This is the pytorch implementation of our paper:
Efficient Inference for Large Language Model-based Generative Recommendation (ICLR 2025)
We also release a Python package, BeamSD, which can accelerate the beam search generation of transformers by 1.5x speedup with just one line of code!
- Anaconda 3
- Python 3.9.0
- pytorch 1.13.0
- transformers 4.41.0
data/
├── beauty
├── gamesThe data in the floder is already processed and can be used directly. The raw data is from Amazon product data.
We sort users' historical interactions by the global timestamps, and then split them into training, validation, and testing sets with the ratio of 8:1:1. If you want to apply this splitting method to your own dataset, please refer to the example for Beauty dataset in data/data_process.ipynb.
For the item identifier, we follow LC-Rec to set the length L = 4, i.e., the token sequence length of a generated item would be 4.
First, replace the parameters in code/script/finetune_llama.sh with your own parameters, such as LOG_DIR, OUTPUT_DIR, etc.
LOG_DIR=YOUR_LOG_DIR
OUTPUT_DIR=YOUR_OUTPUT_DIR
BASE_MODEL=YOUR_BASE_MODEL_PATHThen, run the following command to train the target model.
cd code
bash script/finetune_llama.sh- Generate Teacher Data
Replace the parameters in code/script/generate_teacher_data.sh with your own parameters, and then run the following command.
cd code
bash script/generate_teacher_data.shThen, the data will be generated in ${YOUR_OUTPUT_DIR}/${dataset}/train_teacher_data and ${YOUR_OUTPUT_DIR}/${dataset}/eval_teacher_data, which are the parameters train_data and valid_data in code/script/train.sh.
- Train Draft Model
Replace the parameters in code/script/train.sh with your own parameters, such as LOG_DIR, OUTPUT_DIR, TARGET_MODEL, BASE_MODEL, MODEL_CLASS, etc. And modify accelerate.yaml according to your needs if necessary.
LOG_DIR=YOUR_LOG_DIR
OUTPUT_DIR=YOUR_OUTPUT_DIR
TARGET_MODEL=YOUR_TARGET_MODEL_PATH
BASE_MODEL=YOUR_BASE_MODEL_PATH
MODEL_CLASS=AtSpeedRModelThen, run the following command to train the target model.
cd code
bash script/train.shFirst, replace the parameters in code/script/inference.sh with your own parameters, such as LOG_DIR, OUTPUT_DIR, DRAFT_MODEL, DRAFT_MODEL_NAME, etc.
LOG_DIR=YOUR_LOG_DIR
OUTPUT_DIR=YOUR_OUTPUT_DIR
DRAFT_MODEL=DRAFT_MODEL_PATH
DRATF_MODEL_NAME=DRAFT_MODEL_NAMEThen, run the following command to train the target model.
cd code
bash script/inference.shIf you find our work is useful for your research, please consider citing:
@inproceedings{lin2024efficient,
title={Efficient Inference for Large Language Model-based Generative Recommendation},
author={Lin, Xinyu and Yang, Chaoqun and Wang, Wenjie and Li, Yongqi and Du, Cunxiao and Feng, Fuli and Ng, See-Kiong and Chua, Tat-Seng},
booktitle={ICLR},
year={2025}
}
NUS © NExT++