Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation

This method introduces a backdoor adjustment strategy during the preference alignment phase to eliminate interfer- ence from environmental confounders, explicitly models the latent environmental distribution using a soft clustering approach, and enhances robust consistency across diverse environments through invariance constraints. Theoretical analysis demonstrates that CausalDPO can effectively capture users’ stable preference structures across multiple environments, thereby improving the OOD generalization performance of LLM-based recommendation models. We conduct extensive experiments under four representative distribution shift settings to validate the effectiveness of CausalDPO, achieving an average performance improvement of 24.10% across four evaluation metrics.

📝 Environment

Create an environment named llm_gpu

conda create --name llm_gpu python=3.10

Install packages from requirements.txt
```
pip install -r requirements.txt 
```

📈 Dataset

We conducted extensive experiments on the following three datasets. Detailed data processing procedures can be found in Appendix C.1 of our paper.

Dataset	Movielens-10M	Yelp2018	Book-Crossing
#Sequence	71,567	31,668	278,858
#Items	10,681	38,048	271,379
#Interactions	10,000,054	1,561,406	1,149,780

🔬 Model Framework

Our model architecture comprises three core modules: supervised fine-tuning (SFT), CausalDPO-based preference alignment, and inference/evaluation

project-root/
├── dataset/
├── eval_result/
├── llm/
├── prompt/
├── save_checkpoint/
├── trainer/
│ ├── causal_dpo.py
│ ├── causal_dpo.sh
│ ├── eval.sh
│ ├── evaluate.py  
│ ├── framework.png
│ ├── inference.py
│ ├── inference.sh
│ ├── README.md
│ ├── requirements.txt
│ ├── sft.py
│ ├── sft.sh
│ └── text_to_embeddings.py

🚀 Quick Reproduction

Under resource-constrained conditions, you may choose to load our pre-trained weights to quickly reproduce the results in the paper. The specific operational steps are as follows:

Download Llama-3.1-8B-Instruct and paraphrase-MiniLM-L3-v2 from HuggingFace into their respective directories under the 'llm' folder.

├── llm/
  --Lama3-8b
  --paraphrase-MiniLM-L3-V2

Download the fine-tuned CausalDPO weights from our provided link.

├── save_checkpoint/
    ├──  ml-10m
      --save_path_cdpo

Execute inference.sh and eval.sh respectively to perform model inference and evaluation.

bash inference.sh

bash eval.sh

🙏 Acknowledgments

We extend our special gratitude to the authors of the S-DPO and SPRec methods, whose implementations and codebase informed our model architecture and evaluation methodology. Proper citations have been included in our paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation

📝 Environment

📈 Dataset

🔬 Model Framework

🚀 Quick Reproduction

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
dataset/ml-1m		dataset/ml-1m
prompt		prompt
trainer		trainer
README.md		README.md
causal_dpo.py		causal_dpo.py
causal_dpo.sh		causal_dpo.sh
eval.sh		eval.sh
evaluate.py		evaluate.py
framework.png		framework.png
inference.py		inference.py
inference.sh		inference.sh
requirements.txt		requirements.txt
sft.py		sft.py
sft.sh		sft.sh
text_to_embeddings.py		text_to_embeddings.py

user683/CausalDPO

Folders and files

Latest commit

History

Repository files navigation

Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation

📝 Environment

📈 Dataset

🔬 Model Framework

🚀 Quick Reproduction

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages