Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning

Source code for the paper "Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning" - Otmane Sakhi, Imad Aouali, Pierre Alquier, Nicolas Chopin published at NeuRIPS 2024 (Spotlight).

Creating the environment

A virtual Python environment can be created from the requirement file holding all the needed packages to reproduce the results of the paper.

virtualenv -p python3.9 pess_ls
source pess_ls/bin/activate
pip install -r requirements.txt

Run experiments

Run OPE/OPS experiments

The OPE/OPS experiments are built on the experiments conducted in "Kuzborskij, I., Vernade, C., Gyorgy, A., & Szepesvári, C. (2021, March). Confident off-policy evaluation and selection through self-normalized importance weighting. In International Conference on Artificial Intelligence and Statistics (pp. 640-648). PMLR.".

The code for these experiments is heavily inspired from their associated Github package.

To run OPE experiments, please execute:

python ope_ops/policy_evaluation.py

To run OPS experiments, please execute:

python ope_ops/policy_selection.py

Run OPL experiments

To run OPL experiments, please execute:

python opl/policy_learning.py

Citing this work

If you use this code, please cite our work:

@inproceedings{NEURIPS2024_9379ea6b,
 author = {Sakhi, Otmane and Aouali, Imad and Alquier, Pierre and Chopin, Nicolas},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
 pages = {80706--80755},
 publisher = {Curran Associates, Inc.},
 title = {Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning},
 url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/9379ea6ba7a61a402c7750833848b99f-Paper-Conference.pdf},
 volume = {37},
 year = {2024}}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
ope_ops		ope_ops
opl		opl
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning

Creating the environment

Run experiments

Run OPE/OPS experiments

Run OPL experiments

Citing this work

About

Uh oh!

Releases

Packages

Uh oh!

Languages

otmhi/offpolicy_ls

Folders and files

Latest commit

History

Repository files navigation

Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning

Creating the environment

Run experiments

Run OPE/OPS experiments

Run OPL experiments

Citing this work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages