├── LICENSE ├── Precise_IF_Generalization_Abilities.pdf ├── README.md ├── __pycache__ ├── evaluation_lib.cpython-311.pyc ├── instructions.cpython-311.pyc ├── instructions_registry.cpython-311.pyc ├── instructions_util.cpython-311.pyc └── run_eval.cpython-311.pyc ├── data ├── IFBench_test.jsonl └── sample_output.jsonl ├── eval ├── eval_results_loose.jsonl └── eval_results_strict.jsonl ├── evaluation_lib.py ├── instructions.py ├── instructions_registry.py ├── instructions_test.py ├── instructions_util.py ├── requirements.txt └── run_eval.py /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/LICENSE -------------------------------------------------------------------------------- /Precise_IF_Generalization_Abilities.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/Precise_IF_Generalization_Abilities.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/README.md -------------------------------------------------------------------------------- /__pycache__/evaluation_lib.cpython-311.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/__pycache__/evaluation_lib.cpython-311.pyc -------------------------------------------------------------------------------- /__pycache__/instructions.cpython-311.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/__pycache__/instructions.cpython-311.pyc -------------------------------------------------------------------------------- /__pycache__/instructions_registry.cpython-311.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/__pycache__/instructions_registry.cpython-311.pyc -------------------------------------------------------------------------------- /__pycache__/instructions_util.cpython-311.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/__pycache__/instructions_util.cpython-311.pyc -------------------------------------------------------------------------------- /__pycache__/run_eval.cpython-311.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/__pycache__/run_eval.cpython-311.pyc -------------------------------------------------------------------------------- /data/IFBench_test.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/data/IFBench_test.jsonl -------------------------------------------------------------------------------- /data/sample_output.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/data/sample_output.jsonl -------------------------------------------------------------------------------- /eval/eval_results_loose.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/eval/eval_results_loose.jsonl -------------------------------------------------------------------------------- /eval/eval_results_strict.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/eval/eval_results_strict.jsonl -------------------------------------------------------------------------------- /evaluation_lib.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/evaluation_lib.py -------------------------------------------------------------------------------- /instructions.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/instructions.py -------------------------------------------------------------------------------- /instructions_registry.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/instructions_registry.py -------------------------------------------------------------------------------- /instructions_test.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/instructions_test.py -------------------------------------------------------------------------------- /instructions_util.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/instructions_util.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/requirements.txt -------------------------------------------------------------------------------- /run_eval.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/allenai/IFBench/HEAD/run_eval.py --------------------------------------------------------------------------------