├── .gitignore ├── README.md ├── data ├── Deltabench_v1.csv └── Deltabench_v1.jsonl ├── evaluation.py ├── image ├── crop_div_sections.png ├── exp.jpg ├── human_annotation.png ├── intro_longcot.png ├── logo.jpg └── main.png ├── scripts ├── 1_o1-preview-0912.sh └── 2_gpt-4o-0806.sh └── utils └── prompt.py /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/.gitignore -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/README.md -------------------------------------------------------------------------------- /data/Deltabench_v1.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/data/Deltabench_v1.csv -------------------------------------------------------------------------------- /data/Deltabench_v1.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/data/Deltabench_v1.jsonl -------------------------------------------------------------------------------- /evaluation.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/evaluation.py -------------------------------------------------------------------------------- /image/crop_div_sections.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/image/crop_div_sections.png -------------------------------------------------------------------------------- /image/exp.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/image/exp.jpg -------------------------------------------------------------------------------- /image/human_annotation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/image/human_annotation.png -------------------------------------------------------------------------------- /image/intro_longcot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/image/intro_longcot.png -------------------------------------------------------------------------------- /image/logo.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/image/logo.jpg -------------------------------------------------------------------------------- /image/main.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/image/main.png -------------------------------------------------------------------------------- /scripts/1_o1-preview-0912.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/scripts/1_o1-preview-0912.sh -------------------------------------------------------------------------------- /scripts/2_gpt-4o-0806.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/scripts/2_gpt-4o-0806.sh -------------------------------------------------------------------------------- /utils/prompt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LivingFutureLab/DeltaBench/HEAD/utils/prompt.py --------------------------------------------------------------------------------