├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── data ├── README.md ├── cruxeval.jsonl ├── data_generating_prompt.jsonl ├── diverse_fewshot_examples.py ├── filter │ ├── analyze_ops.py │ └── get_stack.py └── generate_function_prompts.py ├── evaluation ├── evaluate_all_predictions_input.sh ├── evaluate_all_predictions_output.sh ├── evaluate_generations.py ├── evaluation_results │ └── .gitkeep ├── print_evaluation_directories.py ├── read_results.py ├── utils_execute.py └── utils_general.py ├── inference ├── combine_generations.py ├── generation_arguments.py ├── generator.py ├── main.py ├── scripts │ ├── run_input_prediction.sh │ ├── run_input_prediction_cot.sh │ ├── run_output_prediction.sh │ └── run_output_prediction_cot.sh ├── tasks │ ├── __init__.py │ ├── base.py │ ├── input_prediction.py │ └── output_prediction.py └── utils.py ├── model_generations └── .gitkeep ├── openai ├── openai_prompt.py └── openai_run.py ├── prompts.py ├── quickstart.ipynb ├── requirements-base.txt ├── requirements-inference.txt ├── requirements-openai.txt ├── requirements.txt └── samples ├── evaluation_results.zip ├── evaluation_results ├── sample_scored_codellama-7b_temp0.2_input.json └── sample_scored_codellama-7b_temp0.2_output.json ├── model_generations.zip └── model_generations ├── sample_codellama-7b_temp0.2_input └── generations.json └── sample_codellama-7b_temp0.2_output └── generations.json /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/.gitignore -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/CODE_OF_CONDUCT.md -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/CONTRIBUTING.md -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/README.md -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/data/README.md -------------------------------------------------------------------------------- /data/cruxeval.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/data/cruxeval.jsonl -------------------------------------------------------------------------------- /data/data_generating_prompt.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/data/data_generating_prompt.jsonl -------------------------------------------------------------------------------- /data/diverse_fewshot_examples.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/data/diverse_fewshot_examples.py -------------------------------------------------------------------------------- /data/filter/analyze_ops.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/data/filter/analyze_ops.py -------------------------------------------------------------------------------- /data/filter/get_stack.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/data/filter/get_stack.py -------------------------------------------------------------------------------- /data/generate_function_prompts.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/data/generate_function_prompts.py -------------------------------------------------------------------------------- /evaluation/evaluate_all_predictions_input.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/evaluation/evaluate_all_predictions_input.sh -------------------------------------------------------------------------------- /evaluation/evaluate_all_predictions_output.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/evaluation/evaluate_all_predictions_output.sh -------------------------------------------------------------------------------- /evaluation/evaluate_generations.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/evaluation/evaluate_generations.py -------------------------------------------------------------------------------- /evaluation/evaluation_results/.gitkeep: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /evaluation/print_evaluation_directories.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/evaluation/print_evaluation_directories.py -------------------------------------------------------------------------------- /evaluation/read_results.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/evaluation/read_results.py -------------------------------------------------------------------------------- /evaluation/utils_execute.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/evaluation/utils_execute.py -------------------------------------------------------------------------------- /evaluation/utils_general.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/evaluation/utils_general.py -------------------------------------------------------------------------------- /inference/combine_generations.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/combine_generations.py -------------------------------------------------------------------------------- /inference/generation_arguments.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/generation_arguments.py -------------------------------------------------------------------------------- /inference/generator.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/generator.py -------------------------------------------------------------------------------- /inference/main.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/main.py -------------------------------------------------------------------------------- /inference/scripts/run_input_prediction.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/scripts/run_input_prediction.sh -------------------------------------------------------------------------------- /inference/scripts/run_input_prediction_cot.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/scripts/run_input_prediction_cot.sh -------------------------------------------------------------------------------- /inference/scripts/run_output_prediction.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/scripts/run_output_prediction.sh -------------------------------------------------------------------------------- /inference/scripts/run_output_prediction_cot.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/scripts/run_output_prediction_cot.sh -------------------------------------------------------------------------------- /inference/tasks/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/tasks/__init__.py -------------------------------------------------------------------------------- /inference/tasks/base.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/tasks/base.py -------------------------------------------------------------------------------- /inference/tasks/input_prediction.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/tasks/input_prediction.py -------------------------------------------------------------------------------- /inference/tasks/output_prediction.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/tasks/output_prediction.py -------------------------------------------------------------------------------- /inference/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/inference/utils.py -------------------------------------------------------------------------------- /model_generations/.gitkeep: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /openai/openai_prompt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/openai/openai_prompt.py -------------------------------------------------------------------------------- /openai/openai_run.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/openai/openai_run.py -------------------------------------------------------------------------------- /prompts.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/prompts.py -------------------------------------------------------------------------------- /quickstart.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/quickstart.ipynb -------------------------------------------------------------------------------- /requirements-base.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | tabulate -------------------------------------------------------------------------------- /requirements-inference.txt: -------------------------------------------------------------------------------- 1 | torch 2 | datasets 3 | transformers==4.36.2 4 | vllm==0.2.6 5 | tqdm 6 | -------------------------------------------------------------------------------- /requirements-openai.txt: -------------------------------------------------------------------------------- 1 | openai -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/requirements.txt -------------------------------------------------------------------------------- /samples/evaluation_results.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/samples/evaluation_results.zip -------------------------------------------------------------------------------- /samples/evaluation_results/sample_scored_codellama-7b_temp0.2_input.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/samples/evaluation_results/sample_scored_codellama-7b_temp0.2_input.json -------------------------------------------------------------------------------- /samples/evaluation_results/sample_scored_codellama-7b_temp0.2_output.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/samples/evaluation_results/sample_scored_codellama-7b_temp0.2_output.json -------------------------------------------------------------------------------- /samples/model_generations.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/samples/model_generations.zip -------------------------------------------------------------------------------- /samples/model_generations/sample_codellama-7b_temp0.2_input/generations.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/samples/model_generations/sample_codellama-7b_temp0.2_input/generations.json -------------------------------------------------------------------------------- /samples/model_generations/sample_codellama-7b_temp0.2_output/generations.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/cruxeval/HEAD/samples/model_generations/sample_codellama-7b_temp0.2_output/generations.json --------------------------------------------------------------------------------