├── LICENSE ├── README.md ├── datasets ├── CosmosQA_val.csv ├── ETHICS_val.zip ├── HellaSwag_val.zip ├── NumerSense_val.tsv ├── PIQA_val.zip ├── QASC_val.jsonl ├── RiddleSense_val.jsonl ├── Social_IQa_val.zip ├── TRAM_val.zip ├── VCR_val.zip └── aNLI_val.zip ├── image_sources ├── datasets_overview.png └── image.txt └── results ├── CommonsenseQA_eval.csv ├── CosmosQA_eval.csv ├── ETHICS_eval.csv ├── HellaSwag_eval.csv ├── NumerSense_eval.csv ├── PIQA_eval.csv ├── QASC_eval.csv ├── RiddleSense_eval.csv ├── Social_IQa_eval.csv ├── TRAM_eval.csv ├── VCR_eval.zip └── aNLI_eval.csv /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/README.md -------------------------------------------------------------------------------- /datasets/CosmosQA_val.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/CosmosQA_val.csv -------------------------------------------------------------------------------- /datasets/ETHICS_val.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/ETHICS_val.zip -------------------------------------------------------------------------------- /datasets/HellaSwag_val.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/HellaSwag_val.zip -------------------------------------------------------------------------------- /datasets/NumerSense_val.tsv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/NumerSense_val.tsv -------------------------------------------------------------------------------- /datasets/PIQA_val.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/PIQA_val.zip -------------------------------------------------------------------------------- /datasets/QASC_val.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/QASC_val.jsonl -------------------------------------------------------------------------------- /datasets/RiddleSense_val.jsonl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/RiddleSense_val.jsonl -------------------------------------------------------------------------------- /datasets/Social_IQa_val.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/Social_IQa_val.zip -------------------------------------------------------------------------------- /datasets/TRAM_val.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/TRAM_val.zip -------------------------------------------------------------------------------- /datasets/VCR_val.zip: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /datasets/aNLI_val.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/datasets/aNLI_val.zip -------------------------------------------------------------------------------- /image_sources/datasets_overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/image_sources/datasets_overview.png -------------------------------------------------------------------------------- /image_sources/image.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /results/CommonsenseQA_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/CommonsenseQA_eval.csv -------------------------------------------------------------------------------- /results/CosmosQA_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/CosmosQA_eval.csv -------------------------------------------------------------------------------- /results/ETHICS_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/ETHICS_eval.csv -------------------------------------------------------------------------------- /results/HellaSwag_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/HellaSwag_eval.csv -------------------------------------------------------------------------------- /results/NumerSense_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/NumerSense_eval.csv -------------------------------------------------------------------------------- /results/PIQA_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/PIQA_eval.csv -------------------------------------------------------------------------------- /results/QASC_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/QASC_eval.csv -------------------------------------------------------------------------------- /results/RiddleSense_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/RiddleSense_eval.csv -------------------------------------------------------------------------------- /results/Social_IQa_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/Social_IQa_eval.csv -------------------------------------------------------------------------------- /results/TRAM_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/TRAM_eval.csv -------------------------------------------------------------------------------- /results/VCR_eval.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/VCR_eval.zip -------------------------------------------------------------------------------- /results/aNLI_eval.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EternityYW/Gemini-Commonsense-Evaluation/HEAD/results/aNLI_eval.csv --------------------------------------------------------------------------------