├── .gitignore ├── LICENSE ├── README.md ├── doc ├── data_format.md ├── schema.svg ├── unarXive_data_sample.tar.gz └── unarXive_supplement_-_mag_zero_coverage_-_mzcao.html └── src ├── README.md ├── extend_matched.py ├── match_references_openalex.py ├── normalize_arxiv_dump.py ├── parse_latex_tralics.py ├── prepare.py ├── requirements.txt └── utility_scripts ├── arxiv_taxonomy.py ├── calc_stats.py ├── count_licenses.py ├── filter_permissively_livensed.py ├── generate_metadata_db.py ├── generate_openalex_db.py ├── generate_openalex_db_using_locations.py ├── ml_tasks_prep_data.py └── ml_tasks_split_data.py /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/.gitignore -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/README.md -------------------------------------------------------------------------------- /doc/data_format.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/doc/data_format.md -------------------------------------------------------------------------------- /doc/schema.svg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/doc/schema.svg -------------------------------------------------------------------------------- /doc/unarXive_data_sample.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/doc/unarXive_data_sample.tar.gz -------------------------------------------------------------------------------- /doc/unarXive_supplement_-_mag_zero_coverage_-_mzcao.html: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/doc/unarXive_supplement_-_mag_zero_coverage_-_mzcao.html -------------------------------------------------------------------------------- /src/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/README.md -------------------------------------------------------------------------------- /src/extend_matched.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/extend_matched.py -------------------------------------------------------------------------------- /src/match_references_openalex.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/match_references_openalex.py -------------------------------------------------------------------------------- /src/normalize_arxiv_dump.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/normalize_arxiv_dump.py -------------------------------------------------------------------------------- /src/parse_latex_tralics.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/parse_latex_tralics.py -------------------------------------------------------------------------------- /src/prepare.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/prepare.py -------------------------------------------------------------------------------- /src/requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/requirements.txt -------------------------------------------------------------------------------- /src/utility_scripts/arxiv_taxonomy.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/arxiv_taxonomy.py -------------------------------------------------------------------------------- /src/utility_scripts/calc_stats.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/calc_stats.py -------------------------------------------------------------------------------- /src/utility_scripts/count_licenses.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/count_licenses.py -------------------------------------------------------------------------------- /src/utility_scripts/filter_permissively_livensed.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/filter_permissively_livensed.py -------------------------------------------------------------------------------- /src/utility_scripts/generate_metadata_db.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/generate_metadata_db.py -------------------------------------------------------------------------------- /src/utility_scripts/generate_openalex_db.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/generate_openalex_db.py -------------------------------------------------------------------------------- /src/utility_scripts/generate_openalex_db_using_locations.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/generate_openalex_db_using_locations.py -------------------------------------------------------------------------------- /src/utility_scripts/ml_tasks_prep_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/ml_tasks_prep_data.py -------------------------------------------------------------------------------- /src/utility_scripts/ml_tasks_split_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IllDepence/unarXive/HEAD/src/utility_scripts/ml_tasks_split_data.py --------------------------------------------------------------------------------