├── .gitattributes ├── .gitignore ├── App.pdf ├── DATA_LICENSE ├── LEGAL.md ├── MODEL_LICENSE ├── README.md ├── corpus └── law®ulation │ ├── README.md │ └── law®ulation.zip ├── dataset ├── cls │ ├── README.md │ └── open_source_multilabel_test.csv ├── ner │ ├── README.md │ └── open_source_ner_test.csv └── qa │ ├── README.md │ └── open_source_qa_test.csv ├── img.png └── models ├── README.md └── pretrain ├── added_tokens.json ├── config.json ├── pytorch_model.bin ├── special_tokens_map.json ├── tokenizer.json ├── tokenizer_config.json └── vocab.txt /.gitattributes: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/.gitattributes -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/.gitignore -------------------------------------------------------------------------------- /App.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/App.pdf -------------------------------------------------------------------------------- /DATA_LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/DATA_LICENSE -------------------------------------------------------------------------------- /LEGAL.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/LEGAL.md -------------------------------------------------------------------------------- /MODEL_LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/MODEL_LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/README.md -------------------------------------------------------------------------------- /corpus/law®ulation/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/corpus/law®ulation/README.md -------------------------------------------------------------------------------- /corpus/law®ulation/law®ulation.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/corpus/law®ulation/law®ulation.zip -------------------------------------------------------------------------------- /dataset/cls/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/dataset/cls/README.md -------------------------------------------------------------------------------- /dataset/cls/open_source_multilabel_test.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/dataset/cls/open_source_multilabel_test.csv -------------------------------------------------------------------------------- /dataset/ner/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/dataset/ner/README.md -------------------------------------------------------------------------------- /dataset/ner/open_source_ner_test.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/dataset/ner/open_source_ner_test.csv -------------------------------------------------------------------------------- /dataset/qa/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/dataset/qa/README.md -------------------------------------------------------------------------------- /dataset/qa/open_source_qa_test.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/dataset/qa/open_source_qa_test.csv -------------------------------------------------------------------------------- /img.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/img.png -------------------------------------------------------------------------------- /models/README.md: -------------------------------------------------------------------------------- 1 | ## 预训练模型 2 | 利用数万篇隐私协议文本,采用whole word mask任务进行预训练; 3 | -------------------------------------------------------------------------------- /models/pretrain/added_tokens.json: -------------------------------------------------------------------------------- 1 | {} -------------------------------------------------------------------------------- /models/pretrain/config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/models/pretrain/config.json -------------------------------------------------------------------------------- /models/pretrain/pytorch_model.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/models/pretrain/pytorch_model.bin -------------------------------------------------------------------------------- /models/pretrain/special_tokens_map.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/models/pretrain/special_tokens_map.json -------------------------------------------------------------------------------- /models/pretrain/tokenizer.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/models/pretrain/tokenizer.json -------------------------------------------------------------------------------- /models/pretrain/tokenizer_config.json: -------------------------------------------------------------------------------- 1 | {"init_inputs": []} -------------------------------------------------------------------------------- /models/pretrain/vocab.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alipay/ComBERT/HEAD/models/pretrain/vocab.txt --------------------------------------------------------------------------------