├── LICENSE ├── README.md ├── bert_tf └── create_pretraining_data.py ├── crawling ├── ilbe_crwal.py ├── namuwiki_crwal.py ├── naver_news_crwal.py └── youtube_crwal.py ├── img ├── ap_graph.png ├── example.gif └── pool_mean.png ├── purifier ├── convert_tf_checkpoint_to_pytorch.py ├── data │ ├── bert_config.json │ └── vocab_korea.txt ├── file_utils.py ├── mask_tokenizer.py ├── modeling_purifier.py ├── optimization.py ├── preprocess.py └── run_purifier.py └── requirements.txt /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/README.md -------------------------------------------------------------------------------- /bert_tf/create_pretraining_data.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/bert_tf/create_pretraining_data.py -------------------------------------------------------------------------------- /crawling/ilbe_crwal.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/crawling/ilbe_crwal.py -------------------------------------------------------------------------------- /crawling/namuwiki_crwal.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/crawling/namuwiki_crwal.py -------------------------------------------------------------------------------- /crawling/naver_news_crwal.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/crawling/naver_news_crwal.py -------------------------------------------------------------------------------- /crawling/youtube_crwal.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/crawling/youtube_crwal.py -------------------------------------------------------------------------------- /img/ap_graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/img/ap_graph.png -------------------------------------------------------------------------------- /img/example.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/img/example.gif -------------------------------------------------------------------------------- /img/pool_mean.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/img/pool_mean.png -------------------------------------------------------------------------------- /purifier/convert_tf_checkpoint_to_pytorch.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/convert_tf_checkpoint_to_pytorch.py -------------------------------------------------------------------------------- /purifier/data/bert_config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/data/bert_config.json -------------------------------------------------------------------------------- /purifier/data/vocab_korea.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/data/vocab_korea.txt -------------------------------------------------------------------------------- /purifier/file_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/file_utils.py -------------------------------------------------------------------------------- /purifier/mask_tokenizer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/mask_tokenizer.py -------------------------------------------------------------------------------- /purifier/modeling_purifier.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/modeling_purifier.py -------------------------------------------------------------------------------- /purifier/optimization.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/optimization.py -------------------------------------------------------------------------------- /purifier/preprocess.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/preprocess.py -------------------------------------------------------------------------------- /purifier/run_purifier.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/purifier/run_purifier.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/teammatmul/project-purifier/HEAD/requirements.txt --------------------------------------------------------------------------------