├── .DS_Store ├── README.md ├── common ├── arg.py └── dataset.py ├── data ├── sample.txt ├── train │ └── kowiki_sample.txt └── vocab-v1.txt ├── images ├── explicit-sparse-attention.png ├── macaron.png ├── residual_attn.png └── rezero.png ├── model ├── o_transformer.py ├── pipeline.py └── transformer.py ├── requirements.txt ├── train ├── config.json └── run_pretraining.py ├── train_deepspeed ├── __pycache__ │ └── ds_util.cpython-36.pyc ├── config_rezero_sparsetopk.json ├── ds_config_rezero_sparsetopk.json ├── ds_train_rezero_sparsetopk.sh ├── ds_util.py └── train_rezero_sparsetopk.py └── train_pl ├── config.json ├── config_small.json ├── run_pretraining.py ├── run_pretraining_rezero.py └── run_pretraining_rezero_sparsetopk.py /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/.DS_Store -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/README.md -------------------------------------------------------------------------------- /common/arg.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/common/arg.py -------------------------------------------------------------------------------- /common/dataset.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/common/dataset.py -------------------------------------------------------------------------------- /data/sample.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/data/sample.txt -------------------------------------------------------------------------------- /data/train/kowiki_sample.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/data/train/kowiki_sample.txt -------------------------------------------------------------------------------- /data/vocab-v1.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/data/vocab-v1.txt -------------------------------------------------------------------------------- /images/explicit-sparse-attention.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/images/explicit-sparse-attention.png -------------------------------------------------------------------------------- /images/macaron.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/images/macaron.png -------------------------------------------------------------------------------- /images/residual_attn.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/images/residual_attn.png -------------------------------------------------------------------------------- /images/rezero.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/images/rezero.png -------------------------------------------------------------------------------- /model/o_transformer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/model/o_transformer.py -------------------------------------------------------------------------------- /model/pipeline.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/model/pipeline.py -------------------------------------------------------------------------------- /model/transformer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/model/transformer.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/requirements.txt -------------------------------------------------------------------------------- /train/config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train/config.json -------------------------------------------------------------------------------- /train/run_pretraining.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train/run_pretraining.py -------------------------------------------------------------------------------- /train_deepspeed/__pycache__/ds_util.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_deepspeed/__pycache__/ds_util.cpython-36.pyc -------------------------------------------------------------------------------- /train_deepspeed/config_rezero_sparsetopk.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_deepspeed/config_rezero_sparsetopk.json -------------------------------------------------------------------------------- /train_deepspeed/ds_config_rezero_sparsetopk.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_deepspeed/ds_config_rezero_sparsetopk.json -------------------------------------------------------------------------------- /train_deepspeed/ds_train_rezero_sparsetopk.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_deepspeed/ds_train_rezero_sparsetopk.sh -------------------------------------------------------------------------------- /train_deepspeed/ds_util.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_deepspeed/ds_util.py -------------------------------------------------------------------------------- /train_deepspeed/train_rezero_sparsetopk.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_deepspeed/train_rezero_sparsetopk.py -------------------------------------------------------------------------------- /train_pl/config.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_pl/config.json -------------------------------------------------------------------------------- /train_pl/config_small.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_pl/config_small.json -------------------------------------------------------------------------------- /train_pl/run_pretraining.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_pl/run_pretraining.py -------------------------------------------------------------------------------- /train_pl/run_pretraining_rezero.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_pl/run_pretraining_rezero.py -------------------------------------------------------------------------------- /train_pl/run_pretraining_rezero_sparsetopk.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nawnoes/pytorch-gpt-x/HEAD/train_pl/run_pretraining_rezero_sparsetopk.py --------------------------------------------------------------------------------