├── README.md ├── demo_usage_of_all_preconditioners.py ├── hello_psgd.py ├── lstm_with_xor_problem.py ├── misc ├── affine_wrapping_F_conv2d.py ├── affine_wrapping_VF_rnn_tanh.py ├── gpt2.py ├── gpt2_adamw_vs_psgd.svg ├── how_psgd_generalize.py ├── how_psgd_generalize.svg ├── mnist_logistic_regression.py ├── preconditioner_fitting_rule_verification.py ├── psgd_affine_integrate_out_v.py ├── psgd_affine_matmul_vs_einsum.py ├── psgd_kron_verification.py ├── psgd_lra_verification.py ├── psgd_numerical_stability.py ├── psgd_numerical_stability.svg ├── psgd_shampoo_caspr.py ├── psgd_updates.pdf ├── psgd_vs_adafactor.py ├── psgd_with_finite_precision_arithmetic.py ├── tightness_of_spectral_norm_bound.py ├── vit.py └── vit_adam_vs_psgd.svg ├── mnist_with_lenet5.py ├── preconditioned_stochastic_gradient_descent.py ├── psgd.py ├── rnn_xor_problem_general_purpose_preconditioner.py └── wrapped_as_torch_optimizer_for_ddp.py /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/README.md -------------------------------------------------------------------------------- /demo_usage_of_all_preconditioners.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/demo_usage_of_all_preconditioners.py -------------------------------------------------------------------------------- /hello_psgd.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/hello_psgd.py -------------------------------------------------------------------------------- /lstm_with_xor_problem.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/lstm_with_xor_problem.py -------------------------------------------------------------------------------- /misc/affine_wrapping_F_conv2d.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/affine_wrapping_F_conv2d.py -------------------------------------------------------------------------------- /misc/affine_wrapping_VF_rnn_tanh.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/affine_wrapping_VF_rnn_tanh.py -------------------------------------------------------------------------------- /misc/gpt2.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/gpt2.py -------------------------------------------------------------------------------- /misc/gpt2_adamw_vs_psgd.svg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/gpt2_adamw_vs_psgd.svg -------------------------------------------------------------------------------- /misc/how_psgd_generalize.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/how_psgd_generalize.py -------------------------------------------------------------------------------- /misc/how_psgd_generalize.svg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/how_psgd_generalize.svg -------------------------------------------------------------------------------- /misc/mnist_logistic_regression.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/mnist_logistic_regression.py -------------------------------------------------------------------------------- /misc/preconditioner_fitting_rule_verification.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/preconditioner_fitting_rule_verification.py -------------------------------------------------------------------------------- /misc/psgd_affine_integrate_out_v.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_affine_integrate_out_v.py -------------------------------------------------------------------------------- /misc/psgd_affine_matmul_vs_einsum.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_affine_matmul_vs_einsum.py -------------------------------------------------------------------------------- /misc/psgd_kron_verification.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_kron_verification.py -------------------------------------------------------------------------------- /misc/psgd_lra_verification.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_lra_verification.py -------------------------------------------------------------------------------- /misc/psgd_numerical_stability.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_numerical_stability.py -------------------------------------------------------------------------------- /misc/psgd_numerical_stability.svg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_numerical_stability.svg -------------------------------------------------------------------------------- /misc/psgd_shampoo_caspr.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_shampoo_caspr.py -------------------------------------------------------------------------------- /misc/psgd_updates.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_updates.pdf -------------------------------------------------------------------------------- /misc/psgd_vs_adafactor.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_vs_adafactor.py -------------------------------------------------------------------------------- /misc/psgd_with_finite_precision_arithmetic.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/psgd_with_finite_precision_arithmetic.py -------------------------------------------------------------------------------- /misc/tightness_of_spectral_norm_bound.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/tightness_of_spectral_norm_bound.py -------------------------------------------------------------------------------- /misc/vit.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/vit.py -------------------------------------------------------------------------------- /misc/vit_adam_vs_psgd.svg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/misc/vit_adam_vs_psgd.svg -------------------------------------------------------------------------------- /mnist_with_lenet5.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/mnist_with_lenet5.py -------------------------------------------------------------------------------- /preconditioned_stochastic_gradient_descent.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/preconditioned_stochastic_gradient_descent.py -------------------------------------------------------------------------------- /psgd.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/psgd.py -------------------------------------------------------------------------------- /rnn_xor_problem_general_purpose_preconditioner.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/rnn_xor_problem_general_purpose_preconditioner.py -------------------------------------------------------------------------------- /wrapped_as_torch_optimizer_for_ddp.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lixilinx/psgd_torch/HEAD/wrapped_as_torch_optimizer_for_ddp.py --------------------------------------------------------------------------------