GitXplorerGitXplorer
R

lion-tf

public
10 stars
4 forks
0 issues

Commits

List of commits on branch main.
Verified
eceaf68d6c27cd8ab9bb53fc931d84ff87a7d549

Update setup.py

RRocketknight1 committed 2 years ago
Verified
fd826995be454ba5a89a5f00b6066c7f10c8c368

Update README.md

RRocketknight1 committed 2 years ago
Unverified
b5265b7fa27de933f2971a3c0acf5183f95ebff3

Explain sparse_lerp

RRocketknight1 committed 2 years ago
Unverified
ad30977903ab5fa6618b56bfe20549d3d4ea5b22

Add tips to README

RRocketknight1 committed 2 years ago
Unverified
5b8f25b6fa6f15bfe3aa2570df642d1b2cfbd705

Small README fix

RRocketknight1 committed 2 years ago
Verified
3d08b0d6f7205eb5b48ff970c676ea34e8b31c69

Create README.md

RRocketknight1 committed 2 years ago

README

The README file for this repository.

lion-tf

A TensorFlow implementation of the Lion optimizer from Symbolic Discovery of Optimization Algorithms. Partially copied from the lucidrains PyTorch implementation.

The maths seem right and it successfully trained a couple of models for me, but that doesn't mean I haven't forgotten something stupid, or that there isn't room for optimization!

In general, the code trusts in 🙏XLA🙏 to efficiently reuse buffers and save memory rather than manually doing all the ops in-place like the PyTorch version does. Note that the optimizer will be compiled with XLA even if you don't use jit_compile for the rest of your model!

Installation

pip install git+https://github.com/Rocketknight1/lion-tf.git

Usage

from lion_tf import Lion

model.compile(Lion(1e-5))

Tips

Lion likes much lower learning rates than Adam - I'd suggest a factor of 10 lower as a good starting point. When fine-tuning pre-trained models, learning rates are already quite low, which means the optimal LR for Lion can be very low. I found 1e-5 or less worked well for fine-tuning BERT!