GitXplorerGitXplorer
m

transformerlens-model-table

public
3 stars
0 forks
1 issues

Commits

List of commits on branch main.
Unverified
2031b0718a8bd337acae77b07cad0e80ce54f4fe

Merge branch 'main' of https://github.com/mivanit/transformerlens-model-table

mmivanit committed 3 months ago
Unverified
b18c3eda857c43e113ea91800bc40ddc7c249701

fix typing, mypy passes

mmivanit committed 3 months ago
Unverified
804396c4aa2c2ab391e6c5a9589d20bcc3148ba9

run formatters

mmivanit committed 3 months ago
Unverified
942ff273140c6583f7b4dde878952512ed49c771

add mixtral and Qwen2 model types

mmivanit committed 3 months ago
Unverified
08c5c2fa9af9d64c4cc4fb5d6f6d8433d7053a1c

Apply automatic changes

mmivanit committed 3 months ago
Unverified
ba0cf18acac751a4f966c32f402ea3fb4abc72f2

restrict python because PyO3 fails on 3.13

mmivanit committed 3 months ago

README

The README file for this repository.

TransformerLens version for last successful build GitHub Workflow Status

This is a utility for viewing all models available in the TransformerLens package as a nice table, which you can view at miv.name/transformerlens-model-table/.

You can also get the table as:

The data for all of these should update daily at midnight UTC. If this appears not to have worked, or if there has been a new TransformerLens release and you want to force an update sooner, make any comment on this issue and the action should run automatically.

The tables contain:

  • all model names and aliases
  • model parameter counts
    • from model name
    • from (non-embedding) parameter count
    • (non-embedding) parameter count as a readable string
  • model type (i.e. gpt2, tiny-stories, pythia, etc.)
  • useful config parameters
    • number of layers, heads, d_model, d_vocab
    • kind of: positional embeddings, activation function, normalization, etc
    • original architecture
  • full model config
  • shapes of all tensors in state dict, activation cache

interactive table features:

  • sort or filter by any value
  • rearrange columns
  • view pretty-printed tensor shapes
  • for longer elements (model cfg, tensor shapes):
    • hover to view
    • left click to copy
    • right click to open as plaintext in new tab

screenshot of the table

pretty printing of tensor shapes

animated gif of table features

TODO:

  • [x] proper CLI interface for script
  • [x] truncated version (no config or tensor shapes) for markdown table, csv (not legible anyway)
    • [x] interface to choose which formats to write in
  • [x] add config as json to jsonl data
  • [ ] make PR to TransformerLens docs