GitXplorerGitXplorer
m

transformerlens-model-table

public
3 stars
0 forks
1 issues

Commits

List of commits on branch main.
Unverified
ec2cbd747e0b492b37dd8b3d3bad9c3f169ecb4d

Apply automatic changes

mmivanit committed 3 months ago
Unverified
46766f09f930b0cc6f31103bd64f6fe45d2f0ebf

Apply automatic changes

mmivanit committed 3 months ago
Unverified
edeabe4fda0198e62bf4e08d88aaa4af492849b3

dont use | for type hints (compatibility)

mmivanit committed 3 months ago
Unverified
8f1cb239e307f6f2da422ed50c165aeb06c220cf

Merge branch 'main' of https://github.com/mivanit/transformerlens-model-table

mmivanit committed 3 months ago
Unverified
bba4d05a6614bc273b7e1dccd7281dd14e9f4f2a

format index.html

mmivanit committed 3 months ago
Unverified
0fafae5e401a83d8a09579ae334209eaf11f7c76

Apply automatic changes

mmivanit committed 3 months ago

README

The README file for this repository.

TransformerLens version for last successful build GitHub Workflow Status

This is a utility for viewing all models available in the TransformerLens package as a nice table, which you can view at miv.name/transformerlens-model-table/.

You can also get the table as:

The data for all of these should update daily at midnight UTC. If this appears not to have worked, or if there has been a new TransformerLens release and you want to force an update sooner, make any comment on this issue and the action should run automatically.

The tables contain:

  • all model names and aliases
  • model parameter counts
    • from model name
    • from (non-embedding) parameter count
    • (non-embedding) parameter count as a readable string
  • model type (i.e. gpt2, tiny-stories, pythia, etc.)
  • useful config parameters
    • number of layers, heads, d_model, d_vocab
    • kind of: positional embeddings, activation function, normalization, etc
    • original architecture
  • full model config
  • shapes of all tensors in state dict, activation cache

interactive table features:

  • sort or filter by any value
  • rearrange columns
  • view pretty-printed tensor shapes
  • for longer elements (model cfg, tensor shapes):
    • hover to view
    • left click to copy
    • right click to open as plaintext in new tab

screenshot of the table

pretty printing of tensor shapes

animated gif of table features

TODO:

  • [x] proper CLI interface for script
  • [x] truncated version (no config or tensor shapes) for markdown table, csv (not legible anyway)
    • [x] interface to choose which formats to write in
  • [x] add config as json to jsonl data
  • [ ] make PR to TransformerLens docs