GitXplorerGitXplorer
k

ng-video-lecture

public
3697 stars
975 forks
37 issues

Commits

List of commits on branch master.
Unverified
d5ae96fc27fc186140113e18d6fcb92dcb9cc085

rename the class it's a remnant of the past

kkarpathy committed 2 years ago
Verified
17115f78e63074d50d6a2af6933435f96d6e17ea

Merge pull request #6 from vineetm/master

kkarpathy committed 2 years ago
Unverified
988848a5f7613d33a6f7f12eed96c2d093759866

scale by head size

committed 2 years ago
Unverified
381c613da07c82ad27c8b06a01f93583318571b0

small readme file

kkarpathy committed 2 years ago
Unverified
3fdc1ea02f3849fd4b3c093430bc49c6fd6a4feb

rename v2 to gpt hah

kkarpathy committed 2 years ago
Unverified
d3df5cfa7424e44043b54299a98cb781a09f2086

might as well add these too

kkarpathy committed 2 years ago

README

The README file for this repository.

nanogpt-lecture

Code created in the Neural Networks: Zero To Hero video lecture series, specifically on the first lecture on nanoGPT. Publishing here as a Github repo so people can easily hack it, walk through the git log history of it, etc.

NOTE: sadly I did not go too much into model initialization in the video lecture, but it is quite important for good performance. The current code will train and work fine, but its convergence is slower because it starts off in a not great spot in the weight space. Please see nanoGPT model.py for # init all weights comment, and especially how it calls the _init_weights function. Even more sadly, the code in this repo is a bit different in how it names and stores the various modules, so it's not possible to directly copy paste this code here. My current plan is to publish a supplementary video lecture and cover these parts, then I will also push the exact code changes to this repo. For now I'm keeping it as is so it is almost exactly what we actually covered in the video.

License

MIT