Starts with building micrograd to build an understanding of how pytorch understands how to calculate gradients, then it proceeds all the way to making a gpt-2 clone.
Looks like this is an effort to reorganize and build on that existing work.
Starts with building micrograd to build an understanding of how pytorch understands how to calculate gradients, then it proceeds all the way to making a gpt-2 clone.
Looks like this is an effort to reorganize and build on that existing work.