AGPT-styledecoder-onlytransformerlanguagemodelbuiltfromscratchwith190Mparameters.Trainedon474MtokensfromTinyStories,implementingmodernfeatureslikeRMSNormandSwiGLU.
Role
Personal Project
Year
2025
Status
Complete
Stack
PyTorchTransformersGPTLanguage ModelingLLM Architecture