Summary

This repository consists of various projects/ideas that I've tinkered with and blogged about on my website or Medium.

Most of the projects are structured as jupyter notebooks and are self-contained. Here's a brief overview of each of the projects and the accompanied blog posts:

1. Simple GRPO Trainer

This repository provides an implementation of the GRPO (Group Relative Policy Optimization) Algorithm and a Trainer for training reasoning Large Language Models (LLMs) without using any libraries like TRL, veRL etc. It includes dataset processing, reward functions, and training logic using PyTorch Lightning and Hugging Face Transformers.

Blog Post: Simple GRPO Trainer

Instructions to run/install can be found in the readme of the sub-project.

2. Mechanistic Interpretability: Superposition 101

Mechanistic interpretability is an emerging area of research in AI focused on understanding the inner workings of neural networks. LLMs and Diffusion models have taken the world by storm in the past couple of years but despite they're jaw dropping capabilities very little is known about how and why these deep neural networks generate these outputs.

In this notebook we'll attempt to breakdown some of the key ideas of Mechanistic Interpretability (mech interp.). We haven't found many good resources to understand the fundamentals of mech interp, there's a sense of irony in how dense the literature on a field that aims to make it easier to understand neural nets is 😅.

As a novice diving into this area of research my goal is to improve my understanding of the topic as I learn and hopefully make it easier for others to learn too. The initial articles will be heavily based on a blog released by Chris Olah and team in 2022, the code encompassed in this series largely derives from it too.

I found the blog quite dense for a newbie to follow so my aim is to dumb it down as much as possible. Word of caution, that even my series expects readers to have a good understanding of ML and how to train Deep Neural Networks. If you've completed a ML 101 class in your schooling you should have no trouble following these articles.

Link to blog

3. Exploring Sink Tokens

This notebook is associated with our Medium Article here, where we detail our exploration of encoder and decoder-only Transformer models and the existence of sink tokens in them.

At a high level sink tokens are a small group of tokens that transformer models use to offload a very high proportion of attention scores to. For more, please read our article.

In this notebook we'll visualize the attention scores of various models and identify the tokens which are allocated the highest amount of attention, on a layer-by-layer basis. We'll show how sink tokens are prominent among encoder and decoder models of all sizes.

4. Sparse Attention Leveraging Sink Tokens

In our prior notebook, we found that both encoder-only and decoder-only models offload a significant portion of their attention scores to sink tokens. We identified that these sink tokens tend to be either special tokens like [CLS], [SEP] or tokens corresponding to punctuations. The consistence display of this phenomenon across model architectures and inputs makes one question the relevance of dense self-attention.

In this notebook we'll explore the performance of BERT by creating custom attention masks, which will be sparse in nature. We'll create a unique mask per each token, where all tokens attend to special tokens and the k tokens in their neighborhood. When visualized the tokens along a diagonal of size 2*k+1 and the first anad last tokens (in the case of BERT) being attended to. We'll also explore the effects of allowing dense attention in some layers and sparse attention in the rest.

We'll assess the downstream performance of the models that leverage this type of custom attention mask on some commonly used datasets for benchmarking like [TBD]. Our resulting article can be found here.

5. Riddle Reasoning Model

This notebook trains a reasoning model using GRPO on a dataset of riddles via Unsloth and TRL.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
gated_attention		gated_attention
results		results
simple_grpo_trainer		simple_grpo_trainer
.gitignore		.gitignore
FormattedTextSimilarity.ipynb		FormattedTextSimilarity.ipynb
LICENSE		LICENSE
Readme.md		Readme.md
bert_sparse_attention_training.ipynb		bert_sparse_attention_training.ipynb
custom_collator.ipynb		custom_collator.ipynb
exploring_sink_tokens.ipynb		exploring_sink_tokens.ipynb
riddle_reasoning_model.ipynb		riddle_reasoning_model.ipynb
superposition.ipynb		superposition.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Summary

1. Simple GRPO Trainer

2. Mechanistic Interpretability: Superposition 101

3. Exploring Sink Tokens

4. Sparse Attention Leveraging Sink Tokens

5. Riddle Reasoning Model

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

pramodith/llm_exploration

Folders and files

Latest commit

History

Repository files navigation

Summary

1. Simple GRPO Trainer

2. Mechanistic Interpretability: Superposition 101

3. Exploring Sink Tokens

4. Sparse Attention Leveraging Sink Tokens

5. Riddle Reasoning Model

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages