Shivnath Tathe shivnathtathe

Shivnath Tathe

Independent AI Researcher | 4-bit Neural Networks | Training LLMs from Scratch

arXiv • Zenodo • Google Scholar • LinkedIn • HuggingFace

About

I'm Shivnath Tathe, a Software Engineer at ISG eSolutions and an independent AI researcher from India. I work on training neural networks at extremely low precision, proving that 4-bit quantized models can match full-precision accuracy without expensive GPUs.

My first paper on arXiv demonstrates training a convolutional network from scratch at true 4-bit precision on a standard CPU, achieving 92.34% on CIFAR-10 with 8x memory compression. I'm currently building T4NT, a 1.5B parameter multilingual Indian language model trained from scratch on 10 languages using 4-bit quantization-aware training with tanh soft clipping.

I believe powerful AI should not require powerful hardware.

Publications

Paper	Venue	Links
True 4-Bit Quantized Convolutional Neural Network Training on CPU: Achieving Full-Precision Parity	arXiv (cs.LG)	Paper · Code
Autonomous Tool-Creation in AI Agents: A Conceptual Framework for Self-Evolving Systems	Zenodo	Paper · Code

Research

4-bit Quantization-Aware Training

Trained VGG-style networks at true 4-bit precision from scratch using symmetric quantization + straight-through estimators
92.34% on CIFAR-10 (0.16% gap from full-precision) with 8x memory compression
Validated on CIFAR-100 (70.94%) and on mobile (OnePlus 9R, 83.16% in 6 epochs)
No specialized GPU kernels. Standard PyTorch on CPU.

T4NT-1.5B (in progress)

Multilingual Indian LLM trained from scratch on 10 languages
Architecture: RMSNorm + RoPE + SwiGLU + 4-bit QAT + Tanh Soft Clipping
Custom SentencePiece tokenizer (65K vocab) covering Devanagari, Bengali, Tamil, Telugu, Kannada, Malayalam, Gurmukhi scripts
Training on Kaggle TPU v5e-8

Projects

Project	Description
true-4bit-training	4-bit QAT with tanh soft clipping — arXiv published
DevShakti Offline RAG	React Native + GGUF, fully offline on-device LLM chatbot
AgentForge	LangChain/CrewAI agents that build their own tools

Tech

Research : PyTorch, Quantization, QAT, STE, LoRA, PEFT
Models   : llama.cpp, GGUF, HuggingFace Transformers
Frontend : React Native, Angular, Electron
Backend  : FastAPI, Node.js
Infra    : Kaggle TPU, Google Colab, Linux

Stats

arXiv:2603.13931 · DOI:10.5281/zenodo.15272894

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shivnath Tathe shivnathtathe

Achievements

Achievements

Highlights

Block or report shivnathtathe