JinjieNi

Follow

💭

Working

Jinjie Ni JinjieNi

💭

Working

Follow

🔨 researching next-generation modeling paradigms; building scalable foundation model systems

73 followers · 16 following

Pinned Loading

MegaDLMs MegaDLMs Public

GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 training.

Python 12 2
XueFuzhao/OpenMoE XueFuzhao/OpenMoE Public

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Python 1.6k 84
NVIDIA/Megatron-LM NVIDIA/Megatron-LM Public

Ongoing research training transformer models at scale

Python 14k 3.2k
MixEval MixEval Public

The official evaluation suite and dynamic data release for MixEval.

Python 250 41
deepseek-ai/DeepSeek-MoE deepseek-ai/DeepSeek-MoE Public

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1.8k 294
EvolvingLMMs-Lab/lmms-engine EvolvingLMMs-Lab/lmms-engine Public

A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.

Python 453 15