Skip to content
View JinjieNi's full-sized avatar
💭
Working
💭
Working

Block or report JinjieNi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. MegaDLMs MegaDLMs Public

    GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 training.

    Python 12 2

  2. XueFuzhao/OpenMoE XueFuzhao/OpenMoE Public

    A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

    Python 1.6k 84

  3. NVIDIA/Megatron-LM NVIDIA/Megatron-LM Public

    Ongoing research training transformer models at scale

    Python 14k 3.2k

  4. MixEval MixEval Public

    The official evaluation suite and dynamic data release for MixEval.

    Python 250 41

  5. deepseek-ai/DeepSeek-MoE deepseek-ai/DeepSeek-MoE Public

    DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

    Python 1.8k 294

  6. EvolvingLMMs-Lab/lmms-engine EvolvingLMMs-Lab/lmms-engine Public

    A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.

    Python 453 15