Check what to do to bring in LLM models

What can we do for the default backend?
- Model: Let's start with llama 3.2 1B, preferably quantized ones
- Runtime:
  - [MediaPipe](https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/android) and [AI-Edge-Torch](https://github.com/google-ai-edge/ai-edge-torch): our current default backend is tflite based, so if we can continue using TFLite-based solution, it would be good.
  - - new API in LiteRT (https://ai.google.dev/edge/litert)
  - ExecuTorch: https://github.com/pytorch/executorch, https://github.com/pytorch/executorch/blob/main/examples/models/llama/README.md
  - llama.cpp: https://github.com/ggerganov/llama.cpp
  - onnx runtime: https://github.com/microsoft/onnxruntime, let's check if it works on Android platforms.
  - onnx runtime genai: https://github.com/microsoft/onnxruntime-genai

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Check what to do to bring in LLM models #940

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Check what to do to bring in LLM models #940

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions