Skip to content

llama reporting tracking #383

@saienduri

Description

@saienduri

There are a few configurations that we are tracking for llama (8b, 70b, 405b) with different lowerings for attention, tensor parallelism, and batch size. With so many configurations, it is important to have them all well tested and an easy way to track the current status of each one. On this front, Avi and I have been iterating on a few tests and getting initial reports up (nod-ai/shark-ai#284, nod-ai/shark-ai#321, nod-ai/shark-ai#363, nod-ai/shark-ai#322, nod-ai/shark-ai#414).

Remaining work:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions