llama reporting tracking

There are a few configurations that we are tracking for llama (8b, 70b, 405b) with different lowerings for attention, tensor parallelism, and batch size. With so many configurations, it is important to have them all well tested and an easy way to track the current status of each one. On this front, Avi and I have been iterating on a few tests and getting initial reports up (https://github.com/nod-ai/SHARK-Platform/pull/284, https://github.com/nod-ai/SHARK-Platform/pull/321, https://github.com/nod-ai/SHARK-Platform/pull/363, https://github.com/nod-ai/SHARK-Platform/pull/322, https://github.com/nod-ai/SHARK-Platform/pull/414).

Remaining work:

- Land https://github.com/nod-ai/SHARK-Platform/pull/387
- Iterate on expected output/accuracy validation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama reporting tracking #383

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

llama reporting tracking #383

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions