-
Notifications
You must be signed in to change notification settings - Fork 41
Open
Description
There are a few configurations that we are tracking for llama (8b, 70b, 405b) with different lowerings for attention, tensor parallelism, and batch size. With so many configurations, it is important to have them all well tested and an easy way to track the current status of each one. On this front, Avi and I have been iterating on a few tests and getting initial reports up (nod-ai/shark-ai#284, nod-ai/shark-ai#321, nod-ai/shark-ai#363, nod-ai/shark-ai#322, nod-ai/shark-ai#414).
Remaining work:
- Land Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly shark-ai#387
- Iterate on expected output/accuracy validation
Metadata
Metadata
Assignees
Labels
No labels