Skip to content

Conversation

ArkaMukherjee0
Copy link
Contributor

@ArkaMukherjee0 ArkaMukherjee0 commented May 14, 2025

Added Radeon RX 9000 series GPUs to packages/tasks.src/hardware.ts

  1. AMD Radeon RX 9070 XT 16 GB - 97.32 TFLOPS; Source: https://www.techpowerup.com/gpu-specs/radeon-rx-9070-xt.c4229
  2. AMD Radeon RX 9070 16 GB - 72.25 TFLOPS; Source: https://www.techpowerup.com/gpu-specs/radeon-rx-9070.c4250

Added Zen 5 9000 Ryzen series CPUs to packaged/tasks/src/hardware.ts

  1. Ryzen Zen5 9000 (Ryzen 9) - 0.56 TFLOPS
  2. Ryzen Zen5 9000 (Ryzen 7) - 0.56 TFLOPS
  3. Ryzen Zen5 9000 (Ryzen 5) - 0.56 TFLOPS
    Source for TFLOPS count: https://www.cpu-monkey.com/en/cpu-amd_ryzen_9_9950x3d

@ArkaMukherjee0 ArkaMukherjee0 marked this pull request as ready for review May 14, 2025 13:01
Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Not sure how to distinguish between iGPU and CPU figures for processors with an integrated GPU, but I think we can go with this for now.

Comment on lines +601 to +609
"Ryzen Zen5 9000 (Ryzen 9)": {
tflops: 0.56,
},
"Ryzen Zen5 9000 (Ryzen 7)": {
tflops: 0.56,
},
"Ryzen Zen5 9000 (Ryzen 5)": {
tflops: 0.56,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These correspond to the integrated GPU, not the CPU; but I guess it's ok.

Copy link
Contributor Author

@ArkaMukherjee0 ArkaMukherjee0 May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reliable proxy for CPU performance that I can think of if we want to equate them to GPUs. INT8 / INT4 performance would have been a better metric, but then we'd end up with two different quantities. Best to leave CPUs out of the question.

Also, another problem I noticed: all Ryzen 9s are clubbed together, while chips like the Ryzen 9 9900X and the Ryzen 9 9950X3D have large differences in their performance. Here, I tested the cheapest and the costliest Ryzen 9, for example:

DeBERTa inference:

  • Ryzen 9 9900X: 0.7947s
  • Ryzen 9 9950X3D: 0.6674s (+19%)

Granite Vision 3.2 2B 4-bit inference (image+text input):

  • Ryzen 9 9900X: 32.4109s
  • Ryzen 9 9950X3D: 30.5195s (+6.2%)

In pure text input, the 9950X3D did 28.01 tok/s with this model. That's already 1/5th of what the RTX 5060 Ti 16 GB did (137.89 tok/s) in the same test, a card we are currently allocating 42.32x compute power by going for TFLOPS.

Hence, the 0.56 TFLOPS number for all Ryzen 9000 and 7000 series CPUs is indeed not fair or comparable with GPUs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, those tokens/second are probably coming from CPU inference, that's why I think it's interesting to use CPU numbers if we can (it's what people will use for ML inference). The problem is that CPU manufacturers don't usually publish tflops. We can compute them by taking into account the number of cores, the clock frequency and the number of "representative" operations (multiply-add, for example) per clock cycle. The latter, however, is difficult to find and requires diving into tech sheets, see this comment for an example.

I would suggest we merge this PR and revisit the CPU numbers you reported in a follow-up. Personally, I'm happy to include an estimate based on comparative performance if we can't find anything else!

Copy link
Contributor Author

@ArkaMukherjee0 ArkaMukherjee0 May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that sounds good to me. We can visit this in a different conversation. Do let me know the best medium for the exchange.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A draft PR or an issue is good imo! 🤗

@pcuenca pcuenca merged commit 96b48c8 into huggingface:main May 16, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants