Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong TFLOP/s with --accurate-pp in qemu user-mode emulation #293

Closed
lilyanatia opened this issue Sep 25, 2024 · 2 comments
Closed

wrong TFLOP/s with --accurate-pp in qemu user-mode emulation #293

lilyanatia opened this issue Sep 25, 2024 · 2 comments

Comments

@lilyanatia
Copy link

correct output:

$ cpufetch --accurate-pp



                                                              Name:                Intel Xeon E5-4657L v2
                                                              Microarchitecture:   Ivy Bridge
                                                              Technology:          22nm
                                                              Max Frequency:       2.900 GHz
                                                              Sockets:             4
                                                              Cores:               12 cores (24 threads)
                                                              Cores (Total):       48 cores (96 threads)
                                                              AVX:                 AVX
                                                              FMA:                 No
                                                              L1i Size:            32KB (1.5MB Total)
                                                              L1d Size:            32KB (1.5MB Total)
                                                              L2 Size:             256KB (12MB Total)
                                                              L3 Size:             30MB (120MB Total)
                                                              Peak Performance:    1.04 TFLOP/s

running in qemu-x86_64 results in 4x the TFLOP/s:

$ qemu-x86_64 /usr/bin/cpufetch --accurate-pp



                                                              Name:                QEMU TCG CPU version 2.5+
                                                              Hypervisor:          QEMU
                                                              Microarchitecture:   K7
                                                              Technology:          Unknown
                                                              Max Frequency:       2.900 GHz
                                                              Sockets:             9
                                                              Cores:               1 core
                                                              Cores (Total):       96 cores
                                                              AVX:                 AVX,AVX2
                                                              FMA:                 FMA3
                                                              L1i Size:            64KB
                                                              L1d Size:            64KB
                                                              L2 Size:             512KB
                                                              L3 Size:             16MB
                                                              Peak Performance:    4.15 TFLOP/s

I'm not sure if it's possible to fix --accurate-pp under emulation other than actually running a benchmark, but perhaps a warning that "Peak Performance", even with --accurate-pp, might be completely wrong when a hypervisor is detected would be a good idea.

@Dr-Noob
Copy link
Owner

Dr-Noob commented Sep 26, 2024

Hey, thanks for the report.

I disagree with your interpretation. It's not that --accurate-pp is "wrong". If you compare the output from QEMU vs no QEMU you'll see that almost everything (except max frequency) is "wrong". And it's not wrong, it's simply QEMU tampering the values (VMs can report whatever values they want).

In particular, in this case it's the microarchitecture and the number of sockets what is making cpufetch think that the peak performance is higher, but cpufetch can do nothing about that, except maybe warning the user when running under a VM about these effects. I would expect the user to be aware of this though, that's why no warning is displayed right now.

@Dr-Noob
Copy link
Owner

Dr-Noob commented Oct 9, 2024

Closing, not much more to do except maybe a message warning about VM probably tampering values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants