Details are in the attached tarball.
I'm running a very simple RISC-V system configuration with only L1 data / instruction caches.
I'm running a very simple cache test executable that reuses memory inside of a 24KiB block.
When I run with 32KiB caches "riscv-gem5 ./cache_bench.py", it takes 6971237877 ticks.
When I run with 1KiB caches "riscv-gem5 ./cache_bench.py --no-cache" it takes 7042666907 ticks.
This is only a 1.01x improvement from having 32x the cache size; working set should be 24KiB.
I must be using Gem5 wrong since there is almost no improvement with increased cache sizes.
Does anyone have any ideas about what I might be missing here?
The test program, "ubench_cache.c" has been used in a different context with success.
I was able to use a Linux kernel module on an x86 machine to get access to uncacheable memory:
https://github.com/lemonsqueeze/uncached-ram-lkm
The benchmark has >100x performance with cacheable memory on x86 hardware vs uncacheable.
I feel like I must be missing something about configuring Gem5 correctly.
Any ideas or tips would be greatly appreciated.
Thanks,
~Aaron Vose
Looks like the email system stripped out my tarball. Attaching again with the config file named "cache_bench_py_.txt" instead of "cache_bench.py".
Thanks,
~Aaron Vose
From: Aaron Vose via gem5-users gem5-users@gem5.org
Sent: Saturday, July 1, 2023 6:01 PM
To: The gem5 Users mailing list gem5-users@gem5.org
Cc: Aaron Vose avose@maxlinear.com
Subject: [gem5-users] Confused When Running Cache Benchmark in Gem5
This email was sent from outside of MaxLinear.
Details are in the attached tarball.
I'm running a very simple RISC-V system configuration with only L1 data / instruction caches.
I'm running a very simple cache test executable that reuses memory inside of a 24KiB block.
When I run with 32KiB caches "riscv-gem5 ./cache_bench.py", it takes 6971237877 ticks.
When I run with 1KiB caches "riscv-gem5 ./cache_bench.py --no-cache" it takes 7042666907 ticks.
This is only a 1.01x improvement from having 32x the cache size; working set should be 24KiB.
I must be using Gem5 wrong since there is almost no improvement with increased cache sizes.
Does anyone have any ideas about what I might be missing here?
The test program, "ubench_cache.c" has been used in a different context with success.
I was able to use a Linux kernel module on an x86 machine to get access to uncacheable memory:
https://github.com/lemonsqueeze/uncached-ram-lkmhttps://github.com/lemonsqueeze/uncached-ram-lkm
The benchmark has >100x performance with cacheable memory on x86 hardware vs uncacheable.
I feel like I must be missing something about configuring Gem5 correctly.
Any ideas or tips would be greatly appreciated.
Thanks,
~Aaron Vose