Hello,
I am trying to use IndirectMemoryPrefetcher() in my research project. I wrote a simple Proof of Concept(PoC) code to see if IMP is working or not. I tried debugging the performance with the debug flag ‘HWPrefetch’ to see if I am getting a prefetch hit/request. In Proof of Concept when I am trying to train the Indirect Memory Prefetcher, I am not able to see any Prefetch hits at all. I’m expecting the IMP will get activated though:
printf("- IMP training Start\n");
for (int i = 0; i < trainSize; i++)
tmp = dataArray[indices[i]];
printf("- IMP training End\n");
I suspect that Indirect Prefetcher Memory is not getting activated because while using the same PoC with StridePrefetcher() the indices array is getting prefetched in between the 2 print statements. For a successful IMP prefetching I’m expecting a Cache Hit on indexes beyond ‘trainSize‘ in my dataArray.
T1 = __rdtscp(&trash);
tmp = dataArray[trainSize + 64];
T2 = __rdtscp(&trash);
I’m using caches.py and two_level.py from the configs/learning_gem5/part1/ and I’ve added system.cpu = DerivO3CPU(branchPred=LTAGE()) to two_level.py and prefetcher details to caches.py as seen in below screenshot. IndirectMemoryPrefetcher() is used in L2Cache in a similar fashion.
It would be a great help if anyone would guide us with the instructions on using IMPs.