gem5-dev@gem5.org

The gem5 Developer List

View all threads

[S] Change in gem5/gem5[develop]: cpu: Move fetch stats from simple and minor to base

BB
Bobby Bruce (Gerrit)
Mon, May 8, 2023 7:09 PM

Bobby Bruce has submitted this change. (
https://gem5-review.googlesource.com/c/public/gem5/+/69097?usp=email )

Change subject: cpu: Move fetch stats from simple and minor to base
......................................................................

cpu: Move fetch stats from simple and minor to base

This summarizes a series of changes to move general Simple, Minor,
O3 CPU stats to BaseCPU. This commit focuses on moving numBranches
from SimpleCPU to the FetchCPUStats in the BaseCPU, and
numFetchSuspends from MinorCPU into FetchCPUStats.  More general
information about this relation chain is below.  In addition, this
changeset first adds all relevant stats to base in the first half,
then removes the duplicated stats in the second half.  Duplicated
stats are denoted in the code. In addition, to view the difference
between the old stats output and the current output, view
https://gem5.atlassian.net/browse/GEM5-1304

  1. Summary:
    Moved general CPU stats found across Simple, Minor, and O3 CPU models
    into BaseCPU through new stat groups. The stat groups are
    FetchCPUStats, ExecuteCPUStats, and CommitCPUStats. Implemented the
    committedControl stat vector found in MinorCPU for Simple and O3 CPU.
    Implemented the numStoreInsts stat found in SimpleCPU for O3CPU. IPC
    and CPI stats are now tracked at the core and thread level in BaseCPU
    and are made universal for simple, minor, o3, and kvm CPUs. Duplicate
    stats across the models are merged into a single stat in BaseCPU under
    the same stat name. This change does not implement every general level
    stat moved to BaseCPU for every model.

  2. Stat API Changes
    a. SimpleCPU:
    statExecutedInstType vector unified into committedInstType
    numCondCtrlInsts unified into committedControl::isControl

b. O3CPU:
i. Fetch Stage
branches in fetch unified into with numBranches
rate renamed to fetchRate
insts unified into with numInsts

ii. Execute Stage
Regfile stats unified into base with use of Simple's stat naming
numRefs in IEW unified into numMemRefs
numRate from IEW renamed to instRate

iii. Commit Stage
committedInsts is renamed to numInstsNotNOP
committedOps is renamed to numOpsNotNOP
instsCommitted is unified into numInsts
opsCommitted is unified into numOps
branches is unified into committedControl::isControl
floating is unified into numFpInsts
integer is unified into numIntInsts
loads is unified into numLoadInsts
memRefs is renamed to numMemRefs
vectorInstructions is unified into numVecInsts

  1. Details:
    Created three stat groups in BaseCPU. FetchCPUStats track statistics
    related to the fetch stage. ExecuteCPUStats track statistics related
    to the execute stage. CommitCPUStats track statistics related to the
    commit stage.

There are three vectors in Base that store unique pointers to per
thread instances of these stat groups. The stat group pointer for
thread i is accessible at index i of one of these vectors. For example,
stat numCCRegReads of the execute stage for thread 0 can be accessed
with executeStats[0]->numCCRegReads. The stats.txt output will print the
thread ID of the stat group. For example, numVecRegReads on thread 0
of a single core prints as
"board.processor.cores.core.executeStats0.numVecRegReads".
NOTE: Multithreading in gem5 is untested. Therefore per thread stats
output in stats.txt is not currently guaranteed to be correctly
formatted.

For FetchCPUStats, the stats moved from  SimpleCPU are numBranches
and numInsts. From MinorCPU, the stat moved is numFetchSuspends. From
O3CPU, the stats moved are from the O3 fetch stage: Stat branches is
unified into numBranches, stat rate is renamed to fetchRate in Base,
stat insts is unified into numInsts, stat icacheStallCycles keeps the
same name in Base.

For ExecuteCPUStats, the stats moved from SimpleCPU are
dcacheStallCycles, numCCRegReads, numCCRegWrites,
numFpAluAccesses, numFpRegReads, numFpRegWrites, numIntAluAccesses,
numIntRegReads, numIntRegWrites, numMemRefs, numMiscRegReads,
numMiscRegWrites, numVecAluAccesses, numVecPredRegReads,
numVecPredRegWrites, numVecRegReads, numVecRegWrites. The stat moved
from MinorCPU is numDiscardedOps. From O3, the Regfile stats in CPU are
unified into the reg stats in Base and use the names found originally
in SimpleCPU. From O3 IEW stage, numInsts keeps the same name in
Base, numBranches is unified into numBranches in base, numNop keeps
the same name in Base, numRefs is unified into numMemRefs in Base,
numLoadInsts and numStoreInsts are moved into Base, numRate is renamed
to instRate in base.

For CommitCPUStats, the stats moved from SimpleCPU are
numCondCtrlInsts, numFpInsts, numIntInsts, numLoadInsts, numStoreInsts,
numVecInsts. The stats moved from MinorCPU are numInsts,
committedInstType, and committedControl. statExecutedInstType of
SimpleCPU is unified with committedInstType of MinorCPU. Implemented
committedControl stats from MinorCPU in Simple and O3 CPU. In MinorCPU,
this stat was a 2D vector, where the first dimension is the thread ID.
In base it is now a 1D vector that is tied to a thread ID via the
commitStats vector that the object is accessible through. From the O3
commit stage, committedInsts is renamed to numInstsNotNOP, committedOps
is renamed to numOpsNotNOP, instsCommitted is unified into numInsts,
opsCommitted is renamed to numOps, committedInstType is unified into
committedInstType from Minor, branches is removed because it duplicates
committedControl::IsControl, floating is unified into numFpInsts,
interger is unified into numIntInsts, loads is unified into
numLoadInsts, numStoreInsts is implemented for tracking in O3, memRefs
is renamed to numMemRefs, vectorInstructions is unified into
numVecInsts. Note that numCondCtrlInsts of Simple is unified into
committedControl::IsCondCtrl.

Implemented IPC and CPI tracking inside BaseCPU.
In BaseCPU::BaseCPUStats, numInsts and numOps track per CPU core
committed instructions and operations.
In BaseCPU::FetchCPUStats, numInsts and numOps track per thread
fetched instructions and operations.
In BaseCPU::CommitCPUStats, numInsts tracks per thread executed
instructions.
In BaseCPU::CommitCPUStats, numInsts and numOps track per thread
committed instructions and operations.
In BaseSimpleCPU, the countInst() function has been split into
countInst(), countFetchInst(), and countCommitInst(). The stat count
incrementation step of countInst() has been removed and delegated to the
other two functions. countFetchInst() increments numInsts and numOps
of the FetchCPUStats group for a thread. countCommitInst() increments
the numInsts and numOps of the CommitCPUStats group for a thread and
of the BaseCPUStats group for a CPU core. These functions are called
in the appropriate stage within timing.cc and atomic.cc. The call to
countInst() is left unchanged. countFetchInst() is called in
preExecute(). countCommitInst() is called in postExecute().
For MinorCPU, only the commit level numInsts and numOps stats have been
implemented.
IPC and CPI stats have been added to BaseCPUStats (core level) and
CommitCPUStats (thread level). The formulas for the IPC and CPI stats
in CommitCPUStats are set in the BaseCPU constructor, after the
CommitCPUStats stat group object has been created. These replace IPC,
CPI, totalIpc, and totalCpi stats in O3.

Replaced committedInsts stats of KVM CPU with commitStats.numInsts
of BaseCPU. This results in IPC and CPI printing in stats.txt for
KVM simulations.

This change does not implement most general stats found in one or two
model for all others.

Change-Id: I44d8ff6f3d102e94e53f9b2ce9b7917d96341e51
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69097
Reviewed-by: Bobby Bruce bbruce@ucdavis.edu
Tested-by: kokoro noreply+kokoro@google.com
Maintainer: Bobby Bruce bbruce@ucdavis.edu

M src/cpu/base.cc
M src/cpu/base.hh
M src/cpu/minor/execute.cc
M src/cpu/simple/base.cc
4 files changed, 40 insertions(+), 0 deletions(-)

Approvals:
kokoro: Regressions pass
Bobby Bruce: Looks good to me, approved; Looks good to me, approved

diff --git a/src/cpu/base.cc b/src/cpu/base.cc
index d2c0a78..1d29339 100644
--- a/src/cpu/base.cc
+++ b/src/cpu/base.cc
@@ -191,6 +191,11 @@
modelResetPort.onChange([this](const bool &new_val) {
setReset(new_val);
});

  • // create a stat group object for each thread on this core

  • fetchStats.reserve(numThreads);

  • for (int i = 0; i < numThreads; i++) {

  •    fetchStats.emplace_back(new FetchCPUStats(this, i));
    
  • }
    }

    void
    @@ -827,4 +832,18 @@
    hostOpRate = simOps / hostSeconds;
    }

+BaseCPU::
+FetchCPUStats::FetchCPUStats(statistics::Group *parent, int thread_id)

  • : statistics::Group(parent, csprintf("fetchStats%i",
    thread_id).c_str()),
  • ADD_STAT(numBranches, statistics::units::Count::get(),
  •         "Number of branches fetched"),
    
  • ADD_STAT(numFetchSuspends, statistics::units::Count::get(),
  •         "Number of times Execute suspended instruction fetching")
    

+{

  • numBranches
  •    .prereq(numBranches);
    

+}
+
} // namespace gem5
diff --git a/src/cpu/base.hh b/src/cpu/base.hh
index 084d9b9..e8fb777 100644
--- a/src/cpu/base.hh
+++ b/src/cpu/base.hh
@@ -42,6 +42,7 @@
#ifndef CPU_BASE_HH
#define CPU_BASE_HH

+#include <memory>
#include <vector>

#include "arch/generic/interrupts.hh"
@@ -676,6 +677,22 @@
const Cycles pwrGatingLatency;
const bool powerGatingOnIdle;
EventFunctionWrapper enterPwrGatingEvent;
+
+

  • public:
  • struct FetchCPUStats : public statistics::Group
  • {
  •    FetchCPUStats(statistics::Group *parent, int thread_id);
    
  •    /* Total number of branches fetched */
    
  •    statistics::Scalar numBranches;
    
  •    /* Number of times fetch was asked to suspend by Execute */
    
  •    statistics::Scalar numFetchSuspends;
    
  • };
  • std::vector<std::unique_ptr<FetchCPUStats>> fetchStats;
    };

} // namespace gem5
diff --git a/src/cpu/minor/execute.cc b/src/cpu/minor/execute.cc
index 5eaaf58..c37c6c6 100644
--- a/src/cpu/minor/execute.cc
+++ b/src/cpu/minor/execute.cc
@@ -1054,7 +1054,9 @@
DPRINTF(MinorInterrupt, "Suspending thread: %d from Execute"
" inst: %s\n", thread_id, *inst);

  •        // output both old and new stats
            cpu.stats.numFetchSuspends++;
    
  •        cpu.fetchStats[thread_id]->numFetchSuspends++;
    
            updateBranchData(thread_id, BranchData::SuspendThread, inst,
                resume_pc, branch);
    

diff --git a/src/cpu/simple/base.cc b/src/cpu/simple/base.cc
index 768f63e..1632f54 100644
--- a/src/cpu/simple/base.cc
+++ b/src/cpu/simple/base.cc
@@ -396,7 +396,9 @@
}

  if (curStaticInst->isControl()) {
  •    // output both old and new stats
        ++t_info.execContextStats.numBranches;
    
  •    ++fetchStats[t_info.thread->threadId()]->numBranches;
    }
    
    /* Power model statistics */
    

--
To view, visit
https://gem5-review.googlesource.com/c/public/gem5/+/69097?usp=email
To unsubscribe, or for help writing mail filters, visit
https://gem5-review.googlesource.com/settings?usp=email

Gerrit-MessageType: merged
Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I44d8ff6f3d102e94e53f9b2ce9b7917d96341e51
Gerrit-Change-Number: 69097
Gerrit-PatchSet: 3
Gerrit-Owner: Melissa Jost melissakjost@gmail.com
Gerrit-Reviewer: Bobby Bruce bbruce@ucdavis.edu
Gerrit-Reviewer: Gabe Black gabe.black@gmail.com
Gerrit-Reviewer: Jason Lowe-Power jason@lowepower.com
Gerrit-Reviewer: kokoro noreply+kokoro@google.com

Bobby Bruce has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/69097?usp=email ) Change subject: cpu: Move fetch stats from simple and minor to base ...................................................................... cpu: Move fetch stats from simple and minor to base This summarizes a series of changes to move general Simple, Minor, O3 CPU stats to BaseCPU. This commit focuses on moving numBranches from SimpleCPU to the FetchCPUStats in the BaseCPU, and numFetchSuspends from MinorCPU into FetchCPUStats. More general information about this relation chain is below. In addition, this changeset first adds all relevant stats to base in the first half, then removes the duplicated stats in the second half. Duplicated stats are denoted in the code. In addition, to view the difference between the old stats output and the current output, view https://gem5.atlassian.net/browse/GEM5-1304 1. Summary: Moved general CPU stats found across Simple, Minor, and O3 CPU models into BaseCPU through new stat groups. The stat groups are FetchCPUStats, ExecuteCPUStats, and CommitCPUStats. Implemented the committedControl stat vector found in MinorCPU for Simple and O3 CPU. Implemented the numStoreInsts stat found in SimpleCPU for O3CPU. IPC and CPI stats are now tracked at the core and thread level in BaseCPU and are made universal for simple, minor, o3, and kvm CPUs. Duplicate stats across the models are merged into a single stat in BaseCPU under the same stat name. This change does not implement every general level stat moved to BaseCPU for every model. 2. Stat API Changes a. SimpleCPU: statExecutedInstType vector unified into committedInstType numCondCtrlInsts unified into committedControl::isControl b. O3CPU: i. Fetch Stage branches in fetch unified into with numBranches rate renamed to fetchRate insts unified into with numInsts ii. Execute Stage Regfile stats unified into base with use of Simple's stat naming numRefs in IEW unified into numMemRefs numRate from IEW renamed to instRate iii. Commit Stage committedInsts is renamed to numInstsNotNOP committedOps is renamed to numOpsNotNOP instsCommitted is unified into numInsts opsCommitted is unified into numOps branches is unified into committedControl::isControl floating is unified into numFpInsts integer is unified into numIntInsts loads is unified into numLoadInsts memRefs is renamed to numMemRefs vectorInstructions is unified into numVecInsts 3. Details: Created three stat groups in BaseCPU. FetchCPUStats track statistics related to the fetch stage. ExecuteCPUStats track statistics related to the execute stage. CommitCPUStats track statistics related to the commit stage. There are three vectors in Base that store unique pointers to per thread instances of these stat groups. The stat group pointer for thread i is accessible at index i of one of these vectors. For example, stat numCCRegReads of the execute stage for thread 0 can be accessed with executeStats[0]->numCCRegReads. The stats.txt output will print the thread ID of the stat group. For example, numVecRegReads on thread 0 of a single core prints as "board.processor.cores.core.executeStats0.numVecRegReads". NOTE: Multithreading in gem5 is untested. Therefore per thread stats output in stats.txt is not currently guaranteed to be correctly formatted. For FetchCPUStats, the stats moved from SimpleCPU are numBranches and numInsts. From MinorCPU, the stat moved is numFetchSuspends. From O3CPU, the stats moved are from the O3 fetch stage: Stat branches is unified into numBranches, stat rate is renamed to fetchRate in Base, stat insts is unified into numInsts, stat icacheStallCycles keeps the same name in Base. For ExecuteCPUStats, the stats moved from SimpleCPU are dcacheStallCycles, numCCRegReads, numCCRegWrites, numFpAluAccesses, numFpRegReads, numFpRegWrites, numIntAluAccesses, numIntRegReads, numIntRegWrites, numMemRefs, numMiscRegReads, numMiscRegWrites, numVecAluAccesses, numVecPredRegReads, numVecPredRegWrites, numVecRegReads, numVecRegWrites. The stat moved from MinorCPU is numDiscardedOps. From O3, the Regfile stats in CPU are unified into the reg stats in Base and use the names found originally in SimpleCPU. From O3 IEW stage, numInsts keeps the same name in Base, numBranches is unified into numBranches in base, numNop keeps the same name in Base, numRefs is unified into numMemRefs in Base, numLoadInsts and numStoreInsts are moved into Base, numRate is renamed to instRate in base. For CommitCPUStats, the stats moved from SimpleCPU are numCondCtrlInsts, numFpInsts, numIntInsts, numLoadInsts, numStoreInsts, numVecInsts. The stats moved from MinorCPU are numInsts, committedInstType, and committedControl. statExecutedInstType of SimpleCPU is unified with committedInstType of MinorCPU. Implemented committedControl stats from MinorCPU in Simple and O3 CPU. In MinorCPU, this stat was a 2D vector, where the first dimension is the thread ID. In base it is now a 1D vector that is tied to a thread ID via the commitStats vector that the object is accessible through. From the O3 commit stage, committedInsts is renamed to numInstsNotNOP, committedOps is renamed to numOpsNotNOP, instsCommitted is unified into numInsts, opsCommitted is renamed to numOps, committedInstType is unified into committedInstType from Minor, branches is removed because it duplicates committedControl::IsControl, floating is unified into numFpInsts, interger is unified into numIntInsts, loads is unified into numLoadInsts, numStoreInsts is implemented for tracking in O3, memRefs is renamed to numMemRefs, vectorInstructions is unified into numVecInsts. Note that numCondCtrlInsts of Simple is unified into committedControl::IsCondCtrl. Implemented IPC and CPI tracking inside BaseCPU. In BaseCPU::BaseCPUStats, numInsts and numOps track per CPU core committed instructions and operations. In BaseCPU::FetchCPUStats, numInsts and numOps track per thread fetched instructions and operations. In BaseCPU::CommitCPUStats, numInsts tracks per thread executed instructions. In BaseCPU::CommitCPUStats, numInsts and numOps track per thread committed instructions and operations. In BaseSimpleCPU, the countInst() function has been split into countInst(), countFetchInst(), and countCommitInst(). The stat count incrementation step of countInst() has been removed and delegated to the other two functions. countFetchInst() increments numInsts and numOps of the FetchCPUStats group for a thread. countCommitInst() increments the numInsts and numOps of the CommitCPUStats group for a thread and of the BaseCPUStats group for a CPU core. These functions are called in the appropriate stage within timing.cc and atomic.cc. The call to countInst() is left unchanged. countFetchInst() is called in preExecute(). countCommitInst() is called in postExecute(). For MinorCPU, only the commit level numInsts and numOps stats have been implemented. IPC and CPI stats have been added to BaseCPUStats (core level) and CommitCPUStats (thread level). The formulas for the IPC and CPI stats in CommitCPUStats are set in the BaseCPU constructor, after the CommitCPUStats stat group object has been created. These replace IPC, CPI, totalIpc, and totalCpi stats in O3. Replaced committedInsts stats of KVM CPU with commitStats.numInsts of BaseCPU. This results in IPC and CPI printing in stats.txt for KVM simulations. This change does not implement most general stats found in one or two model for all others. Change-Id: I44d8ff6f3d102e94e53f9b2ce9b7917d96341e51 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69097 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> --- M src/cpu/base.cc M src/cpu/base.hh M src/cpu/minor/execute.cc M src/cpu/simple/base.cc 4 files changed, 40 insertions(+), 0 deletions(-) Approvals: kokoro: Regressions pass Bobby Bruce: Looks good to me, approved; Looks good to me, approved diff --git a/src/cpu/base.cc b/src/cpu/base.cc index d2c0a78..1d29339 100644 --- a/src/cpu/base.cc +++ b/src/cpu/base.cc @@ -191,6 +191,11 @@ modelResetPort.onChange([this](const bool &new_val) { setReset(new_val); }); + // create a stat group object for each thread on this core + fetchStats.reserve(numThreads); + for (int i = 0; i < numThreads; i++) { + fetchStats.emplace_back(new FetchCPUStats(this, i)); + } } void @@ -827,4 +832,18 @@ hostOpRate = simOps / hostSeconds; } +BaseCPU:: +FetchCPUStats::FetchCPUStats(statistics::Group *parent, int thread_id) + : statistics::Group(parent, csprintf("fetchStats%i", thread_id).c_str()), + ADD_STAT(numBranches, statistics::units::Count::get(), + "Number of branches fetched"), + ADD_STAT(numFetchSuspends, statistics::units::Count::get(), + "Number of times Execute suspended instruction fetching") + +{ + numBranches + .prereq(numBranches); + +} + } // namespace gem5 diff --git a/src/cpu/base.hh b/src/cpu/base.hh index 084d9b9..e8fb777 100644 --- a/src/cpu/base.hh +++ b/src/cpu/base.hh @@ -42,6 +42,7 @@ #ifndef __CPU_BASE_HH__ #define __CPU_BASE_HH__ +#include <memory> #include <vector> #include "arch/generic/interrupts.hh" @@ -676,6 +677,22 @@ const Cycles pwrGatingLatency; const bool powerGatingOnIdle; EventFunctionWrapper enterPwrGatingEvent; + + + public: + struct FetchCPUStats : public statistics::Group + { + FetchCPUStats(statistics::Group *parent, int thread_id); + + /* Total number of branches fetched */ + statistics::Scalar numBranches; + + /* Number of times fetch was asked to suspend by Execute */ + statistics::Scalar numFetchSuspends; + + }; + + std::vector<std::unique_ptr<FetchCPUStats>> fetchStats; }; } // namespace gem5 diff --git a/src/cpu/minor/execute.cc b/src/cpu/minor/execute.cc index 5eaaf58..c37c6c6 100644 --- a/src/cpu/minor/execute.cc +++ b/src/cpu/minor/execute.cc @@ -1054,7 +1054,9 @@ DPRINTF(MinorInterrupt, "Suspending thread: %d from Execute" " inst: %s\n", thread_id, *inst); + // output both old and new stats cpu.stats.numFetchSuspends++; + cpu.fetchStats[thread_id]->numFetchSuspends++; updateBranchData(thread_id, BranchData::SuspendThread, inst, resume_pc, branch); diff --git a/src/cpu/simple/base.cc b/src/cpu/simple/base.cc index 768f63e..1632f54 100644 --- a/src/cpu/simple/base.cc +++ b/src/cpu/simple/base.cc @@ -396,7 +396,9 @@ } if (curStaticInst->isControl()) { + // output both old and new stats ++t_info.execContextStats.numBranches; + ++fetchStats[t_info.thread->threadId()]->numBranches; } /* Power model statistics */ -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/69097?usp=email To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings?usp=email Gerrit-MessageType: merged Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I44d8ff6f3d102e94e53f9b2ce9b7917d96341e51 Gerrit-Change-Number: 69097 Gerrit-PatchSet: 3 Gerrit-Owner: Melissa Jost <melissakjost@gmail.com> Gerrit-Reviewer: Bobby Bruce <bbruce@ucdavis.edu> Gerrit-Reviewer: Gabe Black <gabe.black@gmail.com> Gerrit-Reviewer: Jason Lowe-Power <jason@lowepower.com> Gerrit-Reviewer: kokoro <noreply+kokoro@google.com>