gem5-dev@gem5.org

The gem5 Developer List

View all threads

[XS] Change in gem5/gem5[develop]: mem-ruby: fix load deadlock with WB GPU L2 caches

MS
Matt Sinclair (Gerrit)
Wed, Mar 22, 2023 4:00 AM

Matt Sinclair has submitted this change. (
https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email )

Change subject: mem-ruby: fix load deadlock with WB GPU L2 caches
......................................................................

mem-ruby: fix load deadlock with WB GPU L2 caches

By default the GPU VIPER coherence protocol uses a WT L2 cache.
However it has support for using WB caches (although this is not
tested currently).  When using a WB L2 cache for the GPU, this
results in deadlocks with loads.

Specifically, when a load reaches the L2 and the line is currently
in the W state, that line must be written back before the load can
be performed.  However, the current transition for this in the L2
did not attempt to retry the load when the WB completes, resulting
in a deadlock.  This deadlock can be replicated by running the GPU
Ruby random tester as is with a WB L2 cache instead of a WT L2
cache.

To fix this, this change modifies the transition in question to
put the load on the stalled requests buffer, which the WBAck will
check when it returns to the L2 (and thus perform the load).

This fix has been tested and verified with both the per-checkin and
nightly GPU Ruby Random tester tests (with a WB L2 cache).

Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68977
Reviewed-by: Matthew Poremba matthew.poremba@amd.com
Maintainer: Bobby Bruce bbruce@ucdavis.edu
Tested-by: kokoro noreply+kokoro@google.com

M src/mem/ruby/protocol/GPU_VIPER-TCC.sm
1 file changed, 5 insertions(+), 2 deletions(-)

Approvals:
Matthew Poremba: Looks good to me, approved
Bobby Bruce: Looks good to me, approved
kokoro: Regressions pass

diff --git a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
index 0f93339..0b7f5ed 100644
--- a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
+++ b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
@@ -718,10 +718,13 @@
p_popRequestQueue;
}
transition(W, RdBlk, WI) {TagArrayRead, DataArrayRead} {

  • p_profileHit;
    t_allocateTBE;
    wb_writeBack;
  • p_popRequestQueue;
  • // need to try this request again after writing back the current entry
    -- to
  • // do so, put it with other stalled requests in a buffer to reduce
    resource
  • // contention since they won't try again every cycle and will instead
    only
  • // try again once woken up
  • st_stallAndWaitRequest;
    }
transition(I, RdBlk, IV) {TagArrayRead} {

--
To view, visit
https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email
To unsubscribe, or for help writing mail filters, visit
https://gem5-review.googlesource.com/settings

Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6
Gerrit-Change-Number: 68977
Gerrit-PatchSet: 2
Gerrit-Owner: Matt Sinclair mattdsinclair.wisc@gmail.com
Gerrit-Reviewer: Bobby Bruce bbruce@ucdavis.edu
Gerrit-Reviewer: Bradford Beckmann bradford.beckmann@gmail.com
Gerrit-Reviewer: Jason Lowe-Power jason@lowepower.com
Gerrit-Reviewer: Matt Sinclair mattdsinclair@gmail.com
Gerrit-Reviewer: Matthew Poremba matthew.poremba@amd.com
Gerrit-Reviewer: kokoro noreply+kokoro@google.com
Gerrit-CC: VISHNU RAMADAS vramadas@wisc.edu
Gerrit-MessageType: merged

Matt Sinclair has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email ) Change subject: mem-ruby: fix load deadlock with WB GPU L2 caches ...................................................................... mem-ruby: fix load deadlock with WB GPU L2 caches By default the GPU VIPER coherence protocol uses a WT L2 cache. However it has support for using WB caches (although this is not tested currently). When using a WB L2 cache for the GPU, this results in deadlocks with loads. Specifically, when a load reaches the L2 and the line is currently in the W state, that line must be written back before the load can be performed. However, the current transition for this in the L2 did not attempt to retry the load when the WB completes, resulting in a deadlock. This deadlock can be replicated by running the GPU Ruby random tester as is with a WB L2 cache instead of a WT L2 cache. To fix this, this change modifies the transition in question to put the load on the stalled requests buffer, which the WBAck will check when it returns to the L2 (and thus perform the load). This fix has been tested and verified with both the per-checkin and nightly GPU Ruby Random tester tests (with a WB L2 cache). Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68977 Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> --- M src/mem/ruby/protocol/GPU_VIPER-TCC.sm 1 file changed, 5 insertions(+), 2 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Bobby Bruce: Looks good to me, approved kokoro: Regressions pass diff --git a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm index 0f93339..0b7f5ed 100644 --- a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm +++ b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm @@ -718,10 +718,13 @@ p_popRequestQueue; } transition(W, RdBlk, WI) {TagArrayRead, DataArrayRead} { - p_profileHit; t_allocateTBE; wb_writeBack; - p_popRequestQueue; + // need to try this request again after writing back the current entry -- to + // do so, put it with other stalled requests in a buffer to reduce resource + // contention since they won't try again every cycle and will instead only + // try again once woken up + st_stallAndWaitRequest; } transition(I, RdBlk, IV) {TagArrayRead} { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6 Gerrit-Change-Number: 68977 Gerrit-PatchSet: 2 Gerrit-Owner: Matt Sinclair <mattdsinclair.wisc@gmail.com> Gerrit-Reviewer: Bobby Bruce <bbruce@ucdavis.edu> Gerrit-Reviewer: Bradford Beckmann <bradford.beckmann@gmail.com> Gerrit-Reviewer: Jason Lowe-Power <jason@lowepower.com> Gerrit-Reviewer: Matt Sinclair <mattdsinclair@gmail.com> Gerrit-Reviewer: Matthew Poremba <matthew.poremba@amd.com> Gerrit-Reviewer: kokoro <noreply+kokoro@google.com> Gerrit-CC: VISHNU RAMADAS <vramadas@wisc.edu> Gerrit-MessageType: merged