gem5-dev@gem5.org

The gem5 Developer List

View all threads

[XS] Change in gem5/gem5[develop]: mem-ruby: fix load deadlock with WB GPU L2 caches

MS
Matt Sinclair (Gerrit)
Wed, Mar 15, 2023 10:19 PM

Matt Sinclair has uploaded this change for review. (
https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email )

Change subject: mem-ruby: fix load deadlock with WB GPU L2 caches
......................................................................

mem-ruby: fix load deadlock with WB GPU L2 caches

By default the GPU VIPER coherence protocol uses a WT L2 cache.
However it has support for using WB caches (although this is not
tested currently).  When using a WB L2 cache for the GPU, this
results in deadlocks with loads.

Specifically, when a load reaches the L2 and the line is currently
in the W state, that line must be written back before the load can
be performed.  However, the current transition for this in the L2
did not attempt to retry the load when the WB completes, resulting
in a deadlock.  This deadlock can be replicated by running the GPU
Ruby random tester as is with a WB L2 cache instead of a WT L2
cache.

To fix this, this change modifies the transition in question to
put the load on the stalled requests buffer, which the WBAck will
check when it returns to the L2 (and thus perform the load).

This fix has been tested and verified with both the per-checkin and
nightly GPU Ruby Random tester tests (with a WB L2 cache).

Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6

M src/mem/ruby/protocol/GPU_VIPER-TCC.sm
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
index 0f93339..0b7f5ed 100644
--- a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
+++ b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
@@ -718,10 +718,13 @@
p_popRequestQueue;
}
transition(W, RdBlk, WI) {TagArrayRead, DataArrayRead} {

  • p_profileHit;
    t_allocateTBE;
    wb_writeBack;
  • p_popRequestQueue;
  • // need to try this request again after writing back the current entry
    -- to
  • // do so, put it with other stalled requests in a buffer to reduce
    resource
  • // contention since they won't try again every cycle and will instead
    only
  • // try again once woken up
  • st_stallAndWaitRequest;
    }
transition(I, RdBlk, IV) {TagArrayRead} {

--
To view, visit
https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email
To unsubscribe, or for help writing mail filters, visit
https://gem5-review.googlesource.com/settings

Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6
Gerrit-Change-Number: 68977
Gerrit-PatchSet: 1
Gerrit-Owner: Matt Sinclair mattdsinclair.wisc@gmail.com
Gerrit-MessageType: newchange

Matt Sinclair has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email ) Change subject: mem-ruby: fix load deadlock with WB GPU L2 caches ...................................................................... mem-ruby: fix load deadlock with WB GPU L2 caches By default the GPU VIPER coherence protocol uses a WT L2 cache. However it has support for using WB caches (although this is not tested currently). When using a WB L2 cache for the GPU, this results in deadlocks with loads. Specifically, when a load reaches the L2 and the line is currently in the W state, that line must be written back before the load can be performed. However, the current transition for this in the L2 did not attempt to retry the load when the WB completes, resulting in a deadlock. This deadlock can be replicated by running the GPU Ruby random tester as is with a WB L2 cache instead of a WT L2 cache. To fix this, this change modifies the transition in question to put the load on the stalled requests buffer, which the WBAck will check when it returns to the L2 (and thus perform the load). This fix has been tested and verified with both the per-checkin and nightly GPU Ruby Random tester tests (with a WB L2 cache). Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6 --- M src/mem/ruby/protocol/GPU_VIPER-TCC.sm 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm index 0f93339..0b7f5ed 100644 --- a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm +++ b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm @@ -718,10 +718,13 @@ p_popRequestQueue; } transition(W, RdBlk, WI) {TagArrayRead, DataArrayRead} { - p_profileHit; t_allocateTBE; wb_writeBack; - p_popRequestQueue; + // need to try this request again after writing back the current entry -- to + // do so, put it with other stalled requests in a buffer to reduce resource + // contention since they won't try again every cycle and will instead only + // try again once woken up + st_stallAndWaitRequest; } transition(I, RdBlk, IV) {TagArrayRead} { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/68977?usp=email To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6 Gerrit-Change-Number: 68977 Gerrit-PatchSet: 1 Gerrit-Owner: Matt Sinclair <mattdsinclair.wisc@gmail.com> Gerrit-MessageType: newchange