gem5-dev@gem5.org

The gem5 Developer List

View all threads

[XS] Change in gem5/gem5[develop]: mem-ruby: fix atomic deadlock with WB GPU L2 caches

MS
Matt Sinclair (Gerrit)
Wed, Mar 22, 2023 4:00 AM

Matt Sinclair has submitted this change. (
https://gem5-review.googlesource.com/c/public/gem5/+/68978?usp=email )

Change subject: mem-ruby: fix atomic deadlock with WB GPU L2 caches
......................................................................

mem-ruby: fix atomic deadlock with WB GPU L2 caches

By default the GPU VIPER coherence protocol uses a WT L2 cache.
However it has support for using WB caches (although this is not
tested currently).  When using a WB L2 cache for the GPU, this
results in deadlocks with atomics.

Specifically, when an atomic reaches the L2 and the line is
currently in M or W, the line must be written back before the atomic
can be performed.  However, the current support has two issues:

a) it never performs the atomic operation -- while VIPER current
assumes all atomics are system scope atomics and thus cannot be
performed at the L2 and this transition requires the dirty line be
written back before performing the atomic, the transition never
performs the atomic nor does the response path handle it.
b) putting the atomic action right after the write back is not
safe because we need to ensure the requests are ordered when they
reach memory -- thus we have to wait until the write back is
acknowledged before it's safe to send/perform the atomic.

To fix this, this change modifies the transition in question to
put the atomic on the stalled requests buffer, which the WBAck will
check when it returns to the L2 (and thus perform the atomic, which
will result in the atomic being sent on to the directory).

This fix has been tested and verified with both the per-checkin and
nightly GPU Ruby Random tester tests (with a WB L2 cache).

Change-Id: I9a43fd985dc71297521f4b05c47288d92c314ac7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68978
Maintainer: Bobby Bruce bbruce@ucdavis.edu
Reviewed-by: Matthew Poremba matthew.poremba@amd.com
Tested-by: kokoro noreply+kokoro@google.com

M src/mem/ruby/protocol/GPU_VIPER-TCC.sm
1 file changed, 6 insertions(+), 2 deletions(-)

Approvals:
kokoro: Regressions pass
Matthew Poremba: Looks good to me, approved
Bobby Bruce: Looks good to me, approved

diff --git a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
index 0b7f5ed..a595898 100644
--- a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
+++ b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm
@@ -816,10 +816,14 @@
}

transition({M, W}, Atomic, WI) {TagArrayRead} {
  • p_profileHit;
    t_allocateTBE;
    wb_writeBack;
  • p_popRequestQueue;
  • // after writing back the current line, we need to wait for it to be
    done
  • // before we try to perform the atomic
  • // by putting the stalled requests in a buffer, we reduce resource
    contention
  • // since they won't try again every cycle and will instead only try
    again once
  • // woken up
  • st_stallAndWaitRequest;
    }
transition(I, WrVicBlk) {TagArrayRead} {

--
To view, visit
https://gem5-review.googlesource.com/c/public/gem5/+/68978?usp=email
To unsubscribe, or for help writing mail filters, visit
https://gem5-review.googlesource.com/settings

Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I9a43fd985dc71297521f4b05c47288d92c314ac7
Gerrit-Change-Number: 68978
Gerrit-PatchSet: 2
Gerrit-Owner: Matt Sinclair mattdsinclair.wisc@gmail.com
Gerrit-Reviewer: Bobby Bruce bbruce@ucdavis.edu
Gerrit-Reviewer: Bradford Beckmann bradford.beckmann@gmail.com
Gerrit-Reviewer: Jason Lowe-Power jason@lowepower.com
Gerrit-Reviewer: Matt Sinclair mattdsinclair@gmail.com
Gerrit-Reviewer: Matthew Poremba matthew.poremba@amd.com
Gerrit-Reviewer: kokoro noreply+kokoro@google.com
Gerrit-CC: VISHNU RAMADAS vramadas@wisc.edu
Gerrit-MessageType: merged

Matt Sinclair has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/68978?usp=email ) Change subject: mem-ruby: fix atomic deadlock with WB GPU L2 caches ...................................................................... mem-ruby: fix atomic deadlock with WB GPU L2 caches By default the GPU VIPER coherence protocol uses a WT L2 cache. However it has support for using WB caches (although this is not tested currently). When using a WB L2 cache for the GPU, this results in deadlocks with atomics. Specifically, when an atomic reaches the L2 and the line is currently in M or W, the line must be written back before the atomic can be performed. However, the current support has two issues: a) it never performs the atomic operation -- while VIPER current assumes all atomics are system scope atomics and thus cannot be performed at the L2 and this transition requires the dirty line be written back before performing the atomic, the transition never performs the atomic nor does the response path handle it. b) putting the atomic action right after the write back is not safe because we need to ensure the requests are ordered when they reach memory -- thus we have to wait until the write back is acknowledged before it's safe to send/perform the atomic. To fix this, this change modifies the transition in question to put the atomic on the stalled requests buffer, which the WBAck will check when it returns to the L2 (and thus perform the atomic, which will result in the atomic being sent on to the directory). This fix has been tested and verified with both the per-checkin and nightly GPU Ruby Random tester tests (with a WB L2 cache). Change-Id: I9a43fd985dc71297521f4b05c47288d92c314ac7 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68978 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> --- M src/mem/ruby/protocol/GPU_VIPER-TCC.sm 1 file changed, 6 insertions(+), 2 deletions(-) Approvals: kokoro: Regressions pass Matthew Poremba: Looks good to me, approved Bobby Bruce: Looks good to me, approved diff --git a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm index 0b7f5ed..a595898 100644 --- a/src/mem/ruby/protocol/GPU_VIPER-TCC.sm +++ b/src/mem/ruby/protocol/GPU_VIPER-TCC.sm @@ -816,10 +816,14 @@ } transition({M, W}, Atomic, WI) {TagArrayRead} { - p_profileHit; t_allocateTBE; wb_writeBack; - p_popRequestQueue; + // after writing back the current line, we need to wait for it to be done + // before we try to perform the atomic + // by putting the stalled requests in a buffer, we reduce resource contention + // since they won't try again every cycle and will instead only try again once + // woken up + st_stallAndWaitRequest; } transition(I, WrVicBlk) {TagArrayRead} { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/68978?usp=email To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I9a43fd985dc71297521f4b05c47288d92c314ac7 Gerrit-Change-Number: 68978 Gerrit-PatchSet: 2 Gerrit-Owner: Matt Sinclair <mattdsinclair.wisc@gmail.com> Gerrit-Reviewer: Bobby Bruce <bbruce@ucdavis.edu> Gerrit-Reviewer: Bradford Beckmann <bradford.beckmann@gmail.com> Gerrit-Reviewer: Jason Lowe-Power <jason@lowepower.com> Gerrit-Reviewer: Matt Sinclair <mattdsinclair@gmail.com> Gerrit-Reviewer: Matthew Poremba <matthew.poremba@amd.com> Gerrit-Reviewer: kokoro <noreply+kokoro@google.com> Gerrit-CC: VISHNU RAMADAS <vramadas@wisc.edu> Gerrit-MessageType: merged