 
                    KA
                 
                
                                            Kazi Asifuzzaman
                                    
             
            
                Wed, Aug 23, 2023 4:15 PM
            
         
                            Hello,
I am exploring the usage of the Gem5-GPU model GCN3_X86 (version 22.1.0.0).
I have the following queries/questions if you could kindly clarify:
- 
In previous versions of Gem5 we could use --outdir or -d option to
 redirect the output to a specific directory. That appears to be not
 supported in the version specified above (please correct me if I am wrong).
 Is there any other way to redirect simulation outputs to specific
 directories other than m5out? Otherwise, is there any other way to run
 multiple instances of gem5, ensuring that the stats.txt of one application
 is not overwritten by other instances running at the same time and saving
 the output in the same directory (m5out).
 
- 
For GCN3_X86, I assume -mem-type defines available memory models for
 CPU/Host memory. By the default settings, does this work as a "Unified
 Memory" ? If not, how to define the memory type for the GPU (global
 memory), if it is different from CPU main memory?
 
- 
--list-rp-types lists the available replacement policies, but what is
 the option to select one of those?
 
- 
-n defines the number of CPUs or cores, L1d_size and L2_size should be
 the size per core or per CPU?
 
- 
Are there any output parameters that report % of resources (e.g. CUs)
 used by the application, or quantify memory contention in unified memory?
 
Thanks,
K. Zaman
Hello,
I am exploring the usage of the Gem5-GPU model GCN3_X86 (version 22.1.0.0).
I have the following queries/questions if you could kindly clarify:
1. In previous versions of Gem5 we could use --outdir or -d option to
redirect the output to a specific directory. That appears to be not
supported in the version specified above (please correct me if I am wrong).
Is there any other way to redirect simulation outputs to specific
directories other than m5out? Otherwise, is there any other way to run
multiple instances of gem5, ensuring that the stats.txt of one application
is not overwritten by other instances running at the same time and saving
the output in the same directory (m5out).
2. For GCN3_X86, I assume -mem-type defines available memory models for
CPU/Host memory. By the default settings, does this work as a "Unified
Memory" ? If not, how to define the memory type for the GPU (global
memory), if it is different from CPU main memory?
3. --list-rp-types lists the available replacement policies, but what is
the option to select one of those?
4. -n defines the number of CPUs or cores, L1d_size and L2_size should be
the size per core or per CPU?
5. Are there any output parameters that report % of resources (e.g. CUs)
used by the application, or quantify memory contention in unified memory?
Thanks,
*K. Zaman*
        
    
    
             
    
        
            
                
                     
                    MS
                 
                
                                            Matt Sinclair
                                    
             
            
                Wed, Aug 23, 2023 9:10 PM
            
         
                            Hi Kazi,
Trying to answer your questions:
- 
I am not aware of -d not working -- as of yesterday my students and I
 were able to use it (with head of develop, or something close to it).  How
 are you attempting to use it on the command line?
 
- 
I am not sure about the -mem-type flag (maybe Matt P., CC'd, knows
 better), but in an APU model like you seem to be looking at, the CPU and
 GPU memory are one and the same.  So there isn't a different CPU and GPU
 memory if you are using an APU.  Matt P might have better information on
 how to do this for a dGPU model, which we support in the VEGA_X86 model.
 
- 
Right now, the GPU support in the public gem5 and its VIPER Ruby
 coherence protocol only allow the protocols available in Ruby (LRU,
 TreePLRU) to be picked.  I believe there is also not a command line flag to
 chose them, so you'd need to edit the appropriate Python file (e.g.,
 https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/configs/ruby/GPU_VIPER.py#235)
 to pick the policy you want.  However, some of the students working with me
 has updated Ruby and Classic's replacement policy support such that you can
 now pick all the other Classic replacement policies in Ruby protocols too
 (originally started here:
 https://gem5-review.googlesource.com/c/public/gem5/+/20879, and subsequent
 added to here:
 https://gem5-review.googlesource.com/q/owner:jia44@wisc.edu.test-google-a.com).
 We have some patches internally that allow users to specify replacement
 policies for the different GPU caching levels (as part of the work
 described here:
 https://www.gem5.org/assets/files/workshop-isca-2023/slides/analyzing-the-benefits-of-more-complex-cache.pdf),
 but we have not pushed them yet to the public code as we're trying to debug
 the issues highlighted in slides 19-23 of that presentation.  We can push
 the patches publicly now if that would help, but there would be some caveat
 emptor there since the source of those bugs is unknown.
 
- 
I am assuming you are referring to the parameters eventually used in
 places like this:
 https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/configs/ruby/GPU_VIPER.py#142?
 If so, tcp_size is the size per instance of the L1D$ (currently per CU in
 the GPU implementation) and tcc_size is the size of the shared GPU L2.  I
 am not sure if you are seeing some other parameters somewhere?
 
- 
Matt P might have better information on this, but from briefly looking
 at the stats output, my guess is the waveLevelParallelism stats are the
 ones that might provide this information.
 
Hope this helps,
Matt
On Wed, Aug 23, 2023 at 11:19 AM Kazi Asifuzzaman via gem5-users <
gem5-users@gem5.org> wrote:
Hello,
I am exploring the usage of the Gem5-GPU model GCN3_X86 (version
22.1.0.0). I have the following queries/questions if you could kindly
clarify:
- 
In previous versions of Gem5 we could use --outdir or -d option to
 redirect the output to a specific directory. That appears to be not
 supported in the version specified above (please correct me if I am wrong).
 Is there any other way to redirect simulation outputs to specific
 directories other than m5out? Otherwise, is there any other way to run
 multiple instances of gem5, ensuring that the stats.txt of one application
 is not overwritten by other instances running at the same time and saving
 the output in the same directory (m5out).
 
- 
For GCN3_X86, I assume -mem-type defines available memory models for
 CPU/Host memory. By the default settings, does this work as a "Unified
 Memory" ? If not, how to define the memory type for the GPU (global
 memory), if it is different from CPU main memory?
 
- 
--list-rp-types lists the available replacement policies, but what is
 the option to select one of those?
 
- 
-n defines the number of CPUs or cores, L1d_size and L2_size should be
 the size per core or per CPU?
 
- 
Are there any output parameters that report % of resources (e.g. CUs)
 used by the application, or quantify memory contention in unified memory?
 
Thanks,
K. Zaman
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-leave@gem5.org
Hi Kazi,
Trying to answer your questions:
1.  I am not aware of -d not working -- as of yesterday my students and I
were able to use it (with head of develop, or something close to it).  How
are you attempting to use it on the command line?
2.  I am not sure about the -mem-type flag (maybe Matt P., CC'd, knows
better), but in an APU model like you seem to be looking at, the CPU and
GPU memory are one and the same.  So there isn't a different CPU and GPU
memory if you are using an APU.  Matt P might have better information on
how to do this for a dGPU model, which we support in the VEGA_X86 model.
3.  Right now, the GPU support in the public gem5 and its VIPER Ruby
coherence protocol only allow the protocols available in Ruby (LRU,
TreePLRU) to be picked.  I believe there is also not a command line flag to
chose them, so you'd need to edit the appropriate Python file (e.g.,
https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/configs/ruby/GPU_VIPER.py#235)
to pick the policy you want.  However, some of the students working with me
has updated Ruby and Classic's replacement policy support such that you can
now pick all the other Classic replacement policies in Ruby protocols too
(originally started here:
https://gem5-review.googlesource.com/c/public/gem5/+/20879, and subsequent
added to here:
https://gem5-review.googlesource.com/q/owner:jia44@wisc.edu.test-google-a.com).
We have some patches internally that allow users to specify replacement
policies for the different GPU caching levels (as part of the work
described here:
https://www.gem5.org/assets/files/workshop-isca-2023/slides/analyzing-the-benefits-of-more-complex-cache.pdf),
but we have not pushed them yet to the public code as we're trying to debug
the issues highlighted in slides 19-23 of that presentation.  We can push
the patches publicly now if that would help, but there would be some caveat
emptor there since the source of those bugs is unknown.
4.  I am assuming you are referring to the parameters eventually used in
places like this:
https://gem5.googlesource.com/public/gem5/+/refs/heads/develop/configs/ruby/GPU_VIPER.py#142?
If so, tcp_size is the size per instance of the L1D$ (currently per CU in
the GPU implementation) and tcc_size is the size of the shared GPU L2.  I
am not sure if you are seeing some other parameters somewhere?
5.  Matt P might have better information on this, but from briefly looking
at the stats output, my guess is the waveLevelParallelism stats are the
ones that might provide this information.
Hope this helps,
Matt
On Wed, Aug 23, 2023 at 11:19 AM Kazi Asifuzzaman via gem5-users <
gem5-users@gem5.org> wrote:
> Hello,
>
> I am exploring the usage of the Gem5-GPU model GCN3_X86 (version
> 22.1.0.0). I have the following queries/questions if you could kindly
> clarify:
>
> 1. In previous versions of Gem5 we could use --outdir or -d option to
> redirect the output to a specific directory. That appears to be not
> supported in the version specified above (please correct me if I am wrong).
> Is there any other way to redirect simulation outputs to specific
> directories other than m5out? Otherwise, is there any other way to run
> multiple instances of gem5, ensuring that the stats.txt of one application
> is not overwritten by other instances running at the same time and saving
> the output in the same directory (m5out).
>
> 2. For GCN3_X86, I assume -mem-type defines available memory models for
> CPU/Host memory. By the default settings, does this work as a "Unified
> Memory" ? If not, how to define the memory type for the GPU (global
> memory), if it is different from CPU main memory?
>
> 3. --list-rp-types lists the available replacement policies, but what is
> the option to select one of those?
>
> 4. -n defines the number of CPUs or cores, L1d_size and L2_size should be
> the size per core or per CPU?
>
> 5. Are there any output parameters that report % of resources (e.g. CUs)
> used by the application, or quantify memory contention in unified memory?
>
> Thanks,
>
> *K. Zaman*
> _______________________________________________
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-leave@gem5.org
>