Learning Objectives
Use the framework available in VisualSim to build a model for the following objectives:
- Utilize different levels of modelling abstractions based on the available information.
- Analyse a memory subsystem to evaluate the response of the memory for:
- Different types of requests
- Requests from different devices arriving at different rates
- Different data size
Introduction
A designer or architect considers the performance of a memory
subsystem as one of the crucial factors to achieve real-time
application performance. In addition, they must accurately evaluate the
tradeoff between performance, power, cost, and reliability of the
system. They also need to evaluate properties such as locality,
interface technology, arbitration algorithm, and data width while
making architecture decisions.
In this tutorial, we explore a memory system architecture with the following levels of abstractions.
- Statistical DRAM and Controller with an Abstract Bus Arbiter.
Tutorial Model can be found at $VS\VS_AR\doc\Training_Material\Tutorial\WebHelp\Tutorial\Performance_Modeling\mem_bw_model.xml
- Statistical DRAM with a Memory Controller Logic and Single AXI.
Tutorial Model can be found at
$VS\VS_AR\doc\Training_Material\Tutorial\WebHelp\Tutorial\Performance_Modeling\mem_bw_model_V3_1.xml
- Statistical DRAM with a Memory Controller Logic and Single AXI and Dedicated Local Bus.
Tutorial Model can be found at
$VS\VS_AR\doc\Training_Material\Tutorial\WebHelp\Tutorial\Performance_Modeling\mem_bw_model_V3_2.xml
- Statistical DRAM with a Memory Controller Logic and Single AXI, Dedicated local Bus and a DMA.
Tutorial Model can be found at
$VS\VS_AR\doc\Training_Material\Tutorial\WebHelp\Tutorial\Performance_Modeling\mem_bw_model_V3_5.xml
- Cycle Accurate DRAM and Memory Controller.
Tutorial Model can be found at
$VS\VS_AR\doc\Training_Material\Tutorial\WebHelp\Tutorial\Performance_Modeling\mem_bw_model_V3_6.xml
Note: Add an offset to every traffic block. In AXI_Bus change the Threshold_Trans_T_Bytes_F value to be true.
Design Methodology
Figure 1 depicts a simple block diagram of a memory
system. As the analysis is focussed around memory subsystem, we have
abstracted out the processor/external device that performs requests.
In this tutorial, the user analyses the response of the memory to
requests from different devices. The user chooses an arbitration to
select the highest priority request or the request that arrived first.
The request is then put in a Command Queue based on the selected
arbitration algorithm. Based on the address of the transactions, a DRAM
Requestor decides on the location in the memory to which to send the
transactions.
After the completion of the DRAM Read/Write/Erase activity, the response is sent to the Requestor in First Come First Out order.
Figure 1: Block Diagram
Use the above block diagram to create enhanced models with additional blocks. The details are in the subsequent sections.
Block Diagram Usage of VisualSim
In VisualSim, you model the block diagram into five different variations. The details of the five variations are given below:
Variation
|
Details
|
Variation 1 (Abstract Arbitration Algorithm and Memory Controller) |
- Traffic generators
- Processors to assign values to transactions
- Arbitration mechanism
- Command Queues
- DRAM Requestor
- DRAM
|
Variation 2 (Single Bus Interface between devices and Memory)
|
- Traffic generators
- Processors to assign values to transactions
- Device Interface
- BUS
- Script
- DRAM
|
Variation 3 (Local bus and AXI Bus)
|
- Traffic generators
- Processors to assign values to transactions
- Device Interface
- Local Bus
- Bridge
- BUS
- Script
- DRAM
|
Variation 4 (Extension to Variation 3 with DMA
|
- Traffic generators
- Processors to assign values to transactions
- Device Interface
- DMA Controller
- DMA Database
- BUS
- Script
- DRAM
|
Variation 5 (Extension to Variation 4 and use Cycle Accurate Memory Controller and Memory)
|
- Traffic generators
- Processors to assign values to transactions
- Device Interface
- DMA Controller
- DMA Database
- BUS
- Memory Controller
- DRAM
|
Apart from these parts, the following elements are unique to VisualSim:
- Parameters are values that are constants for each simulation and can be varied across experiments.
- Variables that are registers or single value memory locations
that can be used to transfer data between blocks; flags on status and
intermediate computing values.
- Simulator to define the duration of the simulation.
Variances from the Block Diagram in VisualSim Modeling
Variances
We factor in the following variances when we model the block diagram
in VisualSim. These variances are simulation-specific, either for ease
of modelling or to capture statistics in the model.
- Data Structure fields: Enter the fields for Time Entered,
Priority, and Data Size, Channel ID, Command type, Source name, and
destination device name.
- Traffic: This block simulates the workload or the data packets
entering the queues. The block attributed is the data rate. In this
case, we have traffic blocks to emulate the requests from processors or
peripheral devices.
- Attributes of the Memory Request: The typical attributes for the
Memory Request are data size in bytes, source, destination, priority,
type of request, and time entered.
- Differences with hardware: Note that the memory request does not
carry the actual data in this tutorial. However, the user can utilize
the real data and address information. In this tutorial, the request
carries just the size and command type.
- Arbitration: Though arbitration is not considered in the first
variation of the model, it is incorporated in AHB and AXI buses from
Variation 2 onwards. Based on the type of arbitration algorithm
selected, bus controller in AMBA AHB and AMBA AXI determine to which
master the control must be given.
- DeviceInterface: These blocks in VisualSim are not part of the
real hardware. They allow to model custom devices and abstract master
or slave devices to be part of the system.
Mapping of the block diagram to VisualSim Model
The following table provides a mapping of the block diagram to the VisalSim Model.
Block Diagram
|
VisualSim Model
|
|
|
|
|
|
|
|
|
|
- TimeDataPlotter
- Expression_List
- Queue
|
Building the VisualSim Model
Use the block information as the base and build a VisualSim model
with the Library Blocks listed in the following table (Table 1). Note
that some of the steps are applicable only to a particular variation.
Such information is given in the “Applicable for Variation” column.
Initial Setup
S.No.
|
Process
|
Library Block
|
Applicable for Variation
|
1.
|
- Create a Digital Simulator.
- Use the “Parameter=” block to define a parameter (TStop) and value (for example 10.0) for the Digital parameter “stopTime”.
|
Digital
|
Variation 1, 2, 3, 4, 5
|
2.
|
- Implement an Architecture setup
|
Architecture_Setup
|
Variation 1, 2, 3, 4, 5
|
Create Traffic Generators
S.No.
|
Process
|
Library Block
|
Applicable for Variation
|
1.
|
- Ensure three traffic generators (Group 1, Group 2, Group 3) to send transactions to the memory.
- Use the “Parameter=” block to define parameters
(Mean_Interrarival_Group1, Mean_Interrarival_Group2,
Mean_Interrarival_Group3) and a uniform value (for example 0.1) for the
parameter “Value_1” of the traffic generators.
- Specify the value “Fixed (Value_1)” for the parameter “Time_Distribution” of the traffic sources.
Note: Create three
modules of traffic generators with each module comprising three
different traffic controllers for Variations 2, 3, 4, and 5.
|
Traffic
|
All the Variations
|
2.
|
- Build blocks to assign values to the transactions from the traffic sources.
- Specify the following values for the parameter “Expression_List”.
- input.A_Bytes = 400
- input.A_Bytes_Remaining = 0
- input.A_Bytes_Sent = input.A_Bytes
- input.A_Command = "Write"
- input.A_Destination = "DRAM"
- input.A_Hop = "DRAM"
- input.A_Status = ""
- input.A_Task_Flag = true
- input.A_Interrupt = false
- input.A_Prefetch = false
- input.A_Priority = 1
- input.Time_Generated = TNow
|
Processing
|
All the Variations
|
3.
|
- Add the following additional instance-specific statements in the respective processing block.
- input.A_Source = "Group1"
- input.Origin = "Group1"
- input.A_Source = "Group2"
- input.Origin = "Group2"
- input.A_Source = "Group3"
- input.Origin = "Group3"
|
|
All the Variations
|
4.
|
- Create a block to make the transactions compatible with the proposed architecture.
|
DeviceInterface
|
Variations 2, 3, 4, and 5
|
5.
|
- Implement a block to receive the transactions from the Memory.
|
OUT
|
Variations 2, 3, 4, and 5
|
6.
|
|
DMA Controller
|
Variations 4 and 5
|
7.
|
|
DMA Database
|
Variations 4 and 5
|
Note:
- Connect the traffic sources to their respective processing blocks. (All Variations)
- Connect the Processing blocks to the Device Interfaces. (Variations 2, 3, 4, and 5)
- Connect the DeviceInterface to the OUT block. (Variations 2, 3, 4, and 5)
- Create back and forth connections between the DMA Controller and Device Interface. (Variations 4 and 5)
- Connect the Device Interface to the DMA Database. (Variations 4 and 5)
Prioritize the transactions (Applicable only for Variation 1)
S.No.
|
Process
|
Library Block
|
Applicable for Variation
|
1.
|
- Create a block titled “Arbitration” to prioritize the transactions.
|
Hierarchical
|
Variation 1
|
2.
|
- Implement a Command_Queue to put the transactions in a queue.
- Specify “A_Priority” as the value for the parameter “Priority_Field”.
|
Queue
|
Variation 1
|
3.
|
- Create a Const to POP the head of the queue:
- Right-click on the “Const” block and select menu Appearance->Flip Ports Horizontally.
|
Const
|
Variation 1
|
Note:
- Connect the processing blocks to the arbitration block. (Variation 1)
- Connect the arbitration block to the Command_Queue. (Variation 1)
- Connect the Const. block to the Command_Queue. (Variation 1)
Implement BUS (Not applicable for Variation 1)
S.No.
|
Process
|
Library Block
|
Applicable for Variation
|
1.
|
- Create BUS to route the transactions.
|
AMBA_AXI
|
Variation 2
|
2.
|
- Implement a local BUS to route the transactions.
|
AMBA_AHB
|
Variations 3, 4, and 5
|
3.
|
- Create a bridge to connect the local bus to BUS.
|
Bridge
|
Variations 3, 4, and 5
|
Note:
- Create back and forth connections between the
- BUS and the Device Interface. (Variation 2)
- local BUS and the DeviceInterface. (Variations 3, 4, and 5)
- local BUS and the Bridge. (Variations 3, 4, and 5)
- BUS and the bridge. (Variations 3, 4, and 5)
Implement a Memory
S.No.
|
Analysis
|
Library Block
|
1.
|
- Implement a script to either randomly or sequentially write/read the transactions to the memory.
|
Script
|
2.
|
- Implement a Cycle Accurate Memory Controller to either randomly or sequentially write/read the transactions to the memory.
|
Memory Controller
|
3.
|
- Insert a memory to hold the transactions.
- Use the “Parameter=” block to define:
- “Access_Time” parameter with the value "Read
1000.0/Memory_Speed_Mhz, Prefetch 3.0, Write 1000.0/Memory_Speed_Mhz,
ReadWrite 3.0, Erase 3.0"
- “Memory_Speed_Mhz” with a value 256.0.
- Specify the following values for other parameters:
- Memory_Name: “DRAM”
- Memory_Type:DDR
- Deselect “Enable_Hello_Messages”
- Add the parameter “Refresh” and specify false as the default value.
|
RAM
|
Note:
- Connect the “Arbitration” block to the “Command_Queue”. (Variation 1)
- Connect the “Command_Queue” to “DRAM”. (Variation 1)
- Connect “DRAM” to “Const”. (Variation 1)
- Create back and forth connections between the
- Script and the BUS. (Variations 2, 3, and 4)
- Memory Controller and the BUS. (Variation 5)
- DRAM and Script. (Variations 2, 3, and 4)
- DRAM and Memory Controller. (Variation 5)
Illustration of the Model
The VisualSim models are given below:
Variation 1
Figure 2: Type 1 VisualSim Model
Variation 2
Figure 3: Type 2 VisualSim Model
Variation 3
Figure 4: Type 3 VisualSim Model
Figure 5: Type 4 VisualSim Model
Variation 5
Figure 6: Type 5 VisualSim Model
Gathering Resource Statistics and Reports
Build a Visual model using the Library blocks listed in the table to gather resource statistics and reports.
Variation 1
S.No.
|
Parameter
|
Value
|
1.
|
- Create a block titled “xTime_yData_Plotter”.
- Right-click on the block and select menu Appearance->Flip Ports Horizontally.
|
TimeDataPlotter
|
2.
|
- Right-click on the block to select:
- menu Appearance->Flip Ports Horizontally.
- menu Customize Ports
- Add 2 output additional output ports
- Click the “Add” button and specify “output2”, put a
check in the “Output” column, and select double for the “Type” from the
pull-down menu.
- Click the “Add” button and specify “output3”, put a check
in the “Output” column, and select double for the “Type” from the
pull-down menu.
|
Expression_List
|
3.
|
- Implement a queue titled “Smart_Timed_Resource”.
- Use the “Parameter=” block to define “Exit_Queue_Service_Time” as the value for the parameter “Time_Field”.
- Right-click on the block to select menu Appearance->Flip Ports Horizontally.
|
Queue
|
Note:
- Connect the “DRAM” block to the Smart_Timed_Resource” block.
- Connect the “Smart_Timed_Resource” block to the “Decision” block.
- Connect the “Decision” block to the “xTime_yData_Plotter” block.
Other Variations
S.No.
|
Parameter
|
Value
|
1.
|
- Create a block to send the transaction from the memory for analysis.
|
IN
|
2.
|
- Right-click on the block to select:
- menu Appearance->Flip Ports Horizontally.
- menu Customize Ports
- Add 2 output additional output ports
- Click the “Add” button and specify “output2”, put a
check in the “Output” column, and select double for the “Type” from the
pull-down menu.
- Click the “Add” button and specify “output3”, put a check
in the “Output” column, and select double for the “Type” from the
pull-down menu.
|
Expression_List
|
3.
|
- Create a block titled “Latency”.
- Right-click on the block and select menu Appearance->Flip Ports Horizontally.
|
TimeDataPlotter
|
Note:
- Connect the “IN” block to the “Decision” block.
- Connect the “Decision” block to the “Latency” block.
Analysis and Results
Latency graph shows that the end-to-end latency for the complete
system. X-axis is the simulation time and Y-axis displays latency in
seconds. Latency increases gradually and also shows that the standard
deviation between maximum and minimum is huge and may result in
unpredictable system behaviour.
Variation 1
Figure 7: Variation 1 - Latency vs. Simulation Time Plot
Variation 2
Figure 8: Variation 2 - Latency vs. Simulation Time Plot
Variation 3
Figure 9: Variation 3 - Latency vs. Simulation Time Plot
Variation 4
Figure 10: Variation 4 - Latency vs. Simulation Time Plot
Variation 5
Figure 11: Variation 5 - Latency vs. Simulation Time Plot
Additional Analysis
Build another model for three Traffic Generators with the following blocks:
- Traffic (Located in library Traffic->Traffic)
- Processing (Located in library Full Library -> Defining_Flow)
- VariableList (Located in library ModelSetup->VariableList)
- Delay (Located in library Traffic)
- Fork (Located in library Behavior)
- Text Display (Located in library Results)
Use the following values for the Traffic Generators:
- Traffic Generator 1: 1 transaction every 10 ms
- Traffic Generator 2: 10 transactions every 10 ms with constant interarrival time (e.g., 1/10 ms)
- Traffic Generator 3: 10 transactions every 10 ms with random interarrival times