HW_Cache

Memory/Integrated_Cache
Block Name: Integrated_Cache

Code File Location: VisualSim/actor/arch/Memory/Integrated_Cache

Block Overview

This block can be used as L1 (Instruction or Data), L2 and L3 cache. Cache block have two modes of operation as a Stochastic cache model and Address based cache model. It can receive inputs from the Processor or any other device through the bus. The block will perform its operation based on the command specified on the requesting data structure. The hit and miss operations of this block will be depends on the cache type(stochastic or Address based).

Description

    Cache block receives input from the processor or IO_device or Lower level caches and check the block for hit or miss. in the case of hit the data will be send back to source(Read_Req) or write it in the cache(Write_Req). In the case of a miss the cache will send a request for specific block number to higher level cache or secondary memory, whichever connected to it.
    Input flow control for the cache block is achieved by adding the field"A_Event_Name" in the input Data Structure. The output flow control for the cache block is activated by the "Output_Flow_Control" parameter in the configuration window.
    Cache block can be used for dedicated sources(Specified in the Block_Configuration parameter) and non-dedicated sources. Dedicated sources will have a separate memory space, which other sources cannot access. The non-dedicated sources can use cache without specifying the name in the block configuration parameter, cache will allow these sources to access the common space that any devices can access.

   Stochastic Model:
       Cache block uses Instruction hit ratio and data hit ratio for determining the hit or miss to the requested input.

   Address Based Model:
       Cache block uses A_I_Addr and A_D_Addr from the input data structure for checking the specific address is available or not in the cache block.

   Write Policies:
       Cache block have Write Through and Write Back policy options.
       In the case of Write Through the data being written in the current cache block will be updated to higher level cache or memory at the same time.
       In the case of Write Back the data will be written to the cache block and it will not be sent to higher level memory until a read request or replacement of the corresponding block occurs.

   Replacement policy:
       Cache block have Least Recently Used(LRU) and Pseudo-LRU options.
       In the case of LRU the least requested block of memory will be replaced by the new block.
       In the case of Psuedo-LRU Blocks which accumulated at the beginning stages will be replaced.

   Required Fields:            Example_Value
        A_Source                   "Processor_1"
        A_Destination            "I_1_Cache"
        A_D_Addr                 32L
        A_I_Addr                  0L
        A_Command             "Read_Intr"
        A_Bytes                    32
        A_Task_Flag             true

    Use Cases:
       Instruction Cache

A_Command field should be "Read_Instr" for reading the instruction in the cache.
No_of_D_Blocks in Block configuration parameter can be {0,0} in the case, the whole cache block is used for instruction alone.
Memory space for instructions are allocated to the dedicated sources specified in the block configuration.

Data Cache:

A_Command field can be "Read_Req" or "Write_Req" for reading and writing data in the cache.
No_of_I_Blocks in Block configuration parameter can be {0,0} in the case, the whole cache block is used for Data alone.
Memory space for Data are allocated to the dedicated sources specified in the block configuration.

L2 Cache:

Block Configuration parameter must be fulfilled with dedicated source and the number of blocks for I - Cache and D - Cache.
memory space for both data and cache for the dedicated sources are allocated based on the block configuration.

L3 Cache:

Whole Cache block can be used as data cache to read or write the block of data.

Cache block can support the following instructions

Read_Instr (Instruction Cache)
Read_Req (Data Cache)
Write_Req (Data Cache)
Read (Consider the request as data access)
Write (Consider the request as data write)

    To illustrate the usage, look the following Example in BDE.
    Cache_Demo

Operation

Stochastic:

The following parameter plays important role in stochastic model.

1. Data_Hit_Ratio

2. Instruction_Hit_Ratio

3. Loop_Ratio

The cache hit and miss are decided based on the above three parameter, which are not used in Address based model.

Input request will be processed in the cache as follows:

1. Determine hit or miss.

2. If a hit sends out the response to the source.

3. If a miss keep the input request in the queue and sends out a request for block of memory (Block_Size_KB).

4. Next level memory will return the content and the request waiting in the queue will be processed and returns the data to source.

The incoming request can be a hit or miss but it cannot be accurate. User can use this mode of operation in simple architecture.

Address_Based:

The following input fields are necessary for address based model.

1. A_Address

2. A_I_Addr and A_D_Addr specifically for Processor request.

3. Block_No

The incoming requests are processed as follows:

1. Address value will be used for determining the hit or miss.

2. First request to the cache block will be a miss (Pre-fetch is not implemented). So a block of request will be sent to next level memory and the request will be kept in the queue.

3. During this cache miss if some other request comes for same block (within the address range of the requested block) then those will also be buffered in the queue.

4. Once the data returned from higher level memory, the requests that are waiting for the returned block will be processed and sent out to source.

5. If the requesting address is not available in the cache or the cache is full and the new set of address range is requested then the cache will replace the block based on the algorithm chosen on replacement policy .

The above flows are common for Read request.

Write Policies:

Write_Back:

Incoming request will be updated to the current cache, but it will not be updated to next level memory.

That content will be updated on the following scenarios only:

1. Read request for the address that was written previously. In this case the current content will be updated to next level memory and the response will be sent out to source.

2. Cache overflow, when the cache needs to perform a replacement. The current set of block will be written to next level memory and the replacement will be initiated.

Write_Through:

Incoming write request will be updated in the current cache and the content will be sent out for next level memory for update.

Expected Data Structure

Data Structure Field	Value (Data Type)	Explanation
A_Bytes (necessary)	100	This is the total bytes to be transfered for this transaction. All bursts of this transaction will have this value.
A_Bytes_Remaining	96	The number of bytes remaining after the current transaction.
A_Bytes_Sent (necessary)	4	The number of bytes in this transaction.
A_Command (necessary)	"Read" or "Write"	This determines the operation.
A_Address (necessary)	100L	This will be used by the address decoder to determine Row, Column, Bank and Rank to perfrom the Read/Write operation.
A_Source (necessary)	"Processor"	This is unique name of the Source. When the transaction returns from the Destination, the Source and Destination names are flipped. So, the Source becomes the Destination and Destination becomes Source.
A_Destination (necessary)	“DRAM”	Final Destination
A_Task_Flag (necessary)	false	The default is false, which means that the Master does not require an acknowledge for a Write. If set to true, the VisualSim standard blocks will send a acknowledgment back when all the data has been written to the Slave. The DMA block uses this field to get a return from the Cache or DRAM block.

Input request combination

Description	A_Command	A_Bytes	A_Bytes_Remaining	A_Bytes_Sent
100 Byte Read at Slave. Bus Width = 4	Read	100	96	4
100 Byte Read Return at Master	Write	100	0	100
100 Byte Write at Slave	Write	100	0	100

List of models

VS_AR\demo\memory

Cache_and_mem.xml
Cache_Demo.xml
Proc_Cache_MC.XML
4x_proc_Private_L2.xml
4x_proc_common_L2.xml

Parameter Configuration

Parameter Explanation Example

Cache_Name This will define the name of the cache block. User has to enter the unique name to avoid overlap with other blocks.
"Cache_1"

Cache_Speed_Mhz Speed of the cache in Mhz. It determines the clock cycle and internal timing of the Cache block. User can analyze the output time using this speed value.
500.0

Cache_Width_Bytes
This is the maximum width (in terms of byte) that this cache block can process in a single clock cycle. User can use this parameter for analyzing the output and its internal operation. 4

Cache_Size_KB
The overall cache memory size is determined by this parameter.
16

Block_Size_KB
Block size will help the cache to subdivide the memory into set of blocks. Cache will request for the miss based on this block size. Eg: if cache size is 32 KB and the block size is 2KB then the memory will be organized as 16 blocks. These blocks can be assigned to specific Source/Processor based on the “Block_Configuration” parameter. 1

Block_Configuration
This will helps the cache block to organize the memory foe specific source in terms of Instruction and Data.
User can configure the single cache as Instruction or Data or L2 cache. For each source the total number of blocks and number of blocks for I and D have to be configured by the user.

For Eg: to configure it as L1 instruction cache
Cache_Allocation       No_of_Blocks No_of_I_Blocks       No_of_D_Blocks
Source_Name                     15         {1,15}                    {0,0}

Cache_Allocation    No_of_Blocks    No_of_I_Blocks    No_of_D_Blocks    ;
Src_1            32        {1,16}        {17,32}        ;
Src_2            32        {1,16}        {17,32}        ;

Data Hit Ratio
This will be used in stochastic mode of operation. Cache will use this information to determine the hit or miss of the input Data request. This will not be used in Address mode (Cycle Accurate) of operation.
0.8

Instruction_Hit_Ratio
This will be used in stochastic mode of operation. Cache will use this information to determine the hit or miss of the input Instruction request. This will not be used in Address mode (Cycle Accurate) of operation.
0.8

Loop_Ratio
This will be used in stochastic mode for emulate the looping in instruction fetch. This will not be used in Address mode (Cycle Accurate) of operation.
0.2

Overhead_Cycles
User can include the overhead cycle for each request based on the requirement. 1

Cache_Replacement_Policies
Cache blocks will be replaced based on these algorithms. Pseudo-LRU
/*Pseudo-LRU,Least_Recently_Used*/

Cache_Write_Policy
User can choose the Write policy for this cache block for the entire simulation. The operation of these policies is explained in this document.
Write_Through

Stochastic_or_Address_Based This will determine the mode of operation of this cache block, user can choose Address based for cycle accurate mode of operation. To emulate the cache operation with less accurate user can use the stochastic mode.
Address_Based
/*Stochastic,Address_Based*/

Miss_Memory_Name
User has to enter the next level memory in this field. This will help this cache block to send the request in case of a miss.
If the name is wrong or there is no next level memory with this name then the buses will throw an error as “Destination not found”.
"L2_Cache"

Power_Manager_Name
This parameter helps the user to observe the power consumption of this block along with the other blocks in the architecture. User has to enter the power table name in this field to get the power analysis.
"none"

First_Word This will determine the output transaction sequence. If this is true then the first word(Cache width length) of the requested bytes will be sent out. If not the last word (requested byte length) will be sent out..
false

No_of_Statistics
User can analyze the operation in the cache by observing the Statistics of the cache. It defines the number of samples of statistics for this cache block. User can view statistics by connecting a text display to stats port at bottom of the block.
2

FIFO_Buffers
Currently not used.
16

Architecture_Name This defines the architecture setup name. This will help the tool to observe the model and make sure the model built is correct. Please keep the name same as the Architecture setup block.
"Architecture_1"
Enable_Hello_Messages
This will help the tool to find the source and destination of the request. By enabling this blocks can perform the routing in as better way. true

Output_Flow_Control

User can implement Flow control at the slave side by enabling this parameter. Input flow control can be enables by adding the field “Event_Name” in the master side of the cache. For more detailed implementation of flow control please check the AXI Bus Document (Master_Throttle_Enable parameter).

false




Port Explanation

to_cache This is the West Side input port. This block can be connected to two buses, one on either side. This is connected to the processor.

fm_cache This is the West Side output port. This block can be connected to two buses, one on either side. This is connected to the processor.

stats Debug messages and statistics are output on this port.

Throughput
Throughput of the block during the simulation period. Time data plotter is used to view the plot.

Latency
Latancies of the block for each input. Time data plotter is used to view the plot

to_next_cache This is the East Side output port. This block can be connected to two buses, one on either side. This is connected to the lower level memory.
fm_next_cache
This is the East Side input port. This block can be connected to two buses, one on either side. This is connected to the lower level memory.

How to connect

The below image shows the basic implementation of the cache block. The bottom ports are dedicated for Statistics ,latency and throughput.

image not found in local directory

Enabling plots

To observe the latency and throughput of the cache block just connect the time data plotter to the bottom ports Overall throughput and latency.

Statistics and Analysis

The Cache block generates the following statistics to analyse the internal charectiristics.

1. Hit_Ratio:

2. Miss_Ratio:

3. Prefetch_Ratio:

Percentage of hit, miss and prefetch ratio of this cache block during the simulation period.

4. Read_MBs:

5. Write_MBs:

6. Total_MBS:

Throughput of the cache block for Read, Write and for the overall block(Read and Write).

7. Read_MBs_per_Second:

8. Write_MBs_per_Second:

9. Total_MBs_per_Second:

Throughput interms of Read, Write and (Read+Write) per second.

10. Buffer_Occupancy:

Request that are waiting in the buffer that are not processed yet at the sample time.

11. Utilization:

The utilzation of this block in the model in terms of percentage. This is defined for the Simulation period.

12. Number_Entered:

13. Number_Returned:

The total number of request that are entered and returned by the cache block during the simulation period.

The following image shows the statistics that can be generated for the cache block.

Image not found in the local directory

Error messages and solutions

VisualSim.kernel.util.IllegalActionException: in .IC_Test.manager
Because:
java.lang.NullPointerException

Please make sure Architecture setup block is available in the model

Please make sure the necessary fields are available at the input. if not please add them.

Error_Number     : Script_021
Explanation     : GTO (port_token.A_Destination)*, Check argument types, argument values, field names, and variables.
Exception        : VisualSim.kernel.util.IllegalActionException: No method found matching {BLOCK = string, DELTA = double, DS_NAME = string, ID = int, INDEX = int, TIME = double}.A_Destination()

Please make sure the necessary fields are available at the input. if not please add them.

Error_Number     : Script_075B
Explanation     : hit_t = QUEUE("Q1",pop), getting QUEUE POP Check argument types, argument values, field names, and variables.
Exception        : Check QUEUE name: Q1

Please make sure the source name is configured in the "Block Configuration" parameter.

Error : Issue with RegEx execution
Exception : VisualSim.kernel.util.IllegalActionException: Error invoking function public static java.lang.String VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String) throws VisualSim.kernel.util.IllegalActionException

Because:
User RegEx Exception:
AXI_Top_Master_1 did not find Slave named DRAM
on port number: 1 from Source: L2_Cache
Check Device_Attached_to_Slave_N parmeters

Please make sure the Miss memory name specified in the configuration is valid.

Integerated_Cache

Memory/Integrated_Cache
Block Name: Integrated_Cache

Table of contents

Block Overview

Description

Operation

Expected Data Structure

Input request combination

List of models

Parameter Configuration

How to connect

Enabling plots

Statistics and Analysis

Error messages and solutions

Parameter	Explanation	Example
Cache_Name	This will define the name of the cache block. User has to enter the unique name to avoid overlap with other blocks.	"Cache_1"
Cache_Speed_Mhz	Speed of the cache in Mhz. It determines the clock cycle and internal timing of the Cache block. User can analyze the output time using this speed value.	500.0
Cache_Width_Bytes	This is the maximum width (in terms of byte) that this cache block can process in a single clock cycle. User can use this parameter for analyzing the output and its internal operation.	4
Cache_Size_KB	The overall cache memory size is determined by this parameter.	16
Block_Size_KB	Block size will help the cache to subdivide the memory into set of blocks. Cache will request for the miss based on this block size. Eg: if cache size is 32 KB and the block size is 2KB then the memory will be organized as 16 blocks. These blocks can be assigned to specific Source/Processor based on the “Block_Configuration” parameter.	1
Block_Configuration	This will helps the cache block to organize the memory foe specific source in terms of Instruction and Data. User can configure the single cache as Instruction or Data or L2 cache. For each source the total number of blocks and number of blocks for I and D have to be configured by the user. For Eg: to configure it as L1 instruction cache Cache_Allocation No_of_Blocks No_of_I_Blocks No_of_D_Blocks Source_Name 15 {1,15} {0,0}	Cache_Allocation No_of_Blocks No_of_I_Blocks No_of_D_Blocks ; Src_1 32 {1,16} {17,32} ; Src_2 32 {1,16} {17,32} ;
Data Hit Ratio	This will be used in stochastic mode of operation. Cache will use this information to determine the hit or miss of the input Data request. This will not be used in Address mode (Cycle Accurate) of operation.	0.8
Instruction_Hit_Ratio	This will be used in stochastic mode of operation. Cache will use this information to determine the hit or miss of the input Instruction request. This will not be used in Address mode (Cycle Accurate) of operation.	0.8
Loop_Ratio	This will be used in stochastic mode for emulate the looping in instruction fetch. This will not be used in Address mode (Cycle Accurate) of operation.	0.2
Overhead_Cycles	User can include the overhead cycle for each request based on the requirement.	1
Cache_Replacement_Policies	Cache blocks will be replaced based on these algorithms.	Pseudo-LRU /Pseudo-LRU,Least_Recently_Used/
Cache_Write_Policy	User can choose the Write policy for this cache block for the entire simulation. The operation of these policies is explained in this document.	Write_Through
Stochastic_or_Address_Based	This will determine the mode of operation of this cache block, user can choose Address based for cycle accurate mode of operation. To emulate the cache operation with less accurate user can use the stochastic mode.	Address_Based /Stochastic,Address_Based/
Miss_Memory_Name	User has to enter the next level memory in this field. This will help this cache block to send the request in case of a miss. If the name is wrong or there is no next level memory with this name then the buses will throw an error as “Destination not found”.	"L2_Cache"
Power_Manager_Name	This parameter helps the user to observe the power consumption of this block along with the other blocks in the architecture. User has to enter the power table name in this field to get the power analysis.	"none"
First_Word	This will determine the output transaction sequence. If this is true then the first word(Cache width length) of the requested bytes will be sent out. If not the last word (requested byte length) will be sent out..	false
No_of_Statistics	User can analyze the operation in the cache by observing the Statistics of the cache. It defines the number of samples of statistics for this cache block. User can view statistics by connecting a text display to stats port at bottom of the block.	2
FIFO_Buffers	Currently not used.	16
Architecture_Name	This defines the architecture setup name. This will help the tool to observe the model and make sure the model built is correct. Please keep the name same as the Architecture setup block.	"Architecture_1"
Enable_Hello_Messages	This will help the tool to find the source and destination of the request. By enabling this blocks can perform the routing in as better way.	true
Output_Flow_Control	User can implement Flow control at the slave side by enabling this parameter. Input flow control can be enables by adding the field “Event_Name” in the master side of the cache. For more detailed implementation of flow control please check the AXI Bus Document (Master_Throttle_Enable parameter).	false

Port	Explanation
to_cache	This is the West Side input port. This block can be connected to two buses, one on either side. This is connected to the processor.
fm_cache	This is the West Side output port. This block can be connected to two buses, one on either side. This is connected to the processor.
stats	Debug messages and statistics are output on this port.
Throughput	Throughput of the block during the simulation period. Time data plotter is used to view the plot.
Latency	Latancies of the block for each input. Time data plotter is used to view the plot
to_next_cache	This is the East Side output port. This block can be connected to two buses, one on either side. This is connected to the lower level memory.
fm_next_cache	This is the East Side input port. This block can be connected to two buses, one on either side. This is connected to the lower level memory.

Integerated_Cache

Memory/Integrated_Cache Block Name: Integrated_Cache

Table of contents

Block Overview

Description

Operation

Expected Data Structure

Input request combination

List of models

Parameter Configuration

How to connect

Enabling plots

Statistics and Analysis

Error messages and solutions

Memory/Integrated_Cache
Block Name: Integrated_Cache