Memory / CycleAccurateDRAM
Block Name: CycleAccurateDRAM
Code
File Location: VisualSim/actor/arch/Memory/HW_DRAM
Table of contents
- Block Description
- Operation
- State_Plot_Diagram
- How to connect
- Necessary Fields
- Parameter Configuration
- How to configure
- Refresh Configuration and Restriction
- Power Configuration
- Checklist before running the model
- Deadlock and Debugging
- Error handling
- Future Implementation
Description:
The
Cycle AccurateDRAM block is a standard DRAM module that emulates the
memroy array and the bidirectional data transfer between the sense
amplifier and IO Buffer. The DRAM and memory controller follows the
open page policy and the memory controller will send appropriate
command to achieve specific set of operations. Both memory controller
and DRAM follows the JEDEC standard.
Operation:
The
DRAM module performs the read/write access and the activation sequence
as per the commands obtaioned from the memory controller.
Activation commands from the memory controller:
-
RCD
command: DRAM opens a page in the particular bank.
-
RP command : DRAM closes the page in the particular bank.
Read command from the memory controller:
For
SDR and DDRs the DRAM initiates the Read operation and transfer the
data through IO channel. (tRL + Data transfer through channel)
-
For SDR the data transfer rate is same as the internal bus. 1 word of data takes 1 cycle in the Channel.
-
For DDRs the data transfer rate is twice as the internal bus. 1 word of data takes 1/2 cycle in the Channel.
-
For
LPDDRs the internal transfer delay is tRL + DQSCK. The Channel transfer
rate is twice as the internal bus.(1 word of data takes 1/2 cycle in
the channel).
Write command from the memory controller:
For
SDR and DDRs(except DDR1) the DRAM initiates the Write operation and transfer the
data through IO channel. (tWL + Data transfer through channel)
-
For SDR the data transfer rate is same as the internal bus. 1 word of data takes 1 cycle in the Channel.
-
For DDRs the data transfer rate is twice as the internal bus. 1 word of data takes 1/2 cycle in the Channel.
-
For
LPDDRs and DDR1 the internal transfer delay is tWL + DQSS. The Channel transfer
rate is twice as the internal bus.(1 word of data takes 1/2 cycle in
the channel).
Refresh command from the memory controller:
The RP command for an open bank will precharge the bank and block the other command to that particular bank.
-
The Refresh command emulate the refresh operation for RFC time period and the appropriate bank will be idle during that time.
State Plot Diagram:
The
bottom flow LPDDR5_Read_Write shows the data transfer through the IO
channel. The remaining each flows shows the state of the corresponding
bank.
The
data transfer of each read or write in the Bidirectional bus will shows
2 consecutive pulse, the 1st pulse will be shorter and the 2nd pulse
will be wider. The short pulse shows the read of first word and the
wide pulse shows the read of remaining words till the last word.
The state change in each bank describe one of the following operations:
1. RCD - (Activate) openning a page in that bank. - Single Pulse with tRCD duration (New activate in a closed bank)
2. RP- (Precharge) closing the page in that bank - 2 Pulse, 1st Pulse for tRP duration and second pulse for tRCD duration (Open bank policy - page will be closed only if new page needs to be openned)
3. Refresh - Refreshing certain number of rows in a bank - 2 Pulse, 1st Pulse for tRP duration and second pulse for tRFC duration (RP
wil occur for the banks that are open, for other banks just RFC will
occur. All banks will be in closed state after refresh).
How to Connect HW_DRAM
to Memory Controller Block
Left-Hand
Side Input/Output to CycleAccurateDRAM, typically from/to MC
Right-Hand
Side Input/Output to write memory values, typically not used
Note
port names in diagram
Necessary Fields for the HW_DRAM
1. A_Address_Bank - Bank Id for the requested command
2. A_Address_Row - Row Id for the requested command
3. A_Address_Col - Column Id for the requested command
3. A_Source - Source device of this request
3. A_Address - Requesting Address
3. A_Bytes - Requested Bytes
3. A_Mem_ID - Internal ID for each packet
3. A_Bytes_Total
- Total bytes requested by the source (it will be different from
A_Bytes if the requested bytes is different from burst size)
3. A_Addr_Ctrl_Flag - internal flag to differentiate Read and Write
Parameter Configuration:
Parameter |
Explanation |
Example |
HW_DRAM_Name |
Unique
name for the CycleAccurateDRAM |
"DRAM" |
DRAM_Type |
This
is determines the type of memory. |
"DDR4" /* SDR, DDR, DDR2, DDR3, DDR4, DDR5, LPDDR,
LPDDR2, LPDDR3, LPDDR4, LPDDR5 */ |
HW_DRAM_Speed_Mhz |
Speed
of CycleAccurateDRAM in Mhz (Data Rate)
|
3200.0 /* DDR4 typical shown */ |
Burst_Length |
Number of data transfer in bursts.
|
8 /* DDR4 typical shown */ |
Memory_Width_Bytes |
Width
of the memory interface in bytes. |
1 /* DDR4 typical shown */ |
Memory_Controller |
Name of the memory controller connected to this DRAM. |
"none" /* Default */
/* eg: "DDR_Controller" */ |
Address_Bit_Map |
Determines the size and organization of the memory.
it must be 2D array with 4 internal array entries.
Each internal array must have the range of address pins (or specific bits)
{{Column range}, {Row_Range},{Bank_Range},{Rank_Range}} |
{{4,13},{14,29},{0,3},{30}} /* col, row, bank, rank (min, max) Bit Position */
this configures the memory as single rank 8GB package.
4 bank groups, 4 banks per bank group, and 1KB page size. |
Mfg_Suggest_Timing |
The
recommended timing for the DRAM as an array of six double values (in nano sec) . These six entries refer to the
following settings, in this order: tCL - tRCD - tRP - tRAS - tAL - tCWL.
- CL = CAS
Latency time (Read);
- tRCD = DRAM RAS# to CAS# Delay;
- tRP = DRAM RAS#
Precharge;
- tRAS = Active to Precharge delay.
- tAL = Additive latency in read and write.
- tCWL = CAS latency for write.
|
{13.75, 13.75, 13.75, 32, 0, 12.5} /* tCL, tRCD, tRP, tRAS, tAL, tCWL in ns*/
/* DDR4 typical shown */ |
Extra_Timing |
Extra
timing for Memory Controller. Array of 9 added timing values:
DQSS, tWTR, tCCD, tRRD, tWR, tDQSCK, tTRP, tHWpre, tFAW to further refine
memory timing. This
is an array of additional parameters used by vendors to describe other
intermediate latencies. The format is an array of double values in ns.
DQSS is the
time for the BiDirectional bus strobe.
tWTR is the minimum time
interval between end of WRITE and the start of READ command.
tCCD is the minimum time interval between two sequential reads./writes.
tRRD
is the minimum time interval between successive
ACTIVE commands to different banks.
tWR
is
minimum time interval between end of WRITE and PRECHARGE command.
tDQSCK is the data queue strobe clock.
tRTP is Read to Precharge timing.
tHWpre can be used
to add cycles to tRAS, tRTP, tWR parameters, default is 0.
tFAW is the time period in which only 4 activates can be issued. if 5th
activate needs to be issued, it must be issued in the next FAW window
|
{0, 7.5, 0, 0, 15, 0, 7.5, 0, 21} /* DQSS, tWTR, tCCD, tRRD, tWR, tDQSCK, tRTP, tHWpre, tFAW in ns*/
/* DDR4 typical shown */ |
Retention_Time
|
Refresh window for entire memory to be refreshed once.
|
64.0E-3 /* LPDDR2 and DDR3 typical
shown */ |
Fine_Granularity_Refresh_Time |
Refresh time(tRFC) for each refresh command
|
166.0E-09 |
Fine_Granularity_Refresh |
Fine tune the refresh interval.
(Only applies to DDR4 and DDR5. For other DRAM type it is by default FGR_1x)
|
FGR_1x
/*None, FGR_1x, FGR_2x, FGR_4x, FGR_8x, FGR_16x*/ |
REFpb_T_REFab_F |
Select Per Bank or All Bank refresh. "true" enable the Per Bank refresh and "false" enables All Bank refresh. |
true
|
Same_Bank_Refresh |
Enable same bank refresh (Only used in DDR5) |
false
|
Refresh_Statistical |
Enable or disable statistical refresh. (stochastic version of refresh). |
false
|
Power_Manager_Name |
Name of the power manager. "none" will disable the power. Valid PowerTable name will enable the power |
"none" /* Default */
/*eg: "Manager_1" */ |
Debug |
If
true, an activity trace is output on port. If false, no output is
received. This provides the user information on the actions with
the block. |
true /*Debug data */ |
State_Plot_Enable
|
Enable or disable the timing diagram of this DRAM.
|
false
|
Enable_External_Data
|
If
true, the data structure is sent out on port_3. The DRAM does not
continue for that data structure, until it receives the data structure
back on port_4. If false, port_3 is bypassed.
|
false /* No output on port_3 */
|
Standard_Name
|
(Only used if the standard
timings are configured through a file). Memory standard name to select
the set of timings from the standard file.
|
"none"
/*eg: DDR_Memory_Standards.txt */
|
Standard_File
|
Another way to configure the timing values using a tex file.
The text file must contain the timings along with the memory standard name.
|
option to browse and select the file.
|
Architecture_Name |
Unique
architecture name, typically
“Architecture_1” |
"Architecture_1" |
Port |
Explanation |
port_1 |
Input
from Memory Controller. |
port_2 |
Output
to Memory Controller. |
port_3 |
Connected
to external logic for making a read request or writing data to an
address location. |
port_4 |
Connected
to external logic for reading data from an address location. |
fm_ctrl |
Control
signals, tRAS, tRP from the Memory Controller. |
to_ctrl |
Control
signals, tRAS, tRP, tRCD to the Memory Controller. |
How to Configure specific DRAM type:
The user has to extract the following timing values from the LPDDR3 datasheet:
The timing values must be choosen according to the data rate.
In this configuration we are going to use 1600 data rate and a single channel 4 GB LPDDR3.
DRAM Configuration:
HW_DRAM_Speed_MHz : 1600.0 (1.25e-9 clock time)
Burst_Length
: 8
(8n prefetch)
Memory_Width_Bytes : 4
(x32
device)
Mfg_Suggest_Timing :
{15, 18, 18, 42, 0, 7.5} /* tCL, tRCD, tRP, tRAS, tAL, tCWL in
ns*/
CL or RL = 12
clocks =
15ns
RCD
= max(18ns,3clk) = 18ns
RP
= max(18ns,3clk) = 18ns (per bank refresh)
RAS
= max(42ns,3clk) = 42ns (RAS_min)
AL
=
0
(choosing 0
additive latency, users choice)
CWL or WL = 6
clocks
= 7.5ns
Extra_Timing
: {1.25, 7.5, 5, 10, 15, 0, 7.5, 0, 50} /* DQSS,
tWTR, tCCD, tRRD, tWR, tDQSCK, tRTP, tHWpre, tFAW in ns*/
DQSS min = 0.75 clock
DQSS max = 1.25 clock
we have to use 1 clock for all LPDDRs to maintain the same
functionality, so 1.25ns
WTR
= max(7.5ns, 4clk) = 7.5ns
CCD
= 4
clocks
= 5ns
RRD
= max(10ns, 2clk) = 10ns
WR
= max(15ns, 3clk) = 15ns
DQSCK = 2ns
(DQSCK min) (applicable only for LPDDRs, for DDRs
it must be 0)
RTP
= max(7.5ns, 4clk) = 7.5ns
HWpre
= 0 (not used in the current version)
FAW
= max(50ns,8clk) = 50ns
Address_Bit_Map
: {{3,12},{13,26},{0,2},{30}} / col,
row, bank, rank (min, max) Bit Position /
8 banks - 3 bits to
address it.
1K Columns - 10 bits to address it
16K Rows - 14 bits to address it
Bank interleaved
addressing format
| 0 to 2 |
3 to 12 |
13 to 26 | 30
|
| 8 banks | 1K Columns | 16K Rows |
single rank |(31st address
bit must be 0 all the time )
Fine_Granularity_Refresh_Time : 130.0e-9
RFC = 130ns for 4GB
Retention_Time
: 32ms
REFW = 32ms
for 4GB
In this configuration we are going to use 2400 data rate and a single channel 8 GB DDR4.
(1 clock preamble)
DRAM Configuration:
HW_DRAM_Speed_MHz : 2400.0 (0.833e-9 clock time)
Burst_Length
: 8
(8n prefetch)
Memory_Width_Bytes : 1
(x8 device)
Mfg_Suggest_Timing :
{13.75, 13.75, 13.75, 32, 0, 13.3} /* tCL, tRCD, tRP, tRAS, tAL, tCWL
in ns*/
CL - RCD - RP = 17 - 17 - 17
CL or AA
= 17 clocks
= 13.75ns
RCD
= 17
clocks =
13.75ns
RP
= 17
clocks =
13.75ns
RAS
= 39 clocks = 32ns
(RAS_min)
AL
=
0
(choosing 0
additive latency, users choice)
CWL or WL = 16
clocks = 13.3ns
Extra_Timing
: {0, 0, 0, 0, 15, 0, 7.5, 0, 21} /* DQSS,
tWTR, tCCD, tRRD, tWR, tDQSCK, tRTP, tHWpre, tFAW in ns*/
DQSS min = 0.75 clock
DQSS max = 1.25 clock
we have to use 0 clock for all DDRs to maintain the same functionality,
so 0ns
WTR
= 0
CCD
= 0
RRD
= 0
WR
= 15ns
DQSCK = 0
RTP
= max(7.5ns, 4clk) = 7.5ns
HWpre
= 0 (not used in the current version)
FAW
= max(21ns,20clk) = 21ns
Address_Bit_Map
: {{4,13},{14,29},{0,3},{30}} / col,
row, bank, rank (min, max) Bit Position /
16 banks - 4 bits to address it. (4 Bank
groups and 4 banks per
bank group)
1K Columns - 10 bits to address it
64K Rows - 16 bits to address it
Bank interleaved
addressing format
| 0 to 3 |
4 to 13 | 14 to
29 | 30
|
| 16 banks | 1K Columns | 16K Rows |
single rank |(31st address
bit must be 0 all the time )
Fine_Granularity_Refresh_Time : 160.0e-9
RFC = 160ns for 8GB
Retention_Time
: 64ms
Refresh Configuration and Restriction:
Basic configuration:
1. Retention_Time
- Refresh window for the entire DRAM to be refreshed once. It is
typically configured either 64ms or 32ms accroding to the DDR type.
2. Fine_Granularity_Refresh_Time - tRFC time, it defines the time taken to complete a refresh command. It varies based on size in each DRAM type.
3. Refresh_Statistical - setting it to true will disable the Refresh operation in DRAM.
Refresh Type:
1. REF_pb_T_REF_ab_F - true will set per bank
refresh and false will set all bank refresh.
2. Same_Bank_Refresh - true
will set all bank refresh. (same bank in each bank group will be
refreshed at the same time)
Restrictions:
1. DDRs only support all bank refersh. If per bank refresh is set, it will be ignored.
2. LPDDRs only support per bank refresh. If all bank
refresh is set, it will be ignored. (all bank support will be added in
the future)
3. DDR5 can support both all bank and same bank refresh. if same bank refresh is set then REF_pb_T_REF_ab_F parameter will be ignored.
4. FGR rate: all DDRs and LPDDRs except DDR4 and
DDR5 uses FGR_1x by default, setting rate beyond that will not provide
any impact.
5. DDR4 will support only 1x,2x and 4x. Setting the rate beyond that will cause an error.
6. DDR5 will support only 1x and 2x. Setting the rate beyond that will cause an error.
Power Configuration:
DRAM requires the following entries in
power table for power analysis. (Missing any of these entries will
cause an error):
Format
1. ACT Stanby
-
"ACT_Stanby_"+DRAM_Name
2. ACT
Active - "ACT_Active_"+DRAM_Name
3. Read Power
- "Read_Power_"+DRAM_Name
4. Write Power
- "Write_Power_"+DRAM_Name
5. Refresh Power - "RFSH_Power_"+Bank_ID+"_"DRAM_Name (Refresh power must be enterd for each bank, else an error will occur)
Example entries:
Architecture_Block
Standby Active Wait
Idle Existing OffState OnState
t_OnOff
Mhz Volts ;
1.
ACT_Standby_DRAM
37.6 0.1 22.4
0.0 Standby Standby
Active
0.0
1000.0 1.0 ;
2.
ACT_Active_DRAM
37.6 0.1 22.4
0.0 Standby Standby
Active
0.0
1000.0 1.0 ;
3. Write_Power_DRAM
0.0 216.2 0.0
0.0 Standby Standby
Active
0.0
1000.0 1.0 ;
4. Read_Power_DRAM
0.0 392.2 0.0
0.0 Standby Standby
Active
0.0
1000.0 1.0 ;
5.
RFSH_Power_0_DRAM
11.8 12.0 0.0
0.0 Standby Standby
Active
0.0
1000.0 1.0 ;
6.
RFSH_Power_1_DRAM
11.8 12.0 0.0
0.0 Standby Standby
Active
0.0
1000.0 1.0 ;
Note for Developer:
The
refresh power supports only for 32 bank DRAM. If DRAM is configured
beyond that the Refresh loops must be replicated to the required number
of banks.
Checklist before running the model:
- DRAM name and memory controller name are configured correctly in both blocks
- DRAM type must be same in memory controller and DRAM.
- Make sure the DRAM name is correctly configured with the appropriate source device.
- Make sure the timing parameters are configured as per
the data sheet from the vendor. The default values are for DDR4
obtained from micron datasheet.
- Memory_Width_Bytes must be same in memory controller and DRAM.
- Address_Bit_Map parameter must match wiht the Memory_Coloumn, Memory_Row and Memory_Rank in memory controller.
- Mfg_Suggest_Timing and Extra_Timing must be same in memory controller and DRAM.
- Retention time must be within the range of 1us to 1sec.
Please configure the value as per the data sheet. Typical value will be
in ms.
- Fine_Granulairty_Refresh_Time
must be within the range of 50ns to 1sec. Plese configure the the value
as per the datasheet. Typical value will be in ns.
- To enable power analysis, make sure the Power_Manager_Name is a valid name of Power Table. Power Table must have required entries of DRAM.
- DRAM can be connected to the bus interface through
memory contoller. Please make sure the memory controller is configured
correctly with DRAM.
Deadlock and debugging:
Deadlock situations:
- If Command_Buffer_Length is less than or equal to the total fragments of the input request the deadlock can occur.
- Fragments size = Burst_length * Memory_Width_Bytes, if input request size is greatrer than this size, it will fragmented.
- If TResolution is set to 1.0E-12 in Digital Simulator, it can cause a deadlock. Please reduce the TResolution to 1.0E-13.
Debugging methods:
- Enable Debug parameter in memory controller. Connect a text display to save or view the debug messages from memory controller.
- Look for last command issued to DRAM other than refresh messsages.
- Connect text display to ArchitectureSetup to observe statistics.
- Look for Total Requests and Completed_Requests.
- Enable State_Plot_Diagram in DRAM to observe the state of each bank in DRAM.
- Please make sure the ports are connected correctly. (output to input)
Error Handling:
1. VisualSim.kernel.util.IllegalActionException:
Block : Test_Run.CycleAccurateDRAM.State_Machine
Error : Issue with RegEx execution
Exception :
VisualSim.kernel.util.IllegalActionException: Error invoking function
public static java.lang.String
VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String)
throws VisualSim.kernel.util.IllegalActionException
Because:
User RegEx Exception:
LPDDR5_State_Machine cannot locate Memory_Controller: M_LPDDR5_Controller, suggest edit.
Please make sure the memory controler name is configured correctly in the DRAM block.
2. VisualSim.kernel.util.IllegalActionException:
Block : Test_Run.CycleAccurateDRAM.State_Machine
Error : Issue with RegEx execution
Exception :
VisualSim.kernel.util.IllegalActionException: Error invoking function
public static java.lang.String
VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String)
throws VisualSim.kernel.util.IllegalActionException
Because:
User RegEx Exception:
Row First memory bit should contain integer or array, found: {12.0, 24.0}
Please make sure the Address_Bit_Map parameter is an array of integers, other data types are not supported.
3. VisualSim.kernel.util.IllegalActionException:
Block : Test_Run.CycleAccurateDRAM.HW_DRAM
Error : Issue with RegEx execution
Exception :
VisualSim.kernel.util.IllegalActionException: add method not supported
between VisualSim.data.StringToken '"Parameter 'Burst_Length' contains
Unsupported data type value ="' and VisualSim.data.ArrayToken '{16}'
because the types are incomparable.
in .Test_Run.CycleAccurateDRAM.HW_DRAM
Please make sure the Burst Length parameter is an integers, other data types are not supported.
4. VisualSim.kernel.util.IllegalActionException:
Block : Test_Run.CycleAccurateDRAM.State_Machine
Error : Issue with RegEx execution
Exception :
VisualSim.kernel.util.IllegalActionException: Error invoking function
public static java.lang.String
VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String)
throws VisualSim.kernel.util.IllegalActionException
Because:
User RegEx Exception:
LPDDR5_State_Machine Retention_Time (32.0) is negative, less than 1 usec, or greater than 1.0, suggest edit.
in .Test_Run.CycleAccurateDRAM.State_Machine
Please
make sure the Retention Time parameter is a double and within the range
of 1us to 1.0 sec, out of range values will not be supported. Typical
value is 64.0E-3.
5. VisualSim.kernel.util.IllegalActionException:
Block : Test_Run.CycleAccurateDRAM.State_Machine
Error : Issue with RegEx execution
Exception :
VisualSim.kernel.util.IllegalActionException: Error invoking function
public static java.lang.String
VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String)
throws VisualSim.kernel.util.IllegalActionException
Because:
User RegEx Exception:
LPDDR5_State_Machine Fine_Granularity_Refresh_Time (2.0E-8) is negative, less than 50 ns, or greater than 1.0, suggest edit.
in .Test_Run.CycleAccurateDRAM.State_Machine
Please make sure
the Fine Granularity Refresh Time parameter is a double and within the range of 50ns
to 1.0 sec, out of range values will not be supported. Typical value is 160.0E-9.
6.
VisualSim.kernel.util.IllegalActionException:
Block : Test_Run.CycleAccurateDRAM.HW_DRAM
Line : 257
Error_Number : Script_001
Explanation :
Set_Block_Reference
= addBlockReference(Architecture_Name, Block_Name), Check argument
types, argument values, field names, and variables.
Exception :
VisualSim.kernel.util.IllegalActionException: Error invoking function
public static VisualSim.data.BooleanToken
VisualSim.data.expr.UtilityFunctions.addBlockReference(VisualSim.data.StringToken,VisualSim.data.StringToken)
throws VisualSim.kernel.util.IllegalActionException
Because:
Cannot read Memory Reference: Architectue_1
Source Reference: LPDDR5
Please make sure the ArchitectureSetup Name is confgured correctly.
Future Implementations:
- Multi Rank Support: Timing constratint for multi rank activate and acces must be added
- LPDDR4 dual cycle issue: 2 cycle for issuing all the commands except precharge must be added
- All bank support for LPDDRs must be added.
- FAW constraint for activates following all bank refresh must be added.
- Postpone and Advance of refresh must be added.
- Gneralization of refresh code to support N banks must be added. Right now it support only 32 banks with hardcoded loops.
- Address bit map verification with the DRAM Size parameter can be added.
- Statistical Refresh support can be added.
- Row and Column boundary crossing as per the address bit map can be added.