Memory / CycleAccurateDRAM
Block Name: CycleAccurateDRAM

Code File Location: VisualSim/actor/arch/Memory/HW_DRAM


Table of contents

  1. Block Description
  2. Operation
  3. State_Plot_Diagram
  4. How to connect
  5. Necessary Fields
  6. Parameter Configuration
  7. How to configure
  8. Refresh Configuration and Restriction
  9. Power Configuration
  10. Checklist before running the model
  11. Deadlock and Debugging
  12. Error handling
  13. Future Implementation

Description:

The Cycle AccurateDRAM block is a standard DRAM module that emulates the memroy array and the bidirectional data transfer between the sense amplifier and IO Buffer. The DRAM and memory controller follows the open page policy and the memory controller will send appropriate command to achieve specific set of operations. Both memory controller and DRAM follows the JEDEC standard.

Operation:

The  DRAM module performs the read/write access and the activation sequence as per the commands obtaioned from the memory controller.

Activation commands from the memory controller:

  1. RCD command: DRAM opens a page in the particular bank.

  2. RP command : DRAM closes the page in the particular bank.

Read command from the memory controller:

  1. For SDR and DDRs the DRAM initiates the Read operation and transfer the data through IO channel. (tRL + Data transfer through channel)

    1. For SDR the data transfer rate is same as the internal bus. 1 word of data takes 1 cycle in the Channel.

    2. For DDRs the data transfer rate is twice as the internal bus. 1 word of data takes 1/2 cycle in the Channel.

  2.  For LPDDRs the internal transfer delay is tRL + DQSCK. The Channel transfer rate is twice as the internal bus.(1 word of data takes 1/2 cycle in the channel).


Write command from the memory controller:

  1. For SDR and DDRs(except DDR1) the DRAM initiates the Write operation and transfer the data through IO channel. (tWL + Data transfer through channel)

    1. For SDR the data transfer rate is same as the internal bus. 1 word of data takes 1 cycle in the Channel.

    2. For DDRs the data transfer rate is twice as the internal bus. 1 word of data takes 1/2 cycle in the Channel.

  2.  For LPDDRs and DDR1 the internal transfer delay is tWL + DQSS. The Channel transfer rate is twice as the internal bus.(1 word of data takes 1/2 cycle in the channel).

Refresh command from the memory controller:

  1. The RP command for an open bank will precharge the bank and block the other command to that particular bank.

  2. The Refresh command emulate the refresh operation for RFC time period and the appropriate bank will be idle during that time.

State Plot Diagram:


Image is not found in local directory

      

The bottom flow LPDDR5_Read_Write shows the data transfer through the IO channel. The remaining each flows shows the state of the corresponding bank.

The data transfer of each read or write in the Bidirectional bus will shows 2 consecutive pulse, the 1st pulse will be shorter and the 2nd pulse will be wider. The short pulse shows the read of first word and the wide pulse shows the read of remaining words till the last word.

The state change in each bank describe one of the following operations:

        1. RCD - (Activate) openning a page in that bank.  - Single Pulse with tRCD duration (New activate in a closed bank)

        2. RP- (Precharge) closing the page in that bank   - 2 Pulse, 1st Pulse for tRP duration and second pulse for tRCD duration (Open bank policy - page will be closed only if new page needs to be openned)

        3. Refresh - Refreshing certain number of rows in a bank  - 2 Pulse, 1st Pulse for tRP duration and second pulse for tRFC duration (RP wil occur for the banks that are open, for other banks just RFC will occur. All banks will be in closed state after refresh).


How to Connect HW_DRAM to Memory Controller Block

Left-Hand Side Input/Output to CycleAccurateDRAM, typically from/to MC

Right-Hand Side Input/Output to write memory values, typically not used

Note port names in diagram

Necessary Fields for the HW_DRAM

1. A_Address_Bank - Bank Id for the requested command

2. A_Address_Row  - Row Id for the requested command

3. A_Address_Col    - Column Id for the requested command

3. A_Source              - Source device of this request

3. A_Address            -  Requesting Address

3. A_Bytes                 - Requested  Bytes

3. A_Mem_ID           - Internal ID for each packet

3. A_Bytes_Total      - Total bytes requested by the source (it will be different from A_Bytes if the requested bytes is different from burst size)

3. A_Addr_Ctrl_Flag - internal flag to differentiate Read and Write


Parameter Configuration:


Parameter Explanation  Example 
HW_DRAM_Name Unique name for the CycleAccurateDRAM "DRAM"
DRAM_Type This is determines the type of memory. "DDR4" /* SDR, DDR, DDR2, DDR3, DDR4, DDR5, LPDDR, LPDDR2, LPDDR3, LPDDR4, LPDDR5 */
HW_DRAM_Speed_Mhz Speed of CycleAccurateDRAM in Mhz (Data Rate)  3200.0  /* DDR4 typical shown */
Burst_Length Number of data transfer in bursts. 8  /* DDR4 typical shown */
Memory_Width_Bytes Width of the memory interface in bytes. 1  /* DDR4 typical shown */
Memory_Controller Name of the memory controller connected to this DRAM. "none"  /* Default */
/* eg: "DDR_Controller" */
Address_Bit_Map Determines the size and organization of the memory.
it must be 2D array with 4 internal array entries.
Each internal array must have the range of address pins (or specific bits)
{{Column range}, {Row_Range},{Bank_Range},{Rank_Range}}
{{4,13},{14,29},{0,3},{30}}  /* col, row, bank, rank (min, max) Bit Position */

this configures the memory as single rank 8GB package.
4 bank groups, 4 banks per bank group, and 1KB page size.
Mfg_Suggest_Timing The recommended timing for the DRAM as an array of six double values (in nano sec) . These six entries refer to the following settings, in this order: tCL - tRCD - tRP - tRAS - tAL - tCWL.
  • CL = CAS Latency time (Read);
  • tRCD = DRAM RAS# to CAS# Delay;
  • tRP = DRAM RAS# Precharge;
  • tRAS = Active to Precharge delay.
  • tAL = Additive latency in read and write.
  • tCWL = CAS latency for write.
{13.75, 13.75, 13.75, 32, 0, 12.5}  /* tCL, tRCD, tRP, tRAS, tAL, tCWL in ns*/
 /* DDR4 typical shown */
Extra_Timing Extra timing for Memory Controller. Array of 9 added timing values: DQSS, tWTR, tCCD, tRRD, tWR, tDQSCK, tTRP, tHWpre, tFAW to further refine memory timing.  This is an array of additional parameters used by vendors to describe other intermediate latencies. The format is an array of double values in ns.

DQSS is the time for the BiDirectional bus strobe.
tWTR is the minimum time interval between end of WRITE and the start of READ command.
tCCD is the minimum time interval between two sequential reads./writes.
tRRD is the minimum time interval between successive ACTIVE commands to different banks.  
tWR is minimum time interval between end of WRITE and PRECHARGE command. 
tDQSCK is the data queue strobe clock.
tRTP is Read to Precharge timing.  
tHWpre can be used to add cycles to tRAS, tRTP, tWR parameters, default is 0.
tFAW is the time period in which only 4 activates can be issued. if 5th activate needs to be issued, it must be issued in the next FAW window
{0, 7.5, 0, 0, 15, 0, 7.5, 0, 21}  /* DQSS, tWTR, tCCD, tRRD, tWR, tDQSCK, tRTP, tHWpre, tFAW  in ns*/
/* DDR4 typical shown */
Retention_Time
Refresh window for entire memory to be refreshed once.
64.0E-3  /* LPDDR2 and DDR3 typical shown */
Fine_Granularity_Refresh_Time Refresh time(tRFC) for each refresh command
166.0E-09
Fine_Granularity_Refresh Fine tune the refresh interval.
(Only applies to DDR4 and DDR5. For other DRAM type it is by default FGR_1x)
FGR_1x
/*None, FGR_1x, FGR_2x, FGR_4x, FGR_8x, FGR_16x*/
REFpb_T_REFab_F Select Per Bank or All Bank refresh. "true" enable the Per Bank refresh and  "false" enables All Bank refresh. true
Same_Bank_Refresh Enable same bank refresh (Only used in DDR5) false
Refresh_Statistical Enable or disable statistical refresh. (stochastic version of refresh). false
Power_Manager_Name Name of the power manager. "none" will disable the power. Valid PowerTable name will enable the power "none"  /* Default */
/*eg: "Manager_1" */
Debug If true, an activity trace is output on port.  If false, no output is received.  This provides the user information on the actions with the block. true /*Debug data */
State_Plot_Enable
Enable or disable the timing diagram of this DRAM.
false
Enable_External_Data If true, the data structure is sent out on port_3.  The DRAM does not continue for that data structure, until it receives the data structure back on port_4.  If false, port_3 is bypassed. false /* No output on port_3 */
Standard_Name (Only used if the standard timings are configured through a file). Memory standard name to select the set of timings from the standard file. "none"
/*eg: DDR_Memory_Standards.txt */
Standard_File Another way to configure the timing values using a tex file.
The text file must contain the timings along with the memory standard name.
option to browse and select the file.
Architecture_Name Unique architecture name, typically “Architecture_1” "Architecture_1"

      

Port Explanation
port_1 Input from Memory Controller.
port_2 Output to Memory Controller.
port_3 Connected to external logic for making a read request or writing data to an address location.
port_4 Connected to external logic for reading data from an address location.
fm_ctrl Control signals, tRAS, tRP from the Memory Controller.
to_ctrl Control signals, tRAS, tRP, tRCD to the Memory Controller.

  

How to Configure specific DRAM type:


   The user has to extract the following timing values from the  LPDDR3 datasheet:
   The timing values must be choosen according to the data rate.
   
    In this configuration we are going to use 1600 data rate and  a single channel 4 GB LPDDR3.
 
    DRAM Configuration:
    HW_DRAM_Speed_MHz : 1600.0   (1.25e-9 clock time)
    Burst_Length                     : 8            (8n prefetch)
    Memory_Width_Bytes      : 4            (x32 device)
    Mfg_Suggest_Timing         : {15, 18, 18, 42, 0,  7.5} /* tCL, tRCD, tRP, tRAS, tAL, tCWL in ns*/

                                                  CL or RL       = 12 clocks           = 15ns

                                                  RCD               = max(18ns,3clk) = 18ns

                                                  RP                  = max(18ns,3clk) = 18ns (per bank refresh)

                                                  RAS                = max(42ns,3clk) = 42ns (RAS_min)

                                                  AL                  = 0                                    (choosing 0 additive latency, users choice)

                                                  CWL or WL   = 6 clocks             = 7.5ns


    Extra_Timing                     :  {1.25, 7.5, 5, 10, 15, 0, 7.5, 0, 50} /* DQSS, tWTR, tCCD, tRRD, tWR, tDQSCK, tRTP, tHWpre, tFAW  in ns*/  

                                                   DQSS min      = 0.75 clock

                                                   DQSS max     = 1.25 clock

                                                   we have to use 1 clock for all LPDDRs to maintain the same functionality, so 1.25ns

                                                   WTR              = max(7.5ns, 4clk)  = 7.5ns

                                                   CCD               = 4 clocks               = 5ns

                                                   RRD               = max(10ns, 2clk)   = 10ns

                                                   WR                 = max(15ns, 3clk)   = 15ns

                                                   DQSCK          = 2ns (DQSCK min)  (applicable only for LPDDRs, for DDRs it must be 0)

                                                   RTP                = max(7.5ns, 4clk)  = 7.5ns

                                                   HWpre            = 0   (not used in the current version)

                                                   FAW               = max(50ns,8clk)    = 50ns


    Address_Bit_Map              :  {{3,12},{13,26},{0,2},{30}}  / col, row, bank, rank (min, max) Bit Position /

                                                   8 banks         - 3 bits to address it.

                                                   1K Columns - 10 bits to address it

                                                   16K Rows    - 14 bits to address it


                                                    Bank interleaved addressing format

                                                    |    0 to 2     |      3 to 12     |    13 to 26     |        30       |

                                                    |   8 banks   | 1K Columns |   16K Rows  | single rank |(31st address bit must be 0 all the time )


    Fine_Granularity_Refresh_Time  : 130.0e-9
                                            RFC     = 130ns for 4GB
    Retention_Time                            : 32ms
                                            REFW = 32ms for 4GB




    In this configuration we are going to use 2400 data rate and  a single channel 8 GB DDR4.
    (1 clock preamble)

    DRAM Configuration:
    HW_DRAM_Speed_MHz : 2400.0   (0.833e-9 clock time)
    Burst_Length                     : 8            (8n prefetch)
    Memory_Width_Bytes      : 1            (x8 device)
    Mfg_Suggest_Timing         : {13.75, 13.75, 13.75, 32, 0, 13.3} /* tCL, tRCD, tRP, tRAS, tAL, tCWL in ns*/

                                                  CL - RCD - RP = 17 - 17 - 17

                                                  CL or AA       = 17 clocks           = 13.75ns

                                                  RCD               = 17 clocks           = 13.75ns

                                                  RP                  = 17 clocks           = 13.75ns

                                                  RAS                = 39 clocks          = 32ns (RAS_min)

                                                  AL                  = 0                       (choosing 0 additive latency, users choice)

                                                  CWL or WL   = 16 clocks          = 13.3ns


    Extra_Timing                     :  {0, 0, 0, 0, 15, 0, 7.5, 0, 21} /* DQSS, tWTR, tCCD, tRRD, tWR, tDQSCK, tRTP, tHWpre, tFAW  in ns*/  

                                                   DQSS min      = 0.75 clock

                                                   DQSS max     = 1.25 clock

                                                   we have to use 0 clock for all DDRs to maintain the same functionality, so 0ns

                                                   WTR              = 0

                                                   CCD               = 0

                                                   RRD               = 0

                                                   WR                 = 15ns

                                                   DQSCK          = 0

                                                   RTP                = max(7.5ns, 4clk)  = 7.5ns

                                                   HWpre            = 0   (not used in the current version)

                                                   FAW               = max(21ns,20clk)    = 21ns


    Address_Bit_Map              :  {{4,13},{14,29},{0,3},{30}}  / col, row, bank, rank (min, max) Bit Position /

                                                   16 banks      - 4 bits to address it. (4 Bank groups and 4 banks  per bank group)

                                                   1K Columns - 10 bits to address it

                                                   64K Rows    - 16 bits to address it


                                                    Bank interleaved addressing format

                                                    |      0 to 3     |      4 to 13     |    14 to 29     |        30       |

                                                    |   16 banks   | 1K Columns |   16K Rows  | single rank |(31st address bit must be 0 all the time )

                                                   

    Fine_Granularity_Refresh_Time  : 160.0e-9
                                           RFC     = 160ns for 8GB
    Retention_Time                            : 64ms

Refresh Configuration and Restriction:


   Basic configuration:
        1. Retention_Time - Refresh window for the entire DRAM to be refreshed once. It is typically configured either 64ms or 32ms accroding to the DDR type.
        2. Fine_Granularity_Refresh_Time - tRFC time, it defines the time taken to complete a refresh command. It varies based on size in each DRAM type.
        3. Refresh_Statistical - setting it to true will disable the Refresh operation in DRAM.

    Refresh Type:
        1. REF_pb_T_REF_ab_F  - true will set per bank refresh and false will set all bank refresh.
        2. Same_Bank_Refresh     - true will set all bank refresh.  (same bank in each bank group will be refreshed at the same time)
   
    Restrictions:
        1. DDRs only support all bank refersh. If per bank refresh is set, it will be ignored.
        2. LPDDRs only support per bank refresh. If all bank refresh is set, it will be ignored. (all bank support will be added in the future)
        3. DDR5 can support both all bank and same bank refresh. if same bank refresh is set then
REF_pb_T_REF_ab_F parameter will be ignored.
        4. FGR rate: all DDRs and LPDDRs except DDR4 and DDR5 uses FGR_1x by default, setting rate beyond that will not provide any impact.
        5. DDR4 will support only 1x,2x and 4x. Setting the rate beyond that will cause an error.
        6. DDR5 will support only 1x and 2x. Setting the rate beyond that will cause an error.

  

Power Configuration:

    DRAM requires the following entries in power table for power analysis. (Missing any of these entries will cause an error):
                                                                  Format
             1. ACT Stanby          -   "ACT_Stanby_"+DRAM_Name
             2. ACT Active          -   "
ACT_Active_"+DRAM_Name
             3. Read Power          -   "Read_Power_"+DRAM_Name
             4. Write Power         -   "Write_Power_"+DRAM_Name
             5. Refresh Power      -   "RFSH_Power_"+Bank_ID+"_"DRAM_Name    (Refresh power must be enterd for each bank, else an error will occur)

    Example entries:
                       Architecture_Block             Standby  Active  Wait  Idle  Existing  OffState  OnState    t_OnOff        Mhz       Volts   ;
                1. ACT_Standby_DRAM                  37.6     0.1    22.4   0.0   Standby   Standby   Active     0.0          1000.0     1.0     ; 
                2. ACT_Active_DRAM                     37.6     0.1    22.4   0.0   Standby   Standby   Active     0.0          1000.0     1.0     ;    
                3. Write_Power_DRAM                    0.0      216.2   0.0   0.0   Standby   Standby   Active     0.0          1000.0     1.0     ;
                4. Read_Power_DRAM                     0.0      392.2   0.0   0.0   Standby   Standby   Active     0.0          1000.0     1.0     ;
                5. RFSH_Power_0_DRAM                11.8      12.0   0.0   0.0   Standby   Standby   Active     0.0          1000.0     1.0     ; 
                6. RFSH_Power_1_DRAM                11.8      12.0   0.0   0.0   Standby   Standby   Active     0.0          1000.0     1.0     ; 

    Note for Developer:
            
    The refresh power supports only for 32 bank DRAM. If DRAM is configured beyond that the Refresh loops must be replicated to the required number of banks.

Checklist before running the model:

  1. DRAM name and memory controller name are configured correctly in both blocks
  2. DRAM type must be same in memory controller and DRAM.
  3. Make sure the DRAM name is correctly configured with the appropriate source device.
  4. Make sure the timing parameters are configured as per the data sheet from the vendor. The default values are for DDR4 obtained from micron datasheet.
  5. Memory_Width_Bytes must be same in memory controller and DRAM.
  6. Address_Bit_Map parameter must match wiht the Memory_Coloumn, Memory_Row and Memory_Rank in memory controller.
  7. Mfg_Suggest_Timing and Extra_Timing must be same in memory controller and DRAM.
  8. Retention time must be within the range of 1us to 1sec. Please configure the value as per the data sheet. Typical value will be in ms.
  9. Fine_Granulairty_Refresh_Time must be within the range of 50ns to 1sec. Plese configure the the value as per the datasheet. Typical value will be in ns.
  10. To enable power analysis, make sure the Power_Manager_Name is a valid name of Power Table. Power Table must have required entries of DRAM.
  11. DRAM can be connected to the bus interface through memory contoller. Please make sure the memory controller is configured correctly with DRAM.

Deadlock and debugging:

             Deadlock situations:  
    1. If Command_Buffer_Length is less than or equal to the total fragments of the input request the deadlock can occur.
      1. Fragments size  = Burst_length * Memory_Width_Bytes, if input request size is greatrer than this size, it will fragmented.
    2. If TResolution is set to 1.0E-12 in Digital Simulator, it can cause a deadlock. Please reduce the TResolution to 1.0E-13.
       
        Debugging methods:
  1. Enable Debug parameter in memory controller. Connect a text display to save or view the debug messages from memory controller.
    1. Look for last command issued to DRAM other than refresh messsages.
  2. Connect text display to ArchitectureSetup to observe statistics.
    1. Look for Total Requests and Completed_Requests.
  3. Enable State_Plot_Diagram in DRAM to observe the state of each bank in DRAM.
  4. Please make sure the ports are connected correctly. (output to input)


Error Handling:

           1. VisualSim.kernel.util.IllegalActionException:
Block        :  Test_Run.CycleAccurateDRAM.State_Machine
Error        :  Issue with RegEx execution
Exception    :  VisualSim.kernel.util.IllegalActionException: Error invoking function public static java.lang.String VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String) throws VisualSim.kernel.util.IllegalActionException

Because:
User RegEx Exception:
LPDDR5_State_Machine cannot locate Memory_Controller: M_LPDDR5_Controller, suggest edit.
        Please make sure the memory controler name is configured correctly in the DRAM block.
      
          2. VisualSim.kernel.util.IllegalActionException:
Block        :  Test_Run.CycleAccurateDRAM.State_Machine
Error        :  Issue with RegEx execution
Exception    :  VisualSim.kernel.util.IllegalActionException: Error invoking function public static java.lang.String VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String) throws VisualSim.kernel.util.IllegalActionException

Because:
User RegEx Exception:
Row First memory bit should contain integer or array, found: {12.0, 24.0}
        Please make sure the Address_Bit_Map parameter is an array of integers, other data types are not supported.
   
       3. VisualSim.kernel.util.IllegalActionException:
Block        :  Test_Run.CycleAccurateDRAM.HW_DRAM
Error        :  Issue with RegEx execution
Exception    :  VisualSim.kernel.util.IllegalActionException: add method not supported between VisualSim.data.StringToken '"Parameter 'Burst_Length' contains Unsupported data type value ="' and VisualSim.data.ArrayToken '{16}' because the types are incomparable.
  in .Test_Run.CycleAccurateDRAM.HW_DRAM      
        Please make sure the Burst Length parameter is an integers, other data types are not supported.
        
        4. VisualSim.kernel.util.IllegalActionException:
Block        :  Test_Run.CycleAccurateDRAM.State_Machine
Error        :  Issue with RegEx execution
Exception    :  VisualSim.kernel.util.IllegalActionException: Error invoking function public static java.lang.String VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String) throws VisualSim.kernel.util.IllegalActionException

Because:
User RegEx Exception:
LPDDR5_State_Machine Retention_Time (32.0) is negative, less than 1 usec, or greater than 1.0, suggest edit.
  in .Test_Run.CycleAccurateDRAM.State_Machine
        Please make sure the Retention Time parameter is a double and within the range of 1us to 1.0 sec, out of range values will not be supported. Typical value is 64.0E-3.

        5. VisualSim.kernel.util.IllegalActionException:
Block        :  Test_Run.CycleAccurateDRAM.State_Machine
Error        :  Issue with RegEx execution
Exception    :  VisualSim.kernel.util.IllegalActionException: Error invoking function public static java.lang.String VisualSim.data.expr.UtilityFunctions.throwMyException(java.lang.String) throws VisualSim.kernel.util.IllegalActionException

Because:
User RegEx Exception:
LPDDR5_State_Machine Fine_Granularity_Refresh_Time (2.0E-8) is negative, less than 50 ns, or greater than 1.0, suggest edit.
  in .Test_Run.CycleAccurateDRAM.State_Machine
        Please make sure the Fine Granularity Refresh Time parameter is a double and within the range of 50ns to 1.0 sec, out of range values will not be supported. Typical value is 160.0E-9.
         
    6.
VisualSim.kernel.util.IllegalActionException:
Block            :  Test_Run.CycleAccurateDRAM.HW_DRAM
Line             :  257
Error_Number     :  Script_001
Explanation      :  Set_Block_Reference            = addBlockReference(Architecture_Name, Block_Name), Check argument types, argument values, field names, and variables.
Exception        :  VisualSim.kernel.util.IllegalActionException: Error invoking function public static VisualSim.data.BooleanToken VisualSim.data.expr.UtilityFunctions.addBlockReference(VisualSim.data.StringToken,VisualSim.data.StringToken) throws VisualSim.kernel.util.IllegalActionException

Because:
Cannot read Memory Reference: Architectue_1
Source Reference: LPDDR5

        Please make sure the ArchitectureSetup Name is confgured correctly.

Future Implementations:

  1. Multi Rank Support: Timing constratint for multi rank activate and acces must be added
  2. LPDDR4 dual cycle issue: 2 cycle for issuing all the commands except precharge must be added
  3. All bank support for LPDDRs must be added.
  4. FAW constraint for activates following all bank refresh must be added.
  5. Postpone and Advance of refresh must be added.
  6. Gneralization of refresh code to support N banks must be added. Right now it support only 32 banks with hardcoded loops.
  7. Address bit map verification with the DRAM Size parameter can be added.
  8. Statistical Refresh support can be added.
  9. Row and Column boundary crossing as per the address bit map can be added.