Introduction

Architecture-accurate processor model

Quick Explanation

  • Generator of RISC, CISC, Microcontrollers
  • Spreadsheet-based
  • Defines timing, power and functionality
  • Multiple stage pipeline
  • Multi-level cache and memory hierarchy
  • Supports multiple interfaces
  • DMA support
  • Changing the clock speed when a specific operation or the stage of the pipeline needs to be expanded
  • Using multi-cycle delay for the flush from annotate C-code
  • Preemption Enabled - adding preemption to the Processor

Protocol

  • Supports commercial, research and future generation design
  • Multi-Thread Processor - shows the definition of multi-threaded processor
  • Multi-Core Processor - shows the use of the Processor block for defining multi-core
  • Multi-Processor Shared Cache - create a multi-processor system with all processors sharing a single cache structure
  • Co-Processor Model - adding a co-processor that operates off the main pipeline
  • SIMD Processor Model - Single Instruction- Multi Data
  • MIMD Processor Model - Multi Instruction- Multi Data
  • Processor External Definition Model - used when a specific operation or the stage of the pipeline needs to be expanded.
  • Processor to External Scheduler Model - similar to above but using the Scheduler block
  • MIMD Processor Model - Multi Instruction - Multi Data
  • Processor External Definition Model - Used when a specific operation or the stage of the pipeline needs to be expanded.
  • Processor to External Scheduler Model- Similar to above but using the Scheduler block
  • Static Checker Check the correctness of the Optional Parameters

Architecture-Accurate Processor Model

VisualSim Processor Generator is a revolutionary and an extremely intelligent library.  The library contains the generators and a large set of pre-defined components.  VisualSim Artificial Intelligence (AI)-driven Processor Generator is used for performance analysis and architecture exploration of System-on-Chip (SoC) and embedded systems.  The generated model is pipeline-accurate and has port integration with standard buses and memories.  This processor model is used to compare different processor families, optimize the specification and identify system bottlenecks. The AI Processor Generator currently supports microcontrollers, microprocessors, DSP and GPUs.  The breath of processors can range from 8-bit to 128-bit and from zero to 4-level caches.

Selecting the right processor, configuring multi-cores and establishing the right topology is very challenging for complex systems. Acquiring boards and loading software on each processor instance is expensive; emulators, RTL and cycle-accurate models take a long time to simulate and are not easily available; virtual prototypes do not provide timing accuracy; while analytical models cannot handle the complex traffic patterns.  AI technology has evolved to enable this library to take a spreadsheet input and generate a processor model that is fast, accurate and visual.

Mirabilis Design has used Artificial Intelligence to identify patterns in over 100 processors. Using these patterns, VisualSim AI Processor Generator has created a unique input spreadsheet.  Using this input and the learning algorithm database loaded into the generator, existing and future processors models are generated. Data for the input spreadsheet is available in the vendor datasheet.  The generated model supports variable processor pipelines, SIMD/MIMD, multi-thread, multi-level cache hierarchy, coherency, heterogeneous execution units, buffers and bus interfaces. The generated model has over 150 statistics for cache hit-ratio, stalls and utilization. The processor has probes to trace pipeline execution sequence, prefetch requests, interrupts and preemption.

<!–VisualSim Hardware Core Architecture library contains generators for processors, memories, caches and software threads to use in models of distributed systems and System-on-Chip (SoC). These components are modeled for timing, throughput and power accuracy. These components provide extensive visibility into the internal operations of these components, thus allowing the designer to understand the possible bottlenecks or identify areas of improvement.

Key Capabilities

  1. Contains models of processors, caches, memories and software tasks
  2. Blocks have timing, logic and power state information
  3. Generates a large set of pre-built statistics
  4. Defined and extended using parameters and extension ports
  5. Parameters are populated with information that is normally available in the datasheet
  6. Provides a high-level of timing, throughput and power accuracy
  7. No programming is required to define the components

Technology supported

  1. Generate Embedded, advanced VLIW general purpose processors, DSP, Application-specific, Network Processor and micro-controllers
  2. Memory controller and array for all types of DDR, DDR2, DDR3, SDR, SRAM and NAND
  3. Hierarchical cache with concurrent operations
  4. Most software tasks profiles supported for instruction sequence generation

Analysis

  1. Architecture studies for system sizing, parameter tuning and optimizing cache/memory hierarchy.
  2. Study cache strategies, branch prediction algorithm, off-load engines, software allocation schemes and processor selection.
  3. Evaluate scalability studies and feasibility of incorporating new application on existing systems.

Processor technology

  1. Single- and multi-core processors
  2. Multi-processor and multi-board systems
  3. Single Instruction Multiple Data (SMID) and Multiple Instruction Multiple Data (MIMD)
  4. VLIW, CISC and RISC
  5. Shared cache and distributed cache processing
  6. Cache-less architectures

The processor defines the pipeline, resources such as queues, caches and interfaces, execution units, instruction associated with each execution unit and the widths. The processor handles advanced hazard modeling, buffered writes, co-processor calls, pipeline flushes, custom branch algorithms, pipeline stalls, context-switching, interrupt instructions and load-store operations. The processor model executes a sequence of instructions that can be sourced from an actual execution trace or synthetically generated using the including Software Generator module. Over 100 statistics are generated including resource utilization, task latency, pipeline stalls, IOs per second, throughput, hit-miss ratio and active threads. The cache handles request queuing, access latency, hit-miss evaluation, prefetch, Read/Write data response and miss activity to the next level of memory. The cache can extend the processor cache or can be used standalone with a trace or generated distribution.

The memory block combines a memory controller and the memory array. The memory controller has a request queue and can model the impact of sequential and random requests. This block handles pre-fetch request, read, write, refresh and erase. The block can be connected to a bus, processor or another hardware device.

The Software Generator block is used to generate a sequence of instruction based on the profile of a software tasks to execute on the processor. The instruction list can be of the target processor or generic. The mix of instructions can be based distinguished by integer, floating, branch,Load/Store, logic, mathematical and other special operations. The duration of each task can also be specified.

A number of execution activity views are provided. The Pipeline view provides the exact instruction-by-instruction execution sequence, shows the stalls by stage and cycle, shows the parallel execution and the sequence of flow for misses and prefetches. The textual view shows the step by step operation from data arrival to internal logic operation to delays and output actions.

The library comes with a large list of vendor processors. These have been built using information from the datasheet and validated using application models from the template library. Some of the processors provided include:

ARM: ARM Cortex-A8, M3, ARM1136J(F)-S, ARM1176JZ(F)-S, ARM720T, ARM920T, ARM922T, ARM926EJ-S, ARM7EJ-S, ARM7TDMI, ARM7TDMI-S, ARM946E-S, ARM966E-S, ARM968E-S, ARM996HS

PowerPC: 7XXX, 7410, 750, 405, 603e, RAD750

Intel: Xeon, Nehalem

AMD: Opteron

Renesas: SH4 and SH5

TI: C64, C6678, C6713

Tensilica: Xtensa LX2

AD: TS201

–>