Browsable image of the model.
The developments of technologies like the Internet-of-things (IoT), enterprise mobility, cloud computing and virtualization have led to a change in business model, infrastructure and protocol. Developing a sophisticated system has increased demand for more security, data storage, and power control and memory bandwidth. Memory and power manage the efficiency of the system. Architecture exploration analysis and performance analysis would enable to evaluate the processes and develop a customized product architecture that suits the business need.
Designing an Information ApplianceMarketing has to explore feasibility of requests coming in from customer surveys and field request for new products. Product engineering and the CTO's office have to develop prototypes of this new technology for demonstration at major trade shows. During this period, the feasibility, conformance to market requirements and product specifications has to be designed and verified. For this purpose it is essential to combine timing and functional with true data streams to explore the system. Companies such as Nokia generate 120 products out of just 5-10 platforms. The robustness and flexibility of these platforms is critical. In addition, companies are developing platforms that can be used for a variety of applications. The OMAP platform from TI and the Geode from National are examples of such generic platforms that can be adapted by customers for a variety of applications in the consumer and Wireless markets. Semiconductor companies must demonstrate the superiority of their platforms over the rival. Most perform that today using relationships and data sheets. Most of these are not dynamic and cannot capture the differentiating value. Customers need an evaluation platform to experiment with the performance and functionality of these platforms or ICs for their application and traffic requirements. The VisualSim models provide an effective evaluation platform and are a replacement for the static reference designs available on semiconductor and system companies Web site. Customers could be on your Web site exploring new technology the same way you are currently on our Web site viewing and modifying these models.Project ObjectivesTo illustrate these requirements, a Personal Data Assistant (PDA) that contains a smart media and a wireless connection is considered. The selected application to evaluate the system is an MPEG stream. The model utilizes information that is normally available on semiconductor companies data sheet to size the individual components. The model studies the latency and image quality for the rendering of a MPEG4 image on a CRT.The MPEG4 data can arrive from two sources- memory stick/ hard drive and wireless interface. The data flows in the following manner from the source to the CRT:
Capabilities DemonstratedThis model demonstrates the separation methodology and also shows how customers can utilize the Web as a media for transmitting technical information in a dynamic manner. This model combines software application with the hardware. In addition, this models displays the original transmitted image and the image after decoding, thus showing the superiority of the particular implementation of the Vector Quantization algorithm for HDTV design. This model can also be expanded extensively to explore numerous other trade-offs such as image ratio versus CPU cycles consumed, determine the optimum software execution platform - CPU vs. custom processor and reduce power by sequencing tasks to optimizing the switching functions.Model Development StatisticsDefine data flow through the system = 2 DaysNumber of unique blocks required to create the model = 6 Time to do the initial model construction = 1 Day Model analysis and refinement = 3 days Documentation = .5 day Model ConstructionThis VisualSim system model consists of three sections- Workload generator, behavior description and architecture description. The connection between these various sections is using the Virtual method provided by VisualSim. In the diagram, mapping between the behavior and the architecture is done using the Virtual Execution method.In this model, the transaction generated are the incoming MPEG frames. These are two generators in the hierarchical block shown as the Workload_Gen- one for the Antenna and one for the hard drive. The transactions are generated in a pattern described by a poisson distribution. A refined model can use the actual arrival rate of the MPEG streams by capturing a arrival stream and feeding that file as the input. The Workload_Gen has one parameter- Transaction. This parameter can be modified to increase or reduce the number of frames that are transmitted in any second. Each transaction is considered as a Data Structure and contains multiple fields. These fields carry information that are required for the simulation. The list of Data Structure fields can be seen in the Transaction output from the Dual Processor Model. The MPEG data is not actually transmitted over the entire simulation but rather a representation in the form of frame size is sent. The frame can be encapsulated as an object field in the data structure to be used by any part of the design that evaluates the algorithm. In this example the image processing algorithm is evaluated in two locations- at the behavior and at the Video processor. In this example, the parameter "transaction" is said to be exported to the upper layer as it is made common for the entire layer of this system. This parameter can be made global and will be evaluated at simulation time or can be specified at any level of the hierarchy. The output from the Workload_gen is fed into the behavior portion of the design. The behavior describes the flow of data through the appliance. As the execution is the same for the data generated from the antenna and the hard drive, the flow is also similar. There is one decision tree in this flow and that is to determine if the data needs to be retrieved from the cache or memory. One of the fields of the data structure contains a random distribution between 0 and 1. if the field value is 0, then data is acquired from the Cache but if the value is 1, then the data is acquired from the SDRAM. Each item in the behavior flow is mapped to respective hardware or software on the architecture. When data arrives at the PCI behavior block, a request is sent to the PCI architecture where the execution occurs. The behavior simply defines the functionality while the actual execution is performed on the architecture blocks. The behavior is described using the Mapper_Adv SmartBlock. The architecture elements are defined using the Scheduler SmartBlock. The Video Processor has further refinement in that the actual vector quantization algorithm is implemented. The processing requirements at each entity is determined by the size of the incoming frame and the clock speed of that block. The connections between the architecture blocks are for statistics gathering and do not impact the simulation. The results are gathered in two locations- Result hierarchical block and the Timeline plotter. The output from the right bottom port of the architecture blocks is the statistics output. The data from this port is captured and displayed using the plotter in the Result block. All of the SmartBlock have statistics generation that can be utilized to generate data on the fly. This eliminates the need for performing complex analysis outside of the environment. The right middle port reports the time utilized of the block and can be plotted on a timeline. ResultsThere are a number of results that are generated from this simulation:
Utilization Graph- All of the components are heavily under utilized except for the CPU Bus. This is to be expected as the CPU Bus is accessed atleast 4 times by each flow. Additional tradeoff can be performed to determine if the video processor can be eliminated. This is a common problem faced by appliance makers to eliminate the application processor and share the load between the DSP and the CPU. The appliance platform has a large headroom and a number of new features can be added onto this product without modifying underlying hardware. This is important where this product is being sold as a generic development platform or where multiple product emanate from a single platform. Timeline Plot- The timeline plot indicates that the instructions are accessing the SRAM at a much higher rate than the cache. This could be a reason for the image quality degradation. The cache speed can be maintained the same as the utilization graph indicates but the cache hit ratio must be increased. It is possible to determine the optimum cache hit ratio. Refinement OpportunitiesThere are a number of refinement opportunities for this model to get more analysis reports and thus make detailed decision. Some are described below:
|