1、Synthesis of Transaction-Level Models to FPGAs,Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department University of California, Los Angeles,Outline,Transaction-level model (TLM) SystemC TLM Metropolis Meta Model Synthesis from TLM RDR/MCAS: our exis
2、ting architectural synthesis approach xPilot: Ongoing synthesis infrastructure for TLM,Outline,Transaction-level model (TLM) SystemC TLM Metropolis Meta Model Synthesis from TLM RDR/MCAS: our existing architectural synthesis approach xPilot: Ongoing synthesis infrastructure for TLM,SystemC Framework
3、,SystemC history OO system/HW modeling and simulation SystemC under development by CAD vendors/researchers Synopsys Frontier Design CoWare (Belgium) Released to public Sept. 99 Open source distribution www.systemc.org Version 2 out July 01,Channels and Modules,Basic building blocks: Module (class) i
4、nstances, communicating via channel (class) instances Modules functionality coded as concurrent processes Processes communicate via channels or events,Communication Modeling in SystemC,Primitive Channels in SystemC Library,Ordinary signal (wire) of type Fill in data type T when instantiated Point-to
5、-point or multi-point (1 writer, n readers) Signal bus (arbitrary width) FIFO, for producer/consumer connection Pseudo-channels Mutex & semaphore, for interprocess sync Accessed using channel syntax Complex “hierarchical” channels composed of primitive channels, processes, modules,Events and Process
6、es,Events: abstract occurrences used for Process triggering (like VHDL sensitivity list) Channel communication Interprocess synchronization Process can call wait() to block on event Event occurrence tells simulator to schedule simulation of relevant process Processes execution Not called directly fr
7、om your code Triggered for simulation by events on ports, channels, or explicit named events Registered in constructor of enclosing module (associate method with events) Thread process infinite loop Must call wait() to lose control Method process runs to completion Less scheduling overhead,Data Type
8、s in SystemC,SystemC supports Native C/C+ Types SystemC Types SystemC Types Data type for system modeling 2 value (0,1) logic/logic vector 4 value (0,1,Z,X) logic/logic vector Arbitrary sized integer (Signed/Unsigned) Fixed Point types (Templated/Untemplated) Objective: to reflect HW registers & ALU
9、 operations,Functional Level and RTL Modeling in SystemC,Functional level Sequential, algorithmic, software-like Explore HW/SW architectures, proof of algorithms, performance modeling & analysisRegister transfer level Complete detailed functional description of hardware Every register, bus, bit for
10、every clock cycle Use C+ switch/case for FSM implementation At this point, can switch to HDL, but staying in SystemC leverages test benches Prepare for HW synthesis step by using only synthesizable constructs,Transaction Level Modeling in SystemC,Transaction level Model includes architectural compon
11、ents Maintain component interface accuracy E.g., buses modeled as channels (read/write operations) Behavioral style inside a component Simulates 100-10,000x faster than RTL Provide execution platform for SW development,TLM Raise the Level of Architectural Modeling,What is TLM?Communication uses func
12、tion calls burst_read(char* buf, int addr, int len); Why is TLM interesting? Simulation: Fast and compact Integrate HW and SW models Early platform for SW development Early system exploration and verification Verification reuse Synthesis Reference: www.systemc.org,Typical Design Flow Using TLM,Funct
13、ional model Captures system behaviourTLM, Transaction Level Model Bus transactions Accurate interaction with SW portion Simulates rapidly Can create TLM model initially,Introduction of Metropolis,A UCB and GSRC project, http:/www.gigascale.org/metropolis/ Platform-based design ASV Platforms have suf
14、ficient flexibility to support a series of applications/products Choose a platform by design space exploration Above two require models to be reusable Orthogonalization of concerns Computation vs. Communication Behavior vs. Coordination Behavior vs. Architecture Capability vs. Cost,Metropolis Meta M
15、odel,A combination of imperative program and declarative constraints Imperative program: objects (process, media, quantity, statemedia) netlist await block and label interface function call quantity annotation Declarative constraints Linear Temporal Logic (LTL) (synch) Logic of Constraints (LOC),A M
16、etropolis Design Tutorial,MyMapNetlist,A Metropolis Design Tutorial,MyMapNetlist,MyFncNetlist,M,P1,P2,Env1,Env2,B(P1, M.write) B(mP1, mP1.writeCpu); E(P1, M.write) E(mP1, mP1.writeCpu); B(P1, P1.f) B(mP1, mP1.mapf); E(P1, P1.f) E(mP1, mP1.mapf); B(P2, M.read) B(P2, mP2.readCpu); E(P2, M.read) E(mP2,
17、 mP2.readCpu); B(P2, P2.f) B(mP2, mP2.mapf); E(P2, P2.f) E(mP2, mP2.mapf);,Outlook of the First Metropolis Release,Meta model infrastructure,Front end,Back end1,Back end2,Back endN,Back end3,Sample architectural libraries:coarse-simple cpu, bus, memory, arbiterstime quantity,Sample MoC:multi-media (
18、Yapi, TTL)Synchronous,A design tutorial,http:/www.gigascale.org/metropolis/,TLM Conclusions,SystemC is the defacto system-level-design standard Pushed by many CAD tool vendors Used widely in industry and academia E.g., Intel handhold system project ICCAD04 Unified language to model a system in diffe
19、rent levels Improving path to HW synthesis from SystemC source code Fits with trend to take system design to higher level Metropolis is a novel academic framework of model of computation Capable of representing TLM as well Provides a comprehensive starting point of synthesis,Outline,Transaction-leve
20、l model (TLM) SystemC TLM Metropolis Meta Model Synthesis from TLM xPilot: our ongoing synthesis infrastructure for TLM RDR/MCAS: our existing architectural synthesis approach,xPilot: TLM to RTL Synthesis Flow,TLM in SystemC/Metropolis,RTL,SSDM,Arch-generation passes: RTL/constraints generation Veri
21、log/VHDL/SystemC Altera/Xilinx General/Synopsys/Magma ,Arch-dependent passes Memory analysis/allocation Scheduling/Binding/Memory analysis/allocation Register/port binding Traditional/Low power/RDR-pipe or Placement driven ,Arch-Independent passes SSDM Checking Loop unrolling/pipelining Strength red
22、uction/Bitwidth analysis Speculative-execution transformation ,FPGAs,Frontend,Integration xPilot with Metropolis,Meta model infrastructure,Front end,SystemC Simulation,LOC Checking,SPIN Interface,Synthesis,HW Implementation,RTL,Timing Constraints,Physical Constraints,RTL Handoff,Latency Insensitive
23、Design,GALS,RDR/MCAS,IP Library,HW implementation,Compilation for RP,Simulation,Extended Instruction,Reconfigurable Interconnect,Reconfigurable Coprocessor,xPilot/SSDM,SSDM Zoomed In CDFG,if (cond1) bb1();else bb2(); bb3(); switch (test1) case c1: bb4(); break;case c2: bb5(); break;case c3: bb6(); b
24、reak; bb7(),2-level CDFG representation 1st level: control flow graph 2nd level: data flow graph,SSDM Features Different from Software IR,Top-level: netlist of concurrent processes Process port/interface semantics FIFO: FifoRead() / FifoWrite() BUFF: BuffRead() / BuffWrite() Memory: MemRead() / MemW
25、rite() Bit vector manipulation Bit extraction / concatenation / insertion Bit-width property for every value Cycle-level notation Scheduling / binding information / delay,Our Architectural Synthesis Approaches RDR / MCAS,Consideration of multi-cycle communication during architectural (or behavioral)
26、 synthesis Regular Distributed Register (RDR) micro-architecture Cong et al, ISPD03 Highly regular Direct support of multi-cycle on-chip communication MCAS: Architectural Synthesis for Multi-cycle Communication Efficiently maps the behavioral descriptions to RDR uArch Integrates architectural synthe
27、sis (e.g. resource binding, scheduling) with physical planning,RDR/MCAS: Support for Heterogeneous Integration with Multi-cycle Communication & Automatic Interconnect Pipelining,Distribute registers to each “island” Choose the island size such that Single cycle for intra-island computation and commu
28、nication Multi-cycle communication between islands Support interconnect pipelining Inter-island pipeline register station (PRS) for global communications PRS performs autonomous store-and-forward MCAS: Multi-cycle architectural synthesis integrated with global placement Experimental results MCAS vs.
29、 Conventional flow:36% reduction in clock period and 30% reduction in total latency MCAS-Pipe vs. MCAS:28.8% long global wirelength reduction19.3% total wirelength reduction Can also support IP integration using latency insensitive technique Carloni, ICCAD99,Synthesis Flow: MCAS-Pipe System,ICG,C /
30、VHDL,Locations,Placement-driven rescheduling & rebinding,Scheduling-driven placement,CDFG generation,Register and port binding,Datapath & FSM generation,Resource allocation & Functional unit binding,RTL VHDL & Floorplan constraints,CDFG,Global interconnect sharing,Global interconnect sharing Enable
31、multiple data communications to share one physical link (a wire with pipeline registers),Related Publications,Regular distributed register (RDR) architecture and MCAS synthesis algorithms ISPD03, ICCAD03 RDR-Pipe and MCAS-Pipe synthesis algorithms DAC04 Lopass: high-level synthesis for low-power FPG
32、As ISLPED03 Multiplexor optimization through register/port binding ASPDAC04 Bitwidth-aware scheduling and binding algorithms ASPDAC05,Conclusions,Higher level abstraction is needed in current SO(P)C design flow SystemC becomes the SLD standard, esp., TLM is widely used Metropolis is a platform-based design framework It is time to build new generation of behavioral synthesis system from TLM xPilot: Ongoing project An architectural synthesis infrastructure from TLM to RTL (FPGAs),