1、 ,1,Synthesis, ,2,What is Synthesis?,Transformation of an abstract description into a more detailed description “+“ operator is transformed into a gate netlist “if (VEC_A = VEC_B) then“ is realized as a comparator which controls a multiplexer Transformation depends on several factors, ( AND OR ) too
2、l ., ,3,Field Programmable Gate Array (FPGA), ,4, FPLD,:, ., ( ) ( ),Debug . :, ., ,5,Synthesizability,Only a subset of VHDL is synthesizable Different Tools support different subsets records? arrays of integers? clock edge detection? sensitivity list? ., ,6,Different Language Support for Synthesis,
3、 ,7,How to Do?,Macrocells adder comparator Bus interface Constraints speed area power Optimizations boolean: mathematic gate: technological, ,8,Non-functional requirements,Performance: Clock speed is generally a primary requirement. Usually expressed as a lower bound. Design cycle and Timing Closure
4、 Size: Determines manufacturing cost. If your design doesnt fit into one size FPGA, you must use the next larger FPGA. For very large designs: multi-FPGAs. Power/energy: Power/Energy related to battery life and heat. May have more cost: More expensive packaging to dissipate heat. More extreme measur
5、es (e.g. cooling fans). Many digital systems are power- or energy-limited., ,9,Mapping into an FPGA,Must choose the FPGA: Capacity. Pinout/package type. Maximum speed., ,10,Synthesis Process in Practice, , ,11,Path delay,Combinational network delay is measured over paths through network. Can trace a
6、 causality chain from inputs to worst-case output., ,12,Path delay example,network,graph model, ,13,Critical path,Critical path = path which creates longest delay. Can trace transitions which cause delays that are elements of the critical delay path., ,14,Critical path through delay graph, ,15,Delay
7、 Paths in a design, ,16,False paths,Logic gates are not simple nodessome input changes dont cause output changes.A false path is a path which never happens due to Boolean gate conditions.False paths cause pessimistic delay estimates., ,17,Placement and delay,Placement helps determine routing.Routing
8、 determines wire length.Wire length determines capacitive load.Capacitive load determines delay., ,18,Example: Adder placement and delay,N-bit adder: (optimal placement),+,+,+,+, ,19,Bad placement and routing,placement,routing,With no delay constraints., ,20,Bad placement and routing,Adder has been
9、distributed throughout the FPGA. I/O pins have been spread around the chip. P&R algorithms do not catch on to regularity., ,21,Better placement and routing,With delay constraints.,Better but far from optimal (less spread out horizontally but spread out vertically), ,22,How to improve?,Use macros (op
10、timized), Put constraints on the placement of objects, Hand place objects. Example: later., ,23,Power Optimization, ,24,Power optimization,Transitions cause power consumption. Logic network design helps control power consumption: minimizing capacitance; eliminating unnecessary glitches., ,25,Power o
11、ptimization,Leakage in more advanced processes. Even when logic is idle. The only way: disconnect the power supply from the logic when not needed for some time. It generally takes a considerable period (larger than a clock period) to reconnect power and let the circuits stabilize., ,26,Glitching exa
12、mple,Gate network:, ,27,Glitching example behavior,NOR gate produces 0 output at beginning and end: beginning: bottom input is 1; end: NAND output is 1; Difference in delay between application of primary inputs and generation of new NAND output causes glitch., ,28,Adder Chain Glitching,bad,good, ,29
13、,Explanation,Unbalanced chain has signals arriving at different times at each adder. A glitch downstream propagates all the way upstream. Balanced tree introduces multiple glitches simultaneously, reducing total glitch activity., ,30,Factorization for low power,Proper factorization reduces glitching
14、.,bad,good,ac,ac,a: High transition probability, ,31,Factorization techniques,In example, a has high transition probability, b and c low probabilities. Reduce number of logic levels through which high-probability signals must travel in order to reduce propagation of glitches., ,32,Example (ALU),ALU
15、output is not used for every cycle If ALU inputs change, the energy is needlessly consumed, ,33,Example (ALU),Control Signal selects whether data is allowed to pass the logic or the previous value is held to avoid transitions.,Logic,D,Q,Data,Control, ,34,Layout for low power,Place and route to minim
16、ize capacitance of nodes with high glitching activity. Feed back wiring capacitance values to power analysis for better estimates., ,35,State assignment for low power,Later, ,36,Case Study,16 x 16 multiplier example., ,37,The FPGA design process,Xilinx ISE (Integrated Synthesis Environment) Translat
17、ion from HDL. (Synthesis, Translation) Logic synthesis. (Mapping) Placement and routing. (Place and Route) Configuration generation. (Program File Generation), ,38,Design experiments,Synthesize with no constraints. Synthesize with timing constraint. Tighten timing constraint. Synthesize with placeme
18、nt constraints. Power: Many tools dont allow us to directly specify power consumption must rewrite our h/w description for better power consumption characteristics., ,39,Post-translation simulation model,No timing or area constraints HDL model in terms of FPGA primitives. Example:X_LUT4 p12_Madd_n00
19、15_Mxor_Result_Xo1 (.ADR0(x_7_IBUF),.ADR1(y_13_IBUF),.ADR2(c127),.ADR3(row128),.O(row137);, ,40,Mapping report,Design Summary - Number of errors: 0 Number of warnings: 0 Logic Utilization:Number of 4 input LUTs: 501 out of 1,024 48% Logic Distribution:Number of occupied Slices: 255 out of 512 49%Num
20、ber of Slices containing only related logic: 255 out of 255 100%Number of Slices containing unrelated logic: 0 out of 255 0%*See NOTES below for an explanation of the effects of unrelated logic Total Number 4 input LUTs: 501 out of 1,024 48%Number of bonded IOBs: 64 out of 92 69%Total equivalent gat
21、e count for design: 3,006 Additional JTAG gate count for IOBs: 3,072 Peak Memory Usage: 64 MB, ,41,Related vs. Unrelated Logic (Hidden),Related logic: logic that shares connectivity. Unrelated logic: logic that shares no connectivity. When assembling slices, mapper gives priority to combine logic th
22、at is related best results. Mapper will only begin packing unrelated logic into a slice once all of the slices are occupied., ,42,Static timing analysis report,Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP “PADS“ TO TIMEGRP “PADS“ 99.999 uS ; 20135312 items analyzed, 0 timing errors detected. (0
23、 setup errors, 0 hold errors)Maximum delay is 20.916ns. -,After Mapping: estimated delays (no information about interconnects), ,43,Static timing report: delays along paths,Data Sheet report: - All values displayed in nanoseconds (ns)Pad to Pad -+-+-+ Source Pad |Destination Pad| Delay | -+-+-+ x |p
24、 | 5.824| x |p | 10.675| x |p | 11.214| x |p | 11.753|, ,44,Routing report,Phase 1: 1975 unrouted; REAL time: 11 secs Phase 2: 1975 unrouted; REAL time: 11 secs Phase 3: 619 unrouted; REAL time: 12 secs Phase 4: 619 unrouted; (0) REAL time: 12 secs Phase 5: 619 unrouted; (0) REAL time: 12 secs Phase
25、 6: 619 unrouted; (0) REAL time: 12 secs Phase 7: 0 unrouted; (0) REAL time: 12 secs The NUMBER OF SIGNALS NOT COMPLETELY ROUTED for this design is: 0,REAL time: Routing algorithm run time., ,45,Static timing after routing,Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP “PADS“ TO TIMEGRP “PADS“ 99
26、.999 uS ; 20135312 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors)Maximum delay is 38.424ns. -,(vs 20.916 ns in mapping report) Because of interconnect delays., ,46,Timing constraint,Use timing constraint editor:, ,47,Post-map static timing report,Timing constraint: TS_P2P
27、= MAXDELAY FROM TIMEGRP “PADS“ TO TIMEGRP “PADS“ 32 nS ; 20135312 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors)Maximum delay is 20.916ns.,Pad to pad,Hasnt changed since this design has limited opportunities for logic synthesis to change delays by restructuring logic., ,48
28、,Post-routing static timing report,Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP “PADS“ TO TIMEGRP “PADS“ 32 nS ; 20135312 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors)Maximum delay is 31.984ns.,Tools generally try to meet the delay goal as closely as possible to mini
29、mize area., ,49,Tighter timing constraints,Tighten requirement to 25 ns. Post-place-route timing report: Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP “PADS“ TO TIMEGRP “PADS“ 25 nS ; 20135312 items analyzed, 11 timing errors detected. (11 setup errors, 0 hold errors)Maximum delay is 31.128ns.,
30、,50,Report on a violated path,Slack: -6.128ns (requirement - data path)Source: y (PAD)Destination: p (PAD)Requirement: 25.000nsData Path Delay: 31.128ns (Levels of Logic = 31),Modify the logic and/or physical design to improve the delay., ,51,Power report,Power summary: I(mA) P(mW) - Total estimated
31、 power consumption: 333-Vccint 1.50V: 0 0Vccaux 3.30V: 100 330Vcco33 3.30V: 1 3-Inputs: 0 0Logic: 0 0Outputs:Vcco33 0 0Signals: 0 0-Quiescent Vccaux 3.30V: 100 330Quiescent Vcco33 3.30V: 1 3Thermal summary: -Estimated junction temperature: 36CAmbient temp: 25CCase temp: 35CTheta J-A: 34C/W,Helps us
32、determine whether we need additional cooling., ,52,Improving area,Floorplanner window: Floorplanner View/edit placed design,LEs,Chip floorplan,Green rectangles: mapped components to CLBs, ,53,Rats nest wiring,If you click on a component in the deign hierarchy window, its rats nest is shown., ,54,Rou
33、ting editor view,FPGA Editor View/Edit Routed Design, ,55,Editing constraints,Use constraints editor to place constraints: This tool allws you to constrain the placement of logic as well as the assignment of chip I/Os to IOBs (e.g useful for PCB design), ,56,Design browser pane, ,57,Drag and drop co
34、nstraints, ,58,Change the shape of constraints, ,59,Full set of placement constraints,We place the rows of the multiplier one below the other to create the row structure of the floorplan., ,60,Placement results, ,61,New timing report,After placement constraints:19742142 items analyzed, 0 timing erro
35、rs detected. (0 setup errors, 0 hold errors)Maximum delay is 29.934ns. Compares to 31 ns for unconstrained placement., ,62,Combinational Process: Sensitivity List,Library IEEE; use IEEE.Std_Logic_1164.all; entity IF_EXAMPLE is port (A, B, C, X : in std_ulogic_vector(3 downto 0); Z : out std_ulogic_v
36、ector(3 downto 0); end IF_EXAMPLE; architecture A of IF_EXAMPLE is begin process (A, B, C, X) begin if ( X = “1110“ ) then Z = A; elsif (X = “0101“) then Z = B; else Z = C; end if; end process; end A;, ,63,Combinational Process: Sensitivity List,process (A, B, SEL) begin if SEL = 1 then Z = A; else
37、Z = B; end if; end process;,If SEL is missing in the sensitivity list, what will the behavior (simulation) be?,Sensitivity list is usually ignored during synthesis. Equivalent behavior of simulation model and hardware All signals which are read are entered into the sensitivity list. Complete if-stat
38、ement for the synthesis of combinational logic., ,64,Combinational Process: Incomplete Assignments,Library IEEE; use IEEE.Std_Logic_1164.all; entity INCOMP_IF is port (A, B, SEL :in std_ulogic; Z : out std_ulogic); end INCOMP_IF; architecture RTL of INCOMP_IF is begin process (A, B, SEL) begin if SE
39、L = 1 then Z = A; end if; end process; end RTL;,Latch SEL = 1 (Transparent latch). FF .,What is the value of Z, if SEL = 0 ? What hardware would be generated during synthesis ?, ,65,Modeling of Flip-Flops,Library IEEE; use IEEE.Std_Logic_1164.all; entity FLOP is port (D, CLK : in std_ulogic; Q : out
40、 std_ulogic); end FLOP; architecture A of FLOP is begin process begin wait until CLKevent and CLK=1; Q = D; end process; end A;, ,66,Description of Rising Clock Edge for Synthesis,Standard for synthesis: IEEE 1076.6,. if condition RISING_EDGE ( clock_signal_ name) (not always supported) clock_signal
41、_ nameEVENT and clock_signal _name=1 clock_signal _name=1 and clock_signal_ nameEVENT not clock_signal_ nameSTABLE and clock_signal_ name=1 clock_signal _name=1 and not clock_signal_ nameSTABLE, . wait . : if wait until , ,67,Description of Rising Clock Edge for Synthesis,. wait until condition RISING_EDGE ( clock_signal_ name) clock_signal_ nameEVENT and clock_signal _name=1 clock_signal _name=1 and clock_signal_ nameEVENT not clock_signal_ nameSTABLE and clock_signal_ name=1 clock_signal _name=1 and not clock_signal_ nameSTABLE clock_signal _name=1,
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1