A Tool for Partitioning andPipelined Schedulingof Hardware-.ppt

上传人:outsidejudge265 文档编号:377904 上传时间:2018-10-09 格式:PPT 页数:19 大小:207KB
下载 相关 举报
A Tool for Partitioning andPipelined Schedulingof Hardware-.ppt_第1页
第1页 / 共19页
A Tool for Partitioning andPipelined Schedulingof Hardware-.ppt_第2页
第2页 / 共19页
A Tool for Partitioning andPipelined Schedulingof Hardware-.ppt_第3页
第3页 / 共19页
A Tool for Partitioning andPipelined Schedulingof Hardware-.ppt_第4页
第4页 / 共19页
A Tool for Partitioning andPipelined Schedulingof Hardware-.ppt_第5页
第5页 / 共19页
亲,该文档总共19页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems,Karam S Chatha and Ranga Vemuri Department of ECECS University of Cincinnatikchatha,rangaececs.uc.edu,Organization of Talk,IntroductionOverview of ToolCodesign partitionerPipelined SchedulerResultsConclusion,Introduction,M

2、otivation:The throughput of a loop oriented HW-SW application can be maximized by obtaining a pipelined implementation.Objective:To obtain a pipelined implementation of the application on the codesign architecture such that:- Throughput constraint is satisfied- HW area constraint is satisfied- Numbe

3、r of pipeline stages is minimized- Increase in memory requirement is minimized,Introduction Architecture and Task Graph,S = 225 ns H = 175 ns (8 +, ),S = 400 ns H = 150 ns (4 *, 8 +, ),S = 100 ns H = 400 ns (3 *, 3 +, ),S = 200 ns H = 100 ns (4 *, 8 -, ),10 Data items per dependence,Introduction Pip

4、elined Design,Assumptions: - SW-SW communication time taken in to account by SW runtime of the task. Hence it is not shown. - HW co-processor cannot execute tasks in parallel.,Some Definitions,A pipelined design is characterized by its initiation interval.Initiation interval (II) is the time differe

5、nce between the start of two consecutive iterations of the steady state.Given a partitioned task graph there exists a theoretical lower bound on the II of its pipelined schedule called the Minimum Initiation Interval (MII). For a directed acyclic task graph the MII is given by:MII = max (Sum_hw, Sum

6、_sw)where Sum_hw is the sum of execution times of tasks bound to HW and Sum_sw is the sum of execution times of tasks bound to SW.,HW-SW Codesign,Output Successful Design,Unable to Design with Given Constraints,Throughput and Area Constraints,Task Graph,Architecture,Yes,Obtain a Pipelined Schedule w

7、hich executes in II time.,Increase II,Calculate MII Set II = MII,Partition Design,Constraint Satisfied ?,Schd found ?,II Constraint ?,YES,NO,NO,NO,YES,YES,Satisfy throughput and area constraints.,Satisfy throughput constraints, minimize the number of pipeline stages and minimize the increase in memo

8、ry requirements.,HW-SW Partitioner,Branch and bound algorithmInitial solution tries to minimize MII- Suitability of task to be assigned to HW is given by:- Sort tasks in descending order of their suitabilities.- Assign tasks to HW and SW alternatively from front and back of the sorted list so that S

9、um_hw and Sum_sw remain balanced.We also apply heuristics to effectively limit the search space of the algorithm.,HW-SW Partitioner Area Estimation,Resources required by tasks divided into two types: 1. Shared - adders, subtractors, multipliers, dividers2. Unshared - interconnect and controllerShare

10、d resource area estimated by taking the union of the shared resources required by all the HW tasks.Unshared resource area estimated by adding the area associated with the unshared resources of all the HW tasks.Total area estimated by taking the sum of area requirements of shared and unshared resourc

11、es.,Pipelined Scheduling,In the following explanation we call the task graph before retiming transformation the original loop and after transformation the steady state.In order to apply retiming transformation we associate- an iteration index “l” with every task and- a dependence distance “d” with e

12、very dependency.Iteration index of a task u, l(u) implies that at the “I”iteration of the steady state instance of task u belonging to (I + l(u) iteration of the original loop is executed.Dependence distance of a dependency uv, d(uv) implies that data produced by task u is consumed by task v , d(uv)

13、 iterations later.,Some Definitions,RECOD Step 1: Select a dependency to retime,1. Dependency is an intra loopdependency (ILD).,2. Dependency between tasks boundto heterogeneous processors.,3. Dependency whose predecessortask belongs to longer constraining path.,4. Dependency representing theleast n

14、umber of data items transferred.,A,B,C,D,E,H,F,G,I,HW,HW,SW,HW,SW,SW,SW,SW,SW,d = 0,d = 0,d = 1,d = 0,d = 0,d = 0,d = 0,d = 0,d = 0,d = 1,d = 0,Var = 20,Var = 10,RECOD Step 2: Partition to minimize increase in memory requirements.,A,B,C,D,E,H,F,G,I,Set R,Set P,Set S,Cutset,Cost function for the part

15、itioner,Retiming Transformation,JPEG Case Study,We specified the JPEG image compression algorithm as task graph with 12 tasks.We then obtained pipelined codesign implementations by specifying different constraints on the II and HW area.,Execution Time,We evaluated the runtime of the tool by invoking

16、 it for 50 random task graphs and searching for optimal HW-SW partitions.,Percentage deviation of initial solution from final,We calculated the percentage deviation in initiation interval of the initial partition from the final partition.The average percentage deviation was 8.4%.,Percentage deviatio

17、n of final result from optimal,We compared the II obtained by the tool with minimum MII that was obtained during design space exploration.The minimum MII is a lower bound on the global optimum for a particular task graph.The solution obtained by our tool was on an average within 2.2% of the global o

18、ptimum.,Conclusion,The tool can optimize the throughput, area, pipeline stagesand memory requirements of pipelined HW-SW system.The tool can obtain solutions for task graphs with upto 30nodes within a short period of time.Although it assumes a single SW processor and single HWcoprocessor the techniq

19、ue can be extended to multipleprocessor architectures.The limitation of the tool is its inability to handle large taskgraphs ( 30 nodes) in a reasonable amount of time.A time out option with the branch and bound partitioner canovercome this limitation.,RECOD Step 1: Select a dependency to retime,1.

20、Dependency is an intra loopdependency (ILD).,2. Dependency between tasks boundto heterogeneous processors.,3. Dependency whose predecessortask belongs to longer constraining path.,4. Dependency representing theleast number of data items transferred.,A,B,C,D,E,H,F,G,I,HW,HW,SW,HW,SW,SW,SW,SW,SW,d = 0,d = 0,d = 1,d = 0,d = 0,d = 0,d = 0,d = 0,d = 0,d = 1,d = 0,Var = 20,Var = 10,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 教学课件 > 大学教育

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1