Algebraic Techniques To Enhance Common Sub-expression .ppt

上传人:sofeeling205 文档编号:378177 上传时间:2018-10-09 格式:PPT 页数:24 大小:943.50KB
下载 相关 举报
Algebraic Techniques To Enhance Common Sub-expression .ppt_第1页
第1页 / 共24页
Algebraic Techniques To Enhance Common Sub-expression .ppt_第2页
第2页 / 共24页
Algebraic Techniques To Enhance Common Sub-expression .ppt_第3页
第3页 / 共24页
Algebraic Techniques To Enhance Common Sub-expression .ppt_第4页
第4页 / 共24页
Algebraic Techniques To Enhance Common Sub-expression .ppt_第5页
第5页 / 共24页
亲,该文档总共24页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、Algebraic Techniques To Enhance Common Sub-expression Extraction for Polynomial System Synthesis,Sivaram Gopalakrishnan Synopsys Inc., Hillsboro, OR 97124Priyank Kalla Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, UT- 84112,Outline,Problem context: Polynomial

2、 datapath synthesis Our Focus: Integrating CSE and Algebraic methods Applications: DSP for audio, video, multimedia. Motivation Previous Work and Limitations Integrated Approach Square-free factorization Common Coefficient Extraction Common Cube Extraction Algebraic Division Results: Area Optimizati

3、on Conclusions & Future Work,The Synthesis Flow,Polynomial representation?,Quadratic filter design for polynomial signal processingy = a0 . x12 + a1 . x1 + b0 . x02 + b1 . x0 + c . x0 . x1,Motivation,P1 = x2 + 6xy + 9y2 P2 = 4xy2 + 12y3 P3 = 2zx2 + 6xyz P1 = x(x+ 6y) + 9y2 P2 = 4xy2 + 12y3 P3 = x(2z

4、x + 6yz)P1 = x(x+ 6y) + 9y2 P2 = y2(4x+ 12y) P3 = xz(2x + 6y),Direct Implementation 17 Mults & 4 Adds,Horner form 15 Mults & 4 Adds,Factorization + CSE 12 Mults & 4 Adds,Motivation,d1 = x + 3y P1 = d12 P2 = 4d1y2 P3 = 2xzd1 d1 is a good building blockHow to identify such building blocks across multi

5、ple polynomial datapaths?Need an methodology to expose many common expressions!,Our Approach 8 Mults & 1 Add,Conventional Methods,Extracting control-dataflow graphs (CDFGs) from RTL Scheduling Resource sharing Retiming Control synthesis Algebraic Transforms for arithmetic designs Factorization Hosan

6、gadi et al, ICCAD 04 Common Sub-expression Elimination Hosangadi et al, VLSI 05 Term-rewriting Arvind et al, IEEE. Micro 98 Tree-Height Reduction De Micheli 94 Lack of symbolic computer algebra manipulation,Conventional Methods,Kernel/Co-kernel Extraction (Factorization + CSE) Integrates CSE with cu

7、be/coefficient extraction Uses coefficients and variables to identify cubes (co-kernels)to obtain kernels Subsequently uses CSE for further optimizationP = 5x2 + 10y3 + 15pq; Uses 5, 10, 15, x, y, p, q for kernel/co-kernel extraction Does not perform algebraic division Cannot determine decomposition

8、 5(x2 + 2y3 + 3pq)P = x2 + 2xy + y2; - (x+y)2 Cannot determine the above decomposition,Symbolic algebra techniques,Polynomial models for complex computational blocksGuiding Synthesis engines using Grbners basis Peymandoust and De Micheli, TCAD 02 Given polynomial F and Library elements F = h1 I1 + +

9、 hn In Restricted to library elementsDatapath optimization using word-length informationGopalakrishnan et al, ICCAD 07 Restricted to fixed-size datapaths Cannot address systems of polynomials,Optimization techniques,Canonical Form representation ckYk ck : Coefficient in the range (0 ck bk) Yk : Fall

10、ing factorial F = 3x2y2 - 3x2y - 3xy2 + 3xy = 3x(x-1)y(y-1)f1 = 5x3y2 - 5x3y - 15x2y2 + 15x2y + 10xy2 - 10xy + 3z2 f2 = 3x2y2 - 3x2y - 3xy2 + 3xy + z + 1d1 = x(x-1)y(y-1) f1 = 5d1(x-2) + 3z2 f2 = 3d1 + z + 1,Optimization techniques,Square-free factorizationLet F be an integral domain Z A polynomial

11、u in Fx is square-free if there is no polynomial v in Fx with deg(v, x) 0, such that v2 | u.u1 = x2 + 3x + 2; u1 = (x+1)(x+2) is square-free u2 = x4 + 7x3 + 18x2 + 20x + 8;u2 = (x+1)(x+2)2 is not square-free!,Optimization techniques,Common Coefficient Extraction P = 8x + 16y + 24z; P1 = 2(4x + 8y +

12、12z); P2 = 4(2x + 4y + 6z); P3 = 8(x + 2y + 3z); best transformationUse GCD computation Get the coefficients (ais) Compute GCD of every pair (ai, aj) Retain GCDs atleast (ai, aj) Arrange GCDs in decreasing order, perform extraction Update GCD list and continue,Optimization techniques,Common Coeffici

13、ent Extraction (Example) P = 8x + 16y + 24z + 15a + 30b; Coefficients 8, 16, 24, 15, 30 GCD list 8, 8, 1, 2, 8, 1, 2, 1, 6, 15 Reduced GCD list 8, 15 - decreasing order 15, 8Extracting 15 results in P = 8x + 16y + 24z + 15(a + 2b);Similarly, extracting 8 results in P = 8(x + 2y + 3z) + 15(a + 2b);,O

14、ptimization techniques,Common Cube Extraction Similar to kernel/co-kernel extraction (for variables) P1 = x2y + xyz; P2 = ab2c3 + b2c2x; P3 = axz + x2z2b; kernel/co-kernel extraction results in P1 = xy(x + z); P2 = b2c2(ac + x); P3 = xz(a + xzb);,Optimization techniques,Polynomial long divisionGiven

15、 two polynomials a(x) and b(x), algebraic division determines q(x) and r(x) such thata(x) = b(x) q(x) + r(x)a(x) = x4 - 2x3 + 5; b(x) = x2 + 3x - 2;a(x) = b(x) (x2 5x + 17) 61x + 39 q(x) r(x),Optimization techniques,Common Sub-Expression EliminationIdentify isomorphic patterns in an arithmetic expre

16、ssion tree and merge them!k = x + y; m = x + y + z; n = xy + x + y;k = x + y; m = k + z; n = xy + k;,Integrated approach,Input: The polynomial system Porig (list of arrays) Perform Canonization, Square-free factorization Get best initial cost: Cinitial Perform Coefficient extraction: Pcce Perform cu

17、be extraction: Pcce_cube, get linear blocks Get the lists representing the system For every linear block, for each list perform algebraic division Pick the best cost,Illustration,Integrated approach (Example),P1 = 13x2 + 26xy + 13y2 + 7x - 7y + 11; P2 = 15x2 - 30xy + 15y2 + 11x + 11y + 9; PorigSquar

18、e-free factorization does not work! Initial cost: 16 M and 10 AAfter common coefficient extraction (Pcce) P1 = 13(x2 + 2xy + y2) + 7(x y) + 11; P2 = 15(x2 - 2xy + y2) + 11(x + y) + 9; Linear blocks: (x y), (x + y),Integrated approach (Example),After common cube extraction (Pcce_cube) P1 = 13(x(x + 2

19、y) + y2) + 7(x y) + 11; P2 = 15(x(x- 2y) + y2) + 11(x + y) + 9; Linear blocks: (x y), (x + y), (x + 2y), (x 2y)Perform algebraic division using the linear blocksPcce is the best cost implementation with (x+y) (x-y)d1 = x + y; d2 = x - y; P1 = 13d12 + 7d2 + 11; P2 = 15d22 + 11d1 + 9; Cost: 6 M and 6

20、A,Results,Average area improvement: 42%,Results,Average area improvement: 42%,Conclusions & Future Work,Polynomial decomposition approach for arithmetic datapaths Arithmetic datapaths modeled as polynomial systems Integrating CSE with algebraic manipulation Performing algebraic decomposition to enhance the power of CSEImpressive area savings But delay penalty!Future Work: Address the concerns in delay!Retarget the approach towards power savings?,Questions?,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 教学课件 > 大学教育

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1