The Effectiveness of Lloyd-type Methods for the k-means .ppt

上传人:proposalcash356 文档编号:373252 上传时间:2018-10-05 格式:PPT 页数:20 大小:153.50KB
下载 相关 举报
The Effectiveness of Lloyd-type Methods for the k-means .ppt_第1页
第1页 / 共20页
The Effectiveness of Lloyd-type Methods for the k-means .ppt_第2页
第2页 / 共20页
The Effectiveness of Lloyd-type Methods for the k-means .ppt_第3页
第3页 / 共20页
The Effectiveness of Lloyd-type Methods for the k-means .ppt_第4页
第4页 / 共20页
The Effectiveness of Lloyd-type Methods for the k-means .ppt_第5页
第5页 / 共20页
亲,该文档总共20页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、The Effectiveness of Lloyd-type Methods for the k-means Problem,Chaitanya Swamy University of Waterloo Joint work with Rafi Ostrovsky, Yuval Rabani, Leonard SchulmanUCLA Technion Caltech,The k-means Problem,Given: n points in d-dimensional spacepartition X into k clusters X1, Xk assign each point in

2、 Xi to a common center ciRd Goal: Minimize i xXi d(x,ci)2,d: L2 distance,X Rd : point set with |X| = n,k-means (contd.),Given the cis, best clustering is to assign each point to nearest center: Xi = xX: ci is ctr. nearest to x Given the Xis, best choice of centers is to setci = center of mass of Xi

3、= ctr(Xi) = xXi x / |Xi| Optimal solution satisfies both properties Problem is NP-hard even for k=2 (n, d not fixed),Related Work,k-means problem dates back to Steinhaus (1956). a) Approximation algorithms algorithms with provable guarantees,PTASs with varying runtime dependence on n, d, k: poly/lin

4、ear in n, could be exponential in d and/or k Matousek (poly(n), exp(d,k) Kumar, Sabharwal guarantees also translate to k-means Charikar, Guha, Tardos & Shmoys Arya et al. + Kanungo et al.: (9+e)-approximation,b) Heuristics: Lloyds method invented in 1957 and remains an extremely popular heuristic ev

5、en today,1) Start with k initial / “seed” centers c1, ck. 2) Iterate the following Lloyd step Assign each point to nearest center ci to obtain clustering X1, Xk. Update ci ctr(Xi) = xXi x/|Xi| .,1) Start with k initial / “seed” centers c1, ck. 2) Iterate the following Lloyd step Assign each point to

6、 nearest center ci to obtain clustering X1, Xk. Update ci ctr(Xi) = xXi x/|Xi| .,b) Heuristics: Lloyds method invented in 1957 and remains an extremely popular heuristic even today,1) Start with k initial / “seed” centers c1, ck. 2) Iterate the following Lloyd step Assign each point to nearest cente

7、r ci to obtain clustering X1, Xk. Update ci ctr(Xi) = xXi x/|Xi| .,b) Heuristics: Lloyds method invented in 1957 and remains an extremely popular heuristic even today,Some bounds on number of iterations of Lloyd-type methods: Inaba-Katoh-Imai; Har-Peled-Sadri; Arthur-Vassilvitskii (06) Performance v

8、ery sensitive to choice of seed centers; lot of literature on finding “good” seeding methods for LloydBut, almost no analysis that proves performance guarantees about quality of final solution for arbitrary k and dimension,Lloyds method: Whats known?,Our Goal: to analyze Lloyd and try to prove rigor

9、ous performance guarantees for Lloyd-type methods,Our Results,Introduce a clusterability or separation condition. Give a novel, efficient sampling process for seeding Lloyds method with initial centers. Show that if data satisfies our clusterabililty condition: seeding + 1 Lloyd step yields a consta

10、nt-approximation in time linear in n and d, poly(k): is potentially faster than Lloyd variants which require multiple reseedings seeding + KSS04-sampling gives a PTAS: algorithm is faster and simpler than PTAS in KSS04.,Main Theorem: If data has a “meaningful k-clustering”, then there is a simple, e

11、fficient seeding method s.t. Lloyd-type methods return a near-optimal solution.,“Meaningful k-Clustering”,Settings where one would NOT consider data to possess a meaningful k-clustering:,1) If near-optimum cost can be achieved by two very distinct k-partitions of data, then identity of an optimal k-

12、partition carries little meaning provides ambiguous classification. 2) If cost of best k-clustering cost of best (k-1)-clustering, then a k-clustering yields only marginal benefit over the best (k-1)-clustering should use smaller value of k here.,Example: k=3,We formalize 2). Let Dk2(X) = cost of be

13、st k-clustering of X. X is e-seperated for k-means iff Dk2(X) / Dk-12(X) e2 .,Simple condition. Drop in k-clustering cost is already used by practitioners to choose the right k Can show that (roughly),X is e-separated for k-means,two low-cost k-clusterings disagree on only a small fraction of data,r

14、*2,r*1,The 2-means problem (k=2),X*1, X*2 : optimal clusters c*i = ctr(X*i), D* = d(c*1,c*2) ni = |X*i|, (r*i)2 = xX*i d(x, c*i)2 / ni = D12(X*i) / ni = avg. squared distance in cluster X*i,Lemma: For i=1, 2, (r*i)2 e2/(1-e2).D*2 . Proof: D22(X) / e2 D12(X) = D22(X) + (n1n2 / n).D*2 .,X is e-separat

15、ed for 2-means.,c*1,c*2,D*,X*1,X*2,The 2-means algorithm,Assume: X is e-separated for 2-means,1) Sampling-based seeding procedure: Pick two seed centers c1, c2 by randomly picking the pair x, yX with probability d(x,y)2. 2) Lloyd step or simpler “ball k-means step”: For each ci, let Bi = xX: d(x,ci)

16、 d(c1,c2)/3. Update ci ctr(Bi); return these as final centers.,Sampling can be implemented in O(nd) time, so entire algorithm runs in O(nd) time.,c*1,c*2,D*,X*1,X*2,c1,2-means: Analysis,c*1,c*2,D*,X*1,X*2,core(X*1),core(X*2),c2,Let core(X*i) = xX*i : d(x,c)2 (r*i)2 / r, where r= q(e2) 1. Seeding lem

17、ma: With prob. 1O(r), c1,c2 lie in cores of X*1, X*2. Proof: |core(X*i)| (1-r)ni for i=1,2. Let A = xcore(X*1), ycore(X*2) d(x,y)2 (1-r)2n1n2D*2.B = x,yX d(x,y)2 = n.D12(X) n1n2D*2. Probability = A / B (1-r)2 = 1 O(r).,2-means analysis (contd.),c1,c*1,c*2,D*,X*1,X*2,core(X*1),core(X*2),c2,Recall tha

18、t Bi = xX: d(x,ci) d(c1,c2)/3 Ball-k-means lemma: For i=1,2, core(X*i) Bi X*i . Therefore d(ctr(Bi), c*i)2 r(r*i)2 /(1r) . Intuitively, since Bi X*i and Bi contains almost all of the mass of X*i, ctr(Bi) must be close to ctr(X*i) = c*i.,B1,B2,2-means analysis (contd.),Theorem: With probability 1O(r)

19、, cost of final clustering is at most D22(X)/(1r), get a (1/(1r)-approximation algorithm. Since r = O(e2), we have approximation ratio 1 as e 0.probability of success 1 as e 0.,Arbitrary k,Algorithm and analysis follow the same outline as in 2-means. If X is e-separated for k-means, can again show t

20、hat all clusters are well separated, that is, cluster radius inter-cluster distance, r*i = O(e).d(c*i, c*j) “i,j 1) Seeding stage: we choose k initial centers and ensure that they lie in the “cores” of the k optimal clusters. exploits the fact that clusters are well separated after seeding stage, ea

21、ch optimal center has a distinct seed center very “near” it 2) Now, can run either a Lloyd step or a ball-k-means step. Theorem: If X is e-separated for k-means, then one can obtain an a(e)-approximation algorithm where a(e) 1 as e 0.,Schematic of entire algorithm,Simple sampling: Pick k centers as

22、follows. first pick 2 centers c1, c2 as in 2-means to pick center ci+1, pick xX with probability minji d(x,cj)2,Simple sampling: success probability = exp(-k),Oversampling + deletion: sample O(k) centers, then greedily delete till k remain O(1) success probability, O(nkd+k3d),Greedy deletion: O(n3d)

23、,Greedy deletion: Start with n centers and keep deleting the center that causes least cost increase till k centers remain,k well-placed seeds,Ball k-means or Lloyd step: gives O(1)-approx.,KSS04-sampling: gives PTAS,Open Questions,Deeper analysis of Lloyd: are there weaker conditions under which one can prove performance guarantes for Lloyd-type methods? PTAS for k-means with polytime dependence on n, k and d? Is it APX hard in geometric setting? PTAS for k-means under our separation condition? Other applications of separation condition?,Thank You.,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 教学课件 > 大学教育

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1