ImageVerifierCode 换一换
格式:PPT , 页数:48 ,大小:618.50KB ,
资源ID:373182      下载积分:2000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。 如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-373182.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(A Review of Information FilteringPart II- Collaborative Filtering.ppt)为本站会员(boatfragile160)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

A Review of Information FilteringPart II- Collaborative Filtering.ppt

1、A Review of Information Filtering Part II: Collaborative Filtering,Chengxiang ZhaiLanguage Technologies Institiute School of Computer Science Carnegie Mellon University,Outline,A Conceptual Framework for Collaborative Filtering (CF) Rating-based Methods (Breese et al. 98) Memory-based methods Model-

2、based methods Preference-based Methods (Cohen et al. 99 & Freund et al. 98) Summary & Research Directions,What is Collaborative Filtering (CF)?,Making filtering decisions for an individual user based on the judgments of other users Inferring individuals interest/preferences from that of other simila

3、r users General idea Given a user u, find similar users u1, , um Predict us preferences based on the preferences of u1, , um,CF: Applications,Recommender Systems: books, CDs, Videos, Movies, potentially anything! Can be combined with content-based filtering Example (commercial) systems GroupLens (Re

4、snick et al. 94): usenet news rating Amazon: book recommendation Firefly (purchased by Microsoft?): music recommendation Alexa: web page recommendation,CF: Assumptions,Users with a common interest will have similar preferences Users with similar preferences probably share the same interest Examples

5、“interest is IR” = “read SIGIR papers” “read SIGIR papers” = “interest is IR” Sufficiently large number of user preferences are available,CF: Intuitions,User similarity If Jamie liked the paper, Ill like the paper ? If Jamie liked the movie, Ill like the movie Suppose Jamie and I viewed similar movi

6、es in the past six months Item similarity Since 90% of those who liked Star Wars also liked Independence Day, and, you liked Star Wars You may also like Independence Day,Collaborative Filtering vs. Content-based Filtering,Basic filtering question: Will user U like item X? Two different ways of answe

7、ring it Look at what U likes Look at who likes X Can be combined,= characterize X = content-based filtering,= characterize U = collaborative filtering,Rating-based vs. Preference-based,Rating-based: Users preferences are encoded using numerical ratings on items Complete ordering Absolute values can

8、be meaningful But, values must be normalized to combine Preferences: Users preferences are represented by partial ordering of items Partial ordering Easier to exploit implicit preferences,A Formal Framework for Rating,u1 u2 ui .um,Users: U,Objects: O,o1 o2 oj on 3 1.5 . 2 213,Xij=f(ui,oj)=?,?,The ta

9、sk,Unknown function f: U x O R,Assume known f values for some (u,o)s Predict f values for other (u,o)s Essentially function approximation, like other learning problems,Where are the intuitions?,Similar users have similar preferences If u u, then for all os, f(u,o) f(u,o) Similar objects have similar

10、 user preferences If o o, then for all us, f(u,o) f(u,o) In general, f is “locally constant” If u u and o o, then f(u,o) f(u,o) “Local smoothness” makes it possible to predict unknown values by interpolation or extrapolation What does “local” mean?,Two Groups of Approaches,Memory-based approaches f(

11、u,o) = g(u)(o) g(u)(o) if u u Find “neighbors” of u and combine g(u)(o)s Model-based approaches Assume structures/model: object cluster, user cluster, f defined on clusters f(u,o) = f(cu, co) Estimation & Probabilistic inference,Memory-based Approaches (Breese et al. 98),General ideas: Xij: rating o

12、f object j by user i ni: average rating of all objects by user i Normalized ratings: Vij = Xij - ni Memory-based predictionSpecific approaches differ in w(a,i) - the distance/similarity between user a and i,User Similarity Measures,Pearson correlation coefficient (sum over commonly rated items)Cosin

13、e measureMany other possibilities!,Improving User Similarity Measures (Breese et al. 98),Dealing with missing values: default ratings Inverse User Frequency (IUF): similar to IDF Case Amplification: use w(a,I)p, e.g., p=2.5,Model-based Approaches (Breese et al. 98),General ideas Assume that data/rat

14、ings are explained by a probabilistic model with parameter Estimate/learn model parameter based on data Predict unknown rating using E xk+1 | x1, , xk, which is computed using the estimated modelSpecific methods differ in the model used and how the model is estimated,Probabilistic Clustering,Cluster

15、ing users based on their ratings Assume ratings are observations of a multinomial mixture model with parameters p(C), p(xi|C) Model estimated using standard EM Predict ratings using Exk+1 | x1, , xk,Bayesian Network,Use BN to capture object/item dependency Each item/object is a node (Dependency) str

16、ucture is learned from all data Model parameters: p(xk+1 |pa(xk+1) where pa(xk+1) is the parents/predictors of xk+1 (represented as a decision tree) Predict ratings using Exk+1 | x1, , xk,Three-way Aspect Model (Popescul et al. 2001),CF + content-based Generative model (u,d,w) as observations z as h

17、idden variable Standard EM Essentially clustering the joint data Evaluation on ResearchIndex data Found its better to treat (u,w) as observations,Evaluation Criteria (Breese et al. 98),Rating accuracy Average absolute deviation Pa = set of items predicted Ranking accuracy Expected utility Exponentia

18、lly decaying viewing probabillity ( halflife )= the rank where the viewing probability =0.5 d = neutral rating,Datasets,Results,- BN & CR+ are generally betterthan VSIM & BC - BN is best with more training data - VSIM is better with little training data - Inverse User Freq. Is effective - Case ampli

19、fication is mostly effective,Summary of Rating-based Methods,Effectiveness Both memory-based and model-based methods can be effective The correlation method appears to be robust Bayesian network works well with plenty of training data, but not very well with little training data The cosine similarit

20、y method works well with little training data,Summary of Rating-based Methods (cont.),Efficiency Memory based methods are slower than model-based methods in predicting Learning can be extremely slow for model-based methods,Preference-based Methods (Cohen et al. 99, Freund et al. 98),Motivation Expli

21、cit ratings are not always available, but implicit orderings/preferences might be available Only relative ratings are meaningful, even if when ratings are available Combining preferences has other applications, e.g.,Merging results from different search engines,A Formal Model of Preferences,Instance

22、s: O=o1, on Ranking function: R: (U x) O x O 0,1 R(u,v)=1 means u is strongly preferred to v R(u,v)=0 means v is strongly preferred to u R(u,v)=0.5 means no preference Feedback: F = (u,v), u is preferred to v Minimize Loss:,Hypothesis space,The Hypothesis Space H,Without constraints on H, the loss i

23、s minimized by any R that agrees with F Appropriate constraints for collaborative filtering Compare this with,The Hedge Algorithm for Combining Preferences,Iterative updating of w1, w2, , wn Initialization: wi is uniform Updating: 0,1L=0 = weight stays L is large = weight is decreased,Some Theoretic

24、al Results,The cumulative loss of Ra will not be much worse than that of the best ranking expert/feature Preferences Ra = ordering = R L(R,F) = DISAGREE(,Ra)/|F| + L(Ra,F) Need to find that minimizes disagreement General case: NP-complete,A Greedy Ordering Algorithm,Use weighted graph to represent p

25、references R For each node, compute the potential value, I.e., outgoing_weights - ingoing_weightsRank the node with the highest potential value above all others Remove this node and its edges, repeat At least half of the optimal agreement is guaranteed,Improvement,Identify all the strongly connected

26、 components Rank the components consistently with the edges between them Rank the nodes within a component using the basic greedy algorithm,Evaluation of Ordering Algorithms,Measure: “weight coverage” Datasets = randomly generated small graphs Observations The basic greedy algorithm works better tha

27、n a random permutation baseline Improved version is generally better, but the improvement is insignificant for large graphs,Metasearch Experiments,Task: Known item search Search for a ML researchers homepage Search for a university homepage Search expert = variant of query Learn to merge results of

28、all search experts Feedback Complete : known item preferred to all others Click data : known item preferred to all above it Leave-one-out testing,Metasearch Results,Measures: compare combined preferences with individual ranking function sign test: to see which system tends to rank the known relevant

29、 article higher. #queries with the known relevant item ranked above k. average rank of the known relevant item Learned system better than individual expert by all measure (not surprising, why?),Metasearch Results (cont.),Direct Learning of an Ordering Function,Each expert is treated as a ranking fea

30、ture fi: O R U 0 (allow partial ranking) Given preference feedback : X x X R Goal: to learn H that minimizes the lossD (x0,x1): a distribution over X x X (actually a uniform dist. over pairs with feedback order) D (x0,x1) = c max0, (x0,x1) ,The RankBoost Algorithm,Iterative updating of D(x0,x1) Init

31、ialization: D1= D For t=1,T: Train weak learner using Dt Get weak hypothesis ht: X R Choose t 0 Update Final hypothesis:,How to Choose t and Design ht ?,Bound on the ranking lossThus, we should choose t that minimizes the bound Three approaches: Numerical search Special case: h is either 0 or 1 Appr

32、oximation of Z, then find analytic solution,Efficient RankBoost for Bipartite Feedback,Complexity at each round: O(|X0|X1|) O(|X0|+|X1|),Bipartite feedback: Essentially binary classification,X0,X1,Evaluation of RankBoost,Meta-search: Same as in (Cohen et al 99) Perfect feedback 4-fold cross validati

33、on,EachMovie Evaluation,# users,#movies/user,#feedback movies,Performance Comparison Cohen et al. 99 vs. Freund et al. 99,Summary,CF is “easy” The users expectation is low Any recommendation is better than none Making it practically useful CF is “hard” Data sparseness Scalability Domain-dependent,Su

34、mmary (cont.),CF as a Learning Task Rating-based formulation Learn f: U x O - R Algorithms Instance-based/memory-based (k-nearest neighbors) Model-based (probabilistic clustering) Preference-based formulation Learn PREF: U x O x O - R Algorithms General preference combination (Hedge), greedy orderin

35、g Efficient restricted preference combination (RankBoost),Summary (cont.),Evaluation Rating-based methods Simple methods seem to be reasonably effective Advantage of sophisticated methods seems to be limited Preference-based methods More effective than rating-based methods according to one evaluatio

36、n Evaluation on meta-search is weak,Research Directions,Exploiting complete information CF + content-based filtering + domain knowledge + user model More “localized” kernels for instance-based methods Predicting movies need different “neighbor users” than predicting books Suggesting using items simi

37、lar to the target item as features to find neighbors,Research Directions (cont.),Modeling time There might be sequential patterns on the items a user purchased (e.g., bread machine - bread machine mix) Probabilistic model of preferences Making preference function a probability function, e.g, P(AB|U)

38、 Clustering items and users Minimizing preference disagreements,References,Cohen, W.W., Schapire, R.E., and Singer, Y. (1999) “Learning to Order Things“, Journal of AI Research, Volume 10, pages 243-270. Freund, Y., Iyer, R.,Schapire, R.E., & Singer, Y. (1999). An efficient boosting algorithm for co

39、mbining preferences. Machine Learning Journal. 1999. Breese, J. S., Heckerman, D., and Kadie, C. (1998). Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of the 14th Conference on Uncertainty in Articial Intelligence, pp. 43-52. Alexandrin Popescul and Lyle H.

40、Ungar, Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments, UAI 2001. N. Good, J.B. Schafer, J. Konstan, A. Borchers, B. Sarwar, J. Herlocker, and J. Riedl. “Combining Collaborative Filtering with Personal Agents for Better Recommendations.“ Proceedings AAAI-99. pp 439-446. 1999.,The EndThank you!,

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1