ImageVerifierCode 换一换
格式:PPT , 页数:46 ,大小:1.48MB ,
资源ID:373198      下载积分:2000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-373198.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(A Unified Model for Stable and Temporal Topic Detection from.ppt)为本站会员(lawfemale396)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

A Unified Model for Stable and Temporal Topic Detection from.ppt

1、A Unified Model for Stable and Temporal Topic Detection from Social Media Data,Hongzhi Yin, Bin Cui, Hua Lu, Yuxin Huang and Junjie Yao Peking University Aalobrg University,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regular

2、ization Technique Burst-Weighted Boosting Experiments Q/A,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Motivation,Motivation (Cont.),Two different types of topi

3、cs are mixed up in the social media platforms such as Twitter, Weibo and Delicious; Temporal Topics are temporally coherent meaningful themes. They are time-sensitive and often on popular real-life events or hot spots, i.e., breaking events in the real world. Stable Topics are often on users regular

4、 interests and their daily routine discussions, e.g., their moods and statuses.,One Example in Twitter,Temporal Topic : Dead pigs in Shanghai,Stable Topic : Big Data,Another Example in Twitter,Temporal Topic: Independence Day,Stable Topic: Animal Adoption,We can tell the difference between temporal

5、and Stable topics from their temporal distributions and their description words.,Motivation (Cont.),Discovering different topics of events that are coherent in temporal space Detecting bursty events, such as disaster (e.g., earthquakes), politics (e.g., election), and public events (e.g., Olympics)

6、Analyzing topic trends Extracting stable topics that are coherent in user-interest space. Finding user intrinsic interests and better modeling user preference,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Techni

7、que Burst-Weighted Smoothing Experiments Q/A,Problem Formulation,A user-time-associated document d is a text document associated with a time stamp and a user. A temporal topic is a temporally coherent theme. In other words, the words that are emerging in the close time dimension are clustered in a t

8、opic. An example of temporal topics: Given a collection of user-time-associated tweets, the desired temporal topics are the events happening in different times. Formally, a temporal/stable topic is represented by a word distribution where,Problem Formulation (Cont.),A topic distribution in time dime

9、nsion is the distribution of topics given a specific time interval. Formally, is the probability of temporal topic given time interval t. A topic distribution in user space is the distribution of topics given a specific user. Formally, is the probability of stable topic given user u.,Problem Formula

10、tion (Cont.),A User-Time-Keyword Matrix M is a hyper-matrix whose three dimensions refer to user, time and keyword. A cell in Mu, t, w stores the frequency of word w generated by user u within time interval t. Given a collection of user-time-associated documents C, we first formulate matrix M Detect

11、ing Temporal Topics Extracting Stable Topics,Task 1,Task 2,Problem Formulation (Cont.),Detecting a set of temporal topics that are event-driven. Detecting bursty events, such as disaster (e.g., earthquakes), politics (e.g., election), and public events (e.g., Olympics) Analyzing topic trends Extract

12、ing a set of stable topics that are interest-driven. Finding user intrinsic interests and better modeling user preference,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experimen

13、ts Q/A,A User-Time Mixture Model,Main InsightsTo find both temporal and stable topics in a unified manner, we propose a topic model that simultaneously captures two observations: Words generated around the same time are more likely to have the same event-driven temporal topicWords generated by the s

14、ame user are more likely to have the same interest-driven stable topic. The former helps find event-driven temporal topics while the latter helps identify interest-driven stable topics.,Combine user and time information We assume that when a user u generates a word w at time t, he/she is probably in

15、fluenced by two factors: the breaking news/events occurring in time t and his/her intrinsic interests. Breaking events are modeled by temporal topics and user intrinsic interests are modeled by stable topics.,The likelihood that user u generates word w at time t is as follows:Parameters and are mixi

16、ng weights controlling the motivation factor choice, also denoting the proportions of temporal topics and stable topics in the dataset. It is worth mentioning that they are learnt automatically, instead of being fixed.,Parameter Estimation,The log-likelihood of the whole user-time-associated documen

17、t collection C is E-M algorithm to estimate,E-Step,M-Step,Compute expectation,Maximize, closed form solution,Please refer to the details of E-M algorithm in Section 4.2,Parameter Estimation,E-step:M-step:,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhanceme

18、nt of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Spatial Regularization,Intuitions If two users are connected in the social network space, they are more likely to enjoy same/similar interests/topics. A topic is interest-coherent if people who are interested i

19、n this topic also close in the network space.,22,DB,DB,DB,?,More likely to be an DB person or an IR person?,Intuition: users interests are similar to their neighbors,Spatial Regularization,Topic Model With Spatial Regularization A regularized data likelihood is defined as follows:,Regularizer,The Sp

20、atial Regularizer plays the role of spatial smoothing for user interests.,Parameter Estimation,24,Maximize, using Newton-Raphson,Smooth using a spatial regularizer; in each iteration, a user interest issmoothed by his/her spatial neighbors.,Outline,Motivation Problem Formulation A Basic Solution A U

21、ser-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Insights,In topic models, the words with high occurrence rate, i.e., popular words, enjoy high probabilities to appear at top positions in each discovered topic. These popula

22、r words are mostly general words, denoting abstract concepts. In stable topics, they can illustrate the domain of topics at the first glimpse. However, in temporal topics, words with notable bursty feature are superior in expressing temporal information since users are more interested in bursty word

23、s than in abstract concepts when browsing temporal topic,Example: Michael Jacksons Death,In this temporal topic, we expect that bursty words “mj”, “michael jackson” “moonwalk” become the dominant words rather than the general words “world”, “news” and “death”.But they cannot be removed as stop words

24、, since they can help illustrate the stable topics.,Burst-Weighted Boosting,We implement a bursty boosting step to escalate the probability of these bursty words during the procedure of detecting temporal topics. We first compute the bursty-degree of each word in each time interval. (Yao et al. ICDE

25、2010) A boosting step is then taken after each few E-M iterations, as follows. In this step, a word w will have its generation probability boosted in a temporal topic only if ws bursty period overlaps with that of the topic.,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mix

26、ture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Data Sets,Twitter Data set (Mar. 2009 to Oct.2009) Delicious Data set (Feb.2008 to Dec. 2009) Sina Weibo (2011),Data Sets,Twitter: People in this platform often discuss many social events an

27、d their daily life. It contains 9,884,640 tweets posted by 456,024 users in the period of Mar. 2009 to Oct.2009. Each user in this data set at least published 200 posts. We first removed all the stop words. Delicious: Delicious is a collaborative tagging system on which users can upload and tag web

28、pages. We collected 200,000 users and their tagging behaviors from the period of Feb.2008 to Dec. 2009. The dataset contains 7,103,622 tags. Topics on technology and electronic cover more than half of tags. Breaking news also co-exists.,Compared Methods,Our models BUT is the basic model EUTS is the

29、model enhanced with spatial regularization EUTB is the model enhanced with both spatial regularization and burst-weighted boosting. PLSA Model on Time Slices (Mei et al. KDD05) Individual Detection Method (Wang et al. KDD07) Topic Over Time Model (TOT) (Wang et al. KDD06) TimeUserLDA (Diao et al. AC

30、L12),Time Stamp Prediction Comparison,Time Stamp Prediction Comparison,Topic Quality Comparison,Excellent: a nicely presented temporal topic; Good: a topic containing bursty features; Poor: a topic without obviousbursty features,Stable Topics Detected in Delicious,Temporal Topics Detected in Delicio

31、us,Stable Topics Detected in Twitter,Temporal Topics Detected in Twitter,Stable Topics (Sina Weibo),Temporal Topics (Sina Weibo),Temporal Topics (Sina Weibo),Temporal Topic Trends Analysis,Temporal Topic Trends Analysis,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Thank You!,Any Question ?,Email: ,

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1