ImageVerifierCode 换一换
格式:PPT , 页数:28 ,大小:1.48MB ,
资源ID:374412      下载积分:2000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-374412.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(End-User Programming of Intelligent Learning Agents.ppt)为本站会员(confusegate185)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

End-User Programming of Intelligent Learning Agents.ppt

1、End-User Programming of Intelligent Learning Agents,Prasad Tadepalli, Ron Metoyer, and Margaret Burnett,In conjunction with the EUSES Consortium: End Users Shaping Effective Software,Prasad Tadepalli: Machine Learning,Scaling Average-reward Reinforcement Learning to large spaces,Relational Learning,

2、Relational learning from prior knowledge and sparse user input,Relational Reinforcement Learning,NSF CAREER Award winner (2003). Complexities of animated content. Creating characters for training. Emphasis on usability and realism. Real-time simulation of evacuation dynamics for large crowds.,Ron Me

3、toyer: Computer Graphics & Animation,Margaret Burnett: Visual & End-User Programming,Project director: EUSES Consortium (End Users Shaping Effective Software) An ITR project by Oregon State, Carnegie Mellon, Drexel, Nebraska, & Penn State. Principal architect: Forms/3, FAR end-user programming suppo

4、rt. Co-architect: Functions for Excel users (a Microsoft Research project).,Motivation,Task Training Sports Military,Boston Dynamics Inc.,Who creates the training content?,Current Approaches,Joystick Control: User does all (once, not reusable). Scripting Languages User does all (reusable program). P

5、rogramming by Demonstration User and system share. Autonomous Agents System does all.,Application:Quarterback Training,QBs can benefit from 3D training content Coaches: Do not program or animate. Need responsive, semi-intelligent agents that perform football tasks. Agents: Should get better over tim

6、e. Should do so with few examples. Agent behavior: Must morph over time (different opponents).,End-User Programming by Demonstration,Generalizing from demonstrations is still an active area of research: Some viable approaches for particular assumptions, but not a solved problem. Other systems allow

7、demonstrating only reactive behaviors. Not used to train people strategy. Largely distinct from machine learning.,Our Approach to End-User Programming,Our approach: demonstrate goals and strategies to achieve the goals. Allows generalization and planning by agents. Thus, suited to training: Agents c

8、an simulate both “good” characters for training (desirable strategies) . and “bad” characters (strategies we know they employ).,Example,Goal: Get the football to Character A. Demonstration: Start state, goal state. Research issue: “What is relevant”? Any trees are ignorable background. Character A c

9、an be any character. The football is a unique object.,Start:,Goal:,Strategy 1: Pass it directly. Demonstration: Passing to A. “Whats relevant” issues arise again.Strategy 2: Pass it to B who passes to A. New issue: recursiveness. (Need to learn a general strategy of “get it to someone who can get it

10、 to closer to A”.),Example (cont.),Machine Learning Challenges,Learning must be on-line. Users can only give a few examples. Provide a predictable model of generalization. Must include support for debugging. Must allow safety checks. Expressive representation language.,Strategy Languages,Some high-l

11、evel languages exist to express strategies, e.g., Golog, CML. Our plan: simpler rule-based languages, suitable for learning. Starting point: our previous work on a decomposition-rule language: IF Condition(s) and Goal(s) Then Subgoals(s1,s2,sn) While invariant conditions hold.,Requirements of the Le

12、arning Algorithms,Follow HCI findings: User motivation, attention, trust. Need transparent generalization procedure, e.g., no neural nets. Treat user input as examples of high-level specification of strategy. .and fill in the details. User “steers” agent behaviors to correct faulty generalizations.

13、Assertions to monitor behavior. Provided, Inferred, and propagated.,Learning from Exercises,Generate examples automatically by searching for successful plans. Bottom-up learning of skills. Learn how to solve simple problems first. Compose known strategies for solving subgoals to solve more complex g

14、oals.,Oops! Thats Not Right!,Debugging by end-user programmers. When the agents pick the right strategy but it doesnt work right. When the agents pick the wrong strategies. These provide negative examples to the learning component.,How to Support Debugging?,User/system collaboration. User helps narr

15、ow the problem. System revises its rules and runs them on the example until the user is satisfied. Testing and Assertions Used for quality control, but designed specifically for end users. Assertions will be used to rule out bad generalizations.,Debugging (cont.),Draws from our previous work on end-

16、user software development: WYSIWYT testing, fault localization, and assertions. Surprise-Explain-Reward strategy: Empirically driven research. Draws from psychology to motivate desired behaviors via surprises (to arouse curiosity).,Research Issues,How to learn from a small number of examples? How to

17、 let the user “speak” his/her own language? How to motivate the users and earn their trust? How to facilitate debugging and maintenance in a natural way? How to make learning safe?,Summary: The Research Question,Is it possible to empower end users. .to program in evolving task-training environments.

18、 .using machine learning and programming by demonstration?,(The End),Leftovers,How to Support Debugging?,User/system collaboration. Builds on our previous work: Motivating, suggesting, and supporting. .end-user testing, end-user fault localization, and end-user assertions.,Web Navigation (*possibly

19、cut),Navigation of the web to satisfy a goal: Students trying to find an appropriate school that match their interests and constraints. Shoppers looking for bargain purchases. Traders searching for appropriate stocks to buy and sell. In each case, the system should learn to retrieve the target infor

20、mation efficiently.,Debugging,Negative examples are used to specialize over-general rules. Maintain confidences of rules based on their support among the training examples and suggest possible incorrect rules. Encourage users to enter assertions to correct errors. Verify assertions during rule evaluation and warn the user if they are not valid.,Agent Behavior Control,Joystick Controlled,Autonomous,Scripting languages,Autonomous but “teachable”,End-User Agents -program by interaction -generalize,

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1