1、,TACC Retrospective: Contributions, Non-Contributions, and What We Really Learned,Armando Fox University of California,Berkeley foxcs.berkeley.edu,Vision: “The Content You Want”,What do above apps have in common? Adapt (collect, filter, transform) existing content according to client constraints res
2、pecting network limitations according to per-user preferences But: Lack of unified framework for designing apps that exploit this observation,Contributions,TACC, a model for structuring services Transformation, Aggregation, Caching, Customization of Internet content Scalable TACC server Based on clu
3、sters of commodity PCs Easy to author “industrial strength” services Scalable Network Service (SNS) platform maps app semantics onto cluster-based availability mechanisms Experience with real users 15,000 today at UCB,Whats TACC?,Transformation (“local”, “one-to-one”) TranSend, Anonymizer Aggregatio
4、n (“nonlocal”, “many-to-one”) Search engines, crawlers, newswatchers Caching Both original and locally-generated content Customization Per user: for content generation Per device: data delivery, content “packaging”,TACC Example: TranSend,Transparent HTTP proxy On-the-fly, lossy compression of specif
5、ic MIME types (GIF, JPG.) Cache both original & transformed User specifies aggressiveness and “refinement” UI Parameters to HTML & image transformers,$,Top Gun Wingman,PalmPilot web browser Intermediate-form page layout Image scaling & transcoding Controlled by layout engine Device-specific ADU mars
6、halling Including client versioning Originals and device-specific pages cached,$,A,ADU,html,Application Partitioning,Client competence Styled text, images, widgets are fine Bitmaps unnecessary Client responsiveness Scrolling, etc. shouldnt require roundtrip to server Client independence Very late co
7、nversion to client-specific format,TACC Conceptual Data Flow,FE,User request,To Internet,Front end accepts RPC-like user requests Users customization profile retrieved Original data fetched from cache or Internet Aggregation/transformation workers operate on data according to customization profile,T
8、ACC Model Summary,Mostly stateless, composable workers Unifies previously ad hoc applications under one framework Encourages re-use through modularization Composition enables both new services and new clients TACC breakdown provides unified way to think about app structure,Services Should Be Easy To
9、 Write,Rapid prototyping Insulate workers from “mundane” details Easy to incorporate existing/legacy code Few assumptions about code structure Must support variety of languages May be fragile Composition to leverage existing code,Building a TACC Server,Challenge: Scalable Network Service (SNS) requi
10、rements Scalability to 100Ks of users with high availability Cost effective to deploy & administer But, services should remain easy to write Server provides some bug robustness Server provides availability Server handles load balancing and scaling Preserve modularity (& componentwise upgradability)
11、when deploying,Layered Model of Internet Services,TACC Layer Programming model based on composable building blocks SNS Layer: “large virtual server” Implements SNS requirements Cluster computing for hardware F/T and incremental scaling,httpd, etc.,TACC,Scalable Network Svc,Exploit TACC model semanti
12、cs for software F/T SNS layer is reusable and isolated from TACC Application “content” orthogonal to SNS mechanisms Key to making apps easy to write,Why Use a Cluster?,Incremental scalability, low cost components High availability through hardware redundancyGoals: Demonstrate that clusters and TACC
13、fit well together Separate SNS from TACC,Cluster-Based TACC Server,Component replication for scaling and availability High-bandwidth, low-latency interconnect Incremental scaling: commodity PCs,Front Ends,Caches,User Profile Database,Workers,Load Balancing & Fault Tolerance,Administration Interface,
14、“Starfish” Availability: LB Death,FE detects via broken pipe/timeout, restarts LB,C,$,FE,$,$,FE,FE,LB/FT,“Starfish” Availability: LB Death,FE detects via broken pipe/timeout, restarts LB,C,$,FE,$,$,FE,FE,LB/FT,New LB announces itself (multicast), contacted by workers, gradually rebuilds load tables,
15、If partition heals, extra LBs commit suicide FEs operate using cached LB info during failure,“Starfish” Availability: LB Death,FE detects via broken pipe/timeout, restarts LB,C,$,FE,$,$,FE,FE,LB/FT,New LB announces itself (multicast), contacted by workers, gradually rebuilds load tables,If partition
16、 heals, extra LBs commit suicide FEs operate using cached LB info during failure,Fault Recovery Latency,Task queue length,Behavior in the Large,TranSend: 160 image transformations/sec = 10 Ultra-1 servers Peak seen during UCB traces on 700-modem bank: 15/sec Amortized hardware cost $0.35/user/month
17、(one $5K PC serving 15,000 subscribers) Wingman: factor of 6-8 worse Administration: one undergraduate part-time,Building a Big System,Restartable, atomic workers Read-only data from other origin server(s) Orthogonal separation of scalability/availability from application “content” Multiple lines of
18、 defense App modules agree to obey semantics compatible with these mechanisms Common-case failure behavior compatible with users Internet experience Enables reuse of whole workers, however diverse,Availability & Scalability Summary,Pervasive strategy: timeout, retry, restart Transient failures usual
19、ly invisible to user Process peers watch each other Mostly stateless workers, xact support possible Simplicity from exploiting soft state Piggyback status info on multicast beacons Use of stale LB info fine in practice “Starfish” availability works in practice,Service Authoring,Keyword hiliting: 1 d
20、ay Wingman: 2-3 weeks Various apps from graduate seminar projects Safe worker upload Annotate the Web “Channel aggregators”,New Services By Composition,Compose existing services to create a new one2.5 hours to implement Composes with TranSend or Wingman,TranSend Metasearch,Internet,Experience With R
21、eal Users,Transparent enhancements Minimal downtime Low administration cost Multicast-based administration GUI Virtually no dedicated resources at UCB “Overflow pool” of 100 UltraSPARC servers Users dont mind relying on middleware proxy,Why Now?,Internets critical mass Commercial push for many devic
22、e types (transistor curves) Cluster computing economically viable A good time for infrastructural services,Related Work,Transformational proxy services: WBI, Strands Application partitioning: Wit, InfoPad, PARC Ubiquitous Computing Computing in the infrastructure: Active Networks Soft state for simp
23、licity and robustness: Microsoft Tiger, multicast routing protocols,Summary of Contributions,TACC, a composition-based Internet services programming model captures rich variety of apps one view of customization No-hassle deployment on a cluster Automatic and robust partial-failure handling Availabil
24、ity & scaling strategies work in practice New apps are easy to write, deploy, debug SNS behaviors are free Compose existing services to enable new clients,Non-Contributions (a/k/a Future Work),Accidental contributions: Legacy code glue Cheap test rig for next project (prototyping path discovery; a b
25、are bones “cluster OS”) Non-contributions: Fair resource allocation over cluster Built-in security abstractions Rich state management abstractions,What We Really Learned,Design for failure It will fail anyway End-to-end argument applied to availability Orthogonality is even better than layering Narr
26、ow interface vs. no interface A great way to manage system complexity The price of orthogonality Techniques: Refreshable soft state; watchdogs/timeouts; sandboxing,How About State Management?,Transactional apps? APIs are there, but you have to roll your own consistency Groupware apps with group stat
27、e? One way: distributed, F/T group state like SRM! Keeps state management orthogonal to SNS layer,The Moral: Consistency, Availability, Partition-resilience: pick at most 2,Future Work,TACC as test rig for Ninja Taxonomy of app structure and platforms What is the “big picture” of different types of Internet services, and where does TACC fit in? Joint work with Dr. Murray Mazer at the Open Group Research Institute Apply lessons to reliable distributed systems Formalize programming model,Finish writing thesis,