1、1,Transaction Research History and Challenges,Jim Gray Microsoft http:/ talk for session on Systems Perspectives on Database Technology; Achievements and Dreams Forgotten ACM SIGMOD 2006, Chicago, Ill, 27 June 2006,Thanks to: Phil Bernstein, Surajit Chaudhuri, Dave DeWitt, Rick Snodgrass, Gerhard We
2、ikum,2,Databases Are State,DB is a collection of facts Store the facts Find the facts Combine the facts to make new ones,3,Transactions are State Changes,And of course these changes are state (facts)my meta data is just your dataIts all rock n roll to me. Its turtles all the way down.,4,Transactions
3、 Have a LONG History,First clay tablets were transaction records General ledger lots of technology there Punched cards Batch (tape) transactions Online (concurrency & durability issues) And now What next? I believe it is back to clay tablets.,Years Before Present Timescale 6000 1000 100 50 25 0,5,“F
4、ormal” Transaction Notion Interesting History,Formalization happened concurrently in many groups: GE, IBM, MIT, Tokyo, Many others saw it as useless Transactions dont give THE right answer, they just give A answer. Heated debate among the “enlightened” “Winner” was wrong, but right at the time.,6,AC
5、ID came to define Transaction “elevator pitch”,Atomicity: All or nothing Consistency: Preserve application invariants Isolation: No concurrency surprises Durability: No commitments lost,Lesson: Simple Story Matters. It is IMPORTANT to get it right.,7,Red-Green Balls Example Whats Wrong With This Pic
6、ture?,A: Change all Red to Green B: Change all Green to Red,Even people who have worked on this for 40 years, still puzzle about things like this. The “answer” is subtle. Probably there is no “answer”. We get to set the rules.,8,The Virtue of Transactions,They are simple Convert complex errors into
7、simple go / no-goSimplifies component composition Simplifies distributed system error handling (especially useful in a “cluster”)Lampson: Transactions are “pixie dust” that you sprinkle on your program to make it reliable.”,9,But Technology Clouded our/my Thinking,Disk & RAM Storage was expensive Ac
8、cesses were expensive So we discarded old values, did update-in-place Makes it easy to find current state possible to find old state from log. But, many applications want data lineage databases dont optimize for that. But now storage is “free”. Keep everything!,Some kept old versions: Prime Codasyl,
9、 Oracle, Rdb, but “garbage collected,10,Restatement It is a Mistake to Update Data,Discards information! You should only ADD information Examples clay tablets, general ledger punched cards Batch processing Old-Master New-Master,11,Correct Solution Temporal Databases,No Update! No Delete! Only Insert
10、 and Read-Time grouped into transactions Every item has time dimension(s) transaction time, valid time, This is BETTER than clay tablets & punched cards (they did not have valid time & transaction time) Same as general ledger References: Bernstein, Hadzilacos, Goodman “transaction” book. Snodgrass e
11、t. al., Temporal SQL Reed Atomic Actions thesis,12,Lesson,Technology can warp your (my) thinking No update with clay tablets, cards, tape Disks allowed/encouraged update (precious disk space). Now that disk space is free, I see the error of my ways But at the time (19702005) it was “right” Systems t
12、hat tried, “failed” (e.g. Postgress, TSQL) Real lesson: Good ideas can go bad Good ideas may have to wait,13,What About Durability,Discussion so far: atomic-consistent-isolated (ACI) state change Durability always used replicas. Log replica is compact but “useless” Want object-replica Want security,
13、 query, Log is just a technology for replicas. Replica technology has made huge progress. Problem: too many solutions . Durability requires geo-plex Lots of Copies Keep Stuff Safe (LOCKSS) Challenge today is to simplify options.,14,Whats WRONG With ACID Transactions?,Transactions are an UN-availabil
14、ity feature Correctness/consistencyFight with “Do it now!” (Lesson: “Do it now!” usually wins)Users hate to wait!Transactions are Good within an organization: I trust you! Bad across organizations: Can I depend on you?,15,Workflow Still An Elusive Goal,If X-is good, recursive-X is better What is the
15、 generalization of transaction? If they are atomic, what are molecules? How to compose them? Great! progress on Multi-level Transaction Model (Weikum-Vossen book) Limited progress on workflow parallelism within transactions,16,Workflow Progress,There are LOTS of workflow systems. What “concepts” hav
16、e helped? Compensation model Simple metaphors (e.g. Sagas) CommitAbort dependencies,17,Aside: The Software Crisis and Transactional Memory,Software systems are getting too complex. Try-catch fault handling model Huge advance Unworkable in complex systems. Multi-core and Many-core force parallel prog
17、ramming So, Software is in crisis (as usual). Transactional memory (treat methods as sub-transactions) simplifies error handling. Reminiscent of Randells Recovery Blocks and Great progress in this space, challenging problems. PS: they definitely update in place.,18,Transaction Research Advice for 20
18、07,Think in terms of temporal databases Transactions of Insert and Readtime Simplify replication (as a path to Durability) LOCKSS is the key to durability Temporal model may make it easier Dont give up on workflow It is too important. Non ACID workflow? But, all my advice on it has been a dead end.
19、Simpler programming model with Transactions? Cleaner & Simpler fault handling. Many-core parallelism?,19,20,The abstract I promised to talk about. Database Operating Systems: Storage & Transactions,Database systems now use most of the technologies the research community developed over the last 3 dec
20、ades: Self-organizing data, non-procedural query processors, automatic-parallelism, transactional storage and execution, self-tuning, and self-healing. After a period of linear evolution, database concepts and systems are undergoing rapid evolution and mutation - entering a synthesis with programmin
21、g languages, with file systems, with networking, and with sensor networks. Files are being unified with other types and becoming first-class objects. The transaction model appears to be fundamental to the transactional memory needed to program multi-core systems in parallel. Workflow systems are now a reality. The long-heralded parallel database machine idea of data-flow programming has begun to bear fruit. Each of these new applications of our ideas raise new and challenging research questions.,Blue are undelivered promises So it goes.,