ImageVerifierCode 换一换
格式:PPT , 页数:47 ,大小:229KB ,
资源ID:379681      下载积分:2000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-379681.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(Chapter 23- Advanced Application Development.ppt)为本站会员(towelfact221)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

Chapter 23- Advanced Application Development.ppt

1、1,Chapter 23: Advanced Application Development,Performance Tuning Performance Benchmarks Standardization E-Commerce Legacy Systems,2,Performance Tuning,Adjusting various parameters and design choices to improve system performance for a specific application. Tuning is best done by identifying bottlen

2、ecks, and eliminating them. Can tune a database system at 3 levels: Hardware - e.g., add disks to speed up I/O, add memory to increase buffer hits, move to a faster processor. Database system parameters - e.g., set buffer size to avoid paging of buffer, set checkpointing intervals to limit log size.

3、 System may have automatic tuning. Higher level database design, such as the schema, indices and transactions (more later),3,Bottlenecks,Performance of most systems (at least before they are tuned) usually limited by performance of one or a few components: these are called bottlenecks E.g. 80% of th

4、e code may take up 20% of time and 20% of code takes up 80% of time Worth spending most time on 20% of code that take 80% of time Bottlenecks may be in hardware (e.g. disks are very busy, CPU is idle), or in software Removing one bottleneck often exposes another De-bottlenecking consists of repeated

5、ly finding bottlenecks, and removing them This is a heuristic,4,Identifying Bottlenecks,Transactions request a sequence of services e.g. CPU, Disk I/O, locksWith concurrent transactions, transactions may have to wait for a requested service while other transactions are being served Can model databas

6、e as a queueing system with a queue for each servicetransactions repeatedly do the following request a service, wait in queue for the service, and get serviced Bottlenecks in a database system typically show up as very high utilizations (and correspondingly, very long queues) of a particular service

7、 E.g. disk vs CPU utilization 100% utilization leads to very long waiting time: Rule of thumb: design system for about 70% utilization at peak load utilization over 90% should be avoided,5,Queues In A Database System,6,Tunable Parameters,Tuning of hardware Tuning of schema Tuning of indices Tuning o

8、f materialized views Tuning of transactions,7,Tuning of Hardware,Even well-tuned transactions typically require a few I/O operations Typical disk supports about 100 random I/O operations per second Suppose each transaction requires just 2 random I/O operations. Then to support n transactions per sec

9、ond, we need to stripe data across n/50 disks (ignoring skew) Number of I/O operations per transaction can be reduced by keeping more data in memory If all data is in memory, I/O needed only for writes Keeping frequently used data in memory reduces disk accesses, reducing number of disks required, b

10、ut has a memory cost,8,Hardware Tuning: Five-Minute Rule,Question: which data to keep in memory: If a page is accessed n times per second, keeping it in memory saves n * price-per-disk-driveaccesses-per-second-per-disk Cost of keeping page in memory price-per-MB-of-memoryages-per-MB-of-memory Break-

11、even point: value of n for which above costs are equal If accesses are more then saving is greater than cost Solving above equation with current disk and memory prices leads to: 5-minute rule: if a page that is randomly accessed is used more frequently than once in 5 minutes it should be kept in mem

12、ory(by buying sufficient memory!),9,Hardware Tuning: One-Minute Rule,For sequentially accessed data, more pages can be read per second. Assuming sequential reads of 1MB of data at a time: 1-minute rule: sequentially accessed data that is accessed once or more in a minute should be kept in memory Pri

13、ces of disk and memory have changed greatly over the years, but the ratios have not changed much so rules remain as 5 minute and 1 minute rules, not 1 hour or 1 second rules!,10,Hardware Tuning: Choice of RAID Level,To use RAID 1 or RAID 5?Depends on ratio of reads and writes RAID 5 requires 2 block

14、 reads and 2 block writes to write out one data block If an application requires r reads and w writes per second RAID 1 requires r + 2w I/O operations per second RAID 5 requires: r + 4w I/O operations per second For reasonably large r and w, this requires lots of disks to handle workload RAID 5 may

15、require more disks than RAID 1 to handle load! Apparent saving of number of disks by RAID 5 (by using parity, as opposed to the mirroring done by RAID 1) may be illusory! Thumb rule: RAID 5 is fine when writes are rare and data is very large, but RAID 1 is preferable otherwise If you need more disks

16、 to handle I/O load, just mirror them since disk capacities these days are enormous!,11,Tuning the Database Design,Schema tuning Vertically partition relations to isolate the data that is accessed most often - only fetch needed information. E.g., split account into two, (account-number, branch-name)

17、 and (account-number, balance).Branch-name need not be fetched unless required Improve performance by storing a denormalized relation E.g., store join of account and depositor; branch-name and balance information is repeated for each holder of an account, but join need not be computed repeatedly. Pr

18、ice paid: more space and more work for programmer to keep relation consistent on updates better to use materialized views (more on this later) Cluster together on the same disk page records that would match in a frequently required join,compute join very efficiently when required.,12,Tuning the Data

19、base Design (Cont.),Index tuning Create appropriate indices to speed up slow queries/updates Speed up slow updates by removing excess indices (tradeoff between queries and updates) Choose type of index (B-tree/hash) appropriate for most frequent types of queries. Choose which index to make clustered

20、 Index tuning wizards look at past history of queries and updates (the workload) and recommend which indices would be best for the workload,13,Tuning the Database Design (Cont.),Materialized Views Materialized views can help speed up certain queries Particularly aggregate queries Overheads Space Tim

21、e for view maintenance Immediate view maintenance:done as part of update txntime overhead paid by update transaction Deferred view maintenance: done only when required update transaction is not affected, but system time is spent on view maintenance until updated, the view may be out-of-date Preferab

22、le to denormalized schema since view maintenance is systems responsibility, not programmers Avoids inconsistencies caused by errors in update programs,14,Tuning the Database Design (Cont.),How to choose set of materialized views Helping one transaction type by introducing a materialized view may hur

23、t others Choice of materialized views depends on costs Users often have no idea of actual cost of operations Overall, manual selection of materialized views is tedious Some database systems provide tools to help DBA choose views to materialize “Materialized view selection wizards”,15,Tuning of Trans

24、actions,Basic approaches to tuning of transactions Improve set orientation Reduce lock contention Rewriting of queries to improve performance was important in the past, but smart optimizers have made this less important Communication overhead and query handling overheads significant part of cost of

25、each call Combine multiple embedded SQL/ODBC/JDBC queries into a single set-oriented query Set orientation - fewer calls to database E.g. tune program that computes total salary for each department using a separate SQL query by instead using a single query that computes total salaries for all depart

26、ment at once (using group by) Use stored procedures: avoids re-parsing and re-optimization of query,16,Tuning of Transactions (Cont.),Reducing lock contention Long transactions (typically read-only) that examine large parts of a relation result in lock contention with update transactions E.g. large

27、query to compute bank statistics and regular bank transactions To reduce contention Use multi-version concurrency control E.g. Oracle “snapshots” which support multi-version 2PL Use degree-two consistency (cursor-stability) for long transactions Drawback: result may be approximate,17,Tuning of Trans

28、actions (Cont.),Long update transactions cause several problems Exhaust lock space Exhaust log spaceand also greatly increase recovery time after a crash, and may even exhaust log space during recovery if recovery algorithm is badly designed! Use mini-batch transactions to limit number of updates th

29、at a single transaction can carry out. E.g., if a single large transaction updates every record of a very large relation, log may grow too big. * Split large transaction into batch of mini-transactions, each performing part of the updates Hold locks across transactions in a mini-batch to ensure seri

30、alizability If lock table size is a problem can release locks, but at the cost of serializability * In case of failure during a mini-batch, must complete its remaining portion on recovery, to ensure atomicity.,18,Performance Simulation,Performance simulation using queuing model useful to predict bot

31、tlenecks as well as the effects of tuning changes, even without access to real system Queuing model as we saw earlier Models activities that go on in parallel Simulation model is quite detailed, but usually omits some low level details Model service time, but disregard details of service E.g. approx

32、imate disk read time by using an average disk read time Experiments can be run on model, and provide an estimate of measures such as average throughput/response time Parameters can be tuned in model and then replicated in real system E.g. number of disks, memory, algorithms, etc,19,Performance Bench

33、marks,Suites of tasks used to quantify the performance of software systems Important in comparing database systems, especially as systems become more standards compliant. Commonly used performance measures: Throughput (transactions per second, or tps) Response time (delay from submission of transact

34、ion to return of result) Availability or mean time to failure,20,Performance Benchmarks (Cont.),Suites of tasks used to characterize performance single task not enough for complex systems Beware when computing average throughput of different transaction types E.g., suppose a system runs transaction

35、type A at 99 tps and transaction type B at 1 tps. Given an equal mixture of types A and B, throughput is not (99+1)/2 = 50 tps. Running one transaction of each type takes time 1+.01 seconds, giving a throughput of 1.98 tps. To compute average throughput, use harmonic mean: n Interference (e.g. lock

36、contention) makes even this incorrect if different transaction types run concurrently,1/t1 + 1/t2 + + 1/tn,21,Database Application Classes,Online transaction processing (OLTP) requires high concurrency and clever techniques to speed up commit processing, to support a high rate of update transactions

37、. Decision support applications including online analytical processing, or OLAP applications require good query evaluation algorithms and query optimization. Architecture of some database systems tuned to one of the two classes E.g. Teradata is tuned to decision support Others try to balance the two

38、 requirements E.g. Oracle, with snapshot support for long read-only transaction,22,Benchmarks Suites,The Transaction Processing Council (TPC) benchmark suites are widely used. TPC-A and TPC-B: simple OLTP application modeling a bank teller application with and without communication Not used anymore

39、TPC-C: complex OLTP application modeling an inventory system Current standard for OLTP benchmarking,23,Benchmarks Suites (Cont.),TPC benchmarks (cont.) TPC-D: complex decision support application Superceded by TPC-H and TPC-R TPC-H: (H for ad hoc) based on TPC-D with some extra queries Models ad hoc

40、 queries which are not known beforehand Total of 22 queries with emphasis on aggregation prohibits materialized views permits indices only on primary and foreign keys TPC-R: (R for reporting) same as TPC-H, but without any restrictions on materialized views and indices TPC-W: (W for Web) End-to-end

41、Web service benchmark modeling a Web bookstore, with combination of static and dynamically generated pages,24,TPC Performance Measures,TPC performance measures transactions-per-second with specified constraints on response time transactions-per-second-per-dollar accounts for cost of owning system TP

42、C benchmark requires database sizes to be scaled up with increasing transactions-per-secondreflects real world applications where more customers means more database size and more transactions-per-second External audit of TPC performance numbers mandatory TPC performance claims can be trusted,25,TPC

43、Performance Measures,Two types of tests for TPC-H and TPC-R Power test: runs queries and updates sequentially, then takes mean to find queries per hour Throughput test: runs queries and updates concurrently multiple streams running in parallel each generates queries, with one parallel update stream

44、Composite query per hour metric: square root of product of power and throughput metrics Composite price/performance metric,26,Other Benchmarks,OODB transactions require a different set of benchmarks. OO7 benchmark has several different operations, and provides a separate benchmark number for each ki

45、nd of operation Reason: hard to define what is a typical OODB application Benchmarks for XML being discussed,27,Standardization,The complexity of contemporary database systems and the need for their interoperation require a variety of standards. syntax and semantics of programming languages function

46、s in application program interfaces data models (e.g. object oriented/object relational databases) Formal standards are standards developed by a standards organization (ANSI, ISO), or by industry groups, through a public process. De facto standards are generally accepted as standards without any for

47、mal process of recognitionStandards defined by dominant vendors (IBM, Microsoft) often become de facto standards De facto standards often go through a formal process of recognition and become formal standards,28,Standardization (Cont.),Anticipatory standards lead the market place, defining features

48、that vendors then implementEnsure compatibility of future products But at times become very large and unwieldy since standards bodies may not pay enough attention to ease of implementation (e.g.,SQL-92 or SQL:1999) Reactionary standards attempt to standardize features that vendors have already imple

49、mented, possibly in different ways. Can be hard to convince vendors to change already implemented features. E.g. OODB systems,29,SQL Standards History,SQL developed by IBM in late 70s/early 80s SQL-86 first formal standard IBM SAA standard for SQL in 1987 SQL-89 added features to SQL-86 that were al

50、ready implemented in many systems Was a reactionary standard SQL-92 added many new features to SQL-89 (anticipatory standard) Defines levels of compliance (entry, intermediate and full) Even now few database vendors have full SQL-92 implementation,30,SQL Standards History (Cont.),SQL:1999 Adds varie

51、ty of new features - extended data types, object orientation, procedures, triggers, etc. Broken into several parts SQL/Framework (Part 1): overview SQL/Foundation (Part 2): types, schemas, tables, query/update statements, security, etc SQL/CLI (Call Level Interface) (Part 3): API interface SQL/PSM (Persistent Stored Modules) (Part 4): procedural extensions SQL/Bindings (Part 5): embedded SQL for different embedding languages,

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1