Internal Schema Design,Performance and Indexing.ppt

资源描述

1、Internal Schema Design, Performance and Indexing,CS2312,Internal Schema Design,Data Blocks,Oracle manages data in datafiles as data blocks the smallest unit of I/O used by a database. the smallest units of storage that Oracle can use or allocate.,In contrast, at the physical, operating system level,

2、 all data is stored in bytes. Each operating system has what is called a block size. Oracle requests data in multiples of Oracle data blocks, not operating system blocks. Set the data block size for each Oracle database when you create the database as a multiple of the operating systems block size w

3、ithin the maximum (port-specific) limit to avoid unnecessary I/O.,Performance Profiling,Query Profilefrequency of certain querieshit rate on relationscertain relations used togetherselection attributes Update Profiledynamic or statichit rate of certain updatespredictablepre-fetch strategiesAnalysis

4、and Monitoring using tuning tools,Performance: Joins with Composite FK.,Flight(flightcode, ukairport,holairport, depday, deptime.) Hotel(hotelid, hotelname,.) GenericPackage(flightcode, hotelid, reservedrooms, reservedseats) SpecificPackage(flightcode, hotelid, depdate,availseats,availrooms, ) Booki

5、ng(bookingid, contact, flightcode, hotelid, depdate, noofpeople,noofrooms,)GenericPackage(gpackid,flightcode, hotelid, reservedrooms, reservedseats) SpecificPackage(spackid,gpackid, depdate, availseats, availrooms, ) Booking(bookingid, contact, spackid, noofpeople,noofrooms,),Performance: Frequent J

6、oins,Department,Staff,worksfor,deptno,roomno,staffno,staffname,1,m,Dept(deptno, deptname)Staff(staffno, staffname, roomno, deptno),deptname,1. Denormalise to 2NF,Replicate attribute valuesStaff(staffno, staffname, roomno, deptno, deptname) staffno staffname, roomno deptno deptname,2. Physically stor

7、ing a file resulting from a join,Materialised View Update integrity management,File Organisation,Organisation of the data of a file into records, blocks and access structuresOrganisation Unordered records Ordered records Hashing,File Organisation: Unordered Records,Place records in the order they ar

8、e inserted. New records inserted at the end of the file HEAP / PILE,File Organisation: Ordered Records,Physically order the records of a file on disk based on values of one of the fields ordering field / ordering key,Overflow Blocks,3. Inter-file clustering,Store records that are logically related a

9、nd frequently used together physically close together on disk,cluster applied across multiple files e.g. frequently access depts with their staffTherefore interleave Dept and Staff,1, Comp Sci,10, Goble, 2.82,23, Paton, 2.83,2, Maths,31, Smith, 1.100,49, Leuder, 1.23,Oracle Inter file Clustering,Cre

10、ate a cluster named PERSONNEL with the cluster key column DEPTNO CREATE CLUSTER personnel ( deptno NUMBER(2) ) ;Add the STAFF and DEPT tables to the cluster:,CREATE TABLE staff (staffno NUMBER PRIMARY KEY, staffname VARCHAR2(10) roomno NUMBER(2,3), deptno NUMBER(2) NOT NULL ) CLUSTER personnel (dept

11、no);,CREATE TABLE dept (deptno NUMBER(2), deptname VARCHAR2(9) CLUSTER personnel (deptno);,Cluster key,Intra file clustering,cluster around a clustering field in a single stored file e.g. frequently access STUDENT by yearcreate table student (studentno number(8) primary key, givenname char(20), surn

12、ame char(20), hons char(3), tutorid number(4), yearno number(1) not null, cluster year(yearno), );,4. Construct Access Structures for the join attributes,Access Structures / Access Paths Indexes Multi-level indexes B trees, B+ trees Hashing BitmapAccess Methods routines that allow operations to be a

13、pplied to a file,Primary Index,Data File,Index File,Data file is physically ordered on a key field (ordering key field),Oracle: Index-organized Tables,create table student (studentno number(8) primary key, givenname char(20), surname char(20), hons char(3), tutorid number(4), yearno number(1) not nu

14、ll, ORGANIZATION INDEX TABLESPACE students OVERFLOW TABLESPACE students_overflow;,Clustering Index,Data file is physically ordered on a non-key field (clustering field),Index File,Data File,Clustering Index in Oracle,The following statement creates a cluster index on the cluster key of PERSONNEL: CR

15、EATE INDEX idx_personnel ON CLUSTER personnel; After creating the cluster index, you can insert rows into either the STAFF or DEPT tables.,Clustering Index with Separate Blocks,Separate blocks for each group of records with the same cluster field value,Index File,Data File,Dense Secondary Index on a

16、 non-ordering key field,Index field value,Block pointer,Index File,Data File,Oracle: Create Index,create table enrol (studno number(8), courseno char(5), primary key (studno, courseno), );CREATE INDEX enrol-idx1 ON enrol (studno, courseno);CREATE INDEX enrol-idx2 ON enrol (courseno, studno);,Seconda

17、ry Index on a non-key field,Field value,Block pointer,Blocks of record pointers,Data File,Indexing field,Index File,Dense & Sparse Indexes,Dense Index Every record in the data file is in the index Sparse Index Not every record in the data file is in the index. The index indicates the block of record

18、s takes less space quicker to scan the index efficient but.no existence test based on the index A file can have one sparse index and many dense indexes, because a sparse index relies on a unique physical ordering of the data file on disk,Types of Keys,Unordered data files lots of secondary indexes S

19、pecify ordering attribute for file primary / clustering index attribute used most often for joins,Analysing database queries and transactions,Each queryfiles that will be accessedfields whose value is retrieved access pathsfields specified in selection conditions access pathsfields specified in join

20、s access paths Each update transactionfiles that will be updatedtype of update operation on each filefields used in selection conditionsfields whose value is modified avoid access structure,Analysing database queries and transactions,Expected frequency of invocation of queries and transactions expec

21、ted frequency of each field as a selection field or join field over all transactions expected frequency of retrieving and /or updating each recordAnalysing time constraints of queries and transactions stringent performance constraints influence access paths on selection fieldsAnalysing expected freq

22、uency of update operations volatile files reduce number of access paths,Types and Properties of Indexes,Index Summary,Speeds up retrieval but slows down inserts and updates Improve performance when relations are large queries extract 25% of all tuple in a relation a where clause is properly construc

23、tedTwo main considerations: 1. Organisation 2. Access sequential range queries direct criteria queries existence tests,Data Definition: Create Table,create table year (yearno number(1) primary key, yeartutorid number(4), yeartut_uk unique exceptions into bad_tutors using index not null constraint tu

24、t_fk foreign key (yeartutorid) references staff(staffid) tablespace cags_course storage (initial 6144 next 6144 minextents 1 maxextents 5 pctincrease 5 pctfree 20);,Multi-leveled indexes: an index for an index,Index has bi blocks bfri = blocking factor for the index bfri = fan-out = fo,2nd (top) lev

25、el,First (base) level,Data File,Primary Key Field,Tree Indexes,Order a measure of the number of indexing field values at each node Depth number of levels,A,C,D,G,H,I,K,B,E,F,J,Root node,Subtree for node B,B-trees Balanced Trees,Every leaf is at the same level Ordered - Search time O(log(n) Predictab

26、le search time Efficiency - each node = block A key value is stored once, with the address of the associated data record,B trees Order p,1. at least (p-1)/2 and at most p-1 key values at each internal node and leaf internal nodes and leaves must always be at least half full (or half empty) 2. the ro

27、ot must contain at least one key and 2 pointers (thus 2 child nodes) unless its a leaf cant have an empty start point for a non-null tree 3. for k key values at a node, there will be k+1 pointers to child nodes a node must have between p/2 and p tree pointers (child nodes),B tree,1,4,5,1,1,1,7,1,0,1

28、2,3,2,5,0,3,5,5,3,6,0,5,2,5,4,6,2,B trees,Predictable search pattern at most X node comparisons for a tree of X levels Dense index addresses record location index value can lie anywhere in the tree Cost maintenance but ? Sequential access ? Range queries ? Sparse index,B+ trees,Amendment to B tree:

29、 addresses for data records are in the leaves and no where else,B+ trees,32,7,54,53,51,52,50,35,14,11,12,12,14,32,35,50,51,52,53,54,60,10,11,=,B+ trees,1. Each node has at most p-1 comparisons 2. Number of nodes visited is limited by the depth of the tree A tree with k levels has at most (p)(k-1) le

30、aves at least (p/2)(k-1) leaves Each leaf has p/2 addresses if dense or 1 block of n if sparse,Sparse / Dense B+ Trees,Donna,Brian,Bruce,Paul,40,60,20,2,70,21,12,47,59,23,60,79,Donna,21,Brian,Bruce,Claire,Marcia,Paul,Tim,Aaron,47,23,59,60,79,12,2,Data blocks and indexes,Sparse Primary Index,Dense (S

31、econdary) Index,B+ trees,Sequential and direct access Straightforward insertion and deletion maintaining ordering Grows as requiredonly as big as needs to be Predictable scanning pattern Predictable and constant search time but maintenance overhead overkill for small static files duplicate keys? rel

32、ies on random distribution of key values for efficient insertion,Data Blocks, Extents, and Segments,Data stored in data blocks (also called logical blocks, Oracle blocks, or pages). One data block corresponds to a specific number of bytes of physical database space on disk An extent is a specific nu

33、mber of contiguous data blocks allocated for storing a specific type of information. A segment is a set of extents that have been allocated for a specific type of data structure. Each tables data is stored in its own data segment, while each indexs data is stored in its own index segment.,Tablespace

34、s and Datafiles,An Oracle database is divided into one or more logical storage units called tablespaces. The databases data is collectively stored in the databases tablespaces. Each tablespace in an Oracle database consists of one or more files called datafiles. These are physical structures that co

35、nform with the operating system in which Oracle is running. A databases data is collectively stored in the datafiles that constitute each tablespace of the database. The simplest Oracle database would have one tablespace and one datafile. A more complicated database might have three tablespaces, eac

36、h consisting of two datafiles (for a total of six datafiles).,Tablespaces and Datafiles,Why bother with tablespaces?,Uses tablespaces to: control disk space allocation for database data assign specific space quotas for database users control availability of data by taking individual tablespaces onli

37、ne or offline perform partial database backup or recovery operations allocate data storage across devices to improve performance Different functions System tablespace Temporary tablespaces User tablespaces Read-only table spaces,Example,create table year (yearno number(1) primary key, yeartutorid nu

38、mber(4), yeartut_uk unique not null constraint tut_fk foreign key (yeartutorid) references staff(staffid) tablespace secondyr_course storage (initial 6144 next 6144 minextents 1 maxextents 5 pctincrease 5 pctfree 20);,Partitioned Tables in Oracle,Supports very large tables and indexes by allowing us

39、ers to decompose them into smaller and more manageable pieces called partitions. Each partition is stored in a separate segment and you can store each partition in a separate tablespace, so: contain the impact of damaged data. back up and recover each partition independently. balance I/O load by map

40、ping partitions to disk drives. Useful for: Very Large Databases (VLDBs) Reducing Downtime for Scheduled Maintenance Reducing Downtime Due to Data Failures DSS Performance I/O Performance Disk Striping: Performance vs. Availability,Example,A sales table contains historical data divided by week numbe

41、r into 13 four-week partitions. CREATE TABLE sales ( acct_no NUMBER(5), acct_name CHAR(30), amount_of_sale NUMBER(6), week_no INTEGER ) PARTITION BY RANGE ( week_no ) . (PARTITION VALUES LESS THAN ( 4) TABLESPACE ts0, PARTITION VALUES LESS THAN ( 8) TABLESPACE ts1, . PARTITION VALUES LESS THAN ( 52

42、) TABLESPACE ts12 );,Hashing: Hash Clusters,Physically store the rows of a table in a hash cluster and retrieve them according to the results of a hash function. A hash function generates a distribution of numeric values, called hash values, which are based on specific cluster key values. The key of

43、 a hash cluster can be a single column or composite key. To find or store a row in a hash cluster, apply the hash function to the rows cluster key value; the resulting hash value corresponds to a data block in the cluster, which you then reads or writes on behalf of the issued statement.,Hashing Exa

44、mple,Hash functionmod ( hash key prime number )Collisions Rehash functionsOracle internal hash function user defined hash function,Hashing vs Indexing,Choice of Hashing,If a key attribute is used mainly for equality selection and join Nothing depends on layout order of data file Data files are static and of known size,

展开阅读全文