当前位置:网站首页>85- have you learned any of these SQL tuning tips?
85- have you learned any of these SQL tuning tips?
2022-06-22 21:23:00 【Tiger Liu】
The following article is from a wechat official account , The author collected oracle Some of the earlier versions SQL Optimize " skill ", The title of the article is very attractive .
I read this article carefully , Many of these statements have been found to be seriously out of date , Or the theory is wrong , Such articles , If not corrected , And it is transmitted by more people , Will mislead a lot DBA and SQL Developer . These so-called knowledge points , They have not learned , It's just that I'm a loser .
The following is the original article and my comments , My comments are tiger start , The font color is red .
< This article covers all SQL Tuning tips >
1. Choose the right ORACLE Optimizer
ORACLE There are 3 Kind of :
a.RULE ( rule-based )
b.COST ( Cost based )
c.CHOOSE ( selectivity )
Set the default optimizer , It can be done by init.ora in OPTIMIZER_MODE Various declarations of parameters , Such as RULE,COST,CHOOSE,ALL_ROWS,FIRST_ROWS, Of course you are SQL Sentence level or conversation (session) Level to cover it .
In order to use the cost based optimizer (CBO,Cost-Based Optimizer), You have to run it all the time analyze command , To increase the object statistics in the database (object statistics) The accuracy of the .
If the optimizer mode of the database is set to selective (CHOOSE), So the actual optimizer pattern will be the same as whether it has been run Analyze The order is about .
If Table Has been Analyze too , The optimizer mode will automatically become CBO, conversely , The database will use RULE Formal optimizer .
By default ,ORACLE use CHOOSE Optimizer , To avoid unnecessary full table scans (full table scan), You must try to avoid using CHOOSE Optimizer , Instead, use the cost based optimizer directly .
tiger:
oracle edition 10g(2003 Released in ) Start , Default and recommended OPTIMIZER_MODE Namely ALL_ROWS 了 (CBO), And declare that it no longer supports obsolete RBO. This one says 9i And previous versions , It's over 18 year , Don't mention outdated knowledge points , Unless you want to study the antique database . Technical articles should also keep pace with the times .
Remember here , Today's optimizers are smart CBO, Not outdated RBO That's it . These two terms are also mentioned in the following comments .
2. visit Table The way
ORACLE There are two ways to access the records in the table :
a. Full table scan A full table scan is to access each record in the table sequentially . ORACLE Read multiple data blocks at one time (data block) Optimize full table scan in the same way .
b. adopt ROWID Access table You can use a method based on ROWID Access mode of , Improve the efficiency of accessing tables ,ROWID Contains the physical location information recorded in the table ,ORACLE Adopt index (INDEX) Realize the data and the physical location of the data (ROWID) The connection between . Usually indexes provide quick access ROWID Methods , So those queries based on index columns can get performance improvement .
tiger:
It is wrong to think that using indexes is more efficient than full table scanning , An index is suitable for accessing a small number of records on a table , If you want to access most of the records on the table , It's inefficient to go by the index .
If there is t Tabular status Field , There are only two values 0 and 1, 0 The number of corresponding records is relatively small , 1 Corresponding to most records , stay status Index created on field , that select * from t where status=0 Indexing is efficient , and select * from t where status=1 Indexing is inefficient , It's better to scan the whole table .
Add : If it's old RBO, As long as there is an index, it will be used whether it is efficient or not ; and CBO It is used selectively ,status=0 Choose to use the index , status=1 Do not choose to use the index . RBO Why was it eliminated , This is one of the reasons .
3. share SQL sentence
In order not to parse the same SQL sentence , After the first analysis ,ORACLE take SQL Statements are stored in memory .
This is in the global area of the system SGA(System GlobalArea) The shared pool (Shared Buffer Pool) The memory in can be shared by all database users .
therefore , When you execute a SQL sentence ( Sometimes called a cursor ) when , If it is exactly the same as the previously executed statement ,ORACLE You can quickly get the parsed statements and the best execution path .
ORACLE This function of has greatly improved SQL It improves the execution performance and saves memory usage . It is a pity ORACLE Cache only simple tables (cache buffering), This function is not suitable for multi table join queries .
The database administrator must be in init.ora Set the appropriate parameters for this area in , When this memory area is larger , You can keep more statements , Of course, the more likely it is to be shared .
When you turn to ORACLE To submit a SQL sentence ,ORACLE Will first look up the same statement in this memory .
What needs to be noted here is ,ORACLE There is a strict match between the two , To achieve sharing ,SQL Statements must be exactly the same ( Including Spaces , Line break, etc ).
Shared statements must satisfy three conditions :
A. Character level comparison : The statement being executed must be exactly the same as the statement in the shared pool . for example :
SELECT * FROM EMP;
It's different from each of the following
SELECT * from EMP;Select * From Emp;SELECT * FROM EMP;
B. Two statements must refer to exactly the same object
C. Two SQL Statement must use a binding variable with the same name (bind variables) for example : Two of the first group SQL The statements are the same ( Can be Shared ), The two statements in the second group are different ( Even at run time , Assign the same value to different bound variables )
a.
selectpin ,name from people where pin = :blk1.pin;
selectpin ,name from people where pin = :blk1.pin;
b.
selectpin ,name from people where pin = :blk1.ot_ind;
selectpin ,name from people where pin = :blk1.ov_ind;
tiger:
It can be summed up in one sentence ,OLTP Business , To share SQL, Reduce the time and resource consumption of hard parsing , It is recommended to use the writing method of bound variables reasonably . Pay attention to the word "reasonable" , Binding variables are not recommended for all variables .
4. Select the most efficient table name order ( Works only in rule-based optimizers )
ORACLE The parser is processed from right to left FROM Table name in clause , therefore FROM Clause in the last table ( Basic table driving table) Will be dealt with first .
stay FROM Clause contains more than one table , You must select the table with the least number of records as the base table .
When ORACLE When processing multiple tables , Will use sort and merge to connect them . First , Scan the first table (FROM The last table in the clause ) And order the records , Then scan the second table (FROM The last second table in clause ), Finally, all the records retrieved from the second table are merged with the appropriate records in the first table .
for example : surface TAB1 16384 Article record form TAB2 1 Record selection TAB1 As a basic table ( The best way )
select count(*) from tab1,tab2 execution time 0.96 Second selection TAB2 As a basic table ( A bad way )
select count(*) from tab2,tab1 execution time 26.09 second
If there is 3 More than table join queries , Then you need to choose a crosstab (inter section table) As a basic table , A crosstab is a table that is referenced by other tables .
for example :
EMP The table describes LOCATION Table and CATEGORY The intersection of tables .
SELECT * FROM LOCATION L, CATEGORY C, EMP E WHERE E.EMP_NO BETWEEN 1000 AND 2000 AND E.CAT_NO = C.CAT_NO AND E.LOCN = L.LOCN
Will be better than the following SQL More efficient
SELECT *
FROM EMP E, LOCATION L ,CATEGORY C
WHERE E.CAT_NO = C.CAT_NO AND E.LOCN = L.LOCN
AND E.EMP_NO BETWEEN 1000 AND 2000;
tiger:
Rules that have been out of date for a long time (RBO The rules ), Basically, there are few in production now 9i The previous version . stay 10g And later versions default to CBO Under the rules , Write the order of the table casually , The optimizer will based on the statistics , Automatically select the association order and association method .
Because of the existence of these outdated articles , Now many people often ask me this question . Because many developers don't know what a rule-based optimizer is .
5. WHERE Join order in clause .
ORACLE Use bottom-up order analysis WHERE Clause , According to this principle , The connection between tables must be written in other WHERE Before condition , The conditions that can filter out the maximum number of records must be written in WHERE End of clause .
for example :( Inefficient , execution time 156.3 second )
SELECT ... FROM EMP E WHERE SAL > 50000 AND JOB = 'MANAGER' AND 25 < (SELECT COUNT(*) FROM EMP WHERE MGR=E.EMPNO);( Efficient , execution time 10.6 second )
SELECT ... FROM EMP E WHERE 25 < (SELECT COUNT(*) FROM EMP WHERE MGR=E.EMPNO) AND SAL > 50000 AND JOB = 'MANAGER'; tiger:
Same as 4 It's the same , CBO The order of predicate conditions can be written casually . The above efficient writing may also be inefficient ,where Some scalar subqueries exist .
6. SELECT Avoid using in Clause '*'
When you want to be in SELECT Clause lists all COLUMN when , Usage dynamics SQL Column references '*' It's a convenient way . Unfortunately , It's a very inefficient approach .
actually ,ORACLE In the process of parsing , Will '*' Convert to all column names in turn , This is done by querying the data dictionary , That means more time .
tiger: Don't use select * There are many reasons for , But it is not the reason mentioned above , I have summarized several reasons that affect performance as follows :
- 1.exadata The storage node has field projection function , Reducing the number of fields can reduce the amount of data transmitted to the computing node , Reduce the efficiency of computing nodes
- 2. If you include lob Field , If you do not need to process this field , Cause a lot of redundant physical read and network traffic ( And it contains lob Field , You can't materialize)
- 3.hash join/merge join, More fields will occupy more PGA Memory space , Great impact on Performance
- 4. May have missed index coverage , More serious impact on performance
- 5. Cause the server to client network traffic increase , The amount of transmission increases , The efficiency is poor
7. Reduce access to databases
When performing each SQL When the sentence is ,ORACLE A lot of work has been done internally : analysis SQL sentence , Estimating index utilization , Bind variables , Read data blocks and so on .
thus it can be seen , Reduce access to databases , Can actually reduce ORACLE The amount of work .
for example , There are three ways to find out that the employee number is equal to 0342 or 0291 Staff of .
Method 1 ( The most inefficient )
SELECT EMP_NAME ,SALARY ,GRADE FROM EMP WHERE EMP_NO= 342;
Method 2 ( Sub low efficiency )
DECLARE CURSOR C1 (E_NO NUMBER) ISSELECT EMP_NAME, SALARY, GRADE FROM EMP WHERE EMP_NO = E_NO; BEGIN OPEN C1(342); FETCH C1 INTO…,..,.. ; CLOSE C1;END;
Method 3 ( Efficient )
SELECT A.EMP_NAME ,A.SALARY ,A.GRADE,B.EMP_NAME,B.SALARY ,B.GRADE FROM EMP A,EMP B WHERE A.EMP_NO = 342 AND B.EMP_NO = 291;
Be careful : stay SQL*Plus ,SQL*Forms and Pro*C Reset... In ArraySize Parameters , You can increase the amount of retrieved data per database access , Recommended values for 200.
tiger: Method 3 Of " Efficient " I don't know who invented it , It can be called a wonderful writing method . I suggest the following form :SELECT EMP_NAME ,SALARY ,GRADEFROM EMP WHERE EMP_NO in (342,291); Set up arraysize There's no problem with that , sqlplus The default is 15,jdbc The default is 10, Change to 100 It will be better , This has no official recommended value .
8. Use DECODE Function to reduce processing time
Use DECODE Function can avoid repeatedly scanning the same record or repeatedly connecting the same table . for example :
SELECT COUNT(*),SUM(SAL) FROM EMP WHERE DEPT_NO = 0020 AND ENAME LIKE 'SMITH%'; SELECT COUNT(*),SUM(SAL) FROM EMP WHERE DEPT_NO = 0030 AND ENAME LIKE 'SMITH%';
You can use it. DECODE Function efficiently gets the same result
SELECT COUNT(DECODE(DEPT_NO,0020,'X',NULL))D0020_COUNT, COUNT(DECODE(DEPT_NO,0030,'X',NULL))D0030_COUNT, SUM(DECODE(DEPT_NO,0020,SAL,NULL)) D0020_SAL, SUM(DECODE(DEPT_NO,0030,SAL,NULL)) D0030_SALFROM EMPWHERE ENAME LIKE 'SMITH%'; Allied ,DECODE Functions can also be applied to GROUP BY and ORDER BY clause .
tiger:
The theory is correct , But the example and the way of writing look a bit lame , Isn't it good to write it in the following way ?
select dept_no,count(*),sum(sal)
from emp
where dept_no in(20,30) and ename like 'SMITH%'
group by dept_no;
9. Simple integration , No associated database access
If you have a few simple database queries , You can integrate them into one query ( Even if there's no relationship between them ) for example :
SELECT NAME FROM EMP WHERE EMP_NO = 1234; SELECT NAME FROM DPT WHERE DPT_NO = 10 ; SELECT NAME FROM CAT WHERE CAT_TYPE = 'RD';
above 3 Queries can be merged into one :
SELECT E.NAME ,D.NAME ,C.NAME FROMCAT C, DPT D, EMP E, DUAL X WHERE NVL('X',X.DUMMY) = NVL('X',E.ROWID(+)) AND NVL('X',X.DUMMY) = NVL('X',D.ROWID(+)) AND NVL('X',X.DUMMY) = NVL('X',C.ROWID(+)) AND E.EMP_NO(+) = 1234 AND D.DEPT_NO(+) = 10 AND C.CAT_TYPE(+) = 'RD';
Even with this approach , Efficiency has been improved , But the readability of the program is greatly reduced , So the reader still has to weigh the pros and cons )
tiger:
This SQL The rewriting method is a wonderful flower among the wonderful flowers , It's just a few normal SQL, Put a few irrelevant SQL Change it into a merged writing method that no one can understand when optimizing the writing method , This is crazy !
10. Delete duplicate records
The most efficient way to delete duplicate records ( Because of the use of ROWID)
DELETE FROM EMP E WHERE E.ROWID > (SELECT MIN(X.ROWID)FROM EMP X WHERE X.EMP_NO = E.EMP_NO);
tiger:
It can be said to be a method to delete duplicate records , But not necessarily the most efficient . And sometimes the required records should be kept according to different conditions , So there is no way to use rowid.
Large tables or repeated records , CTAS Create a new table and rename, It's also a good choice .
11. use TRUNCATE replace DELETE
When deleting a record in a table , In general , Rollback segment (rollback segments ) Used to store information that can be recovered .
If you don't COMMIT Business ,ORACLE Will restore the data to the state before deletion ( To be exact, it is to restore to the state before executing the delete command )
And when you use TRUNCATE when , The rollback segment no longer stores any recoverable information . When the command runs , Data cannot be recovered . So very few resources are called , The execution time will also be very short .
(TRUNCATE Applicable only when deleting the whole table ,TRUNCATE yes DDL No DML)
tiger:
This theory is correct . The exact way to say it is to delete all table data , It is recommended to use truncate Instead of using without where Conditions of the delete.
12. Use... As much as possible COMMIT
Whenever possible , Use as much as possible in the program COMMIT, In this way, the performance of the program is improved , Demand can also be caused by COMMIT Less resources are released :
COMMIT The resources released :
a. Information used to recover data on the rollback segment .
b. A lock obtained by a program statement
c.redo log buffer In the space
d.ORACLE To manage the above 3 Internal costs in resources
( In the use of COMMIT You have to pay attention to the integrity of the transaction , In reality, efficiency and transaction integrity are often both indispensable )
tiger:
In the case of batch processing , commit The frequency should be moderate , Every time 500 Or 1000 strip commit once ( According to the circumstances ), If every time commit, Execution efficiency will be greatly reduced .
13. Count the number of records
Contrary to the general view ,count(*) Than count(1) Faster , Of course, if you can search by index , The count of index columns is still the fastest . for example COUNT(EMPNO)( Pass the actual test , There is no significant performance difference between the above three methods )
tiger:
count(*) And count(1) It is equivalent. , And count(not null column) It's also equivalent ; If empno The field definition of is nullable , It's not equivalent , It makes no sense to compare them together ;
count(not null column) The reason for the speed is that there is not null Index on field , Used index fast full scan, Because the space occupied by the index is generally smaller than that of the table , So scanning the index will be faster than scanning the table . Test the theory , Instead of using test results to derive theories !
14. use Where Clause substitution HAVING Clause
Avoid using HAVING Clause ,
HAVING The result set is filtered only after all records are retrieved . This process needs to be sorted , Total, etc .
If you can pass WHERE Clause limits the number of records , That would reduce the cost of this .
for example : Inefficient :
SELECT REGION,AVG(LOG_SIZE) FROM LOCATION GROUP BY REGION HAVING REGION REGION != 'SYDNEY'AND REGION != 'PERTH';
Efficient
SELECT REGION,AVG(LOG_SIZE) FROM LOCATION WHERE REGION REGION != 'SYDNEY' AND REGION != 'PERTH' GROUP BY REGION;
(HAVING The conditions in are generally used for the comparison of some set functions , Such as COUNT() wait . In addition , The general conditions should be written in WHERE clause )
15. Reduce queries on tables
In the SQL In the sentence , Pay special attention to reducing the number of queries on the table . for example :
Inefficient
SELECT TAB_NAMEFROM TABLESWHERE TAB_NAME = ( SELECT TAB_NAME FROM TAB_COLUMNS WHERE VERSION = 604) AND DB_VER= ( SELECT DB_VER FROM TAB_COLUMNS WHERE VERSION = 604)
Efficient
SELECT TAB_NAMEFROM TABLESWHERE (TAB_NAME, DB_VER) = ( SELECT TAB_NAME, DB_VER) FROM TAB_COLUMNS WHERE VERSION =604)
Update Multiple Column Example :
Inefficient :
UPDATE EMP SET EMP_CAT = (SELECT MAX(CATEGORY) FROM EMP_CATEGORIES),SAL_RANGE= (SELECT MAX(SAL_RANGE) FROM EMP_CATEGORIES)WHERE EMP_DEPT = 0020;
Efficient :
UPDATE EMP SET (EMP_CAT,SAL_RANGE) = (SELECT MAX(CATEGORY) ,MAX(SAL_RANGE) FROM EMP_CATEGORIES)WHERE EMP_DEPT = 0020;
16. Improve... Through internal functions SQL efficiency
SELECT H.EMPNO,E.ENAME,H.HIST_TYPE,T.TYPE_DESC,COUNT(*)FROM HISTORY_TYPE T,EMP E,EMP_HISTORY H WHEREH.EMPNO = E.EMPNO AND H.HIST_TYPE = T.HIST_TYPE GROUP BY H.EMPNO,E.ENAME,H.HIST_TYPE,T.TYPE_DESC;
Efficiency can be improved by calling the following functions .
FUNCTION LOOKUP_HIST_TYPE(TYP IN NUMBER) RETURN VARCHAR2AS TDESC VARCHAR2(30); CURSOR C1 IS SELECT TYPE_DESC FROM HISTORY_TYPE WHERE HIST_TYPE = TYP;BEGIN OPEN C1; FETCH C1 INTO TDESC; CLOSE C1; RETURN (NVL(TDESC,'?'));END;
FUNCTION LOOKUP_EMP(EMP IN NUMBER) RETURN VARCHAR2AS ENAME VARCHAR2(30); CURSOR C1 IS SELECT ENAME FROM EMP WHERE EMPNO=EMP;BEGIN OPEN C1; FETCH C1 INTO ENAME; CLOSE C1; RETURN (NVL(ENAME,'?'));END; sql become :
SELECT H.EMPNO,LOOKUP_EMP(H.EMPNO),H.HIST_TYPE,
LOOKUP_HIST_TYPE(H.HIST_TYPE),COUNT(*)
FROM EMP_HISTORY H
GROUP BY H.EMPNO ,H.HIST_TYPE;
( I often see in forums such as ' Can you use one SQL Write ….' Post of , I don't know the complicated SQL Often at the expense of executive efficiency . It is very meaningful to master the above method of using function to solve problems in practical work )
tiger:
Two more functions are created for a query , It's too complicated . No function , First only EMP_HISTORY do group by, And then associate with the other two small tables , It's the same principle as using functions , It's more efficient than using functions .
17. Use the alias of the table (Alias)
When in SQL When multiple tables are joined in a statement , Please use the alias of the table and prefix the alias with each Column On . thus , It can reduce the parsing time and reduce the parsing time Column Grammatical errors caused by ambiguity .
(Column Ambiguity refers to the fact that SQL Different tables in have the same Column name , When SQL This appears in the statement Column when ,SQL The parser can't judge this Column Attribution of )
tiger:
Using aliases is a good habit , It should be written in the development specification . But alias is mainly used to avoid the same column The resulting logic error , Parsing time and syntax errors are secondary .
18. use EXISTS replace IN
In many queries based on underlying tables , To satisfy a condition , Often you need to join to another table . under these circumstances , Use EXISTS( or NOT EXISTS) It will usually improve the efficiency of the query .
Inefficient :
SELECT *FROM EMP ( Basic table )WHERE EMPNO > 0AND DEPTNO IN (SELECT DEPTNOFROM DEPTWHERE LOC = 'MELB')
Efficient :
SELECT *FROM EMP ( Basic table )WHERE EMPNO > 0AND EXISTS (SELECT 'X'FROM DEPTWHERE DEPT.DEPTNO = EMP.DEPTNOAND LOC = 'MELB')
relatively speaking , use NOT EXISTS Replace NOT IN Will significantly improve efficiency .
tiger:
in and exists In most cases the efficiency is the same ( For example, the above writing ), Only in a few special cases ;
not in and not exists It's not equivalent , In general, it is recommended to use not exists, Note whether the associated field is null, The associated field value of the main query is null Whether to return the record of .
19. use NOT EXISTS replace NOT IN
In subquery ,NOT IN Clause will perform an internal sort and merge .
In either case ,NOT IN Are the most inefficient ( Because it performs a full table traversal of the tables in the subquery ).
To avoid using NOT IN , We can make it an outer connection (Outer Joins) or NOT EXISTS.
for example :
SELECT … FROM EMP WHERE DEPT_NO NOT IN (SELECT DEPT_NO FROM DEPT WHERE DEPT_CAT='A');
In order to improve efficiency . to :
( Method 1 : Efficient )
SELECT….FROM EMP A,DEPT BWHERE A.DEPT_NO = B.DEPT(+)AND B.DEPT_NO IS NULLAND B.DEPT_CAT(+) = 'A'
( Method 2 : Most efficient )
SELECT …FROM EMP EWHERE NOT EXISTS (SELECT 'X' FROM DEPT D WHERE D.DEPT_NO = E.DEPT_NO AND DEPT_CAT = 'A');
tiger:
not in and not exists The associated fields are not all not null Can not be rewritten equivalently ; The writing of external relation is similar to not exists It is equivalent. , And not in Unequivalence ;
20. Replace... With a table connection EXISTS
Generally speaking , Using table connection is better than EXISTS More efficient
SELECT ENAME FROM EMP E WHERE EXISTS (SELECT 'X' FROM DEPT WHERE DEPT_NO = E.DEPT_NO AND DEPT_CAT = 'A'); Efficient : SELECT ENAME FROM DEPT D,EMP E WHERE E.DEPT_NO = D.DEPT_NO AND DEPT_CAT = 'A' ; tiger: Only when the associated field value of the sub query table is unique , The above rewriting is equivalent ; Otherwise, the rewritten result set is larger than the original result set , Don't be misled by the above unequal rewriting !
21. use EXISTS Replace DISTINCT
When submitting a table containing one to many information ( Like department tables and employee tables ) When it comes to , To avoid the SELECT Used in clauses DISTINCT. Generally, we can consider using EXIST Replace for example :
Inefficient :
SELECT DISTINCT DEPT_NO,DEPT_NAME FROM DEPT D,EMP E WHERE D.DEPT_NO = E.DEPT_NO; Efficient :
SELECT DEPT_NO,DEPT_NAME FROM DEPT D WHERE EXISTS ( SELECT 'X' FROM EMP E WHERE E.DEPT_NO =D.DEPT_NO);EXISTS Make queries faster , because RDBMS The core module will be in the sub query conditions once met , Return the result immediately .
tiger:
This rewrite is not equivalent , Many articles on the Internet and in books are written like this , It's usually the bottom sql Rewrite it as above . One sql There's de duplication logic , A no , Obviously not equivalent . Unless it's a special result set , There is no repetition in itself , That's the equivalent .
22. distinguish ' Inefficient execution ' Of SQL sentence
Use the following SQL Tools to find inefficient SQL:
SELECT EXECUTIONS ,DISK_READS,BUFFER_GETS, ROUND((BUFFER_GETS-DISK_READS)/BUFFER_GETS,2) Hit_radio, ROUND(DISK_READS/EXECUTIONS,2) Reads_per_run, SQL_TEXTFROM V$SQLAREAWHERE EXECUTIONS>0AND BUFFER_GETS > 0AND (BUFFER_GETS-DISK_READS)/BUFFER_GETS < 0.8ORDERBY 4 DESC; Although there are various opinions about SQL Optimized graphical tools emerge in an endless stream , But write your own SQL Tools are always the best way to solve problems
tiger:
Inefficient indexes also produce less disk_reads, Use this sql To find inefficiencies sql, The result is one-sided . It is suggested to divide the dimensions ( execution time ,cpu Consume , Logical reads , Physics reading, etc ) List top sql, Analyze one by one .
23. Use TKPROF Tools to query SQL Performance status
SQL trace The tool collects the SQL And record the performance status data in a tracking file . This trace file provides a lot of useful information ,
For example, the number of parses . Number of executions ,CPU Use time, etc . This data will be used to optimize your system .
Set up SQL TRACE At the session level : It works
ALTER SESSION SET SQL_TRACE=TRUE; Set up SQL TRACE Valid in the whole database , You have to SQL_TRACE Parameter in init.ora Set to TRUE,
USER_DUMP_DEST The parameter describes the directory where the trace file is generated
Four steps :
(1) Set up user_dump_dest The path of
(2) Set up time_statistics=true Also available alter system
(3) max_dump_file_size Set larger or use alter system Set to unlimited
(4) alter session set sql_trace=true
tiger:
Ancient tools and methods , It was used more in earlier versions . Set up sql_trace Get less information , amount to 10046 trace Of level 1, commonly 10046 trace City settings level 12( Add information about binding variables and waiting events ).
24. use EXPLAIN PLAN analysis SQL sentence
EXPLAIN PLAN It's a good analysis SQL Statement tools , It can even be executed without SQL Analyze the statement in the case of .
Through analysis , We can know ORACLE How to connect tables , How to scan the table ( Index scan or full table scan ) And the index name used .
You need to go from the inside out , Interpret the results of the analysis from top to bottom . EXPLAIN PLAN The results of the analysis are arranged in an indented format , The most internal operations will be interpreted first , If two operations are in the same layer , Those with the minimum operation number will be executed first .
NESTED LOOP It is one of the few operations that do not follow the above rules , The correct execution path is to check for NESTED LOOP Operation of providing data , The operation with the smallest number will be processed first .
give an example :
SQL>@D:\oracle\ora90\rdbms\admin\utlxplan.sqlSQL> @D:\oracle\ora90\sqlplus\admin\plustrce.sqlSQL>list 1 SELECT * 2 FROMdept,emp 3* WHERE emp.deptno = dept.deptno;SQL> set autotrace traceonly SQL>set timing on Displays the execution time // SQL>set autorace on Show execution plan // SQL>set autotrace traceonly Only the execution plan is displayed, that is, the queried data is not displayed SQL>/14rows selected.ExecutionPlan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 NESTED LOOPS 2 1 TABLE ACCESS (FULL) OF 'EMP' 3 1 TABLE ACCESS (BY INDEX ROWID)OF 'DEPT' 4 3 INDEX (UNIQUE SCAN) OF'PK_DEPT' (UNIQUE)Statistics---------------------------------------------------------- 0 recursive calls 2 db block gets 30 consistent gets 0 physical reads 0 redo size 2598 bytes sent via SQL*Net to client 503 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 14 rows processed
Through the above analysis , It can be concluded that the actual execution steps are :
1.TABLEACCESS (FULL) OF 'EMP'
2.INDEX(UNIQUE SCAN) OF 'PK_DEPT' (UNIQUE)
3.TABLEACCESS (BY INDEX ROWID) OF 'DEPT'
4.NESTEDLOOPS (JOINING 1 AND 3)
tiger:
nested loops There is nothing special about our implementation plan ;
set autotrace on Generally, it is necessary to execute SQL( Only set autotrace traceonly exp, select Statement is not executed , but dml Statement will still execute );
explain plan It's not execution SQL, The resulting execution plan is consistent with autotrae equally , Are not necessarily real implementation plans .
25. Using indexes to improve efficiency
An index is a conceptual part of a table , To improve the efficiency of retrieving data .
actually ,ORACLE Using a complex self balancing B-tree structure . Usually , Querying data by index is faster than scanning the whole table .
When ORACLE Find out how to execute the query and Update The best path to a statement is ,ORACLE The optimizer will use the index . Similarly, using indexes to join multiple tables can also improve efficiency .
Another advantage of using indexes is , It provides the primary key (primary key) The uniqueness of . Except for those LONG or LONG RAW data type , You can index almost all the columns .
Usually , Using indexes in large tables is particularly effective . Of course , You'll find out , When scanning a small table , Using indexes can also improve efficiency .
Although the use of index can improve query efficiency , But we also have to pay attention to the cost . Indexes need space to store , It also needs regular maintenance , Whenever a record is added or deleted in a table or an index column is modified , The index itself will also be modified .
This means that every recorded INSERT ,DELETE,UPDATE Will pay more for this 4 ,5 Secondary disk I/O .
Because indexes need extra storage and processing , Those unnecessary indexes will slow down the response time of queries
It is necessary to refactor the index regularly .
ALTER INDEX <INDEXNAME> REBUILD <TABLESPACENAME>;
tiger:
Querying a small number of data indexes is fast , Query a large amount of data and scan the whole table quickly ! delete and update In general, you also need to use indexes , At the same time, the cost of maintaining indexes is much less than that of full table scanning ; On a regular basis rebuild Index is not required , Frequent update and delete operation , You may need to rebuild.
26. Operation of index
ORACLE There are two modes of access to indexes .
1. Index unique scan ( INDEX UNIQUE SCAN)
Most of the time , Optimizer pass WHERE Clause access INDEX.
for example : surface LODGING There are two indexes :
Based on the LODGING Unique index on column LODGING_PK And based on MANAGER Non unique index on column LODGING$MANAGER.
SELECT * FROM LODGING WHERE LODGING = 'ROSE HILL';
In the internal , Above SQL Will be carried out in two steps , First ,LODGING_PK The index will be accessed by index unique scanning , Get the corresponding ROWID, adopt ROWID How to access the table Perform the next step to retrieve .
If the retrieved column is included in INDEX In the column ,ORACLE The processing of step 2 will not be performed ( adopt ROWID Access table ). Because the retrieved data is stored in the index , Simply accessing the index can fully meet the query results .
below SQL It only needs INDEX UNIQUE SCAN operation .
SELECT LODGING FROM LODGING WHERE LODGING = 'ROSE HILL';
2. Index range query (INDEX RANGE SCAN)
It can be used in two cases :1. Search based on a range 2. Search based on non unique index example 1:
SELECT LODGING FROM LODGING WHERE LODGING LIKE 'M%';
WHERE Clause conditions include a series of values ,ORACLE Will query by index range query LODGING_PK . Because of the index range, the query will return a set of values , It is less efficient than index unique scanning . example 2:
SELECT LODGING FROM LODGING WHERE MANAGER = 'BILL GATES';
This SQL The implementation of is divided into two steps ,LODGING$MANAGER Index range query ( Get all qualified records ROWID) And the next step through ROWID Access the table to get LODGING The value of the column .
because LODGING$MANAGER Is a non unique index , The database cannot perform an index unique scan on it .
because SQL return LODGING Column , And it doesn't exist in LODGING$MANAGER Index , Therefore, after the index range query, a pass ROWID Operation of accessing tables .
WHERE clause , If the index of the first character of the corresponding column (wildcard) Start , The index will not be used .
SELECT LODGING FROM LODGING WHERE MANAGER LIKE '%HANMAN';
under these circumstances ,ORACLE Full table scan will be used .
tiger:
oracle It also supports a variety of other scanning methods for indexes , such as index skip scan,index full scan, index fast full scan; There are also index full scan(min/max) ,index range scan(min/max),index xxx scan desc etc. .
27. Selection of basic tables
Basic table (Driving Table) The first table to be accessed ( It is usually accessed in the form of a full table scan ). Depending on the optimizer ,SQL The choice of the underlying table in the statement is different .
If you're using CBO (COST BASED OPTIMIZER), The optimizer checks SQL The physical size of each table in the statement , Status of index , Then choose the lowest cost execution path .
If you use RBO (RULE BASED OPTIMIZER) , And all connection conditions have index correspondence , under these circumstances , The basic table is FROM The last table in clause . give an example :SELECT A.NAME ,B.MANAGER FROM WORKER A,LODGING B WHERE A.LODGING = B.LODING; because LODGING Tabular LODING There is an index on the column , and WORKER There is no comparable index in the table ,WORKER The table will be used as the base table in the query .
tiger:
RBO The choice of index is blind , Whether the index is efficient or not , Will give priority to the index , This is also RBO One reason for being eliminated ; The choice of driving table is not to select small table , Instead, choose a small result set , In multi table Association , Generally, you will select a table with filter conditions that can filter out most of the records as the driving table .
28. Multiple equal indexes
When SQL When the execution path of a statement can use multiple indexes distributed on multiple tables ,ORACLE Multiple indexes are used at the same time and their records are merged at run time , Retrieve records that are only valid for all indexes .
stay ORACLE When selecting the execution path , Unique index is higher than non unique index .
However, the rule is only when WHERE The comparison between index columns and constants in clause is valid . If the index column is compared with the index class of other tables . The level of this clause in the optimizer is very low .
If two indexes of the same level are referenced in different tables ,FROM The order of the tables in the clause will determine which will be used first . FROM The index of the last table in the clause will have the highest priority .
If two indexes of the same level in the same table will be referenced ,WHERE The index that is first referenced in the clause will have the highest priority .
give an example : DEPTNO There is a non unique index on ,EMP_CAT There is also a non unique index .
SELECT ENAME,FROM EMP WHERE DEPT_NO = 20 AND EMP_CAT = 'A';
here ,DEPTNO The index will be retrieved first , And then the same EMP_CAT Index the retrieved records and merge them . The execution path is as follows :
TABLEACCESS BY ROWID ON EMP AND-EQUAL INDEX RANGE SCAN ON DEPT_IDX INDEX RANGE SCAN ON CAT_IDX tiger: This is close to 20 Years ago, the lower version optimizer was using RBO The performance of time ; Today's optimizer , Based on statistics on tables and indexes , Automatically select efficient indexes . If both indexes are inefficient , To be able to use AND-EQUAL Such an implementation plan .
29. Equation comparison and range comparison
When WHERE Indexed column in Clause ,ORACLE They cannot be merged ,ORACLE Compare... With range . give an example :DEPTNO There is a non unique index on ,EMP_CAT There is also a non unique index .
SELECT ENAME FROM EMP WHERE DEPTNO> 20 AND EMP_CAT ='A';
There is only EMP_CAT The index is used , Then all the records will be related to DEPTNO Compare the conditions . The execution path is as follows :
TABLE ACCESSBY ROWID ON EMP INDEX RANGESCAN ON CAT_IDX tiger: RBO Characteristics of , Has been eliminated . If the selectivity of two single field indexes is not good , Then you can create a federated index , Like this sql You can create emp_cat and deptno Two field joint index of .
30. Ambiguous index level
When ORACLE There is no way to judge the level of index , The optimizer will use only one index , It's in WHERE Clause is listed at the top of .
give an example :DEPTNO There is a non unique index on ,EMP_CAT There is also a non unique index .
SELECT ENAM E FROM EMP WHERE DEPTNO > 20 AND EMP_CAT > 'A';
here ,ORACLE It's only used. DEPT_NO Indexes . The execution path is as follows :
TABLEACCESS BY ROWID ON EMP INDEX RANGE SCAN ON DEPT_IDX
Let's try the following :
SQL>select index_name,uniqueness from user_indexes where table_name = 'EMP';
INDEX_NAME UNIQUENES---------------------------------------EMPNO UNIQUEEMPTYPE NONUNIQUE
SQL>select * from emp where empno >= 2 and emp_type = 'A' ;
no rows selected
ExecutionPlan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 TABLE ACCESS (BY INDEX ROWID)OF 'EMP' 2 1 INDEX (RANGE SCAN) OF 'EMPTYPE' (NON-UNIQUE)
although EMPNO It's a unique index , But because it does a range comparison , The rank is lower than the equation of non unique index .
tiger:
RBO Characteristics of , Has been eliminated .
31. Force index invalidation
If two or more indexes have the same level , You can force orders ORACLE The optimizer uses one of these ( Through it , The number of retrieved records is small ) . give an example :
SELECT ENAMEFROM EMPWHERE EMPNO = 7935AND DEPTNO + 0 = 10 /*DEPTNO The index on will be invalid */AND EMP_TYPE || '' = 'A' /*EMP_TYPE The index on will be invalid */
This is a fairly direct way to improve query efficiency . But you have to think carefully about this strategy , Generally speaking , Only if you want to optimize a few individually SQL It can only be used when .
Here is an example of when to adopt this strategy , Suppose that EMP Tabular EMP_TYPE There is a non unique index on the column and EMP_CLASS There is no index on .
SELECT ENAME FROM EMP WHERE EMP_TYPE = 'A' AND EMP_CLASS = 'X';
The optimizer will notice EMP_TYPE And use it . This is the only option at present .
If , After a while , Another non uniqueness is based on EMP_CLASS On , The optimizer must select two indexes , In general , The optimizer will use the two indexes and perform sorting and merging on their result sets .
However , If one of the indexes (EMP_TYPE) Close to uniqueness and another index (EMP_CLASS) There are thousands of duplicate values on the .
Sorting and merging can become an unnecessary burden . under these circumstances , You want the optimizer to block out EMP_CLASS Indexes .
The following solution can solve the problem .
SELECT ENAME FROMEMP WHERE EMP_TYPE = 'A' AND EMP_CLASS||'' = 'X';
tiger:
Forcing index invalidation in this way is not recommended , I recently optimized a 9i Library sql, involve 18 The complexity of the table Association SQL, It includes external relations ,not in,group by The subquery etc. , It is also done by the above method " Optimize " Several modifications , The best path cannot be selected . Through the analysis of , Finally, you need to change these disabled indexes back to normal , Then, by specifying the association order of the table , Get the best path , Finally, the optimization effect is several times .
have access to CBO It can be used index relevant hint To specify or disable an index . RBO If used index hint, Also became CBO.
32. Avoid using calculations on index columns
WHERE clause , If the index column is part of a function . Optimizer will use full table scan instead of index . give an example : Inefficient :
SELECT … FROM DEPT WHERE SAL * 12 > 25000;
Efficient :
SELECT … FROM DEPT WHERE SAL > 25000/12;
This is a very practical rule , Please remember
tiger:
That's right , Whether it is an indexed column or not , Keep fields independent , Are the best choices . Many junior programmers like to use... In the date field to_char, It is also very, very not recommended .
33. Auto select index
If there are more than two in the table ( It includes two ) Indexes , There is a unique index , Others are not unique .
under these circumstances ,ORACLE Unique index will be used and non unique index will be ignored completely . give an example :
SELECT ENAME FROM EMP WHERE EMPNO = 2326 AND DEPTNO = 20 ;
here , Only EMPNO The index on is unique , therefore EMPNO The index will be used to retrieve records .
TABLEACCESS BY ROWID ON EMP INDEX UNIQUE SCAN ON EMP_NO_IDX
34. Avoid using... On index columns NOT
Usually , We want to avoid using... On index columns NOT,NOT Will have the same effect as using functions on indexed columns .
When ORACLE encounter “NOT”, The index is stopped and a full table scan is performed instead .
When using “<>” when oracle Nor does it use indexes to perform a full table scan , While using “< The column value or > The column value ” Will perform two index scans in the form of , The proposal USES “ Name < The column value union select Name > The column value ” In the form of , Faster .
give an example :
Inefficient :( here , No index )
SELECT … FROM DEPT WHERE NOT DEPT_CODE = 0
Efficient :( here , Index is used )
SELECT … FROM DEPT WHERE DEPT_CODE > 0;
It should be noted that , At some point ,ORACLE The optimizer will automatically NOT Into the corresponding relational operator .
NOT > to <= NOT >= to < NOT < to >= NOT <= to >
tiger:
This is partly true , != , <> , not in (1,2,3) , not like , like '%xxx%', These expressions are not indexed ;
however <> The equivalent rewrite of cannot be used union , It is union all.
35. use UNION Replace OR ( For index columns )
Usually , use UNION Replace WHERE In Clause OR Will play a better effect . Use... For index columns OR Will cause a full table scan .
Be careful , The above rules are only valid for multiple index columns . If there is column Not indexed , Query efficiency may be because you don't have a choice OR And lower . In the following example ,LOC_ID and REGION There's an index on it .
Efficient :
SELECTLOC_ID ,LOC_DESC ,REGION FROM LOCATION WHERE LOC_ID = 10 UNION SELECT LOC_ID ,LOC_DESC ,REGION FROM LOCATION WHERE REGION = "MELBOURNE"
Inefficient :
SELECT LOC_ID ,LOC_DESC ,REGION FROM LOCATION WHERE LOC_ID = 10 OR REGION = "MELBOURNE"
If you insist on using OR, Then you need to return the index column with the least number of records, which is written at the top . Be careful :
WHEREKEY1 = 10 ( Returns the minimum number of records )ORKEY2 = 20 ( Returns the maximum number of records )ORACLE Internally convert the above to WHEREKEY1 = 10 AND((NOTKEY1 = 10) AND KEY2 = 20)
The following test data is for reference only :(a = 1003 Return a record ,b = 1 return 1003 Bar record )
SQL>select * from unionvsor /*1st test*/ 2 where a = 1003 or b = 1;1003rows selected.ExecutionPlan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 CONCATENATION 2 1 TABLE ACCESS (BY INDEXROWID) OF 'UNIONVSOR' 3 2 INDEX (RANGE SCAN) OF 'UB'(NON-UNIQUE) 4 1 TABLE ACCESS (BY INDEXROWID) OF 'UNIONVSOR' 5 4 INDEX (RANGE SCAN) OF 'UA'(NON-UNIQUE)Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 144 consistent gets 0 physical reads 0 redo size 63749 bytes sent via SQL*Net to client 7751 bytes received via SQL*Net from client 68 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1003 rows processedSQL>select * from unionvsor /*2nd test*/ 2 where b = 1 or a = 1003 ;1003rows selected.ExecutionPlan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 CONCATENATION 2 1 TABLE ACCESS (BY INDEXROWID) OF 'UNIONVSOR' 3 2 INDEX (RANGE SCAN) OF 'UA'(NON-UNIQUE) 4 1 TABLE ACCESS (BY INDEXROWID) OF 'UNIONVSOR' 5 4 INDEX (RANGE SCAN) OF 'UB'(NON-UNIQUE)Statistics---------------------------------------------------------- 0 recursive calls 0 dbblock gets 143 consistent gets 0 physical reads 0 redo size 63749 bytes sent via SQL*Net to client 7751 bytes received via SQL*Net from client 68 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1003 rows processedSQL>select * from unionvsor /*3rd test*/ 2 where a = 1003 3 union 4 select * from unionvsor 5 where b = 1;1003rows selected.ExecutionPlan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 SORT (UNIQUE) 2 1 UNION-ALL 3 2 TABLE ACCESS (BY INDEXROWID) OF 'UNIONVSOR' 4 3 INDEX (RANGE SCAN) OF'UA' (NON-UNIQUE) 5 2 TABLE ACCESS (BY INDEX ROWID) OF 'UNIONVSOR' 6 5 INDEX (RANGE SCAN) OF'UB' (NON-UNIQUE)Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 10 consistent gets 0 physical reads 0 redo size 63735 bytes sent via SQL*Net to client 7751 bytes received via SQL*Net from client 68 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1003 rows processed
use UNION The effect can be seen from consistent gets and SQL*NET The decrease of data exchange volume of
tiger :
use union rewrite or Not only is it not equivalent , And may be more inefficient . The same result set is only a special case , I have mentioned in many of my official account articles .
above test case, use union After rewriting ,buffer gets from 144 To 10, This data is unscientific , union There is one more step to sort , The efficiency should be higher than the original SQL Write lower to meet expectations .
36. use IN To replace OR
The following queries can be replaced by more efficient statements :
Inefficient :
SELECT ….FROM LOCATIONWHERE LOC_ID = 10 OR LOC_ID = 20 OR LOC_ID = 30
Efficient
SELECT …FROM LOCATION WHERE LOC_IN IN (10,20,30);
It's a simple rule to remember , But the actual implementation effect still needs to be tested , stay ORACLE8i Next , The execution paths of the two seem to be the same .
tiger:
Now that we have seen the same implementation plan , There is no need to doubt , The two are equivalent , It's just in It's just more concise .
37. Avoid using... On index columns IS NULL and IS NOT NULL
Avoid using any nullable column in the index ,ORACLE The index will not be available .
For single column indexes , If the column contains null values , This record will not exist in the index .
For composite indexes , If every column is empty , This record also does not exist in the index . If at least one column is not empty , Then the record exists in the index .
give an example : If the unique index is built on the A Column sum B On the column , And there is a record in the table A,B The value is (123,null) ,ORACLE Will not accept the next one with the same A,B value (123,null) The record of ( Insert ).
However, if all index columns are empty ,ORACLE The entire key value will be considered null , And empty is not equal to empty . So you can insert 1000 Records with the same key value ,
Of course, they're all empty ! Because the null value does not exist in the index column , therefore WHERE A null comparison of an index column in the ORACLE Disable the index .
give an example :
Inefficient :( Index failure )
SELECT … FROM DEPARTMENT WHERE DEPT_CODE IS NOT NULL;
Efficient :( Index is valid )
SELECT … FROM DEPARTMENT WHERE DEPT_CODE >=0; tiger: The top two sql Writing , stay 9i The version may be different ; stay 10g And above , The implementation plan of both is the same , There is no need to rewrite .
38. Always use the first column of the index
If the index is built on multiple columns ,
Only in its first column (leading column) By where Clause references , The optimizer will choose to use the index . It's also a simple and important rule . See the following example .
SQL>create table multiindexusage ( inda number ,indb number ,descr varchar2(10));Tablecreated.SQL>create index multindex on multiindexusage(inda,indb);Indexcreated.SQL>set autotrace traceonlySQL> select * from multiindexusage where inda = 1;ExecutionPlan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 TABLE ACCESS (BY INDEX ROWID)OF 'MULTIINDEXUSAGE' 2 1 INDEX (RANGE SCAN) OF'MULTINDEX' (NON-UNIQUE)
SQL>select * from multiindexusage where indb= 1;ExecutionPlan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 TABLE ACCESS (FULL) OF'MULTIINDEXUSAGE'
Obviously , When only the second column of the index is referenced , The optimizer uses a full table scan and ignores the index
tiger:
If the first column has fewer unique values , You can also use index skip
scan.
39. ORACLE Internal operation
When executing a query ,ORACLE Internal operation is adopted . The following table shows several important internal operations .
ORACLE Clause Internal operation ORDER BY SORT ORDER BYUNION UNION-ALLMINUS MINUSINTERSECT INTERSECTDISTINCT,MINUS,INTERSECT,UNIONSORT UNIQUEMIN,MAX,COUNT SORT AGGREGATEGROUP BY SORT GROUP BYROWNUM COUNT or COUNT STOPKEYQueries involving Joins SORT JOIN,MERGE JOIN,NESTED LOOPSCONNECT BY CONNECT BY
40. use UNION-ALL Replace UNION ( If possible )
When SQL Sentence needs UNION Two sets of query results , The two result sets will be in the form of UNION-ALL It's a way of merging , And then sort it before you output the final result .
If you use UNION ALL replace UNION, So sorting is not necessary . Efficiency will be improved as a result .
give an example : Inefficient :
SELECT ACCT_NUM,BALANCE_AMT FROM DEBIT_TRANSACTIONS WHERE TRAN_DATE = '31-DEC-95' UNION SELECT ACCT_NUM,BALANCE_AMT FROM DEBIT_TRANSACTIONS WHERE TRAN_DATE = '31-DEC-95'
Efficient :
SELECT ACCT_NUM,BALANCE_AMT FROM DEBIT_TRANSACTIONS WHERE TRAN_DATE = '31-DEC-95' UNION ALL SELECT ACCT_NUM,BALANCE_AMT FROM DEBIT_TRANSACTIONS WHERE TRAN_DATE = '31-DEC-95';
It should be noted that ,UNION ALL The same record in the two result sets will be output repeatedly . Therefore, you still need to analyze and use from the business requirements UNION ALL The feasibility of .
UNION Will sort the result set , This operation will use SORT_AREA_SIZE This memory . The optimization of this memory is also very important . Below SQL It can be used to query the consumption of sorting
Select substr(name,1,25) "Sort Area Name",substr(value,1,15)"Value" from v$sysstat where name like 'sort%' tiger:
Two are exactly the same sql do union /union all, This test case It was a strange choice . union all Is simply merging two result sets ; and union Is the de duplication of the merged result set , The logic is completely different .
41. Use hint (Hints)
Access to tables , You can use two Hints.FULL and ROWID
FULLhint tell ORACLE Use full table scan to access the specified table . for example :
SELECT /*+ FULL(EMP) */ * FROM EMP WHERE EMPNO = 7893;
ROWIDhint tell ORACLE Use TABLE ACCESS BY ROWID To access the table . Usually , You need to use TABLE ACCESS BY ROWID Especially when accessing large tables , In this way , You need to know ROIWD Or use an index .
If a large table is not set to cache (CACHED) Table and you want its data to remain at the end of the query SGA in ,
You can use it CACHE hint To tell the optimizer to keep the data in SGA in . Usually CACHE hint and FULL hint Use it together .
for example :
SELECT /*+ FULL(WORKER) CACHE(WORKER)*/ *FROMWORK;
Indexes hint tell ORACLE Using index based scanning . You don't have to specify the specific index name, such as :
SELECT /*+ INDEX(LODGING) */ LODGING FROM LODGING WHERE MANAGER = 'BILL GATES';
Without using hint Under the circumstances , The above queries should also use indexes , However , If the index has too many duplicate values and your optimizer is CBO, The optimizer may ignore the index .
under these circumstances , You can use it. INDEX hint mandatory ORACLE Use the index .
ORACLE hints It also includes ALL_ROWS,FIRST_ROWS,RULE,USE_NL,USE_MERGE,USE_HASH wait .
Use hint, To show that we are right ORACLE The optimizer default execution path is not satisfactory , It needs to be modified by hand .
It's a very skilful job . Suggestions are only for specific , A few SQL Conduct hint The optimization of the . Yes ORACLE The optimizer should still have confidence ( especially CBO)
42. use WHERE replace ORDER BY
ORDER BY Clause uses indexes only under two strict conditions .ORDER BY All columns in must be contained in the same index and maintain the order in the index .ORDER BY All columns in must be defined as non empty .WHERE Clause and ORDER BY The index used in Clause cannot be juxtaposed .
for example : surface DEPT Contains the following columns :
DEPT_CODE PK NOT NULL DEPT_DESC NOT NULL DEPT_TYPE NULL
Non unique index (DEPT_TYPE) Inefficient :( The index is not used )
SELECTDEPT_CODE FROM DEPT ORDER BY DEPT_TYPE; EXPLAIN PLAN: SORT ORDER BY TABLE ACCESS FULL
Efficient :( Use index )
SELECT DEPT_CODE FROM DEPT WHERE DEPT_TYPE > 0; EXPLAIN PLAN: TABLE ACCESS BY ROWID ON EMP INDEX RANGE SCAN ON DEPT_IDX
ORDER BY Index can also be used ! This is really a knowledge point that is easy to be ignored . So let's verify that :
SQL> select * from emp order by empno;ExecutionPlan--------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE 1 0 TABLE ACCESS (BY INDEX ROWID)OF 'EMP' 2 1 INDEX (FULL SCAN) OF 'EMPNO'(UNIQUE) tiger:
Two different logics SQL No comparability ,order by It is common sense to be able to use indexes . the last one sql Can use index , That's because empno The field of is defined as not null.
43. Avoid changing the type of index column .
When comparing data of different data types ,ORACLE Automatic simple type conversion of columns . hypothesis EMPNO Is an index column of numeric type .
SELECT … FROM EMP WHERE EMPNO = '123'
actually , after ORACLE Type conversion , The sentence is translated into :
SELECT … FROM EMP WHERE EMPNO = TO_NUMBER('123')
Fortunately, , Type conversion does not occur on index columns , The purpose of the index has not been changed . Now? , hypothesis EMP_TYPE Is an index column of character type .
SELECT … FROM EMP WHERE EMP_TYPE = 123
This statement is ORACLE Convert to :
SELECT … FROM EMP WHERE TO_NUMBER(EMP_TYPE)=123
Because of the internal type conversion , This index will not be used
for fear of ORACLE To you SQL Implicit type conversion , It's better to make the type conversion explicit .
Note that when comparing characters with numbers ,ORACLE Will give priority to converting numeric types to character types
tiger:
The above example is correct , The final summary is written backwards , Should convert string type to numeric type first ;
Only one form of implicit type conversion is listed here , Common ones are date And timestamp, char And nchar,raw And string, etc .
44. You need to be careful WHERE Clause
some SELECT Statement WHERE Clause does not use index . Here are some examples . In the following example ,'!=' The index will not be used .
remember , An index can only tell you what exists in a table , Instead of telling you what doesn't exist in the table .// classic ^_^ No index :
SELECT ACCOUNT_NAME FROM TRANSACTION WHERE AMOUNT !=0;
Use index :
SELECT ACCOUNT_NAMEFROM TRANSACTIONWHERE AMOUNT >0; tiger:
The premise of efficient rewriting is
1.amount There is no negative value
2.amount>0 There are few corresponding records
Without the above premise , Rewriting is meaningless
In the following example ,'||' It's a character concatenation function . Just like other functions , Index Disabled . No index :
SELECT ACCOUNT_NAME,AMOUNTFROM TRANSACTIONWHERE ACCOUNT_NAME||ACCOUNT_TYPE='AMEXA';
Use index :
SELECT ACCOUNT_NAME,AMOUNTFROM TRANSACTIONWHERE ACCOUNT_NAME = 'AMEX'AND ACCOUNT_TYPE=' A'; tiger: The first 31/32 An example of a bar
In the following example ,'+' It's a mathematical function . Just like other mathematical functions , Index Disabled . No index :
SELECT ACCOUNT_NAME,AMOUNTFROM TRANSACTIONWHERE AMOUNT + 3000 >5000;
Use index :
SELECT ACCOUNT_NAME,AMOUNTFROM TRANSACTIONWHERE AMOUNT > 2000 ; tiger: The first 31/32 An example of a bar
In the following example , Same index columns cannot be compared with each other , This will enable full table scan . No index :
SELECT ACCOUNT_NAME,AMOUNTFROM TRANSACTIONWHERE ACCOUNT_NAME = NVL(:ACC_NAME,ACCOUNT_NAME);
Use index :
SELECT ACCOUNT_NAME,AMOUNTFROM TRANSACTIONWHERE ACCOUNT_NAME LIKE NVL(:ACC_NAME,'%');
If you must enable indexing on columns that use functions ,ORACLE The new function :
Function based index (Function-Based Index) Maybe it's a better plan .
CREATE INDEX EMP_I ON EMP(UPPER(ename)); /* Building function based indexes */ SELECT * FROM emp WHERE UPPER(ename) ='BLACKSNAIL'; /* Index will be used */
45. Connect multiple scans
If you compare a column with a finite set of values , The optimizer may perform multiple scans and merge the results . give an example :SELECT * FROM LODGING WHERE MANAGER IN ('BILL GATES','KEN MULLER'); The optimizer may translate it into the following form
SELECT * FROM LODGING WHERE MANAGER = 'BILL GATES' OR MANAGER = 'KEN MULLER';
When selecting the execution path , The optimizer may use for each condition LODGING$MANAGER Index range scan on .
Back to ROWID To access LODGING The record of the table ( adopt TABLE ACCESS BY ROWID The way ). The last two sets of records are connected (concatenation) Are combined into a single set .
ExplainPlan :
SELECT STATEMENT Optimizer=CHOOSE
CONCATENATION
TABLEACCESS (BY INDEX ROWID) OF LODGING
INDEX(RANGE SCAN ) OF LODGING$MANAGER (NON-UNIQUE)
TABLEACCESS (BY INDEX ROWID) OF LODGING
INDEX(RANGE SCAN ) OF LODGING$MANAGER (NON-UNIQUE)
tiger:
This transformation is automatically completed by the optimizer , Programmers don't need to think about using in Or use it or. 10g And above versions of the implementation plan are no longer CONCATENATION, It is inlist iterator.
46. CBO Use a more selective index under
Cost based optimizer (CBO,Cost-Based Optimizer) Judge the selectivity of index to determine whether the use of index can improve efficiency .
If the index is highly selective , That is to say, for every index key value that does not repeat , Only a small number of records .
such as , There are 100 Records and among them 80 Non duplicate index key values . The selectivity of this index is 80/100 = 0.8 . The more selective , The fewer records retrieved by the index key value .
If the index selectivity is low , Retrieving data requires a large number of index range query operations and ROWID Operation of accessing tables . It may be less efficient than full table scanning .
Please refer to the following experience :
a. If the amount of retrieved data exceeds 30% The number of records in the table . There will be no significant efficiency gains using indexes
b. In certain circumstances , Using an index may be slower than a full table scan , But it's a difference of the same order of magnitude . And usually , Using indexes is several times or even thousands of times faster than full table scanning !
tiger:
30% This proportion is very high , Almost one field has 3 Different values , In the case of an even distribution , You can use the index , In fact, such index efficiency is very low , It is less efficient than full table scanning . Such an index , If you use CBO, Will not be used at all , Equivalent to creating a useless index , Occupancy space , influence dml efficiency .
47. Avoid resource consuming operations
with DISTINCT,UNION,MINUS,INTERSECT,ORDER BY Of SQL The statement will start SQL The engine performs resource intensive sorting (SORT) function .
DISTINCT A sort operation is required , Others need to perform sorting at least twice .
for example , One UNION Inquire about , Each of these queries has GROUP BY Clause ,
GROUP BY Will trigger embedded sort (NESTED SORT) ;
such , Each query needs to be sorted once , And then in the UNION when , Another unique sort (SORT UNIQUE) The operation is executed and it can only be started after the previous embedded sort is finished . The depth of the embedded sort will greatly affect the efficiency of the query .
Usually , with UNION,MINUS ,INTERSECT Of SQL Statements can be rewritten in other ways .
If your database SORT_AREA_SIZE It's a good mix , Use UNION,MINUS,INTERSElECT It can also be considered , After all, they are very readable .
tiger:
union and distinct It's the same , Rewriting makes no sense ; minus and intersect It is recommended to use table association to rewrite , More efficient .
48. Optimize GROUP BY
Improve GROUP BY Statement efficiency , This can be done by recording unwanted items in the GROUP BY Filter out before . The next two queries return the same result, but the second one is obviously much faster . Inefficient :
SELECT JOB ,AVG(SAL) FROM EMP GROUP JOB HAVING JOB = 'PRESIDENT' OR JOB = 'MANAGER';
Efficient :
SELECT JOB ,AVG(SAL) FROM EMP WHERE JOB = 'PRESIDENT' OR JOB = 'MANAGER' GROUP JOB; tiger:
With the first 14 It's repeated
49. Use an explicit cursor (CURSORs)
Use an implicit cursor , Two operations will be performed .
The first search of records ,
A second check TOO MANY ROWS This exception .
The explicit cursor does not perform the second operation .
tiger:
Someone did a test , Implicit cursors for cur in (select .... from xxx ) Writing , The code is concise , It is also more efficient than explicit cursors . Of course , There are other benefits to using display cursors , Can't generalize .
50. Optimize EXPORT and IMPORT
Use larger BUFFER( such as 10MB )
Can improve EXPORT and IMPORT The speed of .ORACLE Will try to get the memory size you specified , Even if the memory is not enough , No errors reported . This value must be at least equivalent to the largest column in the table , Otherwise, the column value will be truncated .
51. Separate tables and indexes
Always build your table and index in different table spaces (TABLESPACES). Never will not belong to ORACLE Internal system objects are stored in SYSTEM In the table space . meanwhile , Make sure that the data table space and index table space are on different hard disks .
tiger:
It is a good habit for indexes and tables to store different table spaces respectively , because 9i No, asm,10g And above versions have asm, In terms of performance, there is no explicit requirement that tables and indexes store different table spaces respectively , There will be some benefits in Management .
summary :
SQL Optimization is a more rigorous thing , Some small things that have not been proved or have obvious mistakes " skill " It is widely spread on the Internet , It's not a good thing . Hope the experts who like sharing , In publishing professional books , When writing a blog or a official account article , Can also be in a serious attitude , Don't mislead some people who don't know the truth .
My comments are also the words of a family , Welcome to the discussion , To criticize and correct !
边栏推荐
- [redis]redis6 master-slave replication
- [redis] three new data types
- [20. valid brackets]
- Performance test (I)
- [redis]redis的持久化操作
- 5分钟快速上线Web应用和API(Vercel)
- Objective-C不同数据类型占用字节大小
- Redis learning notes
- Resolved: can there be multiple auto incrementing columns in a table
- Flutter System Architecture(Flutter系统架构图)
猜你喜欢
随机推荐
PlainSelect. getGroupBy()Lnet/sf/jsqlparser/statement/select/GroupByElement;
redis学习笔记
Install MySQL in ECS (version 2022)
建立自己的网站(12)
csv新增一列
Learning websites that programmers must see
如何使用Feign构造多参数的请求
Apple GCD source code
Easyclick fixed status log window
Japanese anime writers and some of their works
Cannot re-register id: PommeFFACompetition-v0问题解决
2022团体程序设计天梯赛L1
[redis] three new data types
Apple Objective-C source code
2022化工自动化控制仪表考试练习题及在线模拟考试
已解决:一個錶中可以有多個自增列嗎
R language CO2 dataset visualization
2022 group programming TIANTI race L1
R language airpassengers dataset visualization
Feign FAQ summary









