SQL Performance Tuning Techniques
What is SQL Tuning ?
It comprises of two words :
SQL
Tuning
SQL – Structured Query Language
Tuning – Process to get High Performance
Therefore, SQL tuning is a process of improving the performance of SQL query using different techniques and methodologies.
Advantages of SQL Tuning:
Improves Performance
Improves Efficiency
Reduces Cost of Query
Lesser Execution Time
Quick Outputs
The need of SQL Tuning Advisor arises due to the following problems –
- Higher Cost of Query
When Query takes higher execution time, naturally, it will
be costlier.
The Cost of the query can be determined from Explain Plan. - Higher Execution Time
When scanning of tables and output of Query
take longer delays, then expected time of
completion of query gets delayed.
Execution Time can be generated through Toad (Quest Software, Interface to access Database) or even SQL plus.
Explain PLAN?
- It’s an application generated plan which conveys detailed information about the Cost of SQL query.
- If it is not enabled in the Toad, then it can be enabled by creating a plan_table
in the Toad (if generating plan through Toad).
Difference between Where clause and Having clause?
Both clauses, Having and Where are widely used in making SQL queries and are used to filter the data with respect to certain conditions.
The major difference between Having and Where is the Having clause first fetches the whole data and then filtering/sorting/summing is done while in case of Where clause, the filtering/sorting are done before fetching the whole data.
Thus, Where clause takes less time to execute and hence will cost less as compare to Having clause.
Statement 1:
SELECT region, AVG (population)
FROM location
GROUP BY region
HAVING region != ‘Bangalore’
AND region != ‘Mysore’;
In the statement 1, the whole data will be first fetched and grouped by region and hence, those records will be excluded whose region is not either in Bangalore or in Mysore. Now, this exclusion will be done from the total number of records that were fetched. Hence, the query will take considerably a good time and hence, a high cost.
Statement 2:
SELECT region, AVG (population)
FROM location
WHERE region != ‘Bangalore’
AND region != ‘Mysore’
GROUP BY region;
As compare to statement 1, the query logic will first exclude those records whose region is not either in Bangalore or Mysore and hence grouping of records will take place with respect to region. Thus, filtering of records is done at initial level only that will lead to the faster execution time and hence, the cost of query will fall significantly.
Union vs Union All clause?
The Union clause are generally used to combine two tables using some validations. However, there is a major difference between Union and Union All clause.
Union clause is used to combine two or more tables when there is a repetition of data and the records to be displayed should be distinct whereas Union All is used when there is no repetition of data.
The major difference between Union and Union All clause is that the Union clause is costly as compare to Union All clause. Hence, whenever there is a need to print data from multiple tables with multiple conditions, we go for Union All clause; provided there is no scope of redundancy in data.
A query with one or more OR clauses can sometimes be rewritten as a series of queries that are combined with a UNION statement. This may or may not improve the performance but will help to use available Indexes and hence, indirectly boost the performance of the query.
For Example:
SELECT emp_name, emp_location
FROM emp_details
WHERE emp_dept = ‘ABC’ or city = ‘Hyderabad’ or IBU = ‘Test’
The above statement can be rewritten as:
SELECT emp_name, emp_location
FROM emp_details
WHERE emp_dept = ‘ABC’
UNION
SELECT emp_name, emp_location
FROM emp_details
WHERE city = ‘Hyderabad’
UNION
SELECT emp_name, emp_location
FROM emp_details
WHERE IBU = ‘Test’
Minus vs Not Exist
Example:
SELECT birth_date, last_name, first_name
FROM employee
MINUS
SELECT birth_date, last_name, first_name FROM stock_holder;
The above query will return those records from the table ‘employee’ which are not in the table ‘stock_holder’.
Instead of using Minus operator, as it is a costly operator, we generally go for Not Exist operator by using Indexes.
Instead of using Minus operator, as it is a costly operator, we generally go for Not Exist operator by using Indexes.
For Example:
SELECT birth_date, last_name, first_name
FROM employee
WHERE NOT EXISTS
(SELECT 1
FROM stock_holder
WHERE stock_holder.birth_date
= employee.birth_date
AND stock_holder.first_name =
employee.first_name);
Distinct vs Exists
SELECT DISTINCT emp.emp_name, dept.dept_name, dept.dept_number
FROM employee_details emp,
department_details dept
WHERE dept.dept_number=emp.dept_number
This query is returning all the employee names, department numbers and names which have at least one employee.
But the problem is that the query needs to check the condition each time it moves i.e. it will check the same condition for each and every record separately. Thus, the query will take more time to execute.
SELECT dept.dept_number, dept.dept_name
FROM employee_details emp,
department_details dept
WHERE EXISTS(
SELECT ‘X‘
FROM employee_details emp
WHERE dept.dept_number=emp.dept_number);
This query is fetching the same records but the difference is in the execution time. Here, when the query optimizer realizes that the sub query has been satisfied once, hence there is no need to proceed further and the next matching records can be fetched thereafter. Thus there is no need to move to the same condition over and again.
Hence, the use of Exists will lead to the faster execution of query as compare to its counterpart Distinct.
Some points to analyze:
- Avoid ORDER BY and GROUP BY clauses in the SQL query if they can be avoided.
- Always have the foreign key indexed.
- Ensure that columns used in OR clause is indexed else will lead to full table scan.
- LIKE clause beginning with ‘%’ or ‘_’ will not use index even if the column is indexed.
- Indexes should be considered on all columns that are frequently accessed by the WHERE, ORDER BY, GROUP BY, TOP, and DISTINCT clauses. Without an index, each of these operations will require a table scan of your table, potentially hurting performance.
- In the Join conditions, put the table name, which is either the smallest, or has the minimum dependencies (i.e. in the WHERE conditions) to the rightmost.
- Oracle reads the table from right to left and it has to sort and merge the driving table.
- In the WHERE clauses, put the most expensive AND condition at the beginning and the least expensive one at the last.
- Oracle evaluates the conditions from the last condition.
- NULL is never equal to anything, never NOT equal to anything, never >= to anything, never <= to anything.
- Evaluation of Inequality disables usage of the Index. So avoid != or <> conditions.
- The condition > 0 is better than != 0.
- Of the search criterion’s in the where clause, at least one of them should be based on a highly selective column that has an index.
- If at least one of the search criterion’s in the where clause is not highly selective, consider adding indexes to all of the columns referenced in the where clause.
- If none of the column in the where clause are selective enough to use an Index on their own, consider creating a covering Index for this query.