Dynamic SQL in the wrong hands can have a drastic affect on the performance of your database. This article discusses remedial and preventative techniques to ensure that your dynamic SQL performs.
Dynamic SQL is very accessible through a wide range of programming technologies, such as JDBC with Java, ODBC with Visual Basic, DBMS_SQL, and Native Dynamic SQL (NDS) within PL/SQL. Sometimes a developer is faced with a problem where compiled static SQL does not provide the required flexibility for a demanding requirement, and has no choice but to resort to dynamic SQL.
However, using dynamic SQL in these technologies is now quite often becoming the norm and not the exception, and without the depth of experience in developing well tuned SQL, solutions are being delivered badly prepared for the high volume and scalable systems they are expected to operate in.
This article is directed primarily at developers writing dynamic SQL, but also DBAs and architects requiring awareness of the issues surrounding dynamic SQL.
The first part of this article discusses some of the remedial tasks that can be applied to minimize the side effects of inappropriately written dynamic SQL in existing applications. Then it discusses some of the preventative techniques that can be used to improve the performance of dynamic SQL during the design and development stages. It also covers how it can be used to improve performance of solutions initially developed using static SQL.
This article uses the standard example SCOTT schema and the PL/SQL language to demonstrate the discussion points and examples. To simulate very high volumes of data in the tables used, it will be necessary to import statistics using the following commands:
BEGIN DBMS_STATS.SET_TABLE_STATS ( ownname => 'SCOTT', tabname => 'EMP', partname => NULL, stattab => NULL, statid => NULL, numrows => 1400000, numblks => 10000, avgrlen => 40, flags => NULL, statown => NULL, no_invalidate => FALSE ); END; BEGIN DBMS_STATS.SET_TABLE_STATS ( ownname => 'SCOTT', tabname => 'DEPT', partname => NULL, stattab => NULL, statid => NULL, numrows => 100000, numblks => 1000, avgrlen => 40, flags => NULL, statown => NULL, no_invalidate => FALSE ); END;
To restore the original statistics back on the table, run the DBMS_STATS package.
It is not uncommon for programmers to build a dynamic SQL (or PL/SQL) statement that does not use bind variables, where every call typically builds in the specific literal variable into the SQL string. This pretty much guarantees every statement submitted to the server to be unique, eliminating any chance of sharing SQL statements in the SGA.
For those statements that do use bind variables and are being successfully shared in the library cache, a pounding from an application using dynamic SQL with literal variables, can quickly flush these statements out and give little opportunity for them to be reused.
When a SQL statement is sent to the server, Oracle first scans the library cache for statements that match in structure and in the use of bind variables. The values of the bind variables are not considered for this match. If not found, then the statement is parsed (known as a hard parse) and stored along with its execution plan in the library cache. If a statement is found, then Oracle can reuse the execution plan and save performing a hard parse, a CPU intensive task. This reuse is known as a soft parse.
Increasing the size of the shared pool does not necessarily alleviate the problem, and if the resizing is over compensated, it can often add to the problem. The more statements in the library, the more statements Oracle has to scan and compare in order to find a match, and resulting in a degradation of the parse time.
In an ideal world, PL/SQL programming standards should be strict enough to prevent the use of literal variables from being used in dynamic SQL. However, the problem often comes to light late on in a project, or even more concerning, a packaged solution may be guilty of having the problem. There are some quick solutions which can help to overcome this without changing the code.
The CURSOR_SHARING=FORCE configuration parameter setting causes the parsing process to replace all literal strings and constant values in a SQL statement with a system generated bind variables.
There are side effects to using bind variables particularly if the values in indexed columns are heavily skewed. Histograms gathered during a table and index analyze can help the Cost Base Optimiser (CBO) decide whether the selectivity of a particular value would better perform with an index or a full table scan. Literal variables would force a reparse each time and therefore benefit from any histogram statistics. Bind variables only allow the values used for the initial parse (known as peeking) to be taken into consideration and therefore not take full advantage of histograms. The setting of this configuration parameter is not recommended on data that is heavily skewed. However, if there are only localised issues with skewness, then it can be dealt with on an individual basis using the optimiser hint CURSOR_SHARING_EXACT to override this configuration parameter. E.g.
SELECT /*+ CURSOR_SHARING_EXACT */ .... FROM ...
The performance of dynamic SQL can be improved by considering approaches described in this section. Dynamic SQL can also be used to improve performance of static embedded SQL, also described in this section.
Existing Dynamic SQL
An existing dynamic SQL which already has a good execution plan can be further improved by ensuring that bind variables are used where possible. So any literal strings or hard coded values (including constants) should be replaced with bind variables (e.g. PL/SQL uses the colon prefix, and JDBC prepared statements use the question mark).
Dynamic SQL Opportunities
Not all uses of dynamic SQL are obvious and in some scenarios where static embedded SQL has been used, better performance can be achieved by using dynamic SQL.
Consider the scenario where we have a procedure used by a Query By Example (QBE) screen, passing in a number of optional search parameters, and returning an open cursor back to the screen application. The problem with NULL bind variables is that you cannot use the standard operators on them, but must use the IS NULL expression instead. A common technique to get round this is to use the NVL function to self join the column back to itself if the bind variable is NULL, e.g.
a.emp_name = NVL(:p_emp_name, a.emp_name)
Let us take a typical statement from a QBE screen on an OLTP (On Line Transaction Processing) system where a FIRST_ROWS optimiser goal would be favoured using nested loop joins.
SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno AND d.dname = NVL(:v_dept_name , d.dname) AND e.ename = NVL(:v_emp_name, e.ename)
This SQL statement contains table joins and uses bind variables operating on columns spanning more than one table. The optimiser will attempt to do the best it can in trying to determine a suitable driving table that covers all scenarios, which could result in a CONCATENATION execution plan such as the one below.
SELECT STATEMENT, GOAL = CHOOSE CONCATENATION FILTER TABLE ACCESS BY INDEX ROWID EMP NESTED LOOPS TABLE ACCESS FULL DEPT INDEX RANGE SCAN IDX_EMP_01 FILTER NESTED LOOPS TABLE ACCESS BY INDEX ROWID EMP INDEX RANGE SCANSCOTT IDX_EMP_02 TABLE ACCESS BY INDEX ROWID DEPT INDEX UNIQUE SCAN PK_DEPT
However, if a dozen or so parameters can be applied to the query, then execution plan as used above would not be very effective.
Let us consider some of the most likely QBE scenarios that the query could cover, and look at the execution plans of a SQL statement specifically written for the populated bind variables.
Scenario 1 : The open query
The execution plan has no choice but to full table scan the tables and retrieve the full data.
SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno
SELECT STATEMENT, GOAL = CHOOSE HASH JOIN TABLE ACCESS FULL EMP TABLE ACCESS FULL DEPT
Scenario 2 : Query on employee name
The execution plan drives off the EMP table using the index on ENAME.
SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno AND e.ename = :v_emp_name
SELECT STATEMENT, GOAL = CHOOSE NESTED LOOPS TABLE ACCESS BY INDEX ROWID EMP INDEX RANGE SCAN IDX_EMP_02 TABLE ACCESS BY INDEX ROWID DEPT INDEX UNIQUE SCAN PK_DEPT
Scenario 3 : Query on department name
The execution plan drives off the DEPT table using the index on DNAME.
SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno AND d.dname = :v_dept_name
SELECT STATEMENT, GOAL = CHOOSE TABLE ACCESS BY INDEX ROWID EMP NESTED LOOPS TABLE ACCESS BY INDEX ROWID DEPT INDEX RANGE SCAN IDX_DEPT_01 INDEX RANGE SCAN IDX_EMP_01
What this demonstrates is that each combination of bind variable has its own specific requirements to achieve an efficient execution plan. Dynamic SQL provides a means of overcoming this by being able to address the non-null bind variables effectively. The instinct would be to leave the non-null bind variable predicate out of the query, but NDS is limited to a fixed number of bind variables with a predefined data type, so all bind variables must be present in the query. However, they can be passively handled by simply adjusting the predicate in the dynamic SQL to:
:v_dept_name IS NULL
The following PL/SQL statement demonstrates how a SQL statement can be built up using the full compliment of bind variables by using the dummy predicate.
DECLARE v_ename emp.ename%TYPE := 'ADAMS'; v_dname dept.dname%TYPE := NULL; TYPE t_sql IS REF CURSOR; c_sql t_sql; v_sql emp%ROWTYPE; v_sqltext VARCHAR2(32767); BEGIN v_sqltext := 'SELECT * '|| 'FROM emp e, '|| ' dept d '|| 'WHERE e.deptno = d.deptno '|| 'AND '|| CASE WHEN v_ename IS NULL THEN ':p_ename IS NULL ' ELSE 'ename = :p_ename ' END|| 'AND '|| CASE WHEN v_dname IS NULL THEN ':p_dname IS NULL ' ELSE 'dname = :p_dname ' END; DBMS_OUTPUT.PUT_LINE('SQL is : '||v_sqltext); OPEN c_sql FOR v_sqltext USING v_ename, v_dname; LOOP FETCH c_sql INTO v_sql; EXIT WHEN c_sql%NOTFOUND; DBMS_OUTPUT.PUT_LINE('Employee Name: '||v_sql.ename); END LOOP; CLOSE c_sql; EXCEPTION WHEN OTHERS THEN IF c_sql%ISOPEN THEN CLOSE c_sql; END IF; RAISE; END;
There are scenarios where bind variables cannot be used in dynamic SQL such as with IN statements with a varying number of values in its list. Multi-select list boxes are commonly used in Query-By-Example (QBE) screens and are a typical example of where an IN statement would be best handled by a dynamic SQL statement using literal variables.
We could have a procedure that accepts an array based parameter holding the list of values selected from the list box, and it then builds a dynamic SQL statement using an IN operator listing the values as literals. E.g.
SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno AND d.dname IN (‘Accounts’,’Sales’,’Marketing’);
Unfortunately, we are now back to the literal variable problem again and this will affect the library cache if there is heavy use of the screen application.
Ideally, we would want to use the array based parameter passed into the procedure as a bind variable directly in the SQL. Standard SQL syntax would not allow you to use this, but the Object Relational features of Oracle do provide a useful means of dealing with collections.
If we create an object type and represent it as a collection using a table type.
CREATE OR REPLACE TYPE t_value AS OBJECT ( list_value VARCHAR2(2000) ); / CREATE OR REPLACE TYPE t_value_list AS TABLE OF t_value; /
The collection can then be used in SQL by casting it into a table enabling it to be used as a standard table.
SELECT * FROM emp WHERE ename IN ( SELECT list_value FROM TABLE( CAST( the_big_in.get_value_list AS t_value_list )) );
The example above uses a function in the “the_big_in” package to represent the collection. The reason is that collections cannot be represented as a bind variable and the above statement has to use a package function to return a pre-populated package global variable. The complete package would look something like the one below. It consists of a global variable to hold the list of values in the object type collection; A function to be used in SQL to return the global variable collection, a procedure to be called by the QBE screen which returns a REF CURSOR type; and a procedure to demonstrate the use of the query procedure.
CREATE OR REPLACE PACKAGE BODY the_big_in IS -- This requires Oracle9i patched to 220.127.116.11 TYPE t_ref_cursor IS REF CURSOR; g_value_list t_value_list; FUNCTION get_value_list RETURN t_value_list IS BEGIN RETURN g_value_list; END; PROCEDURE get_query ( p_emp_list IN t_value_list, p_cursor OUT t_ref_cursor ) AS v_cursor t_ref_cursor; BEGIN g_value_list := p_emp_list; OPEN v_cursor FOR SELECT * FROM emp WHERE ename IN ( SELECT list_value FROM TABLE( CAST(the_big_in.get_value_list AS t_value_list )) ); p_cursor := v_cursor; END; PROCEDURE run_query AS v_emp_list t_value_list := t_value_list(); v_empty_emp_list t_value_list := t_value_list(); v_cursor t_ref_cursor; v_emp emp%ROWTYPE; BEGIN v_emp_list := v_empty_emp_list; v_emp_list.EXTEND; v_emp_list(v_emp_list.LAST) := t_value('Fred Bloggs'); v_emp_list.EXTEND; v_emp_list(v_emp_list.LAST) := t_value('David Jones'); v_emp_list.EXTEND; v_emp_list(v_emp_list.LAST) := t_value('Bill Smith'); get_query(v_emp_list, v_cursor); IF v_cursor%ISOPEN THEN DBMS_OUTPUT.PUT_LINE('Cursor is open'); FOR n IN 1..v_emp_list.COUNT LOOP DBMS_OUTPUT.put_line(v_emp_list(n).list_value); END LOOP; END IF; LOOP FETCH v_cursor INTO v_emp; EXIT WHEN v_cursor%NOTFOUND; DBMS_OUTPUT.put_line(v_emp.emp_name||', '|| v_emp.dept); END LOOP; CLOSE v_cursor; END; END;
Consider taking dynamic SQL further by attempting dynamic PL/SQL. The benefits do not necessarily include performance gains, but the opportunities for highly generic reusable code are endless. This is very attractive from a software engineering perspective where there is a high emphasis on component reusability.
Consider an ETL application which copies data from table to table. The application allows you to define a source SQL statement (extract and transform) and a target table (load) with associated target columns. Configuration settings allow you to select features supporting various loading approaches, such as bulk binding and collecting, record type inserts and updates, update if row already present, etc.
Without catering specifically for a particular table and its list of required columns, and not knowing what loading options are required, it would be impossible to provide a solution within PL/SQL reliant only on dynamic SQL. However, the PL/SQL to perform the required task on the required table can be built dynamically and executed using EXECUTE IMMEDIATE or DBMS_SQL.
There are some drawbacks to be aware of when executing anonymous PL/SQL blocks within compiled PL/SQL. Unlike compiled PL/SQL, anonymous blocks are executed in an invoker rights mode, that is, the privileges of any data manipulation language (DML) statements must be appropriately set for the user executing the compiled PL/SQL, and not the owner. In contrast, DML embedded in compiled PL/SQL code executes with the privileges of the owner of the compiled program. The EXECUTE IMMEDIATE command accepts the query statement in a VARCHAR2 variable only and is therefore limited to 32,767 bytes of code (databases with fixed multi-byte character sets would reduce this). If this size is exceeded the alternative means of executing the dynamic PL/SQL is using the DBMS_SQL supplied package which is not limited.
A common misconception about dynamic SQL is that because it is not compiled it will be slower. However, this article has demonstrated that dynamic SQL, though having a potential overhead of being parsed each execution, can improve performance significantly in other aspects such as achieving a far superior execution plan.