Josip Pojatina

Context switching is one of the most overlooked topics when tuning the Oracle database, as you can’t simply see it directly (e.g. “context switching” wait event doesn’t exist).

Instead you should use indirect approach (through the increased CPU consumption), as it’s recorded as a CPU wait event.

Context switching what it is and what impact on SQL performance it has?

SQL – as nonprocedural, declarative language, has separate execution engine within Oracle database kernel.

PL/SQL on the other side is a typical procedural language with its own execution engine.

If you want to call PL/SQL function from SQL, Oracle must switch between execution engines which brings an overhead as you need to pass all variables, states… between two different engines (alias context switching).

Despite overhead is usually very small, context switching is happening for each row.

In case you have 10.000 rows, it means you’ll call PL/SQL function 10.000 times.

If overhead for each call is 0.01 sec, you’ll end up with 100 seconds of overhead and increased CPU consumption.

Although Oracle SQL has advanced incredibly through the history, it is still not possible to code everything inside SQL, which means sometimes it’s inevitable to use PL/SQL function calls.

Still, in vast majority of cases, context switching is triggered by lack of knowledge of SQL capabilities, developer’s laziness and ignorance of new SQL functionality offered by each release of Oracle database.

The most important is to know is what impact of context switching you can expect and how to reduce it.

Let’s start with finding the answers on those questions.

The best place to start with is to take a look at Top SQL order by CPU in AWR report.

If you observe SQL’s with significant CPU consumption, and after inspecting SQL code you find function call, you need to dig further to check the PL/SQL code.

Through this article I’m going to use SH sample schema, more specifically sales and customers tables, which data model is presented on the picture below.

To simulate context switching I need to create a simple function:

create or replace function fcust_name(p_cust_id in number) RETURN varchar2 
as 
l_cust_name varchar2(256);
begin 
    select  cust_first_name || ' ' || cust_last_name into l_cust_name from customers where cust_id = p_cust_id;
   
    return l_cust_name;
end;

and to execute the following code:

SH@orcl> select fcust_name(s.cust_id), s.* from sales s;

918843 rows selected.

Elapsed: 00:00:15.67

Statistics
----------------------------------------------------------
     918844  recursive calls
	  0  db block gets
    2767309  consistent gets
	 16  physical reads
	  0  redo size
   40204257  bytes sent via SQL*Net to client
     101691  bytes received via SQL*Net from client
       9190  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
     918843  rows processed

As you can see, elapsed time is 15.67 seconds to fetch almost 1 million records.

If I execute SQL that will fetch the same result but this time without using PL/SQL function calls, I’m getting the following results.

SH@orcl> select c.cust_first_name || ' ' || c.cust_last_name , s.* from sales s, customers c where s.cust_id = c.cust_id;

918843 rows selected.

Elapsed: 00:00:01.98

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
      12301  consistent gets
       1472  physical reads
	  0  redo size
   23926382  bytes sent via SQL*Net to client
     101691  bytes received via SQL*Net from client
       9190  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
     918843  rows processed

The code completes in 1.98 seconds which is 7x faster (15.67 sec was result with a function calls).

While in the first case I had 918844 recursive calls, in the second case I have only 1 which is indicator what is going on behind the scene.

To further investigate what is going on you need to create 10024 trace files for both cases.

Here is the execution plan for SQL + PL/SQL function calls case.

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
    918843     918843     918843  PARTITION RANGE ALL PARTITION: 1 28 (cr=10780 pr=0 pw=0 time=589642 us starts=1 cost=517 size=26646447 card=918843)
    918843     918843     918843   TABLE ACCESS FULL SALES PARTITION: 1 28 (cr=10780 pr=0 pw=0 time=412394 us starts=28 cost=517 size=26646447 card=918843)

Execution plan for the second SQL (no PL/SQL function calls) seems to be more complex.

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
    918843     918843     918843  HASH JOIN  (cr=12303 pr=1269 pw=0 time=637267 us starts=1 cost=942 size=45023307 card=918843)
     55500      55500      55500   NESTED LOOPS  (cr=1522 pr=0 pw=0 time=31512 us starts=1 cost=942 size=45023307 card=918843)
     55500      55500      55500    NESTED LOOPS  (cr=1522 pr=0 pw=0 time=24349 us starts=1)
     55500      55500      55500     STATISTICS COLLECTOR  (cr=1522 pr=0 pw=0 time=17827 us starts=1)
     55500      55500      55500      TABLE ACCESS FULL CUSTOMERS (cr=1522 pr=0 pw=0 time=10409 us starts=1 cost=423 size=1110000 card=55500)
         0          0          0     PARTITION RANGE ALL PARTITION: 1 28 (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0      BITMAP CONVERSION TO ROWIDS (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0       BITMAP INDEX SINGLE VALUE SALES_CUST_BIX PARTITION: 1 28 (cr=0 pr=0 pw=0 time=0 us starts=0)(object id 74422)
         0          0          0    TABLE ACCESS BY LOCAL INDEX ROWID SALES PARTITION: 1 1 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=517 size=493 card=17)
    918843     918843     918843   PARTITION RANGE ALL PARTITION: 1 28 (cr=10781 pr=1269 pw=0 time=262488 us starts=1 cost=517 size=26646447 card=918843)
    918843     918843     918843    TABLE ACCESS FULL SALES PARTITION: 1 28 (cr=10781 pr=1269 pw=0 time=248242 us starts=28 cost=517 size=26646447 card=918843)

Trick is that by looking at execution plan you want be able to see the problem as part of PL/SQL is missing.

In the SQL + PL/SQL function call example there is another SQL (recursive call) you need to inspect:

SELECT CUST_FIRST_NAME || ' ' || CUST_LAST_NAME 
FROM
 CUSTOMERS WHERE CUST_ID = :B1 

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  TABLE ACCESS BY INDEX ROWID CUSTOMERS (cr=3 pr=2 pw=0 time=1259 us starts=1 cost=2 size=20 card=1)
         1          1          1   INDEX UNIQUE SCAN CUSTOMERS_PK (cr=2 pr=2 pw=0 time=1254 us starts=1 cost=1 size=0 card=1)(object id 74388)

Now we clearly see that main SQL execution plan for SQL + PL/SQL function call case is incomplete, but we still have no evidence where 7x larger elapsed time is coming from.

To answer that question we need to look at the different part of the SQL trace file.

In SQL + PL/SQL function call example, we have the following fragment:

select fcust_name(s.cust_id), s.* 
from
 sales s


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     9190     25.43      31.53          0      10780          0      918843
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     9192     25.44      31.54          0      10780          0      918843

As you can see. 25.44 seconds was spent on CPU (fetch phase as it’s SELECT statement).

Furthermore, you can find within the same trace file the following:

SELECT CUST_FIRST_NAME || ' ' || CUST_LAST_NAME 
FROM
 CUSTOMERS WHERE CUST_ID = :B1 


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 918843     23.21      27.36          0          0          0           0
Fetch   918843      3.09       4.02        106    2756529          0      918843
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   1837687     26.30      31.38        106    2756529          0      918843

From previous output you can see actual number of function calls (918843) and CPU consumption (26.3 sec, where majority of time is coming from execute phase).

Thus this is the point where we spent majority of elapsed time.

By now you should be able to find out what impact of context switching you might expect.

Let’s see context switching impact from different perspective – by using top or similar Unix/Linux tool.

On the following picture you can find CPU consumption while executing SQL + PL/SQL function call case.

As you can see, CPU consumption, taking into account it’s for just one, single SQL that runs in one thread (not in parallel) is huge.

Let’s see the same SQL without PL/SQL function calls.

Despite CPU load is significantly lower when comparing it with context switching case, it’s still not neglectable, as SQL engine needs to hard parse the SQL when executing for the first time.

In the following picture you can see the load for executing the same SQL for the second time (meaning without hard parsing).

CPU load in case without context switching and with no hard parsing is almost neglectable.

How to fix context switching performance issues?

The best approach is to rewrite SQL & PL/SQL in a way that all logic can be performed in SQL.

Keep in mind that over a decades, Oracle has enhanced SQL language significantly.

Thus many cases where you used to use SQL + PL/SQL function call approach, today can be rewritten to use only SQL.

In rare cases where removing PL/SQL function calls is not possible, you need to check a logic behind to reduce the number of PL/SQL calls, as each call counts.

Additionally you can check cashing (there are many ways like context switching, table caching, materialized views, function cache, collections…).

Summary:

From this article you can learn about context switching: what is it, what issues they introduced, what is the impact of it, how you can indirectly monitor performance penalties related to context switching and how to fix it.

Besides allowing SQL to run much faster (7x faster in previous example), removing context switching might affect the number of Oracle licenses significantly.

Last point is that reducing the context switching is used by PL/SQL bulk operations (Forall…), which usually brings 10x speed up.

Comments

zhwsh
2019-09-06 16:15:38

If you use scalar subquery, reduce calls. select (select fcust_name(s.cust_id) from dual), s.* from sales s;

Josip Pojatina
2019-09-07 21:52:07

That's correct, but only if you have just a few distinct values. In this particular case, I had many millions of names, thus scalar cashing doesn't work.

zhwsh
2019-09-19 01:22:16

if add DETERMINISTIC in function ,set arraysize 10000,it reduce calls. but if you have just a few distinct values,The effect is very poor.

Josip Pojatina
2019-09-19 08:20:42

1. SH@orcl> set arraysize 10000 SP2-0267: arraysize option 10000 out of range (1 through 5000) As you can see, the max. value for arraysize is 5000, not 10000. 2. arraysize have the most impact in case where code is executed from the client. Here I'm executing from the server. Thus no matter I have more round trips, timing is the same (you can check it by yourself, as SH schema is public available). 3. Again, the most important is uniqueness and distribution of your data. This is why the result is always the same. Here is the result after adding DETERMINISTIC hint and changing the arraysize to 5000 (100 is default). SH@orcl> select fcust_name(s.cust_id), s.* from sales s; 918843 rows selected. Elapsed: 00:00:15.54 Statistics ---------------------------------------------------------- 916422 recursive calls 0 db block gets 2751054 consistent gets 0 physical reads 0 redo size 38484302 bytes sent via SQL*Net to client 2637 bytes received via SQL*Net from client 185 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 918843 rows processed

zhwsh
2019-09-19 16:07:12

oh! I don't have a install sh schema.my test using scott schema. table dept,emp. >3. Again, the most important is uniqueness and distribution of your data. This is why the result is always >the same. Here is the result after adding DETERMINISTIC hint and changing the arraysize to 5000 (100 is default). That's correct, but if There are a lot of repetition values, add DETERMINISTIC in function is Reducing recursive calls . SCOTT@test01p> select * from v$version where rownum=1; BANNER CON_ID -------------------------------------------------------------------------------- ---------- Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 0 --//GRANT EXECUTE ON SYS.DBMS_LOCK TO SCOTT; SCOTT@test01p> show rowprefetch rowprefetch 1 CREATE OR REPLACE FUNCTION get_dept (p_deptno dept.deptno%TYPE) RETURN dept.dname%TYPE DETERMINISTIC IS l_dname dept.dname%TYPE; BEGIN DBMS_LOCK.sleep (1); SELECT dname INTO l_dname FROM dept WHERE deptno = p_deptno; RETURN l_dname; END; / $ cat s.txt set timing on set arraysize &1 set head off set feedback only select empno, ename, deptno, get_dept(deptno) c20 from emp; quit --//seq 2 14 | xargs -I {} -n1 sqlplus scott/xxxx@test01p @s.txt {} > sqlplus -s -l scott/xxxx@test01p @s.txt 2 Elapsed: 00:00:13.00 > sqlplus -s -l scott/xxxx@test01p @s.txt 3 Elapsed: 00:00:11.00 > sqlplus -s -l scott/xxxx@test01p @s.txt 4 Elapsed: 00:00:09.00 > sqlplus -s -l scott/xxxx@test01p @s.txt 5 Elapsed: 00:00:09.00 > sqlplus -s -l scott/xxxx@test01p @s.txt 6 Elapsed: 00:00:08.00 > sqlplus -s -l scott/xxxx@test01p @s.txt 7 Elapsed: 00:00:07.00 > sqlplus -s -l scott/xxxx@test01p @s.txt 14 Elapsed: 00:00:07.00

zhwsh
2019-09-21 14:15:41

--//Avoiding unnecessary function calls to optimize SQL statements http://nimishgarg.blogspot.com/2016/03/avoiding-unnecessary-function-calls-to.html

Josip Pojatina
2019-09-21 17:22:46

Regarding SH schema, in case you didn't install it during dbca, you can install in later. More on: https://docs.oracle.com/database/121/COMSC/installation.htm#COMSC00002 Regarding the link and workarounds, everyone listed has it's drawbacks you need to be aware of. In this particular case, client is still running 10g, thus result cache is not available. Even if it is, there are drawbacks, especially in case you have large amount of memory(I observed that effect on Exadata). Deterministic functions still doesn't resolve context switching depends on arraysize, which usually you are not allow to change from default - 100 rows value. Scalar sub-query caching depends on amount of data and distribution. There are a better way from all mentioned here. I'll write about it sometime in a future.

Mukund
2019-12-23 21:08:37

We generally go for functions for code re-usability. If this hampers performance significantly, Oracle should should provide some sort of solution for reducing the cost involved in context switching.

Josip Pojatina
2019-12-25 16:16:13

Oracle did provide solution - in the form of the most powerful SQL engine in the world.

Reducing CPU consumption and Oracle licenses by fixing context switching in Oracle Db

Comments

Leave a Reply Cancel reply

Blog

Tags