Josip Pojatina

Main goal of this article is to explain impact of the parameter arraysize on database performance.

First it’s important to define what array size is.

The array size is the number of rows fetched by the server.

I’m using Oracle’s SH demo schema, and it’s SALES table which has almost 1 million rows.

In the following code snippet I’ve experimented with array size =

1
100
1000
5000 (maximum value)

-- Case 1: arraysize = 1
SH@orcl> set arraysize 1
SH@orcl> select * from sales;

918843 rows selected.

Elapsed: 00:12:19.49

Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
     460271  consistent gets
	  0  physical reads
	  0  redo size
  113459118  bytes sent via SQL*Net to client
    5054255  bytes received via SQL*Net from client
     459423  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
     918843  rows processed



-- Case 2: arraysize = 100
SH@orcl> set arraysize 100
SH@orcl> select * from sales;

Elapsed: 00:00:03.46

Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
      10781  consistent gets
	  0  physical reads
	132  redo size
   27464662  bytes sent via SQL*Net to client
     101692  bytes received via SQL*Net from client
       9190  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
     918843  rows processed



-- Case 3: arraysize = 1000
SH@orcl> set arraysize 1000
SH@orcl> select * from sales;

918843 rows selected.

Elapsed: 00:00:01.64

Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
       2520  consistent gets
	  0  physical reads
	  0  redo size
   25885092  bytes sent via SQL*Net to client
      10722  bytes received via SQL*Net from client
	920  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
     918843  rows processed



-- Case 4: arraysize = 5000 (max. value)
SH@orcl> set arraysize 5000
SH@orcl> select * from sales;

918843 rows selected.

Elapsed: 00:00:01.50

Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
       1788  consistent gets
	  0  physical reads
	  0  redo size
   25744707  bytes sent via SQL*Net to client
       2637  bytes received via SQL*Net from client
	185  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
     918843  rows processed

Here are the test results:

array size	elapsed	consistent gets	*SQLNet trips**
1	12:19.49	460271	459423
100	03.46	10781	9190
1000	01.64	2520	920
5000	01.50	1788	185

You need to be aware that by increasing the array size, your client memory consumption will increase as well.

From the elapsed time you can see that arraysize = 1000 is the optimal value.

Increasing it to a larger value (e.g. to 5000) performance where not improved.

While the first column – elapsed time, is just a consequence of changing array size, last two columns give us explanation what’s behind performance improvement.

Consistent gets represents number of time a consistent read was requested for a block in the buffer cache.

Database blocks may be retrieved in one of two ways:

current mode
consistent gets

While the former is related to DML statements (updating the latest copy of the block), later is related to SELECT statement (you are reading database blocks from the buffer cache in read consistent mode plus reading the UNDO blocks if required).

When I increase the array size parameter, the number of I/O’s is going down from 460271 (arraysize = 1) to 1788 (arraysize = 5000).

That one of the reasons of getting much improved elapsed time measurement.

The other reason for getting a better performance you can find in the last column – “SQL*Net roundtrips to/from client”.

SQL*Net roundtrips to/from client is a total number of SQL*Net messages sent to and received from the client.

There are two reasons for having a high number of SQL*Net roundtrips in statistics:

fetch request (e.g. from the cursor)
single fetch for multi-row result set

In previous examples we have almost 1 million rows in a result set.

By increasing arraysize (number of rows returned in one fetch), number of round trips will decrease from the 459423 (arraysize = 1) to 185 (arraysize = 5000).

For a given case (there are several other parameters that are influencing on the result), Consistent gets and SQL*Net round trips the most responsible for dropping elapsed time from 12 minutes (precisely 12:19.49) for array size = 1 down to 1.5 seconds for array size = 5000.

For OLTP environments you usually need to fetch just a few rows and to have some operational reports on top of the OLTP.

-- Case 1: arraysize = 1
SH@orcl> set arraysize 1
SH@orcl> select * from sales where prod_id = 13 and cust_id = 6277 and channel_id = 3 and promo_id = 999;

Elapsed: 00:00:00.01

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	 44  consistent gets
	  0  physical reads
	  0  redo size
       1383  bytes sent via SQL*Net to client
	635  bytes received via SQL*Net from client
	  3  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  4  rows processed



-- Case 2: arraysize = 5
SH@orcl> set arraysize 5000
SH@orcl> select * from sales where prod_id = 13 and cust_id = 6277 and channel_id = 3 and promo_id = 999;

Elapsed: 00:00:00.02

Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	 44  consistent gets
	  0  physical reads
	  0  redo size
       1192  bytes sent via SQL*Net to client
	624  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  4  rows processed

array size	elapsed	consistent gets	*SQLNet trips**
1	00.01	44	3
5000	00.02	44	2

This time the performances of two extreme cases (arraysize = 1 versus arraysize = 5000) are the same on a quiet system, with consistent gets and SQL*Net roundtrips.

If there are many DML activities around, you will get even worse performance in case of arraysize = 5000.

Summary:

From the article you can learn how changing arraysize parameter in SQL*Plus brings dramatic improvements in case you want to fetch large result set over the network.

In case of ETL / Data Warehouse or Machine Learning, changing array size is definitely a way to go.

As you can see from the results, in case you are fetching a few records (which is the most common characteristic of the OLTP load) you won’t benefit from increasing a number of records fetched in one trip.

Quite opposite, you can actually decrease the performance, especially on environments with a high number of concurrent DML’s.

In most OLTP environments it is a common to have OLTP reports to trace your business in a current time, and that is a case for increasing array size parameter.

For those reasons, in OLTP environments you need to set some compromise value for array size

somewhere between 50 and 100.

If you are using Java JDBC, you have an option to SetFetchSize parameter of a Statement object to set how many rows will be returned in one fetch.

In case of Python, you can do the same on top of the Cursor object.

Comments

There are no comments yet. Why not start the discussion?

Impact of the parameter arraysize on performance

Comments

Leave a Reply Cancel reply

Blog

Tags