Last time I’ve suggested two ways of how to perform string concatenation.
Unfortunately, both approaches have serious drawbacks that you might experience.
In this post I’ll explain one approach which is not Oracle version dependent, which corrects both shortcomings of previously described methods:
- size of concatenated string
- speed of data processing
Part I you can find on the following link:
https://www.josip-pojatina.com/en/string-aggregation-part-i/
Method 3 – cast/collect
I’ll again concatenate CACHE column from my TEST table.
As CACHE column is of type Number, first I need to create appropriate TYPE:
create or replace type name_coll_type as table of number;
Next I need to create a function that will accept collection (previously created table of number) as input parameter and return CLOB.
There s nothing special with that function, that would require further explanation.
create or replace function name_coll_type_to_clob (
p_name_coll in name_coll_type
, p_delimiter in varchar2 default null
)
return clob
is
v_name_string clob := ' ';
begin
for idx in p_name_coll.first..p_name_coll.last
loop
if idx = p_name_coll.first then
dbms_lob.open(v_name_string, DBMS_LOB.LOB_READWRITE);
dbms_lob.write(v_name_string, length(p_name_coll(idx)), 1, p_name_coll(idx));
else
DBMS_LOB.WRITEAPPEND(v_name_string, length(p_name_coll(idx) || p_delimiter), p_delimiter || p_name_coll(idx));
end if;
end loop;
DBMS_LOB.CLOSE(v_name_string);
return v_name_string;
end name_coll_type_to_clob;
Finally I need to execute the query:
select
name_coll_type_to_clob(
cast(
collect(
cache
order by timestamp
)
as name_coll_type
)
, ','
) as some_list
from test t
where rownum < 10000000;
Elapsed: 00:02:08.11
Statistics
----------------------------------------------------------
302 recursive calls
89615909 db block gets
20002237 consistent gets
105463 physical reads
0 redo size
23606 bytes sent via SQL*Net to client
14131 bytes received via SQL*Net from client
66 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
1 rows processed
Oracle COLLECT function enables aggregating data into a collection retaining multiple records within a single row – similar to the nested tables.
CAST function converts input collection (result of COLLECT function) into another collection, where I need to specify exactly which one.
That’s why I have to use “as name_coll_type”.
More about SQL functions you can find in Oracle Database SQL Language Reference book, which is available at the following link:
As you can observe, SQL completes in 2 minutes and 8 seconds which is a very good result, when compared with two previous attempt, and taking into account extensive context switching – 10 million times between SQL and PL/SQL engine.
Can we do better?
Method 4 – apex_string – join_clob function
In this method I’m going to use Oracel APEX join_clob function.
For this to work, you need to first install APEX.
You can download APEX from the following link:
https://www.oracle.com/tools/downloads/apex-downloads.html
You can choose the latest version of APEX (at the time of writing it is 19.2) no matter which database version you are running (e.g. you can install APEX 19.2 on Oracle database 12c).
If you only need one language (english), you can download smaller version.
After downloading and unpacking zip file, you need to connect as SYSDBA to create tablespace and execute the installation script:
connect / as sysdba
create tablespace apex datafile '/u01/app/oracle/oradata/ORCL/apex01.dbf' size 10m autoextend on next 10m maxsize unlimited
extent management local autoallocate;
--execute as SYSDBA
@apexins.sql apex apex temp /i/
Position 1: Name of tablespace for Application Express application user – apex
Position 2: Name of tablespace for Application Express files user - apex
Position 3: Name of temporary tablespace or tablespace group - temp
Position 4: Virtual directory for APEX images - i
You don’t need to create web listener if you just want to install APEX functions and you are not planning to run APEX apps.
Finally I can execute the query:
select
apex_string.join_clob(
cast(
collect(
rpad(t.cache, 2)
order by t.timestamp
)
as apex_t_varchar2
)
, ''
, 12 /* dbms_lob.call */
) as some_list
from test t
where rownum < 10000000;
Elapsed: 00:00:24.33
Statistics
----------------------------------------------------------
283 recursive calls
55942 db block gets
92872 consistent gets
29510 physical reads
0 redo size
23610 bytes sent via SQL*Net to client
14130 bytes received via SQL*Net from client
66 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
1 rows processed
Again I’m using cast/collect combination as in previous method (name_coll_type type/name_coll_type_to_clob function), but this time I added apex_string.join_clob function.
But this approach has brought the best result so far (from 2 minutes 8 sec → 24 sec).
Method 5 – json_arrayagg / json_value
This is the last method I’ll explain.
JSON_ARRAYAGG function takes as input column of SQL expression, converts it to JSON value and returns JSON array where each element is JSON value.
JSON_VALUE converts JSON value into the SQL value.
Here is the SQL code:
select
json_value(
replace(
json_arrayagg(
rpad(t.cache, 2)
order by t.timestamp
returning clob
)
, '","'
, ''
)
, '$[0]' returning clob
) as some_list
from test t
where rownum < 10000000;
Elapsed: 00:00:14.92
Statistics
----------------------------------------------------------
406 recursive calls
670738 db block gets
175185 consistent gets
29510 physical reads
0 redo size
23606 bytes sent via SQL*Net to client
14130 bytes received via SQL*Net from client
66 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
1 rows processed
The JSON method is even better than previous one, as elapsed time dropped from 24 sec → 15 sec.
Summary:
There are many way to achieve desired functionality.
In this and previous posts I’ve listed five different methods.
While there are many ways to achieve the same goal, only one or two ways are usually scalable and fast.
As always, the key is to understand how, under the hood, some functionality is implemented.
Comments