Very often you’ll find delimited text to be more suitable format option instead of splitting text in many rows.

Typical use cases for having delimited text includes preparation of data for further analysis, reporting, importing data into the spreadsheet etc.

To demonstrate the concept, I’ll use my demo table TEST with 10 million rows.

TEST – table description

And here is the content of the table:

TEST – table data

Method 1 – listagg

One extremely easy (and prefered) way to perform string concatenation is to use listagg function, like in the following example:

select
       listagg(rpad(t.cache, 2)) within group (
          order by t.timestamp
       ) as some_list
    from test t
    where rownum < 100;

And here is the output of string concatenation:

Problem with this method is in maximum string length of 4000 bytes.

If output is larger than 4000, I’ll get the following error:

select
       listagg(rpad(t.cache, 2)) within group (
          order by t.timestamp
       ) as some_list
    from test t
    where rownum < 10000000;

ORA-01489: result of string concatenation is too long
01489. 00000 -  "result of string concatenation is too long"
*Cause:    String concatenation result is more than the maximum size.
*Action:   Make sure that the result is less than the maximum size.

This means I want be able to process my 10 million rows by using that method due to the 4000 bytes limitation.

One way to avoid that error (in case you are running 12c+) is to change the max_string_size standard parameter from STANDARD–> EXTENDED.

As this is a major change you need to perform the following steps:

CONN / AS SYSDBA

PURGE DBA_RECYCLEBIN

SHUTDOWN IMMEDIATE;
STARTUP UPGRADE;
ALTER SYSTEM SET max_string_size=extended;
@?/rdbms/admin/utl32k.sql
SHUTDOWN IMMEDIATE;
STARTUP;

Additionally you’ll need to ensure that the compatible parameter is: compatible >= 12.0.0.

Be warned that this is a one way street and there is no easy way to rollback that change.

If you still decide to go with that change, Oracle will store resulting string in varchar2 in case ending string fits within 4000 bytes limit, otherwise it will create a LOB and LOB index automatically in the background.

2. Method 2 – XMLAGG/ XMLPARSE

As the previous method is system-wide and requires a major database change to be performed, in case you don’t want to take that approach, there is an alternative way to achieve the same by using XMLAGG which takes a list of XML elements from one columns and returns an aggregated XML document in a single cell.

To do that I also need to use XMLPARSE function that parses and generates an XML instance from the evaluated result of value expression.

Here I also specified WELLFORMED to avoid check that value_expr resolves to well-formed XML document, as this will boost execution time.

select
   xmlagg(
      xmlparse(
         content rpad(t.cache, 2) wellformed
      )
      order by t.timestamp
   ).getclobval() as some_list
from test t
where rownum < 100000;

Elapsed: 00:00:31.69

Statistics
----------------------------------------------------------
	  3  recursive calls
	685  db block gets
       1052  consistent gets
	  0  physical reads
	  0  redo size
      23606  bytes sent via SQL*Net to client
      14130  bytes received via SQL*Net from client
	 66  SQL*Net roundtrips to/from client
	  1  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

In this case to process only 100k records it takes almost 32 seconds, while in my TEST table I have 10 million rows which I can’t process within reasonable time frame by using described method.

Summary:

First two methods are not the best fit as either I need to perform serious database level change that I can’t revert (easily), or in case of XML string manipulation, I have unacceptable slow query execution.

In the next post I’ll describe method that can allow you both: much better performance and no string size limitation.



Get notified when a new post is published!

Loading

Comments

There are no comments yet. Why not start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.