In this post I’m going to explain the most important drawbacks when using db links technology for purposes for which it is not design for.
To be more precise, the case in which db links are used to massively exchange data between many separate databases will be explained.
In the previous post you can find some frequently overlooked fundamentals behind the database links technology:
https://www.josip-pojatina.com/en/oracle-database-links-locking-strategy/
As usual, this article is based on a real production use case, and I decided to write about it because this is the first time I need to fix totally wrong architecture based on database link technology.
Here I’ll list a few the most important implications when using database links.
1. Efficiency of SQL
If you have SQL where one or more tables are actually remote, it is very difficult for CBO (Cost Based Optimizer) to find the most efficient execution plan.
There is no way for CBO to estimate realistic cost associated with operations such as IO speed, utilization of the remote database, network speed etc. to come up with the optimal execution plan.
Although Oracle provides hints like DRIVING_SITE that can help CBO to control on which side (local database that initiate SQL execution, or remote database) should SQL engine execute, it’s far from enough to be able to produce optimal execution plan.
Even when execution plan that includes remote table is exactly the same with execution plan where external table is replaced with local table, due to the network overhead and locking strategy (check my previous post), performance penalties will be huge.
If you need to call packages / procedures / functions on the remote site, you’ll experience more issues with dependency / compilation / patching etc.
On the following two screenshots you can see 5x performance improvement, even though performance boost of up to 150x can be expected.
Performance overhead mainly depends on amount of data that needs to be transferred across the wire, average row length, data types, if LOBS/XML types are included etc.
2. SQL*Net message from dblink, TCP Socket (KGAS) & log file sync
In case of extensive usage of database links, you should observe high values in the following wait events:
- SQL*Net message from dblink
- TCP Socket (KGAS)
- log file sync
That points to a wrongly chosen architecture style (mentioned events belong to category Network and Commit / Configuration), but that is beyond the scope of this article.
On the following slides you can notice the impact on the above mentioned events when I redirect SQL code to use local instead of remote tables.
While analyzing the problem, some have suggested that the infrastructure components such as storage /cluster /file system should be tuned, as they observed high value of the log file sync wait events, which was discarded after I proved that it was only a consequence of a failed architecture, not the cause.
At the end, on the following slide you can observe impact on database level (screenshots are taken from AWR Diff report) after I’ve implemented just a few changes (migration of one out of dozens of tables from remote to local database and change a code on a few places).
When migration of the remote tables to the local database will be over, and code that points to remote tables is replaced with local tables, the whole problem will disappear.
Simple name for that process is called “Database consolidation”, something that needed to be done during design phase of the enterprise architecture.
Summary:
It is never enough to stress when troubleshooting, it is essential to understand how each component of your enterprise architecture works under the hood, otherwise you won’t be able to troubleshoot correctly (check my comments regarding storage/file system optimization).
By knowing strengths and weakness of each component, you should also know cases when to use some technology (dblinks in this case) and when not.
In this case, by consolidating several different databases during design phase of the project, you would avoid all issues mentioned in this post.
Unfortunately I’ve been usually called to help when client production is already in a very bad shape and with tons of incidents, not at the moment of creation, when I would be able to influence on application design, data model and architecture design.
Yet, although the most efficient option would be to consolidate remote databases into central, it is still possible to bring stability and performance by consolidating / replicating very small part (just one or two tables) along with minimal changes in the code.
Comments