Striving for Optimal Performance
  • Home
  • Blog
    • Archive
    • Categories
  • Troubleshooting Oracle Performance
    • Description
    • Structure
    • Table of Contents
    • Forewords
    • Reviews
    • Downloadable Files
    • Addenda and Errata
  • Publications
  • Public Appearances
    • Past Public Appearances
  • Contact
  • Search
  • About

COMMIT_WAIT and COMMIT_LOGGING

Apr05
2012
8 Comments Written by Christian Antognini

Recently I used the COMMIT_WAIT and COMMIT_LOGGING parameters for solving (or, better, working around) a problem I faced while optimizing a specific task for one of my customers. Since it was the first time I used them in a production system, I thought to write this post not only to shortly explain the purpose of the these two parameters, but also to show a case where it is sensible to use them.

The purpose of the two parameters is the following:

COMMIT_WAIT

  • Simply put this parameter specifies whether a server process that issues a commit waits for the log writer while it writes the redo data in the redo log files.
  • If it’s set to WAIT, the default value, the server process waits. And, in most situations, this is the best thing to do.
  • If it’s set to NOWAIT, it doesn’t wait. This means that, in case of a crash just after a commit, the D of ACID might be violated! Hence, in general, it is not advised to use this value.
  • If it’s set to FORCE_WAIT the behaviour is similar to WAIT. The only difference is that settings at a lower level are ignored. In other words, if it is set at the system level it overrides the setting at the session and transaction level. If it’s set at session level it overrides the setting at the transaction level.

COMMIT_LOGGING

  • Simply put this parameter specifies whether the log writer writes redo data in batches.
  • If it’s set to IMMEDIATE, the default value, it basically performs a write operation for each commit.
  • If it’s set to BATCH, it writes redo data in batches. Since with this value less but larger write operations might be performed, in case of small transactions the log writer should be able to write the redo data in a more efficient way.

Note that in 10.2, the version that introduced these features, there is a single parameter (COMMIT_WRITE) to control them. For example, it is possible to set it to “NOWAIT, BATCH”.

To illustrates how these two parameters work I wrote two scripts: commit.sh and commit.sql. Their purpose is to show the number of times specific system calls are executed by the log writer and a server process that executes the following PL/SQL block:

DECLARE
  l_dummy INTEGER;
BEGIN
  FOR i IN 1..1000
  LOOP
    INSERT INTO t VALUES (i, rpad('*',100,'*'));
    COMMIT;
    SELECT count(*) INTO l_dummy FROM dual;
  END LOOP;
END;

The two system calls I was interested in were the following:

  • semtimedop(2): simply put this one is used by the server process to wait for the log writer
  • io_submit(2): simply put this one is used by the log writer to write redo data in the redo log files

Let’s have a look the output generated by the scripts for three particular cases:

  • COMMIT_WAIT = WAIT and COMMIT_LOGGING = IMMEDIATE: Notice that the server process executes 1005 times semtimedop(2) and that the log writer executes 1016 times io_submit(2). In other words, for both of them the number of executions is approximately the number of commits performed by the PL/SQL block.
   oracle@helicon:~/commit/ [DBM11203] ./commit.sh chris ian wait immediate 2> /dev/null

   ***** Server Process *****

   % time     seconds  usecs/call     calls    errors syscall
   ------ ----------- ----------- --------- --------- ----------------
   100.00    0.069561          69      1005         1 semtimedop
   ------ ----------- ----------- --------- --------- ----------------
   100.00    0.069561                  1005         1 total

   ***** Log Writer *****

   % time     seconds  usecs/call     calls    errors syscall
   ------ ----------- ----------- --------- --------- ----------------
   100.00    0.013919          14      1016           io_submit
   ------ ----------- ----------- --------- --------- ----------------
   100.00    0.013919                  1016           total
  • COMMIT_WAIT = NOWAIT and COMMIT_LOGGING = IMMEDIATE: Notice that, for the server process, the executions of semtimedop(2) dropped to 5. No noticeable difference is observable for the log writer.
oracle@helicon:~/commit/ [DBM11203] ./commit.sh chris ian nowait immediate 2> /dev/null

  ***** Server Process *****

  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.002195         439         5         1 semtimedop
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.002195                     5         1 total

  ***** Log Writer *****

  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.010073          10      1015           io_submit
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.010073                  1015           total
  • COMMIT_WAIT = NOWAIT and COMMIT_LOGGING = BATCH: Notice that this time not only the number of executions to io_submit(2) dropped to 15, but, in total, much less time was spent for writing the redo data (10*1015 >> 36*15).
   oracle@helicon:~/commit/ [DBM11203] ./commit.sh chris ian nowait batch 2> /dev/null

  ***** Server Process *****

  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.002132         533         4         1 semtimedop
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.002132                     4         1 total

  ***** Log Writer *****

  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.000533          36        15           io_submit
  ------ ----------- ----------- --------- --------- ----------------
  100.00    0.000533                    15           total

Now that I explained what the purpose of these parameters is, let me describe a case where I successfully used them.

Few weeks ago one of my customers migrated its DMS to a new version. During the migration data had to be moved from one database to another. Unfortunately the migration code was written to process data slow by slow. Hence, to speed up the processing, parallelization was added to the picture.

The following chart (you can click on it to increase its size) shows the load generated by 20 parallel processes with default values for COMMIT_WAIT and COMMIT_LOGGING. With the default configutation the system was able to process about 1000 “objects” per second. As you can see there were a lot of waits related to the commit wait class.

Database load (20 parallel processes; COMMIT_WAIT WAIT; COMMIT_LOGGING = IMMEDIATE)

Since this was a controlled processing using COMMIT_WAIT was not considered a problem. So, we tested the very same load with COMMIT_WAIT set to NOWAIT and COMMIT_LOGGING set to BATCH. The throughput increased to about 1200 “object” per second. In other words, not dramatically. But, more importantly, as the following chart shows almost all waits in the commit wait class disappeared. This was very important because without that serialization taking place we were able to increase the number of parallel processes.

Database load (20 parallel processes; COMMIT_WAIT NOWAIT; COMMIT_LOGGING = BATCH)

The following chart shows the load generated by 50 parallel processes. Again, almost no wait related to commits. With 50 processes the system was able to process about 2300 “objects” per second.

Database load (50 parallel processes; COMMIT_WAIT NOWAIT; COMMIT_LOGGING = BATCH)

In summary, COMMIT_WAIT and COMMIT_LOGGING are not commonly used parameters but, in some specific situations, using them it might be beneficial to avoid wait events related to commits.

Posted in 10gR2, 11gR1, 11gR2

Analysing Row Lock Contention with LogMiner

Mar12
2012
7 Comments Written by Christian Antognini

Recently I had to analyse a row lock contention problem that can be illustrated by the following test case:

  • A session (let’s call it #1) creates a table and inserts a row into it (note that “n” is the primary key of the table):
SQL> CREATE TABLE t (n NUMBER PRIMARY KEY);

SQL> VARIABLE n NUMBER

SQL> execute :n := 1

SQL> INSERT INTO t VALUES (:n);
  • Another session (let’s call it #2) inserts the same data into the same table:
SQL> VARIABLE n NUMBER

SQL> execute :n := 1

SQL> INSERT INTO t VALUES (:n);
  • Since session #1 did not commit and that session #2 inserted a row with the same primary key, session #2 is blocked waiting for session #1 to either commit or rollback:
SQL> SELECT sid, blocking_session, event, sql_text
  2  FROM v$session LEFT OUTER JOIN v$sqlarea USING (sql_id)
  3  WHERE nvl(blocking_session,sid) IN (SELECT holding_session FROM dba_blockers);

       SID BLOCKING_SESSION EVENT                          SQL_TEXT
---------- ---------------- ------------------------------ ------------------------------
       130                  SQL*Net message from client
       197              130 enq: TX - row lock contention  INSERT INTO t VALUES (:n)

It goes without saying that in such a case the problem is the application that tried to insert two rows with the same primary key. In the real case the primary key was a natural key, not a surrogate key generated through a sequence. To help the developers troubleshoot the problem it was therefore necessary to know the actual value of the bind variable used for the two inserts. Unfortunately this information is stored in the PGA of the server processes and, therefore, it is not directly accessible. In addition I started investigating the problem only a couple of hours after the contention began and, therefore, also a view like V$SQL_BIND_CAPTURE is of no use.

As a result I decided to mine the archived redo logs to find the information I was looking for. Here is what I did:

  • Find out the XID of the transaction holding the row lock:
SQL> SELECT t.xidusn, t.xidslot, t.xidsqn, t.start_time, t.start_scn
  2  FROM v$transaction t JOIN v$session s ON t.addr = s.taddr
  3  WHERE s.sid = 130;

    XIDUSN    XIDSLOT     XIDSQN START_TIME            START_SCN
---------- ---------- ---------- -------------------- ----------
         7         12       1049 03/09/12 07:10:25       1462388
  • Find out which archived redo log contains the first redo records of that transaction:
SQL> SELECT name
  2  FROM v$archived_log
  3  WHERE 1462388 BETWEEN first_change# AND next_change# - 1;

NAME
-----------------------------------------------------------------------------------------------
/u00/app/oracle/fast_recovery_area/DBA112/archivelog/2012_03_09/o1_mf_1_94_7om7qzdx_.arc
  • Start LogMiner:
SQL> EXECUTE dbms_logmnr.add_logfile(logfilename=>'/u00/app/oracle/fast_recovery_area/DBA112/archivelog/2012_03_09/o1_mf_1_94_7om7qzdx_.arc')

SQL> EXECUTE dbms_logmnr.start_logmnr(options=>dbms_logmnr.dict_from_online_catalog)
  • Extract the redo information of the operations performed by the transaction holding the row lock and, therefore, finding that the value of the bind variable was “1”:
SQL> SELECT sql_redo
  2  FROM v$logmnr_contents
  3  WHERE xidusn = 7
  4  AND xidslt = 12
  5  AND xidsqn = 1049;

SQL_REDO
--------------------------------------------------------------------------------
set transaction read write;
insert into "CHRIS"."T"("N") values ('1');
  • Stop LogMiner
SQL> EXECUTE dbms_logmnr.end_logmnr

In conclusion, even though mining (archived) redo logs for extracting the value of bind variables is probably not the most typical use of LogMiner, it works well. Therefore, do not forget this possibility if you have to do it.

Posted in 10gR1, 10gR2, 11gR1, 11gR2, LogMiner

Index Scan with Filter Predicate Based on a Subquery

Feb06
2012
3 Comments Written by Christian Antognini

Most execution plans can be interpreted by following few basic rules (in TOP, Chapter 6, I provide such a list of rules). Nevertheless, there are some special cases. One of them is when an index scan, in addition to the access predicate, has a filter predicate applying a subquery.

The following execution plan, taken from Enterprise Manager 11.2, is an example (click on the image to increase its size):
Execution Plan
Notes:

  • According to the order column the first operation being executed is the scan of the I2 index. Unfortunately this is wrong. In fact the first operation being executed is the scan of the I1 index. This is a bug in Enterprise Manager. I wanted to show you this example to demonstrate that not only for us it might be difficult to correctly interpret an execution plan ;-)
  • The filter predicate IS NOT NULL is also wrong. This is not a bug, however. It is a limitation in the current implementation. The problem is that in some cases the V$SQL_PLAN and V$SQL_PLAN_STATISTICS_ALL views are not able to show all the necessary details.

Without seeing the query on which this execution plan is based, it is not obvious at all to know what’s going on. So, here is the query:

SELECT *
FROM t1
WHERE n1 = 8
AND n2 IN (SELECT t2.n1 FROM t2, t3 WHERE t2.id = t3.id AND t3.n1 = 8) 

Based on the query it is essential to point out that the access predicate "T2"."N1"=:B1 cannot be evaluated and, therefore, the scan of the I2 index cannot be carried out, without having a value passed through the B1 bind variable. In other words, without knowing the value of T1.N2.

To describe how this execution plan is carried out, let’s have a look to the information provided by the DBMS_XPLAN.DISPLAY function (which does not expose the limitation related to the filter predicate).

-----------------------------------------------
| Id  | Operation                      | Name |
-----------------------------------------------
|   0 | SELECT STATEMENT               |      |
|   1 |  TABLE ACCESS BY INDEX ROWID   | T1   |
|*  2 |   INDEX RANGE SCAN             | I1   |
|   3 |    NESTED LOOPS                |      |
|   4 |     TABLE ACCESS BY INDEX ROWID| T2   |
|*  5 |      INDEX RANGE SCAN          | I2   |
|*  6 |     TABLE ACCESS BY INDEX ROWID| T3   |
|*  7 |      INDEX RANGE SCAN          | I3   |
-----------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N1"=8)
    filter( EXISTS (SELECT /*+ LEADING ("T2" "T3") USE_NL ("T3") INDEX
           ("T3" "I3") INDEX_RS_ASC ("T2" "I2") */ 0 FROM "T3" "T3","T2" "T2" WHERE
           "T2"."N1"=:B1 AND "T3"."N1"=8 AND "T2"."ID"="T3"."ID"))
5 - access("T2"."N1"=:B1)
6 - filter("T2"."ID"="T3"."ID")
7 - access("T3"."N1"=8)

The operations are carried out as follows:

  1. Operation 2 applies the access predicate "N1"=8 by scanning the I1 index.
  2. For each key returned by the previous scan, the subquery is executed once. Note that the subquery carries out a nested loop. While the outer loop accesses the T2 table, the inner loop accesses the T3 table.
  3. The first operation of the outer loop is operation 5. It applies the access predicate "T2"."N1"=:B1 by scanning the I2 index. Based on the rowid returned by the index access the T2 table is accessed (operation 4).
  4. For each row returned by the outer loop, the inner loop is executed once. The first operation of the inner loop is operation 7. It applies the access predicate "T3"."N1"=8 by scanning the I3 index. Based on the rowid returned by the index access the T3 table is accessed (operation 6) and the filter predicate "T2"."ID"="T3"."ID" (the join condition) is applied. By the way, it is interesting to notice that, contrary to the join condition is not applied as an access predicate, as it usually happens.
  5. If the subquery returns a row, the rowid returned by operation 2 can be used to access the T1 table (operation 1). The row extracted from this operation is sent to the caller.

All in all, this is a very special execution plan…

In summary, be careful when you see an index scan with a filter predicate applying a subquery. The execution plan might not be carried out as you expect at first sight. It is also essential to point out that in such a case the predicate information is essential to fully understand what’s going on.

Posted in 10gR2, 11gR1, 11gR2, Bug, Query Optimizer, TOP
← Older Entries

EvoLve theme by Theme4Press  •  Powered by WordPress Striving for Optimal Performance