Striving for Optimal Performance
  • Home
  • Blog
    • Archive
    • Categories
  • Troubleshooting Oracle Performance
    • Description
    • Structure
    • Table of Contents
    • Forewords
    • Reviews
    • Downloadable Files
    • Addenda and Errata
  • Publications
  • Public Appearances
    • Past Public Appearances
  • Contact
  • Search
  • About

Posts in category Parallel Processing

Parallel Full Table Scans Do Not Always Perform Direct Reads

Sep12
2010
5 Comments Written by Christian Antognini

Even though in general parallel full table scans performs direct reads, some exceptions exist. The aim of this post is to show such an exception.

For test purposes I build in my own schema a copy of the SH.SALES table (the one distributed by Oracle with the demo schemas…). On that table I build an index on the TIME_ID column and, at the same time, I trace the execution. The statements I use are the following.

execute dbms_monitor.session_trace_enable
CREATE INDEX sales_time_id ON sales (time_id) PARALLEL 2 ONLINE;
execute dbms_monitor.session_trace_disable

Since the index is built in parallel (DOP=2), 5 processes are used to run the statement: the query coordinator, 2 slaves for reading the table, and 2 slaves for building the index. Based on the data provided by extended SQL trace let’s have a look to some information about the execution.

  • Execution plan (without runtime statistics and query optimizer estimations): both the build of the index and the full table scan are performed in parallel.
Operation
----------------------------------------
PX COORDINATOR
  PX SEND QC (ORDER) :TQ10001
    INDEX BUILD NON UNIQUE SALES_TIME_ID
      SORT CREATE INDEX
        PX RECEIVE
          PX SEND RANGE :TQ10000
            PX BLOCK ITERATOR
              TABLE ACCESS FULL SALES
  • Resource usage profile of the query coordinator: no real work is performed, it’s just coordination work…
                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execute Reply                          1.928  79.044           44            0.044
recursive statements                           0.427  17.487          n/a              n/a
CPU                                            0.024   0.984          n/a              n/a
os thread startup                              0.017   0.690            1            0.017
log file sync                                  0.014   0.569            2            0.007
PX Deq: Parse Reply                            0.014   0.568            4            0.003
enq: CR - block range reuse ckpt               0.005   0.202            2            0.002
PX Deq: Join ACK                               0.004   0.145            5            0.001
SQL*Net message from client                    0.002   0.098            1            0.002
PX qref latch                                  0.002   0.075            1            0.002
PX Deq: Table Q qref                           0.001   0.058            1            0.001
enq: RO - fast object reuse                    0.001   0.028            1            0.001
PX Deq: Signal ACK EXT                         0.001   0.025            6            0.000
PX Deq: Slave Session Stats                    0.000   0.014            3            0.000
reliable message                               0.000   0.006            2            0.000
latch: call allocation                         0.000   0.004            1            0.000
db file sequential read                        0.000   0.004            6            0.000
rdbms ipc reply                                0.000   0.002            2            0.000
SQL*Net message to client                      0.000   0.000            1            0.000
----------------------------------- ---------------- -------
Total                                          2.439 100.000
  • Resource usage profile of one of the slaves building the index: direct writes are used to store the index.
                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
direct path write                              0.982  49.671          376            0.003
CPU                                            0.781  39.508          n/a              n/a
PX Deq: Table Q Normal                         0.177   8.957          130            0.001
PX Deq: Execution Msg                          0.012   0.595            4            0.003
cursor: pin S wait on X                        0.011   0.535            1            0.011
recursive statements                           0.008   0.420          n/a              n/a
log file sync                                  0.005   0.247            3            0.002
Disk file operations I/O                       0.001   0.030            1            0.001
PX Deq: Slave Session Stats                    0.000   0.021            1            0.000
reliable message                               0.000   0.011            1            0.000
rdbms ipc reply                                0.000   0.006            2            0.000
----------------------------------- ---------------- -------
Total                                          1.977 100.000
  • Resource usage profile of one of the slaves scanning the table: instead of performing direct reads, “regular” buffered reads are performed (notice the db file scattered read event).
                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execution Msg                          1.571  80.295           23            0.068
CPU                                            0.311  15.889          n/a              n/a
PX Deq Credit: need buffer                     0.055   2.794          432            0.000
db file scattered read                         0.016   0.835           55            0.000
PX Deq: Table Q Get Keys                       0.002   0.109            1            0.002
Disk file operations I/O                       0.001   0.032            1            0.001
PX Deq Credit: send blkd                       0.001   0.028            1            0.001
PX Deq: Slave Session Stats                    0.000   0.010            1            0.000
db file sequential read                        0.000   0.005            4            0.000
PX qref latch                                  0.000   0.002            1            0.000
latch: cache buffers chains                    0.000   0.000            1            0.000
----------------------------------- ---------------- -------
Total                                          1.957 100.000

As pointed out by the last resource usage profile no direct reads are performed. Why? In this case it is because the ONLINE option was specified. By the way, I do not know why there is such a limitation… Anway, without this option, for one of the slaves reading the table the following resource usage profile is used (notice the direct path read event). I do not show the other resource usage profiles and the execution plan because they do not change.

                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execution Msg                          1.499  80.358           22            0.068
CPU                                            0.278  14.897          n/a              n/a
PX Deq Credit: need buffer                     0.071   3.790          504            0.000
direct path read                               0.012   0.630           52            0.000
PX Deq Credit: send blkd                       0.003   0.176            2            0.002
PX Deq: Table Q Get Keys                       0.001   0.047            2            0.000
PX Deq: Slave Session Stats                    0.001   0.041            1            0.001
Disk file operations I/O                       0.001   0.033            1            0.001
library cache: mutex X                         0.000   0.025            1            0.000
db file sequential read                        0.000   0.002            2            0.000
asynch descriptor resize                       0.000   0.001           18            0.000
----------------------------------- ---------------- -------
Total                                          1.866 100.000

In summary, do not expect to always see direct reads when a parallel full table scan is performed.

Posted in 10gR1, 10gR2, 11gR1, 11gR2, Indexes

Parallel Processing With Standard Edition

Sep01
2010
9 Comments Written by Christian Antognini

As clearly stated in the Licensing Information guide, all features related to parallel processing (parallel backup and recovery, parallel query/DML, parallel statistics gathering, parallel index build/scans, parallel Data Pump export/import, in-memory parallel execution, parallel statement queuing and parallel spatial index builds) are only available with the Enterprise Edition. However, as of Oracle Database 11g Release 2, there is a feature that provides parallel processing capabilities in the Standard Edition as well. This feature is available through the DBMS_PARALLEL_EXECUTE package.

For example, let’s say that you have to execute an UPDATE statement on all rows of a table containing a huge amount of data. If you can take the table “offline”, provided you have enough space doing a CTAS statement instead of an UPDATE statement is probably much faster. However, if you cannot put the table “offline”, doing it in parallel might be a sensible way to speed-up the execution. Hence, if you are using the Enterprise Edition you can take advantage of the parallel processing features integrated in the SQL engine. Thus, you can execute something like that:

ALTER SESSION ENABLE PARALLEL DML;
UPDATE /*+ parallel(t,4) */ t SET col = expr;

Since such a possibility is not available with the Standard Edition, as of Oracle Database 11g Release 2 you might execute a PL/SQL block like the following one to perform the same operation in parallel:

SET SERVEROUTPUT ON
DECLARE
  l_task_name user_parallel_execute_tasks.task_name%TYPE;
  l_sql_stmt user_parallel_execute_tasks.sql_stmt%TYPE;
BEGIN
  l_task_name := 'px_update';
  l_sql_stmt := 'UPDATE t SET col = expr WHERE rowid BETWEEN :start_id AND :end_id';

  dbms_parallel_execute.create_task(task_name => l_task_name);

  dbms_parallel_execute.create_chunks_by_rowid(
    task_name   => l_task_name,
    table_owner => user,
    table_name  => 'T',
    by_row      => FALSE,
    chunk_size  => 128
  );

  dbms_parallel_execute.run_task(
    task_name      => l_task_name,
    sql_stmt       => l_sql_stmt,
    language_flag  => dbms_sql.native,
    parallel_level => 4
  );

  WHILE (dbms_parallel_execute.task_status(task_name => l_task_name)
           NOT IN (
             dbms_parallel_execute.chunking_failed,
             dbms_parallel_execute.finished,
             dbms_parallel_execute.finished_with_error,
             dbms_parallel_execute.crashed
           ))
  LOOP
    dbms_lock.sleep(1);
  END LOOP;

  CASE dbms_parallel_execute.task_status(task_name => l_task_name)
    WHEN dbms_parallel_execute.chunking_failed THEN dbms_output.put_line('chunking_failed');
    WHEN dbms_parallel_execute.finished THEN dbms_output.put_line('finished');
    WHEN dbms_parallel_execute.finished_with_error THEN dbms_output.put_line('finished_with_error');
    WHEN dbms_parallel_execute.crashed THEN dbms_output.put_line('crashed');
  END CASE;

  dbms_parallel_execute.drop_task(task_name => l_task_name);
END;
/

Note that using the DBMS_PARALLEL_EXECUTE package is not limited to the Standard Edition, though. I see at least two situations where it can be handy with the Enterprise Edition:

  • You do not want to process the whole DML statement in a single transaction.
  • You want to process in parallel a PL/SQL block, not a DML statement.

Both situations are relevant if, as of Oracle Database 11g Release 2, you plan to perform such an operation during an online application upgrade by taking advantage of edition-based redefinition. I guess that the package was implemented for that purpose…

Posted in 11gR2

Exadata Storage Server and the Query Optimizer – Part 3

May05
2010
2 Comments Written by Christian Antognini

In the first and second post of this series I shared with you some basics about smart scan and gave some details about projection and restriction. The aim of this post is to cover the third basic technique: join filtering.

Join filtering is not something specific to the Exadata Storage Server. In fact, it is an Enterprise Edition feature available since Oracle Database 10g Release 2. Simply put, it is used to reduce data communication between slave processes in parallel joins. For more information about it I suggest you to read a paper I published in June 2008 entitled Bloom Filters. In it I describe not only what bloom filters are, but also how Oracle Database uses them. And, one of the use cases is join filtering.

What I want to show here is how Exadata Storage Server is able to take advantage of join filtering. For that purpose let’s have a look to the following execution plan:

-----------------------------------------------------
| Id  | Operation                        | Name     |
-----------------------------------------------------
|   0 | SELECT STATEMENT                 |          |
|   1 |  PX COORDINATOR                  |          |
|   2 |   PX SEND QC (RANDOM)            | :TQ10002 |
|*  3 |    HASH JOIN BUFFERED            |          |
|   4 |     JOIN FILTER CREATE           | :BF0000  |
|   5 |      PX RECEIVE                  |          |
|   6 |       PX SEND HASH               | :TQ10000 |
|   7 |        PX BLOCK ITERATOR         |          |
|*  8 |         TABLE ACCESS STORAGE FULL| T1       |
|   9 |     PX RECEIVE                   |          |
|  10 |      PX SEND HASH                | :TQ10001 |
|  11 |       JOIN FILTER USE            | :BF0000  |
|  12 |        PX BLOCK ITERATOR         |          |
|* 13 |         TABLE ACCESS STORAGE FULL| T2       |
-----------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"="T2"."ID")
   8 - storage("T1"."MOD"=42)
       filter("T1"."MOD"=42)
  13 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

As you can see, join filtering is used. In fact, the operation 4 (JOIN FILTER CREATE) builds a bloom filter that, later on, is used by operation 11 (JOIN FILTER USE) to filter out the data that does not fulfill the join condition. However, the most important thing to notice in this execution plan is the STORAGE predicate applied by the operation 13. According to it the bloom filter is applied not only by the operation 11, but also by the operation 13. And, since the operation 13 is a smart scan operation, the STORAGE predicate is evaluated by the cells. This means that the reduction of data communication does not only take place between slave processes, but also between the cells and the database instances. Remarkable!

Posted in 11gR1, 11gR2, Exadata, Query Optimizer
← Older Entries

EvoLve theme by Theme4Press  •  Powered by WordPress Striving for Optimal Performance