Striving for Optimal Performance
  • Home
  • Blog
    • Archive
    • Categories
  • Troubleshooting Oracle Performance
    • Description
    • Structure
    • Table of Contents
    • Forewords
    • Reviews
    • Downloadable Files
    • Addenda and Errata
  • Publications
  • Public Appearances
    • Past Public Appearances
  • Contact
  • Search
  • About

Posts in category Indexes

Partially Index a Table

Aug04
2010
9 Comments Written by Christian Antognini

Recenty the following question was posted on oracle-l (I paraphrase…):

With Oracle Database it is possible to create something similar to Teradata’s sparse indexes?

Since the question is an interesting one, I decided to write this short post.

First of all, I have to say that such a feature is not supported by the CREATE INDEX statement with Oracle Database. What a pity! I would really like to see this feature in Oracle Database 12x (I bet that “x” will be “f”; post you opinion as a comment if you want). Several database engines do so. Teradata was mentioned by the OP, two additional examples are PostgreSQL and SQL Server.

How to do something similar with Oracle Database?

As of Oracle Database 11g Release 2 it is possible to partially index a table by taking advantage of Zero-Size Unusable Indexes. The following SQL statements show an example where only the data of August 2010 is indexed.

SQL> CREATE TABLE t AS
  2  SELECT rownum AS id, sysdate-mod(rownum,100) AS tim, rpad('*',50,'*') AS pad
  3  FROM dual
  4  CONNECT BY level <= 1000;

SQL> CREATE INDEX i ON t (tim)
  2  GLOBAL PARTITION BY RANGE (tim) (
  3    PARTITION i_201001 VALUES LESS THAN (to_date('2010-02-01','YYYY-MM-DD')),
  4    PARTITION i_201002 VALUES LESS THAN (to_date('2010-03-01','YYYY-MM-DD')),
  5    PARTITION i_201003 VALUES LESS THAN (to_date('2010-04-01','YYYY-MM-DD')),
  6    PARTITION i_201004 VALUES LESS THAN (to_date('2010-05-01','YYYY-MM-DD')),
  7    PARTITION i_201005 VALUES LESS THAN (to_date('2010-06-01','YYYY-MM-DD')),
  8    PARTITION i_201006 VALUES LESS THAN (to_date('2010-07-01','YYYY-MM-DD')),
  9    PARTITION i_201007 VALUES LESS THAN (to_date('2010-08-01','YYYY-MM-DD')),
 10    PARTITION i_201008 VALUES LESS THAN (to_date('2010-09-01','YYYY-MM-DD')),
 11    PARTITION i_201009 VALUES LESS THAN (to_date('2010-10-01','YYYY-MM-DD')),
 12    PARTITION i_201010 VALUES LESS THAN (to_date('2010-11-01','YYYY-MM-DD')),
 13    PARTITION i_201011 VALUES LESS THAN (to_date('2010-12-01','YYYY-MM-DD')),
 14    PARTITION i_201012 VALUES LESS THAN (to_date('2011-01-01','YYYY-MM-DD')),
 15    PARTITION i_maxvalue VALUES LESS THAN (MAXVALUE)
 16  )
 17  UNUSABLE;

SQL> ALTER INDEX i REBUILD PARTITION i_201008;

It goes without saying that you are not forced to have so many partitions in place. In fact, to index the data of August, the following CREATE INDEX is more appropriate.

SQL> CREATE INDEX i ON t (tim)
  2  GLOBAL PARTITION BY RANGE (tim) (
  3    PARTITION i_201007 VALUES LESS THAN (to_date('2010-08-01','YYYY-MM-DD')),
  4    PARTITION i_201008 VALUES LESS THAN (to_date('2010-09-01','YYYY-MM-DD')),
  5    PARTITION i_maxvalue VALUES LESS THAN (MAXVALUE)
  6  )
  7  UNUSABLE;

Then, to index the data of September, you have to execute some SQL statements like the following ones.

  • Create a partition for the data of September and rebuild it:
SQL> ALTER INDEX i SPLIT PARTITION i_maxvalue AT (to_date('2010-10-01','YYYY-MM-DD')) INTO (
  2    PARTITION i_201009,
  3    PARTITION i_maxvalue
  4  );

SQL> ALTER INDEX i REBUILD PARTITION i_201009;
  • Drop the oldest partition:
SQL> ALTER INDEX i DROP PARTITION i_201007;
  • Make unusable the partition for the data of August:
SQL> ALTER INDEX i MODIFY PARTITION i_201008 UNUSABLE;

In this way you have at most three partitions available. And, for most of the time, only one of them is usable and, therefore, occupying space.

Even though in the example I provide in this post I use a global index, you can use the same technique with local indexes as well. That said, I see no problem in using a global index as the one shown in this post.

Posted in 11gR2, Partitioning

Does CREATE INDEX Gather Global Statistics?

Dec17
2009
4 Comments Written by Christian Antognini

You can add the COMPUTE STATISTICS clause to the CREATE INDEX statement. It instructs the SQL statement to gather and store index statistics in the data dictionary, while creating the index. This is useful because the overhead associated with the gathering of statistics while executing this SQL statement is negligible. In Oracle9i, the gathering of statistics is performed only when this clause is specified. As of Oracle Database 10g, whenever statistics are not locked, their gathering is done by default, which means the COMPUTE STATISTICS clause is deprecated and available for backward compatibility only.

Unfortunately, CREATE INDEX does not gather global statistics. As a result, whenever you are creating partitioned indexes, the global statistics might be inaccurate. Let me show you an example:

  • Create partitioned table, insert data (notice that the number of distinct values is equal to the number of rows) and create a local index
SQL> CREATE TABLE t (n1 number, n2 number)
  2  PARTITION BY RANGE (n1) (
  3    PARTITION p1 VALUES LESS THAN (11),
  4    PARTITION p2 VALUES LESS THAN (21)
  5  );

Table created.

SQL> INSERT INTO t
  2  SELECT rownum, rownum
  3  FROM dual
  4  CONNECT BY level <= 20;

20 rows created.

SQL> CREATE INDEX i ON t (n2) LOCAL;

Index created.
  • The CREATE INDEX statement gathered the statistics for the index; let’s check them…
SQL> SELECT partition_name, global_stats, distinct_keys
  2  FROM user_ind_statistics
  3  WHERE index_name = 'I';

PARTITION_NAME GLOBAL_STATS DISTINCT_KEYS
-------------- ------------ -------------
               NO                      10
P1             NO                      10
P2             NO                      10

As you can see 1) the number of distinct keys at the global level is wrong; it should be 20! 2) the GLOBAL_STATS column at the index level is set to NO. As a result, when you create a partitioned index, you should manually gather the global index statistics straight after. In other words, you should do the following:

  • Manually gather global level index statistics
SQL> execute dbms_stats.gather_index_stats(ownname=>user, indname=>'i', granularity=>'global')

PL/SQL procedure successfully completed.
  • Check whether the index statistics are accurate
SQL> SELECT partition_name, global_stats, distinct_keys
  2  FROM user_ind_statistics
  3  WHERE index_name = 'I';

PARTITION_NAME GLOBAL_STATS DISTINCT_KEYS
-------------- ------------ -------------
               YES                     20
P1             NO                      10
P2             NO                      10

There are situations, however, where it is not necessary to manually gather the global index statistics. For example, when the index is prefixed. But, as a general rule, I would not rely on the automatically gathered statistics for partitioned indexes.

Posted in 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Object Statistics, Partitioning

Zero-Size Unusable Indexes and the Query Optimizer

Nov25
2009
10 Comments Written by Christian Antognini

Zero-size unusable indexes and index partions is a small but useful feature of Oracle Database 11g Release 2. Simply put, its aim is to save space in the database by immediately releasing the segment associated to unusable indexes or index partitions. To illustrate this, let’s have a look to an example…

  • Create a partitioned table, insert data, create a local index and gather object statistics:
SQL> CREATE TABLE t (
  2    id NUMBER NOT NULL,
  3    d DATE NOT NULL,
  4    n NUMBER NOT NULL,
  5    pad VARCHAR2(4000) NOT NULL
  6  )
  7  PARTITION BY RANGE (d) (
  8    PARTITION t_jan_2009 VALUES LESS THAN (to_date('2009-02-01','yyyy-mm-dd')),
  9    PARTITION t_feb_2009 VALUES LESS THAN (to_date('2009-03-01','yyyy-mm-dd')),
 10    PARTITION t_mar_2009 VALUES LESS THAN (to_date('2009-04-01','yyyy-mm-dd')),
 11    PARTITION t_apr_2009 VALUES LESS THAN (to_date('2009-05-01','yyyy-mm-dd')),
 12    PARTITION t_may_2009 VALUES LESS THAN (to_date('2009-06-01','yyyy-mm-dd')),
 13    PARTITION t_jun_2009 VALUES LESS THAN (to_date('2009-07-01','yyyy-mm-dd')),
 14    PARTITION t_jul_2009 VALUES LESS THAN (to_date('2009-08-01','yyyy-mm-dd')),
 15    PARTITION t_aug_2009 VALUES LESS THAN (to_date('2009-09-01','yyyy-mm-dd')),
 16    PARTITION t_sep_2009 VALUES LESS THAN (to_date('2009-10-01','yyyy-mm-dd')),
 17    PARTITION t_oct_2009 VALUES LESS THAN (to_date('2009-11-01','yyyy-mm-dd')),
 18    PARTITION t_nov_2009 VALUES LESS THAN (to_date('2009-12-01','yyyy-mm-dd')),
 19    PARTITION t_dec_2009 VALUES LESS THAN (to_date('2010-01-01','yyyy-mm-dd'))
 20  );

SQL> INSERT INTO t
  2  SELECT rownum, to_date('2009-01-01','yyyy-mm-dd')+rownum/274, mod(rownum,11), rpad('*',100,'*')
  3  FROM dual
  4  CONNECT BY level <= 100000;

SQL> CREATE INDEX i ON t (d) LOCAL;

SQL> execute dbms_stats.gather_table_stats(user,'T')
  • Make all partitions but the last one unusable:
SQL> ALTER INDEX i MODIFY PARTITION t_jan_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_feb_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_mar_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_apr_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_may_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_jun_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_jul_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_aug_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_sep_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_oct_2009 UNUSABLE;

SQL> ALTER INDEX i MODIFY PARTITION t_nov_2009 UNUSABLE;
  • Check whether the segments associated to the unusable partitions still exist:
SQL> SELECT partition_name, bytes
  2  FROM user_segments
  3  WHERE segment_name = 'I'
  4  AND segment_type = 'INDEX PARTITION'
  5  ORDER BY partition_name;

PARTITION_NAME       BYTES
--------------- ----------
T_DEC_2009          262144

As you can see from the output of the last query, only the segment associated to the partition T_DEC_2009 exists. All other segments have been freed. (Note that up to Oracle Database 11g Release 1 all segments would still exist.)

This is nice but, in my opinion, there is a more important thing to consider…
What does the query optimizer do when it has to generate the execution plan for a query that reads data stored into a table having unusable index partitions? For example, does the query optimizer take advantage of the usable partitions to apply a restriction? If yes, what happens when both partitions having usable and unusable index partitions have to be accessed? Let’s have a look to some examples…

  • First, let’s check whether the usable index partition can be used to apply a restriction:
SQL> SELECT count(d)
  2  FROM t
  3  WHERE d BETWEEN to_date('2009-12-01 23:00:00','yyyy-mm-dd hh24:mi:ss')
  4              AND to_date('2009-12-02 01:00:00','yyyy-mm-dd hh24:mi:ss');

SQL> SELECT * FROM table(dbms_xplan.display_cursor(format=>'basic +partition'));

--------------------------------------------------------
| Id  | Operation               | Name | Pstart| Pstop |
--------------------------------------------------------
|   0 | SELECT STATEMENT        |      |       |       |
|   1 |  SORT AGGREGATE         |      |       |       |
|   2 |   PARTITION RANGE SINGLE|      |    12 |    12 |
|   3 |    INDEX RANGE SCAN     | I    |    12 |    12 |
--------------------------------------------------------

Nice, an index range scan can be performed.


  • Second, let’s check what happen when an unusable index partition would be accessed:
SQL> SELECT count(d)
  2  FROM t
  3  WHERE d BETWEEN to_date('2009-11-01 23:00:00','yyyy-mm-dd hh24:mi:ss')
  4              AND to_date('2009-11-02 01:00:00','yyyy-mm-dd hh24:mi:ss');

SQL> SELECT * FROM table(dbms_xplan.display_cursor(format=>'basic +partition'));

--------------------------------------------------------
| Id  | Operation               | Name | Pstart| Pstop |
--------------------------------------------------------
|   0 | SELECT STATEMENT        |      |       |       |
|   1 |  SORT AGGREGATE         |      |       |       |
|   2 |   PARTITION RANGE SINGLE|      |    11 |    11 |
|   3 |    TABLE ACCESS FULL    | T    |    11 |    11 |
--------------------------------------------------------

Obviously, a partition scan is performed. Note that this is only true if the initialization parameter SKIP_UNUSABLE_INDEXES is set to TRUE (this is the default). Otherwise an ORA-01502 would be generated.


  • Third, let’s check what happens when both usable and unusable index partitions would be accessed:
SQL> SELECT count(d)
  2  FROM t
  3  WHERE d BETWEEN to_date('2009-11-30 23:00:00','yyyy-mm-dd hh24:mi:ss')
  4              AND to_date('2009-12-01 01:00:00','yyyy-mm-dd hh24:mi:ss');

SQL> SELECT * FROM table(dbms_xplan.display_cursor(format=>'basic +partition'));

-------------------------------------------------------------
| Id  | Operation                 | Name    | Pstart| Pstop |
-------------------------------------------------------------
|   0 | SELECT STATEMENT          |         |       |       |
|   1 |  SORT AGGREGATE           |         |       |       |
|   2 |   VIEW                    | VW_TE_2 |       |       |
|   3 |    UNION-ALL              |         |       |       |
|   4 |     PARTITION RANGE SINGLE|         |    12 |    12 |
|   5 |      INDEX RANGE SCAN     | I       |    12 |    12 |
|   6 |     PARTITION RANGE SINGLE|         |    11 |    11 |
|   7 |      TABLE ACCESS FULL    | T       |    11 |    11 |
-------------------------------------------------------------

Good stuff! The query optimizer generates an execution plan containing a UNION ALL to takes advantage of the usable index partition. This is really interesting because it allows us to selectively remove unnecessary index partitions. For example, in case some indexes are only used for the “current” partition(s), the index partitions of the older ones could be set unusable. As a result, lot of space might be released.

Be careful, however, that the flexibility of the query optimizer has (still?) some limits. For example, in the following query the unusable partition is the one in the “middle”. In such a case, I was not able to let the query optimizer apply the optimization described above.

SQL> ALTER INDEX i REBUILD PARTITION t_oct_2009;

SQL> SELECT count(d)
  2  FROM t
  3  WHERE d BETWEEN to_date('2009-10-30 23:00:00','yyyy-mm-dd hh24:mi:ss')
  4              AND to_date('2009-12-01 01:00:00','yyyy-mm-dd hh24:mi:ss');

SQL> SELECT * FROM table(dbms_xplan.display_cursor(format=>'basic +partition'));

----------------------------------------------------------
| Id  | Operation                 | Name | Pstart| Pstop |
----------------------------------------------------------
|   0 | SELECT STATEMENT          |      |       |       |
|   1 |  SORT AGGREGATE           |      |       |       |
|   2 |   PARTITION RANGE ITERATOR|      |    10 |    12 |
|   3 |    TABLE ACCESS FULL      | T    |    10 |    12 |
----------------------------------------------------------
Posted in 11gR2, Partitioning, Query Optimizer
← Older Entries Newer Entries →

EvoLve theme by Theme4Press  •  Powered by WordPress Striving for Optimal Performance