I'm somewhat puzzled by the scenario below. The table involved (relevant parts) is defined as:
CREATE TABLE Nya. UPSEC_COURSE (
UPSEC_COURSE_ID CHAR(11) NOT NULL,
CORE_UPSEC_SUBJECT SMALLINT NOT NULL WITH DEFAULT 0,
POINT SMALLINT NOT NULL WITH DEFAULT 0,
CONVERTEDPOINT SMALLINT,
UPSEC_COURSETYPE_ID CHAR(5) NOT NULL with default 'N',
UPSEC_COURSE_SHORT CHAR(20) NOT NULL with default,
EQ_UPSEC_SUBJECT_ID CHAR(5),
UPSEC_MARKSCALE_ID SMALLINT NOT NULL,
SUBMIT_BY VARCHAR(130) NOT NULL ,
SUBMIT_TIME TIMESTAMP NOT NULL WITH EFAULT CURRENT TIMESTAMP,
UPSEC_COURSE VARCHAR(64) NOT NULL with default
) IN USERSPACE1 @
CREATE UNIQUE INDEX NYA.XPK_UPSEC_COURSE ON NYA.UPSEC_COURSE
(UPSEC_COURSE_ID) INCLUDE (UPSEC_COURSE, UPSEC_MARKSCALE_ID, EQ_UPSEC_SUBJECT_ID
,UPSEC_COURSETYPE_ID, CONVERTEDPOINT, CORE_UPSEC_SUBJECT)
ALLOW REVERSE SCANS
COLLECT SAMPLED DETAILED STATISTICS @
The query under concern is:
Original Statement:
------------------
SELECT
RTRIM(UC.EQ_UPSEC_SUBJECT_ID) AS EQ_UPSEC_SUBJECT_ID,
LEV.UPSEC_SUBJECTLEVEL_ID,
RTRIM(UC.UPSEC_COURSETYPE_ID)
FROM nya.UPSEC_COURSE UC
JOIN nya.UPSEC_SUBJECTLEVEL_COURSE LEV
ON UC.UPSEC_COURSE_ID = LEV.UPSEC_COURSE_ID
WHERE LEV.UPSEC_COURSE_ID = ?
and UC.UPSEC_COURSETYPE_ID in (?, ?)
AND EXISTS (
SELECT 1
FROM NYA.UPSEC_COURSE UC2
JOIN NYA.UPSEC_COURSE_WEIGHTED UCW
ON UC2.UPSEC_COURSE_ID = UCW.UPSEC_COURSE_ID
WHERE UC2.EQ_UPSEC_SUBJECT_ID = UC.EQ_UPSEC_SUBJECT_ID
)
All indexes and tables are reorged and have updated statistics.
The query reads roughly 10000 rows per execution due to a hash-join (I replaced the parameter markers with constants in these numbers)
NUM_EXECUTIONS ROWS_READ POOL_DATA_L_READS POOL_INDEX_L_READS
-------------------- -------------------- -------------------- --------------------
5 50535 1415 145
Access Plan:
-----------
Total Cost: 665.512
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
1
TBSCAN
( 2)
665.512
296
|
1
SORT
( 3)
665.512
296
|
6.04348
^HSJOIN
( 4)
665.511
296
/-------------+-------------\
6.04348 1
HSJOIN^ IXSCAN
( 5) ( 10)
652.652 12.8584
295 1
/---------+---------\ |
10014 2641 2663
TBSCAN NLJOIN INDEX: NYA
( 6) ( 7) XPK_UPSEC_SUBJECTLEVEL_COURSE
482.247 169.719 Q6
271 24
| /--------+--------\
10014 1 2641
TABLE: NYA IXSCAN IXSCAN
UPSEC_COURSE ( 8) ( 9)
Q5 25.7102 144.009
2 22
| |
10014 2641
INDEX: NYA INDEX: NYA
XPK_UPSEC_COURSE XPK_UPSEC_COURSE_WEIGHTED
Q7 Q4
As mentioned table and indexes are reorged and have updated statistics on. If I create an almost identical index
CREATE UNIQUE INDEX TMP.XPK_UPSEC_COURSE ON NYA.UPSEC_COURSE
(UPSEC_COURSE_ID) INCLUDE (UPSEC_COURSE, UPSEC_MARKSCALE_ID, EQ_UPSEC_SUBJECT_ID
,UPSEC_COURSETYPE_ID, CONVERTEDPOINT, CORE_UPSEC_SUBJECT, SUBMIT_BY)
ALLOW REVERSE SCANS
COLLECT SAMPLED DETAILED STATISTICS @
I.e. only difference is that I included the completely irrelevant column SUBMIT by in the INCLUDE, I get much nicer numbers:
NUM_EXECUTIONS ROWS_READ POOL_DATA_L_READS POOL_INDEX_L_READS
-------------------- -------------------- -------------------- --------------------
5 0 0 125
The plan is changed to:
Rows
RETURN
( 1)
Cost
I/O
|
1
NLJOIN
( 2)
123.687
34.985
/----------+----------\
1 1
IXSCAN ^NLJOIN
( 3) ( 5)
12.8584 110.829
1 33.985
| /---+----\
2663 1 6.14874
INDEX: NYA IXSCAN FILTER
XPK_UPSEC_SUBJECTLEVEL_COURSE ( 6) ( 7)
Q5 25.7101 609.94
2 229.197
| |
10107 2687
INDEX: NYA ^MSJOIN
XPK_UPSEC_COURSE ( 8)
Q6 609.77
229.197
/------+------\
2687 1
IXSCAN FILTER
( 9) ( 10)
144.02 471.243
22 210
| |
2687 10107
INDEX: NYA IXSCAN
XPK_UPSEC_COURSE_WEIGHTED ( 11)
Q3 471.243
210
|
10107
INDEX: TMP
XPK_UPSEC_COURSE
Q4
If I create the table in another schema with the original index definition I also get this nice behaviour.
Question is why db2 insist on using the index the way it does? I get the feeling that the index is not healthy and should be dropped and recreated. Only problem is that ~50 foreign keys are referencing the p.k. that the index supports, so I'd rather avoid that. Beside reorging (which I tried), what can be done to change the plan?
I noticed that INDEXREC = RESTART, so marking the index as invalid probably won't take effect until restart
Any thoughts?
EDIT: Added result of db2 reorgchk on table nya.upsec_course
Table statistics:
F1: 100 * OVERFLOW / CARD < 5
F2: 100 * (Effective Space Utilization of Data Pages) > 70
F3: 100 * (Required Pages / Total Pages) > 80
SCHEMA.NAME CARD OV NP FP ACTBLK TSIZE F1 F2 F3 REORG
----------------------------------------------------------------------------------------
Table: NYA.UPSEC_COURSE
10107 0 273 273 - 1081449 0 100 100 ---
----------------------------------------------------------------------------------------
Index statistics:
F4: CLUSTERRATIO or normalized CLUSTERFACTOR > 80
F5: 100 * (Space used on leaf pages / Space available on non-empty leaf pages) > MIN(50, (100 - PCTFREE))
F6: (100 - PCTFREE) * (Amount of space available in an index with one less level / Amount of space required for all keys) < 100
F7: 100 * (Number of pseudo-deleted RIDs / Total number of RIDs) < 20
F8: 100 * (Number of pseudo-empty leaf pages / Total number of leaf pages) < 20
SCHEMA.NAME INDCARD LEAF ELEAF LVLS NDEL KEYS LEAF_RECSIZE NLEAF_RECSIZE LEAF_PAGE_OVERHEAD NLEAF_PAGE_OVERHEAD PCT_PAGES_SAVED F4 F5 F6 F7 F8 REORG
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Table: NYA.UPSEC_COURSE
Index: NYA.XAK1_UPSEC_COURSE
10107 77 0 2 0 10107 16 16 416 416 0 87 93 - 0 0 -----
Index: NYA.XPK_UPSEC_COURSE
10107 168 0 3 0 10107 47 11 296 496 0 87 90 73 0 0 -----
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
I haven't used reorgchk in years (I usually just look at the number of overflows to determine if reorg is needed) but as far as I can tell the numbers looks decent enough.
Details about the other table involved. UPSEC_COURSE_WEIGHTED is a subset of UPSEC_COURSE:
CREATE TABLE NYA.UPSEC_COURSE_WEIGHTED (
UPSEC_COURSE_ID CHAR(11) NOT NULL
) IN USERSPACE1 @
CREATE UNIQUE INDEX NYA.XPK_UPSEC_COURSE_WEIGHTED ON NYA.UPSEC_COURSE_WEIGHTED
( UPSEC_COURSE_ID )
ALLOW REVERSE SCANS
COLLECT SAMPLED DETAILED STATISTICS @
ALTER TABLE NYA.UPSEC_COURSE_WEIGHTED ADD CONSTRAINT XPK_UPSEC_COURSE_WEIGHTED
PRIMARY KEY ( UPSEC_COURSE_ID ) @
ALTER TABLE NYA.UPSEC_COURSE_WEIGHTED ADD CONSTRAINT XFK_UPSEC_COURSE
FOREIGN KEY (UPSEC_COURSE_ID)
REFERENCES NYA.UPSEC_COURSE (UPSEC_COURSE_ID)
ON DELETE CASCADE
ON UPDATE RESTRICT @
Reorgchk for table nya.upsec_course_weighted
F1: 100 * OVERFLOW / CARD < 5
F2: 100 * (Effective Space Utilization of Data Pages) > 70
F3: 100 * (Required Pages / Total Pages) > 80
SCHEMA.NAME CARD OV NP FP ACTBLK TSIZE F1 F2 F3 REORG
----------------------------------------------------------------------------------------
Table: NYA.UPSEC_COURSE_WEIGHTED
2687 0 15 15 - 56427 0 100 100 ---
----------------------------------------------------------------------------------------
Index statistics:
F4: CLUSTERRATIO or normalized CLUSTERFACTOR > 80
F5: 100 * (Space used on leaf pages / Space available on non-empty leaf pages) > MIN(50, (100 - PCTFREE))
F6: (100 - PCTFREE) * (Amount of space available in an index with one less level / Amount of space required for all keys) < 100
F7: 100 * (Number of pseudo-deleted RIDs / Total number of RIDs) < 20
F8: 100 * (Number of pseudo-empty leaf pages / Total number of leaf pages) < 20
SCHEMA.NAME INDCARD LEAF ELEAF LVLS NDEL KEYS LEAF_RECSIZE NLEAF_RECSIZE LEAF_PAGE_OVERHEAD NLEAF_PAGE_OVERHEAD PCT_PAGES_SAVED F4 F5 F6 F7 F8 REORG
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Table: NYA.UPSEC_COURSE_WEIGHTED
Index: NYA.XPK_UPSEC_COURSE_WEIGHTED
2687 22 0 2 12 2687 11 11 496 496 0 98 74 - 0 0 -----
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------