I'm having trouble understanding why an index I have on one of my tables is doing nothing to help the performance of a JOIN, even when forcing the index. Here is the query:
EXPLAIN SELECT
*
FROM
tusers PARTITION (p362) tu FORCE INDEX (index_tu_group_id_email)
JOIN users PARTITION (p362) u FORCE INDEX (cc_group_email_index)
ON u.group_id=tu.group_id
AND tu.email_address=u.email
AND u.group_id = 362
WHERE
tu.appliation_id=253555;
+----+-------------+-------+------+-------------------------+-------------------------+---------+-----------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------+-------------------------+---------+-----------------------------+---------+-------------+
| 1 | SIMPLE | u | ref | cc_group_email_index | cc_group_email_index | 4 | const | 8282617 | Using where |
| 1 | SIMPLE | tu | ref | index_tu_group_id_email | index_tu_group_id_email | 247 | const,db.u.email | 8 | Using where |
+----+-------------+-------+------+-------------------------+-------------------------+---------+-----------------------------+---------+-------------+
I included the FORCE INDEX on tusers here just to simplify the output, including or excluding that makes no difference on the query plan.
When I run the same query on my local database (with significantly less data) the query uses the index, and it provides the performance benefit you'd expect from doing so. One other difference I noticed when running locally: the explain output starts with the tusers filter, and then uses the index on the users table -- I suspect that is related to my issue, but I'm not seeing why/how.
Production database (where the query is failing) has ~300,000,000 rows in the tusers table, and the users table has ~175,000,000 rows. Local database only has a few hundred thousand rows in each. MySQL versions are the 5.6.19 on production, 5.6.10 locally. The partitions in question on production have ~17,000,000 rows (users) and 7,600,000 rows (tusers).
Running a query directly against the email column in the users table makes use of the index:
EXPLAIN SELECT
*
FROM users FORCE INDEX (cc_group_email_index)
WHERE group_id=362 and email='emailaddress@gmail.com'
+----+-------------+-------+------+----------------------+----------------------+---------+-------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------+----------------------+---------+-------------+------+-------------+
| 1 | SIMPLE | users | ref | cc_group_email_index | cc_group_email_index | 407 | const,const | 1 | Using where |
+----+-------------+-------+------+----------------------+----------------------+---------+-------------+------+-------------+
I don't think this is related, given I'm forcing the index, but I noticed it so including just in case: cc_group_email_index is composed of two keys, each of which has a cardinality listed of 9, even though the cardinality of both columns is significantly higher. I also have another nearly identical index on a varchar column "raw_id" which when replaced in the above example works perfectly. The only difference between the two columns / indexes is: the raw_id column is varchar(255), and the index is defined to only use the first 60 characters:
KEY `cc_group_email_index` (`group_id`,`email`),
KEY `cc_group_raw_id_index` (`group_id`,`raw_id`(60))
The email address column on tusers is 255 characters (which doesn't match users), though I wouldn't think this matters (and in fact, it does not impact things locally).
Any help/suggestions are HUGELY appreciated.
UPDATE
As requested, here is the table schema:
users:
+---------------------------------+--------------+------+-----+------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------------+--------------+------+-----+------------+-------+
| group_id | int(11) | NO | PRI | NULL | |
| internal_user_id | int(11) | NO | PRI | NULL | |
| create_date | date | NO | | 1000-01-01 | |
| first_trans_date | date | NO | | 1000-01-01 | |
| first_trans_year | smallint(6) | NO | | 0 | |
| first_trans_month | tinyint(4) | NO | | 0 | |
| last_trans_date | date | NO | | 1000-01-01 | |
| first_purchase_revenue_in_cents | bigint(20) | NO | | 0 | |
| first_purchase_profit_in_cents | bigint(20) | NO | | 0 | |
| first_name | varchar(50) | YES | | NULL | |
| last_name | varchar(50) | YES | | NULL | |
| raw_id | varchar(255) | YES | | NULL | |
| address_1 | varchar(200) | YES | | NULL | |
| city | varchar(75) | YES | | NULL | |
| state | varchar(50) | YES | | NULL | |
| zip | varchar(20) | YES | | NULL | |
| phone | varchar(20) | YES | | NULL | |
| email | varchar(100) | YES | | NULL | |
| is_opt_in | tinyint(1) | YES | | NULL | |
| mail_merge_1 | varchar(255) | YES | | NULL | |
+---------------------------------+--------------+------+-----+------------+-------+
tusers:
+--------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| group_id | int(11) | NO | PRI | NULL | |
| application_id | int(11) | YES | MUL | NULL | |
| group_user_id | int(11) | YES | | NULL | |
| raw_id | varchar(255) | YES | | NULL | |
| email_address | varchar(255) | YES | | NULL | |
| first_name | varchar(255) | YES | | NULL | |
| last_name | varchar(255) | YES | | NULL | |
| mail_merge_1 | varchar(255) | YES | | NULL | |
+--------------------------+--------------+------+-----+---------+----------------+
Cardinality:
tusers: index_tu_group_id_email - 2,500,000 users: 17,000,000