0

I am looking to optimize a JOIN between tables that uses a small subset of columns in the second table.

CREATE TABLE `employees` (
  `employee_id` bigint NOT NULL,
  `manager_id` bigint NOT NULL,
  `org_id` bigint NOT NULL,
  `union_id` bigint NOT NULL
  ...
  PRIMARY KEY (employee_id),
  INDEX (union_id)
);

CREATE TABLE managers ( manager_id bigint NOT NULL, org_id bigint NOT NULL, some_condition boolean NOT NULL, PRIMARY KEY (manager_id)
);

Now I want to optimize two types of queries. Both join the tables together on manager_id and org_id, and optionally apply a filter to the some_condition column.

SELECT employees.* 
FROM employees 
JOIN managers
ON (employees.manager_id = managers.manager_id AND employees.org_id = managers.org_id)
WHERE (employees.union_id = ? AND managers.some_condition);

SELECT employees.* FROM employees JOIN managers ON (employees.manager_id = managers.manager_id AND employees.org_id = managers.org_id) WHERE (employees.union_id = ?);

Assuming these are very large tables and employees > manager. I am trying to create an index on managers that will speed up the query. Right now the query is slow because for each row it has to read org_id and some_condition. I want to avoid going to disk if possible.

So far I have two indexes that might work:

INDEX `join_index` (`org_id`,`some_condition`)
INDEX `id_join_index` (`manager_id`, `org_id`, `some_condition`)

My main issue is that MySQL does not use either index in the EXPLAIN statement unless I force it to with use index (...).

Which index (if either) will speed up my query, and do I need manager_id in the index to speed up the join if I do not filter on some_condition?

1 Answers1

0

As said in the comment, you need the composite index for both tables. Also, you do need manager_id in the index even if you're not filtering on some_condition. Without it, MySQL can't efficiently locate the matching rows for the join.

ishimwe
  • 101
  • 1