Indexing the columns that only have yes and no values (cardinality of 2) would never get use. You would quickly subject table1 to table scans. Index merges would be out of the question : See Combining columns in index
In the given table, note that you could separate the clothing types from the person
Here is your table. Let's call it person
CREATE TABLE person
(
id INT NOT NULL AUTO_INCREMENT,
state CHAR(2),
jacket CHAR(3),
coat CHAR(3),
shoes CHAR(3),
raincoat CHAR(3),
sweater CHAR(3),
pants CHAR(3),
PRIMARY KEY (id),
KEY (state)
);
Let's create two tables
Here is one to hold types of clothing
CREATE TABLE articles_of_clothing
(
id INT NOT NULL AUTO_INCREMENT,
article VARCHAR(20),
PRIMARY KEY (id)
);
INSERT INTO clothing_types (article) VALUES
('jacket'),('coat'),('shoes'),('raincoat'),('sweater'),('pants');
Here is one to hold types of clothing associated with a person
CREATE TABLE clothing
(
id INT NOT NULL AUTO_INCREMENT,
person_id INT NOT NULL,
article_id INT NOT NULL,
PRIMARY KEY (id),
KEY person_article (person_id,article_id),
KEY article_person (article_id,person_id)
);
INSERT INTO clothing (person_id,article_id)
SELECT A.id,,B.id FROM person A,
(SELECT id FROM articles_of_clothing WHERE article='jacket') B
WHERE A.jacket='yes';
INSERT INTO clothing (person_id,article_id)
SELECT A.id,,B.id FROM person A,
(SELECT id FROM articles_of_clothing WHERE article='coat') B
WHERE A.coat='yes';
INSERT INTO clothing (person_id,article_id)
SELECT A.id,,B.id FROM person A,
(SELECT id FROM articles_of_clothing WHERE article='shoes') B
WHERE A.shoes='yes';
INSERT INTO clothing (person_id,article_id)
SELECT A.id,,B.id FROM person A,
(SELECT id FROM articles_of_clothing WHERE article='raincoat') B
WHERE A.raincoat='yes';
INSERT INTO clothing (person_id,article_id)
SELECT A.id,,B.id FROM person A,
(SELECT id FROM articles_of_clothing WHERE article='sweater') B
WHERE A.sweater='yes';
INSERT INTO clothing (person_id,article_id)
SELECT A.id,,B.id FROM person A,
(SELECT id FROM articles_of_clothing WHERE article='pants') B
WHERE A.pants='yes';
Let's remove the clothing columns from the person table
CREATE TABLE person_new LIKE person;
ALTER TABLE person_new DROP COLUMN jacket;
ALTER TABLE person_new DROP COLUMN coat;
ALTER TABLE person_new DROP COLUMN shoes;
ALTER TABLE person_new DROP COLUMN raincoat;
ALTER TABLE person_new DROP COLUMN sweater;
ALTER TABLE person_new DROP COLUMN pants;
INSERT INTO person_new (id,state) SELECt id,state FROM person;
ALTER TABLE person RENAME person_old;
ALTER TABLE person_new RENAME person;
DROP TABLE person_old;
Now that they are separated, let's look at your queries. We'll use just the first one.
SELECT * FROM table1 WHERE pants = 'yes' AND sweater = 'yes';
How can you accomplish this under the new design? Do it in stages
- Create a Table to collect person_ids from each pass
- Query the clothing table for all person_ids that have
pants and collect them
- Query the clothing table for all person_ids that have
sweater and collect them
- Count how many have person_id written twice
- Join all person_ids found twice back to person table
Here are those steps
CREATE TEMPORARY TABLE queryids (person_id INT NOT NULL,KEY (person_id));
INSERT INTO queryids
SELECT B.person_id FROM
(SELECT id article_id, FROM articles_of_clothing WHERE article='pants') A
INNER JOIN clothing USING (article_id) B
;
INSERT INTO queryids
SELECT B.person_id FROM
(SELECT id article_id, FROM articles_of_clothing WHERE article='sweater') A
INNER JOIN clothing USING (article_id) B
;
SELECT B.* FROM
(
SELECT COUNT(1) rcount,person_id
FROM queryids GROUP BY person_id
HAVING COUNT(1)=2
) A LEFT JOIN person B;
Since you only have 6 articles of clothing this may not be the best solution. If you have over twenty articles of clothing, the Query Optimizer stands a better chance of using indexes. I say this because when the Query optimizer has to read more than index entries for more than 5% a table, the Query Optimizer gives up an does a full table scan.
I have given this rule-of-thumb as a root cause to many index searching misadventures: