I have a table for visitor stats as follows:
CREATE TABLE `stats` (
`key` int(11) NOT NULL AUTO_INCREMENT,
`u_id` int(11) NOT NULL,
`app` varchar(20) NOT NULL,
`type` varchar(20) DEFAULT '',
`category` varchar(80) DEFAULT '',
`sub_category` varchar(80) DEFAULT '',
`date` int(11) NOT NULL,
`ip` varchar(20) NOT NULL
PRIMARY KEY (`key`)
) ENGINE=MyISAM AUTO_INCREMENT=41094490 DEFAULT CHARSET=utf8;
It was created without much thought a few years ago and has been collecting data ever since. It now has about 40 million rows.
Selecting by "app" takes about 1 minute to complete which is far too slow. I'm usually involved in PHP and Javascript but my MYSQL knowledge is pretty limited.
The table will only ever need rows added to it and for select queries to be made. The select queries are for generating graphs of usage stats on the frontend.
A typical query will select by matching 2-4 column values and will always be selected with a date range. Normally I would expect the date range tobe for a period of 1 week to 6 months, but there are cases where the user may want to view a few years of data. The priority is to optimise the former.
I don't mind having a separate database with this data sectioned off into different tables according to date, so have maybe one table per month of data. I can see how this would help queries of a few months at a time (using joins), but when it comes to data over a few years it would get even slower.
Is there anything else I can alter in terms of the tables structure to improve things? Would it be better separating the table into related tables with fewer columns?
The query I am using to test performance is:
select * from stats where u_id='123' and app='articles'
Using the EXPLAIN function on this I get:
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
1,"SIMPLE","stats","ALL",NULL,NULL,NULL,NULL,37462750,"Using where"