Which is faster, InnoDB or MyISAM?

Question

How can MyISAM be "faster" than InnoDB if

MyISAM needs to do disk reads for the data?
InnoDB uses the buffer pool for indexes and data, and MyISAM just for the index?

score 79 · Accepted Answer · edited Jun 15 '20 at 09:05

The only way MyISAM can be faster that InnoDB would be under this unique circumstance

MyISAM

When read, a MyISAM table's indexes can be read once from the .MYI file and loaded in the MyISAM Key Cache (as sized by key_buffer_size). How can you make a MyISAM table's .MYD faster to read? With this:

ALTER TABLE mytable ROW_FORMAT=Fixed;

I wrote about this in my past posts

Best of MyISAM and InnoDB (Please Read This One First)
What is the performance impact of using CHAR vs VARCHAR on a fixed-size field? (TRADEOFF #2)
Optimized my.cnf for high-end and busy server (Under the heading Replication)
Which DBMS is good for super-fast reads and a simple data structure? (Paragraph 3)

InnoDB

OK, what about InnoDB? Does InnoDB do any disk I/O for queries? Surprisingly, yes it does !! You are probably thinking I am crazy for saying that, but it is absolutely true, even for SELECT queries. At this point, you are probably wondering "How in the world is InnoDB doing disk I/O for queries?"

It all goes back to InnoDB being an ACID-complaint Transactional Storage Engine. In order for InnoDB to be Transactional, it has to support the I in ACID, which is Isolation. The technique for maintaining isolation for transactions is done via MVCC, Multiversion Concurrency Control. In simple terms, InnoDB records what data looks like before transactions attempt to change them. Where does that get recorded? In the system tablespace file, better known as ibdata1. That requires disk I/O.

COMPARISON

Since both InnoDB and MyISAM do disk I/O, what random factors dictate who is faster?

Size of Columns
Column Format
Character Sets
Range of Numeric Values (requiring large enough INTs)
Rows Being Split Across Blocks (Row Chaining)
Data Fragmentation caused by DELETEs and UPDATEs
Size of Primary Key (InnoDB has a Clustered Index, requiring two key lookups)
Size of Index Entries
the list goes on...

Thus, in a heavy-read environment, it is possible for a MyISAM table with a Fixed Row Format to outperform InnoDB reads out of the InnoDB Buffer Pool if there is enough data being written into the undo logs contained within ibdata1 to support the transactional behavior imposed on the InnoDB data.

CONCLUSION

Plan your data types, queries, and storage engine real carefully. Once the data grows, it might become very difficult to move data around. Just ask Facebook...

score 24 · Answer 2 · answered May 18 '12 at 05:51

In a simple world, MyISAM is faster for reads, InnoDB is faster for writes.

Once you start introducing mixed read/writes, InnoDB will be faster for reads as well, thanks to its Row locking mechanism.

I wrote a comparison of MySQL storage engines a few years ago, that still holds true to this day, outlining the unique differences between MyISAM and InnoDB.

In my experience, you should use InnoDB for everything except for read-heavy cache-tables, where losing data due to corruption is not as critical.

score 19 · Answer 3 · edited Jun 15 '20 at 09:05

To add to the responses here covering the mechanical differences between the two engines, I present an empirical speed comparison study.

In terms of pure speed, it is not always the case that MyISAM is faster than InnoDB but in my experience it tends to be faster for PURE READ working environments by a factor of about 2.0-2.5 times. Clearly this isn't appropriate for all environments - as others have written, MyISAM lacks such things as transactions and foreign keys.

I've done a bit of benchmarking below - I've used python for looping and the timeit library for timing comparisons. For interest I've also included the memory engine, this gives the best performance across the board although it is only suitable for smaller tables (you continually encounter The table 'tbl' is full when you exceed the MySQL memory limit). The four types of select I look at are:

vanilla SELECTs
counts
conditional SELECTs
indexed and non-indexed sub-selects

Firstly, I created three tables using the following SQL

CREATE TABLE
    data_interrogation.test_table_myisam
    (
        index_col BIGINT NOT NULL AUTO_INCREMENT,
        value1 DOUBLE,
        value2 DOUBLE,
        value3 DOUBLE,
        value4 DOUBLE,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8

with 'MyISAM' substituted for 'InnoDB' and 'memory' in the second and third tables.

1) Vanilla selects

Query: SELECT * FROM tbl WHERE index_col = xx

Result: draw

Comparison of vanilla selects by different database engines

The speed of these is all broadly the same, and as expected is linear in the number of columns to be selected. InnoDB seems slightly faster than MyISAM but this is really marginal.

Code:

import timeit
import MySQLdb
import MySQLdb.cursors
import random
from random import randint
db = MySQLdb.connect(host="...", user="...", passwd="...", db="...", cursorclass=MySQLdb.cursors.DictCursor)
cur = db.cursor()
lengthOfTable = 100000
Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()
insertString = &quot;INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;
insertString2 = &quot;INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;
insertString3 = &quot;INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;

cur.execute(insertString)
cur.execute(insertString2)
cur.execute(insertString3)


db.commit()
Define a function to pull a certain number of records from these tables
def selectRandomRecords(testTable,numberOfRecords):
for x in xrange(numberOfRecords):
    rand1 = randint(0,lengthOfTable)

    selectString = &quot;SELECT * FROM &quot; + testTable + &quot; WHERE index_col = &quot; + str(rand1)
    cur.execute(selectString)


setupString = "from main import selectRandomRecords"
Test time taken using timeit
myisam_times = []
innodb_times = []
memory_times = []
for theLength in [3,10,30,100,300,1000,3000,10000]:
innodb_times.append( timeit.timeit('selectRandomRecords(&quot;test_table_innodb&quot;,' + str(theLength) + ')', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('selectRandomRecords(&quot;test_table_myisam&quot;,' + str(theLength) + ')', number=100, setup=setupString) )
memory_times.append( timeit.timeit('selectRandomRecords(&quot;test_table_memory&quot;,' + str(theLength) + ')', number=100, setup=setupString) )

2) Counts

Query: SELECT count(*) FROM tbl

Result: MyISAM wins

Comparison of counts by different database engines

This one demonstrates a big difference between MyISAM and InnoDB - MyISAM (and memory) keeps track of the number of records in the table, so this transaction is fast and O(1). The amount of time required for InnoDB to count increases super-linearly with table size in the range I investigated. I suspect many of the speed-ups from MyISAM queries that are observed in practice are due to similar effects.

Code:

myisam_times = []
innodb_times = []
memory_times = []
Define a function to count the records
def countRecords(testTable):
selectString = &quot;SELECT count(*) FROM &quot; + testTable
cur.execute(selectString)


setupString = "from main import countRecords"
Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:
truncateString = &quot;TRUNCATE test_table_innodb&quot;
truncateString2 = &quot;TRUNCATE test_table_myisam&quot;
truncateString3 = &quot;TRUNCATE test_table_memory&quot;

cur.execute(truncateString)
cur.execute(truncateString2)
cur.execute(truncateString3)

for x in xrange(theLength):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = &quot;INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;
    insertString2 = &quot;INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;
    insertString3 = &quot;INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;

    cur.execute(insertString)
    cur.execute(insertString2)
    cur.execute(insertString3)

db.commit()

# Count and time the query
innodb_times.append( timeit.timeit('countRecords(&quot;test_table_innodb&quot;)', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('countRecords(&quot;test_table_myisam&quot;)', number=100, setup=setupString) )
memory_times.append( timeit.timeit('countRecords(&quot;test_table_memory&quot;)', number=100, setup=setupString) )

3) Conditional selects

Query: SELECT * FROM tbl WHERE value1<0.5 AND value2<0.5 AND value3<0.5 AND value4<0.5

Result: MyISAM wins

Comparison of conditional selects by different database engines

Here, MyISAM and memory perform approximately the same, and beat InnoDB by about 50% for larger tables. This is the sort of query for which the benefits of MyISAM seem to be maximised.

Code:

myisam_times = []
innodb_times = []
memory_times = []
Define a function to perform conditional selects
def conditionalSelect(testTable):
    selectString = "SELECT * FROM " + testTable + " WHERE value1 < 0.5 AND value2 < 0.5 AND value3 < 0.5 AND value4 < 0.5"
    cur.execute(selectString)
setupString = "from main import conditionalSelect"
Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:
truncateString = &quot;TRUNCATE test_table_innodb&quot;
truncateString2 = &quot;TRUNCATE test_table_myisam&quot;
truncateString3 = &quot;TRUNCATE test_table_memory&quot;

cur.execute(truncateString)
cur.execute(truncateString2)
cur.execute(truncateString3)

for x in xrange(theLength):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = &quot;INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;
    insertString2 = &quot;INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;
    insertString3 = &quot;INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;

    cur.execute(insertString)
    cur.execute(insertString2)
    cur.execute(insertString3)

db.commit()

# Count and time the query
innodb_times.append( timeit.timeit('conditionalSelect(&quot;test_table_innodb&quot;)', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('conditionalSelect(&quot;test_table_myisam&quot;)', number=100, setup=setupString) )
memory_times.append( timeit.timeit('conditionalSelect(&quot;test_table_memory&quot;)', number=100, setup=setupString) )

4) Sub-selects

Result: InnoDB wins

For this query, I created an additional set of tables for the sub-select. Each is simply two columns of BIGINTs, one with a primary key index and one without any index. Due to the large table size, I didn't test the memory engine. The SQL table creation command was

CREATE TABLE
    subselect_myisam
    (
        index_col bigint NOT NULL,
        non_index_col bigint,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8;

where once again, 'MyISAM' is substituted for 'InnoDB' in the second table.

In this query, I leave the size of the selection table at 1000000 and instead vary the size of the sub-selected columns.

Comparison of sub-selects by different database engines

Here the InnoDB wins easily. After we get to a reasonable size table both engines scale linearly with the size of the sub-select. The index speeds up the MyISAM command but interestingly has little effect on the InnoDB speed. subSelect.png

Code:

myisam_times = []
innodb_times = []
myisam_times_2 = []
innodb_times_2 = []
def subSelectRecordsIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)
setupString = "from main import subSelectRecordsIndexed"
def subSelectRecordsNotIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT non_index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)
setupString2 = "from main import subSelectRecordsNotIndexed"
Truncate the old tables, and re-fill with 1000000 records
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"
cur.execute(truncateString)
cur.execute(truncateString2)
lengthOfTable = 1000000
Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()
insertString = &quot;INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;
insertString2 = &quot;INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (&quot; + str(rand1) + &quot;,&quot; + str(rand2) + &quot;,&quot; + str(rand3) + &quot;,&quot; + str(rand4) + &quot;)&quot;

cur.execute(insertString)
cur.execute(insertString2)


for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:
truncateString = &quot;TRUNCATE subselect_innodb&quot;
truncateString2 = &quot;TRUNCATE subselect_myisam&quot;

cur.execute(truncateString)
cur.execute(truncateString2)

# For each length, empty the table and re-fill it with random data
rand_sample = sorted(random.sample(xrange(lengthOfTable), theLength))
rand_sample_2 = random.sample(xrange(lengthOfTable), theLength)

for (the_value_1,the_value_2) in zip(rand_sample,rand_sample_2):
    insertString = &quot;INSERT INTO subselect_innodb (index_col,non_index_col) VALUES (&quot; + str(the_value_1) + &quot;,&quot; + str(the_value_2) + &quot;)&quot;
    insertString2 = &quot;INSERT INTO subselect_myisam (index_col,non_index_col) VALUES (&quot; + str(the_value_1) + &quot;,&quot; + str(the_value_2) + &quot;)&quot;

    cur.execute(insertString)
    cur.execute(insertString2)

db.commit()

# Finally, time the queries
innodb_times.append( timeit.timeit('subSelectRecordsIndexed(&quot;test_table_innodb&quot;,&quot;subselect_innodb&quot;)', number=100, setup=setupString) )
myisam_times.append( timeit.timeit('subSelectRecordsIndexed(&quot;test_table_myisam&quot;,&quot;subselect_myisam&quot;)', number=100, setup=setupString) )

innodb_times_2.append( timeit.timeit('subSelectRecordsNotIndexed(&quot;test_table_innodb&quot;,&quot;subselect_innodb&quot;)', number=100, setup=setupString2) )
myisam_times_2.append( timeit.timeit('subSelectRecordsNotIndexed(&quot;test_table_myisam&quot;,&quot;subselect_myisam&quot;)', number=100, setup=setupString2) )

I think the take-home message of all of this is that if you are really concerned about speed, you need to benchmark the queries that you're doing rather than make any assumptions about which engine will be more suitable.

score 4 · Answer 4 · answered May 23 '12 at 19:36

4

Which is faster? Either might be faster. YMMV.

Which should you use? InnoDB -- crash-safe, etc, etc.

answered May 23 '12 at 19:36

Rick James

80,479
5
52
119

Which is faster, InnoDB or MyISAM?

4 Answers4

MyISAM

InnoDB

COMPARISON

CONCLUSION

1) Vanilla selects

Fill up the tables with random data

Define a function to pull a certain number of records from these tables

Test time taken using timeit

2) Counts

Define a function to count the records

Truncate the tables and re-fill with a set amount of data

3) Conditional selects

Define a function to perform conditional selects

Truncate the tables and re-fill with a set amount of data

4) Sub-selects

Truncate the old tables, and re-fill with 1000000 records

Fill up the tables with random data

Linked