3

I am considering scaling up instead of scaling out. Therefore, I am interested to hear whether it is feasible to utilize the MySQL in-memory storage engine for a database that is 500+ GB, given that one has a server, with this kind of memory? I should mention that the queries towards the table are mostly ad-hoc queries.

One issue I believe I have found that would arise is that queries towards the MEMORY table could cause temporary tables to get created. Temporary tables have a hard limit of utilizing up to 4 GB of RAM before they get converted to a MyISAM table. This would of course kill performance completely. The memory limit of temporary tables could be worked around by setting the tempdir to a ramdisk.

What other kinds of problems do you foresee?

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536
David
  • 1,195
  • 1
  • 13
  • 27

4 Answers4

9

The MEMORY storage engine can either be a blessing or a curse depending on

  • what you intend to store
  • how often you plan to perform DML
  • how much RAM you leaving for the
    • Database caches (MyISAM Key Cache, InnoDB Buffer Pool)
    • OS caches
    • OS operation

The MEMORY storage engine

  • uses full table locking for INSERTs, UPDATEs, and DELETEs
  • Cannot perform concurrent INSERTs
  • uses the hash indexes instead of BTREE indexes by default
  • can use BTREEs indexes, but must be explicitly specified at CREATE TABLE time
  • has no transaction support
  • Single row queries are just great against MEMORY tables, especially using HASH indexes -- Ranged queries and sequential access are just horrific unless you use BTREE explicitly (more memory consumption required)

Even though you have data in RAM, mysqld will always hit the .frm file to check for the table existing as a reference point, thus always incurring a little disk I/O. Proportionally, heavy access to a MEMORY storage engine table will have noticeable disk I/O.

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536
4

I'm putting a second answer instead of revising my first because I think this one has merits on it's own.

Why not create a VM with a ~600GB ramdisk for it's hard drive, and load the 512GB datastore into the ramdisk and then you can treat it like a regular HDD and it'll never hit the physical medium?

If I had a machine with 2TB RAM I would help you figure out the proper semantics on this, but I don't. I do, however, have friends in academia who are working on these sorts of problems daily. I just don't have that sort of hardware at my disposal ;)

But this would solve all your problems of speed

jcolebrand
  • 6,376
  • 4
  • 43
  • 67
3

I would think that contemporary process size limitations would be the primary limiting factor in this situation. The question then becomes not "can the app do it" but rather "can the OS do it" as well as "does the OS support it".

Since you didn't specify an OS, I'm going to assume that you're using a 64 bit variant (you are discussing 1TB RAM servers in this category) of a "normal" OS.

According to this MSDN http://msdn.microsoft.com/en-us/library/aa366778%28VS.85%29.aspx article for Windows, there is the potential for a 500GB in-memory store, but it would appear unlikely from my basic and rather limited reading that the guys at MySQL would have optimized for that particular use-case. Databases are generally disk-oriented, after all. (see also https://stackoverflow.com/questions/2399162/process-memory-limit-of-64-bit-process)

My googlefu this morning is a little weak, but everyone that I can seem to find referencing the subject seems to be weakly referencing 2^48 without strong proof to back that up as the upper limit for available per-process memory limits in 64bit Linux, which is several terabytes of data. Maybe someone can come up with a better maximum per-process limit for Linux 64 bit kernels.

jcolebrand
  • 6,376
  • 4
  • 43
  • 67
3

The problem I forsee is the server crashing and you loose 500 gigs of data. I'll only use in memory engines for temporary things that aren't really production critical or queues that can be ultimately rebuilt in the event of a crash.

It would be more prudent to go with innodb and make sure to set your key buffers accordingly and leave the rest for disk cache. It doesn't sound like money is the problem for you so invest in an SSD raid solution for your disks.

atxdba
  • 5,293
  • 5
  • 41
  • 62