32

I have two identical servers (in terms of hardware), they are both standard installations of windows server 2008 r2, with minimal software installed (basically my code and required stuff like jvm etc).

On the one server, I am running sql server 2005, on the second server postgresql 9.1. The difference in performance b/n these 2 servers is staggering, it's so bad on postgresql that I'm regretting my initial "let's use postgresql instead of paying for the sql server license" speech to my boss. We're talking differences of 30 seconds vs 15 mins for the same command, and it's not just this one command, it's any query or command I throw at it. They both have pretty much the same data (records were inserted in different order), and both databases have the exact same structure / indexes etc.

But I'm hoping it's just a matter of performance tuning. The thing is, sql server is pretty much using all 32 gigs of ram on the server, whereas postgresl is using nothing, definitely less than a gig though I haven't actually figured it out in fine detail.

How do I get postgresql to use 20+ gigs of ram? These servers were built specifically for this database stuff, so any ram not in use by the database and supporting processes is wasted in my opinion.

user85116
  • 429
  • 1
  • 4
  • 3

4 Answers4

48

There are many tweakable constants, initialised via postgres.conf. The most important ones are:

  • max_connections: the number of concurrent sessions
  • work_mem : the maximal amount of memory to be used for intermediate results such as hash tables, and for sorting
  • shared_buffers the amount of memory dedicated to 'pinned' buffer space.
  • effective_cache_size the amount of memory assumed to be used by the OS's LRU buffers.
  • random_page_cost : an estimate for the relative cost of disk seeks.

max_connections should not be set higher than needed, connections cost resources even when idle; in most cases a connection would spend more time waiting inside than waiting outside. (at the price of concurrency) A nice rule-of-thumb formula is "number of spindles+number of processors+X"

work_mem is tricky: is can be applied to every subquery, so a query with 5 HASHJOINS might cost 5*work_mem. And for worst-case scenarios, you should also think of multiple sessions consuming this amount (again a reason to keep max_connections low).

shared_buffers is (IMHO) overrated. Normally it is advised to set it to about 1/4...1/2 of all available "free" memory, but I tend to keep it low, and set effective_cache_size to all available "free" memory.

random_page_cost is the cost for a seek+read on the disk. It is relative to the sequential_disk_cost, which is 1. The default (4) for random_page_cost is set too high for modern machines and network storage, normally it can be lowered to between 2 and 1.x. For SSD disks yould even set it to 1.0, since seeking is almost for free on SSDs.

wildplasser
  • 581
  • 3
  • 5
20

Consider using PGTune to help you tune the PostgreSQL configuration.

PostgreSQL's default configuration is very conservative and that tool is meant to help with this exact situation. The documentation is a light read and using the tool is pretty straightforward.

Keep in mind that there's no need to use PGTune's exact suggestions. Playing with its settings and watching the resulting changes to the conf file will give you a better understanding of PostgreSQL's configuration and how to tweak it manually.

More info on PGTune and an alternative tool called ClusterControl: PGTune Alternatives - ClusterControl PostgreSQL Configuration

Paul Bellora
  • 301
  • 1
  • 6
3

If every query or command is running slowly I suspect that:

  • you connect to database for every query you run;
  • you have configured some kind of authentication method, which does not work and it halts your queries until this particular authentication method times out.

Could you please tell us how much time it takes to run a query like select version()? If should be instant (0,16ms on my workstation).

Tometzky
  • 529
  • 2
  • 6
2

If EVERY query is that much slower something is terribly wrong with the server or something. In my experience each db has a few things it's better at than the other, but performance wise pgsql is easily in the same realm as mssql server.

So, what OS are you running pgsql on? What hardware? What settings have you changed already? How big is your dataset? What's an example of a poor query and the output of explain analyze (Run your query like this:

explain analyze select ...rest of query here... ;

Post the output to http://explain.depesz.com/ and post the link here.

Scott Marlowe
  • 1,909
  • 12
  • 13