3

Our users experience slow session performance at times during normal work hours. Applications (IE, Office Apps etc.) are slow to respond so is switching between them. This problem happens sporadically and below is some troubleshooting that took place.

We started gathering performance counters through the day and asked that users report when the slowdowns occur. See below for the graphs that show disk performance. The arrows point to the times when users reported slowdowns, and show that the problem is disk related.

Disk use graphs

Can anyone suggest further troubleshooting in order to track the culprit process/application?

Some server specs [OS: Server 2003 32bit Enterprise with /PAE flag] [RAM: 32GB] [CPU: 2xQuad Core @ 2.27Ghz] [HD: RAID5 1.2GB 3xSAS 10,000RPM HD. Controller has no battery and write cache is disabled]

Using Process Explorer i can take a look at processes and track which do the most disk reads/writes.

Processes with highest DISK WRITES: System, ccSvcHst.exe (Symantec Process), FireFox.exe

Processes with highest DISK READS: winlogon.exe, firefox.exe, explorer.exe

Processes with highest DISK WRITE BYTES: System, firefox.exe, ccSvcHst.exe

Processes with highest DISK READ BYTES: System, winlogon.exe, firefox.exe

MikeM
  • 41

2 Answers2

4

Write caching disabled and RAID5? That is a particularly underperforming combination of bad. Windows stands on the user profiles, so the appdata and registry activity alone would surface this issue on such a poor-performing storage subsystem. There could be other aggravating factors, such as the default registry lazy flush interval is too frequent.

The registry lazy flush interval may be increased by adjusting the following DWORD registry value:

Key: HKLM\System\CurrentControlSet\Control\Session Manager\Configuration Manager  
Value: RegistryLazyFlushInterval 

Use 60 (decimal) to specify 60 seconds. I believe the default value is 5 seconds.

The registry in particular is pre-disposed to locking issues. One issue we encountered on Windows Server 2003 manifested after an Internet Explorer security hotfix, and was related to the Browser Helper Object for Java. You can read more about that here:

https://serverfault.com/a/110242/20701

20 users seems a bit low to experience performance issues, however it's difficult to know because that is really based on the applications in use and the user type/behavior. While you may be able to address some of the issues by increasing the lazy flush interval or ruling out the Java BHO, I would start by addressing the problematic disk subsystem.

Greg Askew
  • 39,132
1

I'm going to suggest that your culprit probably is not an application or process, but that you're simply trying to push too much read/write for your card or disks (in that configuration). RAID5 is a parity RAID, which means that for any single write, there's actually a corresponding parity calculation (and thus, an additional write) on each drive in the array, which means that random write performance on RAID 5 arrays tends to be pretty bad.

See our canonical RAID levels thread Q&A here, but in general you only want to use parity RAID when the majority of the disk load is reads, like on a read-only or rarely written to file share, for example. (And for the problems most all of us have run across recovering a broken parity RAID array, you'll find many SAs try to avoid parity RAID in general, when at all possible).

The fact that your OS is on the same RAID5 volume as everything else, and you have multiple client accessing data simultaneously is just a recipe for this kind of problem in my experience, and my solution (assuming 6 drives in your server) would [probably] be to break the array into two - use a 2 drive mirror RAID for the OS, and a 4 drive RAID 10 for the rest. Honestly though, as long as you get out of the RAID5 situation, and switch to a RAID level better suited for your needs (like RAID 10), you'll be in much better shape.

HopelessN00b
  • 54,273