All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adam Goryachev <mailinglists@websitemanagers.com.au>
To: stan@hardwarefreak.com
Cc: Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: RAID performance -  5x SSD RAID5 - effects of stripe cache sizing
Date: Fri, 08 Mar 2013 11:17:35 +1100	[thread overview]
Message-ID: <51392E1F.3080000@websitemanagers.com.au> (raw)
In-Reply-To: <51384366.2030307@hardwarefreak.com>

On 07/03/13 18:36, Stan Hoeppner wrote:
> On 3/5/2013 9:53 AM, Adam Goryachev wrote:
>> On 05/03/13 20:30, Stan Hoeppner wrote:
>> Thanks to the tip about running fio on windows, I think I've now come
>> full circle.... Today I had numerous complaints from users that their
>> outlook froze/etc, and some cases were the TS couldn't copy a file from
>> the DC to it's local C: (iSCSI). The cause was the DC was logging events
>> with event ID 2020 which is "The server was unable to allocate from the
>> system paged pool because the pool was empty". Supposedly the solution
>> to this is tuning two random numbers in the registry, not much is said
>> what the consequences of this are, nor about how to calculate the
>> correct value. 
> ...
>> Running the same fio test on the same TS (win2003) against a SMB share
>> from the DC (SMB -> Win2000 -> Xen -> iSCSI -> etc)
>>> READ: io=16384MB, aggrb=14818KB/s, minb=14818KB/s, maxb=14818KB/s, mint=1132181msec, maxt=0msec
>>> WRITE: io=16384MB, aggrb=8039KB/s, minb=8039KB/s, maxb=8039KB/s, mint=2086815msec, maxt=0msec
> 
> Run FIO on the DC itself and see what your NTFS throughput is to this
> 300GB filesystem.  Use a small file, say 2GB, since the FS is nearly
> full.  Post results.

Can't, I don't see a version of fio that runs on win2000...

> Fire up the Windows CLI FTP client in a TS session DOS box and do a GET
> and PUT into this filesystem share on the DC.  This will tell us if the
> TS to DC problem is TCP in general or limited to SMB.  Post transfer
> rate results for GET and PUT.

I had somewhat forgotten about FTP, and it does provide nice simple
performance results/numbers too. I'll give this a try, talking to
another linux box on the network, it should achieve 100MB/s (gigabit
speeds) or close to it. I'll also run the same FTP test from one of the
2003 boxes for comparison.

>> This is pretty shockingly slow, and seems to clearly indicate why the
>> users are so upset... 14MB/s read and 8MB/s write, it's a wonder they
>> haven't formed a mob and lynched me yet!
> 
> I've never used FIO on Windows against a Windows SMB share.  And
> according to Google nobody else does.  So before we assume these numbers
> paint an accurate picture of your DC SMB performance, and that the CPU
> burn isn't due to an anomaly of FIO, you should run some simple Windows
> file copy tests in Explorer and use Netmeter to measure the speed.  If
> they're in the same ballpark then you know you can somewhat trust FIO
> for SMB testing.  If they're wildly apart, probably not.

Well, during the day, under normal user load, the CPU frequently rises
to around 70 to 80%, while this is not as clear cut as 100%, it makes me
worry that it is limiting performance.

>> However, the truly useful information is that during the read portion of
>> the test, the DC has a CPU load of 100% (no variation, just pegged at
>> 100%), during the write portion, it fluctuates between 80% to 100%.
> 
> That 100% CPU is bothersome.  Turn off NTFS compression on any/all NTFS
> volumes residing on SAN LUNs on the SSD array.  You're burning 100% DC
> CPU at ~12MB/s data rate on the DC, so I can only assume it's turned on
> for this 300GB volume.  These SSDs do on the fly compression, and very
> quickly as you've seen.  Doing NTFS compression on top simply wastes cycles.

NTFS compression is already disabled on all volumes.... I've *never*
enabled it on any system I've ever been responsible for, and never seen
anyone else do that. However, due to the age of this system, it is
possible that it has been enabled and then disabled again at some point.

> This should drop the CPU burn for new writes to the filesystem.  It
> probably won't for reads, since NTFS must still decompress the existing
> 250GB+ of files.  If CPU drops considerably for writes but reads still
> eat 100%, the only fix for this is to backup the filesystem, reformat
> the device, then restore.  

Since it was already disabled, I would expect that the majority of files
currently in use (especially the problematic outlook pst files) have
already been modified/decompressed anyway.

> On the off chance that NTFS compression is
> more efficient than the SandForce controller, you probably want to
> increase the size of the volume before formatting.  And in fact,
> Sysadmin 101 tells us never to run a production filesystem at more than
> ~70% capacity, so it would be smart to bump it up to 400GB to cover your
> bases.

I only bumped it up a small amount just in case I got burned by windows
2000 having an upper limit on disk size supported. I couldn't find a
clear answer on the maximum size supported.... I'll probably increase it
to at least 400 or even 500 as soon as I complete the upgrade to win2003.

> My second recommendation is to turn off the indexing service for all
> these NTFS volumes as well as this will conserve CPU cycle as well.

That is a good thought... I recently did do a complete file search on
the volume, and it seemed to need to traverse the directory tree anyway
(I was looking for files *.bak to delete old pst files).

>> Extended the data drive from 279GB to 300GB (it was 90% full, now 84% full)
> 
> Growing filesystem in small chunks like this is a recipe for disaster.
>  Your free space map is always heavily fragmented and is very large.
> The more entries the filesystem driver must walk the more CPU you burn.
>  Recall we just discussed the table walking overhead of the md/RAID
> stripe cache?  Filesystem maps/tables/B+ trees are much, much larger
> structures.  When they don't fit it cache we read memory, and when they
> don't fit in memory (remember you "pool" problem) we must read from disk.

Yes, I did try and run a defrag (win2000 version) on the volume. I was
fairly curious about whether this would have any advantage given the SSD
backed filesystem where random IO shouldn't matter. Though I did think
it might still offer some improvement if both free space and files were
contiguous. However, after running on two occasions for about 20 hours,
it added around 2 or 3 very narrow defragged sections with no real
progress. I think the defrag is either running very slowly as well,
and/or requires more free space to run more efficiently.

> If you've been expanding this NTFS this way for a while it would also
> explain some of your CPU burn at the DC.  FYI, XFS is a MUCH higher
> performance and much more efficient filesystem than NTFS ever dreamed of
> becoming, but even XFS suffers slow IO and CPU burn due to heavily
> fragmented free space.

Nope, it has had the same 300G HDD (279G) for at least 6 years (as far
as I know). I'm pretty sure is has only been extended by physical
replacement of the HDD from time to time.

The current plan is to upgrade to win2003, I'm hoping this will improve
performance equivalent to what is being achieved on the other 2003
servers, which should make the users happy again. I may increase the
disk space and have another crack at defrag prior to the upgrade since
the upgrade won't happen until next weekend at the earliest.

Regards,
Adam

-- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au

  reply	other threads:[~2013-03-08  0:17 UTC|newest]

Thread overview: 131+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-07  6:48 RAID performance Adam Goryachev
2013-02-07  6:51 ` Adam Goryachev
2013-02-07  8:24   ` Stan Hoeppner
2013-02-07  7:02 ` Carsten Aulbert
2013-02-07 10:12   ` Adam Goryachev
2013-02-07 10:29     ` Carsten Aulbert
2013-02-07 10:41       ` Adam Goryachev
2013-02-07  8:11 ` Stan Hoeppner
2013-02-07 10:05   ` Adam Goryachev
2013-02-16  4:33     ` RAID performance - *Slow SSDs likely solved* Stan Hoeppner
     [not found]       ` <cfefe7a6-a13f-413c-9e3d-e061c68dc01b@email.android.com>
2013-02-17  5:01         ` Stan Hoeppner
2013-02-08  7:21   ` RAID performance Adam Goryachev
2013-02-08  7:37     ` Chris Murphy
2013-02-08 13:04     ` Stan Hoeppner
2013-02-07  9:07 ` Dave Cundiff
2013-02-07 10:19   ` Adam Goryachev
2013-02-07 11:07     ` Dave Cundiff
2013-02-07 12:49       ` Adam Goryachev
2013-02-07 12:53         ` Phil Turmel
2013-02-07 12:58           ` Adam Goryachev
2013-02-07 13:03             ` Phil Turmel
2013-02-07 13:08               ` Adam Goryachev
2013-02-07 13:20                 ` Mikael Abrahamsson
2013-02-07 22:03               ` Chris Murphy
2013-02-07 23:48                 ` Chris Murphy
2013-02-08  0:02                   ` Chris Murphy
2013-02-08  6:25                     ` Adam Goryachev
2013-02-08  7:35                       ` Chris Murphy
2013-02-08  8:34                         ` Chris Murphy
2013-02-08 14:31                           ` Adam Goryachev
2013-02-08 14:19                         ` Adam Goryachev
2013-02-08  6:15                   ` Adam Goryachev
2013-02-07 15:32         ` Dave Cundiff
2013-02-08 13:58           ` Adam Goryachev
2013-02-08 21:42             ` Stan Hoeppner
2013-02-14 22:42               ` Chris Murphy
2013-02-15  1:10                 ` Adam Goryachev
2013-02-15  1:40                   ` Chris Murphy
2013-02-15  4:01                     ` Adam Goryachev
2013-02-15  5:14                       ` Chris Murphy
2013-02-15 11:10                         ` Adam Goryachev
2013-02-15 23:01                           ` Chris Murphy
2013-02-17  9:52             ` RAID performance - new kernel results Adam Goryachev
2013-02-18 13:20               ` RAID performance - new kernel results - 5x SSD RAID5 Stan Hoeppner
2013-02-20 17:10                 ` Adam Goryachev
2013-02-21  6:04                   ` Stan Hoeppner
2013-02-21  6:40                     ` Adam Goryachev
2013-02-21  8:47                       ` Joseph Glanville
2013-02-22  8:10                       ` Stan Hoeppner
2013-02-24 20:36                         ` Stan Hoeppner
2013-03-01 16:06                           ` Adam Goryachev
2013-03-02  9:15                             ` Stan Hoeppner
2013-03-02 17:07                               ` Phil Turmel
2013-03-02 23:48                                 ` Stan Hoeppner
2013-03-03  2:35                                   ` Phil Turmel
2013-03-03 15:19                                 ` Adam Goryachev
2013-03-04  1:31                                   ` Phil Turmel
2013-03-04  9:39                                     ` Adam Goryachev
2013-03-04 12:41                                       ` Phil Turmel
2013-03-04 12:42                                       ` Stan Hoeppner
2013-03-04  5:25                                   ` Stan Hoeppner
2013-03-03 17:32                               ` Adam Goryachev
2013-03-04 12:20                                 ` Stan Hoeppner
2013-03-04 16:26                                   ` Adam Goryachev
2013-03-05  9:30                                     ` RAID performance - 5x SSD RAID5 - effects of stripe cache sizing Stan Hoeppner
2013-03-05 15:53                                       ` Adam Goryachev
2013-03-07  7:36                                         ` Stan Hoeppner
2013-03-08  0:17                                           ` Adam Goryachev [this message]
2013-03-08  4:02                                             ` Stan Hoeppner
2013-03-08  5:57                                               ` Mikael Abrahamsson
2013-03-08 10:09                                                 ` Stan Hoeppner
2013-03-08 14:11                                                   ` Mikael Abrahamsson
2013-02-21 17:41                     ` RAID performance - new kernel results - 5x SSD RAID5 David Brown
2013-02-23  6:41                       ` Stan Hoeppner
2013-02-23 15:57               ` RAID performance - new kernel results John Stoffel
2013-03-01 16:10                 ` Adam Goryachev
2013-03-10 15:35                   ` Charles Polisher
2013-04-15 12:23                     ` Adam Goryachev
2013-04-15 15:31                       ` John Stoffel
2013-04-17 10:15                         ` Adam Goryachev
2013-04-15 16:49                       ` Roy Sigurd Karlsbakk
2013-04-15 20:16                       ` Phil Turmel
2013-04-16 19:28                         ` Roy Sigurd Karlsbakk
2013-04-16 21:03                           ` Phil Turmel
2013-04-16 21:43                           ` Stan Hoeppner
2013-04-15 20:42                       ` Stan Hoeppner
2013-02-08  3:32       ` RAID performance Stan Hoeppner
2013-02-08  7:11         ` Adam Goryachev
2013-02-08 17:10           ` Stan Hoeppner
2013-02-08 18:44             ` Adam Goryachev
2013-02-09  4:09               ` Stan Hoeppner
2013-02-10  4:40                 ` Adam Goryachev
2013-02-10 13:22                   ` Stan Hoeppner
2013-02-10 16:16                     ` Adam Goryachev
2013-02-10 17:19                       ` Mikael Abrahamsson
2013-02-10 21:57                         ` Adam Goryachev
2013-02-11  3:41                           ` Adam Goryachev
2013-02-11  4:33                           ` Mikael Abrahamsson
2013-02-12  2:46                       ` Stan Hoeppner
2013-02-12  5:33                         ` Adam Goryachev
2013-02-13  7:56                           ` Stan Hoeppner
2013-02-13 13:48                             ` Phil Turmel
2013-02-13 16:17                             ` Adam Goryachev
2013-02-13 20:20                               ` Adam Goryachev
2013-02-14 12:22                                 ` Stan Hoeppner
2013-02-15 13:31                                   ` Stan Hoeppner
2013-02-15 14:32                                     ` Adam Goryachev
2013-02-16  1:07                                       ` Stan Hoeppner
2013-02-16 17:19                                         ` Adam Goryachev
2013-02-17  1:42                                           ` Stan Hoeppner
2013-02-17  5:02                                             ` Adam Goryachev
2013-02-17  6:28                                               ` Stan Hoeppner
2013-02-17  8:41                                                 ` Adam Goryachev
2013-02-17 13:58                                                   ` Stan Hoeppner
2013-02-17 14:46                                                     ` Adam Goryachev
2013-02-19  8:17                                                       ` Stan Hoeppner
2013-02-20 16:45                                                         ` Adam Goryachev
2013-02-21  0:45                                                           ` Stan Hoeppner
2013-02-21  3:10                                                             ` Adam Goryachev
2013-02-22 11:19                                                               ` Stan Hoeppner
2013-02-22 15:25                                                                 ` Charles Polisher
2013-02-23  4:14                                                                   ` Stan Hoeppner
2013-02-12  7:34                         ` Mikael Abrahamsson
2013-02-08  7:17         ` Adam Goryachev
2013-02-07 12:01     ` Brad Campbell
2013-02-07 12:37       ` Adam Goryachev
2013-02-07 17:12         ` Fredrik Lindgren
2013-02-08  0:00           ` Adam Goryachev
2013-02-11 19:49   ` Roy Sigurd Karlsbakk
2013-02-11 20:30     ` Dave Cundiff
2013-02-07 11:32 ` Mikael Abrahamsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51392E1F.3080000@websitemanagers.com.au \
    --to=mailinglists@websitemanagers.com.au \
    --cc=linux-raid@vger.kernel.org \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.