All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Martin Knoblauch <spamtrap@knobisoft.de>
Cc: linux-kernel@vger.kernel.org,
	Peter zijlstra <a.p.zijlstra@chello.nl>,
	mingo@redhat.com
Subject: Re: Understanding I/O behaviour - next try
Date: Wed, 29 Aug 2007 11:48:01 +0200	[thread overview]
Message-ID: <20070829094801.GK23758@kernel.dk> (raw)
In-Reply-To: <713252.42570.qm@web32614.mail.mud.yahoo.com>

On Tue, Aug 28 2007, Martin Knoblauch wrote:
> Keywords: I/O, bdi-v9, cfs
> 
> Hi,
> 
>  a while ago I asked a few questions on the Linux I/O behaviour,
> because I were (still am) fighting some "misbehaviour" related to heavy
> I/O.
> 
>  The basic setup is a dual x86_64 box with 8 GB of memory. The DL380
> has a HW RAID5, made from 4x72GB disks and about 100 MB write cache.
> The performance of the block device with O_DIRECT is about 90 MB/sec.
> 
>  The problematic behaviour comes when we are moving large files through
> the system. The file usage in this case is mostly "use once" or
> streaming. As soon as the amount of file data is larger than 7.5 GB, we
> see occasional unresponsiveness of the system (e.g. no more ssh
> connections into the box) of more than 1 or 2 minutes (!) duration
> (kernels up to 2.6.19). Load goes up, mainly due to pdflush threads and
> some other poor guys being in "D" state.
> 
>  The data flows in basically three modes. All of them are affected:
> 
> local-disk -> NFS
> NFS -> local-disk
> NFS -> NFS
> 
>  NFS is V3/TCP.
> 
>  So, I made a few experiments in the last few days, using three
> different kernels: 2.6.22.5, 2.6.22.5+cfs20.4 an 2.6.22.5+bdi-v9.
> 
>  The first observation (independent of the kernel) is that we *should*
> use O_DIRECT, at least for output to the local disk. Here we see about
> 90 MB/sec write performance. A simple "dd" using 1,2 and 3 parallel
> threads to the same block device (through a ext2 FS) gives:
> 
> O_Direct: 88 MB/s, 2x44, 3x29.5
> non-O_DIRECT: 51 MB/s, 2x19, 3x12.5
> 
> - Observation 1a: IO schedulers are mostly equivalent, with CFQ
> slightly worse than AS and DEADLINE
> - Observation 1b: when using a 2.6.22.5+cfs20.4, the non-O_DIRECT
> performance goes [slightly] down. With three threads it is 3x10 MB/s.
> Ingo?
> - Observation 1c: bdi-v9 does not help in this case, which is not
> surprising.
> 
>  The real question here is why the non-O_DIRECT case is so slow. Is
> this a general thing? Is this related to the CCISS controller? Using
> O_DIRECT is unfortunatelly not an option for us.
> 
>  When using three different targets (local disk plus two different NFS
> Filesystems) bdi-v9 is a big winner. Without it, all threads are [seem
> to be] limited to the speed of the slowest FS. With bdi-v9 we see a
> considerable speedup.
> 
>  Just by chance I found out that doing all I/O inc sync-mode does
> prevent the load from going up. Of course, I/O throughput is not
> stellar (but not much worse than the non-O_DIRECT case). But the
> responsiveness seem OK. Maybe a solution, as this can be controlled via
> mount (would be great for O_DIRECT :-).
> 
>  In general 2.6.22 seems to bee better that 2.6.19, but this is highly
> subjective :-( I am using the following setting in /proc. They seem to
> provide the smoothest responsiveness:
> 
> vm.dirty_background_ratio = 1
> vm.dirty_ratio = 1
> vm.swappiness = 1
> vm.vfs_cache_pressure = 1
> 
>  Another thing I saw during my tests is that when writing to NFS, the
> "dirty" or "nr_dirty" numbers are always 0. Is this a conceptual thing,
> or a bug?
> 
>  In any case, view this as a report for one specific loadcase that does
> not behave very well. It seems there are ways to make things better
> (sync, per device throttling, ...), but nothing "perfect yet. Use once
> does seem to be a problem.

Try limiting the queue depth on the cciss device, some of those are
notoriously bad at starving commands. Something like the below hack, see
if it makes a difference (and please verify in dmesg that it prints the
message about limiting depth!):

diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 084358a..257e1c3 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -2992,7 +2992,12 @@ static int cciss_pci_init(ctlr_info_t *c, struct pci_dev *pdev)
 		if (board_id == products[i].board_id) {
 			c->product_name = products[i].product_name;
 			c->access = *(products[i].access);
+#if 0
 			c->nr_cmds = products[i].nr_cmds;
+#else
+			c->nr_cmds = 2;
+			printk("cciss: limited max commands to 2\n");
+#endif
 			break;
 		}
 	}

-- 
Jens Axboe


  parent reply	other threads:[~2007-08-29  9:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-28 15:53 Understanding I/O behaviour - next try Martin Knoblauch
2007-08-29  1:38 ` Fengguang Wu
2007-08-29  1:38   ` Fengguang Wu
2007-08-29  8:15     ` Martin Knoblauch
2007-08-29  8:40       ` Fengguang Wu
2007-08-29  8:40         ` Fengguang Wu
2007-08-29  9:22           ` Martin Knoblauch
2007-09-13 14:17       ` Peter Zijlstra
2007-08-29  9:48 ` Jens Axboe [this message]
2007-08-29 14:26   ` Martin Knoblauch
2007-08-30 10:50   ` Martin Knoblauch
2007-08-29 16:25 ` Chuck Ebbert
2007-08-29 21:43   ` Martin Knoblauch
     [not found] <fa.tV0SjP5wHRgCEzqJw2C8w4+Fh90@ifi.uio.no>
     [not found] ` <fa.NN9klzYbZhoZ+YoOWgrMeLtzlHE@ifi.uio.no>
2007-08-29 14:27   ` Robert Hancock
2007-08-30 10:26     ` Martin Knoblauch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070829094801.GK23758@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=spamtrap@knobisoft.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.