public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Christoph Lameter <clameter@sgi.com>,
	linux-kernel@vger.kernel.org, arjan@linux.intel.com,
	mingo@elte.hu, ak@suse.de, jens.axboe@oracle.com,
	James.Bottomley@SteelEye.com, andrea@suse.de,
	akpm@linux-foundation.org, andrew.vasquez@qlogic.com
Subject: Re: [rfc] direct IO submission and completion scalability issues
Date: Tue, 31 Jul 2007 06:19:17 +0200	[thread overview]
Message-ID: <20070731041917.GA25874@wotan.suse.de> (raw)
In-Reply-To: <20070730203519.GD10033@linux-os.sc.intel.com>

On Mon, Jul 30, 2007 at 01:35:19PM -0700, Suresh B wrote:
> On Mon, Jul 30, 2007 at 11:20:04AM -0700, Christoph Lameter wrote:
> > On Fri, 27 Jul 2007, Siddha, Suresh B wrote:
> 
> > > Observation #2: This introduces some migration overhead during IO submission.
> > > With the current prototype, every incoming IO request results in an IPI and
> > > context switch(to kblockd thread) on the interrupt processing cpu.
> > > This issue needs to be addressed and main challenge to address is
> > > the efficient mechanism of doing this IO migration(how much batching to do and
> > > when to send the migrate request?), so that we don't delay the IO much and at
> > > the same point, don't cause much overhead during migration.
> > 
> > Right.
> 
> So any suggestions for making this clean and acceptable to everyone?

It is obviously a good idea to hand over the IO at the point which
requires the least number of cachelines to be moved, and I think doing
it in the block layer is right. Mostly you have to convince the block
and driver maintainers I guess.

The scheduler really should be made interrupt-load aware anyway, so I
don't have a problem with changing that; or scheduling kblockd at a
higher priority, but I don't know if SCHED_FIFO is a good idea. Couldn't
it be done in a softirq instead?

Latency for IO migration could be the most difficult problem to solve
really. You don't give much details of the workload, profiles, etc... I
hope this is for a real world test? Can the locking be improved in simpler
ways first?

Just some random questions...

It looks like the main source of cacheline bouncing you're eliminating
is from the initial starting of IO from an empty queue (ie. unplug).
>From then on, the submission is driven by completion, right?

Why is the queue allowed to go empty in the first place in an IO critical
workload?

Are you loading up each CPU with as many disks as it can possibly handle
plus a few more? If so, is that realistic? (I honestly don't know).

You say that you'd like to do this for direct IO only, but if it is more
efficient, why not for buffered IO as well? (or is it not more efficient
for buffered IO? if not, why?)

AFAIKS, you'd still have significant queue_lock contention from other
CPUs inserting requests into the list? What IO scheduler are you using?
I assume noop... as a crazy experiment, what happens if you create per-cpu
request queues?


  reply	other threads:[~2007-07-31  4:19 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-28  1:21 [rfc] direct IO submission and completion scalability issues Siddha, Suresh B
2007-07-30 18:20 ` Christoph Lameter
2007-07-30 20:35   ` Siddha, Suresh B
2007-07-31  4:19     ` Nick Piggin [this message]
2007-07-31 17:14       ` Siddha, Suresh B
2007-08-01  0:41         ` Nick Piggin
2007-08-01  0:55           ` Siddha, Suresh B
2007-08-01  1:24             ` Nick Piggin
2008-02-03  9:52 ` Nick Piggin
2008-02-03 10:53   ` Pekka Enberg
2008-02-03 11:58     ` Nick Piggin
2008-02-04  2:10   ` David Chinner
2008-02-04  4:14     ` Arjan van de Ven
2008-02-04  4:40       ` David Chinner
2008-02-04 10:09         ` Nick Piggin
2008-02-05  0:14           ` David Chinner
2008-02-08  7:50             ` Nick Piggin
2008-02-04 18:21     ` Zach Brown
2008-02-04 20:10       ` Jens Axboe
2008-02-04 21:45         ` Arjan van de Ven
2008-02-05  8:24           ` Jens Axboe
2008-02-04 10:12   ` Jens Axboe
2008-02-04 10:31     ` Nick Piggin
2008-02-04 10:33       ` Jens Axboe
2008-02-04 22:28         ` James Bottomley
2008-02-04 10:30   ` Andi Kleen
2008-02-04 21:47   ` Siddha, Suresh B

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070731041917.GA25874@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=James.Bottomley@SteelEye.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@suse.de \
    --cc=andrew.vasquez@qlogic.com \
    --cc=arjan@linux.intel.com \
    --cc=clameter@sgi.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox