Re: [PATCH] block: fix q->max_segment_size checking in blk_recalc_rq_segments about VMERGE

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	jens.axboe@oracle.com, linux-kernel@vger.kernel.org,
	linux-scsi@vger.kernel.org, davem@davemloft.net,
	linux-parisc@vger.kernel.org
Subject: Re: [PATCH] block: fix q->max_segment_size checking in blk_recalc_rq_segments about VMERGE
Date: Tue, 15 Jul 2008 10:41:43 -0500	[thread overview]
Message-ID: <1216136503.3312.48.camel@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.64.0807151104450.26824@devserv.devel.redhat.com>

On Tue, 2008-07-15 at 11:24 -0400, Mikulas Patocka wrote:
> >>> On sparc64 it is broken anyway with or without your patch.
> >>
> >> Yeah, we need to modify SPARC64 IOMMU code (I'm not sure that it's
> >> worth). Right now, the best fix is setting BIO_VMERGE_BOUNDARY to 0.
> >>
> >>
> >>> And alpha alone doesn't justify substantial code bloat in generic block
> >>> layer. So I propose this patch to drop it at all.
> >>
> >> Jens, what do you think about removing VMERGE code?
> >
> > Actually, it's code I did.
> >
> > There are plusses and minusses to all of this.  The original vmerge code
> > was done for sparc ... mainly because the benefits of virtual merging
> > can offset the cost of having to use the iommu.  However, most
> > architectures didn't use it.  When I fixed it up to work for parisc (and
> > introduced the parameters) we were trying to demonstrate that using it
> > was feasible.
> >
> > The idea behind vmerging is that assembling and programming sg lists is
> > expensive, so you want to do it once.  Either in the iommu or in the
> > driver sg list, but not in both.  There is evidence that it saves around
> > 7% or so on drivers.  However, for architectures that can do it, better
> > savings are made simply by lifting the iommu out of the I/O path (so
> > called bypass mode).
> 
> The problem is with vmerge accounting in block layer (that is what I'm 
> proposing to remove), not with vmerge itself.

I don't think that's true ... otherwise parisc would be falling over
left right and centre.

> Vmerge accounting has advantages only if you have device with small amount 
> of sg slots --- it allows the block layer to create request that has 
> higher number of segments then the device.

This isn't really true either.  A lot of devices with a high sg slot
count are still less efficient than an iommu for programming.

Even if they're not, on parisc we have to program the iommu, we can't
bypass, so it still makes sense to only have one large sg list (in the
iommu) and one small one (in the device).  Having two large ones reduces
our I/O throughput because of the extra overhead.

> If you have device with for example 1024 slots, the virtual merge 
> accounting has no effect, because the any request will fit into that size. 

It's not about fitting a request, it's about efficient processing.

> Even without virtual merge accounting, the virtual merging will happen, so 
> there will be no performance penalty for the controller --- the controller 
> will be programmed with exactly the same number of segments as if virtual 
> merge accounting was present. (there could be even slight positive 
> performance effect if you remove accounting, because you burn less CPU 
> cycles per request)

Yes there is.  Both the iommu and the device have to traverse large SG
lists.  This is where the inefficiency lies.  On PA, we use exactly the
same number of iotlb slots whether virtual merging is in effect or not,
but the device has an internal loop to go over the list.  It's that loop
that virtual merging reduces.

Since the virtual merge computation is in line when the request is built
(by design) it doesn't really detract from the throughput and the cost
is pretty small.

> If you have device will small number of sg slots (16 or so), vmerge 
> accounting can improve performance by creating requests with more than 16 
> segments --- the question is: is there any such device? And is the device 
> performance-sensitive? (i.e. isn't it such an old hardware where no one 
> cares about performance anyway?)
> 
> > I suspect with IOMMUs coming back (and being unable to be bypassed) with
> > virtualisation, virtual merging might once more become a significant
> > value.
> 
> I suppose that no one would manufacture new SCSI card with 16 or 32 sg 
> slots these days, so the accounting of hardware segments has no effect on 
> modern hardware.

It's not about accounting, it's about performance.  There's a cost in
every device to traversing large count sg lists.  If you have to bear it
in the iommu (which is usually more efficient because the iotlb tends to
follow mmtlb optimisations) you can reduce the cost by eliminating it
from the device.

James

next prev parent reply	other threads:[~2008-07-15 15:42 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-15 10:44 [PATCH] block: fix q->max_segment_size checking in blk_recalc_rq_segments about VMERGE FUJITA Tomonori
2008-07-15 13:37 ` Mikulas Patocka
2008-07-15 14:20   ` FUJITA Tomonori
2008-07-15 14:37     ` Mikulas Patocka
2008-07-15 15:30       ` FUJITA Tomonori
2008-07-15 15:46         ` Mikulas Patocka
2008-07-16  0:34           ` FUJITA Tomonori
2008-07-16 18:02             ` Mikulas Patocka
2008-07-17  4:14               ` FUJITA Tomonori
2008-07-17 11:50                 ` Mikulas Patocka
2008-07-17 13:18                   ` FUJITA Tomonori
2008-07-17 13:27                     ` Boaz Harrosh
2008-07-17 13:56                       ` James Bottomley
2008-07-19  7:28                     ` David Miller
2008-07-20  1:45                       ` Mikulas Patocka
2008-07-20  2:17                         ` James Bottomley
2008-07-20  4:07                           ` David Miller
2008-07-20 14:52                             ` James Bottomley
2008-07-20 17:23                               ` David Miller
2008-07-20 17:33                                 ` James Bottomley
2008-07-24 15:07                                   ` Mikulas Patocka
2008-07-24 15:28                                     ` James Bottomley
2008-07-24 16:34                                       ` Mikulas Patocka
2008-07-24 16:52                                         ` James Bottomley
2008-07-24 21:49                                           ` Mikulas Patocka
2008-07-24 21:53                                             ` David Miller
2008-07-25  3:47                                               ` James Bottomley
2008-07-25  5:21                                                 ` David Miller
2008-07-25  2:26                                             ` FUJITA Tomonori
2008-07-25  2:40                                             ` [PATCH] block: fix q->max_segment_size checking in blk_recalc_rq_segments John David Anglin
2008-07-20  5:54                         ` [PATCH] block: fix q->max_segment_size checking in blk_recalc_rq_segments about VMERGE David Miller
2008-07-15 14:50     ` James Bottomley
2008-07-15 15:24       ` Mikulas Patocka
2008-07-15 15:41         ` James Bottomley [this message]
2008-07-15 15:58           ` Mikulas Patocka
2008-07-15 16:07             ` James Bottomley
2008-07-15 16:20               ` Mikulas Patocka
2008-07-15 16:36                 ` James Bottomley
2008-07-15 21:50                   ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1216136503.3312.48.camel@localhost.localdomain \
    --to=james.bottomley@hansenpartnership.com \
    --cc=davem@davemloft.net \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox