linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	linux-ide <linux-ide@vger.kernel.org>,
	Jens Axboe <Jens.Axboe@oracle.com>,
	FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: Re: [PATCH RESEND number 2] libata: eliminate the home grown dma padding in favour of that provided by the block layer
Date: Mon, 04 Feb 2008 23:03:11 -0600	[thread overview]
Message-ID: <1202187791.3096.192.camel@localhost.localdomain> (raw)
In-Reply-To: <47A7B6B6.2050801@gmail.com>

On Tue, 2008-02-05 at 10:07 +0900, Tejun Heo wrote:
> James Bottomley wrote:
> >>>> No, it doesn't.  Drain needs to extend the sg table too 
> >>> The patches do extend the sg table.
> >> Hmm... Where?
> > 
> > It's the block component; it's already in git head with this patch:
> > 
> > commit fa0ccd837e3dddb44c7db2f128a8bb7e4eabc21a
> > Author: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Date:   Thu Jan 10 11:30:36 2008 -0600
> > 
> >     block: implement drain buffers
> > 
> > The description is pretty reasonable.  The way it works is by making
> > block add the drain buffer to the sg table as the last element.
> > 
> > The way it works is if you simply use the length block gives you, you
> > find yourself doing DMA to every element in the sg table except the last
> > one.  If you expand the length to anything up to blk_rq_drain_size() you
> > then use the last element for the drain.
> 
> Oops, I was talking about padding.  Sorry about that.

Oh, OK ... the I thought the changelog was pretty explicit.  However, it
works because the dma_aligment parameter of the block layer ensures that
all elements of the sg list begin and end on an address that conforms to
the dma_alignment.  It does this basically by forcing a copy of user
incoming commands if they don't conform.  It does nothing with kernel
based commands, but you can always assume for direct DMA that they begin
and end on the cache line boundary ... and if they don't, there's always
a spare allocation to expand them.

> >> Adjusting qc->nbytes isn't enough.  libata will have to trim at the end
> >> of sglist.  With both draining and padding, it means libata will have to
> >> trim one sg entry and a few more bytes.  Block layer is behaving
> >> differently and in a very noticeable way, it's sending inconsistent
> >> values in data_len and sg list.
> > 
> > No, that's the point.  With all the space available, you don't need to
> > trim at all ... you simply don't use it.  The rationale is the way the
> > sg element stuff works.  If you program a DMA table (PRD table in IDE
> > speak) into a controller, it doesn't care if you don't use the last few
> > pieces.  With the default length set, the DMA controller won't use the
> > last programmed element, but it's their if you expand the length or tell
> > it to for an overrun.
> 
> Not all controllers have a place to set 'default length'.  They just run
> till sg list terminates.

True ... but the traversal of the sg list terminates when the data
transfer finishes ... this is the basis of your drain patch (and mine).
Basically we don't require that anything transfer into the drain, but
it's there just in case the transfer overruns and it needs to be used.

> >> Okay, I see your point.  The biggest problem is that by doing that blk
> >> layer breaks a pretty important rule - keeping data len consistent with
> >> sg list and the bugs it's gonna cause will be very subtle and dangerous.
> >>  This is definitely where we can spend four extra bytes.
> > 
> > The lengths don't have to be consistent (otherwise we wouldn't need to
> > calculate them separately) the request reported length just has to be
> > less than the sg table actual length.  We even make use of this property
> > in SCSI when we can't fit a request into a single command ... we are
> > allowed to chunk up the request as we complete it.  I'm not saying it's
> > best practice, but it is acceptable behaviour.
> 
> libata definitely isn't ready for that and I don't think expanding the
> behavior to whole block layer is a wise thing to do when the best
> practice can be bought at the cost of 4 bytes per request.  No?

but the whole basis of both patches is inconsistent lengths: we're
expecting a transfer of X, but we program the element lists for Y (>X)
and see if the transfer actually needs >X (but hopefully <Y).

James



  reply	other threads:[~2008-02-05  5:03 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-31 21:56 [PATCH] libata: eliminate the home grown dma padding in favour of that provided by the block layer James Bottomley
2007-12-31 22:56 ` Jeff Garzik
2008-01-03  7:58 ` FUJITA Tomonori
2008-01-03 15:12   ` James Bottomley
2008-01-09  2:10 ` Tejun Heo
2008-01-09  4:24   ` James Bottomley
2008-01-09  5:13     ` Tejun Heo
2008-01-09 15:13       ` James Bottomley
2008-01-18 23:14 ` [PATCH RESEND] " James Bottomley
2008-02-01 19:40   ` [PATCH RESEND number 2] " James Bottomley
2008-02-01 20:02     ` Jeff Garzik
2008-02-01 21:09       ` James Bottomley
2008-02-03  3:04         ` Tejun Heo
2008-02-03  4:32           ` James Bottomley
2008-02-03  7:37             ` Tejun Heo
2008-02-03 14:38               ` James Bottomley
2008-02-03 15:14                 ` Tejun Heo
2008-02-03 16:12                   ` James Bottomley
2008-02-03 16:38                     ` Jeff Garzik
2008-02-03 17:12                       ` James Bottomley
2008-02-04  1:21                         ` Tejun Heo
2008-02-04  1:28                     ` Tejun Heo
2008-02-04  9:25                       ` Tejun Heo
2008-02-04 14:43                         ` Tejun Heo
2008-02-04 16:23                           ` James Bottomley
2008-02-05  0:06                             ` Tejun Heo
2008-02-05  0:32                               ` James Bottomley
2008-02-05  0:43                                 ` Tejun Heo
2008-02-05  0:53                                   ` James Bottomley
2008-02-05  1:07                                     ` Tejun Heo
2008-02-05  5:03                                       ` James Bottomley [this message]
2008-02-05  5:22                                         ` Tejun Heo
2008-02-04 15:43                         ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1202187791.3096.192.camel@localhost.localdomain \
    --to=james.bottomley@hansenpartnership.com \
    --cc=Jens.Axboe@oracle.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=htejun@gmail.com \
    --cc=jeff@garzik.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).