public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <htejun@gmail.com>
To: Pierre Ossman <drzeus-list@drzeus.cx>
Cc: Jens Axboe <axboe@suse.de>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: IOMMU and scatterlist limits
Date: Tue, 20 Dec 2005 21:04:38 +0900	[thread overview]
Message-ID: <43A7F356.90600@gmail.com> (raw)
In-Reply-To: <43A7ECCB.2060108@drzeus.cx>

Pierre Ossman wrote:
> Tejun Heo wrote:
> 
>>Pierre Ossman wrote:
>>>
>>>After testing this it seems the block layer never gives me more than
>>>max_hw_segs segments. Is it being clever because I'm compiling for a
>>>system without an IOMMU?
>>>
>>>The hardware should (haven't properly tested this) be able to get new
>>>DMA addresses during a transfer. In essence scatter gather with some CPU
>>>support. Since I avoid MMC overhead this should give a nice performance
>>>boost. But this relies on the block layer giving me more than one
>>>segment. Do I need to lie in max_hw_segs to achieve this?
>>>
>>
>>Hi, Pierre.
>>
>>max_phys_segments: the maximum number of segments in a request
>>           *before* DMA mapping
>>
>>max_hw_segments: the maximum number of segments in a request
>>         *after* DMA mapping (ie. after IOMMU merging)
>>
>>Those maximum numbers are for block layer.  Block layer must not
>>exceed above limits when it passes a request downward.  As long as all
>>entries in sg are processed, block layer doesn't care whether sg
>>iteration is performed by the driver or hardware.
>>
>>So, if you're gonna perform sg by iterating in the driver, what
>>numbers to report for max_phys_segments and max_hw_segments is
>>entirely upto how many entries the driver can handle.
>>
>>Just report some nice number (64 or 128?) for both.  Don't forget that
>>the number of sg entries can be decreased after DMA-mapping on
>>machines with IOMMU.
>>
>>IOW, the part which performs sg iteration gets to determine above
>>limits.  In your case, the driver is reponsible for both iterations
>>(pre and post DMA mapping), so all the limits are upto the driver.
>>
>>
> 
> 
> I'm still a bit confused why the block layer needs to know the maximum
> number of hw segments. Different hardware might be connected to
> different IOMMU:s, so only the driver will now how much the number can
> be reduced. So the block layer should only care about not going above
> max_phys_segments, since that's what the driver has room for.
> 
> What is the scenario that requires both?
> 

Let's say there is a piece of (crap) controller which can handle 4 
segments; but the system has a powerful IOMMU which can merge pretty 
well.  The driver wants to handle large requests for performance but it 
doesn't want to break up requests itself (pretty pointless, block layer 
merges, driver breaks down).  A request should be large but not larger 
than what the hardware can take at once.

So, it uses max_phys_segments to tell block layer how many sg entries 
the driver is willing to handle (some arbitrary large number) and 
reports 4 for max_hw_segments letting block layer know that requests 
should not be more than 4 segments after DMA-mapping.

To sum up, block layer performs request sizing in favor of block 
drivers, so it needs to know the size limits.

Is this explanation any better than my previous one?  :-P

Also, theoretically there can be more than one IOMMUs on a system (is 
there already?).  Block layer isn't yet ready to handle such cases but 
when it becomes necessary, all that needed is to make currently global 
IOMMU merging parameters request queue specific and modify drivers such 
that they tell block layer their IOMMU parameters.

-- 
tejun

  reply	other threads:[~2005-12-20 12:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-11-17  8:34 IOMMU and scatterlist limits Pierre Ossman
2005-11-17  8:54 ` Jens Axboe
2005-11-17  9:02   ` Pierre Ossman
2005-11-17  9:13     ` Jens Axboe
2005-11-17  9:27       ` Pierre Ossman
2005-11-17  9:38         ` Jens Axboe
2005-11-17  9:49           ` Pierre Ossman
2005-11-17 12:02             ` Jens Axboe
2005-12-18 22:41           ` Pierre Ossman
2005-12-20 11:10             ` Tejun Heo
2005-12-20 11:36               ` Pierre Ossman
2005-12-20 12:04                 ` Tejun Heo [this message]
2005-12-20 12:28                   ` Pierre Ossman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43A7F356.90600@gmail.com \
    --to=htejun@gmail.com \
    --cc=axboe@suse.de \
    --cc=drzeus-list@drzeus.cx \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox