From: Jens Axboe <axboe@suse.de>
To: Andi Kleen <ak@suse.de>
Cc: Michael Monnerie <m.monnerie@zmi.at>,
Jeff Garzik <jgarzik@pobox.com>,
linux-kernel@vger.kernel.org
Subject: Re: PCI-DMA: Out of IOMMU space on x86-64 (Athlon64x2), with solution
Date: Thu, 2 Mar 2006 15:14:12 +0100 [thread overview]
Message-ID: <20060302141412.GT4329@suse.de> (raw)
In-Reply-To: <200603021458.02934.ak@suse.de>
On Thu, Mar 02 2006, Andi Kleen wrote:
> On Thursday 02 March 2006 14:49, Jens Axboe wrote:
> > On Thu, Mar 02 2006, Andi Kleen wrote:
> > > On Thursday 02 March 2006 14:33, Jens Axboe wrote:
> > >
> > > > Hmm I would have guessed the first is way more common, the device/driver
> > > > consuming lots of iommu space would be the most likely to run into
> > > > IOMMU-OOM.
> > >
> > > e.g. consider a simple RAID-1. It will always map the requests twice so the
> > > normal case is 2 times as much IOMMU space needed. Or even more with bigger
> > > raids.
> > >
> > > But you're right of course that only waiting for one user would be likely
> > > sufficient. e.g. even if it misses some freeing events the "current" device
> > > should eventually free some space too.
> > >
> > > On the other hand it would seem cleaner to me to solve it globally
> > > instead of trying to hack around it in the higher layers.
> >
> > But I don't think that's really possible.
>
> Wasn't this whole thread about making it possible?
Sorry, what I mean is that I don't think it solvable in the normal
dma_map_sg() path. You have to punt and allow the upper layer to wait.
> > As Jeff points out, SCSI can't
> > do this right now because of the way we map requests.
>
> Sure you have to punt out outside this spinlock and then find
> a "safe place" as you put it to wait. The low level IOMMU code
> would supply the wakeup.
Precisely.
> > And it would be a
> > shame to change the hot path because of the error case. And then you
> > have things like networking and other block drivers - it would be a big
> > audit/fixup to make that work.
> >
> > It's much easier to extend the dma mapping api to have an error
> > fallback.
>
> It already has one (pci_map_sg returning 0 or pci_mapping_error()
> for pci_map_single())
Yeah we can signal the error in map_sg() with 0, that's not what I
meant. I meant adding a way to handle that error, not signal it. Which
is the wait stuff we are discussing.
> The problem is just that when you get it you can only error out
> because there is no way to wait for a free space event. With
> your help I've been trying to figure out how to add it. Of course
> after that's done you still have to do the work to handle
> it in the block layer somewhere.
Yes that's the issue. We can have a defer helper in the block layer that
could reinvoke the request handling when we _hope_ it'll work. That's
already in place, the driver does a BLKPREP_DEFER for that case. For
drivers that don't use the prep handler, we can do something very
similar.
> > > > I was thinking just a global one, we are in soft error handling anyways
> > > > so should be ok. I don't think you would need to dirty any global cache
> > > > line unless you actually need to wake waiters.
> > >
> > > __wake_up takes the spinlock even when nobody waits.
> >
> > I would not want to call wake_up() unless I have to. Would a
> >
> > smp_mb();
> > if (waitqueue_active(&iommu_wq))
> > ...
> >
> > not be sufficient?
>
> Probably, but one would need to be careful to not miss events this way.
Definitely, as far as I can see the above should be enough...
--
Jens Axboe
next prev parent reply other threads:[~2006-03-02 14:14 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-01 23:23 PCI-DMA: Out of IOMMU space on x86-64 (Athlon64x2), with solution Michael Monnerie
2006-03-02 1:03 ` Andi Kleen
2006-03-02 9:59 ` Jens Axboe
2006-03-03 8:16 ` Chris Wedgwood
2006-03-03 11:00 ` Andi Kleen
[not found] ` <200603021316.38077.ak@suse.de>
[not found] ` <4406E226.4050806@pobox.com>
2006-03-02 12:26 ` Andi Kleen
2006-03-02 12:31 ` Jens Axboe
2006-03-02 12:33 ` Jeff Garzik
[not found] ` <20060302123033.GL4329@suse.de>
2006-03-02 13:09 ` Andi Kleen
2006-03-02 13:10 ` Jens Axboe
2006-03-02 13:33 ` Andi Kleen
2006-03-02 13:33 ` Jens Axboe
2006-03-02 13:46 ` Andi Kleen
2006-03-02 13:49 ` Jens Axboe
2006-03-02 13:58 ` Andi Kleen
2006-03-02 14:14 ` Jens Axboe [this message]
2006-03-02 14:35 ` Andi Kleen
2006-03-02 14:38 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2006-03-03 21:27 Allen Martin
2006-03-03 22:12 ` Andi Kleen
2006-03-03 22:23 ` Jeff Garzik
2006-03-03 22:32 ` Andi Kleen
2006-03-04 6:34 ` Michael Monnerie
[not found] <5Mq18-1Na-21@gated-at.bofh.it>
[not found] ` <5MqNc-2Y5-3@gated-at.bofh.it>
[not found] ` <5MqX4-39H-21@gated-at.bofh.it>
[not found] ` <5MyAS-5zh-5@gated-at.bofh.it>
2006-03-07 0:15 ` Robert Hancock
2006-04-02 7:51 ` Joerg Bashir
2006-04-02 8:00 ` Muli Ben-Yehuda
2006-04-02 8:24 ` Joerg Bashir
2006-04-02 11:16 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060302141412.GT4329@suse.de \
--to=axboe@suse.de \
--cc=ak@suse.de \
--cc=jgarzik@pobox.com \
--cc=linux-kernel@vger.kernel.org \
--cc=m.monnerie@zmi.at \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox