linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>,
	"Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	ak@suse.de, gregkh@suse.de, muli@il.ibm.com, ashok.raj@intel.com,
	davem@davemloft.net, clameter@sgi.com
Subject: Re: [Intel IOMMU 06/10] Avoid memory allocation failures in dma	map api calls
Date: Wed, 20 Jun 2007 20:05:03 +0200	[thread overview]
Message-ID: <1182362703.21117.79.camel@twins> (raw)
In-Reply-To: <20070620173038.GA25516@linux-os.sc.intel.com>

On Wed, 2007-06-20 at 10:30 -0700, Siddha, Suresh B wrote:
> On Wed, Jun 20, 2007 at 06:03:02AM -0700, Arjan van de Ven wrote:
> > Peter Zijlstra wrote:
> > >
> > >
> > >PF_MEMALLOC as is, is meant to salvage the VM from the typical VM
> > >deadlock. 
> > 
> > .. and this IS the typical VM deadlock.. it is your storage driver 
> > trying to write out a piece of memory on behalf of the VM, and calls 
> > the iommu to map it, which then needs a bit of memory....
> 
> Today PF_MEMALLOC doesn't do much in interrupt context. If PF_MEMALLOC
> is the right usage model for this, then we need to fix the behavior of
> PF_MEMALLOC in the interrupt context(for our usage model, we do most
> of the allocations in interrupt context).

Right, I have patches that add GFP_EMERGENCY to do basically that.

> I am not very familiar with PF_MEMALLOC. So experts please comment.

PF_MEMALLOC is meant to avoid the VM deadlock - that is we need memory
to free memory. The one constraint is that its use be bounded. (which is
currently violated in that there is no bound on the number of direct
reclaim contexts - which is on my to-fix list)

So a reclaim context (kswapd and direct reclaim) set PF_MEMALLOC to
ensure they themselves will not block on a memory allocation. And it is
understood that these code paths have a bounded memory footprint.

Now, this code seems to be running from interrupt context, which makes
it impossible to tell if the work is being done on behalf of a reclaim
task.  Is it possible to setup the needed data for the IRQ handler from
process context?

Blindly adding GFP_EMERGENCY to do this, has the distinct disadvantage
that there is no inherent bound on the amount of memory consumed. In my
patch set I add an emergency reserve (below the current watermarks,
because ALLOC_HIGH and ALLOC_HARDER modify the threshold in a relative
way, and thus cannot provide a guaranteed limit). I then accurately
account all allocations made from this reserve to ensure I never cross
the set limit.

Like has been said before, if possible move to blocking allocs
(GFP_NOIO), if that is not possible use mempools (for kmem_cache, or
page alloc), if that is not possible use ALLOC_NO_WATERMARKS
(PF_MEMALLOC, GFP_EMERGENCY) but put in a reserve and account its usage.

The last option basically boils down to reserved based allocation,
something which I hope to introduce some-day...

That is, failure is a OK, unless you're from a reclaim context, those
should make progress.


One thing I'm confused about, in earlier discussions it was said that
mempools are not sufficient because they deplete the GFP_ATOMIC reserve
and only then use the mempool. This would not work because some
downstream allocation would then go splat --- using
PF_MEMALLOC/GFP_EMERGENCY has exactly the same problem!




  reply	other threads:[~2007-06-20 18:05 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-19 21:37 [Intel IOMMU 00/10] Intel IOMMU support, take #2 Keshavamurthy, Anil S
2007-06-19 21:37 ` [Intel IOMMU 01/10] DMAR detection and parsing logic Keshavamurthy, Anil S
2007-07-04  9:18   ` Peter Zijlstra
2007-07-04 10:04     ` Andrew Morton
2007-07-04 10:14       ` Peter Zijlstra
2007-06-19 21:37 ` [Intel IOMMU 02/10] PCI generic helper function Keshavamurthy, Anil S
2007-06-26  5:49   ` Andrew Morton
2007-06-26 14:44     ` Keshavamurthy, Anil S
2007-06-19 21:37 ` [Intel IOMMU 03/10] clflush_cache_range now takes size param Keshavamurthy, Anil S
2007-06-19 21:37 ` [Intel IOMMU 04/10] IOVA allocation and management routines Keshavamurthy, Anil S
2007-06-26  6:07   ` Andrew Morton
2007-06-26 16:16     ` Keshavamurthy, Anil S
2007-06-19 21:37 ` [Intel IOMMU 05/10] Intel IOMMU driver Keshavamurthy, Anil S
2007-06-19 23:32   ` Christoph Lameter
2007-06-19 23:50     ` Keshavamurthy, Anil S
2007-06-19 23:56       ` Christoph Lameter
2007-06-26  6:32     ` Andrew Morton
2007-06-26 16:29       ` Keshavamurthy, Anil S
2007-06-26  6:25   ` Andrew Morton
2007-06-26 16:33     ` Keshavamurthy, Anil S
2007-06-26  6:30   ` Andrew Morton
2007-06-19 21:37 ` [Intel IOMMU 06/10] Avoid memory allocation failures in dma map api calls Keshavamurthy, Anil S
2007-06-19 23:25   ` Christoph Lameter
2007-06-19 23:27     ` Arjan van de Ven
2007-06-19 23:34       ` Christoph Lameter
2007-06-20  0:02         ` Arjan van de Ven
2007-06-20  8:06   ` Peter Zijlstra
2007-06-20 13:03     ` Arjan van de Ven
2007-06-20 17:30       ` Siddha, Suresh B
2007-06-20 18:05         ` Peter Zijlstra [this message]
2007-06-20 19:14           ` Arjan van de Ven
2007-06-20 20:08             ` Peter Zijlstra
2007-06-20 23:03               ` Keshavamurthy, Anil S
2007-06-21  6:10                 ` Peter Zijlstra
2007-06-21  6:11                   ` Arjan van de Ven
2007-06-21  6:29                     ` Peter Zijlstra
2007-06-21  6:37                       ` Keshavamurthy, Anil S
2007-06-21  7:13                         ` Peter Zijlstra
2007-06-21 19:51                           ` Keshavamurthy, Anil S
2007-06-21  6:30                     ` Keshavamurthy, Anil S
2007-06-26  5:34     ` Andrew Morton
2007-06-19 21:37 ` [Intel IOMMU 07/10] Intel iommu cmdline option - forcedac Keshavamurthy, Anil S
2007-06-19 21:37 ` [Intel IOMMU 08/10] DMAR fault handling support Keshavamurthy, Anil S
2007-06-19 21:37 ` [Intel IOMMU 09/10] Iommu Gfx workaround Keshavamurthy, Anil S
2007-06-19 21:37 ` [Intel IOMMU 10/10] Iommu floppy workaround Keshavamurthy, Anil S
2007-06-26  6:42   ` Andrew Morton
2007-06-26 10:37     ` Andi Kleen
2007-06-26 19:25       ` Keshavamurthy, Anil S
2007-06-26 16:26     ` Keshavamurthy, Anil S
2007-06-26  6:45 ` [Intel IOMMU 00/10] Intel IOMMU support, take #2 Andrew Morton
2007-06-26  7:12   ` Andi Kleen
2007-06-26 11:13     ` Muli Ben-Yehuda
2007-06-26 15:03       ` Arjan van de Ven
2007-06-26 15:11         ` Muli Ben-Yehuda
2007-06-26 15:48           ` Keshavamurthy, Anil S
2007-06-26 16:00             ` Muli Ben-Yehuda
2007-06-26 15:56       ` Andi Kleen
2007-06-26 15:09         ` Muli Ben-Yehuda
2007-06-26 15:36           ` Andi Kleen
2007-06-26 15:15         ` Arjan van de Ven
2007-06-26 15:33           ` Andi Kleen
2007-06-26 16:25             ` Arjan van de Ven
2007-06-26 17:31               ` Andi Kleen
2007-06-26 20:10                 ` Jesse Barnes
2007-06-26 22:35                   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1182362703.21117.79.camel@twins \
    --to=peterz@infradead.org \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=arjan@linux.intel.com \
    --cc=ashok.raj@intel.com \
    --cc=clameter@sgi.com \
    --cc=davem@davemloft.net \
    --cc=gregkh@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=muli@il.ibm.com \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).