public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin <wli@holomorphy.com>, linux-kernel@vger.kernel.org
Subject: Re: RFC: Block reservation for hugetlbfs
Date: Tue, 21 Feb 2006 15:18:59 +1100	[thread overview]
Message-ID: <43FA94B3.4040904@yahoo.com.au> (raw)
In-Reply-To: <20060221022124.GA18535@localhost.localdomain>

David Gibson wrote:
> These days, hugepages are demand-allocated at first fault time.
> There's a somewhat dubious (and racy) heuristic when making a new
> mmap() to check if there are enough available hugepages to fully
> satisfy that mapping.
> 
> A particularly obvious case where the heuristic breaks down is where a
> process maps its hugepages not as a single chunk, but as a bunch of
> individually mmap()ed (or shmat()ed) blocks without touching and
> instantiating the blocks in between allocations.  In this case the
> size of each block is compared against the total number of available
> hugepages.  It's thus easy for the process to become overcommitted,
> because each block mapping will succeed, although the total number of
> hugepages required by all blocks exceeds the number available.  In
> particular, this defeats such a program which will detect a mapping
> failure and adjust its hugepage usage downward accordingly.
> 
> The patch below is a draft attempt to address this problem, by
> strictly reserving a number of physical hugepages for hugepages inodes
> which have been mapped, but not instatiated.  MAP_SHARED mappings are
> thus "safe" - they will fail on mmap(), not later with a SIGBUS.
> MAP_PRIVATE mappings can still SIGBUS.
> 
> This patch appears to address the problem at hand - it allows DB2 to
> start correctly, for instance, which previously suffered the failure
> described above.  I'm almost certain I'm missing some locking or other
> synchronization - I am entirely bewildered as to what I need to hold
> to safely update i_blocks as below.  Corrections for my ignorance
> solicited...
> 
> Signed-off-by: David Gibson <dwg@au1.ibm.com>
> 

This introduces
tree_lock(r) -> hugetlb_lock

And we already have
hugetlb_lock -> lru_lock

So we now have tree_lock(r) -> lru_lock, which would deadlock
against lru_lock -> tree_lock(w), right?

 From a quick glance it looks safe, but I'd _really_ rather not
introduce something like this.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

  reply	other threads:[~2006-02-21  7:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-21  2:21 RFC: Block reservation for hugetlbfs David Gibson
2006-02-21  4:18 ` Nick Piggin [this message]
2006-02-21 23:39   ` David Gibson
2006-02-22  0:38     ` Nick Piggin
2006-02-22  2:11       ` David Gibson
2006-02-22  3:09         ` Nick Piggin
2006-02-24  4:11           ` David Gibson
2006-02-24  6:22             ` Nick Piggin
2006-02-27  0:18               ` David Gibson
2006-02-21 19:25 ` Dave Hansen
2006-02-21 23:46   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43FA94B3.4040904@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=dwg@au1.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox