From: Nick Piggin <nickpiggin@yahoo.com.au>
To: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin <wli@holomorphy.com>, linux-kernel@vger.kernel.org
Subject: Re: RFC: Block reservation for hugetlbfs
Date: Tue, 21 Feb 2006 15:18:59 +1100 [thread overview]
Message-ID: <43FA94B3.4040904@yahoo.com.au> (raw)
In-Reply-To: <20060221022124.GA18535@localhost.localdomain>
David Gibson wrote:
> These days, hugepages are demand-allocated at first fault time.
> There's a somewhat dubious (and racy) heuristic when making a new
> mmap() to check if there are enough available hugepages to fully
> satisfy that mapping.
>
> A particularly obvious case where the heuristic breaks down is where a
> process maps its hugepages not as a single chunk, but as a bunch of
> individually mmap()ed (or shmat()ed) blocks without touching and
> instantiating the blocks in between allocations. In this case the
> size of each block is compared against the total number of available
> hugepages. It's thus easy for the process to become overcommitted,
> because each block mapping will succeed, although the total number of
> hugepages required by all blocks exceeds the number available. In
> particular, this defeats such a program which will detect a mapping
> failure and adjust its hugepage usage downward accordingly.
>
> The patch below is a draft attempt to address this problem, by
> strictly reserving a number of physical hugepages for hugepages inodes
> which have been mapped, but not instatiated. MAP_SHARED mappings are
> thus "safe" - they will fail on mmap(), not later with a SIGBUS.
> MAP_PRIVATE mappings can still SIGBUS.
>
> This patch appears to address the problem at hand - it allows DB2 to
> start correctly, for instance, which previously suffered the failure
> described above. I'm almost certain I'm missing some locking or other
> synchronization - I am entirely bewildered as to what I need to hold
> to safely update i_blocks as below. Corrections for my ignorance
> solicited...
>
> Signed-off-by: David Gibson <dwg@au1.ibm.com>
>
This introduces
tree_lock(r) -> hugetlb_lock
And we already have
hugetlb_lock -> lru_lock
So we now have tree_lock(r) -> lru_lock, which would deadlock
against lru_lock -> tree_lock(w), right?
From a quick glance it looks safe, but I'd _really_ rather not
introduce something like this.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
next prev parent reply other threads:[~2006-02-21 7:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-21 2:21 RFC: Block reservation for hugetlbfs David Gibson
2006-02-21 4:18 ` Nick Piggin [this message]
2006-02-21 23:39 ` David Gibson
2006-02-22 0:38 ` Nick Piggin
2006-02-22 2:11 ` David Gibson
2006-02-22 3:09 ` Nick Piggin
2006-02-24 4:11 ` David Gibson
2006-02-24 6:22 ` Nick Piggin
2006-02-27 0:18 ` David Gibson
2006-02-21 19:25 ` Dave Hansen
2006-02-21 23:46 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43FA94B3.4040904@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=dwg@au1.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.