From: Nick Piggin <nickpiggin@yahoo.com.au>
To: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin <wli@holomorphy.com>, linux-kernel@vger.kernel.org
Subject: Re: RFC: Block reservation for hugetlbfs
Date: Tue, 21 Feb 2006 15:18:59 +1100 [thread overview]
Message-ID: <43FA94B3.4040904@yahoo.com.au> (raw)
In-Reply-To: <20060221022124.GA18535@localhost.localdomain>
David Gibson wrote:
> These days, hugepages are demand-allocated at first fault time.
> There's a somewhat dubious (and racy) heuristic when making a new
> mmap() to check if there are enough available hugepages to fully
> satisfy that mapping.
>
> A particularly obvious case where the heuristic breaks down is where a
> process maps its hugepages not as a single chunk, but as a bunch of
> individually mmap()ed (or shmat()ed) blocks without touching and
> instantiating the blocks in between allocations. In this case the
> size of each block is compared against the total number of available
> hugepages. It's thus easy for the process to become overcommitted,
> because each block mapping will succeed, although the total number of
> hugepages required by all blocks exceeds the number available. In
> particular, this defeats such a program which will detect a mapping
> failure and adjust its hugepage usage downward accordingly.
>
> The patch below is a draft attempt to address this problem, by
> strictly reserving a number of physical hugepages for hugepages inodes
> which have been mapped, but not instatiated. MAP_SHARED mappings are
> thus "safe" - they will fail on mmap(), not later with a SIGBUS.
> MAP_PRIVATE mappings can still SIGBUS.
>
> This patch appears to address the problem at hand - it allows DB2 to
> start correctly, for instance, which previously suffered the failure
> described above. I'm almost certain I'm missing some locking or other
> synchronization - I am entirely bewildered as to what I need to hold
> to safely update i_blocks as below. Corrections for my ignorance
> solicited...
>
> Signed-off-by: David Gibson <dwg@au1.ibm.com>
>
This introduces
tree_lock(r) -> hugetlb_lock
And we already have
hugetlb_lock -> lru_lock
So we now have tree_lock(r) -> lru_lock, which would deadlock
against lru_lock -> tree_lock(w), right?
From a quick glance it looks safe, but I'd _really_ rather not
introduce something like this.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
next prev parent reply other threads:[~2006-02-21 7:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-21 2:21 RFC: Block reservation for hugetlbfs David Gibson
2006-02-21 4:18 ` Nick Piggin [this message]
2006-02-21 23:39 ` David Gibson
2006-02-22 0:38 ` Nick Piggin
2006-02-22 2:11 ` David Gibson
2006-02-22 3:09 ` Nick Piggin
2006-02-24 4:11 ` David Gibson
2006-02-24 6:22 ` Nick Piggin
2006-02-27 0:18 ` David Gibson
2006-02-21 19:25 ` Dave Hansen
2006-02-21 23:46 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43FA94B3.4040904@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=dwg@au1.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox