public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: stable-review@kernel.org, torvalds@linux-foundation.org,
	akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk,
	Mel Gorman <mel@csn.ul.ie>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	David Rientjes <rientjes@google.com>,
	Andi Kleen <andi@firstfloor.org>
Subject: [04/34] hugetlbfs: kill applications that use MAP_NORESERVE with SIGBUS instead of OOM-killer
Date: Mon, 24 May 2010 15:59:36 -0700	[thread overview]
Message-ID: <20100524230349.707407462@clark.site> (raw)
In-Reply-To: <20100524230418.GA12770@kroah.com>

2.6.34-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Mel Gorman <mel@csn.ul.ie>

commit 4a6018f7f4f1075c1a5403b5ec0ee7262187b86c upstream.

Ordinarily, application using hugetlbfs will create mappings with
reserves.  For shared mappings, these pages are reserved before mmap()
returns success and for private mappings, the caller process is guaranteed
and a child process that cannot get the pages gets killed with sigbus.

An application that uses MAP_NORESERVE gets no reservations and mmap()
will always succeed at the risk the page will not be available at fault
time.  This might be used for example on very large sparse mappings where
the developer is confident the necessary huge pages exist to satisfy all
faults even though the whole mapping cannot be backed by huge pages.
Unfortunately, if an allocation does fail, VM_FAULT_OOM is returned to the
fault handler which proceeds to trigger the OOM-killer.  This is
unhelpful.

Even without hugetlbfs mounted, a user using mmap() can trivially trigger
the OOM-killer because VM_FAULT_OOM is returned (will provide example
program if desired - it's a whopping 24 lines long).  It could be
considered a DOS available to an unprivileged user.

This patch alters hugetlbfs to kill a process that uses MAP_NORESERVE
where huge pages were not available with SIGBUS instead of triggering the
OOM killer.

This change affects hugetlb_cow() as well.  I feel there is a failure case
in there, but I didn't create one.  It would need a fairly specific target
in terms of the faulting application and the hugepage pool size.  The
hugetlb_no_page() path is much easier to hit but both might as well be
closed.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 mm/hugetlb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1008,7 +1008,7 @@ static struct page *alloc_huge_page(stru
 		page = alloc_buddy_huge_page(h, vma, addr);
 		if (!page) {
 			hugetlb_put_quota(inode->i_mapping, chg);
-			return ERR_PTR(-VM_FAULT_OOM);
+			return ERR_PTR(-VM_FAULT_SIGBUS);
 		}
 	}
 



  parent reply	other threads:[~2010-05-24 23:16 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-24 23:04 [00/34] 2.6.32.14-stable review Greg KH
2010-05-24 22:59 ` [01/34] ipv4: udp: fix short packet and bad checksum logging Greg KH
2010-05-25  7:08   ` Bjørn Mork
2010-05-25 14:06     ` Greg KH
2010-05-24 22:59 ` [02/34] hp_accel: fix race in device removal Greg KH
2010-05-24 22:59 ` [03/34] fbdev: bfin-t350mcqb-fb: fix fbmem allocation with blanking lines Greg KH
2010-05-24 22:59 ` Greg KH [this message]
2010-05-24 22:59 ` [05/34] dma-mapping: fix dma_sync_single_range_* Greg KH
2010-05-24 22:59 ` [06/34] ACPI: sleep: eliminate duplicate entries in acpisleep_dmi_table[] Greg KH
2010-05-24 22:59 ` [07/34] mmc: atmel-mci: fix two parameters swapped Greg KH
2010-05-24 22:59 ` [08/34] mmc: atmel-mci: prevent kernel oops while removing card Greg KH
2010-05-24 22:59 ` [09/34] mmc: atmel-mci: remove data error interrupt after xfer Greg KH
2010-05-24 22:59 ` [10/34] [S390] ptrace: fix return value of do_syscall_trace_enter() Greg KH
2010-05-24 22:59 ` [11/34] powerpc/perf_event: Fix oops due to perf_event_do_pending call Greg KH
2010-05-24 22:59 ` [12/34] cifs: guard against hardlinking directories Greg KH
2010-05-24 22:59 ` [13/34] serial: imx.c: fix CTS trigger level lower to avoid lost chars Greg KH
2010-05-24 22:59 ` [14/34] ALSA: ice1724 - Fix ESI Maya44 capture source control Greg KH
2010-05-24 22:59 ` [15/34] ALSA: hda: Fix 0 dB for Lenovo models using Conexant CX20549 (Venice) Greg KH
2010-05-24 22:59 ` [16/34] inotify: race use after free/double free in inotify inode marks Greg KH
2010-05-24 22:59 ` [17/34] inotify: dont leak user struct on inotify release Greg KH
2010-05-24 22:59 ` [18/34] profile: fix stats and data leakage Greg KH
2010-05-24 22:59 ` [19/34] x86, k8: Fix build error when K8_NB is disabled Greg KH
2010-05-24 23:13   ` H. Peter Anvin
2010-05-24 23:26     ` Greg KH
2010-05-24 22:59 ` [20/34] x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments Greg KH
2010-05-24 23:13   ` H. Peter Anvin
2010-05-24 23:25     ` Greg KH
2010-05-24 22:59 ` [21/34] x86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRs Greg KH
2010-05-24 23:20   ` H. Peter Anvin
2010-05-24 22:59 ` [22/34] Btrfs: check for read permission on src file in the clone ioctl Greg KH
2010-05-24 22:59 ` [23/34] ALSA: hda - New Intel HDA controller Greg KH
2010-05-24 22:59 ` [24/34] proc: partially revert "procfs: provide stack information for threads" Greg KH
2010-05-24 22:59 ` [25/34] revert "procfs: provide stack information for threads" and its fixup commits Greg KH
2010-05-24 22:59 ` [26/34] iwlwifi: clear all the stop_queue flag after load firmware Greg KH
2010-05-24 22:59 ` [27/34] p54: disable channels with incomplete calibration data sets Greg KH
2010-05-24 23:00 ` [28/34] CacheFiles: Fix error handling in cachefiles_determine_cache_security() Greg KH
2010-05-24 23:00 ` [29/34] [SCSI] megaraid_sas: fix for 32bit apps Greg KH
2010-05-24 23:00 ` [30/34] mmap_min_addr check CAP_SYS_RAWIO only for write Greg KH
2010-05-24 23:00 ` [31/34] nilfs2: fix sync silent failure Greg KH
2010-05-24 23:00 ` [32/34] Revert "ath9k: fix lockdep warning when unloading module" on stable kernels Greg KH
2010-05-24 23:00 ` [33/34] crypto: authenc - Add EINPROGRESS check Greg KH
2010-05-24 23:00 ` [34/34] Revert "parisc: Set PCI CLS early in boot." Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100524230349.707407462@clark.site \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andi@firstfloor.org \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=rientjes@google.com \
    --cc=stable-review@kernel.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox