From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Josh Boyer <jwboyer@redhat.com>,
Samu Kallio <samu.kallio@aberdeencloud.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
"H. Peter Anvin" <hpa@linux.intel.com>
Subject: [ 25/27] x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates
Date: Sun, 14 Apr 2013 19:43:24 -0700 [thread overview]
Message-ID: <20130415024233.223446306@linuxfoundation.org> (raw)
In-Reply-To: <20130415024231.351969241@linuxfoundation.org>
3.8-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samu Kallio <samu.kallio@aberdeencloud.com>
commit 1160c2779b826c6f5c08e5cc542de58fd1f667d5 upstream.
In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops
when lazy MMU updates are enabled, because set_pgd effects are being
deferred.
One instance of this problem is during process mm cleanup with memory
cgroups enabled. The chain of events is as follows:
- zap_pte_range enables lazy MMU updates
- zap_pte_range eventually calls mem_cgroup_charge_statistics,
which accesses the vmalloc'd mem_cgroup per-cpu stat area
- vmalloc_fault is triggered which tries to sync the corresponding
PGD entry with set_pgd, but the update is deferred
- vmalloc_fault oopses due to a mismatch in the PUD entries
The OOPs usually looks as so:
------------[ cut here ]------------
kernel BUG at arch/x86/mm/fault.c:396!
invalid opcode: 0000 [#1] SMP
.. snip ..
CPU 1
Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1
RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208
.. snip ..
Call Trace:
[<ffffffff81627759>] do_page_fault+0x399/0x4b0
[<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110
[<ffffffff81624065>] page_fault+0x25/0x30
[<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50
[<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350
[<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60
[<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150
[<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80
[<ffffffff81153e61>] unmap_single_vma+0x531/0x870
[<ffffffff81154962>] unmap_vmas+0x52/0xa0
[<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100
[<ffffffff8115c8f8>] exit_mmap+0x98/0x170
[<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81059ce3>] mmput+0x83/0xf0
[<ffffffff810624c4>] exit_mm+0x104/0x130
[<ffffffff8106264a>] do_exit+0x15a/0x8c0
[<ffffffff810630ff>] do_group_exit+0x3f/0xa0
[<ffffffff81063177>] sys_exit_group+0x17/0x20
[<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the
changes visible to the consistency checks.
RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737
Tested-by: Josh Boyer <jwboyer@redhat.com>
Reported-and-Tested-by: Krishna Raman <kraman@redhat.com>
Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/mm/fault.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -378,10 +378,12 @@ static noinline __kprobes int vmalloc_fa
if (pgd_none(*pgd_ref))
return -1;
- if (pgd_none(*pgd))
+ if (pgd_none(*pgd)) {
set_pgd(pgd, *pgd_ref);
- else
+ arch_flush_lazy_mmu_mode();
+ } else {
BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
+ }
/*
* Below here mismatches are bugs because these lower tables
next prev parent reply other threads:[~2013-04-15 2:44 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-15 2:42 [ 00/27] 3.8.8-stable review Greg Kroah-Hartman
2013-04-15 2:43 ` [ 01/27] ALSA: usb-audio: fix endianness bug in snd_nativeinstruments_* Greg Kroah-Hartman
2013-04-15 2:43 ` [ 02/27] ASoC: core: Fix to check return value of snd_soc_update_bits_locked() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 03/27] ASoC: wm5102: Correct lookup of arizona struct in SYSCLK event Greg Kroah-Hartman
2013-04-15 2:43 ` [ 04/27] ASoC: wm8903: Fix the bypass to HP/LINEOUT when no DAC or ADC is running Greg Kroah-Hartman
2013-04-15 2:43 ` [ 05/27] tracing: Fix double free when function profile init failed Greg Kroah-Hartman
2013-04-15 2:43 ` [ 06/27] ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED Greg Kroah-Hartman
2013-04-15 2:43 ` [ 07/27] ARM: imx35 Bugfix admux clock Greg Kroah-Hartman
2013-04-15 2:43 ` [ 08/27] dmaengine: omap-dma: Start DMA without delay for cyclic channels Greg Kroah-Hartman
2013-04-15 2:43 ` [ 09/27] PM / reboot: call syscore_shutdown() after disable_nonboot_cpus() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 10/27] Revert "brcmsmac: support 4313iPA" Greg Kroah-Hartman
2013-04-15 2:43 ` [ 11/27] ipc: set msg back to -EAGAIN if copy wasnt performed Greg Kroah-Hartman
2013-04-15 2:43 ` [ 12/27] GFS2: Fix unlock of fcntl locks during withdrawn state Greg Kroah-Hartman
2013-04-15 2:43 ` [ 13/27] GFS2: return error if malloc failed in gfs2_rs_alloc() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 14/27] SCSI: libsas: fix handling vacant phy in sas_set_ex_phy() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 15/27] cifs: Allow passwords which begin with a delimitor Greg Kroah-Hartman
2013-04-15 2:43 ` [ 16/27] target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs Greg Kroah-Hartman
2013-04-15 2:43 ` [ 17/27] vfs: Revert spurious fix to spinning prevention in prune_icache_sb Greg Kroah-Hartman
2013-04-15 2:43 ` [ 18/27] kobject: fix kset_find_obj() race with concurrent last kobject_put() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 19/27] gpio: fix wrong checking condition for gpio range Greg Kroah-Hartman
2013-04-15 2:43 ` [ 20/27] x86-32: Fix possible incomplete TLB invalidate with PAE pagetables Greg Kroah-Hartman
2013-04-15 2:43 ` [ 21/27] tracing: Fix possible NULL pointer dereferences Greg Kroah-Hartman
2013-04-15 2:43 ` [ 22/27] udl: handle EDID failure properly Greg Kroah-Hartman
2013-04-15 2:43 ` [ 23/27] ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section Greg Kroah-Hartman
2013-04-15 2:43 ` [ 24/27] sched_clock: Prevent 64bit inatomicity on 32bit systems Greg Kroah-Hartman
2013-04-15 2:43 ` Greg Kroah-Hartman [this message]
2013-04-15 2:43 ` [ 26/27] x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal Greg Kroah-Hartman
2013-04-15 2:43 ` [ 27/27] tty: dont deadlock while flushing workqueue Greg Kroah-Hartman
2013-04-15 19:37 ` Peter Hurley
2013-04-15 19:50 ` Greg Kroah-Hartman
2013-04-15 14:05 ` [ 00/27] 3.8.8-stable review Shuah Khan
2013-04-15 16:07 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130415024233.223446306@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=hpa@linux.intel.com \
--cc=jwboyer@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=samu.kallio@aberdeencloud.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.