From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Josh Boyer <jwboyer@redhat.com>,
Samu Kallio <samu.kallio@aberdeencloud.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
"H. Peter Anvin" <hpa@linux.intel.com>
Subject: [ 25/27] x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates
Date: Sun, 14 Apr 2013 19:43:24 -0700 [thread overview]
Message-ID: <20130415024233.223446306@linuxfoundation.org> (raw)
In-Reply-To: <20130415024231.351969241@linuxfoundation.org>
3.8-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samu Kallio <samu.kallio@aberdeencloud.com>
commit 1160c2779b826c6f5c08e5cc542de58fd1f667d5 upstream.
In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops
when lazy MMU updates are enabled, because set_pgd effects are being
deferred.
One instance of this problem is during process mm cleanup with memory
cgroups enabled. The chain of events is as follows:
- zap_pte_range enables lazy MMU updates
- zap_pte_range eventually calls mem_cgroup_charge_statistics,
which accesses the vmalloc'd mem_cgroup per-cpu stat area
- vmalloc_fault is triggered which tries to sync the corresponding
PGD entry with set_pgd, but the update is deferred
- vmalloc_fault oopses due to a mismatch in the PUD entries
The OOPs usually looks as so:
------------[ cut here ]------------
kernel BUG at arch/x86/mm/fault.c:396!
invalid opcode: 0000 [#1] SMP
.. snip ..
CPU 1
Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1
RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208
.. snip ..
Call Trace:
[<ffffffff81627759>] do_page_fault+0x399/0x4b0
[<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110
[<ffffffff81624065>] page_fault+0x25/0x30
[<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50
[<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350
[<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60
[<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150
[<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80
[<ffffffff81153e61>] unmap_single_vma+0x531/0x870
[<ffffffff81154962>] unmap_vmas+0x52/0xa0
[<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100
[<ffffffff8115c8f8>] exit_mmap+0x98/0x170
[<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81059ce3>] mmput+0x83/0xf0
[<ffffffff810624c4>] exit_mm+0x104/0x130
[<ffffffff8106264a>] do_exit+0x15a/0x8c0
[<ffffffff810630ff>] do_group_exit+0x3f/0xa0
[<ffffffff81063177>] sys_exit_group+0x17/0x20
[<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the
changes visible to the consistency checks.
RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737
Tested-by: Josh Boyer <jwboyer@redhat.com>
Reported-and-Tested-by: Krishna Raman <kraman@redhat.com>
Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/mm/fault.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -378,10 +378,12 @@ static noinline __kprobes int vmalloc_fa
if (pgd_none(*pgd_ref))
return -1;
- if (pgd_none(*pgd))
+ if (pgd_none(*pgd)) {
set_pgd(pgd, *pgd_ref);
- else
+ arch_flush_lazy_mmu_mode();
+ } else {
BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
+ }
/*
* Below here mismatches are bugs because these lower tables
next prev parent reply other threads:[~2013-04-15 2:44 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-15 2:42 [ 00/27] 3.8.8-stable review Greg Kroah-Hartman
2013-04-15 2:43 ` [ 01/27] ALSA: usb-audio: fix endianness bug in snd_nativeinstruments_* Greg Kroah-Hartman
2013-04-15 2:43 ` [ 02/27] ASoC: core: Fix to check return value of snd_soc_update_bits_locked() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 03/27] ASoC: wm5102: Correct lookup of arizona struct in SYSCLK event Greg Kroah-Hartman
2013-04-15 2:43 ` [ 04/27] ASoC: wm8903: Fix the bypass to HP/LINEOUT when no DAC or ADC is running Greg Kroah-Hartman
2013-04-15 2:43 ` [ 05/27] tracing: Fix double free when function profile init failed Greg Kroah-Hartman
2013-04-15 2:43 ` [ 06/27] ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED Greg Kroah-Hartman
2013-04-15 2:43 ` [ 07/27] ARM: imx35 Bugfix admux clock Greg Kroah-Hartman
2013-04-15 2:43 ` [ 08/27] dmaengine: omap-dma: Start DMA without delay for cyclic channels Greg Kroah-Hartman
2013-04-15 2:43 ` [ 09/27] PM / reboot: call syscore_shutdown() after disable_nonboot_cpus() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 10/27] Revert "brcmsmac: support 4313iPA" Greg Kroah-Hartman
2013-04-15 2:43 ` [ 11/27] ipc: set msg back to -EAGAIN if copy wasnt performed Greg Kroah-Hartman
2013-04-15 2:43 ` [ 12/27] GFS2: Fix unlock of fcntl locks during withdrawn state Greg Kroah-Hartman
2013-04-15 2:43 ` [ 13/27] GFS2: return error if malloc failed in gfs2_rs_alloc() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 14/27] SCSI: libsas: fix handling vacant phy in sas_set_ex_phy() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 15/27] cifs: Allow passwords which begin with a delimitor Greg Kroah-Hartman
2013-04-15 2:43 ` [ 16/27] target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs Greg Kroah-Hartman
2013-04-15 2:43 ` [ 17/27] vfs: Revert spurious fix to spinning prevention in prune_icache_sb Greg Kroah-Hartman
2013-04-15 2:43 ` [ 18/27] kobject: fix kset_find_obj() race with concurrent last kobject_put() Greg Kroah-Hartman
2013-04-15 2:43 ` [ 19/27] gpio: fix wrong checking condition for gpio range Greg Kroah-Hartman
2013-04-15 2:43 ` [ 20/27] x86-32: Fix possible incomplete TLB invalidate with PAE pagetables Greg Kroah-Hartman
2013-04-15 2:43 ` [ 21/27] tracing: Fix possible NULL pointer dereferences Greg Kroah-Hartman
2013-04-15 2:43 ` [ 22/27] udl: handle EDID failure properly Greg Kroah-Hartman
2013-04-15 2:43 ` [ 23/27] ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section Greg Kroah-Hartman
2013-04-15 2:43 ` [ 24/27] sched_clock: Prevent 64bit inatomicity on 32bit systems Greg Kroah-Hartman
2013-04-15 2:43 ` Greg Kroah-Hartman [this message]
2013-04-15 2:43 ` [ 26/27] x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal Greg Kroah-Hartman
2013-04-15 2:43 ` [ 27/27] tty: dont deadlock while flushing workqueue Greg Kroah-Hartman
2013-04-15 19:37 ` Peter Hurley
2013-04-15 19:50 ` Greg Kroah-Hartman
2013-04-15 14:05 ` [ 00/27] 3.8.8-stable review Shuah Khan
2013-04-15 16:07 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130415024233.223446306@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=hpa@linux.intel.com \
--cc=jwboyer@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=samu.kallio@aberdeencloud.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox