From: Greg KH <gregkh@suse.de>
To: <linux-kernel@vger.kernel.org>, <stable@vger.kernel.org>
Cc: <torvalds@linux-foundation.org>, <akpm@linux-foundation.org>,
<alan@lxorguk.ukuu.org.uk>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>,
Ingo Molnar <mingo@elte.hu>
Subject: [082/104] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode
Date: Wed, 07 Dec 2011 08:12:24 -0800 [thread overview]
Message-ID: <20111207161221.538192066@clark.kroah.org> (raw)
In-Reply-To: <20111207161246.GA10995@kroah.com>
3.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
commit 2cd1c8d4dc7ecca9e9431e2dabe41ae9c7d89e51 upstream.
Fix an outstanding issue that has been reported since 2.6.37.
Under a heavy loaded machine processing "fork()" calls could
crash with:
BUG: unable to handle kernel paging request at f573fc8c
IP: [<c01abc54>] swap_count_continued+0x104/0x180
*pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 Oops: 0000 [#1] SMP
Modules linked in:
Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1
EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3
EIP is at swap_count_continued+0x104/0x180
.. snip..
Call Trace:
[<c01ac222>] ? __swap_duplicate+0xc2/0x160
[<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0
[<c01ac2e4>] ? swap_duplicate+0x14/0x40
[<c01a0a6b>] ? copy_pte_range+0x45b/0x500
[<c01a0ca5>] ? copy_page_range+0x195/0x200
[<c01328c6>] ? dup_mmap+0x1c6/0x2c0
[<c0132cf8>] ? dup_mm+0xa8/0x130
[<c013376a>] ? copy_process+0x98a/0xb30
[<c013395f>] ? do_fork+0x4f/0x280
[<c01573b3>] ? getnstimeofday+0x43/0x100
[<c010f770>] ? sys_clone+0x30/0x40
[<c06c048d>] ? ptregs_clone+0x15/0x48
[<c06bfb71>] ? syscall_call+0x7/0xb
The problem is that in copy_page_range() we turn lazy mode on,
and then in swap_entry_free() we call swap_count_continued()
which ends up in:
map = kmap_atomic(page, KM_USER0) + offset;
and then later we touch *map.
Since we are running in batched mode (lazy) we don't actually
set up the PTE mappings and the kmap_atomic is not done
synchronously and ends up trying to dereference a page that has
not been set.
Looking at kmap_atomic_prot_pfn(), it uses
'arch_flush_lazy_mmu_mode' and doing the same in
kmap_atomic_prot() and __kunmap_atomic() makes the problem go
away.
Interestingly, commit b8bcfe997e4615 ("x86/paravirt: remove lazy
mode in interrupts") removed part of this to fix an interrupt
issue - but it went to far and did not consider this scenario.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
arch/x86/mm/highmem_32.c | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
BUG_ON(!pte_none(*(kmap_pte-idx)));
set_pte(kmap_pte-idx, mk_pte(page, prot));
+ arch_flush_lazy_mmu_mode();
return (void *)vaddr;
}
@@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr)
*/
kpte_clear_flush(kmap_pte-idx, vaddr);
kmap_atomic_idx_pop();
+ arch_flush_lazy_mmu_mode();
}
#ifdef CONFIG_DEBUG_HIGHMEM
else {
next prev parent reply other threads:[~2011-12-07 16:12 UTC|newest]
Thread overview: 119+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-07 16:12 [000/104] 3.1.5-stable review Greg KH
2011-12-07 16:11 ` [001/104] eCryptfs: Prevent file create race condition Greg KH
2011-12-07 16:11 ` [002/104] eCryptfs: Flush file in vma close Greg KH
2011-12-07 16:11 ` [003/104] i2c-algo-bit: Generate correct i2c address sequence for 10-bit target Greg KH
2011-12-07 16:11 ` [004/104] eCryptfs: Extend array bounds for all filename chars Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [005/104] crypto: mv_cesa - fix hashing of chunks > 1920 bytes Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [006/104] drm: integer overflow in drm_mode_dirtyfb_ioctl() Greg KH
2011-12-07 16:11 ` [007/104] drm/radeon/kms: fix up gpio i2c mask bits for r4xx for real Greg KH
2011-12-07 16:11 ` [008/104] drm/i915: Ivybridge still has fences! Greg KH
2011-12-07 16:11 ` [009/104] drm/i915: Turn on a required 3D clock gating bit on Sandybridge Greg KH
2011-12-07 16:11 ` [010/104] drm/i915: Turn on another required clock gating bit on gen6 Greg KH
2011-12-07 16:11 ` [011/104] drm/i915: Fix inconsistent backlight level during disabled Greg KH
2011-12-07 16:11 ` [012/104] PCI hotplug: shpchp: dont blindly claim non-AMD 0x7450 device IDs Greg KH
2011-12-07 16:11 ` [013/104] drm/radeon/kms: fix up gpio i2c mask bits for r4xx Greg KH
2011-12-07 16:11 ` [014/104] viafb: correct sync polarity for OLPC DCON Greg KH
2011-12-07 16:11 ` [015/104] ARM: pxa: fix inconsistent CONFIG_USB_PXA27X Greg KH
2011-12-07 16:11 ` [016/104] arm: mx28: fix bit operation in clock setting Greg KH
2011-12-07 16:11 ` [017/104] ARM: OMAP: smartreflex: fix IRQ handling bug Greg KH
2011-12-07 16:11 ` [018/104] ARM: OMAP: hwmod: Fix the addr space, irq, dma count APIs Greg KH
2011-12-07 16:11 ` [019/104] ARM: OMAP2: select ARM_AMBA if OMAP3_EMU is defined Greg KH
2011-12-07 16:11 ` [020/104] ARM: OMAP: PM: only register TWL with voltage layer when device is present Greg KH
2011-12-07 16:11 ` [021/104] ARM: 7161/1: errata: no automatic store buffer drain Greg KH
2011-12-07 16:11 ` [022/104] ALSA: hda - Fix jack-detection control of VT1708 Greg KH
2011-12-07 16:11 ` [023/104] ALSA: lx6464es - fix device communication via command bus Greg KH
2011-12-07 16:11 ` [024/104] ALSA: hda/realtek - Fix missing inits of item indices for auto-mic Greg KH
2011-12-07 16:11 ` [025/104] ASoC: sta32x: preserve coefficient RAM Greg KH
2011-12-07 16:11 ` [026/104] ASoC: fsl_ssi: properly initialize the sysfs attribute object Greg KH
2011-12-07 16:11 ` [027/104] ASoC: wm8753: Skip noop reconfiguration of DAI mode Greg KH
2011-12-07 16:11 ` [028/104] ASoC: Ensure WM8731 register cache is synced when resuming from disabled Greg KH
2011-12-07 16:11 ` [029/104] SUNRPC: Ensure we return EAGAIN in xs_nospace if congestion is cleared Greg KH
2011-12-07 16:11 ` [030/104] ext4: fix racy use-after-free in ext4_end_io_dio() Greg KH
2011-12-07 16:11 ` [031/104] rtlwifi: fix lps_lock deadlock Greg KH
2011-12-07 16:11 ` [032/104] genirq: fix regression in irqfixup, irqpoll Greg KH
2011-12-07 16:11 ` [033/104] regulator: aat2870: Fix the logic of checking if no id is matched in aat2870_get_regulator Greg KH
2011-12-07 16:11 ` [034/104] regulator: twl: fix twl4030 support for smps regulators Greg KH
2011-12-07 16:11 ` [035/104] cgroup_freezer: fix freezing groups with stopped tasks Greg KH
2011-12-07 16:11 ` [036/104] timekeeping: add arch_offset hook to ktime_get functions Greg KH
2011-12-07 16:11 ` [037/104] hrtimer: Fix extra wakeups from __remove_hrtimer() Greg KH
2011-12-07 16:11 ` [038/104] clocksource: Avoid selecting mult values that might overflow when adjusted Greg KH
2011-12-07 16:11 ` [039/104] p54spi: Add missing spin_lock_init Greg KH
2011-12-07 16:11 ` [040/104] p54spi: Fix workqueue deadlock Greg KH
2011-12-07 16:11 ` [041/104] rt2x00: Fix efuse EEPROM reading on PPC32 Greg KH
2011-12-07 16:11 ` [042/104] nl80211: fix MAC address validation Greg KH
2011-12-07 16:11 ` [043/104] cfg80211: fix regulatory NULL dereference Greg KH
2011-12-07 16:11 ` [044/104] mac80211: dont stop a single aggregation session twice Greg KH
2011-12-07 16:11 ` [045/104] mac80211: fix race between the AGG SM and the Tx data path Greg KH
2011-12-07 16:11 ` [046/104] xfs: dont serialise direct IO reads on page cache checks Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [047/104] xfs: avoid direct I/O write vs buffered I/O race Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [048/104] xfs: Return -EIO when xfs_vn_getattr() failed Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [049/104] xfs: fix buffer flushing during unmount Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [050/104] xfs: Fix possible memory corruption in xfs_readlink Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [051/104] xfs: use doalloc flag in xfs_qm_dqattach_one() Greg KH
2011-12-07 16:11 ` Greg KH
2011-12-07 16:11 ` [052/104] SCSI: Silencing killing requests for dead queue Greg KH
2011-12-07 16:11 ` [053/104] hugetlb: release pages in the error path of hugetlb_cow() Greg KH
2011-12-07 16:11 ` [054/104] bridge: correct IPv6 checksum after pull Greg KH
2011-12-07 16:11 ` [055/104] iwlwifi: allow pci_enable_msi fail Greg KH
2011-12-07 16:11 ` [056/104] drm/radeon/kms: add some new pci ids Greg KH
2011-12-07 16:11 ` [057/104] drm/radeon/kms: add some loop timeouts in pageflip code Greg KH
2011-12-07 16:12 ` [058/104] ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED Greg KH
2011-12-07 16:12 ` [059/104] ASoC: Fix wrong define for AD1836_ADC_WORD_OFFSET Greg KH
2011-12-07 16:12 ` [060/104] firmware: Sigma: Prevent out of bounds memory access Greg KH
2011-12-07 16:12 ` [061/104] firmware: Sigma: Skip header during CRC generation Greg KH
2011-12-07 16:12 ` [062/104] firmware: Sigma: Fix endianess issues Greg KH
2011-12-07 16:12 ` [063/104] staging:rts_pstor:Complete scanning_done variable Greg KH
2011-12-07 16:12 ` [064/104] staging: usbip: bugfix for deadlock Greg KH
2011-12-07 16:12 ` [065/104] staging: comedi: fix oops for USB DAQ devices Greg KH
2011-12-07 16:12 ` [066/104] Staging: comedi: fix mmap_count Greg KH
2011-12-07 16:12 ` [067/104] Staging: comedi: fix signal handling in read and write Greg KH
2011-12-07 16:12 ` [068/104] usb: musb: PM: fix context save/restore in suspend/resume path Greg KH
2011-12-07 16:12 ` [069/104] USB: whci-hcd: fix endian conversion in qset_clear() Greg KH
2011-12-07 16:12 ` [070/104] HID: Correct General touch PID Greg KH
2011-12-07 16:12 ` [071/104] usb: ftdi_sio: add PID for Propox ISPcable III Greg KH
2011-12-07 16:12 ` [072/104] usb: option: add Huawei E353 controlling interfaces Greg KH
2011-12-07 16:12 ` [073/104] usb: option: add SIMCom SIM5218 Greg KH
2011-12-07 16:12 ` [074/104] USB: usb-storage: unusual_devs entry for Kingston DT 101 G2 Greg KH
2011-12-07 16:12 ` [075/104] IB: Fix RCU lockdep splats Greg KH
2011-12-07 16:12 ` [076/104] USB: EHCI: fix HUB TT scheduling issue with iso transfer Greg KH
2011-12-07 16:12 ` [077/104] EHCI : Fix a regression in the ISO scheduler Greg KH
2011-12-07 16:12 ` [078/104] xHCI: fix bug in xhci_clear_command_ring() Greg KH
2011-12-07 16:12 ` [079/104] sched, x86: Avoid unnecessary overflow in sched_clock Greg KH
2011-12-07 16:12 ` [080/104] x86/mpparse: Account for bus types other than ISA and PCI Greg KH
2011-12-07 16:12 ` [081/104] x86: Fix "Acer Aspire 1" reboot hang Greg KH
2011-12-07 16:12 ` Greg KH [this message]
2011-12-07 16:12 ` [083/104] perf/x86: Fix PEBS instruction unwind Greg KH
2011-12-07 16:12 ` [084/104] oprofile, x86: Fix crash when unloading module (nmi timer mode) Greg KH
2011-12-07 16:12 ` [085/104] [S390] add missing .set function for NT_S390_LAST_BREAK regset Greg KH
2011-12-07 16:12 ` [086/104] mac80211: fill rate filter for internal scan requests Greg KH
2011-12-07 16:12 ` [087/104] mac80211: fix race condition caused by late addBA response Greg KH
2011-12-07 16:12 ` [088/104] cfg80211: fix race on init and driver registration Greg KH
2011-12-07 16:12 ` [089/104] cfg80211: amend regulatory NULL dereference fix Greg KH
2011-12-07 16:12 ` [090/104] genirq: Fix race condition when stopping the irq thread Greg KH
2011-12-07 16:12 ` [091/104] slab, lockdep: Fix silly bug Greg KH
2011-12-07 16:12 ` [092/104] iwlwifi: do not re-configure HT40 after associated Greg KH
2011-12-07 15:29 ` Guy, Wey-Yi
2011-12-07 16:38 ` Greg KH
2011-12-07 16:12 ` [093/104] iwlagn: fix HW crypto for TX-only keys Greg KH
2011-12-07 16:12 ` [094/104] ftrace: Remove force undef config value left for testing Greg KH
2011-12-07 16:12 ` [095/104] trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filter Greg KH
2011-12-07 16:12 ` [096/104] rtc: Disable the alarm in the hardware Greg KH
2012-01-03 16:27 ` Jonathan Nieder
2012-01-03 16:42 ` Greg KH
2012-01-03 18:27 ` Jonathan Nieder
2011-12-07 16:12 ` [097/104] rtc: Fix some bugs that allowed accumulating time drift in suspend/resume Greg KH
2011-12-07 16:12 ` [098/104] tracing: fix event_subsystem ref counting Greg KH
2011-12-07 16:12 ` [099/104] tick-broadcast: Stop active broadcast device when replacing it Greg KH
2011-12-07 16:12 ` [100/104] perf: Fix parsing of __print_flags() in TP_printk() Greg KH
2011-12-07 16:12 ` [101/104] jump_label: jump_label_inc may return before the code is patched Greg KH
2011-12-07 16:12 ` [102/104] oprofile: Fix crash when unloading module (hr timer mode) Greg KH
2011-12-07 16:12 ` [103/104] clocksource: Fix bug with max_deferment margin calculation Greg KH
2011-12-07 16:12 ` [104/104] clockevents: Set noop handler in clockevents_exchange_device() Greg KH
2011-12-07 22:12 ` [000/104] 3.1.5-stable review Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111207161221.538192066@clark.kroah.org \
--to=gregkh@suse.de \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=jeremy.fitzhardinge@citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.