From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
Zwane Mwaikambo <zwane@arm.linux.org.uk>,
"Theodore Ts'o" <tytso@mit.edu>,
Randy Dunlap <rdunlap@xenotime.net>,
Dave Jones <davej@redhat.com>,
Chuck Wolber <chuckw@quantumlinux.com>,
Chris Wedgwood <reviews@ml.cw.f00f.org>,
Michael Krufky <mkrufky@linuxtv.org>,
Chuck Ebbert <cebbert@redhat.com>,
Domenico Andreoli <cavokz@gmail.com>, Willy Tarreau <w@1wt.eu>,
Rodrigo Rubira Branco <rbranco@la.checkpoint.com>,
Jake Edge <jake@lwn.net>, Eugene Teo <eteo@redhat.com>,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk, Nick Piggin <npiggin@suse.de>
Subject: [patch 88/94] mm lockless pagecache barrier fix
Date: Thu, 15 Jan 2009 12:00:37 -0800 [thread overview]
Message-ID: <20090115200037.GJ14419@kroah.com> (raw)
In-Reply-To: <20090115195520.GA14403@kroah.com>
[-- Attachment #1: mm-lockless-pagecache-barrier-fix.patch --]
[-- Type: text/plain, Size: 2950 bytes --]
2.6.28-stable review patch. If anyone has any objections, please let us know.
------------------
From: Nick Piggin <npiggin@suse.de>
commit e8c82c2e23e3527e0c9dc195e432c16784d270fa upstream.
An XFS workload showed up a bug in the lockless pagecache patch. Basically it
would go into an "infinite" loop, although it would sometimes be able to break
out of the loop! The reason is a missing compiler barrier in the "increment
reference count unless it was zero" case of the lockless pagecache protocol in
the gang lookup functions.
This would cause the compiler to use a cached value of struct page pointer to
retry the operation with, rather than reload it. So the page might have been
removed from pagecache and freed (refcount==0) but the lookup would not correctly
notice the page is no longer in pagecache, and keep attempting to increment the
refcount and failing, until the page gets reallocated for something else. This
isn't a data corruption because the condition will be detected if the page has
been reallocated. However it can result in a lockup.
Linus points out that ACCESS_ONCE is also required in that pointer load, even
if it's absence is not causing a bug on our particular build. The most general
way to solve this is just to put an rcu_dereference in radix_tree_deref_slot.
Assembly of find_get_pages,
before:
.L220:
movq (%rbx), %rax #* ivtmp.1162, tmp82
movq (%rax), %rdi #, prephitmp.1149
.L218:
testb $1, %dil #, prephitmp.1149
jne .L217 #,
testq %rdi, %rdi # prephitmp.1149
je .L203 #,
cmpq $-1, %rdi #, prephitmp.1149
je .L217 #,
movl 8(%rdi), %esi # <variable>._count.counter, c
testl %esi, %esi # c
je .L218 #,
after:
.L212:
movq (%rbx), %rax #* ivtmp.1109, tmp81
movq (%rax), %rdi #, ret
testb $1, %dil #, ret
jne .L211 #,
testq %rdi, %rdi # ret
je .L197 #,
cmpq $-1, %rdi #, ret
je .L211 #,
movl 8(%rdi), %esi # <variable>._count.counter, c
testl %esi, %esi # c
je .L212 #,
(notice the obvious infinite loop in the first example, if page->count remains 0)
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
include/linux/radix-tree.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -136,7 +136,7 @@ do { \
*/
static inline void *radix_tree_deref_slot(void **pslot)
{
- void *ret = *pslot;
+ void *ret = rcu_dereference(*pslot);
if (unlikely(radix_tree_is_indirect_ptr(ret)))
ret = RADIX_TREE_RETRY;
return ret;
next prev parent reply other threads:[~2009-01-15 20:54 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20090115194806.804618825@mini.kroah.org>
2009-01-15 19:55 ` [patch 00/94] 2.6.28.1 stable review Greg KH
2009-01-15 19:57 ` [patch 01/94] ALSA: hda - Add quirk for another HP dv7 Greg KH
2009-01-15 19:57 ` [patch 02/94] ALSA: hda - Add quirk for HP6730B laptop Greg KH
2009-01-15 19:57 ` [patch 03/94] ALSA: caiaq - Fix Oops with MIDI Greg KH
2009-01-15 19:57 ` [patch 04/94] ALSA: hda - Fix typos for AD1882 codecs Greg KH
2009-01-15 19:57 ` [patch 05/94] x86: fix intel x86_64 llc_shared_map/cpu_llc_id anomolies Greg KH
2009-01-15 19:57 ` [patch 06/94] x86: default to SWIOTLB=y on x86_64 Greg KH
2009-01-15 19:57 ` [patch 07/94] CIFS: make sure that DFS pathnames are properly formed Greg KH
2009-01-15 19:57 ` [patch 08/94] ring-buffer: prevent false positive warning Greg KH
2009-01-15 19:57 ` [patch 09/94] ring-buffer: fix dangling commit race Greg KH
2009-01-15 19:57 ` [patch 10/94] iwlwifi: use GFP_KERNEL to allocate Rx SKB memory Greg KH
2009-01-15 19:57 ` [patch 11/94] tx493[89]ide: Fix length for __ide_flush_dcache_range Greg KH
2009-01-15 19:57 ` [patch 12/94] tx4939ide: Do not use zero count PRD entry Greg KH
2009-01-15 19:57 ` [patch 13/94] SCSI: eata: fix the data buffer accessors conversion regression Greg KH
2009-01-15 19:57 ` [patch 14/94] USB: emi26: fix oops on load Greg KH
2009-01-15 19:57 ` [patch 15/94] x86, UV: remove erroneous BAU initialization Greg KH
2009-01-15 19:57 ` [patch 16/94] x86: fix incorrect __read_mostly on _boot_cpu_pda Greg KH
2009-01-15 19:57 ` [patch 17/94] vmalloc.c: fix flushing in vmap_page_range() Greg KH
2009-01-15 19:57 ` [patch 18/94] fs: symlink write_begin allocation context fix Greg KH
2009-01-15 19:57 ` [patch 19/94] cgroups: fix a race between cgroup_clone and umount Greg KH
2009-01-15 19:57 ` [patch 20/94] dm raid1: fix error count Greg KH
2009-01-15 19:58 ` [patch 21/94] dm log: fix dm_io_client leak on error paths Greg KH
2009-01-15 19:58 ` [patch 22/94] minix: fix add links wrong position calculation Greg KH
2009-01-15 19:58 ` [patch 23/94] md: fix bitmap-on-external-file bug Greg KH
2009-01-15 19:58 ` [patch 24/94] sched_clock: prevent scd->clock from moving backwards, take #2 Greg KH
2009-01-15 19:58 ` [patch 25/94] devices cgroup: allow mkfifo Greg KH
2009-01-15 19:58 ` [patch 26/94] SCSI: aha152x_cs: Fix regression that keeps driver from using shared interrupts Greg KH
2009-01-15 19:58 ` [patch 27/94] ioat: fix self test for multi-channel case Greg KH
2009-01-15 19:58 ` [patch 28/94] USB: isp1760: use a specific PLX bridge instead of any bdridge Greg KH
2009-01-15 19:58 ` [patch 29/94] USB: isp1760: Fix probe in PCI glue code Greg KH
2009-01-15 19:58 ` [patch 30/94] USB: unusual_devs.h additions for Pentax K10D Greg KH
2009-01-15 19:58 ` [patch 31/94] inotify: fix type errors in interfaces Greg KH
2009-01-15 19:58 ` [patch 32/94] [PATCH 01/44] [CVE-2009-0029] Move compat system call declarations to compat header file Greg KH
2009-01-15 19:58 ` [patch 33/94] [PATCH 02/44] [CVE-2009-0029] Convert all system calls to return a long Greg KH
2009-01-15 19:58 ` [patch 34/94] [PATCH 03/44] [CVE-2009-0029] Rename old_readdir to sys_old_readdir Greg KH
2009-01-15 19:58 ` [patch 35/94] [PATCH 04/44] [CVE-2009-0029] Remove __attribute__((weak)) from sys_pipe/sys_pipe2 Greg KH
2009-01-15 19:58 ` [patch 36/94] [PATCH 05/44] [CVE-2009-0029] Make sys_pselect7 static Greg KH
2009-01-15 19:58 ` [patch 37/94] [PATCH 06/44] [CVE-2009-0029] Make sys_syslog a conditional system call Greg KH
2009-01-15 19:58 ` [patch 38/94] [PATCH 07/44] [CVE-2009-0029] System call wrapper infrastructure Greg KH
2009-01-15 19:58 ` [patch 39/94] [PATCH 08/44] [CVE-2009-0029] powerpc: Enable syscall wrappers for 64-bit Greg KH
2009-01-15 19:58 ` [patch 40/94] [PATCH 09/44] [CVE-2009-0029] s390: enable system call wrappers Greg KH
2009-01-15 19:58 ` [patch 41/94] [PATCH 10/44] [CVE-2009-0029] System call wrapper special cases Greg KH
2009-01-15 19:58 ` [patch 42/94] [PATCH 11/44] [CVE-2009-0029] System call wrappers part 01 Greg KH
2009-01-16 11:00 ` Pavel Machek
2009-01-16 11:24 ` Heiko Carstens
2009-01-16 14:43 ` Pavel Machek
2009-01-16 15:00 ` [stable] " Greg KH
2009-01-15 19:58 ` [patch 43/94] [PATCH 12/44] [CVE-2009-0029] System call wrappers part 02 Greg KH
2009-01-15 19:58 ` [patch 44/94] [PATCH 13/44] [CVE-2009-0029] System call wrappers part 03 Greg KH
2009-01-15 19:58 ` [patch 45/94] [PATCH 14/44] [CVE-2009-0029] System call wrappers part 04 Greg KH
2009-01-15 19:58 ` [patch 46/94] [PATCH 15/44] [CVE-2009-0029] System call wrappers part 05 Greg KH
2009-01-15 19:58 ` [patch 47/94] [PATCH 16/44] [CVE-2009-0029] System call wrappers part 06 Greg KH
2009-01-15 19:59 ` [patch 48/94] [PATCH 17/44] [CVE-2009-0029] System call wrappers part 07 Greg KH
2009-01-15 19:59 ` [patch 49/94] [PATCH 18/44] [CVE-2009-0029] System call wrappers part 08 Greg KH
2009-01-15 19:59 ` [patch 50/94] [PATCH 19/44] [CVE-2009-0029] System call wrappers part 09 Greg KH
2009-01-15 19:59 ` [patch 51/94] [PATCH 20/44] [CVE-2009-0029] System call wrappers part 10 Greg KH
2009-01-15 19:59 ` [patch 52/94] [PATCH 21/44] [CVE-2009-0029] System call wrappers part 11 Greg KH
2009-01-15 19:59 ` [patch 53/94] [PATCH 22/44] [CVE-2009-0029] System call wrappers part 12 Greg KH
2009-01-15 19:59 ` [patch 54/94] [PATCH 23/44] [CVE-2009-0029] System call wrappers part 13 Greg KH
2009-01-15 19:59 ` [patch 55/94] [PATCH 24/44] [CVE-2009-0029] System call wrappers part 14 Greg KH
2009-01-15 19:59 ` [patch 56/94] [PATCH 25/44] [CVE-2009-0029] System call wrappers part 15 Greg KH
2009-01-15 19:59 ` [patch 57/94] [PATCH 26/44] [CVE-2009-0029] System call wrappers part 16 Greg KH
2009-01-15 19:59 ` [patch 58/94] [PATCH 27/44] [CVE-2009-0029] System call wrappers part 17 Greg KH
2009-01-15 19:59 ` [patch 59/94] [PATCH 28/44] [CVE-2009-0029] System call wrappers part 18 Greg KH
2009-01-15 19:59 ` [patch 60/94] [PATCH 29/44] [CVE-2009-0029] System call wrappers part 19 Greg KH
2009-01-15 19:59 ` [patch 61/94] [PATCH 30/44] [CVE-2009-0029] System call wrappers part 20 Greg KH
2009-01-15 19:59 ` [patch 62/94] [PATCH 31/44] [CVE-2009-0029] System call wrappers part 21 Greg KH
2009-01-15 19:59 ` [patch 63/94] [PATCH 32/44] [CVE-2009-0029] System call wrappers part 22 Greg KH
2009-01-15 19:59 ` [patch 64/94] [PATCH 33/44] [CVE-2009-0029] System call wrappers part 23 Greg KH
2009-01-15 19:59 ` [patch 65/94] [PATCH 34/44] [CVE-2009-0029] System call wrappers part 24 Greg KH
2009-01-15 19:59 ` [patch 66/94] [PATCH 35/44] [CVE-2009-0029] System call wrappers part 25 Greg KH
2009-01-15 19:59 ` [patch 67/94] [PATCH 36/44] [CVE-2009-0029] System call wrappers part 26 Greg KH
2009-01-15 19:59 ` [patch 68/94] [PATCH 37/44] [CVE-2009-0029] System call wrappers part 27 Greg KH
2009-01-15 19:59 ` [patch 69/94] [PATCH 38/44] [CVE-2009-0029] System call wrappers part 28 Greg KH
2009-01-15 19:59 ` [patch 70/94] [PATCH 39/44] [CVE-2009-0029] System call wrappers part 29 Greg KH
2009-01-15 20:00 ` [patch 71/94] [PATCH 40/44] [CVE-2009-0029] System call wrappers part 30 Greg KH
2009-01-15 20:00 ` [patch 72/94] [PATCH 41/44] [CVE-2009-0029] System call wrappers part 31 Greg KH
2009-01-15 20:00 ` [patch 73/94] [PATCH 42/44] [CVE-2009-0029] System call wrappers part 32 Greg KH
2009-01-15 20:00 ` [patch 74/94] [PATCH 43/44] [CVE-2009-0029] System call wrappers part 33 Greg KH
2009-01-15 20:00 ` [patch 75/94] [PATCH 44/44] [CVE-2009-0029] s390 specific system call wrappers Greg KH
2009-01-15 20:00 ` [patch 76/94] x86: fix RIP printout in early_idt_handler Greg KH
2009-01-15 20:00 ` [patch 77/94] Fix timeouts in sys_pselect7 Greg KH
2009-01-15 20:00 ` [patch 78/94] USB: another unusual_devs entry for another bad Argosy storage device Greg KH
2009-01-15 20:00 ` [patch 79/94] USB: storage: extend unusual range for 067b:3507 Greg KH
2009-01-15 20:00 ` [patch 80/94] USB: storage: recognizing and enabling Nokia 5200 cell phoes Greg KH
2009-01-15 20:00 ` [patch 81/94] HID: fix error condition propagation in hid-sony driver Greg KH
2009-01-15 20:00 ` [patch 82/94] fix switch_names() breakage in short-to-short case Greg KH
2009-01-15 20:00 ` [patch 83/94] nfs: remove redundant tests on reading new pages Greg KH
2009-01-15 20:00 ` [patch 84/94] eCryptfs: check readlink result was not an error before using it Greg KH
2009-01-15 20:00 ` [patch 85/94] [SCSI] mvsas: increase port type detection delay to suit Seagates 10k6 drive ST3450856SS 0003 Greg KH
2009-01-15 20:00 ` [patch 86/94] x86: avoid theoretical vmalloc fault loop Greg KH
2009-01-15 20:00 ` [patch 87/94] ath9k: enable RXing of beacons on STA/IBSS Greg KH
2009-01-15 20:00 ` Greg KH [this message]
2009-01-15 20:00 ` [patch 89/94] powerpc: Disable Collaborative Memory Manager for kdump Greg KH
2009-01-15 20:00 ` [patch 90/94] [SCSI] ibmvfc: Delay NPIV login retry and add retries Greg KH
2009-01-15 20:00 ` [patch 91/94] [SCSI] ibmvfc: Improve async event handling Greg KH
2009-01-15 20:00 ` [patch 92/94] getrusage: RUSAGE_THREAD should return ru_utime and ru_stime Greg KH
2009-01-15 20:00 ` [patch 93/94] ath5k: ignore the return value of ath5k_hw_noise_floor_calibration Greg KH
2009-01-15 20:00 ` [patch 94/94] mm: fix assertion Greg KH
2009-01-15 21:08 ` [patch 95/94] XFS: truncate readdir offsets to signed 32 bit values Greg KH
2009-01-15 21:10 ` [patch 00/94] 2.6.28.1 stable review Greg KH
[not found] ` <200901152200.04272.s.L-H@gmx.de>
2009-01-15 21:12 ` Greg KH
2009-01-15 21:26 ` Alan Stern
2009-01-15 21:19 ` Alan Stern
2009-01-15 21:27 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090115200037.GJ14419@kroah.com \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=cavokz@gmail.com \
--cc=cebbert@redhat.com \
--cc=chuckw@quantumlinux.com \
--cc=davej@redhat.com \
--cc=eteo@redhat.com \
--cc=jake@lwn.net \
--cc=jmforbes@linuxtx.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mkrufky@linuxtv.org \
--cc=npiggin@suse.de \
--cc=rbranco@la.checkpoint.com \
--cc=rdunlap@xenotime.net \
--cc=reviews@ml.cw.f00f.org \
--cc=stable@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=w@1wt.eu \
--cc=zwane@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox