All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Hin-Tak Leung <htl10@users.sourceforge.net>,
	Sergei Antonov <saproj@gmail.com>,
	Anton Altaparmakov <anton@tuxera.com>,
	Sasha Levin <sasha.levin@oracle.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Vyacheslav Dubeyko <slava@dubeyko.com>,
	Sougata Santra <sougata@tuxera.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.10 30/56] hfs,hfsplus: cache pages correctly between bnode_create and bnode_free
Date: Tue, 29 Sep 2015 15:47:18 +0200	[thread overview]
Message-ID: <20150929134701.709736032@linuxfoundation.org> (raw)
In-Reply-To: <20150929134700.376714360@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hin-Tak Leung <htl10@users.sourceforge.net>

commit 7cb74be6fd827e314f81df3c5889b87e4c87c569 upstream.

Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and
hfs_bnode_find() for finding or creating pages corresponding to an inode)
are immediately kmap()'ed and used (both read and write) and kunmap()'ed,
and should not be page_cache_release()'ed until hfs_bnode_free().

This patch fixes a problem I first saw in July 2012: merely running "du"
on a large hfsplus-mounted directory a few times on a reasonably loaded
system would get the hfsplus driver all confused and complaining about
B-tree inconsistencies, and generates a "BUG: Bad page state".  Most
recently, I can generate this problem on up-to-date Fedora 22 with shipped
kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller
mounts) and "du /mnt" simultaneously on two windows, where /mnt is a
lightly-used QEMU VM image of the full Mac OS X 10.9:

$ df -i / /home /mnt
Filesystem                  Inodes   IUsed      IFree IUse% Mounted on
/dev/mapper/fedora-root    3276800  551665    2725135   17% /
/dev/mapper/fedora-home   52879360  716221   52163139    2% /home
/dev/nbd0p2             4294967295 1387818 4293579477    1% /mnt

After applying the patch, I was able to run "du /" (60+ times) and "du
/mnt" (150+ times) continuously and simultaneously for 6+ hours.

There are many reports of the hfsplus driver getting confused under load
and generating "BUG: Bad page state" or other similar issues over the
years.  [1]

The unpatched code [2] has always been wrong since it entered the kernel
tree.  The only reason why it gets away with it is that the
kmap/memcpy/kunmap follow very quickly after the page_cache_release() so
the kernel has not had a chance to reuse the memory for something else,
most of the time.

The current RW driver appears to have followed the design and development
of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec
2001) had a B-tree node-centric approach to
read_cache_page()/page_cache_release() per bnode_get()/bnode_put(),
migrating towards version 0.2 (June 2002) of caching and releasing pages
per inode extents.  When the current RW code first entered the kernel [2]
in 2005, there was an REF_PAGES conditional (and "//" commented out code)
to switch between B-node centric paging to inode-centric paging.  There
was a mistake with the direction of one of the REF_PAGES conditionals in
__hfs_bnode_create().  In a subsequent "remove debug code" commit [4], the
read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were
removed, but a page_cache_release() was mistakenly left in (propagating
the "REF_PAGES <-> !REF_PAGE" mistake), and the commented-out
page_cache_release() in bnode_release() (which should be spanned by
!REF_PAGES) was never enabled.

References:
[1]:
Michael Fox, Apr 2013
http://www.spinics.net/lists/linux-fsdevel/msg63807.html
("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'")

Sasha Levin, Feb 2015
http://lkml.org/lkml/2015/2/20/85 ("use after free")

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887
https://bugzilla.kernel.org/show_bug.cgi?id=42342
https://bugzilla.kernel.org/show_bug.cgi?id=63841
https://bugzilla.kernel.org/show_bug.cgi?id=78761

[2]:
http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfs/bnode.c?id=d1081202f1d0ee35ab0beb490da4b65d4bc763db
commit d1081202f1d0ee35ab0beb490da4b65d4bc763db
Author: Andrew Morton <akpm@osdl.org>
Date:   Wed Feb 25 16:17:36 2004 -0800

    [PATCH] HFS rewrite

http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfsplus/bnode.c?id=91556682e0bf004d98a529bf829d339abb98bbbd

commit 91556682e0bf004d98a529bf829d339abb98bbbd
Author: Andrew Morton <akpm@osdl.org>
Date:   Wed Feb 25 16:17:48 2004 -0800

    [PATCH] HFS+ support

[3]:
http://sourceforge.net/projects/linux-hfsplus/

http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/
http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/

http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\
fs/hfsplus/bnode.c?r1=1.4&r2=1.5

Date:   Thu Jun 6 09:45:14 2002 +0000
Use buffer cache instead of page cache in bnode.c. Cache inode extents.

[4]:
http://git.kernel.org/cgit/linux/kernel/git/\
stable/linux-stable.git/commit/?id=a5e3985fa014029eb6795664c704953720cc7f7d

commit a5e3985fa014029eb6795664c704953720cc7f7d
Author: Roman Zippel <zippel@linux-m68k.org>
Date:   Tue Sep 6 15:18:47 2005 -0700

[PATCH] hfs: remove debug code

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/hfs/bnode.c     |    9 ++++-----
 fs/hfsplus/bnode.c |    3 ---
 2 files changed, 4 insertions(+), 8 deletions(-)

--- a/fs/hfs/bnode.c
+++ b/fs/hfs/bnode.c
@@ -288,7 +288,6 @@ static struct hfs_bnode *__hfs_bnode_cre
 			page_cache_release(page);
 			goto fail;
 		}
-		page_cache_release(page);
 		node->page[i] = page;
 	}
 
@@ -398,11 +397,11 @@ node_error:
 
 void hfs_bnode_free(struct hfs_bnode *node)
 {
-	//int i;
+	int i;
 
-	//for (i = 0; i < node->tree->pages_per_bnode; i++)
-	//	if (node->page[i])
-	//		page_cache_release(node->page[i]);
+	for (i = 0; i < node->tree->pages_per_bnode; i++)
+		if (node->page[i])
+			page_cache_release(node->page[i]);
 	kfree(node);
 }
 
--- a/fs/hfsplus/bnode.c
+++ b/fs/hfsplus/bnode.c
@@ -456,7 +456,6 @@ static struct hfs_bnode *__hfs_bnode_cre
 			page_cache_release(page);
 			goto fail;
 		}
-		page_cache_release(page);
 		node->page[i] = page;
 	}
 
@@ -568,13 +567,11 @@ node_error:
 
 void hfs_bnode_free(struct hfs_bnode *node)
 {
-#if 0
 	int i;
 
 	for (i = 0; i < node->tree->pages_per_bnode; i++)
 		if (node->page[i])
 			page_cache_release(node->page[i]);
-#endif
 	kfree(node);
 }
 



  parent reply	other threads:[~2015-09-29 13:53 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-29 13:46 [PATCH 3.10 00/56] 3.10.90-stable review Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 01/56] unshare: Unsharing a thread does not require unsharing a vm Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 02/56] rtlwifi: rtl8192cu: Add new device ID Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 03/56] tg3: Fix temperature reporting Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 04/56] mac80211: enable assoc check for mesh interfaces Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 05/56] arm64: kconfig: Move LIST_POISON to a safe value Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 06/56] arm64: compat: fix vfp save/restore across signal handlers in big-endian Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 07/56] arm64: head.S: initialise mdcr_el2 in el2_setup Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 08/56] Input: synaptics - fix handling of disabling gesture mode Greg Kroah-Hartman
2015-09-29 13:57   ` Dmitry Torokhov
2015-09-29 14:18     ` Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 09/56] ALSA: hda - Enable headphone jack detect on old Fujitsu laptops Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 10/56] ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437 Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 11/56] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 12/56] powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 13/56] Add radeon suspend/resume quirk for HP Compaq dc5750 Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 14/56] x86/mm: Initialize pmd_idx in page_table_range_init_count() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 16/56] NFSv4: dont set SETATTR for O_RDONLY|O_EXCL Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 17/56] NFS: nfs_set_pgio_error sometimes misses errors Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 18/56] parisc: Filter out spurious interrupts in PA-RISC irq handler Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 19/56] vmscan: fix increasing nr_isolated incurred by putback unevictable pages Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 20/56] fs: if a coredump already exists, unlink and recreate with O_EXCL Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 21/56] mmc: core: fix race condition in mmc_wait_data_done Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 22/56] md/raid10: always set reshape_safe when initializing reshape_position Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 23/56] xen/gntdev: convert priv->lock to a mutex Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 24/56] hfs: fix B-tree corruption after insertion at position 0 Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 25/56] IB/uverbs: reject invalid or unknown opcodes Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 26/56] IB/uverbs: Fix race between ib_uverbs_open and remove_one Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 27/56] IB/mlx4: Forbid using sysfs to change RoCE pkeys Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 28/56] IB/mlx4: Use correct SL on AH query under RoCE Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 29/56] stmmac: fix check for phydev being open Greg Kroah-Hartman
2015-09-30 11:20   ` Sergei Shtylyov
2015-09-29 13:47 ` Greg Kroah-Hartman [this message]
2015-09-29 13:47 ` [PATCH 3.10 31/56] sctp: fix ASCONF list handling Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 32/56] vhost/scsi: potential memory corruption Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 33/56] x86: bpf_jit: fix compilation of large bpf programs Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 34/56] ipv6: Make MLD packets to only be processed locally Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 35/56] net/tipc: initialize security state for new connection socket Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 36/56] bridge: mdb: zero out the local br_ip variable before use Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 37/56] net: pktgen: fix race between pktgen_thread_worker() and kthread_stop() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 38/56] net: call rcu_read_lock early in process_backlog Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 39/56] net: Clone skb before setting peeked flag Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 40/56] net: Fix skb csum races when peeking Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 41/56] net: Fix skb_set_peeked use-after-free bug Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 42/56] bridge: mdb: fix double add notification Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 43/56] isdn/gigaset: reset tty->receive_room when attaching ser_gigaset Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 44/56] ipv6: lock socket in ip6_datagram_connect() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 45/56] bonding: fix destruction of bond with devices different from arphrd_ether Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 46/56] inet: frags: fix defragmented packets IP header for af_packet Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 47/56] netlink: dont hold mutex in rcu callback when releasing mmapd ring Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 48/56] rds: fix an integer overflow test in rds_info_getsockopt() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 49/56] ip6_gre: release cached dst on tunnel removal Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 50/56] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 51/56] ipv6: fix exthdrs offload registration in out_rt path Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 52/56] net/ipv6: Correct PIM6 mrt_lock handling Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 53/56] sctp: fix race on protocol/netns initialization Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 54/56] fib_rules: fix fib rule dumps across multiple skbs Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 55/56] vfs: Remove incorrect debugging WARN in prepend_path Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 56/56] Revert "iio: bmg160: IIO_BUFFER and IIO_TRIGGERED_BUFFER are required" Greg Kroah-Hartman
2015-09-29 16:53 ` [PATCH 3.10 00/56] 3.10.90-stable review Shuah Khan
2015-09-29 21:14 ` Guenter Roeck
2015-09-30  5:45 ` Sudip Mukherjee
     [not found] ` <562a7d97.a9c6b40a.4a84c.46d4@mx.google.com>
2015-10-23 18:36   ` Kevin Hilman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150929134701.709736032@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton@tuxera.com \
    --cc=hch@infradead.org \
    --cc=htl10@users.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=saproj@gmail.com \
    --cc=sasha.levin@oracle.com \
    --cc=slava@dubeyko.com \
    --cc=sougata@tuxera.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.