From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Hin-Tak Leung <htl10@users.sourceforge.net>,
Sergei Antonov <saproj@gmail.com>,
Anton Altaparmakov <anton@tuxera.com>,
Sasha Levin <sasha.levin@oracle.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>,
Vyacheslav Dubeyko <slava@dubeyko.com>,
Sougata Santra <sougata@tuxera.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.10 30/56] hfs,hfsplus: cache pages correctly between bnode_create and bnode_free
Date: Tue, 29 Sep 2015 15:47:18 +0200 [thread overview]
Message-ID: <20150929134701.709736032@linuxfoundation.org> (raw)
In-Reply-To: <20150929134700.376714360@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hin-Tak Leung <htl10@users.sourceforge.net>
commit 7cb74be6fd827e314f81df3c5889b87e4c87c569 upstream.
Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and
hfs_bnode_find() for finding or creating pages corresponding to an inode)
are immediately kmap()'ed and used (both read and write) and kunmap()'ed,
and should not be page_cache_release()'ed until hfs_bnode_free().
This patch fixes a problem I first saw in July 2012: merely running "du"
on a large hfsplus-mounted directory a few times on a reasonably loaded
system would get the hfsplus driver all confused and complaining about
B-tree inconsistencies, and generates a "BUG: Bad page state". Most
recently, I can generate this problem on up-to-date Fedora 22 with shipped
kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller
mounts) and "du /mnt" simultaneously on two windows, where /mnt is a
lightly-used QEMU VM image of the full Mac OS X 10.9:
$ df -i / /home /mnt
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/fedora-root 3276800 551665 2725135 17% /
/dev/mapper/fedora-home 52879360 716221 52163139 2% /home
/dev/nbd0p2 4294967295 1387818 4293579477 1% /mnt
After applying the patch, I was able to run "du /" (60+ times) and "du
/mnt" (150+ times) continuously and simultaneously for 6+ hours.
There are many reports of the hfsplus driver getting confused under load
and generating "BUG: Bad page state" or other similar issues over the
years. [1]
The unpatched code [2] has always been wrong since it entered the kernel
tree. The only reason why it gets away with it is that the
kmap/memcpy/kunmap follow very quickly after the page_cache_release() so
the kernel has not had a chance to reuse the memory for something else,
most of the time.
The current RW driver appears to have followed the design and development
of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec
2001) had a B-tree node-centric approach to
read_cache_page()/page_cache_release() per bnode_get()/bnode_put(),
migrating towards version 0.2 (June 2002) of caching and releasing pages
per inode extents. When the current RW code first entered the kernel [2]
in 2005, there was an REF_PAGES conditional (and "//" commented out code)
to switch between B-node centric paging to inode-centric paging. There
was a mistake with the direction of one of the REF_PAGES conditionals in
__hfs_bnode_create(). In a subsequent "remove debug code" commit [4], the
read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were
removed, but a page_cache_release() was mistakenly left in (propagating
the "REF_PAGES <-> !REF_PAGE" mistake), and the commented-out
page_cache_release() in bnode_release() (which should be spanned by
!REF_PAGES) was never enabled.
References:
[1]:
Michael Fox, Apr 2013
http://www.spinics.net/lists/linux-fsdevel/msg63807.html
("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'")
Sasha Levin, Feb 2015
http://lkml.org/lkml/2015/2/20/85 ("use after free")
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887
https://bugzilla.kernel.org/show_bug.cgi?id=42342
https://bugzilla.kernel.org/show_bug.cgi?id=63841
https://bugzilla.kernel.org/show_bug.cgi?id=78761
[2]:
http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfs/bnode.c?id=d1081202f1d0ee35ab0beb490da4b65d4bc763db
commit d1081202f1d0ee35ab0beb490da4b65d4bc763db
Author: Andrew Morton <akpm@osdl.org>
Date: Wed Feb 25 16:17:36 2004 -0800
[PATCH] HFS rewrite
http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfsplus/bnode.c?id=91556682e0bf004d98a529bf829d339abb98bbbd
commit 91556682e0bf004d98a529bf829d339abb98bbbd
Author: Andrew Morton <akpm@osdl.org>
Date: Wed Feb 25 16:17:48 2004 -0800
[PATCH] HFS+ support
[3]:
http://sourceforge.net/projects/linux-hfsplus/
http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/
http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/
http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\
fs/hfsplus/bnode.c?r1=1.4&r2=1.5
Date: Thu Jun 6 09:45:14 2002 +0000
Use buffer cache instead of page cache in bnode.c. Cache inode extents.
[4]:
http://git.kernel.org/cgit/linux/kernel/git/\
stable/linux-stable.git/commit/?id=a5e3985fa014029eb6795664c704953720cc7f7d
commit a5e3985fa014029eb6795664c704953720cc7f7d
Author: Roman Zippel <zippel@linux-m68k.org>
Date: Tue Sep 6 15:18:47 2005 -0700
[PATCH] hfs: remove debug code
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/hfs/bnode.c | 9 ++++-----
fs/hfsplus/bnode.c | 3 ---
2 files changed, 4 insertions(+), 8 deletions(-)
--- a/fs/hfs/bnode.c
+++ b/fs/hfs/bnode.c
@@ -288,7 +288,6 @@ static struct hfs_bnode *__hfs_bnode_cre
page_cache_release(page);
goto fail;
}
- page_cache_release(page);
node->page[i] = page;
}
@@ -398,11 +397,11 @@ node_error:
void hfs_bnode_free(struct hfs_bnode *node)
{
- //int i;
+ int i;
- //for (i = 0; i < node->tree->pages_per_bnode; i++)
- // if (node->page[i])
- // page_cache_release(node->page[i]);
+ for (i = 0; i < node->tree->pages_per_bnode; i++)
+ if (node->page[i])
+ page_cache_release(node->page[i]);
kfree(node);
}
--- a/fs/hfsplus/bnode.c
+++ b/fs/hfsplus/bnode.c
@@ -456,7 +456,6 @@ static struct hfs_bnode *__hfs_bnode_cre
page_cache_release(page);
goto fail;
}
- page_cache_release(page);
node->page[i] = page;
}
@@ -568,13 +567,11 @@ node_error:
void hfs_bnode_free(struct hfs_bnode *node)
{
-#if 0
int i;
for (i = 0; i < node->tree->pages_per_bnode; i++)
if (node->page[i])
page_cache_release(node->page[i]);
-#endif
kfree(node);
}
next prev parent reply other threads:[~2015-09-29 13:50 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-29 13:46 [PATCH 3.10 00/56] 3.10.90-stable review Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 01/56] unshare: Unsharing a thread does not require unsharing a vm Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 02/56] rtlwifi: rtl8192cu: Add new device ID Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 03/56] tg3: Fix temperature reporting Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 04/56] mac80211: enable assoc check for mesh interfaces Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 05/56] arm64: kconfig: Move LIST_POISON to a safe value Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 06/56] arm64: compat: fix vfp save/restore across signal handlers in big-endian Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 07/56] arm64: head.S: initialise mdcr_el2 in el2_setup Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 08/56] Input: synaptics - fix handling of disabling gesture mode Greg Kroah-Hartman
2015-09-29 13:57 ` Dmitry Torokhov
2015-09-29 14:18 ` Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 09/56] ALSA: hda - Enable headphone jack detect on old Fujitsu laptops Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 10/56] ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437 Greg Kroah-Hartman
2015-09-29 13:46 ` [PATCH 3.10 11/56] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 12/56] powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 13/56] Add radeon suspend/resume quirk for HP Compaq dc5750 Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 14/56] x86/mm: Initialize pmd_idx in page_table_range_init_count() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 16/56] NFSv4: dont set SETATTR for O_RDONLY|O_EXCL Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 17/56] NFS: nfs_set_pgio_error sometimes misses errors Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 18/56] parisc: Filter out spurious interrupts in PA-RISC irq handler Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 19/56] vmscan: fix increasing nr_isolated incurred by putback unevictable pages Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 20/56] fs: if a coredump already exists, unlink and recreate with O_EXCL Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 21/56] mmc: core: fix race condition in mmc_wait_data_done Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 22/56] md/raid10: always set reshape_safe when initializing reshape_position Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 23/56] xen/gntdev: convert priv->lock to a mutex Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 24/56] hfs: fix B-tree corruption after insertion at position 0 Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 25/56] IB/uverbs: reject invalid or unknown opcodes Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 26/56] IB/uverbs: Fix race between ib_uverbs_open and remove_one Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 27/56] IB/mlx4: Forbid using sysfs to change RoCE pkeys Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 28/56] IB/mlx4: Use correct SL on AH query under RoCE Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 29/56] stmmac: fix check for phydev being open Greg Kroah-Hartman
2015-09-30 11:20 ` Sergei Shtylyov
2015-09-29 13:47 ` Greg Kroah-Hartman [this message]
2015-09-29 13:47 ` [PATCH 3.10 31/56] sctp: fix ASCONF list handling Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 32/56] vhost/scsi: potential memory corruption Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 33/56] x86: bpf_jit: fix compilation of large bpf programs Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 34/56] ipv6: Make MLD packets to only be processed locally Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 35/56] net/tipc: initialize security state for new connection socket Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 36/56] bridge: mdb: zero out the local br_ip variable before use Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 37/56] net: pktgen: fix race between pktgen_thread_worker() and kthread_stop() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 38/56] net: call rcu_read_lock early in process_backlog Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 39/56] net: Clone skb before setting peeked flag Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 40/56] net: Fix skb csum races when peeking Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 41/56] net: Fix skb_set_peeked use-after-free bug Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 42/56] bridge: mdb: fix double add notification Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 43/56] isdn/gigaset: reset tty->receive_room when attaching ser_gigaset Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 44/56] ipv6: lock socket in ip6_datagram_connect() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 45/56] bonding: fix destruction of bond with devices different from arphrd_ether Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 46/56] inet: frags: fix defragmented packets IP header for af_packet Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 47/56] netlink: dont hold mutex in rcu callback when releasing mmapd ring Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 48/56] rds: fix an integer overflow test in rds_info_getsockopt() Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 49/56] ip6_gre: release cached dst on tunnel removal Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 50/56] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 51/56] ipv6: fix exthdrs offload registration in out_rt path Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 52/56] net/ipv6: Correct PIM6 mrt_lock handling Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 53/56] sctp: fix race on protocol/netns initialization Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 54/56] fib_rules: fix fib rule dumps across multiple skbs Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 55/56] vfs: Remove incorrect debugging WARN in prepend_path Greg Kroah-Hartman
2015-09-29 13:47 ` [PATCH 3.10 56/56] Revert "iio: bmg160: IIO_BUFFER and IIO_TRIGGERED_BUFFER are required" Greg Kroah-Hartman
2015-09-29 16:53 ` [PATCH 3.10 00/56] 3.10.90-stable review Shuah Khan
2015-09-29 21:14 ` Guenter Roeck
2015-09-30 5:45 ` Sudip Mukherjee
[not found] ` <562a7d97.a9c6b40a.4a84c.46d4@mx.google.com>
2015-10-23 18:36 ` Kevin Hilman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150929134701.709736032@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=anton@tuxera.com \
--cc=hch@infradead.org \
--cc=htl10@users.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=saproj@gmail.com \
--cc=sasha.levin@oracle.com \
--cc=slava@dubeyko.com \
--cc=sougata@tuxera.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).