From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>,
Anthony Doggett <Anthony2486@interfaces.org.uk>,
Vyacheslav Dubeyko <slava@dubeyko.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [ 19/33] nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary
Date: Wed, 5 Jun 2013 13:52:57 -0700 [thread overview]
Message-ID: <20130605204706.836625177@linuxfoundation.org> (raw)
In-Reply-To: <20130605204702.359510786@linuxfoundation.org>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
commit 136e8770cd5d1fe38b3c613100dd6dc4db6d4fa6 upstream.
nilfs2: fix issue of nilfs_set_page_dirty for page at EOF boundary
DESCRIPTION:
There are use-cases when NILFS2 file system (formatted with block size
lesser than 4 KB) can be remounted in RO mode because of encountering of
"broken bmap" issue.
The issue was reported by Anthony Doggett <Anthony2486@interfaces.org.uk>:
"The machine I've been trialling nilfs on is running Debian Testing,
Linux version 3.2.0-4-686-pae (debian-kernel@lists.debian.org) (gcc
version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.35-2), but I've
also reproduced it (identically) with Debian Unstable amd64 and Debian
Experimental (using the 3.8-trunk kernel). The problematic partitions
were formatted with "mkfs.nilfs2 -b 1024 -B 8192"."
SYMPTOMS:
(1) System log contains error messages likewise:
[63102.496756] nilfs_direct_assign: invalid pointer: 0
[63102.496786] NILFS error (device dm-17): nilfs_bmap_assign: broken bmap (inode number=28)
[63102.496798]
[63102.524403] Remounting filesystem read-only
(2) The NILFS2 file system is remounted in RO mode.
REPRODUSING PATH:
(1) Create volume group with name "unencrypted" by means of vgcreate utility.
(2) Run script (prepared by Anthony Doggett <Anthony2486@interfaces.org.uk>):
----------------[BEGIN SCRIPT]--------------------
VG=unencrypted
lvcreate --size 2G --name ntest $VG
mkfs.nilfs2 -b 1024 -B 8192 /dev/mapper/$VG-ntest
mkdir /var/tmp/n
mkdir /var/tmp/n/ntest
mount /dev/mapper/$VG-ntest /var/tmp/n/ntest
mkdir /var/tmp/n/ntest/thedir
cd /var/tmp/n/ntest/thedir
sleep 2
date
darcs init
sleep 2
dmesg|tail -n 5
date
darcs whatsnew || true
date
sleep 2
dmesg|tail -n 5
----------------[END SCRIPT]--------------------
REPRODUCIBILITY: 100%
INVESTIGATION:
As it was discovered, the issue takes place during segment
construction after executing such sequence of user-space operations:
open("_darcs/index", O_RDWR|O_CREAT|O_NOCTTY, 0666) = 7
fstat(7, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
ftruncate(7, 60)
The error message "NILFS error (device dm-17): nilfs_bmap_assign: broken
bmap (inode number=28)" takes place because of trying to get block
number for third block of the file with logical offset #3072 bytes. As
it is possible to see from above output, the file has 60 bytes of the
whole size. So, it is enough one block (1 KB in size) allocation for
the whole file. Trying to operate with several blocks instead of one
takes place because of discovering several dirty buffers for this file
in nilfs_segctor_scan_file() method.
The root cause of this issue is in nilfs_set_page_dirty function which
is called just before writing to an mmapped page.
When nilfs_page_mkwrite function handles a page at EOF boundary, it
fills hole blocks only inside EOF through __block_page_mkwrite().
The __block_page_mkwrite() function calls set_page_dirty() after filling
hole blocks, thus nilfs_set_page_dirty function (=
a_ops->set_page_dirty) is called. However, the current implementation
of nilfs_set_page_dirty() wrongly marks all buffers dirty even for page
at EOF boundary.
As a result, buffers outside EOF are inconsistently marked dirty and
queued for write even though they are not mapped with nilfs_get_block
function.
FIX:
This modifies nilfs_set_page_dirty() not to mark hole blocks dirty.
Thanks to Vyacheslav Dubeyko for his effort on analysis and proposals
for this issue.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reported-by: Anthony Doggett <Anthony2486@interfaces.org.uk>
Reported-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/nilfs2/inode.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -195,13 +195,32 @@ static int nilfs_writepage(struct page *
static int nilfs_set_page_dirty(struct page *page)
{
- int ret = __set_page_dirty_buffers(page);
+ int ret = __set_page_dirty_nobuffers(page);
- if (ret) {
+ if (page_has_buffers(page)) {
struct inode *inode = page->mapping->host;
- unsigned nr_dirty = 1 << (PAGE_SHIFT - inode->i_blkbits);
+ unsigned nr_dirty = 0;
+ struct buffer_head *bh, *head;
- nilfs_set_file_dirty(inode, nr_dirty);
+ /*
+ * This page is locked by callers, and no other thread
+ * concurrently marks its buffers dirty since they are
+ * only dirtied through routines in fs/buffer.c in
+ * which call sites of mark_buffer_dirty are protected
+ * by page lock.
+ */
+ bh = head = page_buffers(page);
+ do {
+ /* Do not mark hole blocks dirty */
+ if (buffer_dirty(bh) || !buffer_mapped(bh))
+ continue;
+
+ set_buffer_dirty(bh);
+ nr_dirty++;
+ } while (bh = bh->b_this_page, bh != head);
+
+ if (nr_dirty)
+ nilfs_set_file_dirty(inode, nr_dirty);
}
return ret;
}
next prev parent reply other threads:[~2013-06-05 20:52 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-05 20:52 [ 00/33] 3.0.81-stable review Greg Kroah-Hartman
2013-06-05 20:52 ` [ 01/33] avr32: fix relocation check for signed 18-bit offset Greg Kroah-Hartman
2013-06-05 20:52 ` [ 02/33] ARM: plat-orion: Fix num_resources and id for ge10 and ge11 Greg Kroah-Hartman
2013-06-05 20:52 ` [ 03/33] staging: vt6656: use free_netdev instead of kfree Greg Kroah-Hartman
2013-06-05 20:52 ` [ 04/33] usb: option: Add Telewell TW-LTE 4G Greg Kroah-Hartman
2013-06-05 20:52 ` [ 05/33] USB: option: add device IDs for Dell 5804 (Novatel E371) WWAN card Greg Kroah-Hartman
2013-06-05 20:52 ` [ 06/33] USB: ftdi_sio: Add support for Newport CONEX motor drivers Greg Kroah-Hartman
2013-06-05 20:52 ` [ 07/33] USB: cxacru: potential underflow in cxacru_cm_get_array() Greg Kroah-Hartman
2013-06-05 20:52 ` [ 08/33] TTY: Fix tty miss restart after we turn off flow-control Greg Kroah-Hartman
2013-06-05 20:52 ` [ 09/33] USB: Blacklisted Cinterions PLxx WWAN Interface Greg Kroah-Hartman
2013-06-05 20:52 ` [ 10/33] USB: reset resume quirk needed by a hub Greg Kroah-Hartman
2013-06-05 20:52 ` [ 11/33] USB: UHCI: fix for suspend of virtual HP controller Greg Kroah-Hartman
2013-06-05 20:52 ` [ 12/33] cifs: only set ops for inodes in I_NEW state Greg Kroah-Hartman
2013-06-05 20:52 ` [ 13/33] fat: fix possible overflow for fat_clusters Greg Kroah-Hartman
2013-06-05 20:52 ` [ 14/33] ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap() Greg Kroah-Hartman
2013-06-05 20:52 ` [ 15/33] Kirkwood: Enable PCIe port 1 on QNAP TS-11x/TS-21x Greg Kroah-Hartman
2013-06-05 20:52 ` [ 16/33] mm compaction: fix of improper cache flush in migration code Greg Kroah-Hartman
2013-06-05 20:52 ` [ 17/33] klist: del waiter from klist_remove_waiters before wakeup waitting process Greg Kroah-Hartman
2013-06-05 20:52 ` [ 18/33] wait: fix false timeouts when using wait_event_timeout() Greg Kroah-Hartman
2013-06-05 20:52 ` Greg Kroah-Hartman [this message]
2013-06-05 20:52 ` [ 20/33] mm: mmu_notifier: re-fix freed page still mapped in secondary MMU Greg Kroah-Hartman
2013-06-05 20:52 ` [ 21/33] drivers/block/brd.c: fix brd_lookup_page() race Greg Kroah-Hartman
2013-06-05 20:53 ` [ 22/33] mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer Greg Kroah-Hartman
2013-06-05 20:53 ` [ 23/33] um: Serve io_remap_pfn_range() Greg Kroah-Hartman
2013-06-05 20:53 ` [ 24/33] drm/radeon: fix card_posted check for newer asics Greg Kroah-Hartman
2013-06-05 20:53 ` [ 25/33] cifs: fix potential buffer overrun when composing a new options string Greg Kroah-Hartman
2013-06-05 20:53 ` [ 26/33] USB: io_ti: Fix NULL dereference in chase_port() Greg Kroah-Hartman
2013-06-05 20:53 ` [ 27/33] libata: make ata_exec_internal_sg honor DMADIR Greg Kroah-Hartman
2013-06-05 20:53 ` [ 28/33] xen/events: Handle VIRQ_TIMER before any other hardirq in event loop Greg Kroah-Hartman
2013-06-05 20:53 ` [ 29/33] jfs: fix a couple races Greg Kroah-Hartman
2013-06-05 20:53 ` [ 30/33] ALSA: usb-audio: fix possible hang and overflow in parse_uac2_sample_rate_range() Greg Kroah-Hartman
2013-06-05 20:53 ` [ 31/33] ALSA: usb-audio: avoid integer overflow in create_fixed_stream_quirk() Greg Kroah-Hartman
2013-06-05 20:53 ` [ 32/33] mac80211: close AP_VLAN interfaces before unregistering all Greg Kroah-Hartman
2013-06-05 20:53 ` [ 33/33] thinkpad-acpi: recognize latest V-Series using DMI_BIOS_VENDOR Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130605204706.836625177@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=Anthony2486@interfaces.org.uk \
--cc=akpm@linux-foundation.org \
--cc=konishi.ryusuke@lab.ntt.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=slava@dubeyko.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox