stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Andreas Rohner <andreas.rohner@gmx.net>,
	Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.10 14/23] nilfs2: fix segctor bug that causes file system corruption
Date: Thu, 23 Jan 2014 10:39:47 -0800	[thread overview]
Message-ID: <20140123183901.852158418@linuxfoundation.org> (raw)
In-Reply-To: <20140123183859.635713053@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andreas Rohner <andreas.rohner@gmx.net>

commit 70f2fe3a26248724d8a5019681a869abdaf3e89a upstream.

There is a bug in the function nilfs_segctor_collect, which results in
active data being written to a segment, that is marked as clean.  It is
possible, that this segment is selected for a later segment
construction, whereby the old data is overwritten.

The problem shows itself with the following kernel log message:

  nilfs_sufile_do_cancel_free: segment 6533 must be clean

Usually a few hours later the file system gets corrupted:

  NILFS: bad btree node (blocknr=8748107): level = 0, flags = 0x0, nchildren = 0
  NILFS error (device sdc1): nilfs_bmap_last_key: broken bmap (inode number=114660)

The issue can be reproduced with a file system that is nearly full and
with the cleaner running, while some IO intensive task is running.
Although it is quite hard to reproduce.

This is what happens:

 1. The cleaner starts the segment construction
 2. nilfs_segctor_collect is called
 3. sc_stage is on NILFS_ST_SUFILE and segments are freed
 4. sc_stage is on NILFS_ST_DAT current segment is full
 5. nilfs_segctor_extend_segments is called, which
    allocates a new segment
 6. The new segment is one of the segments freed in step 3
 7. nilfs_sufile_cancel_freev is called and produces an error message
 8. Loop around and the collection starts again
 9. sc_stage is on NILFS_ST_SUFILE and segments are freed
    including the newly allocated segment, which will contain active
    data and can be allocated at a later time
10. A few hours later another segment construction allocates the
    segment and causes file system corruption

This can be prevented by simply reordering the statements.  If
nilfs_sufile_cancel_freev is called before nilfs_segctor_extend_segments
the freed segments are marked as dirty and cannot be allocated any more.

Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/nilfs2/segment.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -1440,17 +1440,19 @@ static int nilfs_segctor_collect(struct
 
 		nilfs_clear_logs(&sci->sc_segbufs);
 
-		err = nilfs_segctor_extend_segments(sci, nilfs, nadd);
-		if (unlikely(err))
-			return err;
-
 		if (sci->sc_stage.flags & NILFS_CF_SUFREED) {
 			err = nilfs_sufile_cancel_freev(nilfs->ns_sufile,
 							sci->sc_freesegs,
 							sci->sc_nfreesegs,
 							NULL);
 			WARN_ON(err); /* do not happen */
+			sci->sc_stage.flags &= ~NILFS_CF_SUFREED;
 		}
+
+		err = nilfs_segctor_extend_segments(sci, nilfs, nadd);
+		if (unlikely(err))
+			return err;
+
 		nadd = min_t(int, nadd << 1, SC_MAX_SEGDELTA);
 		sci->sc_stage = prev_stage;
 	}



  parent reply	other threads:[~2014-01-23 18:39 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-23 18:39 [PATCH 3.10 00/23] 3.10.28-stable review Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 01/23] ARM: 7815/1: kexec: offline non panic CPUs on Kdump panic Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 03/23] perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 04/23] GFS2: Increase i_writecount during gfs2_setattr_chown Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 05/23] mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 06/23] staging: comedi: addi_apci_1032: fix subdevice type/flags bug Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 07/23] staging: comedi: adl_pci9111: fix incorrect irq passed to request_irq() Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 08/23] vfs: In d_path dont call d_dname on a mount point Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 09/23] hwmon: (coretemp) Fix truncated name of alarm attributes Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 10/23] writeback: Fix data corruption on NFS Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 11/23] SELinux: Fix possible NULL pointer dereference in selinux_inode_permission() Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 12/23] ftrace/x86: Load ftrace_ops in parameter not the variable holding it Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 13/23] thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only Greg Kroah-Hartman
2014-01-23 18:39 ` Greg Kroah-Hartman [this message]
2014-01-23 18:39 ` [PATCH 3.10 15/23] drm/i915: fix DDI PLLs HW state readout code Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 16/23] md: fix problem when adding device to read-only array with bitmap Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 17/23] md/raid10: fix bug when raid10 recovery fails to recover a block Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 18/23] md/raid10: fix two bugs in handling of known-bad-blocks Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 19/23] md/raid5: Fix possible confusion when multiple write errors occur Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 20/23] mm: Make {,set}page_address() static inline if WANT_PAGE_VIRTUAL Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 21/23] serial: amba-pl011: use port lock to guard control register access Greg Kroah-Hartman
2014-01-23 18:39 ` [PATCH 3.10 23/23] ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling Greg Kroah-Hartman
2014-01-23 23:35 ` [PATCH 3.10 00/23] 3.10.28-stable review Guenter Roeck
2014-01-24  4:12   ` Greg Kroah-Hartman
2014-01-24 15:18 ` Shuah Khan
2014-01-24 16:43   ` Greg Kroah-Hartman
2014-01-25 14:03 ` Satoru Takeuchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140123183901.852158418@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=andreas.rohner@gmx.net \
    --cc=konishi.ryusuke@lab.ntt.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).