[Bug 75881] New: lazyinit failure on new mdadm raid5 & encrypted array

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [Bug 75881] New: lazyinit failure on new mdadm raid5 & encrypted array
@ 2014-05-10 18:47 bugzilla-daemon
  2014-05-11 19:05 ` [Bug 75881] " bugzilla-daemon
  2014-05-12  5:09 ` bugzilla-daemon
  0 siblings, 2 replies; 3+ messages in thread
From: bugzilla-daemon @ 2014-05-10 18:47 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=75881

            Bug ID: 75881
           Summary: lazyinit failure on new mdadm raid5 & encrypted array
           Product: File System
           Version: 2.5
    Kernel Version: 3.13.10
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: thomas_reardon@hotmail.com
        Regression: No

Created attachment 135751
  --> https://bugzilla.kernel.org/attachment.cgi?id=135751&action=edit
kernel log, ext4 fs params

ext4 lazyinit fails with corruption when competing with mdadm raid5
reconstruction.  In the _default_ scenario, mdadm doesn't initialize a new
raid5 array but rather leaves the last drives as a rebuilding spare, rather
than a full initialization.

Adding to the load, the volume was an encrypted LUKS volume (at
/dev/mapper/private)

So the order of events:
1) create new 4x4TB RAID5 array via mdadm.  volume is started degraded and
starts rebuilding while allowing access.
2) layer new encrypted LUKS volume: /dev/md1 -> /dev/mapper/private
3) create EXT4 volume with 64bit,sparse_super2,
4) mount volume with journal_async_commit
5) a slow rsync is started in background, averaging 5MB/s

Please see attached

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 75881] lazyinit failure on new mdadm raid5 & encrypted array
  2014-05-10 18:47 [Bug 75881] New: lazyinit failure on new mdadm raid5 & encrypted array bugzilla-daemon
@ 2014-05-11 19:05 ` bugzilla-daemon
  2014-05-12  5:09 ` bugzilla-daemon
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2014-05-11 19:05 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=75881

Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu

--- Comment #1 from Theodore Tso <tytso@mit.edu> ---
Is this something you can reliably reproduce?  The log doesn't tell us anything
useful, and it's not clear whether the problem is with the dm-crypt (i.e.,
LUKS) layer, or with the ext4 layer.   All the log tells us is that we are
waiting forever for a block I/O operation to finish in the jbd2 commit thread,
and this is causing the lazyinit thread to give a soft lockup warning (meaning
that two minutes has gone by without any forward progress taking place).

I suspect the problem is in an interaction between the dm-crypt and md raid5
code, which is being tickled by the I/O patterns that you've described.  But
before we kick this over to the the device mapper developers, the first
question is whether you can reliably reproduce the problem.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 75881] lazyinit failure on new mdadm raid5 & encrypted array
  2014-05-10 18:47 [Bug 75881] New: lazyinit failure on new mdadm raid5 & encrypted array bugzilla-daemon
  2014-05-11 19:05 ` [Bug 75881] " bugzilla-daemon
@ 2014-05-12  5:09 ` bugzilla-daemon
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2014-05-12  5:09 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=75881

--- Comment #2 from TR Reardon <thomas_reardon@hotmail.com> ---
I cannot reproduce.  A quick glance at the stack made me fear that this was not
an obvious ext4 problem but an interaction bug between block layers.  Still
thought I should submit in case it proved useful...hard to easily repro errors
building 16TB arrays on production servers

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-05-12  5:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-10 18:47 [Bug 75881] New: lazyinit failure on new mdadm raid5 & encrypted array bugzilla-daemon
2014-05-11 19:05 ` [Bug 75881] " bugzilla-daemon
2014-05-12  5:09 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).