From: bugzilla-daemon@bugzilla.kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 102731] I have a cough.
Date: Wed, 30 Sep 2015 09:49:21 +0000 [thread overview]
Message-ID: <bug-102731-13602-ThvJJxnjtl@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-102731-13602@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=102731
--- Comment #14 from John Hughes <john@calva.com> ---
On 28/09/15 19:06, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=102731
>
> --- Comment #13 from Theodore Tso <tytso@mit.edu> ---
> So it's been 12 days, and previously when you were using the Debian 3.16
> kernel, it was triggering once every four days, right? Can I assume that your
> silence indicates that you haven't seen a problem to date?
I haven't seen the problem, but unfortunately I'm running 3.18.19 at the
moment (I screwed up on the last boot and let it boot the default
kernel). I haven't had time to reboot. So I'd like to give it a bit
more time.
>
> If so, then it really does seen that it might be an interaction between LVM/MD
> and KVM.
>
> So if that's the case, then the next thing to ask is to try to figure out what
> might be the triggering cause. A couple of things come to mind:
>
> 1) Some failure to properly handle a flush cache command being sent to the MD
> device. This combined to either a power failure or a crash of the guest OS
> (depending on how KVM is configured), might explain a block update getting
> lost. The fact that the block bitmap is out of sync with the block group
> descriptor is consistent with this failure. However, if you were seeing
> failures once every four days, that would imply that the guest OS and/or host
> OS would be crashing at that or about that level of frequency, and you haven't
> reported that.
I haven't had any host or guest crashes.
>
> 2) Some kind a race between a 4k write and a RAID1 resync leading to a block
> write getting lost. Again, this reported data corruption is consistent with
> this theory --- but this also requires the guest OS crashing due to some kind
> of kernel crash or KVM/qemu shutdown and/or host OS crash / power failure, as
> in (1) above. If you weren't seeing these failures once every four days or so,
> then this isn't a likely explanation.
No crashes.
>
> 3) Some kind of corruption caused by the TRIM command being sent to the
> RAID/MD device, possibly racing with a block bitmap update. This could be
> caused either by the file system being mounted with the -o discard mount
> option, or by fstrim getting run out of cron, or by e2fsck explicitly being
> asked to discard unused blocks (with the "-E discard" option).
I'm not using "-o discard", or fstrim, I've never used the "-E discard"
option to fsck.
>
> 4) Some kind of bug which happens rarely either in qemu, the host kernel or
> the guest kernel depending on how it communicates with the virtual disk.
> (i.e., virtio, scsi, ide, etc.) Virtio is the most likely use case, and so
> trying to change to use scsi emulation might be interesting. (OTOH, if the
> problem is specific to the MD layer, then this possibility is less likely.)
>
> So as far as #3 is concerned, can you check to see if you had fstrim enabled,
> or are mounting the file system with -o discard?
>
I'm a bit overwhelmed with work at the moment so I haven't had time to
read this message with the care it deserves, I'll get back to you with
more detail next week.
--
You are receiving this mail because:
You are watching the assignee of the bug.
next prev parent reply other threads:[~2015-09-30 9:49 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-12 8:47 [Bug 102731] New: I have a cough bugzilla-daemon
2015-08-12 8:56 ` [Bug 102731] " bugzilla-daemon
2015-08-12 9:02 ` bugzilla-daemon
2015-08-12 9:11 ` bugzilla-daemon
2015-08-12 9:12 ` bugzilla-daemon
2015-08-12 18:53 ` bugzilla-daemon
2015-08-12 19:25 ` bugzilla-daemon
2015-08-31 15:46 ` bugzilla-daemon
2015-08-31 15:47 ` bugzilla-daemon
2015-08-31 18:03 ` bugzilla-daemon
2015-09-01 10:28 ` bugzilla-daemon
2015-09-01 14:43 ` bugzilla-daemon
2015-09-01 16:08 ` bugzilla-daemon
2015-09-16 14:09 ` bugzilla-daemon
2015-09-28 17:06 ` bugzilla-daemon
2015-09-30 9:49 ` bugzilla-daemon [this message]
2015-10-07 16:17 ` bugzilla-daemon
2015-10-08 9:16 ` bugzilla-daemon
2015-10-11 4:05 ` bugzilla-daemon
2015-10-12 10:36 ` bugzilla-daemon
2015-10-12 14:01 ` bugzilla-daemon
2015-10-15 15:32 ` bugzilla-daemon
2015-10-15 15:38 ` bugzilla-daemon
2015-10-15 15:41 ` bugzilla-daemon
2015-10-16 13:04 ` bugzilla-daemon
2015-10-16 15:53 ` bugzilla-daemon
2015-10-16 16:14 ` bugzilla-daemon
2015-10-20 13:40 ` bugzilla-daemon
2015-10-20 15:44 ` bugzilla-daemon
2015-10-20 15:55 ` bugzilla-daemon
2015-10-20 16:28 ` bugzilla-daemon
2015-10-20 16:30 ` bugzilla-daemon
2015-11-25 10:09 ` bugzilla-daemon
2016-01-19 12:00 ` bugzilla-daemon
2016-01-21 23:57 ` bugzilla-daemon
2016-01-22 10:27 ` bugzilla-daemon
2016-01-22 15:20 ` bugzilla-daemon
2016-01-22 16:36 ` bugzilla-daemon
2016-02-08 9:52 ` bugzilla-daemon
2016-02-08 10:56 ` bugzilla-daemon
2016-03-18 22:20 ` bugzilla-daemon
2016-03-19 17:49 ` bugzilla-daemon
2016-03-20 1:27 ` bugzilla-daemon
2016-03-20 23:26 ` bugzilla-daemon
2016-03-21 13:04 ` bugzilla-daemon
2016-03-25 16:55 ` bugzilla-daemon
2016-04-08 15:49 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-102731-13602-ThvJJxnjtl@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@bugzilla.kernel.org \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).