All of lore.kernel.org
 help / color / mirror / Atom feed
From: Abelardo Ricart III <aricart@memnix.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com, mpatocka@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: Regression: Disk corruption with dm-crypt and kernels >= 4.0
Date: Fri, 01 May 2015 19:42:15 -0400	[thread overview]
Message-ID: <1430523735.5352.1.camel@memnix.com> (raw)
In-Reply-To: <1430519090.5537.4.camel@memnix.com>

On Fri, 2015-05-01 at 18:24 -0400, Abelardo Ricart III wrote:
> On Fri, 2015-05-01 at 17:17 -0400, Mike Snitzer wrote:
> > On Fri, May 01 2015 at 12:37am -0400,
> > Abelardo Ricart III <aricart@memnix.com> wrote:
> > 
> > > I made sure to run a completely vanilla kernel when testing why I was 
> > > suddenly
> > > seeing some nasty libata errors with all kernels >= v4.0. Here's a 
> > > snippet:
> > > 
> > > -------------------->8--------------------
> > > [  165.592136] ata5.00: exception Emask 0x60 SAct 0x7000 SErr 0x800 
> > > action 
> > > 0x6
> > > frozen
> > > [  165.592140] ata5.00: irq_stat 0x20000000, host bus error
> > > [  165.592143] ata5: SError: { HostInt }
> > > [  165.592145] ata5.00: failed command: READ FPDMA QUEUED
> > > [  165.592149] ata5.00: cmd 60/08:60:a0:0d:89/00:00:07:00:00/40 tag 12 
> > > ncq 
> > > 4096
> > > in
> > >                         res 40/00:74:40:58:5d/00:00:00:00:00/40 Emask 0x60
> > > (host bus error)
> > > [  165.592151] ata5.00: status: { DRDY }
> > > -------------------->8--------------------
> > > 
> > > After a few dozen of these errors, I'd suddenly find my system in read
> > > -only
> > > mode with corrupted files throughout my encrypted filesystems (seemed like
> > > either a read or a write would corrupt a file, though I could be 
> > > mistaken). 
> > > I
> > > decided to do a git bisect with a random read-write-sync test to narrow 
> > > down
> > > the culprit, which turned out to be this commit (part of a series):
> > > 
> > > # first bad commit: [cf2f1abfbd0dba701f7f16ef619e4d2485de3366] dm crypt: 
> > > don't
> > > allocate pages for a partial request
> > > 
> > > Just to be sure, I created a patch to revert the entire nine patch series 
> > > that
> > > commit belonged to... and the bad behavior disappeared. I've now been 
> > > running
> > > kernel 4.0 for a few days without issue, and went so far as to stress 
> > > test 
> > > my
> > > poor SSD for a few hours to be 100% positive.
> > > 
> > > Here's some more info on my setup.
> > > 
> > > -------------------->8--------------------
> > > $ lsblk -f
> > > NAME         FSTYPE      LABEL MOUNTPOINT
> > > sda                  
> > > ├─sda1       vfat              /boot/EFI
> > > ├─sda2       ext4              /boot
> > > └─sda3       LVM2_member
> > >   ├─SSD-root crypto_LUKS
> > >   │ └─root   f2fs              /
> > >   └─SSD-home crypto_LUKS
> > >     └─home   f2fs              /home
> > > 
> > > $ cat /proc/cmdline
> > > BOOT_IMAGE=/vmlinuz-linux-memnix cryptdevice=/dev/SSD/root:root:allow
> > > -discards
> > > root=/dev/mapper/root acpi_osi=Linux security=tomoyo
> > > TOMOYO_trigger=/usr/lib/systemd/systemd intel_iommu=on
> > > modprobe.blacklist=nouveau rw quiet
> > > 
> > > $ cat /etc/lvm/lvm.conf | grep "issue_discards"
> > > issue_discards = 1
> > > -------------------->8--------------------
> > > 
> > > If there's anything else I can do to help diagnose the underlying 
> > > problem, 
> > > I'm
> > > more than willing.
> > 
> > The patchset in question was tested quite heavily so this is a
> > surprising report.  I'm noticing you are opting in to dm-crypt discard
> > support.  Have you tested without discards enabled?
> 
> I've disabled discards universally and rebuilt a vanilla kernel. After running
> my heavy read-write-sync scripts, everything seems to be working fine now. I
> suppose this could be something that used to fail silently before, but now
> produces bad behavior? I seem to remember having something in my message log
> about "discards not supported on this device" when running with it enabled
> before.

Forgive me, but I spoke too soon. The corruption and libata errors are still
there, as was evidenced when I went to reboot and got treated to an eye full of
"read-only filesystem" and ata errors.

So no, disabling discards unfortunately did nothing to help.

  reply	other threads:[~2015-05-01 23:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-01  4:37 Regression: Disk corruption with dm-crypt and kernels >= 4.0 Abelardo Ricart III
2015-05-01 21:17 ` Mike Snitzer
2015-05-01 22:24   ` Abelardo Ricart III
2015-05-01 23:42     ` Abelardo Ricart III [this message]
2015-05-15 15:04       ` Brandon Smith
2015-05-18 14:36         ` Abelardo Ricart III
2015-06-02 17:51           ` Mikulas Patocka
2015-06-03  2:21             ` Abelardo Ricart III
2015-06-03  2:21               ` Abelardo Ricart III
2015-09-11 16:11               ` Mike Snitzer
2015-05-01 21:47 ` [dm-devel] " Alasdair G Kergon
2015-05-02  0:19   ` Abelardo Ricart III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1430523735.5352.1.camel@memnix.com \
    --to=aricart@memnix.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.