* ext4 filesystem corruption with 4.10-rc2 on ppc64le
@ 2017-01-04 5:18 Anton Blanchard
2017-01-04 6:02 ` Chandan Rajendra
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Anton Blanchard @ 2017-01-04 5:18 UTC (permalink / raw)
To: jack, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Stephen Rothwell, axboe
Cc: linux-fsdevel, linux-ext4, linuxppc-dev, linux-kernel
Hi,
I'm consistently seeing ext4 filesystem corruption using a mainline
kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu
cloud image, boot it in KVM and run:
sudo apt-get update
sudo apt-get dist-upgrade
sudo reboot
And it never makes it back up, dying with rather severe filesystem
corruption.
I've narrowed it down to:
64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration")
e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it")
ce98321bf7d2 ("fs: Remove unmap_underlying_metadata")
Backing these patches out fixes the issue.
Anton
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 5:18 ext4 filesystem corruption with 4.10-rc2 on ppc64le Anton Blanchard @ 2017-01-04 6:02 ` Chandan Rajendra 2017-01-04 15:28 ` Theodore Ts'o 2017-01-04 7:34 ` luigi burdo 2017-01-04 15:09 ` Jens Axboe 2 siblings, 1 reply; 9+ messages in thread From: Chandan Rajendra @ 2017-01-04 6:02 UTC (permalink / raw) To: Anton Blanchard Cc: jack, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Stephen Rothwell, axboe, linuxppc-dev, linux-kernel, linux-ext4, linux-fsdevel On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote: > Hi, > > I'm consistently seeing ext4 filesystem corruption using a mainline > kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu > cloud image, boot it in KVM and run: > > sudo apt-get update > sudo apt-get dist-upgrade > sudo reboot > > And it never makes it back up, dying with rather severe filesystem > corruption. Hi, The patch at https://patchwork.kernel.org/patch/9488235/ should fix the bug. > > I've narrowed it down to: > > 64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration") > e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it") > ce98321bf7d2 ("fs: Remove unmap_underlying_metadata") > > Backing these patches out fixes the issue. > > Anton > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- chandan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 6:02 ` Chandan Rajendra @ 2017-01-04 15:28 ` Theodore Ts'o 2017-01-04 16:23 ` Jens Axboe ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Theodore Ts'o @ 2017-01-04 15:28 UTC (permalink / raw) To: Chandan Rajendra Cc: Anton Blanchard, jack, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Stephen Rothwell, axboe, linuxppc-dev, linux-kernel, linux-ext4, linux-fsdevel, Jens Axboe, torvalds On Wed, Jan 04, 2017 at 11:32:42AM +0530, Chandan Rajendra wrote: > On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote: > > I'm consistently seeing ext4 filesystem corruption using a mainline > > kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu > > cloud image, boot it in KVM and run: > > > > sudo apt-get update > > sudo apt-get dist-upgrade > > sudo reboot > > > > And it never makes it back up, dying with rather severe filesystem > > corruption. > > The patch at https://patchwork.kernel.org/patch/9488235/ should fix the > bug. It looks like this patch is already queued up on the "for-linus" branch on the linux-block.git tree. Chandra, thanks for pointing this out! I had missed your e-mail from Christmas day, and it was on my todo list to figure out why I was seeing lots of 1k block regressions on gce-xfstests post-merge window that wasn't showing up on the ext4.git tree before I sent my pull request to Linus. Jens, could you expedite a pull request to Linus? This is affecting ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only regression. Anton or Chandan, could you do me a favor and verify whether or not 64k block sizes are working for you on ppcle on ext4 by running xfstests? Light duty testing works for me but when I stress ext4 with pagesize==blocksize on ppcle64 via xfstests, it blows up. I suspect (but am not sure) it's due to (non-upstream) device driver issues, and a verification that you can run xfstests on your ppcle64 systems using standard upstream device drivers would be very helpful, since I don't have easy console access on the machines I have access to at $WORK. :-( And of course, if there are still blocksize==pagesize issues on ext4 on ppc64le, it would be good to know that too. Many thanks!! - Ted P.S. And for those people who are doing storage work, let me put in a plug for "gce-xfstests full". It's cheap and finds lots of problems before I and others have to. And if the $1.50 USD is the problem, let me know and I'll try to work something out. :-) :-) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 15:28 ` Theodore Ts'o @ 2017-01-04 16:23 ` Jens Axboe 2017-01-04 18:09 ` Linus Torvalds 2017-01-05 10:44 ` Anton Blanchard 2017-01-09 4:10 ` Chandan Rajendra 2 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2017-01-04 16:23 UTC (permalink / raw) To: Theodore Ts'o, Chandan Rajendra, Anton Blanchard, jack, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Stephen Rothwell, linuxppc-dev, linux-kernel, linux-ext4, linux-fsdevel, Jens Axboe, torvalds On 01/04/2017 08:28 AM, Theodore Ts'o wrote: > On Wed, Jan 04, 2017 at 11:32:42AM +0530, Chandan Rajendra wrote: >> On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote: >>> I'm consistently seeing ext4 filesystem corruption using a mainline >>> kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu >>> cloud image, boot it in KVM and run: >>> >>> sudo apt-get update >>> sudo apt-get dist-upgrade >>> sudo reboot >>> >>> And it never makes it back up, dying with rather severe filesystem >>> corruption. >> >> The patch at https://patchwork.kernel.org/patch/9488235/ should fix the >> bug. > > It looks like this patch is already queued up on the "for-linus" > branch on the linux-block.git tree. > > Chandra, thanks for pointing this out! I had missed your e-mail from > Christmas day, and it was on my todo list to figure out why I was > seeing lots of 1k block regressions on gce-xfstests post-merge window > that wasn't showing up on the ext4.git tree before I sent my pull > request to Linus. > > Jens, could you expedite a pull request to Linus? This is affecting > ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only > regression. Yes, it'll go out this morning. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 16:23 ` Jens Axboe @ 2017-01-04 18:09 ` Linus Torvalds 0 siblings, 0 replies; 9+ messages in thread From: Linus Torvalds @ 2017-01-04 18:09 UTC (permalink / raw) To: Jens Axboe Cc: Theodore Ts'o, Chandan Rajendra, Anton Blanchard, Jan Kara, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Stephen Rothwell, ppc-dev, Linux Kernel Mailing List, linux-ext4@vger.kernel.org, linux-fsdevel, Jens Axboe On Wed, Jan 4, 2017 at 8:23 AM, Jens Axboe <axboe@fb.com> wrote: > On 01/04/2017 08:28 AM, Theodore Ts'o wrote: >> >> Jens, could you expedite a pull request to Linus? This is affecting >> ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only >> regression. > > Yes, it'll go out this morning. It's merged and out there in my tree now. Linus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 15:28 ` Theodore Ts'o 2017-01-04 16:23 ` Jens Axboe @ 2017-01-05 10:44 ` Anton Blanchard 2017-01-09 4:10 ` Chandan Rajendra 2 siblings, 0 replies; 9+ messages in thread From: Anton Blanchard @ 2017-01-05 10:44 UTC (permalink / raw) To: Theodore Ts'o Cc: Jens Axboe, Stephen Rothwell, jack, linux-kernel, axboe, torvalds, Chandan Rajendra, Paul Mackerras, linux-fsdevel, linux-ext4, linuxppc-dev Hi Ted, > Anton or Chandan, could you do me a favor and verify whether or not > 64k block sizes are working for you on ppcle on ext4 by running > xfstests? Light duty testing works for me but when I stress ext4 with > pagesize==blocksize on ppcle64 via xfstests, it blows up. I suspect > (but am not sure) it's due to (non-upstream) device driver issues, and > a verification that you can run xfstests on your ppcle64 systems using > standard upstream device drivers would be very helpful, since I don't > have easy console access on the machines I have access to at > $WORK. :-( I fired off an xfstests run, and it looks good. There are 3 failures, but they seem to be setup issues on my part. I also double checked those same three failed on 4.8. Chandan has been running the test suite regularly, and plans to do a run against mainline too. Anton ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 15:28 ` Theodore Ts'o 2017-01-04 16:23 ` Jens Axboe 2017-01-05 10:44 ` Anton Blanchard @ 2017-01-09 4:10 ` Chandan Rajendra 2 siblings, 0 replies; 9+ messages in thread From: Chandan Rajendra @ 2017-01-09 4:10 UTC (permalink / raw) To: Theodore Ts'o Cc: Anton Blanchard, jack, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Stephen Rothwell, axboe, linuxppc-dev, linux-kernel, linux-ext4, linux-fsdevel, Jens Axboe, torvalds On Wednesday, January 04, 2017 10:28:37 AM Theodore Ts'o wrote: > On Wed, Jan 04, 2017 at 11:32:42AM +0530, Chandan Rajendra wrote: > > On Wednesday, January 04, 2017 04:18:08 PM Anton Blanchard wrote: > > > I'm consistently seeing ext4 filesystem corruption using a mainline > > > kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu > > > cloud image, boot it in KVM and run: > > > > > > sudo apt-get update > > > sudo apt-get dist-upgrade > > > sudo reboot > > > > > > And it never makes it back up, dying with rather severe filesystem > > > corruption. > > > > The patch at https://patchwork.kernel.org/patch/9488235/ should fix the > > bug. > > It looks like this patch is already queued up on the "for-linus" > branch on the linux-block.git tree. > > Chandra, thanks for pointing this out! I had missed your e-mail from > Christmas day, and it was on my todo list to figure out why I was > seeing lots of 1k block regressions on gce-xfstests post-merge window > that wasn't showing up on the ext4.git tree before I sent my pull > request to Linus. > > Jens, could you expedite a pull request to Linus? This is affecting > ext4 on 1k block file systems on x86/x86_64, so this is not a ppc-only > regression. > > Anton or Chandan, could you do me a favor and verify whether or not > 64k block sizes are working for you on ppcle on ext4 by running > xfstests? Light duty testing works for me but when I stress ext4 with > pagesize==blocksize on ppcle64 via xfstests, it blows up. I suspect > (but am not sure) it's due to (non-upstream) device driver issues, and > a verification that you can run xfstests on your ppcle64 systems using > standard upstream device drivers would be very helpful, since I don't > have easy console access on the machines I have access to at $WORK. :-( Hi Ted, I found one regression w.r.t 64k blocksize. I posted a patch (http://marc.info/?l=linux-block&m=148388687722745&w=2) to fix the issue. -- chandan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 5:18 ext4 filesystem corruption with 4.10-rc2 on ppc64le Anton Blanchard 2017-01-04 6:02 ` Chandan Rajendra @ 2017-01-04 7:34 ` luigi burdo 2017-01-04 15:09 ` Jens Axboe 2 siblings, 0 replies; 9+ messages in thread From: luigi burdo @ 2017-01-04 7:34 UTC (permalink / raw) To: Anton Blanchard, jack@suse.cz, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Stephen Rothwell, axboe@fb.com Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1264 bytes --] Hi, it is present on ppc not le too. found it on Ubuntu Mate 16.10 PPC with kernel 4.9 rc6 PPC64 on P5020/P5040 Thanks Luigi ________________________________ Da: Linuxppc-dev <linuxppc-dev-bounces+intermediadc=hotmail.com@lists.ozlabs.org> per conto di Anton Blanchard <anton@samba.org> Inviato: mercoledì 4 gennaio 2017 06.18 A: jack@suse.cz; Michael Ellerman; Benjamin Herrenschmidt; Paul Mackerras; Stephen Rothwell; axboe@fb.com Cc: linux-fsdevel@vger.kernel.org; linux-ext4@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-kernel@vger.kernel.org Oggetto: ext4 filesystem corruption with 4.10-rc2 on ppc64le Hi, I'm consistently seeing ext4 filesystem corruption using a mainline kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu cloud image, boot it in KVM and run: sudo apt-get update sudo apt-get dist-upgrade sudo reboot And it never makes it back up, dying with rather severe filesystem corruption. I've narrowed it down to: 64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration") e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it") ce98321bf7d2 ("fs: Remove unmap_underlying_metadata") Backing these patches out fixes the issue. Anton [-- Attachment #2: Type: text/html, Size: 2205 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext4 filesystem corruption with 4.10-rc2 on ppc64le 2017-01-04 5:18 ext4 filesystem corruption with 4.10-rc2 on ppc64le Anton Blanchard 2017-01-04 6:02 ` Chandan Rajendra 2017-01-04 7:34 ` luigi burdo @ 2017-01-04 15:09 ` Jens Axboe 2 siblings, 0 replies; 9+ messages in thread From: Jens Axboe @ 2017-01-04 15:09 UTC (permalink / raw) To: Anton Blanchard, jack, Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras, Stephen Rothwell Cc: linuxppc-dev, linux-kernel, linux-ext4, linux-fsdevel On 01/03/2017 10:18 PM, Anton Blanchard wrote: > Hi, > > I'm consistently seeing ext4 filesystem corruption using a mainline > kernel. It doesn't take much to trigger it - download a ppc64le Ubuntu > cloud image, boot it in KVM and run: > > sudo apt-get update > sudo apt-get dist-upgrade > sudo reboot > > And it never makes it back up, dying with rather severe filesystem > corruption. > > I've narrowed it down to: > > 64e1c57fa474 ("ext4: Use clean_bdev_aliases() instead of iteration") > e64855c6cfaa ("fs: Add helper to clean bdev aliases under a bh and use it") > ce98321bf7d2 ("fs: Remove unmap_underlying_metadata") > > Backing these patches out fixes the issue. Fix is going out today, I see Chandan already pointed you at it. For the other reporter, it's not an LE vs BE thing, it's a fs blocksize < page size problem. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-01-09 4:10 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-01-04 5:18 ext4 filesystem corruption with 4.10-rc2 on ppc64le Anton Blanchard 2017-01-04 6:02 ` Chandan Rajendra 2017-01-04 15:28 ` Theodore Ts'o 2017-01-04 16:23 ` Jens Axboe 2017-01-04 18:09 ` Linus Torvalds 2017-01-05 10:44 ` Anton Blanchard 2017-01-09 4:10 ` Chandan Rajendra 2017-01-04 7:34 ` luigi burdo 2017-01-04 15:09 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).