From: Ming Lei <ming.lei@redhat.com>
To: Dongli Zhang <dongli.zhang@oracle.com>
Cc: linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Subject: Re: NVMe: Regression: write zeros corrupts ext4 file system
Date: Mon, 11 Mar 2019 18:16:19 +0800 [thread overview]
Message-ID: <20190311101618.GA26229@ming.t460p> (raw)
In-Reply-To: <08f3d0f9-c10c-ef72-72f5-62670388763b@oracle.com>
On Mon, Mar 11, 2019 at 03:54:16PM +0800, Dongli Zhang wrote:
>
>
> On 3/11/19 10:24 AM, Ming Lei wrote:
> > Hi,
> >
> > It is observed that ext4 is corrupted easily by running some workloads
> > on QEMU NVMe, such as:
>
> I cannot reproduce with most recent up-to-date mainline kernel on below qemu
> versions:
>
> - qemu-2.10.2
> - qemu-3.0.0
The qemu in my test is from Fedora 27, and it isn't built by me, and
'qemu-system-x86_64 -version' shows that:
QEMU emulator version 2.10.2(qemu-2.10.2-1.fc27)
My test VM is actually cloned from the official Fedora 27 Cloud image[1],
then run 'dnf update' before starting the test.
[1] https://download.fedoraproject.org/pub/fedora/linux/releases/27/CloudImages/x86_64/images/Fedora-Cloud-Base-27-1.6.x86_64.qcow2
>
> >
> > 1) mkfs.ext4 /dev/nvme0n1
> >
> > 2) mount /dev/nvme0n1 /mnt
> >
> > 3) cd /mnt; git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> >
> > 4) then the following error message may show up:
> >
> > [ 1642.271816] EXT4-fs error (device nvme0n1): ext4_mb_generate_buddy:747: group 0, block bitmap and bg descriptor inconsistent: 32768 vs 23513 free clusters
> >
> > Or fsck.ext4 will complain after running 'umount /mnt'
> >
> > The issue disappears by reverting 6e02318eaea53eaafe6 ("nvme: add support for the
> > Write Zeroes command").
>
> As above commit is for Write Zeros command, I instrument and add printf at the
> beginning of nvme_write_zeros() for qemu-2.10.2.
>
> nvme_write_zeros() are only called for 47 times during "mount /dev/nvme0n1 /mnt".
>
>
> During "git clone" from torvalds' linux.git, there is no call of nvme_write_zeros().
>
> Perhaps there is some special configuration required to trigger the
> nvme_write_zeros() on purpose during "git clone" to involve the
> nvme_cmd_write_zeroes on kernel side?
It can be triggered by random write workloads after mkfs & mount on the
nvme.
>
> My test nvme image is only about 5GB.
Mine is 8GB.
Thanks,
Ming
WARNING: multiple messages have this Message-ID (diff)
From: ming.lei@redhat.com (Ming Lei)
Subject: NVMe: Regression: write zeros corrupts ext4 file system
Date: Mon, 11 Mar 2019 18:16:19 +0800 [thread overview]
Message-ID: <20190311101618.GA26229@ming.t460p> (raw)
In-Reply-To: <08f3d0f9-c10c-ef72-72f5-62670388763b@oracle.com>
On Mon, Mar 11, 2019@03:54:16PM +0800, Dongli Zhang wrote:
>
>
> On 3/11/19 10:24 AM, Ming Lei wrote:
> > Hi,
> >
> > It is observed that ext4 is corrupted easily by running some workloads
> > on QEMU NVMe, such as:
>
> I cannot reproduce with most recent up-to-date mainline kernel on below qemu
> versions:
>
> - qemu-2.10.2
> - qemu-3.0.0
The qemu in my test is from Fedora 27, and it isn't built by me, and
'qemu-system-x86_64 -version' shows that:
QEMU emulator version 2.10.2(qemu-2.10.2-1.fc27)
My test VM is actually cloned from the official Fedora 27 Cloud image[1],
then run 'dnf update' before starting the test.
[1] https://download.fedoraproject.org/pub/fedora/linux/releases/27/CloudImages/x86_64/images/Fedora-Cloud-Base-27-1.6.x86_64.qcow2
>
> >
> > 1) mkfs.ext4 /dev/nvme0n1
> >
> > 2) mount /dev/nvme0n1 /mnt
> >
> > 3) cd /mnt; git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> >
> > 4) then the following error message may show up:
> >
> > [ 1642.271816] EXT4-fs error (device nvme0n1): ext4_mb_generate_buddy:747: group 0, block bitmap and bg descriptor inconsistent: 32768 vs 23513 free clusters
> >
> > Or fsck.ext4 will complain after running 'umount /mnt'
> >
> > The issue disappears by reverting 6e02318eaea53eaafe6 ("nvme: add support for the
> > Write Zeroes command").
>
> As above commit is for Write Zeros command, I instrument and add printf at the
> beginning of nvme_write_zeros() for qemu-2.10.2.
>
> nvme_write_zeros() are only called for 47 times during "mount /dev/nvme0n1 /mnt".
>
>
> During "git clone" from torvalds' linux.git, there is no call of nvme_write_zeros().
>
> Perhaps there is some special configuration required to trigger the
> nvme_write_zeros() on purpose during "git clone" to involve the
> nvme_cmd_write_zeroes on kernel side?
It can be triggered by random write workloads after mkfs & mount on the
nvme.
>
> My test nvme image is only about 5GB.
Mine is 8GB.
Thanks,
Ming
next prev parent reply other threads:[~2019-03-11 10:16 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-11 2:24 NVMe: Regression: write zeros corrupts ext4 file system Ming Lei
2019-03-11 2:24 ` Ming Lei
2019-03-11 7:54 ` Dongli Zhang
2019-03-11 7:54 ` Dongli Zhang
2019-03-11 10:16 ` Ming Lei [this message]
2019-03-11 10:16 ` Ming Lei
2019-03-11 14:54 ` Keith Busch
2019-03-11 14:54 ` Keith Busch
2019-03-11 15:23 ` Christoph Hellwig
2019-03-11 15:23 ` Christoph Hellwig
2019-03-12 1:32 ` Ming Lei
2019-03-12 1:32 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190311101618.GA26229@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=chaitanya.kulkarni@wdc.com \
--cc=dongli.zhang@oracle.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.