linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Ritesh Harjani <riteshh@linux.ibm.com>
Cc: linux-ext4@vger.kernel.org, "Theodore Y. Ts'o" <tytso@mit.edu>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Jan Kara <jack@suse.cz>
Subject: Re: Ext4 corruption with VM images as 3 > drop_caches
Date: Fri, 20 Mar 2020 12:49:40 +0100	[thread overview]
Message-ID: <20200320114940.GA20455@quack2.suse.cz> (raw)
In-Reply-To: <20200320053451.B7AD0AE04D@d06av26.portsmouth.uk.ibm.com>

On Fri 20-03-20 11:04:50, Ritesh Harjani wrote:
> On 3/19/20 6:54 PM, Ritesh Harjani wrote:
> > On 3/18/20 9:17 AM, Aneesh Kumar K.V wrote:
> > > Hi,
> > > 
> > > With new vm install I am finding corruption with the vm image if I
> > > follow up the install with echo 3 > /proc/sys/vm/drop_caches
> > > 
> > > The file system reports below error.
> > > 
> > > Begin: Running /scripts/local-bottom ... done.
> > > Begin: Running /scripts/init-bottom ...
> > > [    4.916017] EXT4-fs error (device vda2): ext4_lookup:1700: inode
> > > #787185: comm sh: iget: checksum invalid
> > > done.
> > > [    5.244312] EXT4-fs error (device vda2): ext4_lookup:1700: inode
> > > #917954: comm init: iget: checksum invalid
> > > [    5.257246] EXT4-fs error (device vda2): ext4_lookup:1700: inode
> > > #917954: comm init: iget: checksum invalid
> > > /sbin/init: error while loading shared libraries: libc.so.6: cannot
> > > open shared object file: Error 74
> > > [    5.271207] Kernel panic - not syncing: Attempted to kill init!
> > > exitcode=0x00007f00
> > > 
> > > And debugfs reports
> > > 
> > > debugfs:  stat <917954>
> > > Inode: 917954   Type: bad type    Mode:  0000   Flags: 0x0
> > > Generation: 0    Version: 0x00000000
> > > User:     0   Group:     0   Size: 0
> > > File ACL: 0
> > > Links: 0   Blockcount: 0
> > > Fragment:  Address: 0    Number: 0    Size: 0
> > > ctime: 0x00000000 -- Wed Dec 31 18:00:00 1969
> > > atime: 0x00000000 -- Wed Dec 31 18:00:00 1969
> > > mtime: 0x00000000 -- Wed Dec 31 18:00:00 1969
> > > Size of extra inode fields: 0
> > > Inode checksum: 0x00000000
> > > BLOCKS:
> > > debugfs:
> > > 
> > > Bisecting this finds
> > > Commit 244adf6426ee31a83f397b700d964cff12a247d3("ext4: make
> > > dioread_nolock the default")
> > > as bad. If I revert the same on top of linus
> > > upstream(fb33c6510d5595144d585aa194d377cf74d31911)
> > > I don't hit the corrupttion anymore.
> > 
> > Tried replicating this and could easily replicate it on Power box.
> > I tried to reproduce this on x86 too, but could not reproduce on x86.
> > Now one difference on Power could be that pagesize is 64K and fs
> > blocksize is 4K.
> > 
> > The issue looks like the guest qemu image file is not properly written
> > back, after host does echo 3 > drop_caches. (correct me if this is not
> > the case).
> 
> Ok. So tried this issue with passing "cache=directsync" parameter to
> drive file. This parameter says it should bypass the host side page
> cache. With this parameter, I don't see this issue on Power box.

OK, so this likely means that there is something hosed in the writeback
path using unwritten extents when blocksize < pagesize. Maybe we miss some
conversion of unwritten extent to a written one and thus after dropping
caches we effectively loose data?

								Honza

> > I tried replicating via below test, but it could not reproduce.
> > 
> > Any idea what kind of unit test could be written for this?
> > I am not sure how exactly qemu is writing to it's image file.
> > 
> > 
> > 1. Create 2 files. "mmap-file", "mmap-data".
> > 2. "mmap-file" is a 2GB sparse file. Then at some random offsets (tried
> > with both 64KB align and 4KB align offsets), try to write
> > pagesize/blocksize amount of known data pattern.
> > 3. These offsets (which are pagesize/blocksize align) are recorded into
> > "mmap-data" file via normal read/write calls.
> > 4. Then after we wrote to both files, we munmap the "mmap-file" and
> > close both of these files.
> > 5. Then we do echo 3 > drop_caches.
> > 6. Then in the verify phase, using the offsets written in "mmap-data"
> > file, I read the "mmap-file" to verify if it's contents are proper or
> > not.
> > With that could not reproduce this issue.
> > 
> > 
> > -ritesh
> > 
> > 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2020-03-20 11:49 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18  3:47 Ext4 corruption with VM images as 3 > drop_caches Aneesh Kumar K.V
2020-03-19 13:24 ` Ritesh Harjani
2020-03-19 16:36 ` Jan Kara
2020-03-20  4:07   ` Aneesh Kumar K.V
2020-03-20  5:34 ` Ritesh Harjani
2020-03-20 11:49   ` Jan Kara [this message]
2020-03-21  3:22     ` Ritesh Harjani
2020-03-27 20:07 ` [PATCH] ext4: Don't set dioread_nolock by default for blocksize < pagesize Ritesh Harjani
2020-03-29  2:17   ` Theodore Y. Ts'o
2020-05-11  8:07     ` Ritesh Harjani
2020-05-12 11:45       ` Greg KH
2020-05-12 12:50         ` Ritesh Harjani
2020-05-12 12:59           ` Greg KH
2020-05-12 14:13             ` Sasha Levin
2020-05-12 16:12               ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200320114940.GA20455@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=riteshh@linux.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).