linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kai Krakow <hurikhan77@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: csum errors in VirtualBox VDI files
Date: Tue, 22 Mar 2016 19:48:54 +0100	[thread overview]
Message-ID: <20160322194854.161e9c4c@jupiter.sol.kaishome.de> (raw)
In-Reply-To: 56F1068E.6050806@cn.fujitsu.com

Am Tue, 22 Mar 2016 16:47:10 +0800
schrieb Qu Wenruo <quwenruo@cn.fujitsu.com>:

> Hi,
> 
> Kai Krakow wrote on 2016/03/22 09:03 +0100:
> > Hello!
> >
> > Since one of the last kernel updates (I don't know which exactly),
> > I'm experiencing csum errors within VDI files when running
> > VirtualBox. A side effect of this is, as soon as dmesg shows these
> > errors, commands like "du" and "df" hang until reboot.
> >
> > I've now restored the file from backup but it happens over and over
> > again.
> >
> > On another machine I'm also seeing errors with big files in the
> > following scenario (apparently an older kernel, 4.1.x I afair):
> >
> > # ntfsclone --save /dev/md126p2 -o rescue.ntfs.img
> >                     ^ big NTFS partition   ^ file on btrfs
> >
> > results in a write error and the file system goes read-only.  
> 
> When it goes RO, it must have some warning in kernel log.
> Would you please paste the kernel log?

Apparently, that system does not boot now due to errors in bcache
b-tree. That being that, it may well be some bcache error and not
btrfs' fault. Apparently I couldn't catch the output, I've been in a
hurry. It said "write error" and had some backtrace. I will come to
this back later.

Let's go to the system I currently care about (that one with the
always breaking VDI file):

> > Both systems have in common they are using btrfs on bcache with
> > compress=lzo,autodefrag,nossd,discard (mraid=1,draid=0 and
> > mraid=1,draid=single).
> >
> > The system mentioned first is running Kernel 4.5.0 with Gentoo
> > patch-set. I upgraded from the last 4.4.x kernel when I first
> > experienced this problem. The first time the problem resulted in a
> > duplicate extent which btrfsck wasn't able to fix, that's when I
> > first restored from backup. But now I'm getting csum errors in this
> > file over a over again, plus when rsync has run for backup, the
> > system no longer responds to "du" and "df" commands - it just hangs.
> >
> > Known problem? Does it help if I send debug info? If so, please
> > instruct.
> >  
> Does btrfs check report anything wrong?

After the error occured?

Yes, some text about the extent being compressed and btrfs repair
doesn't currently handle that case (I tried --repair as I'm having a
backup). I simply decided not to investigate that further at that point
but delete and restore the affected file from backup. However, this is
the message from dmesg (tho, I didn't catch the backtrace):

btrfs_run_delayed_refs:2927: errno=-17 Object already exists

After this, the system went RO and I had to reboot. I ran btrfs check
and it told about a duplicate extent. I identified the file (using
btrfs inspect and the inode number) being the VDI file, and restored it.
Afterwards, I upgraded from latest 4.4 to 4.5. Currently, I'm now
watching closer since this incident, and the file becomes damaged
without any message in the kernel log when doing some more than usual
IO in VirtualBox. When my backup script then runs over the file, I get
errors about missing csums - the block is not readable. I now ran
ddrescue, and replaced the file to get a current and slightly damaged
VDI image back (my backup uses time rotation, so no problem). But
running chkdsk in VirtualBox damages the VDI again.

Regarding the other error on the other machine, I'm not completely
convinced bcache ain't involved in this problem.

As soon as I "produced" csum errors again, I'll run btrfs check. Or
should I do it now without forcing the csum error to occur?


-- 
Regards,
Kai

Replies to list-only preferred.


  reply	other threads:[~2016-03-22 18:49 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-22  8:03 csum errors in VirtualBox VDI files Kai Krakow
2016-03-22  8:06 ` Kai Krakow
2016-03-22  8:07 ` Kai Krakow
2016-03-22  8:47 ` Qu Wenruo
2016-03-22 18:48   ` Kai Krakow [this message]
2016-03-22 19:42     ` Chris Murphy
2016-03-22 20:35       ` Kai Krakow
2016-03-23  4:16     ` Qu Wenruo
2016-03-26 19:30       ` Kai Krakow
2016-03-26 20:28         ` Chris Murphy
2016-03-26 21:04           ` Chris Murphy
2016-03-27  1:30             ` Kai Krakow
2016-03-27  4:57               ` Chris Murphy
2016-03-27 17:31                 ` Kai Krakow
2016-03-27 19:04                   ` Chris Murphy
2016-03-28 10:30                     ` Kai Krakow
2016-03-27  1:01           ` Kai Krakow
2016-03-27  1:50         ` Kai Krakow
2016-03-27  4:43           ` Chris Murphy
2016-03-27 13:55           ` Qu Wenruo
2016-03-28 10:02             ` bad metadata crossing stripe boundary (was: csum errors in VirtualBox VDI files) Kai Krakow
2016-03-31  1:33               ` bad metadata crossing stripe boundary Qu Wenruo
2016-03-31  2:31                 ` Qu Wenruo
2016-03-31 20:27                   ` Kai Krakow
2016-03-31 20:37                     ` Henk Slager
2016-03-31 21:00                   ` Marc Haber
2016-03-31 21:16                     ` Kai Krakow
2016-03-31 21:35                       ` Kai Krakow
2016-04-01  5:57                       ` Marc Haber
2016-04-02  9:03                         ` Kai Krakow
2016-04-02  9:44                           ` Marc Haber
2016-04-02 18:31                             ` Kai Krakow
2016-04-02 19:39                               ` Patrik Lundquist
2016-04-03  8:39                               ` Marc Haber
2016-04-02 19:41                         ` Chris Murphy
2016-04-03  8:51                           ` Marc Haber
2016-04-03 18:29                             ` Chris Murphy
2016-03-27 13:46         ` csum errors in VirtualBox VDI files Qu Wenruo
2016-03-22 20:07 ` Henk Slager
2016-03-22 21:23   ` Kai Krakow
2016-03-27 12:18 ` Martin Steigerwald
2016-03-27 16:53   ` Kai Krakow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160322194854.161e9c4c@jupiter.sol.kaishome.de \
    --to=hurikhan77@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).