linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
To: Marc MERLIN <marc@merlins.org>, Liu Bo <bo.li.liu@oracle.com>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: 3.15.1: kernel BUG at fs/btrfs/locking.c:269
Date: Fri, 4 Jul 2014 13:29:29 +0800	[thread overview]
Message-ID: <53B63BB9.2020208@cn.fujitsu.com> (raw)
In-Reply-To: <20140704041102.GS11539@merlins.org>

On 07/04/2014 12:11 PM, Marc MERLIN wrote:
> On Fri, Jul 04, 2014 at 11:07:22AM +0800, Liu Bo wrote:
>>>>>> [160562.925463] parent transid verify failed on 2776298520576 wanted 41015 found 18120
>>> What should I be doing about this?
>>> Does it mean that I do have some kind of corruption/damage on my
>>> filesystem?
>>>
>> If there is another copy for the block(RAID1, DUP, RAID5/6), it'd try to read
>> the copy and repair the crc with the good one, it's all we can do about it.
> Right. It's not quite my question though.
> I mean I don't know what device it's on, never mind what file is affected.
> If I know which file is corrupted, I can simply delete it and restore from
> backup, no biggie.
> Right now I don't even know which one of my 3 btrfs filesystems (over 10TB)
> has this problem. That makes the message kind of problematic: "you have a
> problem, but not I'm not giving you any fighting chance of finding out
> where" :)
>   
>>> Also, is it possible to have all these messages state which devid they
>>> occurred on? I don't even know which device I should be worrying about
>>> right now, and although I'm running scrub now, my understanding is that
>>> scrub doesn't actually look at FS structures and is likely to miss this
>>> anyway.
>> Yes we can but it'd need a bit more effort, for now, all device msg we've seen
>> in panic info comes from sb->s_id which points to @fs_info->latest_device.
> Food for though, as is the message is unfortunately close to useless, except
> to an FS developer with a system that has only one btrfs filesystem.
>
> On Fri, Jul 04, 2014 at 11:50:25AM +0800, Wang Shilong wrote:
>> I am afraid, scrub maybe could not fix such kind of errors, all scrub
>> doing is to verify whether checksums match and if possible use good
>> mirrors to rewrite bad one.
> I wouldn't be bothered if scrub can't fix it, but it would be good if it
> could tell me.
>   
>> Such errors seem imply contention itself is corrupted, we may have passed
>> checksum check after ending io, but we fail generation check afterwards.
>   
> So should I really replace scrub with
> find / -type f -print0 | xargs grep . >/dev/null ?
>
> Basically we need something that will scan the filesystem and ensure that
> all files are reachable correctly without causing filesystem problems, and
> if one is bad, output the name of the bad file(s).
> Scrub only does a half job of that it seems.
>
>> To get physical device name, we still need mirror num to know which device
>> we are locating.
> Ok, so it's missing for now and therefore the code can't easily report it,
> I understand.
>
> Well, I explained the problem, ext4 and others of course tell me which devid
> an error is on, hopefully btrfs will able to do so in the near future.
So it is ok for you to print one of btrfs filesystem device(for example 
device name)  ?
maybe it is not really  physical address the metadata locates in, this 
is easier.


>
> Back to the original problem, would you agree that
> find / -type f -print0 | xargs grep . >/dev/nul?
> may do a better job scanning the entire FS for problems than scrub would?
>
> Thanks,
> Marc


  reply	other threads:[~2014-07-04  5:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-02 20:41 3.15.1: kernel BUG at fs/btrfs/locking.c:269 Marc MERLIN
2014-07-03  7:47 ` Duncan
2014-07-03  8:13 ` Liu Bo
2014-07-03  8:20   ` Wang Shilong
2014-07-03  9:25     ` Liu Bo
2014-07-03 13:44     ` Marc MERLIN
2014-07-04  3:07       ` Liu Bo
2014-07-04  4:11         ` Marc MERLIN
2014-07-04  5:29           ` Wang Shilong [this message]
2014-07-04  5:48         ` Wang Shilong
2014-07-04  6:02           ` Marc MERLIN
2014-07-04  6:12             ` Wang Shilong
2014-07-04  9:59               ` [PATCH] Btrfs: print btrfs specific info for some fatal error cases Wang Shilong
2014-09-05  9:49                 ` David Sterba
2014-07-04 14:02               ` 3.15.1: kernel BUG at fs/btrfs/locking.c:269 Marc MERLIN
2014-07-04  6:18             ` Wang Shilong
2014-07-04  3:50       ` Wang Shilong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53B63BB9.2020208@cn.fujitsu.com \
    --to=wangsl.fnst@cn.fujitsu.com \
    --cc=bo.li.liu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=marc@merlins.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).