From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plane.gmane.org ([80.91.229.3]:60874 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751112AbbIPFCe (ORCPT ); Wed, 16 Sep 2015 01:02:34 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1Zc4rY-0006Jn-4u for linux-btrfs@vger.kernel.org; Wed, 16 Sep 2015 07:02:32 +0200 Received: from ip98-167-165-199.ph.ph.cox.net ([98.167.165.199]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Sep 2015 07:02:32 +0200 Received: from 1i5t5.duncan by ip98-167-165-199.ph.ph.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Sep 2015 07:02:32 +0200 To: linux-btrfs@vger.kernel.org From: Duncan <1i5t5.duncan@cox.net> Subject: Re: kernel BUG at =?us-ascii?Q?linux-4=2E2=2E0=2Ffs=2Fbtrfs=2Fextent-tree=2Ec=3A183?= =?us-ascii?Q?3?= on rebalance Date: Wed, 16 Sep 2015 05:02:26 +0000 (UTC) Message-ID: References: <9c864637fe7676a8b7badc5ddd7a4e0c@all.all> <2c00c4b7c15e424659fb2e810170e32e@all.all> <55F83181.9010201@fb.com> <532aadf0f92d08d3d2b274173548aee1@all.all> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: Stéphane Lesimple posted on Tue, 15 Sep 2015 23:47:01 +0200 as excerpted: > Le 2015-09-15 16:56, Josef Bacik a écrit : >> On 09/15/2015 10:47 AM, Stéphane Lesimple wrote: >>>> I've been experiencing repetitive "kernel BUG" occurences in the past >>>> few days trying to balance a raid5 filesystem after adding a new >>>> drive. >>>> It occurs on both 4.2.0 and 4.1.7, using 4.2 userspace tools. >>> >>> I've ran a scrub on this filesystem after the crash happened twice, >>> and if found no errors. >>> >>> The BUG_ON() condition that my filesystem triggers is the following : >>> >>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID); >>> // in insert_inline_extent_backref() of extent-tree.c. >>> >> Does btrfsck complain at all? Just to elucidate a bit... Scrub is designed to detect, and where there's a second copy available (dup or raid1/10 modes, raid5/6 modes can reconstruct from parity) correct, exactly one problem, corruption where the checksum stored at data write doesn't match that computed on data read back from storage. As such, it detects/corrects media errors and (perhaps more commonly) corrupted data due to crashes in the middle of the write, but if the data was bad when it was written in the first place and thus the checksum covering it simply validates what was already bad before the write happened, scrub will be none the wiser and will happily validate the incorrect data, since it's a totally valid checksum covering data that was bad before the checksum was ever created. Which is where btrfs check comes in and why JB asked you to run it, since unlike scrub, check is designed to catch filesystem logic errors. > Thanks for your suggestion. > You're right, even if btrfs scrub didn't complain, btrfsck does : > > checking extents > bad metadata [4179166806016, 4179166822400) crossing stripe boundary > bad metadata [4179166871552, 4179166887936) crossing stripe boundary > bad metadata [4179166937088, 4179166953472) crossing stripe boundary This is an actively in-focus bug ATM, and while I'm not a dev and can't tell you for sure that it's behind the specific balance-related crash and traces you posted (tho I believe it so), it certainly has the potential to be that serious, yes. The most common cause is a buggy btrfs-convert that was creating invalid btrfs when converting from ext* at one point. AFAIK they've hotfixed the immediate convert issue, but are still actively working on a longer term proper fix. Meanwhile, while btrfs check does now detect the issue (and even that is quite new code, added in 4.2 I believe), there's still no real fix for what was after all a defective btrfs from the moment the convert was done. So where that's the cause, the filesystem was created from an ext* fs using a buggy btrfs-convert and is thus actually invalid due to this cross-stripe-metadata, the current fix is to back up the files you want to keep (and FWIW, as any good sysadmin will tell you, a backup that hasn't been tested restorable isn't yet a backup, as the job isn't complete), then blow away and recreate the filesystem properly, using mkfs.btrfs, and of course then restore to the new filesystem. If, however, you created the filesystem using mkfs.btrfs, then the problem must have occurred some other way. Whether there's some other cause beyond the known cause, a buggy btrfs-convert, has in fact been in question, so in this case the devs are likely to be quite interested indeed in your case and perhaps the filesystem history that brought you to this point. The ultimate fix is likely to be the same (unless the devs have you test new fix code for btrfs check --repair), but I'd strongly urge you to delay blowing away the filesystem, if possible, until the devs have a chance to ask you to run other diagnostics and perhaps even get a btrfs-image for them, since you may well have accidentally found a corner-case they'll have trouble reproducing, without your information. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman