From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f68.google.com ([209.85.214.68]:34448 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751488AbdBBMt5 (ORCPT ); Thu, 2 Feb 2017 07:49:57 -0500 Received: by mail-it0-f68.google.com with SMTP id o185so5661928itb.1 for ; Thu, 02 Feb 2017 04:49:57 -0800 (PST) Received: from [191.9.206.254] (rrcs-70-62-41-24.central.biz.rr.com. [70.62.41.24]) by smtp.gmail.com with ESMTPSA id l31sm826689iod.17.2017.02.02.04.49.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Feb 2017 04:49:55 -0800 (PST) Subject: Re: raid1: cannot add disk to replace faulty because can only mount fs as read-only. To: linux-btrfs@vger.kernel.org References: <54678ac94c95687e00485d41fa5b5bc9@server1.deragon.biz> <51114ea93a0f76a5ff6621e4d8983944@server1.deragon.biz> <07bba687-64c3-6713-6f6a-c8da183cbd3d@gmail.com> <20170201115530.l2ce5afcqld2kzi4@angband.pl> From: "Austin S. Hemmelgarn" Message-ID: <7cb699fd-44a7-d8eb-e492-c448f67b0eac@gmail.com> Date: Thu, 2 Feb 2017 07:49:50 -0500 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2017-02-01 17:48, Duncan wrote: > Adam Borowski posted on Wed, 01 Feb 2017 12:55:30 +0100 as excerpted: > >> On Wed, Feb 01, 2017 at 05:23:16AM +0000, Duncan wrote: >>> Hans Deragon posted on Tue, 31 Jan 2017 21:51:22 -0500 as excerpted: >>>> But the current scenario makes it difficult for me to put redundancy >>>> back into service! How much time did I waited until I find the >>>> mailing list, subscribe to it, post my email and get an answer? >>>> Wouldn't it be better if the user could actually add the disk at >>>> anytime, mostly ASAP? >>>> >>>> And to fix this, I have to learn how to patch and compile the kernel. >>>> I have not done this since the beginning of the century. More >>>> delays, >>>> more risk added to the system (what if I compile the kernel with the >>>> wrong parameters?). >>> >>> The patch fixing the problem and making return from degraded not the >>> one- >>> shot thing it tends to be now will eventually be merged >> >> Not anything like the one I posted to this thread -- this one merely >> removes a check that can't handle this particular (common) case of an >> otherwise healthy RAID that lost one device then was mounted degraded >> twice. We need instead a better check, one that sees whether every >> block group is present. >> >> This can be done quite easily, as as far as I know, the list of block >> group is at that moment fully present in memory, but someone actually >> has to code that, and I for one don't know btrfs internals (yet?). > > I didn't mention it because you spared me the trouble with your hack- > patch that did the current job, but FWIW, there's actually a patch that > does per-chunk testing as you indicate, but it got merged into a longer > running feature-add project (hot-spares, IIRC), and thus isn't likely to > be mainline-merged until that project is merged. > > But who knows when that might be? Could be years before it is considered > ready. > > Meanwhile, perhaps it's simply because I'm not a dev and am not > appreciating the complexities of some detail or other, but as > demonstrated by the people who have local-merged that patch to get out of > just this sort of jam, as well as the continuing saga of more and more > people appearing here with the same problem, it could be an arguably high > priority fix on its own, and should have been reviewed and ultimately > mainline-merged on its own merits, instead of being stuck in someone's > feature-add project queue for potentially years, while more and more > people who could have definitely used it have to either suffer without it > or go and find and local-merge it themselves. Even if this feature is > critical to the longer term feature, merge of this little one now would > make the final patch set for the longer term feature that much smaller. Agreed, it should have been mainlined. I have no issue with the hot-spare patches depending on it, but it's a severe bug. > > But that's a btrfs-using sysadmin's viewpoint, not a developer viewpoint, > and it's the developer's doing the work, so they get to define when and > how it gets merged, and us non-devs must either simply live with it, or > if the circumstances allow, fund some dev to have our specific interests > as their priority and take care of it for us. I don't care in this case if I draw some flak from the developers, but this particular developer viewpoint is wrong. If this was a commercial software product, the person responsible would at least be facing some serious reprimand, and depending on the company, possibly would be out of a job. This is a severe bug that makes a not all that uncommon (albeit bad) use case fail completely. The fix had no dependencies itself and > > Meanwhile, perhaps I should have bookmarked that patch at least as it > appeared on-list, but I didn't, so while I know it exists, I too would > have to go looking to actually find it, should I end up needing it. In > my defense, I /thought/ the immediate benefit was obvious enough that it > would be mainline-merged by now, not hoovered-up into some long-term > project with no real hint as to /when/ it might be merged. Oh, well... I think (although I'm not sure about it) that this: http://www.spinics.net/lists/linux-btrfs/msg47283.html is the first posting of the patch series.