* Can't mount degraded. How to remove/add drives OFFLINE?
@ 2015-08-14 18:12 Timothy Normand Miller
2015-08-14 18:44 ` Chris Murphy
2015-08-14 23:49 ` Anand Jain
0 siblings, 2 replies; 10+ messages in thread
From: Timothy Normand Miller @ 2015-08-14 18:12 UTC (permalink / raw)
To: Btrfs BTRFS
Sorry about that empty email. I hit a wrong key, and gmail decided to send.
Anyhow, my replacement drive is going to arrive this evening, and I
need to know how to add it to my btrfs array. Here's the situation:
- I had a drive fail, so I removed it and mounted degraded.
- I hooked up a replacement drive, did an "add" on that one, and did a
"delete missing".
- During the rebalance, the replacement drive failed, there were OOPSes, etc.
- Now, although all of my data is there, I can't mount degraded,
because btrfs is complaining that too many devices are missing (3 are
there, but it sees 2 missing).
So I could use some help with cleaning up this mess. All the data is
there, so I need to know how to either force it to mount degraded, or
add and remove devices offline. Where do I begin?
Also, doesn't it seem a bit arbitrary that there are "too many
missing," when all of the data is there? If I understand correctly,
all four drives in my RAID1 should all have copies of the metadata,
and of the remaining three good drives, there should be one or two
copies of every data block. So it's all there, but btrfs has decided,
based on the NUMBER of missing devices, that it won't mount.
Shouldn't it refuse to mount if it knows there is data missing? For
that matter, why should it even refuse in that case? So some data
might missing, so it should throw some errors if you try to access
that missing data. Right?
Thanks!
--
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Can't mount degraded. How to remove/add drives OFFLINE?
2015-08-14 18:12 Can't mount degraded. How to remove/add drives OFFLINE? Timothy Normand Miller
@ 2015-08-14 18:44 ` Chris Murphy
2015-08-14 19:03 ` Timothy Normand Miller
2015-08-14 23:49 ` Anand Jain
1 sibling, 1 reply; 10+ messages in thread
From: Chris Murphy @ 2015-08-14 18:44 UTC (permalink / raw)
To: Timothy Normand Miller, Btrfs BTRFS
On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller
<theosib@gmail.com> wrote:
> Sorry about that empty email. I hit a wrong key, and gmail decided to send.
>
> Anyhow, my replacement drive is going to arrive this evening, and I
> need to know how to add it to my btrfs array. Here's the situation:
>
> - I had a drive fail, so I removed it and mounted degraded.
> - I hooked up a replacement drive, did an "add" on that one, and did a
> "delete missing".
> - During the rebalance, the replacement drive failed, there were OOPSes, etc.
> - Now, although all of my data is there, I can't mount degraded,
> because btrfs is complaining that too many devices are missing (3 are
> there, but it sees 2 missing).
It might be related to this (long) bug:
https://bugzilla.kernel.org/show_bug.cgi?id=92641
While Btrfs RAID 1 can tolerate only a single device failure, what you
have is an in-progress rebuild of a missing device. If it becomes
missing, the volume should be no worse off than it was before. But
Btrfs doesn't see it this way, instead is sees this as two separate
missing devices and now too many devices missing and it refuses to
proceed. And there's no mechanism to remove missing devices unless you
can mount rw. So it's stuck.
> So I could use some help with cleaning up this mess. All the data is
> there, so I need to know how to either force it to mount degraded, or
> add and remove devices offline. Where do I begin?
You can try to ask on IRC. I have no ideas for this scenario, I've
tried and failed. My case was throw away, what should still be
possible is using btrfs restore.
> Also, doesn't it seem a bit arbitrary that there are "too many
> missing," when all of the data is there? If I understand correctly,
> all four drives in my RAID1 should all have copies of the metadata,
No that's not correct. RAID 1 means 2 copies of metadata. In a 4
device RAID 1 that's still only 2 copies. It is not n-way RAID 1.
But that doesn't matter here, the problem is that Btrfs has a narrow
idea of the volume, it assumes without context that once the number of
devices is below the minimum, the volume can't be mounted. In reality,
an exception exists if the failure is for an in-progress rebuild of a
missing drive. That drive failing should mean the volume is no worse
off than before but Btrfs doesn't know that.
Pretty sure about that anyway.
> and of the remaining three good drives, there should be one or two
> copies of every data block. So it's all there, but btrfs has decided,
> based on the NUMBER of missing devices, that it won't mount.
> Shouldn't it refuse to mount if it knows there is data missing? For
> that matter, why should it even refuse in that case? So some data
> might missing, so it should throw some errors if you try to access
> that missing data. Right?
I think no data is missing, no metadata is missing, and Btrfs is
confused and stuck in this case.
--
Chris Murphy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Can't mount degraded. How to remove/add drives OFFLINE?
2015-08-14 18:44 ` Chris Murphy
@ 2015-08-14 19:03 ` Timothy Normand Miller
2015-08-14 19:49 ` Chris Murphy
0 siblings, 1 reply; 10+ messages in thread
From: Timothy Normand Miller @ 2015-08-14 19:03 UTC (permalink / raw)
To: Chris Murphy; +Cc: Btrfs BTRFS
I'm not sure my situation is quite like the one you linked, so here's
my bug report:
https://bugzilla.kernel.org/show_bug.cgi?id=102881
On Fri, Aug 14, 2015 at 2:44 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller
> <theosib@gmail.com> wrote:
>> Sorry about that empty email. I hit a wrong key, and gmail decided to send.
>>
>> Anyhow, my replacement drive is going to arrive this evening, and I
>> need to know how to add it to my btrfs array. Here's the situation:
>>
>> - I had a drive fail, so I removed it and mounted degraded.
>> - I hooked up a replacement drive, did an "add" on that one, and did a
>> "delete missing".
>> - During the rebalance, the replacement drive failed, there were OOPSes, etc.
>> - Now, although all of my data is there, I can't mount degraded,
>> because btrfs is complaining that too many devices are missing (3 are
>> there, but it sees 2 missing).
>
> It might be related to this (long) bug:
> https://bugzilla.kernel.org/show_bug.cgi?id=92641
>
> While Btrfs RAID 1 can tolerate only a single device failure, what you
> have is an in-progress rebuild of a missing device. If it becomes
> missing, the volume should be no worse off than it was before. But
> Btrfs doesn't see it this way, instead is sees this as two separate
> missing devices and now too many devices missing and it refuses to
> proceed. And there's no mechanism to remove missing devices unless you
> can mount rw. So it's stuck.
>
>
>> So I could use some help with cleaning up this mess. All the data is
>> there, so I need to know how to either force it to mount degraded, or
>> add and remove devices offline. Where do I begin?
>
> You can try to ask on IRC. I have no ideas for this scenario, I've
> tried and failed. My case was throw away, what should still be
> possible is using btrfs restore.
>
>
>> Also, doesn't it seem a bit arbitrary that there are "too many
>> missing," when all of the data is there? If I understand correctly,
>> all four drives in my RAID1 should all have copies of the metadata,
>
> No that's not correct. RAID 1 means 2 copies of metadata. In a 4
> device RAID 1 that's still only 2 copies. It is not n-way RAID 1.
>
> But that doesn't matter here, the problem is that Btrfs has a narrow
> idea of the volume, it assumes without context that once the number of
> devices is below the minimum, the volume can't be mounted. In reality,
> an exception exists if the failure is for an in-progress rebuild of a
> missing drive. That drive failing should mean the volume is no worse
> off than before but Btrfs doesn't know that.
>
> Pretty sure about that anyway.
>
>
>> and of the remaining three good drives, there should be one or two
>> copies of every data block. So it's all there, but btrfs has decided,
>> based on the NUMBER of missing devices, that it won't mount.
>> Shouldn't it refuse to mount if it knows there is data missing? For
>> that matter, why should it even refuse in that case? So some data
>> might missing, so it should throw some errors if you try to access
>> that missing data. Right?
>
> I think no data is missing, no metadata is missing, and Btrfs is
> confused and stuck in this case.
>
> --
> Chris Murphy
--
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Can't mount degraded. How to remove/add drives OFFLINE?
2015-08-14 18:12 Can't mount degraded. How to remove/add drives OFFLINE? Timothy Normand Miller
2015-08-14 18:44 ` Chris Murphy
@ 2015-08-14 23:49 ` Anand Jain
2015-08-15 0:12 ` Timothy Normand Miller
1 sibling, 1 reply; 10+ messages in thread
From: Anand Jain @ 2015-08-14 23:49 UTC (permalink / raw)
To: Timothy Normand Miller, Btrfs BTRFS
On 08/15/2015 02:12 AM, Timothy Normand Miller wrote:
> Sorry about that empty email. I hit a wrong key, and gmail decided to send.
>
> Anyhow, my replacement drive is going to arrive this evening, and I
> need to know how to add it to my btrfs array. Here's the situation:
>
> - I had a drive fail, so I removed it and mounted degraded.
that bit dangerous to do without the below patch. patch has more details
why.
> - I hooked up a replacement drive, did an "add" on that one, and did a
> "delete missing".
> - During the rebalance, the replacement drive failed, there were OOPSes, etc.
> - Now, although all of my data is there, I can't mount degraded,
> because btrfs is complaining that too many devices are missing (3 are
> there, but it sees 2 missing).
This is addressed in the patch
[PATCH 23/23] Btrfs: allow -o rw,degraded for single group profile
Thanks, Anand
> So I could use some help with cleaning up this mess. All the data is
> there, so I need to know how to either force it to mount degraded, or
> add and remove devices offline. Where do I begin?
>
> Also, doesn't it seem a bit arbitrary that there are "too many
> missing," when all of the data is there? If I understand correctly,
> all four drives in my RAID1 should all have copies of the metadata,
> and of the remaining three good drives, there should be one or two
> copies of every data block. So it's all there, but btrfs has decided,
> based on the NUMBER of missing devices, that it won't mount.
> Shouldn't it refuse to mount if it knows there is data missing? For
> that matter, why should it even refuse in that case? So some data
> might missing, so it should throw some errors if you try to access
> that missing data. Right?
>
> Thanks!
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Can't mount degraded. How to remove/add drives OFFLINE?
2015-08-14 23:49 ` Anand Jain
@ 2015-08-15 0:12 ` Timothy Normand Miller
2015-08-15 0:28 ` Anand Jain
0 siblings, 1 reply; 10+ messages in thread
From: Timothy Normand Miller @ 2015-08-15 0:12 UTC (permalink / raw)
To: Anand Jain; +Cc: Btrfs BTRFS
On Fri, Aug 14, 2015 at 7:49 PM, Anand Jain <anand.jain@oracle.com> wrote:
>
>>
>> - I had a drive fail, so I removed it and mounted degraded.
>
>
> that bit dangerous to do without the below patch. patch has more details
> why.
Just to be clear, I removed the drive (the original failed drive) when
the power was off, then powered up, and then mounted degraded. That's
not dangerous that I know of.
>
>> - I hooked up a replacement drive, did an "add" on that one, and did a
>> "delete missing".
>> - During the rebalance, the replacement drive failed, there were OOPSes,
>> etc.
>> - Now, although all of my data is there, I can't mount degraded,
>> because btrfs is complaining that too many devices are missing (3 are
>> there, but it sees 2 missing).
>
>
>
> This is addressed in the patch
>
> [PATCH 23/23] Btrfs: allow -o rw,degraded for single group profile
>
Where is this patch, and what kernel versions can this be applied to?
--
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Can't mount degraded. How to remove/add drives OFFLINE?
2015-08-15 0:12 ` Timothy Normand Miller
@ 2015-08-15 0:28 ` Anand Jain
2015-08-15 1:39 ` Chris Murphy
2015-08-15 2:32 ` Timothy Normand Miller
0 siblings, 2 replies; 10+ messages in thread
From: Anand Jain @ 2015-08-15 0:28 UTC (permalink / raw)
To: Timothy Normand Miller; +Cc: Btrfs BTRFS
> Just to be clear, I removed the drive (the original failed drive) when
> the power was off, then powered up, and then mounted degraded. That's
> not dangerous that I know of.
patch has details. pls refer.
>
> Where is this patch, and what kernel versions can this be applied to?
https://patchwork.kernel.org/patch/7014141/
its on 4.3. but should apply nice on below.
thanks
Anand
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Can't mount degraded. How to remove/add drives OFFLINE?
2015-08-15 0:28 ` Anand Jain
@ 2015-08-15 1:39 ` Chris Murphy
2015-08-15 2:32 ` Timothy Normand Miller
1 sibling, 0 replies; 10+ messages in thread
From: Chris Murphy @ 2015-08-15 1:39 UTC (permalink / raw)
To: Anand Jain; +Cc: Timothy Normand Miller, Btrfs BTRFS
I thought for a second that maybe the problem is due to the "phantom"
single chunk(s) created at mkfs time. I redid the test, and did a
balance to get rid of the single chunk. I did this right after
populating volume with some data. But the problem still happens.
---
Chris Murphy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Can't mount degraded. How to remove/add drives OFFLINE?
2015-08-15 0:28 ` Anand Jain
2015-08-15 1:39 ` Chris Murphy
@ 2015-08-15 2:32 ` Timothy Normand Miller
1 sibling, 0 replies; 10+ messages in thread
From: Timothy Normand Miller @ 2015-08-15 2:32 UTC (permalink / raw)
To: Anand Jain; +Cc: Btrfs BTRFS
I applied that patch to my 4.1.4, it mounted degraded, and now it's
balancing to the new drive.
Thanks for all the help!
On Fri, Aug 14, 2015 at 8:28 PM, Anand Jain <anand.jain@oracle.com> wrote:
>
>
>> Just to be clear, I removed the drive (the original failed drive) when
>> the power was off, then powered up, and then mounted degraded. That's
>> not dangerous that I know of.
>
>
> patch has details. pls refer.
>>
>>
>> Where is this patch, and what kernel versions can this be applied to?
>
>
>
> https://patchwork.kernel.org/patch/7014141/
>
> its on 4.3. but should apply nice on below.
>
> thanks
> Anand
--
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
^ permalink raw reply [flat|nested] 10+ messages in thread
* Can't mount degraded. How to remove/add drives OFFLINE?
@ 2015-08-14 18:06 Timothy Normand Miller
0 siblings, 0 replies; 10+ messages in thread
From: Timothy Normand Miller @ 2015-08-14 18:06 UTC (permalink / raw)
To: Btrfs BTRFS
My
--
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-08-15 2:32 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-14 18:12 Can't mount degraded. How to remove/add drives OFFLINE? Timothy Normand Miller
2015-08-14 18:44 ` Chris Murphy
2015-08-14 19:03 ` Timothy Normand Miller
2015-08-14 19:49 ` Chris Murphy
2015-08-14 23:49 ` Anand Jain
2015-08-15 0:12 ` Timothy Normand Miller
2015-08-15 0:28 ` Anand Jain
2015-08-15 1:39 ` Chris Murphy
2015-08-15 2:32 ` Timothy Normand Miller
-- strict thread matches above, loose matches on Subject: below --
2015-08-14 18:06 Timothy Normand Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.