* Bug(?): btrfs carries on working if part of a device disappears @ 2012-01-05 18:02 Maik Zumstrull 2012-01-13 12:07 ` Liu Bo 0 siblings, 1 reply; 3+ messages in thread From: Maik Zumstrull @ 2012-01-05 18:02 UTC (permalink / raw) To: linux-btrfs Hello list, I hit a funny BIOS bug the other day where the BIOS suddenly sets a HPA on a random hard disk, leaving only the first 33 MB accessible. That disk had one device of a multi-device btrfs on it in my case. (With dm-crypt/LUKS in between, no partitioning or LVM.) The reason I'm writing to you is that btrfs apparently didn't care at all. It didn't complain, and it certainly didn't consider "Uhm, maybe I should stop writing to a file system that mostly doesn't exist anymore." The only errors I saw in dmesg were from the lower block device level: someone trying to read or write beyond the end of a device. An error btrfs apparently didn't mind. It took me a while to figure out what had happened, during which time btrfsck and the btrfs kernel part worked together to pretty much totally trash the fs. (I'm still trying a few things, but I'm not hopeful. Hold the default backup rant, I can in fact recover anything that was on this from elsewhere, I think.) So, I think during mount, btrfs should check the reported size of the block device, and if it's significantly smaller than fs metadata implies it must be, mount degraded or read-only or not at all. And mostly, complain. Loudly. This was on Debian's linux-image-3.1.0-1-amd6 at version 3.1.6-1. Other ways this could happen than HPA are LVM or partitioning. Maik ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Bug(?): btrfs carries on working if part of a device disappears 2012-01-05 18:02 Bug(?): btrfs carries on working if part of a device disappears Maik Zumstrull @ 2012-01-13 12:07 ` Liu Bo 2012-01-13 12:51 ` Ben Klein 0 siblings, 1 reply; 3+ messages in thread From: Liu Bo @ 2012-01-13 12:07 UTC (permalink / raw) To: Maik Zumstrull; +Cc: linux-btrfs On 01/06/2012 02:02 AM, Maik Zumstrull wrote: > Hello list, > > I hit a funny BIOS bug the other day where the BIOS suddenly sets a > HPA on a random hard disk, leaving only the first 33 MB accessible. > That disk had one device of a multi-device btrfs on it in my case. > (With dm-crypt/LUKS in between, no partitioning or LVM.) > > The reason I'm writing to you is that btrfs apparently didn't care at > all. It didn't complain, and it certainly didn't consider "Uhm, maybe > I should stop writing to a file system that mostly doesn't exist > anymore." The only errors I saw in dmesg were from the lower block > device level: someone trying to read or write beyond the end of a > device. An error btrfs apparently didn't mind. It took me a while to > figure out what had happened, during which time btrfsck and the btrfs > kernel part worked together to pretty much totally trash the fs. (I'm > still trying a few things, but I'm not hopeful. Hold the default > backup rant, I can in fact recover anything that was on this from > elsewhere, I think.) > > So, I think during mount, btrfs should check the reported size of the > block device, and if it's significantly smaller than fs metadata > implies it must be, mount degraded or read-only or not at all. And > mostly, complain. Loudly. > I also notice this, when we "mkfs.btrfs" with a "-b fssize", if "fssize" is larger than dev size, it will not complain and get "beyond the end" errors. so maybe we limit the mkfs size: diff --git a/mkfs.c b/mkfs.c index e3ced19..3ac8525 100644 --- a/mkfs.c +++ b/mkfs.c @@ -1282,6 +1282,8 @@ int main(int ac, char **av) ret = btrfs_prepare_device(fd, file, zero_end, &dev_block_count, &mixed); if (block_count == 0) block_count = dev_block_count; + if (block_count > dev_block_count); + block_count = dev_block_count; } else { ac = 0; file = av[optind++]; thanks, liubo > This was on Debian's linux-image-3.1.0-1-amd6 at version 3.1.6-1. > Other ways this could happen than HPA are LVM or partitioning. > > > Maik > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: Bug(?): btrfs carries on working if part of a device disappears 2012-01-13 12:07 ` Liu Bo @ 2012-01-13 12:51 ` Ben Klein 0 siblings, 0 replies; 3+ messages in thread From: Ben Klein @ 2012-01-13 12:51 UTC (permalink / raw) To: Liu Bo; +Cc: Maik Zumstrull, linux-btrfs On 13 January 2012 23:07, Liu Bo <liubo2009@cn.fujitsu.com> wrote: > On 01/06/2012 02:02 AM, Maik Zumstrull wrote: >> Hello list, >> >> I hit a funny BIOS bug the other day where the BIOS suddenly sets a >> HPA on a random hard disk, leaving only the first 33 MB accessible. >> That disk had one device of a multi-device btrfs on it in my case. >> (With dm-crypt/LUKS in between, no partitioning or LVM.) >> >> The reason I'm writing to you is that btrfs apparently didn't care a= t >> all. It didn't complain, and it certainly didn't consider "Uhm, mayb= e >> I should stop writing to a file system that mostly doesn't exist >> anymore." The only errors I saw in dmesg were from the lower block >> device level: someone trying to read or write beyond the end of a >> device. An error btrfs apparently didn't mind. It took me a while to >> figure out what had happened, during which time btrfsck and the btrf= s >> kernel part worked together to pretty much totally trash the fs. (I'= m >> still trying a few things, but I'm not hopeful. Hold the default >> backup rant, I can in fact recover anything that was on this from >> elsewhere, I think.) >> >> So, I think during mount, btrfs should check the reported size of th= e >> block device, and if it's significantly smaller than fs metadata >> implies it must be, mount degraded or read-only or not at all. And >> mostly, complain. Loudly. >> > > I also notice this, when we "mkfs.btrfs" with a "-b fssize", if "fssi= ze" is > larger than dev size, it will not complain and get "beyond the end" e= rrors. > > so maybe we limit the mkfs size: > > diff --git a/mkfs.c b/mkfs.c > index e3ced19..3ac8525 100644 > --- a/mkfs.c > +++ b/mkfs.c > @@ -1282,6 +1282,8 @@ int main(int ac, char **av) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ret =3D btrfs_= prepare_device(fd, file, zero_end, &dev_block_count, &mixed); > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (block_coun= t =3D=3D 0) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0block_count =3D dev_block_count; > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (block_count > = dev_block_count); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 block_count =3D dev_block_count; > =C2=A0 =C2=A0 =C2=A0 =C2=A0} else { > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ac =3D 0; > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file =3D av[op= tind++]; > > thanks, > liubo It might be a better idea to error out at this point. If the user is asking for a filesystem larger than what is possible on the device, I think the mkfs should fail completely. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" = in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-01-13 12:51 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-05 18:02 Bug(?): btrfs carries on working if part of a device disappears Maik Zumstrull 2012-01-13 12:07 ` Liu Bo 2012-01-13 12:51 ` Ben Klein
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.