linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* “Bug”-report: inconsistency kernel <-> tools
@ 2012-08-28 19:52 M G Berberich
  2012-08-30 18:24 ` Goffredo Baroncelli
  0 siblings, 1 reply; 6+ messages in thread
From: M G Berberich @ 2012-08-28 19:52 UTC (permalink / raw)
  To: linux-btrfs

Hello,

We had set up a btrfs-fs over 6 hot-plugable SAS-disks for
testing and got it into a state where kernel and btrfs-tools do not
agree any more about the state of the filesystem.

We do not remember exaclty what we did, but roughly it was something
like this (on the running system). THIS IS FROM MEMORY!

(1) pulled out one disk
(2) removed disk from btrfs
(3) rebalanced btrfs
(4) pulled out another disk
(5) removed disk from btrfs
(6) rebalanced btrfs

This went fine sofar.

(7) reinserted disk (and rebooted)
    At some point before reboot the first 10 sectors of one disk
    were zeroed to test if the disk gets removed from the btrfs.

Now btrfs-tools showed:

---------------------------------------------------------------------------
# btrfs fi show                      
failed to read /dev/sr0
Label: 'BTRFS_RAID'  uuid: 807193fd-17de-4088-9a54-3e7cacdc89db
        Total devices 6 FS bytes used 3.07GB
        devid    4 size 931.00GB used 75.00GB path /dev/sdf
        devid    5 size 931.00GB used 324.03GB path /dev/sde
        devid    6 size 931.00GB used 83.03GB path /dev/sdd
        devid    3 size 931.00GB used 326.03GB path /dev/sdc
        devid    2 size 931.00GB used 326.03GB path /dev/sdb
        devid    1 size 931.00GB used 324.04GB path /dev/sda

Btrfs Btrfs v0.19
---------------------------------------------------------------------------

As far as we can tell, only four of the disks are considered part of
the btrfs by kernel. There were only four “btrfs: bdev”-lines in dmesg
and only four disks took part in balancing. “btrfs device scan” says:

  unable to scan the device '/dev/sdd' - Device or resource busy

and balance does not balance theses two devices (of 6)

It was neither possible to remove the disk from the btrfs via “btrfs
device delete” nor adding them via “btrfs device add”.

(8) a colleague swaped the two disk

Now btrfs-tools showed:

---------------------------------------------------------------------------
# btrfs fi show
failed to read /dev/sr0
Label: 'BTRFS_RAID'  uuid: 807193fd-17de-4088-9a54-3e7cacdc89db
        Total devices 5 FS bytes used 3.01GB
        devid    6 size 931.00GB used 83.03GB path /dev/sdf
        devid    4 size 931.00GB used 75.00GB path /dev/sdd
        devid    5 size 931.00GB used 325.03GB path /dev/sde
        devid    3 size 931.00GB used 326.03GB path /dev/sdc
        devid    2 size 931.00GB used 325.03GB path /dev/sdb
        devid    1 size 931.51GB used 326.04GB path /dev/sda

Btrfs Btrfs v0.19
---------------------------------------------------------------------------

Claiming the btrfs has 5 disk, but listing 6 disks out of 5 (6 of 5).

He finally managed to get the btrfs complete again by overwriting the
first 100G of the two disk. After this the btrfs-tools (correctly)
stated a filesystem with 4 disk and it was possible to add the two
disk again.


Assumption:
kernel and btrfs do not share the same view of the filesystem.

In this state commands to repair the filesystem do not work, because
they are either rejected by the tools or by the kernel.

A tool that allows a disk/partition to be marked as not-a-btrfs-part
would be nice.

A “/proc/btrfs” showing the kernels view of the filesystem would be
usefull.

	MfG
	bmg

-- 
„Des is völlig wurscht, was heut beschlos- | M G Berberich
 sen wird: I bin sowieso dagegn!“          | berberic@fmi.uni-passau.de
(SPD-Stadtrat Kurt Schindler; Regensburg)  | www.fmi.uni-passau.de/~berberic

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: “Bug”-report: inconsistency kernel <-> tools
  2012-08-28 19:52 “Bug”-report: inconsistency kernel <-> tools M G Berberich
@ 2012-08-30 18:24 ` Goffredo Baroncelli
  2012-08-30 18:37   ` Hugo Mills
  2012-08-31 19:08   ` Goffredo Baroncelli
  0 siblings, 2 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2012-08-30 18:24 UTC (permalink / raw)
  To: M G Berberich; +Cc: linux-btrfs, Hubert Kario

On 08/28/2012 09:52 PM, M G Berberich wrote:
> Hello,
>
> We had set up a btrfs-fs over 6 hot-plugable SAS-disks for
> testing and got it into a state where kernel and btrfs-tools do not
> agree any more about the state of the filesystem.
>
> We do not remember exaclty what we did, but roughly it was something
> like this (on the running system). THIS IS FROM MEMORY!
>
> (1) pulled out one disk
> (2) removed disk from btrfs
> (3) rebalanced btrfs
> (4) pulled out another disk
> (5) removed disk from btrfs
> (6) rebalanced btrfs
>
> This went fine sofar.
>
> (7) reinserted disk (and rebooted)
>      At some point before reboot the first 10 sectors of one disk
>      were zeroed to test if the disk gets removed from the btrfs.

IIRC the superblock is not placed at the beginning of the disk. On the 
basis of [1] it should be near the 64KB (around the sector #128)


[1] 
https://btrfs.wiki.kernel.org/index.php/User:Wtachi/On-disk_Format#Superblock
>
> Now btrfs-tools showed:
>
> ---------------------------------------------------------------------------
> # btrfs fi show
> failed to read /dev/sr0
> Label: 'BTRFS_RAID'  uuid: 807193fd-17de-4088-9a54-3e7cacdc89db
>          Total devices 6 FS bytes used 3.07GB
>          devid    4 size 931.00GB used 75.00GB path /dev/sdf
>          devid    5 size 931.00GB used 324.03GB path /dev/sde
>          devid    6 size 931.00GB used 83.03GB path /dev/sdd
>          devid    3 size 931.00GB used 326.03GB path /dev/sdc
>          devid    2 size 931.00GB used 326.03GB path /dev/sdb
>          devid    1 size 931.00GB used 324.04GB path /dev/sda

"btrfs filesystem show" shows the content of the disks, which could be 
unrelated to the kernel status. Pay attention that if the data is not 
flushed to the disk the report of "btrfs fi show" could be unreliable.

I posted few days ago a patch which adds the sysfs support to btrfs. 
With this support it is possible to know the real state of the disks.

For example I have a filesystem with 4 disks (note "Total devices 4"):

   ghigo@emulato:~$ sudo btrfs fi show
   Label: 'btrfs3'  uuid: 2a66286d-63e9-4ed5-b347-5af5e4ada814
	Total devices 4 FS bytes used 284.00KB
	devid    4 size 100.00GB used 8.01GB path /dev/vdj
	devid    3 size 100.00GB used 6.04GB path /dev/vdi
	devid    5 size 100.00GB used 0.00 path /dev/vdh
	devid    1 size 100.00GB used 7.05GB path /dev/vdg

   Btrfs Btrfs v0.19

My sysfs interface says that the filesystem is composed by 4 disks:

   ghigo@emulato:~$ cat /sys/fs/btrfs/filesystems/2a66286d-
   63e9-4ed5b347-5af5e4ada814/fs_devices/open_devices
   4

Then I remove 1 disk

   ghigo@emulato:~$ sudo btrfs dev del /dev/vdi  /mnt/btrfs3/

Now the sysfs interface says:

   ghigo@emulato:~$ cat /sys/fs/btrfs/filesystems/2a66286d-
   63e9-4ed5b347-5af5e4ada814/fs_devices/open_devices
   3

But "btrfs filesystem show" says (note still "Total devices 4"):

   ghigo@emulato:~$ sudo btrfs fi show
   failed to read /dev/sr0
   Label: 'btrfs3'  uuid: 2a66286d-63e9-4ed5-b347-5af5e4ada814
	Total devices 4 FS bytes used 92.00KB
	devid    4 size 100.00GB used 7.00GB path /dev/vdj
	devid    3 size 100.00GB used 6.04GB path /dev/vdi
	devid    5 size 100.00GB used 5.06GB path /dev/vdh
	devid    1 size 100.00GB used 6.08GB path /dev/vdg

   Btrfs Btrfs v0.19

Then I do a sync

   ghigo@emulato:~$ sync
   ghigo@emulato:~$ sudo btrfs fi show
   failed to read /dev/sr0
   Label: 'btrfs3'  uuid: 2a66286d-63e9-4ed5-b347-5af5e4ada814
	Total devices 3 FS bytes used 92.00KB
	devid    4 size 100.00GB used 7.00GB path /dev/vdj
	devid    3 size 100.00GB used 6.04GB path /dev/vdi
	devid    5 size 100.00GB used 5.06GB path /dev/vdh
	devid    1 size 100.00GB used 6.08GB path /dev/vdg

   Btrfs Btrfs v0.19

(note "Total devices 3")

And magically the filesystem is now composed by three disks. However 4 
physical devices are show. This because the disk /dev/vdi superblock 
says that the disk is still valid (after the "btrfs device del" the disk 
is not touched any more)

In the past Hubert posted a patch [2] to clear a btrfs superblock. A 
further enhancement of the "btrfs device del" could be to reset 
automatically the first superblock (leaving the backup ones unaffected).



[2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/17065
>
> Btrfs Btrfs v0.19
> ---------------------------------------------------------------------------
>
> As far as we can tell, only four of the disks are considered part of
> the btrfs by kernel. There were only four “btrfs: bdev”-lines in dmesg
> and only four disks took part in balancing. “btrfs device scan” says:
>
>    unable to scan the device '/dev/sdd' - Device or resource busy
>
> and balance does not balance theses two devices (of 6)
>
> It was neither possible to remove the disk from the btrfs via “btrfs
> device delete” nor adding them via “btrfs device add”.
>
> (8) a colleague swaped the two disk
>
> Now btrfs-tools showed:
>
> ---------------------------------------------------------------------------
> # btrfs fi show
> failed to read /dev/sr0
> Label: 'BTRFS_RAID'  uuid: 807193fd-17de-4088-9a54-3e7cacdc89db
>          Total devices 5 FS bytes used 3.01GB
>          devid    6 size 931.00GB used 83.03GB path /dev/sdf
>          devid    4 size 931.00GB used 75.00GB path /dev/sdd
>          devid    5 size 931.00GB used 325.03GB path /dev/sde
>          devid    3 size 931.00GB used 326.03GB path /dev/sdc
>          devid    2 size 931.00GB used 325.03GB path /dev/sdb
>          devid    1 size 931.51GB used 326.04GB path /dev/sda
>
> Btrfs Btrfs v0.19
> ---------------------------------------------------------------------------
>
> Claiming the btrfs has 5 disk, but listing 6 disks out of 5 (6 of 5).
>
> He finally managed to get the btrfs complete again by overwriting the
> first 100G of the two disk. After this the btrfs-tools (correctly)
> stated a filesystem with 4 disk and it was possible to add the two
> disk again.
>
>
> Assumption:
> kernel and btrfs do not share the same view of the filesystem.
>
> In this state commands to repair the filesystem do not work, because
> they are either rejected by the tools or by the kernel.
>
> A tool that allows a disk/partition to be marked as not-a-btrfs-part
> would be nice.
>
> A “/proc/btrfs” showing the kernels view of the filesystem would be
> usefull.
>
> 	MfG
> 	bmg
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: “Bug”-report: inconsistency kernel <-> tools
  2012-08-30 18:24 ` Goffredo Baroncelli
@ 2012-08-30 18:37   ` Hugo Mills
  2012-08-31 19:08   ` Goffredo Baroncelli
  1 sibling, 0 replies; 6+ messages in thread
From: Hugo Mills @ 2012-08-30 18:37 UTC (permalink / raw)
  To: kreijack; +Cc: M G Berberich, linux-btrfs, Hubert Kario

[-- Attachment #1: Type: text/plain, Size: 872 bytes --]

On Thu, Aug 30, 2012 at 08:24:53PM +0200, Goffredo Baroncelli wrote:
> On 08/28/2012 09:52 PM, M G Berberich wrote:
> >(7) reinserted disk (and rebooted)
> >     At some point before reboot the first 10 sectors of one disk
> >     were zeroed to test if the disk gets removed from the btrfs.
> 
> IIRC the superblock is not placed at the beginning of the disk. On
> the basis of [1] it should be near the 64KB (around the sector #128)

   Just for the record, the first is at 64KiB; each subsequent one is
shifted 12 bits left (256MiB, 1TiB, 4EiB, 16ZiB, 64YiB).

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
      --- This chap Anon is writing some perfectly lovely stuff ---      
                             at the moment.                              

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: “Bug”-report: inconsistency kernel <-> tools
  2012-08-30 18:24 ` Goffredo Baroncelli
  2012-08-30 18:37   ` Hugo Mills
@ 2012-08-31 19:08   ` Goffredo Baroncelli
  2012-08-31 21:37     ` [BTRFS-PROGS][BUG][PATCH] Incorrect detection of a removed device [was Re: “Bug”-report: inconsistency kernel <-> tools] Goffredo Baroncelli
  2012-09-11 17:31     ` [RESPOST][BTRFS-PROGS][PATCH] btrfs_read_dev_super(): uninitialized variable Goffredo Baroncelli
  1 sibling, 2 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2012-08-31 19:08 UTC (permalink / raw)
  To: M G Berberich; +Cc: linux-btrfs, Hubert Kario

On 08/30/2012 08:24 PM, Goffredo Baroncelli wrote:
>
> And magically the filesystem is now composed by three disks. However 4
> physical devices are show. This because the disk /dev/vdi superblock
> says that the disk is still valid (after the "btrfs device del" the disk
> is not touched any more)

I have to correct myself. When a device is removed its superblock is 
zero-ed (from btrfs_rm_device():

[...]
         /*
          * at this point, the device is zero sized.  We want to
          * remove it from the devices list and zero out the old super
          */
         if (clear_super) {
                 /* make sure this device isn't detected as part of
                  * the FS anymore
                  */
                 memset(&disk_super->magic, 0, sizeof(disk_super->magic));
                 set_buffer_dirty(bh);
                 sync_dirty_buffer(bh);
         }
[...]


clear_super is set to true when the device is writeable.



However making a test I found both the behaviours: sometime the removed 
disk disappears from the output of "btrfs fi show" and sometime not...

May be that there is a bug somewhere...


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [BTRFS-PROGS][BUG][PATCH] Incorrect detection of a removed device  [was Re: “Bug”-report: inconsistency kernel <-> tools]
  2012-08-31 19:08   ` Goffredo Baroncelli
@ 2012-08-31 21:37     ` Goffredo Baroncelli
  2012-09-11 17:31     ` [RESPOST][BTRFS-PROGS][PATCH] btrfs_read_dev_super(): uninitialized variable Goffredo Baroncelli
  1 sibling, 0 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2012-08-31 21:37 UTC (permalink / raw)
  To: Yan Zheng; +Cc: M G Berberich, linux-btrfs, Chris Mason

[-- Attachment #1: Type: text/plain, Size: 4621 bytes --]

Hi all, Yan,

On 08/31/2012 09:08 PM, Goffredo Baroncelli wrote:
> However making a test I found both the behaviours: sometime the removed
> disk disappears from the output of "btrfs fi show" and sometime not...
>
> May be that there is a bug somewhere...


I became crazy looking at this bug. I found that a debian package raises 
the bug, but when I compiled the source by hand the bug disappeared... 
Finally I discovered that this bug depends by an uninitialized variable; 
this lead to the unpredictable behaviour.

The problem is that when a device is removed, the function 
btrfs_read_dev_super() should ignore it. In fact the kernel clear the 
magic number in the *first* superblock. However the function 
btrfs_read_dev_super() checks also the backup superblocks and when it 
found a valid one, the function returns success.

Lukely (?) this function fails very often because the fsid of the backup 
superblock are checked against an uninitialized buffer. However when 
this check has success this device is considered suitable even tough it 
was removed.

The BUG is in the function btrfs_read_dev_super():


int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 
sb_bytenr)
{
         u8 fsid[BTRFS_FSID_SIZE];
[...]

line 933:

         for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) {
                 bytenr = btrfs_sb_offset(i);
                 ret = pread64(fd, &buf, sizeof(buf), bytenr);
                 if (ret < sizeof(buf))
                         break;

                 if (btrfs_super_bytenr(&buf) != bytenr ||
                     strncmp((char *)(&buf.magic), BTRFS_MAGIC,
                             sizeof(buf.magic)))
                         continue;

                 if (i == 0)
                         memcpy(fsid, buf.fsid, sizeof(fsid));
                 else if (memcmp(fsid, buf.fsid, sizeof(fsid)))
                         continue;

                 if (btrfs_super_generation(&buf) > transid) {
                         memcpy(sb, &buf, sizeof(*sb));
                         transid = btrfs_super_generation(&buf);
                 }
         }


When a device is removed, the *first* superblock magic field is zeroed 
so the first check "strncmp((char *)(&buf.magic), BTRFS_MAGIC,..." fails 
, "i" is increased, and the "continue" statement is execute.
Then the check "memcmp(fsid...." became unreliable in the next iteration 
because the fsid variable is not initialized.

To me the test is unclear: what is the purpose to continue when the 
superblocks (the original one and its backup) refer to different fsid: 
there is something wrong which require an user decision...
May be that Yan added this check (see commit 
50860d6e31c28cf4789ef099729dfbce2108620a ) to converting from different 
format ? Yan do you remember something about this code ?

The enclosed patch corrects the initialization of the fsid variable; 
morover if the fsid are different between the superblocks (the original 
one and its backup) the function fails because the device cannot be 
trusted. Finally it is handled the special case when the magic fields is 
zeroed in the *first* superblock. In this case the device is skipped.

BR
G.Baroncelli

-- 
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>

diff --git a/disk-io.c b/disk-io.c
index b21a87f..82fc3b8 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -910,6 +910,7 @@ struct btrfs_root *open_ctree_fd(int fp, const char 
*path, u64 sb_bytenr,
  int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 
sb_bytenr)
  {
  	u8 fsid[BTRFS_FSID_SIZE];
+	int fsid_is_initialized = 0;
  	struct btrfs_super_block buf;
  	int i;
  	int ret;
@@ -936,15 +937,26 @@ int btrfs_read_dev_super(int fd, struct 
btrfs_super_block *sb, u64 sb_bytenr)
  		if (ret < sizeof(buf))
  			break;

-		if (btrfs_super_bytenr(&buf) != bytenr ||
-		    strncmp((char *)(&buf.magic), BTRFS_MAGIC,
+		if (btrfs_super_bytenr(&buf) != bytenr )
+			continue;
+		/* if magic is NULL, the device was removed */
+		if (buf.magic == 0 && i==0)
+			return -1;
+		if (strncmp((char *)(&buf.magic), BTRFS_MAGIC,
  			    sizeof(buf.magic)))
  			continue;

-		if (i == 0)
+		if (!fsid_is_initialized){
  			memcpy(fsid, buf.fsid, sizeof(fsid));
-		else if (memcmp(fsid, buf.fsid, sizeof(fsid)))
-			continue;
+			fsid_is_initialized = 1;
+		} else if (memcmp(fsid, buf.fsid, sizeof(fsid))) {
+			/*
+			 * the superblocks (the original one and
+			 * its backups) contain data of different
+			 * filesystems -> the disk cannot be trusted
+			 */
+			return -1;
+		}

  		if (btrfs_super_generation(&buf) > transid) {
  			memcpy(sb, &buf, sizeof(*sb));

[-- Attachment #2: btrfs_read_dev_super-bug.patch --]
[-- Type: text/x-patch, Size: 1362 bytes --]

diff --git a/disk-io.c b/disk-io.c
index b21a87f..82fc3b8 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -910,6 +910,7 @@ struct btrfs_root *open_ctree_fd(int fp, const char *path, u64 sb_bytenr,
 int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr)
 {
 	u8 fsid[BTRFS_FSID_SIZE];
+	int fsid_is_initialized = 0;
 	struct btrfs_super_block buf;
 	int i;
 	int ret;
@@ -936,15 +937,26 @@ int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr)
 		if (ret < sizeof(buf))
 			break;
 
-		if (btrfs_super_bytenr(&buf) != bytenr ||
-		    strncmp((char *)(&buf.magic), BTRFS_MAGIC,
+		if (btrfs_super_bytenr(&buf) != bytenr )
+			continue;
+		/* if magic is NULL, the device was removed */
+		if (buf.magic == 0 && i==0)
+			return -1;
+		if (strncmp((char *)(&buf.magic), BTRFS_MAGIC,
 			    sizeof(buf.magic)))
 			continue;
 
-		if (i == 0)
+		if (!fsid_is_initialized){
 			memcpy(fsid, buf.fsid, sizeof(fsid));
-		else if (memcmp(fsid, buf.fsid, sizeof(fsid)))
-			continue;
+			fsid_is_initialized = 1;
+		} else if (memcmp(fsid, buf.fsid, sizeof(fsid))) {
+			/* 
+			 * the superblocks (the original one and
+			 * its backups) contain data of different
+			 * filesystems -> the disk cannot be trusted
+			 */
+			return -1;
+		}
 
 		if (btrfs_super_generation(&buf) > transid) {
 			memcpy(sb, &buf, sizeof(*sb));

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RESPOST][BTRFS-PROGS][PATCH] btrfs_read_dev_super(): uninitialized variable
  2012-08-31 19:08   ` Goffredo Baroncelli
  2012-08-31 21:37     ` [BTRFS-PROGS][BUG][PATCH] Incorrect detection of a removed device [was Re: “Bug”-report: inconsistency kernel <-> tools] Goffredo Baroncelli
@ 2012-09-11 17:31     ` Goffredo Baroncelli
  1 sibling, 0 replies; 6+ messages in thread
From: Goffredo Baroncelli @ 2012-09-11 17:31 UTC (permalink / raw)
  To: Chris Mason; +Cc: Yan Zheng, M G Berberich, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3471 bytes --]

This is a repost because I rebased the change. The first attempt was 
done with the email "[BTRFS-PROGS][BUG][PATCH] Incorrect detection of a 
removed device  [was Re: “Bug”-report: inconsistency kernel <-> tools]" 
dated 08/31/2012.

In the function btrfs_read_dev_super() it is possible to use the 
variable fsid without initialisation.

In btrfs_read_dev_super(), during the scan of the superblock the 
variable fsid is initialised with the value of fsid of the first 
superblock. But if the first superblock contains an incorrect signature 
this initialisation is skipped.

int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 
sb_bytenr)
{
          u8 fsid[BTRFS_FSID_SIZE];
[...]

line 933:

          for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) {
                  bytenr = btrfs_sb_offset(i);
                  ret = pread64(fd, &buf, sizeof(buf), bytenr);
                  if (ret < sizeof(buf))
                          break;

                  if (btrfs_super_bytenr(&buf) != bytenr ||
                      strncmp((char *)(&buf.magic), BTRFS_MAGIC,
                              sizeof(buf.magic)))
                          continue;

                  if (i == 0)
                          memcpy(fsid, buf.fsid, sizeof(fsid));
                  else if (memcmp(fsid, buf.fsid, sizeof(fsid)))
                          continue;

                 [...]
          }

This bug could be triggered by the command "btrfs device delete", which 
zeros the signature of the first superblock.

The enclosed patch corrects the initialisation of the fsid variable; 
moreover if the fsid are different between the superblocks (the original 
one and its backups) the function fails because the device cannot be 
trusted. Finally it is handled the special case when the magic fields is 
zeroed in the *first* superblock. In this case the device is skipped.

Please apply, thank.

You can pull from
	http://cassiopea.homelinux.net/git/btrfs-progs-unstable.git
branch
	btrfs_read_dev_super-bug
	

BR
G.Baroncelli

-- 
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>

diff --git a/disk-io.c b/disk-io.c
index b21a87f..82fc3b8 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -910,6 +910,7 @@ struct btrfs_root *open_ctree_fd(int fp, const char 
*path, u64 sb_bytenr,
  int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 
sb_bytenr)
  {
  	u8 fsid[BTRFS_FSID_SIZE];
+	int fsid_is_initialized = 0;
  	struct btrfs_super_block buf;
  	int i;
  	int ret;
@@ -936,15 +937,26 @@ int btrfs_read_dev_super(int fd, struct 
btrfs_super_block *sb, u64 sb_bytenr)
  		if (ret < sizeof(buf))
  			break;

-		if (btrfs_super_bytenr(&buf) != bytenr ||
-		    strncmp((char *)(&buf.magic), BTRFS_MAGIC,
+		if (btrfs_super_bytenr(&buf) != bytenr )
+			continue;
+		/* if magic is NULL, the device was removed */
+		if (buf.magic == 0 && i==0)
+			return -1;
+		if (strncmp((char *)(&buf.magic), BTRFS_MAGIC,
  			    sizeof(buf.magic)))
  			continue;

-		if (i == 0)
+		if (!fsid_is_initialized){
  			memcpy(fsid, buf.fsid, sizeof(fsid));
-		else if (memcmp(fsid, buf.fsid, sizeof(fsid)))
-			continue;
+			fsid_is_initialized = 1;
+		} else if (memcmp(fsid, buf.fsid, sizeof(fsid))) {
+			/*
+			 * the superblocks (the original one and
+			 * its backups) contain data of different
+			 * filesystems -> the disk cannot be trusted
+			 */
+			return -1;
+		}

  		if (btrfs_super_generation(&buf) > transid) {
  			memcpy(sb, &buf, sizeof(*sb));

[-- Attachment #2: btrfs_read_dev_super-bug.diff --]
[-- Type: text/x-patch, Size: 1362 bytes --]

diff --git a/disk-io.c b/disk-io.c
index b21a87f..82fc3b8 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -910,6 +910,7 @@ struct btrfs_root *open_ctree_fd(int fp, const char *path, u64 sb_bytenr,
 int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr)
 {
 	u8 fsid[BTRFS_FSID_SIZE];
+	int fsid_is_initialized = 0;
 	struct btrfs_super_block buf;
 	int i;
 	int ret;
@@ -936,15 +937,26 @@ int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr)
 		if (ret < sizeof(buf))
 			break;
 
-		if (btrfs_super_bytenr(&buf) != bytenr ||
-		    strncmp((char *)(&buf.magic), BTRFS_MAGIC,
+		if (btrfs_super_bytenr(&buf) != bytenr )
+			continue;
+		/* if magic is NULL, the device was removed */
+		if (buf.magic == 0 && i==0)
+			return -1;
+		if (strncmp((char *)(&buf.magic), BTRFS_MAGIC,
 			    sizeof(buf.magic)))
 			continue;
 
-		if (i == 0)
+		if (!fsid_is_initialized){
 			memcpy(fsid, buf.fsid, sizeof(fsid));
-		else if (memcmp(fsid, buf.fsid, sizeof(fsid)))
-			continue;
+			fsid_is_initialized = 1;
+		} else if (memcmp(fsid, buf.fsid, sizeof(fsid))) {
+			/* 
+			 * the superblocks (the original one and
+			 * its backups) contain data of different
+			 * filesystems -> the disk cannot be trusted
+			 */
+			return -1;
+		}
 
 		if (btrfs_super_generation(&buf) > transid) {
 			memcpy(sb, &buf, sizeof(*sb));

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-09-11 17:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-28 19:52 “Bug”-report: inconsistency kernel <-> tools M G Berberich
2012-08-30 18:24 ` Goffredo Baroncelli
2012-08-30 18:37   ` Hugo Mills
2012-08-31 19:08   ` Goffredo Baroncelli
2012-08-31 21:37     ` [BTRFS-PROGS][BUG][PATCH] Incorrect detection of a removed device [was Re: “Bug”-report: inconsistency kernel <-> tools] Goffredo Baroncelli
2012-09-11 17:31     ` [RESPOST][BTRFS-PROGS][PATCH] btrfs_read_dev_super(): uninitialized variable Goffredo Baroncelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).