RAID0 wrong (raw) device?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID0 wrong (raw) device?
@ 2015-08-12 13:07 Ulli Horlacher
  2015-08-12 17:03 ` Chris Murphy
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Ulli Horlacher @ 2015-08-12 13:07 UTC (permalink / raw)
  To: linux-btrfs


I have 2 identical servers with 2 x 2 Hitachi (HGST) SATA disks (and some
other disks) which are mirrored with drbd.
On top of this drbd setup I have created a btrfs RAID0 filesystem.
The problem now is, that btrfs shows the raw device instead of the drbd
device.

root@toy02:~# mkfs.btrfs /dev/drbd2 /dev/drbd3
root@toy02:~# mount btrfs filesystem label /dev/drbd2 data
root@toy02:~# mount /dev/drbd2 /data


root@toy02:~# df -T /data
Filesystem     Type   1K-blocks      Used  Available Use% Mounted on
/dev/sdb       btrfs 3906909856 140031696 3765056176   4% /data

root@toy02:~# btrfs filesystem show /data
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
        Total devices 2 FS bytes used 129.81GiB
        devid    3 size 1.82TiB used 67.03GiB path /dev/drbd2
        devid    4 size 1.82TiB used 67.03GiB path /dev/sdb

Btrfs v3.12

==> btrfs shows the wrong (raw) device /dev/sdb instead of /dev/drbd3 !


root@toy02:~# uname -a; lsb_release -a
Linux toy02 3.13.0-61-generic #100-Ubuntu SMP Wed Jul 29 11:21:34 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.3 LTS
Release:        14.04
Codename:       trusty


root@toy02:~# find /dev -ls | grep drbd
 47453    0 brw-rw----   1 root     disk              Aug 12 14:51 /dev/drbd3
 47433    0 brw-rw----   1 root     disk              Aug 11 14:00 /dev/drbd2
 14706    0 drwxr-xr-x   4 root     root           80 Aug 10 14:17 /dev/drbd
 14713    0 drwxr-xr-x   2 root     root          100 Aug 12 13:40 /dev/drbd/by-res
 41685    0 lrwxrwxrwx   1 root     root           11 Aug 12 14:51 /dev/drbd/by-res/d3 -> ../../drbd3
 42759    0 lrwxrwxrwx   1 root     root           11 Aug 11 14:00 /dev/drbd/by-res/d2 -> ../../drbd2
 14707    0 drwxr-xr-x   3 root     root           60 Aug 10 14:17 /dev/drbd/by-disk
 14708    0 drwxr-xr-x   3 root     root           60 Aug 10 14:17 /dev/drbd/by-disk/disk
 14709    0 drwxr-xr-x   2 root     root          100 Aug 12 13:40 /dev/drbd/by-disk/disk/by-id
 41682    0 lrwxrwxrwx   1 root     root           17 Aug 12 14:51 /dev/drbd/by-disk/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2AX -> ../../../../drbd3
 42756    0 lrwxrwxrwx   1 root     root           17 Aug 11 14:00 /dev/drbd/by-disk/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2XX -> ../../../../drbd2
 41681    0 lrwxrwxrwx   1 root     root            8 Aug 12 14:51 /dev/block/147:3 -> ../drbd3
 42755    0 lrwxrwxrwx   1 root     root            8 Aug 11 14:00 /dev/block/147:2 -> ../drbd2

root@toy02:~# find /dev -ls | grep HGST
 41682    0 lrwxrwxrwx   1 root     root           17 Aug 12 14:51 /dev/drbd/by-disk/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2AX -> ../../../../drbd3
 42756    0 lrwxrwxrwx   1 root     root           17 Aug 11 14:00 /dev/drbd/by-disk/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2XX -> ../../../../drbd2
 63889    0 lrwxrwxrwx   1 root     root            9 Aug 12 13:42 /dev/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2AX -> ../../sdb
  7429    0 lrwxrwxrwx   1 root     root            9 Aug 10 16:45 /dev/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2XX -> ../../sdd



root@toy02:~# hdparm -I /dev/sdb| grep Number:
        Model Number: HGST HUS724020ALA640
        Serial Number: PN2134P5G2P2AX

root@toy02:~# hdparm -I /dev/sdd| grep Number:
        Model Number: HGST HUS724020ALA640
        Serial Number: PN2134P5G2P2XX

root@toy02:~# hdparm -I /dev/sde| grep Number:
        Model Number: HGST HUS724020ALA640
        Serial Number: PN2134P5G2P2AX

/dev/sdb and /dev/sde have the same serial number!
But there are really only 2 HGST drives in the server (and some other
seagate disks, non-relevant here).

root@toy02:~# find /dev -ls | grep sde 
 10391    0 brw-rw----   1 root     disk              Aug 10 16:45 /dev/sde
  8360    0 lrwxrwxrwx   1 root     root            9 Aug 10 16:45 /dev/disk/by-path/pci-0000:08:00.0-scsi-0:1:2:0 -> ../../sde
  8355    0 lrwxrwxrwx   1 root     root            6 Aug 10 16:45 /dev/block/8:64 -> ../sde

root@toy02:~# find /dev -ls | grep sdb
 10382    0 brw-rw----   1 root     disk              Aug 12 13:42 /dev/sdb
 68794    0 lrwxrwxrwx   1 root     root            9 Aug 12 13:42 /dev/disk/by-uuid/411af13f-6cae-4f03-99dc-5941acb3135b -> ../../sdb
 12410    0 lrwxrwxrwx   1 root     root            9 Aug 12 13:42 /dev/disk/by-path/pci-0000:08:00.0-sas-0x1221000002000000-lun-0 -> ../../sdb
 68791    0 lrwxrwxrwx   1 root     root            9 Aug 12 13:42 /dev/disk/by-label/data -> ../../sdb
 63890    0 lrwxrwxrwx   1 root     root            9 Aug 12 13:42 /dev/disk/by-id/wwn-0x5000cca24ec137db -> ../../sdb
 63889    0 lrwxrwxrwx   1 root     root            9 Aug 12 13:42 /dev/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2AX -> ../../sdb
 12403    0 lrwxrwxrwx   1 root     root            6 Aug 12 13:42 /dev/block/8:16 -> ../sdb

/dev/sdb and /dev/sde are in reality the same physical disk!


-- 
Ullrich Horlacher              Informationssysteme und Serverbetrieb
IZUS/TIK                       E-Mail: horlacher@rus.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20150812130758.GA26529@rus.uni-stuttgart.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-12 13:07 RAID0 wrong (raw) device? Ulli Horlacher
@ 2015-08-12 17:03 ` Chris Murphy
  2015-08-12 17:43   ` Hugo Mills
  2015-08-13 12:11   ` Ulli Horlacher
  2015-08-13  7:34 ` anand jain
  2015-08-13 11:44 ` Austin S Hemmelgarn
  2 siblings, 2 replies; 18+ messages in thread
From: Chris Murphy @ 2015-08-12 17:03 UTC (permalink / raw)
  To: Btrfs BTRFS

On Wed, Aug 12, 2015 at 7:07 AM, Ulli Horlacher
<framstag@rus.uni-stuttgart.de> wrote:

> /dev/sdb and /dev/sde are in reality the same physical disk!

When does all of this confusion happen? Is it already confused before
mkfs or only after mkfs or only after mount? I would find out what
instigates it, wipe all signatures from everything, reboot, start from
scratch, and then strace the command that causes the confusion. And
attach that output as well as the entire dmesg to a bug report. Just
my 2 cents, I have no idea what's going on but sounds like a block
layer and/or drdb bug that's triggered by Btrfs multiple device
setups.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-12 17:03 ` Chris Murphy
@ 2015-08-12 17:43   ` Hugo Mills
  2015-08-12 17:53     ` Chris Murphy
  2015-08-13 12:11   ` Ulli Horlacher
  1 sibling, 1 reply; 18+ messages in thread
From: Hugo Mills @ 2015-08-12 17:43 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, Ulli Horlacher

[-- Attachment #1: Type: text/plain, Size: 1602 bytes --]

[adding Ulli back into the cc list]

On Wed, Aug 12, 2015 at 11:03:00AM -0600, Chris Murphy wrote:
> On Wed, Aug 12, 2015 at 7:07 AM, Ulli Horlacher
> <framstag@rus.uni-stuttgart.de> wrote:
> 
> > /dev/sdb and /dev/sde are in reality the same physical disk!
> 
> When does all of this confusion happen? Is it already confused before
> mkfs or only after mkfs or only after mount? I would find out what
> instigates it, wipe all signatures from everything, reboot, start from
> scratch, and then strace the command that causes the confusion. And
> attach that output as well as the entire dmesg to a bug report. Just
> my 2 cents, I have no idea what's going on but sounds like a block
> layer and/or drdb bug that's triggered by Btrfs multiple device
> setups.

   If (some of) the DRBD host devices are also physically present on
the machine to which the DRBDs are exported, then you're in the same
situation as having block-level snapshots or dd copies of the data --
the FS will see two devices (the backing store and the DRBD) which are
have the same UUID. It will pick an arbitrary one to write to, which
is probably not something that the DRBD driver will cope with very
well, I suspect.

   I think the solution here would be to blacklist the backing store
from btrfs dev scan. I recall that there was such a capability at some
point -- I don't know if it made it into the userspace tools?

   Hugo.

-- 
Hugo Mills             | Great oxymorons of the world, no. 7:
hugo@... carfax.org.uk | The Simple Truth
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-12 17:43   ` Hugo Mills
@ 2015-08-12 17:53     ` Chris Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Murphy @ 2015-08-12 17:53 UTC (permalink / raw)
  To: Hugo Mills, Btrfs BTRFS, Ulli Horlacher

On Wed, Aug 12, 2015 at 11:43 AM, Hugo Mills <hugo@carfax.org.uk> wrote:
> [adding Ulli back into the cc list]
>
> On Wed, Aug 12, 2015 at 11:03:00AM -0600, Chris Murphy wrote:
>> On Wed, Aug 12, 2015 at 7:07 AM, Ulli Horlacher
>> <framstag@rus.uni-stuttgart.de> wrote:
>>
>> > /dev/sdb and /dev/sde are in reality the same physical disk!
>>
>> When does all of this confusion happen? Is it already confused before
>> mkfs or only after mkfs or only after mount? I would find out what
>> instigates it, wipe all signatures from everything, reboot, start from
>> scratch, and then strace the command that causes the confusion. And
>> attach that output as well as the entire dmesg to a bug report. Just
>> my 2 cents, I have no idea what's going on but sounds like a block
>> layer and/or drdb bug that's triggered by Btrfs multiple device
>> setups.
>
>    If (some of) the DRBD host devices are also physically present on
> the machine to which the DRBDs are exported, then you're in the same
> situation as having block-level snapshots or dd copies of the data --
> the FS will see two devices (the backing store and the DRBD) which are
> have the same UUID. It will pick an arbitrary one to write to, which
> is probably not something that the DRBD driver will cope with very
> well, I suspect.
>
>    I think the solution here would be to blacklist the backing store
> from btrfs dev scan. I recall that there was such a capability at some
> point -- I don't know if it made it into the userspace tools?

That makes sense. But then this would also affect XFS also I'd think,
except XFS will refuse to mount if the kernel sees two of the same fs
UUID.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-12 13:07 RAID0 wrong (raw) device? Ulli Horlacher
  2015-08-12 17:03 ` Chris Murphy
@ 2015-08-13  7:34 ` anand jain
  2015-08-13 12:02   ` Ulli Horlacher
  2015-08-13 11:44 ` Austin S Hemmelgarn
  2 siblings, 1 reply; 18+ messages in thread
From: anand jain @ 2015-08-13  7:34 UTC (permalink / raw)
  To: Ulli Horlacher, linux-btrfs



> root@toy02:~# df -T /data
> Filesystem     Type   1K-blocks      Used  Available Use% Mounted on
> /dev/sdb       btrfs 3906909856 140031696 3765056176   4% /data
>
> root@toy02:~# btrfs filesystem show /data
> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>          Total devices 2 FS bytes used 129.81GiB
>          devid    3 size 1.82TiB used 67.03GiB path /dev/drbd2
>          devid    4 size 1.82TiB used 67.03GiB path /dev/sdb
>
> Btrfs v3.12
>
> ==> btrfs shows the wrong (raw) device /dev/sdb instead of /dev/drbd3 !

Don't be too alarmed by that, progs do a bit of user land fabrication 
(wrong). kernel may /may-not be using sdb. try -m option.

just in case if it didn't, Then use mount -o devices / btrfs dev scan 
<dev> option to provide the desired dev path to the kernel.

Thanks, Anand

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-12 13:07 RAID0 wrong (raw) device? Ulli Horlacher
  2015-08-12 17:03 ` Chris Murphy
  2015-08-13  7:34 ` anand jain
@ 2015-08-13 11:44 ` Austin S Hemmelgarn
  2015-08-13 12:06   ` Ulli Horlacher
  2015-08-13 22:32   ` Gareth Pye
  2 siblings, 2 replies; 18+ messages in thread
From: Austin S Hemmelgarn @ 2015-08-13 11:44 UTC (permalink / raw)
  To: Ulli Horlacher, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1038 bytes --]

A couple of observations:
1. BTRFS currently has no knowledge of multipath or anything like that. 
  In theory it should work fine as long as the multiple device instances 
all point to the same storage directly (including having identical block 
addresses), but we still need to add proper handling for it.

2. Be _VERY_ careful using BTRFS on top of _ANY_ kind of shared storage. 
  Most non-clustered filesystems will have issues if multiply mounted, 
but in almost all cases I've personally seen, it _WILL_ cause 
irreparable damage to a BTRFS filesystem (we really need to do something 
like ext4's MMP in BTRFS).

3. See the warnings about doing block level copies and LVM snapshots of 
BTRFS volumes, the same applies to using it on DRBD currently as well 
(with the possible exception of remote DRBD nodes (ie, ones without a 
local copy of the backing store) (in this case, we need to blacklist 
backing devices for stacked storage (I think the same issue may be 
present with BTRFS on a MD based RAID1 set).

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13  7:34 ` anand jain
@ 2015-08-13 12:02   ` Ulli Horlacher
  2015-08-13 14:55     ` Ulli Horlacher
  0 siblings, 1 reply; 18+ messages in thread
From: Ulli Horlacher @ 2015-08-13 12:02 UTC (permalink / raw)
  To: linux-btrfs

On Thu 2015-08-13 (15:34), anand jain wrote:
> > root@toy02:~# df -T /data
> > Filesystem     Type   1K-blocks      Used  Available Use% Mounted on
> > /dev/sdb       btrfs 3906909856 140031696 3765056176   4% /data
> >
> > root@toy02:~# btrfs filesystem show /data
> > Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
> >          Total devices 2 FS bytes used 129.81GiB
> >          devid    3 size 1.82TiB used 67.03GiB path /dev/drbd2
> >          devid    4 size 1.82TiB used 67.03GiB path /dev/sdb
> >
> > Btrfs v3.12
> >
> > ==> btrfs shows the wrong (raw) device /dev/sdb instead of /dev/drbd3 !
> 
> Don't be too alarmed by that, progs do a bit of user land fabrication 
> (wrong). kernel may /may-not be using sdb. try -m option.

It is really weird: meanwhile (without any mount change or even reboot) I
get: 

root@toy02:~# btrfs filesystem show   
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
        Total devices 2 FS bytes used 106.51GiB
        devid    3 size 1.82TiB used 82.03GiB path /dev/drbd2
        devid    4 size 1.82TiB used 82.03GiB path /dev/drbd3

==> no more /dev/sdb !

BUT:

root@toy02:~# df -T /data
Filesystem     Type   1K-blocks      Used  Available Use% Mounted on
/dev/sdb       btrfs 3906909856 111822636 3793208212   3% /data

root@toy02:~# mount | grep /data
/dev/sdb on /data type btrfs (rw)

root@toy02:~# grep /data /proc/mounts
/dev/drbd2 /data btrfs rw,relatime,space_cache 0 0

And still, Linux sees 3 HGST devices (there are real 2 drives):

root@toy02:~# hdparm -I /dev/sdb | grep Number:
        Model Number:       HGST HUS724020ALA640
        Serial Number:      PN2134P5G2P2AX

root@toy02:~# hdparm -I /dev/sdd | grep Number:
        Model Number:       HGST HUS724020ALA640
        Serial Number:      PN2134P5G2P2XX

root@toy02:~# hdparm -I /dev/sde | grep Number:
        Model Number:       HGST HUS724020ALA640
        Serial Number:      PN2134P5G2P2AX


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<55CC488D.4020203@oracle.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 11:44 ` Austin S Hemmelgarn
@ 2015-08-13 12:06   ` Ulli Horlacher
  2015-08-13 22:32   ` Gareth Pye
  1 sibling, 0 replies; 18+ messages in thread
From: Ulli Horlacher @ 2015-08-13 12:06 UTC (permalink / raw)
  To: linux-btrfs

On Thu 2015-08-13 (07:44), Austin S Hemmelgarn wrote:

> 2. Be _VERY_ careful using BTRFS on top of _ANY_ kind of shared storage. 
>   Most non-clustered filesystems will have issues if multiply mounted, 
> but in almost all cases I've personally seen, it _WILL_ cause 
> irreparable damage to a BTRFS filesystem (we really need to do something 
> like ext4's MMP in BTRFS).

Same with ZFS: it has also no MMP and when you mount a shared block device
twice you will destroy it irreparable. Kids, do not try this at home - I
have done so ;-)

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<55CC830D.2070304@gmail.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-12 17:03 ` Chris Murphy
  2015-08-12 17:43   ` Hugo Mills
@ 2015-08-13 12:11   ` Ulli Horlacher
  1 sibling, 0 replies; 18+ messages in thread
From: Ulli Horlacher @ 2015-08-13 12:11 UTC (permalink / raw)
  To: Btrfs BTRFS

On Wed 2015-08-12 (11:03), Chris Murphy wrote:
> On Wed, Aug 12, 2015 at 7:07 AM, Ulli Horlacher
> <framstag@rus.uni-stuttgart.de> wrote:
> 
> > /dev/sdb and /dev/sde are in reality the same physical disk!
> 
> When does all of this confusion happen? Is it already confused before
> mkfs or only after mkfs or only after mount?

I have not looked closely before the mkfs.btrfs, because I noticed no
problems.


> I would find out what instigates it, wipe all signatures from everything,
> reboot, start from scratch

This is not an option for me, because this server (despite its name) is a
production system.


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<CAJCQCtRTvn-Xu_ipM7pCTtVuxUm0m7kjqt=F=+q6cj0vOOhF7g@mail.gmail.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 12:02   ` Ulli Horlacher
@ 2015-08-13 14:55     ` Ulli Horlacher
  2015-08-13 16:24       ` Anand Jain
  0 siblings, 1 reply; 18+ messages in thread
From: Ulli Horlacher @ 2015-08-13 14:55 UTC (permalink / raw)
  To: linux-btrfs

On Thu 2015-08-13 (14:02), Ulli Horlacher wrote:
> On Thu 2015-08-13 (15:34), anand jain wrote:
> 
> > > root@toy02:~# df -T /data
> > > Filesystem     Type   1K-blocks      Used  Available Use% Mounted on
> > > /dev/sdb       btrfs 3906909856 140031696 3765056176   4% /data
> > >
> > > root@toy02:~# btrfs filesystem show /data
> > > Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
> > >          Total devices 2 FS bytes used 129.81GiB
> > >          devid    3 size 1.82TiB used 67.03GiB path /dev/drbd2
> > >          devid    4 size 1.82TiB used 67.03GiB path /dev/sdb
> > >
> > > Btrfs v3.12
> > >
> > > ==> btrfs shows the wrong (raw) device /dev/sdb instead of /dev/drbd3 !
> > 
> > Don't be too alarmed by that, progs do a bit of user land fabrication 
> > (wrong). kernel may /may-not be using sdb. try -m option.
> 
> It is really weird: meanwhile (without any mount change or even reboot) I
> get: 
> 
> root@toy02:~# btrfs filesystem show   
> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>         Total devices 2 FS bytes used 106.51GiB
>         devid    3 size 1.82TiB used 82.03GiB path /dev/drbd2
>         devid    4 size 1.82TiB used 82.03GiB path /dev/drbd3

And now, after a reboot:

root@toy02:~/bin# btrfs filesystem show
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
        Total devices 2 FS bytes used 119.82GiB
        devid    3 size 1.82TiB used 82.03GiB path /dev/drbd2
        devid    4 size 1.82TiB used 82.03GiB path /dev/sde

GRMPF!


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20150813120211.GA24122@rus.uni-stuttgart.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 14:55     ` Ulli Horlacher
@ 2015-08-13 16:24       ` Anand Jain
  2015-08-14  7:32         ` Ulli Horlacher
  0 siblings, 1 reply; 18+ messages in thread
From: Anand Jain @ 2015-08-13 16:24 UTC (permalink / raw)
  To: linux-btrfs



On 08/13/2015 10:55 PM, Ulli Horlacher wrote:
> On Thu 2015-08-13 (14:02), Ulli Horlacher wrote:
>> On Thu 2015-08-13 (15:34), anand jain wrote:
>>
>>>> root@toy02:~# df -T /data
>>>> Filesystem     Type   1K-blocks      Used  Available Use% Mounted on
>>>> /dev/sdb       btrfs 3906909856 140031696 3765056176   4% /data
>>>>
>>>> root@toy02:~# btrfs filesystem show /data
>>>> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>>>>           Total devices 2 FS bytes used 129.81GiB
>>>>           devid    3 size 1.82TiB used 67.03GiB path /dev/drbd2
>>>>           devid    4 size 1.82TiB used 67.03GiB path /dev/sdb
>>>>
>>>> Btrfs v3.12
>>>>
>>>> ==> btrfs shows the wrong (raw) device /dev/sdb instead of /dev/drbd3 !
>>>
>>> Don't be too alarmed by that, progs do a bit of user land fabrication
>>> (wrong). kernel may /may-not be using sdb. try -m option.
>>
>> It is really weird: meanwhile (without any mount change or even reboot) I
>> get:
>>
>> root@toy02:~# btrfs filesystem show
>> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>>          Total devices 2 FS bytes used 106.51GiB
>>          devid    3 size 1.82TiB used 82.03GiB path /dev/drbd2
>>          devid    4 size 1.82TiB used 82.03GiB path /dev/drbd3
>
> And now, after a reboot:
>
> root@toy02:~/bin# btrfs filesystem show
> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>          Total devices 2 FS bytes used 119.82GiB
>          devid    3 size 1.82TiB used 82.03GiB path /dev/drbd2
>          devid    4 size 1.82TiB used 82.03GiB path /dev/sde
>
> GRMPF!

pls use 'btrfs fi show -m' and just ignore no option or -d if fs is 
mounted, as -m reads from the kernel.

at mount you could assemble correct set of devices using mount -o device 
option.

Thanks, -Anand



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 11:44 ` Austin S Hemmelgarn
  2015-08-13 12:06   ` Ulli Horlacher
@ 2015-08-13 22:32   ` Gareth Pye
  2015-08-13 22:54     ` Hugo Mills
  1 sibling, 1 reply; 18+ messages in thread
From: Gareth Pye @ 2015-08-13 22:32 UTC (permalink / raw)
  To: Austin S Hemmelgarn; +Cc: Ulli Horlacher, linux-btrfs

On Thu, Aug 13, 2015 at 9:44 PM, Austin S Hemmelgarn
<ahferroin7@gmail.com> wrote:
> 3. See the warnings about doing block level copies and LVM snapshots of
> BTRFS volumes, the same applies to using it on DRBD currently as well (with
> the possible exception of remote DRBD nodes (ie, ones without a local copy
> of the backing store) (in this case, we need to blacklist backing devices
> for stacked storage (I think the same issue may be present with BTRFS on a
> MD based RAID1 set).


I've been using BTRFS on top of DRBD for several years now, what
specifically am I meant to avoid?

I have 6 drives mirrored across a local network, this is done with DRBD.
At any one time only a single server has the 6 drives mounted with btrfs.
Is this a ticking time bomb?

-- 
Gareth Pye - blog.cerberos.id.au
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 22:32   ` Gareth Pye
@ 2015-08-13 22:54     ` Hugo Mills
  2015-08-13 23:29       ` Gareth Pye
  0 siblings, 1 reply; 18+ messages in thread
From: Hugo Mills @ 2015-08-13 22:54 UTC (permalink / raw)
  To: Gareth Pye; +Cc: Austin S Hemmelgarn, Ulli Horlacher, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1603 bytes --]

On Fri, Aug 14, 2015 at 08:32:46AM +1000, Gareth Pye wrote:
> On Thu, Aug 13, 2015 at 9:44 PM, Austin S Hemmelgarn
> <ahferroin7@gmail.com> wrote:
> > 3. See the warnings about doing block level copies and LVM snapshots of
> > BTRFS volumes, the same applies to using it on DRBD currently as well (with
> > the possible exception of remote DRBD nodes (ie, ones without a local copy
> > of the backing store) (in this case, we need to blacklist backing devices
> > for stacked storage (I think the same issue may be present with BTRFS on a
> > MD based RAID1 set).
> 
> 
> I've been using BTRFS on top of DRBD for several years now, what
> specifically am I meant to avoid?
> 
> I have 6 drives mirrored across a local network, this is done with DRBD.
> At any one time only a single server has the 6 drives mounted with btrfs.
> Is this a ticking time bomb?

   There are two things which are potentially worrisome here:

 - Having the same filesystem mounted on more than one machine at a
   time (which you're not doing).

 - Having one or more of the DRBD backing store devices present on the
   same machine that the DRBD filesystem is mounted on (which you may
   be doing).

   Of these, the first is definitely going to be dangerous. The second
may or may not be, depending on how well DRBD copes with direct writes
to its backing store, and how lucky you are about the kernel
identifying the right devices to use for the FS.

   Hugo.

-- 
Hugo Mills             | "Big data" doesn't just mean increasing the font
hugo@... carfax.org.uk | size.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 22:54     ` Hugo Mills
@ 2015-08-13 23:29       ` Gareth Pye
  2015-08-14 11:26         ` Austin S Hemmelgarn
  0 siblings, 1 reply; 18+ messages in thread
From: Gareth Pye @ 2015-08-13 23:29 UTC (permalink / raw)
  To: Hugo Mills, Gareth Pye, Austin S Hemmelgarn, Ulli Horlacher,
	linux-btrfs

I would have been surprised if any generic file system copes well with
being mounted in several locations at once, DRBD appears to fight
really hard to avoid that happening :)

And yeah I'm doing the second thing, I've successfully switched which
of the servers is active a few times with no ill effect (I would
expect scrub to give me some significant warnings if one of the disks
was a couple of months out of date) so I'm presuming that DRBD copes
reasonably well or I've been very lucky. Either that luck is very
deterministic, DRBD copes correctly, or I've been very very lucky.

Very very lucky doesn't sound likely.

On Fri, Aug 14, 2015 at 8:54 AM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Fri, Aug 14, 2015 at 08:32:46AM +1000, Gareth Pye wrote:
>> On Thu, Aug 13, 2015 at 9:44 PM, Austin S Hemmelgarn
>> <ahferroin7@gmail.com> wrote:
>> > 3. See the warnings about doing block level copies and LVM snapshots of
>> > BTRFS volumes, the same applies to using it on DRBD currently as well (with
>> > the possible exception of remote DRBD nodes (ie, ones without a local copy
>> > of the backing store) (in this case, we need to blacklist backing devices
>> > for stacked storage (I think the same issue may be present with BTRFS on a
>> > MD based RAID1 set).
>>
>>
>> I've been using BTRFS on top of DRBD for several years now, what
>> specifically am I meant to avoid?
>>
>> I have 6 drives mirrored across a local network, this is done with DRBD.
>> At any one time only a single server has the 6 drives mounted with btrfs.
>> Is this a ticking time bomb?
>
>    There are two things which are potentially worrisome here:
>
>  - Having the same filesystem mounted on more than one machine at a
>    time (which you're not doing).
>
>  - Having one or more of the DRBD backing store devices present on the
>    same machine that the DRBD filesystem is mounted on (which you may
>    be doing).
>
>    Of these, the first is definitely going to be dangerous. The second
> may or may not be, depending on how well DRBD copes with direct writes
> to its backing store, and how lucky you are about the kernel
> identifying the right devices to use for the FS.
>
>    Hugo.
>
> --
> Hugo Mills             | "Big data" doesn't just mean increasing the font
> hugo@... carfax.org.uk | size.
> http://carfax.org.uk/  |
> PGP: E2AB1DE4          |



-- 
Gareth Pye - blog.cerberos.id.au
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 16:24       ` Anand Jain
@ 2015-08-14  7:32         ` Ulli Horlacher
  2015-08-15  0:02           ` Anand Jain
  0 siblings, 1 reply; 18+ messages in thread
From: Ulli Horlacher @ 2015-08-14  7:32 UTC (permalink / raw)
  To: linux-btrfs

On Fri 2015-08-14 (00:24), Anand Jain wrote:

> >> root@toy02:~# btrfs filesystem show
> >> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
> >>          Total devices 2 FS bytes used 106.51GiB
> >>          devid    3 size 1.82TiB used 82.03GiB path /dev/drbd2
> >>          devid    4 size 1.82TiB used 82.03GiB path /dev/drbd3
> >
> > And now, after a reboot:
> >
> > root@toy02:~/bin# btrfs filesystem show
> > Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
> >          Total devices 2 FS bytes used 119.82GiB
> >          devid    3 size 1.82TiB used 82.03GiB path /dev/drbd2
> >          devid    4 size 1.82TiB used 82.03GiB path /dev/sde
> >
> > GRMPF!
> 
> pls use 'btrfs fi show -m' and just ignore no option or -d if fs is 
> mounted, as -m reads from the kernel.

There is now a new behaviour: after the btrfs mount, I can see shortly the
wrong raw device /dev/sde and a few seconds later there is the correct
/dev/drbd3 :


root@toy02:/etc# umount /data
root@toy02:/etc# mount /data
root@toy02:/etc# btrfs filesystem show
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
        Total devices 2 FS bytes used 109.56GiB
        devid    3 size 1.82TiB used 63.03GiB path /dev/drbd2
        devid    4 size 1.82TiB used 63.03GiB path /dev/sde

Btrfs v3.12
root@toy02:/etc# btrfs filesystem show
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
        Total devices 2 FS bytes used 109.56GiB
        devid    3 size 1.82TiB used 63.03GiB path /dev/drbd2
        devid    4 size 1.82TiB used 63.03GiB path /dev/drbd3

Btrfs v3.12

root@toy02:/etc# btrfs filesystem show -m
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
        Total devices 2 FS bytes used 109.56GiB
        devid    3 size 1.82TiB used 63.03GiB path /dev/drbd2
        devid    4 size 1.82TiB used 63.03GiB path /dev/drbd3

Btrfs v3.12


Still, the kernel sees 3 instead of (really) 2 HGST drives:

root@toy02:/etc# hdparm -I /dev/sdb | grep Number:
        Model Number:       HGST HUS724020ALA640
        Serial Number:      PN2134P5G2P2AX

root@toy02:/etc# hdparm -I /dev/sde | grep Number:
        Model Number:       HGST HUS724020ALA640
        Serial Number:      PN2134P5G2P2AX

root@toy02:/etc# hdparm -I /dev/sdd | grep Number:
        Model Number:       HGST HUS724020ALA640
        Serial Number:      PN2134P5G2P2XX

-- 
Ullrich Horlacher              Informationssysteme und Serverbetrieb
IZUS/TIK                       E-Mail: horlacher@rus.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<55CCC4AB.2080600@oracle.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-13 23:29       ` Gareth Pye
@ 2015-08-14 11:26         ` Austin S Hemmelgarn
  0 siblings, 0 replies; 18+ messages in thread
From: Austin S Hemmelgarn @ 2015-08-14 11:26 UTC (permalink / raw)
  To: Gareth Pye, Hugo Mills, Ulli Horlacher, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1007 bytes --]

On 2015-08-13 19:29, Gareth Pye wrote:
> I would have been surprised if any generic file system copes well with
> being mounted in several locations at once, DRBD appears to fight
> really hard to avoid that happening :)
>
> And yeah I'm doing the second thing, I've successfully switched which
> of the servers is active a few times with no ill effect (I would
> expect scrub to give me some significant warnings if one of the disks
> was a couple of months out of date) so I'm presuming that DRBD copes
> reasonably well or I've been very lucky. Either that luck is very
> deterministic, DRBD copes correctly, or I've been very very lucky.
>
> Very very lucky doesn't sound likely.
>
Yeah, I'd be willing to bet that DRBD does cope well with direct writes 
to the backing store (either that or it prevents the kernel from doing 
that, which would be even better and would not surprise me at all).  In 
my experience it's one of the most resilient shared storage options out 
there.



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-14  7:32         ` Ulli Horlacher
@ 2015-08-15  0:02           ` Anand Jain
  2015-08-15 10:09             ` Ulli Horlacher
  0 siblings, 1 reply; 18+ messages in thread
From: Anand Jain @ 2015-08-15  0:02 UTC (permalink / raw)
  To: linux-btrfs


First of all there is a known issue in handling multiple paths /
instances of the same device image in btrfs. Fixing this caused
regression earlier. And my survey
    [survey]  BTRFS_IOC_DEVICES_READY return status
almost told me not to fix the bug.

But these are just a reporting issue which would confuse users, should 
be fixed.


> There is now a new behaviour: after the btrfs mount, I can see shortly the
> wrong raw device /dev/sde and a few seconds later there is the correct
> /dev/drbd3 :

yep possible. but it does not mean that btrfs kernel is using the new 
path its just a reporting (bug).



(pls use -m option)
>
> root@toy02:/etc# umount /data
> root@toy02:/etc# mount /data
> root@toy02:/etc# btrfs filesystem show
> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>          Total devices 2 FS bytes used 109.56GiB
>          devid    3 size 1.82TiB used 63.03GiB path /dev/drbd2
>          devid    4 size 1.82TiB used 63.03GiB path /dev/sde
>
> Btrfs v3.12
> root@toy02:/etc# btrfs filesystem show
> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>          Total devices 2 FS bytes used 109.56GiB
>          devid    3 size 1.82TiB used 63.03GiB path /dev/drbd2
>          devid    4 size 1.82TiB used 63.03GiB path /dev/drbd3
>
> Btrfs v3.12
>
> root@toy02:/etc# btrfs filesystem show -m
> Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
>          Total devices 2 FS bytes used 109.56GiB
>          devid    3 size 1.82TiB used 63.03GiB path /dev/drbd2
>          devid    4 size 1.82TiB used 63.03GiB path /dev/drbd3
>
> Btrfs v3.12
>



> Still, the kernel sees 3 instead of (really) 2 HGST drives:
>
> root@toy02:/etc# hdparm -I /dev/sdb | grep Number:
>          Model Number:       HGST HUS724020ALA640
>          Serial Number:      PN2134P5G2P2AX
>
> root@toy02:/etc# hdparm -I /dev/sde | grep Number:
>          Model Number:       HGST HUS724020ALA640
>          Serial Number:      PN2134P5G2P2AX

This is important to know but not a btrfs issue. Do you have multiple 
host paths reaching this this device with serial # PN2134P5G2P2AX ?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID0 wrong (raw) device?
  2015-08-15  0:02           ` Anand Jain
@ 2015-08-15 10:09             ` Ulli Horlacher
  0 siblings, 0 replies; 18+ messages in thread
From: Ulli Horlacher @ 2015-08-15 10:09 UTC (permalink / raw)
  To: linux-btrfs

On Sat 2015-08-15 (08:02), Anand Jain wrote:

> First of all there is a known issue in handling multiple paths /
> instances of the same device image in btrfs. Fixing this caused
> regression earlier. And my survey
>     [survey]  BTRFS_IOC_DEVICES_READY return status
> almost told me not to fix the bug.

I have subscribed to this list this week, I am a newbie :-)


> > There is now a new behaviour: after the btrfs mount, I can see shortly the
> > wrong raw device /dev/sde and a few seconds later there is the correct
> > /dev/drbd3 :
> 
> yep possible. but it does not mean that btrfs kernel is using the new 
> path its just a reporting (bug).

What is the reporting bug: /dev/sde or /dev/drbd3?

> > root@toy02:/etc# btrfs filesystem show -m
> > Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
> >          Total devices 2 FS bytes used 109.56GiB
> >          devid    3 size 1.82TiB used 63.03GiB path /dev/drbd2
> >          devid    4 size 1.82TiB used 63.03GiB path /dev/drbd3
> 
> 
> > Still, the kernel sees 3 instead of (really) 2 HGST drives:
> >
> > root@toy02:/etc# hdparm -I /dev/sdb | grep Number:
> >          Model Number:       HGST HUS724020ALA640
> >          Serial Number:      PN2134P5G2P2AX
> >
> > root@toy02:/etc# hdparm -I /dev/sde | grep Number:
> >          Model Number:       HGST HUS724020ALA640
> >          Serial Number:      PN2134P5G2P2AX
> 
> This is important to know but not a btrfs issue. Do you have multiple 
> host paths reaching this this device with serial # PN2134P5G2P2AX ?

root@toy02:~# find /dev -ls | grep PN2134P5G2P2AX
 14354    0 lrwxrwxrwx   1 root     root           17 Aug 14 09:00 /dev/drbd/by-disk/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2AX -> ../../../../drbd3
 13640    0 lrwxrwxrwx   1 root     root            9 Aug 13 16:25 /dev/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2AX -> ../../sdb

root@toy02:~# find /dev -ls | grep sdb
  7417    0 brw-rw----   1 root     disk              Aug 13 16:25 /dev/sdb
 12366    0 lrwxrwxrwx   1 root     root            9 Aug 13 16:25 /dev/disk/by-path/pci-0000:08:00.0-sas-0x1221000002000000-lun-0 -> ../../sdb
 13641    0 lrwxrwxrwx   1 root     root            9 Aug 13 16:25 /dev/disk/by-id/wwn-0x5000cca24ec137db -> ../../sdb
 13640    0 lrwxrwxrwx   1 root     root            9 Aug 13 16:25 /dev/disk/by-id/ata-HGST_HUS724020ALA640_PN2134P5G2P2AX -> ../../sdb
 12356    0 lrwxrwxrwx   1 root     root            6 Aug 13 16:25 /dev/block/8:16 -> ../sdb

root@toy02:~# find /dev -ls | grep sde
 13353    0 brw-rw----   1 root     disk              Aug 13 16:24 /dev/sde
 15725    0 lrwxrwxrwx   1 root     root            9 Aug 13 16:25 /dev/disk/by-uuid/411af13f-6cae-4f03-99dc-5941acb3135b -> ../../sde
 15724    0 lrwxrwxrwx   1 root     root            9 Aug 13 16:25 /dev/disk/by-label/data -> ../../sde
  9394    0 lrwxrwxrwx   1 root     root            9 Aug 13 16:24 /dev/disk/by-path/pci-0000:08:00.0-scsi-0:1:2:0 -> ../../sde
  9387    0 lrwxrwxrwx   1 root     root            6 Aug 13 16:24 /dev/block/8:64 -> ../sde

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<55CE81A6.5070305@oracle.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-08-15 10:09 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-12 13:07 RAID0 wrong (raw) device? Ulli Horlacher
2015-08-12 17:03 ` Chris Murphy
2015-08-12 17:43   ` Hugo Mills
2015-08-12 17:53     ` Chris Murphy
2015-08-13 12:11   ` Ulli Horlacher
2015-08-13  7:34 ` anand jain
2015-08-13 12:02   ` Ulli Horlacher
2015-08-13 14:55     ` Ulli Horlacher
2015-08-13 16:24       ` Anand Jain
2015-08-14  7:32         ` Ulli Horlacher
2015-08-15  0:02           ` Anand Jain
2015-08-15 10:09             ` Ulli Horlacher
2015-08-13 11:44 ` Austin S Hemmelgarn
2015-08-13 12:06   ` Ulli Horlacher
2015-08-13 22:32   ` Gareth Pye
2015-08-13 22:54     ` Hugo Mills
2015-08-13 23:29       ` Gareth Pye
2015-08-14 11:26         ` Austin S Hemmelgarn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).