linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NFS and /dev/mdXpY
@ 2010-04-17 15:57 Vlad Glagolev
       [not found] ` <20100417195747.5fae8834.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-17 15:57 UTC (permalink / raw)
  To: linux-nfs-u79uwXL29TY76Z2rM5mHXA; +Cc: linux-raid-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 5391 bytes --]

Well, hello there,

Posted it on linux-kernel ML also, and post it here, for more specific analysis.

I faced this problem today while trying to mount some NFS share on OpenBSD box.
I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:

ksh: cd: /storage - Stale NFS file handle

Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.

While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:

http://pastebin.ca/1864713

Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.

Then I started to experiment with another Linux box to kill the possible different variants.

On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:

--
    172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x25e4 (correct), seq 1, ack 1, win 92, options [nop,nop,TS val 1808029984 ecr 1618999], length 0
    172.17.2.5.3565791363 > 172.17.2.2.2049: 40 null
    172.17.2.2.2049 > 172.17.2.5.884: Flags [.], cksum 0x25e6 (correct), seq 1, ack 45, win 46, options [nop,nop,TS val 1618999 ecr 1808029984], length 0
    172.17.2.2.2049 > 172.17.2.5.3565791363: reply ok 24 null
    172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x259b (correct), seq 45, ack 29, win 92, options [nop,nop,TS val 1808029985 ecr 1618999], length 0
    172.17.2.5.3582568579 > 172.17.2.2.2049: 40 null
    172.17.2.2.2049 > 172.17.2.5.3582568579: reply ok 24 null
    172.17.2.5.3599345795 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
    172.17.2.2.2049 > 172.17.2.5.3599345795: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
    172.17.2.5.3616123011 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
    172.17.2.2.2049 > 172.17.2.5.3616123011: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
    172.17.2.5.884 > 172.17.2.2.2049: Flags [F.], cksum 0x2449 (correct), seq 281, ack 129, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
    172.17.2.2.2049 > 172.17.2.5.884: Flags [F.], cksum 0x2476 (correct), seq 129, ack 282, win 46, options [nop,nop,TS val 1618999 ecr 1808029986], length 0
    172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x2448 (correct), seq 282, ack 130, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
--

familiar messages, eh?

Since that time I've solved that's not OpenBSD problem. So only NFS and Linux left as the reasons of this.
It was possible to mount that small partition on Linux box too, the same as on OpenBSD.

But afterthat I recongnized an interesting issue: I have different sw raid setups on my storage server.
I tried to mount a small partition on the same md device where 5.5TB partition is located, and got the same
error message! Now I'm sure it's about NFS <-> MDADM setup, that's why I called the topic like this.

A bit about my setup:

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath] 
md3 : active raid1 sdc1[0] sdd1[1]
      61376 blocks [2/2] [UU]
      
md1 : active raid5 sdc2[2] sdd2[3] sdb2[1] sda2[0]
      3153408 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      
md2 : active raid5 sdc3[2] sdd3[3] sdb3[1] sda3[0]
      5857199616 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      
md0 : active raid1 sdb1[1] sda1[0]
      61376 blocks [2/2] [UU]
      
unused devices: <none>

md0, md1, and md3 aren't so interesting, since fs is created directly on them, and that's a _problem device_:

# parted /dev/md2
GNU Parted 2.2
Using /dev/md2
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p free                                                           
p free
Model: Unknown (unknown)
Disk /dev/md2: 5998GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system     Name   Flags
        17.4kB  1049kB  1031kB  Free Space
 1      1049kB  2147MB  2146MB  linux-swap(v1)  swap
 2      2147MB  23.6GB  21.5GB  xfs             home
 3      23.6GB  24.7GB  1074MB  xfs             temp
 4      24.7GB  35.4GB  10.7GB  xfs             user
 5      35.4GB  51.5GB  16.1GB  xfs             var
 6      51.5GB  5998GB  5946GB  xfs             vault
        5998GB  5998GB  507kB   Free Space

# ls /dev/md?*
/dev/md0  /dev/md1  /dev/md2  /dev/md2p1  /dev/md2p2  /dev/md2p3  /dev/md2p4  /dev/md2p5  /dev/md2p6  /dev/md3

It's very handy partitioning scheme where I can extend (grow 5th raid) with more hdds only /vault partition while "loosing" (a.k.a. not using for this partition) only ~1gb of space from every 2TB drive.

System boots ok and xfs_check passes with no problems, etc.
The only problem: it's not possible to use NFS shares on any partition of /dev/md2 device.

Finally, my question to NFS and MDADM developers: any idea?

-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found] ` <20100417195747.5fae8834.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
@ 2010-04-21 16:39   ` Steve Cousins
  2010-04-21 16:48     ` Vlad Glagolev
  2010-04-22 18:25   ` J. Bruce Fields
  1 sibling, 1 reply; 15+ messages in thread
From: Steve Cousins @ 2010-04-21 16:39 UTC (permalink / raw)
  To: Vlad Glagolev
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

Since md2 with XFS is acting fine locally, it seems to be an NFS issue. 
What export and mounting parameters are you using?


Vlad Glagolev wrote:
> Well, hello there,
>
> Posted it on linux-kernel ML also, and post it here, for more specific analysis.
>
> I faced this problem today while trying to mount some NFS share on OpenBSD box.
> I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
>
> ksh: cd: /storage - Stale NFS file handle
>
> Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
>
> While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
>
> http://pastebin.ca/1864713
>
> Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
>
> Then I started to experiment with another Linux box to kill the possible different variants.
>
> On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
>
> --
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x25e4 (correct), seq 1, ack 1, win 92, options [nop,nop,TS val 1808029984 ecr 1618999], length 0
>     172.17.2.5.3565791363 > 172.17.2.2.2049: 40 null
>     172.17.2.2.2049 > 172.17.2.5.884: Flags [.], cksum 0x25e6 (correct), seq 1, ack 45, win 46, options [nop,nop,TS val 1618999 ecr 1808029984], length 0
>     172.17.2.2.2049 > 172.17.2.5.3565791363: reply ok 24 null
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x259b (correct), seq 45, ack 29, win 92, options [nop,nop,TS val 1808029985 ecr 1618999], length 0
>     172.17.2.5.3582568579 > 172.17.2.2.2049: 40 null
>     172.17.2.2.2049 > 172.17.2.5.3582568579: reply ok 24 null
>     172.17.2.5.3599345795 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
>     172.17.2.2.2049 > 172.17.2.5.3599345795: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
>     172.17.2.5.3616123011 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
>     172.17.2.2.2049 > 172.17.2.5.3616123011: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [F.], cksum 0x2449 (correct), seq 281, ack 129, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
>     172.17.2.2.2049 > 172.17.2.5.884: Flags [F.], cksum 0x2476 (correct), seq 129, ack 282, win 46, options [nop,nop,TS val 1618999 ecr 1808029986], length 0
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x2448 (correct), seq 282, ack 130, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> --
>
> familiar messages, eh?
>
> Since that time I've solved that's not OpenBSD problem. So only NFS and Linux left as the reasons of this.
> It was possible to mount that small partition on Linux box too, the same as on OpenBSD.
>
> But afterthat I recongnized an interesting issue: I have different sw raid setups on my storage server.
> I tried to mount a small partition on the same md device where 5.5TB partition is located, and got the same
> error message! Now I'm sure it's about NFS <-> MDADM setup, that's why I called the topic like this.
>
> A bit about my setup:
>
> # cat /proc/mdstat 
> Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath] 
> md3 : active raid1 sdc1[0] sdd1[1]
>       61376 blocks [2/2] [UU]
>       
> md1 : active raid5 sdc2[2] sdd2[3] sdb2[1] sda2[0]
>       3153408 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
>       
> md2 : active raid5 sdc3[2] sdd3[3] sdb3[1] sda3[0]
>       5857199616 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
>       
> md0 : active raid1 sdb1[1] sda1[0]
>       61376 blocks [2/2] [UU]
>       
> unused devices: <none>
>
> md0, md1, and md3 aren't so interesting, since fs is created directly on them, and that's a _problem device_:
>
> # parted /dev/md2
> GNU Parted 2.2
> Using /dev/md2
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) p free                                                           
> p free
> Model: Unknown (unknown)
> Disk /dev/md2: 5998GB
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
>
> Number  Start   End     Size    File system     Name   Flags
>         17.4kB  1049kB  1031kB  Free Space
>  1      1049kB  2147MB  2146MB  linux-swap(v1)  swap
>  2      2147MB  23.6GB  21.5GB  xfs             home
>  3      23.6GB  24.7GB  1074MB  xfs             temp
>  4      24.7GB  35.4GB  10.7GB  xfs             user
>  5      35.4GB  51.5GB  16.1GB  xfs             var
>  6      51.5GB  5998GB  5946GB  xfs             vault
>         5998GB  5998GB  507kB   Free Space
>
> # ls /dev/md?*
> /dev/md0  /dev/md1  /dev/md2  /dev/md2p1  /dev/md2p2  /dev/md2p3  /dev/md2p4  /dev/md2p5  /dev/md2p6  /dev/md3
>
> It's very handy partitioning scheme where I can extend (grow 5th raid) with more hdds only /vault partition while "loosing" (a.k.a. not using for this partition) only ~1gb of space from every 2TB drive.
>
> System boots ok and xfs_check passes with no problems, etc.
> The only problem: it's not possible to use NFS shares on any partition of /dev/md2 device.
>
> Finally, my question to NFS and MDADM developers: any idea?
>
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
  2010-04-21 16:39   ` Steve Cousins
@ 2010-04-21 16:48     ` Vlad Glagolev
       [not found]       ` <20100421204819.b86ee3f7.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-21 16:48 UTC (permalink / raw)
  To: Steve Cousins; +Cc: linux-nfs, linux-raid

[-- Attachment #1: Type: text/plain, Size: 7211 bytes --]

Thanks for reply, Steve!

parameters are pretty trivial, (rw,insecure) for exports, and defaults while mounting via ``mount host:/path /path'' command.

Yes. That sounds interesting, since XFS works fine with there partitions.
Also, I must say it's WD20EARS drives (with 4kb sector size, though parted says it's 512b).

I also tried another NFS daemon implementation (cvs version, not .22) -- unfsd (unfs3).
It mounts ok, but when I try to write any file to the server -- I get the same error (Stale NFS file handle).

And on the server side in dmesg I see this:

--
NFS: server 172.17.2.2 error: fileid changed
fsid 0:f: expected fileid 0x2033, got 0xb6d1e05fa150ce09
NFS: server 172.17.2.2 error: fileid changed
fsid 0:f: expected fileid 0x2033, got 0x26550b0132c0b1
NFS: server 172.17.2.2 error: fileid changed
fsid 0:f: expected fileid 0x2033, got 0x8202a60053000020
NFS: server 172.17.2.2 error: fileid changed
fsid 0:f: expected fileid 0x2033, got 0xe542f93ebc8fe157
NFS: server 172.17.2.2 error: fileid changed
fsid 0:f: expected fileid 0x2033, got 0xc00cd74ea904301
--

looks like NFS protocol doesn't like something in partitioned software RAID.

On Wed, 21 Apr 2010 12:39:08 -0400
Steve Cousins <steve.cousins@maine.edu> wrote:

> Since md2 with XFS is acting fine locally, it seems to be an NFS issue. 
> What export and mounting parameters are you using?
> 
> 
> Vlad Glagolev wrote:
> > Well, hello there,
> >
> > Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> >
> > I faced this problem today while trying to mount some NFS share on OpenBSD box.
> > I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> >
> > ksh: cd: /storage - Stale NFS file handle
> >
> > Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> > That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> >
> > While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> >
> > http://pastebin.ca/1864713
> >
> > Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> >
> > Then I started to experiment with another Linux box to kill the possible different variants.
> >
> > On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
> >
> > --
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x25e4 (correct), seq 1, ack 1, win 92, options [nop,nop,TS val 1808029984 ecr 1618999], length 0
> >     172.17.2.5.3565791363 > 172.17.2.2.2049: 40 null
> >     172.17.2.2.2049 > 172.17.2.5.884: Flags [.], cksum 0x25e6 (correct), seq 1, ack 45, win 46, options [nop,nop,TS val 1618999 ecr 1808029984], length 0
> >     172.17.2.2.2049 > 172.17.2.5.3565791363: reply ok 24 null
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x259b (correct), seq 45, ack 29, win 92, options [nop,nop,TS val 1808029985 ecr 1618999], length 0
> >     172.17.2.5.3582568579 > 172.17.2.2.2049: 40 null
> >     172.17.2.2.2049 > 172.17.2.5.3582568579: reply ok 24 null
> >     172.17.2.5.3599345795 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
> >     172.17.2.2.2049 > 172.17.2.5.3599345795: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
> >     172.17.2.5.3616123011 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
> >     172.17.2.2.2049 > 172.17.2.5.3616123011: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [F.], cksum 0x2449 (correct), seq 281, ack 129, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> >     172.17.2.2.2049 > 172.17.2.5.884: Flags [F.], cksum 0x2476 (correct), seq 129, ack 282, win 46, options [nop,nop,TS val 1618999 ecr 1808029986], length 0
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x2448 (correct), seq 282, ack 130, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> > --
> >
> > familiar messages, eh?
> >
> > Since that time I've solved that's not OpenBSD problem. So only NFS and Linux left as the reasons of this.
> > It was possible to mount that small partition on Linux box too, the same as on OpenBSD.
> >
> > But afterthat I recongnized an interesting issue: I have different sw raid setups on my storage server.
> > I tried to mount a small partition on the same md device where 5.5TB partition is located, and got the same
> > error message! Now I'm sure it's about NFS <-> MDADM setup, that's why I called the topic like this.
> >
> > A bit about my setup:
> >
> > # cat /proc/mdstat 
> > Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath] 
> > md3 : active raid1 sdc1[0] sdd1[1]
> >       61376 blocks [2/2] [UU]
> >       
> > md1 : active raid5 sdc2[2] sdd2[3] sdb2[1] sda2[0]
> >       3153408 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
> >       
> > md2 : active raid5 sdc3[2] sdd3[3] sdb3[1] sda3[0]
> >       5857199616 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
> >       
> > md0 : active raid1 sdb1[1] sda1[0]
> >       61376 blocks [2/2] [UU]
> >       
> > unused devices: <none>
> >
> > md0, md1, and md3 aren't so interesting, since fs is created directly on them, and that's a _problem device_:
> >
> > # parted /dev/md2
> > GNU Parted 2.2
> > Using /dev/md2
> > Welcome to GNU Parted! Type 'help' to view a list of commands.
> > (parted) p free                                                           
> > p free
> > Model: Unknown (unknown)
> > Disk /dev/md2: 5998GB
> > Sector size (logical/physical): 512B/512B
> > Partition Table: gpt
> >
> > Number  Start   End     Size    File system     Name   Flags
> >         17.4kB  1049kB  1031kB  Free Space
> >  1      1049kB  2147MB  2146MB  linux-swap(v1)  swap
> >  2      2147MB  23.6GB  21.5GB  xfs             home
> >  3      23.6GB  24.7GB  1074MB  xfs             temp
> >  4      24.7GB  35.4GB  10.7GB  xfs             user
> >  5      35.4GB  51.5GB  16.1GB  xfs             var
> >  6      51.5GB  5998GB  5946GB  xfs             vault
> >         5998GB  5998GB  507kB   Free Space
> >
> > # ls /dev/md?*
> > /dev/md0  /dev/md1  /dev/md2  /dev/md2p1  /dev/md2p2  /dev/md2p3  /dev/md2p4  /dev/md2p5  /dev/md2p6  /dev/md3
> >
> > It's very handy partitioning scheme where I can extend (grow 5th raid) with more hdds only /vault partition while "loosing" (a.k.a. not using for this partition) only ~1gb of space from every 2TB drive.
> >
> > System boots ok and xfs_check passes with no problems, etc.
> > The only problem: it's not possible to use NFS shares on any partition of /dev/md2 device.
> >
> > Finally, my question to NFS and MDADM developers: any idea?
> >
> >   
> 


-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found]       ` <20100421204819.b86ee3f7.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
@ 2010-04-21 17:09         ` Roger Heflin
       [not found]           ` <q2zd3da20d01004211009jccd81479v83e2ef4b6d5db7bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Heflin @ 2010-04-21 17:09 UTC (permalink / raw)
  To: Vlad Glagolev
  Cc: Steve Cousins, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

On Wed, Apr 21, 2010 at 11:48 AM, Vlad Glagolev <stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org> wrote:
> Thanks for reply, Steve!
>
> parameters are pretty trivial, (rw,insecure) for exports, and defaults while mounting via ``mount host:/path /path'' command.
>
> Yes. That sounds interesting, since XFS works fine with there partitions.
> Also, I must say it's WD20EARS drives (with 4kb sector size, though parted says it's 512b).
>
> I also tried another NFS daemon implementation (cvs version, not .22) -- unfsd (unfs3).
> It mounts ok, but when I try to write any file to the server -- I get the same error (Stale NFS file handle).
>
> And on the server side in dmesg I see this:
>
> --
> NFS: server 172.17.2.2 error: fileid changed
> fsid 0:f: expected fileid 0x2033, got 0xb6d1e05fa150ce09
> NFS: server 172.17.2.2 error: fileid changed
> fsid 0:f: expected fileid 0x2033, got 0x26550b0132c0b1
> NFS: server 172.17.2.2 error: fileid changed
> fsid 0:f: expected fileid 0x2033, got 0x8202a60053000020
> NFS: server 172.17.2.2 error: fileid changed
> fsid 0:f: expected fileid 0x2033, got 0xe542f93ebc8fe157
> NFS: server 172.17.2.2 error: fileid changed
> fsid 0:f: expected fileid 0x2033, got 0xc00cd74ea904301
> --
>
> looks like NFS protocol doesn't like something in partitioned software RAID.
>


Try manually setting the fsid=something in the exports file and
reexport and remount on the target system, if there was a fsid
collision of some sort then nfs would be hitting the wrong fs...

NFS generates the fsid automatically based on the devices major minor,
and it is possible there is something odd about the major minor
numbers that make them not unique...and collide with someone else
major minor.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found]           ` <q2zd3da20d01004211009jccd81479v83e2ef4b6d5db7bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-04-21 17:32             ` Vlad Glagolev
       [not found]               ` <20100421213201.67a4a7a2.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-21 17:32 UTC (permalink / raw)
  To: Roger Heflin
  Cc: Steve Cousins, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 2394 bytes --]

On Wed, 21 Apr 2010 12:09:20 -0500
Roger Heflin <rogerheflin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On Wed, Apr 21, 2010 at 11:48 AM, Vlad Glagolev <stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org> wrote:
> > Thanks for reply, Steve!
> >
> > parameters are pretty trivial, (rw,insecure) for exports, and defaults while mounting via ``mount host:/path /path'' command.
> >
> > Yes. That sounds interesting, since XFS works fine with there partitions.
> > Also, I must say it's WD20EARS drives (with 4kb sector size, though parted says it's 512b).
> >
> > I also tried another NFS daemon implementation (cvs version, not .22) -- unfsd (unfs3).
> > It mounts ok, but when I try to write any file to the server -- I get the same error (Stale NFS file handle).
> >
> > And on the server side in dmesg I see this:
> >
> > --
> > NFS: server 172.17.2.2 error: fileid changed
> > fsid 0:f: expected fileid 0x2033, got 0xb6d1e05fa150ce09
> > NFS: server 172.17.2.2 error: fileid changed
> > fsid 0:f: expected fileid 0x2033, got 0x26550b0132c0b1
> > NFS: server 172.17.2.2 error: fileid changed
> > fsid 0:f: expected fileid 0x2033, got 0x8202a60053000020
> > NFS: server 172.17.2.2 error: fileid changed
> > fsid 0:f: expected fileid 0x2033, got 0xe542f93ebc8fe157
> > NFS: server 172.17.2.2 error: fileid changed
> > fsid 0:f: expected fileid 0x2033, got 0xc00cd74ea904301
> > --
> >
> > looks like NFS protocol doesn't like something in partitioned software RAID.
> >
> 
> 
> Try manually setting the fsid=something in the exports file and
> reexport and remount on the target system, if there was a fsid
> collision of some sort then nfs would be hitting the wrong fs...
> 
> NFS generates the fsid automatically based on the devices major minor,
> and it is possible there is something odd about the major minor
> numbers that make them not unique...and collide with someone else
> major minor.

BAH! How simple!

Thank you very much, Roger!

I've just added fsid=1 (yes, only these few chars) to exports, and it worked! Unbelievable, really :)
Of course I've checked it on OpenBSD and Linux under both nfsd and unfsd. Works flawlessly.

Thanks a lot again.

But it seems to be a bug, right? If so, patches welcome.. I'll test it with great pleasure.

-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found]               ` <20100421213201.67a4a7a2.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
@ 2010-04-21 18:26                 ` Vlad Glagolev
  2010-04-21 19:08                   ` Vlad Glagolev
       [not found]                   ` <20100421222612.7aa4f21a.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
  0 siblings, 2 replies; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-21 18:26 UTC (permalink / raw)
  To: Roger Heflin
  Cc: Steve Cousins, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 2971 bytes --]

Hmm, more testing.. It works only with tiny files flawlessly on OpenBSD (client).

If a filesize is around 50 mibs, then it just freezes and eats cpu with nfsrcvl call.

On Linux I don't see such problem. Even big files are transfered with good enough speed.

On Wed, 21 Apr 2010 21:32:01 +0400
Vlad Glagolev <stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org> wrote:

> On Wed, 21 Apr 2010 12:09:20 -0500
> Roger Heflin <rogerheflin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
> > On Wed, Apr 21, 2010 at 11:48 AM, Vlad Glagolev <stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org> wrote:
> > > Thanks for reply, Steve!
> > >
> > > parameters are pretty trivial, (rw,insecure) for exports, and defaults while mounting via ``mount host:/path /path'' command.
> > >
> > > Yes. That sounds interesting, since XFS works fine with there partitions.
> > > Also, I must say it's WD20EARS drives (with 4kb sector size, though parted says it's 512b).
> > >
> > > I also tried another NFS daemon implementation (cvs version, not .22) -- unfsd (unfs3).
> > > It mounts ok, but when I try to write any file to the server -- I get the same error (Stale NFS file handle).
> > >
> > > And on the server side in dmesg I see this:
> > >
> > > --
> > > NFS: server 172.17.2.2 error: fileid changed
> > > fsid 0:f: expected fileid 0x2033, got 0xb6d1e05fa150ce09
> > > NFS: server 172.17.2.2 error: fileid changed
> > > fsid 0:f: expected fileid 0x2033, got 0x26550b0132c0b1
> > > NFS: server 172.17.2.2 error: fileid changed
> > > fsid 0:f: expected fileid 0x2033, got 0x8202a60053000020
> > > NFS: server 172.17.2.2 error: fileid changed
> > > fsid 0:f: expected fileid 0x2033, got 0xe542f93ebc8fe157
> > > NFS: server 172.17.2.2 error: fileid changed
> > > fsid 0:f: expected fileid 0x2033, got 0xc00cd74ea904301
> > > --
> > >
> > > looks like NFS protocol doesn't like something in partitioned software RAID.
> > >
> > 
> > 
> > Try manually setting the fsid=something in the exports file and
> > reexport and remount on the target system, if there was a fsid
> > collision of some sort then nfs would be hitting the wrong fs...
> > 
> > NFS generates the fsid automatically based on the devices major minor,
> > and it is possible there is something odd about the major minor
> > numbers that make them not unique...and collide with someone else
> > major minor.
> 
> BAH! How simple!
> 
> Thank you very much, Roger!
> 
> I've just added fsid=1 (yes, only these few chars) to exports, and it worked! Unbelievable, really :)
> Of course I've checked it on OpenBSD and Linux under both nfsd and unfsd. Works flawlessly.
> 
> Thanks a lot again.
> 
> But it seems to be a bug, right? If so, patches welcome.. I'll test it with great pleasure.
> 
> -- 
> Dont wait to die to find paradise...
> --
> Cheerz,
> Vlad "Stealth" Glagolev


-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
  2010-04-21 18:26                 ` Vlad Glagolev
@ 2010-04-21 19:08                   ` Vlad Glagolev
       [not found]                   ` <20100421222612.7aa4f21a.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
  1 sibling, 0 replies; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-21 19:08 UTC (permalink / raw)
  To: Roger Heflin; +Cc: Steve Cousins, linux-nfs, linux-raid

[-- Attachment #1: Type: text/plain, Size: 5503 bytes --]

Another interesting facts:

According to exports(5) small integers or UUIDs must be used for "fsid=" option.

If I set "fsid=__UUID__" in /etc/exports (where __UUID__ is UUID of partition returned by blkid command), then I got _exactly_ the same error as the first time: impossible to mount nfs partition from Linux client box, and "Stale NFS file handle" while trying to cd into mounted dir on OpenBSD box.

If I set "fsid=1" in /etc/exports, then from Linux client box I can write files without any performance issues, and from OpenBSD client box I get this: I copy a file (size's around 50-60 mibs) after visible full existance on the other side, it freezes and I see nfsrcvl call in top; few mins later I notice nfs_fsy call in top; a few mins later cp returns 0, and file is copied successfully.

I checked sha1sum hashes on both sides, they're equal.

On the server with tcpdump (while writing file from OpenBSD box) it's visible:

22:49:17.002445 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 164)
    172.17.2.2.2049 > 81.200.8.213.1393674899: reply ok 136 write POST: REG 100644 ids 8000/10 sz 27583794 8192 bytes
22:49:17.003105 IP (tos 0x0, ttl 64, id 64645, offset 0, flags [+], proto UDP (17), length 1500)
    81.200.8.213.1015648788 > 172.17.2.2.2049: 1472 write fh Unknown/01000101010000001D080000000000000000000000A4C0000000200000000002 8192 (8192) bytes @ 10797056 <filesync>
22:49:17.003131 IP (tos 0x0, ttl 64, id 64645, offset 1480, flags [+], proto UDP (17), length 1500)
    81.200.8.213 > 172.17.2.2: udp
22:49:17.003345 IP (tos 0x0, ttl 64, id 64645, offset 2960, flags [+], proto UDP (17), length 1500)
    81.200.8.213 > 172.17.2.2: udp
22:49:17.003468 IP (tos 0x0, ttl 64, id 64645, offset 4440, flags [+], proto UDP (17), length 1500)
    81.200.8.213 > 172.17.2.2: udp
22:49:17.003590 IP (tos 0x0, ttl 64, id 64645, offset 5920, flags [+], proto UDP (17), length 1500)
    81.200.8.213 > 172.17.2.2: udp
22:49:17.003598 IP (tos 0x0, ttl 64, id 64645, offset 7400, flags [none], proto UDP (17), length 940)
    81.200.8.213 > 172.17.2.2: udp

No errors, like in the first log. But something's definetely incorrect here.

Also tried mounting the partition with "-T" (tcp) flag on the client side -- no luck.

On Wed, 21 Apr 2010 22:26:12 +0400
Vlad Glagolev <stealth@sourcemage.org> wrote:

> Hmm, more testing.. It works only with tiny files flawlessly on OpenBSD (client).
> 
> If a filesize is around 50 mibs, then it just freezes and eats cpu with nfsrcvl call.
> 
> On Linux I don't see such problem. Even big files are transfered with good enough speed.
> 
> On Wed, 21 Apr 2010 21:32:01 +0400
> Vlad Glagolev <stealth@sourcemage.org> wrote:
> 
> > On Wed, 21 Apr 2010 12:09:20 -0500
> > Roger Heflin <rogerheflin@gmail.com> wrote:
> > 
> > > On Wed, Apr 21, 2010 at 11:48 AM, Vlad Glagolev <stealth@sourcemage.org> wrote:
> > > > Thanks for reply, Steve!
> > > >
> > > > parameters are pretty trivial, (rw,insecure) for exports, and defaults while mounting via ``mount host:/path /path'' command.
> > > >
> > > > Yes. That sounds interesting, since XFS works fine with there partitions.
> > > > Also, I must say it's WD20EARS drives (with 4kb sector size, though parted says it's 512b).
> > > >
> > > > I also tried another NFS daemon implementation (cvs version, not .22) -- unfsd (unfs3).
> > > > It mounts ok, but when I try to write any file to the server -- I get the same error (Stale NFS file handle).
> > > >
> > > > And on the server side in dmesg I see this:
> > > >
> > > > --
> > > > NFS: server 172.17.2.2 error: fileid changed
> > > > fsid 0:f: expected fileid 0x2033, got 0xb6d1e05fa150ce09
> > > > NFS: server 172.17.2.2 error: fileid changed
> > > > fsid 0:f: expected fileid 0x2033, got 0x26550b0132c0b1
> > > > NFS: server 172.17.2.2 error: fileid changed
> > > > fsid 0:f: expected fileid 0x2033, got 0x8202a60053000020
> > > > NFS: server 172.17.2.2 error: fileid changed
> > > > fsid 0:f: expected fileid 0x2033, got 0xe542f93ebc8fe157
> > > > NFS: server 172.17.2.2 error: fileid changed
> > > > fsid 0:f: expected fileid 0x2033, got 0xc00cd74ea904301
> > > > --
> > > >
> > > > looks like NFS protocol doesn't like something in partitioned software RAID.
> > > >
> > > 
> > > 
> > > Try manually setting the fsid=something in the exports file and
> > > reexport and remount on the target system, if there was a fsid
> > > collision of some sort then nfs would be hitting the wrong fs...
> > > 
> > > NFS generates the fsid automatically based on the devices major minor,
> > > and it is possible there is something odd about the major minor
> > > numbers that make them not unique...and collide with someone else
> > > major minor.
> > 
> > BAH! How simple!
> > 
> > Thank you very much, Roger!
> > 
> > I've just added fsid=1 (yes, only these few chars) to exports, and it worked! Unbelievable, really :)
> > Of course I've checked it on OpenBSD and Linux under both nfsd and unfsd. Works flawlessly.
> > 
> > Thanks a lot again.
> > 
> > But it seems to be a bug, right? If so, patches welcome.. I'll test it with great pleasure.
> > 
> > -- 
> > Dont wait to die to find paradise...
> > --
> > Cheerz,
> > Vlad "Stealth" Glagolev
> 
> 
> -- 
> Dont wait to die to find paradise...
> --
> Cheerz,
> Vlad "Stealth" Glagolev


-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found]                   ` <20100421222612.7aa4f21a.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
@ 2010-04-22  1:20                     ` Roger Heflin
  0 siblings, 0 replies; 15+ messages in thread
From: Roger Heflin @ 2010-04-22  1:20 UTC (permalink / raw)
  To: Vlad Glagolev
  Cc: Steve Cousins, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

Vlad Glagolev wrote:
> Hmm, more testing.. It works only with tiny files flawlessly on OpenBSD (client).
> 
> If a filesize is around 50 mibs, then it just freezes and eats cpu with nfsrcvl call.
> 
> On Linux I don't see such problem. Even big files are transfered with good enough speed.
> 

I think that is a second problem.   There are simple ways to screw up 
the fsid's and produce stale fs warnings...so I was guessing it might 
be a slightly different variation of the one I had seen before were we 
were getting through some series of commands different exported fs 
with the same fsid and the client machine basically had the fs 
switched out from under it...

You might try different nfsvers(may work, but ver 2 has some issues 
displaying the proper sizes on >2tb fs, and has 1-2tb file limits--I 
believe) on the mounts on bsd, and or different wsize/rsize(unlikely 
to help) and such...but I am not sure it will necessarily matter, but 
one may cause less issues from with one set of options vs another as 
each is likely a fairly different code path, and the same with trying 
the proto tcp vs udp changes quite a bit of code out...

You can on most oses check what options were actually accepted on the 
mount on linux it is a cat /proc/mounts the mount command may do it on 
bsd...but that will tell you if it accepted the specified options or 
ignored them.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found] ` <20100417195747.5fae8834.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
  2010-04-21 16:39   ` Steve Cousins
@ 2010-04-22 18:25   ` J. Bruce Fields
       [not found]     ` <20100422182543.GB8858-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  1 sibling, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2010-04-22 18:25 UTC (permalink / raw)
  To: Vlad Glagolev
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

On Sat, Apr 17, 2010 at 07:57:47PM +0400, Vlad Glagolev wrote:
> Well, hello there,
> 
> Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> 
> I faced this problem today while trying to mount some NFS share on OpenBSD box.
> I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> 
> ksh: cd: /storage - Stale NFS file handle
> 
> Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> 
> While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> 
> http://pastebin.ca/1864713
> 
> Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> 
> Then I started to experiment with another Linux box to kill the possible different variants.
> 
> On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:

I'm a bit confused.  What kernel and nfs-utils version is running on the
problematic Linux server?

Also, what are the contents of /proc/net/rpc/nfsd.export/content and
/proc/net/rpc/nfsd.content after you try to access the filesystem?

--b.

> 
> --
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x25e4 (correct), seq 1, ack 1, win 92, options [nop,nop,TS val 1808029984 ecr 1618999], length 0
>     172.17.2.5.3565791363 > 172.17.2.2.2049: 40 null
>     172.17.2.2.2049 > 172.17.2.5.884: Flags [.], cksum 0x25e6 (correct), seq 1, ack 45, win 46, options [nop,nop,TS val 1618999 ecr 1808029984], length 0
>     172.17.2.2.2049 > 172.17.2.5.3565791363: reply ok 24 null
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x259b (correct), seq 45, ack 29, win 92, options [nop,nop,TS val 1808029985 ecr 1618999], length 0
>     172.17.2.5.3582568579 > 172.17.2.2.2049: 40 null
>     172.17.2.2.2049 > 172.17.2.5.3582568579: reply ok 24 null
>     172.17.2.5.3599345795 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
>     172.17.2.2.2049 > 172.17.2.5.3599345795: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
>     172.17.2.5.3616123011 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
>     172.17.2.2.2049 > 172.17.2.5.3616123011: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [F.], cksum 0x2449 (correct), seq 281, ack 129, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
>     172.17.2.2.2049 > 172.17.2.5.884: Flags [F.], cksum 0x2476 (correct), seq 129, ack 282, win 46, options [nop,nop,TS val 1618999 ecr 1808029986], length 0
>     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x2448 (correct), seq 282, ack 130, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> --
> 
> familiar messages, eh?
> 
> Since that time I've solved that's not OpenBSD problem. So only NFS and Linux left as the reasons of this.
> It was possible to mount that small partition on Linux box too, the same as on OpenBSD.
> 
> But afterthat I recongnized an interesting issue: I have different sw raid setups on my storage server.
> I tried to mount a small partition on the same md device where 5.5TB partition is located, and got the same
> error message! Now I'm sure it's about NFS <-> MDADM setup, that's why I called the topic like this.
> 
> A bit about my setup:
> 
> # cat /proc/mdstat 
> Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath] 
> md3 : active raid1 sdc1[0] sdd1[1]
>       61376 blocks [2/2] [UU]
>       
> md1 : active raid5 sdc2[2] sdd2[3] sdb2[1] sda2[0]
>       3153408 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
>       
> md2 : active raid5 sdc3[2] sdd3[3] sdb3[1] sda3[0]
>       5857199616 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
>       
> md0 : active raid1 sdb1[1] sda1[0]
>       61376 blocks [2/2] [UU]
>       
> unused devices: <none>
> 
> md0, md1, and md3 aren't so interesting, since fs is created directly on them, and that's a _problem device_:
> 
> # parted /dev/md2
> GNU Parted 2.2
> Using /dev/md2
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) p free                                                           
> p free
> Model: Unknown (unknown)
> Disk /dev/md2: 5998GB
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
> 
> Number  Start   End     Size    File system     Name   Flags
>         17.4kB  1049kB  1031kB  Free Space
>  1      1049kB  2147MB  2146MB  linux-swap(v1)  swap
>  2      2147MB  23.6GB  21.5GB  xfs             home
>  3      23.6GB  24.7GB  1074MB  xfs             temp
>  4      24.7GB  35.4GB  10.7GB  xfs             user
>  5      35.4GB  51.5GB  16.1GB  xfs             var
>  6      51.5GB  5998GB  5946GB  xfs             vault
>         5998GB  5998GB  507kB   Free Space
> 
> # ls /dev/md?*
> /dev/md0  /dev/md1  /dev/md2  /dev/md2p1  /dev/md2p2  /dev/md2p3  /dev/md2p4  /dev/md2p5  /dev/md2p6  /dev/md3
> 
> It's very handy partitioning scheme where I can extend (grow 5th raid) with more hdds only /vault partition while "loosing" (a.k.a. not using for this partition) only ~1gb of space from every 2TB drive.
> 
> System boots ok and xfs_check passes with no problems, etc.
> The only problem: it's not possible to use NFS shares on any partition of /dev/md2 device.
> 
> Finally, my question to NFS and MDADM developers: any idea?
> 
> -- 
> Dont wait to die to find paradise...
> --
> Cheerz,
> Vlad "Stealth" Glagolev


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found]     ` <20100422182543.GB8858-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2010-04-22 18:53       ` Vlad Glagolev
  2010-04-22 19:32         ` J. Bruce Fields
  0 siblings, 1 reply; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-22 18:53 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 6681 bytes --]

On Thu, 22 Apr 2010 14:25:43 -0400
"J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:

> On Sat, Apr 17, 2010 at 07:57:47PM +0400, Vlad Glagolev wrote:
> > Well, hello there,
> > 
> > Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> > 
> > I faced this problem today while trying to mount some NFS share on OpenBSD box.
> > I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> > 
> > ksh: cd: /storage - Stale NFS file handle
> > 
> > Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> > That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> > 
> > While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> > 
> > http://pastebin.ca/1864713
> > 
> > Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> > 
> > Then I started to experiment with another Linux box to kill the possible different variants.
> > 
> > On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
> 
> I'm a bit confused.  What kernel and nfs-utils version is running on the
> problematic Linux server?

same. nfs-utils 1.1.6 and kernel 2.6.32.

> 
> Also, what are the contents of /proc/net/rpc/nfsd.export/content and
> /proc/net/rpc/nfsd.content after you try to access the filesystem?

# cat /proc/net/rpc/nfsd.export/content
#path domain(flags)
/vault	172.17.2.5(ro,insecure,root_squash,sync,wdelay,no_subtree_check)

# cat /proc/net/rpc/nfsd.fh/content 
#domain fsidtype fsid [path]
# 172.17.2.5 3 0x0001030500000803
172.17.2.5 0 0x0500030100000803 /vault

> 
> --b.
> 
> > 
> > --
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x25e4 (correct), seq 1, ack 1, win 92, options [nop,nop,TS val 1808029984 ecr 1618999], length 0
> >     172.17.2.5.3565791363 > 172.17.2.2.2049: 40 null
> >     172.17.2.2.2049 > 172.17.2.5.884: Flags [.], cksum 0x25e6 (correct), seq 1, ack 45, win 46, options [nop,nop,TS val 1618999 ecr 1808029984], length 0
> >     172.17.2.2.2049 > 172.17.2.5.3565791363: reply ok 24 null
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x259b (correct), seq 45, ack 29, win 92, options [nop,nop,TS val 1808029985 ecr 1618999], length 0
> >     172.17.2.5.3582568579 > 172.17.2.2.2049: 40 null
> >     172.17.2.2.2049 > 172.17.2.5.3582568579: reply ok 24 null
> >     172.17.2.5.3599345795 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
> >     172.17.2.2.2049 > 172.17.2.5.3599345795: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
> >     172.17.2.5.3616123011 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
> >     172.17.2.2.2049 > 172.17.2.5.3616123011: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [F.], cksum 0x2449 (correct), seq 281, ack 129, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> >     172.17.2.2.2049 > 172.17.2.5.884: Flags [F.], cksum 0x2476 (correct), seq 129, ack 282, win 46, options [nop,nop,TS val 1618999 ecr 1808029986], length 0
> >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x2448 (correct), seq 282, ack 130, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> > --
> > 
> > familiar messages, eh?
> > 
> > Since that time I've solved that's not OpenBSD problem. So only NFS and Linux left as the reasons of this.
> > It was possible to mount that small partition on Linux box too, the same as on OpenBSD.
> > 
> > But afterthat I recongnized an interesting issue: I have different sw raid setups on my storage server.
> > I tried to mount a small partition on the same md device where 5.5TB partition is located, and got the same
> > error message! Now I'm sure it's about NFS <-> MDADM setup, that's why I called the topic like this.
> > 
> > A bit about my setup:
> > 
> > # cat /proc/mdstat 
> > Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath] 
> > md3 : active raid1 sdc1[0] sdd1[1]
> >       61376 blocks [2/2] [UU]
> >       
> > md1 : active raid5 sdc2[2] sdd2[3] sdb2[1] sda2[0]
> >       3153408 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
> >       
> > md2 : active raid5 sdc3[2] sdd3[3] sdb3[1] sda3[0]
> >       5857199616 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
> >       
> > md0 : active raid1 sdb1[1] sda1[0]
> >       61376 blocks [2/2] [UU]
> >       
> > unused devices: <none>
> > 
> > md0, md1, and md3 aren't so interesting, since fs is created directly on them, and that's a _problem device_:
> > 
> > # parted /dev/md2
> > GNU Parted 2.2
> > Using /dev/md2
> > Welcome to GNU Parted! Type 'help' to view a list of commands.
> > (parted) p free                                                           
> > p free
> > Model: Unknown (unknown)
> > Disk /dev/md2: 5998GB
> > Sector size (logical/physical): 512B/512B
> > Partition Table: gpt
> > 
> > Number  Start   End     Size    File system     Name   Flags
> >         17.4kB  1049kB  1031kB  Free Space
> >  1      1049kB  2147MB  2146MB  linux-swap(v1)  swap
> >  2      2147MB  23.6GB  21.5GB  xfs             home
> >  3      23.6GB  24.7GB  1074MB  xfs             temp
> >  4      24.7GB  35.4GB  10.7GB  xfs             user
> >  5      35.4GB  51.5GB  16.1GB  xfs             var
> >  6      51.5GB  5998GB  5946GB  xfs             vault
> >         5998GB  5998GB  507kB   Free Space
> > 
> > # ls /dev/md?*
> > /dev/md0  /dev/md1  /dev/md2  /dev/md2p1  /dev/md2p2  /dev/md2p3  /dev/md2p4  /dev/md2p5  /dev/md2p6  /dev/md3
> > 
> > It's very handy partitioning scheme where I can extend (grow 5th raid) with more hdds only /vault partition while "loosing" (a.k.a. not using for this partition) only ~1gb of space from every 2TB drive.
> > 
> > System boots ok and xfs_check passes with no problems, etc.
> > The only problem: it's not possible to use NFS shares on any partition of /dev/md2 device.
> > 
> > Finally, my question to NFS and MDADM developers: any idea?
> > 
> > -- 
> > Dont wait to die to find paradise...
> > --
> > Cheerz,
> > Vlad "Stealth" Glagolev
> 
> 


-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
  2010-04-22 18:53       ` Vlad Glagolev
@ 2010-04-22 19:32         ` J. Bruce Fields
       [not found]           ` <20100422193236.GA10302-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2010-04-22 19:32 UTC (permalink / raw)
  To: Vlad Glagolev; +Cc: linux-nfs, linux-raid

On Thu, Apr 22, 2010 at 10:53:10PM +0400, Vlad Glagolev wrote:
> On Thu, 22 Apr 2010 14:25:43 -0400
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Sat, Apr 17, 2010 at 07:57:47PM +0400, Vlad Glagolev wrote:
> > > Well, hello there,
> > > 
> > > Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> > > 
> > > I faced this problem today while trying to mount some NFS share on OpenBSD box.
> > > I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> > > 
> > > ksh: cd: /storage - Stale NFS file handle
> > > 
> > > Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> > > That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> > > 
> > > While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> > > 
> > > http://pastebin.ca/1864713
> > > 
> > > Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> > > 
> > > Then I started to experiment with another Linux box to kill the possible different variants.
> > > 
> > > On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
> > 
> > I'm a bit confused.  What kernel and nfs-utils version is running on the
> > problematic Linux server?
> 
> same. nfs-utils 1.1.6 and kernel 2.6.32.

Huh.  That should be new enough for it to be using uuid's.  I wonder why
it isn't?

--b.

> 
> > 
> > Also, what are the contents of /proc/net/rpc/nfsd.export/content and
> > /proc/net/rpc/nfsd.content after you try to access the filesystem?
> 
> # cat /proc/net/rpc/nfsd.export/content
> #path domain(flags)
> /vault	172.17.2.5(ro,insecure,root_squash,sync,wdelay,no_subtree_check)
> 
> # cat /proc/net/rpc/nfsd.fh/content 
> #domain fsidtype fsid [path]
> # 172.17.2.5 3 0x0001030500000803
> 172.17.2.5 0 0x0500030100000803 /vault
> 
> > 
> > --b.
> > 
> > > 
> > > --
> > >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x25e4 (correct), seq 1, ack 1, win 92, options [nop,nop,TS val 1808029984 ecr 1618999], length 0
> > >     172.17.2.5.3565791363 > 172.17.2.2.2049: 40 null
> > >     172.17.2.2.2049 > 172.17.2.5.884: Flags [.], cksum 0x25e6 (correct), seq 1, ack 45, win 46, options [nop,nop,TS val 1618999 ecr 1808029984], length 0
> > >     172.17.2.2.2049 > 172.17.2.5.3565791363: reply ok 24 null
> > >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x259b (correct), seq 45, ack 29, win 92, options [nop,nop,TS val 1808029985 ecr 1618999], length 0
> > >     172.17.2.5.3582568579 > 172.17.2.2.2049: 40 null
> > >     172.17.2.2.2049 > 172.17.2.5.3582568579: reply ok 24 null
> > >     172.17.2.5.3599345795 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
> > >     172.17.2.2.2049 > 172.17.2.5.3599345795: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
> > >     172.17.2.5.3616123011 > 172.17.2.2.2049: 92 fsinfo fh Unknown/0100030005030100000800000000000000000000000000000000000000000000
> > >     172.17.2.2.2049 > 172.17.2.5.3616123011: reply ok 32 fsinfo ERROR: Stale NFS file handle POST:
> > >     172.17.2.5.884 > 172.17.2.2.2049: Flags [F.], cksum 0x2449 (correct), seq 281, ack 129, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> > >     172.17.2.2.2049 > 172.17.2.5.884: Flags [F.], cksum 0x2476 (correct), seq 129, ack 282, win 46, options [nop,nop,TS val 1618999 ecr 1808029986], length 0
> > >     172.17.2.5.884 > 172.17.2.2.2049: Flags [.], cksum 0x2448 (correct), seq 282, ack 130, win 92, options [nop,nop,TS val 1808029986 ecr 1618999], length 0
> > > --
> > > 
> > > familiar messages, eh?
> > > 
> > > Since that time I've solved that's not OpenBSD problem. So only NFS and Linux left as the reasons of this.
> > > It was possible to mount that small partition on Linux box too, the same as on OpenBSD.
> > > 
> > > But afterthat I recongnized an interesting issue: I have different sw raid setups on my storage server.
> > > I tried to mount a small partition on the same md device where 5.5TB partition is located, and got the same
> > > error message! Now I'm sure it's about NFS <-> MDADM setup, that's why I called the topic like this.
> > > 
> > > A bit about my setup:
> > > 
> > > # cat /proc/mdstat 
> > > Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath] 
> > > md3 : active raid1 sdc1[0] sdd1[1]
> > >       61376 blocks [2/2] [UU]
> > >       
> > > md1 : active raid5 sdc2[2] sdd2[3] sdb2[1] sda2[0]
> > >       3153408 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
> > >       
> > > md2 : active raid5 sdc3[2] sdd3[3] sdb3[1] sda3[0]
> > >       5857199616 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
> > >       
> > > md0 : active raid1 sdb1[1] sda1[0]
> > >       61376 blocks [2/2] [UU]
> > >       
> > > unused devices: <none>
> > > 
> > > md0, md1, and md3 aren't so interesting, since fs is created directly on them, and that's a _problem device_:
> > > 
> > > # parted /dev/md2
> > > GNU Parted 2.2
> > > Using /dev/md2
> > > Welcome to GNU Parted! Type 'help' to view a list of commands.
> > > (parted) p free                                                           
> > > p free
> > > Model: Unknown (unknown)
> > > Disk /dev/md2: 5998GB
> > > Sector size (logical/physical): 512B/512B
> > > Partition Table: gpt
> > > 
> > > Number  Start   End     Size    File system     Name   Flags
> > >         17.4kB  1049kB  1031kB  Free Space
> > >  1      1049kB  2147MB  2146MB  linux-swap(v1)  swap
> > >  2      2147MB  23.6GB  21.5GB  xfs             home
> > >  3      23.6GB  24.7GB  1074MB  xfs             temp
> > >  4      24.7GB  35.4GB  10.7GB  xfs             user
> > >  5      35.4GB  51.5GB  16.1GB  xfs             var
> > >  6      51.5GB  5998GB  5946GB  xfs             vault
> > >         5998GB  5998GB  507kB   Free Space
> > > 
> > > # ls /dev/md?*
> > > /dev/md0  /dev/md1  /dev/md2  /dev/md2p1  /dev/md2p2  /dev/md2p3  /dev/md2p4  /dev/md2p5  /dev/md2p6  /dev/md3
> > > 
> > > It's very handy partitioning scheme where I can extend (grow 5th raid) with more hdds only /vault partition while "loosing" (a.k.a. not using for this partition) only ~1gb of space from every 2TB drive.
> > > 
> > > System boots ok and xfs_check passes with no problems, etc.
> > > The only problem: it's not possible to use NFS shares on any partition of /dev/md2 device.
> > > 
> > > Finally, my question to NFS and MDADM developers: any idea?
> > > 
> > > -- 
> > > Dont wait to die to find paradise...
> > > --
> > > Cheerz,
> > > Vlad "Stealth" Glagolev
> > 
> > 
> 
> 
> -- 
> Dont wait to die to find paradise...
> --
> Cheerz,
> Vlad "Stealth" Glagolev



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found]           ` <20100422193236.GA10302-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2010-04-22 19:47             ` Trond Myklebust
  2010-04-22 19:51               ` Vlad Glagolev
  0 siblings, 1 reply; 15+ messages in thread
From: Trond Myklebust @ 2010-04-22 19:47 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Vlad Glagolev, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

On Thu, 2010-04-22 at 15:32 -0400, J. Bruce Fields wrote: 
> On Thu, Apr 22, 2010 at 10:53:10PM +0400, Vlad Glagolev wrote:
> > On Thu, 22 Apr 2010 14:25:43 -0400
> > "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> > 
> > > On Sat, Apr 17, 2010 at 07:57:47PM +0400, Vlad Glagolev wrote:
> > > > Well, hello there,
> > > > 
> > > > Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> > > > 
> > > > I faced this problem today while trying to mount some NFS share on OpenBSD box.
> > > > I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> > > > 
> > > > ksh: cd: /storage - Stale NFS file handle
> > > > 
> > > > Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> > > > That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> > > > 
> > > > While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> > > > 
> > > > http://pastebin.ca/1864713
> > > > 
> > > > Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> > > > 
> > > > Then I started to experiment with another Linux box to kill the possible different variants.
> > > > 
> > > > On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
> > > 
> > > I'm a bit confused.  What kernel and nfs-utils version is running on the
> > > problematic Linux server?
> > 
> > same. nfs-utils 1.1.6 and kernel 2.6.32.
> 
> Huh.  That should be new enough for it to be using uuid's.  I wonder why
> it isn't?

What are the contents of /dev/disk/by-uuid?

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
  2010-04-22 19:47             ` Trond Myklebust
@ 2010-04-22 19:51               ` Vlad Glagolev
  2010-04-22 19:56                 ` Trond Myklebust
  0 siblings, 1 reply; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-22 19:51 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2536 bytes --]

On Thu, 22 Apr 2010 15:47:30 -0400
Trond Myklebust <trond.myklebust@fys.uio.no> wrote:

> On Thu, 2010-04-22 at 15:32 -0400, J. Bruce Fields wrote: 
> > On Thu, Apr 22, 2010 at 10:53:10PM +0400, Vlad Glagolev wrote:
> > > On Thu, 22 Apr 2010 14:25:43 -0400
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > > 
> > > > On Sat, Apr 17, 2010 at 07:57:47PM +0400, Vlad Glagolev wrote:
> > > > > Well, hello there,
> > > > > 
> > > > > Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> > > > > 
> > > > > I faced this problem today while trying to mount some NFS share on OpenBSD box.
> > > > > I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> > > > > 
> > > > > ksh: cd: /storage - Stale NFS file handle
> > > > > 
> > > > > Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> > > > > That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> > > > > 
> > > > > While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> > > > > 
> > > > > http://pastebin.ca/1864713
> > > > > 
> > > > > Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> > > > > 
> > > > > Then I started to experiment with another Linux box to kill the possible different variants.
> > > > > 
> > > > > On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
> > > > 
> > > > I'm a bit confused.  What kernel and nfs-utils version is running on the
> > > > problematic Linux server?
> > > 
> > > same. nfs-utils 1.1.6 and kernel 2.6.32.
> > 
> > Huh.  That should be new enough for it to be using uuid's.  I wonder why
> > it isn't?
> 
> What are the contents of /dev/disk/by-uuid?
> 

$ ls -1 /dev/disk/by-uuid/
0e9742f6-44e3-431c-911f-4c914e4f81d5
31ae89c6-dba6-4351-b3a9-e8b08be07c3d
4429ba7a-afd2-4c61-83a0-900dae1bccdc
463bbc42-c19b-4b9e-bae7-838ac0e2e5c6
473b9320-88a4-44eb-b592-2ac98619bc9b
53d16b07-d496-4f8f-ad59-ea34aaf169f4
6adf1c55-405c-43cf-a84d-be5d2746d300
b35a7bca-12ad-4738-a895-52f20b7cc5d9
dc892f1f-0b83-41dd-bde7-0761295f33a3
f7ac4165-320f-4235-a78a-5fe1bd0aac24

-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
  2010-04-22 19:51               ` Vlad Glagolev
@ 2010-04-22 19:56                 ` Trond Myklebust
       [not found]                   ` <1271966181.593.23.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Trond Myklebust @ 2010-04-22 19:56 UTC (permalink / raw)
  To: Vlad Glagolev; +Cc: J. Bruce Fields, linux-nfs, linux-raid

On Thu, 2010-04-22 at 23:51 +0400, Vlad Glagolev wrote: 
> On Thu, 22 Apr 2010 15:47:30 -0400
> Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> 
> > On Thu, 2010-04-22 at 15:32 -0400, J. Bruce Fields wrote: 
> > > On Thu, Apr 22, 2010 at 10:53:10PM +0400, Vlad Glagolev wrote:
> > > > On Thu, 22 Apr 2010 14:25:43 -0400
> > > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > > > 
> > > > > On Sat, Apr 17, 2010 at 07:57:47PM +0400, Vlad Glagolev wrote:
> > > > > > Well, hello there,
> > > > > > 
> > > > > > Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> > > > > > 
> > > > > > I faced this problem today while trying to mount some NFS share on OpenBSD box.
> > > > > > I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> > > > > > 
> > > > > > ksh: cd: /storage - Stale NFS file handle
> > > > > > 
> > > > > > Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> > > > > > That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> > > > > > 
> > > > > > While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> > > > > > 
> > > > > > http://pastebin.ca/1864713
> > > > > > 
> > > > > > Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> > > > > > 
> > > > > > Then I started to experiment with another Linux box to kill the possible different variants.
> > > > > > 
> > > > > > On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
> > > > > 
> > > > > I'm a bit confused.  What kernel and nfs-utils version is running on the
> > > > > problematic Linux server?
> > > > 
> > > > same. nfs-utils 1.1.6 and kernel 2.6.32.
> > > 
> > > Huh.  That should be new enough for it to be using uuid's.  I wonder why
> > > it isn't?
> > 
> > What are the contents of /dev/disk/by-uuid?
> > 
> 
> $ ls -1 /dev/disk/by-uuid/
> 0e9742f6-44e3-431c-911f-4c914e4f81d5
> 31ae89c6-dba6-4351-b3a9-e8b08be07c3d
> 4429ba7a-afd2-4c61-83a0-900dae1bccdc
> 463bbc42-c19b-4b9e-bae7-838ac0e2e5c6
> 473b9320-88a4-44eb-b592-2ac98619bc9b
> 53d16b07-d496-4f8f-ad59-ea34aaf169f4
> 6adf1c55-405c-43cf-a84d-be5d2746d300
> b35a7bca-12ad-4738-a895-52f20b7cc5d9
> dc892f1f-0b83-41dd-bde7-0761295f33a3
> f7ac4165-320f-4235-a78a-5fe1bd0aac24
> 

So, when you do 'ls -l' on the above, you do indeed see all the
partitions that are being exported via NFS?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: NFS and /dev/mdXpY
       [not found]                   ` <1271966181.593.23.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2010-04-22 20:07                     ` Vlad Glagolev
  0 siblings, 0 replies; 15+ messages in thread
From: Vlad Glagolev @ 2010-04-22 20:07 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 5583 bytes --]

On Thu, 22 Apr 2010 15:56:21 -0400
Trond Myklebust <trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org> wrote:

> On Thu, 2010-04-22 at 23:51 +0400, Vlad Glagolev wrote: 
> > On Thu, 22 Apr 2010 15:47:30 -0400
> > Trond Myklebust <trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org> wrote:
> > 
> > > On Thu, 2010-04-22 at 15:32 -0400, J. Bruce Fields wrote: 
> > > > On Thu, Apr 22, 2010 at 10:53:10PM +0400, Vlad Glagolev wrote:
> > > > > On Thu, 22 Apr 2010 14:25:43 -0400
> > > > > "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> > > > > 
> > > > > > On Sat, Apr 17, 2010 at 07:57:47PM +0400, Vlad Glagolev wrote:
> > > > > > > Well, hello there,
> > > > > > > 
> > > > > > > Posted it on linux-kernel ML also, and post it here, for more specific analysis.
> > > > > > > 
> > > > > > > I faced this problem today while trying to mount some NFS share on OpenBSD box.
> > > > > > > I mounted it successfully without any visible errors, but I wasn't able to cd there, the printed error was:
> > > > > > > 
> > > > > > > ksh: cd: /storage - Stale NFS file handle
> > > > > > > 
> > > > > > > Apropos, the partition is 5.5 TB. I tried another one on my box and it was mounted successfully. It was possible to manage files there too. Its size is ~3GB.
> > > > > > > That's why the first time I thought about some size limitations of OpenBSD/Linux/NFS.
> > > > > > > 
> > > > > > > While talking on #openbsd @ freenode, I discovered this via tcpdump on both sides:
> > > > > > > 
> > > > > > > http://pastebin.ca/1864713
> > > > > > > 
> > > > > > > Googling for 3 hours didn't help at all, some posts had similiar issue but either with no answer at all or without any full description.
> > > > > > > 
> > > > > > > Then I started to experiment with another Linux box to kill the possible different variants.
> > > > > > > 
> > > > > > > On another box I also have nfs-utils 1.1.6 and kernel 2.6.32. Mounting that big partition was unsuccessful, it got just stuck. On tcpdump I've seen this:
> > > > > > 
> > > > > > I'm a bit confused.  What kernel and nfs-utils version is running on the
> > > > > > problematic Linux server?
> > > > > 
> > > > > same. nfs-utils 1.1.6 and kernel 2.6.32.
> > > > 
> > > > Huh.  That should be new enough for it to be using uuid's.  I wonder why
> > > > it isn't?
> > > 
> > > What are the contents of /dev/disk/by-uuid?
> > > 
> > 
> > $ ls -1 /dev/disk/by-uuid/
> > 0e9742f6-44e3-431c-911f-4c914e4f81d5
> > 31ae89c6-dba6-4351-b3a9-e8b08be07c3d
> > 4429ba7a-afd2-4c61-83a0-900dae1bccdc
> > 463bbc42-c19b-4b9e-bae7-838ac0e2e5c6
> > 473b9320-88a4-44eb-b592-2ac98619bc9b
> > 53d16b07-d496-4f8f-ad59-ea34aaf169f4
> > 6adf1c55-405c-43cf-a84d-be5d2746d300
> > b35a7bca-12ad-4738-a895-52f20b7cc5d9
> > dc892f1f-0b83-41dd-bde7-0761295f33a3
> > f7ac4165-320f-4235-a78a-5fe1bd0aac24
> > 
> 
> So, when you do 'ls -l' on the above, you do indeed see all the
> partitions that are being exported via NFS?

there's only one, and yes, you're right.. /dev/md2p6 isn't in the list.

According to blkid:

# blkid
/dev/sda1: UUID="9a9beac0-d5fa-94d1-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sda2: UUID="8327bc1c-3d13-ee78-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sda3: UUID="01821f8d-da8e-fd6a-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdb1: UUID="9a9beac0-d5fa-94d1-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdb2: UUID="8327bc1c-3d13-ee78-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdb3: UUID="01821f8d-da8e-fd6a-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdc1: UUID="ed9c6039-5faf-d0ef-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdc2: UUID="8327bc1c-3d13-ee78-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdc3: UUID="01821f8d-da8e-fd6a-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdd1: UUID="ed9c6039-5faf-d0ef-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdd2: UUID="8327bc1c-3d13-ee78-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/sdd3: UUID="01821f8d-da8e-fd6a-86df-b710391e5a7c" TYPE="linux_raid_member" 
/dev/md0: UUID="53d16b07-d496-4f8f-ad59-ea34aaf169f4" TYPE="xfs" 
/dev/md2p1: UUID="31ae89c6-dba6-4351-b3a9-e8b08be07c3d" TYPE="swap" 
/dev/md2p2: UUID="dc892f1f-0b83-41dd-bde7-0761295f33a3" TYPE="xfs" 
/dev/md2p3: UUID="f7ac4165-320f-4235-a78a-5fe1bd0aac24" TYPE="xfs" 
/dev/md2p4: UUID="4429ba7a-afd2-4c61-83a0-900dae1bccdc" TYPE="xfs" 
/dev/md2p5: UUID="b35a7bca-12ad-4738-a895-52f20b7cc5d9" TYPE="xfs" 
/dev/md2p6: UUID="9f8d66e8-4b11-4ae7-8f90-b00ee9d204b1" TYPE="xfs" 
/dev/md1: UUID="0e9742f6-44e3-431c-911f-4c914e4f81d5" TYPE="xfs" 
/dev/md3: UUID="463bbc42-c19b-4b9e-bae7-838ac0e2e5c6" TYPE="xfs" 
/dev/sde1: UUID="6adf1c55-405c-43cf-a84d-be5d2746d300" TYPE="ext2"

and not all of them exist at /dev/disk/by-uuid:

# ls -1 /dev/disk/by-uuid
0e9742f6-44e3-431c-911f-4c914e4f81d5 <- /dev/md1
31ae89c6-dba6-4351-b3a9-e8b08be07c3d <- /dev/md2p1
4429ba7a-afd2-4c61-83a0-900dae1bccdc <- /dev/md2p4
463bbc42-c19b-4b9e-bae7-838ac0e2e5c6 <- /dev/md3
473b9320-88a4-44eb-b592-2ac98619bc9b <- ??
53d16b07-d496-4f8f-ad59-ea34aaf169f4 <- /dev/md0
6adf1c55-405c-43cf-a84d-be5d2746d300 <- /dev/sde1
b35a7bca-12ad-4738-a895-52f20b7cc5d9 <- /dev/md2p5
dc892f1f-0b83-41dd-bde7-0761295f33a3 <- /dev/md2p2
f7ac4165-320f-4235-a78a-5fe1bd0aac24 <- /dev/md2p3

Interesting. so 9f8d66e8-4b11-4ae7-8f90-b00ee9d204b1 != 473b9320-88a4-44eb-b592-2ac98619bc9b.

Bug in udev?

-- 
Dont wait to die to find paradise...
--
Cheerz,
Vlad "Stealth" Glagolev

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-04-22 20:07 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-17 15:57 NFS and /dev/mdXpY Vlad Glagolev
     [not found] ` <20100417195747.5fae8834.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
2010-04-21 16:39   ` Steve Cousins
2010-04-21 16:48     ` Vlad Glagolev
     [not found]       ` <20100421204819.b86ee3f7.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
2010-04-21 17:09         ` Roger Heflin
     [not found]           ` <q2zd3da20d01004211009jccd81479v83e2ef4b6d5db7bf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-04-21 17:32             ` Vlad Glagolev
     [not found]               ` <20100421213201.67a4a7a2.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
2010-04-21 18:26                 ` Vlad Glagolev
2010-04-21 19:08                   ` Vlad Glagolev
     [not found]                   ` <20100421222612.7aa4f21a.stealth-L+UJwxqiw56VyaH7bEyXVA@public.gmane.org>
2010-04-22  1:20                     ` Roger Heflin
2010-04-22 18:25   ` J. Bruce Fields
     [not found]     ` <20100422182543.GB8858-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-04-22 18:53       ` Vlad Glagolev
2010-04-22 19:32         ` J. Bruce Fields
     [not found]           ` <20100422193236.GA10302-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-04-22 19:47             ` Trond Myklebust
2010-04-22 19:51               ` Vlad Glagolev
2010-04-22 19:56                 ` Trond Myklebust
     [not found]                   ` <1271966181.593.23.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-04-22 20:07                     ` Vlad Glagolev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).