linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Help wanted: ENOCLK returned during lock test#2 in connectathon's test
@ 2011-12-05 13:52 DENIEL Philippe
  2011-12-05 23:33 ` Trond Myklebust
  0 siblings, 1 reply; 6+ messages in thread
From: DENIEL Philippe @ 2011-12-05 13:52 UTC (permalink / raw)
  To: Ganesha NFS List, NFS list

[-- Attachment #1: Type: text/plain, Size: 3295 bytes --]

Hi,

as you may know (we may have met at Bake-A-Thon), I am working on 
NFS-Ganesha, a NFS server running in userspace. I currently face an 
issue when running cthon04 test suite, during the "lock step".
Client is linux 3.1.0-rc4, server is nfs-ganesha compiled with FSAL_VFS 
support. Server is mounted via command "mount 
-overs=4.minorversion=1,lock <server>:<path> /mnt"

During the test#2 in "lock" tests, I got the following error:

    Creating parent/child synchronization pipes.

    Test #2 - Try to lock the whole file.
            Parent: 2.0  - F_TLOCK [               0,          ENDING]
    FAILED!
            Parent: **** Expected success, returned errno=37...
            Parent: **** Probably implementation error.

    ** PARENT pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total).

    **  CHILD pass 1 results: 0/0 pass, 0/0 warn, 0/0 fail (pass/total).


I made a wireshark capture of the packet (see attachement). Apparently, 
the client does 2 compounds, one for OP4_OPEN and a second to call 
OP4_OPEN_CONFIRM.

On the client side, the "locl/locklfs" binary fails when calling lockf() 
(this is not the first time it calls it, it has been done in test#1 
which passed successfully). Error return is ENOLCK (posix error #37).
I enabled the kernel's debug message by using the command 'echo 32767 > 
/proc/sys/sunrpc/nfs_debug' (complete log in attachement, reduced to 
what is related to lock#2). Grepping 'NFS' in this shows this:

    Dec  5 13:31:08 aury63 kernel: NFS: nfs_update_inode(0:16/523265
    ct=2 info=0x27e7f)
    Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/523265),
    mask=0x1, res=0
    Dec  5 13:31:08 aury63 kernel: NFS: nfs_lookup_revalidate(/rep) is valid
    Dec  5 13:31:08 aury63 kernel: NFS: nfs_update_inode(0:16/544895
    ct=1 info=0x27e7f)
    Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/544895),
    mask=0x1, res=0
    Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/544895),
    mask=0x1, res=0
    Dec  5 13:31:08 aury63 kernel: NFS: atomic_lookup(0:16/544895),
    lockfile2908
    Dec  5 13:31:08 aury63 kernel: decode_attr_pnfstype: bitmap is 0
    Dec  5 13:31:08 aury63 kernel: nfs4_schedule_state_renewal:
    requeueing work. Lease period = 80
    Dec  5 13:31:08 aury63 kernel: --> nfs_put_client({2})
    Dec  5 13:31:08 aury63 kernel: <-- nfs4_setup_sequence status=0
    Dec  5 13:31:08 aury63 kernel: NFS: nfs_update_inode(0:16/544895
    ct=1 info=0x27e67)
    Dec  5 13:31:08 aury63 kernel: NFS: change_attr change on server for
    file 0:16/544895
    Dec  5 13:31:08 aury63 kernel: NFS: mtime change on server for file
    0:16/544895
    Dec  5 13:31:08 aury63 kernel: NFS: nfs_fhget(0:16/544930 ct=1)
    Dec  5 13:31:08 aury63 kernel: NFS: open file(rep/lockfile2908)
    Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/544930),
    mask=0x26, res=0
    Dec  5 13:31:08 aury63 kernel: NFS: lock(rep/lockfile2908, t=1,
    fl=1, r=0:9223372036854775807)

I can see no reason why ENOLCK is returned. This is clearly a bug on the 
server handside (within nfs-ganesha) but I have to know what the client 
is doing here to have a clearer idea.
I ran the same test using NFSv4.1 and NFSv3+NLMv4, things go perfectly 
OK with no failure.

Can someone help me ?

    Regards

       Philippe


[-- Attachment #2: bug_lock_2.pcap --]
[-- Type: application/octet-stream, Size: 1548 bytes --]

[-- Attachment #3: log_kernel_bug_lock2.txt --]
[-- Type: text/plain, Size: 6686 bytes --]

Dec  5 13:31:08 aury63 kernel: encode_compound: tag=
Dec  5 13:31:08 aury63 kernel: decode_attr_type: type=040000
Dec  5 13:31:08 aury63 kernel: decode_attr_change: change attribute=1323082676
Dec  5 13:31:08 aury63 kernel: decode_attr_size: file size=856064
Dec  5 13:31:08 aury63 kernel: decode_attr_fsid: fsid=(0xc0/0xa8)
Dec  5 13:31:08 aury63 kernel: decode_attr_fileid: fileid=523265
Dec  5 13:31:08 aury63 kernel: decode_attr_fs_locations: fs_locations done, error = 0
Dec  5 13:31:08 aury63 kernel: decode_attr_mode: file mode=01777
Dec  5 13:31:08 aury63 kernel: decode_attr_nlink: nlink=14
Dec  5 13:31:08 aury63 kernel: decode_attr_owner: uid=0
Dec  5 13:31:08 aury63 kernel: decode_attr_group: gid=0
Dec  5 13:31:08 aury63 kernel: decode_attr_rdev: rdev=(0x0:0x0)
Dec  5 13:31:08 aury63 kernel: decode_attr_space_used: space used=860160
Dec  5 13:31:08 aury63 kernel: decode_attr_time_access: atime=1323081627
Dec  5 13:31:08 aury63 kernel: decode_attr_time_metadata: ctime=1323082676
Dec  5 13:31:08 aury63 kernel: decode_attr_time_modify: mtime=1323082676
Dec  5 13:31:08 aury63 kernel: decode_attr_mounted_on_fileid: fileid=0
Dec  5 13:31:08 aury63 kernel: decode_getfattr: xdr returned 0
Dec  5 13:31:08 aury63 kernel: NFS: nfs_update_inode(0:16/523265 ct=2 info=0x27e7f)
Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/523265), mask=0x1, res=0
Dec  5 13:31:08 aury63 kernel: NFS: nfs_lookup_revalidate(/rep) is valid
Dec  5 13:31:08 aury63 kernel: encode_compound: tag=
Dec  5 13:31:08 aury63 kernel: decode_attr_type: type=040000
Dec  5 13:31:08 aury63 kernel: decode_attr_change: change attribute=1323088137
Dec  5 13:31:08 aury63 kernel: decode_attr_size: file size=4096
Dec  5 13:31:08 aury63 kernel: decode_attr_fsid: fsid=(0xc0/0xa8)
Dec  5 13:31:08 aury63 kernel: decode_attr_fileid: fileid=544895
Dec  5 13:31:08 aury63 kernel: decode_attr_fs_locations: fs_locations done, error = 0
Dec  5 13:31:08 aury63 kernel: decode_attr_mode: file mode=0755
Dec  5 13:31:08 aury63 kernel: decode_attr_nlink: nlink=2
Dec  5 13:31:08 aury63 kernel: decode_attr_owner: uid=0
Dec  5 13:31:08 aury63 kernel: decode_attr_group: gid=0
Dec  5 13:31:08 aury63 kernel: decode_attr_rdev: rdev=(0x0:0x0)
Dec  5 13:31:08 aury63 kernel: decode_attr_space_used: space used=4096
Dec  5 13:31:08 aury63 kernel: decode_attr_time_access: atime=1323071853
Dec  5 13:31:08 aury63 kernel: decode_attr_time_metadata: ctime=1323088137
Dec  5 13:31:08 aury63 kernel: decode_attr_time_modify: mtime=1323088137
Dec  5 13:31:08 aury63 kernel: decode_attr_mounted_on_fileid: fileid=0
Dec  5 13:31:08 aury63 kernel: decode_getfattr: xdr returned 0
Dec  5 13:31:08 aury63 kernel: NFS: nfs_update_inode(0:16/544895 ct=1 info=0x27e7f)
Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/544895), mask=0x1, res=0
Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/544895), mask=0x1, res=0
Dec  5 13:31:08 aury63 kernel: NFS: atomic_lookup(0:16/544895), lockfile2908
Dec  5 13:31:08 aury63 kernel: encode_compound: tag=
Dec  5 13:31:08 aury63 kernel: encode_compound: tag=
Dec  5 13:31:08 aury63 kernel: decode_attr_lease_time: file size=120
Dec  5 13:31:08 aury63 kernel: decode_attr_maxfilesize: maxfilesize=0
Dec  5 13:31:08 aury63 kernel: decode_attr_maxread: maxread=1024
Dec  5 13:31:08 aury63 kernel: decode_attr_maxwrite: maxwrite=1024
Dec  5 13:31:08 aury63 kernel: decode_attr_pnfstype: bitmap is 0
Dec  5 13:31:08 aury63 kernel: decode_attr_layout_blksize: bitmap is 0
Dec  5 13:31:08 aury63 kernel: decode_fsinfo: xdr returned 0!
Dec  5 13:31:08 aury63 kernel: nfs4_schedule_state_renewal: requeueing work. Lease period = 80
Dec  5 13:31:08 aury63 kernel: --> nfs_put_client({2})
Dec  5 13:31:08 aury63 kernel: <-- nfs4_setup_sequence status=0
Dec  5 13:31:08 aury63 kernel: encode_compound: tag=
Dec  5 13:31:08 aury63 kernel: decode_attr_type: type=0100000
Dec  5 13:31:08 aury63 kernel: decode_attr_change: change attribute=1323088268
Dec  5 13:31:08 aury63 kernel: decode_attr_size: file size=0
Dec  5 13:31:08 aury63 kernel: decode_attr_fsid: fsid=(0xc0/0xa8)
Dec  5 13:31:08 aury63 kernel: decode_attr_fileid: fileid=544930
Dec  5 13:31:08 aury63 kernel: decode_attr_fs_locations: fs_locations done, error = 0
Dec  5 13:31:08 aury63 kernel: decode_attr_mode: file mode=0644
Dec  5 13:31:08 aury63 kernel: decode_attr_nlink: nlink=1
Dec  5 13:31:08 aury63 kernel: decode_attr_owner: uid=-2
Dec  5 13:31:08 aury63 kernel: decode_attr_group: gid=-2
Dec  5 13:31:08 aury63 kernel: decode_attr_rdev: rdev=(0x0:0x0)
Dec  5 13:31:08 aury63 kernel: decode_attr_space_used: space used=0
Dec  5 13:31:08 aury63 kernel: decode_attr_time_access: atime=1323088268
Dec  5 13:31:08 aury63 kernel: decode_attr_time_metadata: ctime=1323088268
Dec  5 13:31:08 aury63 kernel: decode_attr_time_modify: mtime=1323088268
Dec  5 13:31:08 aury63 kernel: decode_attr_mounted_on_fileid: fileid=0
Dec  5 13:31:08 aury63 kernel: decode_getfattr: xdr returned 0
Dec  5 13:31:08 aury63 kernel: decode_attr_type: type=040000
Dec  5 13:31:08 aury63 kernel: decode_attr_change: change attribute=1323088137
Dec  5 13:31:08 aury63 kernel: decode_attr_size: file size=4096
Dec  5 13:31:08 aury63 kernel: decode_attr_fsid: fsid=(0xc0/0xa8)
Dec  5 13:31:08 aury63 kernel: decode_attr_fileid: fileid=544895
Dec  5 13:31:08 aury63 kernel: decode_attr_fs_locations: fs_locations done, error = 0
Dec  5 13:31:08 aury63 kernel: decode_attr_mode: file mode=0755
Dec  5 13:31:08 aury63 kernel: decode_attr_nlink: nlink=2
Dec  5 13:31:08 aury63 kernel: decode_attr_owner: uid=-2
Dec  5 13:31:08 aury63 kernel: decode_attr_group: gid=-2
Dec  5 13:31:08 aury63 kernel: decode_attr_rdev: rdev=(0x0:0x0)
Dec  5 13:31:08 aury63 kernel: decode_attr_space_used: space used=4096
Dec  5 13:31:08 aury63 kernel: decode_attr_time_access: atime=1323071853
Dec  5 13:31:08 aury63 kernel: decode_attr_time_metadata: ctime=1323088268
Dec  5 13:31:08 aury63 kernel: decode_attr_time_modify: mtime=1323088268
Dec  5 13:31:08 aury63 kernel: decode_attr_mounted_on_fileid: fileid=0
Dec  5 13:31:08 aury63 kernel: decode_getfattr: xdr returned 0
Dec  5 13:31:08 aury63 kernel: NFS: nfs_update_inode(0:16/544895 ct=1 info=0x27e67)
Dec  5 13:31:08 aury63 kernel: NFS: change_attr change on server for file 0:16/544895
Dec  5 13:31:08 aury63 kernel: NFS: mtime change on server for file 0:16/544895
Dec  5 13:31:08 aury63 kernel: encode_compound: tag=
Dec  5 13:31:08 aury63 kernel: NFS: nfs_fhget(0:16/544930 ct=1)
Dec  5 13:31:08 aury63 kernel: NFS: open file(rep/lockfile2908)
Dec  5 13:31:08 aury63 kernel: NFS: permission(0:16/544930), mask=0x26, res=0
Dec  5 13:31:08 aury63 kernel: NFS: lock(rep/lockfile2908, t=1, fl=1, r=0:9223372036854775807)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help wanted: ENOCLK returned during lock test#2 in connectathon's test
  2011-12-05 13:52 Help wanted: ENOCLK returned during lock test#2 in connectathon's test DENIEL Philippe
@ 2011-12-05 23:33 ` Trond Myklebust
  2011-12-06 12:11   ` DENIEL Philippe
  0 siblings, 1 reply; 6+ messages in thread
From: Trond Myklebust @ 2011-12-05 23:33 UTC (permalink / raw)
  To: DENIEL Philippe; +Cc: Ganesha NFS List, NFS list

On Mon, 2011-12-05 at 14:52 +0100, DENIEL Philippe wrote: 
> Hi,
> 
> as you may know (we may have met at Bake-A-Thon), I am working on 
> NFS-Ganesha, a NFS server running in userspace. I currently face an 
> issue when running cthon04 test suite, during the "lock step".
> Client is linux 3.1.0-rc4, server is nfs-ganesha compiled with FSAL_VFS 
> support. Server is mounted via command "mount 
> -overs=4.minorversion=1,lock <server>:<path> /mnt"
> 
> During the test#2 in "lock" tests, I got the following error:
> 
>     Creating parent/child synchronization pipes.
> 
>     Test #2 - Try to lock the whole file.
>             Parent: 2.0  - F_TLOCK [               0,          ENDING]
>     FAILED!
>             Parent: **** Expected success, returned errno=37...
>             Parent: **** Probably implementation error.
> 
>     ** PARENT pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total).
> 
>     **  CHILD pass 1 results: 0/0 pass, 0/0 warn, 0/0 fail (pass/total).
> 
> 
> I made a wireshark capture of the packet (see attachement). Apparently, 
> the client does 2 compounds, one for OP4_OPEN and a second to call 
> OP4_OPEN_CONFIRM.

Hi Philippe,

As far as I can see from the pcap file, your server isn't setting the
OPEN4_RESULT_LOCKTYPE_POSIX flag in the OPEN reply, and so the client
can't support posix locking semantics. In that case, it will return
ENOLCK to all fcntl locking requests.

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help wanted: ENOCLK returned during lock test#2 in connectathon's test
  2011-12-05 23:33 ` Trond Myklebust
@ 2011-12-06 12:11   ` DENIEL Philippe
  2011-12-06 19:37     ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: DENIEL Philippe @ 2011-12-06 12:11 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Ganesha NFS List, NFS list

Hi Trond,

many thanks for your reply.
In fact, the "rflag" in OP4_OPEN's reply is set to 6 = 4|2 = 
OPEN4_RESULT_LOCKTYPE_POSIX|OPEN4_RESULT_CONFIRM
For some reason I do not understand, wireshark see 
OPEN4_RESULT_LOCKTYPE_POSIX as an 'unknown' flag and do not print it.  
Bug actually it seems like OPEN4_RESULT_LOCKTYPE_POSIX is set.

Your mail made me have a closer look to my implementation of OP4_OPEN 
and OP4_OPEN_CONFIRM in NFSv4.0 . Since the beginning (since I met this 
bug), I suspect something related to seqids : it does not occur in 
NFSv4.1 where seqids 's management is made in OP4_SEQUENCE, at the 
beginning of the request. So I ran lock test#2 on a kernel nfsd, capture 
the result and compared to what ganesha produces. I saw a difference:
- when OP4_OPEN is invoked, the nfsd replies with a stateid containing 
seqid=0. This seqid is passed to OP4_OPEN_CONFIRM which confirms it and 
(if OK) replies with an updated stateid (seqid is now 1)
- when ganesha does the same OP4_OPEN return a (unconfirmed) stateid 
whose seqid is equal to 1, then OP4_OPEN_CONFIRM set this seqid to 2 
when confirming the stateid.

 From your point of view, could this mess in seqid's management produce 
the bug that I see when running lock test#2 ?

    Regards

       Philippe

Trond Myklebust a écrit :
> On Mon, 2011-12-05 at 14:52 +0100, DENIEL Philippe wrote: 
>   
>> Hi,
>>
>> as you may know (we may have met at Bake-A-Thon), I am working on 
>> NFS-Ganesha, a NFS server running in userspace. I currently face an 
>> issue when running cthon04 test suite, during the "lock step".
>> Client is linux 3.1.0-rc4, server is nfs-ganesha compiled with FSAL_VFS 
>> support. Server is mounted via command "mount 
>> -overs=4.minorversion=1,lock <server>:<path> /mnt"
>>
>> During the test#2 in "lock" tests, I got the following error:
>>
>>     Creating parent/child synchronization pipes.
>>
>>     Test #2 - Try to lock the whole file.
>>             Parent: 2.0  - F_TLOCK [               0,          ENDING]
>>     FAILED!
>>             Parent: **** Expected success, returned errno=37...
>>             Parent: **** Probably implementation error.
>>
>>     ** PARENT pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total).
>>
>>     **  CHILD pass 1 results: 0/0 pass, 0/0 warn, 0/0 fail (pass/total).
>>
>>
>> I made a wireshark capture of the packet (see attachement). Apparently, 
>> the client does 2 compounds, one for OP4_OPEN and a second to call 
>> OP4_OPEN_CONFIRM.
>>     
>
> Hi Philippe,
>
> As far as I can see from the pcap file, your server isn't setting the
> OPEN4_RESULT_LOCKTYPE_POSIX flag in the OPEN reply, and so the client
> can't support posix locking semantics. In that case, it will return
> ENOLCK to all fcntl locking requests.
>
> Cheers
>   Trond
>   


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help wanted: ENOCLK returned during lock test#2 in connectathon's test
  2011-12-06 12:11   ` DENIEL Philippe
@ 2011-12-06 19:37     ` J. Bruce Fields
  2011-12-07 13:29       ` DENIEL Philippe
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2011-12-06 19:37 UTC (permalink / raw)
  To: DENIEL Philippe; +Cc: Trond Myklebust, Ganesha NFS List, NFS list

On Tue, Dec 06, 2011 at 01:11:05PM +0100, DENIEL Philippe wrote:
> Hi Trond,
> 
> many thanks for your reply.
> In fact, the "rflag" in OP4_OPEN's reply is set to 6 = 4|2 =
> OPEN4_RESULT_LOCKTYPE_POSIX|OPEN4_RESULT_CONFIRM
> For some reason I do not understand, wireshark see
> OPEN4_RESULT_LOCKTYPE_POSIX as an 'unknown' flag and do not print
> it.  Bug actually it seems like OPEN4_RESULT_LOCKTYPE_POSIX is set.
> 
> Your mail made me have a closer look to my implementation of
> OP4_OPEN and OP4_OPEN_CONFIRM in NFSv4.0 . Since the beginning
> (since I met this bug), I suspect something related to seqids : it
> does not occur in NFSv4.1 where seqids 's management is made in
> OP4_SEQUENCE, at the beginning of the request. So I ran lock test#2
> on a kernel nfsd, capture the result and compared to what ganesha
> produces. I saw a difference:
> - when OP4_OPEN is invoked, the nfsd replies with a stateid
> containing seqid=0. This seqid is passed to OP4_OPEN_CONFIRM which
> confirms it and (if OK) replies with an updated stateid (seqid is
> now 1)
> - when ganesha does the same OP4_OPEN return a (unconfirmed) stateid
> whose seqid is equal to 1, then OP4_OPEN_CONFIRM set this seqid to 2
> when confirming the stateid.

Sounds like you're talking about the seqid field that's contained in the
stateid itself--I'd be suprised if the client cares about it.  The spec
does allow the client to inspect that field to decide what order opens
were done in, but other than that a client normally treats the whole
stateid as opaque.

--b.

> 
> From your point of view, could this mess in seqid's management
> produce the bug that I see when running lock test#2 ?
> 
>    Regards
> 
>       Philippe
> 
> Trond Myklebust a écrit :
> >On Mon, 2011-12-05 at 14:52 +0100, DENIEL Philippe wrote:
> >>Hi,
> >>
> >>as you may know (we may have met at Bake-A-Thon), I am working
> >>on NFS-Ganesha, a NFS server running in userspace. I currently
> >>face an issue when running cthon04 test suite, during the "lock
> >>step".
> >>Client is linux 3.1.0-rc4, server is nfs-ganesha compiled with
> >>FSAL_VFS support. Server is mounted via command "mount
> >>-overs=4.minorversion=1,lock <server>:<path> /mnt"
> >>
> >>During the test#2 in "lock" tests, I got the following error:
> >>
> >>    Creating parent/child synchronization pipes.
> >>
> >>    Test #2 - Try to lock the whole file.
> >>            Parent: 2.0  - F_TLOCK [               0,          ENDING]
> >>    FAILED!
> >>            Parent: **** Expected success, returned errno=37...
> >>            Parent: **** Probably implementation error.
> >>
> >>    ** PARENT pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total).
> >>
> >>    **  CHILD pass 1 results: 0/0 pass, 0/0 warn, 0/0 fail (pass/total).
> >>
> >>
> >>I made a wireshark capture of the packet (see attachement).
> >>Apparently, the client does 2 compounds, one for OP4_OPEN and a
> >>second to call OP4_OPEN_CONFIRM.
> >
> >Hi Philippe,
> >
> >As far as I can see from the pcap file, your server isn't setting the
> >OPEN4_RESULT_LOCKTYPE_POSIX flag in the OPEN reply, and so the client
> >can't support posix locking semantics. In that case, it will return
> >ENOLCK to all fcntl locking requests.
> >
> >Cheers
> >  Trond
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help wanted: ENOCLK returned during lock test#2 in connectathon's test
  2011-12-06 19:37     ` J. Bruce Fields
@ 2011-12-07 13:29       ` DENIEL Philippe
  2011-12-07 15:12         ` [Nfs-ganesha-devel] " Frank S Filz
  0 siblings, 1 reply; 6+ messages in thread
From: DENIEL Philippe @ 2011-12-07 13:29 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Trond Myklebust, Ganesha NFS List, NFS list

Hi Bruce,

Yes I am talking about the seqid inside the stateid.

    Philippe

J. Bruce Fields a écrit :
> On Tue, Dec 06, 2011 at 01:11:05PM +0100, DENIEL Philippe wrote:
>   
>> Hi Trond,
>>
>> many thanks for your reply.
>> In fact, the "rflag" in OP4_OPEN's reply is set to 6 = 4|2 =
>> OPEN4_RESULT_LOCKTYPE_POSIX|OPEN4_RESULT_CONFIRM
>> For some reason I do not understand, wireshark see
>> OPEN4_RESULT_LOCKTYPE_POSIX as an 'unknown' flag and do not print
>> it.  Bug actually it seems like OPEN4_RESULT_LOCKTYPE_POSIX is set.
>>
>> Your mail made me have a closer look to my implementation of
>> OP4_OPEN and OP4_OPEN_CONFIRM in NFSv4.0 . Since the beginning
>> (since I met this bug), I suspect something related to seqids : it
>> does not occur in NFSv4.1 where seqids 's management is made in
>> OP4_SEQUENCE, at the beginning of the request. So I ran lock test#2
>> on a kernel nfsd, capture the result and compared to what ganesha
>> produces. I saw a difference:
>> - when OP4_OPEN is invoked, the nfsd replies with a stateid
>> containing seqid=0. This seqid is passed to OP4_OPEN_CONFIRM which
>> confirms it and (if OK) replies with an updated stateid (seqid is
>> now 1)
>> - when ganesha does the same OP4_OPEN return a (unconfirmed) stateid
>> whose seqid is equal to 1, then OP4_OPEN_CONFIRM set this seqid to 2
>> when confirming the stateid.
>>     
>
> Sounds like you're talking about the seqid field that's contained in the
> stateid itself--I'd be suprised if the client cares about it.  The spec
> does allow the client to inspect that field to decide what order opens
> were done in, but other than that a client normally treats the whole
> stateid as opaque.
>
> --b.
>
>   
>> From your point of view, could this mess in seqid's management
>> produce the bug that I see when running lock test#2 ?
>>
>>    Regards
>>
>>       Philippe
>>
>> Trond Myklebust a écrit :
>>     
>>> On Mon, 2011-12-05 at 14:52 +0100, DENIEL Philippe wrote:
>>>       
>>>> Hi,
>>>>
>>>> as you may know (we may have met at Bake-A-Thon), I am working
>>>> on NFS-Ganesha, a NFS server running in userspace. I currently
>>>> face an issue when running cthon04 test suite, during the "lock
>>>> step".
>>>> Client is linux 3.1.0-rc4, server is nfs-ganesha compiled with
>>>> FSAL_VFS support. Server is mounted via command "mount
>>>> -overs=4.minorversion=1,lock <server>:<path> /mnt"
>>>>
>>>> During the test#2 in "lock" tests, I got the following error:
>>>>
>>>>    Creating parent/child synchronization pipes.
>>>>
>>>>    Test #2 - Try to lock the whole file.
>>>>            Parent: 2.0  - F_TLOCK [               0,          ENDING]
>>>>    FAILED!
>>>>            Parent: **** Expected success, returned errno=37...
>>>>            Parent: **** Probably implementation error.
>>>>
>>>>    ** PARENT pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total).
>>>>
>>>>    **  CHILD pass 1 results: 0/0 pass, 0/0 warn, 0/0 fail (pass/total).
>>>>
>>>>
>>>> I made a wireshark capture of the packet (see attachement).
>>>> Apparently, the client does 2 compounds, one for OP4_OPEN and a
>>>> second to call OP4_OPEN_CONFIRM.
>>>>         
>>> Hi Philippe,
>>>
>>> As far as I can see from the pcap file, your server isn't setting the
>>> OPEN4_RESULT_LOCKTYPE_POSIX flag in the OPEN reply, and so the client
>>> can't support posix locking semantics. In that case, it will return
>>> ENOLCK to all fcntl locking requests.
>>>
>>> Cheers
>>>  Trond
>>>       
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>     


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Nfs-ganesha-devel] Help wanted: ENOCLK returned during lock test#2 in connectathon's test
  2011-12-07 13:29       ` DENIEL Philippe
@ 2011-12-07 15:12         ` Frank S Filz
  0 siblings, 0 replies; 6+ messages in thread
From: Frank S Filz @ 2011-12-07 15:12 UTC (permalink / raw)
  To: DENIEL Philippe
  Cc: J. Bruce Fields, NFS list, Ganesha NFS List, Trond Myklebust

DENIEL Philippe <philippe.deniel@cea.fr> wrote on 12/07/2011 05:29:02 AM:
> Hi Bruce,
>
> Yes I am talking about the seqid inside the stateid.

The seqid inside the stateid also should be an unlikely place for a
difference between NFS v4 and NFS v4.1 since NFS v4.1 still has the seqid
inside the stateid. See  9.4. Stateid Seqid Values and Byte-Range Locks for
locks (p. 189 of rfc5661). See   9.9. Open Upgrade and Downgrade on p. 192
for open.

Frank

> J. Bruce Fields a écrit :
> > On Tue, Dec 06, 2011 at 01:11:05PM +0100, DENIEL Philippe wrote:
> >
> >> Hi Trond,
> >>
> >> many thanks for your reply.
> >> In fact, the "rflag" in OP4_OPEN's reply is set to 6 = 4|2 =
> >> OPEN4_RESULT_LOCKTYPE_POSIX|OPEN4_RESULT_CONFIRM
> >> For some reason I do not understand, wireshark see
> >> OPEN4_RESULT_LOCKTYPE_POSIX as an 'unknown' flag and do not print
> >> it.  Bug actually it seems like OPEN4_RESULT_LOCKTYPE_POSIX is set.
> >>
> >> Your mail made me have a closer look to my implementation of
> >> OP4_OPEN and OP4_OPEN_CONFIRM in NFSv4.0 . Since the beginning
> >> (since I met this bug), I suspect something related to seqids : it
> >> does not occur in NFSv4.1 where seqids 's management is made in
> >> OP4_SEQUENCE, at the beginning of the request. So I ran lock test#2
> >> on a kernel nfsd, capture the result and compared to what ganesha
> >> produces. I saw a difference:
> >> - when OP4_OPEN is invoked, the nfsd replies with a stateid
> >> containing seqid=0. This seqid is passed to OP4_OPEN_CONFIRM which
> >> confirms it and (if OK) replies with an updated stateid (seqid is
> >> now 1)
> >> - when ganesha does the same OP4_OPEN return a (unconfirmed) stateid
> >> whose seqid is equal to 1, then OP4_OPEN_CONFIRM set this seqid to 2
> >> when confirming the stateid.
> >>
> >
> > Sounds like you're talking about the seqid field that's contained in
the
> > stateid itself--I'd be suprised if the client cares about it.  The spec
> > does allow the client to inspect that field to decide what order opens
> > were done in, but other than that a client normally treats the whole
> > stateid as opaque.
> >
> > --b.
> >
> >
> >> From your point of view, could this mess in seqid's management
> >> produce the bug that I see when running lock test#2 ?
> >>
> >>    Regards
> >>
> >>       Philippe
> >>
> >> Trond Myklebust a écrit :
> >>
> >>> On Mon, 2011-12-05 at 14:52 +0100, DENIEL Philippe wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> as you may know (we may have met at Bake-A-Thon), I am working
> >>>> on NFS-Ganesha, a NFS server running in userspace. I currently
> >>>> face an issue when running cthon04 test suite, during the "lock
> >>>> step".
> >>>> Client is linux 3.1.0-rc4, server is nfs-ganesha compiled with
> >>>> FSAL_VFS support. Server is mounted via command "mount
> >>>> -overs=4.minorversion=1,lock <server>:<path> /mnt"
> >>>>
> >>>> During the test#2 in "lock" tests, I got the following error:
> >>>>
> >>>>    Creating parent/child synchronization pipes.
> >>>>
> >>>>    Test #2 - Try to lock the whole file.
> >>>>            Parent: 2.0  - F_TLOCK [               0,
ENDING]
> >>>>    FAILED!
> >>>>            Parent: **** Expected success, returned errno=37...
> >>>>            Parent: **** Probably implementation error.
> >>>>
> >>>>    ** PARENT pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail
(pass/total).
> >>>>
> >>>>    **  CHILD pass 1 results: 0/0 pass, 0/0 warn, 0/0 fail
(pass/total).
> >>>>
> >>>>
> >>>> I made a wireshark capture of the packet (see attachement).
> >>>> Apparently, the client does 2 compounds, one for OP4_OPEN and a
> >>>> second to call OP4_OPEN_CONFIRM.
> >>>>
> >>> Hi Philippe,
> >>>
> >>> As far as I can see from the pcap file, your server isn't setting the
> >>> OPEN4_RESULT_LOCKTYPE_POSIX flag in the OPEN reply, and so the client
> >>> can't support posix locking semantics. In that case, it will return
> >>> ENOLCK to all fcntl locking requests.
> >>>
> >>> Cheers
> >>>  Trond
> >>>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
>
>
>
------------------------------------------------------------------------------

> Cloud Services Checklist: Pricing and Packaging Optimization
> This white paper is intended to serve as a reference, checklist and point
of
> discussion for anyone considering optimizing the pricing and packaging
model
> of a cloud services business. Read Now!
> http://www.accelacomm.com/jaw/sfnl/114/51491232/
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-12-07 15:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-05 13:52 Help wanted: ENOCLK returned during lock test#2 in connectathon's test DENIEL Philippe
2011-12-05 23:33 ` Trond Myklebust
2011-12-06 12:11   ` DENIEL Philippe
2011-12-06 19:37     ` J. Bruce Fields
2011-12-07 13:29       ` DENIEL Philippe
2011-12-07 15:12         ` [Nfs-ganesha-devel] " Frank S Filz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).