From: "J. Bruce Fields" <bfields@fieldses.org>
To: Soumya Koduri <skoduri@redhat.com>
Cc: "Omar Walid Llorente" <omar@dit.upm.es>,
"Jeff Layton" <jlayton@poochiereds.net>,
linux-nfs@vger.kernel.org,
"administración del centro de cálculo del dit" <cdc@dit.upm.es>
Subject: Re: possible bug in nfs-kernel-server
Date: Fri, 18 Dec 2015 15:08:40 -0500 [thread overview]
Message-ID: <20151218200840.GA28692@fieldses.org> (raw)
In-Reply-To: <56743FB6.80903@redhat.com>
On Fri, Dec 18, 2015 at 10:47:42PM +0530, Soumya Koduri wrote:
>
>
> On 12/18/2015 08:50 PM, J. Bruce Fields wrote:
> >On Fri, Dec 18, 2015 at 02:13:40PM +0530, Soumya Koduri wrote:
> >>
> >>
> >>On 12/18/2015 06:07 AM, Malahal Naineni wrote:
> >>>IIRC, permission checks are done in open(). write/read syscalls should
> >>>NOT do much access checks (at least based on POSIX). This is why once an
> >>>open is done, you remove permissions for that process, but it should
> >>>still be able to read/write based on the open flags it did when it
> >>>opened the file.
> >>>
> >>>I don't know all the details of this defect, but gluster seems to be
> >>>doing what it is supposed to do.
> >>>
> >>Right. Thanks for the correction. I assumed the behavior should be
> >>same for both OPEN+WRITE vs CREATE+WRITE in the below scenario. But
> >>looks like (from 'man creat') the open() call that creates a
> >>read-only file may well return a read/write file descriptor, which
> >>is the reason the following WRITE can succeed.
> >
> >I forgot another complication, which is that knsfd actually does a
> >temporary open before each read or write--I assume that's getting
> >translated into fuse and gluster open operations?
> >
> yes. It is the OPEN done as part of NFS WRITE which fails with
> EACCESS error (with both NFSv3 and NFSv4 mounts).
Makes sense for v3, but I wouldn't normally expect the extra temporary
open on v4 WRITEs. Could you share any details?
--b.
>
> 63 16:59:09.278651000 ::1 -> ::1 NFS 232 V3 WRITE
> Call, FH: 0x49a35e54 Offset: 0 Len: 7 FILE_SYNC
> 64 16:59:09.278926000 192.168.122.1 -> 192.168.122.202 GlusterFS
> 164 V330 OPEN Call
> 65 16:59:09.278937000 192.168.122.1 -> 192.168.122.202 GlusterFS
> 164 [RPC retransmission of #64][TCP Retransmission] V330 OPEN Call
> 66 16:59:09.279459000 192.168.122.202 -> 192.168.122.1 GlusterFS
> 116 V330 OPEN Reply (Call In 64)
> 67 16:59:09.279459000 192.168.122.202 -> 192.168.122.1 GlusterFS
> 116 [RPC duplicate of #66][TCP Retransmission] V330 OPEN Reply (Call
> In 64)
> 68 16:59:09.279733000 ::1 -> ::1 NFS 212 V3 WRITE
> Reply (Call In 63) Error: NFS3ERR_ACCES
>
>
> Thanks,
> Soumya
>
> >In which case it might be worth experimenting with NFSv4 or with Jeff
> >Layton's filehandle-caching patches. Neither's a real fix, but that
> >could help confirm whether it's the temporary opens that are a problem.
> >
> >--b.
> >
> >>
> >>Thanks,
> >>Soumya
> >>
> >>
> >>>Regards, Malahal.
> >>>
> >>>Soumya Koduri [skoduri@redhat.com] wrote:
> >>>>As mentioned by Bruce, GlusterFS doesn't have owner-override rule
> >>>>except for setattr.
> >>>>
> >>>>I did few experiments to check why this test case passes on plain
> >>>>glusterfs fuse mount & NFS-Ganesha but fails with kernel-NFS.
> >>>>
> >>>>NFS-Ganesha (for most of the FSALs) seem to be passing the actual
> >>>>request credentials to the back-end filesystem only for
> >>>>CREATE(-like) and UNLINK fops. For all the remaining fops, it does
> >>>>the access check at its end and then perform the operation with root
> >>>>credentials. That's the reason WRITE succeeded in your case as
> >>>>NFS-Ganesha (like kernel-NFS) skipped the access check if the
> >>>>request caller_uid proved to be the file's owner.
> >>>>
> >>>>In case of native GlusterFS FUSE mount, there is no OPEN fop
> >>>>involved. WRITE is performed on the fd returned by CREATE. And
> >>>>strangely GlusterFS seem to be doing certain access checks only
> >>>>during OPEN but not for WRITE (this seems like a bug and probably
> >>>>needs to be fixed in Gluster).
> >>>>
> >>>>Thanks,
> >>>>Soumya
> >>>>
> >>>>On 12/14/2015 10:27 PM, Omar Walid Llorente wrote:
> >>>>>
> >>>>>Thank you Bruce, others, for the responses. I send attached a complete
> >>>>>capture of the issue, including the glusterfs transactions.
> >>>>>
> >>>>>Hope this helps to clear where may it be...
> >>>>>
> >>>>>Omar
> >>>>>
> >>>>>El 10/12/15 a las 15:44, J. Bruce Fields escribió:
> >>>>>>On Thu, Dec 10, 2015 at 05:59:33PM +0530, Soumya Koduri wrote:
> >>>>>>>
> >>>>>>>On 12/10/2015 04:02 PM, Omar Walid Llorente wrote:
> >>>>>>>>Hi, Jeff, Bruce, finally I got some time to get the capture of the nfs
> >>>>>>>>packets (you can find them in attached file nfs-problem-nks.pcap.zip).
> >>>>>>>>Sorry for being so late.
> >>>>>>>>
> >>>>>>>>What I did was the following:
> >>>>>>>>
> >>>>>>>>1st) Create the RO file:
> >>>>>>>>cdc@l056:~/prueba-git$ rm -f kk.txt 444.txt; echo "prueba" > 444.txt;
> >>>>>>>>chmod 444 444.txt;
> >>>>>>>>
> >>>>>>>>2nd) Init the capture:
> >>>>>>>>root@l056:~# tcpdump -i eth2 -w /tmp/nfs.pcap -s 512 port 2049
> >>>>>>>>tcpdump: listening on eth2, link-type EN10MB (Ethernet), capture size
> >>>>>>>>512 bytes
> >>>>>>>>
> >>>>>>>GlusterFS protocol is added to wireshark from version 1.8.0 [1]. It
> >>>>>>>may be helpful to see what GlusterFS operations are being processed
> >>>>>>>as part of NFS WRITE call (which has failed in this case).
> >>>>>>>
> >>>>>>>Could you please try taking the packet trace on the machine where
> >>>>>>>NFS server is running (without filtering out based on the port
> >>>>>>>number).
> >>>>>>>
> >>>>>>>Also I tried out the same test on Fedora22 machine, but haven't run
> >>>>>>>into any issue. What are the fuse mount options you have used to
> >>>>>>>mount gluster volume?
> >>>>>>Oh, I think this is a simple problem (but maybe hard to fix). The
> >>>>>>capture shows NFSv3 traffic like:
> >>>>>>
> >>>>>> CREATE -> OK
> >>>>>> SETATTR (mode set to 0400) -> OK
> >>>>>> WRITE -> NFS3ERR_ACCES
> >>>>>>
> >>>>>>That write would succeed locally (because the mode doesn't matter to a
> >>>>>>local application that already holds the file open). It would fail over
> >>>>>>NFSv3, which doesn't know about the open--except that there's a hack for
> >>>>>>this case: NFSv3 servers allow IO operations to ignore the mode, if the
> >>>>>>operation comes from the owner of the file. NFSv3 clients are then
> >>>>>>careful to perform necessary access checks on open to ensure that this
> >>>>>>owner-override rule doesn't grant too many permissions.
> >>>>>>
> >>>>>>That allows NFSv3 applications to see behavior that's mostly like a
> >>>>>>local filesystem, without opening much of a security hole (since the
> >>>>>>owner could always chmod anyway).
> >>>>>>
> >>>>>>So, knfsd is making this special exception--but gluster (which I believe
> >>>>>>it's exporting in this case, via fuse?)--probably doesn't.... I'm not
> >>>>>>sure what you can do about that.
> >>>>>>
> >>>>>>--b.
> >>>>>
> >>>>--
> >>>>To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> >>>>the body of a message to majordomo@vger.kernel.org
> >>>>More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>
> >>>
next prev parent reply other threads:[~2015-12-18 20:08 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-20 11:04 possible bug in nfs-kernel-server Omar Walid Llorente
2015-11-23 21:18 ` J. Bruce Fields
2015-11-25 16:23 ` omar
[not found] ` <20151121091824.71ab1f6b@tlielax.poochiereds.net>
2015-11-25 13:50 ` omar
2015-12-10 10:32 ` Omar Walid Llorente
2015-12-10 12:29 ` Soumya Koduri
2015-12-10 14:44 ` J. Bruce Fields
2015-12-14 16:57 ` Omar Walid Llorente
2015-12-17 12:16 ` Soumya Koduri
2015-12-18 0:37 ` Malahal Naineni
2015-12-18 8:43 ` Soumya Koduri
2015-12-18 15:20 ` J. Bruce Fields
2015-12-18 17:17 ` Soumya Koduri
2015-12-18 20:08 ` J. Bruce Fields [this message]
2015-12-21 8:48 ` Soumya Koduri
2015-12-21 16:47 ` J. Bruce Fields
2015-12-21 17:58 ` Soumya Koduri
2015-12-21 20:14 ` J. Bruce Fields
[not found] ` <2443f0d3-6937-ae92-d4d5-6e1f00a19e81@dit.upm.es>
2016-11-08 20:16 ` J. Bruce Fields
2016-11-11 17:57 ` Omar Walid Llorente
2016-11-11 19:03 ` J. Bruce Fields
2016-11-11 22:04 ` J. Bruce Fields
2016-11-15 10:13 ` Miklos Szeredi
2016-11-16 18:19 ` Omar Walid Llorente
2016-11-18 14:16 ` Miklos Szeredi
2016-11-18 16:03 ` Omar Walid Llorente
2016-11-21 12:56 ` Soumya Koduri
2016-11-21 14:57 ` J. Bruce Fields
2016-11-22 14:45 ` Soumya Koduri
2016-11-28 18:03 ` Omar Walid Llorente
2016-11-28 18:25 ` J. Bruce Fields
2016-12-15 17:06 ` Omar Walid Llorente
[not found] ` <HK2PR0401MB15701B151822C20064F3D418FE9D0@HK2PR0401MB1570.apcprd04.prod.outlook.com>
2016-12-15 20:19 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151218200840.GA28692@fieldses.org \
--to=bfields@fieldses.org \
--cc=cdc@dit.upm.es \
--cc=jlayton@poochiereds.net \
--cc=linux-nfs@vger.kernel.org \
--cc=omar@dit.upm.es \
--cc=skoduri@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.