Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Benny Halevy <bhalevy@panasas.com>
Cc: NFS list <linux-nfs@vger.kernel.org>,
	nfsv4@linux-nfs.org, Greg Banks <gnb@melbourne.sgi.com>
Subject: Re: A new NFSv4 server...
Date: Fri, 04 Jan 2008 10:49:16 -0500	[thread overview]
Message-ID: <477E557C.3000104@garzik.org> (raw)
In-Reply-To: <477DF761.8040306@panasas.com>

Benny Halevy wrote:
> Jeff, taking into account the amount of effort people and different
> organizations have already put into NFSv4 and NFSv4.1 I wish you could
> tunnel your inventive energy into making NFSv4.1 better rather than
> trying to reinvent NFS/RPC/XDR.
> 
> Although It's rather late in the process since the NFSv4 working group
> is close to putting the NFSv4.1 Internet-draft up for last
> call, we would certainly appreciate more implementation feedback.


I am more than happy to give feedback, though (as you say) it is 
probably too late for substantial feedback to have any large effect.

My general engineering opinions of pNFS:

* Fills an obvious need:  eliminating the need to copy data through the 
metadata interface to backend storage.  Many clear, tangible benefits here.


* pNFS major issue #1:  client storage protocol

Storing and retrieving blobs over the network, with strong 
authentication/integrity/security, is a solved problem.

Pick ONE client storage protocol (HTTP? iSCSI OSD2?), and stick to it. 
Or maybe HTTP|SCSI but nothing more.  Heck, even BitTorrent w/ auth 
extensions would be better than yet another protocol for similar 
purposes (not that I'm advocating BT, just saying...).

Maximize reuse of existing software and mindshare.


* pNFS major issue #2:  abandons NFS's "one true generic" path

I believe pNFS violates the "spirit of NFS" by deviating from a defacto 
assumption found in earlier versions:  data transfer is simple, 
arbitrary blobs, addressed in the same manner, and sent via the same 
protocol.

Pick ONE layout type, and stick to it.  Banish all other layout types to 
other software layers.

Protocol conversion servers, firmware, and other softwares can easily 
convert from a generic layout to something more exotic like OSD or 
[insert site specific protocol here].

NFS itself should not be delving into low-level storage details like 
this.  Clients should not need to know low-level details (like stripe 
sizes).  In Linux, we call this a layering violation.

Working on kernel storage drivers as I do, I can see the attraction of 
wanting to do things this way...  but we invented layering and 
abstraction in computer science for good reasons :)


* pNFS major issue #3: no longer a "closed loop" protocol

By permitting multiple layout types, and in particular undefined 
(site-specific) layout types, it is by definition _impossible_ for 
anyone to claim full protocol interoperability with other implementations.

The number of possible combinations approaches infinity, with obvious 
consequences on testing, and production software quality.

And when a marketing department advertises "fully NFSv4.1 compliant!" on 
ther company's appliance, it is trivial for any engineer to construct 
another "fully NFSv4.1 compliant" setup -- with equivalent 
authentication, metadata and data sets -- that is not interoperable 
except via the fallback case (copy through the metadata server).

Such interoperability breakdowns are IMO not in the spirit of NFS.

	Jeff

  reply	other threads:[~2008-01-04 15:49 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-03 12:16 A new NFSv4 server Jeff Garzik
2008-01-03 16:32 ` J. Bruce Fields
2008-01-04  5:32   ` Jeff Garzik
2008-01-04  6:24     ` Greg Banks
     [not found]       ` <477DD11B.40909-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-01-04  7:04         ` Jeff Garzik
2008-01-04  9:07           ` Benny Halevy
2008-01-04 15:49             ` Jeff Garzik [this message]
2008-01-04 19:51               ` Benny Halevy
2008-01-05  1:46               ` Greg Banks
2008-01-05  7:56                 ` Benny Halevy
2008-01-04 17:47             ` J. Bruce Fields
2008-01-04 19:55               ` Benny Halevy
2008-01-04  9:15           ` Peter Åstrand
2008-01-04 10:05             ` Neil Brown
     [not found]             ` <Pine.LNX.4.64.0801040954070.5004-K9BqGu7AvB3wj5YHdwD3Ga2PxDmRETKR@public.gmane.org>
2008-01-04 13:50               ` Frank van Maarseveen
2008-01-04 16:41               ` Jeff Garzik
2008-01-04 20:03                 ` Peter Åstrand
     [not found]                   ` <Pine.LNX.4.64.0801042030380.18738-K9BqGu7AvB3wj5YHdwD3Ga2PxDmRETKR@public.gmane.org>
2008-01-06 23:54                     ` James Morris
2008-01-04 20:31             ` Muntz, Daniel
2008-01-04  9:15 ` Peter Åstrand
2008-01-04 16:14   ` Jeff Garzik
2008-01-04 19:58     ` Peter Åstrand
  -- strict thread matches above, loose matches on Subject: below --
2008-01-04 15:28 Rick Macklem
     [not found] ` <200801041528.KAA18776-bYVALtacgsT800Iu1Vt84J3p9npsUQCG@public.gmane.org>
2008-01-04 17:21   ` J. Bruce Fields
2008-01-04 18:03     ` Tom Haynes
     [not found]       ` <477E750A.2030905-8AdZ+HgO7noAvxtiuMwx3w@public.gmane.org>
2008-01-04 18:21         ` J. Bruce Fields
2008-01-04 19:50     ` Jeff Garzik
2008-01-04 19:57       ` Peter Åstrand
     [not found]         ` <Pine.LNX.4.64.0801042055490.18738-K9BqGu7AvB3wj5YHdwD3Ga2PxDmRETKR@public.gmane.org>
2008-01-05  0:43           ` Jeff Garzik
2008-01-04 15:48 Rick Macklem
     [not found] ` <200801041548.KAA18953-bYVALtacgsT800Iu1Vt84J3p9npsUQCG@public.gmane.org>
2008-01-04 17:15   ` J. Bruce Fields
2008-01-05  2:32   ` Greg Banks
2008-01-04 17:11 Rick Macklem
     [not found] ` <200801041711.MAA19577-bYVALtacgsT800Iu1Vt84J3p9npsUQCG@public.gmane.org>
2008-01-05  0:51   ` Jeff Garzik
2008-01-04 17:28 Rick Macklem
     [not found] ` <200801041728.MAA19743-bYVALtacgsT800Iu1Vt84J3p9npsUQCG@public.gmane.org>
2008-01-04 17:42   ` J. Bruce Fields
2008-01-04 17:45   ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=477E557C.3000104@garzik.org \
    --to=jeff@garzik.org \
    --cc=bhalevy@panasas.com \
    --cc=gnb@melbourne.sgi.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nfsv4@linux-nfs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox