linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benny Halevy <bhalevy@panasas.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jeff Garzik <jeff@garzik.org>,
	Boaz Harrosh <bharrosh@panasas.com>,
	FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	avishay@gmail.com, akpm@linux-foundation.org,
	linux-fsdevel@vger.kernel.org, osd-dev@open-osd.org,
	linux-kernel@vger.kernel.org, jens.axboe@oracle.com,
	linux-scsi@vger.kernel.org
Subject: Re: pNFS rant (was Re: [PATCH 1/8] exofs: Kbuild, Headers and osd utils)
Date: Mon, 16 Feb 2009 18:27:53 +0200	[thread overview]
Message-ID: <49999409.4030602@panasas.com> (raw)
In-Reply-To: <1234799401.3237.16.camel@localhost.localdomain>

On Feb. 16, 2009, 17:50 +0200, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Mon, 2009-02-16 at 06:05 -0500, Jeff Garzik wrote:
>> Boaz Harrosh wrote:
>>> No can do. exofs is meant to be a reference implementation of a pNFS-objects
>>> file serving system. Have you read the spec of pNFS-objects layout? they define
>>> RAID 0, 1, 5, and 6. In pNFS the MDS is suppose to be able to write the data
>>> for its clients as NFS, so it needs to have all the infra structure and knowledge
>>> of an Client pNFS-object layout drive.
>> Yes, I have studied pNFS!  I plan to add v4.1 and pNFS support to my NFS 
>> server, once v4.0 support is working well.
>>
>>
>> pNFS The Theory:   is wise and necessary:  permit clients to directly 
>> connect to data storage, rather than copying through the metadata 
>> server(s).  This is what every distributed filesystem is doing these 
>> days -- direct to data server for bulk data read/write.
>>
>> pNFS The Specification:   is an utter piece of shit.  I can only presume 
>> some shady backroom deal in a smoke-filled room was the reason this saw 
>> the light of day.
>>
>>
>> In a sane world, NFS clients would speak... NFS.
>>
>> In the crazy world of pNFS, NFS clients are now forced to speak NFS, 
>> SCSI, RAID, and any number of proprietary layout types.  When will HTTP 
>> be added to the list?  :)
> 
> Heh, it's one of the endearing faults of the storage industry that we
> never learn from our mistakes ... particularly in storage protocols.
> 
> Actually, perhaps that's a mischaracterised: we never actually learn
> from our successes.  For example, most popular storage protocols solve
> about 80% of the problem (NFSv2) get something bolted on to take that to
> 95% (locking) and rule for decades.  We end up obsessing about the 5%
> and produce something that's like 10x the overhead to solve it.
> Customers, for some unfathomable reason, hate complexity  (I suspect
> principally because it in some measure equals expense) so the 100%
> solution (which actually turns out to be a 95% one because the over
> engineered complexity adds another 5% of different problems that take
> years to find) tends to work its way into a niche and stay there ...
> eventually fading.
> 
> If you're really lucky, the niche evolves into something sustainable.
> For example iSCSI: blew its early promise, pulled a bunch of unnecessary
> networking into the protocol and ended up too big to fit in disk
> firmware (thus destroying the ability to have a simple network tap to
> replace storage fabric).  It's been slowly fading until Virtualisation
> came along.  Now all the other solutions to getting storage into virtual
> machines are so horrible and arcane that iSCSI looks like a winner (if
> the alternative is Frankenstein's monster, Grendel's mother suddenly
> looks more attractive as a partner).
> 
> So, trust the customer ... if it's so horrible it shouldn't have seen
> the light of day, the chances are that no-one will buy it anyway.

I completely agree with this sentence.
And no customer, whatsoever, that I've talked to about pNFS had
any reservations about supporting multiple layout types.  On the
contrary...

Benny

> 
> James
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-02-16 16:27 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-09 13:07 [PATCHSET 0/8 version 3] exofs Boaz Harrosh
2009-02-09 13:12 ` [PATCH 1/8] exofs: Kbuild, Headers and osd utils Boaz Harrosh
2009-02-16  4:18   ` FUJITA Tomonori
2009-02-16  8:49     ` Boaz Harrosh
2009-02-16  9:00       ` FUJITA Tomonori
2009-02-16  9:19         ` Boaz Harrosh
2009-02-16  9:27           ` Jeff Garzik
2009-02-16 10:19             ` Boaz Harrosh
2009-02-16 11:05               ` pNFS rant (was Re: [PATCH 1/8] exofs: Kbuild, Headers and osd utils) Jeff Garzik
2009-02-16 12:45                 ` Boaz Harrosh
2009-02-16 15:50                 ` James Bottomley
2009-02-16 16:27                   ` Benny Halevy [this message]
2009-02-16 16:23                 ` Benny Halevy
2009-02-16  9:38           ` [PATCH 1/8] exofs: Kbuild, Headers and osd utils FUJITA Tomonori
2009-02-16 10:29             ` Boaz Harrosh
2009-02-17  0:20               ` FUJITA Tomonori
2009-02-17  8:10                 ` [osd-dev] " Boaz Harrosh
2009-02-27  8:09                   ` FUJITA Tomonori
2009-03-01 10:43                     ` Boaz Harrosh
2009-02-09 13:18 ` [PATCH 2/8] exofs: file and file_inode operations Boaz Harrosh
2009-02-09 13:20 ` [PATCH 3/8] exofs: symlink_inode and fast_symlink_inode operations Boaz Harrosh
2009-02-09 13:22 ` [PATCH 4/8] exofs: address_space_operations Boaz Harrosh
2009-02-09 13:24 ` [PATCH 5/8] exofs: dir_inode and directory operations Boaz Harrosh
2009-02-15 17:08   ` Evgeniy Polyakov
2009-02-16  9:31     ` Boaz Harrosh
2009-03-15 18:10       ` Boaz Harrosh
2009-03-15 18:37         ` Evgeniy Polyakov
2009-02-09 13:25 ` [PATCH 6/8] exofs: super_operations and file_system_type Boaz Harrosh
2009-02-15 17:24   ` Evgeniy Polyakov
2009-02-16  9:59     ` Boaz Harrosh
2009-02-09 13:29 ` [PATCH 7/8] exofs: Documentation Boaz Harrosh
2009-02-09 13:31 ` [PATCH 8/8] fs: Add exofs to Kernel build Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49999409.4030602@panasas.com \
    --to=bhalevy@panasas.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=avishay@gmail.com \
    --cc=bharrosh@panasas.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=jeff@garzik.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=osd-dev@open-osd.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).