From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Brent Welch <welch@panasas.com>,
NFS list <linux-nfs@vger.kernel.org>,
open-osd <osd-dev@open-osd.org>
Subject: Re: [PATCH 1/8] pnfs-obj: Remove redundant EOF from objlayout_io_state
Date: Mon, 31 Oct 2011 22:25:52 -0400 [thread overview]
Message-ID: <1320114352.10028.33.camel@lade.trondhjem.org> (raw)
In-Reply-To: <4EAF366C.8040504@panasas.com>
On Mon, 2011-10-31 at 16:59 -0700, Boaz Harrosh wrote:
> On 10/31/2011 04:29 PM, Trond Myklebust wrote:
> >> In files-type reads in a "condense" layout. You should be careful
> >> because in striping it is common place to have eof on some DSs because
> >> of file holes even though there are more bits higher on in the file
> >> at other DSs. You should check to return back only the answer from the
> >> highest logical read DS. (Or I'm wrong in my interpretation?)
> >
> > In the close-to-open cache consistency, O_DIRECT database, or file
> > locking cases, then either the data has been committed, the file size
> > extended and the DSes updated,
>
> I meant in the case all that as happened (Just opened the file) but
> any particular DS can return EOF. Example:
> I have 3 DSs, with stripe unit of say 1K for example.
>
> The file has been written to 0K..1K and 2K..3K. In dense layout file-size
> on DS2 is zero, right? because it was never written too. So if the client
> is reading 0K..3K (All file), Will it get eof from DS2?
Once the client issues LAYOUTCOMMIT, the server will need to ensure that
DS2 gets filled too. Dense file layouts aren't supposed to have holes
according to 13.4.4. (the whole point is to support filesystems like
NTFS that don't do sparse files).
> > or our client must know that the server
> > has incomplete information because it is holding cached writes or
> > layoutcommits that extend the file. In either case, the meaning of the
> > eofs should be obvious.
> >
>
> I hope that is taken care of, surly?
>
> > Benny's old pet project of making 'tail -f' work on a log file that is
> > being extended by someone else is, OTOH, subject to screwiness. However
> > that case can be screwy on ordinary read-through-MDS too.
> >
>
> Ye that one was me too. I still think file length can easily be extended
> only on commit/layout_commit and not on any random write. So the above can
> work. I think there is all that is needed within the protocol for servers that
> *want* to support this. With any compliant client. (Ask me if you don't know how,
> it involves keeping a shadow length per client up until commit, actually with
> pnfs it is easier)
You can't do it safely with MDS only due to the issue of WRITE
reordering on the wire+server and pNFS just adds more ways to reorder
those writes. So unless you play games with layout recalls etc. in order
to order the READs and WRITEs from different clients (which goes against
the premise that layouts do not constitute a caching protocol) then I
fail to see how pNFS can help.
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
next prev parent reply other threads:[~2011-11-01 2:25 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-28 19:13 [patchset 0/8] pnfs-obj: Move to ORE for v3.2 merge window Boaz Harrosh
2011-10-29 17:24 ` Trond Myklebust
2011-10-31 17:49 ` Boaz Harrosh
2011-10-31 21:34 ` Boaz Harrosh
2011-10-31 21:45 ` [PATCH 1/8] pnfs-obj: Remove redundant EOF from objlayout_io_state Boaz Harrosh
2011-10-31 22:24 ` Trond Myklebust
2011-10-31 22:45 ` Boaz Harrosh
2011-10-31 23:23 ` Boaz Harrosh
2011-10-31 23:29 ` Trond Myklebust
2011-10-31 23:59 ` Boaz Harrosh
2011-11-01 2:25 ` Trond Myklebust [this message]
2011-10-31 21:45 ` [PATCH 2/8] pnfs-obj: Return PNFS_NOT_ATTEMPTED in case of read/write_pagelist Boaz Harrosh
2011-10-31 21:47 ` [PATCH 3/8] pnfs-obj: Get rid of objlayout_{alloc,free}_io_state Boaz Harrosh
2011-10-31 22:03 ` [PATCH 4/8] pnfs-obj: Rename objlayout_io_state => objlayout_io_res Boaz Harrosh
2011-10-31 22:04 ` [PATCH 5/8] pnfs-obj: move to ore 01: ore_layout & ore_components Boaz Harrosh
2011-10-31 22:15 ` [PATCH 6/8] pnfs-obj: move to ore 02: move to ORE Boaz Harrosh
2011-10-31 22:16 ` [PATCH 7/8] pnfs-obj: move to ore 03: Remove old raid engine Boaz Harrosh
2011-10-31 22:16 ` [PATCH 8/8] pnfs-obj: Support for RAID5 read-4-write interface Boaz Harrosh
2011-11-01 17:42 ` [patchset 0/8] pnfs-obj: Move to ORE for v3.2 merge window Boaz Harrosh
2011-11-02 19:01 ` [osd-dev] " Boaz Harrosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1320114352.10028.33.camel@lade.trondhjem.org \
--to=trond.myklebust@netapp.com \
--cc=bharrosh@panasas.com \
--cc=linux-nfs@vger.kernel.org \
--cc=osd-dev@open-osd.org \
--cc=welch@panasas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).