From: Benny Halevy <bhalevy@tonian.com>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
Benny Halevy <bhalevy@panasas.com>,
Brent Welch <welch@panasas.com>,
NFS list <linux-nfs@vger.kernel.org>,
open-osd <osd-dev@open-osd.org>
Subject: Re: [PATCH 13/19] pnfs-obj: Remove redundant EOF from objlayout_io_state
Date: Fri, 07 Oct 2011 12:58:31 -0400 [thread overview]
Message-ID: <4E8F2FB7.40803@tonian.com> (raw)
In-Reply-To: <1317724499-27702-1-git-send-email-bharrosh@panasas.com>
On 2011-10-04 06:34, Boaz Harrosh wrote:
> The EOF calculation was done on .read_pagelist(), cached
> in objlayout_io_state->eof, and set in objlayout_read_done()
> into nfs_read_data->res.eof.
>
> So set it directly into nfs_read_data->res.eof and avoid
> the extra member.
>
> This is a slight behaviour change because before eof was
> *not* set on an error update at objlayout_read_done(). But
> is that a problem? Is Generic layer so sensitive that it
> will miss the error IO if eof was set? From my testing
> I did not see such a problem.
>
> Benny please review.
>
> Which brings me to a more abstract problem. Why does the
> LAYOUT driver needs to do this eof calculation? .i.e we
> are inspecting generic i_size_read() and if spanned by
> offset + count which is received from generic layer we set
> eof. It looks like all this can/should be done in generic
> layer and not at LD. Where does NFS and files-LD do it?
> It looks like it can be promoted.
In the files layout case, nfs_read_done sets res.eof.
But I agree this code could be moved to the generic layout
at least to serve non-rpc LDs.
And BTW, current the object layout handling of the eof flag
is stricter than the blocks layout and it requires an extra
call with offset >= i_size to set the eof flag, while for
nfs and blocks eof is set when offset + count >= i_size
>
> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Reviewed-by: Benny Halevy <bhalevy@tonian.com>
> ---
> fs/nfs/objlayout/objlayout.c | 16 +++++++---------
> fs/nfs/objlayout/objlayout.h | 1 -
> 2 files changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/fs/nfs/objlayout/objlayout.c b/fs/nfs/objlayout/objlayout.c
> index 1d06f8e..1300736 100644
> --- a/fs/nfs/objlayout/objlayout.c
> +++ b/fs/nfs/objlayout/objlayout.c
> @@ -287,17 +287,14 @@ static void _rpc_read_complete(struct work_struct *work)
> void
> objlayout_read_done(struct objlayout_io_state *state, ssize_t status, bool sync)
> {
> - int eof = state->eof;
> - struct nfs_read_data *rdata;
> + struct nfs_read_data *rdata = state->rpcdata;
>
> state->status = status;
> - dprintk("%s: Begin status=%zd eof=%d\n", __func__, status, eof);
> - rdata = state->rpcdata;
> + dprintk("%s: Begin status=%zd eof=%d\n", __func__,
> + status, rdata->res.eof);
> rdata->task.tk_status = status;
> - if (status >= 0) {
> + if (status >= 0)
> rdata->res.count = status;
> - rdata->res.eof = eof;
> - }
> objlayout_iodone(state);
> /* must not use state after this point */
>
> @@ -330,11 +327,14 @@ objlayout_read_pagelist(struct nfs_read_data *rdata)
> status = 0;
> rdata->res.count = 0;
> rdata->res.eof = 1;
> + /*FIXME: do we need to call pnfs_ld_read_done() */
Yes, it looks like we do, otherwise we might leak a refcount on the lseg.
We also need to set rdata->task.tk_status = 0, to mimic what objlayout_read_done
would have done in the sync case.
Benny
> goto out;
> }
> count = eof - offset;
> }
>
> + rdata->res.eof = (offset + count) >= eof;
> +
> state = objlayout_alloc_io_state(NFS_I(rdata->inode)->layout,
> rdata->args.pages, rdata->args.pgbase,
> offset, count,
> @@ -345,8 +345,6 @@ objlayout_read_pagelist(struct nfs_read_data *rdata)
> goto out;
> }
>
> - state->eof = state->offset + state->count >= eof;
> -
> status = objio_read_pagelist(state);
> out:
> dprintk("%s: Return status %Zd\n", __func__, status);
> diff --git a/fs/nfs/objlayout/objlayout.h b/fs/nfs/objlayout/objlayout.h
> index a8244c8..ffb884c 100644
> --- a/fs/nfs/objlayout/objlayout.h
> +++ b/fs/nfs/objlayout/objlayout.h
> @@ -86,7 +86,6 @@ struct objlayout_io_state {
>
> void *rpcdata;
> int status; /* res */
> - int eof; /* res */
> int committed; /* res */
>
> /* Error reporting (layout_return) */
next prev parent reply other threads:[~2011-10-11 2:31 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-04 10:24 [PATCHSET 00/19] objlayout: Move to ORE Boaz Harrosh
2011-10-04 10:28 ` [PATCH 01/19] exofs: Rename struct ore_components comps => oc Boaz Harrosh
2011-10-04 10:28 ` [PATCH 02/19] exofs: Remove unused data_map member from exofs_sb_info Boaz Harrosh
2011-10-04 10:29 ` [PATCH 03/19] ore: Make ore_striping_info and ore_calc_stripe_info public Boaz Harrosh
2011-10-04 10:29 ` [PATCH 04/19] ore/exofs: Change the type of the devices array (API change) Boaz Harrosh
2011-10-04 10:30 ` [PATCH 05/19] ore: Only IO one group at a time " Boaz Harrosh
2011-10-04 10:30 ` [PATCH 06/19] ore: cleanup: Embed an ore_striping_info inside ore_io_state Boaz Harrosh
2011-10-04 10:31 ` [PATCH 07/19] ore: Remove check for ios->kern_buff in _prepare_for_striping to later Boaz Harrosh
2011-10-04 10:32 ` [PATCH 08/19] exofs: Support for short read/writes Boaz Harrosh
2011-10-04 10:32 ` [PATCH 09/19] ore: " Boaz Harrosh
2011-10-04 10:33 ` [PATCH 10/19] ore: Support for partial component table Boaz Harrosh
2011-10-04 10:34 ` [PATCH 11/19] ore/exofs: Define new ore_verify_layout Boaz Harrosh
2011-10-04 10:34 ` [PATCH 12/19] ore/exofs: Change ore_check_io API Boaz Harrosh
2011-10-04 10:34 ` [PATCH 13/19] pnfs-obj: Remove redundant EOF from objlayout_io_state Boaz Harrosh
2011-10-07 16:58 ` Benny Halevy [this message]
2011-10-04 10:35 ` [PATCH 14/19] pnfs-obj: Return PNFS_NOT_ATTEMPTED in case of read/write_pagelist Boaz Harrosh
2011-10-07 17:06 ` Benny Halevy
2011-10-04 10:35 ` [PATCH 15/19] pnfs-obj: Get rid of objlayout_{alloc,free}_io_state Boaz Harrosh
2011-10-07 17:17 ` Benny Halevy
2011-10-04 10:36 ` [PATCH 16/19] pnfs-obj: Rename objlayout_io_state => objlayout_io_res Boaz Harrosh
2011-10-04 12:20 ` Jim Rees
2011-10-04 12:27 ` Boaz Harrosh
2011-10-04 10:36 ` [PATCH 17/19] pnfs-obj: move to ore 01: ore_layout & ore_components Boaz Harrosh
2011-10-07 17:26 ` Benny Halevy
2011-10-04 10:36 ` [PATCH 18/19] pnfs-obj: move to ore 02: move to ORE Boaz Harrosh
2011-10-07 17:26 ` Benny Halevy
2011-10-04 10:37 ` [PATCH 19/19] pnfs-obj: move to ore 03: Remove old raid engine Boaz Harrosh
2011-10-07 17:27 ` Benny Halevy
2011-10-04 12:04 ` [PATCHSET 00/19] objlayout: Move to ORE Benny Halevy
2011-10-04 12:24 ` Boaz Harrosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E8F2FB7.40803@tonian.com \
--to=bhalevy@tonian.com \
--cc=Trond.Myklebust@netapp.com \
--cc=bhalevy@panasas.com \
--cc=bharrosh@panasas.com \
--cc=linux-nfs@vger.kernel.org \
--cc=osd-dev@open-osd.org \
--cc=welch@panasas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).