* Re: [pnfs] nfs41_sequence_done [not found] ` <20100712182927.GB22461-8f4Pc2RrbJmHXe+LvDLADg@public.gmane.org> @ 2010-07-12 18:58 ` Benny Halevy 2010-07-12 19:14 ` Trond Myklebust 0 siblings, 1 reply; 6+ messages in thread From: Benny Halevy @ 2010-07-12 18:58 UTC (permalink / raw) To: Jim Rees; +Cc: NFS list, Trond Myklebust [pnfs@linux-nfs.org -> linux-nfs@vger.kernel.org] On Jul. 12, 2010, 21:29 +0300, Jim Rees <rees@umich.edu> wrote: > Does anyone still care about this? > > WARNING: nfs41_sequence_done: Operation in progress slot=1 seq=7 highest_used_slotid=1: please report to pnfs@linux-nfs.org if you saw this message Heh, need to update hard-coded instructions to point to the new list... > > I'm getting this on the client side of a pnfs block layout mount against the > spnfs server. Kernel is benny's pnfs-all-2.6.35-rc3-2010-07-01 plus EMC > complex block layout patches. It's possible the complex layout code is to > blame, but I doubt it because this isn't a complex layout mount. I can > provide more details. I agree. This is a generic issue. The patch that adds this check is d6ce9ad DEVONLY: nfs41: Do not free slot if retried while operation was in progress It was originally rejected (http://www.spinics.net/lists/linux-nfs/msg09562.html) due to noise regarding where nfs41_sequence_free_slot is called but that masked the real issue. Can you readily reproduce this? Can you debug also the server side to see if indeed the client retries the RPC while it is in progress on the server? Benny > _______________________________________________ > NOTE: THIS LIST IS DEPRECATED. Please use linux-nfs@vger.kernel.org > instead. (To subscribe to linux-nfs@vger.kernel.org: send "subscribe > linux-nfs" in the body of a message to majordomo-u79uwXL29TaiAVqoAR/hOA@public.gmane.org) > > pNFS mailing list > pNFS@linux-nfs.org > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pnfs] nfs41_sequence_done 2010-07-12 18:58 ` [pnfs] nfs41_sequence_done Benny Halevy @ 2010-07-12 19:14 ` Trond Myklebust [not found] ` <1278962046.12559.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Trond Myklebust @ 2010-07-12 19:14 UTC (permalink / raw) To: Benny Halevy; +Cc: Jim Rees, NFS list On Mon, 2010-07-12 at 21:58 +0300, Benny Halevy wrote: > [pnfs@linux-nfs.org -> linux-nfs@vger.kernel.org] > > On Jul. 12, 2010, 21:29 +0300, Jim Rees <rees@umich.edu> wrote: > > Does anyone still care about this? > > > > WARNING: nfs41_sequence_done: Operation in progress slot=1 seq=7 highest_used_slotid=1: please report to pnfs@linux-nfs.org if you saw this message > > Heh, need to update hard-coded instructions to point to the new list... > > > > > I'm getting this on the client side of a pnfs block layout mount against the > > spnfs server. Kernel is benny's pnfs-all-2.6.35-rc3-2010-07-01 plus EMC > > complex block layout patches. It's possible the complex layout code is to > > blame, but I doubt it because this isn't a complex layout mount. I can > > provide more details. > > I agree. This is a generic issue. > The patch that adds this check is > d6ce9ad DEVONLY: nfs41: Do not free slot if retried while operation was in progress > > It was originally rejected (http://www.spinics.net/lists/linux-nfs/msg09562.html) > due to noise regarding where nfs41_sequence_free_slot is called > but that masked the real issue. > > Can you readily reproduce this? > Can you debug also the server side to see if indeed the client retries the RPC > while it is in progress on the server? So what is the root cause here? Is it the known issue that we don't deal correctly with an NFS4ERR_DELAY on the SEQUENCE operation? Trond ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <1278962046.12559.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>]
* Re: [pnfs] nfs41_sequence_done [not found] ` <1278962046.12559.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> @ 2010-07-12 19:16 ` Benny Halevy 2010-07-12 19:26 ` Trond Myklebust 0 siblings, 1 reply; 6+ messages in thread From: Benny Halevy @ 2010-07-12 19:16 UTC (permalink / raw) To: Trond Myklebust; +Cc: Jim Rees, NFS list On Jul. 12, 2010, 22:14 +0300, Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > On Mon, 2010-07-12 at 21:58 +0300, Benny Halevy wrote: >> [pnfs@linux-nfs.org -> linux-nfs@vger.kernel.org] >> >> On Jul. 12, 2010, 21:29 +0300, Jim Rees <rees@umich.edu> wrote: >>> Does anyone still care about this? >>> >>> WARNING: nfs41_sequence_done: Operation in progress slot=1 seq=7 highest_used_slotid=1: please report to pnfs@linux-nfs.org if you saw this message >> >> Heh, need to update hard-coded instructions to point to the new list... >> >>> >>> I'm getting this on the client side of a pnfs block layout mount against the >>> spnfs server. Kernel is benny's pnfs-all-2.6.35-rc3-2010-07-01 plus EMC >>> complex block layout patches. It's possible the complex layout code is to >>> blame, but I doubt it because this isn't a complex layout mount. I can >>> provide more details. >> >> I agree. This is a generic issue. >> The patch that adds this check is >> d6ce9ad DEVONLY: nfs41: Do not free slot if retried while operation was in progress >> >> It was originally rejected (http://www.spinics.net/lists/linux-nfs/msg09562.html) >> due to noise regarding where nfs41_sequence_free_slot is called >> but that masked the real issue. >> >> Can you readily reproduce this? >> Can you debug also the server side to see if indeed the client retries the RPC >> while it is in progress on the server? > > So what is the root cause here? Is it the known issue that we don't deal > correctly with an NFS4ERR_DELAY on the SEQUENCE operation? Yes. > > Trond > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pnfs] nfs41_sequence_done 2010-07-12 19:16 ` Benny Halevy @ 2010-07-12 19:26 ` Trond Myklebust [not found] ` <1278962763.12559.10.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Trond Myklebust @ 2010-07-12 19:26 UTC (permalink / raw) To: Benny Halevy; +Cc: Jim Rees, NFS list On Mon, 2010-07-12 at 22:16 +0300, Benny Halevy wrote: > On Jul. 12, 2010, 22:14 +0300, Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > > On Mon, 2010-07-12 at 21:58 +0300, Benny Halevy wrote: > >> [pnfs@linux-nfs.org -> linux-nfs@vger.kernel.org] > >> > >> On Jul. 12, 2010, 21:29 +0300, Jim Rees <rees@umich.edu> wrote: > >>> Does anyone still care about this? > >>> > >>> WARNING: nfs41_sequence_done: Operation in progress slot=1 seq=7 highest_used_slotid=1: please report to pnfs@linux-nfs.org if you saw this message > >> > >> Heh, need to update hard-coded instructions to point to the new list... > >> > >>> > >>> I'm getting this on the client side of a pnfs block layout mount against the > >>> spnfs server. Kernel is benny's pnfs-all-2.6.35-rc3-2010-07-01 plus EMC > >>> complex block layout patches. It's possible the complex layout code is to > >>> blame, but I doubt it because this isn't a complex layout mount. I can > >>> provide more details. > >> > >> I agree. This is a generic issue. > >> The patch that adds this check is > >> d6ce9ad DEVONLY: nfs41: Do not free slot if retried while operation was in progress > >> > >> It was originally rejected (http://www.spinics.net/lists/linux-nfs/msg09562.html) > >> due to noise regarding where nfs41_sequence_free_slot is called > >> but that masked the real issue. > >> > >> Can you readily reproduce this? > >> Can you debug also the server side to see if indeed the client retries the RPC > >> while it is in progress on the server? > > > > So what is the root cause here? Is it the known issue that we don't deal > > correctly with an NFS4ERR_DELAY on the SEQUENCE operation? > > Yes. I'm happy to take patches to fix that. Trond ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <1278962763.12559.10.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>]
* [PATCH] nfs41: Do not free slot if retried while operation was in progress [not found] ` <1278962763.12559.10.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> @ 2010-07-12 19:49 ` Benny Halevy 2010-07-12 19:59 ` Trond Myklebust 0 siblings, 1 reply; 6+ messages in thread From: Benny Halevy @ 2010-07-12 19:49 UTC (permalink / raw) To: Trond Myklebust; +Cc: Jim Rees, NFS list On Jul. 12, 2010, 22:26 +0300, Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > On Mon, 2010-07-12 at 22:16 +0300, Benny Halevy wrote: >> On Jul. 12, 2010, 22:14 +0300, Trond Myklebust <Trond.Myklebust@netapp.com> wrote: >>> On Mon, 2010-07-12 at 21:58 +0300, Benny Halevy wrote: >>>> [pnfs@linux-nfs.org -> linux-nfs@vger.kernel.org] >>>> >>>> On Jul. 12, 2010, 21:29 +0300, Jim Rees <rees@umich.edu> wrote: >>>>> Does anyone still care about this? >>>>> >>>>> WARNING: nfs41_sequence_done: Operation in progress slot=1 seq=7 highest_used_slotid=1: please report to pnfs@linux-nfs.org if you saw this message >>>> >>>> Heh, need to update hard-coded instructions to point to the new list... >>>> >>>>> >>>>> I'm getting this on the client side of a pnfs block layout mount against the >>>>> spnfs server. Kernel is benny's pnfs-all-2.6.35-rc3-2010-07-01 plus EMC >>>>> complex block layout patches. It's possible the complex layout code is to >>>>> blame, but I doubt it because this isn't a complex layout mount. I can >>>>> provide more details. >>>> >>>> I agree. This is a generic issue. >>>> The patch that adds this check is >>>> d6ce9ad DEVONLY: nfs41: Do not free slot if retried while operation was in progress >>>> >>>> It was originally rejected (http://www.spinics.net/lists/linux-nfs/msg09562.html) >>>> due to noise regarding where nfs41_sequence_free_slot is called >>>> but that masked the real issue. >>>> >>>> Can you readily reproduce this? >>>> Can you debug also the server side to see if indeed the client retries the RPC >>>> while it is in progress on the server? >>> >>> So what is the root cause here? Is it the known issue that we don't deal >>> correctly with an NFS4ERR_DELAY on the SEQUENCE operation? >> >> Yes. > > I'm happy to take patches to fix that. > > Trond >From 7dc3c468463a337dabff7f714a3475e3f51380f6 Mon Sep 17 00:00:00 2001 From: Benny Halevy <bhalevy@panasas.com> Date: Mon, 12 Jul 2010 22:42:15 +0300 Subject: [PATCH] nfs41: Do not free slot if retried while operation was in progress Getting NFS4ERR_DELAY on OP_SEQUENCE means that the compound was retried while it's still in progress on the server. Therefore its respective slot must not be freed and reused for other compounds until it either succeeds or fails with another error status. Signed-off-by: Benny Halevy <bhalevy@panasas.com> --- That fixed, do we ensure that the client either closes or loses the connection before retrying? fs/nfs/nfs4proc.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 70015dd..baf86b9 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -425,6 +425,12 @@ static void nfs41_sequence_done(struct nfs_client *clp, /* Check sequence flags */ if (atomic_read(&clp->cl_count) > 1) nfs41_handle_sequence_flag_errors(clp, res->sr_status_flags); + } else if (unlikely(res->sr_status == -NFS4ERR_DELAY)) { + /* Do not free slot if retried while operation was in progress */ + tbl = &res->sr_session->fc_slot_table; + dprintk("%s: slot=%d seq=%d: Operation in progress\n", __func__, + res->sr_slotid, tbl->slots[res->sr_slotid].seq_nr); + return; } out: /* The session may be reset by one of the error handlers. */ ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] nfs41: Do not free slot if retried while operation was in progress 2010-07-12 19:49 ` [PATCH] nfs41: Do not free slot if retried while operation was in progress Benny Halevy @ 2010-07-12 19:59 ` Trond Myklebust 0 siblings, 0 replies; 6+ messages in thread From: Trond Myklebust @ 2010-07-12 19:59 UTC (permalink / raw) To: Benny Halevy; +Cc: Jim Rees, NFS list On Mon, 2010-07-12 at 22:49 +0300, Benny Halevy wrote: > On Jul. 12, 2010, 22:26 +0300, Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > > On Mon, 2010-07-12 at 22:16 +0300, Benny Halevy wrote: > >> On Jul. 12, 2010, 22:14 +0300, Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > >>> On Mon, 2010-07-12 at 21:58 +0300, Benny Halevy wrote: > >>>> [pnfs@linux-nfs.org -> linux-nfs@vger.kernel.org] > >>>> > >>>> On Jul. 12, 2010, 21:29 +0300, Jim Rees <rees@umich.edu> wrote: > >>>>> Does anyone still care about this? > >>>>> > >>>>> WARNING: nfs41_sequence_done: Operation in progress slot=1 seq=7 highest_used_slotid=1: please report to pnfs@linux-nfs.org if you saw this message > >>>> > >>>> Heh, need to update hard-coded instructions to point to the new list... > >>>> > >>>>> > >>>>> I'm getting this on the client side of a pnfs block layout mount against the > >>>>> spnfs server. Kernel is benny's pnfs-all-2.6.35-rc3-2010-07-01 plus EMC > >>>>> complex block layout patches. It's possible the complex layout code is to > >>>>> blame, but I doubt it because this isn't a complex layout mount. I can > >>>>> provide more details. > >>>> > >>>> I agree. This is a generic issue. > >>>> The patch that adds this check is > >>>> d6ce9ad DEVONLY: nfs41: Do not free slot if retried while operation was in progress > >>>> > >>>> It was originally rejected (http://www.spinics.net/lists/linux-nfs/msg09562.html) > >>>> due to noise regarding where nfs41_sequence_free_slot is called > >>>> but that masked the real issue. > >>>> > >>>> Can you readily reproduce this? > >>>> Can you debug also the server side to see if indeed the client retries the RPC > >>>> while it is in progress on the server? > >>> > >>> So what is the root cause here? Is it the known issue that we don't deal > >>> correctly with an NFS4ERR_DELAY on the SEQUENCE operation? > >> > >> Yes. > > > > I'm happy to take patches to fix that. > > > > Trond > > >From 7dc3c468463a337dabff7f714a3475e3f51380f6 Mon Sep 17 00:00:00 2001 > From: Benny Halevy <bhalevy@panasas.com> > Date: Mon, 12 Jul 2010 22:42:15 +0300 > Subject: [PATCH] nfs41: Do not free slot if retried while operation was in progress > > Getting NFS4ERR_DELAY on OP_SEQUENCE means that the compound was retried > while it's still in progress on the server. Therefore its respective > slot must not be freed and reused for other compounds until it either > succeeds or fails with another error status. > > Signed-off-by: Benny Halevy <bhalevy@panasas.com> > --- > > That fixed, do we ensure that the client either closes or loses the connection > before retrying? > > fs/nfs/nfs4proc.c | 6 ++++++ > 1 files changed, 6 insertions(+), 0 deletions(-) > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index 70015dd..baf86b9 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -425,6 +425,12 @@ static void nfs41_sequence_done(struct nfs_client *clp, > /* Check sequence flags */ > if (atomic_read(&clp->cl_count) > 1) > nfs41_handle_sequence_flag_errors(clp, res->sr_status_flags); > + } else if (unlikely(res->sr_status == -NFS4ERR_DELAY)) { > + /* Do not free slot if retried while operation was in progress */ > + tbl = &res->sr_session->fc_slot_table; > + dprintk("%s: slot=%d seq=%d: Operation in progress\n", __func__, > + res->sr_slotid, tbl->slots[res->sr_slotid].seq_nr); > + return; > } > out: > /* The session may be reset by one of the error handlers. */ No. That is very clearly insufficient... Never mind. I'll do it... ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-07-12 20:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20100712182927.GB22461@merit.edu>
[not found] ` <20100712182927.GB22461-8f4Pc2RrbJmHXe+LvDLADg@public.gmane.org>
2010-07-12 18:58 ` [pnfs] nfs41_sequence_done Benny Halevy
2010-07-12 19:14 ` Trond Myklebust
[not found] ` <1278962046.12559.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2010-07-12 19:16 ` Benny Halevy
2010-07-12 19:26 ` Trond Myklebust
[not found] ` <1278962763.12559.10.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2010-07-12 19:49 ` [PATCH] nfs41: Do not free slot if retried while operation was in progress Benny Halevy
2010-07-12 19:59 ` Trond Myklebust
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.