* [PATCH] sunrpc: remove unnecessary svc_xprt_put
@ 2010-02-26 22:33 Neil Brown
[not found] ` <19336.19524.469529.431210-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-02-26 22:33 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Tom Tucker, linux-nfs
[I found this while looking for the current refcount problem
that triggers a warning in svc_recv. This isn't that bug
but is a different refcount bug - NB]
The 'struct svc_deferred_req's on the xpt_deferred queue do not
own a reference to the owning xprt. This is seen in svc_revisit
which is where things are added to this queue. dr->xprt is set to
NULL and the reference to the xprt it put.
So when this list is cleaned up in svc_delete_xprt, we mustn't
put the reference.
Also, replace the 'for' with a 'while' which is arguably
simpler and more likely to compile efficiently.
Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: NeilBrown <neilb@suse.de>
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 7d1f9e9..4f30336 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -889,11 +889,8 @@ void svc_delete_xprt(struct svc_xprt *xprt)
if (test_bit(XPT_TEMP, &xprt->xpt_flags))
serv->sv_tmpcnt--;
- for (dr = svc_deferred_dequeue(xprt); dr;
- dr = svc_deferred_dequeue(xprt)) {
- svc_xprt_put(xprt);
+ while ((dr = svc_deferred_dequeue(xprt)) != NULL)
kfree(dr);
- }
svc_xprt_put(xprt);
spin_unlock_bh(&serv->sv_lock);
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put
[not found] ` <19336.19524.469529.431210-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2010-02-26 22:44 ` J. Bruce Fields
2010-02-26 22:54 ` J. Bruce Fields
1 sibling, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2010-02-26 22:44 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs
On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>
> [I found this while looking for the current refcount problem
> that triggers a warning in svc_recv. This isn't that bug
> but is a different refcount bug - NB]
>
> The 'struct svc_deferred_req's on the xpt_deferred queue do not
> own a reference to the owning xprt. This is seen in svc_revisit
> which is where things are added to this queue. dr->xprt is set to
> NULL and the reference to the xprt it put.
>
> So when this list is cleaned up in svc_delete_xprt, we mustn't
> put the reference.
>
> Also, replace the 'for' with a 'while' which is arguably
> simpler and more likely to compile efficiently.
OK, thanks, queuing up for 2.6.34 and stable.
--b.
>
> Cc: Tom Tucker <tom@opengridcomputing.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 7d1f9e9..4f30336 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -889,11 +889,8 @@ void svc_delete_xprt(struct svc_xprt *xprt)
> if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> serv->sv_tmpcnt--;
>
> - for (dr = svc_deferred_dequeue(xprt); dr;
> - dr = svc_deferred_dequeue(xprt)) {
> - svc_xprt_put(xprt);
> + while ((dr = svc_deferred_dequeue(xprt)) != NULL)
> kfree(dr);
> - }
>
> svc_xprt_put(xprt);
> spin_unlock_bh(&serv->sv_lock);
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put
[not found] ` <19336.19524.469529.431210-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-02-26 22:44 ` J. Bruce Fields
@ 2010-02-26 22:54 ` J. Bruce Fields
2010-02-27 0:40 ` Tom Tucker
1 sibling, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-02-26 22:54 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs
On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>
> [I found this while looking for the current refcount problem
> that triggers a warning in svc_recv. This isn't that bug
> but is a different refcount bug - NB]
And thanks very much for looking into that, I'm worried.... Seems to
have appeared some time between v2.6.31 and v2.6.32.2. On a quick skim
commits in that range that struck me as worth a second look included
8f55f3c0a013, b0401d725334, and the 4.1 backchannel patches (3ddc8bf5f3
and preceding).
Oh, and I also have some very rough notes from when I looked at this
before, in case there's anything useful.
--b.
Re: 2.6.32.2 - WARNING: at lib/kref.c:43 kref_get+0x,23/0x2b()
Seen on: 2.6.32.2, 2.6.32.6, 2.6.32.8; probably was OK on 2.6.29.6 and
2.6.31.
Is the warning actually warning about anything that's a problem, or can
that counter by zero by design? Yes, it's actually a problem.
Is probably svc_xprt_get(xprt) in svc_recv() (only obvious kref_get I
found on a quick glance through svc_recv).
Double-check:
svc_recv+0x305/0x7e6
Note next bug is on putting a socket (that we probably
shouldn't have!?):
- BUG_ON(inode->i_state == I_CLEAR).
- Implies clear_inode() was previously called on
it.
- stack includes kref_put() call in
svc_xprt_release, which is indeed put of same
xpt_ref field that svc_xprt_get() gets.
So, most probably explanation:
- We still had a dangling reference to an xprt after putting
one. So we ended up doing another get/put pair on it later
and trying to free the same socket twice.
So, plan: look for svc_xprt_puts (after checking for other stray uses of
xpt_ref) and verify that they're all legit. And gets while we're at it:
Ignore svc_rdma for now. Those reporters that answered weren't
using rdma.
Most puts outside of rdma are in svc_xprt.c:
- svc_xprt_release (unconditional): 0 to caller (put matched
with removal from rq_xprt)
- svc_check_conn_limits(): 0 to caller
- takes an xprt off a sv_tempsocks list, gets it (and
sets XPT_CLOSE) before dropping sv_lock, then enqueues
and puts. (Note: enqueue will get, and assign to
rq_xprt, if thread found.)
- svc_age_temp_xprts: 0 to caller
- same pattern as svc_check_conn_limits().
- svc_delete_xprt: 0 to caller (put matched with removal from
xpt_list)
- if test_and_set_bit of XPT_DEAD succeeds, will
svc_xprt_put(), after calling xpo_detach, then (under
sv_lock) removing xpt_list.
- ALSO unconditionally puts once for each deferred
request it finds associated with this request. Is
that right? Yup: svc_defer() gets on success, when
assigning dr->xprt.
- svc_close_xprt:
- sets XPT_CLOSE, then if test_and_set_bit of XPT_BUSY
succeeds, gets xprt, deletes, clears BUSY, puts.
- revisit:
- puts associated xprt unconditionally.
Also some puts are in fs/nfsd/nfsctl.c, fs/nfsd/nfs4state.c,
fs/lockd/svc.c:
nfsctl.c:
ifs/nfsd/nfsctl.c:__write_ports_delxprt():
- svc_find_xprt() gets a reference; if found:
svc_close_xprt, svc_xprt_put. OK.
nfs4state.c:
free_client: svc_xprt_put(clp->cl_cb_xprt);
Looks basically correct: we take reference when we
assign that in nfsd4_create_session.
Hm. Note we copy pointer to clp->cl_cb_xprt without
taking reference? The client holds a reference,
though. Looking at cb_xprt use in client xprt code, I
can't see any references taken or dropped. This all
looks fine.
lockd/svc.c: create_lock_listener() looks innocuous.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put
2010-02-26 22:54 ` J. Bruce Fields
@ 2010-02-27 0:40 ` Tom Tucker
2010-02-27 1:35 ` Neil Brown
0 siblings, 1 reply; 20+ messages in thread
From: Tom Tucker @ 2010-02-27 0:40 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Neil Brown, linux-nfs
J. Bruce Fields wrote:
> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>
>> [I found this while looking for the current refcount problem
>> that triggers a warning in svc_recv. This isn't that bug
>> but is a different refcount bug - NB]
>>
>
>
I seem to recall that we added that reference for a reason. There was
an issue with unmount while there were deferrals pending. That's why the
reference was added.
Tom
> And thanks very much for looking into that, I'm worried.... Seems to
> have appeared some time between v2.6.31 and v2.6.32.2. On a quick skim
> commits in that range that struck me as worth a second look included
> 8f55f3c0a013, b0401d725334, and the 4.1 backchannel patches (3ddc8bf5f3
> and preceding).
>
> Oh, and I also have some very rough notes from when I looked at this
> before, in case there's anything useful.
>
> --b.
>
> Re: 2.6.32.2 - WARNING: at lib/kref.c:43 kref_get+0x,23/0x2b()
>
> Seen on: 2.6.32.2, 2.6.32.6, 2.6.32.8; probably was OK on 2.6.29.6 and
> 2.6.31.
>
> Is the warning actually warning about anything that's a problem, or can
> that counter by zero by design? Yes, it's actually a problem.
>
> Is probably svc_xprt_get(xprt) in svc_recv() (only obvious kref_get I
> found on a quick glance through svc_recv).
> Double-check:
> svc_recv+0x305/0x7e6
> Note next bug is on putting a socket (that we probably
> shouldn't have!?):
> - BUG_ON(inode->i_state == I_CLEAR).
> - Implies clear_inode() was previously called on
> it.
> - stack includes kref_put() call in
> svc_xprt_release, which is indeed put of same
> xpt_ref field that svc_xprt_get() gets.
>
> So, most probably explanation:
> - We still had a dangling reference to an xprt after putting
> one. So we ended up doing another get/put pair on it later
> and trying to free the same socket twice.
>
> So, plan: look for svc_xprt_puts (after checking for other stray uses of
> xpt_ref) and verify that they're all legit. And gets while we're at it:
>
> Ignore svc_rdma for now. Those reporters that answered weren't
> using rdma.
>
> Most puts outside of rdma are in svc_xprt.c:
>
> - svc_xprt_release (unconditional): 0 to caller (put matched
> with removal from rq_xprt)
> - svc_check_conn_limits(): 0 to caller
> - takes an xprt off a sv_tempsocks list, gets it (and
> sets XPT_CLOSE) before dropping sv_lock, then enqueues
> and puts. (Note: enqueue will get, and assign to
> rq_xprt, if thread found.)
> - svc_age_temp_xprts: 0 to caller
> - same pattern as svc_check_conn_limits().
> - svc_delete_xprt: 0 to caller (put matched with removal from
> xpt_list)
> - if test_and_set_bit of XPT_DEAD succeeds, will
> svc_xprt_put(), after calling xpo_detach, then (under
> sv_lock) removing xpt_list.
> - ALSO unconditionally puts once for each deferred
> request it finds associated with this request. Is
> that right? Yup: svc_defer() gets on success, when
> assigning dr->xprt.
> - svc_close_xprt:
> - sets XPT_CLOSE, then if test_and_set_bit of XPT_BUSY
> succeeds, gets xprt, deletes, clears BUSY, puts.
> - revisit:
> - puts associated xprt unconditionally.
>
> Also some puts are in fs/nfsd/nfsctl.c, fs/nfsd/nfs4state.c,
> fs/lockd/svc.c:
>
> nfsctl.c:
> ifs/nfsd/nfsctl.c:__write_ports_delxprt():
> - svc_find_xprt() gets a reference; if found:
> svc_close_xprt, svc_xprt_put. OK.
> nfs4state.c:
> free_client: svc_xprt_put(clp->cl_cb_xprt);
> Looks basically correct: we take reference when we
> assign that in nfsd4_create_session.
>
> Hm. Note we copy pointer to clp->cl_cb_xprt without
> taking reference? The client holds a reference,
> though. Looking at cb_xprt use in client xprt code, I
> can't see any references taken or dropped. This all
> looks fine.
>
> lockd/svc.c: create_lock_listener() looks innocuous.
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put
2010-02-27 0:40 ` Tom Tucker
@ 2010-02-27 1:35 ` Neil Brown
[not found] ` <20100227123537.6289e326-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-02-27 1:35 UTC (permalink / raw)
To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs
On Fri, 26 Feb 2010 18:40:58 -0600
Tom Tucker <tom@opengridcomputing.com> wrote:
> J. Bruce Fields wrote:
> > On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
> >
> >> [I found this while looking for the current refcount problem
> >> that triggers a warning in svc_recv. This isn't that bug
> >> but is a different refcount bug - NB]
> >>
> >
> >
>
> I seem to recall that we added that reference for a reason. There was
> an issue with unmount while there were deferrals pending. That's why the
> reference was added.
>
> Tom
What reference?
What I (thought I) found was code that was dropping a reference which it
didn't hold. Are you saying that it is supposed to be holding a reference
here, but isn't, or that it really is holding a reference here and I didn't
see it?
And just for completeness, my understanding of the refcounting here is:
A counted references is held on an svc_xprt when:
- a 'struct rqst' refers to it through ->rq_xprt
- a 'cache_deferred_req' refers to it through ->xprt
This only happens while the req is waiting to be
revisited, and is in the hash table and on the lru.
Once the req gets revisited (svc_revisit) ->xprt
is set to NULL and the reference is dropped.
- XPT_DEAD is *not* set. So the refcount is initialised
to '1' to reflect this, and this ref is dropped
when we set XPT_DEAD.
- there are a few transient references in svc_xprt.c
which very clearly have matched 'get' and 'put'.
- svc_find_xprt returns a counted reference. This is
called once in lockd and once in nfsd, and both
calls drop the ref correctly.
Whenever we drop a counted ref that was stored in a pointer, we set that
pointer to NULL.
So if there was a race where two threads both get a reference from a pointer
and then drop that reference, you would expect that slightly different timing
would cause one of those threads to get a NULL from the pointer, dereference
it, and crash. There are no important tests-for-NULL on either of the
pointers in question, so that wouldn't be protecting us from a crash. But
we don't see that crash, so there cannot be a race there.
So: The refcount cannot possibly be zero in svc_recv :-)
I just noticed some slightly odd code later in svc_recv:
if (XPT_LISTENER && XPT_CLOSE) {
...
} else if (XPT_CLOSE) {
...
->xpo_recvfrom()
}
if (XPT_CLOSE) {
...
svc_delete_xprt()
}
So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
is possible, and if ->xpo_recvfrom returns non-zero, then we end up
processing a request on a dead socket, which doesn't sound like the right
thing to do. I don't think it can cause the present problem, but
it looks wrong. That last 'if' should just be an 'else'.
I guess that would effectively reverse b0401d7253, though - not that
that patch seems entirely right to me - if there is a problem I probably
would have fixed it differently, though I'm not sure how.
So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
NeilBrown
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put
[not found] ` <20100227123537.6289e326-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2010-02-27 2:38 ` Tom Tucker
2010-03-01 4:23 ` Neil Brown
2010-02-27 5:59 ` The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put) Neil Brown
1 sibling, 1 reply; 20+ messages in thread
From: Tom Tucker @ 2010-02-27 2:38 UTC (permalink / raw)
To: Neil Brown; +Cc: J. Bruce Fields, linux-nfs
Neil Brown wrote:
> On Fri, 26 Feb 2010 18:40:58 -0600
> Tom Tucker <tom@opengridcomputing.com> wrote:
>
>
>> J. Bruce Fields wrote:
>>
>>> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>>>
>>>
>>>> [I found this while looking for the current refcount problem
>>>> that triggers a warning in svc_recv. This isn't that bug
>>>> but is a different refcount bug - NB]
>>>>
>>>>
>>>
>>>
>> I seem to recall that we added that reference for a reason. There was
>> an issue with unmount while there were deferrals pending. That's why the
>> reference was added.
>>
>> Tom
>>
>
> What reference?
> What I (thought I) found was code that was dropping a reference which it
> didn't hold. Are you saying that it is supposed to be holding a reference
> here, but isn't, or that it really is holding a reference here and I didn't
> see it?
>
Here's the commit that I was thinking of...
22945e4a1c7454c97f5d8aee1ef526c83fef3223
I think this change adds the bug that you are now fixing. It fixed one
problem, but added another that you have now resolved.
What do you guys think?
Thanks,
Tom
> And just for completeness, my understanding of the refcounting here is:
>
> A counted references is held on an svc_xprt when:
> - a 'struct rqst' refers to it through ->rq_xprt
> - a 'cache_deferred_req' refers to it through ->xprt
> This only happens while the req is waiting to be
> revisited, and is in the hash table and on the lru.
> Once the req gets revisited (svc_revisit) ->xprt
> is set to NULL and the reference is dropped.
> - XPT_DEAD is *not* set. So the refcount is initialised
> to '1' to reflect this, and this ref is dropped
> when we set XPT_DEAD.
> - there are a few transient references in svc_xprt.c
> which very clearly have matched 'get' and 'put'.
> - svc_find_xprt returns a counted reference. This is
> called once in lockd and once in nfsd, and both
> calls drop the ref correctly.
>
> Whenever we drop a counted ref that was stored in a pointer, we set that
> pointer to NULL.
> So if there was a race where two threads both get a reference from a pointer
> and then drop that reference, you would expect that slightly different timing
> would cause one of those threads to get a NULL from the pointer, dereference
> it, and crash. There are no important tests-for-NULL on either of the
> pointers in question, so that wouldn't be protecting us from a crash. But
> we don't see that crash, so there cannot be a race there.
>
> So: The refcount cannot possibly be zero in svc_recv :-)
>
> I just noticed some slightly odd code later in svc_recv:
>
> if (XPT_LISTENER && XPT_CLOSE) {
> ...
> } else if (XPT_CLOSE) {
> ...
> ->xpo_recvfrom()
> }
> if (XPT_CLOSE) {
> ...
> svc_delete_xprt()
> }
>
> So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
> is possible, and if ->xpo_recvfrom returns non-zero, then we end up
> processing a request on a dead socket, which doesn't sound like the right
> thing to do. I don't think it can cause the present problem, but
> it looks wrong. That last 'if' should just be an 'else'.
> I guess that would effectively reverse b0401d7253, though - not that
> that patch seems entirely right to me - if there is a problem I probably
> would have fixed it differently, though I'm not sure how.
> So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
>
> NeilBrown
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
[not found] ` <20100227123537.6289e326-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-02-27 2:38 ` Tom Tucker
@ 2010-02-27 5:59 ` Neil Brown
[not found] ` <20100227165913.53718449-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
1 sibling, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-02-27 5:59 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, J. Bruce Fields, linux-nfs, Wei Yongjun
On Sat, 27 Feb 2010 12:35:37 +1100
Neil Brown <neilb@suse.de> wrote:
> So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
> is possible, and if ->xpo_recvfrom returns non-zero, then we end up
> processing a request on a dead socket, which doesn't sound like the right
> thing to do. I don't think it can cause the present problem, but
> it looks wrong. That last 'if' should just be an 'else'.
> I guess that would effectively reverse b0401d7253, though - not that
> that patch seems entirely right to me - if there is a problem I probably
> would have fixed it differently, though I'm not sure how.
> So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
OK, I think I've nailed it. I think b0401d7253 is the culprit.
Now let me see if I can convince you (and me).
Firstly, why is this patch wrong.
It claims:
sunrpc: "Move close processing to a single place"
(d7979ae4a050a45b78af51832475001b68263d2a) moved the
close processing before the recvfrom method. This may
cause the close processing never to execute. So this
patch moves it to the right place.
The referenced commit did *not* move the close processing before the
recvfrom method - it was already there. The close processing was previously
at the top of the individual recvfrom methods. It was split out and common
code with placed before the now-slightly-reduced recvfrom methods.
This is functionally a null change.
However that doesn't explain why sometimes "the close processing [would]]
never .. execute".
The reason for this is subtle. One the changes in commit d7979ae4a is
err_delete:
- svc_delete_socket(svsk);
+ set_bit(SK_CLOSE, &svsk->sk_flags);
return -EAGAIN;
This is insufficient. The recvfrom methods must always call svc_xprt_received
on completion so that the socket gets re-queued if there is any more work to
do. This particular path did not make that call because it actually
destroyed the svsk, making requeue pointless. When the svc_delete_socket was
change to just set a bit, we should have added a call to svc_xprt_received,
but we didn't. Sorry. As I said, it was subtle.
So how is the b0401d7253 patch causing a problem?
svc_tcp_state_change - which can be called at any time - sets XPT_CLOSE.
If this happens while svc_tcp_recvfrom is running and before one of the
calls to svc_xprt_received, then svc_xprt_received will requeue the svsk for
further processing (to handle the close).
As soon a svc_tcp_recvfrom completes, svc_recv will notice that XPT_CLOSE is
set and will close the socket, dropping the last refcount. Subsequently the
thread which the socket was queued to wakes up, calls svc_recv, and triggers
the warning.
So the fix I propose is:
- make the XPT_CLOSE case in svc_recv once more exclusive with the
->recvfrom case
- make sure all paths out of all recvfrom methods call svc_xprt_received.
Maybe it should be called after the ->xpo_recvfrom call instead.
So something like this? I've made quite a few changes here - it might be
worth splitting them up. One worth noting is that we now don't re-queue a udp
socket at the earliest opportunity, but possibly do a
csum_partial_copy_to_xdr before the requeue which could reduce performance
slightly with udp on a multiprocessor. I have no idea what the actual
performance effect would be, but I think it makes the code a lot more robust
(move the svc_xprt_received to just one place).
NeilBrown
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 4f30336..2d99fb8 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -699,8 +699,12 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
spin_unlock_bh(&pool->sp_lock);
len = 0;
- if (test_bit(XPT_LISTENER, &xprt->xpt_flags) &&
- !test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+
+ if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ dprintk("svc_recv: found XPT_CLOSE\n");
+ svc_delete_xprt(xprt);
+ return 0;
+ } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
struct svc_xprt *newxpt;
newxpt = xprt->xpt_ops->xpo_accept(xprt);
if (newxpt) {
@@ -726,23 +730,19 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
svc_xprt_received(newxpt);
}
svc_xprt_received(xprt);
- } else if (!test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
- dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
- rqstp, pool->sp_id, xprt,
- atomic_read(&xprt->xpt_ref.refcount));
- rqstp->rq_deferred = svc_deferred_dequeue(xprt);
- if (rqstp->rq_deferred) {
- svc_xprt_received(xprt);
- len = svc_deferred_recv(rqstp);
- } else
- len = xprt->xpt_ops->xpo_recvfrom(rqstp);
- dprintk("svc: got len=%d\n", len);
+ return 0;
}
- if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
- dprintk("svc_recv: found XPT_CLOSE\n");
- svc_delete_xprt(xprt);
- }
+ dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
+ rqstp, pool->sp_id, xprt,
+ atomic_read(&xprt->xpt_ref.refcount));
+ rqstp->rq_deferred = svc_deferred_dequeue(xprt);
+ if (rqstp->rq_deferred)
+ len = svc_deferred_recv(rqstp);
+ else
+ len = xprt->xpt_ops->xpo_recvfrom(rqstp);
+ dprintk("svc: got len=%d\n", len);
+ svc_xprt_received(xprt);
/* No data, incomplete (TCP) read, or accept() */
if (len == 0 || len == -EAGAIN) {
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 870929e..22d9904 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
dprintk("svc: recvfrom returned error %d\n", -err);
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
}
- svc_xprt_received(&svsk->sk_xprt);
return -EAGAIN;
}
len = svc_addr_len(svc_addr(rqstp));
@@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
svsk->sk_sk->sk_stamp = skb->tstamp;
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
- /*
- * Maybe more packets - kick another thread ASAP.
- */
- svc_xprt_received(&svsk->sk_xprt);
-
len = skb->len - sizeof(struct udphdr);
rqstp->rq_arg.len = len;
@@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
if (len < want) {
dprintk("svc: short recvfrom while reading record "
"length (%d of %d)\n", len, want);
- svc_xprt_received(&svsk->sk_xprt);
goto err_again; /* record header not complete */
}
@@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
if (len < svsk->sk_reclen) {
dprintk("svc: incomplete TCP record (%d of %d)\n",
len, svsk->sk_reclen);
- svc_xprt_received(&svsk->sk_xprt);
goto err_again; /* record not complete */
}
len = svsk->sk_reclen;
@@ -961,10 +953,9 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
return len;
error:
- if (len == -EAGAIN) {
+ if (len == -EAGAIN)
dprintk("RPC: TCP recv_record got EAGAIN\n");
- svc_xprt_received(&svsk->sk_xprt);
- }
+
return len;
err_delete:
set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
@@ -1109,18 +1100,14 @@ out:
svsk->sk_tcplen = 0;
svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
- svc_xprt_received(&svsk->sk_xprt);
if (serv->sv_stats)
serv->sv_stats->nettcpcnt++;
return len;
err_again:
- if (len == -EAGAIN) {
+ if (len == -EAGAIN)
dprintk("RPC: TCP recvfrom got EAGAIN\n");
- svc_xprt_received(&svsk->sk_xprt);
- return len;
- }
error:
if (len != -EAGAIN) {
printk(KERN_NOTICE "%s: recvfrom returned errno %d\n",
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f92e37e..0194de8 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
rqstp->rq_arg.head[0].iov_len);
- svc_xprt_received(rqstp->rq_xprt);
return ret;
}
@@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
rqstp->rq_arg.head[0].iov_len);
rqstp->rq_prot = IPPROTO_MAX;
svc_xprt_copy_addrs(rqstp, xprt);
- svc_xprt_received(xprt);
return ret;
close_out:
@@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
*/
set_bit(XPT_CLOSE, &xprt->xpt_flags);
defer:
- svc_xprt_received(xprt);
return 0;
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning
[not found] ` <20100227165913.53718449-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2010-02-28 0:46 ` Tom Tucker
2010-02-28 21:05 ` The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put) J. Bruce Fields
1 sibling, 0 replies; 20+ messages in thread
From: Tom Tucker @ 2010-02-28 0:46 UTC (permalink / raw)
To: Neil Brown; +Cc: J. Bruce Fields, linux-nfs, Wei Yongjun
Makes sense to me.
Neil Brown wrote:
> On Sat, 27 Feb 2010 12:35:37 +1100
> Neil Brown <neilb@suse.de> wrote:
>
>
>> So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
>> is possible, and if ->xpo_recvfrom returns non-zero, then we end up
>> processing a request on a dead socket, which doesn't sound like the right
>> thing to do. I don't think it can cause the present problem, but
>> it looks wrong. That last 'if' should just be an 'else'.
>> I guess that would effectively reverse b0401d7253, though - not that
>> that patch seems entirely right to me - if there is a problem I probably
>> would have fixed it differently, though I'm not sure how.
>> So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
>>
>
> OK, I think I've nailed it. I think b0401d7253 is the culprit.
> Now let me see if I can convince you (and me).
>
> Firstly, why is this patch wrong.
> It claims:
>
> sunrpc: "Move close processing to a single place"
> (d7979ae4a050a45b78af51832475001b68263d2a) moved the
> close processing before the recvfrom method. This may
> cause the close processing never to execute. So this
> patch moves it to the right place.
>
> The referenced commit did *not* move the close processing before the
> recvfrom method - it was already there. The close processing was previously
> at the top of the individual recvfrom methods. It was split out and common
> code with placed before the now-slightly-reduced recvfrom methods.
> This is functionally a null change.
>
> However that doesn't explain why sometimes "the close processing [would]]
> never .. execute".
> The reason for this is subtle. One the changes in commit d7979ae4a is
>
> err_delete:
> - svc_delete_socket(svsk);
> + set_bit(SK_CLOSE, &svsk->sk_flags);
> return -EAGAIN;
>
> This is insufficient. The recvfrom methods must always call svc_xprt_received
> on completion so that the socket gets re-queued if there is any more work to
> do. This particular path did not make that call because it actually
> destroyed the svsk, making requeue pointless. When the svc_delete_socket was
> change to just set a bit, we should have added a call to svc_xprt_received,
> but we didn't. Sorry. As I said, it was subtle.
>
> So how is the b0401d7253 patch causing a problem?
>
> svc_tcp_state_change - which can be called at any time - sets XPT_CLOSE.
> If this happens while svc_tcp_recvfrom is running and before one of the
> calls to svc_xprt_received, then svc_xprt_received will requeue the svsk for
> further processing (to handle the close).
> As soon a svc_tcp_recvfrom completes, svc_recv will notice that XPT_CLOSE is
> set and will close the socket, dropping the last refcount. Subsequently the
> thread which the socket was queued to wakes up, calls svc_recv, and triggers
> the warning.
>
> So the fix I propose is:
> - make the XPT_CLOSE case in svc_recv once more exclusive with the
> ->recvfrom case
> - make sure all paths out of all recvfrom methods call svc_xprt_received.
> Maybe it should be called after the ->xpo_recvfrom call instead.
>
> So something like this? I've made quite a few changes here - it might be
> worth splitting them up. One worth noting is that we now don't re-queue a udp
> socket at the earliest opportunity, but possibly do a
> csum_partial_copy_to_xdr before the requeue which could reduce performance
> slightly with udp on a multiprocessor. I have no idea what the actual
> performance effect would be, but I think it makes the code a lot more robust
> (move the svc_xprt_received to just one place).
>
> NeilBrown
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 4f30336..2d99fb8 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -699,8 +699,12 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> spin_unlock_bh(&pool->sp_lock);
>
> len = 0;
> - if (test_bit(XPT_LISTENER, &xprt->xpt_flags) &&
> - !test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> +
> + if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> + dprintk("svc_recv: found XPT_CLOSE\n");
> + svc_delete_xprt(xprt);
> + return 0;
> + } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> struct svc_xprt *newxpt;
> newxpt = xprt->xpt_ops->xpo_accept(xprt);
> if (newxpt) {
> @@ -726,23 +730,19 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> svc_xprt_received(newxpt);
> }
> svc_xprt_received(xprt);
> - } else if (!test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> - dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> - rqstp, pool->sp_id, xprt,
> - atomic_read(&xprt->xpt_ref.refcount));
> - rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> - if (rqstp->rq_deferred) {
> - svc_xprt_received(xprt);
> - len = svc_deferred_recv(rqstp);
> - } else
> - len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> - dprintk("svc: got len=%d\n", len);
> + return 0;
> }
>
> - if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> - dprintk("svc_recv: found XPT_CLOSE\n");
> - svc_delete_xprt(xprt);
> - }
> + dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> + rqstp, pool->sp_id, xprt,
> + atomic_read(&xprt->xpt_ref.refcount));
> + rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> + if (rqstp->rq_deferred)
> + len = svc_deferred_recv(rqstp);
> + else
> + len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> + dprintk("svc: got len=%d\n", len);
> + svc_xprt_received(xprt);
>
> /* No data, incomplete (TCP) read, or accept() */
> if (len == 0 || len == -EAGAIN) {
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 870929e..22d9904 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> dprintk("svc: recvfrom returned error %d\n", -err);
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
> }
> - svc_xprt_received(&svsk->sk_xprt);
> return -EAGAIN;
> }
> len = svc_addr_len(svc_addr(rqstp));
> @@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> svsk->sk_sk->sk_stamp = skb->tstamp;
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
>
> - /*
> - * Maybe more packets - kick another thread ASAP.
> - */
> - svc_xprt_received(&svsk->sk_xprt);
> -
> len = skb->len - sizeof(struct udphdr);
> rqstp->rq_arg.len = len;
>
> @@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < want) {
> dprintk("svc: short recvfrom while reading record "
> "length (%d of %d)\n", len, want);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record header not complete */
> }
>
> @@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < svsk->sk_reclen) {
> dprintk("svc: incomplete TCP record (%d of %d)\n",
> len, svsk->sk_reclen);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record not complete */
> }
> len = svsk->sk_reclen;
> @@ -961,10 +953,9 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
>
> return len;
> error:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recv_record got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - }
> +
> return len;
> err_delete:
> set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
> @@ -1109,18 +1100,14 @@ out:
> svsk->sk_tcplen = 0;
>
> svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
> - svc_xprt_received(&svsk->sk_xprt);
> if (serv->sv_stats)
> serv->sv_stats->nettcpcnt++;
>
> return len;
>
> err_again:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recvfrom got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - return len;
> - }
> error:
> if (len != -EAGAIN) {
> printk(KERN_NOTICE "%s: recvfrom returned errno %d\n",
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index f92e37e..0194de8 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
> ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
> rqstp->rq_arg.head[0].iov_len);
>
> - svc_xprt_received(rqstp->rq_xprt);
> return ret;
> }
>
> @@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> rqstp->rq_arg.head[0].iov_len);
> rqstp->rq_prot = IPPROTO_MAX;
> svc_xprt_copy_addrs(rqstp, xprt);
> - svc_xprt_received(xprt);
> return ret;
>
> close_out:
> @@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> */
> set_bit(XPT_CLOSE, &xprt->xpt_flags);
> defer:
> - svc_xprt_received(xprt);
> return 0;
> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
[not found] ` <20100227165913.53718449-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-02-28 0:46 ` The recent kref_put warning Tom Tucker
@ 2010-02-28 21:05 ` J. Bruce Fields
2010-02-28 22:07 ` J. Bruce Fields
1 sibling, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-02-28 21:05 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Sat, Feb 27, 2010 at 04:59:13PM +1100, Neil Brown wrote:
> On Sat, 27 Feb 2010 12:35:37 +1100
> Neil Brown <neilb@suse.de> wrote:
>
> > So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
> > is possible, and if ->xpo_recvfrom returns non-zero, then we end up
> > processing a request on a dead socket, which doesn't sound like the right
> > thing to do. I don't think it can cause the present problem, but
> > it looks wrong. That last 'if' should just be an 'else'.
> > I guess that would effectively reverse b0401d7253, though - not that
> > that patch seems entirely right to me - if there is a problem I probably
> > would have fixed it differently, though I'm not sure how.
> > So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
>
> OK, I think I've nailed it. I think b0401d7253 is the culprit.
> Now let me see if I can convince you (and me).
Thanks, Neil, for the explanation.
> Firstly, why is this patch wrong.
> It claims:
>
> sunrpc: "Move close processing to a single place"
> (d7979ae4a050a45b78af51832475001b68263d2a) moved the
> close processing before the recvfrom method. This may
> cause the close processing never to execute. So this
> patch moves it to the right place.
>
> The referenced commit did *not* move the close processing before the
> recvfrom method - it was already there. The close processing was previously
> at the top of the individual recvfrom methods. It was split out and common
> code with placed before the now-slightly-reduced recvfrom methods.
> This is functionally a null change.
>
> However that doesn't explain why sometimes "the close processing [would]]
> never .. execute".
> The reason for this is subtle. One the changes in commit d7979ae4a is
>
> err_delete:
> - svc_delete_socket(svsk);
> + set_bit(SK_CLOSE, &svsk->sk_flags);
> return -EAGAIN;
>
> This is insufficient. The recvfrom methods must always call svc_xprt_received
> on completion so that the socket gets re-queued if there is any more work to
> do. This particular path did not make that call because it actually
> destroyed the svsk, making requeue pointless. When the svc_delete_socket was
> change to just set a bit, we should have added a call to svc_xprt_received,
> but we didn't. Sorry. As I said, it was subtle.
>
> So how is the b0401d7253 patch causing a problem?
>
> svc_tcp_state_change - which can be called at any time - sets XPT_CLOSE.
> If this happens while svc_tcp_recvfrom is running and before one of the
> calls to svc_xprt_received, then svc_xprt_received will requeue the svsk for
> further processing (to handle the close).
> As soon a svc_tcp_recvfrom completes, svc_recv will notice that XPT_CLOSE is
> set and will close the socket, dropping the last refcount.
OK, so the rule (or one of the rules) that was violated here was that we
end up calling svc_delete_xprt() without XPT_BUSY?
> Subsequently the
> thread which the socket was queued to wakes up, calls svc_recv, and triggers
> the warning.
>
> So the fix I propose is:
> - make the XPT_CLOSE case in svc_recv once more exclusive with the
> ->recvfrom case
> - make sure all paths out of all recvfrom methods call svc_xprt_received.
> Maybe it should be called after the ->xpo_recvfrom call instead.
>
> So something like this?
Makes sense to me on a first look. It would be helpful if people that
can reproduce this could test....
> I've made quite a few changes here - it might be worth splitting them
> up.
Probably so.
> One worth noting is that we now don't re-queue a udp
> socket at the earliest opportunity, but possibly do a
> csum_partial_copy_to_xdr before the requeue which could reduce performance
> slightly with udp on a multiprocessor.
Just because we're slower to get another CPU working on the next
incoming packet?
> I have no idea what the actual
> performance effect would be, but I think it makes the code a lot more robust
> (move the svc_xprt_received to just one place).
OK.
--b.
>
> NeilBrown
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 4f30336..2d99fb8 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -699,8 +699,12 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> spin_unlock_bh(&pool->sp_lock);
>
> len = 0;
> - if (test_bit(XPT_LISTENER, &xprt->xpt_flags) &&
> - !test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> +
> + if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> + dprintk("svc_recv: found XPT_CLOSE\n");
> + svc_delete_xprt(xprt);
> + return 0;
> + } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> struct svc_xprt *newxpt;
> newxpt = xprt->xpt_ops->xpo_accept(xprt);
> if (newxpt) {
> @@ -726,23 +730,19 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> svc_xprt_received(newxpt);
> }
> svc_xprt_received(xprt);
> - } else if (!test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> - dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> - rqstp, pool->sp_id, xprt,
> - atomic_read(&xprt->xpt_ref.refcount));
> - rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> - if (rqstp->rq_deferred) {
> - svc_xprt_received(xprt);
> - len = svc_deferred_recv(rqstp);
> - } else
> - len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> - dprintk("svc: got len=%d\n", len);
> + return 0;
> }
>
> - if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> - dprintk("svc_recv: found XPT_CLOSE\n");
> - svc_delete_xprt(xprt);
> - }
> + dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> + rqstp, pool->sp_id, xprt,
> + atomic_read(&xprt->xpt_ref.refcount));
> + rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> + if (rqstp->rq_deferred)
> + len = svc_deferred_recv(rqstp);
> + else
> + len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> + dprintk("svc: got len=%d\n", len);
> + svc_xprt_received(xprt);
>
> /* No data, incomplete (TCP) read, or accept() */
> if (len == 0 || len == -EAGAIN) {
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 870929e..22d9904 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> dprintk("svc: recvfrom returned error %d\n", -err);
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
> }
> - svc_xprt_received(&svsk->sk_xprt);
> return -EAGAIN;
> }
> len = svc_addr_len(svc_addr(rqstp));
> @@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> svsk->sk_sk->sk_stamp = skb->tstamp;
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
>
> - /*
> - * Maybe more packets - kick another thread ASAP.
> - */
> - svc_xprt_received(&svsk->sk_xprt);
> -
> len = skb->len - sizeof(struct udphdr);
> rqstp->rq_arg.len = len;
>
> @@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < want) {
> dprintk("svc: short recvfrom while reading record "
> "length (%d of %d)\n", len, want);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record header not complete */
> }
>
> @@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < svsk->sk_reclen) {
> dprintk("svc: incomplete TCP record (%d of %d)\n",
> len, svsk->sk_reclen);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record not complete */
> }
> len = svsk->sk_reclen;
> @@ -961,10 +953,9 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
>
> return len;
> error:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recv_record got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - }
> +
> return len;
> err_delete:
> set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
> @@ -1109,18 +1100,14 @@ out:
> svsk->sk_tcplen = 0;
>
> svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
> - svc_xprt_received(&svsk->sk_xprt);
> if (serv->sv_stats)
> serv->sv_stats->nettcpcnt++;
>
> return len;
>
> err_again:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recvfrom got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - return len;
> - }
> error:
> if (len != -EAGAIN) {
> printk(KERN_NOTICE "%s: recvfrom returned errno %d\n",
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index f92e37e..0194de8 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
> ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
> rqstp->rq_arg.head[0].iov_len);
>
> - svc_xprt_received(rqstp->rq_xprt);
> return ret;
> }
>
> @@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> rqstp->rq_arg.head[0].iov_len);
> rqstp->rq_prot = IPPROTO_MAX;
> svc_xprt_copy_addrs(rqstp, xprt);
> - svc_xprt_received(xprt);
> return ret;
>
> close_out:
> @@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> */
> set_bit(XPT_CLOSE, &xprt->xpt_flags);
> defer:
> - svc_xprt_received(xprt);
> return 0;
> }
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
2010-02-28 21:05 ` The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put) J. Bruce Fields
@ 2010-02-28 22:07 ` J. Bruce Fields
2010-02-28 23:57 ` Neil Brown
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-02-28 22:07 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Sun, Feb 28, 2010 at 04:05:53PM -0500, J. Bruce Fields wrote:
> On Sat, Feb 27, 2010 at 04:59:13PM +1100, Neil Brown wrote:
> > I've made quite a few changes here - it might be worth splitting them
> > up.
>
> Probably so.
So, if I first revert b292cf9 and then b0401d7, I get the following.
I don't understand the "return 0" in the XPT_CLOSE case. Is it really
OK for the caller to try to process this request?
--b.
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 8f0f1fb..48f91fb 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -706,9 +706,11 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
spin_unlock_bh(&pool->sp_lock);
len = 0;
+
if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
dprintk("svc_recv: found XPT_CLOSE\n");
svc_delete_xprt(xprt);
+ return 0;
} else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
struct svc_xprt *newxpt;
newxpt = xprt->xpt_ops->xpo_accept(xprt);
@@ -735,19 +737,20 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
svc_xprt_received(newxpt);
}
svc_xprt_received(xprt);
- } else {
- dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
- rqstp, pool->sp_id, xprt,
- atomic_read(&xprt->xpt_ref.refcount));
- rqstp->rq_deferred = svc_deferred_dequeue(xprt);
- if (rqstp->rq_deferred) {
- svc_xprt_received(xprt);
- len = svc_deferred_recv(rqstp);
- } else
- len = xprt->xpt_ops->xpo_recvfrom(rqstp);
- dprintk("svc: got len=%d\n", len);
+ return 0;
}
+ dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
+ rqstp, pool->sp_id, xprt,
+ atomic_read(&xprt->xpt_ref.refcount));
+ rqstp->rq_deferred = svc_deferred_dequeue(xprt);
+ if (rqstp->rq_deferred)
+ len = svc_deferred_recv(rqstp);
+ else
+ len = xprt->xpt_ops->xpo_recvfrom(rqstp);
+ dprintk("svc: got len=%d\n", len);
+ svc_xprt_received(xprt);
+
/* No data, incomplete (TCP) read, or accept() */
if (len == 0 || len == -EAGAIN) {
rqstp->rq_res.len = 0;
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 9e09391..cc68137 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
dprintk("svc: recvfrom returned error %d\n", -err);
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
}
- svc_xprt_received(&svsk->sk_xprt);
return -EAGAIN;
}
len = svc_addr_len(svc_addr(rqstp));
@@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
svsk->sk_sk->sk_stamp = skb->tstamp;
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
- /*
- * Maybe more packets - kick another thread ASAP.
- */
- svc_xprt_received(&svsk->sk_xprt);
-
len = skb->len - sizeof(struct udphdr);
rqstp->rq_arg.len = len;
@@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
if (len < want) {
dprintk("svc: short recvfrom while reading record "
"length (%d of %d)\n", len, want);
- svc_xprt_received(&svsk->sk_xprt);
goto err_again; /* record header not complete */
}
@@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
if (len < svsk->sk_reclen) {
dprintk("svc: incomplete TCP record (%d of %d)\n",
len, svsk->sk_reclen);
- svc_xprt_received(&svsk->sk_xprt);
goto err_again; /* record not complete */
}
len = svsk->sk_reclen;
@@ -961,10 +953,9 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
return len;
error:
- if (len == -EAGAIN) {
+ if (len == -EAGAIN)
dprintk("RPC: TCP recv_record got EAGAIN\n");
- svc_xprt_received(&svsk->sk_xprt);
- }
+
return len;
err_delete:
set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
@@ -1109,18 +1100,14 @@ out:
svsk->sk_tcplen = 0;
svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
- svc_xprt_received(&svsk->sk_xprt);
if (serv->sv_stats)
serv->sv_stats->nettcpcnt++;
return len;
err_again:
- if (len == -EAGAIN) {
+ if (len == -EAGAIN)
dprintk("RPC: TCP recvfrom got EAGAIN\n");
- svc_xprt_received(&svsk->sk_xprt);
- return len;
- }
error:
if (len != -EAGAIN) {
printk(KERN_NOTICE "%s: recvfrom returned errno %d\n",
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f92e37e..0194de8 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
rqstp->rq_arg.head[0].iov_len);
- svc_xprt_received(rqstp->rq_xprt);
return ret;
}
@@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
rqstp->rq_arg.head[0].iov_len);
rqstp->rq_prot = IPPROTO_MAX;
svc_xprt_copy_addrs(rqstp, xprt);
- svc_xprt_received(xprt);
return ret;
close_out:
@@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
*/
set_bit(XPT_CLOSE, &xprt->xpt_flags);
defer:
- svc_xprt_received(xprt);
return 0;
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
2010-02-28 22:07 ` J. Bruce Fields
@ 2010-02-28 23:57 ` Neil Brown
[not found] ` <20100301105734.7fe935b0-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-02-28 23:57 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Sun, 28 Feb 2010 17:07:23 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Sun, Feb 28, 2010 at 04:05:53PM -0500, J. Bruce Fields wrote:
> > On Sat, Feb 27, 2010 at 04:59:13PM +1100, Neil Brown wrote:
> > > I've made quite a few changes here - it might be worth splitting them
> > > up.
> >
> > Probably so.
>
> So, if I first revert b292cf9 and then b0401d7, I get the following.
>
> I don't understand the "return 0" in the XPT_CLOSE case. Is it really
> OK for the caller to try to process this request?
No, you are correct. "return 0" is wrong, it should be "return -EAGAIN",
both in the XPT_CLOSE case and the XPT_LISTENER case.
I observed that in both those cases, 'len' remained at 0 and we didn't do
much else but 'return len', so I optimised.
I forgot to factor in:
if (len == 0 || len == -EAGAIN) {
rqstp->rq_res.len = 0;
svc_xprt_release(rqstp);
return -EAGAIN;
}
So the svc_xprt_release needs to be moved in there as well, I'm not sure
about the rq_res.len = 0.
Maybe that was a bad case of premature-optimisation :-)
We should probably leave that last else clause as it is and just have a
single return from the function.
Thanks
NeilBrown
>
> --b.
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 8f0f1fb..48f91fb 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -706,9 +706,11 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> spin_unlock_bh(&pool->sp_lock);
>
> len = 0;
> +
> if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> dprintk("svc_recv: found XPT_CLOSE\n");
> svc_delete_xprt(xprt);
> + return 0;
> } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> struct svc_xprt *newxpt;
> newxpt = xprt->xpt_ops->xpo_accept(xprt);
> @@ -735,19 +737,20 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> svc_xprt_received(newxpt);
> }
> svc_xprt_received(xprt);
> - } else {
> - dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> - rqstp, pool->sp_id, xprt,
> - atomic_read(&xprt->xpt_ref.refcount));
> - rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> - if (rqstp->rq_deferred) {
> - svc_xprt_received(xprt);
> - len = svc_deferred_recv(rqstp);
> - } else
> - len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> - dprintk("svc: got len=%d\n", len);
> + return 0;
> }
>
> + dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> + rqstp, pool->sp_id, xprt,
> + atomic_read(&xprt->xpt_ref.refcount));
> + rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> + if (rqstp->rq_deferred)
> + len = svc_deferred_recv(rqstp);
> + else
> + len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> + dprintk("svc: got len=%d\n", len);
> + svc_xprt_received(xprt);
> +
> /* No data, incomplete (TCP) read, or accept() */
> if (len == 0 || len == -EAGAIN) {
> rqstp->rq_res.len = 0;
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 9e09391..cc68137 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> dprintk("svc: recvfrom returned error %d\n", -err);
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
> }
> - svc_xprt_received(&svsk->sk_xprt);
> return -EAGAIN;
> }
> len = svc_addr_len(svc_addr(rqstp));
> @@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> svsk->sk_sk->sk_stamp = skb->tstamp;
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
>
> - /*
> - * Maybe more packets - kick another thread ASAP.
> - */
> - svc_xprt_received(&svsk->sk_xprt);
> -
> len = skb->len - sizeof(struct udphdr);
> rqstp->rq_arg.len = len;
>
> @@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < want) {
> dprintk("svc: short recvfrom while reading record "
> "length (%d of %d)\n", len, want);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record header not complete */
> }
>
> @@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < svsk->sk_reclen) {
> dprintk("svc: incomplete TCP record (%d of %d)\n",
> len, svsk->sk_reclen);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record not complete */
> }
> len = svsk->sk_reclen;
> @@ -961,10 +953,9 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
>
> return len;
> error:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recv_record got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - }
> +
> return len;
> err_delete:
> set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
> @@ -1109,18 +1100,14 @@ out:
> svsk->sk_tcplen = 0;
>
> svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
> - svc_xprt_received(&svsk->sk_xprt);
> if (serv->sv_stats)
> serv->sv_stats->nettcpcnt++;
>
> return len;
>
> err_again:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recvfrom got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - return len;
> - }
> error:
> if (len != -EAGAIN) {
> printk(KERN_NOTICE "%s: recvfrom returned errno %d\n",
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index f92e37e..0194de8 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
> ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
> rqstp->rq_arg.head[0].iov_len);
>
> - svc_xprt_received(rqstp->rq_xprt);
> return ret;
> }
>
> @@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> rqstp->rq_arg.head[0].iov_len);
> rqstp->rq_prot = IPPROTO_MAX;
> svc_xprt_copy_addrs(rqstp, xprt);
> - svc_xprt_received(xprt);
> return ret;
>
> close_out:
> @@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> */
> set_bit(XPT_CLOSE, &xprt->xpt_flags);
> defer:
> - svc_xprt_received(xprt);
> return 0;
> }
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
[not found] ` <20100301105734.7fe935b0-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2010-03-01 3:46 ` J. Bruce Fields
2010-03-01 3:48 ` J. Bruce Fields
2010-03-01 5:51 ` Neil Brown
0 siblings, 2 replies; 20+ messages in thread
From: J. Bruce Fields @ 2010-03-01 3:46 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Mon, Mar 01, 2010 at 10:57:34AM +1100, Neil Brown wrote:
> No, you are correct. "return 0" is wrong, it should be "return -EAGAIN",
> both in the XPT_CLOSE case and the XPT_LISTENER case.
>
> I observed that in both those cases, 'len' remained at 0 and we didn't do
> much else but 'return len', so I optimised.
> I forgot to factor in:
>
> if (len == 0 || len == -EAGAIN) {
> rqstp->rq_res.len = 0;
> svc_xprt_release(rqstp);
> return -EAGAIN;
> }
>
> So the svc_xprt_release needs to be moved in there as well, I'm not sure
> about the rq_res.len = 0.
> Maybe that was a bad case of premature-optimisation :-)
>
> We should probably leave that last else clause as it is and just have a
> single return from the function.
OK, so the below is what I'm thinking of sending, after some testing;
really just a split-up version of your patches (uh, so credits may be
wrong) with the final cleanup removed:
1. remove the extra put from svc_delete_xprt().
2,3. Revert 2 problematic patches which caused the oops people
are seeing.
4. Fix the original bug from the rdma series.
And the first 3 will go to stable as well. The 4th might eventually
too, it just seems less urgent.
I also agree with the cleanup that moves the svc_xprt_received to one
place, I'm just hoping you won't mind regenerating it against this.
--b.
>From ab1b18f70a007ea6caeb007d269abb75b131a410 Mon Sep 17 00:00:00 2001
From: Neil Brown <neilb@suse.de>
Date: Sat, 27 Feb 2010 09:33:40 +1100
Subject: [PATCH 1/4] sunrpc: remove unnecessary svc_xprt_put
The 'struct svc_deferred_req's on the xpt_deferred queue do not
own a reference to the owning xprt. This is seen in svc_revisit
which is where things are added to this queue. dr->xprt is set to
NULL and the reference to the xprt it put.
So when this list is cleaned up in svc_delete_xprt, we mustn't
put the reference.
Also, replace the 'for' with a 'while' which is arguably
simpler and more likely to compile efficiently.
Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
---
net/sunrpc/svc_xprt.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index d7ec5ca..0983830 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -896,11 +896,8 @@ void svc_delete_xprt(struct svc_xprt *xprt)
if (test_bit(XPT_TEMP, &xprt->xpt_flags))
serv->sv_tmpcnt--;
- for (dr = svc_deferred_dequeue(xprt); dr;
- dr = svc_deferred_dequeue(xprt)) {
- svc_xprt_put(xprt);
+ while ((dr = svc_deferred_dequeue(xprt)) != NULL)
kfree(dr);
- }
svc_xprt_put(xprt);
spin_unlock_bh(&serv->sv_lock);
--
1.6.3.3
>From 56dd703462dad7311f3c5a736343f38d7b34b965 Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <bfields@citi.umich.edu>
Date: Sun, 28 Feb 2010 16:32:51 -0500
Subject: [PATCH 2/4] Revert "sunrpc: fix peername failed on closed listener"
This reverts commit b292cf9ce70d221c3f04ff62db5ab13d9a249ca8. The
commit that it attempted to patch up, b0401d "sunrpc: fix peername
failed on closed listener" was fundamentally wrong, and will also be
reverted.
Cc: stable@kernel.org
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
---
net/sunrpc/svc_xprt.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 0983830..818c4c3 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -706,8 +706,7 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
spin_unlock_bh(&pool->sp_lock);
len = 0;
- if (test_bit(XPT_LISTENER, &xprt->xpt_flags) &&
- !test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
struct svc_xprt *newxpt;
newxpt = xprt->xpt_ops->xpo_accept(xprt);
if (newxpt) {
--
1.6.3.3
>From 4d87b1d6c9832b19068f662101d27c82f3bb659d Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <bfields@citi.umich.edu>
Date: Sun, 28 Feb 2010 16:33:31 -0500
Subject: [PATCH 3/4] Revert "sunrpc: move the close processing after do recvfrom method"
This reverts commit b0401d725334a94d57335790b8ac2404144748ee, which
moved svc_delete_xprt() outside of XPT_BUSY, and allowed it to be called
after svc_xpt_recived(), removing the xprt's last reference and
destroying the xprt after it had already been queued for future
processing.
Cc: Wei Yongjun <yjwei@cn.fujitsu.com>
Cc: stable_kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
---
net/sunrpc/svc_xprt.c | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 818c4c3..8f0f1fb 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -706,7 +706,10 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
spin_unlock_bh(&pool->sp_lock);
len = 0;
- if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
+ if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ dprintk("svc_recv: found XPT_CLOSE\n");
+ svc_delete_xprt(xprt);
+ } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
struct svc_xprt *newxpt;
newxpt = xprt->xpt_ops->xpo_accept(xprt);
if (newxpt) {
@@ -732,7 +735,7 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
svc_xprt_received(newxpt);
}
svc_xprt_received(xprt);
- } else if (!test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ } else {
dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
rqstp, pool->sp_id, xprt,
atomic_read(&xprt->xpt_ref.refcount));
@@ -745,11 +748,6 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
dprintk("svc: got len=%d\n", len);
}
- if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
- dprintk("svc_recv: found XPT_CLOSE\n");
- svc_delete_xprt(xprt);
- }
-
/* No data, incomplete (TCP) read, or accept() */
if (len == 0 || len == -EAGAIN) {
rqstp->rq_res.len = 0;
--
1.6.3.3
>From f41357becb29e874a7adf4d77d52c31cb7b91820 Mon Sep 17 00:00:00 2001
From: Neil Brown <neilb@suse.de>
Date: Sun, 28 Feb 2010 22:01:05 -0500
Subject: [PATCH 4/4] nfsd: ensure sockets are closed on error
One of the changes in commit d7979ae4a "svc: Move close processing to a
single place" is:
err_delete:
- svc_delete_socket(svsk);
+ set_bit(SK_CLOSE, &svsk->sk_flags);
return -EAGAIN;
This is insufficient. The recvfrom methods must always call
svc_xprt_received on completion so that the socket gets re-queued if
there is any more work to do. This particular path did not make that
call because it actually destroyed the svsk, making requeue pointless.
When the svc_delete_socket was change to just set a bit, we should have
added a call to svc_xprt_received,
This is the problem that b0401d7253 attempted to fix, incorrectly.
Cc: Tom Tucker <tom@opengridcomputing.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Greg Banks <gnb-xTcybq6BZ68@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
---
net/sunrpc/svcsock.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 9e09391..a29f259 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -968,6 +968,7 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
return len;
err_delete:
set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
+ svc_xprt_received(&svsk->sk_xprt);
err_again:
return -EAGAIN;
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
2010-03-01 3:46 ` J. Bruce Fields
@ 2010-03-01 3:48 ` J. Bruce Fields
2010-03-01 5:51 ` Neil Brown
1 sibling, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2010-03-01 3:48 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
For people that can test, the combined patch is just as follows.
--b.
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index d7ec5ca..8f0f1fb 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -706,8 +706,10 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
spin_unlock_bh(&pool->sp_lock);
len = 0;
- if (test_bit(XPT_LISTENER, &xprt->xpt_flags) &&
- !test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ dprintk("svc_recv: found XPT_CLOSE\n");
+ svc_delete_xprt(xprt);
+ } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
struct svc_xprt *newxpt;
newxpt = xprt->xpt_ops->xpo_accept(xprt);
if (newxpt) {
@@ -733,7 +735,7 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
svc_xprt_received(newxpt);
}
svc_xprt_received(xprt);
- } else if (!test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ } else {
dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
rqstp, pool->sp_id, xprt,
atomic_read(&xprt->xpt_ref.refcount));
@@ -746,11 +748,6 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
dprintk("svc: got len=%d\n", len);
}
- if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
- dprintk("svc_recv: found XPT_CLOSE\n");
- svc_delete_xprt(xprt);
- }
-
/* No data, incomplete (TCP) read, or accept() */
if (len == 0 || len == -EAGAIN) {
rqstp->rq_res.len = 0;
@@ -896,11 +893,8 @@ void svc_delete_xprt(struct svc_xprt *xprt)
if (test_bit(XPT_TEMP, &xprt->xpt_flags))
serv->sv_tmpcnt--;
- for (dr = svc_deferred_dequeue(xprt); dr;
- dr = svc_deferred_dequeue(xprt)) {
- svc_xprt_put(xprt);
+ while ((dr = svc_deferred_dequeue(xprt)) != NULL)
kfree(dr);
- }
svc_xprt_put(xprt);
spin_unlock_bh(&serv->sv_lock);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 9e09391..a29f259 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -968,6 +968,7 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
return len;
err_delete:
set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
+ svc_xprt_received(&svsk->sk_xprt);
err_again:
return -EAGAIN;
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put
2010-02-27 2:38 ` Tom Tucker
@ 2010-03-01 4:23 ` Neil Brown
[not found] ` <20100301152310.750f3504-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-03-01 4:23 UTC (permalink / raw)
To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs
On Fri, 26 Feb 2010 20:38:25 -0600
Tom Tucker <tom@opengridcomputing.com> wrote:
> Neil Brown wrote:
> > On Fri, 26 Feb 2010 18:40:58 -0600
> > Tom Tucker <tom@opengridcomputing.com> wrote:
> >
> >
> >> J. Bruce Fields wrote:
> >>
> >>> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
> >>>
> >>>
> >>>> [I found this while looking for the current refcount problem
> >>>> that triggers a warning in svc_recv. This isn't that bug
> >>>> but is a different refcount bug - NB]
> >>>>
> >>>>
> >>>
> >>>
> >> I seem to recall that we added that reference for a reason. There was
> >> an issue with unmount while there were deferrals pending. That's why the
> >> reference was added.
> >>
> >> Tom
> >>
> >
> > What reference?
> > What I (thought I) found was code that was dropping a reference which it
> > didn't hold. Are you saying that it is supposed to be holding a reference
> > here, but isn't, or that it really is holding a reference here and I didn't
> > see it?
> >
>
> Here's the commit that I was thinking of...
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223
>
> I think this change adds the bug that you are now fixing. It fixed one
> problem, but added another that you have now resolved.
>
> What do you guys think?
Yes, I see what you are saying.
I agree that commit did fix a problem, but inadvertently introduced a new one.
Thanks,
NeilBrown
>
> Thanks,
> Tom
> > And just for completeness, my understanding of the refcounting here is:
> >
> > A counted references is held on an svc_xprt when:
> > - a 'struct rqst' refers to it through ->rq_xprt
> > - a 'cache_deferred_req' refers to it through ->xprt
> > This only happens while the req is waiting to be
> > revisited, and is in the hash table and on the lru.
> > Once the req gets revisited (svc_revisit) ->xprt
> > is set to NULL and the reference is dropped.
> > - XPT_DEAD is *not* set. So the refcount is initialised
> > to '1' to reflect this, and this ref is dropped
> > when we set XPT_DEAD.
> > - there are a few transient references in svc_xprt.c
> > which very clearly have matched 'get' and 'put'.
> > - svc_find_xprt returns a counted reference. This is
> > called once in lockd and once in nfsd, and both
> > calls drop the ref correctly.
> >
> > Whenever we drop a counted ref that was stored in a pointer, we set that
> > pointer to NULL.
> > So if there was a race where two threads both get a reference from a pointer
> > and then drop that reference, you would expect that slightly different timing
> > would cause one of those threads to get a NULL from the pointer, dereference
> > it, and crash. There are no important tests-for-NULL on either of the
> > pointers in question, so that wouldn't be protecting us from a crash. But
> > we don't see that crash, so there cannot be a race there.
> >
> > So: The refcount cannot possibly be zero in svc_recv :-)
> >
> > I just noticed some slightly odd code later in svc_recv:
> >
> > if (XPT_LISTENER && XPT_CLOSE) {
> > ...
> > } else if (XPT_CLOSE) {
> > ...
> > ->xpo_recvfrom()
> > }
> > if (XPT_CLOSE) {
> > ...
> > svc_delete_xprt()
> > }
> >
> > So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
> > is possible, and if ->xpo_recvfrom returns non-zero, then we end up
> > processing a request on a dead socket, which doesn't sound like the right
> > thing to do. I don't think it can cause the present problem, but
> > it looks wrong. That last 'if' should just be an 'else'.
> > I guess that would effectively reverse b0401d7253, though - not that
> > that patch seems entirely right to me - if there is a problem I probably
> > would have fixed it differently, though I'm not sure how.
> > So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
> >
> > NeilBrown
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
2010-03-01 3:46 ` J. Bruce Fields
2010-03-01 3:48 ` J. Bruce Fields
@ 2010-03-01 5:51 ` Neil Brown
[not found] ` <20100301165114.74d2797b-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
1 sibling, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-03-01 5:51 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Sun, 28 Feb 2010 22:46:47 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Mon, Mar 01, 2010 at 10:57:34AM +1100, Neil Brown wrote:
> > No, you are correct. "return 0" is wrong, it should be "return -EAGAIN",
> > both in the XPT_CLOSE case and the XPT_LISTENER case.
> >
> > I observed that in both those cases, 'len' remained at 0 and we didn't do
> > much else but 'return len', so I optimised.
> > I forgot to factor in:
> >
> > if (len == 0 || len == -EAGAIN) {
> > rqstp->rq_res.len = 0;
> > svc_xprt_release(rqstp);
> > return -EAGAIN;
> > }
> >
> > So the svc_xprt_release needs to be moved in there as well, I'm not sure
> > about the rq_res.len = 0.
> > Maybe that was a bad case of premature-optimisation :-)
> >
> > We should probably leave that last else clause as it is and just have a
> > single return from the function.
>
> OK, so the below is what I'm thinking of sending, after some testing;
> really just a split-up version of your patches (uh, so credits may be
> wrong) with the final cleanup removed:
Credits and code look OK the me, thanks.
>
> 1. remove the extra put from svc_delete_xprt().
> 2,3. Revert 2 problematic patches which caused the oops people
> are seeing.
> 4. Fix the original bug from the rdma series.
>
> And the first 3 will go to stable as well. The 4th might eventually
> too, it just seems less urgent.
>
> I also agree with the cleanup that moves the svc_xprt_received to one
> place, I'm just hoping you won't mind regenerating it against this.
See below.
There is still room to tidy up svc_recv, including getting the xpo_recvfrom
routines to report -EAGAIN when that is what they mean, rather than '0',
but I'm not really happy with what I have so-far so I won't post it yet.
NeilBrown
>From 1e75b9d1232957cd44e0d8ea704c9af431cc85be Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 1 Mar 2010 15:49:11 +1100
Subject: [PATCH] sunrpc: centralise most calls to svc_xprt_received
svc_xprt_received must be called when ->xpo_recvfrom has finished
receiving a message, so that the XPT_BUSY flag will be cleared and
if necessary, requeued for further work.
This call is currently made in each ->xpo_recvfrom function, often
from multiple different point, in each case it is the earliest point
on a particular path where it is known that the protection provided by
XPT_BUSY is no longer needed.
However there are (still) some error paths which do not call
svc_xprt_received, and requiring each ->xpo_recvfrom to make the call
does not encourage robustness.
So: move the svc_xprt_received call to be made just after the
call to ->xpo_recvfrom(), and move it of the various ->xpo_recvfrom
methods.
This means that it may not be called at the earliest possible instant,
but this is unlikely to be a measurable performance issue.
Note that there are still other calls to svc_xprt_received as it is
also needed when an xprt is newly created.
Signed-off-by: NeilBrown <neilb@suse.de>
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 6bd41a9..70b74be 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -736,8 +736,10 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
if (rqstp->rq_deferred) {
svc_xprt_received(xprt);
len = svc_deferred_recv(rqstp);
- } else
+ } else {
len = xprt->xpt_ops->xpo_recvfrom(rqstp);
+ svc_xprt_received(xprt);
+ }
dprintk("svc: got len=%d\n", len);
}
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 528efef..7425029 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
dprintk("svc: recvfrom returned error %d\n", -err);
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
}
- svc_xprt_received(&svsk->sk_xprt);
return -EAGAIN;
}
len = svc_addr_len(svc_addr(rqstp));
@@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
svsk->sk_sk->sk_stamp = skb->tstamp;
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
- /*
- * Maybe more packets - kick another thread ASAP.
- */
- svc_xprt_received(&svsk->sk_xprt);
-
len = skb->len - sizeof(struct udphdr);
rqstp->rq_arg.len = len;
@@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
if (len < want) {
dprintk("svc: short recvfrom while reading record "
"length (%d of %d)\n", len, want);
- svc_xprt_received(&svsk->sk_xprt);
goto err_again; /* record header not complete */
}
@@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
if (len < svsk->sk_reclen) {
dprintk("svc: incomplete TCP record (%d of %d)\n",
len, svsk->sk_reclen);
- svc_xprt_received(&svsk->sk_xprt);
goto err_again; /* record not complete */
}
len = svsk->sk_reclen;
@@ -961,14 +953,11 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
return len;
error:
- if (len == -EAGAIN) {
+ if (len == -EAGAIN)
dprintk("RPC: TCP recv_record got EAGAIN\n");
- svc_xprt_received(&svsk->sk_xprt);
- }
return len;
err_delete:
set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
- svc_xprt_received(&svsk->sk_xprt);
err_again:
return -EAGAIN;
}
@@ -1110,7 +1099,6 @@ out:
svsk->sk_tcplen = 0;
svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
- svc_xprt_received(&svsk->sk_xprt);
if (serv->sv_stats)
serv->sv_stats->nettcpcnt++;
@@ -1119,7 +1107,6 @@ out:
err_again:
if (len == -EAGAIN) {
dprintk("RPC: TCP recvfrom got EAGAIN\n");
- svc_xprt_received(&svsk->sk_xprt);
return len;
}
error:
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f92e37e..0194de8 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
rqstp->rq_arg.head[0].iov_len);
- svc_xprt_received(rqstp->rq_xprt);
return ret;
}
@@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
rqstp->rq_arg.head[0].iov_len);
rqstp->rq_prot = IPPROTO_MAX;
svc_xprt_copy_addrs(rqstp, xprt);
- svc_xprt_received(xprt);
return ret;
close_out:
@@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
*/
set_bit(XPT_CLOSE, &xprt->xpt_flags);
defer:
- svc_xprt_received(xprt);
return 0;
}
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put
[not found] ` <20100301152310.750f3504-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2010-03-01 14:44 ` J. Bruce Fields
0 siblings, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2010-03-01 14:44 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs
On Mon, Mar 01, 2010 at 03:23:10PM +1100, Neil Brown wrote:
> On Fri, 26 Feb 2010 20:38:25 -0600
> Tom Tucker <tom@opengridcomputing.com> wrote:
>
> > Neil Brown wrote:
> > > On Fri, 26 Feb 2010 18:40:58 -0600
> > > Tom Tucker <tom@opengridcomputing.com> wrote:
> > >
> > >
> > >> J. Bruce Fields wrote:
> > >>
> > >>> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
> > >>>
> > >>>
> > >>>> [I found this while looking for the current refcount problem
> > >>>> that triggers a warning in svc_recv. This isn't that bug
> > >>>> but is a different refcount bug - NB]
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >> I seem to recall that we added that reference for a reason. There was
> > >> an issue with unmount while there were deferrals pending. That's why the
> > >> reference was added.
> > >>
> > >> Tom
> > >>
> > >
> > > What reference?
> > > What I (thought I) found was code that was dropping a reference which it
> > > didn't hold. Are you saying that it is supposed to be holding a reference
> > > here, but isn't, or that it really is holding a reference here and I didn't
> > > see it?
> > >
> >
> > Here's the commit that I was thinking of...
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223
> >
> > I think this change adds the bug that you are now fixing. It fixed one
> > problem, but added another that you have now resolved.
> >
> > What do you guys think?
>
> Yes, I see what you are saying.
>
> I agree that commit did fix a problem, but inadvertently introduced a new one.
Agreed. So it looks to there's nothing additional here to fix.
(Correct me if I'm overlooking something.)
--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
[not found] ` <20100301165114.74d2797b-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2010-03-01 14:50 ` J. Bruce Fields
2010-03-01 23:19 ` J. Bruce Fields
2010-04-28 21:43 ` J. Bruce Fields
1 sibling, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-03-01 14:50 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Mon, Mar 01, 2010 at 04:51:14PM +1100, Neil Brown wrote:
> On Sun, 28 Feb 2010 22:46:47 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
>
> > On Mon, Mar 01, 2010 at 10:57:34AM +1100, Neil Brown wrote:
> > > No, you are correct. "return 0" is wrong, it should be "return -EAGAIN",
> > > both in the XPT_CLOSE case and the XPT_LISTENER case.
> > >
> > > I observed that in both those cases, 'len' remained at 0 and we didn't do
> > > much else but 'return len', so I optimised.
> > > I forgot to factor in:
> > >
> > > if (len == 0 || len == -EAGAIN) {
> > > rqstp->rq_res.len = 0;
> > > svc_xprt_release(rqstp);
> > > return -EAGAIN;
> > > }
> > >
> > > So the svc_xprt_release needs to be moved in there as well, I'm not sure
> > > about the rq_res.len = 0.
> > > Maybe that was a bad case of premature-optimisation :-)
> > >
> > > We should probably leave that last else clause as it is and just have a
> > > single return from the function.
> >
> > OK, so the below is what I'm thinking of sending, after some testing;
> > really just a split-up version of your patches (uh, so credits may be
> > wrong) with the final cleanup removed:
>
> Credits and code look OK the me, thanks.
Thanks, and I'll add S-o-b/acked-by for you as appropriate assuming no
objection.
> > 1. remove the extra put from svc_delete_xprt().
> > 2,3. Revert 2 problematic patches which caused the oops people
> > are seeing.
> > 4. Fix the original bug from the rdma series.
> >
> > And the first 3 will go to stable as well. The 4th might eventually
> > too, it just seems less urgent.
> >
> > I also agree with the cleanup that moves the svc_xprt_received to one
> > place, I'm just hoping you won't mind regenerating it against this.
>
> See below.
> There is still room to tidy up svc_recv, including getting the xpo_recvfrom
> routines to report -EAGAIN when that is what they mean, rather than '0',
> but I'm not really happy with what I have so-far so I won't post it yet.
OK, thanks again!
--b.
>
> NeilBrown
>
>
> From 1e75b9d1232957cd44e0d8ea704c9af431cc85be Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Mon, 1 Mar 2010 15:49:11 +1100
> Subject: [PATCH] sunrpc: centralise most calls to svc_xprt_received
>
> svc_xprt_received must be called when ->xpo_recvfrom has finished
> receiving a message, so that the XPT_BUSY flag will be cleared and
> if necessary, requeued for further work.
>
> This call is currently made in each ->xpo_recvfrom function, often
> from multiple different point, in each case it is the earliest point
> on a particular path where it is known that the protection provided by
> XPT_BUSY is no longer needed.
>
> However there are (still) some error paths which do not call
> svc_xprt_received, and requiring each ->xpo_recvfrom to make the call
> does not encourage robustness.
>
> So: move the svc_xprt_received call to be made just after the
> call to ->xpo_recvfrom(), and move it of the various ->xpo_recvfrom
> methods.
>
> This means that it may not be called at the earliest possible instant,
> but this is unlikely to be a measurable performance issue.
>
> Note that there are still other calls to svc_xprt_received as it is
> also needed when an xprt is newly created.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 6bd41a9..70b74be 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -736,8 +736,10 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> if (rqstp->rq_deferred) {
> svc_xprt_received(xprt);
> len = svc_deferred_recv(rqstp);
> - } else
> + } else {
> len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> + svc_xprt_received(xprt);
> + }
> dprintk("svc: got len=%d\n", len);
> }
>
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 528efef..7425029 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> dprintk("svc: recvfrom returned error %d\n", -err);
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
> }
> - svc_xprt_received(&svsk->sk_xprt);
> return -EAGAIN;
> }
> len = svc_addr_len(svc_addr(rqstp));
> @@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> svsk->sk_sk->sk_stamp = skb->tstamp;
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
>
> - /*
> - * Maybe more packets - kick another thread ASAP.
> - */
> - svc_xprt_received(&svsk->sk_xprt);
> -
> len = skb->len - sizeof(struct udphdr);
> rqstp->rq_arg.len = len;
>
> @@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < want) {
> dprintk("svc: short recvfrom while reading record "
> "length (%d of %d)\n", len, want);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record header not complete */
> }
>
> @@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < svsk->sk_reclen) {
> dprintk("svc: incomplete TCP record (%d of %d)\n",
> len, svsk->sk_reclen);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record not complete */
> }
> len = svsk->sk_reclen;
> @@ -961,14 +953,11 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
>
> return len;
> error:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recv_record got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - }
> return len;
> err_delete:
> set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
> - svc_xprt_received(&svsk->sk_xprt);
> err_again:
> return -EAGAIN;
> }
> @@ -1110,7 +1099,6 @@ out:
> svsk->sk_tcplen = 0;
>
> svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
> - svc_xprt_received(&svsk->sk_xprt);
> if (serv->sv_stats)
> serv->sv_stats->nettcpcnt++;
>
> @@ -1119,7 +1107,6 @@ out:
> err_again:
> if (len == -EAGAIN) {
> dprintk("RPC: TCP recvfrom got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> return len;
> }
> error:
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index f92e37e..0194de8 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
> ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
> rqstp->rq_arg.head[0].iov_len);
>
> - svc_xprt_received(rqstp->rq_xprt);
> return ret;
> }
>
> @@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> rqstp->rq_arg.head[0].iov_len);
> rqstp->rq_prot = IPPROTO_MAX;
> svc_xprt_copy_addrs(rqstp, xprt);
> - svc_xprt_received(xprt);
> return ret;
>
> close_out:
> @@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> */
> set_bit(XPT_CLOSE, &xprt->xpt_flags);
> defer:
> - svc_xprt_received(xprt);
> return 0;
> }
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
2010-03-01 14:50 ` J. Bruce Fields
@ 2010-03-01 23:19 ` J. Bruce Fields
2010-03-01 23:20 ` J. Bruce Fields
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-03-01 23:19 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Mon, Mar 01, 2010 at 09:50:15AM -0500, J. Bruce Fields wrote:
> On Mon, Mar 01, 2010 at 04:51:14PM +1100, Neil Brown wrote:
> > On Sun, 28 Feb 2010 22:46:47 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> >
> > > On Mon, Mar 01, 2010 at 10:57:34AM +1100, Neil Brown wrote:
> > > > No, you are correct. "return 0" is wrong, it should be "return -EAGAIN",
> > > > both in the XPT_CLOSE case and the XPT_LISTENER case.
> > > >
> > > > I observed that in both those cases, 'len' remained at 0 and we didn't do
> > > > much else but 'return len', so I optimised.
> > > > I forgot to factor in:
> > > >
> > > > if (len == 0 || len == -EAGAIN) {
> > > > rqstp->rq_res.len = 0;
> > > > svc_xprt_release(rqstp);
> > > > return -EAGAIN;
> > > > }
> > > >
> > > > So the svc_xprt_release needs to be moved in there as well, I'm not sure
> > > > about the rq_res.len = 0.
> > > > Maybe that was a bad case of premature-optimisation :-)
> > > >
> > > > We should probably leave that last else clause as it is and just have a
> > > > single return from the function.
> > >
> > > OK, so the below is what I'm thinking of sending, after some testing;
> > > really just a split-up version of your patches (uh, so credits may be
> > > wrong) with the final cleanup removed:
> >
> > Credits and code look OK the me, thanks.
And, by the way, this is all ready to submit--but I'd like to avoid
having to revert anything more, and as part of that I'd greatly
appreciate any testing results, however partial.
--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
2010-03-01 23:19 ` J. Bruce Fields
@ 2010-03-01 23:20 ` J. Bruce Fields
0 siblings, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2010-03-01 23:20 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Mon, Mar 01, 2010 at 06:19:56PM -0500, J. Bruce Fields wrote:
> On Mon, Mar 01, 2010 at 09:50:15AM -0500, J. Bruce Fields wrote:
> > On Mon, Mar 01, 2010 at 04:51:14PM +1100, Neil Brown wrote:
> > > On Sun, 28 Feb 2010 22:46:47 -0500
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > >
> > > > On Mon, Mar 01, 2010 at 10:57:34AM +1100, Neil Brown wrote:
> > > > > No, you are correct. "return 0" is wrong, it should be "return -EAGAIN",
> > > > > both in the XPT_CLOSE case and the XPT_LISTENER case.
> > > > >
> > > > > I observed that in both those cases, 'len' remained at 0 and we didn't do
> > > > > much else but 'return len', so I optimised.
> > > > > I forgot to factor in:
> > > > >
> > > > > if (len == 0 || len == -EAGAIN) {
> > > > > rqstp->rq_res.len = 0;
> > > > > svc_xprt_release(rqstp);
> > > > > return -EAGAIN;
> > > > > }
> > > > >
> > > > > So the svc_xprt_release needs to be moved in there as well, I'm not sure
> > > > > about the rq_res.len = 0.
> > > > > Maybe that was a bad case of premature-optimisation :-)
> > > > >
> > > > > We should probably leave that last else clause as it is and just have a
> > > > > single return from the function.
> > > >
> > > > OK, so the below is what I'm thinking of sending, after some testing;
> > > > really just a split-up version of your patches (uh, so credits may be
> > > > wrong) with the final cleanup removed:
> > >
> > > Credits and code look OK the me, thanks.
>
> And, by the way, this is all ready to submit--but I'd like to avoid
> having to revert anything more, and as part of that I'd greatly
> appreciate any testing results, however partial.
(I've run my basic regression tests, but they were never enough to
reproduce the refcnt warning others were seeing.)
--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put)
[not found] ` <20100301165114.74d2797b-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-03-01 14:50 ` J. Bruce Fields
@ 2010-04-28 21:43 ` J. Bruce Fields
1 sibling, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2010-04-28 21:43 UTC (permalink / raw)
To: Neil Brown; +Cc: Tom Tucker, linux-nfs, Wei Yongjun
On Mon, Mar 01, 2010 at 04:51:14PM +1100, Neil Brown wrote:
> On Sun, 28 Feb 2010 22:46:47 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > I also agree with the cleanup that moves the svc_xprt_received to one
> > place, I'm just hoping you won't mind regenerating it against this.
>
> See below.
> There is still room to tidy up svc_recv, including getting the xpo_recvfrom
> routines to report -EAGAIN when that is what they mean, rather than '0',
> but I'm not really happy with what I have so-far so I won't post it yet.
I've applied this for 2.6.35; apologies for the delay.
Did you get another chance to look the further cleanup you were
considering?
--b.
>
> NeilBrown
>
>
> From 1e75b9d1232957cd44e0d8ea704c9af431cc85be Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Mon, 1 Mar 2010 15:49:11 +1100
> Subject: [PATCH] sunrpc: centralise most calls to svc_xprt_received
>
> svc_xprt_received must be called when ->xpo_recvfrom has finished
> receiving a message, so that the XPT_BUSY flag will be cleared and
> if necessary, requeued for further work.
>
> This call is currently made in each ->xpo_recvfrom function, often
> from multiple different point, in each case it is the earliest point
> on a particular path where it is known that the protection provided by
> XPT_BUSY is no longer needed.
>
> However there are (still) some error paths which do not call
> svc_xprt_received, and requiring each ->xpo_recvfrom to make the call
> does not encourage robustness.
>
> So: move the svc_xprt_received call to be made just after the
> call to ->xpo_recvfrom(), and move it of the various ->xpo_recvfrom
> methods.
>
> This means that it may not be called at the earliest possible instant,
> but this is unlikely to be a measurable performance issue.
>
> Note that there are still other calls to svc_xprt_received as it is
> also needed when an xprt is newly created.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 6bd41a9..70b74be 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -736,8 +736,10 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> if (rqstp->rq_deferred) {
> svc_xprt_received(xprt);
> len = svc_deferred_recv(rqstp);
> - } else
> + } else {
> len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> + svc_xprt_received(xprt);
> + }
> dprintk("svc: got len=%d\n", len);
> }
>
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 528efef..7425029 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -547,7 +547,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> dprintk("svc: recvfrom returned error %d\n", -err);
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
> }
> - svc_xprt_received(&svsk->sk_xprt);
> return -EAGAIN;
> }
> len = svc_addr_len(svc_addr(rqstp));
> @@ -562,11 +561,6 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
> svsk->sk_sk->sk_stamp = skb->tstamp;
> set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */
>
> - /*
> - * Maybe more packets - kick another thread ASAP.
> - */
> - svc_xprt_received(&svsk->sk_xprt);
> -
> len = skb->len - sizeof(struct udphdr);
> rqstp->rq_arg.len = len;
>
> @@ -917,7 +911,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < want) {
> dprintk("svc: short recvfrom while reading record "
> "length (%d of %d)\n", len, want);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record header not complete */
> }
>
> @@ -953,7 +946,6 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> if (len < svsk->sk_reclen) {
> dprintk("svc: incomplete TCP record (%d of %d)\n",
> len, svsk->sk_reclen);
> - svc_xprt_received(&svsk->sk_xprt);
> goto err_again; /* record not complete */
> }
> len = svsk->sk_reclen;
> @@ -961,14 +953,11 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
>
> return len;
> error:
> - if (len == -EAGAIN) {
> + if (len == -EAGAIN)
> dprintk("RPC: TCP recv_record got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> - }
> return len;
> err_delete:
> set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
> - svc_xprt_received(&svsk->sk_xprt);
> err_again:
> return -EAGAIN;
> }
> @@ -1110,7 +1099,6 @@ out:
> svsk->sk_tcplen = 0;
>
> svc_xprt_copy_addrs(rqstp, &svsk->sk_xprt);
> - svc_xprt_received(&svsk->sk_xprt);
> if (serv->sv_stats)
> serv->sv_stats->nettcpcnt++;
>
> @@ -1119,7 +1107,6 @@ out:
> err_again:
> if (len == -EAGAIN) {
> dprintk("RPC: TCP recvfrom got EAGAIN\n");
> - svc_xprt_received(&svsk->sk_xprt);
> return len;
> }
> error:
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index f92e37e..0194de8 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -566,7 +566,6 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
> ret, rqstp->rq_arg.len, rqstp->rq_arg.head[0].iov_base,
> rqstp->rq_arg.head[0].iov_len);
>
> - svc_xprt_received(rqstp->rq_xprt);
> return ret;
> }
>
> @@ -665,7 +664,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> rqstp->rq_arg.head[0].iov_len);
> rqstp->rq_prot = IPPROTO_MAX;
> svc_xprt_copy_addrs(rqstp, xprt);
> - svc_xprt_received(xprt);
> return ret;
>
> close_out:
> @@ -678,6 +676,5 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
> */
> set_bit(XPT_CLOSE, &xprt->xpt_flags);
> defer:
> - svc_xprt_received(xprt);
> return 0;
> }
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2010-04-28 21:43 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-26 22:33 [PATCH] sunrpc: remove unnecessary svc_xprt_put Neil Brown
[not found] ` <19336.19524.469529.431210-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-02-26 22:44 ` J. Bruce Fields
2010-02-26 22:54 ` J. Bruce Fields
2010-02-27 0:40 ` Tom Tucker
2010-02-27 1:35 ` Neil Brown
[not found] ` <20100227123537.6289e326-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-02-27 2:38 ` Tom Tucker
2010-03-01 4:23 ` Neil Brown
[not found] ` <20100301152310.750f3504-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-03-01 14:44 ` J. Bruce Fields
2010-02-27 5:59 ` The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put) Neil Brown
[not found] ` <20100227165913.53718449-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-02-28 0:46 ` The recent kref_put warning Tom Tucker
2010-02-28 21:05 ` The recent kref_put warning (was: [PATCH] sunrpc: remove unnecessary svc_xprt_put) J. Bruce Fields
2010-02-28 22:07 ` J. Bruce Fields
2010-02-28 23:57 ` Neil Brown
[not found] ` <20100301105734.7fe935b0-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-03-01 3:46 ` J. Bruce Fields
2010-03-01 3:48 ` J. Bruce Fields
2010-03-01 5:51 ` Neil Brown
[not found] ` <20100301165114.74d2797b-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-03-01 14:50 ` J. Bruce Fields
2010-03-01 23:19 ` J. Bruce Fields
2010-03-01 23:20 ` J. Bruce Fields
2010-04-28 21:43 ` J. Bruce Fields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox