From: NeilBrown <neilb@suse.com>
To: "J. Bruce Fields" <bfields@fieldses.org>,
Vasily Averin <vvs@virtuozzo.com>
Cc: Jeff Layton <jlayton@kernel.org>,
linux-nfs@vger.kernel.org,
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Subject: Re: [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
Date: Fri, 05 Apr 2019 09:21:23 +1100 [thread overview]
Message-ID: <87sguxza1o.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <87zhtso38v.fsf@notabene.neil.brown.name>
[-- Attachment #1: Type: text/plain, Size: 4364 bytes --]
On Thu, Nov 29 2018, NeilBrown wrote:
> On Wed, Nov 28 2018, J. Bruce Fields wrote:
>
>> On Wed, Nov 28, 2018 at 11:45:46AM +0300, Vasily Averin wrote:
>>> Dear all, we have found memory leak on OpenVz7 node and believe it
>>> affects mainline too.
>>>
>>> sunrpc_cache_lookup() removes exprired cache_head from hash, however
>>> if it waits for reply on submitted cache_request both of them can leak
>>> forever, nobody cleans unhashed cache_heads.
>>>
>>> Originally we had claim on busy loop device of stopped container, that
>>> had executed nfs server inside. Device was kept by mount that was
>>> detached from already destroyed mount namespace. By using crash
>>> search we have found some structure with path struct related to our
>>> mount. Finally we have found that it was alive svc_export struct used
>>> by to alive cache_request, however both of them pointed to already
>>> freed cache_detail.
>>>
>>> We decided that cache_detail was correctly freed during destroy of net
>>> namespace, however svc_export with taken path struct, cache_request
>>> and some other structures seems was leaked forever.
>>>
>>> This could happen only if cache_head of svc_export was removed from
>>> hash on cache_detail before its destroy. Finally we have found that it
>>> could happen when sunrpc_cache_lookup() removes expired cache_head
>>> from hash.
>>>
>>> Usually it works correctly and cache_put(freeme) frees expired
>>> cache_head. However in our case cache_head have an extra reference
>>> counter from stalled cache_request. Becasue of cache_head was removed
>>> from hash of cache_detail it cannot be found in cache_clean() and its
>>> cache_request cannot be freed in cache_dequeue(). Memory leaks
>>> forever, exactly like we observed.
>>>
>>> After may attempts we have reproduced this situation on OpenVz7
>>> kernel, however our reproducer is quite long and complex.
>>> Unfortunately we still did not reproduced this problem on mainline
>>> kernel and did not validated the patch yet.
>>>
>>> It would be great if someone advised us some simple way to trigger
>>> described scenario.
>>
>> I think you should be able to produce hung upcalls by flushing the cache
>> (exportfs -f), then stopping mountd, then trying to access the
>> filesystem from a client. Does that help?
>>
>>> We are not sure that our patch is correct, please let us know if our
>>> analyze missed something.
>>
>> It looks OK to me, but it would be helpful to have Neil's review too.
>
> Yes, it makes sense to me.
> Reviewed-by: NeilBrown <neilb@suse.com>
Unfortunately I was wrong. See below.
NeilBrown
From: NeilBrown <neilb@suse.com>
Subject: [PATCH] sunrpc: don't mark uninitialised items as VALID.
A recent commit added a call to cache_fresh_locked()
when an expired item was found.
The call sets the CACHE_VALID flag, so it is important
that the item actually is valid.
There are two ways it could be valid:
1/ If ->update has been called to fill in relevant content
2/ If CACHE_NEGATIVE is set, to say that content doesn't exist.
An expired item that is waiting for an update will be neither.
Setting CACHE_VALID will mean that a subsequent call to cache_put()
will be likely to dereference uninitialised pointers.
So we must make sure the item is valid, and we already have code to do
that in try_to_negate_entry(). This takes the hash lock and so cannot
be used directly, so take out the two lines that we need and use them.
Now cache_fresh_locked() is certain to be called only on
a valid item.
Cc: stable@kernel.org # 2.6.35
Fixes: 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued request")
Signed-off-by: NeilBrown <neilb@suse.com>
---
net/sunrpc/cache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 12bb23b8e0c5..f5f8ce8c3443 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -105,6 +105,8 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail,
if (cache_is_expired(detail, tmp)) {
hlist_del_init_rcu(&tmp->cache_list);
detail->entries --;
+ if (cache_is_valid(tmp) == -EAGAIN)
+ set_bit(CACHE_NEGATIVE, &h->flags);
cache_fresh_locked(tmp, 0, detail);
freeme = tmp;
break;
--
2.14.0.rc0.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2019-04-04 22:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 8:45 [PATCH 0/1] cache_head leak in sunrpc_cache_lookup() Vasily Averin
2018-11-28 23:35 ` J. Bruce Fields
2018-11-29 5:35 ` NeilBrown
2018-12-04 20:42 ` J. Bruce Fields
2019-04-04 22:21 ` NeilBrown [this message]
2019-04-05 0:34 ` NeilBrown
2019-04-05 23:52 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sguxza1o.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=bfields@fieldses.org \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=ptikhomirov@virtuozzo.com \
--cc=vvs@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox