From: Ben Greear <greearb@candelatech.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Question on debugging use-after-free memory issues.
Date: Tue, 28 Jun 2011 23:01:25 -0700 [thread overview]
Message-ID: <4E0ABFB5.2050403@candelatech.com> (raw)
In-Reply-To: <1309326112.2532.104.camel@edumazet-laptop>
On 06/28/2011 10:41 PM, Eric Dumazet wrote:
> Le mercredi 29 juin 2011 à 00:00 +0200, Jiri Kosina a écrit :
>> On Mon, 27 Jun 2011, Ben Greear wrote:
>> Anyway, I'd propose to start with kmemcheck (see
>> Documentation/kmemcheck.txt). It could pin-point the problemtic spot
>> immediately (or it might not).
>>
>
> kmemcheck is fine if problem is not coming from an SMP bug only. Also
> kmemcheck is so slow it makes a rare bug becoming very hard to trigger.
I think I've pretty much verified that deleted memory is passed down
a certain call path with the slub patches I posted.
What I can't figure out is how that came to be.
> Ben, given that you know that RPC might have a problem on a given small
> object (struct rpcbind_args ), you could afford changing the
> kmalloc()/kfree() used to allocate/free such objects by calls to page
> allocator, and dont free the page but unmap it from kernel mapping so
> that any further read/write access triggers a fault. You can then have a
> more precise idea of what's happening, without slowing down whole
> kernel. Of course there is a mem leak for each "struct rpcbind_args"
> allocated, so this is a debugging aid only.
>
> DEBUG_PAGEALLOC might be too expensive, so try this patch (untested, you
> might need to complete it)
>
> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
> index 9a80a92..9b4dbaf 100644
> --- a/net/sunrpc/rpcb_clnt.c
> +++ b/net/sunrpc/rpcb_clnt.c
> @@ -158,7 +158,7 @@ static void rpcb_map_release(void *data)
> rpcb_wake_rpcbind_waiters(map->r_xprt, map->r_status);
> xprt_put(map->r_xprt);
> kfree(map->r_addr);
> - kfree(map);
> + kernel_map_pages(virt_to_page(map), 1, 0);
> }
>
> /*
> @@ -668,7 +668,7 @@ void rpcb_getport_async(struct rpc_task *task)
> goto bailout_nofree;
> }
>
> - map = kzalloc(sizeof(struct rpcbind_args), GFP_ATOMIC);
> + map = (struct rpcbind_args *)__get_free_page(GFP_ATOMIC | __GFP_ZERO);
> if (!map) {
> status = -ENOMEM;
> dprintk("RPC: %5u %s: no memory available\n",
>
It takes possibly hours of heavy load to hit the problem, so I think
I cannot afford to leak that much memory.
Interestingly, I added this code below, and haven't hit the problem since.
I'm not sure if it just changed the timing, or what...or maybe I'll
hit it overnight...
I tried setting this (below) to 0x6b instead of 0x0 (mempool doesn't
really kmalloc/free too often, so the slub poisoning doesn't help),
but never did hit the bug again.
I suspect that somehow the task object is still on the work-queue,
when it is deleted, but since the 0x6b and 0x0 poisoning didn't cause
any funny crashes, I could easily be wrong about that.
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 17c3e3a..d94f009 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -859,6 +859,12 @@ static void rpc_free_task(struct rpc_task *task)
if (task->tk_flags & RPC_TASK_DYNAMIC) {
dprintk("RPC: %5u freeing task\n", task->tk_pid);
+ /* HACK: Have been seeing use-after-free of calldata. Zero this memory
+ * so that it cannot happen here. Seems to have fixed the problem
+ * in 3.0 kernel, but maybe it just adjusted timing..either way,
+ * it's not a real fix. --Ben
+ */
+ memset(task, 0, sizeof(*task));
mempool_free(task, rpc_task_mempool);
}
rpc_release_calldata(tk_ops, calldata);
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
prev parent reply other threads:[~2011-06-29 6:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-27 18:12 Question on debugging use-after-free memory issues Ben Greear
2011-06-28 22:00 ` Jiri Kosina
2011-06-29 5:41 ` Eric Dumazet
2011-06-29 6:01 ` Ben Greear [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E0ABFB5.2050403@candelatech.com \
--to=greearb@candelatech.com \
--cc=eric.dumazet@gmail.com \
--cc=jkosina@suse.cz \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox