* Question on debugging use-after-free memory issues. @ 2011-06-27 18:12 Ben Greear 2011-06-28 22:00 ` Jiri Kosina 0 siblings, 1 reply; 4+ messages in thread From: Ben Greear @ 2011-06-27 18:12 UTC (permalink / raw) To: Linux Kernel Mailing List I have a case where deleted memory is being passed into an RPC callback. I enabled SLUB memory poisoning and verified that the data pointed to has 0x6b...6b value. Unfortunately, the rpc code is a giant maze of callbacks and I'm having a difficult time figuring out where this data could be erroneously deleted at. So first question: Given a pointer to memory, and with SLUB memory debuging on (and/or other debugging options if applicable), is there a way to get any info about where the memory was last deleted? Second: Any other suggestions for how to go about debugging this? I hit this problem under load after multiple hours, so just adding printks in random places may not be feasible... Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Question on debugging use-after-free memory issues. 2011-06-27 18:12 Question on debugging use-after-free memory issues Ben Greear @ 2011-06-28 22:00 ` Jiri Kosina 2011-06-29 5:41 ` Eric Dumazet 0 siblings, 1 reply; 4+ messages in thread From: Jiri Kosina @ 2011-06-28 22:00 UTC (permalink / raw) To: Ben Greear; +Cc: Linux Kernel Mailing List On Mon, 27 Jun 2011, Ben Greear wrote: > I have a case where deleted memory is being passed into an RPC callback. > I enabled SLUB memory poisoning and verified that the data pointed to > has 0x6b...6b value. > > Unfortunately, the rpc code is a giant maze of callbacks and I'm having > a difficult time figuring out where this data could be erroneously > deleted at. > > So first question: > > Given a pointer to memory, and with SLUB memory debuging on (and/or > other debugging options if applicable), is there a way to get any info > about where the memory was last deleted? > > Second: Any other suggestions for how to go about debugging this? > > I hit this problem under load after multiple hours, so just adding > printks in random places may not be feasible... First, this is not really a proper list for such questions. I'd propose kernel newbies community next time. Anyway, I'd propose to start with kmemcheck (see Documentation/kmemcheck.txt). It could pin-point the problemtic spot immediately (or it might not). -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Question on debugging use-after-free memory issues. 2011-06-28 22:00 ` Jiri Kosina @ 2011-06-29 5:41 ` Eric Dumazet 2011-06-29 6:01 ` Ben Greear 0 siblings, 1 reply; 4+ messages in thread From: Eric Dumazet @ 2011-06-29 5:41 UTC (permalink / raw) To: Jiri Kosina; +Cc: Ben Greear, Linux Kernel Mailing List Le mercredi 29 juin 2011 à 00:00 +0200, Jiri Kosina a écrit : > On Mon, 27 Jun 2011, Ben Greear wrote: > > > I have a case where deleted memory is being passed into an RPC callback. > > I enabled SLUB memory poisoning and verified that the data pointed to > > has 0x6b...6b value. > > > > Unfortunately, the rpc code is a giant maze of callbacks and I'm having > > a difficult time figuring out where this data could be erroneously > > deleted at. > > > > So first question: > > > > Given a pointer to memory, and with SLUB memory debuging on (and/or > > other debugging options if applicable), is there a way to get any info > > about where the memory was last deleted? > > > > Second: Any other suggestions for how to go about debugging this? > > > > I hit this problem under load after multiple hours, so just adding > > printks in random places may not be feasible... > > First, this is not really a proper list for such questions. I'd propose > kernel newbies community next time. > LKML is definitely a place for such questions. > Anyway, I'd propose to start with kmemcheck (see > Documentation/kmemcheck.txt). It could pin-point the problemtic spot > immediately (or it might not). > kmemcheck is fine if problem is not coming from an SMP bug only. Also kmemcheck is so slow it makes a rare bug becoming very hard to trigger. Ben, given that you know that RPC might have a problem on a given small object (struct rpcbind_args ), you could afford changing the kmalloc()/kfree() used to allocate/free such objects by calls to page allocator, and dont free the page but unmap it from kernel mapping so that any further read/write access triggers a fault. You can then have a more precise idea of what's happening, without slowing down whole kernel. Of course there is a mem leak for each "struct rpcbind_args" allocated, so this is a debugging aid only. DEBUG_PAGEALLOC might be too expensive, so try this patch (untested, you might need to complete it) diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c index 9a80a92..9b4dbaf 100644 --- a/net/sunrpc/rpcb_clnt.c +++ b/net/sunrpc/rpcb_clnt.c @@ -158,7 +158,7 @@ static void rpcb_map_release(void *data) rpcb_wake_rpcbind_waiters(map->r_xprt, map->r_status); xprt_put(map->r_xprt); kfree(map->r_addr); - kfree(map); + kernel_map_pages(virt_to_page(map), 1, 0); } /* @@ -668,7 +668,7 @@ void rpcb_getport_async(struct rpc_task *task) goto bailout_nofree; } - map = kzalloc(sizeof(struct rpcbind_args), GFP_ATOMIC); + map = (struct rpcbind_args *)__get_free_page(GFP_ATOMIC | __GFP_ZERO); if (!map) { status = -ENOMEM; dprintk("RPC: %5u %s: no memory available\n", ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Question on debugging use-after-free memory issues. 2011-06-29 5:41 ` Eric Dumazet @ 2011-06-29 6:01 ` Ben Greear 0 siblings, 0 replies; 4+ messages in thread From: Ben Greear @ 2011-06-29 6:01 UTC (permalink / raw) To: Eric Dumazet; +Cc: Jiri Kosina, Linux Kernel Mailing List On 06/28/2011 10:41 PM, Eric Dumazet wrote: > Le mercredi 29 juin 2011 à 00:00 +0200, Jiri Kosina a écrit : >> On Mon, 27 Jun 2011, Ben Greear wrote: >> Anyway, I'd propose to start with kmemcheck (see >> Documentation/kmemcheck.txt). It could pin-point the problemtic spot >> immediately (or it might not). >> > > kmemcheck is fine if problem is not coming from an SMP bug only. Also > kmemcheck is so slow it makes a rare bug becoming very hard to trigger. I think I've pretty much verified that deleted memory is passed down a certain call path with the slub patches I posted. What I can't figure out is how that came to be. > Ben, given that you know that RPC might have a problem on a given small > object (struct rpcbind_args ), you could afford changing the > kmalloc()/kfree() used to allocate/free such objects by calls to page > allocator, and dont free the page but unmap it from kernel mapping so > that any further read/write access triggers a fault. You can then have a > more precise idea of what's happening, without slowing down whole > kernel. Of course there is a mem leak for each "struct rpcbind_args" > allocated, so this is a debugging aid only. > > DEBUG_PAGEALLOC might be too expensive, so try this patch (untested, you > might need to complete it) > > diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c > index 9a80a92..9b4dbaf 100644 > --- a/net/sunrpc/rpcb_clnt.c > +++ b/net/sunrpc/rpcb_clnt.c > @@ -158,7 +158,7 @@ static void rpcb_map_release(void *data) > rpcb_wake_rpcbind_waiters(map->r_xprt, map->r_status); > xprt_put(map->r_xprt); > kfree(map->r_addr); > - kfree(map); > + kernel_map_pages(virt_to_page(map), 1, 0); > } > > /* > @@ -668,7 +668,7 @@ void rpcb_getport_async(struct rpc_task *task) > goto bailout_nofree; > } > > - map = kzalloc(sizeof(struct rpcbind_args), GFP_ATOMIC); > + map = (struct rpcbind_args *)__get_free_page(GFP_ATOMIC | __GFP_ZERO); > if (!map) { > status = -ENOMEM; > dprintk("RPC: %5u %s: no memory available\n", > It takes possibly hours of heavy load to hit the problem, so I think I cannot afford to leak that much memory. Interestingly, I added this code below, and haven't hit the problem since. I'm not sure if it just changed the timing, or what...or maybe I'll hit it overnight... I tried setting this (below) to 0x6b instead of 0x0 (mempool doesn't really kmalloc/free too often, so the slub poisoning doesn't help), but never did hit the bug again. I suspect that somehow the task object is still on the work-queue, when it is deleted, but since the 0x6b and 0x0 poisoning didn't cause any funny crashes, I could easily be wrong about that. diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index 17c3e3a..d94f009 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -859,6 +859,12 @@ static void rpc_free_task(struct rpc_task *task) if (task->tk_flags & RPC_TASK_DYNAMIC) { dprintk("RPC: %5u freeing task\n", task->tk_pid); + /* HACK: Have been seeing use-after-free of calldata. Zero this memory + * so that it cannot happen here. Seems to have fixed the problem + * in 3.0 kernel, but maybe it just adjusted timing..either way, + * it's not a real fix. --Ben + */ + memset(task, 0, sizeof(*task)); mempool_free(task, rpc_task_mempool); } rpc_release_calldata(tk_ops, calldata); Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-06-29 6:03 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-06-27 18:12 Question on debugging use-after-free memory issues Ben Greear 2011-06-28 22:00 ` Jiri Kosina 2011-06-29 5:41 ` Eric Dumazet 2011-06-29 6:01 ` Ben Greear
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox