From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: Memory leak? Date: Fri, 08 Jul 2011 12:17:54 -0400 Message-ID: <1310141768-sup-424@shiny> References: <20110703190913.GA4474@yahoo.fr> <20110706081111.GA6931@yahoo.fr> <20110708124429.GB4284@yahoo.fr> <1310137241-sup-8158@shiny> <20110708154123.GA17886@yahoo.fr> <20110708161103.GD4284@yahoo.fr> Content-Type: text/plain; charset=UTF-8 Cc: cwillu , linux-btrfs To: Stephane Chazelas Return-path: In-reply-to: <20110708161103.GD4284@yahoo.fr> List-ID: Excerpts from Stephane Chazelas's message of 2011-07-08 12:11:03 -0400: > 2011-07-08 16:41:23 +0100, Stephane Chazelas: > > 2011-07-08 11:06:08 -0400, Chris Mason: > > [...] > > > So the invalidate opcode in btrfs-fixup-0 is the big problem. We're > > > either failing to write because we weren't able to allocate memory (and > > > not dealing with it properly) or there is a bigger problem. > > > > > > Does the btrfs-fixup-0 oops come before or after the ooms? > > > > Hi Chris, thanks for looking into this. > > > > It comes long before. Hours before there's any problem. So it > > seems unrelated. > > Though every time I had the issue, there had been such an > "invalid opcode" before. But also, I only had both the "invalid > opcode" and memory issue when doing that rsync onto external > hard drive. > > > > Please send along any oops output during the run. Only the first > > > (earliest) oops matters. > > > > There's always only one in between two reboots. I've sent two > > already, but here they are: > [...] > > I dug up the traces for before I switched to debian (thinking > getting a newer kernel would improve matters) in case it helps: > > And: > > Jun 5 00:58:10 BUG: Bad page state in process rsync pfn:1bfdf > Jun 5 00:58:10 page:ffffea000061f8c8 count:0 mapcount:0 mapping: (null) index:0x2300 > Jun 5 00:58:10 page flags: 0x100000000000010(dirty) > Jun 5 00:58:10 Pid: 1584, comm: rsync Tainted: G D C 2.6.38-7-server #35-Ubuntu > Jun 5 00:58:10 Call Trace: Ok, this one is really interesting. Did you get this after another oops or was it after a reboot? How easily can you recompile your kernel with more debugging flags? That should help narrow it down. I'm looking for CONFIG_SLAB_DEBUG (or slub) and CONFIG_DEBUG_PAGEALLOC -chris