linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* blk-mq vs kmemleak
@ 2015-07-03 16:11 Dave Jones
  2015-07-03 17:04 ` Bart Van Assche
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Jones @ 2015-07-03 16:11 UTC (permalink / raw)
  To: linux-scsi

After a fuzzing run recently, I noticed that the machine had oom'd, and
killed everything, but there was still 3GB of memory still in use, that
I couldn't even reclaim with /proc/sys/vm/drop_caches

So I enabled kmemleak. After applying this..

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index cf79f110157c..6dc18dbad9ec 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
 
        object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
        if (!object) {
-               pr_warning("Cannot allocate a kmemleak_object structure\n");
-               kmemleak_disable();
+               //pr_warning("Cannot allocate a kmemleak_object structure\n");
+               //kmemleak_disable();
                return NULL;
        }

otherwise it would disable itself within a minute of runtime.

I notice now that I'm seeing a lot of traces like this..


unreferenced object 0xffff8800ba8202c0 (size 320):
  comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s)
  hex dump (first 32 bytes):
    21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
    [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
    [<ffffffff89165963>] mempool_alloc+0x63/0x180
    [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
    [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
    [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
    [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
    [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
    [<ffffffff8946da59>] sd_init_command+0x59/0xe40
    [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
    [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
    [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
    [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
    [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
    [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140

unreferenced object 0xffff8800ba824800 (size 640):
  comm "trinity-c2", pid 3687, jiffies 4294843075 (age 46785.966s)
  hex dump (first 32 bytes):
    21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
    [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
    [<ffffffff89165963>] mempool_alloc+0x63/0x180
    [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
    [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
    [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
    [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
    [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
    [<ffffffff8946da59>] sd_init_command+0x59/0xe40
    [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
    [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
    [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
    [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
    [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
    [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140

unreferenced object 0xffff8800a9fe6780 (size 2560):
  comm "kworker/1:1H", pid 171, jiffies 4294843118 (age 46785.923s)
  hex dump (first 32 bytes):
    21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
    [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
    [<ffffffff89165963>] mempool_alloc+0x63/0x180
    [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
    [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
    [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
    [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
    [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
    [<ffffffff8946da59>] sd_init_command+0x59/0xe40
    [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
    [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
    [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
    [<ffffffff892f66b2>] blk_mq_run_work_fn+0x12/0x20
    [<ffffffff8908eba7>] process_one_work+0x147/0x420
    [<ffffffff8908f209>] worker_thread+0x69/0x470

The sizes vary, but the hex dump is always the same.

What's the usual completion path where these would get deallocated ?
I'm wondering if there's just some annotation missing to appease kmemleak,
because I'm seeing thousands of these.

Or it could be a real leak, but it seems surprising no-one else is complaining.

	Dave


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-07-03 16:11 blk-mq vs kmemleak Dave Jones
@ 2015-07-03 17:04 ` Bart Van Assche
  2015-07-03 17:07   ` Dave Jones
  2015-07-07 10:33   ` Catalin Marinas
  0 siblings, 2 replies; 12+ messages in thread
From: Bart Van Assche @ 2015-07-03 17:04 UTC (permalink / raw)
  To: Dave Jones, Catalin Marinas; +Cc: linux-scsi@vger.kernel.org

On 07/03/15 09:11, Dave Jones wrote:
> After a fuzzing run recently, I noticed that the machine had oom'd, and
> killed everything, but there was still 3GB of memory still in use, that
> I couldn't even reclaim with /proc/sys/vm/drop_caches
>
> So I enabled kmemleak. After applying this..
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index cf79f110157c..6dc18dbad9ec 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
>
>          object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
>          if (!object) {
> -               pr_warning("Cannot allocate a kmemleak_object structure\n");
> -               kmemleak_disable();
> +               //pr_warning("Cannot allocate a kmemleak_object structure\n");
> +               //kmemleak_disable();
>                  return NULL;
>          }
>
> otherwise it would disable itself within a minute of runtime.
>
> I notice now that I'm seeing a lot of traces like this..
>
>
> unreferenced object 0xffff8800ba8202c0 (size 320):
>    comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s)
>    hex dump (first 32 bytes):
>      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
>      [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
>      [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
>      [<ffffffff89165963>] mempool_alloc+0x63/0x180
>      [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
>      [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
>      [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
>      [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
>      [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
>      [<ffffffff8946da59>] sd_init_command+0x59/0xe40
>      [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
>      [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
>      [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
>      [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
>      [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
>      [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140
>
> unreferenced object 0xffff8800ba824800 (size 640):
>    comm "trinity-c2", pid 3687, jiffies 4294843075 (age 46785.966s)
>    hex dump (first 32 bytes):
>      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
>      [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
>      [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
>      [<ffffffff89165963>] mempool_alloc+0x63/0x180
>      [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
>      [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
>      [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
>      [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
>      [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
>      [<ffffffff8946da59>] sd_init_command+0x59/0xe40
>      [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
>      [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
>      [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
>      [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
>      [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
>      [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140
>
> unreferenced object 0xffff8800a9fe6780 (size 2560):
>    comm "kworker/1:1H", pid 171, jiffies 4294843118 (age 46785.923s)
>    hex dump (first 32 bytes):
>      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
>      [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
>      [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
>      [<ffffffff89165963>] mempool_alloc+0x63/0x180
>      [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
>      [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
>      [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
>      [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
>      [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
>      [<ffffffff8946da59>] sd_init_command+0x59/0xe40
>      [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
>      [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
>      [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
>      [<ffffffff892f66b2>] blk_mq_run_work_fn+0x12/0x20
>      [<ffffffff8908eba7>] process_one_work+0x147/0x420
>      [<ffffffff8908f209>] worker_thread+0x69/0x470
>
> The sizes vary, but the hex dump is always the same.
>
> What's the usual completion path where these would get deallocated ?
> I'm wondering if there's just some annotation missing to appease kmemleak,
> because I'm seeing thousands of these.
>
> Or it could be a real leak, but it seems surprising no-one else is complaining.

(+Catalin)

Dave, with which kernel version has this behavior been observed ?

Catalin, can you recommend which patches Dave Jones should apply to 
kmemleak ? A few weeks ago I had noticed similar kmemleak reports. 
However, when I reran my test with kmemleak disabled memory usage was 
stable. See also 
https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-07-03 17:04 ` Bart Van Assche
@ 2015-07-03 17:07   ` Dave Jones
  2015-07-07 10:33   ` Catalin Marinas
  1 sibling, 0 replies; 12+ messages in thread
From: Dave Jones @ 2015-07-03 17:07 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Catalin Marinas, linux-scsi@vger.kernel.org

On Fri, Jul 03, 2015 at 10:04:00AM -0700, Bart Van Assche wrote:
 > On 07/03/15 09:11, Dave Jones wrote:
 > > After a fuzzing run recently, I noticed that the machine had oom'd, and
 > > killed everything, but there was still 3GB of memory still in use, that
 > > I couldn't even reclaim with /proc/sys/vm/drop_caches
 > > ...
 > > I'm wondering if there's just some annotation missing to appease kmemleak,
 > > because I'm seeing thousands of these.
 > >
 > > Or it could be a real leak, but it seems surprising no-one else is complaining.
 > 
 > Dave, with which kernel version has this behavior been observed ?

Linus' current tree

	Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-07-03 17:04 ` Bart Van Assche
  2015-07-03 17:07   ` Dave Jones
@ 2015-07-07 10:33   ` Catalin Marinas
  2015-07-07 13:59     ` Bart Van Assche
  1 sibling, 1 reply; 12+ messages in thread
From: Catalin Marinas @ 2015-07-07 10:33 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Dave Jones, linux-scsi@vger.kernel.org

On Fri, Jul 03, 2015 at 06:04:00PM +0100, Bart Van Assche wrote:
> On 07/03/15 09:11, Dave Jones wrote:
> > After a fuzzing run recently, I noticed that the machine had oom'd, and
> > killed everything, but there was still 3GB of memory still in use, that
> > I couldn't even reclaim with /proc/sys/vm/drop_caches
> >
> > So I enabled kmemleak. After applying this..
> >
> > diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > index cf79f110157c..6dc18dbad9ec 100644
> > --- a/mm/kmemleak.c
> > +++ b/mm/kmemleak.c
> > @@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
> >
> >          object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
> >          if (!object) {
> > -               pr_warning("Cannot allocate a kmemleak_object structure\n");
> > -               kmemleak_disable();
> > +               //pr_warning("Cannot allocate a kmemleak_object structure\n");
> > +               //kmemleak_disable();
> >                  return NULL;
> >          }
> >
> > otherwise it would disable itself within a minute of runtime.
> >
> > I notice now that I'm seeing a lot of traces like this..
> >
> >
> > unreferenced object 0xffff8800ba8202c0 (size 320):
> >    comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s)
> >    hex dump (first 32 bytes):
> >      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
> >      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

If kmemleak fails to allocate its metadata, the object referencing graph
is broken, so whatever kmemleak prints afterwards can no longer be
trusted (e.g. objects in a linked list and there is a block in the
middle of the list that kmemleak doesn't know about and won't scan; the
blocks referenced by it will be reported as leaks).

After kmemleak disables itself, does it still show any potential leaks?
Even that is not entirely to be trusted. A better start would be to look
at /proc/slabinfo.

> Catalin, can you recommend which patches Dave Jones should apply to 
> kmemleak ?

All kmemleak patches I had are in mainline (as of 4.2-rc1) but none of
them targets leaks (or false positives).

> A few weeks ago I had noticed similar kmemleak reports. 
> However, when I reran my test with kmemleak disabled memory usage was 
> stable. See also 
> https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html.

Further down in this thread it gets interesting with kmalloc-96 objects
disappearing in /proc/slabinfo with kmemleak disabled. Kmemleak does not
allocate such objects, the two types it allocates are in separate
kmem_cache as kmemleak_object and kmemleak_scan_area. However, the 96
size is reported by kmemleak as well, so maybe it was fixed by some
other patch in the meantime?

If you manage to reproduce it again please let me know. Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-07-07 10:33   ` Catalin Marinas
@ 2015-07-07 13:59     ` Bart Van Assche
  2015-07-08  8:17       ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2015-07-07 13:59 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Dave Jones, linux-scsi@vger.kernel.org

On 07/07/15 03:33, Catalin Marinas wrote:
> On Fri, Jul 03, 2015 at 06:04:00PM +0100, Bart Van Assche wrote:
>> A few weeks ago I had noticed similar kmemleak reports.
>> However, when I reran my test with kmemleak disabled memory usage was
>> stable. See also
>> https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html.
>
> Further down in this thread it gets interesting with kmalloc-96 objects
> disappearing in /proc/slabinfo with kmemleak disabled. Kmemleak does not
> allocate such objects, the two types it allocates are in separate
> kmem_cache as kmemleak_object and kmemleak_scan_area. However, the 96
> size is reported by kmemleak as well, so maybe it was fixed by some
> other patch in the meantime?
>
> If you manage to reproduce it again please let me know. Thanks.

Hello Catalin,

Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a 
red zone to be allocated before and after each block of allocated 
memory. Could that explain the kmalloc-96 objects ?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-07-07 13:59     ` Bart Van Assche
@ 2015-07-08  8:17       ` Christoph Hellwig
  2015-08-01  0:37         ` Bart Van Assche
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2015-07-08  8:17 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Catalin Marinas, Dave Jones, linux-scsi@vger.kernel.org

On Tue, Jul 07, 2015 at 06:59:37AM -0700, Bart Van Assche wrote:
> Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red
> zone to be allocated before and after each block of allocated memory. Could
> that explain the kmalloc-96 objects ?

96 is almost guaranteed to be the sense buffer allocated in
scsi_init_request and freed in scsi_exit_request.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-07-08  8:17       ` Christoph Hellwig
@ 2015-08-01  0:37         ` Bart Van Assche
  2015-08-03 10:43           ` Catalin Marinas
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2015-08-01  0:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Catalin Marinas, Dave Jones, linux-scsi@vger.kernel.org

On 07/08/2015 01:17 AM, Christoph Hellwig wrote:
> On Tue, Jul 07, 2015 at 06:59:37AM -0700, Bart Van Assche wrote:
>> Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red
>> zone to be allocated before and after each block of allocated memory. Could
>> that explain the kmalloc-96 objects ?
>
> 96 is almost guaranteed to be the sense buffer allocated in
> scsi_init_request and freed in scsi_exit_request.

Hello Catalin and Christoph,

kmemleak still reports large numbers of unreferenced objects for the 
scsi-mq code with the v4.2-rc4 kernel even with the recently posted 
scsi-mq leak fix applied on top of v4.2-rc4. Here is an example of one 
such report:

unreferenced object 0xffff88045e05dc28 (size 96):
   comm "srp_daemon", pid 8461, jiffies 4294973034 (age 742.350s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<ffffffff814f2ede>] kmemleak_alloc+0x4e/0xb0
     [<ffffffff811b0678>] kmem_cache_alloc_trace+0xc8/0x2d0
     [<ffffffffa006cc37>] scsi_init_request+0x27/0x40 [scsi_mod]
     [<ffffffff81278b91>] blk_mq_init_rq_map+0x1d1/0x260
     [<ffffffff81278cc4>] blk_mq_alloc_tag_set+0xa4/0x1f0
     [<ffffffffa006fb0d>] scsi_mq_setup_tags+0xcd/0xd0 [scsi_mod]
     [<ffffffffa0066464>] scsi_add_host_with_dma+0x74/0x2e0 [scsi_mod]
     [<ffffffffa0478e12>] srp_create_target+0xe12/0x1320 [ib_srp]
     [<ffffffff8138a728>] dev_attr_store+0x18/0x30
     [<ffffffff812371f8>] sysfs_kf_write+0x48/0x60
     [<ffffffff812367f4>] kernfs_fop_write+0x144/0x190
     [<ffffffff811bdaf8>] __vfs_write+0x28/0xe0
     [<ffffffff811be1a9>] vfs_write+0xa9/0x190
     [<ffffffff811bef09>] SyS_write+0x49/0xa0
     [<ffffffff815022f2>] entry_SYSCALL_64_fastpath+0x16/0x7a
     [<ffffffffffffffff>] 0xffffffffffffffff

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-08-01  0:37         ` Bart Van Assche
@ 2015-08-03 10:43           ` Catalin Marinas
  2015-08-03 13:33             ` Christoph Hellwig
  2015-08-03 17:05             ` Bart Van Assche
  0 siblings, 2 replies; 12+ messages in thread
From: Catalin Marinas @ 2015-08-03 10:43 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Christoph Hellwig, Dave Jones, linux-scsi@vger.kernel.org

Hi Bart,

On Sat, Aug 01, 2015 at 01:37:02AM +0100, Bart Van Assche wrote:
> On 07/08/2015 01:17 AM, Christoph Hellwig wrote:
> > On Tue, Jul 07, 2015 at 06:59:37AM -0700, Bart Van Assche wrote:
> >> Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red
> >> zone to be allocated before and after each block of allocated memory. Could
> >> that explain the kmalloc-96 objects ?
> >
> > 96 is almost guaranteed to be the sense buffer allocated in
> > scsi_init_request and freed in scsi_exit_request.
> 
> kmemleak still reports large numbers of unreferenced objects for the 
> scsi-mq code with the v4.2-rc4 kernel even with the recently posted 
> scsi-mq leak fix applied on top of v4.2-rc4. Here is an example of one 
> such report:
> 
> unreferenced object 0xffff88045e05dc28 (size 96):
>    comm "srp_daemon", pid 8461, jiffies 4294973034 (age 742.350s)
>    hex dump (first 32 bytes):
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff814f2ede>] kmemleak_alloc+0x4e/0xb0
>      [<ffffffff811b0678>] kmem_cache_alloc_trace+0xc8/0x2d0
>      [<ffffffffa006cc37>] scsi_init_request+0x27/0x40 [scsi_mod]
>      [<ffffffff81278b91>] blk_mq_init_rq_map+0x1d1/0x260
>      [<ffffffff81278cc4>] blk_mq_alloc_tag_set+0xa4/0x1f0
>      [<ffffffffa006fb0d>] scsi_mq_setup_tags+0xcd/0xd0 [scsi_mod]
>      [<ffffffffa0066464>] scsi_add_host_with_dma+0x74/0x2e0 [scsi_mod]
>      [<ffffffffa0478e12>] srp_create_target+0xe12/0x1320 [ib_srp]
>      [<ffffffff8138a728>] dev_attr_store+0x18/0x30
>      [<ffffffff812371f8>] sysfs_kf_write+0x48/0x60
>      [<ffffffff812367f4>] kernfs_fop_write+0x144/0x190
>      [<ffffffff811bdaf8>] __vfs_write+0x28/0xe0
>      [<ffffffff811be1a9>] vfs_write+0xa9/0x190
>      [<ffffffff811bef09>] SyS_write+0x49/0xa0
>      [<ffffffff815022f2>] entry_SYSCALL_64_fastpath+0x16/0x7a
>      [<ffffffffffffffff>] 0xffffffffffffffff

If I read the code correctly, I think this is a false positive. In
blk_mq_init_irq_map(), tags->rqs[i] are allocated with
alloc_pages_node(). The scsi_cmd is placed after the request structure
in the same page and the sense_buffer pointer is stored in the scsi_cmd
structure. Since kmemleak does not track page allocations, it never
scans the scsi_cmd structure for the sense_buffer pointer, hence it
reports them as leaks.

The simplest would be to add a kmemleak_not_leak() annotation in
scsi_init_request(), though you would hide real leaks (if any).

A better way could be to inform kmemleak of these pages, something like
below (compile-tested only):

--------8<-------------
>From b5526babaf4b991fcc530c15563bc9b333f7c86a Mon Sep 17 00:00:00 2001
From: Catalin Marinas <catalin.marinas@arm.com>
Date: Mon, 3 Aug 2015 11:30:24 +0100
Subject: [PATCH] block: kmemleak: Track the page allocations for struct
 request

The pages allocated for struct request contain pointers to other slab
allocations (via ops->init_request). Since kmemleak does not track/scan
page allocations, the slab objects will be reported as leaks (false
positives). This patch adds kmemleak callbacks to allow tracking of such
pages.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
---
 block/blk-mq.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7d842db59699..c5fe656fe244 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -9,6 +9,7 @@
 #include <linux/backing-dev.h>
 #include <linux/bio.h>
 #include <linux/blkdev.h>
+#include <linux/kmemleak.h>
 #include <linux/mm.h>
 #include <linux/init.h>
 #include <linux/slab.h>
@@ -1448,6 +1449,11 @@ static void blk_mq_free_rq_map(struct blk_mq_tag_set *set,
 	while (!list_empty(&tags->page_list)) {
 		page = list_first_entry(&tags->page_list, struct page, lru);
 		list_del_init(&page->lru);
+		/*
+		 * Remove kmemleak object previously allocated in
+		 * blk_mq_init_rq_map().
+		 */
+		kmemleak_free(page_address(page));
 		__free_pages(page, page->private);
 	}
 
@@ -1520,6 +1526,11 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
 		list_add_tail(&page->lru, &tags->page_list);
 
 		p = page_address(page);
+		/*
+		 * Allow kmemleak to scan these pages as they contain pointers
+		 * to additional allocations like via ops->init_request().
+		 */
+		kmemleak_alloc(p, order_to_size(this_order), 1, GFP_KERNEL);
 		entries_per_page = order_to_size(this_order) / rq_size;
 		to_do = min(entries_per_page, set->queue_depth - i);
 		left -= to_do * rq_size;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-08-03 10:43           ` Catalin Marinas
@ 2015-08-03 13:33             ` Christoph Hellwig
  2015-08-03 15:34               ` Catalin Marinas
  2015-08-03 17:05             ` Bart Van Assche
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2015-08-03 13:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Bart Van Assche, Christoph Hellwig, Dave Jones,
	linux-scsi@vger.kernel.org

On Mon, Aug 03, 2015 at 11:43:09AM +0100, Catalin Marinas wrote:
> The simplest would be to add a kmemleak_not_leak() annotation in
> scsi_init_request(), though you would hide real leaks (if any).
> 
> A better way could be to inform kmemleak of these pages, something like
> below (compile-tested only):

Both versions sound reasonable to me, but I'd prefer the second one if
it works.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-08-03 13:33             ` Christoph Hellwig
@ 2015-08-03 15:34               ` Catalin Marinas
  0 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2015-08-03 15:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Bart Van Assche, Dave Jones, linux-scsi@vger.kernel.org

On Mon, Aug 03, 2015 at 02:33:27PM +0100, Christoph Hellwig wrote:
> On Mon, Aug 03, 2015 at 11:43:09AM +0100, Catalin Marinas wrote:
> > The simplest would be to add a kmemleak_not_leak() annotation in
> > scsi_init_request(), though you would hide real leaks (if any).
> > 
> > A better way could be to inform kmemleak of these pages, something like
> > below (compile-tested only):
> 
> Both versions sound reasonable to me, but I'd prefer the second one if
> it works.

I don't currently have a system where scsi_init_request() is used. But I
hacked loop_init_request (and added a loop_exit_request) to allocate a
dummy structure. Without this patch, I indeed get a few hundred kmemleak
reports which disappear once I apply it.

-- 
Catalin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-08-03 10:43           ` Catalin Marinas
  2015-08-03 13:33             ` Christoph Hellwig
@ 2015-08-03 17:05             ` Bart Van Assche
  2015-08-03 17:50               ` Catalin Marinas
  1 sibling, 1 reply; 12+ messages in thread
From: Bart Van Assche @ 2015-08-03 17:05 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Christoph Hellwig, Dave Jones, linux-scsi@vger.kernel.org

On 08/03/2015 03:43 AM, Catalin Marinas wrote:
> The pages allocated for struct request contain pointers to other slab
> allocations (via ops->init_request). Since kmemleak does not track/scan
> page allocations, the slab objects will be reported as leaks (false
> positives). This patch adds kmemleak callbacks to allow tracking of such
> pages.
>
> Signed-off-by: Catalin Marinas<catalin.marinas@arm.com>
> Reported-by: Bart Van Assche<bart.vanassche@sandisk.com>

That patch works fine on my test setup, hence:

Tested-by: Bart Van Assche<bart.vanassche@sandisk.com>

Thanks !

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: blk-mq vs kmemleak
  2015-08-03 17:05             ` Bart Van Assche
@ 2015-08-03 17:50               ` Catalin Marinas
  0 siblings, 0 replies; 12+ messages in thread
From: Catalin Marinas @ 2015-08-03 17:50 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Dave Jones, linux-scsi@vger.kernel.org,
	Jens Axboe

On Mon, Aug 03, 2015 at 06:05:59PM +0100, Bart Van Assche wrote:
> On 08/03/2015 03:43 AM, Catalin Marinas wrote:
> > The pages allocated for struct request contain pointers to other slab
> > allocations (via ops->init_request). Since kmemleak does not track/scan
> > page allocations, the slab objects will be reported as leaks (false
> > positives). This patch adds kmemleak callbacks to allow tracking of such
> > pages.
> >
> > Signed-off-by: Catalin Marinas<catalin.marinas@arm.com>
> > Reported-by: Bart Van Assche<bart.vanassche@sandisk.com>
> 
> That patch works fine on my test setup, hence:
> 
> Tested-by: Bart Van Assche<bart.vanassche@sandisk.com>

Thanks for testing.

Cc'ing the block layer maintainer since the patches touches this part.
Jens, are you ok with the patch posted below? If yes, would you like me
to re-post or you're happy to pick it up from the list:

http://lkml.kernel.org/r/20150803104309.GB4033@e104818-lin.cambridge.arm.com

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-08-03 17:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-03 16:11 blk-mq vs kmemleak Dave Jones
2015-07-03 17:04 ` Bart Van Assche
2015-07-03 17:07   ` Dave Jones
2015-07-07 10:33   ` Catalin Marinas
2015-07-07 13:59     ` Bart Van Assche
2015-07-08  8:17       ` Christoph Hellwig
2015-08-01  0:37         ` Bart Van Assche
2015-08-03 10:43           ` Catalin Marinas
2015-08-03 13:33             ` Christoph Hellwig
2015-08-03 15:34               ` Catalin Marinas
2015-08-03 17:05             ` Bart Van Assche
2015-08-03 17:50               ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).