linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown
@ 2017-03-29  2:29 Junichi Nomura
  2017-03-29 11:17 ` Johannes Thumshirn
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Junichi Nomura @ 2017-03-29  2:29 UTC (permalink / raw)
  To: linux-scsi, dick.kennedy@broadcom.com, james.smart@broadcom.com
  Cc: anton@samba.org, martin.petersen@oracle.com

Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications"),
"rmmod lpfc" starting to cause panic or corruption due to double free.

The double-free occurs as followings:
  - During initialization, lpfc_create_wq_cq() binds cq and wq to
    the same ring in the way that both cq->pring and wq->pring point
    to the same object.
  - Upon removal, lpfc_sli4_queue_destroy() ends up calling
    lpfc_sli4_queue_free() for both wqs and cqs
    and kfree(queue->pring) is done twice.

The problem became more visible in v4.11-rc3 because commit 85e8a23936ab
("scsi: lpfc: Add shutdown method for kexec") made lpfc_pci_remove_one()
called during driver shutdown.

A sample of slub_debug output is attached below.

=============================================================================
BUG kmalloc-512 (Not tainted): Object already free
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in lpfc_wq_create+0x31c/0x4f0 [lpfc] age=259902 cpu=0 pid=314
	___slab_alloc+0x47f/0x4b0
	__slab_alloc+0x40/0x5c
	kmem_cache_alloc_trace+0x16c/0x1b0
	lpfc_wq_create+0x31c/0x4f0 [lpfc]
	lpfc_create_wq_cq+0xb6/0x370 [lpfc]
	lpfc_sli4_queue_setup+0x331/0xd70 [lpfc]
	lpfc_sli4_hba_setup+0x12ce/0x1e90 [lpfc]
	lpfc_pci_probe_one_s4.isra.43+0x7c2/0x8f0 [lpfc]
	lpfc_pci_probe_one+0xbd/0xc30 [lpfc]
	local_pci_probe+0x45/0xa0
	work_for_cpu_fn+0x14/0x20
	process_one_work+0x165/0x410
	worker_thread+0x27f/0x4c0
	kthread+0x101/0x140
	ret_from_fork+0x2c/0x40
INFO: Freed in lpfc_sli4_queue_free+0x11b/0x160 [lpfc] age=100 cpu=3 pid=11802
	__slab_free+0x1ba/0x2c0
	kfree+0x122/0x170
	lpfc_sli4_queue_free+0x11b/0x160 [lpfc]
	lpfc_sli4_queue_destroy+0xba/0x470 [lpfc]
	lpfc_pci_remove_one+0x6b4/0x880 [lpfc]
	pci_device_remove+0x39/0xc0
	device_release_driver_internal+0x141/0x1f0
	driver_detach+0x3f/0x80
	bus_remove_driver+0x55/0xd0
	driver_unregister+0x2c/0x50
	pci_unregister_driver+0x2a/0xa0
	lpfc_exit+0x1c/0xe84 [lpfc]
	SyS_delete_module+0x1ba/0x220
	do_syscall_64+0x67/0x180
	return_from_SYSCALL_64+0x0/0x6a
INFO: Slab 0xffffea0040c9ce00 objects=38 used=34 fp=0xffff881032739a88 flags=0x17ffffc0008101
INFO: Object 0xffff881032739098 @offset=4248 fp=0x          (null)

Redzone ffff881032739090: bb bb bb bb bb bb bb bb                          ........
Object ffff881032739098: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327390a8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327390b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327390c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327390d8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327390e8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327390f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739108: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739118: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739128: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739138: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739148: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739158: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739168: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739178: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739188: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739198: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327391a8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327391b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327391c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327391d8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327391e8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff8810327391f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739208: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739218: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739228: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739238: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739248: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739258: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739268: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739278: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
Object ffff881032739288: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
Redzone ffff881032739298: bb bb bb bb bb bb bb bb                          ........
Padding ffff8810327393d8: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
CPU: 3 PID: 11802 Comm: rmmod Tainted: G    B           4.11.0-rc3 #1
Call Trace:
 dump_stack+0x63/0x87
 print_trailer+0x165/0x260
 free_debug_processing+0x20c/0x278
 ? lpfc_sli4_queue_free+0x11b/0x160 [lpfc]
 __slab_free+0x1ba/0x2c0
 ? lpfc_sli4_queue_destroy+0xda/0x470 [lpfc]
 ? free_hot_cold_page+0x21f/0x280
 ? __free_pages+0x25/0x30
 ? free_pages.part.88+0x40/0x50
 ? lpfc_sli4_queue_free+0x11b/0x160 [lpfc]
 kfree+0x122/0x170
 lpfc_sli4_queue_free+0x11b/0x160 [lpfc]
 lpfc_sli4_queue_destroy+0x11b/0x470 [lpfc]
 lpfc_pci_remove_one+0x6b4/0x880 [lpfc]
 pci_device_remove+0x39/0xc0
 device_release_driver_internal+0x141/0x1f0
 driver_detach+0x3f/0x80
 bus_remove_driver+0x55/0xd0
 driver_unregister+0x2c/0x50
 pci_unregister_driver+0x2a/0xa0
 lpfc_exit+0x1c/0xe84 [lpfc]
 SyS_delete_module+0x1ba/0x220
 do_syscall_64+0x67/0x180
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7fa3e194ac27
RSP: 002b:00007ffdcd1607b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000000000789210 RCX: 00007fa3e194ac27
RDX: 00007fa3e19bb000 RSI: 0000000000000800 RDI: 0000000000789278
RBP: 0000000000000000 R08: 00007fa3e1c0e060 R09: 00007fa3e19bb000
R10: 00007ffdcd160540 R11: 0000000000000206 R12: 00007ffdcd1625ca
R13: 0000000000000000 R14: 0000000000789210 R15: 0000000000789010
FIX kmalloc-512: Object at 0xffff881032739098 not freed

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown
  2017-03-29  2:29 [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Junichi Nomura
@ 2017-03-29 11:17 ` Johannes Thumshirn
  2017-03-29 23:26   ` Junichi Nomura
  2017-04-03 21:51 ` [PATCH] lpfc: fix double free of bound CQ/WQ ring pointer Mauricio Faria de Oliveira
  2017-04-03 21:53 ` [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Mauricio Faria de Oliveira
  2 siblings, 1 reply; 7+ messages in thread
From: Johannes Thumshirn @ 2017-03-29 11:17 UTC (permalink / raw)
  To: Junichi Nomura
  Cc: linux-scsi, dick.kennedy@broadcom.com, james.smart@broadcom.com,
	anton@samba.org, martin.petersen@oracle.com

On Wed, Mar 29, 2017 at 02:29:45AM +0000, Junichi Nomura wrote:
> Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications"),
> "rmmod lpfc" starting to cause panic or corruption due to double free.
> 
> The double-free occurs as followings:
>   - During initialization, lpfc_create_wq_cq() binds cq and wq to
>     the same ring in the way that both cq->pring and wq->pring point
>     to the same object.
>   - Upon removal, lpfc_sli4_queue_destroy() ends up calling
>     lpfc_sli4_queue_free() for both wqs and cqs
>     and kfree(queue->pring) is done twice.
> 
> The problem became more visible in v4.11-rc3 because commit 85e8a23936ab
> ("scsi: lpfc: Add shutdown method for kexec") made lpfc_pci_remove_one()
> called during driver shutdown.

Well the obvious band-aid would be setting the pointers to NULL after freeing
them. lpfc_sli4_queue_free() checks for queue's precense and doesn't use
queue->pring prior to freeing it, so the following _should_ to the trick:

>From befa936d8935a1bed01df65b376f515fa42c99da Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn@suse.de>
Date: Wed, 29 Mar 2017 13:08:55 +0200
Subject: [PATCH] lpfc: prevent double free of lpfc queue ring pointer

Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications")
rmoving the lpfc module causes a double free in lpfc_sli4_queue_free().

This can be prevented by setting the queue->pring and queue pointers to NULL,
so kfree() will simply ignore the pointers on a second call.

Reported-by: Junichi Nomura <j-nomura@ce.jp.nec.com>
Fixes: 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications")
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/lpfc/lpfc_sli.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 1c9fa45..86e1529 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -13759,7 +13759,9 @@ lpfc_sli4_queue_free(struct lpfc_queue *queue)
 		kfree(queue->rqbp);
 	}
 	kfree(queue->pring);
+	queue->pring = NULL;
 	kfree(queue);
+	queue = NULL;
 	return;
 }
 
-- 
2.10.2

I'll have a look if we at the callers of lpfc_sli4_queue_free() as well
and check if there's a better (a.k.a more correct) way to fix this.

Byte,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown
  2017-03-29 11:17 ` Johannes Thumshirn
@ 2017-03-29 23:26   ` Junichi Nomura
  0 siblings, 0 replies; 7+ messages in thread
From: Junichi Nomura @ 2017-03-29 23:26 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: linux-scsi, dick.kennedy@broadcom.com, james.smart@broadcom.com,
	anton@samba.org, martin.petersen@oracle.com

On 03/29/17 20:17, Johannes Thumshirn wrote:
> On Wed, Mar 29, 2017 at 02:29:45AM +0000, Junichi Nomura wrote:
>> The double-free occurs as followings:
>>   - During initialization, lpfc_create_wq_cq() binds cq and wq to
>>     the same ring in the way that both cq->pring and wq->pring point
>>     to the same object.
>>   - Upon removal, lpfc_sli4_queue_destroy() ends up calling
>>     lpfc_sli4_queue_free() for both wqs and cqs
>>     and kfree(queue->pring) is done twice.
>>
>> The problem became more visible in v4.11-rc3 because commit 85e8a23936ab
>> ("scsi: lpfc: Add shutdown method for kexec") made lpfc_pci_remove_one()
>> called during driver shutdown.
> 
> Well the obvious band-aid would be setting the pointers to NULL after freeing
> them. lpfc_sli4_queue_free() checks for queue's precense and doesn't use
> queue->pring prior to freeing it, so the following _should_ to the trick:
> 
> From befa936d8935a1bed01df65b376f515fa42c99da Mon Sep 17 00:00:00 2001
> From: Johannes Thumshirn <jthumshirn@suse.de>
> Date: Wed, 29 Mar 2017 13:08:55 +0200
> Subject: [PATCH] lpfc: prevent double free of lpfc queue ring pointer
> 
> Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications")
> rmoving the lpfc module causes a double free in lpfc_sli4_queue_free().
> 
> This can be prevented by setting the queue->pring and queue pointers to NULL,
> so kfree() will simply ignore the pointers on a second call.

No, it doesn't work.

Even if lpfc_sli4_queue_free(wq) sets wq->pring to NULL, cq->pring still
holds bogus pointer and lpfc_sli4_queue_free(cq) will call kfree(cq->pring)
and cause double-free.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] lpfc: fix double free of bound CQ/WQ ring pointer
  2017-03-29  2:29 [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Junichi Nomura
  2017-03-29 11:17 ` Johannes Thumshirn
@ 2017-04-03 21:51 ` Mauricio Faria de Oliveira
  2017-04-03 21:53 ` [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Mauricio Faria de Oliveira
  2 siblings, 0 replies; 7+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-04-03 21:51 UTC (permalink / raw)
  To: jthumshirn; +Cc: linux-scsi, dick.kennedy, james.smart, anton, martin.petersen

commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications")
binds the CQs and WQs ring pointer (sets it to same address on both).

    lpfc_create_wq_cq():
    ...
            rc = lpfc_cq_create(phba, cq, eq, <...>)
    ...
                    rc = lpfc_wq_create(phba, wq, cq, qtype);
    ...
                    /* Bind this CQ/WQ to the NVME ring */
                    pring = wq->pring;
    ...
                    cq->pring = pring;
    ...

The commit frees both CQ & WQ for FCP/NVME on lpfc_sli4_queue_destroy(),
which causes a double free (potential corruption or panic) on freeing
the ring pointer of the second entity (CQ is first, WQ is second):

    lpfc_pci_remove_one() # that is, .remove / .shutdown
    -> lpfc_pci_remove_one_s4()
       -> lpfc_sli4_hba_unset()
          -> lpfc_sli4_queue_destroy()

             -> lpfc_sli4_release_queues()  # Release FCP/NVME cqs
                -> __lpfc_sli4_release_queue()
                   -> lpfc_sli4_queue_free()
                      -> kfree(queue->pring)  # first free

             -> lpfc_sli4_release_queues()  # Release FCP/NVME wqs
                -> __lpfc_sli4_release_queue()
                   -> lpfc_sli4_queue_free()
                      -> kfree(queue->pring)  # second free

So, check for WQs in lpfc_sli4_queue_free() and do not free the pring,
as it is freed before in the bound CQ.  [the WQs are created only via
lpfc_wq_create(), which sets struct lpfc_queue::type == LPFC_WQ.  And
that happens in 2 sites (lpfc_create_wq_cq() & lpfc_fof_queue_setup()),
both of which bind the CQs & WQs. Thus, checking for the LPFC_WQ type
correlates to whether the WQ is bound to a CQ, which is freed first.]

Additional details:

For reference, that binding also occurs on one other function:

    lpfc_fof_queue_setup():
    ...
                    rc = lpfc_cq_create(phba, phba->sli4_hba.oas_cq, <...>)
    ...
                    rc = lpfc_wq_create(phba, phba->sli4_hba.oas_wq, <...>)
    ...
                    /* Bind this CQ/WQ to the NVME ring */
                    pring = phba->sli4_hba.oas_wq->pring;
    ...
                    phba->sli4_hba.oas_cq->pring = pring;

And used to occur similarly on lpfc_sli4_queue_setup(), but was changed
by that commit; although the problem is more related to the new freeing
pattern introduced in lpfc_sli4_queue_destroy() plus the bound CQs/WQs.

    -               /* Bind this WQ to the next FCP ring */
    -               pring = &psli->ring[MAX_SLI3_CONFIGURED_RINGS + fcp_wqidx];
    ...
    -               phba->sli4_hba.fcp_cq[fcp_wqidx]->pring = pring;

commit 85e8a23936ab ("scsi: lpfc: Add shutdown method for kexec") made
this more likely as lpfc_pci_remove_one() is called on driver shutdown
(e.g., modprobe -r / rmmod).

(this patch is partially based on a different patch suggested by Johannes,
 thus adding a Suggested-by tag for due credit.)

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Reported-by: Junichi Nomura <j-nomura@ce.jp.nec.com>
Suggested-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/lpfc/lpfc_sli.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 1c9fa45df7eb..8befe841adaa 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -13758,7 +13758,14 @@ void lpfc_sli4_els_xri_abort_event_proc(struct lpfc_hba *phba)
 		lpfc_free_rq_buffer(queue->phba, queue);
 		kfree(queue->rqbp);
 	}
-	kfree(queue->pring);
+
+	/*
+	 * The WQs/CQs' pring is bound (same pointer).
+	 * So free it only once, and not again on WQ.
+	 */
+	if (queue->type != LPFC_WQ)
+		kfree(queue->pring);
+
 	kfree(queue);
 	return;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown
  2017-03-29  2:29 [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Junichi Nomura
  2017-03-29 11:17 ` Johannes Thumshirn
  2017-04-03 21:51 ` [PATCH] lpfc: fix double free of bound CQ/WQ ring pointer Mauricio Faria de Oliveira
@ 2017-04-03 21:53 ` Mauricio Faria de Oliveira
  2017-04-04  2:10   ` Junichi Nomura
  2 siblings, 1 reply; 7+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-04-03 21:53 UTC (permalink / raw)
  To: Junichi Nomura, linux-scsi, dick.kennedy@broadcom.com,
	james.smart@broadcom.com
  Cc: anton@samba.org, martin.petersen@oracle.com

Hi Junichi,

On 03/28/2017 11:29 PM, Junichi Nomura wrote:
> Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications"),
> "rmmod lpfc" starting to cause panic or corruption due to double free.

Thanks for the report. Can you please check whether the patch just sent
([PATCH] lpfc: fix double free of bound CQ/WQ ring pointer) resolves it?

I don't have a setup to test it handy right now.

cheers,

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown
  2017-04-03 21:53 ` [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Mauricio Faria de Oliveira
@ 2017-04-04  2:10   ` Junichi Nomura
  2017-04-04 12:07     ` Mauricio Faria de Oliveira
  0 siblings, 1 reply; 7+ messages in thread
From: Junichi Nomura @ 2017-04-04  2:10 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira, linux-scsi, dick.kennedy@broadcom.com,
	james.smart@broadcom.com
  Cc: anton@samba.org, martin.petersen@oracle.com

On 04/04/17 06:53, Mauricio Faria de Oliveira wrote:
> Hi Junichi,
> 
> On 03/28/2017 11:29 PM, Junichi Nomura wrote:
>> Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications"),
>> "rmmod lpfc" starting to cause panic or corruption due to double free.
> 
> Thanks for the report. Can you please check whether the patch just sent
> ([PATCH] lpfc: fix double free of bound CQ/WQ ring pointer) resolves it?

It works for me. Thank you!

Considering future maintenance, it might be a bit fragile to just depend
on the code comment about representing the relation between cq/wq and
shared pring but it's maintainers' choice.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown
  2017-04-04  2:10   ` Junichi Nomura
@ 2017-04-04 12:07     ` Mauricio Faria de Oliveira
  0 siblings, 0 replies; 7+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-04-04 12:07 UTC (permalink / raw)
  To: Junichi Nomura, james.smart@broadcom.com,
	martin.petersen@oracle.com
  Cc: linux-scsi, dick.kennedy@broadcom.com, anton@samba.org

Hi Martin and Junichi,

On 04/03/2017 11:10 PM, Junichi Nomura wrote:
> On 04/04/17 06:53, Mauricio Faria de Oliveira wrote:

>> On 03/28/2017 11:29 PM, Junichi Nomura wrote:
>>> Since commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications"),
>>> "rmmod lpfc" starting to cause panic or corruption due to double free.

>> Thanks for the report. Can you please check whether the patch just sent
>> ([PATCH] lpfc: fix double free of bound CQ/WQ ring pointer) resolves it?

> It works for me. Thank you!

Excellent, thanks!

Martin, can you review/consider it for 4.11-rc6, please?

> Considering future maintenance, it might be a bit fragile to just depend
> on the code comment about representing the relation between cq/wq and
> shared pring but it's maintainers' choice.

I agree -- there should be a better way of identifying a bound WQ/CQ.
Perhaps there is, but I couldn't find it currently.

For now, as far as I could grep and examine the code (detailed in commit
message), a WQ is always bound to a CQ, so to check for WQ and not free
its ring pointer seems to be sufficient (as the CQ ring pointer is freed
first).

If that changes, probably some form of flagging and/or queue type
determination would be better/necessary.

cheers,

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-04-04 12:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-29  2:29 [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Junichi Nomura
2017-03-29 11:17 ` Johannes Thumshirn
2017-03-29 23:26   ` Junichi Nomura
2017-04-03 21:51 ` [PATCH] lpfc: fix double free of bound CQ/WQ ring pointer Mauricio Faria de Oliveira
2017-04-03 21:53 ` [REGRESSION] v4.11-rc3: lpfc: panic during module removal / shutdown Mauricio Faria de Oliveira
2017-04-04  2:10   ` Junichi Nomura
2017-04-04 12:07     ` Mauricio Faria de Oliveira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).