linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mpt2sas: prevent double free on error path
@ 2013-01-23 21:20 Jörn Engel
  2013-01-24  7:51 ` Bjørn Mork
  0 siblings, 1 reply; 7+ messages in thread
From: Jörn Engel @ 2013-01-23 21:20 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Sreekanth Reddy, support,
	James E.J. Bottomley, DL-MPTFusionLinux, linux-scsi, linux-kernel

I noticed this one when list_del was called with poisoned list
pointers, but the real problem is a double-free (and a use-after-free
just before that).

Both _scsih_probe_boot_devices() and _scsih_sas_device_add() put the
sas_device onto a list, thereby giving up control.  Next they call
mpt2sas_transport_port_add() and will list_del and free the object on
errors.  If some other function already did the list_del and free, it
will happen again.

This patch adds reference counting to prevent the double free.  One
reference count goes to the caller of mpt2sas_transport_port_add(), the
second to the list.  Whoever removes the object from the list gets to
drop one reference count.  _scsih_probe_boot_devices() and
_scsih_sas_device_add() get a second reference count to ensure the
object is not freed while they are still accessing it.

To prevent the double list_del(), I changed the code to list_del_init()
and added a list_empty() check before that.  Since the
list_empty/list_del_init is always called under a lock, this should be
safe.

I hate the complexity this patch adds, but see no alternative to it.

mpt2sas0: failure at drivers/scsi/mpt2sas/mpt2sas_transport.c:708/mpt2sas_transport_port_add()!
general protection fault: 0000 [#1] SMP
CPU 9
Pid: 3097, comm: kworker/u:11 Tainted: G        W  O 3.6.10+ #31392.trunk    /0JP31P
RIP: 0010:[<ffffffffa0309744>]  [<ffffffffa0309744>] _scsih_sas_device_remove+0x54/0x90 [mpt2sas]
RSP: 0018:ffff881fed4d7ab0  EFLAGS: 00010046
RAX: dead000000200200 RBX: ffff881ff6a5cd88 RCX: 00000000000010e8
RDX: ffff881ff7dab800 RSI: ffff881ff7daba00 RDI: dead000000100100
RBP: ffff881fed4d7ad0 R08: dead000000200200 R09: ffff880fff802200
R10: ffffffffa0317980 R11: 0000000000000000 R12: ffff881ff7daba00
R13: 0000000000000286 R14: 500605ba006c9d09 R15: ffff881ff7daba00
FS:  0000000000000000(0000) GS:ffff88203fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f8ac89ec458 CR3: 0000001ff4c5c000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:11 (pid: 3097, threadinfo ffff881fed4d6000, task ffff881402f3d9c0)
Stack:
 0000000000000401 ffff881ff6a5c6b0 0000000000000401 0000000000000016
 ffff881fed4d7bb0 ffffffffa030f93e 0000000000000000 ffff881ff6a5cd88
 0012000e0f000008 006c9d090002000b 00180009500605ba 0000040100000016
Call Trace:
 [<ffffffffa030f93e>] _scsih_add_device.clone.32+0x2fe/0x420 [mpt2sas]
 [<ffffffffa03126e5>] _scsih_sas_topology_change_event.clone.38+0x285/0x620 [mpt2sas]
 [<ffffffff81078c90>] ? load_balance+0x100/0x7a0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffffa0312d8a>] _firmware_event_work+0x30a/0xfc0 [mpt2sas]
 [<ffffffff810015cc>] ? __switch_to+0x14c/0x410
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffff8105bf40>] process_one_work+0x140/0x500
 [<ffffffff8105d354>] worker_thread+0x194/0x510
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffff8105d1c0>] ? manage_workers+0x320/0x320
 [<ffffffff8106282e>] kthread+0x9e/0xb0
 [<ffffffff815bef44>] kernel_thread_helper+0x4/0x10
 [<ffffffff815b5e5d>] ? retint_restore_args+0x13/0x13
 [<ffffffff81062790>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff815bef40>] ? gs_change+0x13/0x13

Signed-off-by: Joern Engel <joern@logfs.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h  |    1 +
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |   55 ++++++++++++++++++++++++++++------
 2 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 543d8d6..ceb7d41 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -367,6 +367,7 @@ struct _sas_device {
 	u16	slot;
 	u8	phy;
 	u8	responding;
+	struct kref kref;
 };
 
 /**
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index c6bdc92..43b3a98 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -570,6 +570,18 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	return NULL;
 }
 
+static void free_sas_device(struct kref *kref)
+{
+	struct _sas_device *sas_device = container_of(kref, struct _sas_device,
+			kref);
+	kfree(sas_device);
+}
+
+static void put_sas_device(struct _sas_device *sas_device)
+{
+	kref_put(&sas_device->kref, free_sas_device);
+}
+
 /**
  * _scsih_sas_device_remove - remove sas_device from list.
  * @ioc: per adapter object
@@ -583,14 +595,19 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
     struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		was_on_list = 1;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (was_on_list)
+		put_sas_device(sas_device);
 }
 
 
@@ -613,6 +630,8 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	/* Get an extra refcount... */
+	kref_get(&sas_device->kref);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -630,6 +649,12 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 			sas_device->sas_address_parent);
 		_scsih_sas_device_remove(ioc, sas_device);
 	}
+	/*
+	 * ...and drop it again.  If an error already happend, this is the
+	 * final put and we free the object now.  Otherwise whoever removes
+	 * the object from the list will do the final put and free.
+	 */
+	put_sas_device(sas_device);
 }
 
 /**
@@ -5270,6 +5295,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->kref);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5341,7 +5367,7 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
+	put_sas_device(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5355,16 +5381,21 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -5381,6 +5412,7 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
@@ -5388,10 +5420,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -7805,6 +7841,7 @@ _scsih_probe_boot_devices(struct MPT2SAS_ADAPTER *ioc)
 		handle = sas_device->handle;
 		sas_address_parent = sas_device->sas_address_parent;
 		sas_address = sas_device->sas_address;
+		kref_get(&sas_device->kref);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mpt2sas: prevent double free on error path
  2013-01-23 21:20 [PATCH] mpt2sas: prevent double free on error path Jörn Engel
@ 2013-01-24  7:51 ` Bjørn Mork
  2013-01-25 17:12   ` Jörn Engel
  0 siblings, 1 reply; 7+ messages in thread
From: Bjørn Mork @ 2013-01-24  7:51 UTC (permalink / raw)
  To: Jörn Engel
  Cc: Nagalakshmi Nandigama, Sreekanth Reddy, support,
	James E.J. Bottomley, DL-MPTFusionLinux, linux-scsi, linux-kernel

Jörn Engel <joern@logfs.org> writes:

> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index c6bdc92..43b3a98 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -570,6 +570,18 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  	return NULL;
>  }
>  
> +static void free_sas_device(struct kref *kref)
> +{
> +	struct _sas_device *sas_device = container_of(kref, struct _sas_device,
> +			kref);
> +	kfree(sas_device);
> +}
> +
> +static void put_sas_device(struct _sas_device *sas_device)
> +{
> +	kref_put(&sas_device->kref, free_sas_device);
> +}
> +
>  /**
>   * _scsih_sas_device_remove - remove sas_device from list.
>   * @ioc: per adapter object
> @@ -583,14 +595,19 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
>      struct _sas_device *sas_device)
>  {
>  	unsigned long flags;
> +	int was_on_list = 0;
>  
>  	if (!sas_device)
>  		return;
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	list_del(&sas_device->list);
> -	kfree(sas_device);
> +	if (!list_empty(&sas_device->list)) {
> +		list_del_init(&sas_device->list);
> +		was_on_list = 1;
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +	if (was_on_list)
> +		put_sas_device(sas_device);
>  }
>  


How about the copy of this code in drivers/scsi/mpt3sas/mpt3sas_scsih.c?
Is that safe, or does it need fixing as well?


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mpt2sas: prevent double free on error path
  2013-01-24  7:51 ` Bjørn Mork
@ 2013-01-25 17:12   ` Jörn Engel
  2013-01-25 18:00     ` [PATCH v2] mpt2sas/mpt3sas: " Jörn Engel
  0 siblings, 1 reply; 7+ messages in thread
From: Jörn Engel @ 2013-01-25 17:12 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Nagalakshmi Nandigama, Sreekanth Reddy, support,
	James E.J. Bottomley, DL-MPTFusionLinux, linux-scsi, linux-kernel

On Thu, 24 January 2013 08:51:20 +0100, Bjørn Mork wrote:
> 
> How about the copy of this code in drivers/scsi/mpt3sas/mpt3sas_scsih.c?
> Is that safe, or does it need fixing as well?

Well spotted, that appears to suffer from the same ailment.  Will cook
up a second patch for that.

Are there any other copies of this code around?

Jörn

--
The object-oriented version of 'Spaghetti code' is, of course, 'Lasagna code'.
(Too many layers).
-- Roberto Waltman.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] mpt2sas/mpt3sas: prevent double free on error path
  2013-01-25 17:12   ` Jörn Engel
@ 2013-01-25 18:00     ` Jörn Engel
  2013-01-25 19:20       ` [PATCH v3] " Jörn Engel
  0 siblings, 1 reply; 7+ messages in thread
From: Jörn Engel @ 2013-01-25 18:00 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Nagalakshmi Nandigama, Sreekanth Reddy, support,
	James E.J. Bottomley, DL-MPTFusionLinux, linux-scsi, linux-kernel

Changes since v1:
- Also fixes mpt3sas, thanks to Bjørn Mork.
- Adds a missing put_sas_device to _scsih_probe_boot_devices
- Added a newline
- Moved a kref_get outside the spinlock

I noticed this one when list_del was called with poisoned list
pointers, but the real problem is a double-free (and a use-after-free
just before that).

Both _scsih_probe_boot_devices() and _scsih_sas_device_add() put the
sas_device onto a list, thereby giving up control.  Next they call
mpt2sas_transport_port_add() and will list_del and free the object on
errors.  If some other function already did the list_del and free, it
will happen again.

This patch adds reference counting to prevent the double free.  One
reference count goes to the caller of mpt2sas_transport_port_add(), the
second to the list.  Whoever removes the object from the list gets to
drop one reference count.  _scsih_probe_boot_devices() and
_scsih_sas_device_add() get a second reference count to ensure the
object is not freed while they are still accessing it.

To prevent the double list_del(), I changed the code to list_del_init()
and added a list_empty() check before that.  Since the
list_empty/list_del_init is always called under a lock, this should be
safe.

I hate the complexity this patch adds, but see no alternative to it.

mpt2sas0: failure at drivers/scsi/mpt2sas/mpt2sas_transport.c:708/mpt2sas_transport_port_add()!
general protection fault: 0000 [#1] SMP
CPU 9
Pid: 3097, comm: kworker/u:11 Tainted: G        W  O 3.6.10+ #31392.trunk    /0JP31P
RIP: 0010:[<ffffffffa0309744>]  [<ffffffffa0309744>] _scsih_sas_device_remove+0x54/0x90 [mpt2sas]
RSP: 0018:ffff881fed4d7ab0  EFLAGS: 00010046
RAX: dead000000200200 RBX: ffff881ff6a5cd88 RCX: 00000000000010e8
RDX: ffff881ff7dab800 RSI: ffff881ff7daba00 RDI: dead000000100100
RBP: ffff881fed4d7ad0 R08: dead000000200200 R09: ffff880fff802200
R10: ffffffffa0317980 R11: 0000000000000000 R12: ffff881ff7daba00
R13: 0000000000000286 R14: 500605ba006c9d09 R15: ffff881ff7daba00
FS:  0000000000000000(0000) GS:ffff88203fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f8ac89ec458 CR3: 0000001ff4c5c000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:11 (pid: 3097, threadinfo ffff881fed4d6000, task ffff881402f3d9c0)
Stack:
 0000000000000401 ffff881ff6a5c6b0 0000000000000401 0000000000000016
 ffff881fed4d7bb0 ffffffffa030f93e 0000000000000000 ffff881ff6a5cd88
 0012000e0f000008 006c9d090002000b 00180009500605ba 0000040100000016
Call Trace:
 [<ffffffffa030f93e>] _scsih_add_device.clone.32+0x2fe/0x420 [mpt2sas]
 [<ffffffffa03126e5>] _scsih_sas_topology_change_event.clone.38+0x285/0x620 [mpt2sas]
 [<ffffffff81078c90>] ? load_balance+0x100/0x7a0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffffa0312d8a>] _firmware_event_work+0x30a/0xfc0 [mpt2sas]
 [<ffffffff810015cc>] ? __switch_to+0x14c/0x410
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffff8105bf40>] process_one_work+0x140/0x500
 [<ffffffff8105d354>] worker_thread+0x194/0x510
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffff8105d1c0>] ? manage_workers+0x320/0x320
 [<ffffffff8106282e>] kthread+0x9e/0xb0
 [<ffffffff815bef44>] kernel_thread_helper+0x4/0x10
 [<ffffffff815b5e5d>] ? retint_restore_args+0x13/0x13
 [<ffffffff81062790>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff815bef40>] ? gs_change+0x13/0x13

Signed-off-by: Joern Engel <joern@logfs.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h  |    1 +
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |   57 ++++++++++++++++++++++++++++------
 drivers/scsi/mpt3sas/mpt3sas_base.h  |    1 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   57 ++++++++++++++++++++++++++++------
 4 files changed, 98 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 543d8d6..ceb7d41 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -367,6 +367,7 @@ struct _sas_device {
 	u16	slot;
 	u8	phy;
 	u8	responding;
+	struct kref kref;
 };
 
 /**
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index c6bdc92..217660c 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -570,6 +570,19 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	return NULL;
 }
 
+static void free_sas_device(struct kref *kref)
+{
+	struct _sas_device *sas_device = container_of(kref, struct _sas_device,
+			kref);
+
+	kfree(sas_device);
+}
+
+static void put_sas_device(struct _sas_device *sas_device)
+{
+	kref_put(&sas_device->kref, free_sas_device);
+}
+
 /**
  * _scsih_sas_device_remove - remove sas_device from list.
  * @ioc: per adapter object
@@ -583,14 +596,19 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
     struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		was_on_list = 1;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (was_on_list)
+		put_sas_device(sas_device);
 }
 
 
@@ -612,6 +630,8 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
+	/* Get an extra refcount... */
+	kref_get(&sas_device->kref);
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -630,6 +650,12 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 			sas_device->sas_address_parent);
 		_scsih_sas_device_remove(ioc, sas_device);
 	}
+	/*
+	 * ...and drop it again.  If an error already happend, this is the
+	 * final put and we free the object now.  Otherwise whoever removes
+	 * the object from the list will do the final put and free.
+	 */
+	put_sas_device(sas_device);
 }
 
 /**
@@ -5270,6 +5296,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->kref);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5341,7 +5368,7 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
+	put_sas_device(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5355,16 +5382,21 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -5381,6 +5413,7 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
@@ -5388,10 +5421,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -7805,6 +7842,7 @@ _scsih_probe_boot_devices(struct MPT2SAS_ADAPTER *ioc)
 		handle = sas_device->handle;
 		sas_address_parent = sas_device->sas_address_parent;
 		sas_address = sas_device->sas_address;
+		kref_get(&sas_device->kref);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -7819,6 +7857,7 @@ _scsih_probe_boot_devices(struct MPT2SAS_ADAPTER *ioc)
 					sas_address_parent);
 			_scsih_sas_device_remove(ioc, sas_device);
 		}
+		put_sas_device(sas_device);
 	}
 }
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 994656c..fff7fe7 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -293,6 +293,7 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	fast_path;
+	struct kref kref;
 };
 
 /**
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 6421a06..54fdd7c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -569,6 +569,19 @@ _scsih_sas_device_find_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
 	return NULL;
 }
 
+static void free_sas_device(struct kref *kref)
+{
+	struct _sas_device *sas_device = container_of(kref, struct _sas_device,
+			kref);
+
+	kfree(sas_device);
+}
+
+static void put_sas_device(struct _sas_device *sas_device)
+{
+	kref_put(&sas_device->kref, free_sas_device);
+}
+
 /**
  * _scsih_sas_device_remove - remove sas_device from list.
  * @ioc: per adapter object
@@ -582,14 +595,19 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
 	struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		was_on_list = 1;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (was_on_list)
+		put_sas_device(sas_device);
 }
 
 /**
@@ -604,16 +622,21 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -630,6 +653,7 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
@@ -637,10 +661,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -663,6 +691,8 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
 		ioc->name, __func__, sas_device->handle,
 		(unsigned long long)sas_device->sas_address));
 
+	/* Get an extra refcount... */
+	kref_get(&sas_device->kref);
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -682,6 +712,12 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
 			    sas_device->sas_address_parent);
 		_scsih_sas_device_remove(ioc, sas_device);
 	}
+	/*
+	 * ...and drop it again.  If an error already happend, this is the
+	 * final put and we free the object now.  Otherwise whoever removes
+	 * the object from the list will do the final put and free.
+	 */
+	put_sas_device(sas_device);
 }
 
 /**
@@ -4859,6 +4895,7 @@ _scsih_add_device(struct MPT3SAS_ADAPTER *ioc, u16 handle, u8 phy_num,
 		return 0;
 	}
 
+	kref_init(&sas_device->kref);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc,
 	    le16_to_cpu(sas_device_pg0.ParentDevHandle),
@@ -4935,7 +4972,7 @@ _scsih_remove_device(struct MPT3SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
 
-	kfree(sas_device);
+	put_sas_device(sas_device);
 }
 
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
@@ -7521,6 +7558,7 @@ _scsih_probe_boot_devices(struct MPT3SAS_ADAPTER *ioc)
 		handle = sas_device->handle;
 		sas_address_parent = sas_device->sas_address_parent;
 		sas_address = sas_device->sas_address;
+		kref_get(&sas_device->kref);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -7533,6 +7571,7 @@ _scsih_probe_boot_devices(struct MPT3SAS_ADAPTER *ioc)
 				    sas_address_parent);
 			_scsih_sas_device_remove(ioc, sas_device);
 		}
+		put_sas_device(sas_device);
 	}
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3] mpt2sas/mpt3sas: prevent double free on error path
  2013-01-25 18:00     ` [PATCH v2] mpt2sas/mpt3sas: " Jörn Engel
@ 2013-01-25 19:20       ` Jörn Engel
  2013-01-26  3:52         ` mpt2sas: device_blocked question spren.gm
  0 siblings, 1 reply; 7+ messages in thread
From: Jörn Engel @ 2013-01-25 19:20 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Nagalakshmi Nandigama, Sreekanth Reddy, support,
	James E.J. Bottomley, DL-MPTFusionLinux, linux-scsi, linux-kernel

Changes since v1:
- Adds another missing put_sas_device to _scsih_probe_boot_devices,
  proving the need for more coffee before sending out patches.

I noticed this one when list_del was called with poisoned list
pointers, but the real problem is a double-free (and a use-after-free
just before that).

Both _scsih_probe_boot_devices() and _scsih_sas_device_add() put the
sas_device onto a list, thereby giving up control.  Next they call
mpt2sas_transport_port_add() and will list_del and free the object on
errors.  If some other function already did the list_del and free, it
will happen again.

This patch adds reference counting to prevent the double free.  One
reference count goes to the caller of mpt2sas_transport_port_add(), the
second to the list.  Whoever removes the object from the list gets to
drop one reference count.  _scsih_probe_boot_devices() and
_scsih_sas_device_add() get a second reference count to ensure the
object is not freed while they are still accessing it.

To prevent the double list_del(), I changed the code to list_del_init()
and added a list_empty() check before that.  Since the
list_empty/list_del_init is always called under a lock, this should be
safe.

I hate the complexity this patch adds, but see no alternative to it.

mpt2sas0: failure at drivers/scsi/mpt2sas/mpt2sas_transport.c:708/mpt2sas_transport_port_add()!
general protection fault: 0000 [#1] SMP
CPU 9
Pid: 3097, comm: kworker/u:11 Tainted: G        W  O 3.6.10+ #31392.trunk    /0JP31P
RIP: 0010:[<ffffffffa0309744>]  [<ffffffffa0309744>] _scsih_sas_device_remove+0x54/0x90 [mpt2sas]
RSP: 0018:ffff881fed4d7ab0  EFLAGS: 00010046
RAX: dead000000200200 RBX: ffff881ff6a5cd88 RCX: 00000000000010e8
RDX: ffff881ff7dab800 RSI: ffff881ff7daba00 RDI: dead000000100100
RBP: ffff881fed4d7ad0 R08: dead000000200200 R09: ffff880fff802200
R10: ffffffffa0317980 R11: 0000000000000000 R12: ffff881ff7daba00
R13: 0000000000000286 R14: 500605ba006c9d09 R15: ffff881ff7daba00
FS:  0000000000000000(0000) GS:ffff88203fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f8ac89ec458 CR3: 0000001ff4c5c000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:11 (pid: 3097, threadinfo ffff881fed4d6000, task ffff881402f3d9c0)
Stack:
 0000000000000401 ffff881ff6a5c6b0 0000000000000401 0000000000000016
 ffff881fed4d7bb0 ffffffffa030f93e 0000000000000000 ffff881ff6a5cd88
 0012000e0f000008 006c9d090002000b 00180009500605ba 0000040100000016
Call Trace:
 [<ffffffffa030f93e>] _scsih_add_device.clone.32+0x2fe/0x420 [mpt2sas]
 [<ffffffffa03126e5>] _scsih_sas_topology_change_event.clone.38+0x285/0x620 [mpt2sas]
 [<ffffffff81078c90>] ? load_balance+0x100/0x7a0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffffa0312d8a>] _firmware_event_work+0x30a/0xfc0 [mpt2sas]
 [<ffffffff810015cc>] ? __switch_to+0x14c/0x410
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffff8105bf40>] process_one_work+0x140/0x500
 [<ffffffff8105d354>] worker_thread+0x194/0x510
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffff8105d1c0>] ? manage_workers+0x320/0x320
 [<ffffffff8106282e>] kthread+0x9e/0xb0
 [<ffffffff815bef44>] kernel_thread_helper+0x4/0x10
 [<ffffffff815b5e5d>] ? retint_restore_args+0x13/0x13
 [<ffffffff81062790>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff815bef40>] ? gs_change+0x13/0x13

Signed-off-by: Joern Engel <joern@logfs.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h  |    1 +
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |   57 ++++++++++++++++++++++++++++------
 drivers/scsi/mpt3sas/mpt3sas_base.h  |    1 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   57 ++++++++++++++++++++++++++++------
 4 files changed, 98 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 543d8d6..ceb7d41 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -367,6 +367,7 @@ struct _sas_device {
 	u16	slot;
 	u8	phy;
 	u8	responding;
+	struct kref kref;
 };
 
 /**
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index c6bdc92..217660c 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -570,6 +570,19 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	return NULL;
 }
 
+static void free_sas_device(struct kref *kref)
+{
+	struct _sas_device *sas_device = container_of(kref, struct _sas_device,
+			kref);
+
+	kfree(sas_device);
+}
+
+static void put_sas_device(struct _sas_device *sas_device)
+{
+	kref_put(&sas_device->kref, free_sas_device);
+}
+
 /**
  * _scsih_sas_device_remove - remove sas_device from list.
  * @ioc: per adapter object
@@ -583,14 +596,19 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
     struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		was_on_list = 1;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (was_on_list)
+		put_sas_device(sas_device);
 }
 
 
@@ -612,6 +630,8 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
+	/* Get an extra refcount... */
+	kref_get(&sas_device->kref);
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -630,6 +650,12 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 			sas_device->sas_address_parent);
 		_scsih_sas_device_remove(ioc, sas_device);
 	}
+	/*
+	 * ...and drop it again.  If an error already happend, this is the
+	 * final put and we free the object now.  Otherwise whoever removes
+	 * the object from the list will do the final put and free.
+	 */
+	put_sas_device(sas_device);
 }
 
 /**
@@ -5270,6 +5296,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->kref);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5341,7 +5368,7 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
+	put_sas_device(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5355,16 +5382,21 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -5381,6 +5413,7 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
@@ -5388,10 +5421,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -7805,6 +7842,7 @@ _scsih_probe_boot_devices(struct MPT2SAS_ADAPTER *ioc)
 		handle = sas_device->handle;
 		sas_address_parent = sas_device->sas_address_parent;
 		sas_address = sas_device->sas_address;
+		kref_get(&sas_device->kref);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -7819,6 +7857,7 @@ _scsih_probe_boot_devices(struct MPT2SAS_ADAPTER *ioc)
 					sas_address_parent);
 			_scsih_sas_device_remove(ioc, sas_device);
 		}
+		put_sas_device(sas_device);
 	}
 }
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 994656c..fff7fe7 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -293,6 +293,7 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	fast_path;
+	struct kref kref;
 };
 
 /**
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 6421a06..54fdd7c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -569,6 +569,19 @@ _scsih_sas_device_find_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
 	return NULL;
 }
 
+static void free_sas_device(struct kref *kref)
+{
+	struct _sas_device *sas_device = container_of(kref, struct _sas_device,
+			kref);
+
+	kfree(sas_device);
+}
+
+static void put_sas_device(struct _sas_device *sas_device)
+{
+	kref_put(&sas_device->kref, free_sas_device);
+}
+
 /**
  * _scsih_sas_device_remove - remove sas_device from list.
  * @ioc: per adapter object
@@ -582,14 +595,19 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
 	struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		was_on_list = 1;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (was_on_list)
+		put_sas_device(sas_device);
 }
 
 /**
@@ -604,16 +622,21 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -630,6 +653,7 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
@@ -637,10 +661,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -663,6 +691,8 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
 		ioc->name, __func__, sas_device->handle,
 		(unsigned long long)sas_device->sas_address));
 
+	/* Get an extra refcount... */
+	kref_get(&sas_device->kref);
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -682,6 +712,12 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
 			    sas_device->sas_address_parent);
 		_scsih_sas_device_remove(ioc, sas_device);
 	}
+	/*
+	 * ...and drop it again.  If an error already happend, this is the
+	 * final put and we free the object now.  Otherwise whoever removes
+	 * the object from the list will do the final put and free.
+	 */
+	put_sas_device(sas_device);
 }
 
 /**
@@ -4859,6 +4895,7 @@ _scsih_add_device(struct MPT3SAS_ADAPTER *ioc, u16 handle, u8 phy_num,
 		return 0;
 	}
 
+	kref_init(&sas_device->kref);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc,
 	    le16_to_cpu(sas_device_pg0.ParentDevHandle),
@@ -4935,7 +4972,7 @@ _scsih_remove_device(struct MPT3SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
 
-	kfree(sas_device);
+	put_sas_device(sas_device);
 }
 
 #ifdef CONFIG_SCSI_MPT3SAS_LOGGING
@@ -7521,6 +7558,7 @@ _scsih_probe_boot_devices(struct MPT3SAS_ADAPTER *ioc)
 		handle = sas_device->handle;
 		sas_address_parent = sas_device->sas_address_parent;
 		sas_address = sas_device->sas_address;
+		kref_get(&sas_device->kref);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -7533,6 +7571,7 @@ _scsih_probe_boot_devices(struct MPT3SAS_ADAPTER *ioc)
 				    sas_address_parent);
 			_scsih_sas_device_remove(ioc, sas_device);
 		}
+		put_sas_device(sas_device);
 	}
 }
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* mpt2sas: device_blocked question
  2013-01-25 19:20       ` [PATCH v3] " Jörn Engel
@ 2013-01-26  3:52         ` spren.gm
  2013-01-30  6:52           ` Reddy, Sreekanth
  0 siblings, 1 reply; 7+ messages in thread
From: spren.gm @ 2013-01-26  3:52 UTC (permalink / raw)
  To: JörnEngel, BjørnMork, Nagalakshmi Nandigama,
	Sreekanth Reddy, support@lsi.com, James E.J. Bottomley,
	DL-MPTFusionLinux@lsi.com
  Cc: linux-scsi@vger.kernel.org

Hi,
  we encounter a problem that a disk is blocked forever, the key log is:

mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: Spinning up disk.... handle(0x000a), lun(0)
mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: Spinning up disk.... handle(0x000a), lun(0)
mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: Spinning up disk.... handle(0x000a), lun(0)
sd 2:0:1:0: device_blocked, handle(0x000a)

  looking into the driver code, it seems that the device is blocked by function 
_scsih_block_io_to_children_attached_directly() in mpt2sas/mpt2sas_scsih.c
with reason_code == MPI2_EVENT_SAS_TOPO_RC_DELAY_NOT_RESPONDING,
  what does this reason_code mean please? when should it be returned? and any way to handle it?

Best Regards,
spren
2013-01-26


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: mpt2sas: device_blocked question
  2013-01-26  3:52         ` mpt2sas: device_blocked question spren.gm
@ 2013-01-30  6:52           ` Reddy, Sreekanth
  0 siblings, 0 replies; 7+ messages in thread
From: Reddy, Sreekanth @ 2013-01-30  6:52 UTC (permalink / raw)
  To: spren.gm@gmail.com, Nandigama, Nagalakshmi, Support,
	James E.J. Bottomley, DL-MPT Fusion Linux
  Cc: linux-scsi@vger.kernel.org

Hi Spren,

Reason code MPI2_EVENT_SAS_TOPO_RC_DELAY_NOT_RESPONDING mean that "A target device has stopped responding or is missing but the ReportDeviceMissingDelay timer has not expired for this device".

Can you please send me the some more information i.e.

 1. For which operation/TestCase this issue has occurred?
 2. Which Driver & Firmware version you are using?
 3. Can you also please send me the complete log.

Thanks,
Sreekanth.

-----Original Message-----
From: spren.gm@gmail.com [mailto:spren.gm@gmail.com] 
Sent: Saturday, January 26, 2013 9:22 AM
To: JörnEngel; BjørnMork; Nandigama, Nagalakshmi; Reddy, Sreekanth; Support; James E.J. Bottomley; DL-MPT Fusion Linux
Cc: linux-scsi@vger.kernel.org
Subject: mpt2sas: device_blocked question

Hi,
  we encounter a problem that a disk is blocked forever, the key log is:

mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: Spinning up disk.... handle(0x000a), lun(0)
mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: Spinning up disk.... handle(0x000a), lun(0)
mpt2sas0: TEST_UNIT_READY: handle(0x000a), lun(0)
mpt2sas0: Spinning up disk.... handle(0x000a), lun(0) sd 2:0:1:0: device_blocked, handle(0x000a)

  looking into the driver code, it seems that the device is blocked by function
_scsih_block_io_to_children_attached_directly() in mpt2sas/mpt2sas_scsih.c with reason_code == MPI2_EVENT_SAS_TOPO_RC_DELAY_NOT_RESPONDING,
  what does this reason_code mean please? when should it be returned? and any way to handle it?

Best Regards,
spren
2013-01-26

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-01-30  6:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-23 21:20 [PATCH] mpt2sas: prevent double free on error path Jörn Engel
2013-01-24  7:51 ` Bjørn Mork
2013-01-25 17:12   ` Jörn Engel
2013-01-25 18:00     ` [PATCH v2] mpt2sas/mpt3sas: " Jörn Engel
2013-01-25 19:20       ` [PATCH v3] " Jörn Engel
2013-01-26  3:52         ` mpt2sas: device_blocked question spren.gm
2013-01-30  6:52           ` Reddy, Sreekanth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).