public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mpt2sas: prevent double free on error path
@ 2013-01-23 21:20 Jörn Engel
  2013-01-24  7:51 ` Bjørn Mork
  0 siblings, 1 reply; 6+ messages in thread
From: Jörn Engel @ 2013-01-23 21:20 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Sreekanth Reddy, support,
	James E.J. Bottomley, DL-MPTFusionLinux, linux-scsi, linux-kernel

I noticed this one when list_del was called with poisoned list
pointers, but the real problem is a double-free (and a use-after-free
just before that).

Both _scsih_probe_boot_devices() and _scsih_sas_device_add() put the
sas_device onto a list, thereby giving up control.  Next they call
mpt2sas_transport_port_add() and will list_del and free the object on
errors.  If some other function already did the list_del and free, it
will happen again.

This patch adds reference counting to prevent the double free.  One
reference count goes to the caller of mpt2sas_transport_port_add(), the
second to the list.  Whoever removes the object from the list gets to
drop one reference count.  _scsih_probe_boot_devices() and
_scsih_sas_device_add() get a second reference count to ensure the
object is not freed while they are still accessing it.

To prevent the double list_del(), I changed the code to list_del_init()
and added a list_empty() check before that.  Since the
list_empty/list_del_init is always called under a lock, this should be
safe.

I hate the complexity this patch adds, but see no alternative to it.

mpt2sas0: failure at drivers/scsi/mpt2sas/mpt2sas_transport.c:708/mpt2sas_transport_port_add()!
general protection fault: 0000 [#1] SMP
CPU 9
Pid: 3097, comm: kworker/u:11 Tainted: G        W  O 3.6.10+ #31392.trunk    /0JP31P
RIP: 0010:[<ffffffffa0309744>]  [<ffffffffa0309744>] _scsih_sas_device_remove+0x54/0x90 [mpt2sas]
RSP: 0018:ffff881fed4d7ab0  EFLAGS: 00010046
RAX: dead000000200200 RBX: ffff881ff6a5cd88 RCX: 00000000000010e8
RDX: ffff881ff7dab800 RSI: ffff881ff7daba00 RDI: dead000000100100
RBP: ffff881fed4d7ad0 R08: dead000000200200 R09: ffff880fff802200
R10: ffffffffa0317980 R11: 0000000000000000 R12: ffff881ff7daba00
R13: 0000000000000286 R14: 500605ba006c9d09 R15: ffff881ff7daba00
FS:  0000000000000000(0000) GS:ffff88203fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f8ac89ec458 CR3: 0000001ff4c5c000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:11 (pid: 3097, threadinfo ffff881fed4d6000, task ffff881402f3d9c0)
Stack:
 0000000000000401 ffff881ff6a5c6b0 0000000000000401 0000000000000016
 ffff881fed4d7bb0 ffffffffa030f93e 0000000000000000 ffff881ff6a5cd88
 0012000e0f000008 006c9d090002000b 00180009500605ba 0000040100000016
Call Trace:
 [<ffffffffa030f93e>] _scsih_add_device.clone.32+0x2fe/0x420 [mpt2sas]
 [<ffffffffa03126e5>] _scsih_sas_topology_change_event.clone.38+0x285/0x620 [mpt2sas]
 [<ffffffff81078c90>] ? load_balance+0x100/0x7a0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffffa0312d8a>] _firmware_event_work+0x30a/0xfc0 [mpt2sas]
 [<ffffffff810015cc>] ? __switch_to+0x14c/0x410
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffffa0312a80>] ? _scsih_sas_topology_change_event.clone.38+0x620/0x620 [mpt2sas]
 [<ffffffff8105bf40>] process_one_work+0x140/0x500
 [<ffffffff8105d354>] worker_thread+0x194/0x510
 [<ffffffff8106dfdb>] ? finish_task_switch+0x4b/0xf0
 [<ffffffff8105d1c0>] ? manage_workers+0x320/0x320
 [<ffffffff8106282e>] kthread+0x9e/0xb0
 [<ffffffff815bef44>] kernel_thread_helper+0x4/0x10
 [<ffffffff815b5e5d>] ? retint_restore_args+0x13/0x13
 [<ffffffff81062790>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff815bef40>] ? gs_change+0x13/0x13

Signed-off-by: Joern Engel <joern@logfs.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h  |    1 +
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |   55 ++++++++++++++++++++++++++++------
 2 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 543d8d6..ceb7d41 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -367,6 +367,7 @@ struct _sas_device {
 	u16	slot;
 	u8	phy;
 	u8	responding;
+	struct kref kref;
 };
 
 /**
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index c6bdc92..43b3a98 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -570,6 +570,18 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	return NULL;
 }
 
+static void free_sas_device(struct kref *kref)
+{
+	struct _sas_device *sas_device = container_of(kref, struct _sas_device,
+			kref);
+	kfree(sas_device);
+}
+
+static void put_sas_device(struct _sas_device *sas_device)
+{
+	kref_put(&sas_device->kref, free_sas_device);
+}
+
 /**
  * _scsih_sas_device_remove - remove sas_device from list.
  * @ioc: per adapter object
@@ -583,14 +595,19 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
     struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		was_on_list = 1;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (was_on_list)
+		put_sas_device(sas_device);
 }
 
 
@@ -613,6 +630,8 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	/* Get an extra refcount... */
+	kref_get(&sas_device->kref);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -630,6 +649,12 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 			sas_device->sas_address_parent);
 		_scsih_sas_device_remove(ioc, sas_device);
 	}
+	/*
+	 * ...and drop it again.  If an error already happend, this is the
+	 * final put and we free the object now.  Otherwise whoever removes
+	 * the object from the list will do the final put and free.
+	 */
+	put_sas_device(sas_device);
 }
 
 /**
@@ -5270,6 +5295,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->kref);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5341,7 +5367,7 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
+	put_sas_device(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5355,16 +5381,21 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 
@@ -5381,6 +5412,7 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 {
 	struct _sas_device *sas_device;
 	unsigned long flags;
+	int was_on_list = 0;
 
 	if (ioc->shost_recovery)
 		return;
@@ -5388,10 +5420,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (!list_empty(&sas_device->list)) {
+			list_del_init(&sas_device->list);
+			was_on_list = 1;
+		}
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	if (was_on_list)
 		_scsih_remove_device(ioc, sas_device);
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -7805,6 +7841,7 @@ _scsih_probe_boot_devices(struct MPT2SAS_ADAPTER *ioc)
 		handle = sas_device->handle;
 		sas_address_parent = sas_device->sas_address_parent;
 		sas_address = sas_device->sas_address;
+		kref_get(&sas_device->kref);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread
* [PATCH 00/14] Add blockconsole version 1.1 (try 3)
@ 2013-05-09 20:42 Joern Engel
  2013-05-09 20:42 ` [PATCH] mpt2sas: prevent double free on error path Joern Engel
  0 siblings, 1 reply; 6+ messages in thread
From: Joern Engel @ 2013-05-09 20:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Jens Axboe, Borislav Petkov, Takashi Iwai,
	Joern Engel

Blockconsole is a console driver very roughly similar to netconsole.
Instead of sending messages out via UDP, they are written to a block
device.  Typically a USB stick is chosen, although in principle any
block device will do.

In most cases blockconsole is useful where netconsole is not, i.e.
single machines without network access or without an accessable
netconsole capture server.  When using both blockconsole and
netconsole, I have found netconsole to sometimes create a mess under
high message load (sysrq-t, etc.) while blockconsole does not.

Most importantly, a number of bugs were identified and fixed that
would have been unexplained machine reboots without blockconsole.

More highlights:
* reasonably small and self-contained code,
* some 100+ machine years of runtime,
* nice tutorial with a 30-sec guide for the impatient.

Special thanks to Borislav Petkov for many improvements and kicking my
behind to provide a proper git tree and resend patches.

Git tree is on kernel.org and I intend to keep it stable, as people
seem to be using it already.  It has been in -next since Mar 7.

git://git.kernel.org/pub/scm/linux/kernel/git/joern/bcon2.git

Joern Engel (10):
  do_mounts: constify name_to_dev_t parameter
  add blockconsole version 1.1
  printk: add CON_ALLDATA console flag
  netconsole: use CON_ALLDATA
  blockconsole: use CON_ALLDATA
  bcon: add a release work struct
  bcon: check for hdparm in bcon_tail
  bcon: remove version 1.0 support
  bcon: Fix wrap-around behaviour
  netconsole: s/syslogd/cancd/ in documentation

Takashi Iwai (4):
  blockconsole: Allow to pass a device file path to bcon_tail
  blockconsole: Fix undefined MAX_RT_PRIO
  blockconsole: Rename device_lock with bc_device_lock
  blockconsole: Mark a local work struct static

 Documentation/block/blockconsole.txt            |   94 ++++
 Documentation/block/blockconsole/bcon_tail      |   82 +++
 Documentation/block/blockconsole/mkblockconsole |   29 ++
 Documentation/networking/netconsole.txt         |   16 +-
 block/partitions/Makefile                       |    1 +
 block/partitions/blockconsole.c                 |   22 +
 block/partitions/check.c                        |    3 +
 block/partitions/check.h                        |    3 +
 drivers/block/Kconfig                           |    6 +
 drivers/block/Makefile                          |    1 +
 drivers/block/blockconsole.c                    |  618 +++++++++++++++++++++++
 drivers/net/netconsole.c                        |    2 +-
 include/linux/blockconsole.h                    |    7 +
 include/linux/console.h                         |    1 +
 include/linux/mount.h                           |    2 +-
 init/do_mounts.c                                |    2 +-
 kernel/printk.c                                 |    5 +-
 17 files changed, 885 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/block/blockconsole.txt
 create mode 100755 Documentation/block/blockconsole/bcon_tail
 create mode 100755 Documentation/block/blockconsole/mkblockconsole
 create mode 100644 block/partitions/blockconsole.c
 create mode 100644 drivers/block/blockconsole.c
 create mode 100644 include/linux/blockconsole.h

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-05-09 22:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-23 21:20 [PATCH] mpt2sas: prevent double free on error path Jörn Engel
2013-01-24  7:51 ` Bjørn Mork
2013-01-25 17:12   ` Jörn Engel
2013-01-25 18:00     ` [PATCH v2] mpt2sas/mpt3sas: " Jörn Engel
2013-01-25 19:20       ` [PATCH v3] " Jörn Engel
  -- strict thread matches above, loose matches on Subject: below --
2013-05-09 20:42 [PATCH 00/14] Add blockconsole version 1.1 (try 3) Joern Engel
2013-05-09 20:42 ` [PATCH] mpt2sas: prevent double free on error path Joern Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox