From: Calvin Owens <calvinowens@fb.com>
To: Sreekanth Reddy <sreekanth.reddy@avagotech.com>
Cc: martin.petersen@oracle.com, linux-scsi@vger.kernel.org,
jejb@kernel.org, JBottomley@Parallels.com,
Sathya.Prakash@avagotech.com, chaitra.basappa@avagotech.com,
linux-kernel@vger.kernel.org, hch@infradead.org,
calvinowens@fb.com
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization
Date: Wed, 6 May 2015 11:48:52 -0700 [thread overview]
Message-ID: <20150506184852.GA1007288@mail.thefacebook.com> (raw)
In-Reply-To: <1430751931-14002-1-git-send-email-Sreekanth.Reddy@avagotech.com>
On Monday 05/04 at 20:35 +0530, Sreekanth Reddy wrote:
> I have applied this patch on the latest upstream mpt3sas driver, then
> I have compiled and loaded the driver. In the driver logs I didn't
> see any attached drives are added to the OS, 'fdisk -l' command also
> doesn't list the drives which are actually attached to the HBA.
>
> When I debug this issue then I see that in '_scsih_target_alloc'
> driver is searching for sas_device from the lists
> 'sas_device_init_list' & 'sas_device_list' based on the device sas
> address using the function
> mpt3sas_scsih_sas_device_find_by_sas_address(), since this device is
> not in the 'sas_device_init_list' (as it is moved it to head list)
> driver exit from this function without updating the required device
> addition information.
Yes, I misunderstood that the initialization depended on the devices
still being on the init_list.
What's interesting about this is that when I tested it, it still worked.
I think the MPT2SAS_PORT_ENABLE_COMPLETE fw_event might zero
ioc->start_scan and allow scsih_scan_finished() to start probing devices
before all the devices are actually on the init_list. It seems to be
very repeatable per-machine whether or not it works.
But in any case, my patch was wrong.
> To solve the original problem (i.e memory corruption), here I have
> attached the patch, in this patch I have added one atomic flag
> is_on_sas_device_init_list in _sas_device_structure and I followed
> below algorithm.
The problem is that this only solves a single case. There isn't anything
to enforce that this or a similar chain of events can't happen elsewhere
in the code.
I think the best general solution would be to add a refcount to these
objects. They sit on a list that can be concurrently accessed from
multiple threads, so I think a refcount is the best way to ensure that
objects aren't freed out from under other users.
I'm working on a patchset that does this. I'm starting by adding a
refcount to the sas_device object only, and refactoring the code in
mpt2sas_scsih.c to use it. I should be able to send up a first version
of that pretty soon to get some feedback.
Thanks,
Calvin
> 1. when ever a device is added to sas_device_init_list then driver
> will set this atomic flag of this device to one.
>
> 2. And during the addition of this device to SCSI mid layer, if the
> device is successfully added to the OS then driver will move this
> device list in to sas_device_list list from sas_device_init_list list
> and at this time driver will reset this flag to zero. if device is
> failed to register with SCSI mid layer then also driver will reset
> this flag to zero in function _scsih_sas_device_remove and will remove
> the device entry from sas_device_init_list and will free the device
> structure.
>
> 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
> a. driver will check whether addition of discovered devices to SML process is currently running or not,
> i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
> if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
> ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
>
> 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this device removal event.
>
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
> ---
> drivers/scsi/mpt2sas/mpt2sas_base.h | 2 ++
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
> drivers/scsi/mpt3sas/mpt3sas_base.h | 2 ++
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
> 4 files changed, 71 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..1aa10d2 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -376,6 +376,7 @@ struct _sas_device {
> u8 phy;
> u8 responding;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;
> };
>
> /**
> @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..2a61286 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);
> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> _scsih_determine_boot_device(ioc, sas_device, 0);
> @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (ioc->hide_drives)
> continue;
> -
> +
> if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt2sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> if (!ioc->is_driver_loading) {
> mpt2sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index afa8816..6188490 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -315,6 +315,7 @@ struct _sas_device {
> u8 responding;
> u8 fast_path;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;
> };
>
> /**
> @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..53cc9ea 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);
> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
> ioc->name, __func__, sas_device->handle,
> (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt3sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> /*
> @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> mpt3sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**
> --
> 2.0.2
>
WARNING: multiple messages have this Message-ID (diff)
From: Calvin Owens <calvinowens@fb.com>
To: Sreekanth Reddy <sreekanth.reddy@avagotech.com>
Cc: <martin.petersen@oracle.com>, <linux-scsi@vger.kernel.org>,
<jejb@kernel.org>, <JBottomley@Parallels.com>,
<Sathya.Prakash@avagotech.com>, <chaitra.basappa@avagotech.com>,
<linux-kernel@vger.kernel.org>, <hch@infradead.org>,
<calvinowens@fb.com>
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization
Date: Wed, 6 May 2015 11:48:52 -0700 [thread overview]
Message-ID: <20150506184852.GA1007288@mail.thefacebook.com> (raw)
In-Reply-To: <1430751931-14002-1-git-send-email-Sreekanth.Reddy@avagotech.com>
On Monday 05/04 at 20:35 +0530, Sreekanth Reddy wrote:
> I have applied this patch on the latest upstream mpt3sas driver, then
> I have compiled and loaded the driver. In the driver logs I didn't
> see any attached drives are added to the OS, 'fdisk -l' command also
> doesn't list the drives which are actually attached to the HBA.
>
> When I debug this issue then I see that in '_scsih_target_alloc'
> driver is searching for sas_device from the lists
> 'sas_device_init_list' & 'sas_device_list' based on the device sas
> address using the function
> mpt3sas_scsih_sas_device_find_by_sas_address(), since this device is
> not in the 'sas_device_init_list' (as it is moved it to head list)
> driver exit from this function without updating the required device
> addition information.
Yes, I misunderstood that the initialization depended on the devices
still being on the init_list.
What's interesting about this is that when I tested it, it still worked.
I think the MPT2SAS_PORT_ENABLE_COMPLETE fw_event might zero
ioc->start_scan and allow scsih_scan_finished() to start probing devices
before all the devices are actually on the init_list. It seems to be
very repeatable per-machine whether or not it works.
But in any case, my patch was wrong.
> To solve the original problem (i.e memory corruption), here I have
> attached the patch, in this patch I have added one atomic flag
> is_on_sas_device_init_list in _sas_device_structure and I followed
> below algorithm.
The problem is that this only solves a single case. There isn't anything
to enforce that this or a similar chain of events can't happen elsewhere
in the code.
I think the best general solution would be to add a refcount to these
objects. They sit on a list that can be concurrently accessed from
multiple threads, so I think a refcount is the best way to ensure that
objects aren't freed out from under other users.
I'm working on a patchset that does this. I'm starting by adding a
refcount to the sas_device object only, and refactoring the code in
mpt2sas_scsih.c to use it. I should be able to send up a first version
of that pretty soon to get some feedback.
Thanks,
Calvin
> 1. when ever a device is added to sas_device_init_list then driver
> will set this atomic flag of this device to one.
>
> 2. And during the addition of this device to SCSI mid layer, if the
> device is successfully added to the OS then driver will move this
> device list in to sas_device_list list from sas_device_init_list list
> and at this time driver will reset this flag to zero. if device is
> failed to register with SCSI mid layer then also driver will reset
> this flag to zero in function _scsih_sas_device_remove and will remove
> the device entry from sas_device_init_list and will free the device
> structure.
>
> 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
> a. driver will check whether addition of discovered devices to SML process is currently running or not,
> i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
> if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
> ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
>
> 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this device removal event.
>
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
> ---
> drivers/scsi/mpt2sas/mpt2sas_base.h | 2 ++
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
> drivers/scsi/mpt3sas/mpt3sas_base.h | 2 ++
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
> 4 files changed, 71 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..1aa10d2 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -376,6 +376,7 @@ struct _sas_device {
> u8 phy;
> u8 responding;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;
> };
>
> /**
> @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..2a61286 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);
> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> _scsih_determine_boot_device(ioc, sas_device, 0);
> @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (ioc->hide_drives)
> continue;
> -
> +
> if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt2sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> if (!ioc->is_driver_loading) {
> mpt2sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index afa8816..6188490 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -315,6 +315,7 @@ struct _sas_device {
> u8 responding;
> u8 fast_path;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;
> };
>
> /**
> @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..53cc9ea 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);
> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
> ioc->name, __func__, sas_device->handle,
> (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt3sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> /*
> @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> mpt3sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**
> --
> 2.0.2
>
next prev parent reply other threads:[~2015-05-06 18:48 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-04 15:05 [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization Sreekanth Reddy
2015-05-05 15:35 ` Tomas Henzl
2015-05-12 9:38 ` Sreekanth Reddy
2015-05-06 18:48 ` Calvin Owens [this message]
2015-05-06 18:48 ` Calvin Owens
2015-05-15 3:41 ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
2015-05-15 3:41 ` Calvin Owens
2015-05-15 3:41 ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
2015-05-15 3:41 ` Calvin Owens
2015-05-15 3:41 ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
2015-05-15 3:41 ` Calvin Owens
2015-05-15 3:41 ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
2015-05-15 3:41 ` Calvin Owens
2015-05-15 3:42 ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
2015-05-15 3:42 ` Calvin Owens
2015-05-15 3:42 ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
2015-05-15 3:42 ` Calvin Owens
2015-05-15 3:42 ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
2015-05-15 3:42 ` Calvin Owens
2015-06-09 3:50 ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
2015-06-09 3:50 ` Calvin Owens
2015-06-09 3:50 ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
2015-06-09 3:50 ` Calvin Owens
2015-07-03 15:24 ` Christoph Hellwig
2015-06-09 3:50 ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
2015-06-09 3:50 ` Calvin Owens
2015-07-03 15:38 ` Christoph Hellwig
2015-07-12 4:15 ` Calvin Owens
2015-07-12 4:15 ` Calvin Owens
2015-06-09 3:50 ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
2015-06-09 3:50 ` Calvin Owens
2015-07-03 16:03 ` Christoph Hellwig
2015-06-09 3:50 ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
2015-06-09 3:50 ` Calvin Owens
2015-07-03 15:38 ` Christoph Hellwig
2015-06-09 3:50 ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
2015-06-09 3:50 ` Calvin Owens
2015-07-03 16:00 ` Christoph Hellwig
2015-07-12 4:13 ` Calvin Owens
2015-07-12 4:13 ` Calvin Owens
2015-06-09 3:50 ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
2015-06-09 3:50 ` Calvin Owens
2015-07-03 16:02 ` Christoph Hellwig
2015-07-12 4:20 ` Calvin Owens
2015-07-12 4:20 ` Calvin Owens
2015-07-02 20:15 ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Bart Van Assche
2015-07-02 20:15 ` Bart Van Assche
2015-07-12 4:24 ` [PATCH 0/2 v2] " Calvin Owens
2015-07-12 4:24 ` Calvin Owens
2015-07-12 4:24 ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
2015-07-12 4:24 ` Calvin Owens
2015-07-13 6:52 ` Christoph Hellwig
2015-07-21 7:06 ` Calvin Owens
2015-07-21 7:06 ` Calvin Owens
2015-07-13 15:05 ` Joe Lawrence
2015-07-13 15:05 ` Joe Lawrence
2015-07-21 7:04 ` Calvin Owens
2015-07-21 7:04 ` Calvin Owens
2015-07-16 14:57 ` Sreekanth Reddy
2015-07-21 7:03 ` Calvin Owens
2015-07-21 7:03 ` Calvin Owens
2015-07-12 4:24 ` [PATCH 2/2] mpt2sas: Refcount fw_events " Calvin Owens
2015-07-12 4:24 ` Calvin Owens
2015-07-13 6:52 ` Christoph Hellwig
2015-08-01 5:02 ` [PATCH v3 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
2015-08-01 5:02 ` Calvin Owens
2015-08-01 5:02 ` [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
2015-08-01 5:02 ` Calvin Owens
2015-08-10 13:15 ` Sreekanth Reddy
2015-08-14 1:43 ` Calvin Owens
2015-08-14 1:43 ` Calvin Owens
2015-08-01 5:02 ` [PATCH v3 2/2] mpt2sas: Refcount fw_events " Calvin Owens
2015-08-01 5:02 ` Calvin Owens
2015-08-14 1:48 ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
2015-08-14 1:48 ` Calvin Owens
2015-08-14 1:48 ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
2015-08-14 1:48 ` Calvin Owens
2015-08-14 1:48 ` [PATCH v4 2/2] mpt2sas: Refcount fw_events " Calvin Owens
2015-08-14 1:48 ` Calvin Owens
2015-08-25 21:06 ` Nicholas A. Bellinger
2015-09-04 14:35 ` Sreekanth Reddy
2015-08-25 21:03 ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects " Nicholas A. Bellinger
2015-09-04 14:34 ` Sreekanth Reddy
2015-08-25 21:21 ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Nicholas A. Bellinger
2015-07-02 19:22 ` [PATCH 0/6] " Jens Axboe
2015-07-02 19:22 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2015-04-10 7:14 [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization Calvin Owens
2015-04-10 14:30 ` James Bottomley
2015-04-10 14:30 ` James Bottomley
2015-04-10 16:43 ` Sathya Prakash
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150506184852.GA1007288@mail.thefacebook.com \
--to=calvinowens@fb.com \
--cc=JBottomley@Parallels.com \
--cc=Sathya.Prakash@avagotech.com \
--cc=chaitra.basappa@avagotech.com \
--cc=hch@infradead.org \
--cc=jejb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=sreekanth.reddy@avagotech.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.