From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Jason Yan <yanaijie@huawei.com>,
John Garry <john.garry@huawei.com>,
Johannes Thumshirn <jthumshirn@suse.de>,
Ewan Milne <emilne@redhat.com>, Christoph Hellwig <hch@lst.de>,
Tomas Henzl <thenzl@redhat.com>,
Dan Williams <dan.j.williams@intel.com>,
Hannes Reinecke <hare@suse.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.14 01/51] scsi: libsas: direct call probe and destruct
Date: Mon, 3 Aug 2020 14:19:46 +0200 [thread overview]
Message-ID: <20200803121849.564535738@linuxfoundation.org> (raw)
In-Reply-To: <20200803121849.488233135@linuxfoundation.org>
From: Jason Yan <yanaijie@huawei.com>
[ Upstream commit 0558f33c06bb910e2879e355192227a8e8f0219d ]
In commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery
competing with ata error handling") introduced disco mutex to prevent
rediscovery competing with ata error handling and put the whole
revalidation in the mutex. But the rphy add/remove needs to wait for the
error handling which also grabs the disco mutex. This may leads to dead
lock.So the probe and destruct event were introduce to do the rphy
add/remove asynchronously and out of the lock.
The asynchronously processed workers makes the whole discovery process
not atomic, the other events may interrupt the process. For example,
if a loss of signal event inserted before the probe event, the
sas_deform_port() is called and the port will be deleted.
And sas_port_delete() may run before the destruct event, but the
port-x:x is the top parent of end device or expander. This leads to
a kernel WARNING such as:
[ 82.042979] sysfs group 'power' not found for kobject 'phy-1:0:22'
[ 82.042983] ------------[ cut here ]------------
[ 82.042986] WARNING: CPU: 54 PID: 1714 at fs/sysfs/group.c:237
sysfs_remove_group+0x94/0xa0
[ 82.043059] Call trace:
[ 82.043082] [<ffff0000082e7624>] sysfs_remove_group+0x94/0xa0
[ 82.043085] [<ffff00000864e320>] dpm_sysfs_remove+0x60/0x70
[ 82.043086] [<ffff00000863ee10>] device_del+0x138/0x308
[ 82.043089] [<ffff00000869a2d0>] sas_phy_delete+0x38/0x60
[ 82.043091] [<ffff00000869a86c>] do_sas_phy_delete+0x6c/0x80
[ 82.043093] [<ffff00000863dc20>] device_for_each_child+0x58/0xa0
[ 82.043095] [<ffff000008696f80>] sas_remove_children+0x40/0x50
[ 82.043100] [<ffff00000869d1bc>] sas_destruct_devices+0x64/0xa0
[ 82.043102] [<ffff0000080e93bc>] process_one_work+0x1fc/0x4b0
[ 82.043104] [<ffff0000080e96c0>] worker_thread+0x50/0x490
[ 82.043105] [<ffff0000080f0364>] kthread+0xfc/0x128
[ 82.043107] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
Make probe and destruct a direct call in the disco and revalidate function,
but put them outside the lock. The whole discovery or revalidate won't
be interrupted by other events. And the DISCE_PROBE and DISCE_DESTRUCT
event are deleted as a result of the direct call.
Introduce a new list to destruct the sas_port and put the port delete after
the destruct. This makes sure the right order of destroying the sysfs
kobject and fix the warning above.
In sas_ex_revalidate_domain() have a loop to find all broadcasted
device, and sometimes we have a chance to find the same expander twice.
Because the sas_port will be deleted at the end of the whole revalidate
process, sas_port with the same name cannot be added before this.
Otherwise the sysfs will complain of creating duplicate filename. Since
the LLDD will send broadcast for every device change, we can only
process one expander's revalidation.
[mkp: kbuild test robot warning]
Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/scsi/libsas/sas_ata.c | 1 -
drivers/scsi/libsas/sas_discover.c | 32 +++++++++++++++++-------------
drivers/scsi/libsas/sas_expander.c | 8 +++-----
drivers/scsi/libsas/sas_internal.h | 1 +
drivers/scsi/libsas/sas_port.c | 3 +++
include/scsi/libsas.h | 3 +--
include/scsi/scsi_transport_sas.h | 1 +
7 files changed, 27 insertions(+), 22 deletions(-)
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 70be4425ae0be..2b3637b40dde9 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -730,7 +730,6 @@ int sas_discover_sata(struct domain_device *dev)
if (res)
return res;
- sas_discover_event(dev->port, DISCE_PROBE);
return 0;
}
diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
index b200edc665a58..d6365e2fcc603 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -221,13 +221,9 @@ void sas_notify_lldd_dev_gone(struct domain_device *dev)
}
}
-static void sas_probe_devices(struct work_struct *work)
+static void sas_probe_devices(struct asd_sas_port *port)
{
struct domain_device *dev, *n;
- struct sas_discovery_event *ev = to_sas_discovery_event(work);
- struct asd_sas_port *port = ev->port;
-
- clear_bit(DISCE_PROBE, &port->disc.pending);
/* devices must be domain members before link recovery and probe */
list_for_each_entry(dev, &port->disco_list, disco_list_node) {
@@ -303,7 +299,6 @@ int sas_discover_end_dev(struct domain_device *dev)
res = sas_notify_lldd_dev_found(dev);
if (res)
return res;
- sas_discover_event(dev->port, DISCE_PROBE);
return 0;
}
@@ -362,13 +357,9 @@ static void sas_unregister_common_dev(struct asd_sas_port *port, struct domain_d
sas_put_device(dev);
}
-static void sas_destruct_devices(struct work_struct *work)
+void sas_destruct_devices(struct asd_sas_port *port)
{
struct domain_device *dev, *n;
- struct sas_discovery_event *ev = to_sas_discovery_event(work);
- struct asd_sas_port *port = ev->port;
-
- clear_bit(DISCE_DESTRUCT, &port->disc.pending);
list_for_each_entry_safe(dev, n, &port->destroy_list, disco_list_node) {
list_del_init(&dev->disco_list_node);
@@ -379,6 +370,16 @@ static void sas_destruct_devices(struct work_struct *work)
}
}
+static void sas_destruct_ports(struct asd_sas_port *port)
+{
+ struct sas_port *sas_port, *p;
+
+ list_for_each_entry_safe(sas_port, p, &port->sas_port_del_list, del_list) {
+ list_del_init(&sas_port->del_list);
+ sas_port_delete(sas_port);
+ }
+}
+
void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
{
if (!test_bit(SAS_DEV_DESTROY, &dev->state) &&
@@ -393,7 +394,6 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
sas_rphy_unlink(dev->rphy);
list_move_tail(&dev->disco_list_node, &port->destroy_list);
- sas_discover_event(dev->port, DISCE_DESTRUCT);
}
}
@@ -499,6 +499,8 @@ static void sas_discover_domain(struct work_struct *work)
port->port_dev = NULL;
}
+ sas_probe_devices(port);
+
SAS_DPRINTK("DONE DISCOVERY on port %d, pid:%d, result:%d\n", port->id,
task_pid_nr(current), error);
}
@@ -532,6 +534,10 @@ static void sas_revalidate_domain(struct work_struct *work)
port->id, task_pid_nr(current), res);
out:
mutex_unlock(&ha->disco_mutex);
+
+ sas_destruct_devices(port);
+ sas_destruct_ports(port);
+ sas_probe_devices(port);
}
/* ---------- Events ---------- */
@@ -587,10 +593,8 @@ void sas_init_disc(struct sas_discovery *disc, struct asd_sas_port *port)
static const work_func_t sas_event_fns[DISC_NUM_EVENTS] = {
[DISCE_DISCOVER_DOMAIN] = sas_discover_domain,
[DISCE_REVALIDATE_DOMAIN] = sas_revalidate_domain,
- [DISCE_PROBE] = sas_probe_devices,
[DISCE_SUSPEND] = sas_suspend_devices,
[DISCE_RESUME] = sas_resume_devices,
- [DISCE_DESTRUCT] = sas_destruct_devices,
};
disc->pending = 0;
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index f77d72f01da91..84df6cf467605 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -1946,7 +1946,8 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent,
sas_port_delete_phy(phy->port, phy->phy);
sas_device_set_phy(found, phy->port);
if (phy->port->num_phys == 0)
- sas_port_delete(phy->port);
+ list_add_tail(&phy->port->del_list,
+ &parent->port->sas_port_del_list);
phy->port = NULL;
}
}
@@ -2156,7 +2157,7 @@ int sas_ex_revalidate_domain(struct domain_device *port_dev)
struct domain_device *dev = NULL;
res = sas_find_bcast_dev(port_dev, &dev);
- while (res == 0 && dev) {
+ if (res == 0 && dev) {
struct expander_device *ex = &dev->ex_dev;
int i = 0, phy_id;
@@ -2168,9 +2169,6 @@ int sas_ex_revalidate_domain(struct domain_device *port_dev)
res = sas_rediscover(dev, phy_id);
i = phy_id + 1;
} while (i < ex->num_phys);
-
- dev = NULL;
- res = sas_find_bcast_dev(port_dev, &dev);
}
return res;
}
diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
index c07e081364915..f3449fde9c5fb 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -98,6 +98,7 @@ int sas_try_ata_reset(struct asd_sas_phy *phy);
void sas_hae_reset(struct work_struct *work);
void sas_free_device(struct kref *kref);
+void sas_destruct_devices(struct asd_sas_port *port);
#ifdef CONFIG_SCSI_SAS_HOST_SMP
extern void sas_smp_host_handler(struct bsg_job *job, struct Scsi_Host *shost);
diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
index d3c5297c6c89e..5d3244c8f2801 100644
--- a/drivers/scsi/libsas/sas_port.c
+++ b/drivers/scsi/libsas/sas_port.c
@@ -66,6 +66,7 @@ static void sas_resume_port(struct asd_sas_phy *phy)
rc = sas_notify_lldd_dev_found(dev);
if (rc) {
sas_unregister_dev(port, dev);
+ sas_destruct_devices(port);
continue;
}
@@ -219,6 +220,7 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone)
if (port->num_phys == 1) {
sas_unregister_domain_devices(port, gone);
+ sas_destruct_devices(port);
sas_port_delete(port->port);
port->port = NULL;
} else {
@@ -323,6 +325,7 @@ static void sas_init_port(struct asd_sas_port *port,
INIT_LIST_HEAD(&port->dev_list);
INIT_LIST_HEAD(&port->disco_list);
INIT_LIST_HEAD(&port->destroy_list);
+ INIT_LIST_HEAD(&port->sas_port_del_list);
spin_lock_init(&port->phy_list_lock);
INIT_LIST_HEAD(&port->phy_list);
port->ha = sas_ha;
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index a966d281dedc3..1b1cf9eff3b5a 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -87,10 +87,8 @@ enum discover_event {
DISCE_DISCOVER_DOMAIN = 0U,
DISCE_REVALIDATE_DOMAIN = 1,
DISCE_PORT_GONE = 2,
- DISCE_PROBE = 3,
DISCE_SUSPEND = 4,
DISCE_RESUME = 5,
- DISCE_DESTRUCT = 6,
DISC_NUM_EVENTS = 7,
};
@@ -269,6 +267,7 @@ struct asd_sas_port {
struct list_head dev_list;
struct list_head disco_list;
struct list_head destroy_list;
+ struct list_head sas_port_del_list;
enum sas_linkrate linkrate;
struct sas_work work;
diff --git a/include/scsi/scsi_transport_sas.h b/include/scsi/scsi_transport_sas.h
index 62895b4059330..05ec927a3c729 100644
--- a/include/scsi/scsi_transport_sas.h
+++ b/include/scsi/scsi_transport_sas.h
@@ -156,6 +156,7 @@ struct sas_port {
struct mutex phy_list_mutex;
struct list_head phy_list;
+ struct list_head del_list; /* libsas only */
};
#define dev_to_sas_port(d) \
--
2.25.1
next prev parent reply other threads:[~2020-08-03 12:33 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-03 12:19 [PATCH 4.14 00/51] 4.14.192-rc1 review Greg Kroah-Hartman
2020-08-03 12:19 ` Greg Kroah-Hartman [this message]
2020-08-03 12:57 ` [PATCH 4.14 01/51] scsi: libsas: direct call probe and destruct John Garry
2020-08-05 9:52 ` Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 02/51] net: phy: mdio-bcm-unimac: fix potential NULL dereference in unimac_mdio_probe() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 03/51] crypto: ccp - Release all allocated memory if sha type is invalid Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 04/51] media: rc: prevent memory leak in cx23888_ir_probe Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 05/51] iio: imu: adis16400: fix memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 06/51] ath9k_htc: release allocated buffer if timed out Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 07/51] ath9k: " Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 08/51] x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bit Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 09/51] PCI/ASPM: Disable ASPM on ASMedia ASM1083/1085 PCIe-to-PCI bridge Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 10/51] wireless: Use offsetof instead of custom macro Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 11/51] ARM: 8986/1: hw_breakpoint: Dont invoke overflow handler on uaccess watchpoints Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 12/51] random32: update the net random state on interrupt and activity Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 13/51] ARM: percpu.h: fix build error Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 14/51] drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 15/51] drm: hold gem reference until object is no longer accessed Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 16/51] f2fs: check memory boundary by insane namelen Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 17/51] f2fs: check if file namelen exceeds max value Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 18/51] random: fix circular include dependency on arm64 after addition of percpu.h Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 19/51] random32: remove net_rand_state from the latent entropy gcc plugin Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 20/51] 9p/trans_fd: abort p9_read_work if req status changed Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 21/51] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 22/51] x86/build/lto: Fix truncated .bss with -fdata-sections Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 23/51] x86, vmlinux.lds: Page-align end of ..page_aligned sections Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 24/51] rds: Prevent kernel-infoleak in rds_notify_queue_get() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 25/51] xfs: fix missed wakeup on l_flush_wait Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 26/51] net/x25: Fix x25_neigh refcnt leak when x25 disconnect Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 27/51] net/x25: Fix null-ptr-deref in x25_disconnect Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 28/51] selftests/net: rxtimestamp: fix clang issues for target arch PowerPC Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 29/51] sh: Fix validation of system call number Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 30/51] net: lan78xx: add missing endpoint sanity check Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 31/51] net: lan78xx: fix transfer-buffer memory leak Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 32/51] mlx4: disable device on shutdown Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 33/51] mlxsw: core: Increase scope of RCU read-side critical section Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 34/51] mlxsw: core: Free EMAD transactions using kfree_rcu() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 35/51] ibmvnic: Fix IRQ mapping disposal in error path Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 36/51] bpf: Fix map leak in HASH_OF_MAPS map Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 37/51] mac80211: mesh: Free ie data when leaving mesh Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 38/51] mac80211: mesh: Free pending skb when destroying a mpath Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 39/51] arm64/alternatives: move length validation inside the subsection Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 40/51] arm64: csum: Fix handling of bad packets Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 41/51] usb: hso: Fix debug compile warning on sparc32 Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 42/51] qed: Disable "MFW indication via attention" SPAM every 5 minutes Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 43/51] nfc: s3fwrn5: add missing release on skb in s3fwrn5_recv_frame Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 44/51] parisc: add support for cmpxchg on u8 pointers Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 45/51] net: ethernet: ravb: exit if re-initialization fails in tx timeout Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 46/51] Revert "i2c: cadence: Fix the hold bit setting" Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 47/51] x86/unwind/orc: Fix ORC for newly forked tasks Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 48/51] cxgb4: add missing release on skb in uld_send() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 49/51] xen-netfront: fix potential deadlock in xennet_remove() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 50/51] KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 51/51] x86/i8259: Use printk_deferred() to prevent deadlock Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200803121849.564535738@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dan.j.williams@intel.com \
--cc=emilne@redhat.com \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=john.garry@huawei.com \
--cc=jthumshirn@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
--cc=thenzl@redhat.com \
--cc=yanaijie@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.