linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] fix / cleanup async scsi scanning
@ 2012-05-30 18:21 Dan Williams
  2012-05-30 18:21 ` [PATCH v2 1/4] async: introduce 'async_domain' type Dan Williams
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Dan Williams @ 2012-05-30 18:21 UTC (permalink / raw)
  To: linux-kernel, linux-scsi; +Cc: mroos, JBottomley

Commit a7a20d10 "[SCSI] sd: limit the scope of the async probe domain"
introduces a boot regression by moving sd probe work off of the global
async queue.  Using a local async domain hides the probe work from being
synchronized by wait_for_device_probe()->async_synchronize_full().

Fix this by teaching async_synchronize_full() to flush all async work
regardless of domain, and take the opportunity to convert scsi scanning
to async_schedule().  This enables wait_for_device_probe() to flush scsi
scanning work.

Changes since v1: http://marc.info/?l=linux-scsi&m=133793153025832&w=2

1/ Tested to fix the boot hang that Meelis reported with v1.  Reworked
   async_synchronize_full() to walk through all the active domains,
   otherwise we spin on !list_empty(async_domains) and prevent the async
   context from running.

2/ Added the ability for domains to opt-out of global syncing as
   requested by Arjan, but also needed for domains that don't want to worry
   about list corruption when the domain goes out of scope (stack-allocated
   domains).

---

Dan Williams (4):
      async: introduce 'async_domain' type
      async: make async_synchronize_full() flush all work regardless of domain
      scsi: queue async scan work to an async_schedule domain
      scsi: cleanup usages of scsi_complete_async_scans


 drivers/regulator/core.c      |    2 +
 drivers/scsi/libsas/sas_ata.c |    2 +
 drivers/scsi/scsi.c           |    4 ++
 drivers/scsi/scsi_priv.h      |    3 +-
 drivers/scsi/scsi_scan.c      |   24 +++----------
 drivers/scsi/scsi_wait_scan.c |   15 +++-----
 include/linux/async.h         |   36 +++++++++++++++++--
 include/scsi/scsi_scan.h      |   11 ------
 kernel/async.c                |   76 +++++++++++++++++++++++++++++++----------
 kernel/power/hibernate.c      |    8 ----
 kernel/power/user.c           |    2 -
 11 files changed, 107 insertions(+), 76 deletions(-)
 delete mode 100644 include/scsi/scsi_scan.h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/4] async: introduce 'async_domain' type
  2012-05-30 18:21 [PATCH v2 0/4] fix / cleanup async scsi scanning Dan Williams
@ 2012-05-30 18:21 ` Dan Williams
  2012-05-30 18:21 ` [PATCH v2 2/4] async: make async_synchronize_full() flush all work regardless of domain Dan Williams
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2012-05-30 18:21 UTC (permalink / raw)
  To: linux-kernel, linux-scsi
  Cc: mroos, Mark Brown, Arjan van de Ven, Liam Girdwood,
	James Bottomley

This is in preparation for teaching async_synchronize_full() to sync all
pending async work, and not just on the async_running domain.  This
conversion is functionally equivalent, just embedding the existing list
in a new async_domain type.

The .registered attribute is used in a later patch to distinguish
between domains that want to be flushed by async_synchronize_full()
versus those that only expect async_synchronize_{full|cookie}_domain to
be used for flushing.

Cc: Liam Girdwood <lrg@ti.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: James Bottomley <JBottomley@parallels.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/regulator/core.c      |    2 +-
 drivers/scsi/libsas/sas_ata.c |    2 +-
 drivers/scsi/scsi.c           |    3 ++-
 drivers/scsi/scsi_priv.h      |    3 ++-
 include/linux/async.h         |   35 +++++++++++++++++++++++++++++++----
 kernel/async.c                |   35 +++++++++++++++++------------------
 6 files changed, 54 insertions(+), 26 deletions(-)

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index e70dd38..87377a3 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -2521,7 +2521,7 @@ static void regulator_bulk_enable_async(void *data, async_cookie_t cookie)
 int regulator_bulk_enable(int num_consumers,
 			  struct regulator_bulk_data *consumers)
 {
-	LIST_HEAD(async_domain);
+	ASYNC_DOMAIN_EXCLUSIVE(async_domain);
 	int i;
 	int ret = 0;
 
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 607a35b..899d190 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -828,7 +828,7 @@ static void async_sas_ata_eh(void *data, async_cookie_t cookie)
 void sas_ata_strategy_handler(struct Scsi_Host *shost)
 {
 	struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(shost);
-	LIST_HEAD(async);
+	ASYNC_DOMAIN_EXCLUSIVE(async);
 	int i;
 
 	/* it's ok to defer revalidation events during ata eh, these
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 2ecdb01..2f92007 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -54,6 +54,7 @@
 #include <linux/notifier.h>
 #include <linux/cpu.h>
 #include <linux/mutex.h>
+#include <linux/async.h>
 
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
@@ -91,7 +92,7 @@ EXPORT_SYMBOL(scsi_logging_level);
 #endif
 
 /* sd and scsi_pm need to coordinate flushing async actions */
-LIST_HEAD(scsi_sd_probe_domain);
+ASYNC_DOMAIN(scsi_sd_probe_domain);
 EXPORT_SYMBOL(scsi_sd_probe_domain);
 
 /* NB: These are exposed through /proc/scsi/scsi and form part of the ABI.
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 07ce3f5..eab472d 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -2,6 +2,7 @@
 #define _SCSI_PRIV_H
 
 #include <linux/device.h>
+#include <linux/async.h>
 
 struct request_queue;
 struct request;
@@ -163,7 +164,7 @@ static inline int scsi_autopm_get_host(struct Scsi_Host *h) { return 0; }
 static inline void scsi_autopm_put_host(struct Scsi_Host *h) {}
 #endif /* CONFIG_PM_RUNTIME */
 
-extern struct list_head scsi_sd_probe_domain;
+extern struct async_domain scsi_sd_probe_domain;
 
 /* 
  * internal scsi timeout functions: for use by mid-layer and transport
diff --git a/include/linux/async.h b/include/linux/async.h
index 68a9530..364e7ff 100644
--- a/include/linux/async.h
+++ b/include/linux/async.h
@@ -9,19 +9,46 @@
  * as published by the Free Software Foundation; version 2
  * of the License.
  */
+#ifndef __ASYNC_H__
+#define __ASYNC_H__
 
 #include <linux/types.h>
 #include <linux/list.h>
 
 typedef u64 async_cookie_t;
 typedef void (async_func_ptr) (void *data, async_cookie_t cookie);
+struct async_domain {
+	struct list_head node;
+	struct list_head domain;
+	int count;
+	unsigned registered:1;
+};
+
+/*
+ * domain participates in global async_synchronize_full
+ */
+#define ASYNC_DOMAIN(_name) \
+	struct async_domain _name = { .node = LIST_HEAD_INIT(_name.node), \
+				      .domain = LIST_HEAD_INIT(_name.domain), \
+				      .count = 0, \
+				      .registered = 1 }
+
+/*
+ * domain is free to go out of scope as soon as all pending work is
+ * complete, this domain does not participate in async_synchronize_full
+ */
+#define ASYNC_DOMAIN_EXCLUSIVE(_name) \
+	struct async_domain _name = { .node = LIST_HEAD_INIT(_name.node), \
+				      .domain = LIST_HEAD_INIT(_name.domain), \
+				      .count = 0, \
+				      .registered = 0 }
 
 extern async_cookie_t async_schedule(async_func_ptr *ptr, void *data);
 extern async_cookie_t async_schedule_domain(async_func_ptr *ptr, void *data,
-					    struct list_head *list);
+					    struct async_domain *domain);
 extern void async_synchronize_full(void);
-extern void async_synchronize_full_domain(struct list_head *list);
+extern void async_synchronize_full_domain(struct async_domain *domain);
 extern void async_synchronize_cookie(async_cookie_t cookie);
 extern void async_synchronize_cookie_domain(async_cookie_t cookie,
-					    struct list_head *list);
-
+					    struct async_domain *domain);
+#endif
diff --git a/kernel/async.c b/kernel/async.c
index bd0c168..ba5491d 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -62,7 +62,7 @@ static async_cookie_t next_cookie = 1;
 #define MAX_WORK	32768
 
 static LIST_HEAD(async_pending);
-static LIST_HEAD(async_running);
+static ASYNC_DOMAIN(async_running);
 static DEFINE_SPINLOCK(async_lock);
 
 struct async_entry {
@@ -71,7 +71,7 @@ struct async_entry {
 	async_cookie_t		cookie;
 	async_func_ptr		*func;
 	void			*data;
-	struct list_head	*running;
+	struct async_domain	*running;
 };
 
 static DECLARE_WAIT_QUEUE_HEAD(async_done);
@@ -82,13 +82,12 @@ static atomic_t entry_count;
 /*
  * MUST be called with the lock held!
  */
-static async_cookie_t  __lowest_in_progress(struct list_head *running)
+static async_cookie_t  __lowest_in_progress(struct async_domain *running)
 {
 	struct async_entry *entry;
 
-	if (!list_empty(running)) {
-		entry = list_first_entry(running,
-			struct async_entry, list);
+	if (!list_empty(&running->domain)) {
+		entry = list_first_entry(&running->domain, typeof(*entry), list);
 		return entry->cookie;
 	}
 
@@ -99,7 +98,7 @@ static async_cookie_t  __lowest_in_progress(struct list_head *running)
 	return next_cookie;	/* "infinity" value */
 }
 
-static async_cookie_t  lowest_in_progress(struct list_head *running)
+static async_cookie_t  lowest_in_progress(struct async_domain *running)
 {
 	unsigned long flags;
 	async_cookie_t ret;
@@ -119,10 +118,11 @@ static void async_run_entry_fn(struct work_struct *work)
 		container_of(work, struct async_entry, work);
 	unsigned long flags;
 	ktime_t uninitialized_var(calltime), delta, rettime;
+	struct async_domain *running = entry->running;
 
 	/* 1) move self to the running queue */
 	spin_lock_irqsave(&async_lock, flags);
-	list_move_tail(&entry->list, entry->running);
+	list_move_tail(&entry->list, &running->domain);
 	spin_unlock_irqrestore(&async_lock, flags);
 
 	/* 2) run (and print duration) */
@@ -156,7 +156,7 @@ static void async_run_entry_fn(struct work_struct *work)
 	wake_up(&async_done);
 }
 
-static async_cookie_t __async_schedule(async_func_ptr *ptr, void *data, struct list_head *running)
+static async_cookie_t __async_schedule(async_func_ptr *ptr, void *data, struct async_domain *running)
 {
 	struct async_entry *entry;
 	unsigned long flags;
@@ -223,7 +223,7 @@ EXPORT_SYMBOL_GPL(async_schedule);
  * Note: This function may be called from atomic or non-atomic contexts.
  */
 async_cookie_t async_schedule_domain(async_func_ptr *ptr, void *data,
-				     struct list_head *running)
+				     struct async_domain *running)
 {
 	return __async_schedule(ptr, data, running);
 }
@@ -238,20 +238,20 @@ void async_synchronize_full(void)
 {
 	do {
 		async_synchronize_cookie(next_cookie);
-	} while (!list_empty(&async_running) || !list_empty(&async_pending));
+	} while (!list_empty(&async_running.domain) || !list_empty(&async_pending));
 }
 EXPORT_SYMBOL_GPL(async_synchronize_full);
 
 /**
  * async_synchronize_full_domain - synchronize all asynchronous function within a certain domain
- * @list: running list to synchronize on
+ * @domain: running list to synchronize on
  *
  * This function waits until all asynchronous function calls for the
- * synchronization domain specified by the running list @list have been done.
+ * synchronization domain specified by the running list @domain have been done.
  */
-void async_synchronize_full_domain(struct list_head *list)
+void async_synchronize_full_domain(struct async_domain *domain)
 {
-	async_synchronize_cookie_domain(next_cookie, list);
+	async_synchronize_cookie_domain(next_cookie, domain);
 }
 EXPORT_SYMBOL_GPL(async_synchronize_full_domain);
 
@@ -261,11 +261,10 @@ EXPORT_SYMBOL_GPL(async_synchronize_full_domain);
  * @running: running list to synchronize on
  *
  * This function waits until all asynchronous function calls for the
- * synchronization domain specified by the running list @list submitted
+ * synchronization domain specified by running list @running submitted
  * prior to @cookie have been done.
  */
-void async_synchronize_cookie_domain(async_cookie_t cookie,
-				     struct list_head *running)
+void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain *running)
 {
 	ktime_t uninitialized_var(starttime), delta, endtime;
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/4] async: make async_synchronize_full() flush all work regardless of domain
  2012-05-30 18:21 [PATCH v2 0/4] fix / cleanup async scsi scanning Dan Williams
  2012-05-30 18:21 ` [PATCH v2 1/4] async: introduce 'async_domain' type Dan Williams
@ 2012-05-30 18:21 ` Dan Williams
  2012-05-30 18:21 ` [PATCH v2 3/4] scsi: queue async scan work to an async_schedule domain Dan Williams
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2012-05-30 18:21 UTC (permalink / raw)
  To: linux-kernel, linux-scsi
  Cc: Len Brown, Meelis Roos, James Bottomley, Eldad Zack,
	Rafael J. Wysocki, Arjan van de Ven

In response to an async related regression James noted:

  "My theory is that this is an init problem: The assumption in a lot of
   our code is that async_synchronize_full() waits for everything ... even
   the domain specific async schedules, which isn't true."

...so make this assumption true.

Each domain, including the default one, registers itself on a global domain
list when work is scheduled.  Once all entries complete it exits that
list.  Waiting for the list to be empty syncs all in-flight work across
all domains.

Domains can opt-out of global syncing if they are declared as exclusive
ASYNC_DOMAIN_EXCLUSIVE().  All stack-based domains have been declared
exclusive since the domain may go out of scope as soon as the last work
item completes.

Statically declared domains are mostly ok, but async_unregister_domain()
is there to close any theoretical races with pending
async_synchronize_full waiters at module removal time.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: James Bottomley <JBottomley@parallels.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Eldad Zack <eldadzack@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/scsi.c   |    1 +
 include/linux/async.h |    1 +
 kernel/async.c        |   43 +++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 2f92007..2dd3c93 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -1355,6 +1355,7 @@ static void __exit exit_scsi(void)
 	scsi_exit_devinfo();
 	scsi_exit_procfs();
 	scsi_exit_queue();
+	async_unregister_domain(&scsi_sd_probe_domain);
 }
 
 subsys_initcall(init_scsi);
diff --git a/include/linux/async.h b/include/linux/async.h
index 364e7ff..7a24fe9 100644
--- a/include/linux/async.h
+++ b/include/linux/async.h
@@ -46,6 +46,7 @@ struct async_domain {
 extern async_cookie_t async_schedule(async_func_ptr *ptr, void *data);
 extern async_cookie_t async_schedule_domain(async_func_ptr *ptr, void *data,
 					    struct async_domain *domain);
+void async_unregister_domain(struct async_domain *domain);
 extern void async_synchronize_full(void);
 extern void async_synchronize_full_domain(struct async_domain *domain);
 extern void async_synchronize_cookie(async_cookie_t cookie);
diff --git a/kernel/async.c b/kernel/async.c
index ba5491d..9d31183 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -63,7 +63,9 @@ static async_cookie_t next_cookie = 1;
 
 static LIST_HEAD(async_pending);
 static ASYNC_DOMAIN(async_running);
+static LIST_HEAD(async_domains);
 static DEFINE_SPINLOCK(async_lock);
+static DEFINE_MUTEX(async_register_mutex);
 
 struct async_entry {
 	struct list_head	list;
@@ -145,6 +147,8 @@ static void async_run_entry_fn(struct work_struct *work)
 	/* 3) remove self from the running queue */
 	spin_lock_irqsave(&async_lock, flags);
 	list_del(&entry->list);
+	if (running->registered && --running->count == 0)
+		list_del_init(&running->node);
 
 	/* 4) free the entry */
 	kfree(entry);
@@ -187,6 +191,8 @@ static async_cookie_t __async_schedule(async_func_ptr *ptr, void *data, struct a
 	spin_lock_irqsave(&async_lock, flags);
 	newcookie = entry->cookie = next_cookie++;
 	list_add_tail(&entry->list, &async_pending);
+	if (running->registered && running->count++ == 0)
+		list_add_tail(&running->node, &async_domains);
 	atomic_inc(&entry_count);
 	spin_unlock_irqrestore(&async_lock, flags);
 
@@ -236,13 +242,43 @@ EXPORT_SYMBOL_GPL(async_schedule_domain);
  */
 void async_synchronize_full(void)
 {
+	mutex_lock(&async_register_mutex);
 	do {
-		async_synchronize_cookie(next_cookie);
-	} while (!list_empty(&async_running.domain) || !list_empty(&async_pending));
+		struct async_domain *domain = NULL;
+
+		spin_lock_irq(&async_lock);
+		if (!list_empty(&async_domains))
+			domain = list_first_entry(&async_domains, typeof(*domain), node);
+		spin_unlock_irq(&async_lock);
+
+		async_synchronize_cookie_domain(next_cookie, domain);
+	} while (!list_empty(&async_domains));
+	mutex_unlock(&async_register_mutex);
 }
 EXPORT_SYMBOL_GPL(async_synchronize_full);
 
 /**
+ * async_unregister_domain - ensure no more anonymous waiters on this domain
+ * @domain: idle domain to flush out of any async_synchronize_full instances
+ *
+ * async_synchronize_{cookie|full}_domain() are not flushed since callers
+ * of these routines should know the lifetime of @domain
+ *
+ * Prefer ASYNC_DOMAIN_EXCLUSIVE() declarations over flushing
+ */
+void async_unregister_domain(struct async_domain *domain)
+{
+	mutex_lock(&async_register_mutex);
+	spin_lock_irq(&async_lock);
+	WARN_ON(!domain->registered || !list_empty(&domain->node) ||
+		!list_empty(&domain->domain));
+	domain->registered = 0;
+	spin_unlock_irq(&async_lock);
+	mutex_unlock(&async_register_mutex);
+}
+EXPORT_SYMBOL_GPL(async_unregister_domain);
+
+/**
  * async_synchronize_full_domain - synchronize all asynchronous function within a certain domain
  * @domain: running list to synchronize on
  *
@@ -268,6 +304,9 @@ void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain
 {
 	ktime_t uninitialized_var(starttime), delta, endtime;
 
+	if (!running)
+		return;
+
 	if (initcall_debug && system_state == SYSTEM_BOOTING) {
 		printk(KERN_DEBUG "async_waiting @ %i\n", task_pid_nr(current));
 		starttime = ktime_get();

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/4] scsi: queue async scan work to an async_schedule domain
  2012-05-30 18:21 [PATCH v2 0/4] fix / cleanup async scsi scanning Dan Williams
  2012-05-30 18:21 ` [PATCH v2 1/4] async: introduce 'async_domain' type Dan Williams
  2012-05-30 18:21 ` [PATCH v2 2/4] async: make async_synchronize_full() flush all work regardless of domain Dan Williams
@ 2012-05-30 18:21 ` Dan Williams
  2012-05-30 18:21 ` [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans Dan Williams
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2012-05-30 18:21 UTC (permalink / raw)
  To: linux-kernel, linux-scsi
  Cc: Rafael J. Wysocki, Len Brown, mroos, Arjan van de Ven,
	James Bottomley

This is preparation to enable async_synchronize_full() to be used as a
replacement for scsi_complete_async_scans(), i.e. to stop leaking scsi
internal details where they are not needed.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/scsi_scan.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 5e00e09..fb42aa0 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1847,14 +1847,13 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
 	}
 }
 
-static int do_scan_async(void *_data)
+static void do_scan_async(void *_data, async_cookie_t c)
 {
 	struct async_scan_data *data = _data;
 	struct Scsi_Host *shost = data->shost;
 
 	do_scsi_scan_host(shost);
 	scsi_finish_async_scan(data);
-	return 0;
 }
 
 /**
@@ -1863,7 +1862,6 @@ static int do_scan_async(void *_data)
  **/
 void scsi_scan_host(struct Scsi_Host *shost)
 {
-	struct task_struct *p;
 	struct async_scan_data *data;
 
 	if (strncmp(scsi_scan_type, "none", 4) == 0)
@@ -1878,9 +1876,11 @@ void scsi_scan_host(struct Scsi_Host *shost)
 		return;
 	}
 
-	p = kthread_run(do_scan_async, data, "scsi_scan_%d", shost->host_no);
-	if (IS_ERR(p))
-		do_scan_async(data);
+	/* register with the async subsystem so wait_for_device_probe()
+	 * will flush this work
+	 */
+	async_schedule(do_scan_async, data);
+
 	/* scsi_autopm_put_host(shost) is called in scsi_finish_async_scan() */
 }
 EXPORT_SYMBOL(scsi_scan_host);

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans
  2012-05-30 18:21 [PATCH v2 0/4] fix / cleanup async scsi scanning Dan Williams
                   ` (2 preceding siblings ...)
  2012-05-30 18:21 ` [PATCH v2 3/4] scsi: queue async scan work to an async_schedule domain Dan Williams
@ 2012-05-30 18:21 ` Dan Williams
  2012-05-30 21:34   ` Rafael J. Wysocki
  2012-05-30 18:22 ` [PATCH v2 0/4] fix / cleanup async scsi scanning Borislav Petkov
  2012-05-31  9:05 ` mroos
  5 siblings, 1 reply; 14+ messages in thread
From: Dan Williams @ 2012-05-30 18:21 UTC (permalink / raw)
  To: linux-kernel, linux-scsi
  Cc: Rafael J. Wysocki, Len Brown, mroos, Arjan van de Ven,
	James Bottomley

Now that scsi registers its async scan work with the async subsystem,
wait_for_device_probe() is sufficient for ensuring all scanning is
complete.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/scsi_scan.c      |   12 ------------
 drivers/scsi/scsi_wait_scan.c |   15 +++++----------
 include/scsi/scsi_scan.h      |   11 -----------
 kernel/power/hibernate.c      |    8 --------
 kernel/power/user.c           |    2 --
 5 files changed, 5 insertions(+), 43 deletions(-)
 delete mode 100644 include/scsi/scsi_scan.h

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index fb42aa0..20c7108 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -184,18 +184,6 @@ int scsi_complete_async_scans(void)
 	return 0;
 }
 
-/* Only exported for the benefit of scsi_wait_scan */
-EXPORT_SYMBOL_GPL(scsi_complete_async_scans);
-
-#ifndef MODULE
-/*
- * For async scanning we need to wait for all the scans to complete before
- * trying to mount the root fs.  Otherwise non-modular drivers may not be ready
- * yet.
- */
-late_initcall(scsi_complete_async_scans);
-#endif
-
 /**
  * scsi_unlock_floptical - unlock device via a special MODE SENSE command
  * @sdev:	scsi device to send command to
diff --git a/drivers/scsi/scsi_wait_scan.c b/drivers/scsi/scsi_wait_scan.c
index 74708fc..57de24a 100644
--- a/drivers/scsi/scsi_wait_scan.c
+++ b/drivers/scsi/scsi_wait_scan.c
@@ -12,21 +12,16 @@
 
 #include <linux/module.h>
 #include <linux/device.h>
-#include <scsi/scsi_scan.h>
 
 static int __init wait_scan_init(void)
 {
 	/*
-	 * First we need to wait for device probing to finish;
-	 * the drivers we just loaded might just still be probing
-	 * and might not yet have reached the scsi async scanning
+	 * This will not return until all async work (system wide) is
+	 * quiesced.  Probing queues host-scanning work to the async
+	 * queue which is why we don't need a separate call to
+	 * scsi_complete_async_scans()
 	 */
 	wait_for_device_probe();
-	/*
-	 * and then we wait for the actual asynchronous scsi scan
-	 * to finish.
-	 */
-	scsi_complete_async_scans();
 	return 0;
 }
 
@@ -38,5 +33,5 @@ MODULE_DESCRIPTION("SCSI wait for scans");
 MODULE_AUTHOR("James Bottomley");
 MODULE_LICENSE("GPL");
 
-late_initcall(wait_scan_init);
+module_init(wait_scan_init);
 module_exit(wait_scan_exit);
diff --git a/include/scsi/scsi_scan.h b/include/scsi/scsi_scan.h
deleted file mode 100644
index 7889888..0000000
--- a/include/scsi/scsi_scan.h
+++ /dev/null
@@ -1,11 +0,0 @@
-#ifndef _SCSI_SCSI_SCAN_H
-#define _SCSI_SCSI_SCAN_H
-
-#ifdef CONFIG_SCSI
-/* drivers/scsi/scsi_scan.c */
-extern int scsi_complete_async_scans(void);
-#else
-static inline int scsi_complete_async_scans(void) { return 0; }
-#endif
-
-#endif /* _SCSI_SCSI_SCAN_H */
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index e09dfbf..821114a 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -25,7 +25,6 @@
 #include <linux/freezer.h>
 #include <linux/gfp.h>
 #include <linux/syscore_ops.h>
-#include <scsi/scsi_scan.h>
 
 #include "power.h"
 
@@ -735,13 +734,6 @@ static int software_resume(void)
 			async_synchronize_full();
 		}
 
-		/*
-		 * We can't depend on SCSI devices being available after loading
-		 * one of their modules until scsi_complete_async_scans() is
-		 * called and the resume device usually is a SCSI one.
-		 */
-		scsi_complete_async_scans();
-
 		swsusp_resume_device = name_to_dev_t(resume_file);
 		if (!swsusp_resume_device) {
 			error = -ENODEV;
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 91b0fd0..4ed81e7 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -24,7 +24,6 @@
 #include <linux/console.h>
 #include <linux/cpu.h>
 #include <linux/freezer.h>
-#include <scsi/scsi_scan.h>
 
 #include <asm/uaccess.h>
 
@@ -84,7 +83,6 @@ static int snapshot_open(struct inode *inode, struct file *filp)
 		 * appear.
 		 */
 		wait_for_device_probe();
-		scsi_complete_async_scans();
 
 		data->swap = -1;
 		data->mode = O_WRONLY;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] fix / cleanup async scsi scanning
  2012-05-30 18:21 [PATCH v2 0/4] fix / cleanup async scsi scanning Dan Williams
                   ` (3 preceding siblings ...)
  2012-05-30 18:21 ` [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans Dan Williams
@ 2012-05-30 18:22 ` Borislav Petkov
  2012-05-30 18:29   ` Dan Williams
  2012-05-31  9:05 ` mroos
  5 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2012-05-30 18:22 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-kernel, linux-scsi, mroos, JBottomley

On Wed, May 30, 2012 at 11:21:25AM -0700, Dan Williams wrote:
> Commit a7a20d10 "[SCSI] sd: limit the scope of the async probe domain"
> introduces a boot regression by moving sd probe work off of the global
> async queue.  Using a local async domain hides the probe work from being
> synchronized by wait_for_device_probe()->async_synchronize_full().
> 
> Fix this by teaching async_synchronize_full() to flush all async work
> regardless of domain, and take the opportunity to convert scsi scanning
> to async_schedule().  This enables wait_for_device_probe() to flush scsi
> scanning work.

Looks like those fix a similar boot issue I reported earlier:

http://marc.info/?l=linux-kernel&m=133839683405526&w=2

Should I give them a run or are they still in review?

Thanks.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] fix / cleanup async scsi scanning
  2012-05-30 18:22 ` [PATCH v2 0/4] fix / cleanup async scsi scanning Borislav Petkov
@ 2012-05-30 18:29   ` Dan Williams
  2012-05-30 22:33     ` walt
  2012-05-31 13:37     ` Borislav Petkov
  0 siblings, 2 replies; 14+ messages in thread
From: Dan Williams @ 2012-05-30 18:29 UTC (permalink / raw)
  To: Borislav Petkov, Dan Williams, linux-kernel, linux-scsi, mroos,
	JBottomley

On Wed, May 30, 2012 at 11:22 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, May 30, 2012 at 11:21:25AM -0700, Dan Williams wrote:
>> Commit a7a20d10 "[SCSI] sd: limit the scope of the async probe domain"
>> introduces a boot regression by moving sd probe work off of the global
>> async queue.  Using a local async domain hides the probe work from being
>> synchronized by wait_for_device_probe()->async_synchronize_full().
>>
>> Fix this by teaching async_synchronize_full() to flush all async work
>> regardless of domain, and take the opportunity to convert scsi scanning
>> to async_schedule().  This enables wait_for_device_probe() to flush scsi
>> scanning work.
>
> Looks like those fix a similar boot issue I reported earlier:
>
> http://marc.info/?l=linux-kernel&m=133839683405526&w=2
>
> Should I give them a run or are they still in review?
>

They're ready for a run, but are likely 3.6 material.  For 3.5 I think
James is going with the smaller fix posted here:

http://marc.info/?l=linux-scsi&m=133796775807498&w=2

--
Dan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans
  2012-05-30 18:21 ` [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans Dan Williams
@ 2012-05-30 21:34   ` Rafael J. Wysocki
  2012-05-30 21:37     ` Rafael J. Wysocki
  2012-05-30 21:41     ` Rafael J. Wysocki
  0 siblings, 2 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-05-30 21:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux-scsi, Len Brown, mroos, Arjan van de Ven,
	James Bottomley

On Wednesday, May 30, 2012, Dan Williams wrote:
> Now that scsi registers its async scan work with the async subsystem,
> wait_for_device_probe() is sufficient for ensuring all scanning is
> complete.
> 
> Cc: Arjan van de Ven <arjan@linux.intel.com>
> Cc: Len Brown <len.brown@intel.com>
> Cc: Rafael J. Wysocki <rjw@sisk.pl>
> Cc: James Bottomley <JBottomley@parallels.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/scsi/scsi_scan.c      |   12 ------------
>  drivers/scsi/scsi_wait_scan.c |   15 +++++----------
>  include/scsi/scsi_scan.h      |   11 -----------
>  kernel/power/hibernate.c      |    8 --------
>  kernel/power/user.c           |    2 --
>  5 files changed, 5 insertions(+), 43 deletions(-)
>  delete mode 100644 include/scsi/scsi_scan.h
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index fb42aa0..20c7108 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -184,18 +184,6 @@ int scsi_complete_async_scans(void)
>  	return 0;
>  }
>  
> -/* Only exported for the benefit of scsi_wait_scan */
> -EXPORT_SYMBOL_GPL(scsi_complete_async_scans);
> -
> -#ifndef MODULE
> -/*
> - * For async scanning we need to wait for all the scans to complete before
> - * trying to mount the root fs.  Otherwise non-modular drivers may not be ready
> - * yet.
> - */
> -late_initcall(scsi_complete_async_scans);
> -#endif
> -
>  /**
>   * scsi_unlock_floptical - unlock device via a special MODE SENSE command
>   * @sdev:	scsi device to send command to
> diff --git a/drivers/scsi/scsi_wait_scan.c b/drivers/scsi/scsi_wait_scan.c
> index 74708fc..57de24a 100644
> --- a/drivers/scsi/scsi_wait_scan.c
> +++ b/drivers/scsi/scsi_wait_scan.c
> @@ -12,21 +12,16 @@
>  
>  #include <linux/module.h>
>  #include <linux/device.h>
> -#include <scsi/scsi_scan.h>
>  
>  static int __init wait_scan_init(void)
>  {
>  	/*
> -	 * First we need to wait for device probing to finish;
> -	 * the drivers we just loaded might just still be probing
> -	 * and might not yet have reached the scsi async scanning
> +	 * This will not return until all async work (system wide) is
> +	 * quiesced.  Probing queues host-scanning work to the async
> +	 * queue which is why we don't need a separate call to
> +	 * scsi_complete_async_scans()
>  	 */
>  	wait_for_device_probe();
> -	/*
> -	 * and then we wait for the actual asynchronous scsi scan
> -	 * to finish.
> -	 */
> -	scsi_complete_async_scans();
>  	return 0;
>  }
>  
> @@ -38,5 +33,5 @@ MODULE_DESCRIPTION("SCSI wait for scans");
>  MODULE_AUTHOR("James Bottomley");
>  MODULE_LICENSE("GPL");
>  
> -late_initcall(wait_scan_init);
> +module_init(wait_scan_init);
>  module_exit(wait_scan_exit);
> diff --git a/include/scsi/scsi_scan.h b/include/scsi/scsi_scan.h
> deleted file mode 100644
> index 7889888..0000000
> --- a/include/scsi/scsi_scan.h
> +++ /dev/null
> @@ -1,11 +0,0 @@
> -#ifndef _SCSI_SCSI_SCAN_H
> -#define _SCSI_SCSI_SCAN_H
> -
> -#ifdef CONFIG_SCSI
> -/* drivers/scsi/scsi_scan.c */
> -extern int scsi_complete_async_scans(void);
> -#else
> -static inline int scsi_complete_async_scans(void) { return 0; }
> -#endif
> -
> -#endif /* _SCSI_SCSI_SCAN_H */
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index e09dfbf..821114a 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -25,7 +25,6 @@
>  #include <linux/freezer.h>
>  #include <linux/gfp.h>
>  #include <linux/syscore_ops.h>
> -#include <scsi/scsi_scan.h>
>  
>  #include "power.h"
>  
> @@ -735,13 +734,6 @@ static int software_resume(void)
>  			async_synchronize_full();
>  		}
>  
> -		/*
> -		 * We can't depend on SCSI devices being available after loading
> -		 * one of their modules until scsi_complete_async_scans() is
> -		 * called and the resume device usually is a SCSI one.
> -		 */
> -		scsi_complete_async_scans();
> -

I believe this is wrong.  You're going to introduce a regression on systems
using built-in hibernation and built-in SCSI stack.

>  		swsusp_resume_device = name_to_dev_t(resume_file);
>  		if (!swsusp_resume_device) {
>  			error = -ENODEV;
> diff --git a/kernel/power/user.c b/kernel/power/user.c
> index 91b0fd0..4ed81e7 100644
> --- a/kernel/power/user.c
> +++ b/kernel/power/user.c
> @@ -24,7 +24,6 @@
>  #include <linux/console.h>
>  #include <linux/cpu.h>
>  #include <linux/freezer.h>
> -#include <scsi/scsi_scan.h>
>  
>  #include <asm/uaccess.h>
>  
> @@ -84,7 +83,6 @@ static int snapshot_open(struct inode *inode, struct file *filp)
>  		 * appear.
>  		 */
>  		wait_for_device_probe();
> -		scsi_complete_async_scans();

Same here.

>  
>  		data->swap = -1;
>  		data->mode = O_WRONLY;

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans
  2012-05-30 21:34   ` Rafael J. Wysocki
@ 2012-05-30 21:37     ` Rafael J. Wysocki
  2012-05-30 21:41     ` Rafael J. Wysocki
  1 sibling, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-05-30 21:37 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux-scsi, Len Brown, mroos, Arjan van de Ven,
	James Bottomley

On Wednesday, May 30, 2012, Rafael J. Wysocki wrote:
> On Wednesday, May 30, 2012, Dan Williams wrote:
> > Now that scsi registers its async scan work with the async subsystem,
> > wait_for_device_probe() is sufficient for ensuring all scanning is
> > complete.
> > 
> > Cc: Arjan van de Ven <arjan@linux.intel.com>
> > Cc: Len Brown <len.brown@intel.com>
> > Cc: Rafael J. Wysocki <rjw@sisk.pl>
> > Cc: James Bottomley <JBottomley@parallels.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/scsi/scsi_scan.c      |   12 ------------
> >  drivers/scsi/scsi_wait_scan.c |   15 +++++----------
> >  include/scsi/scsi_scan.h      |   11 -----------
> >  kernel/power/hibernate.c      |    8 --------
> >  kernel/power/user.c           |    2 --
> >  5 files changed, 5 insertions(+), 43 deletions(-)
> >  delete mode 100644 include/scsi/scsi_scan.h
> > 
> > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> > index fb42aa0..20c7108 100644
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -184,18 +184,6 @@ int scsi_complete_async_scans(void)
> >  	return 0;
> >  }
> >  
> > -/* Only exported for the benefit of scsi_wait_scan */
> > -EXPORT_SYMBOL_GPL(scsi_complete_async_scans);
> > -
> > -#ifndef MODULE
> > -/*
> > - * For async scanning we need to wait for all the scans to complete before
> > - * trying to mount the root fs.  Otherwise non-modular drivers may not be ready
> > - * yet.
> > - */
> > -late_initcall(scsi_complete_async_scans);
> > -#endif
> > -
> >  /**
> >   * scsi_unlock_floptical - unlock device via a special MODE SENSE command
> >   * @sdev:	scsi device to send command to
> > diff --git a/drivers/scsi/scsi_wait_scan.c b/drivers/scsi/scsi_wait_scan.c
> > index 74708fc..57de24a 100644
> > --- a/drivers/scsi/scsi_wait_scan.c
> > +++ b/drivers/scsi/scsi_wait_scan.c
> > @@ -12,21 +12,16 @@
> >  
> >  #include <linux/module.h>
> >  #include <linux/device.h>
> > -#include <scsi/scsi_scan.h>
> >  
> >  static int __init wait_scan_init(void)
> >  {
> >  	/*
> > -	 * First we need to wait for device probing to finish;
> > -	 * the drivers we just loaded might just still be probing
> > -	 * and might not yet have reached the scsi async scanning
> > +	 * This will not return until all async work (system wide) is
> > +	 * quiesced.  Probing queues host-scanning work to the async
> > +	 * queue which is why we don't need a separate call to
> > +	 * scsi_complete_async_scans()
> >  	 */
> >  	wait_for_device_probe();
> > -	/*
> > -	 * and then we wait for the actual asynchronous scsi scan
> > -	 * to finish.
> > -	 */
> > -	scsi_complete_async_scans();
> >  	return 0;
> >  }
> >  
> > @@ -38,5 +33,5 @@ MODULE_DESCRIPTION("SCSI wait for scans");
> >  MODULE_AUTHOR("James Bottomley");
> >  MODULE_LICENSE("GPL");
> >  
> > -late_initcall(wait_scan_init);
> > +module_init(wait_scan_init);
> >  module_exit(wait_scan_exit);
> > diff --git a/include/scsi/scsi_scan.h b/include/scsi/scsi_scan.h
> > deleted file mode 100644
> > index 7889888..0000000
> > --- a/include/scsi/scsi_scan.h
> > +++ /dev/null
> > @@ -1,11 +0,0 @@
> > -#ifndef _SCSI_SCSI_SCAN_H
> > -#define _SCSI_SCSI_SCAN_H
> > -
> > -#ifdef CONFIG_SCSI
> > -/* drivers/scsi/scsi_scan.c */
> > -extern int scsi_complete_async_scans(void);
> > -#else
> > -static inline int scsi_complete_async_scans(void) { return 0; }
> > -#endif
> > -
> > -#endif /* _SCSI_SCSI_SCAN_H */
> > diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> > index e09dfbf..821114a 100644
> > --- a/kernel/power/hibernate.c
> > +++ b/kernel/power/hibernate.c
> > @@ -25,7 +25,6 @@
> >  #include <linux/freezer.h>
> >  #include <linux/gfp.h>
> >  #include <linux/syscore_ops.h>
> > -#include <scsi/scsi_scan.h>
> >  
> >  #include "power.h"
> >  
> > @@ -735,13 +734,6 @@ static int software_resume(void)
> >  			async_synchronize_full();
> >  		}
> >  
> > -		/*
> > -		 * We can't depend on SCSI devices being available after loading
> > -		 * one of their modules until scsi_complete_async_scans() is
> > -		 * called and the resume device usually is a SCSI one.
> > -		 */
> > -		scsi_complete_async_scans();
> > -
> 
> I believe this is wrong.  You're going to introduce a regression on systems
> using built-in hibernation and built-in SCSI stack.
> 
> >  		swsusp_resume_device = name_to_dev_t(resume_file);
> >  		if (!swsusp_resume_device) {
> >  			error = -ENODEV;
> > diff --git a/kernel/power/user.c b/kernel/power/user.c
> > index 91b0fd0..4ed81e7 100644
> > --- a/kernel/power/user.c
> > +++ b/kernel/power/user.c
> > @@ -24,7 +24,6 @@
> >  #include <linux/console.h>
> >  #include <linux/cpu.h>
> >  #include <linux/freezer.h>
> > -#include <scsi/scsi_scan.h>
> >  
> >  #include <asm/uaccess.h>
> >  
> > @@ -84,7 +83,6 @@ static int snapshot_open(struct inode *inode, struct file *filp)
> >  		 * appear.
> >  		 */
> >  		wait_for_device_probe();
> > -		scsi_complete_async_scans();
> 
> Same here.

Well, not exactly.  Built-in SCSI with userspace-based hibernation.

> >  
> >  		data->swap = -1;
> >  		data->mode = O_WRONLY;

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans
  2012-05-30 21:34   ` Rafael J. Wysocki
  2012-05-30 21:37     ` Rafael J. Wysocki
@ 2012-05-30 21:41     ` Rafael J. Wysocki
  2012-05-30 21:49       ` Dan Williams
  1 sibling, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-05-30 21:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux-scsi, Len Brown, mroos, Arjan van de Ven,
	James Bottomley

On Wednesday, May 30, 2012, Rafael J. Wysocki wrote:
> On Wednesday, May 30, 2012, Dan Williams wrote:
> > Now that scsi registers its async scan work with the async subsystem,
> > wait_for_device_probe() is sufficient for ensuring all scanning is
> > complete.
> > 
> > Cc: Arjan van de Ven <arjan@linux.intel.com>
> > Cc: Len Brown <len.brown@intel.com>
> > Cc: Rafael J. Wysocki <rjw@sisk.pl>
> > Cc: James Bottomley <JBottomley@parallels.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/scsi/scsi_scan.c      |   12 ------------
> >  drivers/scsi/scsi_wait_scan.c |   15 +++++----------
> >  include/scsi/scsi_scan.h      |   11 -----------
> >  kernel/power/hibernate.c      |    8 --------
> >  kernel/power/user.c           |    2 --
> >  5 files changed, 5 insertions(+), 43 deletions(-)
> >  delete mode 100644 include/scsi/scsi_scan.h
> > 
> > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> > index fb42aa0..20c7108 100644
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -184,18 +184,6 @@ int scsi_complete_async_scans(void)
> >  	return 0;
> >  }
> >  
> > -/* Only exported for the benefit of scsi_wait_scan */
> > -EXPORT_SYMBOL_GPL(scsi_complete_async_scans);
> > -
> > -#ifndef MODULE
> > -/*
> > - * For async scanning we need to wait for all the scans to complete before
> > - * trying to mount the root fs.  Otherwise non-modular drivers may not be ready
> > - * yet.
> > - */
> > -late_initcall(scsi_complete_async_scans);
> > -#endif
> > -
> >  /**
> >   * scsi_unlock_floptical - unlock device via a special MODE SENSE command
> >   * @sdev:	scsi device to send command to
> > diff --git a/drivers/scsi/scsi_wait_scan.c b/drivers/scsi/scsi_wait_scan.c
> > index 74708fc..57de24a 100644
> > --- a/drivers/scsi/scsi_wait_scan.c
> > +++ b/drivers/scsi/scsi_wait_scan.c
> > @@ -12,21 +12,16 @@
> >  
> >  #include <linux/module.h>
> >  #include <linux/device.h>
> > -#include <scsi/scsi_scan.h>
> >  
> >  static int __init wait_scan_init(void)
> >  {
> >  	/*
> > -	 * First we need to wait for device probing to finish;
> > -	 * the drivers we just loaded might just still be probing
> > -	 * and might not yet have reached the scsi async scanning
> > +	 * This will not return until all async work (system wide) is
> > +	 * quiesced.  Probing queues host-scanning work to the async
> > +	 * queue which is why we don't need a separate call to
> > +	 * scsi_complete_async_scans()
> >  	 */
> >  	wait_for_device_probe();
> > -	/*
> > -	 * and then we wait for the actual asynchronous scsi scan
> > -	 * to finish.
> > -	 */
> > -	scsi_complete_async_scans();
> >  	return 0;
> >  }
> >  
> > @@ -38,5 +33,5 @@ MODULE_DESCRIPTION("SCSI wait for scans");
> >  MODULE_AUTHOR("James Bottomley");
> >  MODULE_LICENSE("GPL");
> >  
> > -late_initcall(wait_scan_init);
> > +module_init(wait_scan_init);
> >  module_exit(wait_scan_exit);
> > diff --git a/include/scsi/scsi_scan.h b/include/scsi/scsi_scan.h
> > deleted file mode 100644
> > index 7889888..0000000
> > --- a/include/scsi/scsi_scan.h
> > +++ /dev/null
> > @@ -1,11 +0,0 @@
> > -#ifndef _SCSI_SCSI_SCAN_H
> > -#define _SCSI_SCSI_SCAN_H
> > -
> > -#ifdef CONFIG_SCSI
> > -/* drivers/scsi/scsi_scan.c */
> > -extern int scsi_complete_async_scans(void);
> > -#else
> > -static inline int scsi_complete_async_scans(void) { return 0; }
> > -#endif
> > -
> > -#endif /* _SCSI_SCSI_SCAN_H */
> > diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> > index e09dfbf..821114a 100644
> > --- a/kernel/power/hibernate.c
> > +++ b/kernel/power/hibernate.c
> > @@ -25,7 +25,6 @@
> >  #include <linux/freezer.h>
> >  #include <linux/gfp.h>
> >  #include <linux/syscore_ops.h>
> > -#include <scsi/scsi_scan.h>
> >  
> >  #include "power.h"
> >  
> > @@ -735,13 +734,6 @@ static int software_resume(void)
> >  			async_synchronize_full();
> >  		}
> >  
> > -		/*
> > -		 * We can't depend on SCSI devices being available after loading
> > -		 * one of their modules until scsi_complete_async_scans() is
> > -		 * called and the resume device usually is a SCSI one.
> > -		 */
> > -		scsi_complete_async_scans();
> > -
> 
> I believe this is wrong.  You're going to introduce a regression on systems
> using built-in hibernation and built-in SCSI stack.

Ah, wait.  Do I understand correctly that wait_for_device_probe()
is now going to do an equivalent of scsi_complete_async_scans()?

If so, that should work, in which case please disregard my previous
messages in this thread.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans
  2012-05-30 21:41     ` Rafael J. Wysocki
@ 2012-05-30 21:49       ` Dan Williams
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2012-05-30 21:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, linux-scsi, Len Brown, mroos, Arjan van de Ven,
	James Bottomley

On Wed, May 30, 2012 at 2:41 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >             }
>> >
>> > -           /*
>> > -            * We can't depend on SCSI devices being available after loading
>> > -            * one of their modules until scsi_complete_async_scans() is
>> > -            * called and the resume device usually is a SCSI one.
>> > -            */
>> > -           scsi_complete_async_scans();
>> > -
>>
>> I believe this is wrong.  You're going to introduce a regression on systems
>> using built-in hibernation and built-in SCSI stack.
>
> Ah, wait.  Do I understand correctly that wait_for_device_probe()
> is now going to do an equivalent of scsi_complete_async_scans()?
>
> If so, that should work, in which case please disregard my previous
> messages in this thread.

Yeah, no problem.  Patch 3 is what changed the assumptions in this regard.

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] fix / cleanup async scsi scanning
  2012-05-30 18:29   ` Dan Williams
@ 2012-05-30 22:33     ` walt
  2012-05-31 13:37     ` Borislav Petkov
  1 sibling, 0 replies; 14+ messages in thread
From: walt @ 2012-05-30 22:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-scsi

On 05/30/2012 11:29 AM, Dan Williams wrote:
> For 3.5 I think
> James is going with the smaller fix posted here:
> 
> http://marc.info/?l=linux-scsi&m=133796775807498&w=2

Hi Dan.  I replied to that thread not long ago, explaining
that I still have a booting problem after that patch.  Just
a corner case maybe, but it's *my* corner case ;)

>From that article click twice on "next in thread" to see it.

Many thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] fix / cleanup async scsi scanning
  2012-05-30 18:21 [PATCH v2 0/4] fix / cleanup async scsi scanning Dan Williams
                   ` (4 preceding siblings ...)
  2012-05-30 18:22 ` [PATCH v2 0/4] fix / cleanup async scsi scanning Borislav Petkov
@ 2012-05-31  9:05 ` mroos
  5 siblings, 0 replies; 14+ messages in thread
From: mroos @ 2012-05-31  9:05 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-kernel, linux-scsi, JBottomley

> Commit a7a20d10 "[SCSI] sd: limit the scope of the async probe domain"
> introduces a boot regression by moving sd probe work off of the global
> async queue.  Using a local async domain hides the probe work from being
> synchronized by wait_for_device_probe()->async_synchronize_full().
> 
> Fix this by teaching async_synchronize_full() to flush all async work
> regardless of domain, and take the opportunity to convert scsi scanning
> to async_schedule().  This enables wait_for_device_probe() to flush scsi
> scanning work.
> 
> Changes since v1: http://marc.info/?l=linux-scsi&m=133793153025832&w=2
> 
> 1/ Tested to fix the boot hang that Meelis reported with v1.  Reworked
>    async_synchronize_full() to walk through all the active domains,
>    otherwise we spin on !list_empty(async_domains) and prevent the async
>    context from running.
> 
> 2/ Added the ability for domains to opt-out of global syncing as
>    requested by Arjan, but also needed for domains that don't want to worry
>    about list corruption when the domain goes out of scope (stack-allocated
>    domains).

Tested successfully on my Netra X1 where the original problem happened, 
on top of 3.4.0-08215-g1e2aec8. Thank you!

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/4] fix / cleanup async scsi scanning
  2012-05-30 18:29   ` Dan Williams
  2012-05-30 22:33     ` walt
@ 2012-05-31 13:37     ` Borislav Petkov
  1 sibling, 0 replies; 14+ messages in thread
From: Borislav Petkov @ 2012-05-31 13:37 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-kernel, linux-scsi, mroos, JBottomley

On Wed, May 30, 2012 at 11:29:23AM -0700, Dan Williams wrote:
> They're ready for a run, but are likely 3.6 material.  For 3.5 I think
> James is going with the smaller fix posted here:
> 
> http://marc.info/?l=linux-scsi&m=133796775807498&w=2

Yep,

both the smaller fix and the 4-patch set cure the box here, thanks.

Tested-by: Borislav Petkov <bp@alien8.de>

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-05-31 13:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-30 18:21 [PATCH v2 0/4] fix / cleanup async scsi scanning Dan Williams
2012-05-30 18:21 ` [PATCH v2 1/4] async: introduce 'async_domain' type Dan Williams
2012-05-30 18:21 ` [PATCH v2 2/4] async: make async_synchronize_full() flush all work regardless of domain Dan Williams
2012-05-30 18:21 ` [PATCH v2 3/4] scsi: queue async scan work to an async_schedule domain Dan Williams
2012-05-30 18:21 ` [PATCH v2 4/4] scsi: cleanup usages of scsi_complete_async_scans Dan Williams
2012-05-30 21:34   ` Rafael J. Wysocki
2012-05-30 21:37     ` Rafael J. Wysocki
2012-05-30 21:41     ` Rafael J. Wysocki
2012-05-30 21:49       ` Dan Williams
2012-05-30 18:22 ` [PATCH v2 0/4] fix / cleanup async scsi scanning Borislav Petkov
2012-05-30 18:29   ` Dan Williams
2012-05-30 22:33     ` walt
2012-05-31 13:37     ` Borislav Petkov
2012-05-31  9:05 ` mroos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).