[RFC] firmware load: defer request_firmware during early boot and resume

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC] firmware load: defer request_firmware during early boot and resume
@ 2012-07-20 12:33 Ming Lei
  2012-07-20 12:52 ` Borislav Petkov
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Ming Lei @ 2012-07-20 12:33 UTC (permalink / raw)
  To: Linux Kernel Mailing List, linux-usb
  Cc: Alan Stern, Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

The RFC patch is just for discussing if the idea of deferring
request_firmware is doable for addressing the issue of
request_firmware in resume path, which is caused by driver
unbind/rebind during resume.

At least usb bus is involved in such things, one driver may be
unbound and rebound in resume path at several situations, and
request_firmware is often called inside probe().

Also the idea should be helpful for other hotplug buses too,
at least there was the similar problem report on pcmcia bus.

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 6cd2c6c..fb02eff 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -123,7 +123,7 @@ static bool driver_deferred_probe_enable = false;
  * list and schedules the deferred probe workqueue to process them.  It
  * should be called anytime a driver is successfully bound to a device.
  */
-static void driver_deferred_probe_trigger(void)
+void driver_deferred_probe_trigger(void)
 {
 	if (!driver_deferred_probe_enable)
 		return;
@@ -144,6 +144,7 @@ static void driver_deferred_probe_trigger(void)
 	 */
 	queue_work(deferred_wq, &deferred_probe_work);
 }
+EXPORT_SYMBOL_GPL(driver_deferred_probe_trigger);

 /**
  * deferred_probe_initcall() - Enable probing of deferred devices
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 5401814..4fe742f 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -593,6 +593,9 @@ request_firmware(const struct firmware
**firmware_p, const char *name,
 	if (IS_ERR_OR_NULL(fw_priv))
 		return PTR_RET(fw_priv);

+	if (system_state != SYSTEM_RUNNING)
+		return -EPROBE_DEFER;
+
 	ret = usermodehelper_read_trylock();
 	if (WARN_ON(ret)) {
 		dev_err(device, "firmware: %s will not be loaded\n", name);
diff --git a/include/linux/device.h b/include/linux/device.h
index d0e4d99..a63d3171 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -244,7 +244,7 @@ extern struct device_driver *driver_find(const char *name,
 					 struct bus_type *bus);
 extern int driver_probe_done(void);
 extern void wait_for_device_probe(void);
-
+extern void driver_deferred_probe_trigger(void);

 /* sysfs interface for exporting driver attributes */

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e07f5e0..c8d74c6 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -378,6 +378,7 @@ extern enum system_states {
 	SYSTEM_POWER_OFF,
 	SYSTEM_RESTART,
 	SYSTEM_SUSPEND_DISK,
+	SYSTEM_SUSPEND,
 } system_state;

 #define TAINT_PROPRIETARY_MODULE	0
diff --git a/init/main.c b/init/main.c
index e60679d..237eae1 100644
--- a/init/main.c
+++ b/init/main.c
@@ -809,6 +809,8 @@ static noinline int init_post(void)
 	current->signal->flags |= SIGNAL_UNKILLABLE;
 	flush_delayed_fput();

+	driver_deferred_probe_trigger();
+
 	if (ramdisk_execute_command) {
 		run_init_process(ramdisk_execute_command);
 		printk(KERN_WARNING "Failed to execute %s\n",
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 1da39ea..dbf6ffb 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -224,6 +224,9 @@ int suspend_devices_and_enter(suspend_state_t state)
 		goto Recover_platform;
 	}
 	suspend_test_finish("suspend devices");
+
+	system_state = SYSTEM_SUSPEND;
+
 	if (suspend_test(TEST_DEVICES))
 		goto Recover_platform;

@@ -301,6 +304,10 @@ static int enter_state(suspend_state_t state)
  Finish:
 	pr_debug("PM: Finishing wakeup.\n");
 	suspend_finish();
+
+	system_state = SYSTEM_RUNNING;
+	driver_deferred_probe_trigger();
+
  Unlock:
 	mutex_unlock(&pm_mutex);
 	return error;


Thanks,
-- 
Ming Lei

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 12:33 [RFC] firmware load: defer request_firmware during early boot and resume Ming Lei
@ 2012-07-20 12:52 ` Borislav Petkov
  2012-07-20 12:57   ` Ming Lei
  2012-07-20 13:54 ` Oliver Neukum
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Borislav Petkov @ 2012-07-20 12:52 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Fri, Jul 20, 2012 at 08:33:32PM +0800, Ming Lei wrote:
> The RFC patch is just for discussing if the idea of deferring
> request_firmware is doable for addressing the issue of
> request_firmware in resume path, which is caused by driver
> unbind/rebind during resume.
> 
> At least usb bus is involved in such things, one driver may be
> unbound and rebound in resume path at several situations, and
> request_firmware is often called inside probe().
> 
> Also the idea should be helpful for other hotplug buses too,
> at least there was the similar problem report on pcmcia bus.
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 6cd2c6c..fb02eff 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -123,7 +123,7 @@ static bool driver_deferred_probe_enable = false;
>   * list and schedules the deferred probe workqueue to process them.  It
>   * should be called anytime a driver is successfully bound to a device.
>   */
> -static void driver_deferred_probe_trigger(void)
> +void driver_deferred_probe_trigger(void)
>  {
>  	if (!driver_deferred_probe_enable)
>  		return;
> @@ -144,6 +144,7 @@ static void driver_deferred_probe_trigger(void)
>  	 */
>  	queue_work(deferred_wq, &deferred_probe_work);
>  }
> +EXPORT_SYMBOL_GPL(driver_deferred_probe_trigger);
> 
>  /**
>   * deferred_probe_initcall() - Enable probing of deferred devices
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 5401814..4fe742f 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -593,6 +593,9 @@ request_firmware(const struct firmware
> **firmware_p, const char *name,
>  	if (IS_ERR_OR_NULL(fw_priv))
>  		return PTR_RET(fw_priv);
> 
> +	if (system_state != SYSTEM_RUNNING)
> +		return -EPROBE_DEFER;
> +
>  	ret = usermodehelper_read_trylock();
>  	if (WARN_ON(ret)) {
>  		dev_err(device, "firmware: %s will not be loaded\n", name);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index d0e4d99..a63d3171 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -244,7 +244,7 @@ extern struct device_driver *driver_find(const char *name,
>  					 struct bus_type *bus);
>  extern int driver_probe_done(void);
>  extern void wait_for_device_probe(void);
> -
> +extern void driver_deferred_probe_trigger(void);
> 
>  /* sysfs interface for exporting driver attributes */
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index e07f5e0..c8d74c6 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -378,6 +378,7 @@ extern enum system_states {
>  	SYSTEM_POWER_OFF,
>  	SYSTEM_RESTART,
>  	SYSTEM_SUSPEND_DISK,
> +	SYSTEM_SUSPEND,
>  } system_state;
> 
>  #define TAINT_PROPRIETARY_MODULE	0
> diff --git a/init/main.c b/init/main.c
> index e60679d..237eae1 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -809,6 +809,8 @@ static noinline int init_post(void)
>  	current->signal->flags |= SIGNAL_UNKILLABLE;
>  	flush_delayed_fput();
> 
> +	driver_deferred_probe_trigger();
> +
>  	if (ramdisk_execute_command) {
>  		run_init_process(ramdisk_execute_command);
>  		printk(KERN_WARNING "Failed to execute %s\n",
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 1da39ea..dbf6ffb 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -224,6 +224,9 @@ int suspend_devices_and_enter(suspend_state_t state)
>  		goto Recover_platform;
>  	}
>  	suspend_test_finish("suspend devices");
> +
> +	system_state = SYSTEM_SUSPEND;

This new SYSTEM_SUSPEND state is declared above and only assigned here
to system_state without being tested anywhere. AFAICT, the only test
you're doing is system_state != SYSTEM_RUNNING and that works without
defining a new SYSTEM_SUSPEND state.

So are you sure you really need it?

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 12:52 ` Borislav Petkov
@ 2012-07-20 12:57   ` Ming Lei
  2012-07-20 13:03     ` Borislav Petkov
  0 siblings, 1 reply; 25+ messages in thread
From: Ming Lei @ 2012-07-20 12:57 UTC (permalink / raw)
  To: Borislav Petkov, Ming Lei, Linux Kernel Mailing List, linux-usb,
	Alan Stern, Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Fri, Jul 20, 2012 at 8:52 PM, Borislav Petkov <bp@alien8.de> wrote:

> This new SYSTEM_SUSPEND state is declared above and only assigned here
> to system_state without being tested anywhere. AFAICT, the only test
> you're doing is system_state != SYSTEM_RUNNING and that works without
> defining a new SYSTEM_SUSPEND state.
>
> So are you sure you really need it?

If the approach is workable, I will rename SYSTEM_SUSPEND_DISK as
SYSTEM_SUSPEND since SYSTEM_SUSPEND_DISK is not used now.


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 12:57   ` Ming Lei
@ 2012-07-20 13:03     ` Borislav Petkov
  2012-07-20 13:09       ` Ming Lei
  0 siblings, 1 reply; 25+ messages in thread
From: Borislav Petkov @ 2012-07-20 13:03 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Fri, Jul 20, 2012 at 08:57:05PM +0800, Ming Lei wrote:
> On Fri, Jul 20, 2012 at 8:52 PM, Borislav Petkov <bp@alien8.de> wrote:
> 
> > This new SYSTEM_SUSPEND state is declared above and only assigned here
> > to system_state without being tested anywhere. AFAICT, the only test
> > you're doing is system_state != SYSTEM_RUNNING and that works without
> > defining a new SYSTEM_SUSPEND state.
> >
> > So are you sure you really need it?
> 
> If the approach is workable, I will rename SYSTEM_SUSPEND_DISK as
> SYSTEM_SUSPEND since SYSTEM_SUSPEND_DISK is not used now.

This still doesn't change the fact that SYSTEM_SUSPEND or
SYSTEM_SUSPEND_DISK is unused. IOW, both states are unused. So why
introduce a new state instead of simply test != SYSTEM_RUNNING?

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 13:03     ` Borislav Petkov
@ 2012-07-20 13:09       ` Ming Lei
  2012-07-20 13:13         ` Borislav Petkov
  0 siblings, 1 reply; 25+ messages in thread
From: Ming Lei @ 2012-07-20 13:09 UTC (permalink / raw)
  To: Borislav Petkov, Ming Lei, Linux Kernel Mailing List, linux-usb,
	Alan Stern, Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Fri, Jul 20, 2012 at 9:03 PM, Borislav Petkov <bp@alien8.de> wrote:
> This still doesn't change the fact that SYSTEM_SUSPEND or
> SYSTEM_SUSPEND_DISK is unused. IOW, both states are unused. So why
> introduce a new state instead of simply test != SYSTEM_RUNNING?

Because system_state is still SYSTEM_RUNNING during S2R or hibernation.

Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 13:09       ` Ming Lei
@ 2012-07-20 13:13         ` Borislav Petkov
  0 siblings, 0 replies; 25+ messages in thread
From: Borislav Petkov @ 2012-07-20 13:13 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Fri, Jul 20, 2012 at 09:09:10PM +0800, Ming Lei wrote:
> On Fri, Jul 20, 2012 at 9:03 PM, Borislav Petkov <bp@alien8.de> wrote:
> > This still doesn't change the fact that SYSTEM_SUSPEND or
> > SYSTEM_SUSPEND_DISK is unused. IOW, both states are unused. So why
> > introduce a new state instead of simply test != SYSTEM_RUNNING?
> 
> Because system_state is still SYSTEM_RUNNING during S2R or hibernation.

Ah, and you change that in suspend_devices_and_enter().

Ok, thanks.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 12:33 [RFC] firmware load: defer request_firmware during early boot and resume Ming Lei
  2012-07-20 12:52 ` Borislav Petkov
@ 2012-07-20 13:54 ` Oliver Neukum
  2012-07-20 15:53   ` Ming Lei
  2012-07-21  4:13 ` Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Oliver Neukum @ 2012-07-20 13:54 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Rafael J. Wysocki

On Friday 20 July 2012 20:33:32 Ming Lei wrote:
> The RFC patch is just for discussing if the idea of deferring
> request_firmware is doable for addressing the issue of
> request_firmware in resume path, which is caused by driver
> unbind/rebind during resume.
> 
> At least usb bus is involved in such things, one driver may be
> unbound and rebound in resume path at several situations, and
> request_firmware is often called inside probe().
> 
> Also the idea should be helpful for other hotplug buses too,
> at least there was the similar problem report on pcmcia bus.

The approach seems to me to be less comprehensive than
it ought to be. If you defer, why not the whole probe()?
Deferring only the upoad of the firmware complicates
error handling.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 13:54 ` Oliver Neukum
@ 2012-07-20 15:53   ` Ming Lei
  0 siblings, 0 replies; 25+ messages in thread
From: Ming Lei @ 2012-07-20 15:53 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Rafael J. Wysocki

On Fri, Jul 20, 2012 at 9:54 PM, Oliver Neukum <oneukum@suse.de> wrote:

>
> The approach seems to me to be less comprehensive than
> it ought to be. If you defer, why not the whole probe()?

Because for other failures, we don't know when the conditions
for them can be satisfied. For example, we don't know when the
memory allocation can be met for one previous allocation failure.

IMO, at least the approach provides one simple way to solve the
firmware loading problem during. early boot or resume

> Deferring only the upoad of the firmware complicates
> error handling.

Looks there is no special requirement for the error handling of
request_firmware, just like other failures' handling, undo things
which has been done, isn't there?

Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 12:33 [RFC] firmware load: defer request_firmware during early boot and resume Ming Lei
  2012-07-20 12:52 ` Borislav Petkov
  2012-07-20 13:54 ` Oliver Neukum
@ 2012-07-21  4:13 ` Ming Lei
  2012-07-21  9:56   ` Rafael J. Wysocki
  2012-07-21 17:31 ` Linus Torvalds
  2012-07-21 17:49 ` Rafael J. Wysocki
  4 siblings, 1 reply; 25+ messages in thread
From: Ming Lei @ 2012-07-21  4:13 UTC (permalink / raw)
  To: Linux Kernel Mailing List, linux-usb
  Cc: Alan Stern, Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki,
	Dave Jones, Linus Torvalds, Matthew Garrett, Jack Stone,
	Larry Finger, Alan Cox, Ming Lei

CC guys who discussed the problem in the below link in Jan. :

      http://marc.info/?t=132528956000002&r=10&w=2

On Fri, Jul 20, 2012 at 8:33 PM, Ming Lei <tom.leiming@gmail.com> wrote:
> The RFC patch is just for discussing if the idea of deferring
> request_firmware is doable for addressing the issue of
> request_firmware in resume path, which is caused by driver
> unbind/rebind during resume.
>
> At least usb bus is involved in such things, one driver may be
> unbound and rebound in resume path at several situations, and
> request_firmware is often called inside probe().
>
> Also the idea should be helpful for other hotplug buses too,
> at least there was the similar problem report on pcmcia bus.

Looks it works well in my two test cases: one is to call request_firmware
in early boot(initcall), another one is to call request_firmware in probe()
of resume path(caused by unbind & rebind).  And without the patch, both
two request_firmware return failure and can't complete the loading.

V1:
        only defer handling the requests called from probe()

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 6cd2c6c..fb02eff 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -123,7 +123,7 @@ static bool driver_deferred_probe_enable = false;
  * list and schedules the deferred probe workqueue to process them.  It
  * should be called anytime a driver is successfully bound to a device.
  */
-static void driver_deferred_probe_trigger(void)
+void driver_deferred_probe_trigger(void)
 {
 	if (!driver_deferred_probe_enable)
 		return;
@@ -144,6 +144,7 @@ static void driver_deferred_probe_trigger(void)
 	 */
 	queue_work(deferred_wq, &deferred_probe_work);
 }
+EXPORT_SYMBOL_GPL(driver_deferred_probe_trigger);

 /**
  * deferred_probe_initcall() - Enable probing of deferred devices
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 5401814..1b13536 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -21,6 +21,7 @@
 #include <linux/firmware.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
+#include "base.h"

 #define to_dev(obj) container_of(obj, struct device, kobj)

@@ -593,6 +594,11 @@ request_firmware(const struct firmware
**firmware_p, const char *name,
 	if (IS_ERR_OR_NULL(fw_priv))
 		return PTR_RET(fw_priv);

+	/* only defer handling the requests called from probe() */
+	if (!klist_node_attached(&device->p->knode_driver) &&
+			system_state != SYSTEM_RUNNING)
+		return -EPROBE_DEFER;
+
 	ret = usermodehelper_read_trylock();
 	if (WARN_ON(ret)) {
 		dev_err(device, "firmware: %s will not be loaded\n", name);
diff --git a/include/linux/device.h b/include/linux/device.h
index d0e4d99..a63d3171 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -244,7 +244,7 @@ extern struct device_driver *driver_find(const char *name,
 					 struct bus_type *bus);
 extern int driver_probe_done(void);
 extern void wait_for_device_probe(void);
-
+extern void driver_deferred_probe_trigger(void);

 /* sysfs interface for exporting driver attributes */

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e07f5e0..c8d74c6 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -378,6 +378,7 @@ extern enum system_states {
 	SYSTEM_POWER_OFF,
 	SYSTEM_RESTART,
 	SYSTEM_SUSPEND_DISK,
+	SYSTEM_SUSPEND,
 } system_state;

 #define TAINT_PROPRIETARY_MODULE	0
diff --git a/init/main.c b/init/main.c
index e60679d..02fc1c2 100644
--- a/init/main.c
+++ b/init/main.c
@@ -809,6 +809,9 @@ static noinline int init_post(void)
 	current->signal->flags |= SIGNAL_UNKILLABLE;
 	flush_delayed_fput();

+	/* defer handling request_firmware in probe of initcall path */
+	driver_deferred_probe_trigger();
+
 	if (ramdisk_execute_command) {
 		run_init_process(ramdisk_execute_command);
 		printk(KERN_WARNING "Failed to execute %s\n",
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 1da39ea..61c723f 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -224,6 +224,9 @@ int suspend_devices_and_enter(suspend_state_t state)
 		goto Recover_platform;
 	}
 	suspend_test_finish("suspend devices");
+
+	system_state = SYSTEM_SUSPEND;
+
 	if (suspend_test(TEST_DEVICES))
 		goto Recover_platform;

@@ -301,6 +304,12 @@ static int enter_state(suspend_state_t state)
  Finish:
 	pr_debug("PM: Finishing wakeup.\n");
 	suspend_finish();
+
+	system_state = SYSTEM_RUNNING;
+
+	/* defer handling request_firmware in probe of resume path */
+	driver_deferred_probe_trigger();
+
  Unlock:
 	mutex_unlock(&pm_mutex);
 	return error;



Thanks,
-- 
Ming Lei

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21  4:13 ` Ming Lei
@ 2012-07-21  9:56   ` Rafael J. Wysocki
  2012-07-21 13:25     ` Ming Lei
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2012-07-21  9:56 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Dave Jones, Linus Torvalds,
	Matthew Garrett, Jack Stone, Larry Finger, Alan Cox, Ming Lei

On Saturday, July 21, 2012, Ming Lei wrote:
> CC guys who discussed the problem in the below link in Jan. :
> 
>       http://marc.info/?t=132528956000002&r=10&w=2
> 
> On Fri, Jul 20, 2012 at 8:33 PM, Ming Lei <tom.leiming@gmail.com> wrote:
> > The RFC patch is just for discussing if the idea of deferring
> > request_firmware is doable for addressing the issue of
> > request_firmware in resume path, which is caused by driver
> > unbind/rebind during resume.
> >
> > At least usb bus is involved in such things, one driver may be
> > unbound and rebound in resume path at several situations, and
> > request_firmware is often called inside probe().
> >
> > Also the idea should be helpful for other hotplug buses too,
> > at least there was the similar problem report on pcmcia bus.
> 
> Looks it works well in my two test cases: one is to call request_firmware
> in early boot(initcall), another one is to call request_firmware in probe()
> of resume path(caused by unbind & rebind).  And without the patch, both
> two request_firmware return failure and can't complete the loading.

And that's precisely why you're not supposed to use request_firmware() in
those two situations.

My opinion is that, at least for the suspend/hibernate case, the firmware
data should be loaded into memory before suspend (e.g. using a PM notifier)
and released only after the corresponding resume (or suspend failure), so
that it's present in memory during the entire suspend-resume cycle.

The early bood situation may be handled through deferred probing.

Thanks,
Rafael


> V1:
>         only defer handling the requests called from probe()
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 6cd2c6c..fb02eff 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -123,7 +123,7 @@ static bool driver_deferred_probe_enable = false;
>   * list and schedules the deferred probe workqueue to process them.  It
>   * should be called anytime a driver is successfully bound to a device.
>   */
> -static void driver_deferred_probe_trigger(void)
> +void driver_deferred_probe_trigger(void)
>  {
>  	if (!driver_deferred_probe_enable)
>  		return;
> @@ -144,6 +144,7 @@ static void driver_deferred_probe_trigger(void)
>  	 */
>  	queue_work(deferred_wq, &deferred_probe_work);
>  }
> +EXPORT_SYMBOL_GPL(driver_deferred_probe_trigger);
> 
>  /**
>   * deferred_probe_initcall() - Enable probing of deferred devices
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 5401814..1b13536 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -21,6 +21,7 @@
>  #include <linux/firmware.h>
>  #include <linux/slab.h>
>  #include <linux/sched.h>
> +#include "base.h"
> 
>  #define to_dev(obj) container_of(obj, struct device, kobj)
> 
> @@ -593,6 +594,11 @@ request_firmware(const struct firmware
> **firmware_p, const char *name,
>  	if (IS_ERR_OR_NULL(fw_priv))
>  		return PTR_RET(fw_priv);
> 
> +	/* only defer handling the requests called from probe() */
> +	if (!klist_node_attached(&device->p->knode_driver) &&
> +			system_state != SYSTEM_RUNNING)
> +		return -EPROBE_DEFER;
> +
>  	ret = usermodehelper_read_trylock();
>  	if (WARN_ON(ret)) {
>  		dev_err(device, "firmware: %s will not be loaded\n", name);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index d0e4d99..a63d3171 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -244,7 +244,7 @@ extern struct device_driver *driver_find(const char *name,
>  					 struct bus_type *bus);
>  extern int driver_probe_done(void);
>  extern void wait_for_device_probe(void);
> -
> +extern void driver_deferred_probe_trigger(void);
> 
>  /* sysfs interface for exporting driver attributes */
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index e07f5e0..c8d74c6 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -378,6 +378,7 @@ extern enum system_states {
>  	SYSTEM_POWER_OFF,
>  	SYSTEM_RESTART,
>  	SYSTEM_SUSPEND_DISK,
> +	SYSTEM_SUSPEND,
>  } system_state;
> 
>  #define TAINT_PROPRIETARY_MODULE	0
> diff --git a/init/main.c b/init/main.c
> index e60679d..02fc1c2 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -809,6 +809,9 @@ static noinline int init_post(void)
>  	current->signal->flags |= SIGNAL_UNKILLABLE;
>  	flush_delayed_fput();
> 
> +	/* defer handling request_firmware in probe of initcall path */
> +	driver_deferred_probe_trigger();
> +
>  	if (ramdisk_execute_command) {
>  		run_init_process(ramdisk_execute_command);
>  		printk(KERN_WARNING "Failed to execute %s\n",
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 1da39ea..61c723f 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -224,6 +224,9 @@ int suspend_devices_and_enter(suspend_state_t state)
>  		goto Recover_platform;
>  	}
>  	suspend_test_finish("suspend devices");
> +
> +	system_state = SYSTEM_SUSPEND;
> +
>  	if (suspend_test(TEST_DEVICES))
>  		goto Recover_platform;
> 
> @@ -301,6 +304,12 @@ static int enter_state(suspend_state_t state)
>   Finish:
>  	pr_debug("PM: Finishing wakeup.\n");
>  	suspend_finish();
> +
> +	system_state = SYSTEM_RUNNING;
> +
> +	/* defer handling request_firmware in probe of resume path */
> +	driver_deferred_probe_trigger();
> +
>   Unlock:
>  	mutex_unlock(&pm_mutex);
>  	return error;
> 
> 
> 
> Thanks,
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21  9:56   ` Rafael J. Wysocki
@ 2012-07-21 13:25     ` Ming Lei
  2012-07-21 17:35       ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Ming Lei @ 2012-07-21 13:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Dave Jones, Linus Torvalds,
	Matthew Garrett, Jack Stone, Larry Finger, Alan Cox

On Sat, Jul 21, 2012 at 5:56 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Saturday, July 21, 2012, Ming Lei wrote:
>> CC guys who discussed the problem in the below link in Jan. :
>>
>>       http://marc.info/?t=132528956000002&r=10&w=2
>>
>> On Fri, Jul 20, 2012 at 8:33 PM, Ming Lei <tom.leiming@gmail.com> wrote:
>> > The RFC patch is just for discussing if the idea of deferring
>> > request_firmware is doable for addressing the issue of
>> > request_firmware in resume path, which is caused by driver
>> > unbind/rebind during resume.
>> >
>> > At least usb bus is involved in such things, one driver may be
>> > unbound and rebound in resume path at several situations, and
>> > request_firmware is often called inside probe().
>> >
>> > Also the idea should be helpful for other hotplug buses too,
>> > at least there was the similar problem report on pcmcia bus.
>>
>> Looks it works well in my two test cases: one is to call request_firmware
>> in early boot(initcall), another one is to call request_firmware in probe()
>> of resume path(caused by unbind & rebind).  And without the patch, both
>> two request_firmware return failure and can't complete the loading.
>
> And that's precisely why you're not supposed to use request_firmware() in
> those two situations.

So you mean we should do as below?

For the early boot situation, the driver which calls request_firmware in probe()
can't be built in kernel image.

For the second situation, some drivers can't be allowed to call
request_firmware()
in its probe() because these drivers may be unbound and rebound inside resume()
or complete_resume(),  for example, see usb_resume_complete().

IMO, the 1st one is very unfriendly and the 2nd one can't be avoided for some
hotplug buses.

That is just the motivation of this patch to make request_firmware() workable in
both the two above situations.

>
> My opinion is that, at least for the suspend/hibernate case, the firmware
> data should be loaded into memory before suspend (e.g. using a PM notifier)
> and released only after the corresponding resume (or suspend failure), so
> that it's present in memory during the entire suspend-resume cycle.

The patch isn't to replace caching firmware data during suspend-resume cycle,
and just a supplement for it.

It is not easy to cache firmware data during suspend-resume cycle for
the above 2nd situation because of the lifetime problem of firmware data:
the driver may be unbound and rebound inside resume path, even the
device may vanish and appear again.

Also, Matthew had a below case[1] which can't be solved with caching
firmware data at all, not to mention consuming much memory by caching
firmware:

       1) user boots from cold. Device comes up with generic USB ID.
       2) isight_firmware loads and binds. Firmware is loaded. Device
           disconnects and reconnects with an ID that's bound by the UVC
           driver.
       3) user reboots. Device comes up with UVC ID. isight_firmware
           doesn't bind.
       4) user suspends.
       5) user resumes. isight_firmware binds and attempts to load firmware.

But it can be dealt with easily by the simple patch.

Finally, suppose caching firmware may work well for the 2nd situation, we still
have to cache all the firmwares of all hotplug devices(in one system) which
need firmware before suspending, because these devices may be unplugged
and plugged again during suspend-resume cycle or be powered off by system.

>
> The early bood situation may be handled through deferred probing.

Yes, deferring it will make drivers more friendly and be easy to use.

[1], http://marc.info/?l=linux-usb&m=132554118928398&w=2

thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 13:25     ` Ming Lei
@ 2012-07-21 17:35       ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2012-07-21 17:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Dave Jones, Linus Torvalds,
	Matthew Garrett, Jack Stone, Larry Finger, Alan Cox,
	Linux PM list

On Saturday, July 21, 2012, Ming Lei wrote:
> On Sat, Jul 21, 2012 at 5:56 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Saturday, July 21, 2012, Ming Lei wrote:
> >> CC guys who discussed the problem in the below link in Jan. :
> >>
> >>       http://marc.info/?t=132528956000002&r=10&w=2
> >>
> >> On Fri, Jul 20, 2012 at 8:33 PM, Ming Lei <tom.leiming@gmail.com> wrote:
> >> > The RFC patch is just for discussing if the idea of deferring
> >> > request_firmware is doable for addressing the issue of
> >> > request_firmware in resume path, which is caused by driver
> >> > unbind/rebind during resume.
> >> >
> >> > At least usb bus is involved in such things, one driver may be
> >> > unbound and rebound in resume path at several situations, and
> >> > request_firmware is often called inside probe().
> >> >
> >> > Also the idea should be helpful for other hotplug buses too,
> >> > at least there was the similar problem report on pcmcia bus.
> >>
> >> Looks it works well in my two test cases: one is to call request_firmware
> >> in early boot(initcall), another one is to call request_firmware in probe()
> >> of resume path(caused by unbind & rebind).  And without the patch, both
> >> two request_firmware return failure and can't complete the loading.
> >
> > And that's precisely why you're not supposed to use request_firmware() in
> > those two situations.
> 
> So you mean we should do as below?
> 
> For the early boot situation, the driver which calls request_firmware in probe()
> can't be built in kernel image.
> 
> For the second situation, some drivers can't be allowed to call
> request_firmware()
> in its probe() because these drivers may be unbound and rebound inside resume()
> or complete_resume(),  for example, see usb_resume_complete().
> 
> IMO, the 1st one is very unfriendly and the 2nd one can't be avoided for some
> hotplug buses.

I'm not sure if it really can't be avoided.

> That is just the motivation of this patch to make request_firmware() workable in
> both the two above situations.
> 
> >
> > My opinion is that, at least for the suspend/hibernate case, the firmware
> > data should be loaded into memory before suspend (e.g. using a PM notifier)
> > and released only after the corresponding resume (or suspend failure), so
> > that it's present in memory during the entire suspend-resume cycle.
> 
> The patch isn't to replace caching firmware data during suspend-resume cycle,
> and just a supplement for it.
> 
> It is not easy to cache firmware data during suspend-resume cycle for
> the above 2nd situation because of the lifetime problem of firmware data:
> the driver may be unbound and rebound inside resume path, even the
> device may vanish and appear again.
> 
> Also, Matthew had a below case[1] which can't be solved with caching
> firmware data at all, not to mention consuming much memory by caching
> firmware:
> 
>        1) user boots from cold. Device comes up with generic USB ID.
>        2) isight_firmware loads and binds. Firmware is loaded. Device
>            disconnects and reconnects with an ID that's bound by the UVC
>            driver.
>        3) user reboots. Device comes up with UVC ID. isight_firmware
>            doesn't bind.
>        4) user suspends.
>        5) user resumes. isight_firmware binds and attempts to load firmware.
> 
> But it can be dealt with easily by the simple patch.
> 
> Finally, suppose caching firmware may work well for the 2nd situation, we still
> have to cache all the firmwares of all hotplug devices(in one system) which
> need firmware before suspending, because these devices may be unplugged
> and plugged again during suspend-resume cycle or be powered off by system.

OK, I give up.  This may not be too ugly to live, after all.

I'll post some comments in a reply to the message with the patch.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 12:33 [RFC] firmware load: defer request_firmware during early boot and resume Ming Lei
                   ` (2 preceding siblings ...)
  2012-07-21  4:13 ` Ming Lei
@ 2012-07-21 17:31 ` Linus Torvalds
  2012-07-21 17:53   ` Rafael J. Wysocki
  2012-07-21 19:55   ` Ming Lei
  2012-07-21 17:49 ` Rafael J. Wysocki
  4 siblings, 2 replies; 25+ messages in thread
From: Linus Torvalds @ 2012-07-21 17:31 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Fri, Jul 20, 2012 at 5:33 AM, Ming Lei <tom.leiming@gmail.com> wrote:
> The RFC patch is just for discussing if the idea of deferring
> request_firmware is doable for addressing the issue of
> request_firmware in resume path, which is caused by driver
> unbind/rebind during resume.

NAK.

It really *has* to be handled some other way.

This whole approach seems to actively try to *silently* fix up broken
drivers. And it's wrong.

There's a reason we warn, and there's a reason we *have* to warn: to
let driver writers know that what they are attempting to do MAY NOT
WORK.

Really.

Sure, for a lot of devices it's fine to load the firmware later. But
some devices may be part of the resume sequence in very critical ways,
and deferring the firmware loading will just mean that the resume will
fail.

This we *need* the WARN_ON() - so that even in the case where it
happens to work, people are very much told that "sure, your
suspend/resume may have worked, but it was doing fundamentally wrong
things that may mean that for other people it *won't* work".

For example, maybe it's a USB network dongle, and for *YOU* it is
perfectly fine to defer the firmware loading, so that the network
comes back up a few seconds after the system is up and running.

But in another machine, that exact same USB network dongle may
actually be hardwired on the motherboard (it's fairly common to use
USB as a "system bus" in some laptop and embedded devices), and maybe
that other machine actually is a thin client that has some tiny rdinit
thing, and then everything else is NFS-mounted, and if you resume
without networking, the machine is simply *dead*.

Ok, so that was a completely made-up example, but we have actually had
situations kind of like that, where a device is just not that
important for lots of people, but in other situations it's critical
for the rest of the suspend/resume to succeed.

This is why I'm so vehemently against silently "hiding" these issues.

If you have a driver that has problems, make THAT ONE PARTICULAR
driver do the deferral explicitly. Don't make some generic "silently
defer if there are problems" patch.

See what I'm saying? You're solving things in exactly the wrong place,
and in exactly the wrong way. You're papering things over, and making
the generic code silently just make broken cases work. That's really
really bad, because it makes it *easier* for driver writers to do the
wrong thing without even thinking about it, and without ever seeing
the problem. And then when people say "suspend/resume doesn't work",
the driver author says "it works for me" and ignores the problem.
Because you've systemically made it easy to ignore the problem, and
made it easy to do the wrong thing by default.

So we should make driver writers do the right thing by default, and if
they cannot do the right thing (and the "isight" camera always comes
up, and f*ck it, just fix that driver) then they should do extra work.

Seriously. People should load their firmware *before* the
suspend/resume cycle. And if that isn't possible, then the system
should ABSOLUTELY NOT silently say "whatever" and defer it until
later.  We should have that big failure and the big noisy warning, and
drivers that need to defer need to do so themselves, so that we never
*ever* have that silent automatic defer situation.

                          Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 17:31 ` Linus Torvalds
@ 2012-07-21 17:53   ` Rafael J. Wysocki
  2012-07-21 19:55   ` Ming Lei
  1 sibling, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2012-07-21 17:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Linux PM list

On Saturday, July 21, 2012, Linus Torvalds wrote:
> On Fri, Jul 20, 2012 at 5:33 AM, Ming Lei <tom.leiming@gmail.com> wrote:
> > The RFC patch is just for discussing if the idea of deferring
> > request_firmware is doable for addressing the issue of
> > request_firmware in resume path, which is caused by driver
> > unbind/rebind during resume.
> 
> NAK.

[...]

> Seriously. People should load their firmware *before* the
> suspend/resume cycle. And if that isn't possible, then the system
> should ABSOLUTELY NOT silently say "whatever" and defer it until
> later.  We should have that big failure and the big noisy warning, and
> drivers that need to defer need to do so themselves, so that we never
> *ever* have that silent automatic defer situation.

Well, thanks.  I totally agree, but didn't know how to say this. :-)

Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 17:31 ` Linus Torvalds
  2012-07-21 17:53   ` Rafael J. Wysocki
@ 2012-07-21 19:55   ` Ming Lei
  2012-07-21 20:38     ` Linus Torvalds
  2012-07-21 21:01     ` Francois Romieu
  1 sibling, 2 replies; 25+ messages in thread
From: Ming Lei @ 2012-07-21 19:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

Thanks for your so detailed comments.

On Sun, Jul 22, 2012 at 1:31 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Jul 20, 2012 at 5:33 AM, Ming Lei <tom.leiming@gmail.com> wrote:
>> The RFC patch is just for discussing if the idea of deferring
>> request_firmware is doable for addressing the issue of
>> request_firmware in resume path, which is caused by driver
>> unbind/rebind during resume.
>
> NAK.

Suppose it is not good for resume case, I think it still makes sense
for early boot
situation, at least the patch will support to request firmware inside
init call, and
allow drivers to be built in kernel in case of requesting firmware from probe().

>
> It really *has* to be handled some other way.
>
> This whole approach seems to actively try to *silently* fix up broken
> drivers. And it's wrong.
>
> There's a reason we warn, and there's a reason we *have* to warn: to
> let driver writers know that what they are attempting to do MAY NOT
> WORK.
>
> Really.
>
> Sure, for a lot of devices it's fine to load the firmware later. But
> some devices may be part of the resume sequence in very critical ways,
> and deferring the firmware loading will just mean that the resume will
> fail.
>
> This we *need* the WARN_ON() - so that even in the case where it
> happens to work, people are very much told that "sure, your
> suspend/resume may have worked, but it was doing fundamentally wrong
> things that may mean that for other people it *won't* work".
>
> For example, maybe it's a USB network dongle, and for *YOU* it is
> perfectly fine to defer the firmware loading, so that the network
> comes back up a few seconds after the system is up and running.
>
> But in another machine, that exact same USB network dongle may
> actually be hardwired on the motherboard (it's fairly common to use
> USB as a "system bus" in some laptop and embedded devices), and maybe
> that other machine actually is a thin client that has some tiny rdinit
> thing, and then everything else is NFS-mounted, and if you resume
> without networking, the machine is simply *dead*.
>
> Ok, so that was a completely made-up example, but we have actually had
> situations kind of like that, where a device is just not that
> important for lots of people, but in other situations it's critical
> for the rest of the suspend/resume to succeed.
>
> This is why I'm so vehemently against silently "hiding" these issues.

OK, I see your concerns.

>
> If you have a driver that has problems, make THAT ONE PARTICULAR
> driver do the deferral explicitly. Don't make some generic "silently
> defer if there are problems" patch.

It is a good idea to let the driver defer request explicitly, but still need
some changes in generic code to support it.

>
> See what I'm saying? You're solving things in exactly the wrong place,
> and in exactly the wrong way. You're papering things over, and making
> the generic code silently just make broken cases work. That's really
> really bad, because it makes it *easier* for driver writers to do the
> wrong thing without even thinking about it, and without ever seeing
> the problem. And then when people say "suspend/resume doesn't work",
> the driver author says "it works for me" and ignores the problem.
> Because you've systemically made it easy to ignore the problem, and
> made it easy to do the wrong thing by default.
>
> So we should make driver writers do the right thing by default, and if
> they cannot do the right thing (and the "isight" camera always comes
> up, and f*ck it, just fix that driver) then they should do extra work.

So we can let isight driver to defer its request_firmware explicitly.

>
> Seriously. People should load their firmware *before* the
> suspend/resume cycle. And if that isn't possible, then the system
> should ABSOLUTELY NOT silently say "whatever" and defer it until
> later.  We should have that big failure and the big noisy warning, and
> drivers that need to defer need to do so themselves, so that we never
> *ever* have that silent automatic defer situation.

In my opinion,  we should cache firmware data for all hotplug
devices or devices which may experience power loss automatically
in kernel during suspend-resume cycle because all such devices may be
disconnected and connected again during suspend-resume cycle.

Looks it is not difficult to cache firmware data by kernel, for example, just
call the

           cache_firmware(fw_name)

for each device which need firmware before suspending,
then call the below to uncache firmware after resume:

           uncache_firmware(fw_name)

The problem is that many firmwares may consume much
memory, so we still may let drivers to choose if they need to
let kernel cache firmware automatically or just defer the
request_firmware in resume path by unbind & rebind driver to save
memory space, suppose the device is not important wrt. system
resume.  Maybe just a few devices can't be allowed to defer
requesting firmware.

So saving memory space is another advantage of the deferral
of request_firmware.


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 19:55   ` Ming Lei
@ 2012-07-21 20:38     ` Linus Torvalds
  2012-07-21 20:46       ` david
                         ` (2 more replies)
  2012-07-21 21:01     ` Francois Romieu
  1 sibling, 3 replies; 25+ messages in thread
From: Linus Torvalds @ 2012-07-21 20:38 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Sat, Jul 21, 2012 at 12:55 PM, Ming Lei <tom.leiming@gmail.com> wrote:
>
> Suppose it is not good for resume case, I think it still makes sense
> for early boot situation, at least the patch will support to request
> firmware inside init call, and allow drivers to be built in kernel i
> case of requesting firmware from probe().

I agree that this is a problem. At the same time, early boot has some
of the exact same problems as resume has, and I do wish that people
would ask themselves: "why do I try to load the firmware at early boot
time"?

There is really only *one* real reason to load firmware at device
probe time, and that's because the device is needed for the boot. But
in that case, deferral is wrong, isn't it?

And if the device isn't needed for boot, then why is it loading so
early? For network devices, for example (and this is a *common*
issue), firmware should be loaded not at device init, but at device
*open* time, exactly because we don't want to load it too early when
it might not even be available yet.

So I would prefer if people basically just understood that "if you're
trying to load firmware at module init time, you are almost certainly
doing something wrong".

Delaying the firmware load as much as possible (and here "delaying"
does *not* mean your kind of "deferred" load, but explicitly doing it
only when really needed) allows things like "boot the system, copy the
new firmware from a USB stick in single-user mode, then bring up
networking". It also simply avoids the whole module load ordering
issue.

So I really think you are looking (again) too much at working around
the symptoms, rather than fixing the deeper issue.

> It is a good idea to let the driver defer request explicitly, but still need
> some changes in generic code to support it.

That's fine. I am not arguing against making core driver core changes,
I'm just arguing against making them so that you facilitate bad
behavior and work around the symptoms of bad choices.

In fact, I'd actually want to argue for *bigger* core device layer
changes to make it easier to do the right thing. Right now, one of the
reasons why driver writers load the firmware at init time is that it's
often _easiest_ for them to do it there, even if it's the wrong point
to do it. And that is partly because I think the device layer doesn't
help enough in making it really convenient to do later.

> In my opinion,  we should cache firmware data for all hotplug
> devices or devices which may experience power loss automatically
> in kernel during suspend-resume cycle because all such devices may be
> disconnected and connected again during suspend-resume cycle.

Yes. *THAT* is absolutely the kind of change I'd love to see. The core
device layer doesn't really make it easy to handle firmware sanely
over suspend/resume, which is kind of sad. Why does every driver have
to have its own "let me remember my firmware over the suspend/resume
event" and have extra code in suspend/resume, when it's really a
pretty generic situation: if the device has firmware, wouldn't it be
really nice if the core driver layer just knew about that and kept
track of it?

> Looks it is not difficult to cache firmware data by kernel, for example, just
> call the
>
>            cache_firmware(fw_name)
>
> for each device which need firmware before suspending,
> then call the below to uncache firmware after resume:
>
>            uncache_firmware(fw_name)

Exactly. But we should make it automatic, and we should only do it if
the device is actually *active*. If nobody is using the device over
the suspend-resume event, the firmware shouldn't be loaded in the
first place, and resume obviously shouldn't need to re-load it.
Wouldn't it be nice if something like the PCI layer (or the USB layer)
just knew to do the rigth thing for the device on its own?

I would also suggest that the firmware caching have some internal
timeout, so that for the (fairly common) case where a suspend/resume
event might look like a unplug/replug event, the caching would
actually still remember the firmware despite the fact that it looked
(for a short while) like the device went away.

So *this* is where I think we could improve on the generic code. Make
it really easy for devices to do the right thing. Make sure that
firmware caches work, even if it looks like devices disappeared
momentarily. Maybe add a few callbacks from generic code to say "you
can load your firmware now, because the system is up".

So I'm really not against improving the situation with firmware
loading. What I'm against is making it easy for device drivers to do
all the wrong things - like loading firmware in their init routines,
or trying to load firmware at resume time. Because those are both
fundamentally *BAD* things to do.

Don't try to help people do bad things.

> The problem is that many firmwares may consume much
> memory

This tends to come up as an argument, but is it actually true?

I don't think so. Especially for suspend-resume, if the device is
active, you *KNOW* that you will want to load the firmware at resume
time. But at the same time, resume time is when you want to be really
quick: you want to aim for a model where the resume has fully
*completed* by the time the user has opened the lid of the laptop
fully (just as an example). And what does that imply? It implies that
you really want to do as much of the expensive stuff at *suspend*
time.

Who cares if you use memory for firmware while the device is
suspended? NOBODY. And if it takes two extra seconds for the laptop to
really suspend after you close the lid, that's fine too. You'd much
rather spend the time then (when the user clearly doesn't care about
using his device), than at resume time.

So it's really FUNDAMENTALLY WRONG to load firmware at resume time.
It's fundamentally wrong not just because it can be hard to do (the
machine isn't really fully functional yet), but it's fundamentally
wrong because it's STUPID. You want to load the firmware at suspend
time because that's better for the user interface too!

> So saving memory space is another advantage of the deferral
> of request_firmware.

I agree, but see above: I think that argument is only true for the
"the device is not actually in use".

So I really think the rules should always be:

 - firmware should NEVER be loaded at module init time, because it's
the wrong time to do it - the device may never be needed at all.
Slowing down the init sequence is just stupid.

   For example, you may have both a wired and a wireless network in
your laptop, but if you have turned off wireless (airplane mode, for
example), maybe you shouldn't be loading the firmware at all. Or
conversely, maybe you *did* load the firmware of both the wired and
wireless networking when you booted, but the wired network then
noticed that there's no cable attached (after you loaded the firmware
- maybe the driver cannot even tell without the firmware), so the
wired network is not in use, and is not opened. When a suspend
happens, in a perfect world, we should just notice. So we wouldn't
preload the firmware for suspend, because the device isn't even *open*
while suspended, so resume doesn't need to load the firmware.

   Of course, after resume, maybe networkmanager wakes up and checks
the cables, and at that point we load the wired firmware too (*after*
resume, and as a result of opening the device). But then it's the
*correct* thing to load the firmware only after resume, because it
wasn't loaded as *part* of the resume. See what I'm saying.

 - If firmware is needed for resume, it should be loaded by the
suspend logic and cached in memory.

   The reason for this is not just that loading it at resume time
might be hard (so load it when you know the system is fully working),
but also the user interface issue I already mentioned.

Sure, sometimes firmware is a few megabytes in size. But machines
where a few megabytes is a big deal will *not* be running those kinds
of devices. Plus if you load it at suspend-time, nobody really cares
if it takes a while to load and wastes memory while the machine isn't
doing anything. Why would they care?

                Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 20:38     ` Linus Torvalds
@ 2012-07-21 20:46       ` david
  2012-07-21 23:24       ` Ming Lei
  2012-07-22 12:58       ` Borislav Petkov
  2 siblings, 0 replies; 25+ messages in thread
From: david @ 2012-07-21 20:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Sat, 21 Jul 2012, Linus Torvalds wrote:

>> In my opinion,  we should cache firmware data for all hotplug
>> devices or devices which may experience power loss automatically
>> in kernel during suspend-resume cycle because all such devices may be
>> disconnected and connected again during suspend-resume cycle.
>
> Yes. *THAT* is absolutely the kind of change I'd love to see. The core
> device layer doesn't really make it easy to handle firmware sanely
> over suspend/resume, which is kind of sad. Why does every driver have
> to have its own "let me remember my firmware over the suspend/resume
> event" and have extra code in suspend/resume, when it's really a
> pretty generic situation: if the device has firmware, wouldn't it be
> really nice if the core driver layer just knew about that and kept
> track of it?

firmware can be added to the kernel image at compile time. would it make 
sense for there to be some mechanism that can add firmware to the kernel 
image after the fact so that it can create a 'cache' of the firmware 
needed for the particular system as part of that systems kernel image?

David Lang

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 20:38     ` Linus Torvalds
  2012-07-21 20:46       ` david
@ 2012-07-21 23:24       ` Ming Lei
  2012-07-22 12:58       ` Borislav Petkov
  2 siblings, 0 replies; 25+ messages in thread
From: Ming Lei @ 2012-07-21 23:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Sun, Jul 22, 2012 at 4:38 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I agree that this is a problem. At the same time, early boot has some
> of the exact same problems as resume has, and I do wish that people
> would ask themselves: "why do I try to load the firmware at early boot
> time"?
>
> There is really only *one* real reason to load firmware at device
> probe time, and that's because the device is needed for the boot. But
> in that case, deferral is wrong, isn't it?

Linus, sorry for forget sending all, :-(

Maybe not, I know some usb devices, if no firmware is downloaded
into device, you can't use them at all. For example, the isight camera,
only downloading firmware can make the device look like a UVC
camera device.  Also there are some usb BT devices alike, such as
ath3k.

For this kind of devices, deferral should be correct, IMO.

>
> And if the device isn't needed for boot, then why is it loading so
> early? For network devices, for example (and this is a *common*
> issue), firmware should be loaded not at device init, but at device
> *open* time, exactly because we don't want to load it too early when
> it might not even be available yet.

Yes, I agree, we should load firmware just before the device is used in
theory. Also this may be helpful to deal with the caching firmware before
suspend, see below.

But, is one device capable of being downloaded firmware more
than one times? It is still a question and only can be replied by its
vendors. Because one device is only powered on one time and it
is probably or reasonable that the device only supports to be
downloaded firmware just one time. Also, suppose the firmware is
based on linux kernel and application, it is not easy to update the
system online.

Also some devices may have not 'open' interfaces and it is just
accessed by sysfs or other kernel built-in interfaces.

>
> So I would prefer if people basically just understood that "if you're
> trying to load firmware at module init time, you are almost certainly
> doing something wrong".
>
> Delaying the firmware load as much as possible (and here "delaying"
> does *not* mean your kind of "deferred" load, but explicitly doing it
> only when really needed) allows things like "boot the system, copy the
> new firmware from a USB stick in single-user mode, then bring up
> networking". It also simply avoids the whole module load ordering
> issue.
>
> So I really think you are looking (again) too much at working around
> the symptoms, rather than fixing the deeper issue.

OK, I will take time to look at more request_firmware uses in drivers/
and study the related problems above.

>
>> It is a good idea to let the driver defer request explicitly, but still need
>> some changes in generic code to support it.
>
> That's fine. I am not arguing against making core driver core changes,
> I'm just arguing against making them so that you facilitate bad
> behavior and work around the symptoms of bad choices.
>
> In fact, I'd actually want to argue for *bigger* core device layer
> changes to make it easier to do the right thing. Right now, one of the
> reasons why driver writers load the firmware at init time is that it's
> often _easiest_ for them to do it there, even if it's the wrong point
> to do it. And that is partly because I think the device layer doesn't
> help enough in making it really convenient to do later.
>
>> In my opinion,  we should cache firmware data for all hotplug
>> devices or devices which may experience power loss automatically
>> in kernel during suspend-resume cycle because all such devices may be
>> disconnected and connected again during suspend-resume cycle.
>
> Yes. *THAT* is absolutely the kind of change I'd love to see. The core

OK, I'd like to volunteer to improve firmware loading with caching fw
during suspend/resume cycle.

> device layer doesn't really make it easy to handle firmware sanely
> over suspend/resume, which is kind of sad. Why does every driver have
> to have its own "let me remember my firmware over the suspend/resume
> event" and have extra code in suspend/resume, when it's really a
> pretty generic situation: if the device has firmware, wouldn't it be
> really nice if the core driver layer just knew about that and kept
> track of it?
>
>> Looks it is not difficult to cache firmware data by kernel, for example, just
>> call the
>>
>>            cache_firmware(fw_name)
>>
>> for each device which need firmware before suspending,
>> then call the below to uncache firmware after resume:
>>
>>            uncache_firmware(fw_name)
>
> Exactly. But we should make it automatic, and we should only do it if

Yes, I mean the cache/uncache firmware should be done automatically
before suspend and after resume, and it will be implemented inside
driver core.

> the device is actually *active*. If nobody is using the device over

As you said above, suppose devices are active just after its firmware
has been downloaded, we can decide one device if it is active by
observing having downloaded firmware into it or not.

But I know, some drivers needed to be fixed to delay request/download
firmware until it is used actively.

> the suspend-resume event, the firmware shouldn't be loaded in the
> first place, and resume obviously shouldn't need to re-load it.
> Wouldn't it be nice if something like the PCI layer (or the USB layer)
> just knew to do the rigth thing for the device on its own?
>
> I would also suggest that the firmware caching have some internal
> timeout, so that for the (fairly common) case where a suspend/resume
> event might look like a unplug/replug event, the caching would
> actually still remember the firmware despite the fact that it looked
> (for a short while) like the device went away.
>
> So *this* is where I think we could improve on the generic code. Make
> it really easy for devices to do the right thing. Make sure that
> firmware caches work, even if it looks like devices disappeared
> momentarily. Maybe add a few callbacks from generic code to say "you
> can load your firmware now, because the system is up".

IMO, if firmware cache is related with device or driver lifetime, the
problem will become a bit complicated:

     Firmware data lifetime may be longer than driver/device's lifetime.

So I will start the work by just caching firmware before suspend
and uncaching firmware after resume or some time later after resume
for actively used devices, and not related with device/driver's lifetime
first. It should be the simplest and the most reliable approach.


> So I really think the rules should always be:
>
>  - firmware should NEVER be loaded at module init time, because it's
> the wrong time to do it - the device may never be needed at all.

OK, but maybe some devices described above need to load firmware
at probe().

> Slowing down the init sequence is just stupid.

Deferring request_firmware doesn't slow down the init sequence.

>
>  - If firmware is needed for resume, it should be loaded by the
> suspend logic and cached in memory.

Agree, some special device(isight) may choose to defer loading
firmware by themselves.


Thanks,

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 20:38     ` Linus Torvalds
  2012-07-21 20:46       ` david
  2012-07-21 23:24       ` Ming Lei
@ 2012-07-22 12:58       ` Borislav Petkov
  2012-07-22 19:46         ` Linus Torvalds
  2 siblings, 1 reply; 25+ messages in thread
From: Borislav Petkov @ 2012-07-22 12:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Sat, Jul 21, 2012 at 01:38:52PM -0700, Linus Torvalds wrote:
> > Looks it is not difficult to cache firmware data by kernel, for example, just
> > call the
> >
> >            cache_firmware(fw_name)
> >
> > for each device which need firmware before suspending,
> > then call the below to uncache firmware after resume:
> >
> >            uncache_firmware(fw_name)
> 
> Exactly. But we should make it automatic, and we should only do it if
> the device is actually *active*. If nobody is using the device over
> the suspend-resume event, the firmware shouldn't be loaded in the
> first place, and resume obviously shouldn't need to re-load it.
> Wouldn't it be nice if something like the PCI layer (or the USB layer)
> just knew to do the rigth thing for the device on its own?
> 
> I would also suggest that the firmware caching have some internal
> timeout, so that for the (fairly common) case where a suspend/resume
> event might look like a unplug/replug event, the caching would
> actually still remember the firmware despite the fact that it looked
> (for a short while) like the device went away.
> 
> So *this* is where I think we could improve on the generic code. Make
> it really easy for devices to do the right thing. Make sure that
> firmware caches work, even if it looks like devices disappeared
> momentarily. Maybe add a few callbacks from generic code to say "you
> can load your firmware now, because the system is up".

Question: is there any other reason

  [besides maybe embedded people who care about each single Kb of memory
   on the system]

why we don't make this cache/uncache firmware thing *implicit*? That is,
load it once at driver open time and keep it in memory during the whole
driver's lifetime. And this all taken care of by the driver core, btw.

This way you have the const void *firmware always there and can
apply it whenever you feel like it, you obviate all the problems of
request_firmware and it needing userspace and simpify the whole firmware
handling and delays stuff immensely.

You'd only waste just a couple of Kbs per system but what does memory
cost nowadays...

Oh, we'd probably need to be able to jettison the old firmware and
update to a new one but this can be done at convenient moments when the
system is up and we have userspace to do so.

Hmm...

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-22 12:58       ` Borislav Petkov
@ 2012-07-22 19:46         ` Linus Torvalds
  2012-07-23  8:12           ` Borislav Petkov
  0 siblings, 1 reply; 25+ messages in thread
From: Linus Torvalds @ 2012-07-22 19:46 UTC (permalink / raw)
  To: Borislav Petkov, Linus Torvalds, Ming Lei,
	Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Sun, Jul 22, 2012 at 5:58 AM, Borislav Petkov <bp@alien8.de> wrote:
>
> Question: is there any other reason
>
>   [besides maybe embedded people who care about each single Kb of memory
>    on the system]
>
> why we don't make this cache/uncache firmware thing *implicit*? That is,
> load it once at driver open time and keep it in memory during the whole
> driver's lifetime. And this all taken care of by the driver core, btw.

So some firmware is a *lot* more than "a few kB". We're talking
hundreds of kB, sometimes more. And to make matters worse, we keep it
in memory with vmalloc(), which is a limited resource on 32-bit
systems. So it can actually be worse than just the memory use itself.

Also, as you already mentioned, there's the "cache coherency" issue.
There are real advantages to reloading the firmware occasionally,
because it might have changed on disk. So..

I do think we might want to consider it, although I do suspect that we
do want to throw it out because of the problems with infinite caches.

                Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-22 19:46         ` Linus Torvalds
@ 2012-07-23  8:12           ` Borislav Petkov
  0 siblings, 0 replies; 25+ messages in thread
From: Borislav Petkov @ 2012-07-23  8:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

On Sun, Jul 22, 2012 at 12:46:13PM -0700, Linus Torvalds wrote:
> On Sun, Jul 22, 2012 at 5:58 AM, Borislav Petkov <bp@alien8.de> wrote:
> >
> > Question: is there any other reason
> >
> >   [besides maybe embedded people who care about each single Kb of memory
> >    on the system]
> >
> > why we don't make this cache/uncache firmware thing *implicit*? That is,
> > load it once at driver open time and keep it in memory during the whole
> > driver's lifetime. And this all taken care of by the driver core, btw.
> 
> So some firmware is a *lot* more than "a few kB". We're talking
> hundreds of kB, sometimes more.

Ok.

> And to make matters worse, we keep it in memory with vmalloc(), which
> is a limited resource on 32-bit systems. So it can actually be worse
> than just the memory use itself.

Ok, a follow-up: why do we use vmalloc space for firmware, actually?
Because it can be a lot more than a few KB as you say above and a normal
kmalloc allocation could fail in such a case?

Becase I recently converted the AMD microcode driver to use a normal
get_zeroed_page page and got rid of all the vmalloc allocations it did
and it is still working :).

What I'm saying is, we probably could take care of the vmalloc issue by
allocating firmware memory early enough so that we can always succeed.
Oh, I see one problem here - the driver could be loaded very late in
the system lifetime and we could be having fragmented physical memory
so that kmalloc does actually fail. In such cases, we can fallback to
vmalloc I guess.

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 19:55   ` Ming Lei
  2012-07-21 20:38     ` Linus Torvalds
@ 2012-07-21 21:01     ` Francois Romieu
  1 sibling, 0 replies; 25+ messages in thread
From: Francois Romieu @ 2012-07-21 21:01 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linus Torvalds, Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki

Ming Lei <tom.leiming@gmail.com> :
> On Sun, Jul 22, 2012 at 1:31 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Fri, Jul 20, 2012 at 5:33 AM, Ming Lei <tom.leiming@gmail.com> wrote:
> >> The RFC patch is just for discussing if the idea of deferring
> >> request_firmware is doable for addressing the issue of
> >> request_firmware in resume path, which is caused by driver
> >> unbind/rebind during resume.
> >
> > NAK.
> 
> Suppose it is not good for resume case, I think it still makes sense
> for early boot
> situation, at least the patch will support to request firmware inside
> init call, and
> allow drivers to be built in kernel in case of requesting firmware from probe().

Could there be a simple, static and inefficient way to link firmware
and code together ?

-- 
Ueimor

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-20 12:33 [RFC] firmware load: defer request_firmware during early boot and resume Ming Lei
                   ` (3 preceding siblings ...)
  2012-07-21 17:31 ` Linus Torvalds
@ 2012-07-21 17:49 ` Rafael J. Wysocki
  2012-07-21 20:09   ` Ming Lei
  4 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2012-07-21 17:49 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum

On Friday, July 20, 2012, Ming Lei wrote:
> The RFC patch is just for discussing if the idea of deferring
> request_firmware is doable for addressing the issue of
> request_firmware in resume path, which is caused by driver
> unbind/rebind during resume.
> 
> At least usb bus is involved in such things, one driver may be
> unbound and rebound in resume path at several situations, and
> request_firmware is often called inside probe().
> 
> Also the idea should be helpful for other hotplug buses too,
> at least there was the similar problem report on pcmcia bus.
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 6cd2c6c..fb02eff 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -123,7 +123,7 @@ static bool driver_deferred_probe_enable = false;
>   * list and schedules the deferred probe workqueue to process them.  It
>   * should be called anytime a driver is successfully bound to a device.
>   */
> -static void driver_deferred_probe_trigger(void)
> +void driver_deferred_probe_trigger(void)
>  {
>  	if (!driver_deferred_probe_enable)
>  		return;
> @@ -144,6 +144,7 @@ static void driver_deferred_probe_trigger(void)
>  	 */
>  	queue_work(deferred_wq, &deferred_probe_work);
>  }
> +EXPORT_SYMBOL_GPL(driver_deferred_probe_trigger);
> 
>  /**
>   * deferred_probe_initcall() - Enable probing of deferred devices
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 5401814..4fe742f 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -593,6 +593,9 @@ request_firmware(const struct firmware
> **firmware_p, const char *name,
>  	if (IS_ERR_OR_NULL(fw_priv))
>  		return PTR_RET(fw_priv);
> 
> +	if (system_state != SYSTEM_RUNNING)
> +		return -EPROBE_DEFER;

You can't just return here, _request_firmware_cleanup() has to be done still.

> +
>  	ret = usermodehelper_read_trylock();

So why don't you do this here, actually, like:

 	if (ret) {
 	 	ret = -EPROBE_DEFER;
 	} else {

instead of the WARN_ON()?

Arguably, all cases in which usermodehelper_read_trylock() returns error
codes will require deferred probing.

>  	if (WARN_ON(ret)) {
>  		dev_err(device, "firmware: %s will not be loaded\n", name);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index d0e4d99..a63d3171 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -244,7 +244,7 @@ extern struct device_driver *driver_find(const char *name,
>  					 struct bus_type *bus);
>  extern int driver_probe_done(void);
>  extern void wait_for_device_probe(void);
> -
> +extern void driver_deferred_probe_trigger(void);
> 
>  /* sysfs interface for exporting driver attributes */
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index e07f5e0..c8d74c6 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -378,6 +378,7 @@ extern enum system_states {
>  	SYSTEM_POWER_OFF,
>  	SYSTEM_RESTART,
>  	SYSTEM_SUSPEND_DISK,
> +	SYSTEM_SUSPEND,

First off, SYSTEM_SUSPEND_DISK is not used and probably should be removed.
Second, both SYSTEM_SUSPEND and SYSTEM_SUSPEND_DISK would require the same
kind of handling in the respect of device probing, so it is not sufficient
to change the state in suspend_devices_and_enter().

Moreover, there are other situations in which tasks are frozen and
request_firmware() won't work just as well, so I don't think using
system_state for that is going to work in general.

>  } system_state;
> 
>  #define TAINT_PROPRIETARY_MODULE	0
> diff --git a/init/main.c b/init/main.c
> index e60679d..237eae1 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -809,6 +809,8 @@ static noinline int init_post(void)
>  	current->signal->flags |= SIGNAL_UNKILLABLE;
>  	flush_delayed_fput();
> 
> +	driver_deferred_probe_trigger();
> +
>  	if (ramdisk_execute_command) {
>  		run_init_process(ramdisk_execute_command);
>  		printk(KERN_WARNING "Failed to execute %s\n",
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 1da39ea..dbf6ffb 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -224,6 +224,9 @@ int suspend_devices_and_enter(suspend_state_t state)
>  		goto Recover_platform;
>  	}
>  	suspend_test_finish("suspend devices");
> +
> +	system_state = SYSTEM_SUSPEND;
> +
>  	if (suspend_test(TEST_DEVICES))
>  		goto Recover_platform;
> 
> @@ -301,6 +304,10 @@ static int enter_state(suspend_state_t state)
>   Finish:
>  	pr_debug("PM: Finishing wakeup.\n");
>  	suspend_finish();
> +
> +	system_state = SYSTEM_RUNNING;
> +	driver_deferred_probe_trigger();
> +
>   Unlock:
>  	mutex_unlock(&pm_mutex);
>  	return error;

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 17:49 ` Rafael J. Wysocki
@ 2012-07-21 20:09   ` Ming Lei
  2012-07-21 21:33     ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Ming Lei @ 2012-07-21 20:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum

On Sun, Jul 22, 2012 at 1:49 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, July 20, 2012, Ming Lei wrote:

>> +     if (system_state != SYSTEM_RUNNING)
>> +             return -EPROBE_DEFER;
>
> You can't just return here, _request_firmware_cleanup() has to be done still.

Good catch, thanks.

>
>> +
>>       ret = usermodehelper_read_trylock();
>
> So why don't you do this here, actually, like:
>
>         if (ret) {
>                 ret = -EPROBE_DEFER;

The problem is that the 'ret' is zero for early boot situation.

>         } else {
>
> instead of the WARN_ON()?
>
> Arguably, all cases in which usermodehelper_read_trylock() returns error
> codes will require deferred probing.

Yes, looks !SYSTEM_RUNNING has covered all the cases already.

>
>>       if (WARN_ON(ret)) {
>>               dev_err(device, "firmware: %s will not be loaded\n", name);
>> diff --git a/include/linux/device.h b/include/linux/device.h
>> index d0e4d99..a63d3171 100644
>> --- a/include/linux/device.h
>> +++ b/include/linux/device.h
>> @@ -244,7 +244,7 @@ extern struct device_driver *driver_find(const char *name,
>>                                        struct bus_type *bus);
>>  extern int driver_probe_done(void);
>>  extern void wait_for_device_probe(void);
>> -
>> +extern void driver_deferred_probe_trigger(void);
>>
>>  /* sysfs interface for exporting driver attributes */
>>
>> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
>> index e07f5e0..c8d74c6 100644
>> --- a/include/linux/kernel.h
>> +++ b/include/linux/kernel.h
>> @@ -378,6 +378,7 @@ extern enum system_states {
>>       SYSTEM_POWER_OFF,
>>       SYSTEM_RESTART,
>>       SYSTEM_SUSPEND_DISK,
>> +     SYSTEM_SUSPEND,
>
> First off, SYSTEM_SUSPEND_DISK is not used and probably should be removed.
> Second, both SYSTEM_SUSPEND and SYSTEM_SUSPEND_DISK would require the same
> kind of handling in the respect of device probing, so it is not sufficient
> to change the state in suspend_devices_and_enter().

suspend_devices_and_enter is used by hibernation too, and the state is just
updated to RUNNING after suspend_finish.

>
> Moreover, there are other situations in which tasks are frozen and
> request_firmware() won't work just as well, so I don't think using
> system_state for that is going to work in general.

Looks system_state becoming SYSTEM_RUNNING means all tasks has
been thawed completely.



Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] firmware load: defer request_firmware during early boot and resume
  2012-07-21 20:09   ` Ming Lei
@ 2012-07-21 21:33     ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2012-07-21 21:33 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linux Kernel Mailing List, linux-usb, Alan Stern,
	Greg Kroah-Hartman, Oliver Neukum

On Saturday, July 21, 2012, Ming Lei wrote:
> On Sun, Jul 22, 2012 at 1:49 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Friday, July 20, 2012, Ming Lei wrote:
> 
> >> +     if (system_state != SYSTEM_RUNNING)
> >> +             return -EPROBE_DEFER;
> >
> > You can't just return here, _request_firmware_cleanup() has to be done still.
> 
> Good catch, thanks.
> 
> >
> >> +
> >>       ret = usermodehelper_read_trylock();
> >
> > So why don't you do this here, actually, like:
> >
> >         if (ret) {
> >                 ret = -EPROBE_DEFER;
> 
> The problem is that the 'ret' is zero for early boot situation.

If you don't use SYSTEM_SUSPEND, you could just leave your !SYSTEM_RUNNING
check above and use this one to cover the suspend/resume case.

However, this is all moot in the face of the Linus' objection.

> >         } else {
> >
> > instead of the WARN_ON()?
> >
> > Arguably, all cases in which usermodehelper_read_trylock() returns error
> > codes will require deferred probing.
> 
> Yes, looks !SYSTEM_RUNNING has covered all the cases already.

Well, not really.

> >
> >>       if (WARN_ON(ret)) {
> >>               dev_err(device, "firmware: %s will not be loaded\n", name);
> >> diff --git a/include/linux/device.h b/include/linux/device.h
> >> index d0e4d99..a63d3171 100644
> >> --- a/include/linux/device.h
> >> +++ b/include/linux/device.h
> >> @@ -244,7 +244,7 @@ extern struct device_driver *driver_find(const char *name,
> >>                                        struct bus_type *bus);
> >>  extern int driver_probe_done(void);
> >>  extern void wait_for_device_probe(void);
> >> -
> >> +extern void driver_deferred_probe_trigger(void);
> >>
> >>  /* sysfs interface for exporting driver attributes */
> >>
> >> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> >> index e07f5e0..c8d74c6 100644
> >> --- a/include/linux/kernel.h
> >> +++ b/include/linux/kernel.h
> >> @@ -378,6 +378,7 @@ extern enum system_states {
> >>       SYSTEM_POWER_OFF,
> >>       SYSTEM_RESTART,
> >>       SYSTEM_SUSPEND_DISK,
> >> +     SYSTEM_SUSPEND,
> >
> > First off, SYSTEM_SUSPEND_DISK is not used and probably should be removed.
> > Second, both SYSTEM_SUSPEND and SYSTEM_SUSPEND_DISK would require the same
> > kind of handling in the respect of device probing, so it is not sufficient
> > to change the state in suspend_devices_and_enter().
> 
> suspend_devices_and_enter is used by hibernation too,

No now look for the second time and then tell me what you got wrong.  OK?

> and the state is just updated to RUNNING after suspend_finish.

No, it is not.

> > Moreover, there are other situations in which tasks are frozen and
> > request_firmware() won't work just as well, so I don't think using
> > system_state for that is going to work in general.
> 
> Looks system_state becoming SYSTEM_RUNNING means all tasks has
> been thawed completely.

No, it doesn't.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-07-23  8:12 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-20 12:33 [RFC] firmware load: defer request_firmware during early boot and resume Ming Lei
2012-07-20 12:52 ` Borislav Petkov
2012-07-20 12:57   ` Ming Lei
2012-07-20 13:03     ` Borislav Petkov
2012-07-20 13:09       ` Ming Lei
2012-07-20 13:13         ` Borislav Petkov
2012-07-20 13:54 ` Oliver Neukum
2012-07-20 15:53   ` Ming Lei
2012-07-21  4:13 ` Ming Lei
2012-07-21  9:56   ` Rafael J. Wysocki
2012-07-21 13:25     ` Ming Lei
2012-07-21 17:35       ` Rafael J. Wysocki
2012-07-21 17:31 ` Linus Torvalds
2012-07-21 17:53   ` Rafael J. Wysocki
2012-07-21 19:55   ` Ming Lei
2012-07-21 20:38     ` Linus Torvalds
2012-07-21 20:46       ` david
2012-07-21 23:24       ` Ming Lei
2012-07-22 12:58       ` Borislav Petkov
2012-07-22 19:46         ` Linus Torvalds
2012-07-23  8:12           ` Borislav Petkov
2012-07-21 21:01     ` Francois Romieu
2012-07-21 17:49 ` Rafael J. Wysocki
2012-07-21 20:09   ` Ming Lei
2012-07-21 21:33     ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox