Freezable workqueue blocks non-freezable workqueue during the system resume process

All of lore.kernel.org
 help / color / mirror / Atom feed

* Freezable workqueue blocks non-freezable workqueue during the system resume process
@ 2016-02-23  3:20 Peter Chen
  2016-02-23  9:47 ` Peter Chen
  2016-02-23 15:34 ` Alan Stern
  0 siblings, 2 replies; 43+ messages in thread
From: Peter Chen @ 2016-02-23  3:20 UTC (permalink / raw)
  To: tj, florian; +Cc: linux-usb, linux-kernel, usb-storage

Hi Tejun Heo and Florian Mickler,

I have a question that during the system resume process, the freezable
workqueue can be thawed if there is a non-freezable workqueue is
blocked (At uninterruptable state)?

My case like below, I have a USB OTG (Micro-AB) cable is at USB
Micro-B port, and there is a USB driver on it, and un-plug this
cable can wake up system from the suspend. There is a non-freezable
workqueue ci_otg will be scheduled after disconnecting OTG cable,
and in its worker ci_otg_work, it will try to disconnect USB drive,
and flush disk information. But flush disk information is done by
freezable workqueue writeback, it seeems workqueue writeback is
never got chance to execute, the workqueue ci_otg is waiting there
forever, and the system is deadlock.

Both change workqueue ci_otg as freezable or change workqueue writeback
as non-freezable can fix this problem.

The call stack like below:

[  546.987379] writeback       S c07de74c     0    12      2 0x00000000
[  546.993804] Backtrace:
[  546.996307] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[  547.003370]  r10:ef14bc80 r9:ef14ca00 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef14bc98
[  547.011325]  r4:ef164000
[  547.013907] [<c07dec24>] (schedule) from [<c0045f20>] (rescuer_thread+0x290/0x308)
[  547.021490]  r4:00000000 r3:00000008
[  547.025136] [<c0045c90>] (rescuer_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
[  547.032459]  r10:00000000 r9:00000000 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef1526c0
[  547.040412]  r4:00000000
[  547.042993] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
[  547.050229]  r7:00000000 r6:00000000 r5:c004b9d8 r4:ef1526c0
[  555.178869] kworker/u2:13   D c07de74c     0   826      2 0x00000000

[  555.185310] Workqueue: ci_otg ci_otg_work
[  555.189353] Backtrace:
[  555.191849] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[  555.198912]  r10:ee471ba0 r9:00000000 r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4
[  555.206867]  r4:ee470000
[  555.209453] [<c07dec24>] (schedule) from [<c07e2fc4>] (schedule_timeout+0x15c/0x1e0)
[  555.217212]  r4:7fffffff r3:edc2b000
[  555.220862] [<c07e2e68>] (schedule_timeout) from [<c07df6c8>] (wait_for_common+0x94/0x144)
[  555.229140]  r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4 r4:7fffffff
[  555.235980] [<c07df634>] (wait_for_common) from [<c07df790>] (wait_for_completion+0x18/0x1c)
[  555.244430]  r10:00000001 r9:c0b5563c r8:c0042e48 r7:ef086000 r6:eea4372c r5:ef131b00
[  555.252383]  r4:00000000
[  555.254970] [<c07df778>] (wait_for_completion) from [<c0043cb8>] (flush_work+0x19c/0x234)
[  555.263177] [<c0043b1c>] (flush_work) from [<c0043fac>] (flush_delayed_work+0x48/0x4c)
[  555.271106]  r8:ed5b5000 r7:c0b38a3c r6:eea439cc r5:eea4372c r4:eea4372c
[  555.277958] [<c0043f64>] (flush_delayed_work) from [<c00eae18>] (bdi_unregister+0x84/0xec)
[  555.286236]  r4:eea43520 r3:20000153
[  555.289885] [<c00ead94>] (bdi_unregister) from [<c02c2154>] (blk_cleanup_queue+0x180/0x29c)
[  555.298250]  r5:eea43808 r4:eea43400
[  555.301909] [<c02c1fd4>] (blk_cleanup_queue) from [<c0417914>] (__scsi_remove_device+0x48/0xb8)
[  555.310623]  r7:00000000 r6:20000153 r5:ededa950 r4:ededa800
[  555.316403] [<c04178cc>] (__scsi_remove_device) from [<c0415e90>] (scsi_forget_host+0x64/0x68)
[  555.325028]  r5:ededa800 r4:ed5b5000
[  555.328689] [<c0415e2c>] (scsi_forget_host) from [<c0409828>] (scsi_remove_host+0x78/0x104)
[  555.337054]  r5:ed5b5068 r4:ed5b5000
[  555.340709] [<c04097b0>] (scsi_remove_host) from [<c04cdfcc>] (usb_stor_disconnect+0x50/0xb4)
[  555.349247]  r6:ed5b56e4 r5:ed5b5818 r4:ed5b5690 r3:00000008
[  555.355025] [<c04cdf7c>] (usb_stor_disconnect) from [<c04b3bc8>] (usb_unbind_interface+0x78/0x25c)
[  555.363997]  r8:c13919b4 r7:edd3c000 r6:edd3c020 r5:ee551c68 r4:ee551c00 r3:c04cdf7c
[  555.371892] [<c04b3b50>] (usb_unbind_interface) from [<c03dc248>] (__device_release_driver+0x8c/0x118)
[  555.381213]  r10:00000001 r9:edd90c00 r8:c13919b4 r7:ee551c68 r6:c0b546e0 r5:c0b5563c
[  555.389167]  r4:edd3c020
[  555.391752] [<c03dc1bc>] (__device_release_driver) from [<c03dc2fc>] (device_release_driver+0x28/0x34)
[  555.401071]  r5:edd3c020 r4:edd3c054
[  555.404721] [<c03dc2d4>] (device_release_driver) from [<c03db304>] (bus_remove_device+0xe0/0x110)
[  555.413607]  r5:edd3c020 r4:ef17f04c
[  555.417253] [<c03db224>] (bus_remove_device) from [<c03d8128>] (device_del+0x114/0x21c)
[  555.425270]  r6:edd3c028 r5:edd3c020 r4:ee551c00 r3:00000000
[  555.431045] [<c03d8014>] (device_del) from [<c04b1560>] (usb_disable_device+0xa4/0x1e8)
[  555.439061]  r8:edd3c000 r7:eded8000 r6:00000000 r5:00000001 r4:ee551c00
[  555.445906] [<c04b14bc>] (usb_disable_device) from [<c04a8e54>] (usb_disconnect+0x74/0x224)
[  555.454271]  r9:edd90c00 r8:ee551000 r7:ee551c68 r6:ee551c9c r5:ee551c00 r4:00000001
[  555.462156] [<c04a8de0>] (usb_disconnect) from [<c04a8fb8>] (usb_disconnect+0x1d8/0x224)
[  555.470259]  r10:00000001 r9:edd90000 r8:ee471e2c r7:ee551468 r6:ee55149c r5:ee551400
[  555.478213]  r4:00000001
[  555.480797] [<c04a8de0>] (usb_disconnect) from [<c04ae5ec>] (usb_remove_hcd+0xa0/0x1ac)
[  555.488813]  r10:00000001 r9:ee471eb0 r8:00000000 r7:ef3d9500 r6:eded810c r5:eded80b0
[  555.496765]  r4:eded8000
[  555.499351] [<c04ae54c>] (usb_remove_hcd) from [<c04d4158>] (host_stop+0x28/0x64)
[  555.506847]  r6:eeb50010 r5:eded8000 r4:eeb51010
[  555.511563] [<c04d4130>] (host_stop) from [<c04d09b8>] (ci_otg_work+0xc4/0x124)
[  555.518885]  r6:00000001 r5:eeb50010 r4:eeb502a0 r3:c04d4130
[  555.524665] [<c04d08f4>] (ci_otg_work) from [<c00454f0>] (process_one_work+0x194/0x420)
[  555.532682]  r6:ef086000 r5:eeb502a0 r4:edc44480
[  555.537393] [<c004535c>] (process_one_work) from [<c00457b0>] (worker_thread+0x34/0x514)
[  555.545496]  r10:edc44480 r9:ef086000 r8:c0b1a100 r7:ef086034 r6:00000088 r5:edc44498
[  555.553450]  r4:ef086000
[  555.556032] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
[  555.563268]  r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44480 r5:eddc15c0
[  555.571221]  r4:00000000
[  555.573804] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
[  555.581040]  r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
[  555.586803] kworker/u2:14   S c07de74c     0   827      2 0x00000000
[  555.593232] Backtrace:
[  555.595739] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[  555.602806]  r10:edc44200 r9:00000000 r8:00000000 r7:ef086034 r6:edc44200 r5:ef086000
[  555.610761]  r4:ed72a000
[  555.613345] [<c07dec24>] (schedule) from [<c0045834>] (worker_thread+0xb8/0x514)
[  555.620755]  r4:ef086000 r3:ed72bef0
[  555.624399] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
[  555.631635]  r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44200 r5:eddc15c0
[  555.639589]  r4:00000000
[  555.642170] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
[  555.649406]  r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
[  555.655168]
[  555.655168] Showing all locks held in the system:
[  555.661424] 5 locks held by sh/694:
[  555.664926]  #0:  (sb_writers#6){.+.+.+}, at: [<c0118f0c>] __sb_start_write+0xb0/0xbc
[  555.672921]  #1:  (&of->mutex){+.+.+.}, at: [<c0187ad4>] kernfs_fop_write+0x60/0x1fc
[  555.680805]  #2:  (s_active#211){.+.+.+}, at: [<c0187adc>] kernfs_fop_write+0x68/0x1fc
[  555.688873]  #3:  (pm_mutex){+.+.+.}, at: [<c0079d10>] pm_suspend+0xd0/0x2cc
[  555.696052]  #4:  (&dev->mutex){......}, at: [<c03e927c>] dpm_complete+0xc0/0x1b0
[  555.703684] 7 locks held by kworker/u2:13/826:
[  555.708140]  #0:  ("%s""ci_otg"){++++.+}, at: [<c0045484>] process_one_work+0x128/0x420
[  555.716277]  #1:  ((&ci->work)){+.+.+.}, at: [<c0045484>] process_one_work+0x128/0x420
[  555.724317]  #2:  (usb_bus_list_lock){+.+.+.}, at: [<c04ae5e4>] usb_remove_hcd+0x98/0x1ac
[  555.732626]  #3:  (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
[  555.740403]  #4:  (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
[  555.748179]  #5:  (&dev->mutex){......}, at: [<c03dc2f4>] device_release_driver+0x20/0x34
[  555.756487]  #6:  (&shost->scan_mutex){+.+.+.}, at: [<c04097d0>] scsi_remove_host+0x20/0x104
[  555.765062]
[  555.766567] =============================================
[  555.766567]
[  555.773467] Showing busy workqueues and worker pools:
[  555.778544] workqueue events_freezable: flags=0x4
[  555.783371]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
[  555.789316]     delayed: thermal_zone_device_check
[  555.794191] workqueue pm: flags=0x4
[  555.797700]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
[  555.803636]     delayed: pm_runtime_work
[  555.807629] workqueue writeback: flags=0x4e
[  555.811894]   pwq 2: cpus=0 flags=0x4 nice=0 active=0/0
[  555.817209]     delayed: wb_workfn BAR(826)
[  555.821493] workqueue usb_hub_wq: flags=0x4
[  555.825693]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
[  555.831627]     delayed: hub_event
[  555.835119] workqueue ci_otg: flags=0x2000a
[  555.839320]   pwq 2: cpus=0 flags=0x4 nice=0 active=1/1
[  555.844623]     in-flight: 826:ci_otg_work
[  555.848810] pool 2: cpus=0 flags=0x4 nice=0 hung=0s workers=15 idle: 814 733 700 824 6 822 820 819 816 817 815 813 827 825

Thanks you!

-- 

Best Regards,
Peter Chen

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-02-23  3:20 Freezable workqueue blocks non-freezable workqueue during the system resume process Peter Chen
@ 2016-02-23  9:47 ` Peter Chen
  2016-02-23 15:34 ` Alan Stern
  1 sibling, 0 replies; 43+ messages in thread
From: Peter Chen @ 2016-02-23  9:47 UTC (permalink / raw)
  To: tj, florian; +Cc: linux-usb, linux-kernel, usb-storage

On Tue, Feb 23, 2016 at 11:20:56AM +0800, Peter Chen wrote:
> Hi Tejun Heo and Florian Mickler,
> 
> I have a question that during the system resume process, the freezable
> workqueue can be thawed if there is a non-freezable workqueue is
> blocked (At uninterruptable state)?
> 
> My case like below, I have a USB OTG (Micro-AB) cable is at USB
> Micro-B port, and there is a USB driver on it, and un-plug this
> cable can wake up system from the suspend. There is a non-freezable
> workqueue ci_otg will be scheduled after disconnecting OTG cable,
> and in its worker ci_otg_work, it will try to disconnect USB drive,
> and flush disk information. But flush disk information is done by
> freezable workqueue writeback, it seeems workqueue writeback is
> never got chance to execute, the workqueue ci_otg is waiting there
> forever, and the system is deadlock.
> 

> Both change workqueue ci_otg as freezable or change workqueue writeback
> as non-freezable can fix this problem.
> 
Please ignore it, the system is locked at driver's resume,
maybe at scsi or usb driver, so of cos, the freezable processes
can't be thawed.

[  553.429383] sh              D c07de74c     0   694    691 0x00000000
[  553.435801] Backtrace:
[  553.438295] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[  553.445358]  r10:edd3c054 r9:edd3c078 r8:edddbd50 r7:edcbbc00 r6:c1377c34 r5:60000153
[  553.453313]  r4:eddda000
[  553.455896] [<c07dec24>] (schedule) from [<c07deff8>] (schedule_preempt_disabled+0x10/0x14)
[  553.464261]  r4:edd3c058 r3:0000000a
[  553.467910] [<c07defe8>] (schedule_preempt_disabled) from [<c07e0bbc>] (mutex_lock_nested+0x1a0/0x3e8)
[  553.477254] [<c07e0a1c>] (mutex_lock_nested) from [<c03e927c>] (dpm_complete+0xc0/0x1b0)
[  553.485358]  r10:00561408 r9:edd3c054 r8:c0b4863c r7:edddbd90 r6:c0b485d8 r5:edd3c020
[  553.493313]  r4:edd3c0d0
[  553.495896] [<c03e91bc>] (dpm_complete) from [<c03e9388>] (dpm_resume_end+0x1c/0x20)
[  553.503652]  r9:00000000 r8:c0b1a9d0 r7:c1334ec0 r6:c1334edc r5:00000003 r4:00000010
[  553.511544] [<c03e936c>] (dpm_resume_end) from [<c0079894>] (suspend_devices_and_enter+0x158/0x504)
[  553.520604]  r4:00000000 r3:c1334efc
[  553.524250] [<c007973c>] (suspend_devices_and_enter) from [<c0079e74>] (pm_suspend+0x234/0x2cc)
[  553.532961]  r10:00561408 r9:ed6b7300 r8:00000004 r7:c1334eec r6:00000000 r5:c1334ee8
[  553.540914]  r4:00000003
[  553.543493] [<c0079c40>] (pm_suspend) from [<c0078a6c>] (state_store+0x6c/0xc0)
[  553.550815]  r6:00000003 r5:c09b2ca4 r4:00000003 r3:0000006d
[  553.556599] [<c0078a00>] (state_store) from [<c02e71fc>] (kobj_attr_store+0x1c/0x28)
[  553.564358]  r9:00000004 r8:c0010004 r7:edf9480c r6:ed6b7300 r5:edf94800 r4:00000004
[  553.572258] [<c02e71e0>] (kobj_attr_store) from [<c01885d4>] (sysfs_kf_write+0x54/0x58)
[  553.580295] [<c0188580>] (sysfs_kf_write) from [<c0187b4c>] (kernfs_fop_write+0xd8/0x1fc)
[  553.588487]  r6:ed6b7300 r5:00000000 r4:00000000 r3:c0188580
[  553.594262] [<c0187a74>] (kernfs_fop_write) from [<c0114e98>] (__vfs_write+0x2c/0xe0)
[  553.602105]  r10:00000000 r9:eddda000 r8:c0010004 r7:edddbf80 r6:00561408 r5:edddbf80
[  553.610060]  r4:ed445280
[  553.612641] [<c0114e6c>] (__vfs_write) from [<c0115d78>] (vfs_write+0x98/0x16c)
[  553.619963]  r8:c0010004 r7:edddbf80 r6:00561408 r5:00000004 r4:ed445280
[  553.626800] [<c0115ce0>] (vfs_write) from [<c0116818>] (SyS_write+0x4c/0xa8)
[  553.633861]  r8:c0010004 r7:00561408 r6:00000004 r5:ed445280 r4:ed445280
[  553.640705] [<c01167cc>] (SyS_write) from [<c000fe60>] (ret_fast_syscall+0x0/0x1c)
[  553.648291]  r7:00000004 r6:b6f27d60 r5:00561408 r4:00000004

> The call stack like below:
> 
> [  546.987379] writeback       S c07de74c     0    12      2 0x00000000
> [  546.993804] Backtrace:
> [  546.996307] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
> [  547.003370]  r10:ef14bc80 r9:ef14ca00 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef14bc98
> [  547.011325]  r4:ef164000
> [  547.013907] [<c07dec24>] (schedule) from [<c0045f20>] (rescuer_thread+0x290/0x308)
> [  547.021490]  r4:00000000 r3:00000008
> [  547.025136] [<c0045c90>] (rescuer_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
> [  547.032459]  r10:00000000 r9:00000000 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef1526c0
> [  547.040412]  r4:00000000
> [  547.042993] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
> [  547.050229]  r7:00000000 r6:00000000 r5:c004b9d8 r4:ef1526c0
> [  555.178869] kworker/u2:13   D c07de74c     0   826      2 0x00000000
> 
> [  555.185310] Workqueue: ci_otg ci_otg_work
> [  555.189353] Backtrace:
> [  555.191849] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
> [  555.198912]  r10:ee471ba0 r9:00000000 r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4
> [  555.206867]  r4:ee470000
> [  555.209453] [<c07dec24>] (schedule) from [<c07e2fc4>] (schedule_timeout+0x15c/0x1e0)
> [  555.217212]  r4:7fffffff r3:edc2b000
> [  555.220862] [<c07e2e68>] (schedule_timeout) from [<c07df6c8>] (wait_for_common+0x94/0x144)
> [  555.229140]  r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4 r4:7fffffff
> [  555.235980] [<c07df634>] (wait_for_common) from [<c07df790>] (wait_for_completion+0x18/0x1c)
> [  555.244430]  r10:00000001 r9:c0b5563c r8:c0042e48 r7:ef086000 r6:eea4372c r5:ef131b00
> [  555.252383]  r4:00000000
> [  555.254970] [<c07df778>] (wait_for_completion) from [<c0043cb8>] (flush_work+0x19c/0x234)
> [  555.263177] [<c0043b1c>] (flush_work) from [<c0043fac>] (flush_delayed_work+0x48/0x4c)
> [  555.271106]  r8:ed5b5000 r7:c0b38a3c r6:eea439cc r5:eea4372c r4:eea4372c
> [  555.277958] [<c0043f64>] (flush_delayed_work) from [<c00eae18>] (bdi_unregister+0x84/0xec)
> [  555.286236]  r4:eea43520 r3:20000153
> [  555.289885] [<c00ead94>] (bdi_unregister) from [<c02c2154>] (blk_cleanup_queue+0x180/0x29c)
> [  555.298250]  r5:eea43808 r4:eea43400
> [  555.301909] [<c02c1fd4>] (blk_cleanup_queue) from [<c0417914>] (__scsi_remove_device+0x48/0xb8)
> [  555.310623]  r7:00000000 r6:20000153 r5:ededa950 r4:ededa800
> [  555.316403] [<c04178cc>] (__scsi_remove_device) from [<c0415e90>] (scsi_forget_host+0x64/0x68)
> [  555.325028]  r5:ededa800 r4:ed5b5000
> [  555.328689] [<c0415e2c>] (scsi_forget_host) from [<c0409828>] (scsi_remove_host+0x78/0x104)
> [  555.337054]  r5:ed5b5068 r4:ed5b5000
> [  555.340709] [<c04097b0>] (scsi_remove_host) from [<c04cdfcc>] (usb_stor_disconnect+0x50/0xb4)
> [  555.349247]  r6:ed5b56e4 r5:ed5b5818 r4:ed5b5690 r3:00000008
> [  555.355025] [<c04cdf7c>] (usb_stor_disconnect) from [<c04b3bc8>] (usb_unbind_interface+0x78/0x25c)
> [  555.363997]  r8:c13919b4 r7:edd3c000 r6:edd3c020 r5:ee551c68 r4:ee551c00 r3:c04cdf7c
> [  555.371892] [<c04b3b50>] (usb_unbind_interface) from [<c03dc248>] (__device_release_driver+0x8c/0x118)
> [  555.381213]  r10:00000001 r9:edd90c00 r8:c13919b4 r7:ee551c68 r6:c0b546e0 r5:c0b5563c
> [  555.389167]  r4:edd3c020
> [  555.391752] [<c03dc1bc>] (__device_release_driver) from [<c03dc2fc>] (device_release_driver+0x28/0x34)
> [  555.401071]  r5:edd3c020 r4:edd3c054
> [  555.404721] [<c03dc2d4>] (device_release_driver) from [<c03db304>] (bus_remove_device+0xe0/0x110)
> [  555.413607]  r5:edd3c020 r4:ef17f04c
> [  555.417253] [<c03db224>] (bus_remove_device) from [<c03d8128>] (device_del+0x114/0x21c)
> [  555.425270]  r6:edd3c028 r5:edd3c020 r4:ee551c00 r3:00000000
> [  555.431045] [<c03d8014>] (device_del) from [<c04b1560>] (usb_disable_device+0xa4/0x1e8)
> [  555.439061]  r8:edd3c000 r7:eded8000 r6:00000000 r5:00000001 r4:ee551c00
> [  555.445906] [<c04b14bc>] (usb_disable_device) from [<c04a8e54>] (usb_disconnect+0x74/0x224)
> [  555.454271]  r9:edd90c00 r8:ee551000 r7:ee551c68 r6:ee551c9c r5:ee551c00 r4:00000001
> [  555.462156] [<c04a8de0>] (usb_disconnect) from [<c04a8fb8>] (usb_disconnect+0x1d8/0x224)
> [  555.470259]  r10:00000001 r9:edd90000 r8:ee471e2c r7:ee551468 r6:ee55149c r5:ee551400
> [  555.478213]  r4:00000001
> [  555.480797] [<c04a8de0>] (usb_disconnect) from [<c04ae5ec>] (usb_remove_hcd+0xa0/0x1ac)
> [  555.488813]  r10:00000001 r9:ee471eb0 r8:00000000 r7:ef3d9500 r6:eded810c r5:eded80b0
> [  555.496765]  r4:eded8000
> [  555.499351] [<c04ae54c>] (usb_remove_hcd) from [<c04d4158>] (host_stop+0x28/0x64)
> [  555.506847]  r6:eeb50010 r5:eded8000 r4:eeb51010
> [  555.511563] [<c04d4130>] (host_stop) from [<c04d09b8>] (ci_otg_work+0xc4/0x124)
> [  555.518885]  r6:00000001 r5:eeb50010 r4:eeb502a0 r3:c04d4130
> [  555.524665] [<c04d08f4>] (ci_otg_work) from [<c00454f0>] (process_one_work+0x194/0x420)
> [  555.532682]  r6:ef086000 r5:eeb502a0 r4:edc44480
> [  555.537393] [<c004535c>] (process_one_work) from [<c00457b0>] (worker_thread+0x34/0x514)
> [  555.545496]  r10:edc44480 r9:ef086000 r8:c0b1a100 r7:ef086034 r6:00000088 r5:edc44498
> [  555.553450]  r4:ef086000
> [  555.556032] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
> [  555.563268]  r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44480 r5:eddc15c0
> [  555.571221]  r4:00000000
> [  555.573804] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
> [  555.581040]  r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
> [  555.586803] kworker/u2:14   S c07de74c     0   827      2 0x00000000
> [  555.593232] Backtrace:
> [  555.595739] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
> [  555.602806]  r10:edc44200 r9:00000000 r8:00000000 r7:ef086034 r6:edc44200 r5:ef086000
> [  555.610761]  r4:ed72a000
> [  555.613345] [<c07dec24>] (schedule) from [<c0045834>] (worker_thread+0xb8/0x514)
> [  555.620755]  r4:ef086000 r3:ed72bef0
> [  555.624399] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
> [  555.631635]  r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44200 r5:eddc15c0
> [  555.639589]  r4:00000000
> [  555.642170] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
> [  555.649406]  r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
> [  555.655168]
> [  555.655168] Showing all locks held in the system:
> [  555.661424] 5 locks held by sh/694:
> [  555.664926]  #0:  (sb_writers#6){.+.+.+}, at: [<c0118f0c>] __sb_start_write+0xb0/0xbc
> [  555.672921]  #1:  (&of->mutex){+.+.+.}, at: [<c0187ad4>] kernfs_fop_write+0x60/0x1fc
> [  555.680805]  #2:  (s_active#211){.+.+.+}, at: [<c0187adc>] kernfs_fop_write+0x68/0x1fc
> [  555.688873]  #3:  (pm_mutex){+.+.+.}, at: [<c0079d10>] pm_suspend+0xd0/0x2cc
> [  555.696052]  #4:  (&dev->mutex){......}, at: [<c03e927c>] dpm_complete+0xc0/0x1b0
> [  555.703684] 7 locks held by kworker/u2:13/826:
> [  555.708140]  #0:  ("%s""ci_otg"){++++.+}, at: [<c0045484>] process_one_work+0x128/0x420
> [  555.716277]  #1:  ((&ci->work)){+.+.+.}, at: [<c0045484>] process_one_work+0x128/0x420
> [  555.724317]  #2:  (usb_bus_list_lock){+.+.+.}, at: [<c04ae5e4>] usb_remove_hcd+0x98/0x1ac
> [  555.732626]  #3:  (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
> [  555.740403]  #4:  (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
> [  555.748179]  #5:  (&dev->mutex){......}, at: [<c03dc2f4>] device_release_driver+0x20/0x34
> [  555.756487]  #6:  (&shost->scan_mutex){+.+.+.}, at: [<c04097d0>] scsi_remove_host+0x20/0x104
> [  555.765062]
> [  555.766567] =============================================
> [  555.766567]
> [  555.773467] Showing busy workqueues and worker pools:
> [  555.778544] workqueue events_freezable: flags=0x4
> [  555.783371]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
> [  555.789316]     delayed: thermal_zone_device_check
> [  555.794191] workqueue pm: flags=0x4
> [  555.797700]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
> [  555.803636]     delayed: pm_runtime_work
> [  555.807629] workqueue writeback: flags=0x4e
> [  555.811894]   pwq 2: cpus=0 flags=0x4 nice=0 active=0/0
> [  555.817209]     delayed: wb_workfn BAR(826)
> [  555.821493] workqueue usb_hub_wq: flags=0x4
> [  555.825693]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
> [  555.831627]     delayed: hub_event
> [  555.835119] workqueue ci_otg: flags=0x2000a
> [  555.839320]   pwq 2: cpus=0 flags=0x4 nice=0 active=1/1
> [  555.844623]     in-flight: 826:ci_otg_work
> [  555.848810] pool 2: cpus=0 flags=0x4 nice=0 hung=0s workers=15 idle: 814 733 700 824 6 822 820 819 816 817 815 813 827 825
> 

-- 

Best Regards,
Peter Chen

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-02-23  3:20 Freezable workqueue blocks non-freezable workqueue during the system resume process Peter Chen
  2016-02-23  9:47 ` Peter Chen
@ 2016-02-23 15:34 ` Alan Stern
  2016-02-24  7:24   ` Peter Chen
  1 sibling, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-02-23 15:34 UTC (permalink / raw)
  To: Peter Chen; +Cc: tj, florian, linux-usb, linux-kernel, usb-storage

On Tue, 23 Feb 2016, Peter Chen wrote:

> Hi Tejun Heo and Florian Mickler,
> 
> I have a question that during the system resume process, the freezable
> workqueue can be thawed if there is a non-freezable workqueue is
> blocked (At uninterruptable state)?
> 
> My case like below, I have a USB OTG (Micro-AB) cable is at USB
> Micro-B port, and there is a USB driver on it, and un-plug this
> cable can wake up system from the suspend. There is a non-freezable
> workqueue ci_otg will be scheduled after disconnecting OTG cable,
> and in its worker ci_otg_work, it will try to disconnect USB drive,
> and flush disk information.

These operations probably are not safe while the system is resuming.  
It might be best to make them wait until the resume is finished.

> But flush disk information is done by
> freezable workqueue writeback, it seeems workqueue writeback is
> never got chance to execute, the workqueue ci_otg is waiting there
> forever, and the system is deadlock.

> Both change workqueue ci_otg as freezable or change workqueue writeback
> as non-freezable can fix this problem.

It sounds like making ci_otg freezable is the easiest solution.

> Please ignore it, the system is locked at driver's resume,
> maybe at scsi or usb driver, so of cos, the freezable processes
> can't be thawed.
 
> > [  555.263177] [<c0043b1c>] (flush_work) from [<c0043fac>] (flush_delayed_work+0x48/0x4c)   
> > [  555.271106]  r8:ed5b5000 r7:c0b38a3c r6:eea439cc r5:eea4372c r4:eea4372c
> > [  555.277958] [<c0043f64>] (flush_delayed_work) from [<c00eae18>] (bdi_unregister+0x84/0xec)
> > [  555.286236]  r4:eea43520 r3:20000153
> > [  555.289885] [<c00ead94>] (bdi_unregister) from [<c02c2154>] (blk_cleanup_queue+0x180/0x29c)
> > [  555.298250]  r5:eea43808 r4:eea43400

You might want to complain to the block-layer people about this.  I 
don't know if anything can be done to fix it.

Or maybe flush_work and flush_delayed_work can be changed to avoid 
blocking if the workqueue is frozen.  Tejun?

Alan Stern

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-02-23 15:34 ` Alan Stern
@ 2016-02-24  7:24   ` Peter Chen
  2016-02-25 22:01     ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Chen @ 2016-02-24  7:24 UTC (permalink / raw)
  To: Alan Stern; +Cc: tj, florian, linux-usb, linux-kernel, usb-storage

On Tue, Feb 23, 2016 at 10:34:09AM -0500, Alan Stern wrote:
> On Tue, 23 Feb 2016, Peter Chen wrote:
> 
> > Hi Tejun Heo and Florian Mickler,
> > 
> > I have a question that during the system resume process, the freezable
> > workqueue can be thawed if there is a non-freezable workqueue is
> > blocked (At uninterruptable state)?
> > 
> > My case like below, I have a USB OTG (Micro-AB) cable is at USB
> > Micro-B port, and there is a USB driver on it, and un-plug this
> > cable can wake up system from the suspend. There is a non-freezable
> > workqueue ci_otg will be scheduled after disconnecting OTG cable,
> > and in its worker ci_otg_work, it will try to disconnect USB drive,
> > and flush disk information.
> 
> These operations probably are not safe while the system is resuming.  
> It might be best to make them wait until the resume is finished.
> 
> > But flush disk information is done by
> > freezable workqueue writeback, it seeems workqueue writeback is
> > never got chance to execute, the workqueue ci_otg is waiting there
> > forever, and the system is deadlock.
> 
> > Both change workqueue ci_otg as freezable or change workqueue writeback
> > as non-freezable can fix this problem.
> 
> It sounds like making ci_otg freezable is the easiest solution.
> 
> > Please ignore it, the system is locked at driver's resume,
> > maybe at scsi or usb driver, so of cos, the freezable processes
> > can't be thawed.
>  
> > > [  555.263177] [<c0043b1c>] (flush_work) from [<c0043fac>] (flush_delayed_work+0x48/0x4c)   
> > > [  555.271106]  r8:ed5b5000 r7:c0b38a3c r6:eea439cc r5:eea4372c r4:eea4372c
> > > [  555.277958] [<c0043f64>] (flush_delayed_work) from [<c00eae18>] (bdi_unregister+0x84/0xec)
> > > [  555.286236]  r4:eea43520 r3:20000153
> > > [  555.289885] [<c00ead94>] (bdi_unregister) from [<c02c2154>] (blk_cleanup_queue+0x180/0x29c)
> > > [  555.298250]  r5:eea43808 r4:eea43400
> 
> You might want to complain to the block-layer people about this.  I 
> don't know if anything can be done to fix it.
> 
> Or maybe flush_work and flush_delayed_work can be changed to avoid 
> blocking if the workqueue is frozen.  Tejun?
> 

I have a patch to show the root cause of this issue.

http://www.spinics.net/lists/linux-usb/msg136815.html

-- 

Best Regards,
Peter Chen

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-02-24  7:24   ` Peter Chen
@ 2016-02-25 22:01     ` Tejun Heo
  2016-02-26  6:19       ` Peter Chen
  0 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-02-25 22:01 UTC (permalink / raw)
  To: Peter Chen; +Cc: Alan Stern, florian, linux-usb, linux-kernel, usb-storage

Hello, Peter.

On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote:
> > You might want to complain to the block-layer people about this.  I 
> > don't know if anything can be done to fix it.
> > 
> > Or maybe flush_work and flush_delayed_work can be changed to avoid 
> > blocking if the workqueue is frozen.  Tejun?
> > 
> 
> I have a patch to show the root cause of this issue.
> 
> http://www.spinics.net/lists/linux-usb/msg136815.html

I don't get it.  Why would it deadlock?  Shouldn't things get rolling
once the workqueues are thawed?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-02-25 22:01     ` Tejun Heo
@ 2016-02-26  6:19       ` Peter Chen
  2016-03-02 16:00         ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Chen @ 2016-02-26  6:19 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Alan Stern, florian, linux-usb, linux-kernel, usb-storage

On Thu, Feb 25, 2016 at 05:01:12PM -0500, Tejun Heo wrote:
> Hello, Peter.
> 
> On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote:
> > > You might want to complain to the block-layer people about this.  I 
> > > don't know if anything can be done to fix it.
> > > 
> > > Or maybe flush_work and flush_delayed_work can be changed to avoid 
> > > blocking if the workqueue is frozen.  Tejun?
> > > 
> > 
> > I have a patch to show the root cause of this issue.
> > 
> > http://www.spinics.net/lists/linux-usb/msg136815.html
> 
> I don't get it.  Why would it deadlock?  Shouldn't things get rolling
> once the workqueues are thawed?
> 

Hi Tejun,

The workqueue writeback can't be thawed due to driver's resume
(dpm_complete) is lock nested, and can't be finished.

-- 

Best Regards,
Peter Chen

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-02-26  6:19       ` Peter Chen
@ 2016-03-02 16:00         ` Tejun Heo
  2016-03-03  9:33           ` Jan Kara
  0 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-02 16:00 UTC (permalink / raw)
  To: Peter Chen
  Cc: Alan Stern, florian, linux-usb, linux-kernel, usb-storage,
	Jan Kara

Hello,

(cc'ing Jan)

On Fri, Feb 26, 2016 at 02:19:20PM +0800, Peter Chen wrote:
> On Thu, Feb 25, 2016 at 05:01:12PM -0500, Tejun Heo wrote:
> > Hello, Peter.
> > 
> > On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote:
> > > > You might want to complain to the block-layer people about this.  I 
> > > > don't know if anything can be done to fix it.
> > > > 
> > > > Or maybe flush_work and flush_delayed_work can be changed to avoid 
> > > > blocking if the workqueue is frozen.  Tejun?
> > > > 
> > > 
> > > I have a patch to show the root cause of this issue.
> > > 
> > > http://www.spinics.net/lists/linux-usb/msg136815.html
> > 
> > I don't get it.  Why would it deadlock?  Shouldn't things get rolling
> > once the workqueues are thawed?
> 
> The workqueue writeback can't be thawed due to driver's resume
> (dpm_complete) is lock nested, and can't be finished.

Ugh... that's nasty.  I wonder whether the right thing to do is making
writeback workers non-freezable.  IOs are supposed to be blocked from
lower layer anyway.  Jan, what do you think?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-03-02 16:00         ` Tejun Heo
@ 2016-03-03  9:33           ` Jan Kara
  2016-03-11 17:56             ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2016-03-03  9:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Peter Chen, Alan Stern, florian, linux-usb, linux-kernel,
	usb-storage, Jan Kara, jkosina

Hello,

On Wed 02-03-16 11:00:58, Tejun Heo wrote:
> On Fri, Feb 26, 2016 at 02:19:20PM +0800, Peter Chen wrote:
> > On Thu, Feb 25, 2016 at 05:01:12PM -0500, Tejun Heo wrote:
> > > Hello, Peter.
> > > 
> > > On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote:
> > > > > You might want to complain to the block-layer people about this.  I 
> > > > > don't know if anything can be done to fix it.
> > > > > 
> > > > > Or maybe flush_work and flush_delayed_work can be changed to avoid 
> > > > > blocking if the workqueue is frozen.  Tejun?
> > > > > 
> > > > 
> > > > I have a patch to show the root cause of this issue.
> > > > 
> > > > http://www.spinics.net/lists/linux-usb/msg136815.html
> > > 
> > > I don't get it.  Why would it deadlock?  Shouldn't things get rolling
> > > once the workqueues are thawed?
> > 
> > The workqueue writeback can't be thawed due to driver's resume
> > (dpm_complete) is lock nested, and can't be finished.
> 
> Ugh... that's nasty.  I wonder whether the right thing to do is making
> writeback workers non-freezable.  IOs are supposed to be blocked from
> lower layer anyway.  Jan, what do you think?

Well no, at least currently IO is not blocked in lower layers AFAIK - for
that you'd need to freeze block devices & filesystems and there are issues
with that (Jiri Kosina was the last one which was trying to make this work
AFAIR). And I think you need to stop writeback (and generally any IO) to be
generated so that it doesn't interact in a strange way with device drivers
being frozen. So IMO until suspend freezes filesystems & devices properly
you have to freeze writeback workqueue.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-03-03  9:33           ` Jan Kara
@ 2016-03-11 17:56             ` Tejun Heo
  2016-03-14  7:22               ` Jan Kara
  0 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-11 17:56 UTC (permalink / raw)
  To: Jan Kara
  Cc: Peter Chen, Alan Stern, florian, linux-usb, linux-kernel,
	usb-storage, Jan Kara, jkosina

Hello, Jan.

On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > Ugh... that's nasty.  I wonder whether the right thing to do is making
> > writeback workers non-freezable.  IOs are supposed to be blocked from
> > lower layer anyway.  Jan, what do you think?
> 
> Well no, at least currently IO is not blocked in lower layers AFAIK - for
> that you'd need to freeze block devices & filesystems and there are issues

At least libata does and I think SCSI does too, but yeah, there
probably are drivers which depend on block layer blocking IOs, which
btw is a pretty fragile way to go about as upper layers might not be
the only source of activities.

> with that (Jiri Kosina was the last one which was trying to make this work
> AFAIR). And I think you need to stop writeback (and generally any IO) to be
> generated so that it doesn't interact in a strange way with device drivers
> being frozen. So IMO until suspend freezes filesystems & devices properly
> you have to freeze writeback workqueue.

I still think the right thing to do is plugging that block layer or
low level drivers.  It's like we're trying to plug multiple sources
when we can plug the point where they come together anyway.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-03-11 17:56             ` Tejun Heo
@ 2016-03-14  7:22               ` Jan Kara
  2016-03-14 14:37                 ` Alan Stern
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2016-03-14  7:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Peter Chen, Alan Stern, florian, linux-usb,
	linux-kernel, usb-storage, Jan Kara, jkosina

On Fri 11-03-16 12:56:10, Tejun Heo wrote:
> Hello, Jan.
> 
> On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > > Ugh... that's nasty.  I wonder whether the right thing to do is making
> > > writeback workers non-freezable.  IOs are supposed to be blocked from
> > > lower layer anyway.  Jan, what do you think?
> > 
> > Well no, at least currently IO is not blocked in lower layers AFAIK - for
> > that you'd need to freeze block devices & filesystems and there are issues
> 
> At least libata does and I think SCSI does too, but yeah, there
> probably are drivers which depend on block layer blocking IOs, which
> btw is a pretty fragile way to go about as upper layers might not be
> the only source of activities.
> 
> > with that (Jiri Kosina was the last one which was trying to make this work
> > AFAIR). And I think you need to stop writeback (and generally any IO) to be
> > generated so that it doesn't interact in a strange way with device drivers
> > being frozen. So IMO until suspend freezes filesystems & devices properly
> > you have to freeze writeback workqueue.
> 
> I still think the right thing to do is plugging that block layer or
> low level drivers.  It's like we're trying to plug multiple sources
> when we can plug the point where they come together anyway.

I agree that freezing writeback workers is a workaround for real issues at
best and ideally we shouldn't have to do that. But at least for now I had
the impression that it is needed for suspend to work reasonably reliably.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-03-14  7:22               ` Jan Kara
@ 2016-03-14 14:37                 ` Alan Stern
  2016-03-15  9:25                   ` Jan Kara
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-03-14 14:37 UTC (permalink / raw)
  To: Jan Kara
  Cc: Tejun Heo, Peter Chen, florian, linux-usb, linux-kernel,
	usb-storage, Jan Kara, jkosina

On Mon, 14 Mar 2016, Jan Kara wrote:

> On Fri 11-03-16 12:56:10, Tejun Heo wrote:
> > Hello, Jan.
> > 
> > On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > > > Ugh... that's nasty.  I wonder whether the right thing to do is making
> > > > writeback workers non-freezable.  IOs are supposed to be blocked from
> > > > lower layer anyway.  Jan, what do you think?
> > > 
> > > Well no, at least currently IO is not blocked in lower layers AFAIK - for
> > > that you'd need to freeze block devices & filesystems and there are issues
> > 
> > At least libata does and I think SCSI does too, but yeah, there
> > probably are drivers which depend on block layer blocking IOs, which
> > btw is a pretty fragile way to go about as upper layers might not be
> > the only source of activities.
> > 
> > > with that (Jiri Kosina was the last one which was trying to make this work
> > > AFAIR). And I think you need to stop writeback (and generally any IO) to be
> > > generated so that it doesn't interact in a strange way with device drivers
> > > being frozen. So IMO until suspend freezes filesystems & devices properly
> > > you have to freeze writeback workqueue.

What do you mean by "freezes ... devices"?  Only a piece of code can be 
frozen -- not a device.

The kernel does suspend device drivers; that is, it invokes their
suspend callbacks.  But it doesn't "freeze" them in any sense.  Once a 
driver has been suspended, it assumes it won't receive any I/O requests 
until it has been resumed.  Therefore the kernel first has to prevent 
all the upper layers from generating such requests and/or sending them 
to the low-level drivers.

> > I still think the right thing to do is plugging that block layer or
> > low level drivers.  It's like we're trying to plug multiple sources
> > when we can plug the point where they come together anyway.
> 
> I agree that freezing writeback workers is a workaround for real issues at
> best and ideally we shouldn't have to do that. But at least for now I had
> the impression that it is needed for suspend to work reasonably reliably.

The design is not to plug low-level drivers, but instead to prevent
them from receiving any requests by plugging or freezing high-level
code.

It's pretty clear that we don't want to have ongoing I/O during a 
system suspend, right?  And that means the I/O has to be prevented (or 
"plugged", if you prefer) somewhere -- either at an upper layer or at a 
lower layer.  There was a choice to be made, and the decision was to do 
it at an upper layer.

Alan Stern

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-03-14 14:37                 ` Alan Stern
@ 2016-03-15  9:25                   ` Jan Kara
  2016-03-16 15:00                     ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Kara @ 2016-03-15  9:25 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jan Kara, Tejun Heo, Peter Chen, florian, linux-usb, linux-kernel,
	usb-storage, Jan Kara, jkosina

On Mon 14-03-16 10:37:22, Alan Stern wrote:
> On Mon, 14 Mar 2016, Jan Kara wrote:
> 
> > On Fri 11-03-16 12:56:10, Tejun Heo wrote:
> > > Hello, Jan.
> > > 
> > > On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > > > > Ugh... that's nasty.  I wonder whether the right thing to do is making
> > > > > writeback workers non-freezable.  IOs are supposed to be blocked from
> > > > > lower layer anyway.  Jan, what do you think?
> > > > 
> > > > Well no, at least currently IO is not blocked in lower layers AFAIK - for
> > > > that you'd need to freeze block devices & filesystems and there are issues
> > > 
> > > At least libata does and I think SCSI does too, but yeah, there
> > > probably are drivers which depend on block layer blocking IOs, which
> > > btw is a pretty fragile way to go about as upper layers might not be
> > > the only source of activities.
> > > 
> > > > with that (Jiri Kosina was the last one which was trying to make this work
> > > > AFAIR). And I think you need to stop writeback (and generally any IO) to be
> > > > generated so that it doesn't interact in a strange way with device drivers
> > > > being frozen. So IMO until suspend freezes filesystems & devices properly
> > > > you have to freeze writeback workqueue.
> 
> What do you mean by "freezes ... devices"?  Only a piece of code can be 
> frozen -- not a device.

By that I meant block device and filesystem freezing. That way filesystem
is frozen so that it doesn't submit any more IO to the device.

> The kernel does suspend device drivers; that is, it invokes their
> suspend callbacks.  But it doesn't "freeze" them in any sense.  Once a 
> driver has been suspended, it assumes it won't receive any I/O requests 
> until it has been resumed.  Therefore the kernel first has to prevent 
> all the upper layers from generating such requests and/or sending them 
> to the low-level drivers.

OK, so Tejun and you should talk together because you both seem to want
something else... If I understand it right, Tejun wants suspended devices
to just queue requests that have been submitted after these devices were
suspended and complete them once they are resumed...

> > > I still think the right thing to do is plugging that block layer or
> > > low level drivers.  It's like we're trying to plug multiple sources
> > > when we can plug the point where they come together anyway.
> > 
> > I agree that freezing writeback workers is a workaround for real issues at
> > best and ideally we shouldn't have to do that. But at least for now I had
> > the impression that it is needed for suspend to work reasonably reliably.
> 
> The design is not to plug low-level drivers, but instead to prevent
> them from receiving any requests by plugging or freezing high-level
> code.
> 
> It's pretty clear that we don't want to have ongoing I/O during a 
> system suspend, right?  And that means the I/O has to be prevented (or 
> "plugged", if you prefer) somewhere -- either at an upper layer or at a 
> lower layer.  There was a choice to be made, and the decision was to do 
> it at an upper layer.

I agree the IO has to be plugged somewhere. And Tejun seems to want to plug
it at lower layer...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
  2016-03-15  9:25                   ` Jan Kara
@ 2016-03-16 15:00                     ` Tejun Heo
  2016-03-16 15:37                       ` Should suspend plug low-level devices? Alan Stern
  0 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-16 15:00 UTC (permalink / raw)
  To: Jan Kara
  Cc: Alan Stern, Peter Chen, florian, linux-usb, linux-kernel,
	usb-storage, Jan Kara, jkosina

Hello, Jan, Alan.

On Tue, Mar 15, 2016 at 10:25:43AM +0100, Jan Kara wrote:
> > The kernel does suspend device drivers; that is, it invokes their
> > suspend callbacks.  But it doesn't "freeze" them in any sense.  Once a 
> > driver has been suspended, it assumes it won't receive any I/O requests 
> > until it has been resumed.  Therefore the kernel first has to prevent 
> > all the upper layers from generating such requests and/or sending them 
> > to the low-level drivers.
> 
> OK, so Tejun and you should talk together because you both seem to want
> something else... If I understand it right, Tejun wants suspended devices
> to just queue requests that have been submitted after these devices were
> suspended and complete them once they are resumed...

Yeah, I suppose that's why we have the code base we do now.  I don't
think freezing kernel threads is the right mechanism to plug IO
devices during suspend.  It's way too error-prone and causes a
dependency nightmare as it acts essentially as a system-wide lock.

More complex drivers already plug themselves which are necessary no
matter what as upper layers or some kthreads aren't the only sources
of commands to devices.  We can plug at block layer for IOs coming
down from higher layers.  We can even provide a mechanism to plug
certain kthreads if necessary but they should be contained in the
driver - e.g. the suspend callback specifically blocking certain
specific kthreads - instead of the vague "the system is generally
stopped now and it seems to work most of the time" that we're doing
now.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Should suspend plug low-level devices?
  2016-03-16 15:00                     ` Tejun Heo
@ 2016-03-16 15:37                       ` Alan Stern
  2016-03-16 16:31                         ` Rafael J. Wysocki
  2016-03-16 16:34                         ` Tejun Heo
  0 siblings, 2 replies; 43+ messages in thread
From: Alan Stern @ 2016-03-16 15:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

[Removed linux-usb and usb-storage from CC: and added linux-pm; changed 
$SUBJECT]

On Wed, 16 Mar 2016, Tejun Heo wrote:

> Hello, Jan, Alan.
> 
> On Tue, Mar 15, 2016 at 10:25:43AM +0100, Jan Kara wrote:
> > > The kernel does suspend device drivers; that is, it invokes their
> > > suspend callbacks.  But it doesn't "freeze" them in any sense.  Once a 
> > > driver has been suspended, it assumes it won't receive any I/O requests 
> > > until it has been resumed.  Therefore the kernel first has to prevent 
> > > all the upper layers from generating such requests and/or sending them 
> > > to the low-level drivers.
> > 
> > OK, so Tejun and you should talk together because you both seem to want
> > something else... If I understand it right, Tejun wants suspended devices
> > to just queue requests that have been submitted after these devices were
> > suspended and complete them once they are resumed...
> 
> Yeah, I suppose that's why we have the code base we do now.  I don't
> think freezing kernel threads is the right mechanism to plug IO
> devices during suspend.  It's way too error-prone and causes a
> dependency nightmare as it acts essentially as a system-wide lock.
> 
> More complex drivers already plug themselves which are necessary no
> matter what as upper layers or some kthreads aren't the only sources
> of commands to devices.  We can plug at block layer for IOs coming
> down from higher layers.  We can even provide a mechanism to plug
> certain kthreads if necessary but they should be contained in the
> driver - e.g. the suspend callback specifically blocking certain
> specific kthreads - instead of the vague "the system is generally
> stopped now and it seems to work most of the time" that we're doing
> now.

Doing what you suggest would require a tremendous effort.  There are
lots and lots of drivers that have no provisions for plugging; they
would all need to be modified.  Changing the higher layers instead (to
prevent them from generating I/O requests during suspend) seems much
easier.

The original design of the PM subsystem was intended to make life
easier for drivers by minimizing the amount of work needed to support
suspend/resume.  Maybe the decision to do it that way was wrong, but
it's kind of late to change the design now.

I foresee other kinds of problems, too.  Suppose after a suspend 
starts, some higher layer code creates an I/O request with a short 
timeout.  If the device weren't suspended the request would succeed -- 
should it time out and fail now that the system is going to sleep?  
That would make system sleep transitions non-transparent.

Alan Stern


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 15:37                       ` Should suspend plug low-level devices? Alan Stern
@ 2016-03-16 16:31                         ` Rafael J. Wysocki
  2016-03-16 16:35                           ` Tejun Heo
  2016-03-16 16:34                         ` Tejun Heo
  1 sibling, 1 reply; 43+ messages in thread
From: Rafael J. Wysocki @ 2016-03-16 16:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Tejun Heo, Jan Kara, Peter Chen, florian, jkosina,
	Linux-pm mailing list

On Wed, Mar 16, 2016 at 4:37 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> [Removed linux-usb and usb-storage from CC: and added linux-pm; changed
> $SUBJECT]
>
> On Wed, 16 Mar 2016, Tejun Heo wrote:
>
>> Hello, Jan, Alan.
>>
>> On Tue, Mar 15, 2016 at 10:25:43AM +0100, Jan Kara wrote:
>> > > The kernel does suspend device drivers; that is, it invokes their
>> > > suspend callbacks.  But it doesn't "freeze" them in any sense.  Once a
>> > > driver has been suspended, it assumes it won't receive any I/O requests
>> > > until it has been resumed.  Therefore the kernel first has to prevent
>> > > all the upper layers from generating such requests and/or sending them
>> > > to the low-level drivers.
>> >
>> > OK, so Tejun and you should talk together because you both seem to want
>> > something else... If I understand it right, Tejun wants suspended devices
>> > to just queue requests that have been submitted after these devices were
>> > suspended and complete them once they are resumed...
>>
>> Yeah, I suppose that's why we have the code base we do now.  I don't
>> think freezing kernel threads is the right mechanism to plug IO
>> devices during suspend.  It's way too error-prone and causes a
>> dependency nightmare as it acts essentially as a system-wide lock.
>>
>> More complex drivers already plug themselves which are necessary no
>> matter what as upper layers or some kthreads aren't the only sources
>> of commands to devices.  We can plug at block layer for IOs coming
>> down from higher layers.  We can even provide a mechanism to plug
>> certain kthreads if necessary but they should be contained in the
>> driver - e.g. the suspend callback specifically blocking certain
>> specific kthreads - instead of the vague "the system is generally
>> stopped now and it seems to work most of the time" that we're doing
>> now.
>
> Doing what you suggest would require a tremendous effort.  There are
> lots and lots of drivers that have no provisions for plugging; they
> would all need to be modified.  Changing the higher layers instead (to
> prevent them from generating I/O requests during suspend) seems much
> easier.
>
> The original design of the PM subsystem was intended to make life
> easier for drivers by minimizing the amount of work needed to support
> suspend/resume.  Maybe the decision to do it that way was wrong, but
> it's kind of late to change the design now.
>
> I foresee other kinds of problems, too.  Suppose after a suspend
> starts, some higher layer code creates an I/O request with a short
> timeout.  If the device weren't suspended the request would succeed --
> should it time out and fail now that the system is going to sleep?
> That would make system sleep transitions non-transparent.

Right.

Essentially, every device interface that might be accessed by user
space would have to be intercepted somehow.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 15:37                       ` Should suspend plug low-level devices? Alan Stern
  2016-03-16 16:31                         ` Rafael J. Wysocki
@ 2016-03-16 16:34                         ` Tejun Heo
  2016-03-17 18:51                           ` Alan Stern
  1 sibling, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-16 16:34 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

Hello, Alan.

On Wed, Mar 16, 2016 at 11:37:35AM -0400, Alan Stern wrote:
> Doing what you suggest would require a tremendous effort.  There are

I don't know.  Is it?

> lots and lots of drivers that have no provisions for plugging; they
> would all need to be modified.  Changing the higher layers instead (to
> prevent them from generating I/O requests during suspend) seems much
> easier.

Big lock and "much easier" usually don't go together.  I get why we
started out this way - there wasn't anything working and we didn't
quite understand what the actual requirements were and wanted
something more or less working with manageable amount of effort, but I
think we're past the point where we need to address the issue
properly.  The more complex drivers are already doing most of what's
necessary and even for drivers which depend on the freezer the changes
don't need to be too difficult.

Stopping kthreads in itself is fine but the problem is that we
currently don't know who's stopping what.  Just making each freezing
explicit, say, an explicit invocation of kthread_freeze_kthread(),
wouldn't be that difficult and would go a long way.  e.g. It would
immediately solve the problem arising from kthread freezing order not
matching device dependency hierarchy.

> The original design of the PM subsystem was intended to make life
> easier for drivers by minimizing the amount of work needed to support
> suspend/resume.  Maybe the decision to do it that way was wrong, but
> it's kind of late to change the design now.

Hmmm... but that's how most subsystems evolve.  We start out with
something coarse and not quite optimal or even correct and then as we
learn more, the code and design evolve.  I frankly don't think
shifting to driver-side plugging would be *that* difficult.  It's
gonna be quite a bit of work, for sure, but things can be gradual.
For example,

* Make block layer plug bio's from upper layers.

* Visit each freezable workqueue or kthread users and convert them to
  plug explicitly.

> I foresee other kinds of problems, too.  Suppose after a suspend 
> starts, some higher layer code creates an I/O request with a short 
> timeout.  If the device weren't suspended the request would succeed -- 
> should it time out and fail now that the system is going to sleep?  
> That would make system sleep transitions non-transparent.

I can't see how freezing or not freezing would make any difference
there.  What if a kthread gets frozen with with IOs in flight?  The
timeouts would still trigger whether the issuer is frozen or not.  IO
timeouts are mostly implemented in block layer and block layer just
needs to handle it sensibly.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 16:31                         ` Rafael J. Wysocki
@ 2016-03-16 16:35                           ` Tejun Heo
  2016-03-16 16:39                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-16 16:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Jan Kara, Peter Chen, florian, jkosina,
	Linux-pm mailing list

Hello, Rafael.

On Wed, Mar 16, 2016 at 05:31:23PM +0100, Rafael J. Wysocki wrote:
> Essentially, every device interface that might be accessed by user
> space would have to be intercepted somehow.

That's what we're doing now.  What I'm suggesting is that we should be
plugging the point where they come together instead of the sources of
which there can be many.  Freezing does block a lot of those sources
but not all.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 16:35                           ` Tejun Heo
@ 2016-03-16 16:39                             ` Rafael J. Wysocki
  2016-03-16 23:18                               ` Jiri Kosina
  0 siblings, 1 reply; 43+ messages in thread
From: Rafael J. Wysocki @ 2016-03-16 16:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Alan Stern, Jan Kara, Peter Chen, florian,
	jkosina, Linux-pm mailing list

On Wed, Mar 16, 2016 at 5:35 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Rafael.
>
> On Wed, Mar 16, 2016 at 05:31:23PM +0100, Rafael J. Wysocki wrote:
>> Essentially, every device interface that might be accessed by user
>> space would have to be intercepted somehow.
>
> That's what we're doing now.  What I'm suggesting is that we should be
> plugging the point where they come together instead of the sources of
> which there can be many.  Freezing does block a lot of those sources
> but not all.

Well, is there any point where they come together?  Say somebody
communicates with user space over mmaped address range and writes to
that go to hardware.  What do we do then?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 16:39                             ` Rafael J. Wysocki
@ 2016-03-16 23:18                               ` Jiri Kosina
  2016-03-16 23:24                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-03-16 23:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Alan Stern, Jan Kara, Peter Chen, florian,
	Linux-pm mailing list

On Wed, 16 Mar 2016, Rafael J. Wysocki wrote:

> > That's what we're doing now.  What I'm suggesting is that we should be
> > plugging the point where they come together instead of the sources of
> > which there can be many.  Freezing does block a lot of those sources
> > but not all.
> 
> Well, is there any point where they come together?  Say somebody
> communicates with user space over mmaped address range and writes to
> that go to hardware.  What do we do then?

There are no more writes happening to that region after userspace has been 
frozen. And we're definitely not getting rid of userspace freezing.

Or have I missed the point of your question?

Thanks,

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 23:18                               ` Jiri Kosina
@ 2016-03-16 23:24                                 ` Rafael J. Wysocki
  2016-03-16 23:42                                   ` Jiri Kosina
                                                     ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Rafael J. Wysocki @ 2016-03-16 23:24 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Rafael J. Wysocki, Tejun Heo, Alan Stern, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

On Thu, Mar 17, 2016 at 12:18 AM, Jiri Kosina <jikos@kernel.org> wrote:
> On Wed, 16 Mar 2016, Rafael J. Wysocki wrote:
>
>> > That's what we're doing now.  What I'm suggesting is that we should be
>> > plugging the point where they come together instead of the sources of
>> > which there can be many.  Freezing does block a lot of those sources
>> > but not all.
>>
>> Well, is there any point where they come together?  Say somebody
>> communicates with user space over mmaped address range and writes to
>> that go to hardware.  What do we do then?
>
> There are no more writes happening to that region after userspace has been
> frozen. And we're definitely not getting rid of userspace freezing.
>
> Or have I missed the point of your question?

I thought Tejun was talking about getting rid of userspace freezing.
Sorry if that was not the case.

As I've already said for multiple times, I think that the freezing
should really be limited to user space as a rule.  And in particular,
if freezing of any kernel threads or workqueues etc causes problems to
happen, the code using those freezable things is simply buggy and
needs to be fixed.  Most likely by getting rid of the freezing from
it.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 23:24                                 ` Rafael J. Wysocki
@ 2016-03-16 23:42                                   ` Jiri Kosina
  2016-03-16 23:48                                     ` Rafael J. Wysocki
                                                       ` (2 more replies)
  2016-03-17  0:05                                   ` Tejun Heo
  2016-03-17 18:53                                   ` Alan Stern
  2 siblings, 3 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-03-16 23:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Alan Stern, Jan Kara, Peter Chen, Florian Mickler,
	Linux-pm mailing list

On Thu, 17 Mar 2016, Rafael J. Wysocki wrote:

> >> Well, is there any point where they come together?  Say somebody
> >> communicates with user space over mmaped address range and writes to
> >> that go to hardware.  What do we do then?
> >
> > There are no more writes happening to that region after userspace has been
> > frozen. And we're definitely not getting rid of userspace freezing.
> >
> > Or have I missed the point of your question?
> 
> I thought Tejun was talking about getting rid of userspace freezing.
> Sorry if that was not the case.

Oh, I definitely belive that the current "holy grail" is to get rid of 
ktrehad / kernel code freezing. Userspace is a completely different story, 
I agree.

> As I've already said for multiple times, I think that the freezing 
> should really be limited to user space as a rule.  And in particular, if 
> freezing of any kernel threads or workqueues etc causes problems to 
> happen, the code using those freezable things is simply buggy and needs 
> to be fixed.  Most likely by getting rid of the freezing from it.

I am (slowly) working towards this goal. First I am trying to make sure 
that all the kernel is using the kthread freezing API in a well-defined 
way (which is currently not the case at all).

Before this is done, it is more or less impossible to analyse the current 
users and make any decisions on top of that.

Once some kind of analysis is possible, I am pretty sure that we'll be 
able to get rid of most kthread freezing by introducing fs freezing during 
suspend (and fixing a lot of currently existing bugs as a nice 
side-effect), and perhaps eventually ditch the kthread freezer altogether.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 23:42                                   ` Jiri Kosina
@ 2016-03-16 23:48                                     ` Rafael J. Wysocki
  2016-03-17  7:50                                     ` Oliver Neukum
  2016-03-17 18:55                                     ` Alan Stern
  2 siblings, 0 replies; 43+ messages in thread
From: Rafael J. Wysocki @ 2016-03-16 23:48 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Rafael J. Wysocki, Tejun Heo, Alan Stern, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

On Thu, Mar 17, 2016 at 12:42 AM, Jiri Kosina <jikos@kernel.org> wrote:
> On Thu, 17 Mar 2016, Rafael J. Wysocki wrote:
>
>> >> Well, is there any point where they come together?  Say somebody
>> >> communicates with user space over mmaped address range and writes to
>> >> that go to hardware.  What do we do then?
>> >
>> > There are no more writes happening to that region after userspace has been
>> > frozen. And we're definitely not getting rid of userspace freezing.
>> >
>> > Or have I missed the point of your question?
>>
>> I thought Tejun was talking about getting rid of userspace freezing.
>> Sorry if that was not the case.
>
> Oh, I definitely belive that the current "holy grail" is to get rid of
> ktrehad / kernel code freezing. Userspace is a completely different story,
> I agree.
>
>> As I've already said for multiple times, I think that the freezing
>> should really be limited to user space as a rule.  And in particular, if
>> freezing of any kernel threads or workqueues etc causes problems to
>> happen, the code using those freezable things is simply buggy and needs
>> to be fixed.  Most likely by getting rid of the freezing from it.
>
> I am (slowly) working towards this goal. First I am trying to make sure
> that all the kernel is using the kthread freezing API in a well-defined
> way (which is currently not the case at all).
>
> Before this is done, it is more or less impossible to analyse the current
> users and make any decisions on top of that.
>
> Once some kind of analysis is possible, I am pretty sure that we'll be
> able to get rid of most kthread freezing by introducing fs freezing during
> suspend (and fixing a lot of currently existing bugs as a nice
> side-effect), and perhaps eventually ditch the kthread freezer altogether.

Sounds good! :-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 23:24                                 ` Rafael J. Wysocki
  2016-03-16 23:42                                   ` Jiri Kosina
@ 2016-03-17  0:05                                   ` Tejun Heo
  2016-03-17 18:58                                     ` Alan Stern
  2016-03-17 18:53                                   ` Alan Stern
  2 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-17  0:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Kosina, Alan Stern, Jan Kara, Peter Chen, Florian Mickler,
	Linux-pm mailing list

Hello, Rafael, Jiri.

On Thu, Mar 17, 2016 at 12:24:57AM +0100, Rafael J. Wysocki wrote:
> > There are no more writes happening to that region after userspace has been
> > frozen. And we're definitely not getting rid of userspace freezing.
> >
> > Or have I missed the point of your question?
> 
> I thought Tejun was talking about getting rid of userspace freezing.
> Sorry if that was not the case.

Oh, kthreads, definitely.  Userland freezing doesn't cause any
dependency issues in the kernel and I don't see any reason for
changing userland part.

> As I've already said for multiple times, I think that the freezing
> should really be limited to user space as a rule.  And in particular,
> if freezing of any kernel threads or workqueues etc causes problems to
> happen, the code using those freezable things is simply buggy and
> needs to be fixed.  Most likely by getting rid of the freezing from
> it.

We're in full agreement.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 23:42                                   ` Jiri Kosina
  2016-03-16 23:48                                     ` Rafael J. Wysocki
@ 2016-03-17  7:50                                     ` Oliver Neukum
  2016-03-17  8:02                                       ` Jiri Kosina
  2016-03-17 18:55                                     ` Alan Stern
  2 siblings, 1 reply; 43+ messages in thread
From: Oliver Neukum @ 2016-03-17  7:50 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Rafael J. Wysocki, Tejun Heo, Alan Stern, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

On Thu, 2016-03-17 at 00:42 +0100, Jiri Kosina wrote:
> I am (slowly) working towards this goal. First I am trying to make
> sure 
> that all the kernel is using the kthread freezing API in a
> well-defined 
> way (which is currently not the case at all).
> 
> Before this is done, it is more or less impossible to analyse the
> current 
> users and make any decisions on top of that.
> 
> Once some kind of analysis is possible, I am pretty sure that we'll
> be 
> able to get rid of most kthread freezing by introducing fs freezing
> during 
> suspend (and fixing a lot of currently existing bugs as a nice 
> side-effect), and perhaps eventually ditch the kthread freezer
> altogether.

But what about character devices?

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17  7:50                                     ` Oliver Neukum
@ 2016-03-17  8:02                                       ` Jiri Kosina
  2016-03-17  8:20                                         ` Oliver Neukum
  0 siblings, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-03-17  8:02 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Rafael J. Wysocki, Tejun Heo, Alan Stern, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

On Thu, 17 Mar 2016, Oliver Neukum wrote:

> But what about character devices?

What makes the special in this context?

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17  8:02                                       ` Jiri Kosina
@ 2016-03-17  8:20                                         ` Oliver Neukum
  2016-03-17  8:35                                           ` Jiri Kosina
  0 siblings, 1 reply; 43+ messages in thread
From: Oliver Neukum @ 2016-03-17  8:20 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Chen, Rafael J. Wysocki, Tejun Heo, Florian Mickler,
	Alan Stern, Jan Kara, Linux-pm mailing list

On Thu, 2016-03-17 at 09:02 +0100, Jiri Kosina wrote:
> On Thu, 17 Mar 2016, Oliver Neukum wrote:
> 
> > But what about character devices?
> 
> What makes the special in this context?

Actually nothing, but they don't dpend on file systems.
So I see a basic problem, how does a device driver
know which kernel threads cause IO? Are you assuming
that only those a driver starts itself are important?

	Regards
		Oliver




^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17  8:20                                         ` Oliver Neukum
@ 2016-03-17  8:35                                           ` Jiri Kosina
  2016-03-17  9:12                                             ` Oliver Neukum
  0 siblings, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-03-17  8:35 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Peter Chen, Rafael J. Wysocki, Tejun Heo, Florian Mickler,
	Alan Stern, Jan Kara, Linux-pm mailing list

On Thu, 17 Mar 2016, Oliver Neukum wrote:

> > What makes the special in this context?
> 
> Actually nothing, but they don't dpend on file systems. 

Sure. And userspace is frozen, so nothing new is happening with them from 
the process side.

> So I see a basic problem, how does a device driver know which kernel 
> threads cause IO? Are you assuming that only those a driver starts 
> itself are important?

Every driver should know what the kthreads it is spawning are doing, and 
whether they should be handled in a special way during suspend (in most 
cases, they don't, I believe).

Most kthreads don't generate any IO by themselves; usually they are 
actualy I/O helpers, which are threads you in fact want to be running 
during suspend.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17  8:35                                           ` Jiri Kosina
@ 2016-03-17  9:12                                             ` Oliver Neukum
  2016-03-17 10:20                                               ` Jiri Kosina
  0 siblings, 1 reply; 43+ messages in thread
From: Oliver Neukum @ 2016-03-17  9:12 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Chen, Rafael J. Wysocki, Tejun Heo, Florian Mickler,
	Alan Stern, Jan Kara, Linux-pm mailing list

On Thu, 2016-03-17 at 09:35 +0100, Jiri Kosina wrote:
> On Thu, 17 Mar 2016, Oliver Neukum wrote:
> 
> > > What makes the special in this context?
> > 
> > Actually nothing, but they don't dpend on file systems. 
> 
> Sure. And userspace is frozen, so nothing new is happening with them from 
> the process side.

Well, to the same extent as to block devices.

> > So I see a basic problem, how does a device driver know which kernel 
> > threads cause IO? Are you assuming that only those a driver starts 
> > itself are important?
> 
> Every driver should know what the kthreads it is spawning are doing, and 
> whether they should be handled in a special way during suspend (in most 
> cases, they don't, I believe).
> 
> Most kthreads don't generate any IO by themselves; usually they are 
> actualy I/O helpers, which are threads you in fact want to be running 
> during suspend.

Yes, but you are making the assumption that only the kthreads a driver
started are causing IO to it. That is a tall one.

- logging
- network
- device discovery on busses
- detection of media
- knfsd

All these things cause IO and I probably forgot some.
Defined semantics for the threads a driver starts are a good idea,
but it leaves out the really hard cases.

	Regards
		Oliver




^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17  9:12                                             ` Oliver Neukum
@ 2016-03-17 10:20                                               ` Jiri Kosina
  2016-03-17 10:29                                                 ` Oliver Neukum
  0 siblings, 1 reply; 43+ messages in thread
From: Jiri Kosina @ 2016-03-17 10:20 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Peter Chen, Rafael J. Wysocki, Tejun Heo, Florian Mickler,
	Alan Stern, Jan Kara, Linux-pm mailing list

On Thu, 17 Mar 2016, Oliver Neukum wrote:

> Yes, but you are making the assumption that only the kthreads a driver
> started are causing IO to it. That is a tall one.
> 
> - logging
> - network
> - device discovery on busses
> - detection of media
> - knfsd
> 
> All these things cause IO and I probably forgot some. Defined semantics 
> for the threads a driver starts are a good idea, but it leaves out the 
> really hard cases.

So you named only one specific kthread here (knfsd), so let's focus on 
that one. Where is that one currently being frozen during suspend?

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17 10:20                                               ` Jiri Kosina
@ 2016-03-17 10:29                                                 ` Oliver Neukum
  2016-03-17 10:36                                                   ` Jiri Kosina
  0 siblings, 1 reply; 43+ messages in thread
From: Oliver Neukum @ 2016-03-17 10:29 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Peter Chen, Rafael J. Wysocki, Tejun Heo, Florian Mickler,
	Alan Stern, Jan Kara, Linux-pm mailing list

On Thu, 2016-03-17 at 11:20 +0100, Jiri Kosina wrote:
> On Thu, 17 Mar 2016, Oliver Neukum wrote:
> 
> > Yes, but you are making the assumption that only the kthreads a driver
> > started are causing IO to it. That is a tall one.
> > 
> > - logging
> > - network
> > - device discovery on busses
> > - detection of media
> > - knfsd
> > 
> > All these things cause IO and I probably forgot some. Defined semantics 
> > for the threads a driver starts are a good idea, but it leaves out the 
> > really hard cases.
> 
> So you named only one specific kthread here (knfsd), so let's focus on 
> that one. Where is that one currently being frozen during suspend?

In svc_recv()

That concentration, however, is problematic. We can target anything
specific, but the point is, do we know what is running?

	Regards
		Oliver




^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17 10:29                                                 ` Oliver Neukum
@ 2016-03-17 10:36                                                   ` Jiri Kosina
  0 siblings, 0 replies; 43+ messages in thread
From: Jiri Kosina @ 2016-03-17 10:36 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Peter Chen, Rafael J. Wysocki, Tejun Heo, Florian Mickler,
	Alan Stern, Jan Kara, Linux-pm mailing list

On Thu, 17 Mar 2016, Oliver Neukum wrote:

> > > - logging
> > > - network
> > > - device discovery on busses
> > > - detection of media
> > > - knfsd
> > > 
> > > All these things cause IO and I probably forgot some. Defined semantics 
> > > for the threads a driver starts are a good idea, but it leaves out the 
> > > really hard cases.
> > 
> > So you named only one specific kthread here (knfsd), so let's focus on 
> > that one. Where is that one currently being frozen during suspend?
> 
> In svc_recv()
> 
> That concentration, however, is problematic. We can target anything
> specific, but the point is, do we know what is running?

The important question here is (and I don't have answer to that yet, 
unfortunately), whether filesystem freezing works properly on NFS.

Properly working filesystem freezing is of course hard pre-requisity.

Thanks,

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 16:34                         ` Tejun Heo
@ 2016-03-17 18:51                           ` Alan Stern
  2016-03-18 20:32                             ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-03-17 18:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

On Wed, 16 Mar 2016, Tejun Heo wrote:

> Hello, Alan.
> 
> On Wed, Mar 16, 2016 at 11:37:35AM -0400, Alan Stern wrote:
> > Doing what you suggest would require a tremendous effort.  There are
> 
> I don't know.  Is it?
> 
> > lots and lots of drivers that have no provisions for plugging; they
> > would all need to be modified.  Changing the higher layers instead (to
> > prevent them from generating I/O requests during suspend) seems much
> > easier.
> 
> Big lock and "much easier" usually don't go together.  I get why we
> started out this way - there wasn't anything working and we didn't
> quite understand what the actual requirements were and wanted
> something more or less working with manageable amount of effort, but I
> think we're past the point where we need to address the issue
> properly.  The more complex drivers are already doing most of what's
> necessary and even for drivers which depend on the freezer the changes
> don't need to be too difficult.
> 
> Stopping kthreads in itself is fine but the problem is that we
> currently don't know who's stopping what.  Just making each freezing
> explicit, say, an explicit invocation of kthread_freeze_kthread(),
> wouldn't be that difficult and would go a long way.  e.g. It would
> immediately solve the problem arising from kthread freezing order not
> matching device dependency hierarchy.

We're talking about workqueues in particular.  Freezing order is not 
the issue.

The problem that started this discussion was that a block device can be
hot-unplugged while the system is asleep.  The device driver's resume
routine realizes the device is gone and unregisters it.  As part of the
unregistration, the block layer (bdi_unregister) does a
flush_delayed_work.

I'm not sure exactly what work it's trying to flush (the code is
difficult to follow), but evidently it can't go forward while the
system is still resuming.  Maybe the workqueue thread is frozen.

> > The original design of the PM subsystem was intended to make life
> > easier for drivers by minimizing the amount of work needed to support
> > suspend/resume.  Maybe the decision to do it that way was wrong, but
> > it's kind of late to change the design now.
> 
> Hmmm... but that's how most subsystems evolve.  We start out with
> something coarse and not quite optimal or even correct and then as we
> learn more, the code and design evolve.  I frankly don't think
> shifting to driver-side plugging would be *that* difficult.  It's
> gonna be quite a bit of work, for sure, but things can be gradual.
> For example,
> 
> * Make block layer plug bio's from upper layers.
> 
> * Visit each freezable workqueue or kthread users and convert them to
>   plug explicitly.

How would this help the case in question?  If a workqueue is plugged 
instead of frozen, won't flush_delayed_work still deadlock?

> > I foresee other kinds of problems, too.  Suppose after a suspend 
> > starts, some higher layer code creates an I/O request with a short 
> > timeout.  If the device weren't suspended the request would succeed -- 
> > should it time out and fail now that the system is going to sleep?  
> > That would make system sleep transitions non-transparent.
> 
> I can't see how freezing or not freezing would make any difference
> there.  What if a kthread gets frozen with with IOs in flight?  The
> timeouts would still trigger whether the issuer is frozen or not.  IO
> timeouts are mostly implemented in block layer and block layer just
> needs to handle it sensibly.

kthreads (including workqueues) get frozen at controlled places.  
Presumably they are smart enough not to allow themselves to be frozen
while any I/Os are in flight.

Alan Stern


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 23:24                                 ` Rafael J. Wysocki
  2016-03-16 23:42                                   ` Jiri Kosina
  2016-03-17  0:05                                   ` Tejun Heo
@ 2016-03-17 18:53                                   ` Alan Stern
  2016-03-18 20:29                                     ` Tejun Heo
  2 siblings, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-03-17 18:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Kosina, Tejun Heo, Jan Kara, Peter Chen, Florian Mickler,
	Linux-pm mailing list

On Thu, 17 Mar 2016, Rafael J. Wysocki wrote:

> I thought Tejun was talking about getting rid of userspace freezing.
> Sorry if that was not the case.
> 
> As I've already said for multiple times, I think that the freezing
> should really be limited to user space as a rule.  And in particular,
> if freezing of any kernel threads or workqueues etc causes problems to
> happen, the code using those freezable things is simply buggy and
> needs to be fixed.  Most likely by getting rid of the freezing from
> it.

I agree, except that I think the most likely solution is to delay the 
operations that require a freezable thing until after the system is 
resumed.

Alan Stern


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-16 23:42                                   ` Jiri Kosina
  2016-03-16 23:48                                     ` Rafael J. Wysocki
  2016-03-17  7:50                                     ` Oliver Neukum
@ 2016-03-17 18:55                                     ` Alan Stern
  2016-03-18 20:29                                       ` Tejun Heo
  2 siblings, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-03-17 18:55 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Rafael J. Wysocki, Tejun Heo, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

On Thu, 17 Mar 2016, Jiri Kosina wrote:

> On Thu, 17 Mar 2016, Rafael J. Wysocki wrote:
> 
> > >> Well, is there any point where they come together?  Say somebody
> > >> communicates with user space over mmaped address range and writes to
> > >> that go to hardware.  What do we do then?
> > >
> > > There are no more writes happening to that region after userspace has been
> > > frozen. And we're definitely not getting rid of userspace freezing.
> > >
> > > Or have I missed the point of your question?
> > 
> > I thought Tejun was talking about getting rid of userspace freezing.
> > Sorry if that was not the case.
> 
> Oh, I definitely belive that the current "holy grail" is to get rid of 
> ktrehad / kernel code freezing. Userspace is a completely different story, 
> I agree.

Why get rid of kernel code freezing?  It is the perfect solution for 
some problems.

> > As I've already said for multiple times, I think that the freezing 
> > should really be limited to user space as a rule.  And in particular, if 
> > freezing of any kernel threads or workqueues etc causes problems to 
> > happen, the code using those freezable things is simply buggy and needs 
> > to be fixed.  Most likely by getting rid of the freezing from it.
> 
> I am (slowly) working towards this goal. First I am trying to make sure 
> that all the kernel is using the kthread freezing API in a well-defined 
> way (which is currently not the case at all).
> 
> Before this is done, it is more or less impossible to analyse the current 
> users and make any decisions on top of that.
> 
> Once some kind of analysis is possible, I am pretty sure that we'll be 
> able to get rid of most kthread freezing by introducing fs freezing during 
> suspend (and fixing a lot of currently existing bugs as a nice 
> side-effect), and perhaps eventually ditch the kthread freezer altogether.

While fs freezing would undoubtedly be a good thing, I don't think it 
will allow you to eliminate kthread (or more specifically, workqueue) 
freezing.

Alan Stern


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17  0:05                                   ` Tejun Heo
@ 2016-03-17 18:58                                     ` Alan Stern
  2016-03-18 20:25                                       ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-03-17 18:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Jiri Kosina, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

On Wed, 16 Mar 2016, Tejun Heo wrote:

> Hello, Rafael, Jiri.
> 
> On Thu, Mar 17, 2016 at 12:24:57AM +0100, Rafael J. Wysocki wrote:
> > > There are no more writes happening to that region after userspace has been
> > > frozen. And we're definitely not getting rid of userspace freezing.
> > >
> > > Or have I missed the point of your question?
> > 
> > I thought Tejun was talking about getting rid of userspace freezing.
> > Sorry if that was not the case.
> 
> Oh, kthreads, definitely.  Userland freezing doesn't cause any
> dependency issues in the kernel and I don't see any reason for
> changing userland part.

Actually, it does cause dependency issues.  The prototype example is
fuse, when one process can't be frozen because it is waiting in the
kernel for file I/O to complete, but the I/O involves a fuse filesystem
and the userspace driver is already frozen.

Alan Stern


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17 18:58                                     ` Alan Stern
@ 2016-03-18 20:25                                       ` Tejun Heo
  0 siblings, 0 replies; 43+ messages in thread
From: Tejun Heo @ 2016-03-18 20:25 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Jiri Kosina, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

Hello, Alan.

On Thu, Mar 17, 2016 at 02:58:31PM -0400, Alan Stern wrote:
> Actually, it does cause dependency issues.  The prototype example is
> fuse, when one process can't be frozen because it is waiting in the
> kernel for file I/O to complete, but the I/O involves a fuse filesystem
> and the userspace driver is already frozen.

Fuse is a clearly exceptional case where kernel internals depend on
userland for forward-progress.  I don't think this has much bearing on
the overall design.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17 18:55                                     ` Alan Stern
@ 2016-03-18 20:29                                       ` Tejun Heo
  0 siblings, 0 replies; 43+ messages in thread
From: Tejun Heo @ 2016-03-18 20:29 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jiri Kosina, Rafael J. Wysocki, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

Hello,

On Thu, Mar 17, 2016 at 02:55:47PM -0400, Alan Stern wrote:
> > Oh, I definitely belive that the current "holy grail" is to get rid of 
> > ktrehad / kernel code freezing. Userspace is a completely different story, 
> > I agree.
> 
> Why get rid of kernel code freezing?  It is the perfect solution for 
> some problems.

with pretty severe generic problems which are difficult to root out.
It's more about making it finer grained rather than removing it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17 18:53                                   ` Alan Stern
@ 2016-03-18 20:29                                     ` Tejun Heo
  0 siblings, 0 replies; 43+ messages in thread
From: Tejun Heo @ 2016-03-18 20:29 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Jiri Kosina, Jan Kara, Peter Chen,
	Florian Mickler, Linux-pm mailing list

On Thu, Mar 17, 2016 at 02:53:42PM -0400, Alan Stern wrote:
> I agree, except that I think the most likely solution is to delay the 
> operations that require a freezable thing until after the system is 
> resumed.

Yes, by following the existing and well-established device suspend
order.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-17 18:51                           ` Alan Stern
@ 2016-03-18 20:32                             ` Tejun Heo
  2016-03-18 21:06                               ` Alan Stern
  0 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-18 20:32 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

Hello,

On Thu, Mar 17, 2016 at 02:51:36PM -0400, Alan Stern wrote:
> > Stopping kthreads in itself is fine but the problem is that we
> > currently don't know who's stopping what.  Just making each freezing
> > explicit, say, an explicit invocation of kthread_freeze_kthread(),
> > wouldn't be that difficult and would go a long way.  e.g. It would
> > immediately solve the problem arising from kthread freezing order not
> > matching device dependency hierarchy.
> 
> We're talking about workqueues in particular.  Freezing order is not 
> the issue.

Whether something is running on kthread or workqueue doesn't make any
difference.  Why would it?  It's about dependency chain, not what
something is running on.

> > * Make block layer plug bio's from upper layers.
> > 
> > * Visit each freezable workqueue or kthread users and convert them to
> >   plug explicitly.
> 
> How would this help the case in question?  If a workqueue is plugged 
> instead of frozen, won't flush_delayed_work still deadlock?

By plugging and unplugging in a well defined order rather than
"whatever comes first in workqueue or task lists".

> kthreads (including workqueues) get frozen at controlled places.  
> Presumably they are smart enough not to allow themselves to be frozen
> while any I/Os are in flight.

I don't think that's the main problem here.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-18 20:32                             ` Tejun Heo
@ 2016-03-18 21:06                               ` Alan Stern
  2016-03-20 18:29                                 ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-03-18 21:06 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

On Fri, 18 Mar 2016, Tejun Heo wrote:

> Whether something is running on kthread or workqueue doesn't make any
> difference.  Why would it?  It's about dependency chain, not what
> something is running on.
> 
> > > * Make block layer plug bio's from upper layers.
> > > 
> > > * Visit each freezable workqueue or kthread users and convert them to
> > >   plug explicitly.
> > 
> > How would this help the case in question?  If a workqueue is plugged 
> > instead of frozen, won't flush_delayed_work still deadlock?
> 
> By plugging and unplugging in a well defined order rather than
> "whatever comes first in workqueue or task lists".

The original problem here was that some unfreezable code was waiting
(via flush_delayed_work) on frozen code, and it was blocking the
routine that would unfreeze things.  Therefore the order of freezing or
thawing has no bearing on this particular problem.

This is an example of a dependency chain where finer-grained control of
freezing will not help.  And even if we moved the I/O plugging from the
high-level routines to the low-level drivers, we could still end up
with situations where something waiting for a plugged device driver was
blocking the code that would unplug it.

Now, I have nothing against finer-grained control of freezing, although
I'm not sure how you would implement it in cases where a kthread is not
per-device but rather is per-driver or per-subsystem.  But we will 
still have to handle inversions where unfreezable code is waiting for 
frozen code.

Alan Stern

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-18 21:06                               ` Alan Stern
@ 2016-03-20 18:29                                 ` Tejun Heo
  2016-03-21 14:38                                   ` Alan Stern
  0 siblings, 1 reply; 43+ messages in thread
From: Tejun Heo @ 2016-03-20 18:29 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

Hello,

On Fri, Mar 18, 2016 at 05:06:01PM -0400, Alan Stern wrote:
> The original problem here was that some unfreezable code was waiting
> (via flush_delayed_work) on frozen code, and it was blocking the
> routine that would unfreeze things.  Therefore the order of freezing or
> thawing has no bearing on this particular problem.

Yeah, this one is caused by a high-level kernel construct being frozen
and device resuming as a whole taking place before unfreezing.  The
eventual goal would be avoiding freezing any high-level kernel
constructs as it can lead to circular dependencies as shown here.

> Now, I have nothing against finer-grained control of freezing, although
> I'm not sure how you would implement it in cases where a kthread is not
> per-device but rather is per-driver or per-subsystem.  But we will 
> still have to handle inversions where unfreezable code is waiting for 
> frozen code.

It won't be simple one-to-one conversion for all cases.  It's about
terminating suspend handling where the endpoints actually are after
all.

thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-20 18:29                                 ` Tejun Heo
@ 2016-03-21 14:38                                   ` Alan Stern
  2016-03-30 17:09                                     ` Tejun Heo
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Stern @ 2016-03-21 14:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

On Sun, 20 Mar 2016, Tejun Heo wrote:

> Hello,
> 
> On Fri, Mar 18, 2016 at 05:06:01PM -0400, Alan Stern wrote:
> > The original problem here was that some unfreezable code was waiting
> > (via flush_delayed_work) on frozen code, and it was blocking the
> > routine that would unfreeze things.  Therefore the order of freezing or
> > thawing has no bearing on this particular problem.
> 
> Yeah, this one is caused by a high-level kernel construct being frozen
> and device resuming as a whole taking place before unfreezing.  The
> eventual goal would be avoiding freezing any high-level kernel
> constructs as it can lead to circular dependencies as shown here.

That's not the point I was trying to make.

Even if we avoid freezing any high-level kernel constructs, there will 
_still_ be problems if a high-level construct blocks waiting for a 
low-level driver that is plugged... which is exactly the sort of thing 
that can happen when you call flush_delayed_work().

Alan Stern


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: Should suspend plug low-level devices?
  2016-03-21 14:38                                   ` Alan Stern
@ 2016-03-30 17:09                                     ` Tejun Heo
  0 siblings, 0 replies; 43+ messages in thread
From: Tejun Heo @ 2016-03-30 17:09 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jan Kara, Peter Chen, florian, jkosina, Linux-pm mailing list

Hello, Alan.

On Mon, Mar 21, 2016 at 10:38:55AM -0400, Alan Stern wrote:
> Even if we avoid freezing any high-level kernel constructs, there will 
> _still_ be problems if a high-level construct blocks waiting for a 
> low-level driver that is plugged... which is exactly the sort of thing 
> that can happen when you call flush_delayed_work().

The problem is caused by or at least made difficult to solve by having
these dependencies lumped up together.  If a device is detected to
have been removed across suspend, the device should be marked as gone
and further IO requests should be failed immediately instead of
getting blocked on filesystem layer generally blocked on global
freezer state.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2016-03-30 17:09 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-23  3:20 Freezable workqueue blocks non-freezable workqueue during the system resume process Peter Chen
2016-02-23  9:47 ` Peter Chen
2016-02-23 15:34 ` Alan Stern
2016-02-24  7:24   ` Peter Chen
2016-02-25 22:01     ` Tejun Heo
2016-02-26  6:19       ` Peter Chen
2016-03-02 16:00         ` Tejun Heo
2016-03-03  9:33           ` Jan Kara
2016-03-11 17:56             ` Tejun Heo
2016-03-14  7:22               ` Jan Kara
2016-03-14 14:37                 ` Alan Stern
2016-03-15  9:25                   ` Jan Kara
2016-03-16 15:00                     ` Tejun Heo
2016-03-16 15:37                       ` Should suspend plug low-level devices? Alan Stern
2016-03-16 16:31                         ` Rafael J. Wysocki
2016-03-16 16:35                           ` Tejun Heo
2016-03-16 16:39                             ` Rafael J. Wysocki
2016-03-16 23:18                               ` Jiri Kosina
2016-03-16 23:24                                 ` Rafael J. Wysocki
2016-03-16 23:42                                   ` Jiri Kosina
2016-03-16 23:48                                     ` Rafael J. Wysocki
2016-03-17  7:50                                     ` Oliver Neukum
2016-03-17  8:02                                       ` Jiri Kosina
2016-03-17  8:20                                         ` Oliver Neukum
2016-03-17  8:35                                           ` Jiri Kosina
2016-03-17  9:12                                             ` Oliver Neukum
2016-03-17 10:20                                               ` Jiri Kosina
2016-03-17 10:29                                                 ` Oliver Neukum
2016-03-17 10:36                                                   ` Jiri Kosina
2016-03-17 18:55                                     ` Alan Stern
2016-03-18 20:29                                       ` Tejun Heo
2016-03-17  0:05                                   ` Tejun Heo
2016-03-17 18:58                                     ` Alan Stern
2016-03-18 20:25                                       ` Tejun Heo
2016-03-17 18:53                                   ` Alan Stern
2016-03-18 20:29                                     ` Tejun Heo
2016-03-16 16:34                         ` Tejun Heo
2016-03-17 18:51                           ` Alan Stern
2016-03-18 20:32                             ` Tejun Heo
2016-03-18 21:06                               ` Alan Stern
2016-03-20 18:29                                 ` Tejun Heo
2016-03-21 14:38                                   ` Alan Stern
2016-03-30 17:09                                     ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.