* [PATCH] driver core: flush async calls before testing driver removal
@ 2016-12-10 0:15 Vladimir Zapolskiy
2016-12-10 1:59 ` Dmitry Torokhov
2016-12-10 7:32 ` Greg Kroah-Hartman
0 siblings, 2 replies; 10+ messages in thread
From: Vladimir Zapolskiy @ 2016-12-10 0:15 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rob Herring; +Cc: linux-kernel
If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false
positives are reported for ATA controller drivers, because ATA port
probes are done asynchronously, and the same problem may also touch
other asynchronously probed drivers.
To reduce the rate of false reports on boot call async_synchronize_full()
before attempting to remove a driver, the same is done in delete_module()
syscall for all possible drivers and in __device_release_driver() function
for asynchronously probed drivers.
Fixes: bea5b158ff0d ("driver core: add test of driver remove calls during probe")
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
---
Some time ago the issue was discussed on the linux-ide mailing list, see
https://www.spinics.net/lists/linux-ide/msg53481.html
drivers/base/dd.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index d76cd97..a4feecf 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -384,6 +384,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
if (test_remove) {
test_remove = false;
+ async_synchronize_full();
+
if (dev->bus->remove)
dev->bus->remove(dev);
else if (drv->remove)
--
2.10.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-10 0:15 [PATCH] driver core: flush async calls before testing driver removal Vladimir Zapolskiy @ 2016-12-10 1:59 ` Dmitry Torokhov 2016-12-10 12:57 ` Vladimir Zapolskiy 2016-12-10 7:32 ` Greg Kroah-Hartman 1 sibling, 1 reply; 10+ messages in thread From: Dmitry Torokhov @ 2016-12-10 1:59 UTC (permalink / raw) To: Vladimir Zapolskiy; +Cc: Greg Kroah-Hartman, Rob Herring, lkml On Fri, Dec 9, 2016 at 4:15 PM, Vladimir Zapolskiy <vz@mleia.com> wrote: > If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false > positives are reported for ATA controller drivers, because ATA port > probes are done asynchronously, and the same problem may also touch > other asynchronously probed drivers. > > To reduce the rate of false reports on boot call async_synchronize_full() > before attempting to remove a driver, the same is done in delete_module() > syscall for all possible drivers and in __device_release_driver() function > for asynchronously probed drivers. I'd say CONFIG_DEBUG_TEST_DRIVER_REMOVE did what it was supposed to do and uncovered a big in ATA drivers. Since driver core did not asynchronously scheduled those actions it should not wait for their completion either, but either ATA core or drivers should wait for probing to complete before allowing remove() methods to run. > > Fixes: bea5b158ff0d ("driver core: add test of driver remove calls during probe") > Suggested-by: Tejun Heo <tj@kernel.org> > Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> > --- > Some time ago the issue was discussed on the linux-ide mailing list, see > > https://www.spinics.net/lists/linux-ide/msg53481.html > > drivers/base/dd.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index d76cd97..a4feecf 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -384,6 +384,8 @@ static int really_probe(struct device *dev, struct device_driver *drv) > if (test_remove) { > test_remove = false; > > + async_synchronize_full(); > + > if (dev->bus->remove) > dev->bus->remove(dev); > else if (drv->remove) > -- > 2.10.2 > Thanks. -- Dmitry ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-10 1:59 ` Dmitry Torokhov @ 2016-12-10 12:57 ` Vladimir Zapolskiy 0 siblings, 0 replies; 10+ messages in thread From: Vladimir Zapolskiy @ 2016-12-10 12:57 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Greg Kroah-Hartman, Rob Herring, lkml Hello Dmitry, On 12/10/2016 03:59 AM, Dmitry Torokhov wrote: > On Fri, Dec 9, 2016 at 4:15 PM, Vladimir Zapolskiy <vz@mleia.com> wrote: >> If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false >> positives are reported for ATA controller drivers, because ATA port >> probes are done asynchronously, and the same problem may also touch >> other asynchronously probed drivers. >> >> To reduce the rate of false reports on boot call async_synchronize_full() >> before attempting to remove a driver, the same is done in delete_module() >> syscall for all possible drivers and in __device_release_driver() function >> for asynchronously probed drivers. > > I'd say CONFIG_DEBUG_TEST_DRIVER_REMOVE did what it was supposed to do > and uncovered a big in ATA drivers. Since driver core did not > asynchronously scheduled those actions it should not wait for their > completion either, but either ATA core or drivers should wait for > probing to complete before allowing remove() methods to run. > can you please share the idea why? I haven't managed to find any problems with ATA subsystem and drivers, my fuzz testing by inserting delays to postpone scheduled execution of async_port_probe() don't show any problems also. So I believe the problem is with the test itself. -- With best wishes, Vladimir ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-10 0:15 [PATCH] driver core: flush async calls before testing driver removal Vladimir Zapolskiy 2016-12-10 1:59 ` Dmitry Torokhov @ 2016-12-10 7:32 ` Greg Kroah-Hartman 2016-12-10 12:38 ` Vladimir Zapolskiy 1 sibling, 1 reply; 10+ messages in thread From: Greg Kroah-Hartman @ 2016-12-10 7:32 UTC (permalink / raw) To: Vladimir Zapolskiy; +Cc: Rob Herring, linux-kernel On Sat, Dec 10, 2016 at 02:15:19AM +0200, Vladimir Zapolskiy wrote: > If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false > positives are reported for ATA controller drivers, because ATA port > probes are done asynchronously, and the same problem may also touch > other asynchronously probed drivers. > > To reduce the rate of false reports on boot call async_synchronize_full() > before attempting to remove a driver, the same is done in delete_module() > syscall for all possible drivers and in __device_release_driver() function > for asynchronously probed drivers. __device_release_driver() already calls this function, why call it again? thanks, greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-10 7:32 ` Greg Kroah-Hartman @ 2016-12-10 12:38 ` Vladimir Zapolskiy 2016-12-10 13:04 ` Greg Kroah-Hartman 0 siblings, 1 reply; 10+ messages in thread From: Vladimir Zapolskiy @ 2016-12-10 12:38 UTC (permalink / raw) To: Greg Kroah-Hartman; +Cc: Rob Herring, linux-kernel Hello Greg, On 12/10/2016 09:32 AM, Greg Kroah-Hartman wrote: > On Sat, Dec 10, 2016 at 02:15:19AM +0200, Vladimir Zapolskiy wrote: >> If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false >> positives are reported for ATA controller drivers, because ATA port >> probes are done asynchronously, and the same problem may also touch >> other asynchronously probed drivers. >> >> To reduce the rate of false reports on boot call async_synchronize_full() >> before attempting to remove a driver, the same is done in delete_module() >> syscall for all possible drivers and in __device_release_driver() function >> for asynchronously probed drivers. > > __device_release_driver() already calls this function, why call it > again? > __device_release_driver() is not called on test removal of drivers, if CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled. This opens a possibility to races like one I've discovered: https://www.spinics.net/lists/linux-ide/msg53473.html Next, async_synchronize_full() from __device_release_driver() is not called in case of removal of ATA controller drivers, because driver_allows_async_probing(drv) return value is false. -- With best wishes, Vladimir ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-10 12:38 ` Vladimir Zapolskiy @ 2016-12-10 13:04 ` Greg Kroah-Hartman 2016-12-11 1:44 ` Vladimir Zapolskiy 0 siblings, 1 reply; 10+ messages in thread From: Greg Kroah-Hartman @ 2016-12-10 13:04 UTC (permalink / raw) To: Vladimir Zapolskiy; +Cc: Rob Herring, linux-kernel On Sat, Dec 10, 2016 at 02:38:41PM +0200, Vladimir Zapolskiy wrote: > Hello Greg, > > On 12/10/2016 09:32 AM, Greg Kroah-Hartman wrote: > > On Sat, Dec 10, 2016 at 02:15:19AM +0200, Vladimir Zapolskiy wrote: > >> If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false > >> positives are reported for ATA controller drivers, because ATA port > >> probes are done asynchronously, and the same problem may also touch > >> other asynchronously probed drivers. > >> > >> To reduce the rate of false reports on boot call async_synchronize_full() > >> before attempting to remove a driver, the same is done in delete_module() > >> syscall for all possible drivers and in __device_release_driver() function > >> for asynchronously probed drivers. > > > > __device_release_driver() already calls this function, why call it > > again? > > > > __device_release_driver() is not called on test removal of drivers, if > CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled. > > This opens a possibility to races like one I've discovered: > > https://www.spinics.net/lists/linux-ide/msg53473.html > > Next, async_synchronize_full() from __device_release_driver() is not > called in case of removal of ATA controller drivers, because > driver_allows_async_probing(drv) return value is false. Hm, how does this not also get hit if you unbind/bind/unbind/bind/etc. from userspace as well? I don't think this is a CONFIG_DEBUG_TEST_DRIVER_REMOVE issue, but just that this option finds the problem corner cases as you are finding out :) thanks, greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-10 13:04 ` Greg Kroah-Hartman @ 2016-12-11 1:44 ` Vladimir Zapolskiy 2016-12-12 17:50 ` Tejun Heo 0 siblings, 1 reply; 10+ messages in thread From: Vladimir Zapolskiy @ 2016-12-11 1:44 UTC (permalink / raw) To: Greg Kroah-Hartman, Tejun Heo; +Cc: Rob Herring, linux-kernel Hello Greg, I'm adding Tejun to the list of addressees. On 12/10/2016 03:04 PM, Greg Kroah-Hartman wrote: > On Sat, Dec 10, 2016 at 02:38:41PM +0200, Vladimir Zapolskiy wrote: >> Hello Greg, >> >> On 12/10/2016 09:32 AM, Greg Kroah-Hartman wrote: >>> On Sat, Dec 10, 2016 at 02:15:19AM +0200, Vladimir Zapolskiy wrote: >>>> If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false >>>> positives are reported for ATA controller drivers, because ATA port >>>> probes are done asynchronously, and the same problem may also touch >>>> other asynchronously probed drivers. >>>> >>>> To reduce the rate of false reports on boot call async_synchronize_full() >>>> before attempting to remove a driver, the same is done in delete_module() >>>> syscall for all possible drivers and in __device_release_driver() function >>>> for asynchronously probed drivers. >>> >>> __device_release_driver() already calls this function, why call it >>> again? >>> >> >> __device_release_driver() is not called on test removal of drivers, if >> CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled. >> >> This opens a possibility to races like one I've discovered: >> >> https://www.spinics.net/lists/linux-ide/msg53473.html >> >> Next, async_synchronize_full() from __device_release_driver() is not >> called in case of removal of ATA controller drivers, because >> driver_allows_async_probing(drv) return value is false. > > Hm, how does this not also get hit if you unbind/bind/unbind/bind/etc. > from userspace as well? I don't think this is a > CONFIG_DEBUG_TEST_DRIVER_REMOVE issue, but just that this option finds > the problem corner cases as you are finding out :) > and you are right, I managed to reproduce exactly the same race as before running the unmodified kernel built from Torvald's branch head: root@imx6q-sabresd:~# zcat /proc/config.gz | grep TEST_DRIVER_REMOVE # CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set root@imx6q-sabresd:~# uname -a Linux imx6q-sabresd 4.9.0-rc8-00134-g0451698 #23 SMP Sun Dec 11 03:37:30 EET 2016 armv7l GNU/Linux root@imx6q-sabresd:~# echo 2200000.sata > /sys/bus/platform/drivers/ahci-imx/unbind ; echo 2200000.sata > /sys/bus/platform/drivers/ahci-imx/bind; echo 2200000.sata > /sys/bus/platform/drivers/ahci-imx/unbind ahci-imx 2200000.sata: fsl,transmit-level-mV not specified, using 00000024 ahci-imx 2200000.sata: fsl,transmit-boost-mdB not specified, using 00000480 ahci-imx 2200000.sata: fsl,transmit-atten-16ths not specified, using 00002000 ahci-imx 2200000.sata: fsl,receive-eq-mdB not specified, using 05000000 ahci-imx 2200000.sata: SSS flag set, parallel bus scan disabled ahci-imx 2200000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl platform mode ahci-imx 2200000.sata: flags: ncq sntf stag pm led clo only pmp pio slum part ccc apst scsi host0: ahci-imx ata3: SATA max UDMA/133 mmio [mem 0x02200000-0x02203fff] port 0x100 irq 72 ata3: SATA link down (SStatus 0 SControl 300) ahci-imx 2200000.sata: no device found, disabling link. ahci-imx 2200000.sata: pass .hotplug=1 to enable hotplug ------------[ cut here ]------------ WARNING: CPU: 2 PID: 375 at drivers/ata/libata-core.c:6482 ata_port_detach+0x130/0x140 Modules linked in: ahci_imx dw_hdmi_ahb_audio evbug CPU: 2 PID: 375 Comm: sh Not tainted 4.9.0-rc8-00134-g0451698 #23 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [<>] (dump_backtrace) from [<>] (show_stack+0x20/0x24) [<>] (show_stack) from [<>] (dump_stack+0xb4/0xe8) [<>] (dump_stack) from [<>] (__warn+0xe4/0x110) [<>] (__warn) from [<>] (warn_slowpath_null+0x30/0x38) [<>] (warn_slowpath_null) from [<>] (ata_port_detach+0x130/0x140) [<>] (ata_port_detach) from [<>] (ata_platform_remove_one+0x34/0x4c) [<>] (ata_platform_remove_one) from [<>] (platform_drv_remove+0x34/0x4c) [<>] (platform_drv_remove) from [<>] (__device_release_driver+0x98/0x134) [<>] (__device_release_driver) from [<>] (device_release_driver+0x30/0x3c) [<>] (device_release_driver) from [<>] (unbind_store+0x88/0x108) [<>] (unbind_store) from [<>] (drv_attr_store+0x30/0x3c) [<>] (drv_attr_store) from [<>] (sysfs_kf_write+0x58/0x5c) [<>] (sysfs_kf_write) from [<>] (kernfs_fop_write+0x108/0x21c) [<>] (kernfs_fop_write) from [<>] (__vfs_write+0x3c/0x128) [<>] (__vfs_write) from [<>] (vfs_write+0xb0/0x178) [<>] (vfs_write) from [<>] (SyS_write+0x4c/0xa0) [<>] (SyS_write) from [<>] (ret_fast_syscall+0x0/0x1c) ---[ end trace 457071853738c95e ]--- Tejun, will you look at the problem again? -- With best wishes, Vladimir ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-11 1:44 ` Vladimir Zapolskiy @ 2016-12-12 17:50 ` Tejun Heo 2016-12-12 18:33 ` Rob Herring 0 siblings, 1 reply; 10+ messages in thread From: Tejun Heo @ 2016-12-12 17:50 UTC (permalink / raw) To: Vladimir Zapolskiy; +Cc: Greg Kroah-Hartman, Rob Herring, linux-kernel Hello, On Sun, Dec 11, 2016 at 03:44:36AM +0200, Vladimir Zapolskiy wrote: > On 12/10/2016 03:04 PM, Greg Kroah-Hartman wrote: > > Hm, how does this not also get hit if you unbind/bind/unbind/bind/etc. > > from userspace as well? I don't think this is a > > CONFIG_DEBUG_TEST_DRIVER_REMOVE issue, but just that this option finds > > the problem corner cases as you are finding out :) > > > > and you are right, I managed to reproduce exactly the same race as before > running the unmodified kernel built from Torvald's branch head: Ah, you're right, so this means we need to add flush to all async probing drivers. Will do so for libata shortly. Thanks! -- tejun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-12 17:50 ` Tejun Heo @ 2016-12-12 18:33 ` Rob Herring 2016-12-12 18:46 ` Tejun Heo 0 siblings, 1 reply; 10+ messages in thread From: Rob Herring @ 2016-12-12 18:33 UTC (permalink / raw) To: Tejun Heo Cc: Vladimir Zapolskiy, Greg Kroah-Hartman, linux-kernel@vger.kernel.org On Mon, Dec 12, 2016 at 11:50 AM, Tejun Heo <tj@kernel.org> wrote: > Hello, > > On Sun, Dec 11, 2016 at 03:44:36AM +0200, Vladimir Zapolskiy wrote: >> On 12/10/2016 03:04 PM, Greg Kroah-Hartman wrote: >> > Hm, how does this not also get hit if you unbind/bind/unbind/bind/etc. >> > from userspace as well? I don't think this is a >> > CONFIG_DEBUG_TEST_DRIVER_REMOVE issue, but just that this option finds >> > the problem corner cases as you are finding out :) >> > >> >> and you are right, I managed to reproduce exactly the same race as before >> running the unmodified kernel built from Torvald's branch head: > > Ah, you're right, so this means we need to add flush to all async > probing drivers. Will do so for libata shortly. Maybe I'm confused, but don't you need this for all drivers? You need sync the async SCSI scanning to the driver remove regardless of async probe. The driver core synchronization is only for synchronizing the remove with probe AIUI. Rob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] driver core: flush async calls before testing driver removal 2016-12-12 18:33 ` Rob Herring @ 2016-12-12 18:46 ` Tejun Heo 0 siblings, 0 replies; 10+ messages in thread From: Tejun Heo @ 2016-12-12 18:46 UTC (permalink / raw) To: Rob Herring Cc: Vladimir Zapolskiy, Greg Kroah-Hartman, linux-kernel@vger.kernel.org Hello, On Mon, Dec 12, 2016 at 12:33:36PM -0600, Rob Herring wrote: > Maybe I'm confused, but don't you need this for all drivers? You need > sync the async SCSI scanning to the driver remove regardless of async > probe. The driver core synchronization is only for synchronizing the > remove with probe AIUI. Heh, I'm not quite following what you mean. Can you please elaborate? Also, on the second thought, it probably would be better to flush async calls before unbind. It's fragile to require indivdiual drivers to do that and I can't think of benefits of doing so. It's not like the unbind / unload paths are hot in any way. Thanks. -- tejun ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-12-12 18:46 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-12-10 0:15 [PATCH] driver core: flush async calls before testing driver removal Vladimir Zapolskiy 2016-12-10 1:59 ` Dmitry Torokhov 2016-12-10 12:57 ` Vladimir Zapolskiy 2016-12-10 7:32 ` Greg Kroah-Hartman 2016-12-10 12:38 ` Vladimir Zapolskiy 2016-12-10 13:04 ` Greg Kroah-Hartman 2016-12-11 1:44 ` Vladimir Zapolskiy 2016-12-12 17:50 ` Tejun Heo 2016-12-12 18:33 ` Rob Herring 2016-12-12 18:46 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox