* pci driver loads right after unload @ 2018-01-01 23:50 Max Gurtovoy 2018-01-02 19:00 ` Bjorn Helgaas 0 siblings, 1 reply; 5+ messages in thread From: Max Gurtovoy @ 2018-01-01 23:50 UTC (permalink / raw) To: helgaas, linux-pci hi all, I encountered a strange phenomena using 2 different pci drivers (nvme and mlx5_core) since 4.15-rc1: when I try to unload the modules using "modprobe -r" cmd it calls the .probe function right after calling the .remove function and the module is not realy unloaded. I think there is some race condition because when I added a msleep(1000) after "pci_unregister_driver(&nvme_driver);" (in the nvme module testing, it also worked in the mlx5_core), the issue seems to dissapear. any thoughts or idea what is causing this behaviour ? -Max. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: pci driver loads right after unload 2018-01-01 23:50 pci driver loads right after unload Max Gurtovoy @ 2018-01-02 19:00 ` Bjorn Helgaas 2018-01-02 19:27 ` Greg Kroah-Hartman 0 siblings, 1 reply; 5+ messages in thread From: Bjorn Helgaas @ 2018-01-02 19:00 UTC (permalink / raw) To: Max Gurtovoy; +Cc: linux-pci, linux-kernel, Greg Kroah-Hartman [+cc Greg, linux-kernel] Hi Max, Thanks for the report! On Tue, Jan 02, 2018 at 01:50:23AM +0200, Max Gurtovoy wrote: > hi all, > I encountered a strange phenomena using 2 different pci drivers > (nvme and mlx5_core) since 4.15-rc1: > when I try to unload the modules using "modprobe -r" cmd it calls > the .probe function right after calling the .remove function and the > module is not realy unloaded. > I think there is some race condition because when I added a > msleep(1000) after "pci_unregister_driver(&nvme_driver);" (in the > nvme module testing, it also worked in the mlx5_core), the issue > seems to dissapear. You say "since 4.15-rc1". Does that mean it's a regression? If so, what's the most recent kernel that does not have this problem? Worst case, you could bisect to find where it broke. I don't see anything obvious in the drivers/pci changes between v4.14 and v4.15-rc1. Module loading and driver binding is mostly driven by the driver core and udev. Maybe you could learn something with "udevadm monitor" or by turning on the some of the debug in lib/kobject_uevent.c? Bjorn ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: pci driver loads right after unload 2018-01-02 19:00 ` Bjorn Helgaas @ 2018-01-02 19:27 ` Greg Kroah-Hartman 2018-01-03 10:50 ` Max Gurtovoy 0 siblings, 1 reply; 5+ messages in thread From: Greg Kroah-Hartman @ 2018-01-02 19:27 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Max Gurtovoy, linux-pci, linux-kernel On Tue, Jan 02, 2018 at 01:00:03PM -0600, Bjorn Helgaas wrote: > [+cc Greg, linux-kernel] > > Hi Max, > > Thanks for the report! > > On Tue, Jan 02, 2018 at 01:50:23AM +0200, Max Gurtovoy wrote: > > hi all, > > I encountered a strange phenomena using 2 different pci drivers > > (nvme and mlx5_core) since 4.15-rc1: > > when I try to unload the modules using "modprobe -r" cmd it calls > > the .probe function right after calling the .remove function and the > > module is not realy unloaded. > > I think there is some race condition because when I added a > > msleep(1000) after "pci_unregister_driver(&nvme_driver);" (in the > > nvme module testing, it also worked in the mlx5_core), the issue > > seems to dissapear. > > You say "since 4.15-rc1". Does that mean it's a regression? If so, > what's the most recent kernel that does not have this problem? Worst > case, you could bisect to find where it broke. > > I don't see anything obvious in the drivers/pci changes between v4.14 > and v4.15-rc1. Module loading and driver binding is mostly driven by > the driver core and udev. Maybe you could learn something with > "udevadm monitor" or by turning on the some of the debug in > lib/kobject_uevent.c? This should be resolved in 4.15-rc6, there was a regression in -rc1 in this area when dealing with uevents over netlink. Max, can you test -rc6 to verify if this is really fixed or not? thanks, greg k-h ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: pci driver loads right after unload 2018-01-02 19:27 ` Greg Kroah-Hartman @ 2018-01-03 10:50 ` Max Gurtovoy 2018-01-03 11:18 ` Greg Kroah-Hartman 0 siblings, 1 reply; 5+ messages in thread From: Max Gurtovoy @ 2018-01-03 10:50 UTC (permalink / raw) To: Greg Kroah-Hartman, Bjorn Helgaas; +Cc: linux-pci, linux-kernel Hi Greg/Bjorn, On 1/2/2018 9:27 PM, Greg Kroah-Hartman wrote: > On Tue, Jan 02, 2018 at 01:00:03PM -0600, Bjorn Helgaas wrote: >> [+cc Greg, linux-kernel] >> >> Hi Max, >> >> Thanks for the report! >> >> On Tue, Jan 02, 2018 at 01:50:23AM +0200, Max Gurtovoy wrote: >>> hi all, >>> I encountered a strange phenomena using 2 different pci drivers >>> (nvme and mlx5_core) since 4.15-rc1: >>> when I try to unload the modules using "modprobe -r" cmd it calls >>> the .probe function right after calling the .remove function and the >>> module is not realy unloaded. >>> I think there is some race condition because when I added a >>> msleep(1000) after "pci_unregister_driver(&nvme_driver);" (in the >>> nvme module testing, it also worked in the mlx5_core), the issue >>> seems to dissapear. >> >> You say "since 4.15-rc1". Does that mean it's a regression? If so, >> what's the most recent kernel that does not have this problem? Worst >> case, you could bisect to find where it broke. >> >> I don't see anything obvious in the drivers/pci changes between v4.14 >> and v4.15-rc1. Module loading and driver binding is mostly driven by >> the driver core and udev. Maybe you could learn something with >> "udevadm monitor" or by turning on the some of the debug in >> lib/kobject_uevent.c? > > > This should be resolved in 4.15-rc6, there was a regression in -rc1 in > this area when dealing with uevents over netlink. > > Max, can you test -rc6 to verify if this is really fixed or not? I've tested -rc6 and the issue doesn't repro. I'll continue monitoring this scenario in -rc7. Can you point to the commit that fixes the issue ? and I'll test the kernel with/without this patch. > > thanks, > > greg k-h > Cheers, Max. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: pci driver loads right after unload 2018-01-03 10:50 ` Max Gurtovoy @ 2018-01-03 11:18 ` Greg Kroah-Hartman 0 siblings, 0 replies; 5+ messages in thread From: Greg Kroah-Hartman @ 2018-01-03 11:18 UTC (permalink / raw) To: Max Gurtovoy; +Cc: Bjorn Helgaas, linux-pci, linux-kernel On Wed, Jan 03, 2018 at 12:50:05PM +0200, Max Gurtovoy wrote: > Hi Greg/Bjorn, > > On 1/2/2018 9:27 PM, Greg Kroah-Hartman wrote: > > On Tue, Jan 02, 2018 at 01:00:03PM -0600, Bjorn Helgaas wrote: > > > [+cc Greg, linux-kernel] > > > > > > Hi Max, > > > > > > Thanks for the report! > > > > > > On Tue, Jan 02, 2018 at 01:50:23AM +0200, Max Gurtovoy wrote: > > > > hi all, > > > > I encountered a strange phenomena using 2 different pci drivers > > > > (nvme and mlx5_core) since 4.15-rc1: > > > > when I try to unload the modules using "modprobe -r" cmd it calls > > > > the .probe function right after calling the .remove function and the > > > > module is not realy unloaded. > > > > I think there is some race condition because when I added a > > > > msleep(1000) after "pci_unregister_driver(&nvme_driver);" (in the > > > > nvme module testing, it also worked in the mlx5_core), the issue > > > > seems to dissapear. > > > > > > You say "since 4.15-rc1". Does that mean it's a regression? If so, > > > what's the most recent kernel that does not have this problem? Worst > > > case, you could bisect to find where it broke. > > > > > > I don't see anything obvious in the drivers/pci changes between v4.14 > > > and v4.15-rc1. Module loading and driver binding is mostly driven by > > > the driver core and udev. Maybe you could learn something with > > > "udevadm monitor" or by turning on the some of the debug in > > > lib/kobject_uevent.c? > > > > > > This should be resolved in 4.15-rc6, there was a regression in -rc1 in > > this area when dealing with uevents over netlink. > > > > Max, can you test -rc6 to verify if this is really fixed or not? > > I've tested -rc6 and the issue doesn't repro. > I'll continue monitoring this scenario in -rc7. > Can you point to the commit that fixes the issue ? and I'll test the kernel > with/without this patch. 9b3fa47d4a76 ("kobject: fix suppressing modalias in uevents delivered over netlink") ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-01-03 11:18 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-01 23:50 pci driver loads right after unload Max Gurtovoy 2018-01-02 19:00 ` Bjorn Helgaas 2018-01-02 19:27 ` Greg Kroah-Hartman 2018-01-03 10:50 ` Max Gurtovoy 2018-01-03 11:18 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).