* [PATCH 1/1] driver core: Fix unbalance probe_count in really_probe()
@ 2020-06-03 6:45 Ji-Ze Hong (Peter Hong)
2020-06-03 7:13 ` Geert Uytterhoeven
0 siblings, 1 reply; 4+ messages in thread
From: Ji-Ze Hong (Peter Hong) @ 2020-06-03 6:45 UTC (permalink / raw)
To: gregkh, rafael
Cc: geert+renesas, peter_hong, linux-kernel, Ji-Ze Hong (Peter Hong)
In previous patch, using return -EBUSY in really_probe() instead WARN_ON()
only. The following is the partial code.
...
atomic_inc(&probe_count);
pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
drv->bus->name, __func__, drv->name, dev_name(dev));
if (!list_empty(&dev->devres_head)) {
dev_crit(dev, "Resources present before probing\n");
return -EBUSY;
}
...
When the devres_head is not empty, this code will return -EBUSY to prevent
resource conflict, but it forgot to balance probe_count. We can move the
increasement code below the resource checking.
...
pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
drv->bus->name, __func__, drv->name, dev_name(dev));
if (!list_empty(&dev->devres_head)) {
dev_crit(dev, "Resources present before probing\n");
return -EBUSY;
}
atomic_inc(&probe_count);
...
The original code will cause lots motherboard freeze in reboot/shutdown
with systemd message "Reached target Reboot" or "Reached target Shutdown"
with serial8250 platform driver. e.g. AOPEN DE6200. The conflict boot
dmesg below:
Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
Reboot/Shutdown will freeze in wait_for_device_probe(), message as
following:
INFQ: task systemd-shutdown: 1 blocked for more than 120 seconds.
Not tainted 5.7.0-rc7-tty-next+ #241
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Call Trace:
__schedule+0x3d2/0x700
? printk+0x52/0x6e
schedule+0x4f/0xc0
wait_for_device_probe+0xbb/0xl40
? wait_woken+0x80/0x80
device_shutdown+0xl5/0xle0
kernel_power_off+0x35/0x70
__do_sys_reboot+0xla0/0x220
? do_sigtimedwait+0xld0/0x210
? do.writev+0x6a/0xll0
? do.writev+0x6a/0xll0
? sigprocmask+0x6f/Oxa0
__64_sys_reboot+0xle/0x20
do_syscall_64+0x57/0xlb0
Fixes: 7c35e699c88b ("driver core: Print device when resources present in really_probe()")
Signed-off-by: Ji-Ze Hong (Peter Hong) <hpeter+linux_kernel@gmail.com>
---
drivers/base/dd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 9a1d940342ac..5173b0766a26 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -487,7 +487,6 @@ static int really_probe(struct device *dev, struct device_driver *drv)
if (ret)
return ret;
- atomic_inc(&probe_count);
pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
drv->bus->name, __func__, drv->name, dev_name(dev));
if (!list_empty(&dev->devres_head)) {
@@ -495,6 +494,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
return -EBUSY;
}
+ atomic_inc(&probe_count);
+
re_probe:
dev->driver = drv;
--
2.17.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] driver core: Fix unbalance probe_count in really_probe()
2020-06-03 6:45 [PATCH 1/1] driver core: Fix unbalance probe_count in really_probe() Ji-Ze Hong (Peter Hong)
@ 2020-06-03 7:13 ` Geert Uytterhoeven
2020-06-03 7:35 ` Ji-Ze Hong (Peter Hong)
0 siblings, 1 reply; 4+ messages in thread
From: Geert Uytterhoeven @ 2020-06-03 7:13 UTC (permalink / raw)
To: Ji-Ze Hong (Peter Hong)
Cc: Greg KH, Rafael J. Wysocki, peter_hong, Linux Kernel Mailing List,
Ji-Ze Hong (Peter Hong)
Hi Ji-Ze,
On Wed, Jun 3, 2020 at 8:45 AM Ji-Ze Hong (Peter Hong) <hpeter@gmail.com> wrote:
> In previous patch, using return -EBUSY in really_probe() instead WARN_ON()
> only. The following is the partial code.
>
> ...
> atomic_inc(&probe_count);
> pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
> drv->bus->name, __func__, drv->name, dev_name(dev));
> if (!list_empty(&dev->devres_head)) {
> dev_crit(dev, "Resources present before probing\n");
> return -EBUSY;
> }
> ...
>
> When the devres_head is not empty, this code will return -EBUSY to prevent
> resource conflict, but it forgot to balance probe_count. We can move the
> increasement code below the resource checking.
>
> ...
> pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
> drv->bus->name, __func__, drv->name, dev_name(dev));
> if (!list_empty(&dev->devres_head)) {
> dev_crit(dev, "Resources present before probing\n");
> return -EBUSY;
> }
> atomic_inc(&probe_count);
If devres_head is not empty, you have a serious problem on your system,
as those resources may be in an unknown state (e.g. freed but still in
use). While I had missed the probe_count imbalance when implementing
the original change, it may actually be safer to not decrease
probe_count, to prevent further probes from happening. But I guess it
doesn't matter: if you get here, your system is in a bad state anyway.
> The original code will cause lots motherboard freeze in reboot/shutdown
> with systemd message "Reached target Reboot" or "Reached target Shutdown"
> with serial8250 platform driver. e.g. AOPEN DE6200. The conflict boot
> dmesg below:
>
> Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
> 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
> 00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
> serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
>
> Reboot/Shutdown will freeze in wait_for_device_probe(), message as
> following:
> INFQ: task systemd-shutdown: 1 blocked for more than 120 seconds.
Now, how did you get to this state, i.e. which driver triggered the
"Resources present before probing" message? Because that is the root
issue that must be fixed, and the probe_count imbalance is IMHO just a
red herring.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] driver core: Fix unbalance probe_count in really_probe()
2020-06-03 7:13 ` Geert Uytterhoeven
@ 2020-06-03 7:35 ` Ji-Ze Hong (Peter Hong)
2020-06-03 8:30 ` Geert Uytterhoeven
0 siblings, 1 reply; 4+ messages in thread
From: Ji-Ze Hong (Peter Hong) @ 2020-06-03 7:35 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Greg KH, Rafael J. Wysocki, peter_hong, Linux Kernel Mailing List,
Ji-Ze Hong (Peter Hong)
Hi Geert,
Geert Uytterhoeven 於 2020/6/3 下午 03:13 寫道:
> Hi Ji-Ze,
>
> If devres_head is not empty, you have a serious problem on your system,
> as those resources may be in an unknown state (e.g. freed but still in
> use). While I had missed the probe_count imbalance when implementing
> the original change, it may actually be safer to not decrease
> probe_count, to prevent further probes from happening. But I guess it
> doesn't matter: if you get here, your system is in a bad state anyway.
We want to fix the shutdown/reboot freeze issue and bisect to this
patch and found if the probe_count != 0, the PC will stuck with
wait_for_device_probe() with shutdown/reboot forever. So we just
change the increment after return -EBUSY.
In this case, it maybe 8250_PNP & serial 8250 platform driver resources
conflict. I'll try to dump more message to debug.
IMO, the shutdown/reboot operation should not block.
>> with serial8250 platform driver. e.g. AOPEN DE6200. The conflict boot
>> dmesg below:
>>
>> Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
>> 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
>> 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
>> 00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
>> serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
>>
>> Reboot/Shutdown will freeze in wait_for_device_probe(), message as
>> following:
>> INFQ: task systemd-shutdown: 1 blocked for more than 120 seconds.
>
> Now, how did you get to this state, i.e. which driver triggered the
> "Resources present before probing" message? Because that is the root
> issue that must be fixed, and the probe_count imbalance is IMHO just a
> red herring.
>
Sorry for lost important dmesg:
Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
platform serial8250: Resources present before probing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--
With Best Regards,
Peter Hong
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] driver core: Fix unbalance probe_count in really_probe()
2020-06-03 7:35 ` Ji-Ze Hong (Peter Hong)
@ 2020-06-03 8:30 ` Geert Uytterhoeven
0 siblings, 0 replies; 4+ messages in thread
From: Geert Uytterhoeven @ 2020-06-03 8:30 UTC (permalink / raw)
To: Ji-Ze Hong (Peter Hong)
Cc: Greg KH, Rafael J. Wysocki, peter_hong, Linux Kernel Mailing List,
Ji-Ze Hong (Peter Hong)
Hi Ji-Ze,
On Wed, Jun 3, 2020 at 9:35 AM Ji-Ze Hong (Peter Hong) <hpeter@gmail.com> wrote:
> Geert Uytterhoeven 於 2020/6/3 下午 03:13 寫道:
> > If devres_head is not empty, you have a serious problem on your system,
> > as those resources may be in an unknown state (e.g. freed but still in
> > use). While I had missed the probe_count imbalance when implementing
> > the original change, it may actually be safer to not decrease
> > probe_count, to prevent further probes from happening. But I guess it
> > doesn't matter: if you get here, your system is in a bad state anyway.
>
> We want to fix the shutdown/reboot freeze issue and bisect to this
> patch and found if the probe_count != 0, the PC will stuck with
> wait_for_device_probe() with shutdown/reboot forever. So we just
> change the increment after return -EBUSY.
IC. And before my change, you got a big fat warning backtrace, telling you
something is seriously wrong? ;-)
> In this case, it maybe 8250_PNP & serial 8250 platform driver resources
> conflict. I'll try to dump more message to debug.
OK.
> IMO, the shutdown/reboot operation should not block.
Well, it depends. If there's an issue with resources, the system may crash,
too.
> >> with serial8250 platform driver. e.g. AOPEN DE6200. The conflict boot
> >> dmesg below:
> >>
> >> Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> >> 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
> >> 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
> >> 00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
> >> serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
> >>
> >> Reboot/Shutdown will freeze in wait_for_device_probe(), message as
> >> following:
> >> INFQ: task systemd-shutdown: 1 blocked for more than 120 seconds.
> >
> > Now, how did you get to this state, i.e. which driver triggered the
> > "Resources present before probing" message? Because that is the root
> > issue that must be fixed, and the probe_count imbalance is IMHO just a
> > red herring.
> >
>
> Sorry for lost important dmesg:
>
> Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
> 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
> 00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
> serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
> platform serial8250: Resources present before probing
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OK. So the serial8250 driver does something fishy.
When the warning triggered for me, it was due to a driver calling a devm_*()
function on a different device than the one being probed, cfr.
https://lore.kernel.org/r/alpine.DEB.2.21.1911201053330.25420@ramsan.of.borg
which was fixed by commit 32085f25d7b68404 ("mdio_bus: don't use managed
reset-controller").
The serial8250 driver, or the subdriver for an SoC-specific variant, may
do something similar.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-06-03 8:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-03 6:45 [PATCH 1/1] driver core: Fix unbalance probe_count in really_probe() Ji-Ze Hong (Peter Hong)
2020-06-03 7:13 ` Geert Uytterhoeven
2020-06-03 7:35 ` Ji-Ze Hong (Peter Hong)
2020-06-03 8:30 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox