From: Claudiu Beznea <claudiu.beznea@kernel.org>
To: yoshihiro.shimoda.uh@renesas.com, vkoul@kernel.org,
neil.armstrong@linaro.org, geert+renesas@glider.be,
magnus.damm@gmail.com, prabhakar.mahadev-lad.rj@bp.renesas.com
Cc: linux-renesas-soc@vger.kernel.org, linux-phy@lists.infradead.org,
linux-kernel@vger.kernel.org,
Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>,
stable@vger.kernel.org, Pavel Machek <pavel@nabladev.com>,
Nobuhiro Iwamatsu <iwamatsu@nigauri.org>
Subject: sashiko review: Re: [PATCH] phy: renesas: rcar-gen3-usb2: Avoid long delay in atomic context
Date: Fri, 15 May 2026 18:01:33 +0300 [thread overview]
Message-ID: <a88cbcff-2114-45fe-9ab6-d456c69dba99@kernel.org> (raw)
In-Reply-To: <20260514111300.2152386-1-claudiu.beznea@kernel.org>
>commit 856d28e68a5bcfa08fa03aa685cb15772197dedc
>Author: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
>phy: renesas: rcar-gen3-usb2: Avoid long delay in atomic context
>This commit releases the spin lock before sleeping for 20 ms during
>OTG PHY initialization and introduces completion mechanisms to prevent
>other threads from configuring the hardware concurrently. It also disables
>IRQs while waiting to prevent interrupt handlers from applying settings.
>Closes: https://lore.kernel.org/all/afhkX2Ys2BG1gnqy@duo.ucw.cz
>Closes: https://lore.kernel.org/all/afhkX2Ys2BG1gnqy@duo.ucw.cz
>> diff --git a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
>> index 79e820e2fe55d..5a18bbd87c3b9 100644
>> --- a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
>> +++ b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
>[ ... ]
>> @@ -388,32 +392,68 @@ static bool rcar_gen3_are_all_rphys_power_off(struct
rcar_gen3_chan *ch)
>> return true;
>> }
>>
>> +static int rcar_gen3_phy_wait_otg_init(struct rcar_gen3_chan *channel,
>> + unsigned long *flags)
>> +{
>> + unsigned long timeout = msecs_to_jiffies(25);
>Could this completion timeout be too tight?
I tried to have a bit increased compared with the fsleep() value to avoid just
being to tight but not to big.
>The initializing thread uses fsleep(20000) which guarantees a minimum sleep
>of 20ms. Due to scheduler jitter or timer wheel granularity (especially if
>HZ=100, where 25ms rounds to just 2 or 3 jiffies), the 20ms sleep can easily
>exceed 25ms of wall-clock time, causing this to spuriously return -ETIMEDOUT.
>> + unsigned long ret = 1;
>> +
>> + lockdep_assert_held(&channel->lock);
>[ ... ]
>> @@ -427,7 +467,10 @@ static ssize_t role_store(struct device *dev, struct
device_attribute *attr,
>> const char *buf, size_t count)
>> {
>> struct rcar_gen3_chan *ch = dev_get_drvdata(dev);
>> bool is_b_device;
>> enum phy_mode cur_mode, new_mode;
>> + unsigned long flags;
>> + int ret = -EIO;
>>
>> - guard(spinlock_irqsave)(&ch->lock);
>> + spin_lock_irqsave(&ch->lock, flags);
>>
>> if (!ch->is_otg_channel || !rcar_gen3_is_any_otg_rphy_initialized(ch))
>> - return -EIO;
>> + goto unlock;
>> +
>> + ret = rcar_gen3_phy_wait_otg_init(ch, &flags);
>Does checking the initialized state before waiting for the lock to be
>released introduce a race condition?
>Because rcar_gen3_phy_usb2_init() only sets rphy->initialized = true after
>it finishes its 20ms sleep and reacquires the lock, a concurrent sysfs write
>to role_store() during this 20ms window will see initialized as false and
>immediately return -EIO instead of waiting for initialization to complete.
It is better to have it as suggested here, thus I'll adjust it as proposed.
>> + if (ret)
>> + goto unlock;
>[ ... ]
>> @@ -512,6 +556,11 @@ static irqreturn_t rcar_gen3_phy_usb2_irq(int irq, void
*_ch)
>> goto rpm_put;
>>
>> scoped_guard(spinlock, &ch->lock) {
>> + if (READ_ONCE(ch->otg_initializing)) {
>> + dev_warn(dev, "%s: Got IRQ while waiting for OTG
init!\n", __func__);
>> + return IRQ_NONE;
>> + }
>Does returning early here leak the PM runtime reference?
Indeed, I missed that, will adjust it in the next version.
>The interrupt handler calls pm_runtime_get_noresume(dev) at the beginning of
>the function, and releases it at the rpm_put label. By using return IRQ_NONE
>inside the scoped_guard, the code bypasses pm_runtime_put_noidle() completely,
>preventing the device from ever entering runtime suspend.
>Additionally, since this interrupt is requested with IRQF_SHARED, could
>another device sharing the line trigger a valid interrupt during the 20ms
>OTG initialization window? If so, this would print a spurious warning to
>the kernel log for a valid shared interrupt event.
That's a valid point. Better would be to just mask the interrupt locally.
>> +
>> status = readl(usb2_base + USB2_OBINTSTA);
>> if (status & ch->phy_data->obint_enable_bits) {
>> dev_vdbg(dev, "%s: %08x\n", __func__, status);
>[ ... ]
>> @@ -550,8 +605,22 @@ static int rcar_gen3_phy_usb2_init(struct phy *p)
>> }
>>
>> /* Initialize otg part (only if we initialize a PHY with IRQs). */
>> - if (rphy->int_enable_bits)
>> - rcar_gen3_init_otg(channel);
>> + if (rphy->int_enable_bits && channel->is_otg_channel &&
>> + !rcar_gen3_is_any_otg_rphy_initialized(channel)) {
>> + rcar_gen3_init_otg_phase0(channel);
>> + disable_irq_nosync(channel->irq);
>Could channel->irq be 0 here?
In theory, AFAIK, this code should not be executed when the interrupt is not
present as the OTG PHY needs the interrupt.
>In rcar_gen3_phy_usb2_probe(), platform_get_irq_optional() is used. If it
>returns -ENXIO, channel->irq remains 0. Since rphy->int_enable_bits is a
>statically defined non-zero value for OTG PHYs, this would pass 0 to
>disable_irq_nosync(), which manipulates critical system interrupts.
>Also, does calling disable_irq_nosync() on a shared interrupt line starve
>other devices?
>Since the interrupt is requested with IRQF_SHARED, masking it at the
>interrupt controller level means all other unrelated devices on the same
>physical line will have their interrupts disabled for the full 20ms
>duration. Should shared interrupts be masked locally at the device level
>instead?
That would be better, indeed.
>> + reinit_completion(&channel->otg_init_done);
>> + WRITE_ONCE(channel->otg_initializing, true);
>> + spin_unlock_irqrestore(&channel->lock, flags);
>[ ... ]
>> @@ -598,6 +678,7 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
>> struct rcar_gen3_phy *rphy = phy_get_drvdata(p);
>> struct rcar_gen3_chan *channel = rphy->ch;
>> void __iomem *usb2_base = channel->base;
>> + unsigned long flags;
>> u32 val;
>> int ret = 0;
>>
>> @@ -607,11 +688,15 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
>> return ret;
>> }
>>
>> - guard(spinlock_irqsave)(&channel->lock);
>> + spin_lock_irqsave(&channel->lock, flags);
>>
>> if (!rcar_gen3_are_all_rphys_power_off(channel))
>> goto out;
>>
>> + ret = rcar_gen3_phy_wait_otg_init(channel, &flags);
>> + if (ret)
>> + goto unlock;
>> +
>> val = readl(usb2_base + USB2_USBCTR);
>> val |= USB2_USBCTR_PLL_RST;
>Does dropping the lock in rcar_gen3_phy_wait_otg_init() introduce a race
>condition with the !rcar_gen3_are_all_rphys_power_off() check?
>If two threads concurrently power on different PHYs, both will evaluate the
>condition as false because neither has reached rphy->powered = true at the
>end of the function. Both threads might then sleep in
>rcar_gen3_phy_wait_otg_init().
>Upon waking and reacquiring the lock, both will unconditionally apply
>USB2_USBCTR_PLL_RST, which could catastrophically disrupt the PHY that was
>just initialized by the first thread.
>Should the power off condition be re-evaluated after the lock is reacquired?
Yes, rcar_gen3_phy_wait_otg_init() should be called first.
--
linux-phy mailing list
linux-phy@lists.infradead.org
https://lists.infradead.org/mailman/listinfo/linux-phy
WARNING: multiple messages have this Message-ID (diff)
From: Claudiu Beznea <claudiu.beznea@kernel.org>
To: yoshihiro.shimoda.uh@renesas.com, vkoul@kernel.org,
neil.armstrong@linaro.org, geert+renesas@glider.be,
magnus.damm@gmail.com, prabhakar.mahadev-lad.rj@bp.renesas.com
Cc: linux-renesas-soc@vger.kernel.org, linux-phy@lists.infradead.org,
linux-kernel@vger.kernel.org,
Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>,
stable@vger.kernel.org, Pavel Machek <pavel@nabladev.com>,
Nobuhiro Iwamatsu <iwamatsu@nigauri.org>
Subject: sashiko review: Re: [PATCH] phy: renesas: rcar-gen3-usb2: Avoid long delay in atomic context
Date: Fri, 15 May 2026 18:01:33 +0300 [thread overview]
Message-ID: <a88cbcff-2114-45fe-9ab6-d456c69dba99@kernel.org> (raw)
In-Reply-To: <20260514111300.2152386-1-claudiu.beznea@kernel.org>
>commit 856d28e68a5bcfa08fa03aa685cb15772197dedc
>Author: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
>phy: renesas: rcar-gen3-usb2: Avoid long delay in atomic context
>This commit releases the spin lock before sleeping for 20 ms during
>OTG PHY initialization and introduces completion mechanisms to prevent
>other threads from configuring the hardware concurrently. It also disables
>IRQs while waiting to prevent interrupt handlers from applying settings.
>Closes: https://lore.kernel.org/all/afhkX2Ys2BG1gnqy@duo.ucw.cz
>Closes: https://lore.kernel.org/all/afhkX2Ys2BG1gnqy@duo.ucw.cz
>> diff --git a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
>> index 79e820e2fe55d..5a18bbd87c3b9 100644
>> --- a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
>> +++ b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
>[ ... ]
>> @@ -388,32 +392,68 @@ static bool rcar_gen3_are_all_rphys_power_off(struct
rcar_gen3_chan *ch)
>> return true;
>> }
>>
>> +static int rcar_gen3_phy_wait_otg_init(struct rcar_gen3_chan *channel,
>> + unsigned long *flags)
>> +{
>> + unsigned long timeout = msecs_to_jiffies(25);
>Could this completion timeout be too tight?
I tried to have a bit increased compared with the fsleep() value to avoid just
being to tight but not to big.
>The initializing thread uses fsleep(20000) which guarantees a minimum sleep
>of 20ms. Due to scheduler jitter or timer wheel granularity (especially if
>HZ=100, where 25ms rounds to just 2 or 3 jiffies), the 20ms sleep can easily
>exceed 25ms of wall-clock time, causing this to spuriously return -ETIMEDOUT.
>> + unsigned long ret = 1;
>> +
>> + lockdep_assert_held(&channel->lock);
>[ ... ]
>> @@ -427,7 +467,10 @@ static ssize_t role_store(struct device *dev, struct
device_attribute *attr,
>> const char *buf, size_t count)
>> {
>> struct rcar_gen3_chan *ch = dev_get_drvdata(dev);
>> bool is_b_device;
>> enum phy_mode cur_mode, new_mode;
>> + unsigned long flags;
>> + int ret = -EIO;
>>
>> - guard(spinlock_irqsave)(&ch->lock);
>> + spin_lock_irqsave(&ch->lock, flags);
>>
>> if (!ch->is_otg_channel || !rcar_gen3_is_any_otg_rphy_initialized(ch))
>> - return -EIO;
>> + goto unlock;
>> +
>> + ret = rcar_gen3_phy_wait_otg_init(ch, &flags);
>Does checking the initialized state before waiting for the lock to be
>released introduce a race condition?
>Because rcar_gen3_phy_usb2_init() only sets rphy->initialized = true after
>it finishes its 20ms sleep and reacquires the lock, a concurrent sysfs write
>to role_store() during this 20ms window will see initialized as false and
>immediately return -EIO instead of waiting for initialization to complete.
It is better to have it as suggested here, thus I'll adjust it as proposed.
>> + if (ret)
>> + goto unlock;
>[ ... ]
>> @@ -512,6 +556,11 @@ static irqreturn_t rcar_gen3_phy_usb2_irq(int irq, void
*_ch)
>> goto rpm_put;
>>
>> scoped_guard(spinlock, &ch->lock) {
>> + if (READ_ONCE(ch->otg_initializing)) {
>> + dev_warn(dev, "%s: Got IRQ while waiting for OTG
init!\n", __func__);
>> + return IRQ_NONE;
>> + }
>Does returning early here leak the PM runtime reference?
Indeed, I missed that, will adjust it in the next version.
>The interrupt handler calls pm_runtime_get_noresume(dev) at the beginning of
>the function, and releases it at the rpm_put label. By using return IRQ_NONE
>inside the scoped_guard, the code bypasses pm_runtime_put_noidle() completely,
>preventing the device from ever entering runtime suspend.
>Additionally, since this interrupt is requested with IRQF_SHARED, could
>another device sharing the line trigger a valid interrupt during the 20ms
>OTG initialization window? If so, this would print a spurious warning to
>the kernel log for a valid shared interrupt event.
That's a valid point. Better would be to just mask the interrupt locally.
>> +
>> status = readl(usb2_base + USB2_OBINTSTA);
>> if (status & ch->phy_data->obint_enable_bits) {
>> dev_vdbg(dev, "%s: %08x\n", __func__, status);
>[ ... ]
>> @@ -550,8 +605,22 @@ static int rcar_gen3_phy_usb2_init(struct phy *p)
>> }
>>
>> /* Initialize otg part (only if we initialize a PHY with IRQs). */
>> - if (rphy->int_enable_bits)
>> - rcar_gen3_init_otg(channel);
>> + if (rphy->int_enable_bits && channel->is_otg_channel &&
>> + !rcar_gen3_is_any_otg_rphy_initialized(channel)) {
>> + rcar_gen3_init_otg_phase0(channel);
>> + disable_irq_nosync(channel->irq);
>Could channel->irq be 0 here?
In theory, AFAIK, this code should not be executed when the interrupt is not
present as the OTG PHY needs the interrupt.
>In rcar_gen3_phy_usb2_probe(), platform_get_irq_optional() is used. If it
>returns -ENXIO, channel->irq remains 0. Since rphy->int_enable_bits is a
>statically defined non-zero value for OTG PHYs, this would pass 0 to
>disable_irq_nosync(), which manipulates critical system interrupts.
>Also, does calling disable_irq_nosync() on a shared interrupt line starve
>other devices?
>Since the interrupt is requested with IRQF_SHARED, masking it at the
>interrupt controller level means all other unrelated devices on the same
>physical line will have their interrupts disabled for the full 20ms
>duration. Should shared interrupts be masked locally at the device level
>instead?
That would be better, indeed.
>> + reinit_completion(&channel->otg_init_done);
>> + WRITE_ONCE(channel->otg_initializing, true);
>> + spin_unlock_irqrestore(&channel->lock, flags);
>[ ... ]
>> @@ -598,6 +678,7 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
>> struct rcar_gen3_phy *rphy = phy_get_drvdata(p);
>> struct rcar_gen3_chan *channel = rphy->ch;
>> void __iomem *usb2_base = channel->base;
>> + unsigned long flags;
>> u32 val;
>> int ret = 0;
>>
>> @@ -607,11 +688,15 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
>> return ret;
>> }
>>
>> - guard(spinlock_irqsave)(&channel->lock);
>> + spin_lock_irqsave(&channel->lock, flags);
>>
>> if (!rcar_gen3_are_all_rphys_power_off(channel))
>> goto out;
>>
>> + ret = rcar_gen3_phy_wait_otg_init(channel, &flags);
>> + if (ret)
>> + goto unlock;
>> +
>> val = readl(usb2_base + USB2_USBCTR);
>> val |= USB2_USBCTR_PLL_RST;
>Does dropping the lock in rcar_gen3_phy_wait_otg_init() introduce a race
>condition with the !rcar_gen3_are_all_rphys_power_off() check?
>If two threads concurrently power on different PHYs, both will evaluate the
>condition as false because neither has reached rphy->powered = true at the
>end of the function. Both threads might then sleep in
>rcar_gen3_phy_wait_otg_init().
>Upon waking and reacquiring the lock, both will unconditionally apply
>USB2_USBCTR_PLL_RST, which could catastrophically disrupt the PHY that was
>just initialized by the first thread.
>Should the power off condition be re-evaluated after the lock is reacquired?
Yes, rcar_gen3_phy_wait_otg_init() should be called first.
next prev parent reply other threads:[~2026-05-15 15:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 11:13 [PATCH] phy: renesas: rcar-gen3-usb2: Avoid long delay in atomic context Claudiu Beznea
2026-05-14 11:13 ` Claudiu Beznea
2026-05-14 18:52 ` sashiko-bot
2026-05-14 21:18 ` Pavel Machek
2026-05-14 21:18 ` Pavel Machek
2026-05-15 9:47 ` David Laight
2026-05-15 9:47 ` David Laight
2026-05-15 14:37 ` Claudiu Beznea
2026-05-15 14:37 ` Claudiu Beznea
2026-05-15 14:14 ` Claudiu Beznea
2026-05-15 14:14 ` Claudiu Beznea
2026-05-15 15:01 ` Claudiu Beznea [this message]
2026-05-15 15:01 ` sashiko review: " Claudiu Beznea
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a88cbcff-2114-45fe-9ab6-d456c69dba99@kernel.org \
--to=claudiu.beznea@kernel.org \
--cc=claudiu.beznea.uj@bp.renesas.com \
--cc=geert+renesas@glider.be \
--cc=iwamatsu@nigauri.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-phy@lists.infradead.org \
--cc=linux-renesas-soc@vger.kernel.org \
--cc=magnus.damm@gmail.com \
--cc=neil.armstrong@linaro.org \
--cc=pavel@nabladev.com \
--cc=prabhakar.mahadev-lad.rj@bp.renesas.com \
--cc=stable@vger.kernel.org \
--cc=vkoul@kernel.org \
--cc=yoshihiro.shimoda.uh@renesas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.