* [PATCH 0/2] usb: Call cpu_relax() when polling registers
@ 2024-08-20 12:14 Zhongqiu Han
2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han
2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han
0 siblings, 2 replies; 7+ messages in thread
From: Zhongqiu Han @ 2024-08-20 12:14 UTC (permalink / raw)
To: Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel, quic_zhonhan
This series adds cpu_relax() in registers polling busy loops and
it is a well-known idiom in the kernel.
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
---
Zhongqiu Han (2):
usb: dwc3: core: Call cpu_relax() in registers polling busy loops
usb: xhci: ext-caps: Use cpu_relax() when polling registers
drivers/usb/dwc3/core.c | 2 ++
drivers/usb/host/xhci-ext-caps.h | 2 ++
2 files changed, 4 insertions(+)
base-commit: 469f1bad3c1c6e268059f78c0eec7e9552b3894c
--
2.25.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops
2024-08-20 12:14 [PATCH 0/2] usb: Call cpu_relax() when polling registers Zhongqiu Han
@ 2024-08-20 12:15 ` Zhongqiu Han
2024-08-20 22:05 ` Thinh Nguyen
2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han
1 sibling, 1 reply; 7+ messages in thread
From: Zhongqiu Han @ 2024-08-20 12:15 UTC (permalink / raw)
To: Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel, quic_zhonhan
Busy loops that poll on a register should call cpu_relax(). On some
architectures, it can lower CPU power consumption or yield to a
hyperthreaded twin processor. It also serves as a compiler barrier,
see Documentation/process/volatile-considered-harmful.rst. In addition,
if something goes wrong in the busy loop at least it can prevent things
from getting worse.
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
---
drivers/usb/dwc3/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 734de2a8bd21..498f08dbbdb5 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2050,6 +2050,8 @@ static int dwc3_get_num_ports(struct dwc3 *dwc)
if (!offset)
break;
+ cpu_relax();
+
val = readl(base + offset);
major_revision = XHCI_EXT_PORT_MAJOR(val);
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers
2024-08-20 12:14 [PATCH 0/2] usb: Call cpu_relax() when polling registers Zhongqiu Han
2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han
@ 2024-08-20 12:15 ` Zhongqiu Han
2024-08-21 9:40 ` Mathias Nyman
1 sibling, 1 reply; 7+ messages in thread
From: Zhongqiu Han @ 2024-08-20 12:15 UTC (permalink / raw)
To: Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel, quic_zhonhan
It is considered good practice to call cpu_relax() in busy loops, see
Documentation/process/volatile-considered-harmful.rst. This can lower
CPU power consumption or yield to a hyperthreaded twin processor and
also serve as a compiler barrier. In addition, if something goes wrong
in the busy loop at least it can prevent things from getting worse.
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
---
drivers/usb/host/xhci-ext-caps.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci-ext-caps.h b/drivers/usb/host/xhci-ext-caps.h
index 96eb36a58738..25d148d60ab0 100644
--- a/drivers/usb/host/xhci-ext-caps.h
+++ b/drivers/usb/host/xhci-ext-caps.h
@@ -144,6 +144,8 @@ static inline int xhci_find_next_ext_cap(void __iomem *base, u32 start, int id)
if (offset != start && (id == 0 || XHCI_EXT_CAPS_ID(val) == id))
return offset;
+ cpu_relax();
+
next = XHCI_EXT_CAPS_NEXT(val);
offset += next << 2;
} while (next);
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops
2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han
@ 2024-08-20 22:05 ` Thinh Nguyen
2024-08-21 13:59 ` Zhongqiu Han
0 siblings, 1 reply; 7+ messages in thread
From: Thinh Nguyen @ 2024-08-20 22:05 UTC (permalink / raw)
To: Zhongqiu Han
Cc: Thinh Nguyen, gregkh@linuxfoundation.org, mathias.nyman@intel.com,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org
On Tue, Aug 20, 2024, Zhongqiu Han wrote:
> Busy loops that poll on a register should call cpu_relax(). On some
> architectures, it can lower CPU power consumption or yield to a
> hyperthreaded twin processor. It also serves as a compiler barrier,
> see Documentation/process/volatile-considered-harmful.rst. In addition,
> if something goes wrong in the busy loop at least it can prevent things
> from getting worse.
>
> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
> ---
> drivers/usb/dwc3/core.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index 734de2a8bd21..498f08dbbdb5 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -2050,6 +2050,8 @@ static int dwc3_get_num_ports(struct dwc3 *dwc)
> if (!offset)
> break;
>
> + cpu_relax();
> +
> val = readl(base + offset);
> major_revision = XHCI_EXT_PORT_MAJOR(val);
>
> --
> 2.25.1
>
We're not polling on a register here. We're just traversing and reading
the next port capability. The loop in dwc3_get_num_ports() should not be
more than DWC3_USB2_MAX_PORTS + DWC3_USB3_MAX_PORTS.
What's really causing this busy loop you found?
If polling for a register is really a problem, then we would have that
problem everywhere else in dwc3. But why here?
Thanks,
Thinh
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers
2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han
@ 2024-08-21 9:40 ` Mathias Nyman
2024-08-21 14:04 ` Zhongqiu Han
0 siblings, 1 reply; 7+ messages in thread
From: Mathias Nyman @ 2024-08-21 9:40 UTC (permalink / raw)
To: Zhongqiu Han, Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel
On 20.8.2024 15.15, Zhongqiu Han wrote:
> It is considered good practice to call cpu_relax() in busy loops, see
> Documentation/process/volatile-considered-harmful.rst. This can lower
> CPU power consumption or yield to a hyperthreaded twin processor and
> also serve as a compiler barrier. In addition, if something goes wrong
> in the busy loop at least it can prevent things from getting worse.
>
> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
> ---
> drivers/usb/host/xhci-ext-caps.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/usb/host/xhci-ext-caps.h b/drivers/usb/host/xhci-ext-caps.h
> index 96eb36a58738..25d148d60ab0 100644
> --- a/drivers/usb/host/xhci-ext-caps.h
> +++ b/drivers/usb/host/xhci-ext-caps.h
> @@ -144,6 +144,8 @@ static inline int xhci_find_next_ext_cap(void __iomem *base, u32 start, int id)
> if (offset != start && (id == 0 || XHCI_EXT_CAPS_ID(val) == id))
> return offset;
>
> + cpu_relax();
> +
> next = XHCI_EXT_CAPS_NEXT(val);
> offset += next << 2;
> } while (next);
Similar case as with PATCH 1/2
This isn't a busy loop polling for some value.
We traverse xhci extended capabilities until the one we are looking for is found.
Thanks
Mathias
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops
2024-08-20 22:05 ` Thinh Nguyen
@ 2024-08-21 13:59 ` Zhongqiu Han
0 siblings, 0 replies; 7+ messages in thread
From: Zhongqiu Han @ 2024-08-21 13:59 UTC (permalink / raw)
To: Thinh Nguyen
Cc: gregkh@linuxfoundation.org, mathias.nyman@intel.com,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org
On 8/21/2024 6:05 AM, Thinh Nguyen wrote:
> On Tue, Aug 20, 2024, Zhongqiu Han wrote:
>> Busy loops that poll on a register should call cpu_relax(). On some
>> architectures, it can lower CPU power consumption or yield to a
>> hyperthreaded twin processor. It also serves as a compiler barrier,
>> see Documentation/process/volatile-considered-harmful.rst. In addition,
>> if something goes wrong in the busy loop at least it can prevent things
>> from getting worse.
>>
>> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
>> ---
>> drivers/usb/dwc3/core.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>> index 734de2a8bd21..498f08dbbdb5 100644
>> --- a/drivers/usb/dwc3/core.c
>> +++ b/drivers/usb/dwc3/core.c
>> @@ -2050,6 +2050,8 @@ static int dwc3_get_num_ports(struct dwc3 *dwc)
>> if (!offset)
>> break;
>>
>> + cpu_relax();
>> +
>> val = readl(base + offset);
>> major_revision = XHCI_EXT_PORT_MAJOR(val);
>>
>> --
>> 2.25.1
>>
>
> We're not polling on a register here. We're just traversing and reading
> the next port capability. The loop in dwc3_get_num_ports() should not be
> more than DWC3_USB2_MAX_PORTS + DWC3_USB3_MAX_PORTS.
>
Hi Thinh,
Thanks a lot for the review~
yes, now i know that the iterations are limited so it wouldn't make
sense to relax here. I will be careful about this next time and sorry
for this.
> What's really causing this busy loop you found?
>
actually no practical issue.
> If polling for a register is really a problem, then we would have that
> problem everywhere else in dwc3. But why here?
>
I also think that polling for a register is not a problem, but if there
polling for a register in the potential infinite loop, It's better to
relax the cpu and as i saw, basically there are two types of
implementations in other codes for the relax cpu target,
1. use (u/m)sleep or (u/n)delay function or the iterations limited,
such as:
(1)
core.c- if (DWC3_VER_IS_WITHIN(DWC31, 190A, ANY) || DWC3_IP_IS(DWC32))
core.c- retries = 10;
core.c-
core.c- do {
core.c: reg = dwc3_readl(dwc->regs, DWC3_DCTL);
core.c- if (!(reg & DWC3_DCTL_CSFTRST))
core.c- goto done;
core.c-
core.c- if (DWC3_VER_IS_WITHIN(DWC31, 190A, ANY) ||
DWC3_IP_IS(DWC32))
core.c- msleep(20);
core.c- else
core.c- udelay(1);
core.c- } while (--retries);
(2)
gadget.c- /* poll until Link State changes to ON */
gadget.c- retries = 20000;
gadget.c-
gadget.c- while (retries--) {
gadget.c: reg = dwc3_readl(dwc->regs, DWC3_DSTS);
gadget.c-
gadget.c- /* in HS, means ON */
gadget.c- if (DWC3_DSTS_USBLNKST(reg) ==
DWC3_LINK_STATE_U0)
gadget.c- break;
gadget.c- }
By the way, for (2) case, the retries is 20000, seems the value is large
without relax if break the loop only while retries is 0, but as we know
although if there need delay/relax, we cannot easily use (u/m)delay or m
u(sleep) functions because we need consider to avoid the "scheduling on
atomic/invalid context" BUG. Just shared my guess, unless there is
optimization comparison data after relax cpu or practical issue here.
2. use cpu_relax() to relax for busy loop, such as:
(1)
ulpi.c-static int dwc3_ulpi_busyloop(struct dwc3 *dwc, u8 addr, bool read)
..........................
ulpi.c-
ulpi.c- while (count--) {
ulpi.c- ndelay(ns);
ulpi.c: reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYACC(0));
ulpi.c- if (reg & DWC3_GUSB2PHYACC_DONE)
ulpi.c- return 0;
ulpi.c- cpu_relax();
ulpi.c- }
(2)
host/ohci-pxa27x.c: while (__raw_readl(pxa_ohci->mmio_base + UHCHR)
& UHCHR_FSBIR)
host/ohci-pxa27x.c- cpu_relax();
(3)
gadget/udc/fsl_udc_core.c: while (fsl_readl(&dr_regs->usbcmd) &
USB_CMD_CTRL_RESET) {
gadget/udc/fsl_udc_core.c- if (time_after(jiffies, timeout)) {
gadget/udc/fsl_udc_core.c-
dev_err(&udc->gadget.dev, "udc reset timeout!\n");
gadget/udc/fsl_udc_core.c- return -ETIMEDOUT;
gadget/udc/fsl_udc_core.c- }
gadget/udc/fsl_udc_core.c- cpu_relax();
gadget/udc/fsl_udc_core.c- }
Anyways, for current patch there the iterations are limited, thanks a
lot for the review and the discussion and i will be careful next time.
> Thanks,
> Thinh
--
Thx and BRs,
Zhongqiu Han
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers
2024-08-21 9:40 ` Mathias Nyman
@ 2024-08-21 14:04 ` Zhongqiu Han
0 siblings, 0 replies; 7+ messages in thread
From: Zhongqiu Han @ 2024-08-21 14:04 UTC (permalink / raw)
To: Mathias Nyman, Thinh.Nguyen, gregkh, mathias.nyman
Cc: linux-usb, linux-kernel, Zhongqiu Han
On 8/21/2024 5:40 PM, Mathias Nyman wrote:
> On 20.8.2024 15.15, Zhongqiu Han wrote:
>> It is considered good practice to call cpu_relax() in busy loops, see
>> Documentation/process/volatile-considered-harmful.rst. This can lower
>> CPU power consumption or yield to a hyperthreaded twin processor and
>> also serve as a compiler barrier. In addition, if something goes wrong
>> in the busy loop at least it can prevent things from getting worse.
>>
>> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
>> ---
>> drivers/usb/host/xhci-ext-caps.h | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/usb/host/xhci-ext-caps.h b/drivers/usb/host/xhci-
>> ext-caps.h
>> index 96eb36a58738..25d148d60ab0 100644
>> --- a/drivers/usb/host/xhci-ext-caps.h
>> +++ b/drivers/usb/host/xhci-ext-caps.h
>> @@ -144,6 +144,8 @@ static inline int xhci_find_next_ext_cap(void
>> __iomem *base, u32 start, int id)
>> if (offset != start && (id == 0 || XHCI_EXT_CAPS_ID(val) ==
>> id))
>> return offset;
>> + cpu_relax();
>> +
>> next = XHCI_EXT_CAPS_NEXT(val);
>> offset += next << 2;
>> } while (next);
>
> Similar case as with PATCH 1/2
>
> This isn't a busy loop polling for some value.
> We traverse xhci extended capabilities until the one we are looking for
> is found.
>
> Thanks
> Mathias
>
Hi Mathias,
Thanks a lot for the review, yes, it is similar case as with PATCH 1/2
there is not a busy loop polling, sorry for this and i will careful for
similar case next time, and thanks for the discussion as well.
--
Thx and BRs,
Zhongqiu Han
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-08-21 14:05 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-20 12:14 [PATCH 0/2] usb: Call cpu_relax() when polling registers Zhongqiu Han
2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han
2024-08-20 22:05 ` Thinh Nguyen
2024-08-21 13:59 ` Zhongqiu Han
2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han
2024-08-21 9:40 ` Mathias Nyman
2024-08-21 14:04 ` Zhongqiu Han
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox