* [PATCH 0/2] usb: Call cpu_relax() when polling registers @ 2024-08-20 12:14 Zhongqiu Han 2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han 2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han 0 siblings, 2 replies; 7+ messages in thread From: Zhongqiu Han @ 2024-08-20 12:14 UTC (permalink / raw) To: Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel, quic_zhonhan This series adds cpu_relax() in registers polling busy loops and it is a well-known idiom in the kernel. Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> --- Zhongqiu Han (2): usb: dwc3: core: Call cpu_relax() in registers polling busy loops usb: xhci: ext-caps: Use cpu_relax() when polling registers drivers/usb/dwc3/core.c | 2 ++ drivers/usb/host/xhci-ext-caps.h | 2 ++ 2 files changed, 4 insertions(+) base-commit: 469f1bad3c1c6e268059f78c0eec7e9552b3894c -- 2.25.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops 2024-08-20 12:14 [PATCH 0/2] usb: Call cpu_relax() when polling registers Zhongqiu Han @ 2024-08-20 12:15 ` Zhongqiu Han 2024-08-20 22:05 ` Thinh Nguyen 2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han 1 sibling, 1 reply; 7+ messages in thread From: Zhongqiu Han @ 2024-08-20 12:15 UTC (permalink / raw) To: Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel, quic_zhonhan Busy loops that poll on a register should call cpu_relax(). On some architectures, it can lower CPU power consumption or yield to a hyperthreaded twin processor. It also serves as a compiler barrier, see Documentation/process/volatile-considered-harmful.rst. In addition, if something goes wrong in the busy loop at least it can prevent things from getting worse. Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> --- drivers/usb/dwc3/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 734de2a8bd21..498f08dbbdb5 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -2050,6 +2050,8 @@ static int dwc3_get_num_ports(struct dwc3 *dwc) if (!offset) break; + cpu_relax(); + val = readl(base + offset); major_revision = XHCI_EXT_PORT_MAJOR(val); -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops 2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han @ 2024-08-20 22:05 ` Thinh Nguyen 2024-08-21 13:59 ` Zhongqiu Han 0 siblings, 1 reply; 7+ messages in thread From: Thinh Nguyen @ 2024-08-20 22:05 UTC (permalink / raw) To: Zhongqiu Han Cc: Thinh Nguyen, gregkh@linuxfoundation.org, mathias.nyman@intel.com, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org On Tue, Aug 20, 2024, Zhongqiu Han wrote: > Busy loops that poll on a register should call cpu_relax(). On some > architectures, it can lower CPU power consumption or yield to a > hyperthreaded twin processor. It also serves as a compiler barrier, > see Documentation/process/volatile-considered-harmful.rst. In addition, > if something goes wrong in the busy loop at least it can prevent things > from getting worse. > > Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> > --- > drivers/usb/dwc3/core.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c > index 734de2a8bd21..498f08dbbdb5 100644 > --- a/drivers/usb/dwc3/core.c > +++ b/drivers/usb/dwc3/core.c > @@ -2050,6 +2050,8 @@ static int dwc3_get_num_ports(struct dwc3 *dwc) > if (!offset) > break; > > + cpu_relax(); > + > val = readl(base + offset); > major_revision = XHCI_EXT_PORT_MAJOR(val); > > -- > 2.25.1 > We're not polling on a register here. We're just traversing and reading the next port capability. The loop in dwc3_get_num_ports() should not be more than DWC3_USB2_MAX_PORTS + DWC3_USB3_MAX_PORTS. What's really causing this busy loop you found? If polling for a register is really a problem, then we would have that problem everywhere else in dwc3. But why here? Thanks, Thinh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops 2024-08-20 22:05 ` Thinh Nguyen @ 2024-08-21 13:59 ` Zhongqiu Han 0 siblings, 0 replies; 7+ messages in thread From: Zhongqiu Han @ 2024-08-21 13:59 UTC (permalink / raw) To: Thinh Nguyen Cc: gregkh@linuxfoundation.org, mathias.nyman@intel.com, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org On 8/21/2024 6:05 AM, Thinh Nguyen wrote: > On Tue, Aug 20, 2024, Zhongqiu Han wrote: >> Busy loops that poll on a register should call cpu_relax(). On some >> architectures, it can lower CPU power consumption or yield to a >> hyperthreaded twin processor. It also serves as a compiler barrier, >> see Documentation/process/volatile-considered-harmful.rst. In addition, >> if something goes wrong in the busy loop at least it can prevent things >> from getting worse. >> >> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> >> --- >> drivers/usb/dwc3/core.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c >> index 734de2a8bd21..498f08dbbdb5 100644 >> --- a/drivers/usb/dwc3/core.c >> +++ b/drivers/usb/dwc3/core.c >> @@ -2050,6 +2050,8 @@ static int dwc3_get_num_ports(struct dwc3 *dwc) >> if (!offset) >> break; >> >> + cpu_relax(); >> + >> val = readl(base + offset); >> major_revision = XHCI_EXT_PORT_MAJOR(val); >> >> -- >> 2.25.1 >> > > We're not polling on a register here. We're just traversing and reading > the next port capability. The loop in dwc3_get_num_ports() should not be > more than DWC3_USB2_MAX_PORTS + DWC3_USB3_MAX_PORTS. > Hi Thinh, Thanks a lot for the review~ yes, now i know that the iterations are limited so it wouldn't make sense to relax here. I will be careful about this next time and sorry for this. > What's really causing this busy loop you found? > actually no practical issue. > If polling for a register is really a problem, then we would have that > problem everywhere else in dwc3. But why here? > I also think that polling for a register is not a problem, but if there polling for a register in the potential infinite loop, It's better to relax the cpu and as i saw, basically there are two types of implementations in other codes for the relax cpu target, 1. use (u/m)sleep or (u/n)delay function or the iterations limited, such as: (1) core.c- if (DWC3_VER_IS_WITHIN(DWC31, 190A, ANY) || DWC3_IP_IS(DWC32)) core.c- retries = 10; core.c- core.c- do { core.c: reg = dwc3_readl(dwc->regs, DWC3_DCTL); core.c- if (!(reg & DWC3_DCTL_CSFTRST)) core.c- goto done; core.c- core.c- if (DWC3_VER_IS_WITHIN(DWC31, 190A, ANY) || DWC3_IP_IS(DWC32)) core.c- msleep(20); core.c- else core.c- udelay(1); core.c- } while (--retries); (2) gadget.c- /* poll until Link State changes to ON */ gadget.c- retries = 20000; gadget.c- gadget.c- while (retries--) { gadget.c: reg = dwc3_readl(dwc->regs, DWC3_DSTS); gadget.c- gadget.c- /* in HS, means ON */ gadget.c- if (DWC3_DSTS_USBLNKST(reg) == DWC3_LINK_STATE_U0) gadget.c- break; gadget.c- } By the way, for (2) case, the retries is 20000, seems the value is large without relax if break the loop only while retries is 0, but as we know although if there need delay/relax, we cannot easily use (u/m)delay or m u(sleep) functions because we need consider to avoid the "scheduling on atomic/invalid context" BUG. Just shared my guess, unless there is optimization comparison data after relax cpu or practical issue here. 2. use cpu_relax() to relax for busy loop, such as: (1) ulpi.c-static int dwc3_ulpi_busyloop(struct dwc3 *dwc, u8 addr, bool read) .......................... ulpi.c- ulpi.c- while (count--) { ulpi.c- ndelay(ns); ulpi.c: reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYACC(0)); ulpi.c- if (reg & DWC3_GUSB2PHYACC_DONE) ulpi.c- return 0; ulpi.c- cpu_relax(); ulpi.c- } (2) host/ohci-pxa27x.c: while (__raw_readl(pxa_ohci->mmio_base + UHCHR) & UHCHR_FSBIR) host/ohci-pxa27x.c- cpu_relax(); (3) gadget/udc/fsl_udc_core.c: while (fsl_readl(&dr_regs->usbcmd) & USB_CMD_CTRL_RESET) { gadget/udc/fsl_udc_core.c- if (time_after(jiffies, timeout)) { gadget/udc/fsl_udc_core.c- dev_err(&udc->gadget.dev, "udc reset timeout!\n"); gadget/udc/fsl_udc_core.c- return -ETIMEDOUT; gadget/udc/fsl_udc_core.c- } gadget/udc/fsl_udc_core.c- cpu_relax(); gadget/udc/fsl_udc_core.c- } Anyways, for current patch there the iterations are limited, thanks a lot for the review and the discussion and i will be careful next time. > Thanks, > Thinh -- Thx and BRs, Zhongqiu Han ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers 2024-08-20 12:14 [PATCH 0/2] usb: Call cpu_relax() when polling registers Zhongqiu Han 2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han @ 2024-08-20 12:15 ` Zhongqiu Han 2024-08-21 9:40 ` Mathias Nyman 1 sibling, 1 reply; 7+ messages in thread From: Zhongqiu Han @ 2024-08-20 12:15 UTC (permalink / raw) To: Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel, quic_zhonhan It is considered good practice to call cpu_relax() in busy loops, see Documentation/process/volatile-considered-harmful.rst. This can lower CPU power consumption or yield to a hyperthreaded twin processor and also serve as a compiler barrier. In addition, if something goes wrong in the busy loop at least it can prevent things from getting worse. Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> --- drivers/usb/host/xhci-ext-caps.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/host/xhci-ext-caps.h b/drivers/usb/host/xhci-ext-caps.h index 96eb36a58738..25d148d60ab0 100644 --- a/drivers/usb/host/xhci-ext-caps.h +++ b/drivers/usb/host/xhci-ext-caps.h @@ -144,6 +144,8 @@ static inline int xhci_find_next_ext_cap(void __iomem *base, u32 start, int id) if (offset != start && (id == 0 || XHCI_EXT_CAPS_ID(val) == id)) return offset; + cpu_relax(); + next = XHCI_EXT_CAPS_NEXT(val); offset += next << 2; } while (next); -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers 2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han @ 2024-08-21 9:40 ` Mathias Nyman 2024-08-21 14:04 ` Zhongqiu Han 0 siblings, 1 reply; 7+ messages in thread From: Mathias Nyman @ 2024-08-21 9:40 UTC (permalink / raw) To: Zhongqiu Han, Thinh.Nguyen, gregkh, mathias.nyman; +Cc: linux-usb, linux-kernel On 20.8.2024 15.15, Zhongqiu Han wrote: > It is considered good practice to call cpu_relax() in busy loops, see > Documentation/process/volatile-considered-harmful.rst. This can lower > CPU power consumption or yield to a hyperthreaded twin processor and > also serve as a compiler barrier. In addition, if something goes wrong > in the busy loop at least it can prevent things from getting worse. > > Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> > --- > drivers/usb/host/xhci-ext-caps.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/usb/host/xhci-ext-caps.h b/drivers/usb/host/xhci-ext-caps.h > index 96eb36a58738..25d148d60ab0 100644 > --- a/drivers/usb/host/xhci-ext-caps.h > +++ b/drivers/usb/host/xhci-ext-caps.h > @@ -144,6 +144,8 @@ static inline int xhci_find_next_ext_cap(void __iomem *base, u32 start, int id) > if (offset != start && (id == 0 || XHCI_EXT_CAPS_ID(val) == id)) > return offset; > > + cpu_relax(); > + > next = XHCI_EXT_CAPS_NEXT(val); > offset += next << 2; > } while (next); Similar case as with PATCH 1/2 This isn't a busy loop polling for some value. We traverse xhci extended capabilities until the one we are looking for is found. Thanks Mathias ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers 2024-08-21 9:40 ` Mathias Nyman @ 2024-08-21 14:04 ` Zhongqiu Han 0 siblings, 0 replies; 7+ messages in thread From: Zhongqiu Han @ 2024-08-21 14:04 UTC (permalink / raw) To: Mathias Nyman, Thinh.Nguyen, gregkh, mathias.nyman Cc: linux-usb, linux-kernel, Zhongqiu Han On 8/21/2024 5:40 PM, Mathias Nyman wrote: > On 20.8.2024 15.15, Zhongqiu Han wrote: >> It is considered good practice to call cpu_relax() in busy loops, see >> Documentation/process/volatile-considered-harmful.rst. This can lower >> CPU power consumption or yield to a hyperthreaded twin processor and >> also serve as a compiler barrier. In addition, if something goes wrong >> in the busy loop at least it can prevent things from getting worse. >> >> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> >> --- >> drivers/usb/host/xhci-ext-caps.h | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/usb/host/xhci-ext-caps.h b/drivers/usb/host/xhci- >> ext-caps.h >> index 96eb36a58738..25d148d60ab0 100644 >> --- a/drivers/usb/host/xhci-ext-caps.h >> +++ b/drivers/usb/host/xhci-ext-caps.h >> @@ -144,6 +144,8 @@ static inline int xhci_find_next_ext_cap(void >> __iomem *base, u32 start, int id) >> if (offset != start && (id == 0 || XHCI_EXT_CAPS_ID(val) == >> id)) >> return offset; >> + cpu_relax(); >> + >> next = XHCI_EXT_CAPS_NEXT(val); >> offset += next << 2; >> } while (next); > > Similar case as with PATCH 1/2 > > This isn't a busy loop polling for some value. > We traverse xhci extended capabilities until the one we are looking for > is found. > > Thanks > Mathias > Hi Mathias, Thanks a lot for the review, yes, it is similar case as with PATCH 1/2 there is not a busy loop polling, sorry for this and i will careful for similar case next time, and thanks for the discussion as well. -- Thx and BRs, Zhongqiu Han ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-08-21 14:05 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-20 12:14 [PATCH 0/2] usb: Call cpu_relax() when polling registers Zhongqiu Han 2024-08-20 12:15 ` [PATCH 1/2] usb: dwc3: core: Call cpu_relax() in registers polling busy loops Zhongqiu Han 2024-08-20 22:05 ` Thinh Nguyen 2024-08-21 13:59 ` Zhongqiu Han 2024-08-20 12:15 ` [PATCH 2/2] usb: xhci: ext-caps: Use cpu_relax() when polling registers Zhongqiu Han 2024-08-21 9:40 ` Mathias Nyman 2024-08-21 14:04 ` Zhongqiu Han
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox