From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
To: Selvarasu Ganesan <selvarasu.g@samsung.com>
Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"jh0801.jung@samsung.com" <jh0801.jung@samsung.com>,
"dh10.jung@samsung.com" <dh10.jung@samsung.com>,
"naushad@samsung.com" <naushad@samsung.com>,
"akash.m5@samsung.com" <akash.m5@samsung.com>,
"rc93.raju@samsung.com" <rc93.raju@samsung.com>,
"taehyun.cho@samsung.com" <taehyun.cho@samsung.com>,
"hongpooh.kim@samsung.com" <hongpooh.kim@samsung.com>,
"eomji.oh@samsung.com" <eomji.oh@samsung.com>,
"shijie.cai@samsung.com" <shijie.cai@samsung.com>
Subject: Re: [PATCH] usb: dwc3: Potential fix of possible dwc3 interrupt storm
Date: Wed, 11 Sep 2024 00:24:10 +0000 [thread overview]
Message-ID: <20240911002408.gr4fv5vkst7ukxd5@synopsys.com> (raw)
In-Reply-To: <dff83c7d-56b8-481f-af69-8d4262bd54e4@samsung.com>
On Tue, Sep 10, 2024, Selvarasu Ganesan wrote:
>
> On 9/7/2024 6:09 AM, Thinh Nguyen wrote:
> > On Sat, Sep 07, 2024, Selvarasu Ganesan wrote:
> >> Hi Thinh,
> >>
> >> I ran the code you recommended on our testing environment and was able
> >> to reproduce the issue one time.
> >>
> >> When evt->flags contains DWC3_EVENT_PENDING, I've included the following
> >> debugging information: I added this debug message at the start of
> >> dwc3_event_buffers_cleanup and dwc3_event_buffers_setup functions in
> >> during suspend and resume.
> >>
> >> The results were quite interesting . I'm curious to understand how
> >> evt->flags is set to DWC3_EVENT_PENDING, and along with DWC3_GEVNTSIZ is
> >> equal to 0x1000 during the suspend.
> > That is indeed strange.
> >
> >> Its means that the previous bottom-half handler prior to suspend might
> >> still be executing in the middle of the process.
> >>
> >> Could you please give your suggestions here? And let me know if anything
> >> want to test or additional details are required.
> >>
> >>
> >> ##DBG: dwc3_event_buffers_cleanup:
> >> evt->length :0x1000
> >> evt->lpos :0x20c
> >> evt->count :0x0
> >> evt->flags :0x1 // This is Unexpected if DWC3_GEVNTSIZ(0)(0xc408):
> >> 0x00001000. Its means that previous bottom-half handler may be still
> >> running in middle
> > Perhaps.
> >
> > But I doubt that's the case since it shouldn't take that long for the
> > bottom-half to be completed before the next resume yet the flag is still
> > set.
> >
> >> DWC3_GEVNTSIZ(0)(0xc408) : 0x00001000
> >> DWC3_GEVNTCOUNT(0)(0xc40c) : 0x00000000
> >> DWC3_DCFG(0xc700) : 0x00e008a8
> >> DWC3_DCTL(0xc704) : 0x0cf00a00
> >> DWC3_DEVTEN(0xc708) : 0x00000000
> >> DWC3_DSTS(0xc70c) : 0x00d20cd1
> >>
> > The controller status is halted. So there's no problem with
> > soft-disconnect. For the interrupt mask in GEVNTSIZ to be cleared,
> > that likely means that the bottom-half had probably completed.
>
> Agree, But I am worrying on If the bottom-half is completed, then
> DWC3_EVENT_PENDING must be cleared in evt->flags.
> Is there any possibility of a CPU reordering issue when updating
> evt->flags in the bottom-half handler?.
> Should I try with wmb() when writing to evt->flags?
Assuming that the problem occurs after the bottom-half completed, there
should be implicit memory barrier. The memory operation should complete
before the release from spin_unlock complete. I don't think wmb() will
help.
> >
> >> ##DBG: dwc3_event_buffers_setup:
> >> evt->length :0x1000
> >> evt->lpos :0x20c
> > They fact that evt->lpos did not get updated tells me that there's
> > something wrong with memory access to your platform during suspend and
> > resume.
>
> Are you expecting the evt->lpos value to be zero here? If so, this is
> expected in our test setup because we avoid writing zero to evt->lpos as
> part of dwc3_event_buffers_cleanup if evt->flags have a value of 1. This
Oh ok. I did not know you made this modification.
> is simply to track the status of evt->lpos during suspend to resume when
> evt->flags have a value of DWC3_EVENT_PENDING. The following test codes
> for the reference.
>
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -505,8 +505,20 @@ static int dwc3_alloc_event_buffers(struct dwc3
> *dwc, unsigned int length)
> int dwc3_event_buffers_setup(struct dwc3 *dwc)
> {
> struct dwc3_event_buffer *evt;
> + u32 reg;
>
> evt = dwc->ev_buf;
> +
> + if (evt->flags & DWC3_EVENT_PENDING) {
> + pr_info("evt->length :%x\n", evt->length);
> + pr_info("evt->lpos :%x\n", evt->lpos);
> + pr_info("evt->count :%x\n", evt->count);
> + pr_info("evt->flags :%x\n", evt->flags);
> +
> + dwc3_exynos_reg_dump(dwc);
> +
> + }
> +
> evt->lpos = 0;
> dwc3_writel(dwc->regs, DWC3_GEVNTADRLO(0),
> lower_32_bits(evt->dma));
> @@ -514,8 +526,10 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc)
> upper_32_bits(evt->dma));
> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
> DWC3_GEVNTSIZ_SIZE(evt->length));
> - dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0);
>
> + /* Clear any stale event */
> + reg = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
> + dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), reg);
> return 0;
> }
>
> @@ -525,7 +539,16 @@ void dwc3_event_buffers_cleanup(struct dwc3 *dwc)
>
> evt = dwc->ev_buf;
>
> - evt->lpos = 0;
> + if (evt->flags & DWC3_EVENT_PENDING) {
> + pr_info("evt->length :%x\n", evt->length);
> + pr_info("evt->lpos :%x\n", evt->lpos);
> + pr_info("evt->count :%x\n", evt->count);
> + pr_info("evt->flags :%x\n", evt->flags);
> +
> + dwc3_exynos_reg_dump(dwc);
> + } else {
> + evt->lpos = 0;
I wasn't aware of this change.
> + }
>
> >
> >> evt->count :0x0
> >> evt->flags :0x1 // Still It's not clearing in during resume.
> >>
> >> DWC3_GEVNTSIZ(0)(0xc408) : 0x00000000
> >> DWC3_GEVNTCOUNT(0)(0xc40c) : 0x00000000
> >> DWC3_DCFG(0xc700) : 0x00080800
> >> DWC3_DCTL(0xc704) : 0x00f00000
> >> DWC3_DEVTEN(0xc708) : 0x00000000
> >> DWC3_DSTS(0xc70c) : 0x00d20001
> >>
> > Please help look into your platform to see what condition triggers this
> > memory access issue. If this is a hardware quirk, we can properly update
> > the change and note it to be so.
>
> Sure I will try to figure it out. However, we are facing challenges in
> reproducing the issue. There could be a delay in understanding the
> conditions that trigger the memory issue if it is related to a memory issue.
>
> >
> > Thanks,
> > Thinh
> >
> > (If possible, for future tests, please dump the dwc3 tracepoints. Many
> > thanks for the tests.)
>
> I tried to get dwc3 traces in the failure case, but so far no instances
> have been reported. Our testing is still in progress with enable dwc3
> traces.
>
> I will keep posting once I get the dwc3 traces in the failure scenario.
>
Thanks for the update. I hope enabling of the driver tracepoints will
not impact the reproduction of the issue. With the driver log, we'll get
more clues to what was going on.
Thanks,
Thinh
next prev parent reply other threads:[~2024-09-11 0:24 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20240719110149epcas5p3dd468685a095c094ed2e540279bf3ec2@epcas5p3.samsung.com>
2024-07-19 11:00 ` [PATCH] usb: dwc3: Potential fix of possible dwc3 interrupt storm Selvarasu Ganesan
2024-08-07 0:38 ` Thinh Nguyen
2024-08-07 6:20 ` Selvarasu Ganesan
2024-08-08 1:15 ` Thinh Nguyen
2024-08-08 6:23 ` Selvarasu Ganesan
2024-08-09 23:42 ` Thinh Nguyen
2024-08-09 23:45 ` Thinh Nguyen
2024-08-10 15:14 ` Selvarasu Ganesan
2024-08-30 12:16 ` Selvarasu Ganesan
2024-08-31 0:50 ` Thinh Nguyen
2024-09-02 11:27 ` Selvarasu Ganesan
2024-09-03 23:41 ` Thinh Nguyen
2024-09-04 1:03 ` Thinh Nguyen
2024-09-04 15:50 ` Selvarasu Ganesan
2024-09-05 0:26 ` Thinh Nguyen
2024-09-05 13:19 ` Selvarasu Ganesan
2024-09-05 21:13 ` Thinh Nguyen
2024-09-05 23:05 ` Selvarasu Ganesan
2024-09-05 23:18 ` Thinh Nguyen
2024-09-06 0:28 ` Selvarasu Ganesan
2024-09-06 0:59 ` Thinh Nguyen
2024-09-06 19:02 ` Selvarasu Ganesan
2024-09-07 0:39 ` Thinh Nguyen
2024-09-10 13:37 ` Selvarasu Ganesan
2024-09-11 0:24 ` Thinh Nguyen [this message]
2024-09-13 12:42 ` Selvarasu Ganesan
2024-09-13 17:51 ` Thinh Nguyen
2024-09-13 18:00 ` Thinh Nguyen
2024-09-16 12:43 ` Selvarasu Ganesan
2024-09-16 21:19 ` Thinh Nguyen
2024-09-16 12:41 ` Selvarasu Ganesan
2024-09-16 21:18 ` Thinh Nguyen
2024-09-16 22:54 ` Selvarasu Ganesan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240911002408.gr4fv5vkst7ukxd5@synopsys.com \
--to=thinh.nguyen@synopsys.com \
--cc=akash.m5@samsung.com \
--cc=dh10.jung@samsung.com \
--cc=eomji.oh@samsung.com \
--cc=gregkh@linuxfoundation.org \
--cc=hongpooh.kim@samsung.com \
--cc=jh0801.jung@samsung.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=naushad@samsung.com \
--cc=rc93.raju@samsung.com \
--cc=selvarasu.g@samsung.com \
--cc=shijie.cai@samsung.com \
--cc=taehyun.cho@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox