From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
intel-xe@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH] drm/xe/mmio: Make xe_mmio_wait32() aware of interrupts
Date: Wed, 15 Nov 2023 13:31:25 -0500 [thread overview]
Message-ID: <ZVUOfWnAi8Y8qhPF@intel.com> (raw)
In-Reply-To: <170007111735.1634.7618023957528021915@gjsousa-mobl2>
On Wed, Nov 15, 2023 at 02:58:37PM -0300, Gustavo Sousa wrote:
> Quoting Lucas De Marchi (2023-11-15 13:51:41-03:00)
> >On Tue, Nov 14, 2023 at 07:09:22PM -0300, Gustavo Sousa wrote:
> >>With the current implementation, a preemption or other kind of interrupt
> >>might happen between xe_mmio_read32() and ktime_get_raw(). Such an
> >>interruption (specially in the case of preemption) might be long enough
> >>to cause a timeout without giving a chance of a new check on the
> >>register value on a next iteration, which would have happened otherwise.
> >>
> >>This issue causes some sporadic timeouts in some code paths. As an
> >>example, we were experiencing some rare timeouts when waiting for PLL
> >>unlock for C10/C20 PHYs (see intel_cx0pll_disable()). After debugging,
> >>we found out that the PLL unlock was happening within the expected time
> >>period (20us), which suggested a bug in xe_mmio_wait32().
> >>
> >>To fix the issue, ensure that we call ktime_get_raw() to get the current
> >>time before we check the register value. This allows for a last check
> >>after a timeout and covers the case where the it is caused by some
> >>preemption/interrupt.
> >>
> >>This change was tested with the aforementioned PLL unlocking code path.
> >>Experiments showed that, before this change, we observed reported
> >>timeouts in 54 of 5000 runs; and, after this change, no timeouts were
> >>reported in 5000 runs.
> >
> >good find :)
indeed!
> >> for (;;) {
> >>+ cur = ktime_get_raw();
> >>+
> >>+ /*
> >>+ * Keep the compiler from re-ordering calls to ktime_get_raw()
> >>+ * and xe_mmio_read32(): reading the current time after reading
> >>+ * register has the potential for "fake timeouts" due to
> >>+ * preemption/interrupts in between the two.
> >>+ */
> >>+ barrier();
> >
> >we are only protecting here against the compiler reordering this. I
> >don't think it will due to the external call to ktime_get_raw(). Do you
> >get any failures if you remove the barrier?
>
> Makes sense.
>
> I put the barrier here to be sure that the compiler doesn't surprise us, but I
> haven't tested without it. Maybe it is not really necessary indeed...
yeap, probably better without the barrier if it works.
>
> >
> >Anyway, we probably don't need a full barrier and just using
> >WRITE_ONCE() for both would be more appropriate.
> >
> >I wonder if we could do this logic better and have a header / wait /
> >tail approach. In both header and tail we read the mmio and don't
> >care if timeout occurred.
>
> Yeah. I was thinking about a something similar, but ended up sending this way to
> have a single place doing the register read.
yeap, same reason why I had kept like this in the past. but I wouldn't mind
if you believe there are other better ways. as long as we respect the timeout
requests.
> >
> >+Rodrigo, shouldn't we move this to a .c? IMO it's very big for a
> >inline.
indeed... although I believe the compiler would really ignore our inline
request if this gets called from multiple places anyway.
Iirc I had seen other inlines in the kernel with the same size so I thought
it would be okay. But also fine by me if this gets moved to the .c
Thanks for taking care of this,
Rodrigo.
next prev parent reply other threads:[~2023-11-15 18:31 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-14 22:09 [Intel-xe] [PATCH] drm/xe/mmio: Make xe_mmio_wait32() aware of interrupts Gustavo Sousa
2023-11-14 23:29 ` [Intel-xe] ✓ CI.Patch_applied: success for " Patchwork
2023-11-14 23:29 ` [Intel-xe] ✓ CI.checkpatch: " Patchwork
2023-11-14 23:30 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-11-14 23:37 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-11-14 23:38 ` [Intel-xe] ✓ CI.Hooks: " Patchwork
2023-11-14 23:39 ` [Intel-xe] ✓ CI.checksparse: " Patchwork
2023-11-15 0:14 ` [Intel-xe] ✓ CI.BAT: " Patchwork
2023-11-15 16:51 ` [Intel-xe] [PATCH] " Lucas De Marchi
2023-11-15 17:58 ` Gustavo Sousa
2023-11-15 18:31 ` Rodrigo Vivi [this message]
2023-11-15 20:04 ` Lucas De Marchi
2023-11-16 13:42 ` Gustavo Sousa
2023-11-16 17:56 ` Lucas De Marchi
2023-11-15 22:31 ` [Intel-xe] ✗ CI.Patch_applied: failure for drm/xe/mmio: Make xe_mmio_wait32() aware of interrupts (rev3) Patchwork
2023-11-17 21:30 ` Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZVUOfWnAi8Y8qhPF@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=gustavo.sousa@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.