From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: David Airlie <airlied@linux.ie>,
intel-gfx <intel-gfx@lists.freedesktop.org>,
Mike Galbraith <umgwanakikbuti@gmail.com>,
dri-devel <dri-devel@lists.freedesktop.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [Intel-gfx] [PATCH 4/8] drm/i915: Use preempt_disable/enable_rt() where recommended
Date: Fri, 11 Feb 2022 09:44:17 +0100 [thread overview]
Message-ID: <YgYh4Y119dKY8vaO@linutronix.de> (raw)
In-Reply-To: <CAEsyxygWP50qP9Xj-GmAycb6Ex8DET2c4DbY-i3dnocaXZ=JZA@mail.gmail.com>
On 2022-01-27 00:29:37 [+0100], Mario Kleiner wrote:
> Hi, first thank you for implementing these preempt disables according to
Hi Mario,
> the markers i left long ago. And sorry for the rather late reply.
>
> I had a look at the code, as of Linux 5.16, and did also a little test run
> (of a standard kernel, not with PREEMPT_RT, only
> CONFIG_PREEMPT_VOLUNTARY=y) on my Intel Kabylake GT2, so some thoughts:
>
> The area covers only register reads and writes. The part that worries me
> > is:
> > - __intel_get_crtc_scanline() the worst case is 100us if no match is
> > found.
> >
>
> This one can be a problem indeed on (maybe all?) modern Intel gpu's since
> Haswell, ie. the last ~10 years. I was able to reproduce it on my Kabylake
> Intel gpu.
>
> Most of the time that for-loop with up to 100 repetitions (~ 100
> udelay(1) + one mmio register read) (cfe.
> https://elixir.bootlin.com/linux/v5.17-rc1/source/drivers/gpu/drm/i915/i915_irq.c#L856)
> will not execute, because most of the time that function gets called from
> the vblank irq handler and then that trigger condition (if
> (HAS_DDI(dev_priv) && !position)) is not true. However, it also gets called
> as part of power-saving on behalf of userspace context, whenever the
> desktop graphics goes idle for two video refresh cycles. If the desktop
> shows graphics activity again, and vblank interrupts need to get reenabled,
> the probability of hitting that case is then ~1-4% depending on video mode.
> How many loops it runs also varies.
>
> On my little Intel(R) Core(TM) i5-8250U CPU machine with a mostly idle
> desktop, I observed about one hit every couple of seconds of regular use,
> and each hit took between 125 usecs and almost 250 usecs. I guess udelay(1)
> can take a bit longer than 1 usec?
it should get very close to this. Maybe something else extended the time
depending on what you observe.
> So that's too much for preempt-rt. What one could do is the following:
>
> 1. In the for-loop in __intel_get_crtc_scanline(), add a preempt_enable()
> before the udelay(1); and a preempt_disable() again after it. Or
> potentially around the whole for-loop if the overhead of
> preempt_en/disable() is significant?
It is very optimized on x86 ;)
> 2. In intel_get_crtc_scanline() also wrap the call to
> __intel_get_crtc_scanline() into a preempt_disable() and preempt_enable(),
> so we can be sure that __intel_get_crtc_scanline() always gets called with
> preemption disabled.
>
> Why should this work ok'ish? The point of the original preempt disable
> inside i915_get_crtc_scanoutpos
> <https://elixir.bootlin.com/linux/v5.17-rc1/C/ident/i915_get_crtc_scanoutpos>
> is that those two *stime = ktime_get() and *etime = ktime_get() clock
> queries happen as close to the scanout position query as possible to get a
> small confidence interval for when exactly the scanoutpos was
> read/determined from the display hardware. error = (etime - stime) is the
> error margin. If that margin becomes greater than 20 usecs, then the
> higher-level code will consider the measurement invalid and repeat the
> whole procedure up to 3 times before giving up.
The preempt-disable is needed then? The task is preemptible here on
PREEMPT_RT but it _may_ not come to this. The difference vs !RT is that
an interrupt will preempt this code without it.
> Normally, in my experience with different graphics chips, one would observe
> error < 3 usecs, so the measurement almost always succeeds at first try,
> only very rarely takes two attempts. The preempt disable is meant to make
> sure that this stays the case on a PREEMPT_RT kernel.
Was it needed?
> The problem here are the relatively rare cases where we hit that up to 100
> iterations for-loop. Here even on a regular kernel, due to hardware quirks,
> we already exceed the 20 usecs tolerance by a huge amount of more than 100
> usecs, leading to a retry of the measurement. And my tests showed that
> often the two succeeding retries also fail, because of hardware quirks can
> apparently create a blackout situation approaching 1 msec, so we lose
> anyway, regardless if we get preempted on a RT kernel or not. That's why
> enabling preemption on RT again during that for-loop should not make the
> situation worse and at least keep RT as real-time as intended.
>
> In practice I would also expect that this failure case is the one least
> likely to impair userspace applications greatly in practice. The cases that
> mostly matter are the ones executed during vblank hardware irq, where the
> for-loop never executes and error margin and preempt off time is only about
> 1 usec. My own software which depends on very precise timestamps from the
> mechanism never reported >> 20 usecs errors during startup tests or runtime
> tests.
That is without RT I assume?
> > - intel_crtc_scanlines_since_frame_timestamp() not sure how long this
> > may take in the worst case.
> >
> >
> intel_crtc_scanlines_since_frame_timestamp() should be harmless. That
> do-while loop just tries to make sure that two register reads that should
> happen within the same video refresh cycle are happening in the same
> refresh cycle. As such, the while loop will almost always just execute only
> once, and at most two times, so that's at most 6 mmio register reads for
> two loop iterations.
>
> In the long run one could try to test if
> __intel_get_crtc_scanline_from_timestamp
> <https://elixir.bootlin.com/linux/v5.17-rc1/C/ident/__intel_get_crtc_scanline_from_timestamp>()
> wouldn't be the better choice for affected hardware always. Code and
> register descriptions suggest the feature is supported by all potentially
> affected hardware, so if it would turn out that that method works as
> accurate and reliable as the old one, we could save the overhead and time
> delays for that 100 for-loop iterations and make the whole timestamping
> more reliable on modern hw.
>
> It was in the RT queue for a while and nobody complained.
> > Disable preemption on PREEPMPT_RT during timestamping.
> >
> >
> Do other patches exist to implement the preempt_dis/enable() also for AMD
> and NVidia / nouveau / vc4? I left corresponding markers for
No, nobody complained. Most likely the i915 is wider used since it is
built-in into many chipsets which then run RT and some of them use the
display in production.
> radeon/amdgpu-kms and RaspberryPi's vc4 driver. Ideally all kms drivers
> which use scanout position queries should have such code. Always a
> preempt_disable() before the "if (stime) *stime = ktime_get();" statement,
> and a preempt_enable() after the "if (etime) *etime = ktime_get();"
> statement.
>
> Checking Linux 5.16 code, this should be safe - short preempt off interval
> with only a few mmio register reads - for all kms drivers that support it
> atm. I found the following functions to modify:
>
> amdgpu: amdgpu_display_get_crtc_scanoutpos()
> radeon: radeon_get_crtc_scanoutpos()
> msm: mdp5_crtc_get_scanout_position() and dpu_crtc_get_scanout_position()
> ltdc: ltdc_crtc_get_scanout_position()
> vc4: vc4_crtc_get_scanout_position()
I that is "small" with locks and such, then it should work
> nouveau: In nvkm_head_mthd_scanoutpos(), one needs to put the
> preempt_disable() right before
>
…
>
> Is the plan to integrate these patches into the mainline kernel soon, as
> part of ongoing preempt-rt upstreaming?
I want to get the i915 in as part of RT upstreaming. But now I've been
thinking to not allowing i915 on RT via Kconfig and worry about it
afterwards.
> thanks,
> -mario
Sebastian
next prev parent reply other threads:[~2022-02-11 8:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-14 14:02 [Intel-gfx] [PATCH 0/8] drm/i915: PREEMPT_RT related fixups Sebastian Andrzej Siewior
2021-12-14 14:02 ` [Intel-gfx] [PATCH 1/8] drm/i915: Drop the irqs_disabled() check Sebastian Andrzej Siewior
2021-12-14 14:02 ` [Intel-gfx] [PATCH 2/8] drm/i915/gt: Queue and wait for the irq_work item Sebastian Andrzej Siewior
2021-12-14 14:02 ` [Intel-gfx] [PATCH 3/8] drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock() Sebastian Andrzej Siewior
2021-12-14 14:02 ` [Intel-gfx] [PATCH 4/8] drm/i915: Use preempt_disable/enable_rt() where recommended Sebastian Andrzej Siewior
2022-01-26 23:29 ` Mario Kleiner
2022-02-11 8:44 ` Sebastian Andrzej Siewior [this message]
2022-02-14 18:38 ` Mario Kleiner
2021-12-14 14:02 ` [Intel-gfx] [PATCH 5/8] drm/i915: Don't disable interrupts on PREEMPT_RT during atomic updates Sebastian Andrzej Siewior
2021-12-14 14:02 ` [Intel-gfx] [PATCH 6/8] drm/i915: Don't check for atomic context on PREEMPT_RT Sebastian Andrzej Siewior
2021-12-14 14:03 ` [Intel-gfx] [PATCH 7/8] drm/i915: Disable tracing points " Sebastian Andrzej Siewior
2021-12-14 14:36 ` Steven Rostedt
2021-12-14 15:41 ` Jani Nikula
2021-12-14 15:56 ` Sebastian Andrzej Siewior
2021-12-14 16:34 ` Ville Syrjälä
2021-12-14 16:58 ` Steven Rostedt
2022-02-08 17:32 ` Sebastian Andrzej Siewior
2021-12-14 14:03 ` [Intel-gfx] [PATCH 8/8] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with NOTRACE Sebastian Andrzej Siewior
2021-12-14 15:58 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: PREEMPT_RT related fixups. (rev4) Patchwork
2021-12-14 16:12 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YgYh4Y119dKY8vaO@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=airlied@linux.ie \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=mario.kleiner.de@gmail.com \
--cc=tglx@linutronix.de \
--cc=umgwanakikbuti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox