All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mario Kleiner <mario.kleiner.de@gmail.com>
To: "Michel Dänzer" <michel@daenzer.net>
Cc: alexander.deucher@amd.com, daniel.vetter@ffwll.ch,
	dri-devel@lists.freedesktop.org, vbabka@suse.cz,
	christian.koenig@amd.com
Subject: Re: [PATCH 1/2] drm/radeon: Use drm_vblank_off/on to fix vblank counter trouble.
Date: Fri, 22 Jan 2016 18:08:29 +0100	[thread overview]
Message-ID: <56A2620D.7050909@gmail.com> (raw)
In-Reply-To: <56A19F3C.5030208@daenzer.net>

On 01/22/2016 04:17 AM, Michel Dänzer wrote:
> On 21.01.2016 18:16, Mario Kleiner wrote:
>> On 01/21/2016 09:25 AM, Michel Dänzer wrote:
>>> On 21.01.2016 17:16, Mario Kleiner wrote:
>>>>
>>>> This patch replaces calls to drm_vblank_pre/post_modeset in the
>>>> drivers dpms code with calls to drm_vblank_off/on, as recommended
>>>> for drivers with hw counters that reset to zero during modeset.
>>>
>>> Sounds like you fell for the drm_vblank_on/off propaganda. :(
>>>
>>> This was working fine with drm_vblank_pre/post_modeset, that it broke
>>> is simply a regression.
>>
>> I agree with you that pre/post modeset breakage is a regression. It's
>> just that i stumbled over the on/off stuff while searching for a
>> solution and the other sort of hacks i could think of looked similar or
>> more convoluted/hacky/fragile to me.
>
> Finding and fixing the cause of a regression isn't a hack, it's
> established procedure.
>

That's not what i meant. I meant i couldn't find something less 
complicated/risky/without new regression potential, so this looked like 
a better solution. Of course i would have tested my own patches against 
at least a couple bits of userspace (ati ddx, modesetting ddx, weston), 
i just didn't have access to the machine yesterday.

Anyway, some more hours of thinking and code browsing later, now i think 
i have a simple and safe solution which should hopefully restore the 
drm_vblank_pre/post_modeset behaviour with only a few lines of core 
code. At the same time it should fix up another bug in that new 
drm_update_vblank_count code that i just realized, in a way simple 
enough for a stable fix.

Now i just need to actually code and test it first.

>
>> And they probably wouldn't solve that other small race i found as easily
>> - I don't think it's likely to happen (often/at all?) in practice, but i
>> have trouble "forgetting" about its existence now.
>
> That's something which should be addressed independently from the
> regression fix.
>
> Please split up the PM fixes from your patch into one or two separate
> patches (which may be appropriate for 4.5 / stable trees), and leave the
> switch to drm_vblank_on/off to Daniel's patch for 4.6.
>

I will look into that once i'm done with the above, and probably got 
some sleep again.

Fixing this race regression without switching to vblank_off/on might 
need a small bit of extra band aid there.

Btw. wrt. the radeon_pm.c fix: It's certainly good to fix that potential 
drm_vblank_get/put imbalance. I wonder if that "might glitch" DEBUG 
message makes much sense though. Can that code run during a modeset at 
all? And if so, i'd almost expect that there won't be any vblank irqs 
available at that point anyway - once the crtc's are off they don't 
trigger vblank irqs anymore - so that code might glitch due to lack of 
vblank sync regardless if drm_vblank_get is successful or not?

The other thing is my placement of the radeon_pm_compute_clocks() in the 
DPMS_ON path. I moved it to fix the potential extra race i described. 
But thinking about it, wouldn't be the better place at the beginning of 
the DPMS on path, before the atombios calls reenable the crtcs? I don't 
know the driver well enough, but it looked a bit suspicious to me that 
the memory clocks, linebuffer watermarks etc. get updated for thew new 
video mode after the crtc has been enabled. Won't it then potentially 
start running for a moment with wrong memory bandwidth etc.? That's 
probably something for you to check - no idea, just something i noticed 
as slightly odd to me.

Also moving it up might avoid collisions with Daniel's patch, if that 
move doesn't hurt.

>
>>> I'm not against switching to drm_vblank_on/off for 4.6, but it's not a
>>> solution for older kernels.
>>
>> Linux 4.4 is an especially important stable kernel for me because it's
>> supposed to be the standard distro kernel for Ubuntu 16.04-LTS and
>> siblings/derivatives (Linux Mint) for up to the next 5 years. Having
>> many of my neuroscience users ending on that kernel as their very first
>> impression of Linux with something potentially broken in vblank land
>> scares me. The reliability of timing/timestamping stuff is
>> super-important for them, at the same time hand-holding many of them
>> through non-standard kernel upgrades would be so much not fun.
>
> But fast-tracking the switch to drm_vblank_on/off, which haven't been
> widely tested with this driver, all the way to 4.4 seems less risky to
> you? Seriously?
>

It made a lot of sense after 12 hours of browsing code, thinking about 
all kind of race conditions and other new personal horrors ;) - Luckily 
i don't have to decide.

>> Just to say i'm probably way too biased wrt. what solution for this
>> should get backported into an older kernel.
>
> At the risk of sounding like a broken record: There's a bug in 4.4 (and
> older/newer kernels) causing the vblank counter to jump forward across
> DPMS off with drm_vblank_on/off. So it sounds like you don't want to use
> those at least until we've found and fixed that.
>
>

Modesetting "jumps" are not a problem for my application, as production 
or test sessions won't ever run during modesets or other "disruptive" 
events like dpms on/off. That's also why my own test suite didn't detect 
such trouble ever despite extensive testing. Obviously not having a well 
working desktop or non-working logins etc. would be a bad problem for 
myself and my users. It still surprises me that the bug Vlastimil 
reported never happened to myself in days, given that i loggedout/in a 
lot for testing different X-Server xorg.conf configs on KDE Plasma 5. 
Even now as i see the jumps during modesets in my dmesg i haven't 
managed to get Plasma or glxgears etc. to lockup during dozens of modesets.

-mario
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2016-01-22 17:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-21  8:16 [PATCH 1/2] drm/radeon: Use drm_vblank_off/on to fix vblank counter trouble Mario Kleiner
2016-01-21  8:25 ` Michel Dänzer
2016-01-21  9:16   ` Mario Kleiner
2016-01-21 20:12     ` Ville Syrjälä
2016-01-22  3:17     ` Michel Dänzer
2016-01-22 17:08       ` Mario Kleiner [this message]
2016-02-07 11:05         ` Vlastimil Babka
2016-02-07 11:59           ` Mario Kleiner
2016-02-08  1:58             ` Mario Kleiner
2016-02-08 10:10               ` Vlastimil Babka
2016-01-21 10:24 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A2620D.7050909@gmail.com \
    --to=mario.kleiner.de@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=michel@daenzer.net \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.