From: Jani Nikula <jani.nikula@intel.com>
To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>,
"Nikkanen, Kimmo" <kimmo.nikkanen@intel.com>,
Daniel Vetter <daniel.vetter@ffwll.ch>,
intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v3] drm/i915: Remove unsafe i915.enable_rc6
Date: Thu, 02 Nov 2017 17:17:41 +0200 [thread overview]
Message-ID: <87lgjod816.fsf@intel.com> (raw)
In-Reply-To: <1509634780.27999.22.camel@linux.intel.com>
On Thu, 02 Nov 2017, Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> On Thu, 2017-11-02 at 07:47 -0700, Rodrigo Vivi wrote:
>> On Thu, Nov 02, 2017 at 08:06:29AM +0000, Jani Nikula wrote:
>> > On Wed, 01 Nov 2017, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>> > > On Wed, Nov 01, 2017 at 04:21:08PM +0000, Ben Widawsky wrote:
>> > > > On 17-11-01 18:09:47, Joonas Lahtinen wrote:
>> > > > > + Kimmo and Paul
>> > > > >
>> > > > > On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
>> > > > > > On 17-11-01 14:07:28, Joonas Lahtinen wrote:
>> > > > > > > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
>> > > > > > > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
>> > > > > > > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On 26/10/17 03:32, Chris Wilson wrote:
>> > > > > > > > > > > It has been many years since the last confirmed sighting (and fix) of an
>> > > > > > > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
>> > > > > > > > > > > users from setting dangerous values, as they often set it during triage
>> > > > > > > > > > > and end up disabling the entire runtime pm instead (the option is not a
>> > > > > > > > > > > fine scalpel!).
>> > > > > > > > > > >
>> > > > > > > > > > > Furthermore, it allows users to set known dangerous values which were
>> > > > > > > > > > > intended for testing and not for production use. For testing, we can
>> > > > > > > > > > > always patch in the required setting without having to expose ourselves
>> > > > > > > > > > > to random abuse.
>> > > > > > > > > > >
>> > > > > > > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
>> > > > > > > > > > > lack of ilk support better.
>> > > > > > > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
>> > > > > > > > > > >
>> > > > > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > > > > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > > > > > > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > > > > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
>> > > > > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
>> > > > > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > > > > > > ---
>> > > > > > > > > >
>> > > > > > > > > > I think that for execution/debug on early silicon we might still want the
>> > > > > > > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
>> > > > > > > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
>> > > > > > > > > > IMO.
>> > > > > > > > >
>> > > > > > > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
>> > > > > > > > > RC6.
>> > > > > > > >
>> > > > > > > > BIOS option don't block our code to run and set some MMIOs.
>> > > > > > > > Not sure how the GPU will behave on such cases.
>> > > > > > > >
>> > > > > > > > I like the idea of removing some and keeping the parameters clean.
>> > > > > > > > But there are few ones like RC6 and disable_power_wells that are very
>> > > > > > > > useful on platform enabling and also when assisting others to debug issues.
>> > > > > > > >
>> > > > > > > > For instance right now that we fixed RC6 on CNL someone told that
>> > > > > > > > he believe seeing more hangs, so I immediately asked to boot with
>> > > > > > > > i915.enable_rc6=0 to double check. It is easier and straighforward
>> > > > > > > > to direct them to the unsafe param than to ask them to compile the code
>> > > > > > > > with different options or to use some BIOS options that we are not sure.
>> > > > > > > >
>> > > > > > > > Also on bug triage some options like this are helpful.
>> > > > > > > >
>> > > > > > > > Also BIOS and compile are saved flags. So if you need to do a quick test
>> > > > > > > > you have to save it, and then unsave later. Parameters are very convinient
>> > > > > > > > for 1 boot only check.
>> > > > > > >
>> > > > > > > It's convenient for sure, but the unsafe module parameters seems to be
>> > > > > > > finding their way into way too many HOWTOs, and from there to some
>> > > > > > > "productized" use-cases. Chris states that setting .enable_rc6=0 to
>> > > > > > > solving an issue on publicly shipping products has been some years ago,
>> > > > > > > so I don't see a need for carrying this.
>> > > > > > >
>> > > > > > > We shouldn't allow the convenience of not having to change one line and
>> > > > > > > recompile kernel during development to affect the end-users who are
>> > > > > > > Googling how to get the best performance out of their hardware (I could
>> > > > > > > mention some distro here).
>> > > > > > >
>> > > > > > > This seems the like the best option as I don't think introducing kernel
>> > > > > > > parameters that only exists on debug builds would be too convenient
>> > > > > > > either. It'd maybe just add more confusion.
>> > > > > > >
>> > > > > > > Regards, Joonas
>> > > > > >
>> > > > > > I believe the ability to disable RC6 is valuable not just for debugging
>> > > > > > purposes. Folks with very latency sensitive workloads are often willing to
>> > > > > > forego power savings. The real problem I see is that we don't test without rc6
>> > > > > > in our setup, which indeed makes it unsafe. As such, I see the other option here
>> > > > > > going back to the ability to toggle rc6 after load (either module parameter, or
>> > > > > > make it sysfs), and actually run some subset of our workloads with RC6. I
>> > > > > > suspect people will poop on that suggestion, but I figured I'd mention.
>> > > > >
>> > > > > I agree there, but by my understanding there's really no ask to support
>> > > > > the feature in upstream. And the original motive from Chris to drop the
>> > > > > feature is that it's unsafe as it currently is.
>> > > > >
>> > > > > So unless we've got the resources to bring it back from the unsafe
>> > > > > zone, I think we should drop it like this patch proposes.
>> > > > >
>> > > > > Regards, Joonas
>> > > >
>> > > > Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
>> > > > let them use that.
>> > >
>> > > Well, I won't try to block that. I just put my 2 cents that I believe it is a very
>> > > useful parameter.
>> > >
>> > > It wasn't that long ago the last time that we needed the flag to allow
>> > > end users to have a functional machine: https://plus.google.com/+JonMasters/posts/BqWLEjenLKv.
>> > >
>> > > or to debug a related issue:
>> > > https://bugzilla.redhat.com/show_bug.cgi?id=1440988
>> > > https://bugzilla.kernel.org/show_bug.cgi?id=116431
>> > >
>> > > Although date on few seems over than 1 year. We need to consider that
>> > > that was our latest new gpu... gen9.
>> > >
>> > > If products are recommending the use of enable_rc6=0 I can see they
>> > > adding the patch to disable that. Effect is the same and our convenience is gone.
>> > >
>> > > But again, just my view here. Not a nack ;)
>> >
>> > I suppose the compromise would be to make it a boolean module parameter
>> > to only allow disabling rc6 on platforms where it's enabled by default,
>> > but not letting you enable rc6 where it's disabled by default. I.e. only
>> > support i915.enable_rc6=0 to be passed by the user.
>>
>> +1. I like this approach.
>
> Umm, it still doesn't resolve the issue that it's not being tested.
>
> I try to be super clear; until we have resources to support that
> specific code path, I'd much prefer not to have an easy kernel
> parameter to set it.
It resolves the worst part of the issue: people enabling rc6 where it's
known not to work.
BR,
Jani.
--
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2017-11-02 15:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-11 9:12 [PATCH] drm/i915: Remove unsafe i915.enable_rc6 Chris Wilson
2017-10-11 10:23 ` ✓ Fi.CI.BAT: success for drm/i915: Remove unsafe i915.enable_rc6 (rev2) Patchwork
2017-10-11 11:35 ` [PATCH] drm/i915: Remove unsafe i915.enable_rc6 Daniel Vetter
2017-10-11 15:39 ` ✓ Fi.CI.IGT: success for drm/i915: Remove unsafe i915.enable_rc6 (rev2) Patchwork
2017-10-12 9:37 ` [PATCH] drm/i915: Remove unsafe i915.enable_rc6 Joonas Lahtinen
2017-10-12 9:42 ` Imre Deak
2017-10-26 10:32 ` [PATCH v3] " Chris Wilson
2017-10-26 14:33 ` Joonas Lahtinen
2017-10-27 20:57 ` Daniele Ceraolo Spurio
2017-10-30 13:00 ` David Weinehall
2017-10-30 17:48 ` Rodrigo Vivi
2017-11-01 12:07 ` Joonas Lahtinen
2017-11-01 14:43 ` Ben Widawsky
2017-11-01 16:09 ` Joonas Lahtinen
2017-11-01 16:21 ` Ben Widawsky
2017-11-01 17:12 ` Rodrigo Vivi
2017-11-02 8:06 ` Jani Nikula
2017-11-02 14:47 ` Rodrigo Vivi
2017-11-02 14:59 ` Joonas Lahtinen
2017-11-02 15:17 ` Jani Nikula [this message]
2017-10-26 10:58 ` ✗ Fi.CI.BAT: failure for drm/i915: Remove unsafe i915.enable_rc6 (rev3) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lgjod816.fsf@intel.com \
--to=jani.nikula@intel.com \
--cc=ben@bwidawsk.net \
--cc=daniel.vetter@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
--cc=joonas.lahtinen@linux.intel.com \
--cc=kimmo.nikkanen@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).