All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: "Hans de Goede" <hdegoede@redhat.com>,
	"Ville Syrjälä" <ville.syrjala@linux.intel.com>,
	"Chris Wilson" <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/7] drm/i915: Disable preemption and sleeping while using the punit sideband
Date: Fri, 12 Jan 2018 12:02:06 +0200	[thread overview]
Message-ID: <876087jsyp.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <55f551e0-e8b0-a1b9-aa58-8ceffcd5676d@redhat.com>

Hans de Goede <hdegoede@redhat.com> writes:

> Hi,
>
> On 11-01-18 22:42, Hans de Goede wrote:
>> Hi,
>> 
>> On 11-01-18 22:17, Ville Syrjälä wrote:
>>> On Thu, Jan 11, 2018 at 08:53:42PM +0000, Chris Wilson wrote:
>>>> Quoting Ville Syrjälä (2018-01-11 20:10:45)
>>>>> On Wed, Jan 10, 2018 at 12:55:05PM +0000, Chris Wilson wrote:
>>>>>> While we talk to the punit over its sideband, we need to prevent the cpu
>>>>>> from sleeping in order to prevent a potential machine hang.
>>>>>>
>>>>>> Note that by itself, it appears that pm_qos_update_request (via
>>>>>> intel_idle) doesn't provide a sufficient barrier to ensure that all core
>>>>>> are indeed awake (out of Cstate) and that the package is awake. To do so,
>>>>>> we need to supplement the pm_qos with a manual ping on_each_cpu.
>>>>>>
>>>>>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109051
>>>>>> References: https://bugs.freedesktop.org/show_bug.cgi?id=102657
>>>>>> References: https://bugzilla.kernel.org/show_bug.cgi?id=195255
>>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>>>>> Cc: Hans de Goede <hdegoede@redhat.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/i915_drv.c       |  6 ++++
>>>>>>   drivers/gpu/drm/i915/i915_drv.h       |  1 +
>>>>>>   drivers/gpu/drm/i915/intel_sideband.c | 61 ++++++++++++++++++++++++-----------
>>>>>>   3 files changed, 50 insertions(+), 18 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>>>>> index 6c8da9d20c33..d4b90cc0130b 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>>>>> @@ -902,6 +902,9 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
>>>>>>        spin_lock_init(&dev_priv->uncore.lock);
>>>>>>        mutex_init(&dev_priv->sb_lock);
>>>>>> +     pm_qos_add_request(&dev_priv->sb_qos,
>>>>>> +                        PM_QOS_CPU_DMA_LATENCY, PM_QOS_DEFAULT_VALUE);
>>>>>> +
>>>>>>        mutex_init(&dev_priv->modeset_restore_lock);
>>>>>>        mutex_init(&dev_priv->av_mutex);
>>>>>>        mutex_init(&dev_priv->wm.wm_mutex);
>>>>>> @@ -953,6 +956,9 @@ static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv)
>>>>>>        intel_irq_fini(dev_priv);
>>>>>>        i915_workqueues_cleanup(dev_priv);
>>>>>>        i915_engines_cleanup(dev_priv);
>>>>>> +
>>>>>> +     pm_qos_remove_request(&dev_priv->sb_qos);
>>>>>> +     mutex_destroy(&dev_priv->sb_lock);
>>>>>>   }
>>>>>>   static int i915_mmio_setup(struct drm_i915_private *dev_priv)
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index a689396d0ff6..ff3f9effc0bb 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -1887,6 +1887,7 @@ struct drm_i915_private {
>>>>>>        /* Sideband mailbox protection */
>>>>>>        struct mutex sb_lock;
>>>>>> +     struct pm_qos_request sb_qos;
>>>>>>        /** Cached value of IMR to avoid reads in updating the bitfield */
>>>>>>        union {
>>>>>> diff --git a/drivers/gpu/drm/i915/intel_sideband.c b/drivers/gpu/drm/i915/intel_sideband.c
>>>>>> index 75c872bb8cc9..02bdd2e2cef6 100644
>>>>>> --- a/drivers/gpu/drm/i915/intel_sideband.c
>>>>>> +++ b/drivers/gpu/drm/i915/intel_sideband.c
>>>>>> @@ -22,6 +22,8 @@
>>>>>>    *
>>>>>>    */
>>>>>> +#include <asm/iosf_mbi.h>
>>>>>> +
>>>>>>   #include "i915_drv.h"
>>>>>>   #include "intel_drv.h"
>>>>>> @@ -39,18 +41,20 @@
>>>>>>   /* Private register write, double-word addressing, non-posted */
>>>>>>   #define SB_CRWRDA_NP 0x07
>>>>>> -static int vlv_sideband_rw(struct drm_i915_private *dev_priv, u32 devfn,
>>>>>> -                        u32 port, u32 opcode, u32 addr, u32 *val)
>>>>>> +static void ping(void *info)
>>>>>>   {
>>>>>> -     u32 cmd, be = 0xf, bar = 0;
>>>>>> -     bool is_read = (opcode == SB_MRD_NP || opcode == SB_CRRDDA_NP);
>>>>>> +}
>>>>>> -     cmd = (devfn << IOSF_DEVFN_SHIFT) | (opcode << IOSF_OPCODE_SHIFT) |
>>>>>> -             (port << IOSF_PORT_SHIFT) | (be << IOSF_BYTE_ENABLES_SHIFT) |
>>>>>> -             (bar << IOSF_BAR_SHIFT);
>>>>>> +static int vlv_sideband_rw(struct drm_i915_private *dev_priv,
>>>>>> +                        u32 devfn, u32 port, u32 opcode,
>>>>>> +                        u32 addr, u32 *val)
>>>>>> +{
>>>>>> +     const bool is_read = (opcode == SB_MRD_NP || opcode == SB_CRRDDA_NP);
>>>>>> +     int err;
>>>>>> -     WARN_ON(!mutex_is_locked(&dev_priv->sb_lock));
>>>>>> +     lockdep_assert_held(&dev_priv->sb_lock);
>>>>>> +     /* Flush the previous comms, just in case it failed last time. */
>>>>>>        if (intel_wait_for_register(dev_priv,
>>>>>>                                    VLV_IOSF_DOORBELL_REQ, IOSF_SB_BUSY, 0,
>>>>>>                                    5)) {
>>>>>> @@ -59,22 +63,43 @@ static int vlv_sideband_rw(struct drm_i915_private *dev_priv, u32 devfn,
>>>>>>                return -EAGAIN;
>>>>>>        }
>>>>>> -     I915_WRITE(VLV_IOSF_ADDR, addr);
>>>>>> -     I915_WRITE(VLV_IOSF_DATA, is_read ? 0 : *val);
>>>>>> -     I915_WRITE(VLV_IOSF_DOORBELL_REQ, cmd);
>>>>>> +     iosf_mbi_punit_acquire();
>>>>>> -     if (intel_wait_for_register(dev_priv,
>>>>>> -                                 VLV_IOSF_DOORBELL_REQ, IOSF_SB_BUSY, 0,
>>>>>> -                                 5)) {
>>>>>> +     /*
>>>>>> +      * Prevent the cpu from sleeping while we use this sideband, otherwise
>>>>>> +      * the punit may cause a machine hang.
>>>>>> +      */
>>>>>> +     pm_qos_update_request(&dev_priv->sb_qos, 0);
>>>>>> +     on_each_cpu(ping, NULL, 1);
>>>>>
>>>>> pm_qos_update_request() doesn't wake up the cpus on its own? I wonder
>>>>> kind of latency guarantees it can actually give without doing that.
>>>>
>>>> intel_idle plugs into the update to wake up idle cpus, but it is not
>>>> synchronous. If we don't have the on_each_cpu() here, the hangs reoccur;
>>>> the easiest answer seems to be that the on_each_cpu() is causing a
>>>> schedule of the RT migration thread which is enough to ensure that the
>>>> wakeup has occurred.
>>>>> Shouldn't we check if we're actually be talking to the punit and not
>>>>> some other unit before we do all this extra work?
>>>>
>>>> Checking port? I am suspicious of the whole iosf mechanism...
>>>
>>> I doubt iosf sb itself is busted. That would probably make the entire
>>> soc pretty much dead. I suspect the problem is either in the punit
>>> firmware, or something fishy is going on with power delivery and
>>> reducing the rate at which we change the voltages and whatnot helps
>>> keep things a bit more stable.
>>>
>>> So yeah, I would limit this to punit only by checking the port.
>>>
>>>>
>>>> Also debating whether to only apply it to j1900 or all. Is it just pure
>>>> fluke that we can easily reproduce it on the quad core Baytrail as
>>>> opposed to the dual core vlv or chv?
>>>
>>> Not sure. I guess if the problem is around power delivery it might even
>>> be board specific to some degree.
>> 
>> The wordpress (on azure.com) link here:
>> https://bugzilla.kernel.org/show_bug.cgi?id=109051#c860
>> 
>> Certainly seems to hint at a power delivery problem (with the SoC not
>> with specific boards) and all models reported as needing
>> intel_idle.max_cstate=1 there are quad-core Bay Trails.
>> 
>> I guess it the problem happens on all quad-core Bay Trail variants when
>> the GPU and CPU both increase there power-demand (by up-clocking, resp.
>> leaving C6) at the same time and Chris' fix makes sure all CPU cores
>> leave C6 before doing the GPU up-clocking avoiding this.
>> 
>> Note this is all speculation / educated guessing.
>
> Also interesting wrt this is this comment:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=109051#c752
>
> Which speculates along the same line and the commenter has done
> extensive testing (with 2 boards) coming to the conclusion that
> the dual core part does not have whatever problem he is seeing.
>

This has been my understanding that 2 core parts are
not affected. Further, limiting cores to 2 with 4 core j1900 using
maxcpus=2 is an effective workaround.

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-01-12 13:26 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-10 12:55 Keep package awake during punit access Chris Wilson
2018-01-10 12:55 ` [PATCH 1/7] drm/i915: Disable preemption and sleeping while using the punit sideband Chris Wilson
2018-01-10 13:18   ` Hans de Goede
2018-01-10 13:45     ` Mika Kuoppala
2018-01-10 14:02       ` Chris Wilson
2018-01-11 19:43     ` Hans de Goede
2018-01-11 20:10   ` Ville Syrjälä
2018-01-11 20:53     ` Chris Wilson
2018-01-11 21:17       ` Ville Syrjälä
2018-01-11 21:42         ` Hans de Goede
2018-01-11 22:04           ` Hans de Goede
2018-01-12 10:02             ` Mika Kuoppala [this message]
2018-01-10 12:55 ` [PATCH 2/7] drm/i915: Lift acquiring the vlv punit to the callers Chris Wilson
2018-01-10 12:55 ` [PATCH 3/7] drm/i915: Lift sideband locking for vlv_punit_(read|write) Chris Wilson
2018-01-10 12:55 ` [PATCH 4/7] drm/i915: Reduce RPS update frequency on Valleyview/Cherryview Chris Wilson
2018-01-10 12:55 ` [PATCH 5/7] Revert "drm/i915: Avoid tweaking evaluation thresholds on Baytrail v3" Chris Wilson
2018-01-10 12:55 ` [PATCH 6/7] drm/i915: Replace pcu_lock with sb_lock Chris Wilson
2018-01-10 12:55 ` [PATCH 7/7] drm/i915: Merge sandybride_pcode_(read|write) Chris Wilson
2018-01-10 13:41 ` ✓ Fi.CI.BAT: success for series starting with [1/7] drm/i915: Disable preemption and sleeping while using the punit sideband Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=876087jsyp.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=hdegoede@redhat.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.