dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: rwright@hpe.com, jani.nikula@linux.intel.com,
	joonas.lahtinen@linux.intel.com, rodrigo.vivi@intel.com,
	airlied@linux.ie, daniel@ffwll.ch, sumit.semwal@linaro.org,
	christian.koenig@amd.com, wambui.karugax@gmail.com,
	chris@chris-wilson.co.uk, matthew.auld@intel.com,
	akeem.g.abodunrin@intel.com, prathap.kumar.valsan@intel.com,
	mika.kuoppala@linux.intel.com
Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org
Subject: Re: [PATCH v3 0/3] Reduce context clear batch size to avoid gpu hang
Date: Mon, 2 Nov 2020 10:48:54 +0100	[thread overview]
Message-ID: <8cdf0dd0-2a2f-bae9-71ea-89a88fdb14a5@redhat.com> (raw)
In-Reply-To: <20201101174132.10513-1-rwright@hpe.com>

Hi,

On 11/1/20 6:41 PM, rwright@hpe.com wrote:
> From: Randy Wright <rwright@hpe.com>
> 
> For several months, I've been experiencing GPU hangs when  starting
> Cinnamon on an HP Pavilion Mini 300-020 if I try to run an upstream
> kernel.  I reported this recently in
> https://gitlab.freedesktop.org/drm/intel/-/issues/2413 where I have
> attached the requested evidence including the state collected from
> /sys/class/drm/card0/error and debug output from dmesg.
> 
> I ran a bisect to find the problem, which indicates this is the
> troublesome commit:
> 
>   [47f8253d2b8947d79fd3196bf96c1959c0f25f20] drm/i915/gen7: Clear all EU/L3 residual contexts
> 
> The nature of that commit suggested to me that reducing the
> batch size used in the context clear operation might help this
> relatively low-powered system to avoid the hang.... and it did!
> I simply forced this system to take the smaller batch length that is
> already used for non-Haswell systems.
> 
> The first two versions of this patch were posted as RFC
> patches to the Intel-gfx list, implementing the same
> algorithmic change in function batch_get_defaults,
> but without employing a properly constructed quirk.
> 
> I've now cleaned up the patch to employ a new QUIRK_RENDERCLEAR_REDUCED.
> The quirk is presently set only for the aforementioned HP Pavilion Mini
> 300-020.  The patch now touches three files to define the quirk, set it,
> and then check for it in function batch_get_defaults.

Note I'm not really an i915 dev.

With that said I do wonder if we should not use the
reduced batch size in a lot more cases, the machine in question uses a
3558U CPU if the iGPU of that CPU has this issue, then I would expect
pretty much all Haswell U models (at a minimum) to have this issue.

So solving this with a quirk for just the HP Pavilion Mini 300-020
seems wrong to me. I think we need a more generic way of enabling
the reduced batch size. I even wonder if we should not simply use
it everywhere. Since you do have a proper Haswell CPU, I guess
it being an U model makes the hang easier to trigger, but I suspect
the higher TPD ones may also still be susceptible ...

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2020-11-02  9:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-01 17:41 [PATCH v3 0/3] Reduce context clear batch size to avoid gpu hang rwright
2020-11-01 17:41 ` [PATCH v3 1/3] drm/i915: Introduce quirk QUIRK_RENDERCLEAR_REDUCED rwright
2020-11-01 17:41 ` [PATCH v3 2/3] drm/i915/display: Add function quirk_renderclear_reduced rwright
2020-11-01 17:41 ` [PATCH v3 3/3] drm/i915/gt: Force reduced batch size if new QUIRK_RENDERCLEAR_REDUCED is set rwright
2020-11-02  9:48 ` Hans de Goede [this message]
2020-11-02 19:57   ` [PATCH v3 0/3] Reduce context clear batch size to avoid gpu hang rwright
2020-11-07 17:57     ` rwright
  -- strict thread matches above, loose matches on Subject: below --
2020-11-01 14:42 rwright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8cdf0dd0-2a2f-bae9-71ea-89a88fdb14a5@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=airlied@linux.ie \
    --cc=akeem.g.abodunrin@intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=christian.koenig@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=matthew.auld@intel.com \
    --cc=mika.kuoppala@linux.intel.com \
    --cc=prathap.kumar.valsan@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=rwright@hpe.com \
    --cc=sumit.semwal@linaro.org \
    --cc=wambui.karugax@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).