public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	Intel-gfx@lists.freedesktop.org
Cc: Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
	dri-devel@lists.freedesktop.org,
	Chris Wilson <chris@chris-wilson.co.uk>,
	Renato Pereyra <renatopereyra@google.com>,
	Matthew Auld <matthew.auld@intel.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	stable@vger.kernel.org,
	Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Stop doing writeback from the shrinker
Date: Fri, 10 Dec 2021 15:36:17 +0000	[thread overview]
Message-ID: <931129d0-4e86-48f9-7b2e-bddef93697c6@linux.intel.com> (raw)
In-Reply-To: <a7898ef462a49db825b3fdd4efdba1e546466473.camel@linux.intel.com>


On 10/12/2021 14:46, Thomas Hellström wrote:
> On Fri, 2021-12-10 at 11:05 +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> This effectively removes writeback which was added in 2d6692e642e7
>> ("drm/i915: Start writeback from the shrinker").
>>
>> Digging through the history it seems we went back and forth on the
>> topic
>> of whether it would be safe a couple of times. See for instance
>> 5537252b6b6d ("drm/i915: Invalidate our pages under memory pressure")
>> where Hugh Dickins has advised against it. I do not have enough
>> expertise
>> in the memory management area so am hoping for expert input here.
>>
>> Reason for proposing removal is that there are reports from the field
>> which indicate a sysetm wide deadlock (of a sort) implicating i915
>> doing
>> writeback at shrinking time.
>>
>> Signature is a hung task notifier kicking in and task traces such as:
> 
> It would be interesting to see what exactly the find_get_entry is
> blocked on. The other two tasks are blocked on the shrinker_rwsem which
> is held by i915. If it's indeed a deadlock with either of those two,

It may indeed be a livelock instead of a deadlock. I have received a 
newer trace and it indeed shows kswapd in running state. But no progress 
in 120s and dead machine sounded like too suspicious it could happen 
with just a gaming workload so I assumed a more serious issue than just 
severe memory pressure.

> then the fix Chris is working on for an unrelated issue we discovered
> with shrinking would move out the writeback call from the
> shrinker_rwsem and resolve this, but if i915 is in turn deadlocking
> with another process and these two are just hanging waiting for the
> shrinker_rwsem, we would still have other issues.

Presumably this would involve an extra worker and tracking on a list or 
something?

Otherwise my main hope really was to get a verdict from memory 
management experts on pros & cons of doing writeback from the driver in 
any flavour.

> Do you by any chance have the list of the locks held by the system at
> this point?

No, but maybe Renato you could also collect "echo d" and "echo m" to 
sysrq-trigger when things go bad?

Regards,

Tvrtko

  reply	other threads:[~2021-12-10 15:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-10 11:05 [Intel-gfx] [PATCH] drm/i915: Stop doing writeback from the shrinker Tvrtko Ursulin
2021-12-10 14:46 ` Thomas Hellström
2021-12-10 15:36   ` Tvrtko Ursulin [this message]
2021-12-10 15:02 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2021-12-10 15:03 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-12-10 15:33 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-12-11 10:59 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=931129d0-4e86-48f9-7b2e-bddef93697c6@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hughd@google.com \
    --cc=matthew.auld@intel.com \
    --cc=mhocko@suse.com \
    --cc=renatopereyra@google.com \
    --cc=stable@vger.kernel.org \
    --cc=sushma.venkatesh.reddy@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox