From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
Intel-gfx@lists.freedesktop.org
Cc: Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
dri-devel@lists.freedesktop.org,
Chris Wilson <chris@chris-wilson.co.uk>,
Renato Pereyra <renatopereyra@google.com>,
Matthew Auld <matthew.auld@intel.com>,
Daniel Vetter <daniel.vetter@ffwll.ch>,
stable@vger.kernel.org,
Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Stop doing writeback from the shrinker
Date: Fri, 10 Dec 2021 15:36:17 +0000 [thread overview]
Message-ID: <931129d0-4e86-48f9-7b2e-bddef93697c6@linux.intel.com> (raw)
In-Reply-To: <a7898ef462a49db825b3fdd4efdba1e546466473.camel@linux.intel.com>
On 10/12/2021 14:46, Thomas Hellström wrote:
> On Fri, 2021-12-10 at 11:05 +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> This effectively removes writeback which was added in 2d6692e642e7
>> ("drm/i915: Start writeback from the shrinker").
>>
>> Digging through the history it seems we went back and forth on the
>> topic
>> of whether it would be safe a couple of times. See for instance
>> 5537252b6b6d ("drm/i915: Invalidate our pages under memory pressure")
>> where Hugh Dickins has advised against it. I do not have enough
>> expertise
>> in the memory management area so am hoping for expert input here.
>>
>> Reason for proposing removal is that there are reports from the field
>> which indicate a sysetm wide deadlock (of a sort) implicating i915
>> doing
>> writeback at shrinking time.
>>
>> Signature is a hung task notifier kicking in and task traces such as:
>
> It would be interesting to see what exactly the find_get_entry is
> blocked on. The other two tasks are blocked on the shrinker_rwsem which
> is held by i915. If it's indeed a deadlock with either of those two,
It may indeed be a livelock instead of a deadlock. I have received a
newer trace and it indeed shows kswapd in running state. But no progress
in 120s and dead machine sounded like too suspicious it could happen
with just a gaming workload so I assumed a more serious issue than just
severe memory pressure.
> then the fix Chris is working on for an unrelated issue we discovered
> with shrinking would move out the writeback call from the
> shrinker_rwsem and resolve this, but if i915 is in turn deadlocking
> with another process and these two are just hanging waiting for the
> shrinker_rwsem, we would still have other issues.
Presumably this would involve an extra worker and tracking on a list or
something?
Otherwise my main hope really was to get a verdict from memory
management experts on pros & cons of doing writeback from the driver in
any flavour.
> Do you by any chance have the list of the locks held by the system at
> this point?
No, but maybe Renato you could also collect "echo d" and "echo m" to
sysrq-trigger when things go bad?
Regards,
Tvrtko
next prev parent reply other threads:[~2021-12-10 15:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-10 11:05 [Intel-gfx] [PATCH] drm/i915: Stop doing writeback from the shrinker Tvrtko Ursulin
2021-12-10 14:46 ` Thomas Hellström
2021-12-10 15:36 ` Tvrtko Ursulin [this message]
2021-12-10 15:02 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2021-12-10 15:03 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-12-10 15:33 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-12-11 10:59 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=931129d0-4e86-48f9-7b2e-bddef93697c6@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=Intel-gfx@lists.freedesktop.org \
--cc=chris@chris-wilson.co.uk \
--cc=daniel.vetter@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=hughd@google.com \
--cc=matthew.auld@intel.com \
--cc=mhocko@suse.com \
--cc=renatopereyra@google.com \
--cc=stable@vger.kernel.org \
--cc=sushma.venkatesh.reddy@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox