From: Brian Norris <briannorris@chromium.org>
To: Matthew Auld <matthew.william.auld@gmail.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Set PROBE_PREFER_ASYNCHRONOUS
Date: Wed, 2 Nov 2022 17:14:25 -0700 [thread overview]
Message-ID: <Y2MH4QCYqiAmvBQP@google.com> (raw)
In-Reply-To: <CAM0jSHM99OxpmS-pqmEiyoK2pa07fnhekTKLRQTMsWqFkHCgJg@mail.gmail.com>
On Wed, Nov 02, 2022 at 12:18:37PM +0000, Matthew Auld wrote:
> On Tue, 1 Nov 2022 at 21:58, Brian Norris <briannorris@chromium.org> wrote:
> >
> > On Fri, Oct 28, 2022 at 5:24 PM Patchwork
> > <patchwork@emeril.freedesktop.org> wrote:
> > >
> > > Patch Details
> > > Series:drm/i915: Set PROBE_PREFER_ASYNCHRONOUS
> > > URL:https://patchwork.freedesktop.org/series/110277/
> > > State:failure
> > > Details:https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110277v1/index.html
> > >
> > > CI Bug Log - changes from CI_DRM_12317 -> Patchwork_110277v1
> > >
> > > Summary
> > >
> > > FAILURE
> > >
> > > Serious unknown changes coming with Patchwork_110277v1 absolutely need to be
> > > verified manually.
> > >
> > > If you think the reported changes have nothing to do with the changes
> > > introduced in Patchwork_110277v1, please notify your bug team to allow them
> > > to document this new failure mode, which will reduce false positives in CI.
> > >
> > > External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110277v1/index.html
> >
> > For the record, I have almost zero idea what to do with this. From
> > what I can tell, most (all?) of these failures are flaky(?) already
> > and are probably not related to my change.
>
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110277v1/index.html
>
> According to that link, this change appears to break every platform
> when running the live selftests (looking at the purple squares).
> Running the selftests normally involves loading and unloading the
> module. Looking at the logs there is scary stuff like:
>
[...]
Ah, thanks. I'm not sure what made me think the tests were failing the
same way on drm-tip, but maybe just chalk that up to my unfamiliarity
with this particular dashboard... (There are a few isolated failure
and/or flakes on drm-tip, but they don't look like this.)
Anyway, I think I managed to run some of these tests on my own platforms
[1], and I don't reproduce those failures. I do see other failures
(crashes) though, like in i915_gem_mman_live_selftests/igt_mmap, where
igt_mmap_offset() (selftest-only code) -> vm_mmap() assumes we have a
valid |current->mm|. But that's borrowing the modprobe process's memory
map, and with async probe, the selftest sequence happens in a kernel
worker instead (and current->mm is NULL). So that clearly won't work.
I suppose I could disable async probe when built as a module (I believe
it doesn't really have any value, since the module load task just waits
for the async task anyway). I'm not familiar enough with MM to know what
the vm_mmap() alternatives are, but this particular bit of code does
feel odd.
Additionally, I think this implies that live_selftests will break if
i915 is built-in (i.e., =y, not =m), as we'll again run in a
kernel-thread context at boot time. But I would hope nobody is trying to
run them that way? I guess this gets even hairier, because even if the
driver is built into the kernel, it's possible to kick them off from a
process context by tweaking the module parameters later, and then
re-binding the device... So all in all, this bug leaves an ugly
situation, with or without my patch.
I'm still curious about the reported failures, but maybe they require
some particular sequence of tests? I also don't have the full
igt-gpu-tools set running, so maybe they do something a little
differently than my steps in [1]?
Brian
[1] I have a GLk system, if it matters. I figured I can run some of
these with any one of the following:
modprobe i915 live_selftests=1
modprobe i915 live_selftests=1 igt__20__live_workarounds=Y
modprobe i915 live_selftests=1 igt__19__live_uncore=Y
modprobe i915 live_selftests=1 igt__18__live_sanitycheck=Y
...
next prev parent reply other threads:[~2022-11-03 0:14 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-28 21:53 [Intel-gfx] [PATCH] drm/i915: Set PROBE_PREFER_ASYNCHRONOUS Brian Norris
2022-10-29 0:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
2022-11-01 21:58 ` Brian Norris
2022-11-02 12:18 ` Matthew Auld
2022-11-03 0:14 ` Brian Norris [this message]
2022-11-04 14:38 ` Matthew Auld
2022-11-05 1:29 ` Brian Norris
2022-11-04 15:20 ` Matthew Auld
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2MH4QCYqiAmvBQP@google.com \
--to=briannorris@chromium.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=matthew.william.auld@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox