From: Jason Gunthorpe <jgg@nvidia.com>
To: "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com>,
Kevin Tian <kevin.tian@intel.com>,
Lu Baolu <baolu.lu@linux.intel.com>
Cc: "intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>,
"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
Lucas De Marchi <lucas.demarchi@intel.com>,
"Kurmi, Suresh Kumar" <suresh.kumar.kurmi@intel.com>,
"Saarinen, Jani" <jani.saarinen@intel.com>,
matthew.auld@intel.com, baolu.lu@linux.intel.com,
iommu@lists.linux.dev
Subject: Re: REGRESSION on linux-next (next-20251106)
Date: Wed, 12 Nov 2025 18:32:18 -0400 [thread overview]
Message-ID: <aRUK8vDZ3dE1zNxL@nvidia.com> (raw)
In-Reply-To: <4f15cf3b-6fad-4cd8-87e5-6d86c0082673@intel.com>
On Mon, Nov 10, 2025 at 12:06:30PM +0530, Borah, Chaitanya Kumar wrote:
> Hello Jason,
>
> Hope you are doing well. I am Chaitanya from the linux graphics team in
> Intel.
>
> This mail is regarding a regression we are seeing in our CI runs[1] on
> linux-next repository.
>
> Since the version next-20251106 [2], we are seeing our tests timing out
> presumably caused by a GPU Hang.
Thank you for reporting this.
I don't have any immediate theory, so I think it will need some
debug. Maybe Kevin or Lu have some idea?
Some general thoughts to check
1) Is there an iommu fault report? I did not see one in your dmesg,
but maybe it was truncated? It is more puzzling to see an iommu
related error and not see a fault report..
2) Could it be one of the special iommu behaviors to support iGPU that
is not working? Maybe we missed one?
3) I seem to recall Lu tested the coherent cache flushing, but that
would also be a good question, is this iGPU cache incoherent with
the CPU? Could this be a cache flushing bug? It is very hard to
test that so it would not be such a surprise if it has a bug..
4) Nobody has reported any other problems so far, so I'm inclined to
think the map/unmap is working - but maybe there is some edge case
the gpu driver is tripping up on?
The lack of a fault report is very puzzling, even if it was #3 I would
think a fault would be the most likely outcome of missing
flushing.. The lack of a fault report suggests the wrong physical
address was mapped as present which points to #4.
Can you investigate a bit further and maybe see if we can get a bit
more detail what that GPU thinks went wrong?
Jason
next prev parent reply other threads:[~2025-11-12 22:32 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 6:36 REGRESSION on linux-next (next-20251106) Borah, Chaitanya Kumar
2025-11-12 22:32 ` Jason Gunthorpe [this message]
2025-11-13 2:00 ` Tian, Kevin
2025-11-17 15:24 ` Jason Gunthorpe
2025-11-17 12:54 ` Baolu Lu
2025-11-17 15:22 ` Jason Gunthorpe
2025-11-18 1:29 ` Jason Gunthorpe
2025-11-18 4:04 ` Tian, Kevin
2025-11-18 6:19 ` Baolu Lu
2025-11-18 6:23 ` Baolu Lu
2025-11-18 7:47 ` Tian, Kevin
2025-11-18 11:29 ` Baolu Lu
2025-11-18 12:35 ` Jason Gunthorpe
2025-11-19 7:25 ` Baolu Lu
2025-11-18 10:30 ` Baolu Lu
2025-11-18 15:16 ` Borah, Chaitanya Kumar
2025-11-18 16:13 ` Jason Gunthorpe
2025-11-19 7:40 ` Borah, Chaitanya Kumar
2025-11-19 9:31 ` Tian, Kevin
2025-11-19 18:51 ` Jason Gunthorpe
2025-11-19 23:56 ` Tian, Kevin
2025-11-20 2:18 ` Jason Gunthorpe
2025-11-20 2:24 ` Baolu Lu
2025-11-20 7:27 ` Baolu Lu
2025-11-20 0:19 ` Tian, Kevin
2025-11-19 9:29 ` Baolu Lu
2025-11-18 12:42 ` ✗ Fi.CI.BUILD: failure for " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRUK8vDZ3dE1zNxL@nvidia.com \
--to=jgg@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=chaitanya.kumar.borah@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=jani.saarinen@intel.com \
--cc=kevin.tian@intel.com \
--cc=lucas.demarchi@intel.com \
--cc=matthew.auld@intel.com \
--cc=suresh.kumar.kurmi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.