Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: "intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	"Lucas De Marchi" <lucas.demarchi@intel.com>,
	"Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>,
	"Daniel Vetter" <daniel@ffwll.ch>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Demi M. Obenour" <demi@invisiblethingslab.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [Intel-gfx] [cache coherency bug] i915 and PAT attributes
Date: Thu, 22 Dec 2022 10:29:57 +0200	[thread overview]
Message-ID: <Y6QVhRP+voSLi9xm@intel.com> (raw)
In-Reply-To: <1c326e0c-5812-083a-0739-aa20fab3efc4@citrix.com>

On Fri, Dec 16, 2022 at 03:30:13PM +0000, Andrew Cooper wrote:
> On 08/12/2022 1:55 pm, Marek Marczykowski-Górecki wrote:
> > Hi,
> >
> > There is an issue with i915 on Xen PV (dom0). The end result is a lot of
> > glitches, like here: https://openqa.qubes-os.org/tests/54748#step/startup/8
> > (this one is on ADL, Linux 6.1-rc7 as a Xen PV dom0). It's using Xorg
> > with "modesetting" driver.
> >
> > After some iterations of debugging, we narrowed it down to i915 handling
> > caching. The main difference is that PAT is setup differently on Xen PV
> > than on native Linux. Normally, Linux does have appropriate abstraction
> > for that, but apparently something related to i915 doesn't play well
> > with it. The specific difference is:
> > native linux:
> > x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
> > xen pv:
> > x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WC  WP  UC  UC
> >                                   ~~          ~~      ~~  ~~
> >
> > The specific impact depends on kernel version and the hardware. The most
> > severe issues I see on >=ADL, but some older hardware is affected too -
> > sometimes only if composition is disabled in the window manager.
> > Some more information is collected at
> > https://github.com/QubesOS/qubes-issues/issues/4782 (and few linked
> > duplicates...).
> >
> > Kind-of related commit is here:
> > https://github.com/torvalds/linux/commit/bdd8b6c98239cad ("drm/i915:
> > replace X86_FEATURE_PAT with pat_enabled()") - it is the place where
> > i915 explicitly checks for PAT support, so I'm cc-ing people mentioned
> > there too.
> >
> > Any ideas?
> >
> > The issue can be easily reproduced without Xen too, by adjusting PAT in
> > Linux:
> > -----8<-----
> > diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
> > index 66a209f7eb86..319ab60c8d8c 100644
> > --- a/arch/x86/mm/pat/memtype.c
> > +++ b/arch/x86/mm/pat/memtype.c
> > @@ -400,8 +400,8 @@ void pat_init(void)
> >  		 * The reserved slots are unused, but mapped to their
> >  		 * corresponding types in the presence of PAT errata.
> >  		 */
> > -		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
> > -		      PAT(4, WB) | PAT(5, WP) | PAT(6, UC_MINUS) | PAT(7, WT);
> > +		pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
> > +		      PAT(4, WC) | PAT(5, WP) | PAT(6, UC)       | PAT(7, UC);
> >  	}
> >  
> >  	if (!pat_bp_initialized) {
> > -----8<-----
> >
> 
> Hello, can anyone help please?
> 
> Intel's CI has taken this reproducer of the bug, and confirmed the
> regression. 
> https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/T/#m4480c15a0d117dce6210562eb542875e757647fb
> 
> We're reasonably confident that it is an i915 bug (given the repro with
> no Xen in the mix), but we're out of any further ideas.

I don't think we have any code that assumes anything about the PAT,
apart from WC being available (which seems like it should still be
the case with your modified PAT). I suppose you'll just have to 
start digging from pgprot_writecombine()/noncached() and make sure
everything ends up using the correct PAT entry.

-- 
Ville Syrjälä
Intel

  reply	other threads:[~2022-12-22  8:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-08 13:55 [Intel-gfx] i915 and PAT attributes on Xen PV Marek Marczykowski-Górecki
2022-12-08 16:24 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2022-12-08 16:51 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2022-12-16 15:30 ` [Intel-gfx] [cache coherency bug] i915 and PAT attributes Andrew Cooper
2022-12-22  8:29   ` Ville Syrjälä [this message]
2023-01-01 23:24     ` Marek Marczykowski-Górecki
2023-01-02  0:03       ` Demi Marie Obenour
2023-01-02  1:00         ` Marek Marczykowski-Górecki
2023-01-02  1:17           ` Demi Marie Obenour
2023-01-02  1:48             ` Demi Marie Obenour
2023-01-02  1:58               ` [Intel-gfx] [cache coherency bug] [hw bug?] " Marek Marczykowski-Górecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6QVhRP+voSLi9xm@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=daniel@ffwll.ch \
    --cc=demi@invisiblethingslab.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox