public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: "Yao, Jia" <jia.yao@intel.com>
Cc: "intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"Zuo, Alex" <alex.zuo@intel.com>,
	"Lin, Shuicheng" <shuicheng.lin@intel.com>,
	Askar Safin <safinaskar@gmail.com>,
	Pingfan Liu <piliu@redhat.com>,
	Chris Wilson <chris.p.wilson@linux.intel.com>
Subject: Re: [PATCH v2] drm/i915: Setting/clearing the memory access bit when en/disabling i915
Date: Wed, 8 Oct 2025 19:15:35 +0300	[thread overview]
Message-ID: <aOaOJ1YI-NgTloIy@intel.com> (raw)
In-Reply-To: <PH8PR11MB80407C75DF808C33C70B1FBDF4E1A@PH8PR11MB8040.namprd11.prod.outlook.com>

On Wed, Oct 08, 2025 at 04:06:39PM +0000, Yao, Jia wrote:
> The actual bug is showing in https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14598
> if CONFIG_INTEL_IOMMU_DEFAULT_ON=y  ,  that IOMMU prevent the invalid access,  but if  CONFIG_INTEL_IOMMU_DEFAULT_ON=n,   the invalid access will directly cause system crash after kexec reboot.

I was asking you whether that invalid access was caused by that
pxp stuff or not?

If yes, then just fix it.

If not, then I guess someone needs to keep on debugging.

> 
> -----Original Message-----
> From: Ville Syrjälä <ville.syrjala@linux.intel.com> 
> Sent: Wednesday, October 8, 2025 5:22 AM
> To: Yao, Jia <jia.yao@intel.com>
> Cc: intel-gfx@lists.freedesktop.org; Zuo, Alex <alex.zuo@intel.com>; Lin, Shuicheng <shuicheng.lin@intel.com>; Askar Safin <safinaskar@gmail.com>; Pingfan Liu <piliu@redhat.com>; Chris Wilson <chris.p.wilson@linux.intel.com>
> Subject: Re: [PATCH v2] drm/i915: Setting/clearing the memory access bit when en/disabling i915
> 
> On Tue, Oct 07, 2025 at 09:40:45PM +0000, Yao, Jia wrote:
> > You mean  intel_pxp_fini(i915)  ?
> > This is because mei_me_shutdown  is called after i915_driver_shutdown 
> > in pci_device_shutdown sequence.  If we don't close pxp in advance, it 
> > will cause
> > 
> > [  295.584775] i915 0000:00:02.0: [drm] *ERROR* gt: MMIO unreliable (forcewake register returns 0xFFFFFFFF)!
> 
> So that is the actual bug you're trying to fix? Please just submit the pxp fix on its own.
> 
> > 
> > Since we disabled PCI_COMMAND_MEMORY in  i915_driver_shutdown
> > 
> > Thanks,
> > Jia
> > 
> > -----Original Message-----
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Sent: Tuesday, October 7, 2025 2:25 PM
> > To: Yao, Jia <jia.yao@intel.com>
> > Cc: intel-gfx@lists.freedesktop.org; Zuo, Alex <alex.zuo@intel.com>; 
> > Lin, Shuicheng <shuicheng.lin@intel.com>; Askar Safin 
> > <safinaskar@gmail.com>; Pingfan Liu <piliu@redhat.com>; Chris Wilson 
> > <chris.p.wilson@linux.intel.com>
> > Subject: Re: [PATCH v2] drm/i915: Setting/clearing the memory access 
> > bit when en/disabling i915
> > 
> > On Tue, Oct 07, 2025 at 08:25:14PM +0000, Jia Yao wrote:
> > > Make i915's PCI device management more robust by always 
> > > setting/clearing the memory access bit when enabling/disabling the 
> > > device, and by consolidating this logic into helper functions.
> > > 
> > > It fixes kexec reboot issue by disabling memory access before 
> > > shutting down the device, which can block unsafe and unwanted access from DMA.
> > > 
> > > v2:
> > >   - follow brace style
> > > 
> > > Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14598
> > > Cc: Alex Zuo <alex.zuo@intel.com>
> > > Cc: Shuicheng Lin <shuicheng.lin@intel.com>
> > > Cc: Askar Safin <safinaskar@gmail.com>
> > > Cc: Pingfan Liu <piliu@redhat.com>
> > > Suggested-by: Chris Wilson <chris.p.wilson@linux.intel.com>
> > > Signed-off-by: Jia Yao <jia.yao@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_driver.c | 35
> > > +++++++++++++++++++++++++++---
> > >  1 file changed, 32 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_driver.c
> > > b/drivers/gpu/drm/i915/i915_driver.c
> > > index b46cb54ef5dc..766f85726b67 100644
> > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > @@ -118,6 +118,33 @@
> > >  
> > >  static const struct drm_driver i915_drm_driver;
> > >  
> > > +static int i915_enable_device(struct pci_dev *pdev) {
> > > +	u32 cmd;
> > > +	int ret;
> > > +
> > > +	ret = pci_enable_device(pdev);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	pci_read_config_dword(pdev, PCI_COMMAND, &cmd);
> > > +	if (!(cmd & PCI_COMMAND_MEMORY))
> > > +		pci_write_config_dword(pdev, PCI_COMMAND, cmd | 
> > > +PCI_COMMAND_MEMORY);
> > > +
> > > +	return 0;
> > > +}
> > 
> > NAK. If the pci code is broken then fix the problem there.
> > Do not add ugly hacks into random drivers.
> > 
> > > +
> > > +static void i915_disable_device(struct pci_dev *pdev) {
> > > +	u32 cmd;
> > > +
> > > +	pci_read_config_dword(pdev, PCI_COMMAND, &cmd);
> > > +	if (cmd & PCI_COMMAND_MEMORY)
> > > +		pci_write_config_dword(pdev, PCI_COMMAND, cmd & 
> > > +~PCI_COMMAND_MEMORY);
> > > +
> > > +	pci_disable_device(pdev);
> > > +}
> > > +
> > >  static int i915_workqueues_init(struct drm_i915_private *dev_priv)  {
> > >  	/*
> > > @@ -788,7 +815,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> > >  	struct intel_display *display;
> > >  	int ret;
> > >  
> > > -	ret = pci_enable_device(pdev);
> > > +	ret = i915_enable_device(pdev);
> > >  	if (ret) {
> > >  		pr_err("Failed to enable graphics device: %pe\n", ERR_PTR(ret));
> > >  		return ret;
> > > @@ -796,7 +823,7 @@ int i915_driver_probe(struct pci_dev *pdev, 
> > > const struct pci_device_id *ent)
> > >  
> > >  	i915 = i915_driver_create(pdev, ent);
> > >  	if (IS_ERR(i915)) {
> > > -		pci_disable_device(pdev);
> > > +		i915_disable_device(pdev);
> > >  		return PTR_ERR(i915);
> > >  	}
> > >  
> > > @@ -885,7 +912,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> > >  	enable_rpm_wakeref_asserts(&i915->runtime_pm);
> > >  	i915_driver_late_release(i915);
> > >  out_pci_disable:
> > > -	pci_disable_device(pdev);
> > > +	i915_disable_device(pdev);
> > >  	i915_probe_error(i915, "Device initialization failed (%d)\n", ret);
> > >  	return ret;
> > >  }
> > > @@ -1003,6 +1030,7 @@ void i915_driver_shutdown(struct 
> > > drm_i915_private *i915)
> > >  
> > >  	intel_dmc_suspend(display);
> > >  
> > > +	intel_pxp_fini(i915);
> > 
> > What is that doing in this patch?
> > 
> > >  	i915_gem_suspend(i915);
> > >  
> > >  	/*
> > > @@ -1020,6 +1048,7 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> > >  	enable_rpm_wakeref_asserts(&i915->runtime_pm);
> > >  
> > >  	intel_runtime_pm_driver_last_release(&i915->runtime_pm);
> > > +	i915_disable_device(to_pci_dev(i915->drm.dev));
> > >  }
> > >  
> > >  static bool suspend_to_idle(struct drm_i915_private *dev_priv)
> > > --
> > > 2.34.1
> > 
> > --
> > Ville Syrjälä
> > Intel
> 
> --
> Ville Syrjälä
> Intel

-- 
Ville Syrjälä
Intel

  reply	other threads:[~2025-10-08 16:15 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-07 18:06 [PATCH] drm/i915: Setting/clearing the memory access bit when enabling/disabling i915 Jia Yao
2025-10-07 20:25 ` [PATCH v2] drm/i915: Setting/clearing the memory access bit when en/disabling i915 Jia Yao
2025-10-07 21:25   ` Ville Syrjälä
2025-10-07 21:40     ` Yao, Jia
2025-10-08 12:21       ` Ville Syrjälä
2025-10-08 16:06         ` Yao, Jia
2025-10-08 16:15           ` Ville Syrjälä [this message]
2025-10-08 17:14             ` Yao, Jia
2025-10-09  1:55               ` Pingfan Liu
2025-10-11 12:35             ` Askar Safin
2025-10-11 12:49             ` Askar Safin
2025-10-13 16:16               ` Yao, Jia
2025-10-14  6:29               ` Pingfan Liu
2025-11-01 16:02                 ` Askar Safin
2025-10-08  5:17   ` Pingfan Liu
2025-10-08  7:05     ` Yao, Jia
2025-10-08 10:58       ` Pingfan Liu
2025-10-08  8:50   ` Askar Safin
2025-10-09  1:10   ` Askar Safin
2025-10-08  4:06 ` ✓ i915.CI.BAT: success for drm/i915: Setting/clearing the memory access bit when enabling/disabling i915 Patchwork
2025-10-08  4:29 ` ✓ i915.CI.BAT: success for drm/i915: Setting/clearing the memory access bit when enabling/disabling i915 (rev2) Patchwork
2026-01-20  4:42 ` [PATCH v3] drm/i915: Clearing the Memory Space Enable bit when disabling i915 Jia Yao
2026-01-20  9:50   ` Jani Nikula
2026-01-21 21:51     ` Yao, Jia
2026-01-20 16:11   ` Ville Syrjälä
2026-01-21  7:19     ` Yao, Jia
2026-01-21 15:02       ` Ville Syrjälä
2026-01-22  6:43         ` Yao, Jia
2026-01-20  5:31 ` ✗ i915.CI.BAT: failure for drm/i915: Setting/clearing the memory access bit when enabling/disabling i915 (rev3) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOaOJ1YI-NgTloIy@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=alex.zuo@intel.com \
    --cc=chris.p.wilson@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jia.yao@intel.com \
    --cc=piliu@redhat.com \
    --cc=safinaskar@gmail.com \
    --cc=shuicheng.lin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox