public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
@ 2022-09-21 17:39 Rodrigo Vivi
  2022-09-22  7:56 ` [Intel-gfx] " Tvrtko Ursulin
  2022-09-22  9:23 ` Jani Nikula
  0 siblings, 2 replies; 8+ messages in thread
From: Rodrigo Vivi @ 2022-09-21 17:39 UTC (permalink / raw)
  To: intel-gfx
  Cc: Rodrigo Vivi, Daniel J Blueman, stable, Tvrtko Ursulin,
	Anshuman Gupta

The force_probe protection actively avoids the probe of i915 to
manage a device that is currently under development. It is a nice
protection for future users when getting a new platform but using
some older kernel.

However, when we avoid the probe we don't take back the registration
of the device. We cannot give up the registration anyway since we can
have multiple devices present. For instance an integrated and a discrete
one.

When this scenario occurs, the user will not be able to change any
of the runtime pm configuration of the unmanaged device. So, it will
be blocked in D0 state wasting power. This is specially bad in the
case where we have a discrete platform attached, but the user is
able to fully use the integrated one for everything else.

So, let's put the protected and unmanaged device in D3. So we can
save some power.

Reported-by: Daniel J Blueman <daniel@quora.org>
Cc: stable@vger.kernel.org
Cc: Daniel J Blueman <daniel@quora.org>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 77e7df21f539..fc3e7c69af2a 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -25,6 +25,7 @@
 #include <drm/drm_color_mgmt.h>
 #include <drm/drm_drv.h>
 #include <drm/i915_pciids.h>
+#include <linux/pm_runtime.h>
 
 #include "gt/intel_gt_regs.h"
 #include "gt/intel_sa_media.h"
@@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct intel_device_info *intel_info =
 		(struct intel_device_info *) ent->driver_data;
+	struct device *kdev = &pdev->dev;
 	int err;
 
 	if (intel_info->require_force_probe &&
@@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 			 "module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
 			 "or (recommended) check for kernel updates.\n",
 			 pdev->device, pdev->device, pdev->device);
+
+		/* Let's not waste power if we are not managing the device */
+		pm_runtime_use_autosuspend(kdev);
+		pm_runtime_allow(kdev);
+		pm_runtime_put_autosuspend(kdev);
+
 		return -ENODEV;
 	}
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
  2022-09-21 17:39 [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device Rodrigo Vivi
@ 2022-09-22  7:56 ` Tvrtko Ursulin
  2022-09-22  9:43   ` Rodrigo Vivi
  2022-09-22  9:23 ` Jani Nikula
  1 sibling, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2022-09-22  7:56 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-gfx; +Cc: Daniel J Blueman, stable, Imre Deak


On 21/09/2022 18:39, Rodrigo Vivi wrote:
> The force_probe protection actively avoids the probe of i915 to
> manage a device that is currently under development. It is a nice
> protection for future users when getting a new platform but using
> some older kernel.
> 
> However, when we avoid the probe we don't take back the registration
> of the device. We cannot give up the registration anyway since we can
> have multiple devices present. For instance an integrated and a discrete
> one.
> 
> When this scenario occurs, the user will not be able to change any
> of the runtime pm configuration of the unmanaged device. So, it will
> be blocked in D0 state wasting power. This is specially bad in the
> case where we have a discrete platform attached, but the user is
> able to fully use the integrated one for everything else.
> 
> So, let's put the protected and unmanaged device in D3. So we can
> save some power.
> 
> Reported-by: Daniel J Blueman <daniel@quora.org>
> Cc: stable@vger.kernel.org
> Cc: Daniel J Blueman <daniel@quora.org>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Anshuman Gupta <anshuman.gupta@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 77e7df21f539..fc3e7c69af2a 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -25,6 +25,7 @@
>   #include <drm/drm_color_mgmt.h>
>   #include <drm/drm_drv.h>
>   #include <drm/i915_pciids.h>
> +#include <linux/pm_runtime.h>
>   
>   #include "gt/intel_gt_regs.h"
>   #include "gt/intel_sa_media.h"
> @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   {
>   	struct intel_device_info *intel_info =
>   		(struct intel_device_info *) ent->driver_data;
> +	struct device *kdev = &pdev->dev;
>   	int err;
>   
>   	if (intel_info->require_force_probe &&
> @@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   			 "module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
>   			 "or (recommended) check for kernel updates.\n",
>   			 pdev->device, pdev->device, pdev->device);
> +
> +		/* Let's not waste power if we are not managing the device */
> +		pm_runtime_use_autosuspend(kdev);
> +		pm_runtime_allow(kdev);
> +		pm_runtime_put_autosuspend(kdev);

This sequence is black magic to me so can't really comment on the specifics. But in general, what I think I've figured out is, that the PCI core calls our runtime resume callback before probe:

local_pci_probe:
...
         /*
          * Unbound PCI devices are always put in D0, regardless of
          * runtime PM status.  During probe, the device is set to
          * active and the usage count is incremented.  If the driver
          * supports runtime PM, it should call pm_runtime_put_noidle(),
          * or any other runtime PM helper function decrementing the usage
          * count, in its probe routine and pm_runtime_get_noresume() in
          * its remove routine.
          */
         pm_runtime_get_sync(dev);
         pci_dev->driver = pci_drv;
         rc = pci_drv->probe(pci_dev, ddi->id);
         if (!rc)
                 return rc;
         if (rc < 0) {
                 pci_dev->driver = NULL;
                 pm_runtime_put_sync(dev);
                 return rc;
         }

And if probe fails it calls pm_runtime_put_sync which presumably does not provide the symmetry we need?

Anyway since I can't provide meaningful review I'll copy Imre since I think he worked in the area in the past. Just so more eyes is better.

Regards,

Tvrtko


> +
>   		return -ENODEV;
>   	}
>   

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
  2022-09-21 17:39 [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device Rodrigo Vivi
  2022-09-22  7:56 ` [Intel-gfx] " Tvrtko Ursulin
@ 2022-09-22  9:23 ` Jani Nikula
  1 sibling, 0 replies; 8+ messages in thread
From: Jani Nikula @ 2022-09-22  9:23 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-gfx
  Cc: Daniel J Blueman, stable, Rodrigo Vivi, Rafael J. Wysocki


Cc: Rafael

On Wed, 21 Sep 2022, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> The force_probe protection actively avoids the probe of i915 to
> manage a device that is currently under development. It is a nice
> protection for future users when getting a new platform but using
> some older kernel.
>
> However, when we avoid the probe we don't take back the registration
> of the device. We cannot give up the registration anyway since we can
> have multiple devices present. For instance an integrated and a discrete
> one.
>
> When this scenario occurs, the user will not be able to change any
> of the runtime pm configuration of the unmanaged device. So, it will
> be blocked in D0 state wasting power. This is specially bad in the
> case where we have a discrete platform attached, but the user is
> able to fully use the integrated one for everything else.
>
> So, let's put the protected and unmanaged device in D3. So we can
> save some power.
>
> Reported-by: Daniel J Blueman <daniel@quora.org>
> Cc: stable@vger.kernel.org
> Cc: Daniel J Blueman <daniel@quora.org>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Anshuman Gupta <anshuman.gupta@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 77e7df21f539..fc3e7c69af2a 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -25,6 +25,7 @@
>  #include <drm/drm_color_mgmt.h>
>  #include <drm/drm_drv.h>
>  #include <drm/i915_pciids.h>
> +#include <linux/pm_runtime.h>
>  
>  #include "gt/intel_gt_regs.h"
>  #include "gt/intel_sa_media.h"
> @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  {
>  	struct intel_device_info *intel_info =
>  		(struct intel_device_info *) ent->driver_data;
> +	struct device *kdev = &pdev->dev;
>  	int err;
>  
>  	if (intel_info->require_force_probe &&
> @@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  			 "module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
>  			 "or (recommended) check for kernel updates.\n",
>  			 pdev->device, pdev->device, pdev->device);
> +
> +		/* Let's not waste power if we are not managing the device */
> +		pm_runtime_use_autosuspend(kdev);
> +		pm_runtime_allow(kdev);
> +		pm_runtime_put_autosuspend(kdev);
> +
>  		return -ENODEV;
>  	}

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
  2022-09-22  7:56 ` [Intel-gfx] " Tvrtko Ursulin
@ 2022-09-22  9:43   ` Rodrigo Vivi
  2022-09-22 11:09     ` Gupta, Anshuman
  0 siblings, 1 reply; 8+ messages in thread
From: Rodrigo Vivi @ 2022-09-22  9:43 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: intel-gfx, Daniel J Blueman, stable, Rafael J. Wysocki,
	Jani Nikula

On Thu, Sep 22, 2022 at 08:56:00AM +0100, Tvrtko Ursulin wrote:
> 
> On 21/09/2022 18:39, Rodrigo Vivi wrote:
> > The force_probe protection actively avoids the probe of i915 to
> > manage a device that is currently under development. It is a nice
> > protection for future users when getting a new platform but using
> > some older kernel.
> > 
> > However, when we avoid the probe we don't take back the registration
> > of the device. We cannot give up the registration anyway since we can
> > have multiple devices present. For instance an integrated and a discrete
> > one.
> > 
> > When this scenario occurs, the user will not be able to change any
> > of the runtime pm configuration of the unmanaged device. So, it will
> > be blocked in D0 state wasting power. This is specially bad in the
> > case where we have a discrete platform attached, but the user is
> > able to fully use the integrated one for everything else.
> > 
> > So, let's put the protected and unmanaged device in D3. So we can
> > save some power.
> > 
> > Reported-by: Daniel J Blueman <daniel@quora.org>
> > Cc: stable@vger.kernel.org
> > Cc: Daniel J Blueman <daniel@quora.org>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Anshuman Gupta <anshuman.gupta@intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
> >   1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> > index 77e7df21f539..fc3e7c69af2a 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -25,6 +25,7 @@
> >   #include <drm/drm_color_mgmt.h>
> >   #include <drm/drm_drv.h>
> >   #include <drm/i915_pciids.h>
> > +#include <linux/pm_runtime.h>
> >   #include "gt/intel_gt_regs.h"
> >   #include "gt/intel_sa_media.h"
> > @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> >   {
> >   	struct intel_device_info *intel_info =
> >   		(struct intel_device_info *) ent->driver_data;
> > +	struct device *kdev = &pdev->dev;
> >   	int err;
> >   	if (intel_info->require_force_probe &&
> > @@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> >   			 "module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
> >   			 "or (recommended) check for kernel updates.\n",
> >   			 pdev->device, pdev->device, pdev->device);
> > +
> > +		/* Let's not waste power if we are not managing the device */
> > +		pm_runtime_use_autosuspend(kdev);
> > +		pm_runtime_allow(kdev);
> > +		pm_runtime_put_autosuspend(kdev);
> 
> This sequence is black magic to me so can't really comment on the specifics. But in general, what I think I've figured out is, that the PCI core calls our runtime resume callback before probe:
> 
> local_pci_probe:
> ...
>         /*
>          * Unbound PCI devices are always put in D0, regardless of
>          * runtime PM status.  During probe, the device is set to
>          * active and the usage count is incremented.  If the driver
>          * supports runtime PM, it should call pm_runtime_put_noidle(),
>          * or any other runtime PM helper function decrementing the usage
>          * count, in its probe routine and pm_runtime_get_noresume() in
>          * its remove routine.
>          */
>         pm_runtime_get_sync(dev);
>         pci_dev->driver = pci_drv;
>         rc = pci_drv->probe(pci_dev, ddi->id);
>         if (!rc)
>                 return rc;
>         if (rc < 0) {
>                 pci_dev->driver = NULL;
>                 pm_runtime_put_sync(dev);
>                 return rc;
>         }
> 

Yes, in Linux the default is D0 for any unmanaged device. But then the
user can go there in the sysfs and change the power/control to 'auto'
and get the device to D3.

> And if probe fails it calls pm_runtime_put_sync which presumably does not provide the symmetry we need?

The main problem I see is that when the probe fail in our case we don't
unregister and i915 is still listed as controlling that device as we could
see with lspci --nnv.

And any attempt to change the control to 'auto' fails. So we are forever
stuck in D0.

So, I really believe it is better to bring the device to D3 then leaving
it there blocked in D0 forever.

Or forcing users to use another parameter to entirely avoid i915 to get
this device at first place.

> 
> Anyway since I can't provide meaningful review I'll copy Imre since I think he worked in the area in the past. Just so more eyes is better.
> 
> Regards,
> 
> Tvrtko
> 
> 
> > +
> >   		return -ENODEV;
> >   	}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
  2022-09-22  9:43   ` Rodrigo Vivi
@ 2022-09-22 11:09     ` Gupta, Anshuman
  2022-09-22 12:40       ` Gupta, Anshuman
  0 siblings, 1 reply; 8+ messages in thread
From: Gupta, Anshuman @ 2022-09-22 11:09 UTC (permalink / raw)
  To: Rodrigo Vivi, Tvrtko Ursulin
  Cc: Nikula, Jani, intel-gfx@lists.freedesktop.org, Daniel J Blueman,
	Wysocki, Rafael J, stable@vger.kernel.org



On 9/22/2022 3:13 PM, Rodrigo Vivi wrote:
> On Thu, Sep 22, 2022 at 08:56:00AM +0100, Tvrtko Ursulin wrote:
>>
>> On 21/09/2022 18:39, Rodrigo Vivi wrote:
>>> The force_probe protection actively avoids the probe of i915 to
>>> manage a device that is currently under development. It is a nice
>>> protection for future users when getting a new platform but using
>>> some older kernel.
>>>
>>> However, when we avoid the probe we don't take back the registration
>>> of the device. We cannot give up the registration anyway since we can
>>> have multiple devices present. For instance an integrated and a discrete
>>> one.
>>>
>>> When this scenario occurs, the user will not be able to change any
>>> of the runtime pm configuration of the unmanaged device. So, it will
>>> be blocked in D0 state wasting power. This is specially bad in the
>>> case where we have a discrete platform attached, but the user is
>>> able to fully use the integrated one for everything else.
>>>
>>> So, let's put the protected and unmanaged device in D3. So we can
>>> save some power.
>>>
>>> Reported-by: Daniel J Blueman <daniel@quora.org>
>>> Cc: stable@vger.kernel.org
>>> Cc: Daniel J Blueman <daniel@quora.org>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Anshuman Gupta <anshuman.gupta@intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
>>>    1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>>> index 77e7df21f539..fc3e7c69af2a 100644
>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>> @@ -25,6 +25,7 @@
>>>    #include <drm/drm_color_mgmt.h>
>>>    #include <drm/drm_drv.h>
>>>    #include <drm/i915_pciids.h>
>>> +#include <linux/pm_runtime.h>
>>>    #include "gt/intel_gt_regs.h"
>>>    #include "gt/intel_sa_media.h"
>>> @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>>    {
>>>    	struct intel_device_info *intel_info =
>>>    		(struct intel_device_info *) ent->driver_data;
>>> +	struct device *kdev = &pdev->dev;
>>>    	int err;
>>>    	if (intel_info->require_force_probe &&
>>> @@ -1314,6 +1316,12 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>>    			 "module parameter or CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
>>>    			 "or (recommended) check for kernel updates.\n",
>>>    			 pdev->device, pdev->device, pdev->device);
>>> +
>>> +		/* Let's not waste power if we are not managing the device */
>>> +		pm_runtime_use_autosuspend(kdev);
>>> +		pm_runtime_allow(kdev);
>>> +		pm_runtime_put_autosuspend(kdev);
AFAIK we don't need to enable autosuspend here, 
pm_runtime_put_autosuspend() will cause a NULL pointer de-reference as 
it will immediately call the intel_runtime_suspend()(because we haven't 
called the pm_runtime_mark_last_busy) without initializing i915.

Having said that we only need below, in order to let pci core keep the 
pci dev in D3.

pm_runtime_put_noidle()

Br,
Anshuman Gupta


>>
>> This sequence is black magic to me so can't really comment on the specifics. But in general, what I think I've figured out is, that the PCI core calls our runtime resume callback before probe:
>>
>> local_pci_probe:
>> ...
>>          /*
>>           * Unbound PCI devices are always put in D0, regardless of
>>           * runtime PM status.  During probe, the device is set to
>>           * active and the usage count is incremented.  If the driver
>>           * supports runtime PM, it should call pm_runtime_put_noidle(),
>>           * or any other runtime PM helper function decrementing the usage
>>           * count, in its probe routine and pm_runtime_get_noresume() in
>>           * its remove routine.
>>           */
>>          pm_runtime_get_sync(dev);
>>          pci_dev->driver = pci_drv;
>>          rc = pci_drv->probe(pci_dev, ddi->id);
>>          if (!rc)
>>                  return rc;
>>          if (rc < 0) {
>>                  pci_dev->driver = NULL;
>>                  pm_runtime_put_sync(dev);
>>                  return rc;
>>          }
>>
> 
> Yes, in Linux the default is D0 for any unmanaged device. But then the
> user can go there in the sysfs and change the power/control to 'auto'
> and get the device to D3.
> 
>> And if probe fails it calls pm_runtime_put_sync which presumably does not provide the symmetry we need?
> 
> The main problem I see is that when the probe fail in our case we don't
> unregister and i915 is still listed as controlling that device as we could
> see with lspci --nnv.
> 
> And any attempt to change the control to 'auto' fails. So we are forever
> stuck in D0.
> 
> So, I really believe it is better to bring the device to D3 then leaving
> it there blocked in D0 forever.
> 
> Or forcing users to use another parameter to entirely avoid i915 to get
> this device at first place.
> 
>>
>> Anyway since I can't provide meaningful review I'll copy Imre since I think he worked in the area in the past. Just so more eyes is better.
>>
>> Regards,
>>
>> Tvrtko
>>
>>
>>> +
>>>    		return -ENODEV;
>>>    	}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
  2022-09-22 11:09     ` Gupta, Anshuman
@ 2022-09-22 12:40       ` Gupta, Anshuman
  2022-09-23 18:20         ` Vivi, Rodrigo
  0 siblings, 1 reply; 8+ messages in thread
From: Gupta, Anshuman @ 2022-09-22 12:40 UTC (permalink / raw)
  To: Vivi, Rodrigo, Tvrtko Ursulin
  Cc: Nikula, Jani, intel-gfx@lists.freedesktop.org, Daniel J Blueman,
	Wysocki, Rafael J, stable@vger.kernel.org



> -----Original Message-----
> From: Gupta, Anshuman
> Sent: Thursday, September 22, 2022 4:40 PM
> To: Vivi, Rodrigo <rodrigo.vivi@intel.com>; Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com>
> Cc: Nikula, Jani <jani.nikula@intel.com>; intel-gfx@lists.freedesktop.org; Daniel
> J Blueman <daniel@quora.org>; Wysocki, Rafael J
> <rafael.j.wysocki@intel.com>; stable@vger.kernel.org
> Subject: Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively
> managing a known PCI device.
> 
> 
> 
> On 9/22/2022 3:13 PM, Rodrigo Vivi wrote:
> > On Thu, Sep 22, 2022 at 08:56:00AM +0100, Tvrtko Ursulin wrote:
> >>
> >> On 21/09/2022 18:39, Rodrigo Vivi wrote:
> >>> The force_probe protection actively avoids the probe of i915 to
> >>> manage a device that is currently under development. It is a nice
> >>> protection for future users when getting a new platform but using
> >>> some older kernel.
> >>>
> >>> However, when we avoid the probe we don't take back the registration
> >>> of the device. We cannot give up the registration anyway since we
> >>> can have multiple devices present. For instance an integrated and a
> >>> discrete one.
> >>>
> >>> When this scenario occurs, the user will not be able to change any
> >>> of the runtime pm configuration of the unmanaged device. So, it will
> >>> be blocked in D0 state wasting power. This is specially bad in the
> >>> case where we have a discrete platform attached, but the user is
> >>> able to fully use the integrated one for everything else.
> >>>
> >>> So, let's put the protected and unmanaged device in D3. So we can
> >>> save some power.
> >>>
> >>> Reported-by: Daniel J Blueman <daniel@quora.org>
> >>> Cc: stable@vger.kernel.org
> >>> Cc: Daniel J Blueman <daniel@quora.org>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> Cc: Anshuman Gupta <anshuman.gupta@intel.com>
> >>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
> >>>    1 file changed, 8 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_pci.c
> >>> b/drivers/gpu/drm/i915/i915_pci.c index 77e7df21f539..fc3e7c69af2a
> >>> 100644
> >>> --- a/drivers/gpu/drm/i915/i915_pci.c
> >>> +++ b/drivers/gpu/drm/i915/i915_pci.c
> >>> @@ -25,6 +25,7 @@
> >>>    #include <drm/drm_color_mgmt.h>
> >>>    #include <drm/drm_drv.h>
> >>>    #include <drm/i915_pciids.h>
> >>> +#include <linux/pm_runtime.h>
> >>>    #include "gt/intel_gt_regs.h"
> >>>    #include "gt/intel_sa_media.h"
> >>> @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
> >>>    {
> >>>    	struct intel_device_info *intel_info =
> >>>    		(struct intel_device_info *) ent->driver_data;
> >>> +	struct device *kdev = &pdev->dev;
> >>>    	int err;
> >>>    	if (intel_info->require_force_probe && @@ -1314,6 +1316,12 @@
> >>> static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id
> *ent)
> >>>    			 "module parameter or
> CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
> >>>    			 "or (recommended) check for kernel updates.\n",
> >>>    			 pdev->device, pdev->device, pdev->device);
> >>> +
> >>> +		/* Let's not waste power if we are not managing the device */
> >>> +		pm_runtime_use_autosuspend(kdev);
> >>> +		pm_runtime_allow(kdev);
> >>> +		pm_runtime_put_autosuspend(kdev);
> AFAIK we don't need to enable autosuspend here,
> pm_runtime_put_autosuspend() will cause a NULL pointer de-reference as it will
> immediately call the intel_runtime_suspend()(because we haven't called the
> pm_runtime_mark_last_busy) without initializing i915.
> 
> Having said that we only need below, in order to let pci core keep the pci dev in
> D3.
> 
> pm_runtime_put_noidle()
Hi Rodrigo ,
It seems playing with these runtime hooks, will only enable the "runtime suspend"
but actual state in "PMCSR" pci config is D0 despite device is runtime suspended, when there is no driver.
Example:
root@DUT2135-DG2MRB:/home/gta# cat /sys/bus/pci/devices/0000\:03\:00.0/power/runtime_status
suspended
root@DUT2135-DG2MRB:/home/gta# setpci -s 03:00.0 0xd4.l
00000008
(Bits 00:01 are the power state in PMCSR(offset = 4) config register from PM Cap offset at 0xd0).

Thanks,
Anshuman Gupta.
> 
> Br,
> Anshuman Gupta
> 
> 
> >>
> >> This sequence is black magic to me so can't really comment on the specifics.
> But in general, what I think I've figured out is, that the PCI core calls our runtime
> resume callback before probe:
> >>
> >> local_pci_probe:
> >> ...
> >>          /*
> >>           * Unbound PCI devices are always put in D0, regardless of
> >>           * runtime PM status.  During probe, the device is set to
> >>           * active and the usage count is incremented.  If the driver
> >>           * supports runtime PM, it should call pm_runtime_put_noidle(),
> >>           * or any other runtime PM helper function decrementing the usage
> >>           * count, in its probe routine and pm_runtime_get_noresume() in
> >>           * its remove routine.
> >>           */
> >>          pm_runtime_get_sync(dev);
> >>          pci_dev->driver = pci_drv;
> >>          rc = pci_drv->probe(pci_dev, ddi->id);
> >>          if (!rc)
> >>                  return rc;
> >>          if (rc < 0) {
> >>                  pci_dev->driver = NULL;
> >>                  pm_runtime_put_sync(dev);
> >>                  return rc;
> >>          }
> >>
> >
> > Yes, in Linux the default is D0 for any unmanaged device. But then the
> > user can go there in the sysfs and change the power/control to 'auto'
> > and get the device to D3.
> >
> >> And if probe fails it calls pm_runtime_put_sync which presumably does not
> provide the symmetry we need?
> >
> > The main problem I see is that when the probe fail in our case we
> > don't unregister and i915 is still listed as controlling that device
> > as we could see with lspci --nnv.
> >
> > And any attempt to change the control to 'auto' fails. So we are
> > forever stuck in D0.
> >
> > So, I really believe it is better to bring the device to D3 then
> > leaving it there blocked in D0 forever.
> >
> > Or forcing users to use another parameter to entirely avoid i915 to
> > get this device at first place.
> >
> >>
> >> Anyway since I can't provide meaningful review I'll copy Imre since I think he
> worked in the area in the past. Just so more eyes is better.
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>
> >>> +
> >>>    		return -ENODEV;
> >>>    	}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
  2022-09-22 12:40       ` Gupta, Anshuman
@ 2022-09-23 18:20         ` Vivi, Rodrigo
  2022-09-24 17:52           ` Rafael J. Wysocki
  0 siblings, 1 reply; 8+ messages in thread
From: Vivi, Rodrigo @ 2022-09-23 18:20 UTC (permalink / raw)
  To: tvrtko.ursulin@linux.intel.com, Gupta, Anshuman
  Cc: Nikula, Jani, daniel@quora.org, intel-gfx@lists.freedesktop.org,
	stable@vger.kernel.org, Wysocki, Rafael J

Rafael, could you please add your thoughts here?

On Thu, 2022-09-22 at 12:40 +0000, Gupta, Anshuman wrote:
> 
> 
> > -----Original Message-----
> > From: Gupta, Anshuman
> > Sent: Thursday, September 22, 2022 4:40 PM
> > To: Vivi, Rodrigo <rodrigo.vivi@intel.com>; Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com>
> > Cc: Nikula, Jani <jani.nikula@intel.com>; intel-
> > gfx@lists.freedesktop.org; Daniel
> > J Blueman <daniel@quora.org>; Wysocki, Rafael J
> > <rafael.j.wysocki@intel.com>; stable@vger.kernel.org
> > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not
> > actively
> > managing a known PCI device.
> > 
> > 
> > 
> > On 9/22/2022 3:13 PM, Rodrigo Vivi wrote:
> > > On Thu, Sep 22, 2022 at 08:56:00AM +0100, Tvrtko Ursulin wrote:
> > > > 
> > > > On 21/09/2022 18:39, Rodrigo Vivi wrote:
> > > > > The force_probe protection actively avoids the probe of i915
> > > > > to
> > > > > manage a device that is currently under development. It is a
> > > > > nice
> > > > > protection for future users when getting a new platform but
> > > > > using
> > > > > some older kernel.
> > > > > 
> > > > > However, when we avoid the probe we don't take back the
> > > > > registration
> > > > > of the device. We cannot give up the registration anyway
> > > > > since we
> > > > > can have multiple devices present. For instance an integrated
> > > > > and a
> > > > > discrete one.
> > > > > 
> > > > > When this scenario occurs, the user will not be able to
> > > > > change any
> > > > > of the runtime pm configuration of the unmanaged device. So,
> > > > > it will
> > > > > be blocked in D0 state wasting power. This is specially bad
> > > > > in the
> > > > > case where we have a discrete platform attached, but the user
> > > > > is
> > > > > able to fully use the integrated one for everything else.
> > > > > 
> > > > > So, let's put the protected and unmanaged device in D3. So we
> > > > > can
> > > > > save some power.
> > > > > 
> > > > > Reported-by: Daniel J Blueman <daniel@quora.org>
> > > > > Cc: stable@vger.kernel.org
> > > > > Cc: Daniel J Blueman <daniel@quora.org>
> > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > > > Cc: Anshuman Gupta <anshuman.gupta@intel.com>
> > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > ---
> > > > >    drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
> > > > >    1 file changed, 8 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > b/drivers/gpu/drm/i915/i915_pci.c index
> > > > > 77e7df21f539..fc3e7c69af2a
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > @@ -25,6 +25,7 @@
> > > > >    #include <drm/drm_color_mgmt.h>
> > > > >    #include <drm/drm_drv.h>
> > > > >    #include <drm/i915_pciids.h>
> > > > > +#include <linux/pm_runtime.h>
> > > > >    #include "gt/intel_gt_regs.h"
> > > > >    #include "gt/intel_sa_media.h"
> > > > > @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct
> > > > > pci_dev *pdev,
> > const struct pci_device_id *ent)
> > > > >    {
> > > > >         struct intel_device_info *intel_info =
> > > > >                 (struct intel_device_info *) ent-
> > > > > >driver_data;
> > > > > +       struct device *kdev = &pdev->dev;
> > > > >         int err;
> > > > >         if (intel_info->require_force_probe && @@ -1314,6
> > > > > +1316,12 @@
> > > > > static int i915_pci_probe(struct pci_dev *pdev, const struct
> > > > > pci_device_id
> > *ent)
> > > > >                          "module parameter or
> > CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
> > > > >                          "or (recommended) check for kernel
> > > > > updates.\n",
> > > > >                          pdev->device, pdev->device, pdev-
> > > > > >device);
> > > > > +
> > > > > +               /* Let's not waste power if we are not
> > > > > managing the device */
> > > > > +               pm_runtime_use_autosuspend(kdev);
> > > > > +               pm_runtime_allow(kdev);
> > > > > +               pm_runtime_put_autosuspend(kdev);
> > AFAIK we don't need to enable autosuspend here,
> > pm_runtime_put_autosuspend() will cause a NULL pointer de-reference
> > as it will
> > immediately call the intel_runtime_suspend()(because we haven't
> > called the
> > pm_runtime_mark_last_busy) without initializing i915.

I don't see any null pointer dereference here.
The problem is exactly that we do the initialization and the we give up
on the 
device and end up blocking the runtime pm in some state that we cannot
change.

> > 
> > Having said that we only need below, in order to let pci core keep
> > the pci dev in
> > D3.
> > 
> > pm_runtime_put_noidle()

as for this one here I get:
[ 9036.357078] i915 0000:03:00.0: Runtime PM usage count underflow!

> 
> Hi Rodrigo ,
> It seems playing with these runtime hooks, will only enable the
> "runtime suspend"
> but actual state in "PMCSR" pci config is D0 despite device is
> runtime suspended, when there is no driver.
> Example:
> root@DUT2135-DG2MRB:/home/gta# cat
> /sys/bus/pci/devices/0000\:03\:00.0/power/runtime_status
> suspended
> root@DUT2135-DG2MRB:/home/gta# setpci -s 03:00.0 0xd4.l
> 00000008
> (Bits 00:01 are the power state in PMCSR(offset = 4) config register
> from PM Cap offset at 0xd0).

Well, this is indeed awkward.

Rafael, do you know what we could be missing here to ensure we get the
proper d3?

I noticed that with the linux param vfio-pci.ids=8086:<dg2_id> it does
get us to the d3.

# setpci -s 03:00.0 0xd4.l
0000010b

While with the approach in this patch or the noidle() I also get
the 00000008

Thanks,
Rodrigo.

> 
> Thanks,
> Anshuman Gupta.
> > 
> > Br,
> > Anshuman Gupta
> > 
> > 
> > > > 
> > > > This sequence is black magic to me so can't really comment on
> > > > the specifics.
> > But in general, what I think I've figured out is, that the PCI core
> > calls our runtime
> > resume callback before probe:
> > > > 
> > > > local_pci_probe:
> > > > ...
> > > >          /*
> > > >           * Unbound PCI devices are always put in D0,
> > > > regardless of
> > > >           * runtime PM status.  During probe, the device is set
> > > > to
> > > >           * active and the usage count is incremented.  If the
> > > > driver
> > > >           * supports runtime PM, it should call
> > > > pm_runtime_put_noidle(),
> > > >           * or any other runtime PM helper function
> > > > decrementing the usage
> > > >           * count, in its probe routine and
> > > > pm_runtime_get_noresume() in
> > > >           * its remove routine.
> > > >           */
> > > >          pm_runtime_get_sync(dev);
> > > >          pci_dev->driver = pci_drv;
> > > >          rc = pci_drv->probe(pci_dev, ddi->id);
> > > >          if (!rc)
> > > >                  return rc;
> > > >          if (rc < 0) {
> > > >                  pci_dev->driver = NULL;
> > > >                  pm_runtime_put_sync(dev);
> > > >                  return rc;
> > > >          }
> > > > 
> > > 
> > > Yes, in Linux the default is D0 for any unmanaged device. But
> > > then the
> > > user can go there in the sysfs and change the power/control to
> > > 'auto'
> > > and get the device to D3.
> > > 
> > > > And if probe fails it calls pm_runtime_put_sync which
> > > > presumably does not
> > provide the symmetry we need?
> > > 
> > > The main problem I see is that when the probe fail in our case we
> > > don't unregister and i915 is still listed as controlling that
> > > device
> > > as we could see with lspci --nnv.
> > > 
> > > And any attempt to change the control to 'auto' fails. So we are
> > > forever stuck in D0.
> > > 
> > > So, I really believe it is better to bring the device to D3 then
> > > leaving it there blocked in D0 forever.
> > > 
> > > Or forcing users to use another parameter to entirely avoid i915
> > > to
> > > get this device at first place.
> > > 
> > > > 
> > > > Anyway since I can't provide meaningful review I'll copy Imre
> > > > since I think he
> > worked in the area in the past. Just so more eyes is better.
> > > > 
> > > > Regards,
> > > > 
> > > > Tvrtko
> > > > 
> > > > 
> > > > > +
> > > > >                 return -ENODEV;
> > > > >         }


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device.
  2022-09-23 18:20         ` Vivi, Rodrigo
@ 2022-09-24 17:52           ` Rafael J. Wysocki
  0 siblings, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2022-09-24 17:52 UTC (permalink / raw)
  To: Vivi, Rodrigo, tvrtko.ursulin@linux.intel.com, Gupta, Anshuman
  Cc: Nikula, Jani, daniel@quora.org, intel-gfx@lists.freedesktop.org,
	stable@vger.kernel.org

On 9/23/2022 8:20 PM, Vivi, Rodrigo wrote:
> Rafael, could you please add your thoughts here?

Sure, sorry for the delay.

I gather the idea is to bind the driver to the device without actually 
doing anything more to it and to put it into D3, so it doesn't draw too 
much power.

Using PM-runtime for that should work, but the driver needs to make sure 
that its PM-runtime callbacks will work then (they may simply return 0 
all the time in that case, but they need to take it into account).


> On Thu, 2022-09-22 at 12:40 +0000, Gupta, Anshuman wrote:
>>
>>> -----Original Message-----
>>> From: Gupta, Anshuman
>>> Sent: Thursday, September 22, 2022 4:40 PM
>>> To: Vivi, Rodrigo <rodrigo.vivi@intel.com>; Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com>
>>> Cc: Nikula, Jani <jani.nikula@intel.com>; intel-
>>> gfx@lists.freedesktop.org; Daniel
>>> J Blueman <daniel@quora.org>; Wysocki, Rafael J
>>> <rafael.j.wysocki@intel.com>; stable@vger.kernel.org
>>> Subject: Re: [Intel-gfx] [PATCH] drm/i915: Allow D3 when we are not
>>> actively
>>> managing a known PCI device.
>>>
>>>
>>>
>>> On 9/22/2022 3:13 PM, Rodrigo Vivi wrote:
>>>> On Thu, Sep 22, 2022 at 08:56:00AM +0100, Tvrtko Ursulin wrote:
>>>>> On 21/09/2022 18:39, Rodrigo Vivi wrote:
>>>>>> The force_probe protection actively avoids the probe of i915
>>>>>> to
>>>>>> manage a device that is currently under development. It is a
>>>>>> nice
>>>>>> protection for future users when getting a new platform but
>>>>>> using
>>>>>> some older kernel.
>>>>>>
>>>>>> However, when we avoid the probe we don't take back the
>>>>>> registration
>>>>>> of the device. We cannot give up the registration anyway
>>>>>> since we
>>>>>> can have multiple devices present. For instance an integrated
>>>>>> and a
>>>>>> discrete one.
>>>>>>
>>>>>> When this scenario occurs, the user will not be able to
>>>>>> change any
>>>>>> of the runtime pm configuration of the unmanaged device. So,
>>>>>> it will
>>>>>> be blocked in D0 state wasting power. This is specially bad
>>>>>> in the
>>>>>> case where we have a discrete platform attached, but the user
>>>>>> is
>>>>>> able to fully use the integrated one for everything else.
>>>>>>
>>>>>> So, let's put the protected and unmanaged device in D3. So we
>>>>>> can
>>>>>> save some power.
>>>>>>
>>>>>> Reported-by: Daniel J Blueman <daniel@quora.org>
>>>>>> Cc: stable@vger.kernel.org
>>>>>> Cc: Daniel J Blueman <daniel@quora.org>
>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>> Cc: Anshuman Gupta <anshuman.gupta@intel.com>
>>>>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/i915_pci.c | 8 ++++++++
>>>>>>     1 file changed, 8 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c
>>>>>> b/drivers/gpu/drm/i915/i915_pci.c index
>>>>>> 77e7df21f539..fc3e7c69af2a
>>>>>> 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>>>>> @@ -25,6 +25,7 @@
>>>>>>     #include <drm/drm_color_mgmt.h>
>>>>>>     #include <drm/drm_drv.h>
>>>>>>     #include <drm/i915_pciids.h>
>>>>>> +#include <linux/pm_runtime.h>
>>>>>>     #include "gt/intel_gt_regs.h"
>>>>>>     #include "gt/intel_sa_media.h"
>>>>>> @@ -1304,6 +1305,7 @@ static int i915_pci_probe(struct
>>>>>> pci_dev *pdev,
>>> const struct pci_device_id *ent)
>>>>>>     {
>>>>>>          struct intel_device_info *intel_info =
>>>>>>                  (struct intel_device_info *) ent-
>>>>>>> driver_data;
>>>>>> +       struct device *kdev = &pdev->dev;
>>>>>>          int err;
>>>>>>          if (intel_info->require_force_probe && @@ -1314,6
>>>>>> +1316,12 @@
>>>>>> static int i915_pci_probe(struct pci_dev *pdev, const struct
>>>>>> pci_device_id
>>> *ent)
>>>>>>                           "module parameter or
>>> CONFIG_DRM_I915_FORCE_PROBE=%04x configuration option,\n"
>>>>>>                           "or (recommended) check for kernel
>>>>>> updates.\n",
>>>>>>                           pdev->device, pdev->device, pdev-
>>>>>>> device);
>>>>>> +
>>>>>> +               /* Let's not waste power if we are not
>>>>>> managing the device */
>>>>>> +               pm_runtime_use_autosuspend(kdev);
>>>>>> +               pm_runtime_allow(kdev);
>>>>>> +               pm_runtime_put_autosuspend(kdev);
>>> AFAIK we don't need to enable autosuspend here,
>>> pm_runtime_put_autosuspend() will cause a NULL pointer de-reference
>>> as it will
>>> immediately call the intel_runtime_suspend()(because we haven't
>>> called the
>>> pm_runtime_mark_last_busy) without initializing i915.
> I don't see any null pointer dereference here.
> The problem is exactly that we do the initialization and the we give up
> on the
> device and end up blocking the runtime pm in some state that we cannot
> change.
>
>>> Having said that we only need below, in order to let pci core keep
>>> the pci dev in
>>> D3.
>>>
>>> pm_runtime_put_noidle()
> as for this one here I get:
> [ 9036.357078] i915 0000:03:00.0: Runtime PM usage count underflow!
>
>> Hi Rodrigo ,
>> It seems playing with these runtime hooks, will only enable the
>> "runtime suspend"
>> but actual state in "PMCSR" pci config is D0 despite device is
>> runtime suspended, when there is no driver.
>> Example:
>> root@DUT2135-DG2MRB:/home/gta# cat
>> /sys/bus/pci/devices/0000\:03\:00.0/power/runtime_status
>> suspended
>> root@DUT2135-DG2MRB:/home/gta# setpci -s 03:00.0 0xd4.l
>> 00000008
>> (Bits 00:01 are the power state in PMCSR(offset = 4) config register
>> from PM Cap offset at 0xd0).
> Well, this is indeed awkward.
>
> Rafael, do you know what we could be missing here to ensure we get the
> proper d3?
>
> I noticed that with the linux param vfio-pci.ids=8086:<dg2_id> it does
> get us to the d3.
>
> # setpci -s 03:00.0 0xd4.l
> 0000010b
>
> While with the approach in this patch or the noidle() I also get
> the 00000008
>
> Thanks,
> Rodrigo.
>
>> Thanks,
>> Anshuman Gupta.
>>> Br,
>>> Anshuman Gupta
>>>
>>>
>>>>> This sequence is black magic to me so can't really comment on
>>>>> the specifics.
>>> But in general, what I think I've figured out is, that the PCI core
>>> calls our runtime
>>> resume callback before probe:
>>>>> local_pci_probe:
>>>>> ...
>>>>>           /*
>>>>>            * Unbound PCI devices are always put in D0,
>>>>> regardless of
>>>>>            * runtime PM status.  During probe, the device is set
>>>>> to
>>>>>            * active and the usage count is incremented.  If the
>>>>> driver
>>>>>            * supports runtime PM, it should call
>>>>> pm_runtime_put_noidle(),
>>>>>            * or any other runtime PM helper function
>>>>> decrementing the usage
>>>>>            * count, in its probe routine and
>>>>> pm_runtime_get_noresume() in
>>>>>            * its remove routine.
>>>>>            */
>>>>>           pm_runtime_get_sync(dev);
>>>>>           pci_dev->driver = pci_drv;
>>>>>           rc = pci_drv->probe(pci_dev, ddi->id);
>>>>>           if (!rc)
>>>>>                   return rc;
>>>>>           if (rc < 0) {
>>>>>                   pci_dev->driver = NULL;
>>>>>                   pm_runtime_put_sync(dev);
>>>>>                   return rc;
>>>>>           }
>>>>>
>>>> Yes, in Linux the default is D0 for any unmanaged device. But
>>>> then the
>>>> user can go there in the sysfs and change the power/control to
>>>> 'auto'
>>>> and get the device to D3.
>>>>
>>>>> And if probe fails it calls pm_runtime_put_sync which
>>>>> presumably does not
>>> provide the symmetry we need?
>>>> The main problem I see is that when the probe fail in our case we
>>>> don't unregister and i915 is still listed as controlling that
>>>> device
>>>> as we could see with lspci --nnv.
>>>>
>>>> And any attempt to change the control to 'auto' fails. So we are
>>>> forever stuck in D0.
>>>>
>>>> So, I really believe it is better to bring the device to D3 then
>>>> leaving it there blocked in D0 forever.
>>>>
>>>> Or forcing users to use another parameter to entirely avoid i915
>>>> to
>>>> get this device at first place.
>>>>
>>>>> Anyway since I can't provide meaningful review I'll copy Imre
>>>>> since I think he
>>> worked in the area in the past. Just so more eyes is better.
>>>>> Regards,
>>>>>
>>>>> Tvrtko
>>>>>
>>>>>
>>>>>> +
>>>>>>                  return -ENODEV;
>>>>>>          }



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-09-24 17:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-21 17:39 [PATCH] drm/i915: Allow D3 when we are not actively managing a known PCI device Rodrigo Vivi
2022-09-22  7:56 ` [Intel-gfx] " Tvrtko Ursulin
2022-09-22  9:43   ` Rodrigo Vivi
2022-09-22 11:09     ` Gupta, Anshuman
2022-09-22 12:40       ` Gupta, Anshuman
2022-09-23 18:20         ` Vivi, Rodrigo
2022-09-24 17:52           ` Rafael J. Wysocki
2022-09-22  9:23 ` Jani Nikula

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox