* [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up
@ 2017-11-21 15:01 Karol Herbst
[not found] ` <20171121150116.24956-1-kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Karol Herbst @ 2017-11-21 15:01 UTC (permalink / raw)
To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
This should make systems more stable where resuming the GPU fails. This
can happen due to bad firmware or due to a bug within the kernel. The
last thing which should happen in either case is an unusable system.
v2: do the same in nouveau_pmops_resume
Tested-by: Karl Hastings <kazen@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
---
drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++--------
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c
index 8d4a5be3..6e4cb4f7 100644
--- a/drm/nouveau/nouveau_drm.c
+++ b/drm/nouveau/nouveau_drm.c
@@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev)
return 0;
}
+static int
+nouveau_set_power_state_D0(struct pci_dev *pdev)
+{
+ struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev));
+ int ret;
+
+ pci_set_power_state(pdev, PCI_D0);
+ /* abort if anything went wrong */
+ if (pdev->current_state != PCI_D0) {
+ NV_ERROR(drm, "couldn't wake up GPU!\n");
+ return -EBUSY;
+ }
+ pci_restore_state(pdev);
+ ret = pci_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ pci_set_master(pdev);
+ return 0;
+}
+
int
nouveau_pmops_resume(struct device *dev)
{
@@ -803,12 +824,9 @@ nouveau_pmops_resume(struct device *dev)
drm_dev->switch_power_state == DRM_SWITCH_POWER_DYNAMIC_OFF)
return 0;
- pci_set_power_state(pdev, PCI_D0);
- pci_restore_state(pdev);
- ret = pci_enable_device(pdev);
+ ret = nouveau_set_power_state_D0(pdev);
if (ret)
return ret;
- pci_set_master(pdev);
ret = nouveau_do_resume(drm_dev, false);
@@ -879,12 +897,9 @@ nouveau_pmops_runtime_resume(struct device *dev)
return -EBUSY;
}
- pci_set_power_state(pdev, PCI_D0);
- pci_restore_state(pdev);
- ret = pci_enable_device(pdev);
+ ret = nouveau_set_power_state_D0(pdev);
if (ret)
return ret;
- pci_set_master(pdev);
ret = nouveau_do_resume(drm_dev, true);
--
2.14.3
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau
^ permalink raw reply related [flat|nested] 5+ messages in thread[parent not found: <20171121150116.24956-1-kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up [not found] ` <20171121150116.24956-1-kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2017-11-21 17:46 ` Thierry Reding 2017-11-21 19:03 ` Karol Herbst 0 siblings, 1 reply; 5+ messages in thread From: Thierry Reding @ 2017-11-21 17:46 UTC (permalink / raw) To: Karol Herbst; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW [-- Attachment #1.1: Type: text/plain, Size: 1762 bytes --] On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote: > This should make systems more stable where resuming the GPU fails. This > can happen due to bad firmware or due to a bug within the kernel. The > last thing which should happen in either case is an unusable system. > > v2: do the same in nouveau_pmops_resume > > Tested-by: Karl Hastings <kazen-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Signed-off-by: Karol Herbst <kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > --- > drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++-------- > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c > index 8d4a5be3..6e4cb4f7 100644 > --- a/drm/nouveau/nouveau_drm.c > +++ b/drm/nouveau/nouveau_drm.c > @@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev) > return 0; > } > > +static int > +nouveau_set_power_state_D0(struct pci_dev *pdev) > +{ > + struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev)); > + int ret; > + > + pci_set_power_state(pdev, PCI_D0); > + /* abort if anything went wrong */ > + if (pdev->current_state != PCI_D0) { > + NV_ERROR(drm, "couldn't wake up GPU!\n"); > + return -EBUSY; > + } Looks to me like the more idiomatic way to do this is: ret = pci_set_power_state(pdev, PCI_D0); if (ret < 0 && ret != -EIO) return ret; > + pci_restore_state(pdev); > + ret = pci_enable_device(pdev); > + if (ret) > + return ret; > + > + pci_set_master(pdev); Looking closer it also seems like pci_enable_device() will already set the power state to D0 (via do_pci_enable_device()). Is the sequence above really necessary because the hardware is quirky, or was it cargo-culted? Thierry [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up 2017-11-21 17:46 ` Thierry Reding @ 2017-11-21 19:03 ` Karol Herbst 2017-11-22 10:31 ` [Nouveau] " Thierry Reding 0 siblings, 1 reply; 5+ messages in thread From: Karol Herbst @ 2017-11-21 19:03 UTC (permalink / raw) To: Thierry Reding; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, David Airlie On Tue, Nov 21, 2017 at 6:46 PM, Thierry Reding <thierry.reding@gmail.com> wrote: > On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote: >> This should make systems more stable where resuming the GPU fails. This >> can happen due to bad firmware or due to a bug within the kernel. The >> last thing which should happen in either case is an unusable system. >> >> v2: do the same in nouveau_pmops_resume >> >> Tested-by: Karl Hastings <kazen@redhat.com> >> Signed-off-by: Karol Herbst <kherbst@redhat.com> >> --- >> drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++-------- >> 1 file changed, 23 insertions(+), 8 deletions(-) >> >> diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c >> index 8d4a5be3..6e4cb4f7 100644 >> --- a/drm/nouveau/nouveau_drm.c >> +++ b/drm/nouveau/nouveau_drm.c >> @@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev) >> return 0; >> } >> >> +static int >> +nouveau_set_power_state_D0(struct pci_dev *pdev) >> +{ >> + struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev)); >> + int ret; >> + >> + pci_set_power_state(pdev, PCI_D0); >> + /* abort if anything went wrong */ >> + if (pdev->current_state != PCI_D0) { >> + NV_ERROR(drm, "couldn't wake up GPU!\n"); >> + return -EBUSY; >> + } > > Looks to me like the more idiomatic way to do this is: > > ret = pci_set_power_state(pdev, PCI_D0); > if (ret < 0 && ret != -EIO) > return ret; > I thought so too, but it ends up returning 0 even if setting the power state fails. Or maybe I did something wrong when installing the kernel. I could take another shot at it, but what I came up with seems to work. Adding airlied in CC, because he saw my patch and didn't complain about it. Hopefully he knows more. >> + pci_restore_state(pdev); >> + ret = pci_enable_device(pdev); >> + if (ret) >> + return ret; >> + >> + pci_set_master(pdev); > > Looking closer it also seems like pci_enable_device() will already set > the power state to D0 (via do_pci_enable_device()). Is the sequence > above really necessary because the hardware is quirky, or was it > cargo-culted? > > Thierry No clue. And because it was already there in the original code I didn't really felt like doing anything with it. _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up 2017-11-21 19:03 ` Karol Herbst @ 2017-11-22 10:31 ` Thierry Reding 2017-11-22 10:51 ` Karol Herbst 0 siblings, 1 reply; 5+ messages in thread From: Thierry Reding @ 2017-11-22 10:31 UTC (permalink / raw) To: Karol Herbst; +Cc: nouveau, David Airlie, Bjorn Helgaas, linux-pci [-- Attachment #1: Type: text/plain, Size: 2623 bytes --] On Tue, Nov 21, 2017 at 08:03:20PM +0100, Karol Herbst wrote: > On Tue, Nov 21, 2017 at 6:46 PM, Thierry Reding > <thierry.reding@gmail.com> wrote: > > On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote: > >> This should make systems more stable where resuming the GPU fails. This > >> can happen due to bad firmware or due to a bug within the kernel. The > >> last thing which should happen in either case is an unusable system. > >> > >> v2: do the same in nouveau_pmops_resume > >> > >> Tested-by: Karl Hastings <kazen@redhat.com> > >> Signed-off-by: Karol Herbst <kherbst@redhat.com> > >> --- > >> drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++-------- > >> 1 file changed, 23 insertions(+), 8 deletions(-) > >> > >> diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c > >> index 8d4a5be3..6e4cb4f7 100644 > >> --- a/drm/nouveau/nouveau_drm.c > >> +++ b/drm/nouveau/nouveau_drm.c > >> @@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev) > >> return 0; > >> } > >> > >> +static int > >> +nouveau_set_power_state_D0(struct pci_dev *pdev) > >> +{ > >> + struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev)); > >> + int ret; > >> + > >> + pci_set_power_state(pdev, PCI_D0); > >> + /* abort if anything went wrong */ > >> + if (pdev->current_state != PCI_D0) { > >> + NV_ERROR(drm, "couldn't wake up GPU!\n"); > >> + return -EBUSY; > >> + } > > > > Looks to me like the more idiomatic way to do this is: > > > > ret = pci_set_power_state(pdev, PCI_D0); > > if (ret < 0 && ret != -EIO) > > return ret; > > > > I thought so too, but it ends up returning 0 even if setting the power > state fails. Or maybe I did something wrong when installing the > kernel. I could take another shot at it, but what I came up with seems > to work. Adding airlied in CC, because he saw my patch and didn't > complain about it. Hopefully he knows more. pci_raw_set_power_state(), called by pci_set_power_state(), contains this, which looks to me like it would be the only case where the problem you're describing could be coming from: dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK); if (dev->current_state != state && printk_ratelimit()) dev_info(&dev->dev, "Refused to change power state, currently in D%d\n", dev->current_state); Do you happen to see this in the kernel logs? Perhaps this should be considered an error rather than just an KERN_INFO level message? Adding Bjorn and linux-pci for visibility. Thierry [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up 2017-11-22 10:31 ` [Nouveau] " Thierry Reding @ 2017-11-22 10:51 ` Karol Herbst 0 siblings, 0 replies; 5+ messages in thread From: Karol Herbst @ 2017-11-22 10:51 UTC (permalink / raw) To: Thierry Reding; +Cc: nouveau, David Airlie, Bjorn Helgaas, linux-pci On Wed, Nov 22, 2017 at 11:31 AM, Thierry Reding <thierry.reding@gmail.com> wrote: > On Tue, Nov 21, 2017 at 08:03:20PM +0100, Karol Herbst wrote: >> On Tue, Nov 21, 2017 at 6:46 PM, Thierry Reding >> <thierry.reding@gmail.com> wrote: >> > On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote: >> >> This should make systems more stable where resuming the GPU fails. This >> >> can happen due to bad firmware or due to a bug within the kernel. The >> >> last thing which should happen in either case is an unusable system. >> >> >> >> v2: do the same in nouveau_pmops_resume >> >> >> >> Tested-by: Karl Hastings <kazen@redhat.com> >> >> Signed-off-by: Karol Herbst <kherbst@redhat.com> >> >> --- >> >> drm/nouveau/nouveau_drm.c | 31 +++++++++++++++++++++++-------- >> >> 1 file changed, 23 insertions(+), 8 deletions(-) >> >> >> >> diff --git a/drm/nouveau/nouveau_drm.c b/drm/nouveau/nouveau_drm.c >> >> index 8d4a5be3..6e4cb4f7 100644 >> >> --- a/drm/nouveau/nouveau_drm.c >> >> +++ b/drm/nouveau/nouveau_drm.c >> >> @@ -792,6 +792,27 @@ nouveau_pmops_suspend(struct device *dev) >> >> return 0; >> >> } >> >> >> >> +static int >> >> +nouveau_set_power_state_D0(struct pci_dev *pdev) >> >> +{ >> >> + struct nouveau_drm *drm = nouveau_drm(pci_get_drvdata(pdev)); >> >> + int ret; >> >> + >> >> + pci_set_power_state(pdev, PCI_D0); >> >> + /* abort if anything went wrong */ >> >> + if (pdev->current_state != PCI_D0) { >> >> + NV_ERROR(drm, "couldn't wake up GPU!\n"); >> >> + return -EBUSY; >> >> + } >> > >> > Looks to me like the more idiomatic way to do this is: >> > >> > ret = pci_set_power_state(pdev, PCI_D0); >> > if (ret < 0 && ret != -EIO) >> > return ret; >> > >> >> I thought so too, but it ends up returning 0 even if setting the power >> state fails. Or maybe I did something wrong when installing the >> kernel. I could take another shot at it, but what I came up with seems >> to work. Adding airlied in CC, because he saw my patch and didn't >> complain about it. Hopefully he knows more. > > pci_raw_set_power_state(), called by pci_set_power_state(), contains > this, which looks to me like it would be the only case where the problem > you're describing could be coming from: > > dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK); > if (dev->current_state != state && printk_ratelimit()) > dev_info(&dev->dev, "Refused to change power state, currently in D%d\n", > dev->current_state); > > Do you happen to see this in the kernel logs? Perhaps this should be > considered an error rather than just an KERN_INFO level message? > > Adding Bjorn and linux-pci for visibility. > > Thierry yeah, that is the error we have in dmesg. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-11-22 10:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-21 15:01 [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up Karol Herbst
[not found] ` <20171121150116.24956-1-kherbst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-11-21 17:46 ` Thierry Reding
2017-11-21 19:03 ` Karol Herbst
2017-11-22 10:31 ` [Nouveau] " Thierry Reding
2017-11-22 10:51 ` Karol Herbst
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.