From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com ([192.55.52.88]:59229 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753447AbbK0Oow (ORCPT ); Fri, 27 Nov 2015 09:44:52 -0500 Subject: Re: [Intel-gfx] [PATCH v2] PCI / PM: tune down RPM suspend error message with EBUSY and EAGAIN retval To: Jani Nikula , Daniel Vetter , Imre Deak References: <1447838178-15308-1-git-send-email-imre.deak@intel.com> <1447844188-21999-1-git-send-email-imre.deak@intel.com> <1447853318.14073.2.camel@intel.com> <20151118141943.GU20799@phenom.ffwll.local> <878u5j7jt9.fsf@intel.com> Cc: Daniel Vetter , intel-gfx@lists.freedesktop.org, "Rafael J. Wysocki" , Linux PM , Linux PCI , Bjorn Helgaas From: "Rafael J. Wysocki" Message-ID: <56586C60.5050104@intel.com> Date: Fri, 27 Nov 2015 15:44:48 +0100 MIME-Version: 1.0 In-Reply-To: <878u5j7jt9.fsf@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-pci-owner@vger.kernel.org List-ID: On 11/27/2015 12:39 PM, Jani Nikula wrote: > On Wed, 18 Nov 2015, Daniel Vetter wrote: >> On Wed, Nov 18, 2015 at 03:28:38PM +0200, Imre Deak wrote: >>> On ke, 2015-11-18 at 12:56 +0200, Imre Deak wrote: >>>> The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver >>>> suspend hooks as errors, but they still show up as errors in dmesg. Tune >>>> them down. >>>> >>>> One problem caused by this was noticed by Daniel: the i915 driver >>>> returns EAGAIN to signal a temporary failure to suspend and as a request >>>> towards the RPM core for scheduling a suspend again. This is a normal >>>> event, but the resulting error message flags a breakage during the >>>> driver's automated testing which parses dmesg and picks up the error. >>>> >>>> v2: >>>> - fix compile breake when CONFIG_PM_SLEEP=n (0-day builder) >>>> >>>> Reported-by: Daniel Vetter >>>> Signed-off-by: Imre Deak >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92992 >> Reviewed-by: Daniel Vetter >> >> Rafael, can you please pick this up for 4.4? The spurious KERN_ERR noise >> in dmesg is causing a lot fo spurious fail in our (very recently put into >> place) i915 CI system. > Rafael, ping. Well, so I'm not sure about this one. And the question is -> >>>> --- >>>> drivers/base/power/main.c | 7 +++++-- >>>> drivers/pci/pci-driver.c | 2 +- >>>> include/linux/pm.h | 11 +++++++++-- >>>> 3 files changed, 15 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c >>>> index 1710c26..39d2090 100644 >>>> --- a/drivers/base/power/main.c >>>> +++ b/drivers/base/power/main.c >>>> @@ -1679,9 +1679,12 @@ int dpm_suspend_start(pm_message_t state) >>>> } >>>> EXPORT_SYMBOL_GPL(dpm_suspend_start); >>>> >>>> -void __suspend_report_result(const char *function, void *fn, int ret) >>>> +void __suspend_report_result(const char *function, void *fn, int ret, >>>> + bool runtime_pm) >>>> { >>>> - if (ret) >>>> + if (runtime_pm && (ret == -EBUSY || ret == -EAGAIN)) >>>> + printk(KERN_DEBUG "%s(): %pF returns %d\n", function, fn, ret); >>>> + else if (ret) >>>> printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret); >>>> } -> why you are adding overhead to this function, instead of --> >>>> EXPORT_SYMBOL_GPL(__suspend_report_result); >>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c >>>> index 108a311..9569572 100644 >>>> --- a/drivers/pci/pci-driver.c >>>> +++ b/drivers/pci/pci-driver.c >>>> @@ -1142,7 +1142,7 @@ static int pci_pm_runtime_suspend(struct device *dev) >>>> pci_dev->state_saved = false; >>>> pci_dev->no_d3cold = false; >>>> error = pm->runtime_suspend(dev); >>>> - suspend_report_result(pm->runtime_suspend, error); >>>> + rpm_suspend_report_result(pm->runtime_suspend, error); --> replacing the suspend_report_result() above with a direct printk() in the if (error) block below. Surely, suspend_report_result() was not designed with runtime PM in mind and it was a mistake to use it here. It just seemed to do the right thing, but it clearly doesn't. >>>> if (error) >>>> return error; >>>> if (!pci_dev->d3cold_allowed) >>>> diff --git a/include/linux/pm.h b/include/linux/pm.h >>>> index 35d599e..54f37e3 100644 >>>> --- a/include/linux/pm.h >>>> +++ b/include/linux/pm.h >>>> @@ -702,11 +702,17 @@ extern int dpm_suspend_late(pm_message_t state); >>>> extern int dpm_suspend(pm_message_t state); >>>> extern int dpm_prepare(pm_message_t state); >>>> >>>> -extern void __suspend_report_result(const char *function, void *fn, int ret); >>>> +extern void __suspend_report_result(const char *function, void *fn, int ret, >>>> + bool runtime_pm); >>>> >>>> #define suspend_report_result(fn, ret) \ >>>> do { \ >>>> - __suspend_report_result(__func__, fn, ret); \ >>>> + __suspend_report_result(__func__, fn, ret, false); \ >>>> + } while (0) >>>> + >>>> +#define rpm_suspend_report_result(fn, ret) \ >>>> + do { \ >>>> + __suspend_report_result(__func__, fn, ret, true); \ >>>> } while (0) >>>> >>>> extern int device_pm_wait_for_dev(struct device *sub, struct device *dev); >>>> @@ -744,6 +750,7 @@ static inline int dpm_suspend_start(pm_message_t state) >>>> } >>>> >>>> #define suspend_report_result(fn, ret) do {} while (0) >>>> +#define rpm_suspend_report_result(fn, ret) do {} while (0) >>>> >>>> static inline int device_pm_wait_for_dev(struct device *a, struct device *b) >>>> { BTW, if you're changing PM code, it is good to CC linux-pm too (now done) and if you're changing PCI code, it is mandatory to CC linux-pci and the PCI maintainer (now done too). Thanks, Rafael