From: Oliver Neukum <oneukum@suse.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>,
Oliver Neukum <oneukum@suse.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Vincent Whitchurch <vincent.whitchurch@axis.com>,
"jic23@kernel.org" <jic23@kernel.org>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-iio@vger.kernel.org" <linux-iio@vger.kernel.org>
Subject: Re: PM runtime_error handling missing in many drivers?
Date: Wed, 27 Jul 2022 10:08:06 +0200 [thread overview]
Message-ID: <5c37ee19-fe2c-fb22-63a2-638e3dab8f7a@suse.com> (raw)
In-Reply-To: <CAJZ5v0j0mgOcfKXRzyx12EX8CYLzowXrM8DGCH9XvQGnRNv0iw@mail.gmail.com>
On 26.07.22 17:41, Rafael J. Wysocki wrote:
> On Tue, Jul 26, 2022 at 11:05 AM Oliver Neukum <oneukum@suse.com> wrote:
> I guess that depends on what is regarded as "the framework". I mean
> the PM-runtime code, excluding the bus type or equivalent.
Yes, we have multiple candidates in the generic case. Easy to overengineer.
>>> The idea was that drivers would clear these errors.
>>
>> I am afraid that is a deeply hidden layering violation. Yes, a driver's
>> resume() method may have failed. In that case, if that is the same
>> driver, it will obviously already know about the failure.
>
> So presumably it will do something to recover and avoid returning the
> error in the first place.
Yes, but that does not help us if they do return an error.
> From the PM-runtime core code perspective, if an error is returned by
> a suspend callback and it is not -EBUSY or -EAGAIN, the subsequent
> suspend is also likely to fail.
True.
> If a resume callback returns an error, any subsequent suspend or
> resume operations are likely to fail.
Also true, but the consequences are different.
> Storing the error effectively prevents subsequent operations from
> being carried out in both cases and that's why it is done.
I am afraid seeing these two operations as equivalent for this
purpose is a problem for two reasons:
1. suspend can be initiated by the generic framework
2. a failure to suspend leads to worse power consumption,
while a failure to resume is -EIO, at best
>> PM operations, however, are operating on a tree. A driver requesting
>> a resume may get an error code. But it has no idea where this error
>> comes from. The generic code knows at least that.
>
> Well, what do you mean by "the generic code"?
In this case the device model, which has the tree and all dependencies.
Error handling here is potentially very complicated because
1. a driver can experience an error from a node higher in the tree
2. a driver can trigger a failure in a sibling
3. a driver for a node can be less specific than the drivers higher up
Reducing this to a single error condition is difficult.
Suppose you have a USB device with two interfaces. The driver for A
initiates a resume. Interface A is resumed; B reports an error.
Should this block further attempts to suspend the whole device?
>> Let's look at at a USB storage device. The request to resume comes
>> from sd.c. sd.c is certainly not equipped to handle a PCI error
>> condition that has prevented a USB host controller from resuming.
>
> Sure, but this doesn't mean that suspending or resuming the device is
> a good idea until the error condition gets resolved.
Suspending clearly yes. Resuming is another matter. It has to work
if you want to operate without errors.
>> I am afraid this part of the API has issues. And they keep growing
>> the more we divorce the device driver from the bus driver, which
>> actually does the PM operation.
>
> Well, in general suspending or resuming a device is a collaborative
> effort and if one of the pieces falls over, making it work again
> involves fixing up the failing piece and notifying the others that it
> is ready again. However, that part isn't covered and I'm not sure if
> it can be covered in a sufficiently generic way.
True. But that still cannot solve the question what is to be done
if error handling fails. Hence my proposal:
- record all failures
- heed the record only when suspending
Regards
Oliver
next prev parent reply other threads:[~2022-07-27 8:08 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-20 14:42 PM runtime_error handling missing in many drivers? Vincent Whitchurch
2022-06-21 9:38 ` Oliver Neukum
2022-07-08 11:03 ` Vincent Whitchurch
2022-07-08 20:10 ` Rafael J. Wysocki
2022-07-26 9:05 ` Oliver Neukum
2022-07-26 15:41 ` Rafael J. Wysocki
2022-07-27 8:08 ` Oliver Neukum [this message]
2022-07-27 16:31 ` Rafael J. Wysocki
2025-02-10 3:32 ` Ajay Agarwal
2025-02-11 22:21 ` Brian Norris
2025-02-12 19:29 ` Rafael J. Wysocki
2025-02-17 3:49 ` Ajay Agarwal
2025-02-17 20:23 ` Rafael J. Wysocki
2025-02-18 5:37 ` Ajay Agarwal
2025-02-18 5:45 ` Ajay Agarwal
2025-02-18 14:57 ` Rafael J. Wysocki
2025-02-19 22:15 ` Brian Norris
2025-02-20 9:30 ` Oliver Neukum
2025-02-22 1:51 ` Brian Norris
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5c37ee19-fe2c-fb22-63a2-638e3dab8f7a@suse.com \
--to=oneukum@suse.com \
--cc=jic23@kernel.org \
--cc=linux-iio@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=rafael@kernel.org \
--cc=vincent.whitchurch@axis.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox