From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Zhang Rui <rui.zhang@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-pm <linux-pm@lists.linux-foundation.org>,
linux-acpi <linux-acpi@vger.kernel.org>,
Len Brown <lenb@kernel.org>, Pavel Machek <pavel@ucw.cz>,
Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: [PATCH 2/8] introduce the device async action mechanism
Date: Mon, 20 Jul 2009 18:24:16 +0200 [thread overview]
Message-ID: <200907201824.17755.rjw@sisk.pl> (raw)
In-Reply-To: <1247798643.26272.275.camel@rzhang-dt>
On Friday 17 July 2009, Zhang Rui wrote:
> On Fri, 2009-07-17 at 09:11 +0800, Rafael J. Wysocki wrote:
> > On Thursday 16 July 2009, Zhang Rui wrote:
> > > On Wed, 2009-07-15 at 21:00 +0800, Arjan van de Ven wrote:
> > > > On Wed, 15 Jul 2009 15:38:36 +0800
> > > > Zhang Rui <rui.zhang@intel.com> wrote:
> > > >
> > > > > Introduce the device async action mechanism.
> > > > >
> > > > > In order to speed up Linux suspend/resume/shutdown process,
> > > > > we introduce the device async action mechanism that allow devices
> > > > > to suspend/resume/shutdown asynchronously.
> > > > >
> > > > > The basic idea is that,
> > > > > if the suspend/resume/shutdown process of a device set,
> > > > > including a root device and its child devices, are independent of
> > > > > other devices, we create an async domain for this device set,
> > > > > and make them suspend/resume/shutdown asynchronously.
> > > >
> > > > Hi,
> > > >
> > > > I have some concerns about having an async domain per device(group)
> > > > rather than having one async domain for all of this,
> > >
> > > we create an async domain ONLY if we are sure that the device group is
> > > independent with the other devices.
> > >
> > > and IMO, using multiple async domains brings real device async actions.
> > > For example, in S3 resume case, there are two device groups:
> > > device group1: device1, device2, device3
> > > device group2: device4, device5, device6
> > >
> > > If they share the global domain, we may get:
> > > device group1: device1(cookie 1), device2(cookie 4), device3(cookie 5)
> > > device group2: device4(cookie 2), device5(cookie 3), device6(cookie 6)
> > >
> > > this is not real asynchronous resume because
> > > device3 needs to call async_synchronize_cookie(5) before resume itself.
> > > which means that device4 and device5 must be resumed before device3.
> > >
> > > But if multiple async domain is used, we get:
> > > device group1: device1(cookie 1), device2(cookie 2), device3(cookie 3)
> > > device group2: device4(cookie 1), device5(cookie 2), device6(cookie 3)
> > >
> > > device group1 and group2 can be resumed asynchronously.
> > >
> > >
> > > Another example, in my previous test,
> > > 1. sata controller. takes 0.4s to resume.
> > > 2. usb, including uchi and ehci controller takes 1.4s to resume
> > > 3. ACPI battery takes 0.3s to resume.
> > > 3. all the other devices take 0.2s to resume.
> > >
> > > sata, usb and ACPI battery are independent device groups.
> > > If we use multiple async domains, we can reduce the total device resume
> > > time from 2.3s to a little more than 1.4s because there are a lot of
> > > sleep in usb resume process.
> > > But if we share the global async domain, the total resume time can only
> > > be reduced to about 2.1s because sata, usb and ACPI battery are actually
> > > resumed synchronously.
> >
> > Well, first, I'm not really sure that using the async _boot_ infrastructure for
> > that is a good choice.
>
> IMO, kernel/async.c provides good interfaces that can be used not only
> in boot phrase.
> it's generic enough for suspend/resume.
Well, if that were not your opinion, you certainly wouldn't post patches using
it. :-)
> > During suspend-resume we know dependencies between
> > devices beforehand, at least in theory, so we can use them.
> >
> that's why I use multiple async domains. :)
> One domain for a device group.
And how exactly the device groups are defined?
> > In particular, we have to make sure that parent devices will not be suspended
> > until all of their children have been suspended and children devices will not
> > be resumed before the parents.
>
> that's not enough.
> For examples,
> ACPI battery and EC are independent devices, but EC must be resumed
> before battery because battery driver may access some EC address space
> during resume time.
> Of course the problem I describe above doesn't exist because ACPI
> battery driver doesn't have .resume method right now.
> But this is the case that works well in the current code but can not be
> converted to async device resume.
>
> > The current code handles this quite
> > efficiently, so I wonder what you're going to do not to break it.
> >
> sorry I don't quite understand.
>
> > Second, you seem to think that it only makes sense to execute ->suspend()
> > and ->resume() asynchronously (or in parallel), while for example in the case
> > of PCI ->suspend_noirq() and ->resume_noirq() also contain code that
> > potentially can take quite some time to execute.
> >
> IMO, this patch set just provides a framework that can be used for
> suspend/resume/shutdown optimization, and it doesn't solve all the
> problem at one time.
>
> IMO, the next step we should do is:
> 1. analyze the suspend/resume/shutdown process and find out the hotspot,
> i.e. which drivers suspend/resume slowly
> 2. If it's software problem that we can fix in the driver, fix it.
> like commit 24920c8a6358bf5532f1336b990b1c0fe2b599ee
> 3. If the hardware is slow, try to do it asynchronously.
> like I did in patch 8/8.
>
> This framework makes it quite easy to register an async device group.
>
> And even the suspend_noirq and resume_noirq can be covered easily with
> this framework.
> For example,
> 1. introduce two new device async actions.
> DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ
> just like what I did in patch 4/8, 5/8 and 6/8.
> 2. find out which device is slow in ->suspend_noirq and ->resume_noirq
> 3. see if we can find an async device group that including this device.
> 4. if yes, register this new async device group,
> with the type DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ
>
> > Finally, I don't really understand what the code in the $subject patch is
> > supposed to do. In particular, what's the purpose of dev_action()?
> > It only seems to check if func is not NULL right now. Also, you define
> > DEV_ASYNC_ACTIONS_ALL as 0, so the condition
> > if (!(DEV_ASYNC_ACTIONS_ALL & type)) in dev_async_register() is always
> > satisfied.
>
> please refer to patch 4/8 and 5/8 and 6/8
>
> Patch 2/8 is just a framework. No device async action is support yet.
OK
So please remove the redundant return statements from there and please don't
use (void *) for passing function pointers.
> And in Patch 4, 5, 6, we introduced three different types of device
> async actions, DEV_ASYNC_SUSPEND, DEV_ASYNC_RESUME and
> DEV_ASYNC_SHUTDOWN.
> I tried to split a big patch into several small patches.
> And it suggests how to adding a new device async type, like
> DEV_ASYNC_SUSPEND_NOIRQ, DEV_ASYNC_RESUME_NO_IRQ, etc. :)
>
> > Can we please discuss this thoroughly before any new patches are sent?
> >
> do I still need to resend the patch?
Yes, please.
> If yes, I'll resend patch 2/8, 3/8, 4/8, 5/8, 6/8 as a new big one. :)
Yes, please, I think that would help to understand the design.
Best,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Zhang Rui <rui.zhang@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"linux-pm" <linux-pm@lists.linux-foundation.org>,
"linux-acpi" <linux-acpi@vger.kernel.org>,
Len Brown <lenb@kernel.org>, Pavel Machek <pavel@ucw.cz>,
Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: [PATCH 2/8] introduce the device async action mechanism
Date: Mon, 20 Jul 2009 18:24:16 +0200 [thread overview]
Message-ID: <200907201824.17755.rjw@sisk.pl> (raw)
In-Reply-To: <1247798643.26272.275.camel@rzhang-dt>
On Friday 17 July 2009, Zhang Rui wrote:
> On Fri, 2009-07-17 at 09:11 +0800, Rafael J. Wysocki wrote:
> > On Thursday 16 July 2009, Zhang Rui wrote:
> > > On Wed, 2009-07-15 at 21:00 +0800, Arjan van de Ven wrote:
> > > > On Wed, 15 Jul 2009 15:38:36 +0800
> > > > Zhang Rui <rui.zhang@intel.com> wrote:
> > > >
> > > > > Introduce the device async action mechanism.
> > > > >
> > > > > In order to speed up Linux suspend/resume/shutdown process,
> > > > > we introduce the device async action mechanism that allow devices
> > > > > to suspend/resume/shutdown asynchronously.
> > > > >
> > > > > The basic idea is that,
> > > > > if the suspend/resume/shutdown process of a device set,
> > > > > including a root device and its child devices, are independent of
> > > > > other devices, we create an async domain for this device set,
> > > > > and make them suspend/resume/shutdown asynchronously.
> > > >
> > > > Hi,
> > > >
> > > > I have some concerns about having an async domain per device(group)
> > > > rather than having one async domain for all of this,
> > >
> > > we create an async domain ONLY if we are sure that the device group is
> > > independent with the other devices.
> > >
> > > and IMO, using multiple async domains brings real device async actions.
> > > For example, in S3 resume case, there are two device groups:
> > > device group1: device1, device2, device3
> > > device group2: device4, device5, device6
> > >
> > > If they share the global domain, we may get:
> > > device group1: device1(cookie 1), device2(cookie 4), device3(cookie 5)
> > > device group2: device4(cookie 2), device5(cookie 3), device6(cookie 6)
> > >
> > > this is not real asynchronous resume because
> > > device3 needs to call async_synchronize_cookie(5) before resume itself.
> > > which means that device4 and device5 must be resumed before device3.
> > >
> > > But if multiple async domain is used, we get:
> > > device group1: device1(cookie 1), device2(cookie 2), device3(cookie 3)
> > > device group2: device4(cookie 1), device5(cookie 2), device6(cookie 3)
> > >
> > > device group1 and group2 can be resumed asynchronously.
> > >
> > >
> > > Another example, in my previous test,
> > > 1. sata controller. takes 0.4s to resume.
> > > 2. usb, including uchi and ehci controller takes 1.4s to resume
> > > 3. ACPI battery takes 0.3s to resume.
> > > 3. all the other devices take 0.2s to resume.
> > >
> > > sata, usb and ACPI battery are independent device groups.
> > > If we use multiple async domains, we can reduce the total device resume
> > > time from 2.3s to a little more than 1.4s because there are a lot of
> > > sleep in usb resume process.
> > > But if we share the global async domain, the total resume time can only
> > > be reduced to about 2.1s because sata, usb and ACPI battery are actually
> > > resumed synchronously.
> >
> > Well, first, I'm not really sure that using the async _boot_ infrastructure for
> > that is a good choice.
>
> IMO, kernel/async.c provides good interfaces that can be used not only
> in boot phrase.
> it's generic enough for suspend/resume.
Well, if that were not your opinion, you certainly wouldn't post patches using
it. :-)
> > During suspend-resume we know dependencies between
> > devices beforehand, at least in theory, so we can use them.
> >
> that's why I use multiple async domains. :)
> One domain for a device group.
And how exactly the device groups are defined?
> > In particular, we have to make sure that parent devices will not be suspended
> > until all of their children have been suspended and children devices will not
> > be resumed before the parents.
>
> that's not enough.
> For examples,
> ACPI battery and EC are independent devices, but EC must be resumed
> before battery because battery driver may access some EC address space
> during resume time.
> Of course the problem I describe above doesn't exist because ACPI
> battery driver doesn't have .resume method right now.
> But this is the case that works well in the current code but can not be
> converted to async device resume.
>
> > The current code handles this quite
> > efficiently, so I wonder what you're going to do not to break it.
> >
> sorry I don't quite understand.
>
> > Second, you seem to think that it only makes sense to execute ->suspend()
> > and ->resume() asynchronously (or in parallel), while for example in the case
> > of PCI ->suspend_noirq() and ->resume_noirq() also contain code that
> > potentially can take quite some time to execute.
> >
> IMO, this patch set just provides a framework that can be used for
> suspend/resume/shutdown optimization, and it doesn't solve all the
> problem at one time.
>
> IMO, the next step we should do is:
> 1. analyze the suspend/resume/shutdown process and find out the hotspot,
> i.e. which drivers suspend/resume slowly
> 2. If it's software problem that we can fix in the driver, fix it.
> like commit 24920c8a6358bf5532f1336b990b1c0fe2b599ee
> 3. If the hardware is slow, try to do it asynchronously.
> like I did in patch 8/8.
>
> This framework makes it quite easy to register an async device group.
>
> And even the suspend_noirq and resume_noirq can be covered easily with
> this framework.
> For example,
> 1. introduce two new device async actions.
> DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ
> just like what I did in patch 4/8, 5/8 and 6/8.
> 2. find out which device is slow in ->suspend_noirq and ->resume_noirq
> 3. see if we can find an async device group that including this device.
> 4. if yes, register this new async device group,
> with the type DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ
>
> > Finally, I don't really understand what the code in the $subject patch is
> > supposed to do. In particular, what's the purpose of dev_action()?
> > It only seems to check if func is not NULL right now. Also, you define
> > DEV_ASYNC_ACTIONS_ALL as 0, so the condition
> > if (!(DEV_ASYNC_ACTIONS_ALL & type)) in dev_async_register() is always
> > satisfied.
>
> please refer to patch 4/8 and 5/8 and 6/8
>
> Patch 2/8 is just a framework. No device async action is support yet.
OK
So please remove the redundant return statements from there and please don't
use (void *) for passing function pointers.
> And in Patch 4, 5, 6, we introduced three different types of device
> async actions, DEV_ASYNC_SUSPEND, DEV_ASYNC_RESUME and
> DEV_ASYNC_SHUTDOWN.
> I tried to split a big patch into several small patches.
> And it suggests how to adding a new device async type, like
> DEV_ASYNC_SUSPEND_NOIRQ, DEV_ASYNC_RESUME_NO_IRQ, etc. :)
>
> > Can we please discuss this thoroughly before any new patches are sent?
> >
> do I still need to resend the patch?
Yes, please.
> If yes, I'll resend patch 2/8, 3/8, 4/8, 5/8, 6/8 as a new big one. :)
Yes, please, I think that would help to understand the design.
Best,
Rafael
next prev parent reply other threads:[~2009-07-20 16:23 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-15 7:38 [PATCH 2/8] introduce the device async action mechanism Zhang Rui
2009-07-15 0:31 ` Pavel Machek
2009-07-15 0:31 ` [linux-pm] " Pavel Machek
2009-07-15 13:00 ` Arjan van de Ven
2009-07-15 13:00 ` Arjan van de Ven
2009-07-16 2:05 ` Zhang Rui
2009-07-17 1:11 ` Rafael J. Wysocki
2009-07-17 1:11 ` Rafael J. Wysocki
2009-07-17 2:44 ` Zhang Rui
2009-07-17 2:44 ` Zhang Rui
2009-07-17 2:44 ` Zhang Rui
2009-07-19 22:16 ` Pavel Machek
2009-07-19 22:16 ` Pavel Machek
2009-07-20 3:09 ` Zhang Rui
2009-07-20 3:09 ` Zhang Rui
2009-07-20 3:09 ` Zhang Rui
2009-07-20 16:24 ` Rafael J. Wysocki
2009-07-20 16:24 ` Rafael J. Wysocki [this message]
2009-07-20 16:24 ` Rafael J. Wysocki
2009-07-17 4:31 ` Arjan van de Ven
2009-07-17 4:31 ` Arjan van de Ven
2009-07-17 1:11 ` Rafael J. Wysocki
2009-07-16 2:05 ` Zhang Rui
2009-07-15 13:00 ` Arjan van de Ven
-- strict thread matches above, loose matches on Subject: below --
2009-07-15 7:38 Zhang Rui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200907201824.17755.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=arjan@infradead.org \
--cc=arjan@linux.intel.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=pavel@ucw.cz \
--cc=rui.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.