From: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
To: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Cc: Andrew Morton <akpm@osdl.org>, linux-acpi@vger.kernel.org
Subject: Re: Possible bug in ACPI
Date: Wed, 06 Dec 2006 18:36:04 +0300 [thread overview]
Message-ID: <4576E364.3090800@linux.intel.com> (raw)
In-Reply-To: <45734070.6040305@linux.intel.com>
Andrew,
Would it be better to replace sys_sched_yield() with cond_resched()?
Thanks in advance,
Alex.
Alexey Starikovskiy wrote:
> Andrew Morton wrote:
>> On Sun, 03 Dec 2006 12:30:52 +0300
>> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> wrote:
>>
>>
>>> Andrew Morton wrote:
>>>
>>>> On Sun, 03 Dec 2006 12:06:54 +0300
>>>> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> wrote:
>>>>
>>>>
>>>>> This is a patch reverted by Linus from rc6-git2 because it broke
>>>>> his Compaq n620c, it refers to #5534 bug. Basically, kacpid
>>>>> deadlocks on some new HP notebooks, and all incoming requests
>>>>> would be queued until memory is over if this patch is not applied.
>>>>> On a bright side -- it's not a memory leak...
>>>>> Patch, which works for Linus laptop and "looks acceptable" to
>>>>> Linus is the last in #5534 list.
>>>>>
>>>> hm, if you say so.
>>>>
>>> I forwarded Linus' mail to you...
>>>
>>
>> I didn't receive it.
>>
> Please see attached.
>>
>>>> The description in that patch is nowhere near complete
>>>> enough for me to be able to work out what it does.
>>>>
>>>>
>>> Will update.
>>>
>>>
>>>> The sys_sched_yield() is particularly incomprehensible and needs good
>>>> commenting. You are, I hope, aware of the severe problems which
>>>> yield()
>>>> causes when the system is busy? The process which calls it will get
>>>> practically no CPU at all.
>>>>
>>>>
>>> On Linus' machine, as soon as we execute deferred work, it's GPE
>>> becomes enabled again and BIOS sends us a new event.
>>> So kacpid is always ready to run, while kacpi_notify don't have a
>>> chance to run. sys_sched_yield() was added to
>>> give kacpi_notify a chance to run.
>>> I was thinking about lowering the priority of kacpid, is it better?
>>>
>>
>> It all sounds horridly hacky, but I don't understand the problem well
>> enough to be able to recommend any solutions. That's why I was
>> hoping for
>> a decent description of the patch.
>>
>>
> HP nx6125/nx6325/... machines have a _GPE handler with an infinite
> loop sending Notify() events to different ACPI subsystems.
> Notify handler in ACPI driver is a C-routine, which may call ACPI
> interpreter again to get access to some ACPI variables
> (acpi_evaluate_xxx).
> On these HP machines such an evaluation changes state of some variable
> and lets the loop above break.
> In the current ACPI implementation Notify requests are being deferred
> to the same kacpid workqueue on which the above GPE handler with
> infinite loop is executing. Thus we have a deadlock -- loop will
> continue to spin, sending notify events, and at the same time
> preventing these notify events from being run on a workqueue. All
> notify events are deferred, thus we see increase in memory consumption
> noticed by author of the thread. Also as GPE handling is bloked,
> machines overheat. Eventually by external poll of the same
> acpi_evaluate, kacpid is released and all the queued notify events are
> free to run, thus 100% cpu utilization by kacpid for several seconds
> or more.
>
> To prevent all these horrors it's needed to not put notify events to
> kacpid workqueue by either executing them immediately or putting them
> on some other thread.
> It's dangerous to execute notify events in place, as it will put
> several ACPI interpreter stacks on top of each other (at least 4 in
> case of nx6125), thus causing kernel stack overflow.
> First attempt to create a new thread was done by /Peter Wainwright
> <mailto:prw@ceiriog.eclipse.co.uk>: he created a bunch of threads
> which were stealing work from a kacpid workqueue.
> This patch appeared in 2.6.15 kernel shipped with Ubuntu 6.06 LTS.
> Second attempt was done by me, I created a new thread for each
> Notify event. This worked OK on HP nx machines, but broke Linus'
> Compaq n620c, by producing threads with a speed what they stopped the
> machine completely. Thus this patch was reverted from 18-rc2 as I
> remember.
> I re-made the patch to create second workqueue just for notify events,
> thus hopping it will not break Linus' machine. Patch was tested on the
> same HP nx machines in #5534 and #7122, but I did not received reply
> from Linus on a test patch sent to him.
> Patch went to 19-rc and was rejected with much fanfare again.
> There was 4th patch, which inserted schedule_timeout(1) into deferred
> execution of kacpid if we had any notify requests pending, but Linus
> decided that it was too complex (involved either changes to workqueue
> to see if it's empty or atomic inc/dec).
> Now you see last variant which adds yield() to every GPE execution.
> /
>> How does it relate to this? http://lkml.org/lkml/2006/8/8/336
>>
>>
> In both cases GPE (General Purpose Events) are involved.
> There are two types of GPE -- level and edge triggered, they are
> described as _Lxx or _Exx methods under _GPE namespace of ACPI
> interpreter (DSDT). Both should be disabled, then we receive them
> until they are handled by methods above. Level should be cleared after
> execution of handler, while Edge should be cleared immediately. In
> this post DSDT was written with error and edge method was described as
> level, thus requiring changing clearing of level events as soon as
> edge ones. The solution for this problem is changing DSDT to declare
> method in question as _Exx and not _Lxx.
>>>> Minor point: that patch has several unneded (and undesirable) casts
>>>> of void*:
>>>>
>>>> +static void acpi_os_execute_notify(void *context)
>>>> +{
>>>> + struct acpi_os_dpc *dpc = (struct acpi_os_dpc *)context;
>>>>
>>>> please remove those.
>>>>
>>
>> I'll take that as an "OK" ;)
>>
> Sorry, yes, OK. :)
>
> Regards,
> Alex.
>
> ------------------------------------------------------------------------
>
> Subject:
> Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
> From:
> Linus Torvalds <torvalds@osdl.org>
> Date:
> Mon, 20 Nov 2006 10:07:39 -0800 (PST)
> To:
> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
>
> To:
> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
>
>
> On Sun, 19 Nov 2006, Alexey Starikovskiy wrote:
>
>> I agree to all your comments with one exception, please see below. Attached is
>> the reworked patch against latest git. Please test.
>>
>
> Ok, this one works for me too, and looks much simpler.
>
>
>> Linus Torvalds wrote:
>>
>>> And we might as well do it when we add an entry to the _deferred_ queue, no?
>>>
>>
>> acpi_os_execute() is called from interrupt context for insertion into
>> _deferred_ queue, so it's not possible to yield in it, no?
>>
>
> Hmm. Yes. Anyway, the new patch looks acceptable, and certainly much
> simpler than trying to count events.
>
> It probably causes tons of new unnecessary scheduling events, but I doubt
> we really care.
>
> That said, what we _really_ want here is a "priority queue" for the
> events, and some way to put an event back on the queue while running it
> (eg ACPI "Sleep" event). But I guess the ACPI interpreter isn't done that
> way (ie you can't just push and pop ACPI state).
>
> Linus
>
next prev parent reply other threads:[~2006-12-06 15:36 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-19 21:47 Possible bug in ACPI Dalibor Straka
2006-09-19 22:03 ` Jiri Slaby
2006-09-19 22:22 ` Dalibor Straka
2006-09-19 23:33 ` Andrew Morton
2006-09-20 7:11 ` Jarek Poplawski
2006-09-20 9:28 ` Tomasz Torcz
[not found] ` <20061109140416.db12bcbe.akpm@osdl.org>
[not found] ` <20061202205140.GA12447@panelnet.cz>
2006-12-03 3:26 ` Andrew Morton
2006-12-03 9:06 ` Alexey Starikovskiy
2006-12-03 10:12 ` Andrew Morton
2006-12-03 9:30 ` Alexey Starikovskiy
2006-12-03 20:27 ` Andrew Morton
2006-12-03 21:24 ` Alexey Starikovskiy
2006-12-06 15:36 ` Alexey Starikovskiy [this message]
2006-12-06 15:54 ` Andrew Morton
2006-12-06 16:07 ` Alexey Starikovskiy
2006-12-06 16:21 ` Alexey Starikovskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4576E364.3090800@linux.intel.com \
--to=alexey.y.starikovskiy@linux.intel.com \
--cc=akpm@osdl.org \
--cc=linux-acpi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.