From: "Bjørn Mork" <bjorn@mork.no>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>,
Lan Tianyu <lantianyu1986@gmail.com>,
ziegler@uni-freiburg.de, viresh kumar <viresh.kumar@linaro.org>,
"cpufreq@vger.kernel.org" <cpufreq@vger.kernel.org>,
Linux PM list <linux-pm@vger.kernel.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
Date: Thu, 12 Dec 2013 09:52:01 +0100 [thread overview]
Message-ID: <87txeesb7y.fsf@nemi.mork.no> (raw)
In-Reply-To: <4241242.m6mjyy0put@vostro.rjw.lan> (Rafael J. Wysocki's message of "Thu, 12 Dec 2013 02:59:47 +0100")
"Rafael J. Wysocki" <rjw@rjwysocki.net> writes:
> On Monday, December 09, 2013 11:04:53 AM Bjørn Mork wrote:
>> "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> writes:
>> > On 12/09/2013 08:29 AM, Lan Tianyu wrote:
>> >> 2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
>> >>> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
>> >>>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>> >>>>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>> >>>>> light-weight init/teardown during suspend/resume"), which enabled
>> >>>>> suspend/resume optimizations leaving the sysfs files in place.
>> > [...]
>> >>> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
>> >>> but for that I guess it should contain a revert of the change made by the
>> >>> Bjorn's patch.
>> >>
>> >> This patch causes a s3 regression. Cc:Martin Ziegler
>> >> https://bugzilla.kernel.org/show_bug.cgi?id=66751
>> >>
>> >
>> > Hmm.. With Bjorn's patch applied, the cpufreq hotplug callback should become
>> > identical to what happens during regular CPU hotplug.
>>
>> Yes, I also wondered how that could have happened.
>>
>> Apparently this is due to bad interaction between two patches. Commit
>>
>> 5a87182aa21d ("cpufreq: suspend governors on system suspend/hibernate")
>>
>> added an implicit dependency on the suspend/resume code which commit
>>
>> 2167e2399dc5 ("cpufreq: fix garbage kobjects on errors during suspend/resume")
>>
>> disabled.
>
> I suspected so, but then I was about to jump on a plane to another continent
> in several hours, so I preferred to simply revert both commits and start over
> after the dust settled.
No, problem. I saw your mail about travelling. And I definitely support
the "revert first, research later" strategy in any case. There was
still too many people hit by this, and bisecting it just to find an
already known bug.
>> This would make the last patch applied of these two come out of the
>> bisect, which is 2167e2399dc5 in this case. I can confirm that
>> reverting only this patch also fixes my hibernate problem.
>>
>> BUT: It reintroduces the problem it was supposed to fix. AND: As you
>> note, it really does nothing but revert to the assumed safe regular CPU
>> hotplug operations. Which means that the other patch somehow has made
>> regular CPU hotplugging fail *if suspending*. It won't make it fail
>> unless suspending, so there is no need to test CPU hotplugging
>> separately.
>>
>> In any case, my claim is that the real bug here still is in commit
>> 5a87182aa21d, which added an undocumented implicit dependency on the
>> special cpufreq suspend/resume code. There is no way in hell that
>> anyone could have guessed that the seemingly innocent changes in commit
>> 2167e2399dc5 would fail because of this. Which should be more than
>> enough to understand why the continues sprinkling of suspend/resume code
>> all over has to stop. Where did all the nice and clean pm hooks design
>> disappear?
>
> cpufreq has always had problems with suspend/resume in the first place,
> but it just didn't have so much testing coverage before.
Yes... I have known about the problems with acpi-cpufreq "forever" and
do feel bad about not reporting it before. But I usually don't want to
report bugs without being able to dedicate some time to follow up in
case the developers need more info or patch testing etc. Which means
that "low priority" (rare, only slightly annoying, etc) bugs can end up
not being reported at all.
So the additional cpufreq breakage in v3.12 was actually good because it
made the acpi-cpufreq bug a log more annoying, and therefore increased
the priority :-)
>> My opinion is that commit 2167e2399dc5 still is the correct short term
>> fix, and it should be reapplied to v3.13-rcX and resubmitted for
>> 3.12-stable.
>
> First of all, I'm not going to send any pull requests this week and even
> the next week may be too early to reintroduce that commit. However, the
> second next week will be the -rc6 time frame, so I'm not sure. It may
> end up in 3.14-rc1.
You decide of course, but if it matters then I tend to agree that this
should wait for 3.14. It has gone enough back and forth for now, and
the fact that noone(?) else has reported it as a 3.12 regression shows
that it probably isn't a big problem for most people.
>> I anticipate the real cleanup of this mess. But I don't think any
>> additional "if suspending" tests has any place in it. Test *once* and
>> fork to whatever you want to do differently when suspending .
>> Sprinkling these tests all over, having separate code blocks implicitly
>> depending on each other, is nothing but a recipe for hard to track bugs.
>
> Yes, that's pretty much the case, but it looks like we need to do a major
> redesign of stuff to really fix those problems.
Yes, I am hoping you will do that :-)
Bjørn
next prev parent reply other threads:[~2013-12-12 8:52 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-03 11:14 [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume Bjørn Mork
2013-12-03 21:45 ` Rafael J. Wysocki
2013-12-04 6:23 ` Srivatsa S. Bhat
2013-12-24 9:46 ` Jarzmik, Robert
2013-12-04 10:32 ` viresh kumar
2013-12-04 12:08 ` Bjørn Mork
2013-12-04 14:41 ` Viresh Kumar
2013-12-04 15:41 ` Bjørn Mork
[not found] ` <CAKohponu3Fu=WaBHXP1iBJM87V9g=+hDPe=M168U_weODenZdQ@mail.gmail.com>
[not found] ` <878uvzyecg.fsf@nemi.mork.no>
2013-12-05 12:41 ` Srivatsa S. Bhat
2013-12-05 13:21 ` Bjørn Mork
2013-12-05 22:29 ` Rafael J. Wysocki
2013-12-06 5:23 ` Srivatsa S. Bhat
2013-12-07 1:17 ` Rafael J. Wysocki
2013-12-05 1:29 ` Rafael J. Wysocki
2013-12-09 2:59 ` Lan Tianyu
2013-12-09 6:48 ` Srivatsa S. Bhat
2013-12-09 10:04 ` Bjørn Mork
2013-12-12 1:59 ` Rafael J. Wysocki
2013-12-12 8:52 ` Bjørn Mork [this message]
2013-12-09 11:24 ` Martin Ziegler
2013-12-09 11:53 ` Bjørn Mork
2013-12-10 16:02 ` Martin Ziegler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87txeesb7y.fsf@nemi.mork.no \
--to=bjorn@mork.no \
--cc=cpufreq@vger.kernel.org \
--cc=lantianyu1986@gmail.com \
--cc=linux-pm@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=rjw@rjwysocki.net \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=viresh.kumar@linaro.org \
--cc=ziegler@uni-freiburg.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox