From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>,
Eduardo Valentin <edubezval@gmail.com>,
linux-pm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
x86@kernel.org, rt@linutronix.de, Borislav Petkov <bp@alien8.de>
Subject: [patch 00/12] thermal/x86_pkg_temp: Sanitize yet another hotplug and locking trainwreck
Date: Fri, 18 Nov 2016 00:03:21 -0000 [thread overview]
Message-ID: <20161117231435.891545908@linutronix.de> (raw)
We solely intended to convert that driver to the hotplug state machine and
stumbled over a large pile of insanities, which are all interwoven with the
package management:
- The work cancelation code, the thermal zone unregistering, the work code
and the interrupt notification function are racy against each other and
against cpu hotplug and module exit. The random locking sprinkeled all
over the place does not help anything and probably exists to make people
feel good. The resulting issues (mainly use after free) are probably
hard to trigger, but they clearly exist
- Work cancelation in the cpu hotplug code can leave the work marked
scheduled and the package interrupts disabled forever.
- Storage for a boolean information whether work is scheduled for a
package is kept in separate allocated storage, which is resized when the
number of detected packages grows.
- Delayed work structs are held in a static percpu storage, which makes no
sense at all because work is strictly per package.
- Packages are kept in a list, which must be searched over and over.
Fixing the whole pile of races with a few duct tape fixes was pretty much
impossible, so I decided to do a major rewrite to fix all of the
above. Here are the major changes:
- Rewrite the offline code with proper locking against interrupts and work
function and make sure that canceled work is rescheduled if there is
another online cpu in the package.
- Use the cpu offline callback on module exit to fix the work cancelation
race.
- Move the bool which denotes scheduled work into the package struct
where it belongs.
- Move the delayed work struct into the package struct, which is the only
sensible place to have it and schedule the work on the cpu which is the
target for the sysfs files as this makes the cancellation and
rescheduling in the cpu offline path simple.
- Add a large pile of comments documenting the cpu teardown mechanism
- Code sanitizing, revamp the horrible name choices plus a general coding
style cleanup.
Note, that I did the namespace and code cleanup in the middle of the
series, because staring at that mess just made my eyes bleeding.
- Store the package pointers in an array which is allocated at init
time. Sizing of the array is determined from the topology
information. That makes the package lookup a simple array lookup.
As a last step the driver is converted to the hotplug state machine.
The total damage is:
x86_pkg_temp_thermal.c | 590 ++++++++++++++++++++-----------------------------
1 file changed, 245 insertions(+), 345 deletions(-)
So a net reduction of 100 lines and the compiled size of the driver changes
in the following way:
text data bss dec
4370 549 128 5047 Before
2845 424 40 3309 After fixes and cleanup
2690 404 40 3134 After hotplug conversion
Thanks,
tglx
next reply other threads:[~2016-11-18 0:06 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-18 0:03 Thomas Gleixner [this message]
2016-11-18 0:03 ` [patch 01/12] thermal/x86_pkg_temp: Cleanup thermal interrupt handling Thomas Gleixner
2016-11-18 0:03 ` [patch 02/12] thermal/x86_pkg_temp: Remove redundant package search Thomas Gleixner
2016-11-18 0:03 ` [patch 03/12] thermal/x86_pkg_temp: Replace open coded cpu search Thomas Gleixner
2016-11-18 0:03 ` [patch 04/12] thermal/x86_pkg_temp: Sanitize callback (de)initialization Thomas Gleixner
2016-11-18 0:03 ` [patch 05/12] thermal/x86_pkg_temp: Get rid of ref counting Thomas Gleixner
2016-11-18 0:03 ` [patch 06/12] thermal/x86_pkg_temp: Cleanup namespace Thomas Gleixner
2016-11-18 0:03 ` [patch 07/12] thermal/x86_pkg_temp: Cleanup code some more Thomas Gleixner
2016-11-18 0:03 ` [patch 08/12] thermal/x86_pkg_temp: Sanitize locking Thomas Gleixner
2016-11-18 0:03 ` [patch 09/12] thermal/x86_pkg_temp: Move work scheduled flag into package struct Thomas Gleixner
2016-11-18 0:03 ` [patch 10/12] thermal/x86_pkg_temp: Move work " Thomas Gleixner
2016-11-18 0:03 ` [patch 11/12] thermal/x86_pkg_temp: Sanitize package management Thomas Gleixner
2016-11-18 0:03 ` [patch 12/12] thermal/x86 pkg temp: Convert to hotplug state machine Thomas Gleixner
2016-11-21 20:02 ` [patch 00/12] thermal/x86_pkg_temp: Sanitize yet another hotplug and locking trainwreck Pandruvada, Srinivas
2016-11-21 21:34 ` Thomas Gleixner
2016-11-21 23:16 ` Pandruvada, Srinivas
2016-11-22 9:05 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161117231435.891545908@linutronix.de \
--to=tglx@linutronix.de \
--cc=bp@alien8.de \
--cc=edubezval@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rt@linutronix.de \
--cc=rui.zhang@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).