From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751478AbbHBKyf (ORCPT ); Sun, 2 Aug 2015 06:54:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47718 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751031AbbHBKyd (ORCPT ); Sun, 2 Aug 2015 06:54:33 -0400 Date: Sun, 2 Aug 2015 11:54:29 +0100 From: Aaron Tomlin To: Ulrich Obergfell Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, dzickus@redhat.com, jolsa@kernel.org, mhocko@suse.cz, eranian@google.com, cmetcalf@ezchip.com, fweisbec@gmail.com Subject: Re: [PATCH 0/4] watchdog: avoid races in watchdog_nmi_{en|disable} functions Message-ID: <20150802105429.GA7037@atomlin.usersys.redhat.com> References: <1438433365-2979-1-git-send-email-uobergfe@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ibTvN161/egqYuK8" Content-Disposition: inline In-Reply-To: <1438433365-2979-1-git-send-email-uobergfe@redhat.com> X-PGP-Key: http://pgp.mit.edu/pks/lookup?search=atomlin%40redhat.com X-PGP-Fingerprint: 7906 84EB FA8A 9638 8D1E 6E9B E2DE 9658 19CC 77D6 User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --ibTvN161/egqYuK8 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat 2015-08-01 14:49 +0200, Ulrich Obergfell wrote: > Originally watchdog_nmi_enable(cpu) and watchdog_nmi_disable(cpu) were > only called in watchdog thread context. However, the following commits > utilize these functions outside of watchdog thread context too. >=20 > commit 9809b18fcf6b8d8ec4d3643677345907e6b50eca > Author: Michal Hocko > Date: Tue Sep 24 15:27:30 2013 -0700 >=20 > watchdog: update watchdog_thresh properly >=20 > commit b3738d29323344da3017a91010530cf3a58590fc > Author: Stephane Eranian > Date: Mon Nov 17 20:07:03 2014 +0100 >=20 > watchdog: Add watchdog enable/disable all functions >=20 > Hence, it is now possible that these functions execute concurrently with > the same 'cpu' argument. This concurrency is problematic because per-cpu > 'watchdog_ev' can be accessed/modified without adequate synchronization. >=20 > The patch series aims to address the above problem. However, instead of > introducing locks to protect per-cpu 'watchdog_ev' a different approach > is taken: Invoke these functions by parking and unparking the watchdog > threads (to ensure they are always called in watchdog thread context). >=20 > static struct smp_hotplug_thread watchdog_threads =3D { > ... > .park =3D watchdog_disable, // calls watchdog_nmi_disable() > .unpark =3D watchdog_enable, // calls watchdog_nmi_enable() > }; >=20 > Both previously mentioned commits call these functions in a similar way > and thus in principle contain some duplicate code. The patch series also > avoids this duplication by providing a commonly usable mechanism. >=20 > - Patch 1/4 introduces the watchdog_{park|unpark}_threads functions that > park/unpark all watchdog threads specified in 'watchdog_cpumask'. They > are intended to be called inside of kernel/watchdog.c only. >=20 > - Patch 2/4 introduces the watchdog_{suspend|resume} functions which can > be utilized by external callers to deactivate the hard and soft lockup > detector temporarily. >=20 > - Patch 3/4 utilizes watchdog_{park|unpark}_threads to replace some code > that was introduced by commit 9809b18fcf6b8d8ec4d3643677345907e6b50eca. >=20 > - Patch 4/4 utilizes watchdog_{suspend|resume} to replace some code that > was introduced by commit b3738d29323344da3017a91010530cf3a58590fc. >=20 > A few corner cases should be mentioned here for completeness. >=20 > - kthread_park() of watchdog/N could hang if cpu N is already locked up. > However, if watchdog is enabled the lockup will be detected anyway. >=20 > - kthread_unpark() of watchdog/N could hang if cpu N got locked up after > kthread_park(). The occurrence of this scenario should be _very_ rare > in practice, in particular because it is not expected that temporary > deactivation will happen frequently, and if it happens at all it is > expected that the duration of deactivation will be short. >=20 > Ulrich Obergfell (4): > watchdog: introduce watchdog_park_threads() and > watchdog_unpark_threads() > watchdog: introduce watchdog_suspend() and watchdog_resume() > watchdog: use park/unpark functions in update_watchdog_all_cpus() > watchdog: use suspend/resume interface in fixup_ht_bug() >=20 > arch/x86/kernel/cpu/perf_event_intel.c | 9 +- > include/linux/nmi.h | 2 + > include/linux/watchdog.h | 8 -- > kernel/watchdog.c | 157 +++++++++++++++++++--------= ------ > 4 files changed, 100 insertions(+), 76 deletions(-) >=20 The whole patch series looks good to me. Reviewed-by: Aaron Tomlin --ibTvN161/egqYuK8 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVvfblAAoJEOLellgZzHfWYbQP/3pPzfcrLPsym110FzeF3cBm JnP/UKSHHb162MvX+FaNNfq72Z272H4zx3y1iL4r7c1j2HeK0ac3cXJEjWczof3A c8kMbW8PvJn2n/TBtaPCRO3CHemZxxqLmRa+tXDTJu3Cu6ZZF0e22hLMTMrNu4kk Tu3KQsUvnIfXw1YX09XSP7AV2LvbFjlX46d7HpMtWPU3gpjLCefTPnMMroUC0x1g YdK4Y/YKns9EvsJdL1HkBW4x6+WXVYaVNkpnAkAjmZBbZzQ2Vjkk8RFquMIQVjTK alRZm50hJviY4vns5qR+dGZerg4BgZ6ux8LdRiErUToZrULHs1juudo6f/8yYHRx vxUo+hA++sqo6c6HTeClyTu5E4u1STaV/StF2P2K9uY9KmsTCbpbncHJCX1fafLd 4UYIBBcgMyTuqU6nA5cHgBuSv9eeTmmBRIKG7E1sR+jR9kJvJwFs46mL078Hx0Cp +h/bk+bXQO7fSPmHdtV5hkztgG8E3mbaf1o8slEoEa5K/ovtJYhKbpNZyKSQaFbI FLCFhwCxPpu/aaOnysXtmZKQa5hdxtma/Q3TituBU+X4eZvaTh0oF6p4dGPFVxUD dwgWQEJCQ0DmiRWqNwllQpaEeSo0eP9t4QYq1gEpFJJN/thUX7sjXLwCyw4nKXy0 qmQSiQeUHs/HZJ+eL82A =wIqd -----END PGP SIGNATURE----- --ibTvN161/egqYuK8--