From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f68.google.com ([209.85.214.68]:33173 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750748AbcFVGsM (ORCPT ); Wed, 22 Jun 2016 02:48:12 -0400 Date: Wed, 22 Jun 2016 14:51:28 +0800 From: Boqun Feng To: Wei Fang Cc: viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org, jack@suse.com, axboe@kernel.dk, tj@kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH v2] fs/dcache.c: avoid soft-lockup in dput() Message-ID: <20160622065128.GB28443@insomnia> References: <1466564475-30417-1-git-send-email-fangwei1@huawei.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="+g7M9IMkV8truYOl" Content-Disposition: inline In-Reply-To: <1466564475-30417-1-git-send-email-fangwei1@huawei.com> Sender: stable-owner@vger.kernel.org List-ID: --+g7M9IMkV8truYOl Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Wei Fang, On Wed, Jun 22, 2016 at 11:01:15AM +0800, Wei Fang wrote: > We triggered soft-lockup under stress test which > open/access/write/close one file concurrently on more than > five different CPUs: >=20 > WARN: soft lockup - CPU#0 stuck for 11s! [who:30631] > ... > [] dput+0x100/0x298 > [] terminate_walk+0x4c/0x60 > [] path_lookupat+0x5cc/0x7a8 > [] filename_lookup+0x38/0xf0 > [] user_path_at_empty+0x78/0xd0 > [] user_path_at+0x1c/0x28 > [] SyS_faccessat+0xb4/0x230 >=20 > ->d_lock trylock may failed many times because of concurrently > operations, and dput() may execute a long time. >=20 > Fix this by replacing cpu_relax() with cond_resched(). > dput() used to be sleepable, so make it sleepable again > should be safe. >=20 > Cc: > Signed-off-by: Wei Fang > --- > Changes v1->v2: > - add might_sleep() to annotate that dput() can sleep >=20 > fs/dcache.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) >=20 > diff --git a/fs/dcache.c b/fs/dcache.c > index d5ecc6e..074fc1c 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -578,7 +578,7 @@ static struct dentry *dentry_kill(struct dentry *dent= ry) > =20 > failed: > spin_unlock(&dentry->d_lock); > - cpu_relax(); > + cond_resched(); Is it better to put the cond_resched() in the caller(i.e. dput()), right before "goto repeat"? Because it's obviously a loop there, which makes the purpose of cond_resched() more straightforward. Regards, Boqun > return dentry; /* try again with same dentry */ > } > =20 > @@ -752,6 +752,8 @@ void dput(struct dentry *dentry) > return; > =20 > repeat: > + might_sleep(); > + > rcu_read_lock(); > if (likely(fast_dput(dentry))) { > rcu_read_unlock(); > --=20 > 1.7.1 >=20 --+g7M9IMkV8truYOl Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJXajVpAAoJEEl56MO1B/q4cywH/R0x/gJ+SUN8rws1QZrjcyu7 WDueO+Qf+Vo5Kuih1B/3GDGFpFLyh/Y6gkmaxFCdl5MDBBETGMf1MrOHVm+jASik uCXKNG1DLs1HTnyOr/CWgDnXKKnAxOvQSACdcWYKUhPnGKk0QQB583CIrfxbu7PN D2gYEiX+6uOyM5PS9awfV1hstXCDzBxLlzMFllCXlBXpD6nTNTnx6R36p6BfSAc8 BHJRCs5qk5BKUaMovHXEZlnJPA7XI+ZHwf7ncTUFiQpNg6ka+Pa+zL5LopXqasec qQ6RAzXpv4dgcmtD2WaJ1S7qBR4n7HZt5FeS8pSZIAGlqmfeKQASzWvPIfTED6s= =yccT -----END PGP SIGNATURE----- --+g7M9IMkV8truYOl--