From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61D03C83004 for ; Mon, 27 Apr 2020 23:56:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 228112076A for ; Mon, 27 Apr 2020 23:56:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="D2DQHQ5R" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 228112076A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E78968E0009; Mon, 27 Apr 2020 19:56:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E50688E0007; Mon, 27 Apr 2020 19:56:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CCF658E0009; Mon, 27 Apr 2020 19:56:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id A94F78E0007 for ; Mon, 27 Apr 2020 19:56:53 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 73BC0180AD804 for ; Mon, 27 Apr 2020 23:56:53 +0000 (UTC) X-FDA: 76755297906.25.basin57_764e363c5a40f X-HE-Tag: basin57_764e363c5a40f X-Filterd-Recvd-Size: 6292 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Mon, 27 Apr 2020 23:56:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1588031812; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=hKMGvop5RUwWSaAIrxGf06+2WG92KPK2um2wWjE0aqI=; b=D2DQHQ5RNrtIT8HeV52WaaKun3oNQfTv1LLBdq+6ErTP1534uLmbYnys1TSgGaao0ohG1A Hd/XZHQLRndtsmG7z0q7pu21tT8lOmhTVxEcednPujaTZPt/LihjTrc75PKsPjTG/p1pXr vtgi9Jb5BRjl5w3BITZwcLwN6wJKGhk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-447-iDM7_ISSM26hLWkH4Npq8w-1; Mon, 27 Apr 2020 19:56:48 -0400 X-MC-Unique: iDM7_ISSM26hLWkH4Npq8w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7A2A7180F122; Mon, 27 Apr 2020 23:56:46 +0000 (UTC) Received: from llong.com (ovpn-112-176.rdu2.redhat.com [10.10.112.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3293560BE2; Mon, 27 Apr 2020 23:56:45 +0000 (UTC) From: Waiman Long To: Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Johannes Weiner , Michal Hocko , Vladimir Davydov Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Juri Lelli , Qian Cai , Waiman Long Subject: [PATCH v2 3/4] mm/slub: Fix another circular locking dependency in slab_attr_store() Date: Mon, 27 Apr 2020 19:56:20 -0400 Message-Id: <20200427235621.7823-4-longman@redhat.com> In-Reply-To: <20200427235621.7823-1-longman@redhat.com> References: <20200427235621.7823-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It turns out that switching from slab_mutex to memcg_cache_ids_sem in slab_attr_store() does not completely eliminate circular locking dependency as shown by the following lockdep splat when the system is shut down: [ 2095.079697] Chain exists of: [ 2095.079697] kn->count#278 --> memcg_cache_ids_sem --> slab_mutex [ 2095.079697] [ 2095.090278] Possible unsafe locking scenario: [ 2095.090278] [ 2095.096227] CPU0 CPU1 [ 2095.100779] ---- ---- [ 2095.105331] lock(slab_mutex); [ 2095.108486] lock(memcg_cache_ids_sem); [ 2095.114961] lock(slab_mutex); [ 2095.120649] lock(kn->count#278); [ 2095.124068] [ 2095.124068] *** DEADLOCK *** To eliminate this possibility, we have to use trylock to acquire memcg_cache_ids_sem. Unlikely slab_mutex which can be acquired in many places, the memcg_cache_ids_sem write lock is only acquired in memcg_alloc_cache_id() to double the size of memcg_nr_cache_ids. So the chance of successive calls to memcg_alloc_cache_id() within a short time is pretty low. As a result, we can retry the read lock acquisition a few times if the first attempt fails. Signed-off-by: Waiman Long --- include/linux/memcontrol.h | 1 + mm/memcontrol.c | 5 +++++ mm/slub.c | 25 +++++++++++++++++++++++-- 3 files changed, 29 insertions(+), 2 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d275c72c4f8e..9285f14965b1 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1379,6 +1379,7 @@ extern struct workqueue_struct *memcg_kmem_cache_wq; extern int memcg_nr_cache_ids; void memcg_get_cache_ids(void); void memcg_put_cache_ids(void); +int memcg_tryget_cache_ids(void); /* * Helper macro to loop through all memcg-specific caches. Callers must still diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5beea03dd58a..9fa8535ff72a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -279,6 +279,11 @@ void memcg_get_cache_ids(void) down_read(&memcg_cache_ids_sem); } +int memcg_tryget_cache_ids(void) +{ + return down_read_trylock(&memcg_cache_ids_sem); +} + void memcg_put_cache_ids(void) { up_read(&memcg_cache_ids_sem); diff --git a/mm/slub.c b/mm/slub.c index 44cb5215c17f..cf2114ca27f7 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -34,6 +34,7 @@ #include #include #include +#include #include @@ -5572,6 +5573,7 @@ static ssize_t slab_attr_store(struct kobject *kobj, !list_empty(&s->memcg_params.children)) { struct kmem_cache *c, **pcaches; int idx, max, cnt = 0; + int retries = 3; size_t size, old = s->max_attr_size; struct memcg_cache_array *arr; @@ -5585,9 +5587,28 @@ static ssize_t slab_attr_store(struct kobject *kobj, old = cmpxchg(&s->max_attr_size, size, len); } while (old != size); - memcg_get_cache_ids(); - max = memcg_nr_cache_ids; + /* + * To avoid the following circular lock chain + * + * kn->count#278 --> memcg_cache_ids_sem --> slab_mutex + * + * We need to use trylock to acquire memcg_cache_ids_sem. + * + * Since the write lock is acquired only in + * memcg_alloc_cache_id() to double the size of + * memcg_nr_cache_ids. The chance of successive + * memcg_alloc_cache_id() calls within a short time is + * very low except at the beginning where the number of + * memory cgroups is low. So we retry a few times to get + * the memcg_cache_ids_sem read lock. + */ + while (!memcg_tryget_cache_ids()) { + if (retries-- <= 0) + return -EBUSY; + msleep(100); + } + max = memcg_nr_cache_ids; pcaches = kmalloc_array(max, sizeof(void *), GFP_KERNEL); if (!pcaches) { memcg_put_cache_ids(); -- 2.18.1