From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9914EC4360C for ; Tue, 8 Oct 2019 07:31:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6B09E206C2 for ; Tue, 8 Oct 2019 07:31:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1570519867; bh=q6+LlR3pdZCH3+ao29SsNtID7vLUsgj6ETsnFpyd+UM=; h=Subject:To:Cc:From:Date:List-ID:From; b=R9a6liZYSnofqPbIbxvidXCUJA8U0dvYhj3EIDkQTjnQLSdNjUvVv9oVopwJivU6R sJb3sTcCM23YXsIcVTwGLazZuBKAFTL7LX/49uymOwB7dvBm3vCbebTavrYfmhJ/Dn dxJdXRd3phu268oWrMV2V3Wit0qcqFvnLL0pB+uc= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730177AbfJHHbH (ORCPT ); Tue, 8 Oct 2019 03:31:07 -0400 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:34767 "EHLO wout2-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730112AbfJHHbG (ORCPT ); Tue, 8 Oct 2019 03:31:06 -0400 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id 46EBB40C; Tue, 8 Oct 2019 03:31:05 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Tue, 08 Oct 2019 03:31:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:message-id:mime-version:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=MmkLq3 DD1OZkdxNvQ8UOO/cNYSSU5+W1NIP/ss9lxbA=; b=iFH0xd5GBf/V+0o/0mpMa0 i0BI74G3WNHMB5sefxrBiH3cIzPHM7CWRmupNP+lQ7Gk6zQTaDsder9y78hdzjaD PU7ogQ1CNjwepCbsaQxq67e0spy33ORXO477IInQMN5TC4MPEAONHKJPnZSPf5v/ kco8d7apgeFQ/nGJ/OdGlew3dCjoWs+r29H2OkE+wqyM472Nz9RXS0sgk9pd8K7J PrxbSs5loP6mxJO9N9WhyoFoCXE6kdqwG6wq6Mu8a/oHbWFQFiBxXqE5y4dLpXp8 5mF7jZjTbaS8gjPkIHIA8syZ7ahnlL8IwwQHsqczbJ7mGsmSx4gbEldPB00XAmgA == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedrheekgdduudekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefuvffhfffkgggtgfesthekredttd dtlfenucfhrhhomhepoehgrhgvghhkhheslhhinhhugihfohhunhgurghtihhonhdrohhr gheqnecuffhomhgrihhnpehkvghrnhgvlhdrohhrghenucfkphepkeefrdekiedrkeelrd dutdejnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgrhgvgheskhhrohgrhhdrtghomhen ucevlhhushhtvghrufhiiigvpedv X-ME-Proxy: Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) by mail.messagingengine.com (Postfix) with ESMTPA id 7C3BCD60062; Tue, 8 Oct 2019 03:31:04 -0400 (EDT) Subject: FAILED: patch "[PATCH] powerpc/pseries: Fix cpu_hotplug_lock acquisition in" failed to apply to 4.14-stable tree To: ego@linux.vnet.ibm.com, aneesh.kumar@linux.ibm.com, mpe@ellerman.id.au Cc: From: Date: Tue, 08 Oct 2019 09:31:03 +0200 Message-ID: <1570519863137214@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From c784be435d5dae28d3b03db31753dd7a18733f0c Mon Sep 17 00:00:00 2001 From: "Gautham R. Shenoy" Date: Wed, 15 May 2019 13:15:52 +0530 Subject: [PATCH] powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt() The calls to arch_add_memory()/arch_remove_memory() are always made with the read-side cpu_hotplug_lock acquired via memory_hotplug_begin(). On pSeries, arch_add_memory()/arch_remove_memory() eventually call resize_hpt() which in turn calls stop_machine() which acquires the read-side cpu_hotplug_lock again, thereby resulting in the recursive acquisition of this lock. In the absence of CONFIG_PROVE_LOCKING, we hadn't observed a system lockup during a memory hotplug operation because cpus_read_lock() is a per-cpu rwsem read, which, in the fast-path (in the absence of the writer, which in our case is a CPU-hotplug operation) simply increments the read_count on the semaphore. Thus a recursive read in the fast-path doesn't cause any problems. However, we can hit this problem in practice if there is a concurrent CPU-Hotplug operation in progress which is waiting to acquire the write-side of the lock. This will cause the second recursive read to block until the writer finishes. While the writer is blocked since the first read holds the lock. Thus both the reader as well as the writers fail to make any progress thereby blocking both CPU-Hotplug as well as Memory Hotplug operations. Memory-Hotplug CPU-Hotplug CPU 0 CPU 1 ------ ------ 1. down_read(cpu_hotplug_lock.rw_sem) [memory_hotplug_begin] 2. down_write(cpu_hotplug_lock.rw_sem) [cpu_up/cpu_down] 3. down_read(cpu_hotplug_lock.rw_sem) [stop_machine()] Lockdep complains as follows in these code-paths. swapper/0/1 is trying to acquire lock: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: stop_machine+0x2c/0x60 but task is already holding lock: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by swapper/0/1: #0: (____ptrval____) (&dev->mutex){....}, at: __driver_attach+0x12c/0x1b0 #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50 #2: (____ptrval____) (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x54/0x1a0 stack backtrace: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc5-58373-gbc99402235f3-dirty #166 Call Trace: dump_stack+0xe8/0x164 (unreliable) __lock_acquire+0x1110/0x1c70 lock_acquire+0x240/0x290 cpus_read_lock+0x64/0xf0 stop_machine+0x2c/0x60 pseries_lpar_resize_hpt+0x19c/0x2c0 resize_hpt_for_hotplug+0x70/0xd0 arch_add_memory+0x58/0xfc devm_memremap_pages+0x5e8/0x8f0 pmem_attach_disk+0x764/0x830 nvdimm_bus_probe+0x118/0x240 really_probe+0x230/0x4b0 driver_probe_device+0x16c/0x1e0 __driver_attach+0x148/0x1b0 bus_for_each_dev+0x90/0x130 driver_attach+0x34/0x50 bus_add_driver+0x1a8/0x360 driver_register+0x108/0x170 __nd_driver_register+0xd0/0xf0 nd_pmem_driver_init+0x34/0x48 do_one_initcall+0x1e0/0x45c kernel_init_freeable+0x540/0x64c kernel_init+0x2c/0x160 ret_from_kernel_thread+0x5c/0x68 Fix this issue by 1) Requiring all the calls to pseries_lpar_resize_hpt() be made with cpu_hotplug_lock held. 2) In pseries_lpar_resize_hpt() invoke stop_machine_cpuslocked() as a consequence of 1) 3) To satisfy 1), in hpt_order_set(), call mmu_hash_ops.resize_hpt() with cpu_hotplug_lock held. Fixes: dbcf929c0062 ("powerpc/pseries: Add support for hash table resizing") Cc: stable@vger.kernel.org # v4.11+ Reported-by: Aneesh Kumar K.V Signed-off-by: Gautham R. Shenoy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/1557906352-29048-1-git-send-email-ego@linux.vnet.ibm.com diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index e6d471058597..c363e850550e 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -1931,10 +1932,16 @@ static int hpt_order_get(void *data, u64 *val) static int hpt_order_set(void *data, u64 val) { + int ret; + if (!mmu_hash_ops.resize_hpt) return -ENODEV; - return mmu_hash_ops.resize_hpt(val); + cpus_read_lock(); + ret = mmu_hash_ops.resize_hpt(val); + cpus_read_unlock(); + + return ret; } DEFINE_DEBUGFS_ATTRIBUTE(fops_hpt_order, hpt_order_get, hpt_order_set, "%llu\n"); diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index 09bb878c21e0..4f76e5f30c97 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -1413,7 +1413,10 @@ static int pseries_lpar_resize_hpt_commit(void *data) return 0; } -/* Must be called in user context */ +/* + * Must be called in process context. The caller must hold the + * cpus_lock. + */ static int pseries_lpar_resize_hpt(unsigned long shift) { struct hpt_resize_state state = { @@ -1467,7 +1470,8 @@ static int pseries_lpar_resize_hpt(unsigned long shift) t1 = ktime_get(); - rc = stop_machine(pseries_lpar_resize_hpt_commit, &state, NULL); + rc = stop_machine_cpuslocked(pseries_lpar_resize_hpt_commit, + &state, NULL); t2 = ktime_get();