From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C10B0C433F5 for ; Mon, 6 Dec 2021 15:17:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=GtFdTweUiH/P99LmPSizXFLAuaTOesbQQVhATN64oNY=; b=S5YTI844ahyb+r tGIaF+1zS51xoyI6PEGE9CYbOB3In3XOgtFN41xz32varxbC2qjUVW5g+oAamO15XUzZrnvw7aqhI GCLM49nl4XAe/7uHvuBqR7GiQUe0MdGAzClxj6z8PXzHwq5aGwuRRKN3kaJjh4Aaf/GmfocVM9piq 6ILpYfbtorZw2qgHbY48neXCsgQwuqSHiASDYwAkR/A8BK8d9oCjmh5NO2Rdb+JgwvZPLnoENMKGZ qJ6EWHsQzBUSgb9JYHeqNnp+yBiTs8fDg1GyxadGJEf3Fv2WEKFKY0ixb6KzAfST1wTXw2RvAfDyK VWZjSacdY41QhciHDRlg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1muFi9-004Rjb-Da; Mon, 06 Dec 2021 15:15:25 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1muFha-004RW1-9s for linux-arm-kernel@lists.infradead.org; Mon, 06 Dec 2021 15:14:51 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2C6A16132E; Mon, 6 Dec 2021 15:14:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 06363C341C6; Mon, 6 Dec 2021 15:14:46 +0000 (UTC) Date: Mon, 6 Dec 2021 15:14:43 +0000 From: Catalin Marinas To: David Hildenbrand Cc: Jianyong Wu , Anshuman Khandual , "will@kernel.org" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "maz@kernel.org" , "ardb@kernel.org" , "gshan@redhat.com" , Justin He , nd Subject: Re: [PATCH v1] arm64/mm: avoid race condition of update page table when kernel init Message-ID: References: <20211027094828.7629-1-jianyong.wu@arm.com> <1cd8e875-24b1-2904-4e9f-2a4eb13674dc@arm.com> <3c971e70-f8c7-4406-d098-74e92f3c7dc4@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <3c971e70-f8c7-4406-d098-74e92f3c7dc4@redhat.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211206_071450_490870_5EAB7BB5 X-CRM114-Status: GOOD ( 30.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Dec 03, 2021 at 07:13:31PM +0100, David Hildenbrand wrote: > On 03.12.21 18:44, Catalin Marinas wrote: > > On Thu, Oct 28, 2021 at 08:36:07AM +0100, Jianyong Wu wrote: > >> From Anshuman Khandual : > >>> On 10/27/21 3:18 PM, Jianyong Wu wrote: > >>>> Race condition of page table update can happen in kernel boot period > >>>> as both of memory hotplug action when kernel init and the > >>>> mark_rodata_ro can update page table. For virtio-mem, the function excute flow chart is: > >>>> > >>>> ------------------------- > >>>> kernel_init > >>>> kernel_init_freeable > >>>> ... > >>>> do_initcall > >>>> ... > >>>> module_init [A] > >>>> > >>>> ... > >>>> mark_readonly > >>>> mark_rodata_ro [B] > >>>> ------------------------- > > [...] > >>>> We can see that the error derived from the l3 translation as the pte > >>>> value is *0*. That is because the fixmap has been clear when access. > >>>> > >>>> Signed-off-by: Jianyong Wu > >>>> --- > >>>> arch/arm64/mm/mmu.c | 2 ++ > >>>> 1 file changed, 2 insertions(+) > >>>> > >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index > >>>> cfd9deb347c3..567dfba8f08a 100644 > >>>> --- a/arch/arm64/mm/mmu.c > >>>> +++ b/arch/arm64/mm/mmu.c > >>>> @@ -564,8 +564,10 @@ void mark_rodata_ro(void) > >>>> * to cover NOTES and EXCEPTION_TABLE. > >>>> */ > >>>> section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata; > >>>> + get_online_mems(); > >>>> update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata, > >>>> section_size, PAGE_KERNEL_RO); > >>>> + put_online_mems(); > >>>> > >>>> debug_checkwx(); > >>>> } > >>> > >>> While this should solve the current problem i.e race between concurrent > >>> memory hotplug operation and mark_rodata_ro(), but I am still wondering > >>> whether this is the fix at the right place and granularity. Basically a hotplug > >>> operation queued in an work queue at [A] can execute during [B] is the root > >>> cause of this problem. > >> > >> Not exactly, this issue doesn't only happen at the the *pure* kernel > >> boot. For example, hotplug memory through VM monitor when VM boot. We > >> can't foresee when that happen. Thus, this issue can affect all kinds > >> of memory hotplug mechanism, including ACPI based memory hotplug and > >> virtio-mem. I'm not sure that fix it here is the best way. If the race > >> only happens between kernel init and memory hotplug, I think it's fine > >> to fix it here. IMO, this issue results from the race for "fixmap" > >> resource. I wonder why this global resource is not protected by a > >> lock. Maybe we can add one and fix it there. > > > > IIUC the race is caused by multiple attempts to use the fixmap at the > > same time. We can add a fixmap_lock and hold it during > > __create_pgd_mapping(). > > IIRC that's something along the lines I suggested, so, yes :) Yeah, I just echoed what you and Anshuman said ;). -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CD0FC433EF for ; Mon, 6 Dec 2021 15:29:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1386879AbhLFPaN (ORCPT ); Mon, 6 Dec 2021 10:30:13 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:37210 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348351AbhLFPSS (ORCPT ); Mon, 6 Dec 2021 10:18:18 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8E57E61333 for ; Mon, 6 Dec 2021 15:14:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 06363C341C6; Mon, 6 Dec 2021 15:14:46 +0000 (UTC) Date: Mon, 6 Dec 2021 15:14:43 +0000 From: Catalin Marinas To: David Hildenbrand Cc: Jianyong Wu , Anshuman Khandual , "will@kernel.org" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "maz@kernel.org" , "ardb@kernel.org" , "gshan@redhat.com" , Justin He , nd Subject: Re: [PATCH v1] arm64/mm: avoid race condition of update page table when kernel init Message-ID: References: <20211027094828.7629-1-jianyong.wu@arm.com> <1cd8e875-24b1-2904-4e9f-2a4eb13674dc@arm.com> <3c971e70-f8c7-4406-d098-74e92f3c7dc4@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3c971e70-f8c7-4406-d098-74e92f3c7dc4@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 03, 2021 at 07:13:31PM +0100, David Hildenbrand wrote: > On 03.12.21 18:44, Catalin Marinas wrote: > > On Thu, Oct 28, 2021 at 08:36:07AM +0100, Jianyong Wu wrote: > >> From Anshuman Khandual : > >>> On 10/27/21 3:18 PM, Jianyong Wu wrote: > >>>> Race condition of page table update can happen in kernel boot period > >>>> as both of memory hotplug action when kernel init and the > >>>> mark_rodata_ro can update page table. For virtio-mem, the function excute flow chart is: > >>>> > >>>> ------------------------- > >>>> kernel_init > >>>> kernel_init_freeable > >>>> ... > >>>> do_initcall > >>>> ... > >>>> module_init [A] > >>>> > >>>> ... > >>>> mark_readonly > >>>> mark_rodata_ro [B] > >>>> ------------------------- > > [...] > >>>> We can see that the error derived from the l3 translation as the pte > >>>> value is *0*. That is because the fixmap has been clear when access. > >>>> > >>>> Signed-off-by: Jianyong Wu > >>>> --- > >>>> arch/arm64/mm/mmu.c | 2 ++ > >>>> 1 file changed, 2 insertions(+) > >>>> > >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index > >>>> cfd9deb347c3..567dfba8f08a 100644 > >>>> --- a/arch/arm64/mm/mmu.c > >>>> +++ b/arch/arm64/mm/mmu.c > >>>> @@ -564,8 +564,10 @@ void mark_rodata_ro(void) > >>>> * to cover NOTES and EXCEPTION_TABLE. > >>>> */ > >>>> section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata; > >>>> + get_online_mems(); > >>>> update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata, > >>>> section_size, PAGE_KERNEL_RO); > >>>> + put_online_mems(); > >>>> > >>>> debug_checkwx(); > >>>> } > >>> > >>> While this should solve the current problem i.e race between concurrent > >>> memory hotplug operation and mark_rodata_ro(), but I am still wondering > >>> whether this is the fix at the right place and granularity. Basically a hotplug > >>> operation queued in an work queue at [A] can execute during [B] is the root > >>> cause of this problem. > >> > >> Not exactly, this issue doesn't only happen at the the *pure* kernel > >> boot. For example, hotplug memory through VM monitor when VM boot. We > >> can't foresee when that happen. Thus, this issue can affect all kinds > >> of memory hotplug mechanism, including ACPI based memory hotplug and > >> virtio-mem. I'm not sure that fix it here is the best way. If the race > >> only happens between kernel init and memory hotplug, I think it's fine > >> to fix it here. IMO, this issue results from the race for "fixmap" > >> resource. I wonder why this global resource is not protected by a > >> lock. Maybe we can add one and fix it there. > > > > IIUC the race is caused by multiple attempts to use the fixmap at the > > same time. We can add a fixmap_lock and hold it during > > __create_pgd_mapping(). > > IIRC that's something along the lines I suggested, so, yes :) Yeah, I just echoed what you and Anshuman said ;). -- Catalin