From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 07179EB7EB5 for ; Wed, 4 Mar 2026 08:49:17 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [127.0.0.1]) by lists.ozlabs.org (Postfix) with ESMTP id 4fQmYm0G7Gz30hq; Wed, 04 Mar 2026 19:49:16 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; arc=none smtp.remote-ip=172.105.4.254 ARC-Seal: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1772614155; cv=none; b=c8x9sq8S4yvuEUvbymhVIAGGPe3jCaAC/EiN6lmS9v0erzwNDVwLDl5t+P3DKl3TUk7W/RaAxlK2v8QGmPTRrUCHe5D4O2IVlvDfjya0PxVUm7qXM18DX6YOdH+nPL5ickWWvYobhPw4fuNpv/XnVo6HOc8+HJmSIotMDRqHegZYLOnv4CVi/SGhGSpAbDveS+/16RwpYJCyvVQQLbHYFsvSZHinJ38R7qvmGlE8GZDf2p8NwXna74B0+9YDZWMlauybUxa0PdiJDdrfJO/S6iEyn9Xpil9CH1aMwxjP8IFgSVOJvwixiIdBIdEmT1JNv91eoUe18wQkXsud7ETwpg== ARC-Message-Signature: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1772614155; c=relaxed/relaxed; bh=Rj7Q/vYtzb0Bgsi7CI/w0kIX0MjY/y2Jj15ilI3hrWU=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=cCW0cmT26ThqvglrRrf1zyZM4UzAiqwD8VxjQ7mHiaOd2jUD4YFjbZ+vBEG7UZP1+PoekcVAqKLyY3I19xVc+R7cSSwUV2b9XT1ycMqRB8ZGrKta2ZxLbglM+LrlkylKsllDj7CwTH8cTo3IAh5Mgd6xIo7KLHvhRM+2RDlLr96mEJXg+7Z4mpSDrhVb7p6iRVyT0GLamEYQZtm19YEhehnPAcRfOCs13htsZXIg2L5NjwmPQ39J0BKkG8mE0WtTr9GKg5mSD4UGmDMlpR1Psxw8E2rm3sBZGnj2DYlHK0bfUbmRJ4u2hfCgHRgXN8qc77Yay+tVHqbwO53IGo6DSw== ARC-Authentication-Results: i=1; lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=kernel.org; dkim=pass (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=k20201202 header.b=MlL8Gp8h; dkim-atps=neutral; spf=pass (client-ip=172.105.4.254; helo=tor.source.kernel.org; envelope-from=chleroy@kernel.org; receiver=lists.ozlabs.org) smtp.mailfrom=kernel.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=kernel.org Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=k20201202 header.b=MlL8Gp8h; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=kernel.org (client-ip=172.105.4.254; helo=tor.source.kernel.org; envelope-from=chleroy@kernel.org; receiver=lists.ozlabs.org) Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4fQmYl0qwLz2yFY for ; Wed, 04 Mar 2026 19:49:15 +1100 (AEDT) Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id B28B66132F; Wed, 4 Mar 2026 08:49:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A447EC19423; Wed, 4 Mar 2026 08:49:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1772614152; bh=TD8F4xQavyxZ5Y5KpJUoNpSILLZS41Y69rivFsWe7vQ=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=MlL8Gp8hwogxltsa4BXo64o0kOu/E+O7UXlChe6ulR4Cyqo/1XaTYAe5rnx1uzYs+ 8E7SttY7nPdc6+UGwgtvQV2zMhpCw6zHvfK+O4+RB2eX++KZEJ+wrx5IN1vQP3eVTG Gpxei4kfTLRUtamcIKXbmA7jeaQNKdS2Hf5ZOikYurGMJdjGvfEOAtz6OEGm63aX7r oAoRn9HMKoe0grgeSP0zt8DgIOi72y1zHJ66LLGjOWGEfNZQaKE1xFsWcRxrO4jOLW BnxJw22oYlX3M1v0bwjZdfuMve9KBr6/1t/MEGHZdtVvb/ys9lY+3QBMxBA7YpfXBe nQxaQQ34/1wiA== Message-ID: Date: Wed, 4 Mar 2026 09:49:03 +0100 X-Mailing-List: linuxppc-dev@lists.ozlabs.org List-Id: List-Help: List-Owner: List-Post: List-Archive: , List-Subscribe: , , List-Unsubscribe: Precedence: list MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 03/14] vdso/datastore: Allocate data pages dynamically To: =?UTF-8?Q?Thomas_Wei=C3=9Fschuh?= , Andy Lutomirski , Vincenzo Frascino , Arnd Bergmann , "David S. Miller" , Andreas Larsson , Nick Alcock , John Stultz , Stephen Boyd , John Paul Adrian Glaubitz , Shuah Khan , Catalin Marinas , Will Deacon , Theodore Ts'o , "Jason A. Donenfeld" , Russell King , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Huacai Chen , WANG Xuerui , Thomas Bogendoerfer , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Shannon Nelson , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linux-s390@vger.kernel.org References: <20260304-vdso-sparc64-generic-2-v6-0-d8eb3b0e1410@linutronix.de> <20260304-vdso-sparc64-generic-2-v6-3-d8eb3b0e1410@linutronix.de> Content-Language: fr-FR From: "Christophe Leroy (CS GROUP)" In-Reply-To: <20260304-vdso-sparc64-generic-2-v6-3-d8eb3b0e1410@linutronix.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Le 04/03/2026 à 08:49, Thomas Weißschuh a écrit : > Allocating the data pages as part of the kernel image does not work on > SPARC. The MMU will through a fault when userspace tries to access them. > > Allocate the data pages through the page allocator instead. > Unused pages in the vDSO VMA are still allocated to keep the virtual > addresses aligned. Switch the mapping from PFNs to 'struct page' as that > is required for dynamically allocated pages. > This also aligns the allocation of the datapages with the code > pages and is a prerequisite for mlockall() support. > > VM_MIXEDMAP is necessary for the call to vmf_insert_page() in the timens > prefault path to work. > > The data pages need to be order-0, non-compound pages so that the > mapping to userspace and the different orderings work. > > These pages are also used by the timekeeping, random pool and > architecture initialization code. Some of these are running before the > page allocator is available. To keep these subsytems working without > changes, introduce early, statically data storage which will then > replaced by the real one as soon as that is available. > > Signed-off-by: Thomas Weißschuh Reviewed-by: Christophe Leroy (CS GROUP) > --- > include/linux/vdso_datastore.h | 6 +++ > init/main.c | 2 + > lib/vdso/datastore.c | 92 +++++++++++++++++++++++++++--------------- > 3 files changed, 68 insertions(+), 32 deletions(-) > > diff --git a/include/linux/vdso_datastore.h b/include/linux/vdso_datastore.h > index a91fa24b06e0..0b530428db71 100644 > --- a/include/linux/vdso_datastore.h > +++ b/include/linux/vdso_datastore.h > @@ -2,9 +2,15 @@ > #ifndef _LINUX_VDSO_DATASTORE_H > #define _LINUX_VDSO_DATASTORE_H > > +#ifdef CONFIG_HAVE_GENERIC_VDSO > #include > > extern const struct vm_special_mapping vdso_vvar_mapping; > struct vm_area_struct *vdso_install_vvar_mapping(struct mm_struct *mm, unsigned long addr); > > +void __init vdso_setup_data_pages(void); > +#else /* !CONFIG_HAVE_GENERIC_VDSO */ > +static inline void vdso_setup_data_pages(void) { } > +#endif /* CONFIG_HAVE_GENERIC_VDSO */ > + > #endif /* _LINUX_VDSO_DATASTORE_H */ > diff --git a/init/main.c b/init/main.c > index 1cb395dd94e4..de867b2693d2 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -105,6 +105,7 @@ > #include > #include > #include > +#include > #include > > #include > @@ -1119,6 +1120,7 @@ void start_kernel(void) > srcu_init(); > hrtimers_init(); > softirq_init(); > + vdso_setup_data_pages(); > timekeeping_init(); > time_init(); > > diff --git a/lib/vdso/datastore.c b/lib/vdso/datastore.c > index 7377fcb6e1df..faebf5b7cd6e 100644 > --- a/lib/vdso/datastore.c > +++ b/lib/vdso/datastore.c > @@ -1,52 +1,79 @@ > // SPDX-License-Identifier: GPL-2.0-only > > -#include > +#include > +#include > #include > #include > #include > #include > #include > > -/* > - * The vDSO data page. > - */ > +static u8 vdso_initdata[VDSO_NR_PAGES * PAGE_SIZE] __aligned(PAGE_SIZE) __initdata = {}; > + > #ifdef CONFIG_GENERIC_GETTIMEOFDAY > -static union { > - struct vdso_time_data data; > - u8 page[PAGE_SIZE]; > -} vdso_time_data_store __page_aligned_data; > -struct vdso_time_data *vdso_k_time_data = &vdso_time_data_store.data; > -static_assert(sizeof(vdso_time_data_store) == PAGE_SIZE); > +struct vdso_time_data *vdso_k_time_data __refdata = > + (void *)&vdso_initdata[VDSO_TIME_PAGE_OFFSET * PAGE_SIZE]; > + > +static_assert(sizeof(struct vdso_time_data) <= PAGE_SIZE); > #endif /* CONFIG_GENERIC_GETTIMEOFDAY */ > > #ifdef CONFIG_VDSO_GETRANDOM > -static union { > - struct vdso_rng_data data; > - u8 page[PAGE_SIZE]; > -} vdso_rng_data_store __page_aligned_data; > -struct vdso_rng_data *vdso_k_rng_data = &vdso_rng_data_store.data; > -static_assert(sizeof(vdso_rng_data_store) == PAGE_SIZE); > +struct vdso_rng_data *vdso_k_rng_data __refdata = > + (void *)&vdso_initdata[VDSO_RNG_PAGE_OFFSET * PAGE_SIZE]; > + > +static_assert(sizeof(struct vdso_rng_data) <= PAGE_SIZE); > #endif /* CONFIG_VDSO_GETRANDOM */ > > #ifdef CONFIG_ARCH_HAS_VDSO_ARCH_DATA > -static union { > - struct vdso_arch_data data; > - u8 page[VDSO_ARCH_DATA_SIZE]; > -} vdso_arch_data_store __page_aligned_data; > -struct vdso_arch_data *vdso_k_arch_data = &vdso_arch_data_store.data; > +struct vdso_arch_data *vdso_k_arch_data __refdata = > + (void *)&vdso_initdata[VDSO_ARCH_PAGES_START * PAGE_SIZE]; > #endif /* CONFIG_ARCH_HAS_VDSO_ARCH_DATA */ > > +void __init vdso_setup_data_pages(void) > +{ > + unsigned int order = get_order(VDSO_NR_PAGES * PAGE_SIZE); > + struct page *pages; > + > + /* > + * Allocate the data pages dynamically. SPARC does not support mapping > + * static pages to be mapped into userspace. > + * It is also a requirement for mlockall() support. > + * > + * Do not use folios. In time namespaces the pages are mapped in a different order > + * to userspace, which is not handled by the folio optimizations in finish_fault(). > + */ > + pages = alloc_pages(GFP_KERNEL, order); > + if (!pages) > + panic("Unable to allocate VDSO storage pages"); > + > + /* The pages are mapped one-by-one into userspace and each one needs to be refcounted. */ > + split_page(pages, order); > + > + /* Move the data already written by other subsystems to the new pages */ > + memcpy(page_address(pages), vdso_initdata, VDSO_NR_PAGES * PAGE_SIZE); > + > + if (IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY)) > + vdso_k_time_data = page_address(pages + VDSO_TIME_PAGE_OFFSET); > + > + if (IS_ENABLED(CONFIG_VDSO_GETRANDOM)) > + vdso_k_rng_data = page_address(pages + VDSO_RNG_PAGE_OFFSET); > + > + if (IS_ENABLED(CONFIG_ARCH_HAS_VDSO_ARCH_DATA)) > + vdso_k_arch_data = page_address(pages + VDSO_ARCH_PAGES_START); > +} > + > static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, > struct vm_area_struct *vma, struct vm_fault *vmf) > { > - struct page *timens_page = find_timens_vvar_page(vma); > - unsigned long pfn; > + struct page *page, *timens_page; > + > + timens_page = find_timens_vvar_page(vma); > > switch (vmf->pgoff) { > case VDSO_TIME_PAGE_OFFSET: > if (!IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY)) > return VM_FAULT_SIGBUS; > - pfn = __phys_to_pfn(__pa_symbol(vdso_k_time_data)); > + page = virt_to_page(vdso_k_time_data); > if (timens_page) { > /* > * Fault in VVAR page too, since it will be accessed > @@ -56,10 +83,10 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, > vm_fault_t err; > > addr = vmf->address + VDSO_TIMENS_PAGE_OFFSET * PAGE_SIZE; > - err = vmf_insert_pfn(vma, addr, pfn); > + err = vmf_insert_page(vma, addr, page); > if (unlikely(err & VM_FAULT_ERROR)) > return err; > - pfn = page_to_pfn(timens_page); > + page = timens_page; > } > break; > case VDSO_TIMENS_PAGE_OFFSET: > @@ -72,24 +99,25 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, > */ > if (!IS_ENABLED(CONFIG_TIME_NS) || !timens_page) > return VM_FAULT_SIGBUS; > - pfn = __phys_to_pfn(__pa_symbol(vdso_k_time_data)); > + page = virt_to_page(vdso_k_time_data); > break; > case VDSO_RNG_PAGE_OFFSET: > if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM)) > return VM_FAULT_SIGBUS; > - pfn = __phys_to_pfn(__pa_symbol(vdso_k_rng_data)); > + page = virt_to_page(vdso_k_rng_data); > break; > case VDSO_ARCH_PAGES_START ... VDSO_ARCH_PAGES_END: > if (!IS_ENABLED(CONFIG_ARCH_HAS_VDSO_ARCH_DATA)) > return VM_FAULT_SIGBUS; > - pfn = __phys_to_pfn(__pa_symbol(vdso_k_arch_data)) + > - vmf->pgoff - VDSO_ARCH_PAGES_START; > + page = virt_to_page(vdso_k_arch_data) + vmf->pgoff - VDSO_ARCH_PAGES_START; > break; > default: > return VM_FAULT_SIGBUS; > } > > - return vmf_insert_pfn(vma, vmf->address, pfn); > + get_page(page); > + vmf->page = page; > + return 0; > } > > const struct vm_special_mapping vdso_vvar_mapping = { > @@ -101,7 +129,7 @@ struct vm_area_struct *vdso_install_vvar_mapping(struct mm_struct *mm, unsigned > { > return _install_special_mapping(mm, addr, VDSO_NR_PAGES * PAGE_SIZE, > VM_READ | VM_MAYREAD | VM_IO | VM_DONTDUMP | > - VM_PFNMAP | VM_SEALED_SYSMAP, > + VM_MIXEDMAP | VM_SEALED_SYSMAP, > &vdso_vvar_mapping); > } > >