From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B36EDC3601E for ; Mon, 14 Apr 2025 06:53:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42B04280042; Mon, 14 Apr 2025 02:53:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B1AD280030; Mon, 14 Apr 2025 02:53:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 27B53280042; Mon, 14 Apr 2025 02:53:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0332D280030 for ; Mon, 14 Apr 2025 02:53:11 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1A7F3B1DCA for ; Mon, 14 Apr 2025 06:53:13 +0000 (UTC) X-FDA: 83331732666.08.18495D4 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf12.hostedemail.com (Postfix) with ESMTP id 5EE9A40009 for ; Mon, 14 Apr 2025 06:53:11 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=sWJARPmo; spf=pass (imf12.hostedemail.com: domain of mingo@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=mingo@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744613591; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lZkOPl6YUcfhCZsR8r2EUoruehSk2lKBm9f94cR0FCE=; b=FpsehBC6s/lVoUcKtWEJIq5+Bqy2PluMp7SumKK6OiC/nKScu1Z4qttVP9Duxe2QrYy++o /VjBuajebAv0J6eajLNnd0lafB9ZWNrVGTLZa66dKUYJ2mPGvigB18wxeb/brbj5bPWQPl kigBXBWO8aAC2VLR9pGQ115C+qNkg/o= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=sWJARPmo; spf=pass (imf12.hostedemail.com: domain of mingo@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=mingo@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744613591; a=rsa-sha256; cv=none; b=xor/VxRESq/zi3dUURNkIlqlcN9MWIDaFP//WDEBqScXdMR9IIZenOs4kraGXM+K5zM4Z6 1ab1dzDBg3FVZ/CrBJPQvcYQwtYT+HBRMw3luQi62t6MvQOG1VfZxXHjeld5e4OtUJLpyP mstNrpESm99CQJIhsXZQLExG9wuUs/g= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 29D6D5C3B22; Mon, 14 Apr 2025 06:50:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6374BC4CEE2; Mon, 14 Apr 2025 06:53:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744613589; bh=AHvofNn5TL3U2ajIrQ9gUjfUGkeWtcbsQ1mCQPypoL0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=sWJARPmoJzyiLOTGOtX/2dezXvdf2fh7nhbxBkVEU6wCjtq47P+jxcr76UGmyU7g3 hV39/tOYkL223mfzF6NKJysn4zMGBL/251CtSYAN13XcEiyFOZJnZilkHMq49C+a2R IgNIjN2tw5Ts6YjOk4EN7821v9iH/QMrJYqUZ4j9WPDtyoKLePBB5skU0+AtZhkmPn 5Z8i7Ml0pbSNLqqgq0wh++RoM3zXfM81xSDzXFxG1eAtio5M5gLzW11RTHHIWP6MT2 fHZUmDvb2vYCYEJN5je8dAOieNjfHZcrCGnZmyWmI9ovok4qmvKVGB08Pv6yOsGW6T UNiAbh5vAby6g== Date: Mon, 14 Apr 2025 08:53:02 +0200 From: Ingo Molnar To: Ankur Arora Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, luto@kernel.org, peterz@infradead.org, paulmck@kernel.org, rostedt@goodmis.org, tglx@linutronix.de, willy@infradead.org, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com Subject: Re: [PATCH v3 4/4] x86/folio_zero_user: multi-page clearing Message-ID: References: <20250414034607.762653-1-ankur.a.arora@oracle.com> <20250414034607.762653-5-ankur.a.arora@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250414034607.762653-5-ankur.a.arora@oracle.com> X-Rspamd-Queue-Id: 5EE9A40009 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: hfjj4dnowsbp9d18ib8f3j87sjke83jn X-HE-Tag: 1744613591-974909 X-HE-Meta: U2FsdGVkX1+8XBPsa6LpaYul0C7JbJ+QHjxb71FUdOeKKzRrcB3GLzFWpdBfCzZ1/QATDnEdDy/FJ4QjkdIOcT/FOXGLLmKMjaGUXxQ0VBdUsMt4+qBsa0kbtUOJFLMYlBgltH218m5CRaQOgccZYueLOPUaFflDoiPQKhwDGQRMeqPkCITEZOmJQ2dUDT31m9uqsii0b+MewyGGMfMwBe5PGqQeiVQn9GNoFCFqMqGQEG0/NG2iyQ12Sy9unXW3+HjXbWLkd7rXQyBP6ijwNu/zgGuMxzhSGhYaYyeKcaPCoVTqpQRl9xeWIboLXzLWG+zVR/u9jst5rVHUq1f3ShwHKZ91x48qhKfqbmCsQO/Iu52M/Ci9vlQK9VxLJ8NtVE5MMARxp3R+mszPZ/9NiIOPP7JRgG4moJye/TY1Bk//JSn6uaOl6qfB0jHB+AsqO5BjQE8huNBBB2jqnWefPy6bZyM6xunNkB0AEZ0hhf7MBq+ehPQPQhj26XyWyoRTobONQjG3Pz/FJhdcPbZLNUu5bkiO3G8jEaM4w29k5+RISdVf3t35Py52ZPn5SbSynuhTcPrNsUXytqtK/OvmtdXLQv9XfInPAs6NSuWQTcRx19wn+rXAXmJxxhZI7vfqEwhGlTQTNLhNN6EGcsPU5Z5ReIl6zfMAajM8SodWfIJ4g4oCcU77bPGAx3irAw6pAKRrBb4Xe1YOczF1hVLz3u1wzQZg0GEG6l+ps9XlyupHzPgE9K+VeHbqJH7+iuVoPo7wktaAXSUBDd0SBbgFi948YgHJmvDUh89IX3bBfIibKB2bKy4GB+RpwP00kNqw2l28no+KwwZqlD5nOVfkSSiC5+vTMTGBsUEHfqJ7eXUwOqmU8YxLfXjAG0/M5V8qR3H2N1aEur9nlASl7t1Tfz+4M3xYHHiUYD8tsZffUgMMWtqidv8NDZuTFqAYN5PEAOZLWpamIe/gf43Z93R CrDvK8f8 hW8Imhr29CMn2JazGZ8dUGZtSk8mP5MS7WmUo2m/Be6NXxlmTkba3t/NhFb5BkdXggHf+1CAO7tSlSzX6uM0Ssz/3vnM3l4FQjSoVp2XesuxgMUn/zVyOqWlQlW+byGHM1t3lJeh1l0C7p3WWBeswEXWu33m1UKGn8FMkDojS7gaR9q3EyHgECPLeYieiQkEs+ArwJ7FEbM/kP1whgFNbXVtqFTf48e9/8DDYfBhnZXw4+mY8cYW2mSRRFhWUtM0dVXA8/c4xal4tDdDpxlhcYCWAl+KzripyrR/3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: * Ankur Arora wrote: > clear_pages_rep(), clear_pages_erms() use string instructions to zero > memory. When operating on more than a single page, we can use these > more effectively by explicitly advertising the region-size to the > processor, which can use that as a hint to optimize the clearing > (ex. by eliding cacheline allocation.) > +#ifndef CONFIG_HIGHMEM > +/* > + * folio_zero_user_preemptible(): multi-page clearing variant of folio_zero_user(). > + * > + * Taking inspiration from the common code variant, we split the zeroing in > + * three parts: left of the fault, right of the fault, and up to 5 pages > + * in the immediate neighbourhood of the target page. > + * > + * Cleared in that order to keep cache lines of the target region hot. > + * > + * For gigantic pages, there is no expectation of cache locality so just do a > + * straight zero. > + */ > +void folio_zero_user_preemptible(struct folio *folio, unsigned long addr_hint) > +{ > + unsigned long base_addr = ALIGN_DOWN(addr_hint, folio_size(folio)); > + const long fault_idx = (addr_hint - base_addr) / PAGE_SIZE; > + const struct range pg = DEFINE_RANGE(0, folio_nr_pages(folio) - 1); > + int width = 2; /* pages cleared last on either side */ > + struct range r[3]; > + int i; > + > + if (folio_nr_pages(folio) > MAX_ORDER_NR_PAGES) { > + clear_pages(page_address(folio_page(folio, 0)), folio_nr_pages(folio)); > + clear_pages(page_address(folio_page(folio, r[i].start)), len); So the _user postfix naming is super confusing here and elsewhere in this series. clear_page(), and by extension the clear_pages() interface you extended it to, fundamentally only works on kernel addresses: /* * Zero a page. * %rdi - page */ SYM_TYPED_FUNC_START(clear_page_rep) movl $4096/8,%ecx xorl %eax,%eax rep stosq RET Note the absolute lack of fault & exception handling. But folio_zero_user*() uses the kernel-space variants of page clearing AFAICT (contrary to the naming): void folio_zero_user(struct folio *folio, unsigned long addr_hint) { unsigned int nr_pages = folio_nr_pages(folio); if (unlikely(nr_pages > MAX_ORDER_NR_PAGES)) clear_gigantic_page(folio, addr_hint, nr_pages); else process_huge_page(addr_hint, nr_pages, clear_subpage, folio); } static void clear_gigantic_page(struct folio *folio, unsigned long addr_hint, unsigned int nr_pages) { unsigned long addr = ALIGN_DOWN(addr_hint, folio_size(folio)); int i; might_sleep(); for (i = 0; i < nr_pages; i++) { cond_resched(); clear_user_highpage(folio_page(folio, i), addr + i * PAGE_SIZE); } } Which on x86 is simply mapped into a kernel-memory interface: static inline void clear_user_page(void *page, unsigned long vaddr, struct page *pg) { clear_page(page); } So at minimum this is a misnomer and a confusing mixture of user/kernel interface names on an epic scale that TBH should be cleaned up first before extended... > +out: > + /* Explicitly invoke cond_resched() to handle any live patching necessary. */ > + cond_resched(); What again? Thanks, Ingo