From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8ABEC3A9D8C for ; Fri, 8 May 2026 09:30:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778232621; cv=none; b=ON0xDX7mEbU72INfYk6hHPGRMgeA+I940bySj7G1xDBI/Llmfeim98jkamkxDBTFXQKY3LW2arO4dUCkojC6u7HoA34v4XXBm4XFll6DwcqzAsYHmsEvuIQU1V5RbXpmVOThbP3L076WfVfrdTZSPd8sXpPUOILH8IV5dH0YWqw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778232621; c=relaxed/simple; bh=s8DdICLhnj/DfQCudkc982W5x6Y5k2ODjetxBBj7jzQ=; h=Message-ID:Date:MIME-Version:Subject:From:To:Cc:References: In-Reply-To:Content-Type; b=uWLRAXvKbpnJ0MVqO75iQjRQ1LEI4bcupczla0Xt7qqBiLgQezUC+M3RiWcY5RqCWoLgvdd94e5wDzVusfBaj+k80HU3NcmZ/dd82RZ8WiB68SGnIvPj17OPrnbalLzG7nBp/W28kgjbyp2sqyi0a8ExsnQ2lxnpbPMdcUSNrsk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b=pGosNmLK; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b="pGosNmLK" Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9BF1E1BCA; Fri, 8 May 2026 02:30:13 -0700 (PDT) Received: from [10.57.35.71] (unknown [10.57.35.71]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 74F673F7B4; Fri, 8 May 2026 02:30:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1778232618; bh=s8DdICLhnj/DfQCudkc982W5x6Y5k2ODjetxBBj7jzQ=; h=Date:Subject:From:To:Cc:References:In-Reply-To:From; b=pGosNmLKXsFDpqHZ2v4awPO7R4BJLnRLDFG6QTRmbEy3znEuuSjWSCNr806Eg77XW R4Pa91NJ76pjBXYmMgu74Gf4TW14vrW4noSoTnyMABnH9Ok6XcDP2lWnraI7HJAZZR 0y//lSeZUixyJS7TWw3LdVRkiS9PDk3gQcxwKXMg= Message-ID: Date: Fri, 8 May 2026 11:30:11 +0200 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] x86/xen: Fix lazy mmu handling across context switch From: Kevin Brodsky To: =?UTF-8?B?SsO8cmdlbiBHcm/Dnw==?= , linux-kernel@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org Cc: Boris Ostrovsky , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , xen-devel@lists.xenproject.org, =?UTF-8?Q?Marek_Marczykowski-G=C3=B3recki?= References: <20260508080514.454607-1-jgross@suse.com> <5cb54bd1-5981-4a46-9083-f7b527ca342f@suse.com> <775a5fd2-394d-4409-81d3-4dc6ef8209f0@arm.com> Content-Language: en-GB In-Reply-To: <775a5fd2-394d-4409-81d3-4dc6ef8209f0@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit + Marek (the wrong address was in Cc) On 08/05/2026 11:08, Kevin Brodsky wrote: > On 08/05/2026 10:33, Jürgen Groß wrote: >> Please disregard this patch. It isn't fixing the real problem. > That's what I would expect, see below. > >> On 08.05.26 10:05, Juergen Gross wrote: >>> The recent rework of mmu lazy mode has resulted in problems when >>> running as a Xen PV guest. Enabling lazy mmu mode for the new context >>> during context switch is done from the arch_end_context_switch() hook, >>> but when calling this hook current hasn't been changed yet, so the >>> lazy mmu mode state of the wrong task is modified. > Currently xen_end_context_switch() checks if next has lazy MMU mode > enabled and if so calls arch_enter_lazy_mmu_mode(), i.e. > enter_lazy(XEN_LAZY_MMU). This does *not* modify any task state, rather > it writes to the xen_lazy_mode percpu variable. > > I've thought about this from various angles when reworking lazy MMU, and > the conclusion I made is that arch_{start,end}_context_switch() have no > reason to change any task state. On arm64, for instance, we do nothing > at all on context switching, since everything lazy MMU-related is > tracked in task_struct and therefore already switched. > > Xen is trickier because it tracks lazy MMU/CPU state in a percpu > variable, so these hooks do need to do something about it. This is > entirely Xen-internal though, and there's no reason to be calling > generic functions like lazy_mmu_mode_pause() that modify task state. > > The idea behind commit 291b3abed657 ("x86/xen: use lazy_mmu_state when > context-switching") is that TIF_LAZY_MMU_UPDATES now duplicates > lazy_mmu_state in task_struct and we can therefore replace the former > with the latter. More specifically, the assumption is that > TIF_LAZY_MMU_UPDATES is set if and only if the task has been scheduled > out and __task_lazy_mmu_mode_active(task) is true. > > Clearly there is something wrong with this assumption, but I still can't > put my finger on it. For now I would suggest reverting this commit if > that solves the issue Marek reported; the intention was not to introduce > any functional change, but only a (minor) optimisation. > > - Kevin > >>> Additionally it is much cleaner to use lazy_mmu_mode_pause() and >>> lazy_mmu_mode_resume() in the Xen context switch hooks, as it avoids >>> conditionals in those hooks. >>> >>> In order not having to add another hook to be called after switching >>> current, modify lazy_mmu_mode_resume() to use a new sub-function which >>> takes a task pointer as parameter. This new sub-function can then be >>> used in the xen_end_context_switch() hook. >>> >>> Fixes: 291b3abed657 ("x86/xen: use lazy_mmu_state when >>> context-switching") >>> Signed-off-by: Juergen Gross >>> --- >>>   arch/x86/xen/enlighten_pv.c |  7 ++----- >>>   include/linux/pgtable.h     | 33 ++++++++++++++++++++++++--------- >>>   2 files changed, 26 insertions(+), 14 deletions(-) >>> >>> diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c >>> index ed2d7a3756ce..67bb6bf6d240 100644 >>> --- a/arch/x86/xen/enlighten_pv.c >>> +++ b/arch/x86/xen/enlighten_pv.c >>> @@ -424,9 +424,7 @@ static void xen_start_context_switch(struct >>> task_struct *prev) >>>   { >>>       BUG_ON(preemptible()); >>>   -    if (this_cpu_read(xen_lazy_mode) == XEN_LAZY_MMU) { >>> -        arch_leave_lazy_mmu_mode(); >>> -    } >>> +    lazy_mmu_mode_pause(); >>>       enter_lazy(XEN_LAZY_CPU); >>>   } >>>   @@ -436,8 +434,7 @@ static void xen_end_context_switch(struct >>> task_struct *next) >>>         xen_mc_flush(); >>>       leave_lazy(XEN_LAZY_CPU); >>> -    if (__task_lazy_mmu_mode_active(next)) >>> -        arch_enter_lazy_mmu_mode(); >>> +    lazy_mmu_mode_resume_task(next); >>>   } >>>     static unsigned long xen_store_tr(void) >>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >>> index cdd68ed3ae1a..83a099bf2038 100644 >>> --- a/include/linux/pgtable.h >>> +++ b/include/linux/pgtable.h >>> @@ -326,6 +326,28 @@ static inline void lazy_mmu_mode_pause(void) >>>           arch_leave_lazy_mmu_mode(); >>>   } >>>   +/** >>> + * lazy_mmu_mode_resume_task() - Resume the lazy MMU mode for a >>> specific task. >>> + * >>> + * Like lazy_mmu_mode_resume() below, but with a task specified. >>> + * Must be called only by lazy_mmu_mode_resume() or during context >>> switch. >>> + * Must never be called in interrupt context. >>> + * >>> + * Must match a call to lazy_mmu_mode_pause(). >>> + * >>> + * Has no effect if called: >>> + * - While paused (inside another pause()/resume() pair) >>> + */ >>> +static inline void lazy_mmu_mode_resume_task(struct task_struct *task) >>> +{ >>> +    struct lazy_mmu_state *state = &task->lazy_mmu_state; >>> + >>> +    VM_WARN_ON_ONCE(state->pause_count == 0); >>> + >>> +    if (--state->pause_count == 0 && state->enable_count > 0) >>> +        arch_enter_lazy_mmu_mode(); >>> +} >>> + >>>   /** >>>    * lazy_mmu_mode_resume() - Resume the lazy MMU mode. >>>    * >>> @@ -341,15 +363,8 @@ static inline void lazy_mmu_mode_pause(void) >>>    */ >>>   static inline void lazy_mmu_mode_resume(void) >>>   { >>> -    struct lazy_mmu_state *state = ¤t->lazy_mmu_state; >>> - >>> -    if (in_interrupt()) >>> -        return; >>> - >>> -    VM_WARN_ON_ONCE(state->pause_count == 0); >>> - >>> -    if (--state->pause_count == 0 && state->enable_count > 0) >>> -        arch_enter_lazy_mmu_mode(); >>> +    if (!in_interrupt()) >>> +        lazy_mmu_mode_resume_task(current); >>>   } >>>   #else >>>   static inline void lazy_mmu_mode_enable(void) {}