From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD3661A3151; Fri, 7 Feb 2025 18:38:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738953481; cv=none; b=XZFp9K9j5bKoPMwtI4M2Wo+/+Ygtr5r76v4wXiemjgCbBeF+0Q7363lH35iRSzyjv/xhSDnYzP6JCNRAhgtInJ91iKh2zYpgF5KUTXHyzu+WRzYZwoXMsizddVb53ODckKn6n1JkOcWm1SECxQO539oO+P4Mb3X1Ur7CPy5BN6w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738953481; c=relaxed/simple; bh=nKtIeXe1WybNWDzZwu5Tk8fpz5emsmZs04c0W6ceGA0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=cNKK9BE6dJJ9kfoMlkKCYQ8z+S9zw5e1ZQcdq31v7aRWhz7Nlks5z2yoaVTAUeWZlqFzVzL3uCZIUzP69fhhtie0FMi+X1z/Y1Rx2qz3Qx+B214kspC7nP4U71oZ3qRFEBtttt5aj/CfLo8cqtmnUzBeAhQBHeR/fJi5ngS1m4M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=NlQrJCsw; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="NlQrJCsw" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C4FB8C4CED1; Fri, 7 Feb 2025 18:37:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1738953480; bh=nKtIeXe1WybNWDzZwu5Tk8fpz5emsmZs04c0W6ceGA0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=NlQrJCswb2tfBHdOL//Nqi1duhWdmo0hhnvyFyfXR+kDIctwxBbNZGMh0uVXpdrpn VtrsS32dlVeABXURIIGFmPTMZ88JEy/aky3D6z70pfkEKIebKtTmA3OIqxmZ3iDcEH trWQP7mjMuGosKO+a09FIi/HB1+rytKm2YepWGLYPhGRvhc49fQbsTVEVzSz1vy8KY 6YWoxRaVR88tPNIw47TXvof69pWZuIN0NCuUntVn/ixzcMSAUzD+pD+us46BW7ocRi X6a7pzBJZI/SEPEJ2Eua3jvdWjb6R21icBFzIt4mEif2XISSYrHb7WAcNTWLbqa0Er f0yIWw7KhrZxA== Date: Fri, 7 Feb 2025 19:37:57 +0100 From: Frederic Weisbecker To: Valentin Schneider Cc: linux-kernel@vger.kernel.org, x86@kernel.org, virtualization@lists.linux.dev, linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev, linux-riscv@lists.infradead.org, linux-perf-users@vger.kernel.org, xen-devel@lists.xenproject.org, kvm@vger.kernel.org, linux-arch@vger.kernel.org, rcu@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, bcm-kernel-feedback-list@broadcom.com, Juergen Gross , Ajay Kaher , Alexey Makhalov , Russell King , Catalin Marinas , Will Deacon , Huacai Chen , WANG Xuerui , Paul Walmsley , Palmer Dabbelt , Albert Ou , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Peter Zijlstra , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , "Liang, Kan" , Boris Ostrovsky , Josh Poimboeuf , Pawan Gupta , Sean Christopherson , Paolo Bonzini , Andy Lutomirski , Arnd Bergmann , "Paul E. McKenney" , Jason Baron , Steven Rostedt , Ard Biesheuvel , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Juri Lelli , Clark Williams , Yair Podemsky , Tomas Glozar , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Kees Cook , Andrew Morton , Christoph Hellwig , Shuah Khan , Sami Tolvanen , Miguel Ojeda , Alice Ryhl , "Mike Rapoport (Microsoft)" , Samuel Holland , Rong Xu , Nicolas Saenz Julienne , Geert Uytterhoeven , Yosry Ahmed , "Kirill A. Shutemov" , "Masami Hiramatsu (Google)" , Jinghao Jia , Luis Chamberlain , Randy Dunlap , Tiezhu Yang Subject: Re: [PATCH v4 22/30] context_tracking: Exit CT_STATE_IDLE upon irq/nmi entry Message-ID: References: <20250114175143.81438-1-vschneid@redhat.com> <20250114175143.81438-23-vschneid@redhat.com> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Le Fri, Feb 07, 2025 at 06:06:45PM +0100, Valentin Schneider a écrit : > On 27/01/25 12:17, Valentin Schneider wrote: > > On 22/01/25 01:22, Frederic Weisbecker wrote: > >> And NMIs interrupting userspace don't call > >> enter_from_user_mode(). In fact they don't call irqentry_enter_from_user_mode() > >> like regular IRQs but irqentry_nmi_enter() instead. Well that's for archs > >> implementing common entry code, I can't speak for the others. > >> > > > > That I didn't realize, so thank you for pointing it out. Having another > > look now, I mistook DEFINE_IDTENTRY_RAW(exc_int3) for the general case > > when it really isn't :( > > > >> Unifying the behaviour between user and idle such that the IRQs/NMIs exit the > >> CT_STATE can be interesting but I fear this may not come for free. You would > >> need to save the old state on IRQ/NMI entry and restore it on exit. > >> > > > > That's what I tried to avoid, but it sounds like there's no nice way around it. > > > >> Do we really need it? > >> > > > > Well, my problem with not doing IDLE->KERNEL transitions on IRQ/NMI is that > > this leads the IPI deferral logic to observe a technically-out-of-sync sate > > for remote CPUs. Consider: > > > > CPUx CPUy > > state := CT_STATE_IDLE > > ... > > ~>IRQ > > ... > > ct_nmi_enter() > > [in the kernel proper by now] > > > > text_poke_bp_batch() > > ct_set_cpu_work(CPUy, CT_WORK_SYNC) > > READ CPUy ct->state > > `-> CT_IDLE_STATE > > `-> defer IPI > > > > > > I thought this meant I would need to throw out the "defer IPIs if CPU is > > idle" part, but AIUI this also affects CT_STATE_USER and CT_STATE_GUEST, > > which is a bummer :( > > Soooo I've been thinking... > > Isn't > > (context_tracking.state & CT_RCU_WATCHING) > > pretty much a proxy for knowing whether a CPU is executing in kernelspace, > including NMIs? You got it! > > NMI interrupts userspace/VM/idle -> ct_nmi_enter() -> it becomes true > IRQ interrupts idle -> ct_irq_enter() -> it becomes true > IRQ interrupts userspace -> __ct_user_exit() -> it becomes true > IRQ interrupts VM -> __ct_user_exit() -> it becomes true > > IOW, if I gate setting deferred work by checking for this instead of > explicitely CT_STATE_KERNEL, "it should work" and prevent the > aforementioned issue? Or should I be out drinking instead? :-) Exactly it should work! Now that doesn't mean you can't go out for a drink :-) Thanks.