From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 43B6E169AD2 for ; Sun, 26 Apr 2026 01:47:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777168079; cv=none; b=SRYdm6hN0e83748hno5P3PHqvSJGcoDpg3e+lKg5TTGwpDdCU7HMe9S3MvQ2ld9iWE9+1gHUH+i9imexeqSrJxx4ebOIYGniuRw34HEFq5sAdeYACOCfdK1S2X/i3pl3TNvXtyORK0ipQRxcpaWE3/bEgPBr0fXG9pb7dyVy9nE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777168079; c=relaxed/simple; bh=QrvqSDc041TNuXNt3BBlwMz8vZoL2QGWRmws13hXG+Q=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=RSNx++daRDgcvPADuUnN/hbQ8WtT/Io3Y+xsc9o8bmOpHBXhj7pPNOmZ63/ioGYsIFOxKGYfRmzcUY7CHdxyuIaVIi2OFaAuThDc3u+m/g0bkwzdPuGw1qY4toAfyCCUMZl2Ry0DKa69Hrvv/Wij8WUlwECKAtevULeGIvKDQPE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FgKchmHK; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FgKchmHK" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-35d9f68d011so5989859a91.2 for ; Sat, 25 Apr 2026 18:47:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777168077; x=1777772877; darn=lists.linux.dev; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=RL7xdU0D39uNUreJOg8ut3Rg0sXkiiyngpi4Y/rXepg=; b=FgKchmHKj+jl4stIoafa/xweUtdRiByQVPzxbQmbD88KmskAE8d5W4hHWtJJVK7dLB amA4hGfHs2d+uMj/c0Jdf20uvfGkQcybRoGQzBkv6E51S/b73s+Fxu+/Di8RHPxWYMPR r/cRBYSn3g/K+uoQhbewsw3Vg7X6HNTQiyxbnkbteVIGy5uvYdIB+6tO96+Tp+caoncs /umnDLLh1qsgP9TDyA47YhAvsPmbgp3p/hkBe25+CR5jv17Am59GbmnF11v1pNOBwuCs QImwJIxFFT6H3td71khJPT3EDgbHHzzxEHLzv9S5o/v/+PTcxSL0brJ4vYCSg7zr/Qjw JbTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777168077; x=1777772877; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RL7xdU0D39uNUreJOg8ut3Rg0sXkiiyngpi4Y/rXepg=; b=r8XmWev9MRlSq+u/2OoAV6ziwoRKSylAjuQY6P28j1aW8/l1V+9YCcj/3BhJavOXJ2 FDu3GDIuwxxuiwrC9PABAs9f4NN8cMYeJlCL5nv6ZQvhQTIAapdMX0JWsugeg3MZXRfm TwkVvRI7vPPae5DcLEmoTFxPgePCSB9WgvTwzWfQowsbf+UH7egzRJiaBD7I11ywYif4 7YmsKn8XYfMtegTUI4lxoAsYdDDyFF7iENugi3PbjjwyN5oUKwvv9RSIhQ8yFjc5Mbrm ZtD85LzMjW83o+Sjeo7VTIx4ubnhZ0N0qkPCKnIU8vuTlf517qMdQbLx9ch/55GOjzxM jFrw== X-Forwarded-Encrypted: i=1; AFNElJ9e4MnKY1xMiFMslzFKronKr8TR3OYwBV5am2Rd8y9iLuR/4OPj83Idk4gXnuH/YSkfxesTM5TYcp0=@lists.linux.dev X-Gm-Message-State: AOJu0YxKsytlCyuErD9dEMlxupMeFsuXOMEARWmydVgH6jkPZ7bktG2L MzTEr9/UKeOF/jinOJZ7ANQIzizEo8laeAwrCBjQA0ngeBEKG33K3fOS X-Gm-Gg: AeBDieuSgRHboWL2+vSCFVsf3X1OZnfmwXVw6tHr4EaPvX9Ek4j7eBvwTgijcwAmtjq NbZTDoUS6e58Uzm0tt6rh2RYbaIwKiocYQDLEAt+Uuic06x7a6dg0abcOxHcW31bt+jf4sOOZww FYXX0LLmchH/o1hQB2fLvEbNgRHMEGSHVkxW3Em6S6bwe2CC3A+8kD+n56NHpU/R312T1ETTMXm QZ6TNDza+kspbq9Lv0U6qB7N3N00ygojzbylLdPPDnySYzVKXjHSWsSjFxc5nUMyZfrIVmvCzuB 7zcAN6e2pPGqQNbiOmkT8U1u4CWc5pXZt7p8h8IHv1of9TS0GKsX176OSdOWtQ0sO03K6BMwuNu EHzrL5pw1uxdoRFpK0J1osmncdL4g+5x48x3hLkdR59mptbTO8MXqkIagiTbFDKan9mcoQ+HzGM ooCC9DDNaqRTb46gVEMDN8Zj3BchX7q9id97TE9Oqx5749LDr/e0GCuJZ8+peo8cdaWSi8iwqxl 1Vr11NKAyof7NPR X-Received: by 2002:a17:90b:3504:b0:35f:bb33:d727 with SMTP id 98e67ed59e1d1-36140478be0mr37352571a91.20.1777168077417; Sat, 25 Apr 2026 18:47:57 -0700 (PDT) Received: from cchengyang.duckdns.org (36-225-83-234.dynamic-ip.hinet.net. [36.225.83.234]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-36141973c57sm27058415a91.14.2026.04.25.18.47.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 25 Apr 2026 18:47:56 -0700 (PDT) Date: Sun, 26 Apr 2026 09:47:53 +0800 From: Cheng-Yang Chou To: Kuba Piecuch Cc: Tejun Heo , Andrea Righi , David Vernet , Changwoo Min , Emil Tsalapatis , Christian Loehle , Daniel Hodges , sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org, Ching-Chun Huang , Chia-Ping Tsai Subject: Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes Message-ID: <20260426093756.Gd781@cchengyang.duckdns.org> References: <20260319083518.94673-1-arighi@nvidia.com> <20260422142633.G7180@cchengyang.duckdns.org> Precedence: bulk X-Mailing-List: sched-ext@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Hi Kuba, On Thu, Apr 23, 2026 at 01:32:20PM +0000, Kuba Piecuch wrote: > > On Mon, Mar 23, 2026 at 01:13:20PM -1000, Tejun Heo wrote: > >> > The simple way to do this is to do scx_bpf_dsq_insert() at the very beginning, > >> > once we know which task we would like to dispatch, and cancel the pending > >> > dispatch via scx_bpf_dispatch_cancel() if any of the pre-dispatch checks fail > >> > on the BPF side. This way, the "critical section" includes BPF-side checks, and > >> > SCX will ignore the dispatch if there was a dequeue/enqueue racing with the > >> > critical section. > >> > > >> > With this solution, we can throw an error if task_can_run_on_remote_rq() is > >> > false, because we know that there was no racing cpumask change (if there was, > >> > it would have been caught earlier, in finish_dispatch()). > >> > >> Yeah, I think this makes more sense. qseq is already there to provide > >> protection against these events. It's just that the capturing of qseq is too > >> late. If insert/cancel is too ugly, we can introduce another kfunc to > >> capture the qseq - scx_bpf_dsq_insert_begin() or something like that - and > >> stash it in a per-cpu variable. That way, qseq would be cover the "current" > >> queued instance and the existing qseq mechanism would be able to reliably > >> ignore the ones that lost race to dequeue. > > > > Since this has been stale for a while, I prepared a patch to implement > > scx_bpf_dsq_insert_begin() as suggested. > > Thanks for creating the patch. A couple of thoughts: > > 1. Do we have a use case that requires dsq_insert_begin() that isn't > satisfied using the "insert and then cancel if needed" approach? IIUC, yes. scx_bpf_dispatch_cancel() is only registered in scx_kfunc_ids_dispatch, so it is only callable from ops.dispatch(). dsq_insert_begin(), on the other hand, is available from both ops.enqueue() and ops.dispatch() (SCX_KF_ENQUEUE | SCX_KF_DISPATCH). Since there is nothing to cancel in ops.enqueue(), the insert-and-cancel approach simply doesn't work there. > > 2. Do we want to restrict ourselves through the one qseq slot provided by > dsq_insert_begin()? The most flexible approach IMO would be to simply > allow BPF to read the qseq directly via a kfunc and then supply it to > dsq_insert() later. With this, we can have multiple qseqs saved at the > same time, and we can even pass them between CPUs, e.g. if one CPU > dequeues a task for a sibling CPU, but we want the checks to be made inside > the sibling's ops.dispatch() (I just made this use case it up, it may not > be practical.) > That said, exposing an internal thing like qseq to BPF may be a step too far. In Tejun's reply back in [1], he suggested dsq_insert_begin() precisely to avoid promoting qseq into the BPF ABI — which matches your own concern. The single per-CPU slot is sufficient for the one-task-per-iteration dispatch loops used by existing schedulers (e.g., scx_central). If a concrete cross-CPU use case materializes later, we can always extend dsq_insert() to accept an explicit qseq without breaking the current, simpler path. [1]: https://lore.kernel.org/all/acHJED4iAeytdC2l@slm.duckdns.org/ > Let me know what you think. > Please correct me if I'm missing something, thanks! ^0^ -- Cheers, Cheng-Yang