From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5DA13148A3 for ; Mon, 4 May 2026 08:00:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777881655; cv=none; b=o6yYGcHLaUL9VMAqE03lXge3BvVEs9+PY9HeXtxIFj1pL4ODTbpdvIwxmmk0Nk0njj60amDH5ZLUXhK4TKyqRTsXXl8WXyf/4LKbqWqjl5/tfdGSR4Npo0LyQnSij2R5SmLdZJiY0gGN5V38JJnFk7t/NcUqAcmMUx63dEG6Wb8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777881655; c=relaxed/simple; bh=lyrMExmbCcWT2Z4UqkVO5ztFoPUtMbRy+H74bmJO44A=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=E79R5zdQCyJ/hdSXffWd5pFmmCkQlWYI7YBD80hohcOM+tfy5k9B5yAEiam/CE0K9ewYrqwUz6rOuQ3akjV3bYpVXj5fNEVyZ/xLcmeE80D9QdyiTiTB29QbUJnVxFURmGqZWWUne4NjSuftqKutXgtO21nKd/mHR4/B2qz9NGQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jpiecuch.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=V0nileAV; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jpiecuch.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="V0nileAV" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-44d9ace59efso635244f8f.1 for ; Mon, 04 May 2026 01:00:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1777881652; x=1778486452; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=ouola5UQHpZvd0UAQzdvHkVdaes55Utp1ZKXgRfSFq8=; b=V0nileAVANH9qD1TTCkUq29I/PzNo6+sisQWkzhkndd1THIxrKkhjTH5vfXeQz4+HK u0J/ggHMnQM7AKgoAixphVuQUdmDYSqi+ie0FakQvIzRhMortnTAPLN11r+2cLqG55OR uFMEMOHS8ucgmaHPg/JnC2JRyOdkQ4npbIKilyWQkvAYU47OjFO5vbM+Wkf+HdM2KGIf leSR4GgGPGyX4Sqt8721NeBisZJYBIPs7CGpwTK905k/RK+JfLPH0ysq1SzRTlxL4PwS VW1S12WXLVfsGIFRq665JCJ4uK4hvczyGYKjD8v/ZtpzthkuOaQ6MXaJyOdbJgCLNcFO 27nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777881652; x=1778486452; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=ouola5UQHpZvd0UAQzdvHkVdaes55Utp1ZKXgRfSFq8=; b=MHCnG5tMoGi+D7ey5TMZZ3Drz9q+UM+OsrBt79P56xPsUBXWxHpr9pFe1c9Looy6QS rqozx7QCXeubAVpc0gn7o2Q/pb1vgfu19tRZEHe9uscOMbZwEy3DvKU7j5YRHmWqosnu 5iqeLvH2BtQYzWt5rLIXyVPZn4tFIvFygyONVW1iw4aBhPYa4Zw9suOu5d+/IRHI8uzP aE+C5dGyxywch7iDoD33uvrYxLcIvMRL36pzQ4xEkN6/O4FhTFHh+f0R0xNqdcdf1r72 2Nna19NA1+b8W+6TvroQTFj3C3upgB7lleJwY7Eh7X/0FcffJ0enKNHWLFeXD72X5CBb vN8g== X-Forwarded-Encrypted: i=1; AFNElJ+onDpia+WRiQkQAq9ouYhUZZkC5zREDQsTKj2kECL52zy1CpbIbp0Se8oOngOzFfFpdJDiKcW6H7eSp+A=@vger.kernel.org X-Gm-Message-State: AOJu0YxuVKksQs9TvnrYFP6j9lVbwP0twA3SvY2q6vPscxKlR2E+eH1q MI7zy+zKJ4TKhpxrwIUcifBK3NzHEZDmq96G6vT9CqA789qjOZ36dON36tOTx0uneZ0lAuOIa6X EIErNaqebX3q7WQ== X-Received: from wrog14.prod.google.com ([2002:adf:f40e:0:b0:43d:758b:6178]) (user=jpiecuch job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:24c9:b0:43d:21a:9a3e with SMTP id ffacd0b85a97d-44bb65dfc18mr13027948f8f.32.1777881651504; Mon, 04 May 2026 01:00:51 -0700 (PDT) Date: Mon, 04 May 2026 08:00:50 +0000 In-Reply-To: <20260502000039.Ga94c@cchengyang.duckdns.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260319083518.94673-1-arighi@nvidia.com> <20260422142633.G7180@cchengyang.duckdns.org> <20260426093756.Gd781@cchengyang.duckdns.org> <20260502000039.Ga94c@cchengyang.duckdns.org> X-Mailer: aerc 0.21.0-0-g5549850facc2 Message-ID: Subject: Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes From: Kuba Piecuch To: Cheng-Yang Chou , Kuba Piecuch Cc: Tejun Heo , Andrea Righi , David Vernet , Changwoo Min , Emil Tsalapatis , Christian Loehle , Daniel Hodges , , , Ching-Chun Huang , Chia-Ping Tsai Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Cheng-Yang, On Fri May 1, 2026 at 4:19 PM UTC, Cheng-Yang Chou wrote: >> >> 2. Do we want to restrict ourselves through the one qseq slot provide= d by >> >> dsq_insert_begin()? The most flexible approach IMO would be to sim= ply >> >> allow BPF to read the qseq directly via a kfunc and then supply it= to >> >> dsq_insert() later. With this, we can have multiple qseqs saved at= the >> >> same time, and we can even pass them between CPUs, e.g. if one CPU >> >> dequeues a task for a sibling CPU, but we want the checks to be ma= de inside >> >> the sibling's ops.dispatch() (I just made this use case it up, it = may not >> >> be practical.) >> >> That said, exposing an internal thing like qseq to BPF may be a st= ep too far. >> > >> > In Tejun's reply back in [1], he suggested dsq_insert_begin() precisel= y >> > to avoid promoting qseq into the BPF ABI =E2=80=94 which matches your = own concern. >> > The single per-CPU slot is sufficient for the one-task-per-iteration >> > dispatch loops used by existing schedulers (e.g., scx_central). >> > If a concrete cross-CPU use case materializes later, we can always ext= end >> > dsq_insert() to accept an explicit qseq without breaking the current, >> > simpler path. >> > >> > [1]: https://lore.kernel.org/all/acHJED4iAeytdC2l@slm.duckdns.org/ >> > >>=20 >> Well, Tejun doesn't explicitly say there that he's against exposing qseq= , but >> I won't be surprised if he is. >>=20 >> FWIW, ghOSt (our Google-internal BPF scheduling solution) uses exactly t= his >> approach to guard the dispatch path against racing dequeues/enqueues. >> Every task has a seqnum that gets incremented on each "event" pertaining= to >> the task. In the dispatch path, the BPF scheduler reads the task seqnum, >> does whatever checks it needs to do, and passes the seqnum to ghOSt at t= he end. >>=20 >> Admittedly, what works downstream doesn't have to work upstream, but I s= till >> wanted to provide this data point :-) > > The ghOSt data point is appreciated. If a concrete use case emerges where > the single-slot approach falls short, extending dsq_insert() to accept an > explicit qseq seems like a natural next step. > > Tejun, Andrea, sched-ext folks, any preferences? Random thought: If exposing qseq values to BPF directly is undesirable, the= n perhaps a less objectionable approach would be to expose them as opaque cookie/token values? Same semantics, but fewer SCX internals leaking to BPF= . Thanks, Kuba