From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9ED2E2417DE for ; Tue, 24 Feb 2026 11:17:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771931840; cv=none; b=SfY8IAg9aYa2NhODQQPv9y6ghx8ItdiHF7a90dUR7y4/BbTnLNXFfxv3AEk1wXp8bOrpK8qadiu0RwLcQ2P7iSVFh54nVvJQm37ccQy0qnCnbvaC9lVwF/hejWaN+Q2OpZrQLATZGuOrTq1f08mPFciTnBDANE2TaGncm00B4FY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771931840; c=relaxed/simple; bh=ThdepewJwrkOfBe0nUJDCWmtZU5o1JiPUwHTLN7lSqc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Fsm5zFKrmQ5hhUDYXbfwLm0713pazhfDnwUPChu99e4OsM8GIciAfyixvp4nxYd6UqqS9KHND2QsMk5IZpcaJxoaTjy6uzZKZGog4wjOXhZuWWdIXamrCxUF3A7OHyrtgvLWlYwXzQ+MDuCDj3VYnLXtepRqVBKXqnL6UE5vRis= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=Ij8+Cp4R; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="Ij8+Cp4R" Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61O8olCP2750096; Tue, 24 Feb 2026 11:16:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to; s=pp1; bh=zLIDTVUjuDZACZFW400blc7Ki+pAfT Dv9fPaN4SOeMM=; b=Ij8+Cp4RcZy7GZdS8cNspqcFyWCM9LjYZ+ptG7Jl/WmdX4 Nu2JkkPCu/H52D+XQ7BFkS8Dlo3iRrnSRqc90mzjVzSL0If6Yl4aqW7vZLwOKcp2 cyEMie3+agb/htZYIBR59NUElI++QW+KJs5SUrLZE8YuZDjmIHJs4nvOBhFUwleK N12eZznmUujHRtud3GXJygCh2FrCmFd82iyHMQ0PbAv2CAx/PSyxdBcaYvSC/w7B pd/GYcL+heKZxgLu1ZZ2paYZmSXGdDStfl7gvNtCVmMAZ68mWc7ykMV1ayK0LYyv UcgKI8M/p1V1aT9/QJUOk9sgrWDMhxr0kL7R1vdQ== Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4cf471u8x7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 Feb 2026 11:16:51 +0000 (GMT) Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 61O6vwjT013419; Tue, 24 Feb 2026 11:16:50 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 4cfqdy0pve-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 Feb 2026 11:16:49 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 61OBGmhO15794550 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 24 Feb 2026 11:16:48 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DF1342004B; Tue, 24 Feb 2026 11:16:47 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7923020040; Tue, 24 Feb 2026 11:16:47 +0000 (GMT) Received: from osiris (unknown [9.87.147.7]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTPS; Tue, 24 Feb 2026 11:16:47 +0000 (GMT) Date: Tue, 24 Feb 2026 12:16:46 +0100 From: Heiko Carstens To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , mathieu.desnoyers@efficios.com, Mark Rutland , cmarinas@kernel.org, maddy@linux.ibm.com, ryan.roberts@arm.com Subject: Re: [RFC] in-kernel rseq Message-ID: <20260224111646.20006Ddc-hca@linux.ibm.com> References: <20260223163843.GR1282955@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260223163843.GR1282955@noisy.programming.kicks-ass.net> X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: SLnID0PPKXH4U-XdewYQ41-dRCDJEVRb X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjI0MDA5MCBTYWx0ZWRfX4eZJgU4T3ylz APKuHgN9QUoKNBBYFDscTKIVMq70eKPZh/uj/AIW1ESA1tq/u7BfgubuCZj2U9a4jSnnZC8ik99 Z/PiFzFt/kjE+f8c+OQsP/hNbfgcN7wNjr5lITgyIni98Sk6kdWno3OnM9e2X2/GiOSyPW8tMSl hNkajPLGS1mpU0pSl2Pi4MIEkr/2EkR+MC9KmuadojauoJO+cBNnp3Ved6Op/uOZO4tHUKwrF3a J3ICpLiFOVbo+4pQ3PBt+vdoDSc5cVsr4F6C7A3yj7ZbI42UOBwDdBjFuNL4Ucvce2bKKOaXhuY 1iF3IfBdV2k6w+G6daJlrK/Qe3UgSjEvF4fGLo/D+PBsl4VoNmH7hyWaZhWWSH7TB8P+uboRyY9 MVfkpDY+fHjBVRnvj8pZI0kFNn6BERrvt333CZkykZf5Pmx+G1pAeSQ/mKLAKYh7gvxvsvltaxc SsxCG/VBTAl/kiSxHmw== X-Authority-Analysis: v=2.4 cv=R7wO2NRX c=1 sm=1 tr=0 ts=699d88a3 cx=c_pps a=5BHTudwdYE3Te8bg5FgnPg==:117 a=5BHTudwdYE3Te8bg5FgnPg==:17 a=kj9zAlcOel0A:10 a=HzLeVaNsDn8A:10 a=VkNPw1HP01LnGYTKEx00:22 a=Mpw57Om8IfrbqaoTuvik:22 a=GgsMoib0sEa3-_RKJdDe:22 a=_V6xErUFpjFl8cEUAagA:9 a=CjuIK1q_8ugA:10 X-Proofpoint-GUID: SLnID0PPKXH4U-XdewYQ41-dRCDJEVRb X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-02-24_01,2026-02-23_03,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1011 impostorscore=0 phishscore=0 spamscore=0 suspectscore=0 malwarescore=0 bulkscore=0 adultscore=0 lowpriorityscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2601150000 definitions=main-2602240090 On Mon, Feb 23, 2026 at 05:38:43PM +0100, Peter Zijlstra wrote: > This means, it needs to be woven into the asm... and I'm not that handy > with arm64 asm. > > The pseudo code would be something like: > > current->sched_seq = &_R; > ... > > _start: compute per cpu-addr > load addr > $OP > _commit: store addr > > ... > current->sched_rseq = NULL; > > > Then when preemption happens (from interrupt), the instruction pointer > is 'simply' reset to _start and it tries again. I guess also on every interrupt, exception, and nmi current->sched_rseq needs to be saved on entry, and restored on exit, since other contexts can make use of this_cpu ops as well. > Anyway, this was aimed at arm64, which chose to use atomics for > this_cpu. But if we move sched_rseq() from schedule-tail into interrupt > entry, then this would also work for things like Power. Let's assume s390 would be target, which also uses atomics for this_cpu ops. A very simple function like: static DEFINE_PER_CPU(long, bar); long foo(long val) { return this_cpu_add_return(bar, val); } would turn into the below with PREEMPT_NONE: 0000000000000000 : 0: c0 04 00 00 00 00 jgnop 0 6: c0 10 00 00 00 00 larl %r1,6 <- r1 contains address of "bar" 8: R_390_PC32DBL .data..percpu+0x2 c: a7 39 00 00 lghi %r3,0 10: e3 10 33 b8 00 08 ag %r1,952(%r3) <- add per-cpu offset 16: eb 02 10 00 00 e8 laag %r0,%r2,0(%r1) <- atomic op 1c: b9 08 00 20 agr %r2,%r0 20: 07 fe br %r14 With PREEMPT_LAZY this turns into: 0000000000000000 : 0: c0 04 00 00 00 00 jgnop 0 6: eb af f0 68 00 24 stmg %r10,%r15,104(%r15) c: b9 04 00 ef lgr %r14,%r15 10: b9 04 00 b2 lgr %r11,%r2 14: e3 f0 ff c8 ff 71 lay %r15,-56(%r15) 1a: e3 e0 f0 98 00 24 stg %r14,152(%r15) <- up to here: create stack frame 20: eb 01 03 a8 00 6a asi 936,1 <- preempt_inc() 26: c0 10 00 00 00 00 larl %r1,26 28: R_390_PC32DBL .data..percpu+0x2 2c: a7 29 00 00 lghi %r2,0 30: e3 10 23 b8 00 08 ag %r1,952(%r2) 36: eb ab 10 00 00 e8 laag %r10,%r11,0(%r1) 3c: eb ff 03 a8 00 6e alsi 936,-1 <- preempt_dec_and_test() 42: a7 54 00 05 jnhe 4c 46: c0 e5 00 00 00 00 brasl %r14,46 48: R_390_PLT32DBL preempt_schedule_notrace+0x2 4c: b9 e8 b0 2a agrk %r2,%r10,%r11 50: eb af f0 a0 00 04 lmg %r10,%r15,160(%r15) 56: 07 fe br %r14 With your proposal I guess this would turn into something like below. Note, the below is hand-edited, therefore offsets etc, do not make any sense, it is just the instruction sequence I guess we _could_ end up with: 0000000000000000 : 0: c0 04 00 00 00 00 jgnop 0 larl %r1,#this_seq <- &_RR stg %r1,944 <- lowcore->sched_seq = &_R; c: c0 10 00 00 00 00 larl %r1,c e: R_390_PC32DBL .data..percpu+0x2 16: e3 10 33 b8 00 08 ag %r1,952 1c: eb 02 10 00 00 e8 laag %r0,%r2,0(%r1) mvghi 944,0 <- lowcore->sched_seq = NULL; 2c: b9 08 00 20 agr %r2,%r0 30: 07 fe br %r14 This uses the s390 specific "lowcore" instead of current for sched_seq, since it is an architecture per-cpu area mapped at address zero. Let me give it a try to verify if the generated code would really look like the above, but might a few days.