public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <llong@redhat.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Waiman Long <llong@redhat.com>
Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Tejun Heo <tj@kernel.org>, Barret Rhoden <brho@google.com>,
	Josh Don <joshdon@google.com>, Dohyun Kim <dohyunkim@google.com>,
	kernel-team@meta.com
Subject: Re: [PATCH bpf-next v1 12/22] rqspinlock: Add basic support for CONFIG_PARAVIRT
Date: Wed, 8 Jan 2025 19:48:02 -0500	[thread overview]
Message-ID: <dfbaf200-7c87-41b2-ab87-906cbdf3e0d7@redhat.com> (raw)
In-Reply-To: <CAP01T74UX4VKNKmeooiCKsw7G6qkhohSFTXP0r=DZ1AuaEetAw@mail.gmail.com>


On 1/8/25 3:32 PM, Kumar Kartikeya Dwivedi wrote:
> On Wed, 8 Jan 2025 at 21:57, Waiman Long <llong@redhat.com> wrote:
>> On 1/7/25 8:59 AM, Kumar Kartikeya Dwivedi wrote:
>>> We ripped out PV and virtualization related bits from rqspinlock in an
>>> earlier commit, however, a fair lock performs poorly within a virtual
>>> machine when the lock holder is preempted. As such, retain the
>>> virt_spin_lock fallback to test and set lock, but with timeout and
>>> deadlock detection.
>>>
>>> We don't integrate support for CONFIG_PARAVIRT_SPINLOCKS yet, as that
>>> requires more involved algorithmic changes and introduces more
>>> complexity. It can be done when the need arises in the future.
>> virt_spin_lock() doesn't scale well. It is for hypervisors that don't
>> support PV qspinlock yet. Now rqspinlock() will be in this category.
> We would need to make algorithmic changes to paravirt versions, which
> would be too much for this series, so I didn't go there.
I know. The paravirt part is the most difficult. It took me over a year 
to work on the paravirt part of qspinlock to get it right and merged 
upstream.
>
>> I wonder if we should provide an option to disable rqspinlock and fall
>> back to the regular qspinlock with strict BPF locking semantics.
> That unfortunately won't work, because rqspinlock operates essentially
> like a trylock, where it is allowed to fail and callers must handle
> errors accordingly. Some of the users in BPF (e.g. in patch 17) remove
> their per-cpu nesting counts to rely on AA deadlock detection of
> rqspinlock, which would cause a deadlock if we transparently replace
> it with qspinlock as a fallback.

I see. This information should be documented somewhere.


>> Another question that I have is about PREEMPT_RT kernel which cannot
>> tolerate any locking stall. That will probably require disabling
>> rqspinlock if CONFIG_PREEMPT_RT is enabled.
> I think rqspinlock better maps to the raw spin lock variants, which
> stays as a spin lock on RT kernels, and as you see in patch 17 and 18,
> BPF maps were already using the raw spin lock variants. To avoid
> stalling, we perform deadlock checks immediately when we enter the
> slow path, so for the cases where we rely upon rqspinlock to diagnose
> and report an error, we'll recover quickly. If we still hit the
> timeout it is probably a different problem / bug anyway (and would
> have caused a kernel hang otherwise).

Is the intention to only replace raw_spinlock_t by rqspinlock but never 
spinlock_t? Again, this information need to be documented. Looking at 
the pdf file, it looks like the rqspinlock usage will be extended over time.

As for the locking semantics allowed by the BPF verifier, is it possible 
to enforce the strict locking rules for PREEMPT_RT kernel and use the 
relaxed semantics for non-PREEMPT_RT kernel. We don't want the loading 
of an arbitrary BPF program to break the latency guarantee of a 
PREEMPT_RT kernel.

Cheers,
Longman




  reply	other threads:[~2025-01-09  0:48 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-07 13:59 [PATCH bpf-next v1 00/22] Resilient Queued Spin Lock Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 01/22] locking: Move MCS struct definition to public header Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 02/22] locking: Move common qspinlock helpers to a private header Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 03/22] locking: Allow obtaining result of arch_mcs_spin_lock_contended Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 04/22] locking: Copy out qspinlock.c to rqspinlock.c Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 05/22] rqspinlock: Add rqspinlock.h header Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 06/22] rqspinlock: Drop PV and virtualization support Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 07/22] rqspinlock: Add support for timeouts Kumar Kartikeya Dwivedi
2025-01-07 14:50   ` Peter Zijlstra
2025-01-07 17:14     ` Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 08/22] rqspinlock: Protect pending bit owners from stalls Kumar Kartikeya Dwivedi
2025-01-07 14:51   ` Peter Zijlstra
2025-01-07 17:14     ` Kumar Kartikeya Dwivedi
2025-01-07 19:17       ` Peter Zijlstra
2025-01-07 19:22         ` Peter Zijlstra
2025-01-07 19:54           ` Kumar Kartikeya Dwivedi
2025-01-08  2:19   ` Waiman Long
2025-01-08 20:13     ` Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 09/22] rqspinlock: Protect waiters in queue " Kumar Kartikeya Dwivedi
2025-01-08  3:38   ` Waiman Long
2025-01-08 20:42     ` Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 10/22] rqspinlock: Protect waiters in trylock fallback " Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 11/22] rqspinlock: Add deadlock detection and recovery Kumar Kartikeya Dwivedi
2025-01-08 16:06   ` Waiman Long
2025-01-08 20:19     ` Kumar Kartikeya Dwivedi
2025-01-09  0:32       ` Waiman Long
2025-01-07 13:59 ` [PATCH bpf-next v1 12/22] rqspinlock: Add basic support for CONFIG_PARAVIRT Kumar Kartikeya Dwivedi
2025-01-08 16:27   ` Waiman Long
2025-01-08 20:32     ` Kumar Kartikeya Dwivedi
2025-01-09  0:48       ` Waiman Long [this message]
2025-01-09  2:42         ` Alexei Starovoitov
2025-01-09  2:58           ` Waiman Long
2025-01-09  3:37             ` Alexei Starovoitov
2025-01-09  3:46               ` Waiman Long
2025-01-09  3:53                 ` Alexei Starovoitov
2025-01-09  3:58                   ` Waiman Long
2025-01-07 13:59 ` [PATCH bpf-next v1 13/22] rqspinlock: Add helper to print a splat on timeout or deadlock Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 14/22] rqspinlock: Add macros for rqspinlock usage Kumar Kartikeya Dwivedi
2025-01-08 16:55   ` Waiman Long
2025-01-08 20:41     ` Kumar Kartikeya Dwivedi
2025-01-09  1:11       ` Waiman Long
2025-01-09  3:30         ` Alexei Starovoitov
2025-01-09  4:09           ` Waiman Long
2025-01-07 13:59 ` [PATCH bpf-next v1 15/22] rqspinlock: Add locktorture support Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 16/22] rqspinlock: Add entry to Makefile, MAINTAINERS Kumar Kartikeya Dwivedi
2025-01-07 13:59 ` [PATCH bpf-next v1 17/22] bpf: Convert hashtab.c to rqspinlock Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 18/22] bpf: Convert percpu_freelist.c " Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 19/22] bpf: Convert lpm_trie.c " Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 20/22] bpf: Introduce rqspinlock kfuncs Kumar Kartikeya Dwivedi
2025-01-08 10:23   ` kernel test robot
2025-01-08 10:23   ` kernel test robot
2025-01-08 10:44   ` kernel test robot
2025-01-07 14:00 ` [PATCH bpf-next v1 21/22] bpf: Implement verifier support for rqspinlock Kumar Kartikeya Dwivedi
2025-01-07 14:00 ` [PATCH bpf-next v1 22/22] selftests/bpf: Add tests " Kumar Kartikeya Dwivedi
2025-01-07 23:54 ` [PATCH bpf-next v1 00/22] Resilient Queued Spin Lock Linus Torvalds
2025-01-08  9:18   ` Peter Zijlstra
2025-01-08 20:12     ` Kumar Kartikeya Dwivedi
2025-01-08 20:30       ` Linus Torvalds
2025-01-08 21:06         ` Kumar Kartikeya Dwivedi
2025-01-08 21:30         ` Paul E. McKenney
2025-01-09 13:59       ` Waiman Long
2025-01-09 21:13         ` Kumar Kartikeya Dwivedi
2025-01-09 21:18           ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dfbaf200-7c87-41b2-ab87-906cbdf3e0d7@redhat.com \
    --to=llong@redhat.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brho@google.com \
    --cc=daniel@iogearbox.net \
    --cc=dohyunkim@google.com \
    --cc=eddyz87@gmail.com \
    --cc=joshdon@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox