All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] ARM: vfp: Introduce vfp_lock() for VFP locking.
@ 2023-06-15 10:16 Sebastian Andrzej Siewior
  2023-06-15 10:16 ` [PATCH 1/3] ARM: vfp: Provide " Sebastian Andrzej Siewior
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-06-15 10:16 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Russell King, Ard Biesheuvel, Thomas Gleixner


There was a bug report on PREEMPT_RT which made me look into VFP locking
on ARM. The usage of local_bh_disable() to ensure exclusive access to
the VFP unit is not working on PREEMPT_RT because the softirq context is
preemptible.

This series introduces vfp_lock() which does the right thing.

This series depends on Ard's rewrite of the VFP exception handling:
	ARM: convert VFP exception handling to C code
	https://lore.kernel.org/20230522080310.502250-1-ardb@kernel.org

Sebastian


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
@ 2023-05-17 22:05 Ard Biesheuvel
  2023-05-19 14:57 ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
  0 siblings, 1 reply; 19+ messages in thread
From: Ard Biesheuvel @ 2023-05-17 22:05 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On Wed, 17 May 2023 at 16:37, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2023-05-17 13:05:32 [+0200], Ard Biesheuvel wrote:
> > For context, these changes give a substantial performance boost on the
> > RX path to IPsec or WireGuard using SIMD based software crypto.
> >
> > > …
> > > > Please take a look, and confirm whether or not we still need to drop
> > > > the get_cpu() call from vfp_sync_hwstate()
> > >
> > > Yes, it needs to be dealt with. At the time vfp_sync_hwstate() the
> > > context is fully preemptible. So that get_cpu() needs to be removed
> > > before local_bh_disable(). But…
> > >
> > > If I understand the logic right, then the VFP state is saved on context
> > > switch and the FPU access is disabled but the content remains and is in
> > > sync with vfp_current_hw_state. Upon first usage (after the context
> > > switch) in userland there is an exception at which point the access to
> > > the FPU is enabled (FPEXC_EX) and the FPU content is loaded unless it is
> > > in sync as per vfp_current_hw_state.
> > >
> >
> > Exactly. On UP we lazily preserve the user space FP state, and on SMP
> > we always preserve it on a context switch. Then, only when user space
> > actually tries to use the SIMD context, the correct data is copied in.
> > That way, we don't take the performance hit for tasks that are blocked
> > in the kernel.
> >
> > > That means it relies on disabled preemption while operating on
> > > vfp_current_hw_state.
> >
> > In general, this code relies on preserving/restoring the VFP context
> > to/from memory to be a critical section, as it needs to run to
> > completion once it is started.
> >
> > > On PREEMPT_RT softirqs are only handled in process
> > > context (for instance at the end of the threaded interrupt). Therefore
> > > it is sufficient to disable preemption instead of BH. It would need
> > > something like x86's fpregs_lock().
> >
> > I think this is not the only issue, to be honest. We cannot preempt
> > tasks while they are using the SIMD unit in kernel mode, as the kernel
> > mode context is never preserved/restored. So we at least need to
> > disable preemption again in kernel_neon_begin() [which now relies on
> > BH disable to disable preemption but as you point out, that is only
> > the case on !RT]
> >
> > Given this, I wonder whether supporting kernel mode NEON on PREEMPT_RT
> > is worth it to begin with. AIUI, the alternative is to disable both
> > preemption and BH when touching the VFP state.
>
> Only preemption on RT.

Right.

> So it is debatable if it makes sense to use NEON
> on RT. The preempt-off section is limited to PAGE_SIZE due the scatter/
> gather walk if I remember correctly.
>

This depends on the type of algorithm (skciphers/aeads vs
shashes/ahashes) but they generally all have a bounded quantum of work
before yielding the NEON (and therefore the CPU).

> > > Otherwise vfp_entry() can get preempted after decisions based on
> > > vfp_current_hw_state have been made. Also kernel_neon_begin() could get
> > > preempted after enabling FPU. I think based on current logic, after
> > > kernel_neon_begin() a task can get preempted on PREEMPT_RT and since
> > > vfp_current_hw_state is NULL the registers won't be saved and a zero set
> > > of registers will be loaded once the neon task gets back on the CPU (it
> > > seems that VFP exceptions in kernel mode are treated the same way as
> > > those in user mode).
> > >
> > > What do you think?
> > >
> >
> > Indeed. kernel_neon_begin() no longer disabling preemption is
> > definitely wrong on PREEMPT_RT. The question is whether we want to
> > untangle this or just make PREEMPT_RT and KERNEL_MODE_NEON mutually
> > exclusive. This, by itself, makes quite a lot of sense, actually,
> > given that on actual 32-bit hardware, the SIMD speedup is not that
> > spectacular (the AES and other crypto instructions, which do give an
> > order of magnitude speedup are only implemented on 64-bit cores, which
> > usually run a 64-bit kernel). So if real-time behavior is a
> > requirement, using ordinary crypto implemented in C (which does not
> > require preemption to be disabled) is likely preferred anyway.
>
> I have no opinion which can be backed up by numbers. So I have no
> problem to go along with it and disable KERNEL_MODE_NEON since it is
> simpler and the benefit is questionable.
>
> So patch
> - #1 KERNEL_MODE_NEON depends on !RT

Actually, given your explanation above, I think this may not be
necessary. Given that on !RT, disabling BH implies disabling
preemption and on RT disabling preemption implies disabling BH, it
should just be a matter of doing one or the other consistently,
depending on IS_ENABLED(CONFIG_PREEMPT_RT)

> - #2 disable BH followed by preemption in vfp_sync_hwstate() and
>   vfp_entry().
>

Not sure what to do here, given my answer to #1

In most cases where the VFP code disables BH, it is either because an
interrupting softirq may enable the SIMD unit and disable it again,
which would result in a FP exception when the interrupted context
tries to access the registers, or because such an interruption would
corrupt the preserved/restored FP state if it occurs while such a
preserve/restore is in progress.

Maybe the commit log of patch 62b95a7b44d1a30b3a9 sheds some more light here.

> if so, let me prepare something unless you want to :)
>

Don't let me stop you :-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-09-26 11:04 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-15 10:16 [PATCH 0/3] ARM: vfp: Introduce vfp_lock() for VFP locking Sebastian Andrzej Siewior
2023-06-15 10:16 ` [PATCH 1/3] ARM: vfp: Provide " Sebastian Andrzej Siewior
2023-06-15 10:16 ` [PATCH 2/3] ARM: vfp: Use vfp_lock() in vfp_sync_hwstate() Sebastian Andrzej Siewior
2023-06-15 10:16 ` [PATCH 3/3] ARM: vfp: Use vfp_lock() in vfp_entry() Sebastian Andrzej Siewior
2023-06-27 10:22   ` Ard Biesheuvel
2023-06-27 12:41     ` Sebastian Andrzej Siewior
2023-06-27 12:57       ` Ard Biesheuvel
2023-06-27 13:06         ` Sebastian Andrzej Siewior
2023-06-27 13:22           ` Russell King (Oracle)
2023-06-27 13:26             ` Ard Biesheuvel
2023-06-27 13:32               ` Russell King (Oracle)
2023-06-27 14:40                 ` Russell King (Oracle)
2023-06-27 15:02                   ` Russell King (Oracle)
2023-06-27 17:41                     ` Russell King (Oracle)
2023-07-28 16:19                       ` Sebastian Andrzej Siewior
2023-06-27 13:34             ` Sebastian Andrzej Siewior
2023-09-07 13:53 ` [PATCH 0/3] ARM: vfp: Introduce vfp_lock() for VFP locking Ard Biesheuvel
2023-09-26 11:03   ` Sebastian Andrzej Siewior
  -- strict thread matches above, loose matches on Subject: below --
2023-05-17 22:05 Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Ard Biesheuvel
2023-05-19 14:57 ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
2023-05-19 14:57   ` [PATCH 3/3] ARM: vfp: Use vfp_lock() in vfp_entry() Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.