* [PATCH 00/34] biops: add atomig find_bit() operations
@ 2023-11-18 15:50 Yury Norov
2023-11-18 15:50 ` [PATCH 01/34] lib/find: add atomic find_bit() primitives Yury Norov
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Yury Norov @ 2023-11-18 15:50 UTC (permalink / raw)
To: linux-kernel, David S. Miller, H. Peter Anvin,
James E.J. Bottomley, K. Y. Srinivasan, Md. Haris Iqbal,
Akinobu Mita, Andrew Morton, Bjorn Andersson, Borislav Petkov,
Chaitanya Kulkarni, Christian Brauner, Damien Le Moal,
Dave Hansen, David Disseldorp, Edward Cree, Eric Dumazet,
Fenghua Yu, Geert Uytterhoeven, Greg Kroah-Hartman,
Gregory Greenman, Hans Verkuil, Hans de Goede, Hugh Dickins,
Ingo Molnar, Jakub Kicinski, Jaroslav Kysela, Jason Gunthorpe,
Jens Axboe, Jiri Pirko, Jiri Slaby, Kalle Valo, Karsten Graul,
Karsten Keil, Kees Cook, Leon Romanovsky, Mark Rutland,
Martin Habets, Mauro Carvalho Chehab, Michael Ellerman,
Michal Simek, Nicholas Piggin, Oliver Neukum, Paolo Abeni,
Paolo Bonzini, Peter Zijlstra, Ping-Ke Shih, Rich Felker,
Rob Herring, Robin Murphy, Sathya Prakash Veerichetty,
Sean Christopherson, Shuai Xue, Stanislaw Gruszka, Steven Rostedt,
Thomas Bogendoerfer, Thomas Gleixner, Valentin Schneider,
Vitaly Kuznetsov, Wenjia Zhang, Will Deacon, Yoshinori Sato,
GR-QLogic-Storage-Upstream, alsa-devel, ath10k, dmaengine, iommu,
kvm, linux-arm-kernel, linux-arm-msm, linux-block,
linux-bluetooth, linux-hyperv, linux-m68k, linux-media,
linux-mips, linux-net-drivers, linux-pci, linux-rdma, linux-s390,
linux-scsi, linux-serial, linux-sh, linux-sound, linux-usb,
linux-wireless, linuxppc-dev, mpi3mr-linuxdrv.pdl, netdev,
sparclinux, x86
Cc: Yury Norov, Jan Kara, Mirsad Todorovac, Matthew Wilcox,
Rasmus Villemoes, Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
Add helpers around test_and_{set,clear}_bit() that allow to search for
clear or set bits and flip them atomically.
The target patterns may look like this:
for (idx = 0; idx < nbits; idx++)
if (test_and_clear_bit(idx, bitmap))
do_something(idx);
Or like this:
do {
bit = find_first_bit(bitmap, nbits);
if (bit >= nbits)
return nbits;
} while (!test_and_clear_bit(bit, bitmap));
return bit;
In both cases, the opencoded loop may be converted to a single function
or iterator call. Correspondingly:
for_each_test_and_clear_bit(idx, bitmap, nbits)
do_something(idx);
Or:
return find_and_clear_bit(bitmap, nbits);
Obviously, the less routine code people have write themself, the less
probability to make a mistake. Patch #31 of this series fixes one such
error in perf/m1 codebase.
Those are not only handy helpers but also resolve a non-trivial
issue of using non-atomic find_bit() together with atomic
test_and_{set,clear)_bit().
The trick is that find_bit() implies that the bitmap is a regular
non-volatile piece of memory, and compiler is allowed to use such
optimization techniques like re-fetching memory instead of caching it.
For example, find_first_bit() is implemented like this:
for (idx = 0; idx * BITS_PER_LONG < sz; idx++) {
val = addr[idx];
if (val) {
sz = min(idx * BITS_PER_LONG + __ffs(val), sz);
break;
}
}
On register-memory architectures, like x86, compiler may decide to
access memory twice - first time to compare against 0, and second time
to fetch its value to pass it to __ffs().
When running find_first_bit() on volatile memory, the memory may get
changed in-between, and for instance, it may lead to passing 0 to
__ffs(), which is undefined. This is a potentially dangerous call.
find_and_clear_bit() as a wrapper around test_and_clear_bit()
naturally treats underlying bitmap as a volatile memory and prevents
compiler from such optimizations.
Now that KCSAN is catching exactly this type of situations and warns on
undercover memory modifications. We can use it to reveal improper usage
of find_bit(), and convert it to atomic find_and_*_bit() as appropriate.
The 1st patch of the series adds the following atomic primitives:
find_and_set_bit(addr, nbits);
find_and_set_next_bit(addr, nbits, start);
...
Here find_and_{set,clear} part refers to the corresponding
test_and_{set,clear}_bit function, and suffixes like _wrap or _lock
derive semantics from corresponding find() or test() functions.
For brevity, the naming omits the fact that we search for zero bit in
find_and_set, and correspondingly, search for set bit in find_and_clear
functions.
The patch also adds iterators with atomic semantics, like
for_each_test_and_set_bit(). Here, the naming rule is to simply prefix
corresponding atomic operation with 'for_each'.
This series is a result of discussion [1]. All find_bit() functions imply
exclusive access to the bitmaps. However, KCSAN reports quite a number
of warnings related to find_bit() API. Some of them are not pointing
to real bugs because in many situations people intentionally allow
concurrent bitmap operations.
If so, find_bit() can be annotated such that KCSAN will ignore it:
bit = data_race(find_first_bit(bitmap, nbits));
This series addresses the other important case where people really need
atomic find ops. As the following patches show, the resulting code
looks safer and more verbose comparing to opencoded loops followed by
atomic bit flips.
In [1] Mirsad reported 2% slowdown in a single-thread search test when
switching find_bit() function to treat bitmaps as volatile arrays. On
the other hand, kernel robot in the same thread reported +3.7% to the
performance of will-it-scale.per_thread_ops test.
Assuming that our compilers are sane and generate better code against
properly annotated data, the above discrepancy doesn't look weird. When
running on non-volatile bitmaps, plain find_bit() outperforms atomic
find_and_bit(), and vice-versa.
So, all users of find_bit() API, where heavy concurrency is expected,
are encouraged to switch to atomic find_and_bit() as appropriate.
1st patch of this series adds atomic find_and_bit() API, and all the
following patches spread it over the kernel. They can be applied
separately from each other on per-subsystems basis, or I can pull them
in bitmap tree, as appropriate.
[1] https://lore.kernel.org/lkml/634f5fdf-e236-42cf-be8d-48a581c21660@alu.unizg.hr/T/#m3e7341eb3571753f3acf8fe166f3fb5b2c12e615
Yury Norov (34):
lib/find: add atomic find_bit() primitives
lib/sbitmap; make __sbitmap_get_word() using find_and_set_bit()
watch_queue: use atomic find_bit() in post_one_notification()
sched: add cpumask_find_and_set() and use it in __mm_cid_get()
mips: sgi-ip30: rework heart_alloc_int()
sparc: fix opencoded find_and_set_bit() in alloc_msi()
perf/arm: optimize opencoded atomic find_bit() API
drivers/perf: optimize ali_drw_get_counter_idx() by using find_bit()
dmaengine: idxd: optimize perfmon_assign_event()
ath10k: optimize ath10k_snoc_napi_poll() by using find_bit()
wifi: rtw88: optimize rtw_pci_tx_kick_off() by using find_bit()
wifi: intel: use atomic find_bit() API where appropriate
KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers()
PCI: hv: switch hv_get_dom_num() to use atomic find_bit()
scsi: use atomic find_bit() API where appropriate
powerpc: use atomic find_bit() API where appropriate
iommu: use atomic find_bit() API where appropriate
media: radio-shark: use atomic find_bit() API where appropriate
sfc: switch to using atomic find_bit() API where appropriate
tty: nozomi: optimize interrupt_handler()
usb: cdc-acm: optimize acm_softint()
block: null_blk: fix opencoded find_and_set_bit() in get_tag()
RDMA/rtrs: fix opencoded find_and_set_bit_lock() in
__rtrs_get_permit()
mISDN: optimize get_free_devid()
media: em28xx: cx231xx: fix opencoded find_and_set_bit()
ethernet: rocker: optimize ofdpa_port_internal_vlan_id_get()
serial: sc12is7xx: optimize sc16is7xx_alloc_line()
bluetooth: optimize cmtp_alloc_block_id()
net: smc: fix opencoded find_and_set_bit() in
smc_wr_tx_get_free_slot_index()
ALSA: use atomic find_bit() functions where applicable
drivers/perf: optimize m1_pmu_get_event_idx() by using find_bit() API
m68k: rework get_mmu_context()
microblaze: rework get_mmu_context()
sh: rework ilsel_enable()
arch/m68k/include/asm/mmu_context.h | 11 +-
arch/microblaze/include/asm/mmu_context_mm.h | 11 +-
arch/mips/sgi-ip30/ip30-irq.c | 12 +-
arch/powerpc/mm/book3s32/mmu_context.c | 10 +-
arch/powerpc/platforms/pasemi/dma_lib.c | 45 +--
arch/powerpc/platforms/powernv/pci-sriov.c | 12 +-
arch/sh/boards/mach-x3proto/ilsel.c | 4 +-
arch/sparc/kernel/pci_msi.c | 9 +-
arch/x86/kvm/hyperv.c | 39 ++-
drivers/block/null_blk/main.c | 41 +--
drivers/dma/idxd/perfmon.c | 8 +-
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 15 +-
drivers/iommu/arm/arm-smmu/arm-smmu.h | 10 +-
drivers/iommu/msm_iommu.c | 18 +-
drivers/isdn/mISDN/core.c | 9 +-
drivers/media/radio/radio-shark.c | 5 +-
drivers/media/radio/radio-shark2.c | 5 +-
drivers/media/usb/cx231xx/cx231xx-cards.c | 16 +-
drivers/media/usb/em28xx/em28xx-cards.c | 37 +--
drivers/net/ethernet/rocker/rocker_ofdpa.c | 11 +-
drivers/net/ethernet/sfc/rx_common.c | 4 +-
drivers/net/ethernet/sfc/siena/rx_common.c | 4 +-
drivers/net/ethernet/sfc/siena/siena_sriov.c | 14 +-
drivers/net/wireless/ath/ath10k/snoc.c | 9 +-
.../net/wireless/intel/iwlegacy/4965-mac.c | 7 +-
drivers/net/wireless/intel/iwlegacy/common.c | 8 +-
drivers/net/wireless/intel/iwlwifi/dvm/sta.c | 8 +-
drivers/net/wireless/intel/iwlwifi/dvm/tx.c | 19 +-
drivers/net/wireless/realtek/rtw88/pci.c | 5 +-
drivers/net/wireless/realtek/rtw89/pci.c | 5 +-
drivers/pci/controller/pci-hyperv.c | 7 +-
drivers/perf/alibaba_uncore_drw_pmu.c | 10 +-
drivers/perf/apple_m1_cpu_pmu.c | 8 +-
drivers/perf/arm-cci.c | 23 +-
drivers/perf/arm-ccn.c | 10 +-
drivers/perf/arm_dmc620_pmu.c | 9 +-
drivers/perf/arm_pmuv3.c | 8 +-
drivers/scsi/mpi3mr/mpi3mr_os.c | 21 +-
drivers/scsi/qedi/qedi_main.c | 9 +-
drivers/scsi/scsi_lib.c | 5 +-
drivers/tty/nozomi.c | 5 +-
drivers/tty/serial/sc16is7xx.c | 8 +-
drivers/usb/class/cdc-acm.c | 5 +-
include/linux/cpumask.h | 12 +
include/linux/find.h | 289 ++++++++++++++++++
kernel/sched/sched.h | 52 +---
kernel/watch_queue.c | 6 +-
lib/find_bit.c | 85 ++++++
lib/sbitmap.c | 46 +--
net/bluetooth/cmtp/core.c | 10 +-
net/smc/smc_wr.c | 10 +-
sound/pci/hda/hda_codec.c | 7 +-
sound/usb/caiaq/audio.c | 13 +-
53 files changed, 588 insertions(+), 481 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 01/34] lib/find: add atomic find_bit() primitives
2023-11-18 15:50 [PATCH 00/34] biops: add atomig find_bit() operations Yury Norov
@ 2023-11-18 15:50 ` Yury Norov
2023-11-18 16:23 ` Bart Van Assche
2023-11-18 15:50 ` [PATCH 13/34] KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers() Yury Norov
2023-11-18 16:18 ` [PATCH 00/34] biops: add atomig find_bit() operations Bart Van Assche
2 siblings, 1 reply; 8+ messages in thread
From: Yury Norov @ 2023-11-18 15:50 UTC (permalink / raw)
To: linux-kernel, David S. Miller, H. Peter Anvin,
James E.J. Bottomley, K. Y. Srinivasan, Md. Haris Iqbal,
Akinobu Mita, Andrew Morton, Bjorn Andersson, Borislav Petkov,
Chaitanya Kulkarni, Christian Brauner, Damien Le Moal,
Dave Hansen, David Disseldorp, Edward Cree, Eric Dumazet,
Fenghua Yu, Geert Uytterhoeven, Greg Kroah-Hartman,
Gregory Greenman, Hans Verkuil, Hans de Goede, Hugh Dickins,
Ingo Molnar, Jakub Kicinski, Jaroslav Kysela, Jason Gunthorpe,
Jens Axboe, Jiri Pirko, Jiri Slaby, Kalle Valo, Karsten Graul,
Karsten Keil, Kees Cook, Leon Romanovsky, Mark Rutland,
Martin Habets, Mauro Carvalho Chehab, Michael Ellerman,
Michal Simek, Nicholas Piggin, Oliver Neukum, Paolo Abeni,
Paolo Bonzini, Peter Zijlstra, Ping-Ke Shih, Rich Felker,
Rob Herring, Robin Murphy, Sathya Prakash Veerichetty,
Sean Christopherson, Shuai Xue, Stanislaw Gruszka, Steven Rostedt,
Thomas Bogendoerfer, Thomas Gleixner, Valentin Schneider,
Vitaly Kuznetsov, Wenjia Zhang, Will Deacon, Yoshinori Sato,
GR-QLogic-Storage-Upstream, alsa-devel, ath10k, dmaengine, iommu,
kvm, linux-arm-kernel, linux-arm-msm, linux-block,
linux-bluetooth, linux-hyperv, linux-m68k, linux-media,
linux-mips, linux-net-drivers, linux-pci, linux-rdma, linux-s390,
linux-scsi, linux-serial, linux-sh, linux-sound, linux-usb,
linux-wireless, linuxppc-dev, mpi3mr-linuxdrv.pdl, netdev,
sparclinux, x86
Cc: Yury Norov, Jan Kara, Mirsad Todorovac, Matthew Wilcox,
Rasmus Villemoes, Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
Add helpers around test_and_{set,clear}_bit() that allow to search for
clear or set bits and flip them atomically.
The target patterns may look like this:
for (idx = 0; idx < nbits; idx++)
if (test_and_clear_bit(idx, bitmap))
do_something(idx);
Or like this:
do {
bit = find_first_bit(bitmap, nbits);
if (bit >= nbits)
return nbits;
} while (!test_and_clear_bit(bit, bitmap));
return bit;
In both cases, the opencoded loop may be converted to a single function
or iterator call. Correspondingly:
for_each_test_and_clear_bit(idx, bitmap, nbits)
do_something(idx);
Or:
return find_and_clear_bit(bitmap, nbits);
Obviously, the less routine code people have write themself, the less
probability to make a mistake.
Those are not only handy helpers but also resolve a non-trivial
issue of using non-atomic find_bit() together with atomic
test_and_{set,clear)_bit().
The trick is that find_bit() implies that the bitmap is a regular
non-volatile piece of memory, and compiler is allowed to use such
optimization techniques like re-fetching memory instead of caching it.
For example, find_first_bit() is implemented like this:
for (idx = 0; idx * BITS_PER_LONG < sz; idx++) {
val = addr[idx];
if (val) {
sz = min(idx * BITS_PER_LONG + __ffs(val), sz);
break;
}
}
On register-memory architectures, like x86, compiler may decide to
access memory twice - first time to compare against 0, and second time
to fetch its value to pass it to __ffs().
When running find_first_bit() on volatile memory, the memory may get
changed in-between, and for instance, it may lead to passing 0 to
__ffs(), which is undefined. This is a potentially dangerous call.
find_and_clear_bit() as a wrapper around test_and_clear_bit()
naturally treats underlying bitmap as a volatile memory and prevents
compiler from such optimizations.
Now that KCSAN is catching exactly this type of situations and warns on
undercover memory modifications. We can use it to reveal improper usage
of find_bit(), and convert it to atomic find_and_*_bit() as appropriate.
The 1st patch of the series adds the following atomic primitives:
find_and_set_bit(addr, nbits);
find_and_set_next_bit(addr, nbits, start);
...
Here find_and_{set,clear} part refers to the corresponding
test_and_{set,clear}_bit function, and suffixes like _wrap or _lock
derive semantics from corresponding find() or test() functions.
For brevity, the naming omits the fact that we search for zero bit in
find_and_set, and correspondingly, search for set bit in find_and_clear
functions.
The patch also adds iterators with atomic semantics, like
for_each_test_and_set_bit(). Here, the naming rule is to simply prefix
corresponding atomic operation with 'for_each'.
All users of find_bit() API, where heavy concurrency is expected,
are encouraged to switch to atomic find_and_bit() as appropriate.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
include/linux/find.h | 289 +++++++++++++++++++++++++++++++++++++++++++
lib/find_bit.c | 85 +++++++++++++
2 files changed, 374 insertions(+)
diff --git a/include/linux/find.h b/include/linux/find.h
index 5e4f39ef2e72..e8567f336f42 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -32,6 +32,16 @@ extern unsigned long _find_first_and_bit(const unsigned long *addr1,
extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
+unsigned long _find_and_set_bit(volatile unsigned long *addr, unsigned long nbits);
+unsigned long _find_and_set_next_bit(volatile unsigned long *addr, unsigned long nbits,
+ unsigned long start);
+unsigned long _find_and_set_bit_lock(volatile unsigned long *addr, unsigned long nbits);
+unsigned long _find_and_set_next_bit_lock(volatile unsigned long *addr, unsigned long nbits,
+ unsigned long start);
+unsigned long _find_and_clear_bit(volatile unsigned long *addr, unsigned long nbits);
+unsigned long _find_and_clear_next_bit(volatile unsigned long *addr, unsigned long nbits,
+ unsigned long start);
+
#ifdef __BIG_ENDIAN
unsigned long _find_first_zero_bit_le(const unsigned long *addr, unsigned long size);
unsigned long _find_next_zero_bit_le(const unsigned long *addr, unsigned
@@ -460,6 +470,267 @@ unsigned long __for_each_wrap(const unsigned long *bitmap, unsigned long size,
return bit < start ? bit : size;
}
+/**
+ * find_and_set_bit - Find a zero bit and set it atomically
+ * @addr: The address to base the search on
+ * @nbits: The bitmap size in bits
+ *
+ * This function is designed to operate in concurrent access environment.
+ *
+ * Because of concurrency and volatile nature of underlying bitmap, it's not
+ * guaranteed that the bit found is the 1st bit in the bitmap. It's also not
+ * guaranteed that if @nbits is returned, the bitmap is empty.
+ *
+ * The function does guarantee that if returned value is in range [0 .. @nbits),
+ * the acquired bit belongs to the caller exclusively.
+ *
+ * Returns: found and set bit, or @nbits if no bits found
+ */
+static inline
+unsigned long find_and_set_bit(volatile unsigned long *addr, unsigned long nbits)
+{
+ if (small_const_nbits(nbits)) {
+ unsigned long val, ret;
+
+ do {
+ val = *addr | ~GENMASK(nbits - 1, 0);
+ if (val == ~0UL)
+ return nbits;
+ ret = ffz(val);
+ } while (test_and_set_bit(ret, addr));
+
+ return ret;
+ }
+
+ return _find_and_set_bit(addr, nbits);
+}
+
+
+/**
+ * find_and_set_next_bit - Find a zero bit and set it, starting from @offset
+ * @addr: The address to base the search on
+ * @nbits: The bitmap nbits in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * This function is designed to operate in concurrent access environment.
+ *
+ * Because of concurrency and volatile nature of underlying bitmap, it's not
+ * guaranteed that the bit found is the 1st bit in the bitmap, starting from @offset.
+ * It's also not guaranteed that if @nbits is returned, the bitmap is empty.
+ *
+ * The function does guarantee that if returned value is in range [@offset .. @nbits),
+ * the acquired bit belongs to the caller exclusively.
+ *
+ * Returns: found and set bit, or @nbits if no bits found
+ */
+static inline
+unsigned long find_and_set_next_bit(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long offset)
+{
+ if (small_const_nbits(nbits)) {
+ unsigned long val, ret;
+
+ do {
+ val = *addr | ~GENMASK(nbits - 1, offset);
+ if (val == ~0UL)
+ return nbits;
+ ret = ffz(val);
+ } while (test_and_set_bit(ret, addr));
+
+ return ret;
+ }
+
+ return _find_and_set_next_bit(addr, nbits, offset);
+}
+
+/**
+ * find_and_set_bit_wrap - find and set bit starting at @offset, wrapping around zero
+ * @addr: The first address to base the search on
+ * @nbits: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Returns: the bit number for the next clear bit, or first clear bit up to @offset,
+ * while atomically setting it. If no bits are found, returns @nbits.
+ */
+static inline
+unsigned long find_and_set_bit_wrap(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long offset)
+{
+ unsigned long bit = find_and_set_next_bit(addr, nbits, offset);
+
+ if (bit < nbits || offset == 0)
+ return bit;
+
+ bit = find_and_set_bit(addr, offset);
+ return bit < offset ? bit : nbits;
+}
+
+/**
+ * find_and_set_bit_lock - find a zero bit, then set it atomically with lock
+ * @addr: The address to base the search on
+ * @nbits: The bitmap nbits in bits
+ *
+ * This function is designed to operate in concurrent access environment.
+ *
+ * Because of concurrency and volatile nature of underlying bitmap, it's not
+ * guaranteed that the bit found is the 1st bit in the bitmap. It's also not
+ * guaranteed that if @nbits is returned, the bitmap is empty.
+ *
+ * The function does guarantee that if returned value is in range [0 .. @nbits),
+ * the acquired bit belongs to the caller exclusively.
+ *
+ * Returns: found and set bit, or @nbits if no bits found
+ */
+static inline
+unsigned long find_and_set_bit_lock(volatile unsigned long *addr, unsigned long nbits)
+{
+ if (small_const_nbits(nbits)) {
+ unsigned long val, ret;
+
+ do {
+ val = *addr | ~GENMASK(nbits - 1, 0);
+ if (val == ~0UL)
+ return nbits;
+ ret = ffz(val);
+ } while (test_and_set_bit_lock(ret, addr));
+
+ return ret;
+ }
+
+ return _find_and_set_bit_lock(addr, nbits);
+}
+
+/**
+ * find_and_set_next_bit_lock - find a zero bit and set it atomically with lock
+ * @addr: The address to base the search on
+ * @nbits: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * This function is designed to operate in concurrent access environment.
+ *
+ * Because of concurrency and volatile nature of underlying bitmap, it's not
+ * guaranteed that the bit found is the 1st bit in the range. It's also not
+ * guaranteed that if @nbits is returned, the bitmap is empty.
+ *
+ * The function does guarantee that if returned value is in range [@offset .. @nbits),
+ * the acquired bit belongs to the caller exclusively.
+ *
+ * Returns: found and set bit, or @nbits if no bits found
+ */
+static inline
+unsigned long find_and_set_next_bit_lock(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long offset)
+{
+ if (small_const_nbits(nbits)) {
+ unsigned long val, ret;
+
+ do {
+ val = *addr | ~GENMASK(nbits - 1, offset);
+ if (val == ~0UL)
+ return nbits;
+ ret = ffz(val);
+ } while (test_and_set_bit_lock(ret, addr));
+
+ return ret;
+ }
+
+ return _find_and_set_next_bit_lock(addr, nbits, offset);
+}
+
+/**
+ * find_and_set_bit_wrap_lock - find zero bit starting at @ofset and set it
+ * with lock, and wrap around zero if nothing found
+ * @addr: The first address to base the search on
+ * @nbits: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Returns: the bit number for the next set bit, or first set bit up to @offset
+ * If no bits are set, returns @nbits.
+ */
+static inline
+unsigned long find_and_set_bit_wrap_lock(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long offset)
+{
+ unsigned long bit = find_and_set_next_bit_lock(addr, nbits, offset);
+
+ if (bit < nbits || offset == 0)
+ return bit;
+
+ bit = find_and_set_bit_lock(addr, offset);
+ return bit < offset ? bit : nbits;
+}
+
+/**
+ * find_and_clear_bit - Find a set bit and clear it atomically
+ * @addr: The address to base the search on
+ * @nbits: The bitmap nbits in bits
+ *
+ * This function is designed to operate in concurrent access environment.
+ *
+ * Because of concurrency and volatile nature of underlying bitmap, it's not
+ * guaranteed that the found bit is the 1st bit in the bitmap. It's also not
+ * guaranteed that if @nbits is returned, the bitmap is empty.
+ *
+ * The function does guarantee that if returned value is in range [0 .. @nbits),
+ * the acquired bit belongs to the caller exclusively.
+ *
+ * Returns: found and cleared bit, or @nbits if no bits found
+ */
+static inline unsigned long find_and_clear_bit(volatile unsigned long *addr, unsigned long nbits)
+{
+ if (small_const_nbits(nbits)) {
+ unsigned long val, ret;
+
+ do {
+ val = *addr & GENMASK(nbits - 1, 0);
+ if (val == 0)
+ return nbits;
+ ret = __ffs(val);
+ } while (!test_and_clear_bit(ret, addr));
+
+ return ret;
+ }
+
+ return _find_and_clear_bit(addr, nbits);
+}
+
+/**
+ * find_and_clear_next_bit - Find a set bit next after @offset, and clear it atomically
+ * @addr: The address to base the search on
+ * @nbits: The bitmap nbits in bits
+ * @offset: bit offset at which to start searching
+ *
+ * This function is designed to operate in concurrent access environment.
+ *
+ * Because of concurrency and volatile nature of underlying bitmap, it's not
+ * guaranteed that the bit found is the 1st bit in the range It's also not
+ * guaranteed that if @nbits is returned, there's no set bits after @offset.
+ *
+ * The function does guarantee that if returned value is in range [@offset .. @nbits),
+ * the acquired bit belongs to the caller exclusively.
+ *
+ * Returns: found and cleared bit, or @nbits if no bits found
+ */
+static inline
+unsigned long find_and_clear_next_bit(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long offset)
+{
+ if (small_const_nbits(nbits)) {
+ unsigned long val, ret;
+
+ do {
+ val = *addr & GENMASK(nbits - 1, offset);
+ if (val == 0)
+ return nbits;
+ ret = __ffs(val);
+ } while (!test_and_clear_bit(ret, addr));
+
+ return ret;
+ }
+
+ return _find_and_clear_next_bit(addr, nbits, offset);
+}
+
/**
* find_next_clump8 - find next 8-bit clump with set bits in a memory region
* @clump: location to store copy of found clump
@@ -577,6 +848,24 @@ unsigned long find_next_bit_le(const void *addr, unsigned
#define for_each_set_bit_from(bit, addr, size) \
for (; (bit) = find_next_bit((addr), (size), (bit)), (bit) < (size); (bit)++)
+/* same as for_each_set_bit() but atomically clears each found bit */
+#define for_each_test_and_clear_bit(bit, addr, size) \
+ for ((bit) = 0; \
+ (bit) = find_and_clear_next_bit((addr), (size), (bit)), (bit) < (size); \
+ (bit)++)
+
+/* same as for_each_clear_bit() but atomically sets each found bit */
+#define for_each_test_and_set_bit(bit, addr, size) \
+ for ((bit) = 0; \
+ (bit) = find_and_clear_next_bit((addr), (size), (bit)), (bit) < (size); \
+ (bit)++)
+
+/* same as for_each_clear_bit_from() but atomically clears each found bit */
+#define for_each_test_and_set_bit_from(bit, addr, size) \
+ for (; \
+ (bit) = find_and_set_next_bit((addr), (size), (bit)), (bit) < (size); \
+ (bit)++)
+
#define for_each_clear_bit(bit, addr, size) \
for ((bit) = 0; \
(bit) = find_next_zero_bit((addr), (size), (bit)), (bit) < (size); \
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 32f99e9a670e..c9b6b9f96610 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -116,6 +116,91 @@ unsigned long _find_first_and_bit(const unsigned long *addr1,
EXPORT_SYMBOL(_find_first_and_bit);
#endif
+unsigned long _find_and_set_bit(volatile unsigned long *addr, unsigned long nbits)
+{
+ unsigned long bit;
+
+ do {
+ bit = FIND_FIRST_BIT(~addr[idx], /* nop */, nbits);
+ if (bit >= nbits)
+ return nbits;
+ } while (test_and_set_bit(bit, addr));
+
+ return bit;
+}
+EXPORT_SYMBOL(_find_and_set_bit);
+
+unsigned long _find_and_set_next_bit(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long start)
+{
+ unsigned long bit;
+
+ do {
+ bit = FIND_NEXT_BIT(~addr[idx], /* nop */, nbits, start);
+ if (bit >= nbits)
+ return nbits;
+ } while (test_and_set_bit(bit, addr));
+
+ return bit;
+}
+EXPORT_SYMBOL(_find_and_set_next_bit);
+
+unsigned long _find_and_set_bit_lock(volatile unsigned long *addr, unsigned long nbits)
+{
+ unsigned long bit;
+
+ do {
+ bit = FIND_FIRST_BIT(~addr[idx], /* nop */, nbits);
+ if (bit >= nbits)
+ return nbits;
+ } while (test_and_set_bit_lock(bit, addr));
+
+ return bit;
+}
+EXPORT_SYMBOL(_find_and_set_bit_lock);
+
+unsigned long _find_and_set_next_bit_lock(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long start)
+{
+ unsigned long bit;
+
+ do {
+ bit = FIND_NEXT_BIT(~addr[idx], /* nop */, nbits, start);
+ if (bit >= nbits)
+ return nbits;
+ } while (test_and_set_bit_lock(bit, addr));
+
+ return bit;
+}
+EXPORT_SYMBOL(_find_and_set_next_bit_lock);
+
+unsigned long _find_and_clear_bit(volatile unsigned long *addr, unsigned long nbits)
+{
+ unsigned long bit;
+
+ do {
+ bit = FIND_FIRST_BIT(addr[idx], /* nop */, nbits);
+ if (bit >= nbits)
+ return nbits;
+ } while (!test_and_clear_bit(bit, addr));
+
+ return bit;
+}
+EXPORT_SYMBOL(_find_and_clear_bit);
+
+unsigned long _find_and_clear_next_bit(volatile unsigned long *addr,
+ unsigned long nbits, unsigned long start)
+{
+ do {
+ start = FIND_NEXT_BIT(addr[idx], /* nop */, nbits, start);
+ if (start >= nbits)
+ return nbits;
+ } while (!test_and_clear_bit(start, addr));
+
+ return start;
+}
+EXPORT_SYMBOL(_find_and_clear_next_bit);
+
#ifndef find_first_zero_bit
/*
* Find the first cleared bit in a memory region.
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 13/34] KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers()
2023-11-18 15:50 [PATCH 00/34] biops: add atomig find_bit() operations Yury Norov
2023-11-18 15:50 ` [PATCH 01/34] lib/find: add atomic find_bit() primitives Yury Norov
@ 2023-11-18 15:50 ` Yury Norov
2023-11-20 14:26 ` Vitaly Kuznetsov
2023-11-18 16:18 ` [PATCH 00/34] biops: add atomig find_bit() operations Bart Van Assche
2 siblings, 1 reply; 8+ messages in thread
From: Yury Norov @ 2023-11-18 15:50 UTC (permalink / raw)
To: linux-kernel, Vitaly Kuznetsov, Sean Christopherson,
Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, kvm
Cc: Yury Norov, Jan Kara, Mirsad Todorovac, Matthew Wilcox,
Rasmus Villemoes, Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
The function traverses stimer_pending_bitmap n a for-loop bit by bit.
We can do it faster by using atomic find_and_set_bit().
While here, refactor the logic by decreasing indentation level
and dropping 2nd check for stimer->config.enable.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
arch/x86/kvm/hyperv.c | 39 +++++++++++++++++++--------------------
1 file changed, 19 insertions(+), 20 deletions(-)
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 238afd7335e4..460e300b558b 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -870,27 +870,26 @@ void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
if (!hv_vcpu)
return;
- for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++)
- if (test_and_clear_bit(i, hv_vcpu->stimer_pending_bitmap)) {
- stimer = &hv_vcpu->stimer[i];
- if (stimer->config.enable) {
- exp_time = stimer->exp_time;
-
- if (exp_time) {
- time_now =
- get_time_ref_counter(vcpu->kvm);
- if (time_now >= exp_time)
- stimer_expiration(stimer);
- }
-
- if ((stimer->config.enable) &&
- stimer->count) {
- if (!stimer->msg_pending)
- stimer_start(stimer);
- } else
- stimer_cleanup(stimer);
- }
+ for_each_test_and_clear_bit(i, hv_vcpu->stimer_pending_bitmap,
+ ARRAY_SIZE(hv_vcpu->stimer)) {
+ stimer = &hv_vcpu->stimer[i];
+ if (!stimer->config.enable)
+ continue;
+
+ exp_time = stimer->exp_time;
+
+ if (exp_time) {
+ time_now = get_time_ref_counter(vcpu->kvm);
+ if (time_now >= exp_time)
+ stimer_expiration(stimer);
}
+
+ if (stimer->count) {
+ if (!stimer->msg_pending)
+ stimer_start(stimer);
+ } else
+ stimer_cleanup(stimer);
+ }
}
void kvm_hv_vcpu_uninit(struct kvm_vcpu *vcpu)
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 00/34] biops: add atomig find_bit() operations
2023-11-18 15:50 [PATCH 00/34] biops: add atomig find_bit() operations Yury Norov
2023-11-18 15:50 ` [PATCH 01/34] lib/find: add atomic find_bit() primitives Yury Norov
2023-11-18 15:50 ` [PATCH 13/34] KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers() Yury Norov
@ 2023-11-18 16:18 ` Bart Van Assche
2023-11-18 19:06 ` Sergey Shtylyov
2 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2023-11-18 16:18 UTC (permalink / raw)
To: Yury Norov, linux-kernel, David S. Miller, H. Peter Anvin,
James E.J. Bottomley, K. Y. Srinivasan, Md. Haris Iqbal,
Akinobu Mita, Andrew Morton, Bjorn Andersson, Borislav Petkov,
Chaitanya Kulkarni, Christian Brauner, Damien Le Moal,
Dave Hansen, David Disseldorp, Edward Cree, Eric Dumazet,
Fenghua Yu, Geert Uytterhoeven, Greg Kroah-Hartman,
Gregory Greenman, Hans Verkuil, Hans de Goede, Hugh Dickins,
Ingo Molnar, Jakub Kicinski, Jaroslav Kysela, Jason Gunthorpe,
Jens Axboe, Jiri Pirko, Jiri Slaby, Kalle Valo, Karsten Graul,
Karsten Keil, Kees Cook, Leon Romanovsky, Mark Rutland,
Martin Habets, Mauro Carvalho Chehab, Michael Ellerman,
Michal Simek, Nicholas Piggin, Oliver Neukum, Paolo Abeni,
Paolo Bonzini, Peter Zijlstra, Ping-Ke Shih, Rich Felker,
Rob Herring, Robin Murphy, Sathya Prakash Veerichetty,
Sean Christopherson, Shuai Xue, Stanislaw Gruszka, Steven Rostedt,
Thomas Bogendoerfer, Thomas Gleixner, Valentin Schneider,
Vitaly Kuznetsov, Wenjia Zhang, Will Deacon, Yoshinori Sato,
GR-QLogic-Storage-Upstream, alsa-devel, ath10k, dmaengine, iommu,
kvm, linux-arm-kernel, linux-arm-msm, linux-block,
linux-bluetooth, linux-hyperv, linux-m68k, linux-media,
linux-mips, linux-net-drivers, linux-pci, linux-rdma, linux-s390,
linux-scsi, linux-serial, linux-sh, linux-sound, linux-usb,
linux-wireless, linuxppc-dev, mpi3mr-linuxdrv.pdl, netdev,
sparclinux, x86
Cc: Jan Kara, Mirsad Todorovac, Matthew Wilcox, Rasmus Villemoes,
Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
On 11/18/23 07:50, Yury Norov wrote:
> Add helpers around test_and_{set,clear}_bit() that allow to search for
> clear or set bits and flip them atomically.
There is a typo in the subject: shouldn't "atomig" be changed
into "atomic"?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 01/34] lib/find: add atomic find_bit() primitives
2023-11-18 15:50 ` [PATCH 01/34] lib/find: add atomic find_bit() primitives Yury Norov
@ 2023-11-18 16:23 ` Bart Van Assche
0 siblings, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2023-11-18 16:23 UTC (permalink / raw)
To: Yury Norov, linux-kernel, David S. Miller, H. Peter Anvin,
James E.J. Bottomley, K. Y. Srinivasan, Md. Haris Iqbal,
Akinobu Mita, Andrew Morton, Bjorn Andersson, Borislav Petkov,
Chaitanya Kulkarni, Christian Brauner, Damien Le Moal,
Dave Hansen, David Disseldorp, Edward Cree, Eric Dumazet,
Fenghua Yu, Geert Uytterhoeven, Greg Kroah-Hartman,
Gregory Greenman, Hans Verkuil, Hans de Goede, Hugh Dickins,
Ingo Molnar, Jakub Kicinski, Jaroslav Kysela, Jason Gunthorpe,
Jens Axboe, Jiri Pirko, Jiri Slaby, Kalle Valo, Karsten Graul,
Karsten Keil, Kees Cook, Leon Romanovsky, Mark Rutland,
Martin Habets, Mauro Carvalho Chehab, Michael Ellerman,
Michal Simek, Nicholas Piggin, Oliver Neukum, Paolo Abeni,
Paolo Bonzini, Peter Zijlstra, Ping-Ke Shih, Rich Felker,
Rob Herring, Robin Murphy, Sathya Prakash Veerichetty,
Sean Christopherson, Shuai Xue, Stanislaw Gruszka, Steven Rostedt,
Thomas Bogendoerfer, Thomas Gleixner, Valentin Schneider,
Vitaly Kuznetsov, Wenjia Zhang, Will Deacon, Yoshinori Sato,
GR-QLogic-Storage-Upstream, alsa-devel, ath10k, dmaengine, iommu,
kvm, linux-arm-kernel, linux-arm-msm, linux-block,
linux-bluetooth, linux-hyperv, linux-m68k, linux-media,
linux-mips, linux-net-drivers, linux-pci, linux-rdma, linux-s390,
linux-scsi, linux-serial, linux-sh, linux-sound, linux-usb,
linux-wireless, linuxppc-dev, mpi3mr-linuxdrv.pdl, netdev,
sparclinux, x86
Cc: Jan Kara, Mirsad Todorovac, Matthew Wilcox, Rasmus Villemoes,
Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
On 11/18/23 07:50, Yury Norov wrote:
> Add helpers around test_and_{set,clear}_bit() that allow to search for
> clear or set bits and flip them atomically.
Has it been considered to add kunit tests for the new functions?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 00/34] biops: add atomig find_bit() operations
2023-11-18 16:18 ` [PATCH 00/34] biops: add atomig find_bit() operations Bart Van Assche
@ 2023-11-18 19:06 ` Sergey Shtylyov
0 siblings, 0 replies; 8+ messages in thread
From: Sergey Shtylyov @ 2023-11-18 19:06 UTC (permalink / raw)
To: Bart Van Assche, Yury Norov, linux-kernel, David S. Miller,
H. Peter Anvin, James E.J. Bottomley, K. Y. Srinivasan,
Md. Haris Iqbal, Akinobu Mita, Andrew Morton, Bjorn Andersson,
Borislav Petkov, Chaitanya Kulkarni, Christian Brauner,
Damien Le Moal, Dave Hansen, David Disseldorp, Edward Cree,
Eric Dumazet, Fenghua Yu, Geert Uytterhoeven, Greg Kroah-Hartman,
Gregory Greenman, Hans Verkuil, Hans de Goede, Hugh Dickins,
Ingo Molnar, Jakub Kicinski, Jaroslav Kysela, Jason Gunthorpe,
Jens Axboe, Jiri Pirko, Jiri Slaby, Kalle Valo, Karsten Graul,
Karsten Keil, Kees Cook, Leon Romanovsky, Mark Rutland,
Martin Habets, Mauro Carvalho Chehab, Michael Ellerman,
Michal Simek, Nicholas Piggin, Oliver Neukum, Paolo Abeni,
Paolo Bonzini, Peter Zijlstra, Ping-Ke Shih, Rich Felker,
Rob Herring, Robin Murphy, Sathya Prakash Veerichetty,
Sean Christopherson, Shuai Xue, Stanislaw Gruszka, Steven Rostedt,
Thomas Bogendoerfer, Thomas Gleixner, Valentin Schneider,
Vitaly Kuznetsov, Wenjia Zhang, Will Deacon, Yoshinori Sato,
GR-QLogic-Storage-Upstream, alsa-devel, ath10k, dmaengine, iommu,
kvm, linux-arm-kernel, linux-arm-msm, linux-block,
linux-bluetooth, linux-hyperv, linux-m68k, linux-media,
linux-mips, linux-net-drivers, linux-pci, linux-rdma, linux-s390,
linux-scsi, linux-serial, linux-sh, linux-sound, linux-usb,
linux-wireless, linuxppc-dev, mpi3mr-linuxdrv.pdl, netdev,
sparclinux, x86
Cc: Jan Kara, Mirsad Todorovac, Matthew Wilcox, Rasmus Villemoes,
Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
On 11/18/23 7:18 PM, Bart Van Assche wrote:
[...]
>> Add helpers around test_and_{set,clear}_bit() that allow to search for
>> clear or set bits and flip them atomically.
>
> There is a typo in the subject: shouldn't "atomig" be changed
> into "atomic"?
And "biops" to "bitops"? :-)
> Thanks,
>
> Bart.
MBR, Sergey
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 13/34] KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers()
2023-11-18 15:50 ` [PATCH 13/34] KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers() Yury Norov
@ 2023-11-20 14:26 ` Vitaly Kuznetsov
2023-11-21 13:35 ` Yury Norov
0 siblings, 1 reply; 8+ messages in thread
From: Vitaly Kuznetsov @ 2023-11-20 14:26 UTC (permalink / raw)
To: Yury Norov, linux-kernel, Sean Christopherson, Paolo Bonzini,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, kvm
Cc: Yury Norov, Jan Kara, Mirsad Todorovac, Matthew Wilcox,
Rasmus Villemoes, Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
Yury Norov <yury.norov@gmail.com> writes:
> The function traverses stimer_pending_bitmap n a for-loop bit by bit.
> We can do it faster by using atomic find_and_set_bit().
>
> While here, refactor the logic by decreasing indentation level
> and dropping 2nd check for stimer->config.enable.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
> arch/x86/kvm/hyperv.c | 39 +++++++++++++++++++--------------------
> 1 file changed, 19 insertions(+), 20 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 238afd7335e4..460e300b558b 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -870,27 +870,26 @@ void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
> if (!hv_vcpu)
> return;
>
> - for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++)
> - if (test_and_clear_bit(i, hv_vcpu->stimer_pending_bitmap)) {
> - stimer = &hv_vcpu->stimer[i];
> - if (stimer->config.enable) {
> - exp_time = stimer->exp_time;
> -
> - if (exp_time) {
> - time_now =
> - get_time_ref_counter(vcpu->kvm);
> - if (time_now >= exp_time)
> - stimer_expiration(stimer);
> - }
> -
> - if ((stimer->config.enable) &&
> - stimer->count) {
> - if (!stimer->msg_pending)
> - stimer_start(stimer);
> - } else
> - stimer_cleanup(stimer);
> - }
> + for_each_test_and_clear_bit(i, hv_vcpu->stimer_pending_bitmap,
> + ARRAY_SIZE(hv_vcpu->stimer)) {
> + stimer = &hv_vcpu->stimer[i];
> + if (!stimer->config.enable)
> + continue;
> +
> + exp_time = stimer->exp_time;
> +
> + if (exp_time) {
> + time_now = get_time_ref_counter(vcpu->kvm);
> + if (time_now >= exp_time)
> + stimer_expiration(stimer);
> }
> +
> + if (stimer->count) {
You can't drop 'stimer->config.enable' check here as stimer_expiration()
call above actually changes it. This is done on purpose: oneshot timers
fire only once so 'config.enable' is reset to 0.
> + if (!stimer->msg_pending)
> + stimer_start(stimer);
> + } else
> + stimer_cleanup(stimer);
> + }
> }
>
> void kvm_hv_vcpu_uninit(struct kvm_vcpu *vcpu)
--
Vitaly
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 13/34] KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers()
2023-11-20 14:26 ` Vitaly Kuznetsov
@ 2023-11-21 13:35 ` Yury Norov
0 siblings, 0 replies; 8+ messages in thread
From: Yury Norov @ 2023-11-21 13:35 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: linux-kernel, Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
kvm, Jan Kara, Mirsad Todorovac, Matthew Wilcox, Rasmus Villemoes,
Andy Shevchenko, Maxim Kuvyrkov, Alexey Klimov
On Mon, Nov 20, 2023 at 03:26:08PM +0100, Vitaly Kuznetsov wrote:
> Yury Norov <yury.norov@gmail.com> writes:
>
> > The function traverses stimer_pending_bitmap n a for-loop bit by bit.
> > We can do it faster by using atomic find_and_set_bit().
> >
> > While here, refactor the logic by decreasing indentation level
> > and dropping 2nd check for stimer->config.enable.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> > arch/x86/kvm/hyperv.c | 39 +++++++++++++++++++--------------------
> > 1 file changed, 19 insertions(+), 20 deletions(-)
> >
> > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> > index 238afd7335e4..460e300b558b 100644
> > --- a/arch/x86/kvm/hyperv.c
> > +++ b/arch/x86/kvm/hyperv.c
> > @@ -870,27 +870,26 @@ void kvm_hv_process_stimers(struct kvm_vcpu *vcpu)
> > if (!hv_vcpu)
> > return;
> >
> > - for (i = 0; i < ARRAY_SIZE(hv_vcpu->stimer); i++)
> > - if (test_and_clear_bit(i, hv_vcpu->stimer_pending_bitmap)) {
> > - stimer = &hv_vcpu->stimer[i];
> > - if (stimer->config.enable) {
> > - exp_time = stimer->exp_time;
> > -
> > - if (exp_time) {
> > - time_now =
> > - get_time_ref_counter(vcpu->kvm);
> > - if (time_now >= exp_time)
> > - stimer_expiration(stimer);
> > - }
> > -
> > - if ((stimer->config.enable) &&
> > - stimer->count) {
> > - if (!stimer->msg_pending)
> > - stimer_start(stimer);
> > - } else
> > - stimer_cleanup(stimer);
> > - }
> > + for_each_test_and_clear_bit(i, hv_vcpu->stimer_pending_bitmap,
> > + ARRAY_SIZE(hv_vcpu->stimer)) {
> > + stimer = &hv_vcpu->stimer[i];
> > + if (!stimer->config.enable)
> > + continue;
> > +
> > + exp_time = stimer->exp_time;
> > +
> > + if (exp_time) {
> > + time_now = get_time_ref_counter(vcpu->kvm);
> > + if (time_now >= exp_time)
> > + stimer_expiration(stimer);
> > }
> > +
> > + if (stimer->count) {
>
> You can't drop 'stimer->config.enable' check here as stimer_expiration()
> call above actually changes it. This is done on purpose: oneshot timers
> fire only once so 'config.enable' is reset to 0.
Ok, I see. Will fix in v2
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-11-21 13:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-18 15:50 [PATCH 00/34] biops: add atomig find_bit() operations Yury Norov
2023-11-18 15:50 ` [PATCH 01/34] lib/find: add atomic find_bit() primitives Yury Norov
2023-11-18 16:23 ` Bart Van Assche
2023-11-18 15:50 ` [PATCH 13/34] KVM: x86: hyper-v: optimize and cleanup kvm_hv_process_stimers() Yury Norov
2023-11-20 14:26 ` Vitaly Kuznetsov
2023-11-21 13:35 ` Yury Norov
2023-11-18 16:18 ` [PATCH 00/34] biops: add atomig find_bit() operations Bart Van Assche
2023-11-18 19:06 ` Sergey Shtylyov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).