* [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion
@ 2023-11-13 17:37 Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 01/11] bitops: add missing prototype check Alexander Lobakin
` (11 more replies)
0 siblings, 12 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Based on top of "lib/bitmap: add bitmap_{read,write}()"[0] from Alexander
Potapenko as it uses those new bitmap_{read,write}() functions to not
introduce another pair of similar ones.
Derived from the PFCP support series[1] as this grew bigger (2 -> 13
commits in v2) and involved more core bitmap changes, finally transforming
into a pure bitmap series. The actual mentioned ip_tunnel flags conversion
from `__be16` to bitmaps will be submitted bundled with the PFCP set after
this one lands.
Little breakdown:
* #1, #8, #10: misc cleanups;
* #2, #5, #6, #7: symbol scope, name collisions;
* #3, #4, #9, #11: compile-time optimizations.
[0] https://lore.kernel.org/lkml/20231109151106.2385155-1-glider@google.com
[1] https://lore.kernel.org/netdev/20230721071532.613888-1-marcin.szycik@linux.intel.com
Alexander Lobakin (11):
bitops: add missing prototype check
bitops: make BYTES_TO_BITS() treewide-available
bitops: let the compiler optimize {__,}assign_bit()
linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros
s390/cio: rename bitmap_size() -> idset_bitmap_size()
fs/ntfs3: add prefix to bitmap_size() and use BITS_TO_U64()
btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()
tools: move alignment-related macros to new <linux/align.h>
bitmap: introduce generic optimized bitmap_size()
bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}()
lib/bitmap: add compile-time test for __assign_bit() optimization
drivers/md/dm-clone-metadata.c | 5 ----
drivers/s390/cio/idset.c | 12 +++++----
fs/btrfs/free-space-cache.c | 8 +++---
fs/ntfs3/bitmap.c | 4 +--
fs/ntfs3/fsntfs.c | 2 +-
fs/ntfs3/index.c | 11 ++++----
fs/ntfs3/ntfs_fs.h | 4 +--
fs/ntfs3/super.c | 2 +-
include/linux/bitmap.h | 46 ++++++++--------------------------
include/linux/bitops.h | 23 ++++++-----------
include/linux/cpumask.h | 2 +-
include/linux/linkmode.h | 27 +++-----------------
kernel/trace/trace_probe.c | 2 --
lib/math/prime_numbers.c | 2 --
lib/test_bitmap.c | 18 +++++++------
tools/include/linux/align.h | 11 ++++++++
tools/include/linux/bitmap.h | 9 ++++---
tools/include/linux/bitops.h | 2 ++
tools/include/linux/mm.h | 5 +---
tools/perf/util/probe-finder.c | 4 +--
20 files changed, 75 insertions(+), 124 deletions(-)
create mode 100644 tools/include/linux/align.h
---
From v2[2]:
* 00: drop ip_tunnel part: will be sent together with the PFCP part once
this gets accepted (Jakub, Yury);
* 02: fix format string literal build warning due to the new generic
macro having different implicit type (Stephen);
* 04: pick Acked-by (Jakub);
* 08: new;
* 09: pick Acked-by (Yury), fix including the header not existing under
the tools/ tree (via #8) (Stephen);.
From v1[3]:
* 03: convert assign_bit() to a macro as well, saves some bytes and
looks more consistent (Yury);
* 03: enclose each argument into own pair of braces (Yury);
* 06: use generic BITS_TO_U64() while at it (Yury);
* 07: pick Acked-by (David);
* 08: Acked-by, use bitmap_size() in the code from 05 as well (Yury);
* 09: instead of introducing a new pair of functions, use generic
bitmap_{read,write}() from [0]. bloat-o-meter shows no regressions
from the switch (Yury, also Andy).
Old pfcp -> bitmap changelog:
As for former commits (now 10 and 11), almost all of the changes were
suggested by Andy, notably: stop violating bitmap API, use
__assign_bit() where appropriate, and add more tests to make sure
everything works as expected. Apart from that, add simple wrappers for
bitmap_*() used in the IP tunnel code to avoid manually specifying
``__IP_TUNNEL_FLAG_NUM`` each time.
[2] https://lore.kernel.org/lkml/20231016165247.14212-1-aleksander.lobakin@intel.com
[3] https://lore.kernel.org/lkml/20231009151026.66145-1-aleksander.lobakin@intel.com
--
2.41.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3 01/11] bitops: add missing prototype check
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 02/11] bitops: make BYTES_TO_BITS() treewide-available Alexander Lobakin
` (10 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added
a new bitop, test_bit_acquire(), with proper wrapping in order to try to
optimize it at compile-time, but missed the list of bitops used for
checking their prototypes a bit below.
The functions added have consistent prototypes, so that no more changes
are required and no functional changes take place.
Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/linux/bitops.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 2ba557e067fe..f7f5a783da2a 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -80,6 +80,7 @@ __check_bitop_pr(__test_and_set_bit);
__check_bitop_pr(__test_and_clear_bit);
__check_bitop_pr(__test_and_change_bit);
__check_bitop_pr(test_bit);
+__check_bitop_pr(test_bit_acquire);
#undef __check_bitop_pr
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 02/11] bitops: make BYTES_TO_BITS() treewide-available
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 01/11] bitops: add missing prototype check Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 03/11] bitops: let the compiler optimize {__,}assign_bit() Alexander Lobakin
` (9 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Avoid open-coding that simple expression each time by moving
BYTES_TO_BITS() from the probes code to <linux/bitops.h> to export
it to the rest of the kernel.
Simplify the macro while at it. `BITS_PER_LONG / sizeof(long)` always
equals to %BITS_PER_BYTE, regardless of the target architecture.
Do the same for the tools ecosystem as well (incl. its version of
bitops.h). The previous implementation had its implicit type of long,
while the new one is int, so adjust the format literal accordingly in
the perf code.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/linux/bitops.h | 2 ++
kernel/trace/trace_probe.c | 2 --
tools/include/linux/bitops.h | 2 ++
tools/perf/util/probe-finder.c | 4 +---
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index f7f5a783da2a..e0cd09eb91cd 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -21,6 +21,8 @@
#define BITS_TO_U32(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
#define BITS_TO_BYTES(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(char))
+#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_BYTE)
+
extern unsigned int __sw_hweight8(unsigned int w);
extern unsigned int __sw_hweight16(unsigned int w);
extern unsigned int __sw_hweight32(unsigned int w);
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 4dc74d73fc1d..2b743c1e37db 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -1053,8 +1053,6 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
return ret;
}
-#define BYTES_TO_BITS(nb) ((BITS_PER_LONG * (nb)) / sizeof(long))
-
/* Bitfield type needs to be parsed into a fetch function */
static int __parse_bitfield_probe_arg(const char *bf,
const struct fetch_type *t,
diff --git a/tools/include/linux/bitops.h b/tools/include/linux/bitops.h
index f18683b95ea6..bc6600466e7b 100644
--- a/tools/include/linux/bitops.h
+++ b/tools/include/linux/bitops.h
@@ -20,6 +20,8 @@
#define BITS_TO_U32(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
#define BITS_TO_BYTES(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(char))
+#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_BYTE)
+
extern unsigned int __sw_hweight8(unsigned int w);
extern unsigned int __sw_hweight16(unsigned int w);
extern unsigned int __sw_hweight32(unsigned int w);
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index f171360b0ef4..f967379ca42e 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -304,8 +304,6 @@ static int convert_variable_location(Dwarf_Die *vr_die, Dwarf_Addr addr,
return ret2;
}
-#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_LONG / sizeof(long))
-
static int convert_variable_type(Dwarf_Die *vr_die,
struct probe_trace_arg *tvar,
const char *cast, bool user_access)
@@ -335,7 +333,7 @@ static int convert_variable_type(Dwarf_Die *vr_die,
total = dwarf_bytesize(vr_die);
if (boffs < 0 || total < 0)
return -ENOENT;
- ret = snprintf(buf, 16, "b%d@%d/%zd", bsize, boffs,
+ ret = snprintf(buf, 16, "b%d@%d/%d", bsize, boffs,
BYTES_TO_BITS(total));
goto formatted;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 03/11] bitops: let the compiler optimize {__,}assign_bit()
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 01/11] bitops: add missing prototype check Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 02/11] bitops: make BYTES_TO_BITS() treewide-available Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 04/11] linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros Alexander Lobakin
` (8 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
on compile-time constants"), the compilers are able to expand inline
bitmap operations to compile-time initializers when possible.
However, during the round of replacement if-__set-else-__clear with
__assign_bit() as per Andy's advice, bloat-o-meter showed +1024 bytes
difference in object code size for one module (even one function),
where the pattern:
DECLARE_BITMAP(foo) = { }; // on the stack, zeroed
if (a)
__set_bit(const_bit_num, foo);
if (b)
__set_bit(another_const_bit_num, foo);
...
is heavily used, although there should be no difference: the bitmap is
zeroed, so the second half of __assign_bit() should be compiled-out as
a no-op.
I either missed the fact that __assign_bit() has bitmap pointer marked
as `volatile` (as we usually do for bitops) or was hoping that the
compilers would at least try to look past the `volatile` for
__always_inline functions. Anyhow, due to that attribute, the compilers
were always compiling the whole expression and no mentioned compile-time
optimizations were working.
Convert __assign_bit() to a macro since it's a very simple if-else and
all of the checks are performed inside __set_bit() and __clear_bit(),
thus that wrapper has to be as transparent as possible. After that
change, despite it showing only -20 bytes change for vmlinux (due to
that it's still relatively unpopular), no drastic code size changes
happen when replacing if-set-else-clear for onstack bitmaps with
__assign_bit(), meaning the compiler now expands them to the actual
operations will all the expected optimizations.
Atomic assign_bit() is less affected due to its nature, but let's
convert it to a macro as well to keep the code consistent and not
leave a place for possible suboptimal codegen. Moreover, with certain
kernel configuration it actually gives some saves (x86):
do_ip_setsockopt 4154 4099 -55
Suggested-by: Yury Norov <yury.norov@gmail.com> # assign_bit(), too
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/linux/bitops.h | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index e0cd09eb91cd..b25dc8742124 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -275,23 +275,11 @@ static inline unsigned long fns(unsigned long word, unsigned int n)
* @addr: the address to start counting from
* @value: the value to assign
*/
-static __always_inline void assign_bit(long nr, volatile unsigned long *addr,
- bool value)
-{
- if (value)
- set_bit(nr, addr);
- else
- clear_bit(nr, addr);
-}
+#define assign_bit(nr, addr, value) \
+ ((value) ? set_bit((nr), (addr)) : clear_bit((nr), (addr)))
-static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
- bool value)
-{
- if (value)
- __set_bit(nr, addr);
- else
- __clear_bit(nr, addr);
-}
+#define __assign_bit(nr, addr, value) \
+ ((value) ? __set_bit((nr), (addr)) : __clear_bit((nr), (addr)))
/**
* __ptr_set_bit - Set bit in a pointer's value
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 04/11] linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (2 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 03/11] bitops: let the compiler optimize {__,}assign_bit() Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 19:08 ` Andrew Lunn
2023-11-13 17:37 ` [PATCH v3 05/11] s390/cio: rename bitmap_size() -> idset_bitmap_size() Alexander Lobakin
` (7 subsequent siblings)
11 siblings, 1 reply; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
on compile-time constants"), the non-atomic bitops are macros which can
be expanded by the compilers into compile-time expressions, which will
result in better optimized object code. Unfortunately, turned out that
passing `volatile` to those macros discards any possibility of
optimization, as the compilers then don't even try to look whether
the passed bitmap is known at compilation time. In addition to that,
the mentioned linkmode helpers are marked with `inline`, not
`__always_inline`, meaning that it's not guaranteed some compiler won't
uninline them for no reason, which will also effectively prevent them
from being optimized (it's a well-known thing the compilers sometimes
uninline `2 + 2`).
Convert linkmode_*_bit() from inlines to macros. Their calling
convention are 1:1 with the corresponding bitops, so that it's not even
needed to enumerate and map the arguments, only the names. No changes in
vmlinux' object code (compiled by LLVM for x86_64) whatsoever, but that
doesn't necessarily means the change is meaningless.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/linux/linkmode.h | 27 ++++-----------------------
1 file changed, 4 insertions(+), 23 deletions(-)
diff --git a/include/linux/linkmode.h b/include/linux/linkmode.h
index 7303b4bc2ce0..f231e2edbfa5 100644
--- a/include/linux/linkmode.h
+++ b/include/linux/linkmode.h
@@ -38,29 +38,10 @@ static inline int linkmode_andnot(unsigned long *dst, const unsigned long *src1,
return bitmap_andnot(dst, src1, src2, __ETHTOOL_LINK_MODE_MASK_NBITS);
}
-static inline void linkmode_set_bit(int nr, volatile unsigned long *addr)
-{
- __set_bit(nr, addr);
-}
-
-static inline void linkmode_clear_bit(int nr, volatile unsigned long *addr)
-{
- __clear_bit(nr, addr);
-}
-
-static inline void linkmode_mod_bit(int nr, volatile unsigned long *addr,
- int set)
-{
- if (set)
- linkmode_set_bit(nr, addr);
- else
- linkmode_clear_bit(nr, addr);
-}
-
-static inline int linkmode_test_bit(int nr, const volatile unsigned long *addr)
-{
- return test_bit(nr, addr);
-}
+#define linkmode_test_bit test_bit
+#define linkmode_set_bit __set_bit
+#define linkmode_clear_bit __clear_bit
+#define linkmode_mod_bit __assign_bit
static inline void linkmode_set_bit_array(const int *array, int array_size,
unsigned long *addr)
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 05/11] s390/cio: rename bitmap_size() -> idset_bitmap_size()
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (3 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 04/11] linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 06/11] fs/ntfs3: add prefix to bitmap_size() and use BITS_TO_U64() Alexander Lobakin
` (6 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
bitmap_size() is a pretty generic name and one may want to use it for
a generic bitmap API function. At the same time, its logic is not
"generic", i.e. it's not just `nbits -> size of bitmap in bytes`
converter as it would be expected from its name.
Add the prefix 'idset_' used throughout the file where the function
resides.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
idset_new() really wants its vmalloc() + memset() pair to be replaced
with vzalloc().
---
drivers/s390/cio/idset.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/s390/cio/idset.c b/drivers/s390/cio/idset.c
index 45f9c0736be4..0a1105a483bf 100644
--- a/drivers/s390/cio/idset.c
+++ b/drivers/s390/cio/idset.c
@@ -16,7 +16,7 @@ struct idset {
unsigned long bitmap[];
};
-static inline unsigned long bitmap_size(int num_ssid, int num_id)
+static inline unsigned long idset_bitmap_size(int num_ssid, int num_id)
{
return BITS_TO_LONGS(num_ssid * num_id) * sizeof(unsigned long);
}
@@ -25,11 +25,12 @@ static struct idset *idset_new(int num_ssid, int num_id)
{
struct idset *set;
- set = vmalloc(sizeof(struct idset) + bitmap_size(num_ssid, num_id));
+ set = vmalloc(sizeof(struct idset) +
+ idset_bitmap_size(num_ssid, num_id));
if (set) {
set->num_ssid = num_ssid;
set->num_id = num_id;
- memset(set->bitmap, 0, bitmap_size(num_ssid, num_id));
+ memset(set->bitmap, 0, idset_bitmap_size(num_ssid, num_id));
}
return set;
}
@@ -41,7 +42,8 @@ void idset_free(struct idset *set)
void idset_fill(struct idset *set)
{
- memset(set->bitmap, 0xff, bitmap_size(set->num_ssid, set->num_id));
+ memset(set->bitmap, 0xff,
+ idset_bitmap_size(set->num_ssid, set->num_id));
}
static inline void idset_add(struct idset *set, int ssid, int id)
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 06/11] fs/ntfs3: add prefix to bitmap_size() and use BITS_TO_U64()
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (4 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 05/11] s390/cio: rename bitmap_size() -> idset_bitmap_size() Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 07/11] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits() Alexander Lobakin
` (5 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
bitmap_size() is a pretty generic name and one may want to use it for
a generic bitmap API function. At the same time, its logic is
NTFS-specific, as it aligns to the sizeof(u64), not the sizeof(long)
(although it uses ideologically right ALIGN() instead of division).
Add the prefix 'ntfs3_' used for that FS (not just 'ntfs_' to not mix
it with the legacy module) and use generic BITS_TO_U64() while at it.
Suggested-by: Yury Norov <yury.norov@gmail.com> # BITS_TO_U64()
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
fs/ntfs3/bitmap.c | 4 ++--
fs/ntfs3/fsntfs.c | 2 +-
fs/ntfs3/index.c | 11 ++++++-----
fs/ntfs3/ntfs_fs.h | 4 ++--
fs/ntfs3/super.c | 2 +-
5 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/fs/ntfs3/bitmap.c b/fs/ntfs3/bitmap.c
index 63f14a0232f6..a19a73ed630b 100644
--- a/fs/ntfs3/bitmap.c
+++ b/fs/ntfs3/bitmap.c
@@ -654,7 +654,7 @@ int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits)
wnd->total_zeroes = nbits;
wnd->extent_max = MINUS_ONE_T;
wnd->zone_bit = wnd->zone_end = 0;
- wnd->nwnd = bytes_to_block(sb, bitmap_size(nbits));
+ wnd->nwnd = bytes_to_block(sb, ntfs3_bitmap_size(nbits));
wnd->bits_last = nbits & (wbits - 1);
if (!wnd->bits_last)
wnd->bits_last = wbits;
@@ -1347,7 +1347,7 @@ int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits)
return -EINVAL;
/* Align to 8 byte boundary. */
- new_wnd = bytes_to_block(sb, bitmap_size(new_bits));
+ new_wnd = bytes_to_block(sb, ntfs3_bitmap_size(new_bits));
new_last = new_bits & (wbits - 1);
if (!new_last)
new_last = wbits;
diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
index fbfe21dbb425..e18de9c4c2fa 100644
--- a/fs/ntfs3/fsntfs.c
+++ b/fs/ntfs3/fsntfs.c
@@ -522,7 +522,7 @@ static int ntfs_extend_mft(struct ntfs_sb_info *sbi)
ni->mi.dirty = true;
/* Step 2: Resize $MFT::BITMAP. */
- new_bitmap_bytes = bitmap_size(new_mft_total);
+ new_bitmap_bytes = ntfs3_bitmap_size(new_mft_total);
err = attr_set_size(ni, ATTR_BITMAP, NULL, 0, &sbi->mft.bitmap.run,
new_bitmap_bytes, &new_bitmap_bytes, true, NULL);
diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
index cf92b2433f7a..e0cef8f4e414 100644
--- a/fs/ntfs3/index.c
+++ b/fs/ntfs3/index.c
@@ -1456,8 +1456,8 @@ static int indx_create_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
alloc->nres.valid_size = alloc->nres.data_size = cpu_to_le64(data_size);
- err = ni_insert_resident(ni, bitmap_size(1), ATTR_BITMAP, in->name,
- in->name_len, &bitmap, NULL, NULL);
+ err = ni_insert_resident(ni, ntfs3_bitmap_size(1), ATTR_BITMAP,
+ in->name, in->name_len, &bitmap, NULL, NULL);
if (err)
goto out2;
@@ -1518,8 +1518,9 @@ static int indx_add_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
if (bmp) {
/* Increase bitmap. */
err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
- &indx->bitmap_run, bitmap_size(bit + 1),
- NULL, true, NULL);
+ &indx->bitmap_run,
+ ntfs3_bitmap_size(bit + 1), NULL, true,
+ NULL);
if (err)
goto out1;
}
@@ -2092,7 +2093,7 @@ static int indx_shrink(struct ntfs_index *indx, struct ntfs_inode *ni,
if (in->name == I30_NAME)
ni->vfs_inode.i_size = new_data;
- bpb = bitmap_size(bit);
+ bpb = ntfs3_bitmap_size(bit);
if (bpb * 8 == nbits)
return 0;
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
index f6706143d14b..16b84d605cd2 100644
--- a/fs/ntfs3/ntfs_fs.h
+++ b/fs/ntfs3/ntfs_fs.h
@@ -961,9 +961,9 @@ static inline bool run_is_empty(struct runs_tree *run)
}
/* NTFS uses quad aligned bitmaps. */
-static inline size_t bitmap_size(size_t bits)
+static inline size_t ntfs3_bitmap_size(size_t bits)
{
- return ALIGN((bits + 7) >> 3, 8);
+ return BITS_TO_U64(bits) * sizeof(u64);
}
#define _100ns2seconds 10000000
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
index 9153dffde950..0248db1e5c01 100644
--- a/fs/ntfs3/super.c
+++ b/fs/ntfs3/super.c
@@ -1331,7 +1331,7 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
/* Check bitmap boundary. */
tt = sbi->used.bitmap.nbits;
- if (inode->i_size < bitmap_size(tt)) {
+ if (inode->i_size < ntfs3_bitmap_size(tt)) {
ntfs_err(sb, "$Bitmap is corrupted.");
err = -EINVAL;
goto put_inode_out;
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 07/11] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (5 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 06/11] fs/ntfs3: add prefix to bitmap_size() and use BITS_TO_U64() Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-14 0:48 ` Anand Jain
2023-11-13 17:37 ` [PATCH v3 08/11] tools: move alignment-related macros to new <linux/align.h> Alexander Lobakin
` (4 subsequent siblings)
11 siblings, 1 reply; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel,
David Sterba
bitmap_set_bits() does not start with the FS' prefix and may collide
with a new generic helper one day. It operates with the FS-specific
types, so there's no change those two could do the same thing.
Just add the prefix to exclude such possible conflict.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
fs/btrfs/free-space-cache.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 6f93c9a2c3e3..8f4949f0b5e2 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -1913,9 +1913,9 @@ static inline void bitmap_clear_bits(struct btrfs_free_space_ctl *ctl,
ctl->free_space -= bytes;
}
-static void bitmap_set_bits(struct btrfs_free_space_ctl *ctl,
- struct btrfs_free_space *info, u64 offset,
- u64 bytes)
+static void btrfs_bitmap_set_bits(struct btrfs_free_space_ctl *ctl,
+ struct btrfs_free_space *info, u64 offset,
+ u64 bytes)
{
unsigned long start, count, end;
int extent_delta = 1;
@@ -2251,7 +2251,7 @@ static u64 add_bytes_to_bitmap(struct btrfs_free_space_ctl *ctl,
bytes_to_set = min(end - offset, bytes);
- bitmap_set_bits(ctl, info, offset, bytes_to_set);
+ btrfs_bitmap_set_bits(ctl, info, offset, bytes_to_set);
return bytes_to_set;
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 08/11] tools: move alignment-related macros to new <linux/align.h>
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (6 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 07/11] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits() Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 09/11] bitmap: introduce generic optimized bitmap_size() Alexander Lobakin
` (3 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Currently, tools have *ALIGN*() macros scattered across the unrelated
headers, as there are only 3 of them and they were added separately
each time on an as-needed basis.
Anyway, let's make it more consistent with the kernel headers and allow
using those macros outside of the mentioned headers. Create
<linux/align.h> inside the tools/ folder and include it where needed.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
tools/include/linux/align.h | 11 +++++++++++
tools/include/linux/bitmap.h | 2 +-
tools/include/linux/mm.h | 5 +----
3 files changed, 13 insertions(+), 5 deletions(-)
create mode 100644 tools/include/linux/align.h
diff --git a/tools/include/linux/align.h b/tools/include/linux/align.h
new file mode 100644
index 000000000000..62e5582bbb1f
--- /dev/null
+++ b/tools/include/linux/align.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _TOOLS_LINUX_ALIGN_H
+#define _TOOLS_LINUX_ALIGN_H
+
+#include <uapi/linux/const.h>
+
+#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
+#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a))
+#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
+
+#endif /* _TOOLS_LINUX_ALIGN_H */
diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index f3566ea0f932..8c6852dba04f 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -3,6 +3,7 @@
#define _TOOLS_LINUX_BITMAP_H
#include <string.h>
+#include <linux/align.h>
#include <linux/bitops.h>
#include <linux/find.h>
#include <stdlib.h>
@@ -126,7 +127,6 @@ static inline bool bitmap_and(unsigned long *dst, const unsigned long *src1,
#define BITMAP_MEM_ALIGNMENT (8 * sizeof(unsigned long))
#endif
#define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1)
-#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
static inline bool bitmap_equal(const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
diff --git a/tools/include/linux/mm.h b/tools/include/linux/mm.h
index f3c82ab5b14c..7a6b98f4e579 100644
--- a/tools/include/linux/mm.h
+++ b/tools/include/linux/mm.h
@@ -2,8 +2,8 @@
#ifndef _TOOLS_LINUX_MM_H
#define _TOOLS_LINUX_MM_H
+#include <linux/align.h>
#include <linux/mmzone.h>
-#include <uapi/linux/const.h>
#define PAGE_SHIFT 12
#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT)
@@ -11,9 +11,6 @@
#define PHYS_ADDR_MAX (~(phys_addr_t)0)
-#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
-#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a))
-
#define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
#define __va(x) ((void *)((unsigned long)(x)))
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 09/11] bitmap: introduce generic optimized bitmap_size()
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (7 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 08/11] tools: move alignment-related macros to new <linux/align.h> Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 10/11] bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}() Alexander Lobakin
` (2 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
The number of times yet another open coded
`BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge.
Some generic helper is long overdue.
Add one, bitmap_size(), but with one detail.
BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both
divident and divisor are compile-time constants or when the divisor
is not a pow-of-2. When it is however, the compilers sometimes tend
to generate suboptimal code (GCC 13):
48 83 c0 3f add $0x3f,%rax
48 c1 e8 06 shr $0x6,%rax
48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx
%BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does
full division of `nbits + 63` by it and then multiplication by 8.
Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC:
8d 50 3f lea 0x3f(%rax),%edx
c1 ea 03 shr $0x3,%edx
81 e2 f8 ff ff 1f and $0x1ffffff8,%edx
Now it shifts `nbits + 63` by 3 positions (IOW performs fast division
by 8) and then masks bits[2:0]. bloat-o-meter:
add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617)
Clang does it better and generates the same code before/after starting
from -O1, except that with the ALIGN() approach it uses %edx and thus
still saves some bytes:
add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520)
Note that we can't expand DIV_ROUND_UP() by adding a check and using
this approach there, as it's used in array declarations where
expressions are not allowed.
Add this helper to tools/ as well.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
drivers/md/dm-clone-metadata.c | 5 -----
drivers/s390/cio/idset.c | 2 +-
include/linux/bitmap.h | 8 +++++---
include/linux/cpumask.h | 2 +-
lib/math/prime_numbers.c | 2 --
tools/include/linux/bitmap.h | 7 ++++---
6 files changed, 11 insertions(+), 15 deletions(-)
diff --git a/drivers/md/dm-clone-metadata.c b/drivers/md/dm-clone-metadata.c
index c43d55672bce..47c1fa7aad8b 100644
--- a/drivers/md/dm-clone-metadata.c
+++ b/drivers/md/dm-clone-metadata.c
@@ -465,11 +465,6 @@ static void __destroy_persistent_data_structures(struct dm_clone_metadata *cmd)
/*---------------------------------------------------------------------------*/
-static size_t bitmap_size(unsigned long nr_bits)
-{
- return BITS_TO_LONGS(nr_bits) * sizeof(long);
-}
-
static int __dirty_map_init(struct dirty_map *dmap, unsigned long nr_words,
unsigned long nr_regions)
{
diff --git a/drivers/s390/cio/idset.c b/drivers/s390/cio/idset.c
index 0a1105a483bf..e5f28370a903 100644
--- a/drivers/s390/cio/idset.c
+++ b/drivers/s390/cio/idset.c
@@ -18,7 +18,7 @@ struct idset {
static inline unsigned long idset_bitmap_size(int num_ssid, int num_id)
{
- return BITS_TO_LONGS(num_ssid * num_id) * sizeof(unsigned long);
+ return bitmap_size(size_mul(num_ssid, num_id));
}
static struct idset *idset_new(int num_ssid, int num_id)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 7ca0379be8c1..9a6a27a7f675 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -218,9 +218,11 @@ void bitmap_fold(unsigned long *dst, const unsigned long *orig,
#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
+#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
+
static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);
if (small_const_nbits(nbits))
*dst = 0;
@@ -230,7 +232,7 @@ static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);
if (small_const_nbits(nbits))
*dst = ~0UL;
@@ -241,7 +243,7 @@ static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);
if (small_const_nbits(nbits))
*dst = *src;
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index cfb545841a2c..a2064c2a9441 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -839,7 +839,7 @@ static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
*/
static inline unsigned int cpumask_size(void)
{
- return BITS_TO_LONGS(large_cpumask_bits) * sizeof(long);
+ return bitmap_size(large_cpumask_bits);
}
/*
diff --git a/lib/math/prime_numbers.c b/lib/math/prime_numbers.c
index d42cebf7407f..d3b64b10da1c 100644
--- a/lib/math/prime_numbers.c
+++ b/lib/math/prime_numbers.c
@@ -6,8 +6,6 @@
#include <linux/prime_numbers.h>
#include <linux/slab.h>
-#define bitmap_size(nbits) (BITS_TO_LONGS(nbits) * sizeof(unsigned long))
-
struct primes {
struct rcu_head rcu;
unsigned long last, sz;
diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index 8c6852dba04f..210c13b1b857 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -26,13 +26,14 @@ bool __bitmap_intersects(const unsigned long *bitmap1,
#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
+#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
+
static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
if (small_const_nbits(nbits))
*dst = 0UL;
else {
- int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
- memset(dst, 0, len);
+ memset(dst, 0, bitmap_size(nbits));
}
}
@@ -84,7 +85,7 @@ static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
*/
static inline unsigned long *bitmap_zalloc(int nbits)
{
- return calloc(1, BITS_TO_LONGS(nbits) * sizeof(unsigned long));
+ return calloc(1, bitmap_size(nbits));
}
/*
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 10/11] bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}()
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (8 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 09/11] bitmap: introduce generic optimized bitmap_size() Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 11/11] lib/bitmap: add compile-time test for __assign_bit() optimization Alexander Lobakin
2023-11-20 11:04 ` [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Now that we have generic bitmap_read() and bitmap_write(), which are
inline and try to take care of non-bound-crossing and aligned cases
to keep them optimized, collapse bitmap_{get,set}_value8() into
simple wrappers around the former ones.
bloat-o-meter shows no difference in vmlinux and -2 bytes for
gpio-pca953x.ko, which says the optimization didn't suffer due to
that change. The converted helpers have the value width embedded
and always compile-time constant and that helps a lot.
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/linux/bitmap.h | 38 +++++---------------------------------
1 file changed, 5 insertions(+), 33 deletions(-)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 9a6a27a7f675..f80e116b8f60 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -609,39 +609,6 @@ static inline void bitmap_from_u64(unsigned long *dst, u64 mask)
bitmap_from_arr64(dst, &mask, 64);
}
-/**
- * bitmap_get_value8 - get an 8-bit value within a memory region
- * @map: address to the bitmap memory region
- * @start: bit offset of the 8-bit value; must be a multiple of 8
- *
- * Returns the 8-bit value located at the @start bit offset within the @src
- * memory region.
- */
-static inline unsigned long bitmap_get_value8(const unsigned long *map,
- unsigned long start)
-{
- const size_t index = BIT_WORD(start);
- const unsigned long offset = start % BITS_PER_LONG;
-
- return (map[index] >> offset) & 0xFF;
-}
-
-/**
- * bitmap_set_value8 - set an 8-bit value within a memory region
- * @map: address to the bitmap memory region
- * @value: the 8-bit value; values wider than 8 bits may clobber bitmap
- * @start: bit offset of the 8-bit value; must be a multiple of 8
- */
-static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
- unsigned long start)
-{
- const size_t index = BIT_WORD(start);
- const unsigned long offset = start % BITS_PER_LONG;
-
- map[index] &= ~(0xFFUL << offset);
- map[index] |= value << offset;
-}
-
/**
* bitmap_read - read a value of n-bits from the memory region
* @map: address to the bitmap memory region
@@ -715,6 +682,11 @@ static inline void bitmap_write(unsigned long *map, unsigned long value,
map[index + 1] |= (value >> space);
}
+#define bitmap_get_value8(map, start) \
+ bitmap_read(map, start, BITS_PER_BYTE)
+#define bitmap_set_value8(map, value, start) \
+ bitmap_write(map, value, start, BITS_PER_BYTE)
+
#endif /* __ASSEMBLY__ */
#endif /* __LINUX_BITMAP_H */
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 11/11] lib/bitmap: add compile-time test for __assign_bit() optimization
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (9 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 10/11] bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}() Alexander Lobakin
@ 2023-11-13 17:37 ` Alexander Lobakin
2023-11-20 11:04 ` [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-13 17:37 UTC (permalink / raw)
To: Yury Norov
Cc: Alexander Lobakin, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
Commit dc34d5036692 ("lib: test_bitmap: add compile-time
optimization/evaluations assertions") initially missed __assign_bit(),
which led to that quite a time passed before I realized it doesn't get
optimized at compilation time. Now that it does, add test for that just
to make sure nothing will break one day.
To make things more interesting, use bitmap_complement() and
bitmap_full(), thus checking their compile-time evaluation as well. And
remove the misleading comment mentioning the workaround removed recently
in favor of adding the whole file to GCov exceptions.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
lib/test_bitmap.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index a6e92cf5266a..4ee1f8ceb51d 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -1204,14 +1204,7 @@ static void __init test_bitmap_const_eval(void)
* in runtime.
*/
- /*
- * Equals to `unsigned long bitmap[1] = { GENMASK(6, 5), }`.
- * Clang on s390 optimizes bitops at compile-time as intended, but at
- * the same time stops treating @bitmap and @bitopvar as compile-time
- * constants after regular test_bit() is executed, thus triggering the
- * build bugs below. So, call const_test_bit() there directly until
- * the compiler is fixed.
- */
+ /* Equals to `unsigned long bitmap[1] = { GENMASK(6, 5), }` */
bitmap_clear(bitmap, 0, BITS_PER_LONG);
if (!test_bit(7, bitmap))
bitmap_set(bitmap, 5, 2);
@@ -1243,6 +1236,15 @@ static void __init test_bitmap_const_eval(void)
/* ~BIT(25) */
BUILD_BUG_ON(!__builtin_constant_p(~var));
BUILD_BUG_ON(~var != ~BIT(25));
+
+ /* ~BIT(25) | BIT(25) == ~0UL */
+ bitmap_complement(&var, &var, BITS_PER_LONG);
+ __assign_bit(25, &var, true);
+
+ /* !(~(~0UL)) == 1 */
+ res = bitmap_full(&var, BITS_PER_LONG);
+ BUILD_BUG_ON(!__builtin_constant_p(res));
+ BUILD_BUG_ON(!res);
}
/*
--
2.41.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v3 04/11] linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros
2023-11-13 17:37 ` [PATCH v3 04/11] linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros Alexander Lobakin
@ 2023-11-13 19:08 ` Andrew Lunn
0 siblings, 0 replies; 15+ messages in thread
From: Andrew Lunn @ 2023-11-13 19:08 UTC (permalink / raw)
To: Alexander Lobakin
Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes,
Alexander Potapenko, Jakub Kicinski, Przemek Kitszel, netdev,
linux-btrfs, dm-devel, ntfs3, linux-s390, linux-kernel
On Mon, Nov 13, 2023 at 06:37:10PM +0100, Alexander Lobakin wrote:
> Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
> on compile-time constants"), the non-atomic bitops are macros which can
> be expanded by the compilers into compile-time expressions, which will
> result in better optimized object code. Unfortunately, turned out that
> passing `volatile` to those macros discards any possibility of
> optimization, as the compilers then don't even try to look whether
> the passed bitmap is known at compilation time. In addition to that,
> the mentioned linkmode helpers are marked with `inline`, not
> `__always_inline`, meaning that it's not guaranteed some compiler won't
> uninline them for no reason, which will also effectively prevent them
> from being optimized (it's a well-known thing the compilers sometimes
> uninline `2 + 2`).
> Convert linkmode_*_bit() from inlines to macros. Their calling
> convention are 1:1 with the corresponding bitops, so that it's not even
> needed to enumerate and map the arguments, only the names. No changes in
> vmlinux' object code (compiled by LLVM for x86_64) whatsoever, but that
> doesn't necessarily means the change is meaningless.
>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Acked-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 07/11] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()
2023-11-13 17:37 ` [PATCH v3 07/11] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits() Alexander Lobakin
@ 2023-11-14 0:48 ` Anand Jain
0 siblings, 0 replies; 15+ messages in thread
From: Anand Jain @ 2023-11-14 0:48 UTC (permalink / raw)
To: Alexander Lobakin, Yury Norov
Cc: Andy Shevchenko, Rasmus Villemoes, Alexander Potapenko,
Jakub Kicinski, Przemek Kitszel, netdev, linux-btrfs, dm-devel,
ntfs3, linux-s390, linux-kernel, David Sterba
Reviewed-by: Anand Jain <anand.jain@oracle.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
` (10 preceding siblings ...)
2023-11-13 17:37 ` [PATCH v3 11/11] lib/bitmap: add compile-time test for __assign_bit() optimization Alexander Lobakin
@ 2023-11-20 11:04 ` Alexander Lobakin
11 siblings, 0 replies; 15+ messages in thread
From: Alexander Lobakin @ 2023-11-20 11:04 UTC (permalink / raw)
To: Yury Norov
Cc: Andy Shevchenko, Rasmus Villemoes, Alexander Potapenko,
Jakub Kicinski, Przemek Kitszel, netdev, linux-btrfs, dm-devel,
ntfs3, linux-s390, linux-kernel
From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Mon, 13 Nov 2023 18:37:06 +0100
> Based on top of "lib/bitmap: add bitmap_{read,write}()"[0] from Alexander
> Potapenko as it uses those new bitmap_{read,write}() functions to not
> introduce another pair of similar ones.
>
> Derived from the PFCP support series[1] as this grew bigger (2 -> 13
> commits in v2) and involved more core bitmap changes, finally transforming
> into a pure bitmap series. The actual mentioned ip_tunnel flags conversion
> from `__be16` to bitmaps will be submitted bundled with the PFCP set after
> this one lands.
>
> Little breakdown:
> * #1, #8, #10: misc cleanups;
> * #2, #5, #6, #7: symbol scope, name collisions;
> * #3, #4, #9, #11: compile-time optimizations.
Ping?
>
> [0] https://lore.kernel.org/lkml/20231109151106.2385155-1-glider@google.com
> [1] https://lore.kernel.org/netdev/20230721071532.613888-1-marcin.szycik@linux.intel.com
[...]
Thanks,
Olek
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-11-20 11:05 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-13 17:37 [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 01/11] bitops: add missing prototype check Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 02/11] bitops: make BYTES_TO_BITS() treewide-available Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 03/11] bitops: let the compiler optimize {__,}assign_bit() Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 04/11] linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros Alexander Lobakin
2023-11-13 19:08 ` Andrew Lunn
2023-11-13 17:37 ` [PATCH v3 05/11] s390/cio: rename bitmap_size() -> idset_bitmap_size() Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 06/11] fs/ntfs3: add prefix to bitmap_size() and use BITS_TO_U64() Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 07/11] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits() Alexander Lobakin
2023-11-14 0:48 ` Anand Jain
2023-11-13 17:37 ` [PATCH v3 08/11] tools: move alignment-related macros to new <linux/align.h> Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 09/11] bitmap: introduce generic optimized bitmap_size() Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 10/11] bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}() Alexander Lobakin
2023-11-13 17:37 ` [PATCH v3 11/11] lib/bitmap: add compile-time test for __assign_bit() optimization Alexander Lobakin
2023-11-20 11:04 ` [PATCH v3 00/11] bitmap: prereqs for ip_tunnel flags conversion Alexander Lobakin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox