* [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
@ 2025-08-13 7:39 Maciej Żenczykowski
2025-08-13 20:46 ` kernel test robot
2025-08-18 20:58 ` Yonghong Song
0 siblings, 2 replies; 6+ messages in thread
From: Maciej Żenczykowski @ 2025-08-13 7:39 UTC (permalink / raw)
To: Maciej Żenczykowski, Alexei Starovoitov, Daniel Borkmann
Cc: Linux Network Development Mailing List, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Linux Kernel Mailing List, BPF Mailing List,
Maciej Żenczykowski, Stanislav Fomichev
BPF_MAP_LOOKUP_AND_DELETE_BATCH keys & values == NULL
seems like a nice way to simply quickly clear a map.
BPF_MAP_LOOKUP keys/values == NULL might be useful if we just want
the values/keys and don't want to bother copying the keys/values...
BPF_MAP_LOOKUP keys & values == NULL might be useful to count
the number of populated entries.
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
kernel/bpf/hashtab.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 5001131598e5..8fbdd000d9e0 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -1873,9 +1873,9 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
rcu_read_unlock();
bpf_enable_instrumentation();
- if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
+ if (bucket_cnt && (ukeys && copy_to_user(ukeys + total * key_size, keys,
key_size * bucket_cnt) ||
- copy_to_user(uvalues + total * value_size, values,
+ uvalues && copy_to_user(uvalues + total * value_size, values,
value_size * bucket_cnt))) {
ret = -EFAULT;
goto after_loop;
--
2.51.0.rc1.163.g2494970778-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
2025-08-13 7:39 [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values Maciej Żenczykowski
@ 2025-08-13 20:46 ` kernel test robot
2025-08-18 20:58 ` Yonghong Song
1 sibling, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-08-13 20:46 UTC (permalink / raw)
To: Maciej Żenczykowski, Maciej Żenczykowski,
Alexei Starovoitov, Daniel Borkmann
Cc: oe-kbuild-all, Linux Network Development Mailing List,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Linux Kernel Mailing List, BPF Mailing List, Stanislav Fomichev
Hi Maciej,
kernel test robot noticed the following build warnings:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/Maciej-enczykowski/bpf-hashtab-allow-BPF_MAP_LOOKUP-_AND_DELETE-_BATCH-with-NULL-keys-values/20250813-154227
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20250813073955.1775315-1-maze%40google.com
patch subject: [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
config: csky-randconfig-001-20250814 (https://download.01.org/0day-ci/archive/20250814/202508140435.GA2XXxaK-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250814/202508140435.GA2XXxaK-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508140435.GA2XXxaK-lkp@intel.com/
All warnings (new ones prefixed by >>):
kernel/bpf/hashtab.c: In function '__htab_map_lookup_and_delete_batch':
>> kernel/bpf/hashtab.c:1875:34: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
1875 | if (bucket_cnt && (ukeys && copy_to_user(ukeys + total * key_size, keys,
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1876 | key_size * bucket_cnt) ||
| ~~~~~~~~~~~~~~~~~~~~~~
vim +1875 kernel/bpf/hashtab.c
1675
1676 static int
1677 __htab_map_lookup_and_delete_batch(struct bpf_map *map,
1678 const union bpf_attr *attr,
1679 union bpf_attr __user *uattr,
1680 bool do_delete, bool is_lru_map,
1681 bool is_percpu)
1682 {
1683 struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
1684 void *keys = NULL, *values = NULL, *value, *dst_key, *dst_val;
1685 void __user *uvalues = u64_to_user_ptr(attr->batch.values);
1686 void __user *ukeys = u64_to_user_ptr(attr->batch.keys);
1687 void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch);
1688 u32 batch, max_count, size, bucket_size, map_id;
1689 u32 bucket_cnt, total, key_size, value_size;
1690 struct htab_elem *node_to_free = NULL;
1691 u64 elem_map_flags, map_flags;
1692 struct hlist_nulls_head *head;
1693 struct hlist_nulls_node *n;
1694 unsigned long flags = 0;
1695 bool locked = false;
1696 struct htab_elem *l;
1697 struct bucket *b;
1698 int ret = 0;
1699
1700 elem_map_flags = attr->batch.elem_flags;
1701 if ((elem_map_flags & ~BPF_F_LOCK) ||
1702 ((elem_map_flags & BPF_F_LOCK) && !btf_record_has_field(map->record, BPF_SPIN_LOCK)))
1703 return -EINVAL;
1704
1705 map_flags = attr->batch.flags;
1706 if (map_flags)
1707 return -EINVAL;
1708
1709 max_count = attr->batch.count;
1710 if (!max_count)
1711 return 0;
1712
1713 if (put_user(0, &uattr->batch.count))
1714 return -EFAULT;
1715
1716 batch = 0;
1717 if (ubatch && copy_from_user(&batch, ubatch, sizeof(batch)))
1718 return -EFAULT;
1719
1720 if (batch >= htab->n_buckets)
1721 return -ENOENT;
1722
1723 key_size = htab->map.key_size;
1724 value_size = htab->map.value_size;
1725 size = round_up(value_size, 8);
1726 if (is_percpu)
1727 value_size = size * num_possible_cpus();
1728 total = 0;
1729 /* while experimenting with hash tables with sizes ranging from 10 to
1730 * 1000, it was observed that a bucket can have up to 5 entries.
1731 */
1732 bucket_size = 5;
1733
1734 alloc:
1735 /* We cannot do copy_from_user or copy_to_user inside
1736 * the rcu_read_lock. Allocate enough space here.
1737 */
1738 keys = kvmalloc_array(key_size, bucket_size, GFP_USER | __GFP_NOWARN);
1739 values = kvmalloc_array(value_size, bucket_size, GFP_USER | __GFP_NOWARN);
1740 if (!keys || !values) {
1741 ret = -ENOMEM;
1742 goto after_loop;
1743 }
1744
1745 again:
1746 bpf_disable_instrumentation();
1747 rcu_read_lock();
1748 again_nocopy:
1749 dst_key = keys;
1750 dst_val = values;
1751 b = &htab->buckets[batch];
1752 head = &b->head;
1753 /* do not grab the lock unless need it (bucket_cnt > 0). */
1754 if (locked) {
1755 ret = htab_lock_bucket(b, &flags);
1756 if (ret) {
1757 rcu_read_unlock();
1758 bpf_enable_instrumentation();
1759 goto after_loop;
1760 }
1761 }
1762
1763 bucket_cnt = 0;
1764 hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
1765 bucket_cnt++;
1766
1767 if (bucket_cnt && !locked) {
1768 locked = true;
1769 goto again_nocopy;
1770 }
1771
1772 if (bucket_cnt > (max_count - total)) {
1773 if (total == 0)
1774 ret = -ENOSPC;
1775 /* Note that since bucket_cnt > 0 here, it is implicit
1776 * that the locked was grabbed, so release it.
1777 */
1778 htab_unlock_bucket(b, flags);
1779 rcu_read_unlock();
1780 bpf_enable_instrumentation();
1781 goto after_loop;
1782 }
1783
1784 if (bucket_cnt > bucket_size) {
1785 bucket_size = bucket_cnt;
1786 /* Note that since bucket_cnt > 0 here, it is implicit
1787 * that the locked was grabbed, so release it.
1788 */
1789 htab_unlock_bucket(b, flags);
1790 rcu_read_unlock();
1791 bpf_enable_instrumentation();
1792 kvfree(keys);
1793 kvfree(values);
1794 goto alloc;
1795 }
1796
1797 /* Next block is only safe to run if you have grabbed the lock */
1798 if (!locked)
1799 goto next_batch;
1800
1801 hlist_nulls_for_each_entry_safe(l, n, head, hash_node) {
1802 memcpy(dst_key, l->key, key_size);
1803
1804 if (is_percpu) {
1805 int off = 0, cpu;
1806 void __percpu *pptr;
1807
1808 pptr = htab_elem_get_ptr(l, map->key_size);
1809 for_each_possible_cpu(cpu) {
1810 copy_map_value_long(&htab->map, dst_val + off, per_cpu_ptr(pptr, cpu));
1811 check_and_init_map_value(&htab->map, dst_val + off);
1812 off += size;
1813 }
1814 } else {
1815 value = htab_elem_value(l, key_size);
1816 if (is_fd_htab(htab)) {
1817 struct bpf_map **inner_map = value;
1818
1819 /* Actual value is the id of the inner map */
1820 map_id = map->ops->map_fd_sys_lookup_elem(*inner_map);
1821 value = &map_id;
1822 }
1823
1824 if (elem_map_flags & BPF_F_LOCK)
1825 copy_map_value_locked(map, dst_val, value,
1826 true);
1827 else
1828 copy_map_value(map, dst_val, value);
1829 /* Zeroing special fields in the temp buffer */
1830 check_and_init_map_value(map, dst_val);
1831 }
1832 if (do_delete) {
1833 hlist_nulls_del_rcu(&l->hash_node);
1834
1835 /* bpf_lru_push_free() will acquire lru_lock, which
1836 * may cause deadlock. See comments in function
1837 * prealloc_lru_pop(). Let us do bpf_lru_push_free()
1838 * after releasing the bucket lock.
1839 *
1840 * For htab of maps, htab_put_fd_value() in
1841 * free_htab_elem() may acquire a spinlock with bucket
1842 * lock being held and it violates the lock rule, so
1843 * invoke free_htab_elem() after unlock as well.
1844 */
1845 l->batch_flink = node_to_free;
1846 node_to_free = l;
1847 }
1848 dst_key += key_size;
1849 dst_val += value_size;
1850 }
1851
1852 htab_unlock_bucket(b, flags);
1853 locked = false;
1854
1855 while (node_to_free) {
1856 l = node_to_free;
1857 node_to_free = node_to_free->batch_flink;
1858 if (is_lru_map)
1859 htab_lru_push_free(htab, l);
1860 else
1861 free_htab_elem(htab, l);
1862 }
1863
1864 next_batch:
1865 /* If we are not copying data, we can go to next bucket and avoid
1866 * unlocking the rcu.
1867 */
1868 if (!bucket_cnt && (batch + 1 < htab->n_buckets)) {
1869 batch++;
1870 goto again_nocopy;
1871 }
1872
1873 rcu_read_unlock();
1874 bpf_enable_instrumentation();
> 1875 if (bucket_cnt && (ukeys && copy_to_user(ukeys + total * key_size, keys,
1876 key_size * bucket_cnt) ||
1877 uvalues && copy_to_user(uvalues + total * value_size, values,
1878 value_size * bucket_cnt))) {
1879 ret = -EFAULT;
1880 goto after_loop;
1881 }
1882
1883 total += bucket_cnt;
1884 batch++;
1885 if (batch >= htab->n_buckets) {
1886 ret = -ENOENT;
1887 goto after_loop;
1888 }
1889 goto again;
1890
1891 after_loop:
1892 if (ret == -EFAULT)
1893 goto out;
1894
1895 /* copy # of entries and next batch */
1896 ubatch = u64_to_user_ptr(attr->batch.out_batch);
1897 if (copy_to_user(ubatch, &batch, sizeof(batch)) ||
1898 put_user(total, &uattr->batch.count))
1899 ret = -EFAULT;
1900
1901 out:
1902 kvfree(keys);
1903 kvfree(values);
1904 return ret;
1905 }
1906
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
2025-08-13 7:39 [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values Maciej Żenczykowski
2025-08-13 20:46 ` kernel test robot
@ 2025-08-18 20:58 ` Yonghong Song
2025-08-21 4:07 ` Maciej Żenczykowski
[not found] ` <CANP3RGcJ06uRUBF=RR6bjqNnxdaSdpBpynGzNTSms0jA-ZpW6w@mail.gmail.com>
1 sibling, 2 replies; 6+ messages in thread
From: Yonghong Song @ 2025-08-18 20:58 UTC (permalink / raw)
To: Maciej Żenczykowski, Maciej Żenczykowski,
Alexei Starovoitov, Daniel Borkmann
Cc: Linux Network Development Mailing List, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Linux Kernel Mailing List, BPF Mailing List, Stanislav Fomichev
On 8/13/25 12:39 AM, Maciej Żenczykowski wrote:
> BPF_MAP_LOOKUP_AND_DELETE_BATCH keys & values == NULL
> seems like a nice way to simply quickly clear a map.
This will change existing API as users will expect
some error (e.g., -EFAULT) return when keys or values is NULL.
We have a 'flags' field in uapi header in
struct { /* struct used by BPF_MAP_*_BATCH commands */
__aligned_u64 in_batch; /* start batch,
* NULL to start from beginning
*/
__aligned_u64 out_batch; /* output: next start batch */
__aligned_u64 keys;
__aligned_u64 values;
__u32 count; /* input/output:
* input: # of key/value
* elements
* output: # of filled elements
*/
__u32 map_fd;
__u64 elem_flags;
__u64 flags;
} batch;
we can add a flag in 'flags' like BPF_F_CLEAR_MAP_IF_KV_NULL with a comment
that if keys or values is NULL, the batched elements will be cleared.
>
> BPF_MAP_LOOKUP keys/values == NULL might be useful if we just want
> the values/keys and don't want to bother copying the keys/values...
>
> BPF_MAP_LOOKUP keys & values == NULL might be useful to count
> the number of populated entries.
bpf_map_lookup_elem() does not have flags field, so we probably should not
change existins semantics.
>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Stanislav Fomichev <sdf@fomichev.me>
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> ---
> kernel/bpf/hashtab.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 5001131598e5..8fbdd000d9e0 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -1873,9 +1873,9 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>
> rcu_read_unlock();
> bpf_enable_instrumentation();
> - if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
> + if (bucket_cnt && (ukeys && copy_to_user(ukeys + total * key_size, keys,
> key_size * bucket_cnt) ||
> - copy_to_user(uvalues + total * value_size, values,
> + uvalues && copy_to_user(uvalues + total * value_size, values,
> value_size * bucket_cnt))) {
> ret = -EFAULT;
> goto after_loop;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
2025-08-18 20:58 ` Yonghong Song
@ 2025-08-21 4:07 ` Maciej Żenczykowski
[not found] ` <CANP3RGcJ06uRUBF=RR6bjqNnxdaSdpBpynGzNTSms0jA-ZpW6w@mail.gmail.com>
1 sibling, 0 replies; 6+ messages in thread
From: Maciej Żenczykowski @ 2025-08-21 4:07 UTC (permalink / raw)
To: Yonghong Song
Cc: Alexei Starovoitov, Daniel Borkmann,
Linux Network Development Mailing List, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Linux Kernel Mailing List, BPF Mailing List, Stanislav Fomichev
On Mon, Aug 18, 2025 at 1:58 PM Yonghong Song <yonghong.song@linux.dev> wrote:
> On 8/13/25 12:39 AM, Maciej Żenczykowski wrote:
> > BPF_MAP_LOOKUP_AND_DELETE_BATCH keys & values == NULL
> > seems like a nice way to simply quickly clear a map.
>
> This will change existing API as users will expect
> some error (e.g., -EFAULT) return when keys or values is NULL.
No reasonable user will call the current api with NULLs.
This is a similar API change to adding a new system call
(where previously it returned -ENOSYS) - which *is* also a UAPI
change, but obviously allowed.
Or adding support for a new address family / protocol (where
previously it -EAFNOSUPPORT)
Or adding support for a new flag (where previously it returned -EINVAL)
Consider why userspace would ever pass in NULL, two possibilities:
(a) explicit NULL - you'd never do this since it would (till now)
always -EFAULT,
so this would only possibly show up in a very thorough test suite
(b) you're using dynamically allocated memory and it failed allocation.
that's already a program bug, you should catch that before you call bpf().
> We have a 'flags' field in uapi header in
>
> struct { /* struct used by BPF_MAP_*_BATCH commands */
> __aligned_u64 in_batch; /* start batch,
> * NULL to start from beginning
> */
> __aligned_u64 out_batch; /* output: next start batch */
> __aligned_u64 keys;
> __aligned_u64 values;
> __u32 count; /* input/output:
> * input: # of key/value
> * elements
> * output: # of filled elements
> */
> __u32 map_fd;
> __u64 elem_flags;
> __u64 flags;
> } batch;
>
> we can add a flag in 'flags' like BPF_F_CLEAR_MAP_IF_KV_NULL with a comment
> that if keys or values is NULL, the batched elements will be cleared.
I just don't see what value this provides.
> > BPF_MAP_LOOKUP keys/values == NULL might be useful if we just want
> > the values/keys and don't want to bother copying the keys/values...
> >
> > BPF_MAP_LOOKUP keys & values == NULL might be useful to count
> > the number of populated entries.
>
> bpf_map_lookup_elem() does not have flags field, so we probably should not
> change existins semantics.
This is unrelated to this patch, since this only touches 'batch' operation.
(unless I'm missing something)
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: Stanislav Fomichev <sdf@fomichev.me>
> > Signed-off-by: Maciej Żenczykowski <maze@google.com>
> > ---
> > kernel/bpf/hashtab.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > index 5001131598e5..8fbdd000d9e0 100644
> > --- a/kernel/bpf/hashtab.c
> > +++ b/kernel/bpf/hashtab.c
> > @@ -1873,9 +1873,9 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >
> > rcu_read_unlock();
> > bpf_enable_instrumentation();
> > - if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
> > + if (bucket_cnt && (ukeys && copy_to_user(ukeys + total * key_size, keys,
> > key_size * bucket_cnt) ||
> > - copy_to_user(uvalues + total * value_size, values,
> > + uvalues && copy_to_user(uvalues + total * value_size, values,
> > value_size * bucket_cnt))) {
> > ret = -EFAULT;
> > goto after_loop;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
[not found] ` <CANP3RGcJ06uRUBF=RR6bjqNnxdaSdpBpynGzNTSms0jA-ZpW6w@mail.gmail.com>
@ 2025-08-21 21:48 ` Yonghong Song
2025-08-22 19:00 ` Andrii Nakryiko
0 siblings, 1 reply; 6+ messages in thread
From: Yonghong Song @ 2025-08-21 21:48 UTC (permalink / raw)
To: Maciej Żenczykowski
Cc: Alexei Starovoitov, Daniel Borkmann,
Linux Network Development Mailing List, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Linux Kernel Mailing List, BPF Mailing List, Stanislav Fomichev
On 8/20/25 7:23 PM, Maciej Żenczykowski wrote:
> On Mon, Aug 18, 2025 at 1:58 PM Yonghong Song
> <yonghong.song@linux.dev> wrote:
> > On 8/13/25 12:39 AM, Maciej Żenczykowski wrote:
> > > BPF_MAP_LOOKUP_AND_DELETE_BATCH keys & values == NULL
> > > seems like a nice way to simply quickly clear a map.
> >
> > This will change existing API as users will expect
> > some error (e.g., -EFAULT) return when keys or values is NULL.
>
> No reasonable user will call the current api with NULLs.
I do agree it is really unlikely users will have NULL keys or values.
>
> This is a similar API change to adding a new system call
> (where previously it returned -ENOSYS) - which *is* also a UAPI
> change, but obviously allowed.
>
> Or adding support for a new address family / protocol (where
> previously it -EAFNOSUPPORT)
> Or adding support for a new flag (where previously it returned -EINVAL)
>
> Consider why userspace would ever pass in NULL, two possibilities:
> (a) explicit NULL - you'd never do this since it would (till now)
> always -EFAULT,
> so this would only possibly show up in a very thorough test suite
> (b) you're using dynamically allocated memory and it failed allocation.
> that's already a program bug, you should catch that before you call
> bpf().
Okay. What you describes make sense.
Could you add a selftest for this?
Could you add some comments in below uapi bpf.h header to new functionality?
>
> > We have a 'flags' field in uapi header in
> >
> > struct { /* struct used by BPF_MAP_*_BATCH commands */
> > __aligned_u64 in_batch; /* start batch,
> > * NULL to start
> from beginning
> > */
> > __aligned_u64 out_batch; /* output: next
> start batch */
> > __aligned_u64 keys;
> > __aligned_u64 values;
> > __u32 count; /* input/output:
> > * input: # of
> key/value
> > * elements
> > * output: # of
> filled elements
> > */
> > __u32 map_fd;
> > __u64 elem_flags;
> > __u64 flags;
> > } batch;
> >
> > we can add a flag in 'flags' like BPF_F_CLEAR_MAP_IF_KV_NULL with a
> comment
> > that if keys or values is NULL, the batched elements will be cleared.
>
> I just don't see what value this provides.
>
> > > BPF_MAP_LOOKUP keys/values == NULL might be useful if we just want
> > > the values/keys and don't want to bother copying the keys/values...
> > >
> > > BPF_MAP_LOOKUP keys & values == NULL might be useful to count
> > > the number of populated entries.
> >
> > bpf_map_lookup_elem() does not have flags field, so we probably
> should not
> > change existins semantics.
>
> This is unrelated to this patch, since this only touches 'batch'
> operation.
> (unless I'm missing something)
>
> > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > > Cc: Stanislav Fomichev <sdf@fomichev.me>
> > > Signed-off-by: Maciej Żenczykowski <maze@google.com>
> > > ---
> > > kernel/bpf/hashtab.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > > index 5001131598e5..8fbdd000d9e0 100644
> > > --- a/kernel/bpf/hashtab.c
> > > +++ b/kernel/bpf/hashtab.c
> > > @@ -1873,9 +1873,9 @@ __htab_map_lookup_and_delete_batch(struct
> bpf_map *map,
> > >
> > > rcu_read_unlock();
> > > bpf_enable_instrumentation();
> > > - if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
> > > + if (bucket_cnt && (ukeys && copy_to_user(ukeys + total *
> key_size, keys,
> > > key_size * bucket_cnt) ||
> > > - copy_to_user(uvalues + total * value_size, values,
> > > + uvalues && copy_to_user(uvalues + total * value_size,
> values,
> > > value_size * bucket_cnt))) {
> > > ret = -EFAULT;
> > > goto after_loop;
> >
>
>
> --
> Maciej Żenczykowski, Kernel Networking Developer @ Google
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
2025-08-21 21:48 ` Yonghong Song
@ 2025-08-22 19:00 ` Andrii Nakryiko
0 siblings, 0 replies; 6+ messages in thread
From: Andrii Nakryiko @ 2025-08-22 19:00 UTC (permalink / raw)
To: Yonghong Song
Cc: Maciej Żenczykowski, Alexei Starovoitov, Daniel Borkmann,
Linux Network Development Mailing List, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Linux Kernel Mailing List, BPF Mailing List, Stanislav Fomichev
On Thu, Aug 21, 2025 at 2:49 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
>
>
> On 8/20/25 7:23 PM, Maciej Żenczykowski wrote:
> > On Mon, Aug 18, 2025 at 1:58 PM Yonghong Song
> > <yonghong.song@linux.dev> wrote:
> > > On 8/13/25 12:39 AM, Maciej Żenczykowski wrote:
> > > > BPF_MAP_LOOKUP_AND_DELETE_BATCH keys & values == NULL
> > > > seems like a nice way to simply quickly clear a map.
> > >
> > > This will change existing API as users will expect
> > > some error (e.g., -EFAULT) return when keys or values is NULL.
> >
> > No reasonable user will call the current api with NULLs.
>
> I do agree it is really unlikely users will have NULL keys or values.
>
> >
> > This is a similar API change to adding a new system call
> > (where previously it returned -ENOSYS) - which *is* also a UAPI
> > change, but obviously allowed.
> >
> > Or adding support for a new address family / protocol (where
> > previously it -EAFNOSUPPORT)
> > Or adding support for a new flag (where previously it returned -EINVAL)
> >
> > Consider why userspace would ever pass in NULL, two possibilities:
> > (a) explicit NULL - you'd never do this since it would (till now)
> > always -EFAULT,
> > so this would only possibly show up in a very thorough test suite
> > (b) you're using dynamically allocated memory and it failed allocation.
> > that's already a program bug, you should catch that before you call
> > bpf().
>
> Okay. What you describes make sense.
+1, I think there is no backwards compat concern with this extension.
> Could you add a selftest for this?
yes, please
> Could you add some comments in below uapi bpf.h header to new functionality?
+1, I'd also appreciate if you can incorporate that into libbpf API
doc comments in tools/lib/bpf/bpf.h, as we already have a decent
description of this rather complicated API, so it would be nice to
keep it up to date with this added semantics. Thanks!
Also, please shorten the subject, it's way too long.
pw-bot: cr
>
> >
> > > We have a 'flags' field in uapi header in
> > >
> > > struct { /* struct used by BPF_MAP_*_BATCH commands */
> > > __aligned_u64 in_batch; /* start batch,
> > > * NULL to start
> > from beginning
> > > */
> > > __aligned_u64 out_batch; /* output: next
> > start batch */
> > > __aligned_u64 keys;
> > > __aligned_u64 values;
> > > __u32 count; /* input/output:
> > > * input: # of
> > key/value
> > > * elements
> > > * output: # of
> > filled elements
> > > */
> > > __u32 map_fd;
> > > __u64 elem_flags;
> > > __u64 flags;
> > > } batch;
> > >
> > > we can add a flag in 'flags' like BPF_F_CLEAR_MAP_IF_KV_NULL with a
> > comment
> > > that if keys or values is NULL, the batched elements will be cleared.
> >
> > I just don't see what value this provides.
> >
> > > > BPF_MAP_LOOKUP keys/values == NULL might be useful if we just want
> > > > the values/keys and don't want to bother copying the keys/values...
> > > >
> > > > BPF_MAP_LOOKUP keys & values == NULL might be useful to count
> > > > the number of populated entries.
> > >
> > > bpf_map_lookup_elem() does not have flags field, so we probably
> > should not
> > > change existins semantics.
> >
> > This is unrelated to this patch, since this only touches 'batch'
> > operation.
> > (unless I'm missing something)
> >
> > > > Cc: Alexei Starovoitov <ast@kernel.org>
> > > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > > > Cc: Stanislav Fomichev <sdf@fomichev.me>
> > > > Signed-off-by: Maciej Żenczykowski <maze@google.com>
> > > > ---
> > > > kernel/bpf/hashtab.c | 4 ++--
> > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > > > index 5001131598e5..8fbdd000d9e0 100644
> > > > --- a/kernel/bpf/hashtab.c
> > > > +++ b/kernel/bpf/hashtab.c
> > > > @@ -1873,9 +1873,9 @@ __htab_map_lookup_and_delete_batch(struct
> > bpf_map *map,
> > > >
> > > > rcu_read_unlock();
> > > > bpf_enable_instrumentation();
> > > > - if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
> > > > + if (bucket_cnt && (ukeys && copy_to_user(ukeys + total *
> > key_size, keys,
> > > > key_size * bucket_cnt) ||
> > > > - copy_to_user(uvalues + total * value_size, values,
> > > > + uvalues && copy_to_user(uvalues + total * value_size,
> > values,
> > > > value_size * bucket_cnt))) {
> > > > ret = -EFAULT;
> > > > goto after_loop;
> > >
> >
> >
> > --
> > Maciej Żenczykowski, Kernel Networking Developer @ Google
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-08-22 19:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13 7:39 [PATCH bpf-next] bpf: hashtab - allow BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values Maciej Żenczykowski
2025-08-13 20:46 ` kernel test robot
2025-08-18 20:58 ` Yonghong Song
2025-08-21 4:07 ` Maciej Żenczykowski
[not found] ` <CANP3RGcJ06uRUBF=RR6bjqNnxdaSdpBpynGzNTSms0jA-ZpW6w@mail.gmail.com>
2025-08-21 21:48 ` Yonghong Song
2025-08-22 19:00 ` Andrii Nakryiko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).