* [PATCH dpdk 0/2] hash: safe data replacement on overwrite
@ 2026-02-12 21:33 Robin Jarry
2026-02-12 21:33 ` [PATCH dpdk 1/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry
` (4 more replies)
0 siblings, 5 replies; 21+ messages in thread
From: Robin Jarry @ 2026-02-12 21:33 UTC (permalink / raw)
To: dev
When rte_hash_add_key_data() is called with an existing key, the old
data pointer is silently overwritten. With RCU-protected readers still
potentially accessing the old data, the application cannot safely free
it since the old pointer is never returned.
This series provides two ways to address the problem:
* Automatic: when RCU is configured with a free_key_data_func callback,
rte_hash_add_key_data() now automatically defers freeing the old data
on overwrite. No caller changes are needed.
* Explicit: a new rte_hash_replace_key_data() API returns the previous
data pointer via an output parameter, giving the caller full control
over the old pointer's lifecycle. No automatic deferred free is
performed in that case.
Robin Jarry (2):
hash: free replaced data on overwrite when RCU is configured
hash: add replace API returning old data on overwrite
app/test/test_hash.c | 214 +++++++++++++++++++++++++
doc/guides/rel_notes/release_26_03.rst | 12 ++
lib/hash/rte_cuckoo_hash.c | 146 +++++++++++++----
lib/hash/rte_hash.h | 75 ++++++++-
4 files changed, 417 insertions(+), 30 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 21+ messages in thread* [PATCH dpdk 1/2] hash: free replaced data on overwrite when RCU is configured 2026-02-12 21:33 [PATCH dpdk 0/2] hash: safe data replacement on overwrite Robin Jarry @ 2026-02-12 21:33 ` Robin Jarry 2026-02-12 21:33 ` [PATCH dpdk 2/2] hash: add replace API returning old data on overwrite Robin Jarry ` (3 subsequent siblings) 4 siblings, 0 replies; 21+ messages in thread From: Robin Jarry @ 2026-02-12 21:33 UTC (permalink / raw) To: dev, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin When rte_hash_add_key_data() overwrites an existing key, the old data pointer is silently lost. With RCU-protected readers still potentially accessing the old data, the application has no safe way to free it. When RCU is configured with a free_key_data_func callback, automatically enqueue the old data for deferred freeing via the RCU defer queue on overwrite. In SYNC mode, synchronize and call free_key_data_func directly. Signed-off-by: Robin Jarry <rjarry@redhat.com> --- app/test/test_hash.c | 134 +++++++++++++++++++++++++ doc/guides/rel_notes/release_26_03.rst | 6 ++ lib/hash/rte_cuckoo_hash.c | 106 ++++++++++++++----- lib/hash/rte_hash.h | 8 +- 4 files changed, 229 insertions(+), 25 deletions(-) diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 3fb3d96d0546..56a7779e09b9 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -2342,6 +2342,137 @@ test_hash_rcu_qsbr_dq_reclaim(void) return 0; } +static void *old_data; + +static void +test_hash_free_key_data_func(void *p __rte_unused, void *key_data) +{ + old_data = key_data; +} + +/* + * Test automatic RCU free on overwrite via rte_hash_add_key_data. + * - Create hash with RW_CONCURRENCY_LF and RCU QSBR in DQ mode + * with a free_key_data_func callback that increments a counter. + * - Register a pseudo reader thread. + * - Add key with data (void *)1. + * - Overwrite same key with data (void *)2 via rte_hash_add_key_data. + * - Report quiescent state, trigger reclamation. + * - Verify the free callback was called exactly once. + * - Delete the key, report quiescent state, reclaim again. + * - Verify the free callback was called a second time. + */ +static int +test_hash_rcu_qsbr_replace_auto_free(void) +{ + struct rte_hash_rcu_config rcu = { + .v = NULL, + .mode = RTE_HASH_QSBR_MODE_DQ, + .free_key_data_func = test_hash_free_key_data_func, + .key_data_ptr = NULL, + }; + struct rte_hash_parameters params = { + .name = "test_replace_auto_free", + .entries = 16, + .key_len = sizeof(uint32_t), + .hash_func = NULL, + .hash_func_init_val = 0, + .socket_id = SOCKET_ID_ANY, + .extra_flag = RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF, + }; + struct rte_hash *hash = NULL; + uint32_t key = 55; + int32_t status; + int ret = -1; + size_t sz; + + printf("\n# Running RCU replace auto-free test\n"); + + hash = rte_hash_create(¶ms); + if (hash == NULL) { + printf("hash creation failed\n"); + goto end; + } + + sz = rte_rcu_qsbr_get_memsize(RTE_MAX_LCORE); + rcu.v = rte_zmalloc(NULL, sz, RTE_CACHE_LINE_SIZE); + if (rcu.v == NULL) { + printf("RCU QSBR alloc failed\n"); + goto end; + } + status = rte_rcu_qsbr_init(rcu.v, RTE_MAX_LCORE); + if (status != 0) { + printf("RCU QSBR init failed\n"); + goto end; + } + + status = rte_hash_rcu_qsbr_add(hash, &rcu); + if (status != 0) { + printf("RCU QSBR add failed\n"); + goto end; + } + + /* Register pseudo reader */ + status = rte_rcu_qsbr_thread_register(rcu.v, 0); + if (status != 0) { + printf("RCU QSBR thread register failed\n"); + goto end; + } + rte_rcu_qsbr_thread_online(rcu.v, 0); + + old_data = NULL; + + /* Add key with data = (void *)1 */ + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)1); + if (status != 0) { + printf("failed to add key (status=%d)\n", status); + goto end; + } + + /* Overwrite same key with data = (void *)2 */ + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)2); + if (status != 0) { + printf("failed to overwrite key (status=%d)\n", status); + goto end; + } + + /* Reader quiescent and reclaim */ + rte_rcu_qsbr_quiescent(rcu.v, 0); + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); + + if (old_data != (void *)(uintptr_t)1) { + printf("old data should be 0x1 but is %p\n", old_data); + goto end; + } + + /* Delete the key */ + status = rte_hash_del_key(hash, &key); + if (status < 0) { + printf("failed to delete key (status=%d)\n", status); + goto end; + } + + /* Reader quiescent and reclaim again */ + rte_rcu_qsbr_quiescent(rcu.v, 0); + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); + + if (old_data != (void *)(uintptr_t)2) { + printf("old data should be 2 but is %p\n", old_data); + goto end; + } + + ret = 0; +end: + if (rcu.v != NULL) { + rte_rcu_qsbr_thread_offline(rcu.v, 0); + rte_rcu_qsbr_thread_unregister(rcu.v, 0); + } + rte_hash_free(hash); + rte_free(rcu.v); + + return ret; +} + /* * Do all unit and performance tests. */ @@ -2423,6 +2554,9 @@ test_hash(void) if (test_hash_rcu_qsbr_dq_reclaim() < 0) return -1; + if (test_hash_rcu_qsbr_replace_auto_free() < 0) + return -1; + return 0; } diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst index afdf1af06c20..693034bcb0d2 100644 --- a/doc/guides/rel_notes/release_26_03.rst +++ b/doc/guides/rel_notes/release_26_03.rst @@ -87,6 +87,12 @@ New Features * Added support for AES-XTS cipher algorithm. * Added support for SHAKE-128 and SHAKE-256 authentication algorithms. +* **Added automatic deferred free on hash data overwrite.** + + When RCU is configured with a ``free_key_data_func`` callback, + ``rte_hash_add_key_data`` now automatically defers freeing the old data + pointer on key overwrite via the RCU defer queue. + Removed Items ------------- diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index da12825c6ed2..0b979dc3be0b 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -75,6 +75,7 @@ EAL_REGISTER_TAILQ(rte_hash_tailq) struct __rte_hash_rcu_dq_entry { uint32_t key_idx; uint32_t ext_bkt_idx; + void *old_data; }; RTE_EXPORT_SYMBOL(rte_hash_find_existing) @@ -763,10 +764,11 @@ enqueue_slot_back(const struct rte_hash *h, /* Search a key from bucket and update its data. * Writer holds the lock before calling this. + * If old_data is non-NULL, save the previous data pointer before overwriting. */ static inline int32_t search_and_update(const struct rte_hash *h, void *data, const void *key, - struct rte_hash_bucket *bkt, uint16_t sig) + struct rte_hash_bucket *bkt, uint16_t sig, void **old_data) { int i; struct rte_hash_key *k, *keys = h->key_store; @@ -776,6 +778,8 @@ search_and_update(const struct rte_hash *h, void *data, const void *key, k = (struct rte_hash_key *) ((char *)keys + bkt->key_idx[i] * h->key_entry_size); if (rte_hash_cmp_eq(key, k->key, h) == 0) { + if (old_data != NULL) + *old_data = k->pdata; /* The store to application data at *data * should not leak after the store to pdata * in the key store. i.e. pdata is the guard @@ -807,7 +811,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, struct rte_hash_bucket *sec_bkt, const struct rte_hash_key *key, void *data, uint16_t sig, uint32_t new_idx, - int32_t *ret_val) + int32_t *ret_val, void **old_data) { unsigned int i; struct rte_hash_bucket *cur_bkt; @@ -817,7 +821,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, /* Check if key was inserted after last check but before this * protected region in case of inserting duplicated keys. */ - ret = search_and_update(h, data, key, prim_bkt, sig); + ret = search_and_update(h, data, key, prim_bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -825,7 +829,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, } FOR_EACH_BUCKET(cur_bkt, sec_bkt) { - ret = search_and_update(h, data, key, cur_bkt, sig); + ret = search_and_update(h, data, key, cur_bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -872,7 +876,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h, const struct rte_hash_key *key, void *data, struct queue_node *leaf, uint32_t leaf_slot, uint16_t sig, uint32_t new_idx, - int32_t *ret_val) + int32_t *ret_val, void **old_data) { uint32_t prev_alt_bkt_idx; struct rte_hash_bucket *cur_bkt; @@ -892,7 +896,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h, /* Check if key was inserted after last check but before this * protected region. */ - ret = search_and_update(h, data, key, bkt, sig); + ret = search_and_update(h, data, key, bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -900,7 +904,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h, } FOR_EACH_BUCKET(cur_bkt, alt_bkt) { - ret = search_and_update(h, data, key, cur_bkt, sig); + ret = search_and_update(h, data, key, cur_bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -997,7 +1001,8 @@ rte_hash_cuckoo_make_space_mw(const struct rte_hash *h, struct rte_hash_bucket *sec_bkt, const struct rte_hash_key *key, void *data, uint16_t sig, uint32_t bucket_idx, - uint32_t new_idx, int32_t *ret_val) + uint32_t new_idx, int32_t *ret_val, + void **old_data) { unsigned int i; struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN]; @@ -1023,7 +1028,7 @@ rte_hash_cuckoo_make_space_mw(const struct rte_hash *h, int32_t ret = rte_hash_cuckoo_move_insert_mw(h, bkt, sec_bkt, key, data, tail, i, sig, - new_idx, ret_val); + new_idx, ret_val, old_data); if (likely(ret != -1)) return ret; } @@ -1076,6 +1081,33 @@ alloc_slot(const struct rte_hash *h, struct lcore_cache *cached_free_slots) return slot_id; } +/* + * When RCU is configured with a free function, auto-free the overwritten + * data pointer via RCU. + */ +static inline void +__hash_rcu_auto_free_old_data(const struct rte_hash *h, void *old_data_val) +{ + if (h->hash_rcu_cfg == NULL || h->hash_rcu_cfg->free_key_data_func == NULL) + return; + + if (h->dq != NULL) { + /* DQ mode: enqueue with sentinel key_idx=0 */ + struct __rte_hash_rcu_dq_entry rcu_dq_entry = { + .key_idx = 0, + .ext_bkt_idx = EMPTY_SLOT, + .old_data = old_data_val, + }; + if (rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) + HASH_LOG(ERR, "Failed to push QSBR FIFO"); + } else { + /* SYNC mode */ + rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, RTE_QSBR_THRID_INVALID); + h->hash_rcu_cfg->free_key_data_func( + h->hash_rcu_cfg->key_data_ptr, old_data_val); + } +} + static inline int32_t __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig, void *data) @@ -1092,6 +1124,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, struct lcore_cache *cached_free_slots = NULL; int32_t ret_val; struct rte_hash_bucket *last; + void *saved_old_data = NULL; short_sig = get_short_sig(sig); prim_bucket_idx = get_prim_bucket_index(h, sig); @@ -1103,18 +1136,20 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, /* Check if key is already inserted in primary location */ __hash_rw_writer_lock(h); - ret = search_and_update(h, data, key, prim_bkt, short_sig); + ret = search_and_update(h, data, key, prim_bkt, short_sig, + &saved_old_data); if (ret != -1) { __hash_rw_writer_unlock(h); - return ret; + goto overwrite; } /* Check if key is already inserted in secondary location */ FOR_EACH_BUCKET(cur_bkt, sec_bkt) { - ret = search_and_update(h, data, key, cur_bkt, short_sig); + ret = search_and_update(h, data, key, cur_bkt, short_sig, + &saved_old_data); if (ret != -1) { __hash_rw_writer_unlock(h); - return ret; + goto overwrite; } } @@ -1153,33 +1188,39 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, /* Find an empty slot and insert */ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, - short_sig, slot_id, &ret_val); + short_sig, slot_id, &ret_val, + &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); - return ret_val; + ret = ret_val; + goto overwrite; } /* Primary bucket full, need to make space for new entry */ ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data, - short_sig, prim_bucket_idx, slot_id, &ret_val); + short_sig, prim_bucket_idx, slot_id, &ret_val, + &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); - return ret_val; + ret = ret_val; + goto overwrite; } /* Also search secondary bucket to get better occupancy */ ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data, - short_sig, sec_bucket_idx, slot_id, &ret_val); + short_sig, sec_bucket_idx, slot_id, &ret_val, + &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); - return ret_val; + ret = ret_val; + goto overwrite; } /* if ext table not enabled, we failed the insertion */ @@ -1193,17 +1234,21 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, */ __hash_rw_writer_lock(h); /* We check for duplicates again since could be inserted before the lock */ - ret = search_and_update(h, data, key, prim_bkt, short_sig); + ret = search_and_update(h, data, key, prim_bkt, short_sig, + &saved_old_data); if (ret != -1) { enqueue_slot_back(h, cached_free_slots, slot_id); - goto failure; + __hash_rw_writer_unlock(h); + goto overwrite; } FOR_EACH_BUCKET(cur_bkt, sec_bkt) { - ret = search_and_update(h, data, key, cur_bkt, short_sig); + ret = search_and_update(h, data, key, cur_bkt, short_sig, + &saved_old_data); if (ret != -1) { enqueue_slot_back(h, cached_free_slots, slot_id); - goto failure; + __hash_rw_writer_unlock(h); + goto overwrite; } } @@ -1263,6 +1308,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, __hash_rw_writer_unlock(h); return slot_id - 1; +overwrite: + if (saved_old_data != NULL) + __hash_rcu_auto_free_old_data(h, saved_old_data); + return ret; + failure: __hash_rw_writer_unlock(h); return ret; @@ -1566,6 +1616,15 @@ __hash_rcu_qsbr_free_resource(void *p, void *e, unsigned int n) *((struct __rte_hash_rcu_dq_entry *)e); RTE_SET_USED(n); + + /* Overwrite case: free old data only, do not recycle slot */ + if (rcu_dq_entry.old_data != NULL) { + h->hash_rcu_cfg->free_key_data_func( + h->hash_rcu_cfg->key_data_ptr, + rcu_dq_entry.old_data); + return; + } + keys = h->key_store; k = (struct rte_hash_key *) ((char *)keys + @@ -1870,6 +1929,7 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, /* Key index where key is stored, adding the first dummy index */ rcu_dq_entry.key_idx = ret + 1; rcu_dq_entry.ext_bkt_idx = index; + rcu_dq_entry.old_data = NULL; if (h->dq == NULL) { /* Wait for quiescent state change if using * RTE_HASH_QSBR_MODE_SYNC diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h index f692e0868dcf..e33f0aea0f5e 100644 --- a/lib/hash/rte_hash.h +++ b/lib/hash/rte_hash.h @@ -226,7 +226,9 @@ rte_hash_max_key_id(const struct rte_hash *h); * Thread safety can be enabled by setting flag during * table creation. * If the key exists already in the table, this API updates its value - * with 'data' passed in this API. It is the responsibility of + * with 'data' passed in this API. If RCU is configured with a + * free_key_data_func callback, the old data is automatically + * deferred-freed via RCU. Otherwise, it is the responsibility of * the application to manage any memory associated with the old value. * The readers might still be using the old value even after this API * has returned. @@ -253,7 +255,9 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data); * Thread safety can be enabled by setting flag during * table creation. * If the key exists already in the table, this API updates its value - * with 'data' passed in this API. It is the responsibility of + * with 'data' passed in this API. If RCU is configured with a + * free_key_data_func callback, the old data is automatically + * deferred-freed via RCU. Otherwise, it is the responsibility of * the application to manage any memory associated with the old value. * The readers might still be using the old value even after this API * has returned. -- 2.53.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH dpdk 2/2] hash: add replace API returning old data on overwrite 2026-02-12 21:33 [PATCH dpdk 0/2] hash: safe data replacement on overwrite Robin Jarry 2026-02-12 21:33 ` [PATCH dpdk 1/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry @ 2026-02-12 21:33 ` Robin Jarry 2026-02-12 22:55 ` [PATCH dpdk 0/2] hash: safe data replacement " Stephen Hemminger ` (2 subsequent siblings) 4 siblings, 0 replies; 21+ messages in thread From: Robin Jarry @ 2026-02-12 21:33 UTC (permalink / raw) To: dev, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Add rte_hash_replace_key_data() and rte_hash_replace_key_with_hash_data() which behave like their add_key_data counterparts but return the previous data pointer via an output parameter. On fresh insert, *old_data is set to NULL. When the caller provides old_data, no automatic RCU deferred free is performed: the caller takes ownership of the old pointer and is responsible for freeing it. Signed-off-by: Robin Jarry <rjarry@redhat.com> --- app/test/test_hash.c | 80 ++++++++++++++++++++++++++ doc/guides/rel_notes/release_26_03.rst | 6 ++ lib/hash/rte_cuckoo_hash.c | 42 ++++++++++++-- lib/hash/rte_hash.h | 67 +++++++++++++++++++++ 4 files changed, 189 insertions(+), 6 deletions(-) diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 56a7779e09b9..fcc02ea31205 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -2473,6 +2473,83 @@ test_hash_rcu_qsbr_replace_auto_free(void) return ret; } +/* + * Test rte_hash_replace_key_data (explicit replace, no RCU). + * - Add key with data pointer (void *)1. + * - Replace same key with data (void *)2, verify old_data == (void *)1. + * - Lookup and verify data == (void *)2. + * - Replace with a new key (not yet inserted), verify old_data == NULL. + */ +static int +test_hash_replace_key_data(void) +{ + struct rte_hash_parameters params = { + .name = "test_replace_key_data", + .entries = 64, + .key_len = sizeof(uint32_t), + .hash_func = NULL, + .hash_func_init_val = 0, + .socket_id = SOCKET_ID_ANY, + }; + struct rte_hash *hash; + void *old_data = NULL; + void *data = NULL; + uint32_t key1 = 42; + uint32_t key2 = 99; + int ret = -1; + + printf("\n# Running replace key data test\n"); + + hash = rte_hash_create(¶ms); + if (hash == NULL) { + printf("hash creation failed\n"); + goto end; + } + + /* Add key with data = (void *)1 */ + ret = rte_hash_add_key_data(hash, &key1, (void *)(uintptr_t)1); + if (ret != 0) { + printf("failed to add key (ret=%d)\n", ret); + goto end; + } + + /* Replace same key with data = (void *)2 */ + ret = rte_hash_replace_key_data(hash, &key1, (void *)(uintptr_t)2, &old_data); + if (ret != 0) { + printf("failed to replace key (ret=%d)\n", ret); + goto end; + } + if (old_data != (void *)(uintptr_t)1) { + printf("old_data should be 0x1 but is %p\n", old_data); + goto end; + } + + /* Lookup and verify data == (void *)2 */ + ret = rte_hash_lookup_data(hash, &key1, &data); + if (ret < 0 || data != (void *)(uintptr_t)2) { + printf("lookup returned wrong data %p (ret=%d)\n", data, ret); + goto end; + } + + /* Replace with a new key (not yet inserted) */ + old_data = (void *)(uintptr_t)0xdeadbeef; + ret = rte_hash_replace_key_data(hash, &key2, (void *)(uintptr_t)3, &old_data); + if (ret != 0) { + printf("failed to insert new key via replace (ret=%d)\n", ret); + goto end; + } + if (old_data != NULL) { + printf("old_data should be NULL on fresh insert but is %p\n", + old_data); + goto end; + } + +end: + rte_hash_free(hash); + + return ret; +} + /* * Do all unit and performance tests. */ @@ -2557,6 +2634,9 @@ test_hash(void) if (test_hash_rcu_qsbr_replace_auto_free() < 0) return -1; + if (test_hash_replace_key_data() < 0) + return -1; + return 0; } diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst index 693034bcb0d2..73418523809c 100644 --- a/doc/guides/rel_notes/release_26_03.rst +++ b/doc/guides/rel_notes/release_26_03.rst @@ -93,6 +93,12 @@ New Features ``rte_hash_add_key_data`` now automatically defers freeing the old data pointer on key overwrite via the RCU defer queue. +* **Added hash replace API.** + + Added ``rte_hash_replace_key_data`` and ``rte_hash_replace_key_with_hash_data`` + functions that return the previous data pointer on key overwrite, letting the + caller manage its lifecycle. + Removed Items ------------- diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index 0b979dc3be0b..039572d6eab1 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -1110,7 +1110,7 @@ __hash_rcu_auto_free_old_data(const struct rte_hash *h, void *old_data_val) static inline int32_t __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, - hash_sig_t sig, void *data) + hash_sig_t sig, void *data, void **old_data) { uint16_t short_sig; uint32_t prim_bucket_idx, sec_bucket_idx; @@ -1309,7 +1309,9 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, return slot_id - 1; overwrite: - if (saved_old_data != NULL) + if (old_data != NULL) + *old_data = saved_old_data; + else if (saved_old_data != NULL) __hash_rcu_auto_free_old_data(h, saved_old_data); return ret; @@ -1325,7 +1327,7 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig) { RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - return __rte_hash_add_key_with_hash(h, key, sig, 0); + return __rte_hash_add_key_with_hash(h, key, sig, 0, NULL); } RTE_EXPORT_SYMBOL(rte_hash_add_key) @@ -1333,7 +1335,7 @@ int32_t rte_hash_add_key(const struct rte_hash *h, const void *key) { RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - return __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), 0); + return __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), 0, NULL); } RTE_EXPORT_SYMBOL(rte_hash_add_key_with_hash_data) @@ -1344,7 +1346,7 @@ rte_hash_add_key_with_hash_data(const struct rte_hash *h, int ret; RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - ret = __rte_hash_add_key_with_hash(h, key, sig, data); + ret = __rte_hash_add_key_with_hash(h, key, sig, data, NULL); if (ret >= 0) return 0; else @@ -1359,13 +1361,41 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data) RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - ret = __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), data); + ret = __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), data, NULL); if (ret >= 0) return 0; else return ret; } +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_hash_replace_key_with_hash_data, 26.03) +int +rte_hash_replace_key_with_hash_data(const struct rte_hash *h, + const void *key, hash_sig_t sig, + void *data, void **old_data) +{ + int ret; + + RETURN_IF_TRUE(((h == NULL) || (key == NULL) || + (old_data == NULL)), -EINVAL); + + *old_data = NULL; + ret = __rte_hash_add_key_with_hash(h, key, sig, data, old_data); + if (ret >= 0) + return 0; + else + return ret; +} + +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_hash_replace_key_data, 26.03) +int +rte_hash_replace_key_data(const struct rte_hash *h, const void *key, + void *data, void **old_data) +{ + return rte_hash_replace_key_with_hash_data(h, key, + rte_hash_hash(h, key), data, old_data); +} + /* Search one bucket to find the match key - uses rw lock */ static inline int32_t search_one_bucket_l(const struct rte_hash *h, const void *key, diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h index e33f0aea0f5e..4a328c51501c 100644 --- a/lib/hash/rte_hash.h +++ b/lib/hash/rte_hash.h @@ -279,6 +279,73 @@ int32_t rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key, hash_sig_t sig, void *data); +/** + * Replace a key-value pair in an existing hash table, returning the previous + * data pointer associated with the key. If the key does not exist, it is + * inserted and *old_data is set to NULL. + * + * This operation is not multi-thread safe and should only be called from one + * thread by default. Thread safety can be enabled by setting flag during table + * creation. + * + * When old_data is provided, no automatic RCU deferred free is performed on + * overwrite; the caller takes ownership of the old pointer and is responsible + * for freeing it (e.g. via RCU). + * + * @param h + * Hash table to add the key to. + * @param key + * Key to add to the hash table. + * @param data + * Data to add to the hash table. + * @param old_data + * Output: on overwrite, set to the previous data pointer. + * On fresh insert, set to NULL. + * @return + * - 0 if added/replaced successfully + * - -EINVAL if the parameters are invalid. + * - -ENOSPC if there is no space in the hash for this key. + */ +__rte_experimental +int +rte_hash_replace_key_data(const struct rte_hash *h, const void *key, + void *data, void **old_data); + +/** + * Replace a key-value pair with a pre-computed hash value in an existing hash + * table, returning the previous data pointer associated with the key. If the + * key does not exist, it is inserted and *old_data is set to NULL. + * + * This operation is not multi-thread safe and should only be called from one + * thread by default. Thread safety can be enabled by setting flag during table + * creation. + * + * When old_data is provided, no automatic RCU deferred free is performed on + * overwrite; the caller takes ownership of the old pointer and is responsible + * for freeing it (e.g. via RCU). + * + * @param h + * Hash table to add the key to. + * @param key + * Key to add to the hash table. + * @param sig + * Precomputed hash value for 'key' + * @param data + * Data to add to the hash table. + * @param old_data + * Output: on overwrite, set to the previous data pointer. + * On fresh insert, set to NULL. + * @return + * - 0 if added/replaced successfully + * - -EINVAL if the parameters are invalid. + * - -ENOSPC if there is no space in the hash for this key. + */ +__rte_experimental +int +rte_hash_replace_key_with_hash_data(const struct rte_hash *h, + const void *key, hash_sig_t sig, + void *data, void **old_data); + /** * Add a key to an existing hash table. This operation is not multi-thread safe * and should only be called from one thread by default. -- 2.53.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH dpdk 0/2] hash: safe data replacement on overwrite 2026-02-12 21:33 [PATCH dpdk 0/2] hash: safe data replacement on overwrite Robin Jarry 2026-02-12 21:33 ` [PATCH dpdk 1/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry 2026-02-12 21:33 ` [PATCH dpdk 2/2] hash: add replace API returning old data on overwrite Robin Jarry @ 2026-02-12 22:55 ` Stephen Hemminger 2026-02-13 10:34 ` [PATCH dpdk v2 0/3] " Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 0/2] hash: safe data replacement " Robin Jarry 4 siblings, 0 replies; 21+ messages in thread From: Stephen Hemminger @ 2026-02-12 22:55 UTC (permalink / raw) To: Robin Jarry; +Cc: dev On Thu, 12 Feb 2026 22:33:15 +0100 Robin Jarry <rjarry@redhat.com> wrote: > When rte_hash_add_key_data() is called with an existing key, the old > data pointer is silently overwritten. With RCU-protected readers still > potentially accessing the old data, the application cannot safely free > it since the old pointer is never returned. > > This series provides two ways to address the problem: > > * Automatic: when RCU is configured with a free_key_data_func callback, > rte_hash_add_key_data() now automatically defers freeing the old data > on overwrite. No caller changes are needed. > > * Explicit: a new rte_hash_replace_key_data() API returns the previous > data pointer via an output parameter, giving the caller full control > over the old pointer's lifecycle. No automatic deferred free is > performed in that case. > > Robin Jarry (2): > hash: free replaced data on overwrite when RCU is configured > hash: add replace API returning old data on overwrite > > app/test/test_hash.c | 214 +++++++++++++++++++++++++ > doc/guides/rel_notes/release_26_03.rst | 12 ++ > lib/hash/rte_cuckoo_hash.c | 146 +++++++++++++---- > lib/hash/rte_hash.h | 75 ++++++++- > 4 files changed, 417 insertions(+), 30 deletions(-) Subtle change, so fed to AI and got. Here's the deep dive review. The two most substantive findings: 1. Fragile overwrite/delete discriminator in __hash_rcu_qsbr_free_resource — The dequeue callback uses old_data != NULL to distinguish overwrite entries from delete entries, but the enqueue side already sets key_idx = 0 as a sentinel for overwrites. Using key_idx == 0 would be more robust, because if an application legitimately stores NULL data and overwrites the key, the current code will fall through to the delete path and try to recycle key index 0 (the dummy slot). 2. Leaked old data on rte_rcu_qsbr_dq_enqueue failure — The enqueue failure in __hash_rcu_auto_free_old_data logs an error but the old data becomes permanently unreachable. This is consistent with the existing delete-path pattern, but it's a real resource leak on error. The overall design is solid — the split between automatic RCU free (patch 1) for existing users and explicit replace API (patch 2) for fine-grained control is clean, and the interaction between the two is correctly ordered in the overwrite: label. Long review. # DPDK Hash Library Patch Series Review **Series**: hash: free replaced data on overwrite / add replace API **Author**: Robin Jarry <rjarry@redhat.com> **Patches**: 2 **Date**: 2026-02-12 --- ## Patch 1/2: hash: free replaced data on overwrite when RCU is configured ### Summary When `rte_hash_add_key_data()` overwrites an existing key, the old data pointer was silently lost. With RCU-protected readers still potentially accessing the old data, applications had no safe way to free it. This patch adds automatic deferred freeing of the old data pointer via the RCU defer queue when a `free_key_data_func` callback is configured. ### Correctness Analysis #### **Error — Potential use of NULL data treated as "no overwrite" (Medium confidence ~60%)** In `__rte_hash_add_key_with_hash()`, the overwrite detection relies on: ```c overwrite: if (saved_old_data != NULL) __hash_rcu_auto_free_old_data(h, saved_old_data); return ret; ``` If an application legitimately stores `NULL` as the data pointer for a key (which `rte_hash_add_key_data()` permits — `data` is `void *` with no restriction against NULL), and then overwrites that key with new data, `saved_old_data` will be `NULL` and the overwrite path will silently skip the auto-free. This is probably fine for the auto-free case (freeing NULL is a no-op anyway), but it means the code cannot distinguish "no overwrite happened" from "overwrite happened, old data was NULL." This becomes more significant in Patch 2/2 where `old_data` is exposed to the caller — see below. #### **Error — `__hash_rcu_qsbr_free_resource` overwrite detection uses `old_data != NULL` sentinel (Medium-high confidence ~70%)** In the dequeue callback: ```c /* Overwrite case: free old data only, do not recycle slot */ if (rcu_dq_entry.old_data != NULL) { h->hash_rcu_cfg->free_key_data_func( h->hash_rcu_cfg->key_data_ptr, rcu_dq_entry.old_data); return; } ``` This uses `old_data != NULL` to distinguish overwrite entries from delete entries. But overwrite entries are enqueued with `key_idx = 0` as the sentinel: ```c struct __rte_hash_rcu_dq_entry rcu_dq_entry = { .key_idx = 0, .ext_bkt_idx = EMPTY_SLOT, .old_data = old_data_val, }; ``` The `key_idx == 0` check would be a more robust sentinel since key index 0 is the reserved dummy entry in the cuckoo hash. Using `old_data != NULL` means that if someone stores NULL data and then overwrites, the entry flows through to the delete path which will try to access `keys[0]` — the dummy slot. The consequences depend on what's in the dummy slot, but it's at minimum semantically wrong. **Recommendation**: check `key_idx == 0` instead of `old_data != NULL` as the overwrite discriminator: ```c if (rcu_dq_entry.key_idx == 0) { /* Overwrite case */ if (rcu_dq_entry.old_data != NULL) h->hash_rcu_cfg->free_key_data_func(...); return; } ``` #### **Error — Missing error check on `rte_rcu_qsbr_dq_enqueue` (Low-medium confidence ~55%)** In `__hash_rcu_auto_free_old_data()`: ```c if (rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) HASH_LOG(ERR, "Failed to push QSBR FIFO"); ``` The enqueue failure is logged but the old data pointer is leaked. The overwrite has already happened (the data in the hash table has been updated), so by the time we reach this point, the old data is unreachable. If enqueue fails, the old data will never be freed. This is a resource leak on a (presumably rare) error path. The existing delete path in `__rte_hash_del_key_with_hash` has the same pattern, so this is consistent with existing code, but it's worth noting that a failed enqueue here means a permanent memory leak of the old data. #### **Warning — Behavioral change in the ext-table duplicate-check path** In the extended-table section of `__rte_hash_add_key_with_hash`, the original code jumped to `failure:` when a duplicate was found (which unlocked and returned). The new code jumps to `overwrite:` after explicitly unlocking: ```c ret = search_and_update(h, data, key, prim_bkt, short_sig, &saved_old_data); if (ret != -1) { enqueue_slot_back(h, cached_free_slots, slot_id); __hash_rw_writer_unlock(h); goto overwrite; } ``` This is correct — the unlock happens before the goto, and `overwrite:` doesn't attempt to unlock. Just confirming this is safe. The change from `goto failure` to explicit unlock + `goto overwrite` is necessary and correct. ### Test Coverage The test `test_hash_rcu_qsbr_replace_auto_free` is well-structured: - Creates hash with RW_CONCURRENCY_LF and RCU QSBR in DQ mode - Registers a pseudo reader - Tests overwrite + reclaim, then delete + reclaim - Verifies the callback receives the correct old data pointers - Proper cleanup in the `end:` label The cleanup path correctly handles the case where `rcu.v` might be NULL (skipping thread offline/unregister). `rte_hash_free(NULL)` and `rte_free(NULL)` are both safe. ### Documentation Release notes entry is appropriate and well-written. The API doc updates to `rte_hash.h` clearly document the new behavior. --- ## Patch 2/2: hash: add replace API returning old data on overwrite ### Summary Adds `rte_hash_replace_key_data()` and `rte_hash_replace_key_with_hash_data()` — variants of the add APIs that return the previous data pointer via an output parameter. When the caller provides `old_data`, no automatic RCU deferred free is performed; the caller takes ownership. ### Correctness Analysis #### **Error — `saved_old_data` initialized to NULL but not reset between paths (Medium confidence ~65%)** In `__rte_hash_add_key_with_hash`, `saved_old_data` is initialized to `NULL` at declaration. The function has multiple `search_and_update` calls, each of which may or may not set `saved_old_data`. If a first `search_and_update` call doesn't find the key (returns -1), `saved_old_data` stays NULL. A later call that does find the key will set it. This is correct. However, consider the path where `rte_hash_cuckoo_insert_mw` returns 1 (key found during insert): ```c ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, short_sig, slot_id, &ret_val, &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); ret = ret_val; goto overwrite; } ``` On the `ret == 0` path (fresh insert, no overwrite), the function returns directly without going through `overwrite:`. So `old_data` is never set for fresh inserts via this path. But `rte_hash_replace_key_with_hash_data` initializes `*old_data = NULL` before the call, so this is safe — the caller sees NULL for fresh inserts. This is correct. #### **Warning — New experimental APIs missing `__rte_experimental` annotation validation** The new functions `rte_hash_replace_key_data()` and `rte_hash_replace_key_with_hash_data()` are correctly marked `__rte_experimental` in the header and use `RTE_EXPORT_EXPERIMENTAL_SYMBOL` in the `.c` file. This follows DPDK conventions. #### **Info — API design: `old_data` is required (non-NULL) for replace functions** `rte_hash_replace_key_with_hash_data` validates `old_data != NULL`: ```c RETURN_IF_TRUE(((h == NULL) || (key == NULL) || (old_data == NULL)), -EINVAL); ``` This is a reasonable design choice — if you don't want old_data, use the existing `rte_hash_add_key_data()` instead. The semantics are clean: replace always tells you what was there before. ### Test Coverage `test_hash_replace_key_data` covers: - Overwrite path: verifies `old_data` returns the previous value - Lookup after replace: verifies the new data is stored correctly - Fresh insert via replace: verifies `old_data` is set to NULL The test intentionally sets `old_data = (void *)0xdeadbeef` before the fresh-insert replace to verify it gets zeroed. Good practice. **Note**: The test doesn't cover the `rte_hash_replace_key_with_hash_data()` variant (with precomputed hash). Since it's a thin wrapper that delegates to the same internal function, the coverage is adequate, but a dedicated test for the hash variant would be thorough. ### Documentation Release notes and API documentation are comprehensive and clear. The doxygen comments properly document the ownership semantics (caller owns old pointer, no auto-RCU-free). --- ## Cross-Patch Observations ### Overall Architecture Assessment The design is clean and well-thought-out: 1. **Patch 1** adds automatic RCU deferred free as a "zero-effort" improvement for existing users who already have `free_key_data_func` configured. 2. **Patch 2** adds an explicit replace API for users who need precise control over the old data's lifecycle. 3. The two features interact correctly: when `old_data` is provided (replace API), automatic RCU free is suppressed; when `old_data` is NULL (existing add API), automatic RCU free kicks in. The priority of the `overwrite:` label logic in the final version is: ```c overwrite: if (old_data != NULL) // caller wants ownership (replace API) *old_data = saved_old_data; else if (saved_old_data != NULL) // auto-free via RCU (add API) __hash_rcu_auto_free_old_data(h, saved_old_data); return ret; ``` This is the correct priority ordering. ### Key Finding Summary | # | Severity | Patch | Finding | |---|----------|-------|---------| | 1 | Error | 1/2 | `old_data != NULL` sentinel in `__hash_rcu_qsbr_free_resource` is fragile; use `key_idx == 0` instead to discriminate overwrite vs delete entries | | 2 | Error | 1/2 | Failed `rte_rcu_qsbr_dq_enqueue` leaks old data (consistent with existing code pattern but worth noting) | | 3 | Info | 2/2 | No test for `rte_hash_replace_key_with_hash_data()` variant specifically | ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH dpdk v2 0/3] hash: safe data replacement on overwrite 2026-02-12 21:33 [PATCH dpdk 0/2] hash: safe data replacement on overwrite Robin Jarry ` (2 preceding siblings ...) 2026-02-12 22:55 ` [PATCH dpdk 0/2] hash: safe data replacement " Stephen Hemminger @ 2026-02-13 10:34 ` Robin Jarry 2026-02-13 10:34 ` [PATCH dpdk v2 1/3] hash: avoid leaking entries on RCU defer queue failure Robin Jarry ` (2 more replies) 2026-03-06 8:47 ` [PATCH dpdk v3 0/2] hash: safe data replacement " Robin Jarry 4 siblings, 3 replies; 21+ messages in thread From: Robin Jarry @ 2026-02-13 10:34 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger When rte_hash_add_key_data() is called with an existing key, the old data pointer is silently overwritten. With RCU-protected readers still potentially accessing the old data, the application cannot safely free it since the old pointer is never returned. This series provides two ways to address the problem: * Automatic: when RCU is configured with a free_key_data_func callback, rte_hash_add_key_data() now automatically defers freeing the old data on overwrite. No caller changes are needed. * Explicit: a new rte_hash_replace_key_data() API returns the previous data pointer via an output parameter, giving the caller full control over the old pointer's life cycle. No automatic deferred free is performed in that case. v2: * New patch to fallback to SYNC mode when DQ fails on delete. * Check key_idx == EMPTY_SLOT in DQ mode to differentiate when old_data is NULL. * Also fallback to SYNC mode when DQ fails on replace. Robin Jarry (3): hash: avoid leaking entries on RCU defer queue failure hash: free replaced data on overwrite when RCU is configured hash: add replace API returning old data on overwrite app/test/test_hash.c | 214 +++++++++++++++++++++++++ doc/guides/rel_notes/release_26_03.rst | 12 ++ lib/hash/rte_cuckoo_hash.c | 150 +++++++++++++---- lib/hash/rte_hash.h | 75 ++++++++- 4 files changed, 415 insertions(+), 36 deletions(-) -- 2.53.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH dpdk v2 1/3] hash: avoid leaking entries on RCU defer queue failure 2026-02-13 10:34 ` [PATCH dpdk v2 0/3] " Robin Jarry @ 2026-02-13 10:34 ` Robin Jarry 2026-03-04 10:28 ` Konstantin Ananyev 2026-02-13 10:34 ` [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured Robin Jarry 2026-02-13 10:34 ` [PATCH dpdk v2 3/3] hash: add replace API returning old data on overwrite Robin Jarry 2 siblings, 1 reply; 21+ messages in thread From: Robin Jarry @ 2026-02-13 10:34 UTC (permalink / raw) To: dev, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger When rte_rcu_qsbr_dq_enqueue() fails in DQ mode, the deleted key slot is never freed and becomes permanently leaked. Fall back to synchronous reclamation instead of only logging an error. Signed-off-by: Robin Jarry <rjarry@redhat.com> --- lib/hash/rte_cuckoo_hash.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index da12825c6ed2..8189bde024be 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -1870,18 +1870,15 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, /* Key index where key is stored, adding the first dummy index */ rcu_dq_entry.key_idx = ret + 1; rcu_dq_entry.ext_bkt_idx = index; - if (h->dq == NULL) { + if (h->dq == NULL || rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) { /* Wait for quiescent state change if using - * RTE_HASH_QSBR_MODE_SYNC + * RTE_HASH_QSBR_MODE_SYNC or if RCU enqueue failed. */ rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, RTE_QSBR_THRID_INVALID); __hash_rcu_qsbr_free_resource((void *)((uintptr_t)h), &rcu_dq_entry, 1); - } else if (h->dq) - /* Push into QSBR FIFO if using RTE_HASH_QSBR_MODE_DQ */ - if (rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) - HASH_LOG(ERR, "Failed to push QSBR FIFO"); + } } __hash_rw_writer_unlock(h); return ret; -- 2.53.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* RE: [PATCH dpdk v2 1/3] hash: avoid leaking entries on RCU defer queue failure 2026-02-13 10:34 ` [PATCH dpdk v2 1/3] hash: avoid leaking entries on RCU defer queue failure Robin Jarry @ 2026-03-04 10:28 ` Konstantin Ananyev 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Ananyev @ 2026-03-04 10:28 UTC (permalink / raw) To: Robin Jarry, dev@dpdk.org, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger > > When rte_rcu_qsbr_dq_enqueue() fails in DQ mode, the deleted key slot > is never freed and becomes permanently leaked. Fall back to synchronous > reclamation instead of only logging an error. > > Signed-off-by: Robin Jarry <rjarry@redhat.com> > --- > lib/hash/rte_cuckoo_hash.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c > index da12825c6ed2..8189bde024be 100644 > --- a/lib/hash/rte_cuckoo_hash.c > +++ b/lib/hash/rte_cuckoo_hash.c > @@ -1870,18 +1870,15 @@ __rte_hash_del_key_with_hash(const struct > rte_hash *h, const void *key, > /* Key index where key is stored, adding the first dummy index > */ > rcu_dq_entry.key_idx = ret + 1; > rcu_dq_entry.ext_bkt_idx = index; > - if (h->dq == NULL) { > + if (h->dq == NULL || rte_rcu_qsbr_dq_enqueue(h->dq, > &rcu_dq_entry) != 0) { > /* Wait for quiescent state change if using > - * RTE_HASH_QSBR_MODE_SYNC > + * RTE_HASH_QSBR_MODE_SYNC or if RCU enqueue > failed. > */ > rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, > RTE_QSBR_THRID_INVALID); > __hash_rcu_qsbr_free_resource((void *)((uintptr_t)h), > &rcu_dq_entry, 1); > - } else if (h->dq) > - /* Push into QSBR FIFO if using > RTE_HASH_QSBR_MODE_DQ */ > - if (rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != > 0) > - HASH_LOG(ERR, "Failed to push QSBR FIFO"); > + } > } > __hash_rw_writer_unlock(h); > return ret; > -- Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> > 2.53.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured 2026-02-13 10:34 ` [PATCH dpdk v2 0/3] " Robin Jarry 2026-02-13 10:34 ` [PATCH dpdk v2 1/3] hash: avoid leaking entries on RCU defer queue failure Robin Jarry @ 2026-02-13 10:34 ` Robin Jarry 2026-03-04 11:40 ` Konstantin Ananyev 2026-02-13 10:34 ` [PATCH dpdk v2 3/3] hash: add replace API returning old data on overwrite Robin Jarry 2 siblings, 1 reply; 21+ messages in thread From: Robin Jarry @ 2026-02-13 10:34 UTC (permalink / raw) To: dev, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger When rte_hash_add_key_data() overwrites an existing key, the old data pointer is silently lost. With RCU-protected readers still potentially accessing the old data, the application has no safe way to free it. When RCU is configured with a free_key_data_func callback, automatically enqueue the old data for deferred freeing via the RCU defer queue on overwrite. In SYNC mode, synchronize and call free_key_data_func directly. Signed-off-by: Robin Jarry <rjarry@redhat.com> --- app/test/test_hash.c | 134 +++++++++++++++++++++++++ doc/guides/rel_notes/release_26_03.rst | 6 ++ lib/hash/rte_cuckoo_hash.c | 101 ++++++++++++++----- lib/hash/rte_hash.h | 8 +- 4 files changed, 224 insertions(+), 25 deletions(-) diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 3fb3d96d0546..56a7779e09b9 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -2342,6 +2342,137 @@ test_hash_rcu_qsbr_dq_reclaim(void) return 0; } +static void *old_data; + +static void +test_hash_free_key_data_func(void *p __rte_unused, void *key_data) +{ + old_data = key_data; +} + +/* + * Test automatic RCU free on overwrite via rte_hash_add_key_data. + * - Create hash with RW_CONCURRENCY_LF and RCU QSBR in DQ mode + * with a free_key_data_func callback that increments a counter. + * - Register a pseudo reader thread. + * - Add key with data (void *)1. + * - Overwrite same key with data (void *)2 via rte_hash_add_key_data. + * - Report quiescent state, trigger reclamation. + * - Verify the free callback was called exactly once. + * - Delete the key, report quiescent state, reclaim again. + * - Verify the free callback was called a second time. + */ +static int +test_hash_rcu_qsbr_replace_auto_free(void) +{ + struct rte_hash_rcu_config rcu = { + .v = NULL, + .mode = RTE_HASH_QSBR_MODE_DQ, + .free_key_data_func = test_hash_free_key_data_func, + .key_data_ptr = NULL, + }; + struct rte_hash_parameters params = { + .name = "test_replace_auto_free", + .entries = 16, + .key_len = sizeof(uint32_t), + .hash_func = NULL, + .hash_func_init_val = 0, + .socket_id = SOCKET_ID_ANY, + .extra_flag = RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF, + }; + struct rte_hash *hash = NULL; + uint32_t key = 55; + int32_t status; + int ret = -1; + size_t sz; + + printf("\n# Running RCU replace auto-free test\n"); + + hash = rte_hash_create(¶ms); + if (hash == NULL) { + printf("hash creation failed\n"); + goto end; + } + + sz = rte_rcu_qsbr_get_memsize(RTE_MAX_LCORE); + rcu.v = rte_zmalloc(NULL, sz, RTE_CACHE_LINE_SIZE); + if (rcu.v == NULL) { + printf("RCU QSBR alloc failed\n"); + goto end; + } + status = rte_rcu_qsbr_init(rcu.v, RTE_MAX_LCORE); + if (status != 0) { + printf("RCU QSBR init failed\n"); + goto end; + } + + status = rte_hash_rcu_qsbr_add(hash, &rcu); + if (status != 0) { + printf("RCU QSBR add failed\n"); + goto end; + } + + /* Register pseudo reader */ + status = rte_rcu_qsbr_thread_register(rcu.v, 0); + if (status != 0) { + printf("RCU QSBR thread register failed\n"); + goto end; + } + rte_rcu_qsbr_thread_online(rcu.v, 0); + + old_data = NULL; + + /* Add key with data = (void *)1 */ + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)1); + if (status != 0) { + printf("failed to add key (status=%d)\n", status); + goto end; + } + + /* Overwrite same key with data = (void *)2 */ + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)2); + if (status != 0) { + printf("failed to overwrite key (status=%d)\n", status); + goto end; + } + + /* Reader quiescent and reclaim */ + rte_rcu_qsbr_quiescent(rcu.v, 0); + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); + + if (old_data != (void *)(uintptr_t)1) { + printf("old data should be 0x1 but is %p\n", old_data); + goto end; + } + + /* Delete the key */ + status = rte_hash_del_key(hash, &key); + if (status < 0) { + printf("failed to delete key (status=%d)\n", status); + goto end; + } + + /* Reader quiescent and reclaim again */ + rte_rcu_qsbr_quiescent(rcu.v, 0); + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); + + if (old_data != (void *)(uintptr_t)2) { + printf("old data should be 2 but is %p\n", old_data); + goto end; + } + + ret = 0; +end: + if (rcu.v != NULL) { + rte_rcu_qsbr_thread_offline(rcu.v, 0); + rte_rcu_qsbr_thread_unregister(rcu.v, 0); + } + rte_hash_free(hash); + rte_free(rcu.v); + + return ret; +} + /* * Do all unit and performance tests. */ @@ -2423,6 +2554,9 @@ test_hash(void) if (test_hash_rcu_qsbr_dq_reclaim() < 0) return -1; + if (test_hash_rcu_qsbr_replace_auto_free() < 0) + return -1; + return 0; } diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst index afdf1af06c20..693034bcb0d2 100644 --- a/doc/guides/rel_notes/release_26_03.rst +++ b/doc/guides/rel_notes/release_26_03.rst @@ -87,6 +87,12 @@ New Features * Added support for AES-XTS cipher algorithm. * Added support for SHAKE-128 and SHAKE-256 authentication algorithms. +* **Added automatic deferred free on hash data overwrite.** + + When RCU is configured with a ``free_key_data_func`` callback, + ``rte_hash_add_key_data`` now automatically defers freeing the old data + pointer on key overwrite via the RCU defer queue. + Removed Items ------------- diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index 8189bde024be..f487b3b725dd 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -75,6 +75,7 @@ EAL_REGISTER_TAILQ(rte_hash_tailq) struct __rte_hash_rcu_dq_entry { uint32_t key_idx; uint32_t ext_bkt_idx; + void *old_data; }; RTE_EXPORT_SYMBOL(rte_hash_find_existing) @@ -763,10 +764,11 @@ enqueue_slot_back(const struct rte_hash *h, /* Search a key from bucket and update its data. * Writer holds the lock before calling this. + * If old_data is non-NULL, save the previous data pointer before overwriting. */ static inline int32_t search_and_update(const struct rte_hash *h, void *data, const void *key, - struct rte_hash_bucket *bkt, uint16_t sig) + struct rte_hash_bucket *bkt, uint16_t sig, void **old_data) { int i; struct rte_hash_key *k, *keys = h->key_store; @@ -776,6 +778,8 @@ search_and_update(const struct rte_hash *h, void *data, const void *key, k = (struct rte_hash_key *) ((char *)keys + bkt->key_idx[i] * h->key_entry_size); if (rte_hash_cmp_eq(key, k->key, h) == 0) { + if (old_data != NULL) + *old_data = k->pdata; /* The store to application data at *data * should not leak after the store to pdata * in the key store. i.e. pdata is the guard @@ -807,7 +811,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, struct rte_hash_bucket *sec_bkt, const struct rte_hash_key *key, void *data, uint16_t sig, uint32_t new_idx, - int32_t *ret_val) + int32_t *ret_val, void **old_data) { unsigned int i; struct rte_hash_bucket *cur_bkt; @@ -817,7 +821,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, /* Check if key was inserted after last check but before this * protected region in case of inserting duplicated keys. */ - ret = search_and_update(h, data, key, prim_bkt, sig); + ret = search_and_update(h, data, key, prim_bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -825,7 +829,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, } FOR_EACH_BUCKET(cur_bkt, sec_bkt) { - ret = search_and_update(h, data, key, cur_bkt, sig); + ret = search_and_update(h, data, key, cur_bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -872,7 +876,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h, const struct rte_hash_key *key, void *data, struct queue_node *leaf, uint32_t leaf_slot, uint16_t sig, uint32_t new_idx, - int32_t *ret_val) + int32_t *ret_val, void **old_data) { uint32_t prev_alt_bkt_idx; struct rte_hash_bucket *cur_bkt; @@ -892,7 +896,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h, /* Check if key was inserted after last check but before this * protected region. */ - ret = search_and_update(h, data, key, bkt, sig); + ret = search_and_update(h, data, key, bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -900,7 +904,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash *h, } FOR_EACH_BUCKET(cur_bkt, alt_bkt) { - ret = search_and_update(h, data, key, cur_bkt, sig); + ret = search_and_update(h, data, key, cur_bkt, sig, old_data); if (ret != -1) { __hash_rw_writer_unlock(h); *ret_val = ret; @@ -997,7 +1001,8 @@ rte_hash_cuckoo_make_space_mw(const struct rte_hash *h, struct rte_hash_bucket *sec_bkt, const struct rte_hash_key *key, void *data, uint16_t sig, uint32_t bucket_idx, - uint32_t new_idx, int32_t *ret_val) + uint32_t new_idx, int32_t *ret_val, + void **old_data) { unsigned int i; struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN]; @@ -1023,7 +1028,7 @@ rte_hash_cuckoo_make_space_mw(const struct rte_hash *h, int32_t ret = rte_hash_cuckoo_move_insert_mw(h, bkt, sec_bkt, key, data, tail, i, sig, - new_idx, ret_val); + new_idx, ret_val, old_data); if (likely(ret != -1)) return ret; } @@ -1076,6 +1081,29 @@ alloc_slot(const struct rte_hash *h, struct lcore_cache *cached_free_slots) return slot_id; } +/* + * When RCU is configured with a free function, auto-free the overwritten + * data pointer via RCU. + */ +static inline void +__hash_rcu_auto_free_old_data(const struct rte_hash *h, void *old_data_val) +{ + struct __rte_hash_rcu_dq_entry rcu_dq_entry = { + .key_idx = EMPTY_SLOT, /* sentinel value for __hash_rcu_qsbr_free_resource */ + .old_data = old_data_val, + }; + + if (h->hash_rcu_cfg == NULL || h->hash_rcu_cfg->free_key_data_func == NULL) + return; + + if (h->dq == NULL || rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) { + /* SYNC mode or enqueue failed in DQ mode */ + rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, RTE_QSBR_THRID_INVALID); + h->hash_rcu_cfg->free_key_data_func(h->hash_rcu_cfg->key_data_ptr, + old_data_val); + } +} + static inline int32_t __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig, void *data) @@ -1092,6 +1120,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, struct lcore_cache *cached_free_slots = NULL; int32_t ret_val; struct rte_hash_bucket *last; + void *saved_old_data = NULL; short_sig = get_short_sig(sig); prim_bucket_idx = get_prim_bucket_index(h, sig); @@ -1103,18 +1132,20 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, /* Check if key is already inserted in primary location */ __hash_rw_writer_lock(h); - ret = search_and_update(h, data, key, prim_bkt, short_sig); + ret = search_and_update(h, data, key, prim_bkt, short_sig, + &saved_old_data); if (ret != -1) { __hash_rw_writer_unlock(h); - return ret; + goto overwrite; } /* Check if key is already inserted in secondary location */ FOR_EACH_BUCKET(cur_bkt, sec_bkt) { - ret = search_and_update(h, data, key, cur_bkt, short_sig); + ret = search_and_update(h, data, key, cur_bkt, short_sig, + &saved_old_data); if (ret != -1) { __hash_rw_writer_unlock(h); - return ret; + goto overwrite; } } @@ -1153,33 +1184,39 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, /* Find an empty slot and insert */ ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, - short_sig, slot_id, &ret_val); + short_sig, slot_id, &ret_val, + &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); - return ret_val; + ret = ret_val; + goto overwrite; } /* Primary bucket full, need to make space for new entry */ ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data, - short_sig, prim_bucket_idx, slot_id, &ret_val); + short_sig, prim_bucket_idx, slot_id, &ret_val, + &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); - return ret_val; + ret = ret_val; + goto overwrite; } /* Also search secondary bucket to get better occupancy */ ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data, - short_sig, sec_bucket_idx, slot_id, &ret_val); + short_sig, sec_bucket_idx, slot_id, &ret_val, + &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); - return ret_val; + ret = ret_val; + goto overwrite; } /* if ext table not enabled, we failed the insertion */ @@ -1193,17 +1230,21 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, */ __hash_rw_writer_lock(h); /* We check for duplicates again since could be inserted before the lock */ - ret = search_and_update(h, data, key, prim_bkt, short_sig); + ret = search_and_update(h, data, key, prim_bkt, short_sig, + &saved_old_data); if (ret != -1) { enqueue_slot_back(h, cached_free_slots, slot_id); - goto failure; + __hash_rw_writer_unlock(h); + goto overwrite; } FOR_EACH_BUCKET(cur_bkt, sec_bkt) { - ret = search_and_update(h, data, key, cur_bkt, short_sig); + ret = search_and_update(h, data, key, cur_bkt, short_sig, + &saved_old_data); if (ret != -1) { enqueue_slot_back(h, cached_free_slots, slot_id); - goto failure; + __hash_rw_writer_unlock(h); + goto overwrite; } } @@ -1263,6 +1304,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, __hash_rw_writer_unlock(h); return slot_id - 1; +overwrite: + if (saved_old_data != NULL) + __hash_rcu_auto_free_old_data(h, saved_old_data); + return ret; + failure: __hash_rw_writer_unlock(h); return ret; @@ -1566,6 +1612,15 @@ __hash_rcu_qsbr_free_resource(void *p, void *e, unsigned int n) *((struct __rte_hash_rcu_dq_entry *)e); RTE_SET_USED(n); + + if (rcu_dq_entry.key_idx == EMPTY_SLOT) { + /* Overwrite case: free old data only, do not recycle slot */ + RTE_ASSERT(h->hash_rcu_cfg->free_key_data_func != NULL); + h->hash_rcu_cfg->free_key_data_func(h->hash_rcu_cfg->key_data_ptr, + rcu_dq_entry.old_data); + return; + } + keys = h->key_store; k = (struct rte_hash_key *) ((char *)keys + diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h index f692e0868dcf..e33f0aea0f5e 100644 --- a/lib/hash/rte_hash.h +++ b/lib/hash/rte_hash.h @@ -226,7 +226,9 @@ rte_hash_max_key_id(const struct rte_hash *h); * Thread safety can be enabled by setting flag during * table creation. * If the key exists already in the table, this API updates its value - * with 'data' passed in this API. It is the responsibility of + * with 'data' passed in this API. If RCU is configured with a + * free_key_data_func callback, the old data is automatically + * deferred-freed via RCU. Otherwise, it is the responsibility of * the application to manage any memory associated with the old value. * The readers might still be using the old value even after this API * has returned. @@ -253,7 +255,9 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data); * Thread safety can be enabled by setting flag during * table creation. * If the key exists already in the table, this API updates its value - * with 'data' passed in this API. It is the responsibility of + * with 'data' passed in this API. If RCU is configured with a + * free_key_data_func callback, the old data is automatically + * deferred-freed via RCU. Otherwise, it is the responsibility of * the application to manage any memory associated with the old value. * The readers might still be using the old value even after this API * has returned. -- 2.53.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* RE: [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured 2026-02-13 10:34 ` [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured Robin Jarry @ 2026-03-04 11:40 ` Konstantin Ananyev 2026-03-04 11:45 ` Robin Jarry 0 siblings, 1 reply; 21+ messages in thread From: Konstantin Ananyev @ 2026-03-04 11:40 UTC (permalink / raw) To: Robin Jarry, dev@dpdk.org, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger > When rte_hash_add_key_data() overwrites an existing key, the old data > pointer is silently lost. With RCU-protected readers still potentially > accessing the old data, the application has no safe way to free it. > > When RCU is configured with a free_key_data_func callback, automatically > enqueue the old data for deferred freeing via the RCU defer queue on > overwrite. In SYNC mode, synchronize and call free_key_data_func > directly. > > Signed-off-by: Robin Jarry <rjarry@redhat.com> > --- > app/test/test_hash.c | 134 +++++++++++++++++++++++++ > doc/guides/rel_notes/release_26_03.rst | 6 ++ > lib/hash/rte_cuckoo_hash.c | 101 ++++++++++++++----- > lib/hash/rte_hash.h | 8 +- > 4 files changed, 224 insertions(+), 25 deletions(-) > > diff --git a/app/test/test_hash.c b/app/test/test_hash.c > index 3fb3d96d0546..56a7779e09b9 100644 > --- a/app/test/test_hash.c > +++ b/app/test/test_hash.c > @@ -2342,6 +2342,137 @@ test_hash_rcu_qsbr_dq_reclaim(void) > return 0; > } > > +static void *old_data; > + > +static void > +test_hash_free_key_data_func(void *p __rte_unused, void *key_data) > +{ > + old_data = key_data; > +} > + > +/* > + * Test automatic RCU free on overwrite via rte_hash_add_key_data. > + * - Create hash with RW_CONCURRENCY_LF and RCU QSBR in DQ mode > + * with a free_key_data_func callback that increments a counter. > + * - Register a pseudo reader thread. > + * - Add key with data (void *)1. > + * - Overwrite same key with data (void *)2 via rte_hash_add_key_data. > + * - Report quiescent state, trigger reclamation. > + * - Verify the free callback was called exactly once. > + * - Delete the key, report quiescent state, reclaim again. > + * - Verify the free callback was called a second time. > + */ > +static int > +test_hash_rcu_qsbr_replace_auto_free(void) > +{ > + struct rte_hash_rcu_config rcu = { > + .v = NULL, > + .mode = RTE_HASH_QSBR_MODE_DQ, > + .free_key_data_func = test_hash_free_key_data_func, > + .key_data_ptr = NULL, > + }; > + struct rte_hash_parameters params = { > + .name = "test_replace_auto_free", > + .entries = 16, > + .key_len = sizeof(uint32_t), > + .hash_func = NULL, > + .hash_func_init_val = 0, > + .socket_id = SOCKET_ID_ANY, > + .extra_flag = RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF, > + }; > + struct rte_hash *hash = NULL; > + uint32_t key = 55; > + int32_t status; > + int ret = -1; > + size_t sz; > + > + printf("\n# Running RCU replace auto-free test\n"); > + > + hash = rte_hash_create(¶ms); > + if (hash == NULL) { > + printf("hash creation failed\n"); > + goto end; > + } > + > + sz = rte_rcu_qsbr_get_memsize(RTE_MAX_LCORE); > + rcu.v = rte_zmalloc(NULL, sz, RTE_CACHE_LINE_SIZE); > + if (rcu.v == NULL) { > + printf("RCU QSBR alloc failed\n"); > + goto end; > + } > + status = rte_rcu_qsbr_init(rcu.v, RTE_MAX_LCORE); > + if (status != 0) { > + printf("RCU QSBR init failed\n"); > + goto end; > + } > + > + status = rte_hash_rcu_qsbr_add(hash, &rcu); > + if (status != 0) { > + printf("RCU QSBR add failed\n"); > + goto end; > + } > + > + /* Register pseudo reader */ > + status = rte_rcu_qsbr_thread_register(rcu.v, 0); > + if (status != 0) { > + printf("RCU QSBR thread register failed\n"); > + goto end; > + } > + rte_rcu_qsbr_thread_online(rcu.v, 0); > + > + old_data = NULL; > + > + /* Add key with data = (void *)1 */ > + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)1); > + if (status != 0) { > + printf("failed to add key (status=%d)\n", status); > + goto end; > + } > + > + /* Overwrite same key with data = (void *)2 */ > + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)2); > + if (status != 0) { > + printf("failed to overwrite key (status=%d)\n", status); > + goto end; > + } > + > + /* Reader quiescent and reclaim */ > + rte_rcu_qsbr_quiescent(rcu.v, 0); > + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); > + > + if (old_data != (void *)(uintptr_t)1) { > + printf("old data should be 0x1 but is %p\n", old_data); > + goto end; > + } > + > + /* Delete the key */ > + status = rte_hash_del_key(hash, &key); > + if (status < 0) { > + printf("failed to delete key (status=%d)\n", status); > + goto end; > + } > + > + /* Reader quiescent and reclaim again */ > + rte_rcu_qsbr_quiescent(rcu.v, 0); > + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); > + > + if (old_data != (void *)(uintptr_t)2) { > + printf("old data should be 2 but is %p\n", old_data); > + goto end; > + } > + > + ret = 0; > +end: > + if (rcu.v != NULL) { > + rte_rcu_qsbr_thread_offline(rcu.v, 0); > + rte_rcu_qsbr_thread_unregister(rcu.v, 0); > + } > + rte_hash_free(hash); > + rte_free(rcu.v); > + > + return ret; > +} > + > /* > * Do all unit and performance tests. > */ > @@ -2423,6 +2554,9 @@ test_hash(void) > if (test_hash_rcu_qsbr_dq_reclaim() < 0) > return -1; > > + if (test_hash_rcu_qsbr_replace_auto_free() < 0) > + return -1; > + > return 0; > } > > diff --git a/doc/guides/rel_notes/release_26_03.rst > b/doc/guides/rel_notes/release_26_03.rst > index afdf1af06c20..693034bcb0d2 100644 > --- a/doc/guides/rel_notes/release_26_03.rst > +++ b/doc/guides/rel_notes/release_26_03.rst > @@ -87,6 +87,12 @@ New Features > * Added support for AES-XTS cipher algorithm. > * Added support for SHAKE-128 and SHAKE-256 authentication algorithms. > > +* **Added automatic deferred free on hash data overwrite.** > + > + When RCU is configured with a ``free_key_data_func`` callback, > + ``rte_hash_add_key_data`` now automatically defers freeing the old data > + pointer on key overwrite via the RCU defer queue. > + > > Removed Items > ------------- > diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c > index 8189bde024be..f487b3b725dd 100644 > --- a/lib/hash/rte_cuckoo_hash.c > +++ b/lib/hash/rte_cuckoo_hash.c > @@ -75,6 +75,7 @@ EAL_REGISTER_TAILQ(rte_hash_tailq) > struct __rte_hash_rcu_dq_entry { > uint32_t key_idx; > uint32_t ext_bkt_idx; > + void *old_data; > }; > > RTE_EXPORT_SYMBOL(rte_hash_find_existing) > @@ -763,10 +764,11 @@ enqueue_slot_back(const struct rte_hash *h, > > /* Search a key from bucket and update its data. > * Writer holds the lock before calling this. > + * If old_data is non-NULL, save the previous data pointer before overwriting. > */ > static inline int32_t > search_and_update(const struct rte_hash *h, void *data, const void *key, > - struct rte_hash_bucket *bkt, uint16_t sig) > + struct rte_hash_bucket *bkt, uint16_t sig, void **old_data) > { > int i; > struct rte_hash_key *k, *keys = h->key_store; > @@ -776,6 +778,8 @@ search_and_update(const struct rte_hash *h, void *data, > const void *key, > k = (struct rte_hash_key *) ((char *)keys + > bkt->key_idx[i] * h->key_entry_size); > if (rte_hash_cmp_eq(key, k->key, h) == 0) { > + if (old_data != NULL) > + *old_data = k->pdata; For consistency, shouldn't we use here instead of load/store: *old_data = rte_atomic_exchange_explicit(&k->pdata, data, ...); ? > /* The store to application data at *data > * should not leak after the store to pdata > * in the key store. i.e. pdata is the guard > @@ -807,7 +811,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > struct rte_hash_bucket *sec_bkt, > const struct rte_hash_key *key, void *data, > uint16_t sig, uint32_t new_idx, > - int32_t *ret_val) > + int32_t *ret_val, void **old_data) > { > unsigned int i; > struct rte_hash_bucket *cur_bkt; > @@ -817,7 +821,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > /* Check if key was inserted after last check but before this > * protected region in case of inserting duplicated keys. > */ > - ret = search_and_update(h, data, key, prim_bkt, sig); > + ret = search_and_update(h, data, key, prim_bkt, sig, old_data); > if (ret != -1) { > __hash_rw_writer_unlock(h); > *ret_val = ret; > @@ -825,7 +829,7 @@ rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > } > > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > - ret = search_and_update(h, data, key, cur_bkt, sig); > + ret = search_and_update(h, data, key, cur_bkt, sig, old_data); > if (ret != -1) { > __hash_rw_writer_unlock(h); > *ret_val = ret; > @@ -872,7 +876,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash > *h, > const struct rte_hash_key *key, void *data, > struct queue_node *leaf, uint32_t leaf_slot, > uint16_t sig, uint32_t new_idx, > - int32_t *ret_val) > + int32_t *ret_val, void **old_data) > { > uint32_t prev_alt_bkt_idx; > struct rte_hash_bucket *cur_bkt; > @@ -892,7 +896,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash > *h, > /* Check if key was inserted after last check but before this > * protected region. > */ > - ret = search_and_update(h, data, key, bkt, sig); > + ret = search_and_update(h, data, key, bkt, sig, old_data); > if (ret != -1) { > __hash_rw_writer_unlock(h); > *ret_val = ret; > @@ -900,7 +904,7 @@ rte_hash_cuckoo_move_insert_mw(const struct rte_hash > *h, > } > > FOR_EACH_BUCKET(cur_bkt, alt_bkt) { > - ret = search_and_update(h, data, key, cur_bkt, sig); > + ret = search_and_update(h, data, key, cur_bkt, sig, old_data); > if (ret != -1) { > __hash_rw_writer_unlock(h); > *ret_val = ret; > @@ -997,7 +1001,8 @@ rte_hash_cuckoo_make_space_mw(const struct rte_hash > *h, > struct rte_hash_bucket *sec_bkt, > const struct rte_hash_key *key, void *data, > uint16_t sig, uint32_t bucket_idx, > - uint32_t new_idx, int32_t *ret_val) > + uint32_t new_idx, int32_t *ret_val, > + void **old_data) > { > unsigned int i; > struct queue_node queue[RTE_HASH_BFS_QUEUE_MAX_LEN]; > @@ -1023,7 +1028,7 @@ rte_hash_cuckoo_make_space_mw(const struct rte_hash > *h, > int32_t ret = rte_hash_cuckoo_move_insert_mw(h, > bkt, sec_bkt, key, data, > tail, i, sig, > - new_idx, ret_val); > + new_idx, ret_val, old_data); > if (likely(ret != -1)) > return ret; > } > @@ -1076,6 +1081,29 @@ alloc_slot(const struct rte_hash *h, struct lcore_cache > *cached_free_slots) > return slot_id; > } > > +/* > + * When RCU is configured with a free function, auto-free the overwritten > + * data pointer via RCU. > + */ > +static inline void > +__hash_rcu_auto_free_old_data(const struct rte_hash *h, void *old_data_val) > +{ > + struct __rte_hash_rcu_dq_entry rcu_dq_entry = { > + .key_idx = EMPTY_SLOT, /* sentinel value for > __hash_rcu_qsbr_free_resource */ > + .old_data = old_data_val, > + }; > + > + if (h->hash_rcu_cfg == NULL || h->hash_rcu_cfg->free_key_data_func == > NULL) > + return; > + > + if (h->dq == NULL || rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) > { > + /* SYNC mode or enqueue failed in DQ mode */ > + rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, > RTE_QSBR_THRID_INVALID); > + h->hash_rcu_cfg->free_key_data_func(h->hash_rcu_cfg- > >key_data_ptr, > + old_data_val); > + } > +} > + > static inline int32_t > __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig, void *data) > @@ -1092,6 +1120,7 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, > const void *key, > struct lcore_cache *cached_free_slots = NULL; > int32_t ret_val; > struct rte_hash_bucket *last; > + void *saved_old_data = NULL; > > short_sig = get_short_sig(sig); > prim_bucket_idx = get_prim_bucket_index(h, sig); > @@ -1103,18 +1132,20 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > > /* Check if key is already inserted in primary location */ > __hash_rw_writer_lock(h); > - ret = search_and_update(h, data, key, prim_bkt, short_sig); > + ret = search_and_update(h, data, key, prim_bkt, short_sig, > + &saved_old_data); > if (ret != -1) { > __hash_rw_writer_unlock(h); > - return ret; > + goto overwrite; > } > > /* Check if key is already inserted in secondary location */ > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > - ret = search_and_update(h, data, key, cur_bkt, short_sig); > + ret = search_and_update(h, data, key, cur_bkt, short_sig, > + &saved_old_data); > if (ret != -1) { > __hash_rw_writer_unlock(h); > - return ret; > + goto overwrite; > } > } > > @@ -1153,33 +1184,39 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > > /* Find an empty slot and insert */ > ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, > - short_sig, slot_id, &ret_val); > + short_sig, slot_id, &ret_val, > + &saved_old_data); > if (ret == 0) > return slot_id - 1; > else if (ret == 1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > - return ret_val; > + ret = ret_val; > + goto overwrite; > } > > /* Primary bucket full, need to make space for new entry */ > ret = rte_hash_cuckoo_make_space_mw(h, prim_bkt, sec_bkt, key, data, > - short_sig, prim_bucket_idx, slot_id, &ret_val); > + short_sig, prim_bucket_idx, slot_id, &ret_val, > + &saved_old_data); > if (ret == 0) > return slot_id - 1; > else if (ret == 1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > - return ret_val; > + ret = ret_val; > + goto overwrite; > } > > /* Also search secondary bucket to get better occupancy */ > ret = rte_hash_cuckoo_make_space_mw(h, sec_bkt, prim_bkt, key, data, > - short_sig, sec_bucket_idx, slot_id, &ret_val); > + short_sig, sec_bucket_idx, slot_id, &ret_val, > + &saved_old_data); > > if (ret == 0) > return slot_id - 1; > else if (ret == 1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > - return ret_val; > + ret = ret_val; > + goto overwrite; > } > > /* if ext table not enabled, we failed the insertion */ > @@ -1193,17 +1230,21 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > */ > __hash_rw_writer_lock(h); > /* We check for duplicates again since could be inserted before the lock */ > - ret = search_and_update(h, data, key, prim_bkt, short_sig); > + ret = search_and_update(h, data, key, prim_bkt, short_sig, > + &saved_old_data); > if (ret != -1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > - goto failure; > + __hash_rw_writer_unlock(h); > + goto overwrite; > } > > FOR_EACH_BUCKET(cur_bkt, sec_bkt) { > - ret = search_and_update(h, data, key, cur_bkt, short_sig); > + ret = search_and_update(h, data, key, cur_bkt, short_sig, > + &saved_old_data); > if (ret != -1) { > enqueue_slot_back(h, cached_free_slots, slot_id); > - goto failure; > + __hash_rw_writer_unlock(h); > + goto overwrite; > } > } > > @@ -1263,6 +1304,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > __hash_rw_writer_unlock(h); > return slot_id - 1; With multiple goto labels that function becomes even more complicated. Can we just add old_data as an extra parameter: __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig, void *data, void **old_data) And make the callers to invoke __hash_rcu_auto_free_old_data()? > +overwrite: > + if (saved_old_data != NULL) > + __hash_rcu_auto_free_old_data(h, saved_old_data); > + return ret; > + > failure: > __hash_rw_writer_unlock(h); > return ret; > @@ -1566,6 +1612,15 @@ __hash_rcu_qsbr_free_resource(void *p, void *e, > unsigned int n) > *((struct __rte_hash_rcu_dq_entry *)e); > > RTE_SET_USED(n); > + > + if (rcu_dq_entry.key_idx == EMPTY_SLOT) { > + /* Overwrite case: free old data only, do not recycle slot */ > + RTE_ASSERT(h->hash_rcu_cfg->free_key_data_func != NULL); > + h->hash_rcu_cfg->free_key_data_func(h->hash_rcu_cfg- > >key_data_ptr, > + rcu_dq_entry.old_data); > + return; > + } > + > keys = h->key_store; > > k = (struct rte_hash_key *) ((char *)keys + > diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h > index f692e0868dcf..e33f0aea0f5e 100644 > --- a/lib/hash/rte_hash.h > +++ b/lib/hash/rte_hash.h > @@ -226,7 +226,9 @@ rte_hash_max_key_id(const struct rte_hash *h); > * Thread safety can be enabled by setting flag during > * table creation. > * If the key exists already in the table, this API updates its value > - * with 'data' passed in this API. It is the responsibility of > + * with 'data' passed in this API. If RCU is configured with a > + * free_key_data_func callback, the old data is automatically > + * deferred-freed via RCU. Otherwise, it is the responsibility of > * the application to manage any memory associated with the old value. > * The readers might still be using the old value even after this API > * has returned. > @@ -253,7 +255,9 @@ rte_hash_add_key_data(const struct rte_hash *h, const void > *key, void *data); > * Thread safety can be enabled by setting flag during > * table creation. > * If the key exists already in the table, this API updates its value > - * with 'data' passed in this API. It is the responsibility of > + * with 'data' passed in this API. If RCU is configured with a > + * free_key_data_func callback, the old data is automatically > + * deferred-freed via RCU. Otherwise, it is the responsibility of > * the application to manage any memory associated with the old value. > * The readers might still be using the old value even after this API > * has returned. > -- > 2.53.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured 2026-03-04 11:40 ` Konstantin Ananyev @ 2026-03-04 11:45 ` Robin Jarry 2026-03-04 12:50 ` Robin Jarry 0 siblings, 1 reply; 21+ messages in thread From: Robin Jarry @ 2026-03-04 11:45 UTC (permalink / raw) To: Konstantin Ananyev, dev@dpdk.org, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger Hi Konstantin, Konstantin Ananyev, Mar 04, 2026 at 12:40: >> @@ -776,6 +778,8 @@ search_and_update(const struct rte_hash *h, void *data, >> const void *key, >> k = (struct rte_hash_key *) ((char *)keys + >> bkt->key_idx[i] * h->key_entry_size); >> if (rte_hash_cmp_eq(key, k->key, h) == 0) { >> + if (old_data != NULL) >> + *old_data = k->pdata; > > For consistency, shouldn't we use here instead of load/store: > *old_data = rte_atomic_exchange_explicit(&k->pdata, data, ...); > ? Yes, that makes sense. I'll send that in a v3. >> @@ -1263,6 +1304,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash >> *h, const void *key, >> __hash_rw_writer_unlock(h); >> return slot_id - 1; > > With multiple goto labels that function becomes even more complicated. > Can we just add old_data as an extra parameter: > __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig, void *data, void **old_data) > And make the callers to invoke __hash_rcu_auto_free_old_data()? I'll see what I can do for v3. Thanks for the review! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured 2026-03-04 11:45 ` Robin Jarry @ 2026-03-04 12:50 ` Robin Jarry 0 siblings, 0 replies; 21+ messages in thread From: Robin Jarry @ 2026-03-04 12:50 UTC (permalink / raw) To: Konstantin Ananyev, dev@dpdk.org, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger Robin Jarry, Mar 04, 2026 at 12:45: >> With multiple goto labels that function becomes even more complicated. >> Can we just add old_data as an extra parameter: >> __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, >> hash_sig_t sig, void *data, void **old_data) >> And make the callers to invoke __hash_rcu_auto_free_old_data()? > > I'll see what I can do for v3. Unfortunately, I need to pass old_data in order to make the new replace API work. Maybe we can drop the new API altogether and have a much simpler patch. What do you think? ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH dpdk v2 3/3] hash: add replace API returning old data on overwrite 2026-02-13 10:34 ` [PATCH dpdk v2 0/3] " Robin Jarry 2026-02-13 10:34 ` [PATCH dpdk v2 1/3] hash: avoid leaking entries on RCU defer queue failure Robin Jarry 2026-02-13 10:34 ` [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured Robin Jarry @ 2026-02-13 10:34 ` Robin Jarry 2026-03-04 11:44 ` Konstantin Ananyev 2 siblings, 1 reply; 21+ messages in thread From: Robin Jarry @ 2026-02-13 10:34 UTC (permalink / raw) To: dev, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger Add rte_hash_replace_key_data() and rte_hash_replace_key_with_hash_data() which behave like their add_key_data counterparts but return the previous data pointer via an output parameter. On fresh insert, *old_data is set to NULL. When the caller provides old_data, no automatic RCU deferred free is performed: the caller takes ownership of the old pointer and is responsible for freeing it. Signed-off-by: Robin Jarry <rjarry@redhat.com> --- app/test/test_hash.c | 80 ++++++++++++++++++++++++++ doc/guides/rel_notes/release_26_03.rst | 6 ++ lib/hash/rte_cuckoo_hash.c | 42 ++++++++++++-- lib/hash/rte_hash.h | 67 +++++++++++++++++++++ 4 files changed, 189 insertions(+), 6 deletions(-) diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 56a7779e09b9..fcc02ea31205 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -2473,6 +2473,83 @@ test_hash_rcu_qsbr_replace_auto_free(void) return ret; } +/* + * Test rte_hash_replace_key_data (explicit replace, no RCU). + * - Add key with data pointer (void *)1. + * - Replace same key with data (void *)2, verify old_data == (void *)1. + * - Lookup and verify data == (void *)2. + * - Replace with a new key (not yet inserted), verify old_data == NULL. + */ +static int +test_hash_replace_key_data(void) +{ + struct rte_hash_parameters params = { + .name = "test_replace_key_data", + .entries = 64, + .key_len = sizeof(uint32_t), + .hash_func = NULL, + .hash_func_init_val = 0, + .socket_id = SOCKET_ID_ANY, + }; + struct rte_hash *hash; + void *old_data = NULL; + void *data = NULL; + uint32_t key1 = 42; + uint32_t key2 = 99; + int ret = -1; + + printf("\n# Running replace key data test\n"); + + hash = rte_hash_create(¶ms); + if (hash == NULL) { + printf("hash creation failed\n"); + goto end; + } + + /* Add key with data = (void *)1 */ + ret = rte_hash_add_key_data(hash, &key1, (void *)(uintptr_t)1); + if (ret != 0) { + printf("failed to add key (ret=%d)\n", ret); + goto end; + } + + /* Replace same key with data = (void *)2 */ + ret = rte_hash_replace_key_data(hash, &key1, (void *)(uintptr_t)2, &old_data); + if (ret != 0) { + printf("failed to replace key (ret=%d)\n", ret); + goto end; + } + if (old_data != (void *)(uintptr_t)1) { + printf("old_data should be 0x1 but is %p\n", old_data); + goto end; + } + + /* Lookup and verify data == (void *)2 */ + ret = rte_hash_lookup_data(hash, &key1, &data); + if (ret < 0 || data != (void *)(uintptr_t)2) { + printf("lookup returned wrong data %p (ret=%d)\n", data, ret); + goto end; + } + + /* Replace with a new key (not yet inserted) */ + old_data = (void *)(uintptr_t)0xdeadbeef; + ret = rte_hash_replace_key_data(hash, &key2, (void *)(uintptr_t)3, &old_data); + if (ret != 0) { + printf("failed to insert new key via replace (ret=%d)\n", ret); + goto end; + } + if (old_data != NULL) { + printf("old_data should be NULL on fresh insert but is %p\n", + old_data); + goto end; + } + +end: + rte_hash_free(hash); + + return ret; +} + /* * Do all unit and performance tests. */ @@ -2557,6 +2634,9 @@ test_hash(void) if (test_hash_rcu_qsbr_replace_auto_free() < 0) return -1; + if (test_hash_replace_key_data() < 0) + return -1; + return 0; } diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst index 693034bcb0d2..73418523809c 100644 --- a/doc/guides/rel_notes/release_26_03.rst +++ b/doc/guides/rel_notes/release_26_03.rst @@ -93,6 +93,12 @@ New Features ``rte_hash_add_key_data`` now automatically defers freeing the old data pointer on key overwrite via the RCU defer queue. +* **Added hash replace API.** + + Added ``rte_hash_replace_key_data`` and ``rte_hash_replace_key_with_hash_data`` + functions that return the previous data pointer on key overwrite, letting the + caller manage its lifecycle. + Removed Items ------------- diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index f487b3b725dd..cf957de55288 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -1106,7 +1106,7 @@ __hash_rcu_auto_free_old_data(const struct rte_hash *h, void *old_data_val) static inline int32_t __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, - hash_sig_t sig, void *data) + hash_sig_t sig, void *data, void **old_data) { uint16_t short_sig; uint32_t prim_bucket_idx, sec_bucket_idx; @@ -1305,7 +1305,9 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, return slot_id - 1; overwrite: - if (saved_old_data != NULL) + if (old_data != NULL) + *old_data = saved_old_data; + else if (saved_old_data != NULL) __hash_rcu_auto_free_old_data(h, saved_old_data); return ret; @@ -1321,7 +1323,7 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, hash_sig_t sig) { RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - return __rte_hash_add_key_with_hash(h, key, sig, 0); + return __rte_hash_add_key_with_hash(h, key, sig, 0, NULL); } RTE_EXPORT_SYMBOL(rte_hash_add_key) @@ -1329,7 +1331,7 @@ int32_t rte_hash_add_key(const struct rte_hash *h, const void *key) { RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - return __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), 0); + return __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), 0, NULL); } RTE_EXPORT_SYMBOL(rte_hash_add_key_with_hash_data) @@ -1340,7 +1342,7 @@ rte_hash_add_key_with_hash_data(const struct rte_hash *h, int ret; RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - ret = __rte_hash_add_key_with_hash(h, key, sig, data); + ret = __rte_hash_add_key_with_hash(h, key, sig, data, NULL); if (ret >= 0) return 0; else @@ -1355,13 +1357,41 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data) RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); - ret = __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), data); + ret = __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), data, NULL); if (ret >= 0) return 0; else return ret; } +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_hash_replace_key_with_hash_data, 26.03) +int +rte_hash_replace_key_with_hash_data(const struct rte_hash *h, + const void *key, hash_sig_t sig, + void *data, void **old_data) +{ + int ret; + + RETURN_IF_TRUE(((h == NULL) || (key == NULL) || + (old_data == NULL)), -EINVAL); + + *old_data = NULL; + ret = __rte_hash_add_key_with_hash(h, key, sig, data, old_data); + if (ret >= 0) + return 0; + else + return ret; +} + +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_hash_replace_key_data, 26.03) +int +rte_hash_replace_key_data(const struct rte_hash *h, const void *key, + void *data, void **old_data) +{ + return rte_hash_replace_key_with_hash_data(h, key, + rte_hash_hash(h, key), data, old_data); +} + /* Search one bucket to find the match key - uses rw lock */ static inline int32_t search_one_bucket_l(const struct rte_hash *h, const void *key, diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h index e33f0aea0f5e..4a328c51501c 100644 --- a/lib/hash/rte_hash.h +++ b/lib/hash/rte_hash.h @@ -279,6 +279,73 @@ int32_t rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key, hash_sig_t sig, void *data); +/** + * Replace a key-value pair in an existing hash table, returning the previous + * data pointer associated with the key. If the key does not exist, it is + * inserted and *old_data is set to NULL. + * + * This operation is not multi-thread safe and should only be called from one + * thread by default. Thread safety can be enabled by setting flag during table + * creation. + * + * When old_data is provided, no automatic RCU deferred free is performed on + * overwrite; the caller takes ownership of the old pointer and is responsible + * for freeing it (e.g. via RCU). + * + * @param h + * Hash table to add the key to. + * @param key + * Key to add to the hash table. + * @param data + * Data to add to the hash table. + * @param old_data + * Output: on overwrite, set to the previous data pointer. + * On fresh insert, set to NULL. + * @return + * - 0 if added/replaced successfully + * - -EINVAL if the parameters are invalid. + * - -ENOSPC if there is no space in the hash for this key. + */ +__rte_experimental +int +rte_hash_replace_key_data(const struct rte_hash *h, const void *key, + void *data, void **old_data); + +/** + * Replace a key-value pair with a pre-computed hash value in an existing hash + * table, returning the previous data pointer associated with the key. If the + * key does not exist, it is inserted and *old_data is set to NULL. + * + * This operation is not multi-thread safe and should only be called from one + * thread by default. Thread safety can be enabled by setting flag during table + * creation. + * + * When old_data is provided, no automatic RCU deferred free is performed on + * overwrite; the caller takes ownership of the old pointer and is responsible + * for freeing it (e.g. via RCU). + * + * @param h + * Hash table to add the key to. + * @param key + * Key to add to the hash table. + * @param sig + * Precomputed hash value for 'key' + * @param data + * Data to add to the hash table. + * @param old_data + * Output: on overwrite, set to the previous data pointer. + * On fresh insert, set to NULL. + * @return + * - 0 if added/replaced successfully + * - -EINVAL if the parameters are invalid. + * - -ENOSPC if there is no space in the hash for this key. + */ +__rte_experimental +int +rte_hash_replace_key_with_hash_data(const struct rte_hash *h, + const void *key, hash_sig_t sig, + void *data, void **old_data); + /** * Add a key to an existing hash table. This operation is not multi-thread safe * and should only be called from one thread by default. -- 2.53.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* RE: [PATCH dpdk v2 3/3] hash: add replace API returning old data on overwrite 2026-02-13 10:34 ` [PATCH dpdk v2 3/3] hash: add replace API returning old data on overwrite Robin Jarry @ 2026-03-04 11:44 ` Konstantin Ananyev 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Ananyev @ 2026-03-04 11:44 UTC (permalink / raw) To: Robin Jarry, dev@dpdk.org, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin Cc: Stephen Hemminger > > Add rte_hash_replace_key_data() and rte_hash_replace_key_with_hash_data() > which behave like their add_key_data counterparts but return the previous > data pointer via an output parameter. On fresh insert, *old_data is set > to NULL. > > When the caller provides old_data, no automatic RCU deferred free is > performed: the caller takes ownership of the old pointer and is > responsible for freeing it. > > Signed-off-by: Robin Jarry <rjarry@redhat.com> > --- > app/test/test_hash.c | 80 ++++++++++++++++++++++++++ > doc/guides/rel_notes/release_26_03.rst | 6 ++ > lib/hash/rte_cuckoo_hash.c | 42 ++++++++++++-- > lib/hash/rte_hash.h | 67 +++++++++++++++++++++ > 4 files changed, 189 insertions(+), 6 deletions(-) > > diff --git a/app/test/test_hash.c b/app/test/test_hash.c > index 56a7779e09b9..fcc02ea31205 100644 > --- a/app/test/test_hash.c > +++ b/app/test/test_hash.c > @@ -2473,6 +2473,83 @@ test_hash_rcu_qsbr_replace_auto_free(void) > return ret; > } > > +/* > + * Test rte_hash_replace_key_data (explicit replace, no RCU). > + * - Add key with data pointer (void *)1. > + * - Replace same key with data (void *)2, verify old_data == (void *)1. > + * - Lookup and verify data == (void *)2. > + * - Replace with a new key (not yet inserted), verify old_data == NULL. > + */ > +static int > +test_hash_replace_key_data(void) > +{ > + struct rte_hash_parameters params = { > + .name = "test_replace_key_data", > + .entries = 64, > + .key_len = sizeof(uint32_t), > + .hash_func = NULL, > + .hash_func_init_val = 0, > + .socket_id = SOCKET_ID_ANY, > + }; > + struct rte_hash *hash; > + void *old_data = NULL; > + void *data = NULL; > + uint32_t key1 = 42; > + uint32_t key2 = 99; > + int ret = -1; > + > + printf("\n# Running replace key data test\n"); > + > + hash = rte_hash_create(¶ms); > + if (hash == NULL) { > + printf("hash creation failed\n"); > + goto end; > + } > + > + /* Add key with data = (void *)1 */ > + ret = rte_hash_add_key_data(hash, &key1, (void *)(uintptr_t)1); > + if (ret != 0) { > + printf("failed to add key (ret=%d)\n", ret); > + goto end; > + } > + > + /* Replace same key with data = (void *)2 */ > + ret = rte_hash_replace_key_data(hash, &key1, (void *)(uintptr_t)2, > &old_data); > + if (ret != 0) { > + printf("failed to replace key (ret=%d)\n", ret); > + goto end; > + } > + if (old_data != (void *)(uintptr_t)1) { > + printf("old_data should be 0x1 but is %p\n", old_data); > + goto end; > + } > + > + /* Lookup and verify data == (void *)2 */ > + ret = rte_hash_lookup_data(hash, &key1, &data); > + if (ret < 0 || data != (void *)(uintptr_t)2) { > + printf("lookup returned wrong data %p (ret=%d)\n", data, ret); > + goto end; > + } > + > + /* Replace with a new key (not yet inserted) */ > + old_data = (void *)(uintptr_t)0xdeadbeef; > + ret = rte_hash_replace_key_data(hash, &key2, (void *)(uintptr_t)3, > &old_data); > + if (ret != 0) { > + printf("failed to insert new key via replace (ret=%d)\n", ret); > + goto end; > + } > + if (old_data != NULL) { > + printf("old_data should be NULL on fresh insert but is %p\n", > + old_data); > + goto end; > + } > + > +end: > + rte_hash_free(hash); > + > + return ret; > +} > + > /* > * Do all unit and performance tests. > */ > @@ -2557,6 +2634,9 @@ test_hash(void) > if (test_hash_rcu_qsbr_replace_auto_free() < 0) > return -1; > > + if (test_hash_replace_key_data() < 0) > + return -1; > + > return 0; > } > > diff --git a/doc/guides/rel_notes/release_26_03.rst > b/doc/guides/rel_notes/release_26_03.rst > index 693034bcb0d2..73418523809c 100644 > --- a/doc/guides/rel_notes/release_26_03.rst > +++ b/doc/guides/rel_notes/release_26_03.rst > @@ -93,6 +93,12 @@ New Features > ``rte_hash_add_key_data`` now automatically defers freeing the old data > pointer on key overwrite via the RCU defer queue. > > +* **Added hash replace API.** > + > + Added ``rte_hash_replace_key_data`` and > ``rte_hash_replace_key_with_hash_data`` > + functions that return the previous data pointer on key overwrite, letting the > + caller manage its lifecycle. > + > > Removed Items > ------------- > diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c > index f487b3b725dd..cf957de55288 100644 > --- a/lib/hash/rte_cuckoo_hash.c > +++ b/lib/hash/rte_cuckoo_hash.c > @@ -1106,7 +1106,7 @@ __hash_rcu_auto_free_old_data(const struct rte_hash > *h, void *old_data_val) > > static inline int32_t > __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > - hash_sig_t sig, void *data) > + hash_sig_t sig, void *data, void **old_data) > { > uint16_t short_sig; > uint32_t prim_bucket_idx, sec_bucket_idx; > @@ -1305,7 +1305,9 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, > const void *key, > return slot_id - 1; > > overwrite: > - if (saved_old_data != NULL) > + if (old_data != NULL) > + *old_data = saved_old_data; > + else if (saved_old_data != NULL) > __hash_rcu_auto_free_old_data(h, saved_old_data); > return ret; > > @@ -1321,7 +1323,7 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, > const void *key, hash_sig_t sig) > { > RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); > - return __rte_hash_add_key_with_hash(h, key, sig, 0); > + return __rte_hash_add_key_with_hash(h, key, sig, 0, NULL); > } > > RTE_EXPORT_SYMBOL(rte_hash_add_key) > @@ -1329,7 +1331,7 @@ int32_t > rte_hash_add_key(const struct rte_hash *h, const void *key) > { > RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); > - return __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), 0); > + return __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), 0, > NULL); > } > > RTE_EXPORT_SYMBOL(rte_hash_add_key_with_hash_data) > @@ -1340,7 +1342,7 @@ rte_hash_add_key_with_hash_data(const struct rte_hash > *h, > int ret; > > RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); > - ret = __rte_hash_add_key_with_hash(h, key, sig, data); > + ret = __rte_hash_add_key_with_hash(h, key, sig, data, NULL); > if (ret >= 0) > return 0; > else > @@ -1355,13 +1357,41 @@ rte_hash_add_key_data(const struct rte_hash *h, const > void *key, void *data) > > RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL); > > - ret = __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), data); > + ret = __rte_hash_add_key_with_hash(h, key, rte_hash_hash(h, key), data, > NULL); > if (ret >= 0) > return 0; > else > return ret; > } > > +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_hash_replace_key_with_hash_data, > 26.03) > +int > +rte_hash_replace_key_with_hash_data(const struct rte_hash *h, > + const void *key, hash_sig_t sig, > + void *data, void **old_data) > +{ > + int ret; > + > + RETURN_IF_TRUE(((h == NULL) || (key == NULL) || > + (old_data == NULL)), -EINVAL); > + > + *old_data = NULL; > + ret = __rte_hash_add_key_with_hash(h, key, sig, data, old_data); > + if (ret >= 0) > + return 0; > + else > + return ret; > +} > + > +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_hash_replace_key_data, 26.03) > +int > +rte_hash_replace_key_data(const struct rte_hash *h, const void *key, > + void *data, void **old_data) > +{ > + return rte_hash_replace_key_with_hash_data(h, key, > + rte_hash_hash(h, key), data, old_data); > +} > + > /* Search one bucket to find the match key - uses rw lock */ > static inline int32_t > search_one_bucket_l(const struct rte_hash *h, const void *key, > diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h > index e33f0aea0f5e..4a328c51501c 100644 > --- a/lib/hash/rte_hash.h > +++ b/lib/hash/rte_hash.h > @@ -279,6 +279,73 @@ int32_t > rte_hash_add_key_with_hash_data(const struct rte_hash *h, const void *key, > hash_sig_t sig, void *data); > > +/** > + * Replace a key-value pair in an existing hash table, returning the previous > + * data pointer associated with the key. If the key does not exist, it is > + * inserted and *old_data is set to NULL. > + * > + * This operation is not multi-thread safe and should only be called from one > + * thread by default. Thread safety can be enabled by setting flag during table > + * creation. > + * > + * When old_data is provided, no automatic RCU deferred free is performed on > + * overwrite; the caller takes ownership of the old pointer and is responsible > + * for freeing it (e.g. via RCU). > + * > + * @param h > + * Hash table to add the key to. > + * @param key > + * Key to add to the hash table. > + * @param data > + * Data to add to the hash table. > + * @param old_data > + * Output: on overwrite, set to the previous data pointer. > + * On fresh insert, set to NULL. > + * @return > + * - 0 if added/replaced successfully > + * - -EINVAL if the parameters are invalid. > + * - -ENOSPC if there is no space in the hash for this key. > + */ > +__rte_experimental > +int > +rte_hash_replace_key_data(const struct rte_hash *h, const void *key, > + void *data, void **old_data); > + > +/** > + * Replace a key-value pair with a pre-computed hash value in an existing hash > + * table, returning the previous data pointer associated with the key. If the > + * key does not exist, it is inserted and *old_data is set to NULL. > + * > + * This operation is not multi-thread safe and should only be called from one > + * thread by default. Thread safety can be enabled by setting flag during table > + * creation. > + * > + * When old_data is provided, no automatic RCU deferred free is performed on > + * overwrite; the caller takes ownership of the old pointer and is responsible > + * for freeing it (e.g. via RCU). > + * > + * @param h > + * Hash table to add the key to. > + * @param key > + * Key to add to the hash table. > + * @param sig > + * Precomputed hash value for 'key' > + * @param data > + * Data to add to the hash table. > + * @param old_data > + * Output: on overwrite, set to the previous data pointer. > + * On fresh insert, set to NULL. > + * @return > + * - 0 if added/replaced successfully > + * - -EINVAL if the parameters are invalid. > + * - -ENOSPC if there is no space in the hash for this key. > + */ > +__rte_experimental > +int > +rte_hash_replace_key_with_hash_data(const struct rte_hash *h, > + const void *key, hash_sig_t sig, > + void *data, void **old_data); > + > /** > * Add a key to an existing hash table. This operation is not multi-thread safe > * and should only be called from one thread by default. > -- Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> > 2.53.0 > ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite 2026-02-12 21:33 [PATCH dpdk 0/2] hash: safe data replacement on overwrite Robin Jarry ` (3 preceding siblings ...) 2026-02-13 10:34 ` [PATCH dpdk v2 0/3] " Robin Jarry @ 2026-03-06 8:47 ` Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 1/2] hash: avoid leaking entries on RCU defer queue failure Robin Jarry ` (3 more replies) 4 siblings, 4 replies; 21+ messages in thread From: Robin Jarry @ 2026-03-06 8:47 UTC (permalink / raw) To: dev; +Cc: Konstantin Ananyev When rte_hash_add_key_data() is called with an existing key, the old data pointer is silently overwritten. With RCU-protected readers still potentially accessing the old data, the application cannot safely free it since the old pointer is never returned. This series fixes the issue in two places: * When RCU is configured with a free_key_data_func callback, rte_hash_add_key_data() now automatically defers freeing the old data on overwrite. No caller changes are needed. * If rte_hash_del_key() fails to enqueue the deleted entry in RTE_HASH_QSBR_MODE_DQ, it falls back on RTE_HASH_QSBR_MODE_SYNC to avoid leaking slots and data pointers. v3: * Use rte_atomic_exchange_explicit() to swap the new data pointer with the old one in search_and_update(). * Removed the rte_hash_replace_key_data() API which required to bubble a potential old_data pointer from search_and_update() to all intermediate callers down to __rte_hash_add_key_with_hash(). It made an already complex code even more obscure. v2: * New patch to fallback to SYNC mode when DQ fails on delete. * Check key_idx == EMPTY_SLOT in DQ mode to differentiate when old_data is NULL. * Also fallback to SYNC mode when DQ fails on replace. Robin Jarry (2): hash: avoid leaking entries on RCU defer queue failure hash: free replaced data on overwrite when RCU is configured app/test/test_hash.c | 134 +++++++++++++++++++++++++ doc/guides/rel_notes/release_26_03.rst | 7 ++ lib/hash/rte_cuckoo_hash.c | 47 +++++++-- lib/hash/rte_hash.h | 8 +- 4 files changed, 187 insertions(+), 9 deletions(-) -- 2.53.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH dpdk v3 1/2] hash: avoid leaking entries on RCU defer queue failure 2026-03-06 8:47 ` [PATCH dpdk v3 0/2] hash: safe data replacement " Robin Jarry @ 2026-03-06 8:47 ` Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 2/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry ` (2 subsequent siblings) 3 siblings, 0 replies; 21+ messages in thread From: Robin Jarry @ 2026-03-06 8:47 UTC (permalink / raw) To: dev, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin, Ray Kinsella, Dharmik Thakkar, Ruifeng Wang Cc: Konstantin Ananyev, stable When rte_rcu_qsbr_dq_enqueue() fails in DQ mode, the deleted key slot is never freed and becomes permanently leaked. Fall back to synchronous reclamation instead of only logging an error. Cc: stable@dpdk.org Fixes: 769b2de7fb52 ("hash: implement RCU resources reclamation") Signed-off-by: Robin Jarry <rjarry@redhat.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> --- lib/hash/rte_cuckoo_hash.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index 3f382dd117ce..36b3b477a55b 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -1870,18 +1870,15 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, /* Key index where key is stored, adding the first dummy index */ rcu_dq_entry.key_idx = ret + 1; rcu_dq_entry.ext_bkt_idx = index; - if (h->dq == NULL) { + if (h->dq == NULL || rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) { /* Wait for quiescent state change if using - * RTE_HASH_QSBR_MODE_SYNC + * RTE_HASH_QSBR_MODE_SYNC or if RCU enqueue failed. */ rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, RTE_QSBR_THRID_INVALID); __hash_rcu_qsbr_free_resource((void *)((uintptr_t)h), &rcu_dq_entry, 1); - } else if (h->dq) - /* Push into QSBR FIFO if using RTE_HASH_QSBR_MODE_DQ */ - if (rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) - HASH_LOG(ERR, "Failed to push QSBR FIFO"); + } } __hash_rw_writer_unlock(h); return ret; -- 2.53.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH dpdk v3 2/2] hash: free replaced data on overwrite when RCU is configured 2026-03-06 8:47 ` [PATCH dpdk v3 0/2] hash: safe data replacement " Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 1/2] hash: avoid leaking entries on RCU defer queue failure Robin Jarry @ 2026-03-06 8:47 ` Robin Jarry 2026-03-09 7:48 ` Konstantin Ananyev 2026-03-09 7:50 ` [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite Konstantin Ananyev 2026-03-17 9:40 ` Thomas Monjalon 3 siblings, 1 reply; 21+ messages in thread From: Robin Jarry @ 2026-03-06 8:47 UTC (permalink / raw) To: dev, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin, Ray Kinsella, Ruifeng Wang, Dharmik Thakkar Cc: Konstantin Ananyev, stable When rte_hash_add_key_data() overwrites an existing key, the old data pointer is silently lost. With RCU-protected readers still potentially accessing the old data, the application has no safe way to free it. When RCU is configured with a free_key_data_func callback, automatically enqueue the old data for deferred freeing via the RCU defer queue on overwrite. In SYNC mode, synchronize and call free_key_data_func directly. Cc: stable@dpdk.org Fixes: 769b2de7fb52 ("hash: implement RCU resources reclamation") Signed-off-by: Robin Jarry <rjarry@redhat.com> --- app/test/test_hash.c | 134 +++++++++++++++++++++++++ doc/guides/rel_notes/release_26_03.rst | 7 ++ lib/hash/rte_cuckoo_hash.c | 38 ++++++- lib/hash/rte_hash.h | 8 +- 4 files changed, 184 insertions(+), 3 deletions(-) diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 3fb3d96d0546..56a7779e09b9 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -2342,6 +2342,137 @@ test_hash_rcu_qsbr_dq_reclaim(void) return 0; } +static void *old_data; + +static void +test_hash_free_key_data_func(void *p __rte_unused, void *key_data) +{ + old_data = key_data; +} + +/* + * Test automatic RCU free on overwrite via rte_hash_add_key_data. + * - Create hash with RW_CONCURRENCY_LF and RCU QSBR in DQ mode + * with a free_key_data_func callback that increments a counter. + * - Register a pseudo reader thread. + * - Add key with data (void *)1. + * - Overwrite same key with data (void *)2 via rte_hash_add_key_data. + * - Report quiescent state, trigger reclamation. + * - Verify the free callback was called exactly once. + * - Delete the key, report quiescent state, reclaim again. + * - Verify the free callback was called a second time. + */ +static int +test_hash_rcu_qsbr_replace_auto_free(void) +{ + struct rte_hash_rcu_config rcu = { + .v = NULL, + .mode = RTE_HASH_QSBR_MODE_DQ, + .free_key_data_func = test_hash_free_key_data_func, + .key_data_ptr = NULL, + }; + struct rte_hash_parameters params = { + .name = "test_replace_auto_free", + .entries = 16, + .key_len = sizeof(uint32_t), + .hash_func = NULL, + .hash_func_init_val = 0, + .socket_id = SOCKET_ID_ANY, + .extra_flag = RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF, + }; + struct rte_hash *hash = NULL; + uint32_t key = 55; + int32_t status; + int ret = -1; + size_t sz; + + printf("\n# Running RCU replace auto-free test\n"); + + hash = rte_hash_create(¶ms); + if (hash == NULL) { + printf("hash creation failed\n"); + goto end; + } + + sz = rte_rcu_qsbr_get_memsize(RTE_MAX_LCORE); + rcu.v = rte_zmalloc(NULL, sz, RTE_CACHE_LINE_SIZE); + if (rcu.v == NULL) { + printf("RCU QSBR alloc failed\n"); + goto end; + } + status = rte_rcu_qsbr_init(rcu.v, RTE_MAX_LCORE); + if (status != 0) { + printf("RCU QSBR init failed\n"); + goto end; + } + + status = rte_hash_rcu_qsbr_add(hash, &rcu); + if (status != 0) { + printf("RCU QSBR add failed\n"); + goto end; + } + + /* Register pseudo reader */ + status = rte_rcu_qsbr_thread_register(rcu.v, 0); + if (status != 0) { + printf("RCU QSBR thread register failed\n"); + goto end; + } + rte_rcu_qsbr_thread_online(rcu.v, 0); + + old_data = NULL; + + /* Add key with data = (void *)1 */ + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)1); + if (status != 0) { + printf("failed to add key (status=%d)\n", status); + goto end; + } + + /* Overwrite same key with data = (void *)2 */ + status = rte_hash_add_key_data(hash, &key, (void *)(uintptr_t)2); + if (status != 0) { + printf("failed to overwrite key (status=%d)\n", status); + goto end; + } + + /* Reader quiescent and reclaim */ + rte_rcu_qsbr_quiescent(rcu.v, 0); + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); + + if (old_data != (void *)(uintptr_t)1) { + printf("old data should be 0x1 but is %p\n", old_data); + goto end; + } + + /* Delete the key */ + status = rte_hash_del_key(hash, &key); + if (status < 0) { + printf("failed to delete key (status=%d)\n", status); + goto end; + } + + /* Reader quiescent and reclaim again */ + rte_rcu_qsbr_quiescent(rcu.v, 0); + rte_hash_rcu_qsbr_dq_reclaim(hash, NULL, NULL, NULL); + + if (old_data != (void *)(uintptr_t)2) { + printf("old data should be 2 but is %p\n", old_data); + goto end; + } + + ret = 0; +end: + if (rcu.v != NULL) { + rte_rcu_qsbr_thread_offline(rcu.v, 0); + rte_rcu_qsbr_thread_unregister(rcu.v, 0); + } + rte_hash_free(hash); + rte_free(rcu.v); + + return ret; +} + /* * Do all unit and performance tests. */ @@ -2423,6 +2554,9 @@ test_hash(void) if (test_hash_rcu_qsbr_dq_reclaim() < 0) return -1; + if (test_hash_rcu_qsbr_replace_auto_free() < 0) + return -1; + return 0; } diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst index b4499ec066f8..e772fd3cb9b7 100644 --- a/doc/guides/rel_notes/release_26_03.rst +++ b/doc/guides/rel_notes/release_26_03.rst @@ -106,6 +106,13 @@ New Features Added handling of the key combination Control+L to clear the screen before redisplaying the prompt. +* **Added automatic deferred free on hash data overwrite.** + + When RCU is configured with a ``free_key_data_func`` callback, + ``rte_hash_add_key_data`` now automatically defers freeing the old data + pointer on key overwrite via the RCU defer queue. + + Removed Items ------------- diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index 36b3b477a55b..69dcfdfc398a 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -75,6 +75,7 @@ EAL_REGISTER_TAILQ(rte_hash_tailq) struct __rte_hash_rcu_dq_entry { uint32_t key_idx; uint32_t ext_bkt_idx; + void *old_data; }; RTE_EXPORT_SYMBOL(rte_hash_find_existing) @@ -761,6 +762,28 @@ enqueue_slot_back(const struct rte_hash *h, sizeof(uint32_t)); } +/* + * When RCU is configured with a free function, auto-free the overwritten + * data pointer via RCU. + */ +static inline void +__rte_hash_rcu_auto_free_old_data(const struct rte_hash *h, void *d) +{ + struct __rte_hash_rcu_dq_entry rcu_dq_entry = { + .key_idx = EMPTY_SLOT, /* sentinel value for __hash_rcu_qsbr_free_resource */ + .old_data = d, + }; + + if (d == NULL || h->hash_rcu_cfg == NULL || h->hash_rcu_cfg->free_key_data_func == NULL) + return; + + if (h->dq == NULL || rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) { + /* SYNC mode or enqueue failed in DQ mode */ + rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, RTE_QSBR_THRID_INVALID); + h->hash_rcu_cfg->free_key_data_func(h->hash_rcu_cfg->key_data_ptr, d); + } +} + /* Search a key from bucket and update its data. * Writer holds the lock before calling this. */ @@ -770,6 +793,7 @@ search_and_update(const struct rte_hash *h, void *data, const void *key, { int i; struct rte_hash_key *k, *keys = h->key_store; + void *old_data; for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { if (bkt->sig_current[i] == sig) { @@ -782,9 +806,12 @@ search_and_update(const struct rte_hash *h, void *data, const void *key, * variable. Release the application data * to the readers. */ - rte_atomic_store_explicit(&k->pdata, + old_data = rte_atomic_exchange_explicit(&k->pdata, data, rte_memory_order_release); + + __rte_hash_rcu_auto_free_old_data(h, old_data); + /* * Return index where key is stored, * subtracting the first dummy index @@ -1566,6 +1593,15 @@ __hash_rcu_qsbr_free_resource(void *p, void *e, unsigned int n) *((struct __rte_hash_rcu_dq_entry *)e); RTE_SET_USED(n); + + if (rcu_dq_entry.key_idx == EMPTY_SLOT) { + /* Overwrite case: free old data only, do not recycle slot */ + RTE_ASSERT(h->hash_rcu_cfg->free_key_data_func != NULL); + h->hash_rcu_cfg->free_key_data_func(h->hash_rcu_cfg->key_data_ptr, + rcu_dq_entry.old_data); + return; + } + keys = h->key_store; k = (struct rte_hash_key *) ((char *)keys + diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h index f692e0868dcf..e33f0aea0f5e 100644 --- a/lib/hash/rte_hash.h +++ b/lib/hash/rte_hash.h @@ -226,7 +226,9 @@ rte_hash_max_key_id(const struct rte_hash *h); * Thread safety can be enabled by setting flag during * table creation. * If the key exists already in the table, this API updates its value - * with 'data' passed in this API. It is the responsibility of + * with 'data' passed in this API. If RCU is configured with a + * free_key_data_func callback, the old data is automatically + * deferred-freed via RCU. Otherwise, it is the responsibility of * the application to manage any memory associated with the old value. * The readers might still be using the old value even after this API * has returned. @@ -253,7 +255,9 @@ rte_hash_add_key_data(const struct rte_hash *h, const void *key, void *data); * Thread safety can be enabled by setting flag during * table creation. * If the key exists already in the table, this API updates its value - * with 'data' passed in this API. It is the responsibility of + * with 'data' passed in this API. If RCU is configured with a + * free_key_data_func callback, the old data is automatically + * deferred-freed via RCU. Otherwise, it is the responsibility of * the application to manage any memory associated with the old value. * The readers might still be using the old value even after this API * has returned. -- 2.53.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* RE: [PATCH dpdk v3 2/2] hash: free replaced data on overwrite when RCU is configured 2026-03-06 8:47 ` [PATCH dpdk v3 2/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry @ 2026-03-09 7:48 ` Konstantin Ananyev 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Ananyev @ 2026-03-09 7:48 UTC (permalink / raw) To: Robin Jarry, dev@dpdk.org, Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin, Ray Kinsella, Ruifeng Wang, Dharmik Thakkar Cc: stable@dpdk.org > When rte_hash_add_key_data() overwrites an existing key, the old data > pointer is silently lost. With RCU-protected readers still potentially > accessing the old data, the application has no safe way to free it. > > When RCU is configured with a free_key_data_func callback, automatically > enqueue the old data for deferred freeing via the RCU defer queue on > overwrite. In SYNC mode, synchronize and call free_key_data_func > directly. > > Cc: stable@dpdk.org > Fixes: 769b2de7fb52 ("hash: implement RCU resources reclamation") > > Signed-off-by: Robin Jarry <rjarry@redhat.com> > --- Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> > 2.53.0 > ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite 2026-03-06 8:47 ` [PATCH dpdk v3 0/2] hash: safe data replacement " Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 1/2] hash: avoid leaking entries on RCU defer queue failure Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 2/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry @ 2026-03-09 7:50 ` Konstantin Ananyev 2026-03-09 7:52 ` Robin Jarry 2026-03-17 9:40 ` Thomas Monjalon 3 siblings, 1 reply; 21+ messages in thread From: Konstantin Ananyev @ 2026-03-09 7:50 UTC (permalink / raw) To: Robin Jarry, dev@dpdk.org > When rte_hash_add_key_data() is called with an existing key, the old > data pointer is silently overwritten. With RCU-protected readers still > potentially accessing the old data, the application cannot safely free > it since the old pointer is never returned. > > This series fixes the issue in two places: > > * When RCU is configured with a free_key_data_func callback, > rte_hash_add_key_data() now automatically defers freeing the old data > on overwrite. No caller changes are needed. > > * If rte_hash_del_key() fails to enqueue the deleted entry in > RTE_HASH_QSBR_MODE_DQ, it falls back on RTE_HASH_QSBR_MODE_SYNC to > avoid leaking slots and data pointers. > > v3: > > * Use rte_atomic_exchange_explicit() to swap the new data pointer with > the old one in search_and_update(). > > * Removed the rte_hash_replace_key_data() API That's a pity - I think it is a useful one. > which required to bubble > a potential old_data pointer from search_and_update() to all > intermediate callers down to __rte_hash_add_key_with_hash(). It made > an already complex code even more obscure. > > v2: > > * New patch to fallback to SYNC mode when DQ fails on delete. > * Check key_idx == EMPTY_SLOT in DQ mode to differentiate when old_data > is NULL. > * Also fallback to SYNC mode when DQ fails on replace. > > Robin Jarry (2): > hash: avoid leaking entries on RCU defer queue failure > hash: free replaced data on overwrite when RCU is configured > > app/test/test_hash.c | 134 +++++++++++++++++++++++++ > doc/guides/rel_notes/release_26_03.rst | 7 ++ > lib/hash/rte_cuckoo_hash.c | 47 +++++++-- > lib/hash/rte_hash.h | 8 +- > 4 files changed, 187 insertions(+), 9 deletions(-) > > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite 2026-03-09 7:50 ` [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite Konstantin Ananyev @ 2026-03-09 7:52 ` Robin Jarry 2026-03-09 8:32 ` Konstantin Ananyev 0 siblings, 1 reply; 21+ messages in thread From: Robin Jarry @ 2026-03-09 7:52 UTC (permalink / raw) To: Konstantin Ananyev, dev@dpdk.org Konstantin Ananyev, Mar 09, 2026 at 08:50: >> * Removed the rte_hash_replace_key_data() API > > That's a pity - I think it is a useful one. > >> which required to bubble >> a potential old_data pointer from search_and_update() to all >> intermediate callers down to __rte_hash_add_key_with_hash(). It made >> an already complex code even more obscure. I can send it back but it will make the code complex since I cannot free the pointer in the callers. I need to pass it down the call chain to make the free-or-return-old-data decision. Let me know what you prefer. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite 2026-03-09 7:52 ` Robin Jarry @ 2026-03-09 8:32 ` Konstantin Ananyev 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Ananyev @ 2026-03-09 8:32 UTC (permalink / raw) To: Robin Jarry, dev@dpdk.org Cc: Bruce Richardson, Vladimir Medvedkin, Yipeng Wang Hi Robin, > >> * Removed the rte_hash_replace_key_data() API > > > > That's a pity - I think it is a useful one. > > > >> which required to bubble > >> a potential old_data pointer from search_and_update() to all > >> intermediate callers down to __rte_hash_add_key_with_hash(). It made > >> an already complex code even more obscure. > > I can send it back but it will make the code complex since I cannot free > the pointer in the callers. I need to pass it down the call chain to > make the free-or-return-old-data decision. > Let me know what you prefer. My thoughts are: I think what we have now inside rte_hash - Is a bug that needs to be fixed. So your v3 patch looks good to me: it fixes the memory leak. That's why I ack-ed it - I think patch is good to go as it is. From other hand - rte_hash_replace_key_data() - is a 'nice to have' feature, At it is kind of a new API, I think if we decide to go ahead with it - it can be done as a new patch series. If you have time to work on it - please go ahead, I'll be happy to review the patch. Also CC-ing to hash lib maintainers - lads what is your opinion on that subject? Konstantin ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite 2026-03-06 8:47 ` [PATCH dpdk v3 0/2] hash: safe data replacement " Robin Jarry ` (2 preceding siblings ...) 2026-03-09 7:50 ` [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite Konstantin Ananyev @ 2026-03-17 9:40 ` Thomas Monjalon 3 siblings, 0 replies; 21+ messages in thread From: Thomas Monjalon @ 2026-03-17 9:40 UTC (permalink / raw) To: Robin Jarry; +Cc: dev, Konstantin Ananyev > Robin Jarry (2): > hash: avoid leaking entries on RCU defer queue failure > hash: free replaced data on overwrite when RCU is configured Applied, thanks. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2026-03-17 9:40 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-02-12 21:33 [PATCH dpdk 0/2] hash: safe data replacement on overwrite Robin Jarry 2026-02-12 21:33 ` [PATCH dpdk 1/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry 2026-02-12 21:33 ` [PATCH dpdk 2/2] hash: add replace API returning old data on overwrite Robin Jarry 2026-02-12 22:55 ` [PATCH dpdk 0/2] hash: safe data replacement " Stephen Hemminger 2026-02-13 10:34 ` [PATCH dpdk v2 0/3] " Robin Jarry 2026-02-13 10:34 ` [PATCH dpdk v2 1/3] hash: avoid leaking entries on RCU defer queue failure Robin Jarry 2026-03-04 10:28 ` Konstantin Ananyev 2026-02-13 10:34 ` [PATCH dpdk v2 2/3] hash: free replaced data on overwrite when RCU is configured Robin Jarry 2026-03-04 11:40 ` Konstantin Ananyev 2026-03-04 11:45 ` Robin Jarry 2026-03-04 12:50 ` Robin Jarry 2026-02-13 10:34 ` [PATCH dpdk v2 3/3] hash: add replace API returning old data on overwrite Robin Jarry 2026-03-04 11:44 ` Konstantin Ananyev 2026-03-06 8:47 ` [PATCH dpdk v3 0/2] hash: safe data replacement " Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 1/2] hash: avoid leaking entries on RCU defer queue failure Robin Jarry 2026-03-06 8:47 ` [PATCH dpdk v3 2/2] hash: free replaced data on overwrite when RCU is configured Robin Jarry 2026-03-09 7:48 ` Konstantin Ananyev 2026-03-09 7:50 ` [PATCH dpdk v3 0/2] hash: safe data replacement on overwrite Konstantin Ananyev 2026-03-09 7:52 ` Robin Jarry 2026-03-09 8:32 ` Konstantin Ananyev 2026-03-17 9:40 ` Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox