From: Raslan Darawsheh <rasland@nvidia.com>
To: Linhu Li <lilinhu618@gmail.com>, dev@dpdk.org
Cc: stable@dpdk.org, dsosnowski@nvidia.com
Subject: Re: [PATCH v6] net/mlx5: fix counter TAILQ race between free and query callback
Date: Wed, 24 Jun 2026 10:17:25 +0300 [thread overview]
Message-ID: <8a2bf4bd-c9f1-4abd-aed9-9d3337152d6c@nvidia.com> (raw)
In-Reply-To: <20260618091450.19204-1-lilinhu618@gmail.com>
Hi,
On 18/06/2026 12:14 PM, Linhu Li wrote:
> flow_dv_counter_free() inserts counters into
> pool->counters[pool->query_gen] under pool->csl. Meanwhile,
> mlx5_flow_async_pool_query_handle() moves counters from
> pool->counters[query_gen ^ 1] to the global free list via
> TAILQ_CONCAT while holding only cmng->csl, not pool->csl.
>
> The comment in flow_dv_counter_free() claims the lock is not needed
> because the query callback and the release function operate on
> different lists. That holds only if the free path always observes
> the up-to-date query_gen. It can be violated:
>
> 1. A counter free thread (non-PMD, e.g. OVS offload thread) reads
> pool->query_gen == 0 and is about to insert into counters[0].
> 2. The free thread is preempted by the OS scheduler; it is a regular
> pthread, not pinned to a core.
> 3. The eal-intr-thread alarm fires: query_gen++ (now 1) and the async
> query is sent.
> 4. Hardware completes the query and the callback runs TAILQ_CONCAT on
> counters[0] (= query_gen ^ 1).
> 5. The free thread resumes and runs TAILQ_INSERT_TAIL on counters[0]
> concurrently with step 4 on another core.
>
> Because the two paths take different locks, TAILQ_INSERT_TAIL and
> TAILQ_CONCAT run concurrently on the same list with no synchronization
> and corrupt it: the pool-local list ends up with a NULL head but a
> dangling tqh_last, and the global free list tail no longer points to
> the real tail. The just-freed counter and every counter inserted
> afterwards become unreachable and are leaked.
>
> Non-PMD threads can be preempted for hundreds of microseconds under
> CPU pressure, which is well within the async query round-trip time,
> so the window is reachable in practice.
>
> Fix it by taking pool->csl in the query completion callback before
> operating on pool->counters[query_gen], serializing the CONCAT with
> any concurrent INSERT. The lock is taken once per pool per query
> completion in the eal-intr-thread context, not on the datapath, so
> the cost is negligible. Lock order is pool->csl then cmng->csl,
> matching all other sites.
>
> Also handle the error path: previously the counters accumulated in
> pool->counters[query_gen] were abandoned when a query failed. Move
> them back to the global free list to avoid a leak on persistent
> query failures.
>
> Additionally, fix a second independent race in flow_dv_counter_free():
> TAILQ_INSERT_TAIL is passed &pool->counters[pool->query_gen] directly,
> but the macro evaluates its head argument multiple times. Since
> pool->query_gen is a volatile bit-field, if mlx5_flow_query_alarm()
> increments query_gen between two evaluations of the macro, the same
> insertion can operate on two different lists: the earlier steps update
> counters[0] while the later steps update counters[1], leaving both
> lists with inconsistent metadata and leaking the counter. Fix by
> caching pool->query_gen into a local variable before calling the macro.
>
> Fixes: ac79183dc6f7 ("net/mlx5: optimize free counter lookup")
> Cc: stable@dpdk.org
>
> Signed-off-by: Linhu Li <lilinhu618@gmail.com>
> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Patch applied to next-net-mlx,
Kindest regards
Raslan Darawsheh
prev parent reply other threads:[~2026-06-24 7:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 10:11 [PATCH] net/mlx5: fix counter TAILQ race between free and query callback Laaahu
2026-06-08 12:41 ` Dariusz Sosnowski
2026-06-08 13:25 ` [PATCH v2] " Linhu Li
2026-06-08 14:11 ` Dariusz Sosnowski
2026-06-09 9:22 ` Dariusz Sosnowski
2026-06-10 6:34 ` [PATCH v3] " Linhu Li
2026-06-11 7:51 ` [PATCH v4] " Linhu Li
2026-06-16 8:03 ` [PATCH v5] " Linhu Li
2026-06-18 9:14 ` [PATCH v6] " Linhu Li
2026-06-24 7:17 ` Raslan Darawsheh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8a2bf4bd-c9f1-4abd-aed9-9d3337152d6c@nvidia.com \
--to=rasland@nvidia.com \
--cc=dev@dpdk.org \
--cc=dsosnowski@nvidia.com \
--cc=lilinhu618@gmail.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox