DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Mark Blasko <blasko@google.com>
Cc: dev@dpdk.org, joshwash@google.com, jtranoleary@google.com
Subject: Re: [PATCH v4 0/6] net/gve: add hardware timestamping support
Date: Mon, 15 Jun 2026 11:53:14 -0700	[thread overview]
Message-ID: <20260615115314.5e2ff054@phoenix.local> (raw)
In-Reply-To: <20260613042300.3760470-1-blasko@google.com>

On Sat, 13 Jun 2026 04:22:33 +0000
Mark Blasko <blasko@google.com> wrote:

> This patch series introduces support for GVE hardware timestamping
> on DQO queues. To support concurrent access, a mutex lock is introduced
> to protect admin queue operations. A mechanism is then added to
> periodically synchronize the NIC clock via a dedicated control thread,
> and support is introduced for the read_clock ethdev operation.
> Finally, the RX datapath is updated to reconstruct full 64-bit
> timestamps from the 32-bit values in DQO descriptors.
> 
> ---

AI spotted several issues...

Reviewed the v3 series against current main. Findings on 4/6 and 5/6
below; patches 1, 2, 3, and 6 look good.

[PATCH v3 4/6] net/gve: add periodic NIC clock synchronization

Warning: gve_read_nic_clock() runs as an rte_alarm callback, i.e. on the
shared EAL interrupt thread, and calls gve_adminq_report_nic_timestamp()
-> gve_adminq_kick_and_wait(), which busy-waits via rte_delay_ms() up to
GVE_MAX_ADMINQ_EVENT_COUNTER_CHECK * GVE_ADMINQ_SLEEP_LEN = 100 * 20ms =
2s on an AdminQ timeout (tens of ms in the normal case). Blocking the
interrupt thread that long stalls link-status/reset detection and every
other device's interrupt and alarm handling for the whole process. The
existing gve_check_device_status() alarm only does a single ioread32be(),
so this is new behavior for the driver. Consider running the periodic
read off a dedicated control thread, or otherwise bounding the time spent
on the interrupt thread.

The teardown ordering itself is fine: rte_eal_alarm_cancel() is called
before gve_free_nic_ts_report(), and its spin-until-not-executing
semantics catch the self-rescheduled alarm, since the new entry is queued
during the callback before the dispatcher removes the executing one. No
use-after-free on the memzone there.

[PATCH v3 5/6] net/gve: support read clock ethdev op

Error: priv->nic_ts_lock is locked before it is initialized. In
gve_dev_init() the order is:

	err = gve_init_priv(priv, false);
	...
	pthread_mutex_init(&priv->nic_ts_lock, &mutexattr);

but gve_init_priv() -> gve_setup_device_resources() ->
gve_setup_nic_timestamp() calls gve_read_nic_clock() synchronously when
the device reports NIC-timestamp support, and after this patch
gve_read_nic_clock() takes priv->nic_ts_lock. So on timestamp-capable
hardware the first lock runs on an uninitialized mutex, and the later
pthread_mutex_init() then re-initializes an already-used mutex - both
undefined behavior. It only appears to work because dev_private is zeroed
(a zeroed pthread_mutex_t happens to be a valid default mutex on glibc).
Initialize nic_ts_lock (and the mutexattr) before the gve_init_priv()
call.

Error: priv->nic_ts_lock is destroyed before the alarm that uses it is
cancelled. In gve_dev_close():

	pthread_mutex_destroy(&priv->nic_ts_lock);
	gve_free_queues(dev);
	gve_teardown_device_resources(priv);	/* cancels gve_read_nic_clock alarm */

The periodic gve_read_nic_clock() alarm is still armed when the mutex is
destroyed, and that callback locks nic_ts_lock; if it fires in the window
before gve_teardown_device_resources() cancels it, it locks a destroyed
mutex. Move the pthread_mutex_destroy(&priv->nic_ts_lock) to after
gve_teardown_device_resources() returns.

The two 5/6 errors are the same root cause from opposite ends: the
nic_ts_lock lifetime needs to bracket all its users - initialized before
the synchronous setup-time read, and destroyed only after the alarm is
cancelled.

  parent reply	other threads:[~2026-06-15 18:53 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 22:43 [PATCH 0/6] net/gve: add hardware timestamping support mark-blasko
2026-05-11 22:43 ` [PATCH 1/6] net/gve: add thread safety to admin queue mark-blasko
2026-05-11 22:43 ` [PATCH 2/6] net/gve: add device option support for HW timestamps mark-blasko
2026-05-11 22:43 ` [PATCH 3/6] net/gve: add AdminQ command for NIC timestamps mark-blasko
2026-05-11 22:43 ` [PATCH 4/6] net/gve: add periodic NIC clock synchronization mark-blasko
2026-05-11 22:43 ` [PATCH 5/6] net/gve: support read clock ethdev op mark-blasko
2026-05-11 22:43 ` [PATCH 6/6] net/gve: reconstruct HW timestamps from DQO mark-blasko
2026-05-12  7:14 ` [PATCH 0/6] net/gve: add hardware timestamping support Stephen Hemminger
2026-05-15 23:19 ` [PATCH v2 " Mark Blasko
2026-05-15 23:19   ` [PATCH v2 1/6] net/gve: add thread safety to admin queue Mark Blasko
2026-05-15 23:19   ` [PATCH v2 2/6] net/gve: add device option support for HW timestamps Mark Blasko
2026-05-15 23:19   ` [PATCH v2 3/6] net/gve: add AdminQ command for NIC timestamps Mark Blasko
2026-05-15 23:19   ` [PATCH v2 4/6] net/gve: add periodic NIC clock synchronization Mark Blasko
2026-05-15 23:19   ` [PATCH v2 5/6] net/gve: support read clock ethdev op Mark Blasko
2026-05-15 23:19   ` [PATCH v2 6/6] net/gve: reconstruct HW timestamps from DQO Mark Blasko
2026-05-17 23:15   ` [PATCH v2 0/6] net/gve: add hardware timestamping support Stephen Hemminger
2026-05-18 18:43     ` Mark Blasko
2026-05-21 16:10       ` Joshua Washington
2026-06-05 21:29   ` [PATCH v3 " Mark Blasko
2026-06-05 21:29     ` [PATCH v3 1/6] net/gve: add thread safety to admin queue Mark Blasko
2026-06-05 21:29     ` [PATCH v3 2/6] net/gve: add device option support for HW timestamps Mark Blasko
2026-06-05 21:29     ` [PATCH v3 3/6] net/gve: add AdminQ command for NIC timestamps Mark Blasko
2026-06-05 21:29     ` [PATCH v3 4/6] net/gve: add periodic NIC clock synchronization Mark Blasko
2026-06-05 21:29     ` [PATCH v3 5/6] net/gve: support read clock ethdev op Mark Blasko
2026-06-05 21:29     ` [PATCH v3 6/6] net/gve: reconstruct HW timestamps from DQO Mark Blasko
2026-06-13  4:22     ` [PATCH v4 0/6] net/gve: add hardware timestamping support Mark Blasko
2026-06-13  4:22       ` [PATCH v4 1/6] net/gve: add thread safety to admin queue Mark Blasko
2026-06-13  4:22       ` [PATCH v4 2/6] net/gve: add device option support for HW timestamps Mark Blasko
2026-06-13  4:22       ` [PATCH v4 3/6] net/gve: add AdminQ command for NIC timestamps Mark Blasko
2026-06-13  4:22       ` [PATCH v4 4/6] net/gve: add periodic NIC clock synchronization Mark Blasko
2026-06-13  4:22       ` [PATCH v4 5/6] net/gve: support read clock ethdev op Mark Blasko
2026-06-13  4:22       ` [PATCH v4 6/6] net/gve: reconstruct HW timestamps from DQO Mark Blasko
2026-06-15 18:53       ` Stephen Hemminger [this message]
2026-06-15 21:01         ` [PATCH v4 0/6] net/gve: add hardware timestamping support Mark Blasko
2026-06-15 21:29           ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260615115314.5e2ff054@phoenix.local \
    --to=stephen@networkplumber.org \
    --cc=blasko@google.com \
    --cc=dev@dpdk.org \
    --cc=joshwash@google.com \
    --cc=jtranoleary@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox