All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Mark Blasko <blasko@google.com>
Cc: dev@dpdk.org, joshwash@google.com, jtranoleary@google.com
Subject: Re: [PATCH v4 0/6] net/gve: add hardware timestamping support
Date: Mon, 15 Jun 2026 11:53:14 -0700	[thread overview]
Message-ID: <20260615115314.5e2ff054@phoenix.local> (raw)
In-Reply-To: <20260613042300.3760470-1-blasko@google.com>

On Sat, 13 Jun 2026 04:22:33 +0000
Mark Blasko <blasko@google.com> wrote:

> This patch series introduces support for GVE hardware timestamping
> on DQO queues. To support concurrent access, a mutex lock is introduced
> to protect admin queue operations. A mechanism is then added to
> periodically synchronize the NIC clock via a dedicated control thread,
> and support is introduced for the read_clock ethdev operation.
> Finally, the RX datapath is updated to reconstruct full 64-bit
> timestamps from the 32-bit values in DQO descriptors.
> 
> ---

AI spotted several issues...

Reviewed the v3 series against current main. Findings on 4/6 and 5/6
below; patches 1, 2, 3, and 6 look good.

[PATCH v3 4/6] net/gve: add periodic NIC clock synchronization

Warning: gve_read_nic_clock() runs as an rte_alarm callback, i.e. on the
shared EAL interrupt thread, and calls gve_adminq_report_nic_timestamp()
-> gve_adminq_kick_and_wait(), which busy-waits via rte_delay_ms() up to
GVE_MAX_ADMINQ_EVENT_COUNTER_CHECK * GVE_ADMINQ_SLEEP_LEN = 100 * 20ms =
2s on an AdminQ timeout (tens of ms in the normal case). Blocking the
interrupt thread that long stalls link-status/reset detection and every
other device's interrupt and alarm handling for the whole process. The
existing gve_check_device_status() alarm only does a single ioread32be(),
so this is new behavior for the driver. Consider running the periodic
read off a dedicated control thread, or otherwise bounding the time spent
on the interrupt thread.

The teardown ordering itself is fine: rte_eal_alarm_cancel() is called
before gve_free_nic_ts_report(), and its spin-until-not-executing
semantics catch the self-rescheduled alarm, since the new entry is queued
during the callback before the dispatcher removes the executing one. No
use-after-free on the memzone there.

[PATCH v3 5/6] net/gve: support read clock ethdev op

Error: priv->nic_ts_lock is locked before it is initialized. In
gve_dev_init() the order is:

	err = gve_init_priv(priv, false);
	...
	pthread_mutex_init(&priv->nic_ts_lock, &mutexattr);

but gve_init_priv() -> gve_setup_device_resources() ->
gve_setup_nic_timestamp() calls gve_read_nic_clock() synchronously when
the device reports NIC-timestamp support, and after this patch
gve_read_nic_clock() takes priv->nic_ts_lock. So on timestamp-capable
hardware the first lock runs on an uninitialized mutex, and the later
pthread_mutex_init() then re-initializes an already-used mutex - both
undefined behavior. It only appears to work because dev_private is zeroed
(a zeroed pthread_mutex_t happens to be a valid default mutex on glibc).
Initialize nic_ts_lock (and the mutexattr) before the gve_init_priv()
call.

Error: priv->nic_ts_lock is destroyed before the alarm that uses it is
cancelled. In gve_dev_close():

	pthread_mutex_destroy(&priv->nic_ts_lock);
	gve_free_queues(dev);
	gve_teardown_device_resources(priv);	/* cancels gve_read_nic_clock alarm */

The periodic gve_read_nic_clock() alarm is still armed when the mutex is
destroyed, and that callback locks nic_ts_lock; if it fires in the window
before gve_teardown_device_resources() cancels it, it locks a destroyed
mutex. Move the pthread_mutex_destroy(&priv->nic_ts_lock) to after
gve_teardown_device_resources() returns.

The two 5/6 errors are the same root cause from opposite ends: the
nic_ts_lock lifetime needs to bracket all its users - initialized before
the synchronous setup-time read, and destroyed only after the alarm is
cancelled.

  parent reply	other threads:[~2026-06-15 18:53 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 22:43 [PATCH 0/6] net/gve: add hardware timestamping support mark-blasko
2026-05-11 22:43 ` [PATCH 1/6] net/gve: add thread safety to admin queue mark-blasko
2026-05-11 22:43 ` [PATCH 2/6] net/gve: add device option support for HW timestamps mark-blasko
2026-05-11 22:43 ` [PATCH 3/6] net/gve: add AdminQ command for NIC timestamps mark-blasko
2026-05-11 22:43 ` [PATCH 4/6] net/gve: add periodic NIC clock synchronization mark-blasko
2026-05-11 22:43 ` [PATCH 5/6] net/gve: support read clock ethdev op mark-blasko
2026-05-11 22:43 ` [PATCH 6/6] net/gve: reconstruct HW timestamps from DQO mark-blasko
2026-05-12  7:14 ` [PATCH 0/6] net/gve: add hardware timestamping support Stephen Hemminger
2026-05-15 23:19 ` [PATCH v2 " Mark Blasko
2026-05-15 23:19   ` [PATCH v2 1/6] net/gve: add thread safety to admin queue Mark Blasko
2026-05-15 23:19   ` [PATCH v2 2/6] net/gve: add device option support for HW timestamps Mark Blasko
2026-05-15 23:19   ` [PATCH v2 3/6] net/gve: add AdminQ command for NIC timestamps Mark Blasko
2026-05-15 23:19   ` [PATCH v2 4/6] net/gve: add periodic NIC clock synchronization Mark Blasko
2026-05-15 23:19   ` [PATCH v2 5/6] net/gve: support read clock ethdev op Mark Blasko
2026-05-15 23:19   ` [PATCH v2 6/6] net/gve: reconstruct HW timestamps from DQO Mark Blasko
2026-05-17 23:15   ` [PATCH v2 0/6] net/gve: add hardware timestamping support Stephen Hemminger
2026-05-18 18:43     ` Mark Blasko
2026-05-21 16:10       ` Joshua Washington
2026-06-05 21:29   ` [PATCH v3 " Mark Blasko
2026-06-05 21:29     ` [PATCH v3 1/6] net/gve: add thread safety to admin queue Mark Blasko
2026-06-05 21:29     ` [PATCH v3 2/6] net/gve: add device option support for HW timestamps Mark Blasko
2026-06-05 21:29     ` [PATCH v3 3/6] net/gve: add AdminQ command for NIC timestamps Mark Blasko
2026-06-05 21:29     ` [PATCH v3 4/6] net/gve: add periodic NIC clock synchronization Mark Blasko
2026-06-05 21:29     ` [PATCH v3 5/6] net/gve: support read clock ethdev op Mark Blasko
2026-06-05 21:29     ` [PATCH v3 6/6] net/gve: reconstruct HW timestamps from DQO Mark Blasko
2026-06-13  4:22     ` [PATCH v4 0/6] net/gve: add hardware timestamping support Mark Blasko
2026-06-13  4:22       ` [PATCH v4 1/6] net/gve: add thread safety to admin queue Mark Blasko
2026-06-13  4:22       ` [PATCH v4 2/6] net/gve: add device option support for HW timestamps Mark Blasko
2026-06-13  4:22       ` [PATCH v4 3/6] net/gve: add AdminQ command for NIC timestamps Mark Blasko
2026-06-13  4:22       ` [PATCH v4 4/6] net/gve: add periodic NIC clock synchronization Mark Blasko
2026-06-13  4:22       ` [PATCH v4 5/6] net/gve: support read clock ethdev op Mark Blasko
2026-06-13  4:22       ` [PATCH v4 6/6] net/gve: reconstruct HW timestamps from DQO Mark Blasko
2026-06-15 18:53       ` Stephen Hemminger [this message]
2026-06-15 21:01         ` [PATCH v4 0/6] net/gve: add hardware timestamping support Mark Blasko
2026-06-15 21:29           ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260615115314.5e2ff054@phoenix.local \
    --to=stephen@networkplumber.org \
    --cc=blasko@google.com \
    --cc=dev@dpdk.org \
    --cc=joshwash@google.com \
    --cc=jtranoleary@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.