Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v10 3/4] ptr_ring: move free-space check into separate helper
From: Simon Schippers @ 2026-05-06 14:10 UTC (permalink / raw)
  To: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, mst, eperezma, leiyang, stephen, jon, tim.gebauer,
	simon.schippers, netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20260506141033.180450-1-simon.schippers@tu-dortmund.de>

This patch moves the check for available free space for a new entry into
a separate function. As a result, __ptr_ring_produce() remains logically
unchanged, while the new helper allows callers to determine in advance
whether subsequent __ptr_ring_produce() calls will succeed. This
information can, for example, be used to temporarily stop producing until
__ptr_ring_produce_peek() indicates that space is available again.

Co-developed-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Simon Schippers <simon.schippers@tu-dortmund.de>
---
 include/linux/ptr_ring.h | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index d2c3629bbe45..0887284e5b43 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -96,6 +96,17 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r)
 	return ret;
 }
 
+/* Note: callers invoking this in a loop must use a compiler barrier,
+ * for example cpu_relax(). Callers must hold producer_lock.
+ */
+static inline int __ptr_ring_produce_peek(struct ptr_ring *r)
+{
+	if (unlikely(!r->size) || data_race(r->queue[r->producer]))
+		return -ENOSPC;
+
+	return 0;
+}
+
 /* Note: callers invoking this in a loop must use a compiler barrier,
  * for example cpu_relax(). Callers must hold producer_lock.
  * Callers are responsible for making sure pointer that is being queued
@@ -103,8 +114,10 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r)
  */
 static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
 {
-	if (unlikely(!r->size) || data_race(r->queue[r->producer]))
-		return -ENOSPC;
+	int p = __ptr_ring_produce_peek(r);
+
+	if (p)
+		return p;
 
 	/* Make sure the pointer we are storing points to a valid data. */
 	/* Pairs with the dependency ordering in __ptr_ring_consume. */
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v10 4/4] tun/tap & vhost-net: avoid ptr_ring tail-drop when a qdisc is present
From: Simon Schippers @ 2026-05-06 14:10 UTC (permalink / raw)
  To: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, mst, eperezma, leiyang, stephen, jon, tim.gebauer,
	simon.schippers, netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20260506141033.180450-1-simon.schippers@tu-dortmund.de>

This commit prevents tail-drop when a qdisc is present and the ptr_ring
becomes full. Once an entry is successfully produced and the ptr_ring
reaches capacity, the netdev queue is stopped instead of dropping
subsequent packets. If no qdisc is present, the previous tail-drop
behavior is preserved.

If producing an entry fails anyways due to a race, tun_net_xmit() drops
the packet. Such races are expected because LLTX is enabled and the
transmit path operates without the usual locking.

The __tun_wake_queue() function of the consumer races with the producer
for waking/stopping the netdev queue, which could result in a stalled
queue. Therefore, an smp_mb__after_atomic() is introduced that pairs
with the smp_mb() of the consumer. It follows the principle of store
buffering described in tools/memory-model/Documentation/recipes.txt:

- The producer in tun_net_xmit() first sets __QUEUE_STATE_DRV_XOFF,
  followed by an smp_mb__after_atomic() (= smp_mb()), and then reads the
  ring with __ptr_ring_produce_peek().

- The consumer in __tun_wake_queue() first writes zero to the ring in
  __ptr_ring_consume(), followed by an smp_mb(), and then reads the queue
  status with netif_tx_queue_stopped().

=> Following the aforementioned principle, it is impossible for the
   producer to see a full ring (and therefore not wake the queue on the
   re-check) while the consumer simultaneously fails to see a stopped
   queue (and therefore also does not wake it).

Benchmarks:
The benchmarks show a slight regression in raw transmission performance
when using two sending threads. Packet loss also occurs only in the
two-thread sending case; no packet loss was observed with a single
sending thread.

Test setup:
AMD Ryzen 5 5600X at 4.3 GHz, 3200 MHz RAM, isolated QEMU threads;
Average over 50 runs @ 100,000,000 packets. SRSO and spectre v2
mitigations disabled.

Note for tap+vhost-net:
XDP drop program active in VM -> ~2.5x faster; slower for tap due to
more syscalls (high utilization of entry_SYSRETQ_unsafe_stack in perf)

+--------------------------+--------------+----------------+----------+
| 1 thread                 | Stock        | Patched with   | diff     |
| sending                  |              | fq_codel qdisc |          |
+------------+-------------+--------------+----------------+----------+
| TAP        | Received    | 1.132 Mpps   | 1.133 Mpps     | +0.1%    |
|            +-------------+--------------+----------------+----------+
|            | Lost/s      | 3.765 Mpps   | 0 pps          |          |
+------------+-------------+--------------+----------------+----------+
| TAP        | Received    | 3.857 Mpps   | 3.905 Mpps     | +1.2%    |
|            +-------------+--------------+----------------+----------+
| +vhost-net | Lost/s      | 0.802 Mpps   | 0 pps          |          |
+------------+-------------+--------------+----------------+----------+

+--------------------------+--------------+----------------+----------+
| 2 threads                | Stock        | Patched with   | diff     |
| sending                  |              | fq_codel qdisc |          |
+------------+-------------+--------------+----------------+----------+
| TAP        | Received    | 1.115 Mpps   | 1.092 Mpps     | -2.1%    |
|            +-------------+--------------+----------------+----------+
|            | Lost/s      | 8.490 Mpps   | 359 pps        |          |
+------------+-------------+--------------+----------------+----------+
| TAP        | Received    | 3.664 Mpps   | 3.549 Mpps     | -3.1%    |
|            +-------------+--------------+----------------+----------+
| +vhost-net | Lost/s      | 5.330 Mpps   | 832 pps        |          |
+------------+-------------+--------------+----------------+----------+

Co-developed-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Simon Schippers <simon.schippers@tu-dortmund.de>
---
 drivers/net/tun.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index fc358c4c355b..d9ffbf88cfd8 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1018,6 +1018,7 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct netdev_queue *queue;
 	struct tun_file *tfile;
 	int len = skb->len;
+	int ret;
 
 	rcu_read_lock();
 	tfile = rcu_dereference(tun->tfiles[txq]);
@@ -1072,13 +1073,33 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	nf_reset_ct(skb);
 
-	if (ptr_ring_produce(&tfile->tx_ring, skb)) {
+	queue = netdev_get_tx_queue(dev, txq);
+
+	spin_lock(&tfile->tx_ring.producer_lock);
+	ret = __ptr_ring_produce(&tfile->tx_ring, skb);
+	if (!qdisc_txq_has_no_queue(queue) &&
+	    (__ptr_ring_produce_peek(&tfile->tx_ring) || ret)) {
+		netif_tx_stop_queue(queue);
+		/* Paired with smp_mb() in __tun_wake_queue() */
+		smp_mb__after_atomic();
+		if (!__ptr_ring_produce_peek(&tfile->tx_ring))
+			netif_tx_wake_queue(queue);
+	}
+	spin_unlock(&tfile->tx_ring.producer_lock);
+
+	if (ret) {
+		/* This should be a rare case if a qdisc is present, but
+		 * can happen due to lltx.
+		 * Since skb_tx_timestamp(), skb_orphan(),
+		 * run_ebpf_filter() and pskb_trim() could have tinkered
+		 * with the SKB, returning NETDEV_TX_BUSY is unsafe and
+		 * we must drop instead.
+		 */
 		drop_reason = SKB_DROP_REASON_FULL_RING;
 		goto drop;
 	}
 
 	/* dev->lltx requires to do our own update of trans_start */
-	queue = netdev_get_tx_queue(dev, txq);
 	txq_trans_cond_update(queue);
 
 	/* Notify and wake up reader process */
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v10 1/4] tun/tap: add ptr_ring consume helper with netdev queue wakeup
From: Simon Schippers @ 2026-05-06 14:10 UTC (permalink / raw)
  To: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, mst, eperezma, leiyang, stephen, jon, tim.gebauer,
	simon.schippers, netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20260506141033.180450-1-simon.schippers@tu-dortmund.de>

Introduce tun_ring_consume() that wraps ptr_ring_consume() and calls
__tun_wake_queue(). The latter wakes the stopped netdev subqueue once
half of the ring capacity has been consumed, tracked via the new
cons_cnt field in tun_file. cons_cnt is updated while holding the ring
consumer lock, avoiding races. As a safety net, the queue is also woken
when the ring becomes empty. The point is to allow the queue to be
stopped when it gets full, which is required for traffic shaping -
implemented by the following "avoid ptr_ring tail-drop when a qdisc
is present". That patch also explains the pairing of the smp_mb()
of __tun_wake_queue().

Without the corresponding queue stopping, this patch alone causes no
regression for a tap setup sending to a qemu VM: 1.132 Mpps
to 1.144 Mpps.

Details: AMD Ryzen 5 5600X at 4.3 GHz, 3200 MHz RAM, isolated QEMU
threads, pktgen sender; Avg over 50 runs @ 100,000,000 packets;
SRSO and spectre v2 mitigations disabled.

Co-developed-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Simon Schippers <simon.schippers@tu-dortmund.de>
---
 drivers/net/tun.c | 54 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index b183189f1853..00ecf128fe8e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -145,6 +145,7 @@ struct tun_file {
 	struct list_head next;
 	struct tun_struct *detached;
 	struct ptr_ring tx_ring;
+	int cons_cnt;
 	struct xdp_rxq_info xdp_rxq;
 };
 
@@ -557,6 +558,13 @@ void tun_ptr_free(void *ptr)
 }
 EXPORT_SYMBOL_GPL(tun_ptr_free);
 
+static void tun_reset_cons_cnt(struct tun_file *tfile)
+{
+	spin_lock(&tfile->tx_ring.consumer_lock);
+	tfile->cons_cnt = 0;
+	spin_unlock(&tfile->tx_ring.consumer_lock);
+}
+
 static void tun_queue_purge(struct tun_file *tfile)
 {
 	void *ptr;
@@ -564,6 +572,7 @@ static void tun_queue_purge(struct tun_file *tfile)
 	while ((ptr = ptr_ring_consume(&tfile->tx_ring)) != NULL)
 		tun_ptr_free(ptr);
 
+	tun_reset_cons_cnt(tfile);
 	skb_queue_purge(&tfile->sk.sk_write_queue);
 	skb_queue_purge(&tfile->sk.sk_error_queue);
 }
@@ -730,6 +739,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
 		goto out;
 	}
 
+	tun_reset_cons_cnt(tfile);
 	tfile->queue_index = tun->numqueues;
 	tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN;
 
@@ -2115,13 +2125,46 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 	return total;
 }
 
-static void *tun_ring_recv(struct tun_file *tfile, int noblock, int *err)
+/* Callers must hold ring.consumer_lock */
+static void __tun_wake_queue(struct tun_struct *tun,
+			     struct tun_file *tfile, int consumed)
+{
+	struct netdev_queue *txq = netdev_get_tx_queue(tun->dev,
+						tfile->queue_index);
+
+	/* Paired with smp_mb__after_atomic() in tun_net_xmit() */
+	smp_mb();
+	if (netif_tx_queue_stopped(txq)) {
+		tfile->cons_cnt += consumed;
+		if (tfile->cons_cnt >= tfile->tx_ring.size / 2 ||
+		    __ptr_ring_empty(&tfile->tx_ring)) {
+			netif_tx_wake_queue(txq);
+			tfile->cons_cnt = 0;
+		}
+	}
+}
+
+static void *tun_ring_consume(struct tun_struct *tun, struct tun_file *tfile)
+{
+	void *ptr;
+
+	spin_lock(&tfile->tx_ring.consumer_lock);
+	ptr = __ptr_ring_consume(&tfile->tx_ring);
+	if (ptr)
+		__tun_wake_queue(tun, tfile, 1);
+
+	spin_unlock(&tfile->tx_ring.consumer_lock);
+	return ptr;
+}
+
+static void *tun_ring_recv(struct tun_struct *tun, struct tun_file *tfile,
+			   int noblock, int *err)
 {
 	DECLARE_WAITQUEUE(wait, current);
 	void *ptr = NULL;
 	int error = 0;
 
-	ptr = ptr_ring_consume(&tfile->tx_ring);
+	ptr = tun_ring_consume(tun, tfile);
 	if (ptr)
 		goto out;
 	if (noblock) {
@@ -2133,7 +2176,7 @@ static void *tun_ring_recv(struct tun_file *tfile, int noblock, int *err)
 
 	while (1) {
 		set_current_state(TASK_INTERRUPTIBLE);
-		ptr = ptr_ring_consume(&tfile->tx_ring);
+		ptr = tun_ring_consume(tun, tfile);
 		if (ptr)
 			break;
 		if (signal_pending(current)) {
@@ -2170,7 +2213,7 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
 
 	if (!ptr) {
 		/* Read frames from ring */
-		ptr = tun_ring_recv(tfile, noblock, &err);
+		ptr = tun_ring_recv(tun, tfile, noblock, &err);
 		if (!ptr)
 			return err;
 	}
@@ -3406,6 +3449,8 @@ static int tun_chr_open(struct inode *inode, struct file * file)
 		return -ENOMEM;
 	}
 
+	tun_reset_cons_cnt(tfile);
+
 	mutex_init(&tfile->napi_mutex);
 	RCU_INIT_POINTER(tfile->tun, NULL);
 	tfile->flags = 0;
@@ -3614,6 +3659,7 @@ static int tun_queue_resize(struct tun_struct *tun)
 	for (i = 0; i < tun->numqueues; i++) {
 		tfile = rtnl_dereference(tun->tfiles[i]);
 		rings[i] = &tfile->tx_ring;
+		tun_reset_cons_cnt(tfile);
 	}
 	list_for_each_entry(tfile, &tun->disabled, next)
 		rings[i++] = &tfile->tx_ring;
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v10 0/4] tun/tap & vhost-net: apply qdisc backpressure on full ptr_ring to reduce TX drops
From: Simon Schippers @ 2026-05-06 14:10 UTC (permalink / raw)
  To: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, mst, eperezma, leiyang, stephen, jon, tim.gebauer,
	simon.schippers, netdev, linux-kernel, kvm, virtualization

This patch series deals with tun/tap & vhost-net which drop incoming
SKBs whenever their internal ptr_ring buffer is full. Instead, with this 
patch series, the associated netdev queue is stopped - but only when a
qdisc is attached. If no qdisc is present the existing behavior is
preserved. The XDP transmit path is not affected. This patch series
touches tun/tap and vhost-net, as they share common logic and must be
updated together. Modifying only one of them would break the other.

By applying proper backpressure, this change allows the connected qdisc to 
operate correctly, as reported in [1], and significantly improves
performance in real-world scenarios, as demonstrated in our paper [2]. For 
example, we observed a 36% TCP throughput improvement for an OpenVPN 
connection between Germany and the USA.

Synthetic pktgen benchmarks indicate a slight regression, but packet loss
no longer occurs. Pktgen benchmarks are provided per commit, with the final
commit showing the overall performance.

Thanks!

[1] Link: https://unix.stackexchange.com/questions/762935/traffic-shaping-ineffective-on-tun-device
[2] Link: https://cni.etit.tu-dortmund.de/storages/cni-etit/r/Research/Publications/2025/Gebauer_2025_VTCFall/Gebauer_VTCFall2025_AuthorsVersion.pdf

---
Changelog:
v10:
- Changed the term "Transmitted" to "Received" in the benchmarks,
  as correctly pointed out by MST, and reran the benchmarks.

Addressed the Sashiko AI review:
- Avoid a data race on tfile->cons_cnt by always locking.
- Correctly count the number of consumed packets for vhost-net.
- Corrected a typo in the commit message of commit 3.
- Added a missing barrier on the consumer side.
--> The barriers now follow the "store buffering" principle.
- No longer return NETDEV_TX_BUSY at all, because it is unsafe.
--> Result: There are still a few drops with multiple senders, which
            would be avoided by disabling LLTX.

V9: https://lore.kernel.org/netdev/20260428123859.19578-1-simon.schippers@tu-dortmund.de/
- Addressed minor nit by MST in patches 1 and 2.
- Rebased patch 3 because of commit d748047
  ("ptr_ring: disable KCSAN warnings").
- Documented the pair of the smp_mb__after_atomic() in tun_net_xmit()
  with tun_ring_consume().
  --> It simply pairs with the test_and_clear_bit() inside of
      netif_wake_subqueue().
- Use 1 ptr_ring consumer spinlock instead of 2.
- Ran pktgen benchmarks with pg_set SHARED for 50 iterations on
  latest kernel
  --> No significant performance difference noticed

V8: https://lore.kernel.org/netdev/20260312130639.138988-1-simon.schippers@tu-dortmund.de/
- Drop code changes in drivers/net/tap.c; The code there deals with
  ipvtap/macvtap which are unrelated to the goal of this patch series
  and I did not realize that before
-> Greatly simplified logic, 4 instead of 9 commits
-> No more duplicated logics and distinction in vhost required
- Only wake after the queue stopped and half of the ring was consumed
  as suggested by MST
-> Performance improvements for TAP, but still slightly slower
- Better benchmarking with pinned threads, XDP drop program for
  tap+vhost-net and disabling CPU mitigations (and newer Ryzen 5 5600X
  processor) as suggested by Jason Wang

V7: https://lore.kernel.org/netdev/20260107210448.37851-1-simon.schippers@tu-dortmund.de/
- Switch to an approach similar to veth (excluding the recently fixed 
variant), as suggested by MST, with minor adjustments discussed in V6
- Rename the cover-letter title
- Add multithreaded pktgen and iperf3 benchmarks, as suggested by Jason 
Wang
- Rework __ptr_ring_consume_created_space() so it can also be used after 
batched consume

V6: https://lore.kernel.org/netdev/20251120152914.1127975-1-simon.schippers@tu-dortmund.de/
General:
- Major adjustments to the descriptions. Special thanks to Jon Kohler!
- Fix git bisect by moving most logic into dedicated functions and only 
start using them in patch 7.
- Moved the main logic of the coupled producer and consumer into a single 
patch to avoid a chicken-and-egg dependency between commits :-)
- Rebased to 6.18-rc5 and ran benchmarks again that now also include lost 
packets (previously I missed a 0, so all benchmark results were higher by 
factor 10...).
- Also include the benchmark in patch 7.

Producer:
- Move logic into the new helper tun_ring_produce()
- Added a smp_rmb() paired with the consumer, ensuring freed space of the 
consumer is visible
- Assume that ptr_ring is not full when __ptr_ring_full_next() is called

Consumer:
- Use an unpaired smp_rmb() instead of barrier() to ensure that the 
netdev_tx_queue_stopped() call completes before discarding
- Also wake the netdev queue if it was stopped before discarding and then 
becomes empty
-> Fixes race with producer as identified by MST in V5
-> Waking the netdev queues upon resize is not required anymore
- Use __ptr_ring_consume_created_space() instead of messing with ptr_ring 
internals
-> Batched consume now just calls 
__tun_ring_consume()/__tap_ring_consume() in a loop
- Added an smp_wmb() before waking the netdev queue which is paired with 
the smp_rmb() discussed above

V5: https://lore.kernel.org/netdev/20250922221553.47802-1-simon.schippers@tu-dortmund.de/T/#u
- Stop the netdev queue prior to producing the final fitting ptr_ring entry
-> Ensures the consumer has the latest netdev queue state, making it safe 
to wake the queue
-> Resolves an issue in vhost-net where the netdev queue could remain 
stopped despite being empty
-> For TUN/TAP, the netdev queue no longer needs to be woken in the 
blocking loop
-> Introduces new helpers __ptr_ring_full_next and 
__ptr_ring_will_invalidate for this purpose
- vhost-net now uses wrappers of TUN/TAP for ptr_ring consumption rather 
than maintaining its own rx_ring pointer

V4: https://lore.kernel.org/netdev/20250902080957.47265-1-simon.schippers@tu-dortmund.de/T/#u
- Target net-next instead of net
- Changed to patch series instead of single patch
- Changed to new title from old title
"TUN/TAP: Improving throughput and latency by avoiding SKB drops"
- Wake netdev queue with new helpers wake_netdev_queue when there is any 
spare capacity in the ptr_ring instead of waiting for it to be empty
- Use tun_file instead of tun_struct in tun_ring_recv as a more consistent 
logic
- Use smp_wmb() and smp_rmb() barrier pair, which avoids any packet drops 
that happened rarely before
- Use safer logic for vhost-net using RCU read locks to access TUN/TAP data

V3: https://lore.kernel.org/netdev/20250825211832.84901-1-simon.schippers@tu-dortmund.de/T/#u
- Added support for TAP and TAP+vhost-net.

V2: https://lore.kernel.org/netdev/20250811220430.14063-1-simon.schippers@tu-dortmund.de/T/#u
- Removed NETDEV_TX_BUSY return case in tun_net_xmit and removed 
unnecessary netif_tx_wake_queue in tun_ring_recv.

V1: https://lore.kernel.org/netdev/20250808153721.261334-1-simon.schippers@tu-dortmund.de/T/#u
---

Simon Schippers (4):
  tun/tap: add ptr_ring consume helper with netdev queue wakeup
  vhost-net: wake queue of tun/tap after ptr_ring consume
  ptr_ring: move free-space check into separate helper
  tun/tap & vhost-net: avoid ptr_ring tail-drop when a qdisc is present

 drivers/net/tun.c        | 102 ++++++++++++++++++++++++++++++++++++---
 drivers/vhost/net.c      |  21 +++++---
 include/linux/if_tun.h   |   3 ++
 include/linux/ptr_ring.h |  17 ++++++-
 4 files changed, 129 insertions(+), 14 deletions(-)

-- 
2.43.0


^ permalink raw reply

* [PATCH net-next v10 2/4] vhost-net: wake queue of tun/tap after ptr_ring consume
From: Simon Schippers @ 2026-05-06 14:10 UTC (permalink / raw)
  To: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, mst, eperezma, leiyang, stephen, jon, tim.gebauer,
	simon.schippers, netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20260506141033.180450-1-simon.schippers@tu-dortmund.de>

Add tun_wake_queue() to tun.c and export it for use by vhost-net. The
function validates that the file belongs to a tun/tap device,
dereferences the tun_struct under RCU, and delegates to
__tun_wake_queue().

vhost_net_buf_produce() now calls tun_wake_queue() after a successful
batched consume of the ring to allow the netdev subqueue to be woken up.
The point is to allow the queue to be stopped when it gets full, which
is required for traffic shaping - implemented by the following
"avoid ptr_ring tail-drop when a qdisc is present".

Without the corresponding queue stopping, this patch alone causes no
throughput regression for a tap+vhost-net setup sending to a qemu VM:
3.857 Mpps to 3.891 Mpps.

Details: AMD Ryzen 5 5600X at 4.3 GHz, 3200 MHz RAM, isolated QEMU
threads, XDP drop program active in VM, pktgen sender; Avg over
50 runs @ 100,000,000 packets. SRSO and spectre v2 mitigations disabled.

Co-developed-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Simon Schippers <simon.schippers@tu-dortmund.de>
---
 drivers/net/tun.c      | 23 +++++++++++++++++++++++
 drivers/vhost/net.c    | 21 +++++++++++++++------
 include/linux/if_tun.h |  3 +++
 3 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 00ecf128fe8e..fc358c4c355b 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -3776,6 +3776,29 @@ struct ptr_ring *tun_get_tx_ring(struct file *file)
 }
 EXPORT_SYMBOL_GPL(tun_get_tx_ring);
 
+/* Callers must hold ring.consumer_lock */
+void tun_wake_queue(struct file *file, int consumed)
+{
+	struct tun_file *tfile;
+	struct tun_struct *tun;
+
+	if (file->f_op != &tun_fops)
+		return;
+
+	tfile = file->private_data;
+	if (!tfile)
+		return;
+
+	rcu_read_lock();
+
+	tun = rcu_dereference(tfile->tun);
+	if (tun)
+		__tun_wake_queue(tun, tfile, consumed);
+
+	rcu_read_unlock();
+}
+EXPORT_SYMBOL_GPL(tun_wake_queue);
+
 module_init(tun_init);
 module_exit(tun_cleanup);
 MODULE_DESCRIPTION(DRV_DESCRIPTION);
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 80965181920c..ee583d6cc0fa 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -176,13 +176,21 @@ static void *vhost_net_buf_consume(struct vhost_net_buf *rxq)
 	return ret;
 }
 
-static int vhost_net_buf_produce(struct vhost_net_virtqueue *nvq)
+static int vhost_net_buf_produce(struct sock *sk,
+				 struct vhost_net_virtqueue *nvq)
 {
+	struct file *file = sk->sk_socket->file;
 	struct vhost_net_buf *rxq = &nvq->rxq;
 
 	rxq->head = 0;
-	rxq->tail = ptr_ring_consume_batched(nvq->rx_ring, rxq->queue,
-					      VHOST_NET_BATCH);
+	spin_lock(&nvq->rx_ring->consumer_lock);
+	rxq->tail = __ptr_ring_consume_batched(nvq->rx_ring, rxq->queue,
+					       VHOST_NET_BATCH);
+
+	if (rxq->tail)
+		tun_wake_queue(file, rxq->tail);
+
+	spin_unlock(&nvq->rx_ring->consumer_lock);
 	return rxq->tail;
 }
 
@@ -209,14 +217,15 @@ static int vhost_net_buf_peek_len(void *ptr)
 	return __skb_array_len_with_tag(ptr);
 }
 
-static int vhost_net_buf_peek(struct vhost_net_virtqueue *nvq)
+static int vhost_net_buf_peek(struct sock *sk,
+			      struct vhost_net_virtqueue *nvq)
 {
 	struct vhost_net_buf *rxq = &nvq->rxq;
 
 	if (!vhost_net_buf_is_empty(rxq))
 		goto out;
 
-	if (!vhost_net_buf_produce(nvq))
+	if (!vhost_net_buf_produce(sk, nvq))
 		return 0;
 
 out:
@@ -995,7 +1004,7 @@ static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk)
 	unsigned long flags;
 
 	if (rvq->rx_ring)
-		return vhost_net_buf_peek(rvq);
+		return vhost_net_buf_peek(sk, rvq);
 
 	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
 	head = skb_peek(&sk->sk_receive_queue);
diff --git a/include/linux/if_tun.h b/include/linux/if_tun.h
index 80166eb62f41..5f3e206c7a73 100644
--- a/include/linux/if_tun.h
+++ b/include/linux/if_tun.h
@@ -22,6 +22,7 @@ struct tun_msg_ctl {
 #if defined(CONFIG_TUN) || defined(CONFIG_TUN_MODULE)
 struct socket *tun_get_socket(struct file *);
 struct ptr_ring *tun_get_tx_ring(struct file *file);
+void tun_wake_queue(struct file *file, int consumed);
 
 static inline bool tun_is_xdp_frame(void *ptr)
 {
@@ -55,6 +56,8 @@ static inline struct ptr_ring *tun_get_tx_ring(struct file *f)
 	return ERR_PTR(-EINVAL);
 }
 
+static inline void tun_wake_queue(struct file *f, int consumed) {}
+
 static inline bool tun_is_xdp_frame(void *ptr)
 {
 	return false;
-- 
2.43.0


^ permalink raw reply related

* [PATCH net v3] af_unix: Reject SIOCATMARK on non-stream sockets
From: Ren Wei @ 2026-05-06 14:08 UTC (permalink / raw)
  To: netdev
  Cc: kuniyu, davem, edumazet, kuba, pabeni, horms, rao.shoaib,
	yuantan098, yifanwucs, tomapufckgml, bird, wangjiexun2025, n05ec

From: Jiexun Wang <wangjiexun2025@gmail.com>

SIOCATMARK reports whether the receive queue is at the urgent mark for
MSG_OOB.

In AF_UNIX, MSG_OOB is supported only for SOCK_STREAM sockets.
SOCK_DGRAM and SOCK_SEQPACKET reject MSG_OOB in sendmsg() and recvmsg(),
so they should not support SIOCATMARK either.

Return -EOPNOTSUPP for non-stream sockets before checking the receive
queue.

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
---
Changes in v3:
- Update trailers.
- v2 Link: https://lore.kernel.org/all/20260413122916.1479959-1-n05ec@lzu.edu.cn/

Changes in v2:
- Rework the fix based on maintainer feedback.
- Drop the receive-queue locking approach and reject SIOCATMARK on
  non-stream sockets instead, since it is only meaningful for MSG_OOB.
- v1 Link: https://lore.kernel.org/netdev/f6cbbc8da90e95584847b5ceb60aae830d1631c2.1775731983.git.wangjiexun2025@gmail.com/

 net/unix/af_unix.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e2d787ca3e74..1cbf36ea043b 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -3323,6 +3323,9 @@ static int unix_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 			struct sk_buff *skb;
 			int answ = 0;
 
+			if (sk->sk_type != SOCK_STREAM)
+				return -EOPNOTSUPP;
+
 			mutex_lock(&u->iolock);
 
 			skb = skb_peek(&sk->sk_receive_queue);
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH net v2] net: ethernet: cortina: Drop half-assembled SKB
From: Alexander Lobakin @ 2026-05-06 14:02 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans Ulli Kroll, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Michał Mirosław,
	Li Xiasong, netdev, Andreas Haarmann-Thiemann
In-Reply-To: <20260505-gemini-ethernet-fix-v2-1-997c31d06079@kernel.org>

From: Linus Walleij <linusw@kernel.org>
Date: Tue, 05 May 2026 23:52:17 +0200

> From: Andreas Haarmann-Thiemann <eitschman@nebelreich.de>
> 
> In gmac_rx() (drivers/net/ethernet/cortina/gemini.c), when
> gmac_get_queue_page() returns NULL for the second page of a multi-page
> fragment, the driver logs an error and continues — but does not free the
> partially assembled skb that was being assembled via napi_build_skb() /
> napi_get_frags().
> 
> Free the in-progress partially assembled skb via napi_free_frags()
> and increase the number of dropped frames appropriately
> and assign the skb pointer NULL to make sure it is not lingering
> around, matching the pattern already used elsewhere in the driver.
> 
> Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
> Signed-off-by: Andreas Haarmann-Thiemann <eitschman@nebelreich.de>
> Signed-off-by: Linus Walleij <linusw@kernel.org>

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>

> ---
> Changes in v2:
> - Fix up the commit message so it is clear what the patch is doing.
> - Also increase the number of dropped frames as noted by Li Xiasong.
> - Link to v1: https://lore.kernel.org/r/20260330-gemini-ethernet-fix-v1-1-18783a45d13a@kernel.org
> ---
>  drivers/net/ethernet/cortina/gemini.c | 5 +++++
>  1 file changed, 5 insertions(+)

Thanks,
Olek

^ permalink raw reply

* [PATCH iproute2-next] ipaddress.c: do not show address priority in brief mode
From: Arseny Maslennikov @ 2026-05-05 19:36 UTC (permalink / raw)
  To: netdev; +Cc: Arseny Maslennikov

Otherwise, the following output is possible:

  % ip -br a
  lo               UNKNOWN    127.0.0.1/8 ::1/128
  enp7s0           UP         10.5.0.9/16 metric 1024 fe80::546f:33ff:fee1:9/64 2001:db8:0:a:546f:33ff:fee1:9/64

This was observed on a machine with systemd-networkd stable 258.5 active.

The point of `ip -brief a` is to provide basic information about addresses
easily processed by awk(1) and the like. Consumers of its output have come
to expect $3 and further fields to be a space-separated list of addresses.
The above output breaks this expectation. While a human reader may
distinguish "metric" and "1024" from an address/prefixlen pair, existing
scripts won't.

Fixes: 78d04c7b27cf ("ipaddress: Add support for address metric")
Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru>
---
 ip/ipaddress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 4d93a04a..80070a1a 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -1636,7 +1636,7 @@ int print_addrinfo(struct nlmsghdr *n, void *arg)
 		}
 		print_int(PRINT_ANY, "prefixlen", "/%d ", ifa->ifa_prefixlen);
 
-		if (rta_tb[IFA_RT_PRIORITY])
+		if (!brief && rta_tb[IFA_RT_PRIORITY])
 			print_uint(PRINT_ANY, "metric", "metric %u ",
 				   rta_getattr_u32(rta_tb[IFA_RT_PRIORITY]));
 	}
-- 
2.50.1


^ permalink raw reply related

* Re: [PATCH net-next v3] net: reduce RFS/ARFS flow updates by checking LLC affinity
From: chuang @ 2026-05-06 14:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Stanislav Fomichev, Kuniyuki Iwashima, Samiullah Khawaja,
	Hangbin Liu, Krishna Kumar, Neal Cardwell, Martin KaFai Lau,
	netdev, linux-kernel
In-Reply-To: <CANn89i+H=gPo22v0ApLges0ScjMOgAcVsDQLwkr2iAEu+M7B7w@mail.gmail.com>

On Tue, Apr 28, 2026 at 1:09 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Apr 27, 2026 at 7:56 PM Chuang Wang <nashuiliang@gmail.com> wrote:
> >
> > The current implementation of rps_record_sock_flow() updates the flow
> > table every time a socket is processed on a different CPU. In high-load
> > scenarios, especially with Accelerated RFS (ARFS), this triggers
> > frequent flow steering updates via ndo_rx_flow_steer.
> >
> > For drivers like mlx5 that implement hardware flow steering, these
> > constant updates lead to significant contention on internal driver locks
> > (e.g., arfs_lock). This contention often becomes a performance
> > bottleneck that outweighs the steering benefits.
> >
> > This patch introduces a cache-aware update strategy: the flow record is
> > only updated if the flow migrates across Last Level Cache (LLC)
> > boundaries. This minimizes expensive hardware reconfigurations while
> > preserving cache locality for the application. A new sysctl,
> > net.core.rps_feat_llc_affinity, is added to toggle this feature.
> >
> > Performance Test Results:
> > The patch was tested in a K8s environment (AMD CPU 128*2, 16-core Pod
> > with CPU pinning, mlx5 NIC) using brpc[1] echo_server and rpc_press.
> >
> > rpc_press Commands:
> >
> >   for i in {1..8}; do
> >     ./rpc_press -proto=./echo.proto -method=example.EchoService.Echo
> >     -server=<IP>:8000 -input='{"message":"hello"}'
> >     -qps=0 -thread_num=512 -connection_type=pooled &
> >   done
> >
> > Monitor mlx5e_rx_flow_steer frequency:
> >
> >   /usr/share/bcc/tools/funccount -i 1 mlx5e_rx_flow_steer
> >
> > Frequency of mlx5e_rx_flow_steer (via funccount[2]):
> >
> >   Before: ~335,000 counts/sec
> >   After:   ~23,000 counts/sec (reduced by ~93%)
> >
> > System Metrics (after enabling rps_feat_llc_affinity):
> >
> >   CPU Utilization: 38% -> 32%
> >   CPU PSI (Pressure Stall Information): 20% -> 10%
> >
> > These results demonstrate that filtering updates by LLC affinity
> > significantly reduces driver lock contention and improves overall
> > CPU efficiency under heavy network load.
> >
> > [1] https://github.com/apache/brpc/
> > [2] https://github.com/iovisor/bcc/blob/master/tools/funccount.py
> >
> > Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
> > ---
> > v2 -> v3: patch net -> net-next
> > v1 -> v2: add rps_feat_llc_affinity; add brpc tests
> >
> >  include/net/rps.h          | 18 ++--------
> >  net/core/dev.c             | 72 ++++++++++++++++++++++++++++++++++++++
> >  net/core/sysctl_net_core.c | 34 ++++++++++++++++++
> >  3 files changed, 108 insertions(+), 16 deletions(-)
> >
> > diff --git a/include/net/rps.h b/include/net/rps.h
> > index e33c6a2fa8bb..37bbb7009c36 100644
> > --- a/include/net/rps.h
> > +++ b/include/net/rps.h
> > @@ -12,6 +12,7 @@
> >
> >  extern struct static_key_false rps_needed;
> >  extern struct static_key_false rfs_needed;
> > +extern struct static_key_false rps_feat_llc_affinity;
> >
> >  /*
> >   * This structure holds an RPS map which can be of variable length.  The
> > @@ -55,22 +56,7 @@ struct rps_sock_flow_table {
> >
> >  #define RPS_NO_CPU 0xffff
> >
> > -static inline void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash)
> > -{
> > -       unsigned int index = hash & rps_tag_to_mask(tag_ptr);
> > -       u32 val = hash & ~net_hotdata.rps_cpu_mask;
> > -       struct rps_sock_flow_table *table;
> > -
> > -       /* We only give a hint, preemption can change CPU under us */
> > -       val |= raw_smp_processor_id();
> > -
> > -       table = rps_tag_to_table(tag_ptr);
> > -       /* The following WRITE_ONCE() is paired with the READ_ONCE()
> > -        * here, and another one in get_rps_cpu().
> > -        */
> > -       if (READ_ONCE(table[index].ent) != val)
> > -               WRITE_ONCE(table[index].ent, val);
> > -}
> > +void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash);
> >
> >  static inline void _sock_rps_record_flow_hash(__u32 hash)
> >  {
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 203dc36aaed5..630a7f21d8de 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -4964,6 +4964,8 @@ struct static_key_false rps_needed __read_mostly;
> >  EXPORT_SYMBOL(rps_needed);
> >  struct static_key_false rfs_needed __read_mostly;
> >  EXPORT_SYMBOL(rfs_needed);
> > +struct static_key_false rps_feat_llc_affinity __read_mostly;
> > +EXPORT_SYMBOL(rps_feat_llc_affinity);
> >
> >  static u32 rfs_slot(u32 hash, rps_tag_ptr tag_ptr)
> >  {
> > @@ -5175,6 +5177,76 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
> >         return cpu;
> >  }
> >
> > +/**
> > + * rps_record_cond - Determine if RPS flow table should be updated
> > + * @old_val: Previous flow record value
> > + * @new_val: Target flow record value
> > + *
> > + * Returns true if the record needs an update.
> > + */
> > +static inline bool rps_record_cond(u32 old_val, u32 new_val)
> > +{
> > +       u32 old_cpu = old_val & ~net_hotdata.rps_cpu_mask;
> > +       u32 new_cpu = new_val & ~net_hotdata.rps_cpu_mask;
> > +
> > +       if (old_val == new_val)
> > +               return false;
> > +
> > +       /*
> > +        * RPS LLC Affinity Feature:
> > +        * Reduce RFS/ARFS flow updates by checking LLC affinity.
> > +        *
> > +        * Frequent flow table updates can trigger constant hardware steering
> > +        * reconfigurations (e.g., ndo_rx_flow_steer), leading to significant
> > +        * contention on driver internal locks (like mlx5's arfs_lock).
> > +        *
> > +        * This strategy only updates the flow record if it migrates across LLC
> > +        * boundaries. This minimizes expensive hardware updates while preserving
> > +        * cache locality for the application.
> > +        */
> > +       if (static_branch_unlikely(&rps_feat_llc_affinity)) {
> > +               /* Force update if the recorded CPU is invalid or has gone offline */
> > +               if (old_cpu >= nr_cpu_ids || !cpu_active(old_cpu))
> > +                       return true;
> > +
> > +               /*
> > +                * Force an update if the current task is no longer permitted
> > +                * to run on the old_cpu.
> > +                */
> > +               if (!cpumask_test_cpu(old_cpu, current->cpus_ptr))
> > +                       return true;
> > +
> > +               /*
> > +                * If CPUs do not share a cache, allow the update to prevent
> > +                * expensive remote memory accesses and cache misses.
> > +                */
> > +               if (!cpus_share_cache(old_cpu, new_cpu))
> > +                       return true;
> > +
> > +               return false;
> > +       }
> > +
> > +       return true;
> > +}
> > +
> > +void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash)
> > +{
> > +       unsigned int index = hash & rps_tag_to_mask(tag_ptr);
> > +       u32 val = hash & ~net_hotdata.rps_cpu_mask;
> > +       struct rps_sock_flow_table *table;
> > +
> > +       /* We only give a hint, preemption can change CPU under us */
> > +       val |= raw_smp_processor_id();
> > +
> > +       table = rps_tag_to_table(tag_ptr);
> > +       /* The following WRITE_ONCE() is paired with the READ_ONCE()
> > +        * here, and another one in get_rps_cpu().
> > +        */
> > +       if (rps_record_cond(READ_ONCE(table[index].ent), val))
> > +               WRITE_ONCE(table[index].ent, val);
> > +}
> > +EXPORT_SYMBOL(rps_record_sock_flow);
>
> We do not want to put rps_record_sock_flow out of line.
> rps_llc_check() is probably fine, it should not be called often.
>

The same issue reported in:
https://lore.kernel.org/netdev/CACueBy4KyU8DjwtLM6pzjQNTbiy2M+ZhZdO7Ag=ssqWq00CJ7w@mail.gmail.com/
The reason is that 'tun' uses sock_rps_record_flow_hash() in
tun_flow_update(), which triggers the following compilation error due
to symbol visibility when CONFIG_TUN is built as a module:

ERROR: modpost: "rps_llc_check" [drivers/net/tun.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:147: Module.symvers] Error 1

To resolve this, it seems more appropriate to export
sock_rps_record_flow_hash in net/core/dev.c:

+void sock_rps_record_flow_hash(__u32 hash)
+{
+#ifdef CONFIG_RPS
+       if (!rfs_is_needed())
+               return;
+
+       _sock_rps_record_flow_hash(hash);
+#endif
+}
+EXPORT_SYMBOL(sock_rps_record_flow_hash);
+

> diff --git a/include/net/rps.h b/include/net/rps.h
> index e33c6a2fa8bbca3555ecccbbf9132d01cc433c36..7e98918d8751eb824b7057cca9e5d40c28e5f18a
> 100644
> --- a/include/net/rps.h
> +++ b/include/net/rps.h
> @@ -55,10 +55,12 @@ struct rps_sock_flow_table {
>
>  #define RPS_NO_CPU 0xffff
>
> +bool rps_llc_check(u32 old_val, u32 new_val);
> +
>  static inline void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash)
>  {
>         unsigned int index = hash & rps_tag_to_mask(tag_ptr);
> -       u32 val = hash & ~net_hotdata.rps_cpu_mask;
> +       u32 old_val, val = hash & ~net_hotdata.rps_cpu_mask;
>         struct rps_sock_flow_table *table;
>
>         /* We only give a hint, preemption can change CPU under us */
> @@ -68,7 +70,8 @@ static inline void rps_record_sock_flow(rps_tag_ptr
> tag_ptr, u32 hash)
>         /* The following WRITE_ONCE() is paired with the READ_ONCE()
>          * here, and another one in get_rps_cpu().
>          */
> -       if (READ_ONCE(table[index].ent) != val)
> +       old_val = READ_ONCE(table[index].ent);
> +       if (old_val != val && rps_llc_check(old_val, val))
>                 WRITE_ONCE(table[index].ent, val);
>  }

^ permalink raw reply

* Re: [PATCH net] vsock/virtio: fix potential unbounded skb queue
From: Stefano Garzarella @ 2026-05-06 14:00 UTC (permalink / raw)
  To: Arseniy Krasnov
  Cc: Bobby Eshleman, Eric Dumazet, Bobby Eshleman, Stefan Hajnoczi,
	Michael S. Tsirkin, David S . Miller, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev, eric.dumazet, Arseniy Krasnov, Jason Wang,
	Xuan Zhuo, Eugenio Pérez, kvm, virtualization
In-Reply-To: <e1f32df5-b6a1-47a8-a783-fcc8e3c91f25@salutedevices.com>

On Wed, May 06, 2026 at 12:50:04PM +0300, Arseniy Krasnov wrote:
>
>
>05.05.2026 19:37, Bobby Eshleman wrote:
>> On Tue, May 05, 2026 at 06:11:13PM +0200, Stefano Garzarella wrote:
>>> On Tue, May 05, 2026 at 07:14:36AM -0700, Eric Dumazet wrote:
>>>> On Tue, May 5, 2026 at 6:52 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>>>>>
>>>>> On Thu, Apr 30, 2026 at 12:26:52PM +0000, Eric Dumazet wrote:
>>>>>> virtio_transport_inc_rx_pkt() checks vvs->rx_bytes + len > vvs->buf_alloc.
>>>>>>
>>>>>> virtio_transport_recv_enqueue() skips coalescing for packets
>>>>>> with VIRTIO_VSOCK_SEQ_EOM.
>>>>>>
>>>>>> If fed with packets with len == 0 and VIRTIO_VSOCK_SEQ_EOM,
>>>>>> a very large number of packets can be queued
>>>>>> because vvs->rx_bytes stays at 0.
>>>>>>
>>>>>> Fix this by estimating the skb metadata size:
>>>>>>
>>>>>>       (Number of skbs in the queue) * SKB_TRUESIZE(0)
>>>>>>
>>>>>> Fixes: 077706165717 ("virtio/vsock: don't use skbuff state to account credit")
>>>>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>>>>> Cc: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
>>>>>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>>>>>> Cc: Stefano Garzarella <sgarzare@redhat.com>
>>>>>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>>>>>> Cc: Jason Wang <jasowang@redhat.com>
>>>>>> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>>>> Cc: "Eugenio Pérez" <eperezma@redhat.com>
>>>>>> Cc: kvm@vger.kernel.org
>>>>>> Cc: virtualization@lists.linux.dev
>>>>>> ---
>>>>>> net/vmw_vsock/virtio_transport_common.c | 4 +++-
>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>>>>>> index 416d533f493d7b07e9c77c43f741d28cfcd0953e..9b8014516f4fb1130ae184635fbba4dfee58bd64 100644
>>>>>> --- a/net/vmw_vsock/virtio_transport_common.c
>>>>>> +++ b/net/vmw_vsock/virtio_transport_common.c
>>>>>> @@ -447,7 +447,9 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
>>>>>> static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
>>>>>>                                       u32 len)
>>>>>> {
>>>>>> -      if (vvs->buf_used + len > vvs->buf_alloc)
>>>>>> +      u64 skb_overhead = (skb_queue_len(&vvs->rx_queue) + 1) * SKB_TRUESIZE(0);
>>>>>> +
>>>>>> +      if (skb_overhead + vvs->buf_used + len > vvs->buf_alloc)
>>>>>>               return false;
>>>>>
>>>>> I'm not sure about this fix, I mean that maybe this is incomplete.
>>>>> In virtio-vsock, there is a credit mechanism between the two peers:
>>>>> https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#x1-4850003
>>>>>
>>>>> This takes only the payload into account, so it’s true that this problem
>>>>> exists; however, perhaps we should also inform the other peer of a lower
>>>>> credit balance, otherwise the other peer will believe it has much more
>>>>> credit than it actually does, send a large payload, and then the packet
>>>>> will be discarded and the data lost (there are no retransmissions,
>>>>> etc.).
>>>>
>>>> I dunno, perhaps revert 077706165717 ("virtio/vsock: don't use skbuff
>>>> state to account credit")
>>>> and find a better fix then?
>>>
>>> IIRC the same issue was there before the commit fixed by that one (commit
>>> 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")), so
>>> not sure about reverting it TBH.
>>>
>>> CCing Arseniy and Bobby.
>
>Thanks!
>
>>>
>>>>
>>>> There is always a discrepancy between skb->len and skb->truesize.
>>>> You will not be able to announce a 1MB window, and accept one milliion
>>>> skb of 1-byte each.
>>>>
>>>> This kind of contract is broken.
>>>>
>>>
>>> Yep, I agree, but before we start discarding data (and losing it), IMHO we
>>> should at least inform the other peer that we're out of space.
>>>
>>> @Stefan, @Michael, do you think we can do something in the spec to avoid
>>> this issue and in some way take into account also the metadata in the
>>> credit. I mean to avoid the 1-byte packets flooding.
>>>
>>> Thanks,
>>> Stefano
>>>
>>>
>>
>> Indeed the old pre-fix skb code would have the same issue.
>>
>> I can't think of any way around this without extending the spec.
>
>Hi, thanks, agree with Bobby, that accounting metadata (e.g. skb size here) was not implemented "by
>design" in credit logic - another side of data exchange knows nothing about that. Also the same
>situation was before skb implementation was added by Bobby. So looks like need to update spec may be.
>

Even if we change the specifications, we still need to work with older 
devices, so we should find a solution for this as well.

My main concern is data loss, so I'm considering the following options:

1. Notify the other peer of a smaller buf_alloc from the start, leave 
some room for overhead, and when it's running out, notify them that 
buf_alloc = 0. This way, the peer realizes it can’t send anything else.

2. Or update buf_alloc each time by removing the overhead, similar to 
what’s currently done in virtio_transport_inc_rx_pkt(), but also do it 
in virtio_transport_inc_tx_pkt().

As I said, IMO this patch alone is incomplete; we need to communicate 
with the peer somehow regarding space. I don’t think including the 
overhead in fwd_cnt is spec compliant, since the other peer has no idea 
how much overhead is needed, but reducing buf_alloc should be okay, even 
though I’m concerned about packets in flight.

As a quick fix, I think option 2 might be the easiest; I’ll run some 
tests and send over a patch.

But in the long run, I think we absolutely need to improve memory 
management in vsock, perhaps by avoiding custom solutions.

Thanks,
Stefano


^ permalink raw reply

* [PATCH net-next 9/9] net: atlantic: add PTP support for AQC113 (Antigua) (Antigua)
From: sukhdeeps @ 2026-05-06 13:57 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Add IEEE 1588 PTP support for the AQC113 (Antigua) network controller
alongside the existing AQC107 (Atlantic) PTP implementation.

AQC113 PTP uses a different hardware architecture from AQC107:
- Dual TSG clocks (sel 0 for PTP, sel 1 for PTM) instead of PHY-based
  timestamping
- TX timestamp via descriptor writeback instead of firmware mailbox
- Per-instance PTP timestamp offsets instead of global static table
- Hardware L3/L4 filters for PTP multicast steering with IPv4 and
  IPv6 support (4 filter slots for multicast addresses)
- Direct hardware clock control instead of firmware-mediated access

Key implementation details:

PTP clock management:
- Add aq_ptp_state enum to distinguish first init, link up, and no
  link states for proper clock initialization
- On AQC113, only reset the clock on first init (not on every link
  change) to avoid disrupting ongoing PTP synchronization
- Re-apply RX filters on link change since hardware state is lost

TX timestamp path:
- Add per-packet TX timestamp request via request_ts/clk_sel in the
  ring buffer descriptor
- Poll for TX timestamp completion in aq_ring_tx_clean() with a
  timeout mechanism (aq_ptp_tx_ts_timedout/clear)
- Set AQ_HW_TXD_CTL_TS_EN in TX descriptors for timestamp-requested
  packets

RX filter management:
- Replace single UDP filter with array of 4 for IPv4/IPv6 multicast
  PTP addresses (224.0.1.129, 224.0.0.107, ff0e::181, ff02::6b)
- Add aq_ptp_dpath_enable() for comprehensive filter setup/teardown
- Add aq_ptp_parse_rx_filters() to map hwtstamp_rx_filters to L2/L4
  enable flags

PTP TX path in aq_main.c:
- Add IPv6 PTP packet detection using ipv6_hdr()->nexthdr
- Use PTP_EV_PORT/PTP_GEN_PORT defines instead of magic numbers
- Move skb_tx_timestamp() to non-PTP path to avoid double timestamps

IRQ and initialization:
- Account for PTP IRQ vector (AQ_HW_PTP_IRQS) in vector math
- Move filter/VLAN rule application to aq_nic_start() for proper
  ordering after PTP ring setup
- Add AQ_HW_FLAG_STARTED flag management in open/close

HW layer (hw_atl2.c):
- Implement PTP clock enable/disable, read, adjust, increment
- Add GPIO pulse generation for PPS output
- Add TX/RX PTP ring initialization
- Add TX timestamp descriptor readback
- Add RX timestamp extraction from packet trailer
- Re-enable PTP after hardware reset
- Wire all PTP ops into hw_atl2_ops table

Per-instance PTP offsets with empirically measured values for AQC113
at each link speed (100M/1G/2.5G/5G/10G).

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../net/ethernet/aquantia/atlantic/aq_hw.h    |   1 +
 .../net/ethernet/aquantia/atlantic/aq_main.c  |  34 +-
 .../net/ethernet/aquantia/atlantic/aq_nic.c   |  48 +-
 .../ethernet/aquantia/atlantic/aq_pci_func.c  |   4 +-
 .../net/ethernet/aquantia/atlantic/aq_ptp.c   | 535 ++++++++++++++----
 .../net/ethernet/aquantia/atlantic/aq_ptp.h   |  15 +-
 .../net/ethernet/aquantia/atlantic/aq_ring.c  |  42 +-
 .../aquantia/atlantic/hw_atl2/hw_atl2.c       | 179 +++++-
 .../aquantia/atlantic/hw_atl2/hw_atl2.h       |  12 +
 .../atlantic/hw_atl2/hw_atl2_internal.h       |   3 +-
 .../aquantia/atlantic/hw_atl2/hw_atl2_utils.h |  10 +
 11 files changed, 710 insertions(+), 173 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
index e3bacad08b93..4141210578fd 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
@@ -15,6 +15,7 @@
 #include "aq_common.h"
 #include "aq_rss.h"
 #include "hw_atl/hw_atl_utils.h"
+#include "hw_atl2/hw_atl2.h"
 
 #define AQ_HW_MAC_COUNTER_HZ   312500000ll
 #define AQ_HW_PHY_COUNTER_HZ   160000000ll
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_main.c b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
index 4ef4fe64b8ac..aadf3f7f40d0 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_main.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
@@ -19,8 +19,10 @@
 #include <linux/netdevice.h>
 #include <linux/module.h>
 #include <linux/ip.h>
+#include <linux/ipv6.h>
 #include <linux/udp.h>
 #include <net/pkt_cls.h>
+#include <linux/ptp_classify.h>
 #include <net/pkt_sched.h>
 #include <linux/filter.h>
 
@@ -68,20 +70,14 @@ int aq_ndev_open(struct net_device *ndev)
 	if (err < 0)
 		goto err_exit;
 
-	err = aq_reapply_rxnfc_all_rules(aq_nic);
-	if (err < 0)
-		goto err_exit;
-
-	err = aq_filters_vlans_update(aq_nic);
-	if (err < 0)
-		goto err_exit;
-
 	err = aq_nic_start(aq_nic);
 	if (err < 0) {
 		aq_nic_stop(aq_nic);
 		goto err_exit;
 	}
 
+	aq_utils_obj_set(&aq_nic->aq_hw->flags, AQ_HW_FLAG_STARTED);
+
 err_exit:
 	if (err < 0)
 		aq_nic_deinit(aq_nic, true);
@@ -97,6 +93,7 @@ int aq_ndev_close(struct net_device *ndev)
 	err = aq_nic_stop(aq_nic);
 	aq_nic_deinit(aq_nic, true);
 
+	aq_utils_obj_clear(&aq_nic->aq_hw->flags, AQ_HW_FLAG_STARTED);
 	return err;
 }
 
@@ -113,16 +110,25 @@ static netdev_tx_t aq_ndev_start_xmit(struct sk_buff *skb, struct net_device *nd
 		 * and hardware PTP design of the chip. Otherwise ptp stream
 		 * will fail to sync
 		 */
-		if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) ||
-		    unlikely((ip_hdr(skb)->version == 4) &&
-			     (ip_hdr(skb)->protocol == IPPROTO_UDP) &&
-			     ((udp_hdr(skb)->dest == htons(319)) ||
-			      (udp_hdr(skb)->dest == htons(320)))) ||
-		    unlikely(eth_hdr(skb)->h_proto == htons(ETH_P_1588)))
+		if (unlikely(skb->protocol == htons(ETH_P_IP) &&
+			     ip_hdr(skb)->protocol == IPPROTO_UDP &&
+			     (udp_hdr(skb)->dest == htons(PTP_EV_PORT) ||
+			      udp_hdr(skb)->dest == htons(PTP_GEN_PORT))))
+			return aq_ptp_xmit(aq_nic, skb);
+
+		/* PTP over IPv6 does not use extension headers */
+		if (unlikely(skb->protocol == htons(ETH_P_IPV6) &&
+			     ipv6_hdr(skb)->nexthdr == IPPROTO_UDP &&
+			     (udp_hdr(skb)->dest == htons(PTP_EV_PORT) ||
+			      udp_hdr(skb)->dest == htons(PTP_GEN_PORT))))
+			return aq_ptp_xmit(aq_nic, skb);
+
+		if (unlikely(eth_hdr(skb)->h_proto == htons(ETH_P_1588)))
 			return aq_ptp_xmit(aq_nic, skb);
 	}
 #endif
 
+	skb_tx_timestamp(skb);
 	return aq_nic_xmit(aq_nic, skb);
 }
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index 3cec853e9fad..63a4987a60de 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -72,8 +72,15 @@ static void aq_nic_cfg_update_num_vecs(struct aq_nic_s *self)
 
 	cfg->vecs = min(cfg->aq_hw_caps->vecs, AQ_CFG_VECS_DEF);
 	cfg->vecs = min(cfg->vecs, num_online_cpus());
-	if (self->irqvecs > AQ_HW_SERVICE_IRQS)
-		cfg->vecs = min(cfg->vecs, self->irqvecs - AQ_HW_SERVICE_IRQS);
+	if (self->irqvecs > AQ_HW_SERVICE_IRQS + AQ_HW_PTP_IRQS)
+		cfg->vecs = min(cfg->vecs,
+				self->irqvecs - AQ_HW_SERVICE_IRQS - AQ_HW_PTP_IRQS);
+	else if (self->irqvecs > AQ_HW_PTP_IRQS)
+		cfg->vecs = min(cfg->vecs,
+				self->irqvecs - AQ_HW_PTP_IRQS);
+	else
+		cfg->vecs = 1U;
+
 	/* cfg->vecs should be power of 2 for RSS */
 	cfg->vecs = rounddown_pow_of_two(cfg->vecs);
 
@@ -138,7 +145,8 @@ void aq_nic_cfg_start(struct aq_nic_s *self)
 	 * link status IRQ. If no - we'll know link state from
 	 * slower service task.
 	 */
-	if (AQ_HW_SERVICE_IRQS > 0 && cfg->vecs + 1 <= self->irqvecs)
+	if (AQ_HW_SERVICE_IRQS > 0 &&
+	    self->irqvecs > AQ_HW_PTP_IRQS + AQ_HW_SERVICE_IRQS)
 		cfg->link_irq_vec = cfg->vecs;
 	else
 		cfg->link_irq_vec = 0;
@@ -172,7 +180,11 @@ static int aq_nic_update_link_status(struct aq_nic_s *self)
 		aq_nic_update_interrupt_moderation_settings(self);
 
 		if (self->aq_ptp) {
-			aq_ptp_clock_init(self);
+			/* PTP does not work in some modes even if physical link is up */
+			bool ptp_link_good = (self->aq_hw->aq_link_status.mbps >= 100 &&
+					      self->aq_hw->aq_link_status.full_duplex);
+
+			aq_ptp_clock_init(self, ptp_link_good ? AQ_PTP_LINK_UP : AQ_PTP_NO_LINK);
 			aq_ptp_tm_offset_set(self,
 					     self->aq_hw->aq_link_status.mbps);
 			aq_ptp_link_change(self);
@@ -279,6 +291,9 @@ static int aq_nic_hw_prepare(struct aq_nic_s *self)
 	int err = 0;
 
 	err = self->aq_hw_ops->hw_soft_reset(self->aq_hw);
+
+	self->aq_hw->clk_select = -1;
+
 	if (err)
 		goto exit;
 
@@ -450,7 +465,14 @@ int aq_nic_init(struct aq_nic_s *self)
 	}
 
 	if (aq_nic_get_cfg(self)->is_ptp) {
-		err = aq_ptp_init(self, self->irqvecs - 1);
+		u32 ptp_isr_vec;
+
+		if (self->irqvecs > AQ_HW_PTP_IRQS)
+			ptp_isr_vec = self->irqvecs - AQ_HW_PTP_IRQS;
+		else
+			ptp_isr_vec = 0;
+
+		err = aq_ptp_init(self, ptp_isr_vec);
 		if (err < 0)
 			goto err_exit;
 
@@ -496,6 +518,14 @@ int aq_nic_start(struct aq_nic_s *self)
 			goto err_exit;
 	}
 
+	err = aq_reapply_rxnfc_all_rules(self);
+	if (err < 0)
+		goto err_exit;
+
+	err = aq_filters_vlans_update(self);
+	if (err < 0)
+		goto err_exit;
+
 	err = aq_ptp_ring_start(self);
 	if (err < 0)
 		goto err_exit;
@@ -793,6 +823,12 @@ unsigned int aq_nic_map_skb(struct aq_nic_s *self, struct sk_buff *skb,
 
 	first->eop_index = dx;
 	dx_buff->is_eop = 1U;
+	if (skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS &&
+	    self->aq_hw_ops->enable_ptp &&
+	    self->aq_hw_ops->hw_get_clk_sel) {
+		dx_buff->request_ts = 1U;
+		dx_buff->clk_sel = self->aq_hw_ops->hw_get_clk_sel(self->aq_hw);
+	}
 	dx_buff->skb = skb;
 	dx_buff->xdpf = NULL;
 	goto exit;
@@ -895,8 +931,6 @@ int aq_nic_xmit(struct aq_nic_s *self, struct sk_buff *skb)
 
 	frags = aq_nic_map_skb(self, skb, ring);
 
-	skb_tx_timestamp(skb);
-
 	if (likely(frags)) {
 		err = self->aq_hw_ops->hw_ring_tx_xmit(self->aq_hw,
 						       ring, frags);
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
index e9e38af680c3..9e72a9c23b40 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
@@ -293,8 +293,8 @@ static int aq_pci_probe(struct pci_dev *pdev,
 	numvecs = min((u8)AQ_CFG_VECS_DEF,
 		      aq_nic_get_cfg(self)->aq_hw_caps->msix_irqs);
 	numvecs = min(numvecs, num_online_cpus());
-	/* Request IRQ vector for PTP */
-	numvecs += 1;
+	/* Request IRQ lines for PTP */
+	numvecs += AQ_HW_PTP_IRQS;
 
 	numvecs += AQ_HW_SERVICE_IRQS;
 	/*enable interrupts */
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
index 7486a28d7ff8..781d865e1127 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
@@ -26,6 +26,18 @@
 
 #define POLL_SYNC_TIMER_MS 15
 
+#define PTP_UDP_FILTERS_CNT 4
+
+#define PTP_IPV4_MC_ADDR1 0xE0000181
+#define PTP_IPV4_MC_ADDR2 0xE000006B
+
+#define PTP_IPV6_MC_ADDR10 0xFF0E
+#define PTP_IPV6_MC_ADDR14 0x0181
+#define PTP_IPV6_MC_ADDR20 0xFF02
+#define PTP_IPV6_MC_ADDR24 0x006B
+
+#define PTP_GPIO_HIGHTIME 100000
+
 enum ptp_speed_offsets {
 	ptp_offset_idx_10 = 0,
 	ptp_offset_idx_100,
@@ -49,6 +61,12 @@ struct ptp_tx_timeout {
 	unsigned long tx_start;
 };
 
+struct ptp_tm_offset {
+	unsigned int mbps;
+	int egress;
+	int ingress;
+};
+
 struct aq_ptp_s {
 	struct aq_nic_s *aq_nic;
 	struct kernel_hwtstamp_config hwtstamp_config;
@@ -64,7 +82,7 @@ struct aq_ptp_s {
 
 	struct ptp_tx_timeout ptp_tx_timeout;
 
-	unsigned int idx_vector;
+	unsigned int idx_ptp_vector;
 	struct napi_struct napi;
 
 	struct aq_ring_s ptp_tx;
@@ -73,7 +91,7 @@ struct aq_ptp_s {
 
 	struct ptp_skb_ring skb_ring;
 
-	struct aq_rx_filter_l3l4 udp_filter;
+	struct aq_rx_filter_l3l4 udp_filter[PTP_UDP_FILTERS_CNT];
 	struct aq_rx_filter_l2 eth_type_filter;
 
 	struct delayed_work poll_sync;
@@ -81,18 +99,15 @@ struct aq_ptp_s {
 
 	bool extts_pin_enabled;
 	u64 last_sync1588_ts;
+	/* TSG clock selection: 0 - PTP, 1 - PTM */
+	u32 ptp_clock_sel;
 
 	bool a1_ptp;
-};
+	bool a2_ptp;
 
-struct ptp_tm_offset {
-	unsigned int mbps;
-	int egress;
-	int ingress;
+	struct ptp_tm_offset ptp_offset[6];
 };
 
-static struct ptp_tm_offset ptp_offset[6];
-
 void aq_ptp_tm_offset_set(struct aq_nic_s *aq_nic, unsigned int mbps)
 {
 	struct aq_ptp_s *aq_ptp = aq_nic->aq_ptp;
@@ -104,10 +119,10 @@ void aq_ptp_tm_offset_set(struct aq_nic_s *aq_nic, unsigned int mbps)
 	egress = 0;
 	ingress = 0;
 
-	for (i = 0; i < ARRAY_SIZE(ptp_offset); i++) {
-		if (mbps == ptp_offset[i].mbps) {
-			egress = ptp_offset[i].egress;
-			ingress = ptp_offset[i].ingress;
+	for (i = 0; i < ARRAY_SIZE(aq_ptp->ptp_offset); i++) {
+		if (mbps == aq_ptp->ptp_offset[i].mbps) {
+			egress = aq_ptp->ptp_offset[i].egress;
+			ingress = aq_ptp->ptp_offset[i].ingress;
 			break;
 		}
 	}
@@ -366,6 +381,8 @@ static void aq_ptp_convert_to_hwtstamp(struct aq_ptp_s *aq_ptp,
 static int aq_ptp_hw_pin_conf(struct aq_nic_s *aq_nic, u32 pin_index, u64 start,
 			      u64 period)
 {
+	struct aq_ptp_s *aq_ptp = aq_nic->aq_ptp;
+
 	if (period)
 		netdev_dbg(aq_nic->ndev,
 			   "Enable GPIO %d pulsing, start time %llu, period %u\n",
@@ -380,7 +397,8 @@ static int aq_ptp_hw_pin_conf(struct aq_nic_s *aq_nic, u32 pin_index, u64 start,
 	 */
 	mutex_lock(&aq_nic->fwreq_mutex);
 	aq_nic->aq_hw_ops->hw_gpio_pulse(aq_nic->aq_hw, pin_index,
-					 0, start, (u32)period, 0);
+					 aq_ptp->ptp_clock_sel, start,
+					 (u32)period, PTP_GPIO_HIGHTIME);
 	mutex_unlock(&aq_nic->fwreq_mutex);
 
 	return 0;
@@ -454,7 +472,8 @@ static void aq_ptp_extts_pin_ctrl(struct aq_ptp_s *aq_ptp)
 
 	if (aq_nic->aq_hw_ops->hw_extts_gpio_enable)
 		aq_nic->aq_hw_ops->hw_extts_gpio_enable(aq_nic->aq_hw, 0,
-							0, enable);
+							aq_ptp->ptp_clock_sel,
+							enable);
 }
 
 static int aq_ptp_extts_pin_configure(struct ptp_clock_info *ptp,
@@ -543,14 +562,193 @@ void aq_ptp_tx_hwtstamp(struct aq_nic_s *aq_nic, u64 timestamp)
 		return;
 	}
 
-	timestamp += atomic_read(&aq_ptp->offset_egress);
-	aq_ptp_convert_to_hwtstamp(aq_ptp, &hwtstamp, timestamp);
-	skb_tstamp_tx(skb, &hwtstamp);
+	if ((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
+		timestamp += atomic_read(&aq_ptp->offset_egress);
+		aq_ptp_convert_to_hwtstamp(aq_ptp, &hwtstamp, timestamp);
+		skb_tstamp_tx(skb, &hwtstamp);
+	}
+
 	dev_kfree_skb_any(skb);
 
 	aq_ptp_tx_timeout_update(aq_ptp);
 }
 
+static void aq_ptp_fill_udpv4_mc(struct ethtool_rx_flow_spec *fsp,
+				 u16 rx_queue, __be32 mc_addr)
+{
+	memset(fsp, 0, sizeof(*fsp));
+	fsp->ring_cookie = rx_queue;
+	fsp->flow_type = UDP_V4_FLOW;
+	fsp->h_u.udp_ip4_spec.pdst = cpu_to_be16(PTP_EV_PORT);
+	fsp->m_u.udp_ip4_spec.pdst = cpu_to_be16(0xffff);
+	fsp->h_u.udp_ip4_spec.ip4dst = mc_addr;
+	fsp->m_u.udp_ip4_spec.ip4dst = cpu_to_be32(0xffffffff);
+}
+
+static void aq_ptp_fill_udpv6_mc(struct ethtool_rx_flow_spec *fsp,
+				 u16 rx_queue,
+				 __be32 ip6dst_hi, __be32 ip6dst_hi_mask,
+				 __be32 ip6dst_lo, __be32 ip6dst_lo_mask)
+{
+	memset(fsp, 0, sizeof(*fsp));
+	fsp->ring_cookie = rx_queue;
+	fsp->flow_type = UDP_V6_FLOW;
+	fsp->h_u.udp_ip6_spec.pdst = cpu_to_be16(PTP_EV_PORT);
+	fsp->m_u.udp_ip6_spec.pdst = cpu_to_be16(0xffff);
+	fsp->h_u.udp_ip6_spec.ip6dst[0] = ip6dst_hi;
+	fsp->m_u.udp_ip6_spec.ip6dst[0] = ip6dst_hi_mask;
+	fsp->h_u.udp_ip6_spec.ip6dst[3] = ip6dst_lo;
+	fsp->m_u.udp_ip6_spec.ip6dst[3] = ip6dst_lo_mask;
+}
+
+static int aq_ptp_add_a2_filter(struct aq_ptp_s *aq_ptp,
+				struct ethtool_rx_flow_spec *fsp,
+				int *flt_idx)
+{
+	struct aq_nic_s *aq_nic = aq_ptp->aq_nic;
+	int err;
+
+	err = aq_set_data_fl3l4(fsp,
+				&aq_ptp->udp_filter[*flt_idx],
+				aq_ptp->udp_filter[*flt_idx].location,
+				true);
+	if (!err) {
+		netdev_dbg(aq_nic->ndev,
+			   "PTP MC filter prepared. Loc: %x\n",
+			   aq_ptp->udp_filter[*flt_idx].location);
+		(*flt_idx)++;
+	}
+	return err;
+}
+
+static int aq_ptp_dpath_enable(struct aq_ptp_s *aq_ptp,
+			       int enable_flags, u16 rx_queue)
+{
+	struct aq_nic_s *aq_nic = aq_ptp->aq_nic;
+	struct ethtool_rxnfc cmd = { 0 };
+	int err = 0, i = 0;
+	int flt_idx = 0;
+	const struct aq_hw_ops *hw_ops = aq_nic->aq_hw_ops;
+	struct ethtool_rx_flow_spec *fsp =
+		(struct ethtool_rx_flow_spec *)&cmd.fs;
+
+	netdev_dbg(aq_nic->ndev,
+		   "%sable ptp filters: %x.\n",
+		   enable_flags ? "En" : "Dis", enable_flags);
+
+	if (enable_flags) {
+		if (enable_flags & (AQ_HW_PTP_L4_ENABLE)) {
+			if (aq_ptp->a1_ptp) {
+				fsp->ring_cookie = rx_queue;
+				fsp->flow_type = UDP_V4_FLOW;
+				fsp->h_u.udp_ip4_spec.pdst =
+					cpu_to_be16(PTP_EV_PORT);
+				fsp->m_u.udp_ip4_spec.pdst =
+					cpu_to_be16(0xffff);
+				err = aq_set_data_fl3l4(fsp,
+							&aq_ptp->udp_filter[flt_idx],
+							aq_ptp->udp_filter[flt_idx].location,
+							true);
+				if (!err) {
+					netdev_dbg(aq_nic->ndev,
+						   "Set UDPv4, location: %x\n",
+						   aq_ptp->udp_filter[flt_idx]
+						   .location);
+					flt_idx++;
+				}
+			} else {
+				aq_ptp_fill_udpv4_mc(fsp, rx_queue,
+						     cpu_to_be32(PTP_IPV4_MC_ADDR1));
+				err = aq_ptp_add_a2_filter(aq_ptp, fsp,
+							   &flt_idx);
+				if (err)
+					netdev_dbg(aq_nic->ndev,
+						   "UDPv4 filter prepare failed\n");
+
+				aq_ptp_fill_udpv6_mc(fsp, rx_queue,
+						     cpu_to_be32(PTP_IPV6_MC_ADDR20 << 16),
+						     cpu_to_be32(0xffff0000),
+						     cpu_to_be32(PTP_IPV6_MC_ADDR24),
+						     cpu_to_be32(0x0000ffff));
+				err = aq_ptp_add_a2_filter(aq_ptp, fsp,
+							   &flt_idx);
+				if (err)
+					netdev_dbg(aq_nic->ndev,
+						   "UDPv6 filter prepare failed\n");
+
+				aq_ptp_fill_udpv6_mc(fsp, rx_queue,
+						     cpu_to_be32(PTP_IPV6_MC_ADDR10 << 16),
+						     cpu_to_be32(0xffff0000),
+						     cpu_to_be32(PTP_IPV6_MC_ADDR14),
+						     cpu_to_be32(0x0000ffff));
+				err = aq_ptp_add_a2_filter(aq_ptp, fsp,
+							   &flt_idx);
+				if (err)
+					netdev_dbg(aq_nic->ndev,
+						   "UDPv6 filter prepare failed\n");
+
+				aq_ptp_fill_udpv4_mc(fsp, rx_queue,
+						     cpu_to_be32(PTP_IPV4_MC_ADDR2));
+				err = aq_ptp_add_a2_filter(aq_ptp, fsp,
+							   &flt_idx);
+				if (err)
+					netdev_dbg(aq_nic->ndev,
+						   "UDPv4 filter prepare failed\n");
+			}
+		}
+
+		if (enable_flags & AQ_HW_PTP_L2_ENABLE) {
+			aq_ptp->eth_type_filter.ethertype = ETH_P_1588;
+			aq_ptp->eth_type_filter.queue = rx_queue;
+		}
+
+		if (hw_ops->hw_filter_l3l4_set) {
+			for (i = 0; i < flt_idx; i++) {
+				err = hw_ops->hw_filter_l3l4_set(aq_nic->aq_hw,
+						&aq_ptp->udp_filter[i]);
+
+				if (!err) {
+					netdev_dbg(aq_nic->ndev,
+						   "Set UDP filter complete. Location: %x\n",
+						   aq_ptp->udp_filter[i].location);
+				} else {
+					netdev_dbg(aq_nic->ndev, "Set UDP filter failed\n");
+					break;
+				}
+			}
+		}
+
+		if (!err && hw_ops->hw_filter_l2_set) {
+			err = hw_ops->hw_filter_l2_set(aq_nic->aq_hw,
+					&aq_ptp->eth_type_filter);
+
+			if (!err)
+				netdev_dbg(aq_nic->ndev,
+					   "Set L2 filter complete. Location: %d\n",
+					   aq_ptp->eth_type_filter.location);
+		}
+	} else {
+		/* PTP disabled, clear all UDP/L2 filters */
+		for (i = 0; i < PTP_UDP_FILTERS_CNT; i++) {
+			aq_ptp->udp_filter[i].cmd &=
+				~HW_ATL_RX_ENABLE_FLTR_L3L4;
+			if (hw_ops->hw_filter_l3l4_set) {
+				err = hw_ops->hw_filter_l3l4_set(aq_nic->aq_hw,
+						&aq_ptp->udp_filter[i]);
+				if (err)
+					netdev_dbg(aq_nic->ndev,
+						   "Set UDP filter failed\n");
+			}
+		}
+
+		if (!err && hw_ops->hw_filter_l2_clear)
+			err = hw_ops->hw_filter_l2_clear(aq_nic->aq_hw,
+						&aq_ptp->eth_type_filter);
+	}
+
+	return err;
+}
+
 /* aq_ptp_rx_hwtstamp - utility function which checks for RX time stamp
  * @adapter: pointer to adapter struct
  * @shhwtstamps: particular skb_shared_hwtstamps to save timestamp
@@ -572,53 +770,53 @@ void aq_ptp_hwtstamp_config_get(struct aq_ptp_s *aq_ptp,
 	*config = aq_ptp->hwtstamp_config;
 }
 
-static void aq_ptp_prepare_filters(struct aq_ptp_s *aq_ptp)
+static unsigned int aq_ptp_parse_rx_filters(enum hwtstamp_rx_filters rx_filter)
 {
-	aq_ptp->udp_filter.cmd = HW_ATL_RX_ENABLE_FLTR_L3L4 |
-			       HW_ATL_RX_ENABLE_CMP_PROT_L4 |
-			       HW_ATL_RX_UDP |
-			       HW_ATL_RX_ENABLE_CMP_DEST_PORT_L4 |
-			       HW_ATL_RX_HOST << HW_ATL_RX_ACTION_FL3F4_SHIFT |
-			       HW_ATL_RX_ENABLE_QUEUE_L3L4 |
-			       aq_ptp->ptp_rx.idx << HW_ATL_RX_QUEUE_FL3L4_SHIFT;
-	aq_ptp->udp_filter.p_dst = PTP_EV_PORT;
-
-	aq_ptp->eth_type_filter.ethertype = ETH_P_1588;
-	aq_ptp->eth_type_filter.queue = aq_ptp->ptp_rx.idx;
+	unsigned int ptp_en_flags = AQ_HW_PTP_DISABLE;
+
+	switch (rx_filter) {
+	case HWTSTAMP_FILTER_NONE:
+		break;
+	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+		ptp_en_flags = AQ_HW_PTP_L2_ENABLE;
+		break;
+	case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+		ptp_en_flags = AQ_HW_PTP_L4_ENABLE;
+		break;
+	case HWTSTAMP_FILTER_PTP_V2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+	case HWTSTAMP_FILTER_ALL:
+	default:
+		ptp_en_flags = AQ_HW_PTP_L4_ENABLE | AQ_HW_PTP_L2_ENABLE;
+		break;
+	}
+	return ptp_en_flags;
 }
 
 int aq_ptp_hwtstamp_config_set(struct aq_ptp_s *aq_ptp,
 			       struct kernel_hwtstamp_config *config)
 {
+	unsigned int ptp_en_flags = aq_ptp_parse_rx_filters(config->rx_filter);
 	struct aq_nic_s *aq_nic = aq_ptp->aq_nic;
-	const struct aq_hw_ops *hw_ops;
 	int err = 0;
 
-	hw_ops = aq_nic->aq_hw_ops;
-	if (config->tx_type == HWTSTAMP_TX_ON ||
-	    config->rx_filter == HWTSTAMP_FILTER_PTP_V2_EVENT) {
-		aq_ptp_prepare_filters(aq_ptp);
-		if (hw_ops->hw_filter_l3l4_set) {
-			err = hw_ops->hw_filter_l3l4_set(aq_nic->aq_hw,
-							 &aq_ptp->udp_filter);
-		}
-		if (!err && hw_ops->hw_filter_l2_set) {
-			err = hw_ops->hw_filter_l2_set(aq_nic->aq_hw,
-						       &aq_ptp->eth_type_filter);
-		}
+	if (aq_ptp->hwtstamp_config.rx_filter != config->rx_filter)
+		err = aq_ptp_dpath_enable(aq_ptp,
+					  ptp_en_flags,
+					  aq_ptp->ptp_rx.idx);
+
+	if (ptp_en_flags != AQ_HW_PTP_DISABLE)
 		aq_utils_obj_set(&aq_nic->flags, AQ_NIC_PTP_DPATH_UP);
-	} else {
-		aq_ptp->udp_filter.cmd &= ~HW_ATL_RX_ENABLE_FLTR_L3L4;
-		if (hw_ops->hw_filter_l3l4_set) {
-			err = hw_ops->hw_filter_l3l4_set(aq_nic->aq_hw,
-							 &aq_ptp->udp_filter);
-		}
-		if (!err && hw_ops->hw_filter_l2_clear) {
-			err = hw_ops->hw_filter_l2_clear(aq_nic->aq_hw,
-							&aq_ptp->eth_type_filter);
-		}
+	else
 		aq_utils_obj_clear(&aq_nic->flags, AQ_NIC_PTP_DPATH_UP);
-	}
 
 	if (err)
 		return -EREMOTEIO;
@@ -673,21 +871,23 @@ static int aq_ptp_poll(struct napi_struct *napi, int budget)
 		was_cleaned = true;
 	}
 
-	/* Processing HW_TIMESTAMP RX traffic */
-	err = aq_nic->aq_hw_ops->hw_ring_hwts_rx_receive(aq_nic->aq_hw,
-							 &aq_ptp->hwts_rx);
-	if (err < 0)
-		goto err_exit;
-
-	if (aq_ptp->hwts_rx.sw_head != aq_ptp->hwts_rx.hw_head) {
-		aq_ring_hwts_rx_clean(&aq_ptp->hwts_rx, aq_nic);
-
-		err = aq_nic->aq_hw_ops->hw_ring_hwts_rx_fill(aq_nic->aq_hw,
-							      &aq_ptp->hwts_rx);
+	if (aq_ptp->a1_ptp) {
+		/* Processing HW_TIMESTAMP RX traffic */
+		err = aq_nic->aq_hw_ops->hw_ring_hwts_rx_receive(aq_nic->aq_hw,
+			&aq_ptp->hwts_rx);
 		if (err < 0)
 			goto err_exit;
 
-		was_cleaned = true;
+		if (aq_ptp->hwts_rx.sw_head != aq_ptp->hwts_rx.hw_head) {
+			aq_ring_hwts_rx_clean(&aq_ptp->hwts_rx, aq_nic);
+
+			err = aq_nic->aq_hw_ops->hw_ring_hwts_rx_fill(aq_nic->aq_hw,
+				&aq_ptp->hwts_rx);
+			if (err < 0)
+				goto err_exit;
+
+			was_cleaned = true;
+		}
 	}
 
 	/* Processing PTP RX traffic */
@@ -818,7 +1018,7 @@ int aq_ptp_irq_alloc(struct aq_nic_s *aq_nic)
 		return 0;
 
 	if (pdev->msix_enabled || pdev->msi_enabled) {
-		err = request_irq(pci_irq_vector(pdev, aq_ptp->idx_vector),
+		err = request_irq(pci_irq_vector(pdev, aq_ptp->idx_ptp_vector),
 				  aq_ptp_isr, 0, aq_nic->ndev->name, aq_ptp);
 	} else {
 		err = -EINVAL;
@@ -837,7 +1037,7 @@ void aq_ptp_irq_free(struct aq_nic_s *aq_nic)
 	if (!aq_ptp)
 		return;
 
-	free_irq(pci_irq_vector(pdev, aq_ptp->idx_vector), aq_ptp);
+	free_irq(pci_irq_vector(pdev, aq_ptp->idx_ptp_vector), aq_ptp);
 }
 
 int aq_ptp_ring_init(struct aq_nic_s *aq_nic)
@@ -875,6 +1075,9 @@ int aq_ptp_ring_init(struct aq_nic_s *aq_nic)
 	if (err < 0)
 		goto err_rx_free;
 
+	if (aq_ptp->a2_ptp)
+		return 0;
+
 	err = aq_ring_init(&aq_ptp->hwts_rx, ATL_RING_RX);
 	if (err < 0)
 		goto err_rx_free;
@@ -912,10 +1115,12 @@ int aq_ptp_ring_start(struct aq_nic_s *aq_nic)
 	if (err < 0)
 		goto err_exit;
 
-	err = aq_nic->aq_hw_ops->hw_ring_rx_start(aq_nic->aq_hw,
-						  &aq_ptp->hwts_rx);
-	if (err < 0)
-		goto err_exit;
+	if (aq_ptp->a1_ptp) {
+		err = aq_nic->aq_hw_ops->hw_ring_rx_start(aq_nic->aq_hw,
+							  &aq_ptp->hwts_rx);
+		if (err < 0)
+			goto err_exit;
+	}
 
 	napi_enable(&aq_ptp->napi);
 
@@ -933,7 +1138,9 @@ void aq_ptp_ring_stop(struct aq_nic_s *aq_nic)
 	aq_nic->aq_hw_ops->hw_ring_tx_stop(aq_nic->aq_hw, &aq_ptp->ptp_tx);
 	aq_nic->aq_hw_ops->hw_ring_rx_stop(aq_nic->aq_hw, &aq_ptp->ptp_rx);
 
-	aq_nic->aq_hw_ops->hw_ring_rx_stop(aq_nic->aq_hw, &aq_ptp->hwts_rx);
+	if (aq_ptp->a1_ptp)
+		aq_nic->aq_hw_ops->hw_ring_rx_stop(aq_nic->aq_hw,
+						   &aq_ptp->hwts_rx);
 
 	napi_disable(&aq_ptp->napi);
 }
@@ -972,11 +1179,13 @@ int aq_ptp_ring_alloc(struct aq_nic_s *aq_nic)
 	if (err)
 		goto err_exit_ptp_tx;
 
-	err = aq_ring_hwts_rx_alloc(&aq_ptp->hwts_rx, aq_nic, PTP_HWST_RING_IDX,
-				    aq_nic->aq_nic_cfg.rxds,
-				    aq_nic->aq_nic_cfg.aq_hw_caps->rxd_size);
-	if (err)
-		goto err_exit_ptp_rx;
+	if (aq_ptp->a1_ptp) {
+		err = aq_ring_hwts_rx_alloc(&aq_ptp->hwts_rx, aq_nic, PTP_HWST_RING_IDX,
+					    aq_nic->aq_nic_cfg.rxds,
+					    aq_nic->aq_nic_cfg.aq_hw_caps->rxd_size);
+		if (err)
+			goto err_exit_ptp_rx;
+	}
 
 	err = aq_ptp_skb_ring_init(&aq_ptp->skb_ring, aq_nic->aq_nic_cfg.rxds);
 	if (err != 0) {
@@ -984,7 +1193,7 @@ int aq_ptp_ring_alloc(struct aq_nic_s *aq_nic)
 		goto err_exit_hwts_rx;
 	}
 
-	aq_ptp->ptp_ring_param.vec_idx = aq_ptp->idx_vector;
+	aq_ptp->ptp_ring_param.vec_idx = aq_ptp->idx_ptp_vector;
 	aq_ptp->ptp_ring_param.cpu = aq_ptp->ptp_ring_param.vec_idx +
 			aq_nic_get_cfg(aq_nic)->aq_rss.base_cpu_number;
 	cpumask_set_cpu(aq_ptp->ptp_ring_param.cpu,
@@ -993,7 +1202,8 @@ int aq_ptp_ring_alloc(struct aq_nic_s *aq_nic)
 	return 0;
 
 err_exit_hwts_rx:
-	aq_ring_hwts_rx_free(&aq_ptp->hwts_rx);
+	if (aq_ptp->a1_ptp)
+		aq_ring_free(&aq_ptp->hwts_rx);
 err_exit_ptp_rx:
 	aq_ring_free(&aq_ptp->ptp_rx);
 err_exit_ptp_tx:
@@ -1011,7 +1221,8 @@ void aq_ptp_ring_free(struct aq_nic_s *aq_nic)
 
 	aq_ring_free(&aq_ptp->ptp_tx);
 	aq_ring_free(&aq_ptp->ptp_rx);
-	aq_ring_hwts_rx_free(&aq_ptp->hwts_rx);
+	if (aq_ptp->a1_ptp)
+		aq_ring_hwts_rx_free(&aq_ptp->hwts_rx);
 
 	aq_ptp_skb_ring_release(&aq_ptp->skb_ring);
 }
@@ -1035,46 +1246,49 @@ static struct ptp_clock_info aq_ptp_clock = {
 	.pin_config	= NULL,
 };
 
-#define ptp_offset_init(__idx, __mbps, __egress, __ingress)   do { \
-		ptp_offset[__idx].mbps = (__mbps); \
-		ptp_offset[__idx].egress = (__egress); \
-		ptp_offset[__idx].ingress = (__ingress); } \
-		while (0)
+static inline void ptp_offset_init(struct aq_ptp_s *aq_ptp, int idx,
+				   unsigned int mbps, int egress, int ingress)
+{
+	aq_ptp->ptp_offset[idx].mbps = mbps;
+	aq_ptp->ptp_offset[idx].egress = egress;
+	aq_ptp->ptp_offset[idx].ingress = ingress;
+}
 
-static void aq_ptp_offset_init_from_fw(const struct hw_atl_ptp_offset *offsets)
+static void aq_ptp_offset_init_from_fw(struct aq_ptp_s *aq_ptp,
+				       const struct hw_atl_ptp_offset *offsets)
 {
 	int i;
 
 	/* Load offsets for PTP */
-	for (i = 0; i < ARRAY_SIZE(ptp_offset); i++) {
+	for (i = 0; i < ARRAY_SIZE(aq_ptp->ptp_offset); i++) {
 		switch (i) {
 		/* 100M */
 		case ptp_offset_idx_100:
-			ptp_offset_init(i, 100,
+			ptp_offset_init(aq_ptp, i, 100,
 					offsets->egress_100,
 					offsets->ingress_100);
 			break;
 		/* 1G */
 		case ptp_offset_idx_1000:
-			ptp_offset_init(i, 1000,
+			ptp_offset_init(aq_ptp, i, 1000,
 					offsets->egress_1000,
 					offsets->ingress_1000);
 			break;
 		/* 2.5G */
 		case ptp_offset_idx_2500:
-			ptp_offset_init(i, 2500,
+			ptp_offset_init(aq_ptp, i, 2500,
 					offsets->egress_2500,
 					offsets->ingress_2500);
 			break;
 		/* 5G */
 		case ptp_offset_idx_5000:
-			ptp_offset_init(i, 5000,
+			ptp_offset_init(aq_ptp, i, 5000,
 					offsets->egress_5000,
 					offsets->ingress_5000);
 			break;
 		/* 10G */
 		case ptp_offset_idx_10000:
-			ptp_offset_init(i, 10000,
+			ptp_offset_init(aq_ptp, i, 10000,
 					offsets->egress_10000,
 					offsets->ingress_10000);
 			break;
@@ -1082,11 +1296,12 @@ static void aq_ptp_offset_init_from_fw(const struct hw_atl_ptp_offset *offsets)
 	}
 }
 
-static void aq_ptp_offset_init(const struct hw_atl_ptp_offset *offsets)
+static void aq_ptp_offset_init(struct aq_ptp_s *aq_ptp,
+			       const struct hw_atl_ptp_offset *offsets)
 {
-	memset(ptp_offset, 0, sizeof(ptp_offset));
+	memset(aq_ptp->ptp_offset, 0, sizeof(aq_ptp->ptp_offset));
 
-	aq_ptp_offset_init_from_fw(offsets);
+	aq_ptp_offset_init_from_fw(aq_ptp, offsets);
 }
 
 static void aq_ptp_gpio_init(struct ptp_clock_info *info,
@@ -1139,26 +1354,43 @@ static void aq_ptp_gpio_init(struct ptp_clock_info *info,
 	       sizeof(struct ptp_pin_desc) * info->n_pins);
 }
 
-void aq_ptp_clock_init(struct aq_nic_s *aq_nic)
+void aq_ptp_clock_init(struct aq_nic_s *aq_nic, enum aq_ptp_state state)
 {
 	struct aq_ptp_s *aq_ptp = aq_nic->aq_ptp;
-	struct timespec64 ts;
 
-	ktime_get_real_ts64(&ts);
-	aq_ptp_settime(&aq_ptp->ptp_info, &ts);
+	if (!aq_ptp)
+		return;
+
+	if (aq_ptp->a1_ptp || state == AQ_PTP_FIRST_INIT) {
+		struct timespec64 ts;
+
+		ktime_get_real_ts64(&ts);
+		aq_ptp_settime(&aq_ptp->ptp_info, &ts);
+	}
+
+	if (!aq_ptp->a1_ptp && state != AQ_PTP_FIRST_INIT) {
+		unsigned int ptp_en_flags =
+			aq_ptp_parse_rx_filters(state == AQ_PTP_LINK_UP ?
+						aq_ptp->hwtstamp_config.rx_filter :
+						AQ_HW_PTP_DISABLE);
+
+		aq_ptp_dpath_enable(aq_ptp, ptp_en_flags, aq_ptp->ptp_rx.idx);
+	}
 }
 
 static void aq_ptp_poll_sync_work_cb(struct work_struct *w);
 
-int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_vec)
+int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_ptp_vec)
 {
 	bool a1_ptp = ATL_HW_IS_CHIP_FEATURE(aq_nic->aq_hw, ATLANTIC);
+	bool a2_ptp = ATL_HW_IS_CHIP_FEATURE(aq_nic->aq_hw, ANTIGUA);
 	struct hw_atl_utils_mbox mbox;
 	struct ptp_clock *clock;
-	struct aq_ptp_s *aq_ptp;
+	struct aq_ptp_s *aq_ptp = NULL;
 	int err = 0;
+	int i;
 
-	if (!a1_ptp) {
+	if (!a1_ptp && !a2_ptp) {
 		aq_nic->aq_ptp = NULL;
 		return 0;
 	}
@@ -1168,19 +1400,43 @@ int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_vec)
 		return 0;
 	}
 
-	if (!aq_nic->aq_fw_ops->enable_ptp) {
-		aq_nic->aq_ptp = NULL;
-		return 0;
+	if (a1_ptp) {
+		if (!aq_nic->aq_fw_ops->enable_ptp) {
+			aq_nic->aq_ptp = NULL;
+			return 0;
+		}
 	}
 
-	hw_atl_utils_mpi_read_stats(aq_nic->aq_hw, &mbox);
-
-	if (!(mbox.info.caps_ex & BIT(CAPS_EX_PHY_PTP_EN))) {
+	/* PTP requires at least 1 free irq vector for itself */
+	if (aq_nic->irqvecs <= AQ_HW_PTP_IRQS) {
+		netdev_warn(aq_nic->ndev,
+			    "Disabling PTP due to insufficient number of available IRQ vectors.\n");
 		aq_nic->aq_ptp = NULL;
 		return 0;
 	}
 
-	aq_ptp_offset_init(&mbox.info.ptp_offset);
+	if (a1_ptp) {
+		hw_atl_utils_mpi_read_stats(aq_nic->aq_hw, &mbox);
+		if (!(mbox.info.caps_ex & BIT(CAPS_EX_PHY_PTP_EN))) {
+			aq_nic->aq_ptp = NULL;
+			return 0;
+		}
+	} else {
+		memset(&mbox, 0, sizeof(mbox));
+
+		if (a2_ptp) {
+			mbox.info.ptp_offset.ingress_100 = HW_ATL2_PTP_OFFSET_INGRESS_100;
+			mbox.info.ptp_offset.egress_100 = HW_ATL2_PTP_OFFSET_EGRESS_100;
+			mbox.info.ptp_offset.ingress_1000 = HW_ATL2_PTP_OFFSET_INGRESS_1000;
+			mbox.info.ptp_offset.egress_1000 = HW_ATL2_PTP_OFFSET_EGRESS_1000;
+			mbox.info.ptp_offset.ingress_2500 = HW_ATL2_PTP_OFFSET_INGRESS_2500;
+			mbox.info.ptp_offset.egress_2500 = HW_ATL2_PTP_OFFSET_EGRESS_2500;
+			mbox.info.ptp_offset.ingress_5000 = HW_ATL2_PTP_OFFSET_INGRESS_5000;
+			mbox.info.ptp_offset.egress_5000 = HW_ATL2_PTP_OFFSET_EGRESS_5000;
+			mbox.info.ptp_offset.ingress_10000 = HW_ATL2_PTP_OFFSET_INGRESS_10000;
+			mbox.info.ptp_offset.egress_10000 = HW_ATL2_PTP_OFFSET_EGRESS_10000;
+		}
+	}
 
 	aq_ptp = kzalloc_obj(*aq_ptp);
 	if (!aq_ptp) {
@@ -1190,10 +1446,12 @@ int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_vec)
 
 	aq_ptp->aq_nic = aq_nic;
 	aq_ptp->a1_ptp = a1_ptp;
+	aq_ptp->a2_ptp = a2_ptp;
 
 	spin_lock_init(&aq_ptp->ptp_lock);
 	spin_lock_init(&aq_ptp->ptp_ring_lock);
 
+	aq_ptp_offset_init(aq_ptp, &mbox.info.ptp_offset);
 	aq_ptp->ptp_info = aq_ptp_clock;
 	aq_ptp_gpio_init(&aq_ptp->ptp_info, &mbox.info);
 	clock = ptp_clock_register(&aq_ptp->ptp_info, &aq_nic->ndev->dev);
@@ -1210,22 +1468,34 @@ int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_vec)
 
 	netif_napi_add(aq_nic_get_ndev(aq_nic), &aq_ptp->napi, aq_ptp_poll);
 
-	aq_ptp->idx_vector = idx_vec;
+	aq_ptp->idx_ptp_vector = idx_ptp_vec;
 
 	aq_nic->aq_ptp = aq_ptp;
 
 	/* enable ptp counter */
+	aq_ptp->ptp_clock_sel = ATL_TSG_CLOCK_SEL_0;
 	aq_utils_obj_set(&aq_nic->aq_hw->flags, AQ_HW_PTP_AVAILABLE);
-	mutex_lock(&aq_nic->fwreq_mutex);
-	aq_nic->aq_fw_ops->enable_ptp(aq_nic->aq_hw, 1);
-	aq_ptp_clock_init(aq_nic);
-	mutex_unlock(&aq_nic->fwreq_mutex);
+	if (a1_ptp) {
+		mutex_lock(&aq_nic->fwreq_mutex);
+		aq_nic->aq_fw_ops->enable_ptp(aq_nic->aq_hw, 1);
+		mutex_unlock(&aq_nic->fwreq_mutex);
+	}
+	if (a2_ptp)
+		aq_nic->aq_hw_ops->enable_ptp(aq_nic->aq_hw, aq_ptp->ptp_clock_sel, 1);
 
 	INIT_DELAYED_WORK(&aq_ptp->poll_sync, &aq_ptp_poll_sync_work_cb);
 	aq_ptp->eth_type_filter.location =
-			aq_nic_reserve_filter(aq_nic, aq_rx_filter_ethertype);
-	aq_ptp->udp_filter.location =
+		aq_nic_reserve_filter(aq_nic, aq_rx_filter_ethertype);
+
+	for (i = 0; i < PTP_UDP_FILTERS_CNT; i++) {
+		aq_ptp->udp_filter[i].location =
 			aq_nic_reserve_filter(aq_nic, aq_rx_filter_l3l4);
+	}
+
+	aq_ptp_clock_init(aq_nic, AQ_PTP_FIRST_INIT);
+	netdev_info(aq_nic->ndev,
+		    "Enable PTP Support. %d GPIO(s)\n",
+		    aq_ptp->ptp_info.n_pins);
 
 	return 0;
 
@@ -1244,30 +1514,45 @@ void aq_ptp_unregister(struct aq_nic_s *aq_nic)
 	if (!aq_ptp)
 		return;
 
-	ptp_clock_unregister(aq_ptp->ptp_clock);
+	if (aq_ptp->ptp_clock) {
+		ptp_clock_unregister(aq_ptp->ptp_clock);
+		aq_ptp->ptp_clock = NULL;
+	}
 }
 
 void aq_ptp_free(struct aq_nic_s *aq_nic)
 {
 	struct aq_ptp_s *aq_ptp = aq_nic->aq_ptp;
+	int i;
 
 	if (!aq_ptp)
 		return;
 
+	/* disable ptp */
+	if (aq_ptp->a1_ptp) {
+		mutex_lock(&aq_nic->fwreq_mutex);
+		aq_nic->aq_fw_ops->enable_ptp(aq_nic->aq_hw, 0);
+		mutex_unlock(&aq_nic->fwreq_mutex);
+	}
+
+	if (aq_ptp->a2_ptp)
+		aq_nic->aq_hw_ops->enable_ptp(aq_nic->aq_hw,
+					      aq_ptp->ptp_clock_sel, 0);
+
+	cancel_delayed_work_sync(&aq_ptp->poll_sync);
+
 	aq_nic_release_filter(aq_nic, aq_rx_filter_ethertype,
 			      aq_ptp->eth_type_filter.location);
-	aq_nic_release_filter(aq_nic, aq_rx_filter_l3l4,
-			      aq_ptp->udp_filter.location);
-	cancel_delayed_work_sync(&aq_ptp->poll_sync);
-	/* disable ptp */
-	mutex_lock(&aq_nic->fwreq_mutex);
-	aq_nic->aq_fw_ops->enable_ptp(aq_nic->aq_hw, 0);
-	mutex_unlock(&aq_nic->fwreq_mutex);
+	for (i = 0; i < PTP_UDP_FILTERS_CNT; i++)
+		aq_nic_release_filter(aq_nic, aq_rx_filter_l3l4,
+				      aq_ptp->udp_filter[i].location);
 
 	kfree(aq_ptp->ptp_info.pin_config);
+	aq_ptp->ptp_info.pin_config = NULL;
 
 	netif_napi_del(&aq_ptp->napi);
 	kfree(aq_ptp);
+	aq_utils_obj_clear(&aq_nic->aq_hw->flags, AQ_HW_PTP_AVAILABLE);
 	aq_nic->aq_ptp = NULL;
 }
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.h b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.h
index 5e643ec7cc06..df93857deac9 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.h
@@ -14,6 +14,12 @@
 
 #include "aq_ring.h"
 
+enum aq_ptp_state {
+	AQ_PTP_NO_LINK = 0,
+	AQ_PTP_FIRST_INIT = 1,
+	AQ_PTP_LINK_UP = 2,
+};
+
 #define PTP_8TC_RING_IDX             8
 #define PTP_4TC_RING_IDX            16
 #define PTP_HWST_RING_IDX           31
@@ -32,7 +38,7 @@ static inline unsigned int aq_ptp_ring_idx(const enum aq_tc_mode tc_mode)
 #if IS_REACHABLE(CONFIG_PTP_1588_CLOCK)
 
 /* Common functions */
-int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_vec);
+int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_ptp_vec);
 
 void aq_ptp_unregister(struct aq_nic_s *aq_nic);
 void aq_ptp_free(struct aq_nic_s *aq_nic);
@@ -52,7 +58,7 @@ void aq_ptp_service_task(struct aq_nic_s *aq_nic);
 
 void aq_ptp_tm_offset_set(struct aq_nic_s *aq_nic, unsigned int mbps);
 
-void aq_ptp_clock_init(struct aq_nic_s *aq_nic);
+void aq_ptp_clock_init(struct aq_nic_s *aq_nic, enum aq_ptp_state state);
 
 /* Traffic processing functions */
 int aq_ptp_xmit(struct aq_nic_s *aq_nic, struct sk_buff *skb);
@@ -80,7 +86,7 @@ u64 *aq_ptp_get_stats(struct aq_nic_s *aq_nic, u64 *data);
 
 #else
 
-static inline int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_vec)
+static inline int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_ptp_vec)
 {
 	return 0;
 }
@@ -122,7 +128,8 @@ static inline void aq_ptp_ring_deinit(struct aq_nic_s *aq_nic) {}
 static inline void aq_ptp_service_task(struct aq_nic_s *aq_nic) {}
 static inline void aq_ptp_tm_offset_set(struct aq_nic_s *aq_nic,
 					unsigned int mbps) {}
-static inline void aq_ptp_clock_init(struct aq_nic_s *aq_nic) {}
+static inline void aq_ptp_clock_init(struct aq_nic_s *aq_nic,
+				     enum aq_ptp_state state) {}
 static inline int aq_ptp_xmit(struct aq_nic_s *aq_nic, struct sk_buff *skb)
 {
 	return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
index e270327e47fd..a52d6d3fe464 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -308,24 +308,30 @@ bool aq_ring_tx_clean(struct aq_ring_s *self)
 			}
 		}
 
-		if (likely(!buff->is_eop))
-			goto out;
-
-		if (buff->skb) {
-			u64_stats_update_begin(&self->stats.tx.syncp);
-			++self->stats.tx.packets;
-			self->stats.tx.bytes += buff->skb->len;
-			u64_stats_update_end(&self->stats.tx.syncp);
-			dev_kfree_skb_any(buff->skb);
-		} else if (buff->xdpf) {
-			u64_stats_update_begin(&self->stats.tx.syncp);
-			++self->stats.tx.packets;
-			self->stats.tx.bytes += xdp_get_frame_len(buff->xdpf);
-			u64_stats_update_end(&self->stats.tx.syncp);
-			xdp_return_frame_rx_napi(buff->xdpf);
-		}
+		if (unlikely(buff->is_eop)) {
+			if (unlikely(buff->request_ts) &&
+			    self->aq_nic->aq_hw_ops->hw_ring_tx_ptp_get_ts) {
+				u64 ts = self->aq_nic->aq_hw_ops->hw_ring_tx_ptp_get_ts(self);
+
+				if (!ts)
+					break;
 
-out:
+				aq_ptp_tx_hwtstamp(self->aq_nic, ts);
+			}
+			if (buff->skb) {
+				u64_stats_update_begin(&self->stats.tx.syncp);
+				++self->stats.tx.packets;
+				self->stats.tx.bytes += buff->skb->len;
+				u64_stats_update_end(&self->stats.tx.syncp);
+				dev_kfree_skb_any(buff->skb);
+			} else if (buff->xdpf) {
+				u64_stats_update_begin(&self->stats.tx.syncp);
+				++self->stats.tx.packets;
+				self->stats.tx.bytes += xdp_get_frame_len(buff->xdpf);
+				u64_stats_update_end(&self->stats.tx.syncp);
+				xdp_return_frame_rx_napi(buff->xdpf);
+			}
+		}
 		buff->skb = NULL;
 		buff->xdpf = NULL;
 		buff->pa = 0U;
@@ -570,7 +576,7 @@ static int __aq_ring_rx_clean(struct aq_ring_s *self, struct napi_struct *napi,
 							    self->hw_head);
 
 				if (unlikely(!is_rsc_completed) ||
-						frag_cnt > MAX_SKB_FRAGS) {
+				    frag_cnt > MAX_SKB_FRAGS) {
 					err = 0;
 					goto err_exit;
 				}
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
index c71e8d1adfc9..3047bda619c0 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
@@ -7,6 +7,7 @@
 #include "aq_hw_utils.h"
 #include "aq_ring.h"
 #include "aq_nic.h"
+#include "aq_ptp.h"
 #include "hw_atl/hw_atl_b0.h"
 #include "hw_atl/hw_atl_utils.h"
 #include "hw_atl/hw_atl_llh.h"
@@ -20,6 +21,15 @@
 static int hw_atl2_act_rslvr_table_set(struct aq_hw_s *self, u8 location,
 				       u32 tag, u32 mask, u32 action);
 
+static void hw_atl2_enable_ptp(struct aq_hw_s *self,
+			       unsigned int param, int enable);
+static int hw_atl2_hw_tx_ptp_ring_init(struct aq_hw_s *self,
+				       struct aq_ring_s *aq_ring);
+static int hw_atl2_hw_rx_ptp_ring_init(struct aq_hw_s *self,
+				       struct aq_ring_s *aq_ring);
+static void aq_get_ptp_ts(struct aq_hw_s *self, u64 *stamp);
+static int hw_atl2_adj_clock_freq(struct aq_hw_s *self, s32 ppb);
+
 #define DEFAULT_BOARD_BASIC_CAPABILITIES \
 	.is_64_dma = true,		  \
 	.op64bit = true,		  \
@@ -144,6 +154,12 @@ static int hw_atl2_hw_reset(struct aq_hw_s *self)
 		priv->l3l4_filters[i].l4_index = -1;
 	}
 
+	if (self->clk_select != -1)
+		hw_atl2_enable_ptp(self,
+				   self->clk_select,
+				   aq_utils_obj_test(&self->flags, AQ_HW_PTP_AVAILABLE) ?
+				   1 : 0);
+
 	self->aq_fw_ops->set_state(self, MPI_RESET);
 
 	err = aq_hw_err_from_flags(self);
@@ -719,14 +735,24 @@ static int hw_atl2_hw_ring_rx_init(struct aq_hw_s *self,
 				   struct aq_ring_s *aq_ring,
 				   struct aq_ring_param_s *aq_ring_param)
 {
-	return hw_atl_b0_hw_ring_rx_init(self, aq_ring, aq_ring_param);
+	int res = hw_atl_b0_hw_ring_rx_init(self, aq_ring, aq_ring_param);
+
+	if (aq_ptp_ring(aq_ring->aq_nic, aq_ring))
+		hw_atl2_hw_rx_ptp_ring_init(self, aq_ring);
+
+	return res;
 }
 
 static int hw_atl2_hw_ring_tx_init(struct aq_hw_s *self,
 				   struct aq_ring_s *aq_ring,
 				   struct aq_ring_param_s *aq_ring_param)
 {
-	return hw_atl_b0_hw_ring_tx_init(self, aq_ring, aq_ring_param);
+	int res = hw_atl_b0_hw_ring_tx_init(self, aq_ring, aq_ring_param);
+
+	if (aq_ptp_ring(aq_ring->aq_nic, aq_ring))
+		hw_atl2_hw_tx_ptp_ring_init(self, aq_ring);
+
+	return res;
 }
 
 #define IS_FILTER_ENABLED(_F_) ((packet_filter & (_F_)) ? 1U : 0U)
@@ -886,6 +912,138 @@ static struct aq_stats_s *hw_atl2_utils_get_hw_stats(struct aq_hw_s *self)
 	return &self->curr_stats;
 }
 
+static u32 hw_atl2_tsg_int_clk_freq(struct aq_hw_s *self)
+{
+	return AQ2_HW_PTP_COUNTER_HZ;
+}
+
+static void hw_atl2_enable_ptp(struct aq_hw_s *self,
+			       unsigned int param, int enable)
+{
+	self->clk_select = param;
+
+	/* enable tsg counter */
+	hw_atl2_tsg_clock_reset(self, self->clk_select);
+	hw_atl2_tsg_clock_en(self, !self->clk_select, enable);
+	hw_atl2_tsg_clock_en(self, self->clk_select, enable);
+
+	if (enable)
+		hw_atl2_adj_clock_freq(self, 0);
+
+	hw_atl2_tpb_tps_highest_priority_tc_enable_set(self, enable);
+}
+
+static void aq_get_ptp_ts(struct aq_hw_s *self, u64 *stamp)
+{
+	if (stamp)
+		*stamp = hw_atl2_tsg_clock_read(self, self->clk_select);
+}
+
+static u64 hw_atl2_hw_ring_tx_ptp_get_ts(struct aq_ring_s *ring)
+{
+	struct hw_atl2_txts_s *txts;
+
+	txts = (struct hw_atl2_txts_s *)&ring->dx_ring[ring->sw_head *
+						HW_ATL2_TXD_SIZE];
+	/* DD + TS_VALID */
+	if ((txts->ctrl & HW_ATL2_TXTS_DD) && (txts->ctrl & HW_ATL2_TXTS_TS_VALID))
+		return txts->ts;
+
+	return 0;
+}
+
+static u16 hw_atl2_hw_rx_extract_ts(struct aq_hw_s *self, u8 *p,
+				    unsigned int len, u64 *timestamp)
+{
+	unsigned int offset = HW_ATL2_RX_TS_SIZE;
+	u8 *ptr;
+
+	if (len <= offset || !timestamp)
+		return 0;
+
+	ptr = p + (len - offset);
+	memcpy(timestamp, ptr, sizeof(*timestamp));
+
+	return HW_ATL2_RX_TS_SIZE;
+}
+
+static int hw_atl2_adj_sys_clock(struct aq_hw_s *self, s64 delta)
+{
+	if (delta >= 0)
+		hw_atl2_tsg_clock_add(self, self->clk_select, (u64)delta);
+	else
+		hw_atl2_tsg_clock_sub(self, self->clk_select, (u64)(-delta));
+
+	return 0;
+}
+
+static int hw_atl2_adj_clock_freq(struct aq_hw_s *self, s32 ppb)
+{
+	u32 freq = hw_atl2_tsg_int_clk_freq(self);
+	u64 divisor = 0, base_ns;
+	u32 nsi_frac = 0, nsi;
+	u32 nsi_rem;
+
+	base_ns = div_u64((u64)((s64)ppb + NSEC_PER_SEC) * NSEC_PER_SEC, freq);
+	nsi = (u32)div_u64_rem(base_ns, NSEC_PER_SEC, &nsi_rem);
+	if (nsi_rem != 0) {
+		divisor = div_u64(mul_u32_u32(NSEC_PER_SEC, NSEC_PER_SEC),
+				  nsi_rem);
+		nsi_frac = (u32)div64_u64(AQ_FRAC_PER_NS * NSEC_PER_SEC,
+					  divisor);
+	}
+
+	hw_atl2_tsg_clock_increment_set(self, self->clk_select, nsi, nsi_frac);
+
+	return 0;
+}
+
+static int hw_atl2_hw_tx_ptp_ring_init(struct aq_hw_s *self,
+				       struct aq_ring_s *aq_ring)
+{
+	hw_atl2_tdm_tx_desc_timestamp_writeback_en_set(self, true,
+						       aq_ring->idx);
+	hw_atl2_tdm_tx_desc_timestamp_en_set(self, true, aq_ring->idx);
+	hw_atl2_tdm_tx_desc_avb_en_set(self, true, aq_ring->idx);
+
+	return aq_hw_err_from_flags(self);
+}
+
+static int hw_atl2_hw_rx_ptp_ring_init(struct aq_hw_s *self,
+				       struct aq_ring_s *aq_ring)
+{
+	hw_atl2_rpf_rx_desc_timestamp_req_set(self,
+					      self->clk_select == ATL_TSG_CLOCK_SEL_1 ? 2 : 1,
+					      aq_ring->idx);
+	return aq_hw_err_from_flags(self);
+}
+
+static u32 hw_atl2_hw_get_clk_sel(struct aq_hw_s *self)
+{
+	return self->clk_select;
+}
+
+static int hw_atl2_gpio_pulse(struct aq_hw_s *self, u32 index, u32 clk_sel,
+			      u64 start, u32 period, u32 hightime)
+{
+	u32 mode;
+
+	if (start == 0)
+		mode = HW_ATL2_GPIO_PIN_SPEC_MODE_GPIO;
+	else if (clk_sel == ATL_TSG_CLOCK_SEL_0)
+		mode = HW_ATL2_GPIO_PIN_SPEC_MODE_TSG0_EVENT_OUTPUT;
+	else
+		mode = HW_ATL2_GPIO_PIN_SPEC_MODE_TSG1_EVENT_OUTPUT;
+
+	if (index == 1 || index == 3) { /* Hardware limitation */
+		hw_atl2_gpio_special_mode_set(self, mode, index);
+	}
+
+	hw_atl2_tsg_ptp_gpio_gen_pulse(self, clk_sel, start, period, hightime);
+
+	return 0;
+}
+
 static bool hw_atl2_rxf_l3_is_equal(struct hw_atl2_l3_filter *f1,
 				    struct hw_atl2_l3_filter *f2)
 {
@@ -1474,4 +1632,21 @@ const struct aq_hw_ops hw_atl2_ops = {
 	.hw_set_offload              = hw_atl_b0_hw_offload_set,
 	.hw_set_loopback             = hw_atl_b0_set_loopback,
 	.hw_set_fc                   = hw_atl_b0_set_fc,
+
+	.hw_ring_hwts_rx_fill        = NULL,
+	.hw_ring_hwts_rx_receive     = NULL,
+
+	.hw_get_ptp_ts           = aq_get_ptp_ts,
+	.hw_adj_clock_freq       = hw_atl2_adj_clock_freq,
+	.hw_adj_sys_clock        = hw_atl2_adj_sys_clock,
+	.hw_gpio_pulse           = hw_atl2_gpio_pulse,
+
+	.enable_ptp              = hw_atl2_enable_ptp,
+	.hw_ring_tx_ptp_get_ts   = hw_atl2_hw_ring_tx_ptp_get_ts,
+	.rx_extract_ts           = hw_atl2_hw_rx_extract_ts,
+	.hw_tx_ptp_ring_init     = hw_atl2_hw_tx_ptp_ring_init,
+	.hw_rx_ptp_ring_init     = hw_atl2_hw_rx_ptp_ring_init,
+	.hw_get_clk_sel          = hw_atl2_hw_get_clk_sel,
+	.extract_hwts            = NULL,
+	.hw_extts_gpio_enable    = NULL,
 };
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.h
index 346f0dc9912e..4b905231ae73 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.h
@@ -7,6 +7,18 @@
 #define HW_ATL2_H
 
 #include "aq_common.h"
+#define HW_ATL2_RX_TS_SIZE 8
+
+#define HW_ATL2_PTP_OFFSET_INGRESS_100          768
+#define HW_ATL2_PTP_OFFSET_EGRESS_100           336
+#define HW_ATL2_PTP_OFFSET_INGRESS_1000         510
+#define HW_ATL2_PTP_OFFSET_EGRESS_1000          105
+#define HW_ATL2_PTP_OFFSET_INGRESS_2500         2447
+#define HW_ATL2_PTP_OFFSET_EGRESS_2500          634
+#define HW_ATL2_PTP_OFFSET_INGRESS_5000         1426
+#define HW_ATL2_PTP_OFFSET_EGRESS_5000          361
+#define HW_ATL2_PTP_OFFSET_INGRESS_10000        997
+#define HW_ATL2_PTP_OFFSET_EGRESS_10000         203
 
 extern const struct aq_hw_caps_s hw_atl2_caps_aqc113;
 extern const struct aq_hw_caps_s hw_atl2_caps_aqc115c;
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
index 31d7cae6641a..e0687fb4350a 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
@@ -29,7 +29,8 @@
 #define HW_ATL2_TXBUF_MAX              128U
 #define HW_ATL2_PTP_TXBUF_SIZE           8U
 
-#define HW_ATL2_RXBUF_MAX              192U
+/* Reduced from 192 to reserve space for PTP RX timestamp trailer */
+#define HW_ATL2_RXBUF_MAX              172U
 #define HW_ATL2_PTP_RXBUF_SIZE          16U
 #define HW_ATL2_RSS_REDIRECTION_MAX 64U
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h
index c84955bc14ae..6a90e6389ebd 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h
@@ -8,6 +8,16 @@
 
 #include "aq_hw.h"
 
+/* Hardware tx launch time descriptor */
+struct hw_atl2_txts_s {
+	u64 ts;
+	u32 ctrl;
+	u32 reserved;
+};
+
+#define HW_ATL2_TXTS_DD	BIT(3)
+#define HW_ATL2_TXTS_TS_VALID   BIT(20)
+
 /* F W    A P I */
 
 struct link_options_s {
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 8/9] net: atlantic: extend hw_ops and TX descriptor for AQC113 PTP for AQC113 PTP
From: sukhdeeps @ 2026-05-06 13:57 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Extend the aq_hw_ops interface with new function pointers required for
PTP support on AQC113:
- enable_ptp: enable/disable PTP counter with clock selection
- hw_ring_tx_ptp_get_ts: read TX timestamp from descriptor writeback
- hw_tx_ptp_ring_init/hw_rx_ptp_ring_init: per-ring PTP initialization
- hw_get_clk_sel: query active TSG clock selection

Update existing hw_ops signatures to support AQC113 dual-clock
architecture:
- hw_gpio_pulse: add clk_sel and hightime parameters
- hw_extts_gpio_enable: add channel parameter

Add PTP-related hardware defines:
- AQ_HW_TXD_CTL_TS_EN/TS_TSG0 for TX descriptor timestamp control
- AQ2_HW_PTP_COUNTER_HZ for AQC113 TSG clock frequency
- AQ_HW_PTP_IRQS for PTP interrupt vector accounting
- PTP enable flags (L2/L4) and TSG clock selection constants

Add request_ts and clk_sel bitfields to aq_ring_buff_s for per-packet
TX timestamp request tracking.

Update hw_atl_b0.c (AQC107) implementations:
- Adapt gpio_pulse and extts_gpio_enable to new signatures
- Add TX descriptor timestamp bits for AQC113 when ANTIGUA chip
  feature is detected

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../net/ethernet/aquantia/atlantic/aq_hw.h    | 34 +++++++++++++++++--
 .../net/ethernet/aquantia/atlantic/aq_ptp.c   |  4 +--
 .../net/ethernet/aquantia/atlantic/aq_ring.h  |  4 ++-
 .../aquantia/atlantic/hw_atl/hw_atl_b0.c      | 15 ++++++--
 4 files changed, 48 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
index 04fb87d4e56d..e3bacad08b93 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
@@ -19,6 +19,9 @@
 #define AQ_HW_MAC_COUNTER_HZ   312500000ll
 #define AQ_HW_PHY_COUNTER_HZ   160000000ll
 
+#define AQ_HW_TXD_CTL_TS_EN       0x40000000U
+#define AQ_HW_TXD_CTL_TS_TSG0     0x80000000U
+
 enum aq_tc_mode {
 	AQ_TC_MODE_INVALID = -1,
 	AQ_TC_MODE_8TCS,
@@ -38,6 +41,8 @@ enum aq_tc_mode {
 
 #define AQ_FRAC_PER_NS 0x100000000LL
 
+#define AQ2_HW_PTP_COUNTER_HZ   156250000ll
+
 /* Used for rate to Mbps conversion */
 #define AQ_MBPS_DIVISOR         125000 /* 1000000 / 8 */
 
@@ -109,6 +114,7 @@ struct aq_stats_s {
 #define AQ_HW_IRQ_MSIX    3U
 
 #define AQ_HW_SERVICE_IRQS   1U
+#define AQ_HW_PTP_IRQS       1U
 
 #define AQ_HW_POWER_STATE_D0   0U
 #define AQ_HW_POWER_STATE_D3   3U
@@ -157,6 +163,15 @@ enum aq_priv_flags {
 	AQ_HW_LOOPBACK_PHYEXT_SYS,
 };
 
+enum {
+	AQ_HW_PTP_DISABLE = 0,
+	AQ_HW_PTP_L2_ENABLE = BIT(1),
+	AQ_HW_PTP_L4_ENABLE = BIT(2),
+};
+
+#define ATL_TSG_CLOCK_SEL_0 0
+#define ATL_TSG_CLOCK_SEL_1 1
+
 #define AQ_HW_LOOPBACK_MASK	(BIT(AQ_HW_LOOPBACK_DMA_SYS) |\
 				 BIT(AQ_HW_LOOPBACK_PKT_SYS) |\
 				 BIT(AQ_HW_LOOPBACK_DMA_NET) |\
@@ -198,6 +213,7 @@ struct aq_hw_s {
 	u32 rpc_tid;
 	struct hw_atl_utils_fw_rpc rpc;
 	s64 ptp_clk_offset;
+	s8 clk_select;
 	u16 phy_id;
 	void *priv;
 };
@@ -325,11 +341,15 @@ struct aq_hw_ops {
 
 	int (*hw_ts_to_sys_clock)(struct aq_hw_s *self, u64 ts, u64 *time);
 
-	int (*hw_gpio_pulse)(struct aq_hw_s *self, u32 index, u64 start,
-			     u32 period);
+	int (*hw_gpio_pulse)(struct aq_hw_s *self, u32 index,
+			     u32 clk_sel, u64 start,
+			     u32 period, u32 hightime);
 
 	int (*hw_extts_gpio_enable)(struct aq_hw_s *self, u32 index,
-				    u32 enable);
+				    u32 channel, int enable);
+
+	void (*enable_ptp)(struct aq_hw_s *self, unsigned int param,
+			   int enable);
 
 	int (*hw_get_sync_ts)(struct aq_hw_s *self, u64 *ts);
 
@@ -339,6 +359,14 @@ struct aq_hw_ops {
 	int (*extract_hwts)(struct aq_hw_s *self, u8 *p, unsigned int len,
 			    u64 *timestamp);
 
+	u64 (*hw_ring_tx_ptp_get_ts)(struct aq_ring_s *ring);
+
+	int (*hw_tx_ptp_ring_init)(struct aq_hw_s *self,
+				   struct aq_ring_s *aq_ring);
+	int (*hw_rx_ptp_ring_init)(struct aq_hw_s *self,
+				   struct aq_ring_s *aq_ring);
+	u32 (*hw_get_clk_sel)(struct aq_hw_s *self);
+
 	int (*hw_set_fc)(struct aq_hw_s *self, u32 fc, u32 tc);
 
 	int (*hw_set_loopback)(struct aq_hw_s *self, u32 mode, bool enable);
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
index 9df8918216f6..7486a28d7ff8 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
@@ -380,7 +380,7 @@ static int aq_ptp_hw_pin_conf(struct aq_nic_s *aq_nic, u32 pin_index, u64 start,
 	 */
 	mutex_lock(&aq_nic->fwreq_mutex);
 	aq_nic->aq_hw_ops->hw_gpio_pulse(aq_nic->aq_hw, pin_index,
-					 start, (u32)period);
+					 0, start, (u32)period, 0);
 	mutex_unlock(&aq_nic->fwreq_mutex);
 
 	return 0;
@@ -454,7 +454,7 @@ static void aq_ptp_extts_pin_ctrl(struct aq_ptp_s *aq_ptp)
 
 	if (aq_nic->aq_hw_ops->hw_extts_gpio_enable)
 		aq_nic->aq_hw_ops->hw_extts_gpio_enable(aq_nic->aq_hw, 0,
-							enable);
+							0, enable);
 }
 
 static int aq_ptp_extts_pin_configure(struct ptp_clock_info *ptp,
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.h b/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
index d627ace850ff..e578fe04d22c 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
@@ -85,7 +85,9 @@ struct __packed aq_ring_buff_s {
 			u32 is_error:1;
 			u32 is_vlan:1;
 			u32 is_lro:1;
-			u32 rsvd3:3;
+			u32 request_ts:1;
+			u32 clk_sel:1;
+			u32 rsvd3:1;
 			u16 eop_index;
 			u16 rsvd4;
 		};
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
index c7895bfb2ecf..6c25ad264b19 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
@@ -736,6 +736,15 @@ int hw_atl_b0_hw_ring_tx_xmit(struct aq_hw_s *self, struct aq_ring_s *ring,
 				txd->ctl |= HW_ATL_B0_TXD_CTL_CMD_WB;
 				is_gso = false;
 				is_vlan = false;
+
+				if (ATL_HW_IS_CHIP_FEATURE(self, ANTIGUA) &&
+				    unlikely(buff->request_ts)) {
+					txd->ctl |= AQ_HW_TXD_CTL_TS_EN;
+					if (buff->clk_sel != ATL_TSG_CLOCK_SEL_1)
+						txd->ctl |= AQ_HW_TXD_CTL_TS_TSG0;
+					/* The only DD+TS is required */
+					txd->ctl &= ~HW_ATL_B0_TXD_CTL_CMD_WB;
+				}
 			}
 		}
 		ring->sw_tail = aq_ring_next_dx(ring, ring->sw_tail);
@@ -1323,8 +1332,8 @@ static int hw_atl_b0_adj_clock_freq(struct aq_hw_s *self, s32 ppb)
 	return self->aq_fw_ops->send_fw_request(self, &fwreq, size);
 }
 
-static int hw_atl_b0_gpio_pulse(struct aq_hw_s *self, u32 index,
-				u64 start, u32 period)
+static int hw_atl_b0_gpio_pulse(struct aq_hw_s *self, u32 index, u32 clk_sel,
+				u64 start, u32 period, u32 hightime)
 {
 	struct hw_fw_request_iface fwreq;
 	size_t size;
@@ -1342,7 +1351,7 @@ static int hw_atl_b0_gpio_pulse(struct aq_hw_s *self, u32 index,
 }
 
 static int hw_atl_b0_extts_gpio_enable(struct aq_hw_s *self, u32 index,
-				       u32 enable)
+				       u32 channel, int enable)
 {
 	/* Enable/disable Sync1588 GPIO Timestamping */
 	aq_phy_write_reg(self, MDIO_MMD_PCS, 0xc611, enable ? 0x71 : 0);
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 7/9] net: atlantic: add AQC113 PTP traffic class and TX path setup TX path setup
From: sukhdeeps @ 2026-05-06 13:57 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Add PTP traffic class (TC) buffer reservation and TX path
improvements for AQC113:

- Reserve dedicated TX and RX buffer space for PTP TC when PTP is
  enabled, reducing user TC buffers accordingly (TX: 8KB, RX: 16KB).
- Configure PTP TC with no flow control and highest priority
  scheduling to ensure timely PTP packet transmission.
- Enable multicast frame tagging (accept_all_mc_packets) so the
  Action Resolver Table (ART) can match and steer PTP multicast
  traffic to the correct TC/queue based on RPF input tags.

TX path improvements:
- Enable extended PCIe tag mode (32-255) when hardware supports it,
  with increased TX data and descriptor read request limits for
  improved throughput.

Also simplify RSS queue calculation in hw_atl2_hw_rss_set() by
extracting to a local variable and use unsigned types for loop
variables to match their usage.

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../aquantia/atlantic/hw_atl2/hw_atl2.c       | 52 ++++++++++++++++---
 .../atlantic/hw_atl2/hw_atl2_internal.h       |  4 +-
 2 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
index e58bfff38670..c71e8d1adfc9 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
@@ -151,6 +151,24 @@ static int hw_atl2_hw_reset(struct aq_hw_s *self)
 	return err;
 }
 
+static int hw_atl2_tc_ptp_set(struct aq_hw_s *self)
+{
+	/* Init TC2 for PTP_TX */
+	hw_atl_tpb_tx_pkt_buff_size_per_tc_set(self, HW_ATL2_PTP_TXBUF_SIZE,
+					       AQ_HW_PTP_TC);
+
+	/* Init TC2 for PTP_RX */
+	hw_atl_rpb_rx_pkt_buff_size_per_tc_set(self, HW_ATL2_PTP_RXBUF_SIZE,
+					       AQ_HW_PTP_TC);
+
+	/* No flow control for PTP */
+	hw_atl_rpb_rx_xoff_en_per_tc_set(self, 0U, AQ_HW_PTP_TC);
+
+	hw_atl2_tpb_tps_highest_priority_tc_set(self, AQ_HW_PTP_TC);
+
+	return aq_hw_err_from_flags(self);
+}
+
 static int hw_atl2_hw_queue_to_tc_map_set(struct aq_hw_s *self)
 {
 	struct aq_nic_cfg_s *cfg = self->aq_nic_cfg;
@@ -209,6 +227,11 @@ static int hw_atl2_hw_qos_set(struct aq_hw_s *self)
 	unsigned int prio = 0U;
 	u32 tc = 0U;
 
+	if (cfg->is_ptp) {
+		tx_buff_size -= HW_ATL2_PTP_TXBUF_SIZE;
+		rx_buff_size -= HW_ATL2_PTP_RXBUF_SIZE;
+	}
+
 	/* TPS Descriptor rate init */
 	hw_atl_tps_tx_pkt_shed_desc_rate_curr_time_res_set(self, 0x0U);
 	hw_atl_tps_tx_pkt_shed_desc_rate_lim_set(self, 0xA);
@@ -242,6 +265,9 @@ static int hw_atl2_hw_qos_set(struct aq_hw_s *self)
 		hw_atl_b0_set_fc(self, self->aq_nic_cfg->fc.req, tc);
 	}
 
+	if (cfg->is_ptp)
+		hw_atl2_tc_ptp_set(self);
+
 	/* QoS 802.1p priority -> TC mapping */
 	for (prio = 0; prio < 8; ++prio)
 		hw_atl_rpf_rpb_user_priority_tc_map_set(self, prio,
@@ -259,8 +285,9 @@ static int hw_atl2_hw_rss_set(struct aq_hw_s *self,
 	u8 *indirection_table = rss_params->indirection_table;
 	const u32 num_tcs = aq_hw_num_tcs(self);
 	u32 rpf_redir2_enable;
-	int tc;
-	int i;
+	u32 queue;
+	u32 tc;
+	u32 i;
 
 	rpf_redir2_enable = num_tcs > 4 ? 1 : 0;
 
@@ -268,10 +295,9 @@ static int hw_atl2_hw_rss_set(struct aq_hw_s *self,
 
 	for (i = HW_ATL2_RSS_REDIRECTION_MAX; i--;) {
 		for (tc = 0; tc != num_tcs; tc++) {
-			hw_atl2_new_rpf_rss_redir_set(self, tc, i,
-						      tc *
-						      aq_hw_q_per_tc(self) +
-						      indirection_table[i]);
+			queue = tc * aq_hw_q_per_tc(self) +
+				indirection_table[i];
+			hw_atl2_new_rpf_rss_redir_set(self, tc, i, queue);
 		}
 	}
 
@@ -415,9 +441,20 @@ static int hw_atl2_hw_init_tx_path(struct aq_hw_s *self)
 
 	hw_atl2_tpb_tx_buf_clk_gate_en_set(self, 0U);
 
+	if (hw_atl2_phi_ext_tag_get(self)) {
+		hw_atl2_tdm_tx_data_read_req_limit_set(self, 0x7F);
+		hw_atl2_tdm_tx_desc_read_req_limit_set(self, 0x0F);
+	}
+
 	return aq_hw_err_from_flags(self);
 }
 
+/* Initialise new rx filters
+ * L2 promisc OFF
+ * VLAN promisc OFF
+ *
+ * User priority to TC
+ */
 static void hw_atl2_hw_init_new_rx_filters(struct aq_hw_s *self)
 {
 	u8 *prio_tc_map = self->aq_nic_cfg->prio_tc_map;
@@ -429,6 +466,9 @@ static void hw_atl2_hw_init_new_rx_filters(struct aq_hw_s *self)
 	u8 index;
 	int i;
 
+	/* tag MC frames always */
+	hw_atl_rpfl2_accept_all_mc_packets_set(self, 1);
+
 	/* Action Resolver Table (ART) is used by RPF to decide which action
 	 * to take with a packet based upon input tag and tag mask, where:
 	 *  - input tag is a combination of 3-bit VLan Prio (PTP) and
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
index fc086d84fb91..31d7cae6641a 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
@@ -27,8 +27,10 @@
 #define HW_ATL2_INT_MASK  (0xFFFFFFFFU)
 
 #define HW_ATL2_TXBUF_MAX              128U
-#define HW_ATL2_RXBUF_MAX              192U
+#define HW_ATL2_PTP_TXBUF_SIZE           8U
 
+#define HW_ATL2_RXBUF_MAX              192U
+#define HW_ATL2_PTP_RXBUF_SIZE          16U
 #define HW_ATL2_RSS_REDIRECTION_MAX 64U
 
 #define HW_ATL2_TC_MAX 8U
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 5/9] net: atlantic: add AQC113 filter data structures and firmware query and firmware query firmware query
From: sukhdeeps @ 2026-05-06 13:57 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Add filter infrastructure for AQC113 hardware:

- Define L3 (IPv4/IPv6), L4 (TCP/UDP/SCTP), and combined L3L4 filter
  structures with reference-counted sharing support.
- Define tag policy structure for ethertype filter management.
- Add RPF L3/L4 command bit definitions for filter programming.
- Add filter count constants for L3L4, L3V4, L4, VLAN, and ethertype.
- Extend hw_atl2_priv with filter arrays, base indices, and counts
  discovered from firmware.

Query filter capabilities from firmware shared memory at init time
to discover available L2/L3/L4/VLAN/ethertype filter resources and
ART (Action Resolver Table) configuration.

Add hardware register dump utility for AQC113 debug support.

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../atlantic/hw_atl2/hw_atl2_internal.h       | 63 +++++++++++++++++++
 .../aquantia/atlantic/hw_atl2/hw_atl2_utils.c | 33 ++++++++++
 .../aquantia/atlantic/hw_atl2/hw_atl2_utils.h |  5 ++
 .../atlantic/hw_atl2/hw_atl2_utils_fw.c       | 52 +++++++++++++++
 4 files changed, 153 insertions(+)

diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
index 5a89bb8722f9..fc086d84fb91 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h
@@ -84,6 +84,18 @@ enum HW_ATL2_RPF_ART_INDEX {
 					  HW_ATL_VLAN_MAX_FILTERS,
 };
 
+#define HW_ATL2_RPF_L3_CMD_EN       BIT(0)
+#define HW_ATL2_RPF_L3_CMD_SA_EN    BIT(1)
+#define HW_ATL2_RPF_L3_CMD_DA_EN    BIT(2)
+#define HW_ATL2_RPF_L3_CMD_PROTO_EN BIT(3)
+#define HW_ATL2_RPF_L3_V6_CMD_EN       BIT(0x10)
+#define HW_ATL2_RPF_L3_V6_CMD_SA_EN    BIT(0x11)
+#define HW_ATL2_RPF_L3_V6_CMD_DA_EN    BIT(0x12)
+#define HW_ATL2_RPF_L3_V6_CMD_PROTO_EN BIT(0x13)
+#define HW_ATL2_RPF_L4_CMD_EN       BIT(0)
+#define HW_ATL2_RPF_L4_CMD_DP_EN    BIT(1)
+#define HW_ATL2_RPF_L4_CMD_SP_EN    BIT(2)
+
 #define HW_ATL2_ACTION(ACTION, RSS, INDEX, VALID) \
 	((((ACTION) & 0x3U) << 8) | \
 	(((RSS) & 0x1U) << 7) | \
@@ -94,6 +106,12 @@ enum HW_ATL2_RPF_ART_INDEX {
 #define HW_ATL2_ACTION_DISABLE HW_ATL2_ACTION(0, 0, 0, 0)
 #define HW_ATL2_ACTION_ASSIGN_QUEUE(QUEUE) HW_ATL2_ACTION(1, 0, (QUEUE), 1)
 #define HW_ATL2_ACTION_ASSIGN_TC(TC) HW_ATL2_ACTION(1, 1, (TC), 1)
+#define HW_ATL2_RPF_L3L4_FILTERS 8
+#define HW_ATL2_RPF_L3V4_FILTERS 8
+#define HW_ATL2_RPF_L4_FILTERS 8
+#define HW_ATL2_RPF_VLAN_FILTERS 16
+#define HW_ATL2_RPF_ETYPE_FILTERS 16
+#define HW_ATL2_RPF_ETYPE_TAGS 7
 
 enum HW_ATL2_RPF_RSS_HASH_TYPE {
 	HW_ATL2_RPF_RSS_HASH_TYPE_NONE = 0,
@@ -119,9 +137,54 @@ enum HW_ATL2_RPF_RSS_HASH_TYPE {
 
 #define HW_ATL_MCAST_FLT_ANY_TO_HOST 0x00010FFFU
 
+struct hw_atl2_l3_filter {
+	u8 proto;
+	u8 usage;
+	u32 cmd;
+	u32 srcip[4];
+	u32 dstip[4];
+};
+
+struct hw_atl2_l4_filter {
+	u8 usage;
+	u32 cmd;
+	u16 sport;
+	u16 dport;
+};
+
+struct hw_atl2_l3l4_filter {
+	s8 l3_index;
+	s8 l4_index;
+	u8 ipv6;
+};
+
+struct hw_atl2_tag_policy {
+	u16 action;
+	u16 usage;
+};
+
 struct hw_atl2_priv {
+	struct hw_atl2_l3_filter l3_v4_filters[HW_ATL2_RPF_L3L4_FILTERS];
+	struct hw_atl2_l3_filter l3_v6_filters[HW_ATL2_RPF_L3L4_FILTERS];
+	struct hw_atl2_l4_filter l4_filters[HW_ATL2_RPF_L3L4_FILTERS];
+	struct hw_atl2_l3l4_filter l3l4_filters[HW_ATL2_RPF_L3L4_FILTERS];
+	struct hw_atl2_tag_policy etype_policy[HW_ATL2_RPF_ETYPE_FILTERS];
 	struct statistics_s last_stats;
 	unsigned int art_base_index;
+	unsigned int art_count;
+	unsigned int l2_filters_base_index;
+	unsigned int l2_filter_count;
+	unsigned int etype_filter_base_index;
+	unsigned int etype_filter_count;
+	unsigned int etype_filter_tag_top;
+	unsigned int vlan_filter_base_index;
+	unsigned int vlan_filter_count;
+	unsigned int l3_v4_filter_base_index;
+	unsigned int l3_v4_filter_count;
+	unsigned int l3_v6_filter_base_index;
+	unsigned int l3_v6_filter_count;
+	unsigned int l4_filter_base_index;
+	unsigned int l4_filter_count;
 };
 
 #endif /* HW_ATL2_INTERNAL_H */
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.c
index 0fe6257d9c08..ffd723dcfb63 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.c
@@ -128,3 +128,36 @@ int hw_atl2_utils_soft_reset(struct aq_hw_s *self)
 err_exit:
 	return err;
 }
+
+static const u32 hw_atl2_utils_hw_mac_regs[] = {
+	0x00005580U, 0x00005590U, 0x000055B0U, 0x000055B4U,
+	0x000055C0U, 0x00005B00U, 0x00005B04U, 0x00005B08U,
+	0x00005B0CU, 0x00005B10U, 0x00005B14U, 0x00005B18U,
+	0x00005B1CU, 0x00005B20U, 0x00005B24U, 0x00005B28U,
+	0x00005B2CU, 0x00005B30U, 0x00005B34U, 0x00005B38U,
+	0x00005B3CU, 0x00005B40U, 0x00005B44U, 0x00005B48U,
+	0x00005B4CU, 0x00005B50U, 0x00005B54U, 0x00005B58U,
+	0x00005B5CU, 0x00005B60U, 0x00005B64U, 0x00005B68U,
+	0x00005B6CU, 0x00005B70U, 0x00005B74U, 0x00005B78U,
+	0x00005B7CU, 0x00007C00U, 0x00007C04U, 0x00007C08U,
+	0x00007C0CU, 0x00007C10U, 0x00007C14U, 0x00007C18U,
+	0x00007C1CU, 0x00007C20U, 0x00007C40U, 0x00007C44U,
+	0x00007C48U, 0x00007C4CU, 0x00007C50U, 0x00007C54U,
+	0x00007C58U, 0x00007C5CU, 0x00007C60U, 0x00007C80U,
+	0x00007C84U, 0x00007C88U, 0x00007C8CU, 0x00007C90U,
+	0x00007C94U, 0x00007C98U, 0x00007C9CU, 0x00007CA0U,
+	0x00007CC0U, 0x00007CC4U, 0x00007CC8U, 0x00007CCCU,
+	0x00007CD0U, 0x00007CD4U, 0x00007CD8U, 0x00007CDCU,
+};
+
+int hw_atl2_utils_hw_get_regs(struct aq_hw_s *self,
+			      const struct aq_hw_caps_s *aq_hw_caps,
+			      u32 *regs_buff)
+{
+	unsigned int i;
+
+	for (i = 0; i < aq_hw_caps->mac_regs_count; i++)
+		regs_buff[i] = aq_hw_read_reg(self,
+					      hw_atl2_utils_hw_mac_regs[i]);
+	return 0;
+}
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h
index 6bad64c77b87..c84955bc14ae 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h
@@ -626,10 +626,15 @@ int hw_atl2_utils_initfw(struct aq_hw_s *self, const struct aq_fw_ops **fw_ops);
 
 int hw_atl2_utils_soft_reset(struct aq_hw_s *self);
 
+int hw_atl2_utils_hw_get_regs(struct aq_hw_s *self,
+			      const struct aq_hw_caps_s *aq_hw_caps,
+			      u32 *regs_buff);
+
 u32 hw_atl2_utils_get_fw_version(struct aq_hw_s *self);
 
 int hw_atl2_utils_get_action_resolve_table_caps(struct aq_hw_s *self,
 						u8 *base_index, u8 *count);
+int hw_atl2_utils_get_filter_caps(struct aq_hw_s *self);
 
 extern const struct aq_fw_ops aq_a2_fw_ops;
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils_fw.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils_fw.c
index 7370e3f76b62..546b48f897d3 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils_fw.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils_fw.c
@@ -121,6 +121,10 @@ static int aq_a2_fw_init(struct aq_hw_s *self)
 	u32 val;
 	int err;
 
+	err = hw_atl2_utils_get_filter_caps(self);
+	if (err)
+		return err;
+
 	hw_atl2_shared_buffer_get(self, link_control, link_control);
 	link_control.mode = AQ_HOST_MODE_ACTIVE;
 	hw_atl2_shared_buffer_write(self, link_control, link_control);
@@ -606,6 +610,54 @@ u32 hw_atl2_utils_get_fw_version(struct aq_hw_s *self)
 	       version.bundle.build;
 }
 
+int hw_atl2_utils_get_filter_caps(struct aq_hw_s *self)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	struct filter_caps_s filter_caps;
+	u32 tag_top;
+	int err;
+
+	err = hw_atl2_shared_buffer_read_safe(self, filter_caps, &filter_caps);
+	if (err)
+		return err;
+
+	priv->art_base_index = filter_caps.rslv_tbl_base_index * 8;
+	priv->art_count = filter_caps.rslv_tbl_count * 8;
+	if (priv->art_count == 0)
+		priv->art_count = 128;
+	priv->l2_filters_base_index = filter_caps.l2_filters_base_index;
+	priv->l2_filter_count = filter_caps.l2_filter_count;
+	priv->etype_filter_base_index = filter_caps.ethertype_filter_base_index;
+	priv->etype_filter_count = filter_caps.ethertype_filter_count;
+	priv->etype_filter_tag_top =
+		(priv->etype_filter_count >= HW_ATL2_RPF_ETYPE_TAGS) ?
+		 (HW_ATL2_RPF_ETYPE_TAGS) : (HW_ATL2_RPF_ETYPE_TAGS >> 1);
+	priv->vlan_filter_base_index = filter_caps.vlan_filter_base_index;
+	/* 0 - no tag, 1 - reserved for vlan-filter-offload filters */
+	tag_top =
+		  (filter_caps.vlan_filter_count == HW_ATL2_RPF_VLAN_FILTERS) ?
+		  (HW_ATL2_RPF_VLAN_FILTERS - 2) :
+		  (HW_ATL2_RPF_VLAN_FILTERS / 2 - 2);
+
+	if (filter_caps.vlan_filter_count > 2)
+		priv->vlan_filter_count = min_t(u32,
+						filter_caps.vlan_filter_count - 2,
+						tag_top);
+	else
+		priv->vlan_filter_count = 0;
+
+	priv->l3_v4_filter_base_index = filter_caps.l3_ip4_filter_base_index;
+	priv->l3_v4_filter_count = min_t(u32, filter_caps.l3_ip4_filter_count,
+					 HW_ATL2_RPF_L3V4_FILTERS - 1);
+	priv->l3_v6_filter_base_index = filter_caps.l3_ip6_filter_base_index;
+	priv->l3_v6_filter_count = filter_caps.l3_ip6_filter_count;
+	priv->l4_filter_base_index = filter_caps.l4_filter_base_index;
+	priv->l4_filter_count = min_t(u32, filter_caps.l4_filter_count,
+				      HW_ATL2_RPF_L4_FILTERS - 1);
+
+	return 0;
+}
+
 int hw_atl2_utils_get_action_resolve_table_caps(struct aq_hw_s *self,
 						u8 *base_index, u8 *count)
 {
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 6/9] net: atlantic: implement AQC113 L2/L3/L4 RX filter management filter management management
From: sukhdeeps @ 2026-05-06 13:57 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Implement complete RX filter management for AQC113 hardware:

- Add tag-based filter policy with reference-counted sharing, allowing
  multiple filter rules to share the same L3 or L4 hardware filter
  when their match criteria are identical.
- Implement L3 (IPv4/IPv6 source/destination address and protocol)
  filter find, get (program HW and increment refcount), and put
  (decrement refcount and clear HW when last user releases).
- Implement L4 (TCP/UDP/SCTP source/destination port) filter
  management with the same find/get/put pattern.
- Add combined L3L4 filter configuration that translates legacy
  aq_rx_filter_l3l4 commands into AQC113 separate L3+L4 filter
  programming with Action Resolver Table (ART) entries.
- Add L2 ethertype filter set/clear with tag-based ART integration.
- Add MAC address setup using firmware-provided L2 filter base index.

Update hardware initialization:
- Use firmware-reported ART section base and count instead of
  hardcoded 0xFFFF section enable.
- Enable L3 v6/v4 select mode for simultaneous IPv4/IPv6 filtering.
- Initialize L3L4 filter indices to -1 on reset.

Wire up hw_filter_l2_set, hw_filter_l2_clear, hw_filter_l3l4_set,
hw_set_mac_address, hw_get_version, and hw_get_regs in hw_atl2_ops.

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../net/ethernet/aquantia/atlantic/aq_hw.h    |   2 +
 .../aquantia/atlantic/hw_atl2/hw_atl2.c       | 582 +++++++++++++++++-
 2 files changed, 580 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
index 57ea59026a2c..04fb87d4e56d 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
@@ -236,6 +236,8 @@ struct aq_hw_ops {
 
 	int (*hw_stop)(struct aq_hw_s *self);
 
+	u32 (*hw_get_version)(struct aq_hw_s *self);
+
 	int (*hw_ring_tx_init)(struct aq_hw_s *self, struct aq_ring_s *aq_ring,
 			       struct aq_ring_param_s *aq_ring_param);
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
index 0ce9caae8799..e58bfff38670 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
@@ -11,6 +11,7 @@
 #include "hw_atl/hw_atl_utils.h"
 #include "hw_atl/hw_atl_llh.h"
 #include "hw_atl/hw_atl_llh_internal.h"
+#include "hw_atl2.h"
 #include "hw_atl2_utils.h"
 #include "hw_atl2_llh.h"
 #include "hw_atl2_internal.h"
@@ -86,6 +87,38 @@ const struct aq_hw_caps_s hw_atl2_caps_aqc116c = {
 			  AQ_NIC_RATE_10M,
 };
 
+/* Find tag with the same action or new free tag
+ *  top - top inclusive tag value
+ *  action - action for ActionResolverTable
+ */
+static int hw_atl2_filter_tag_get(struct hw_atl2_tag_policy *tags,
+				  int top, u16 action)
+{
+	int i;
+
+	for (i = 1; i <= top; i++)
+		if (tags[i].usage > 0 && tags[i].action == action) {
+			tags[i].usage++;
+			return i;
+		}
+
+	for (i = 1; i <= top; i++)
+		if (tags[i].usage == 0) {
+			tags[i].usage = 1;
+			tags[i].action = action;
+			return i;
+		}
+
+	return -1;
+}
+
+static void hw_atl2_filter_tag_put(struct hw_atl2_tag_policy *tags,
+				   int tag)
+{
+	if (tags[tag].usage > 0)
+		tags[tag].usage--;
+}
+
 static u32 hw_atl2_sem_act_rslvr_get(struct aq_hw_s *self)
 {
 	return hw_atl_reg_glb_cpu_sem_get(self, HW_ATL2_FW_SM_ACT_RSLVR);
@@ -95,12 +128,21 @@ static int hw_atl2_hw_reset(struct aq_hw_s *self)
 {
 	struct hw_atl2_priv *priv = self->priv;
 	int err;
+	int i;
 
 	err = hw_atl2_utils_soft_reset(self);
 	if (err)
 		return err;
 
-	memset(priv, 0, sizeof(*priv));
+	memset(&priv->last_stats, 0, sizeof(priv->last_stats));
+	memset(priv->l3_v4_filters, 0, sizeof(priv->l3_v4_filters));
+	memset(priv->l3_v6_filters, 0, sizeof(priv->l3_v6_filters));
+	memset(priv->l4_filters, 0, sizeof(priv->l4_filters));
+	memset(priv->etype_policy, 0, sizeof(priv->etype_policy));
+	for (i = 0; i < HW_ATL2_RPF_L3L4_FILTERS; i++) {
+		priv->l3l4_filters[i].l3_index = -1;
+		priv->l3l4_filters[i].l4_index = -1;
+	}
 
 	self->aq_fw_ops->set_state(self, MPI_RESET);
 
@@ -380,6 +422,9 @@ static void hw_atl2_hw_init_new_rx_filters(struct aq_hw_s *self)
 {
 	u8 *prio_tc_map = self->aq_nic_cfg->prio_tc_map;
 	struct hw_atl2_priv *priv = self->priv;
+	u32 art_first_sec, art_last_sec;
+	u32 art_sections;
+	u32 art_mask = 0;
 	u16 action;
 	u8 index;
 	int i;
@@ -394,9 +439,14 @@ static void hw_atl2_hw_init_new_rx_filters(struct aq_hw_s *self)
 	 * REC entry is used for further processing. If multiple entries match,
 	 * the lowest REC entry, Action field will be selected.
 	 */
-	hw_atl2_rpf_act_rslvr_section_en_set(self, 0xFFFF);
+	art_last_sec = priv->art_base_index / 8 + priv->art_count / 8;
+	art_first_sec = priv->art_base_index / 8;
+	art_mask = (BIT(art_last_sec) - 1) - (BIT(art_first_sec) - 1);
+	art_sections = hw_atl2_rpf_act_rslvr_section_en_get(self) | art_mask;
+	hw_atl2_rpf_act_rslvr_section_en_set(self, art_sections);
+	hw_atl2_rpf_l3_v6_v4_select_set(self, 1);
 	hw_atl2_rpfl2_uc_flr_tag_set(self, HW_ATL2_RPF_TAG_BASE_UC,
-				     HW_ATL2_MAC_UC);
+				     priv->l2_filters_base_index);
 	hw_atl2_rpfl2_bc_flr_tag_set(self, HW_ATL2_RPF_TAG_BASE_UC);
 
 	/* FW reserves the beginning of ART, thus all driver entries must
@@ -530,6 +580,35 @@ static int hw_atl2_hw_init_rx_path(struct aq_hw_s *self)
 	return aq_hw_err_from_flags(self);
 }
 
+static int hw_atl2_hw_mac_addr_set(struct aq_hw_s *self, const u8 *mac_addr)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	u32 location = priv->l2_filters_base_index;
+	unsigned int h = 0U;
+	unsigned int l = 0U;
+	int err = 0;
+
+	if (!mac_addr) {
+		err = -EINVAL;
+		goto err_exit;
+	}
+	h = (mac_addr[0] << 8) | (mac_addr[1]);
+	l = (mac_addr[2] << 24) | (mac_addr[3] << 16) |
+		(mac_addr[4] << 8) | mac_addr[5];
+
+	hw_atl_rpfl2_uc_flr_en_set(self, 0U, location);
+	hw_atl_rpfl2unicast_dest_addresslsw_set(self, l, location);
+	hw_atl_rpfl2unicast_dest_addressmsw_set(self, h, location);
+	hw_atl_rpfl2unicast_flr_act_set(self, 1U, location);
+	hw_atl2_rpfl2_uc_flr_tag_set(self, HW_ATL2_RPF_TAG_BASE_UC, location);
+	hw_atl_rpfl2_uc_flr_en_set(self, 1U, location);
+
+	err = aq_hw_err_from_flags(self);
+
+err_exit:
+	return err;
+}
+
 static int hw_atl2_hw_init(struct aq_hw_s *self, const u8 *mac_addr)
 {
 	static u32 aq_hw_atl2_igcr_table_[4][2] = {
@@ -767,6 +846,496 @@ static struct aq_stats_s *hw_atl2_utils_get_hw_stats(struct aq_hw_s *self)
 	return &self->curr_stats;
 }
 
+static bool hw_atl2_rxf_l3_is_equal(struct hw_atl2_l3_filter *f1,
+				    struct hw_atl2_l3_filter *f2)
+{
+	if (f1->cmd != f2->cmd)
+		return false;
+
+	if (f1->cmd & HW_ATL2_RPF_L3_CMD_SA_EN)
+		if (f1->srcip[0] != f2->srcip[0])
+			return false;
+
+	if (f1->cmd & HW_ATL2_RPF_L3_CMD_DA_EN)
+		if (f1->dstip[0] != f2->dstip[0])
+			return false;
+
+	if (f1->cmd & (HW_ATL2_RPF_L3_CMD_PROTO_EN |
+		       HW_ATL2_RPF_L3_V6_CMD_PROTO_EN))
+		if (f1->proto != f2->proto)
+			return false;
+
+	if (f1->cmd & HW_ATL2_RPF_L3_V6_CMD_SA_EN)
+		if (memcmp(f1->srcip, f2->srcip, 16))
+			return false;
+
+	if (f1->cmd & HW_ATL2_RPF_L3_V6_CMD_DA_EN)
+		if (memcmp(f1->dstip, f2->dstip, 16))
+			return false;
+
+	return true;
+}
+
+static int hw_atl2_new_fl3l4_find_l3(struct aq_hw_s *self,
+				     struct hw_atl2_l3_filter *l3)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	struct hw_atl2_l3_filter *l3_filters;
+	int i, first, last;
+
+	if (l3->cmd & HW_ATL2_RPF_L3_V6_CMD_EN) {
+		l3_filters = priv->l3_v6_filters;
+		first = priv->l3_v6_filter_base_index;
+		last = priv->l3_v6_filter_base_index +
+		       priv->l3_v6_filter_count;
+	} else {
+		l3_filters = priv->l3_v4_filters;
+		first = priv->l3_v4_filter_base_index;
+		last = priv->l3_v4_filter_base_index +
+		       priv->l3_v4_filter_count;
+	}
+	for (i = first; i < last; i++) {
+		if (hw_atl2_rxf_l3_is_equal(&l3_filters[i], l3))
+			return i;
+	}
+
+	for (i = first; i < last; i++) {
+		u32 l3_enable_mask = HW_ATL2_RPF_L3_CMD_EN |
+				     HW_ATL2_RPF_L3_V6_CMD_EN;
+
+		if (!(l3_filters[i].cmd & l3_enable_mask))
+			return i;
+	}
+
+	return -ENOSPC;
+}
+
+static void hw_atl2_rxf_l3_get(struct aq_hw_s *self,
+			       struct hw_atl2_l3_filter *l3, int idx,
+			       const struct hw_atl2_l3_filter *_l3)
+{
+	int i;
+
+	l3->usage++;
+	if (l3->usage == 1) {
+		l3->cmd = _l3->cmd;
+		for (i = 0; i < 4; i++) {
+			l3->srcip[i] = _l3->srcip[i];
+			l3->dstip[i] = _l3->dstip[i];
+		}
+		l3->proto = _l3->proto;
+
+		if (l3->cmd & HW_ATL2_RPF_L3_CMD_EN) {
+			hw_atl2_rpf_l3_v4_cmd_set(self, l3->cmd, idx);
+			hw_atl2_rpf_l3_v4_tag_set(self, idx + 1, idx);
+			hw_atl2_rpf_l3_v4_dest_addr_set(self,
+							idx,
+							l3->dstip[0]);
+			hw_atl2_rpf_l3_v4_src_addr_set(self,
+						       idx,
+						       l3->srcip[0]);
+		} else {
+			hw_atl2_rpf_l3_v6_cmd_set(self, l3->cmd, idx);
+			hw_atl2_rpf_l3_v6_tag_set(self, idx + 1, idx);
+			hw_atl2_rpf_l3_v6_dest_addr_set(self,
+							idx,
+							l3->dstip);
+			hw_atl2_rpf_l3_v6_src_addr_set(self,
+						       idx,
+						       l3->srcip);
+		}
+	}
+}
+
+static void hw_atl2_rxf_l3_put(struct aq_hw_s *self,
+			       struct hw_atl2_l3_filter *l3, int idx)
+{
+	if (l3->usage)
+		l3->usage--;
+
+	if (!l3->usage) {
+		if (l3->cmd & HW_ATL2_RPF_L3_V6_CMD_EN)
+			hw_atl2_rpf_l3_v6_cmd_set(self, 0, idx);
+		else
+			hw_atl2_rpf_l3_v4_cmd_set(self, 0, idx);
+		l3->cmd = 0;
+	}
+}
+
+static bool hw_atl2_rxf_l4_is_equal(struct hw_atl2_l4_filter *f1,
+				    struct hw_atl2_l4_filter *f2)
+{
+	if (f1->cmd != f2->cmd)
+		return false;
+
+	if (f1->cmd & HW_ATL2_RPF_L4_CMD_SP_EN)
+		if (f1->sport != f2->sport)
+			return false;
+
+	if (f1->cmd & HW_ATL2_RPF_L4_CMD_DP_EN)
+		if (f1->dport != f2->dport)
+			return false;
+
+	return true;
+}
+
+static int hw_atl2_new_fl3l4_find_l4(struct aq_hw_s *self,
+				     struct hw_atl2_l4_filter *l4)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	int i, first, last;
+
+	first = priv->l4_filter_base_index;
+	last = priv->l4_filter_base_index + priv->l4_filter_count;
+
+	for (i = first; i < last; i++)
+		if (hw_atl2_rxf_l4_is_equal(&priv->l4_filters[i], l4))
+			return i;
+
+	for (i = first; i < last; i++)
+		if ((priv->l4_filters[i].cmd & HW_ATL2_RPF_L4_CMD_EN) == 0)
+			return i;
+
+	return -ENOSPC;
+}
+
+static void hw_atl2_rxf_l4_put(struct aq_hw_s *self,
+			       struct hw_atl2_l4_filter *l4, int idx)
+{
+	if (l4->usage)
+		l4->usage--;
+
+	if (!l4->usage) {
+		l4->cmd = 0;
+		hw_atl2_rpf_l4_cmd_set(self, l4->cmd, idx);
+	}
+}
+
+static void hw_atl2_rxf_l4_get(struct aq_hw_s *self,
+			       struct hw_atl2_l4_filter *l4, int idx,
+			       const struct hw_atl2_l4_filter *_l4)
+{
+	l4->usage++;
+	if (l4->usage == 1) {
+		l4->cmd = _l4->cmd;
+		l4->sport = _l4->sport;
+		l4->dport = _l4->dport;
+
+		hw_atl2_rpf_l4_cmd_set(self, l4->cmd, idx);
+		hw_atl2_rpf_l4_tag_set(self, idx + 1, idx);
+		hw_atl_rpf_l4_spd_set(self, l4->sport, idx);
+		hw_atl_rpf_l4_dpd_set(self, l4->dport, idx);
+	}
+}
+
+static int hw_atl2_new_fl3l4_configure(struct aq_hw_s *self,
+				       struct aq_rx_filter_l3l4 *data)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	s8 old_l3_index = priv->l3l4_filters[data->location].l3_index;
+	s8 old_l4_index = priv->l3l4_filters[data->location].l4_index;
+	u8 old_ipv6 = priv->l3l4_filters[data->location].ipv6;
+	struct hw_atl2_l3_filter *l3_filters;
+	struct hw_atl2_l3_filter l3;
+	struct hw_atl2_l4_filter l4;
+	s8 l3_idx = -1;
+	s8 l4_idx = -1;
+
+	if (!(data->cmd & HW_ATL_RX_ENABLE_FLTR_L3L4))
+		return 0;
+
+	memset(&l3, 0, sizeof(l3));
+	memset(&l4, 0, sizeof(l4));
+
+	/* convert legacy filter to new */
+	if (data->cmd & HW_ATL_RX_ENABLE_CMP_PROT_L4) {
+		l3.cmd |= data->is_ipv6 ? HW_ATL2_RPF_L3_V6_CMD_PROTO_EN :
+					  HW_ATL2_RPF_L3_CMD_PROTO_EN;
+		l3.cmd |= data->is_ipv6 ? HW_ATL2_RPF_L3_V6_CMD_EN :
+					  HW_ATL2_RPF_L3_CMD_EN;
+		switch (data->cmd & 0x7) {
+		case HW_ATL_RX_TCP:
+			l3.cmd |= IPPROTO_TCP << (data->is_ipv6 ? 0x18 : 8);
+			break;
+		case HW_ATL_RX_UDP:
+			l3.cmd |= IPPROTO_UDP << (data->is_ipv6 ? 0x18 : 8);
+			break;
+		case HW_ATL_RX_SCTP:
+			l3.cmd |= IPPROTO_SCTP << (data->is_ipv6 ? 0x18 : 8);
+			break;
+		case HW_ATL_RX_ICMP:
+			l3.cmd |= IPPROTO_ICMP << (data->is_ipv6 ? 0x18 : 8);
+			break;
+		}
+	}
+
+	if (data->cmd & HW_ATL_RX_ENABLE_CMP_SRC_ADDR_L3) {
+		if (data->is_ipv6) {
+			l3.cmd |= HW_ATL2_RPF_L3_V6_CMD_SA_EN |
+				  HW_ATL2_RPF_L3_V6_CMD_EN;
+			memcpy(l3.srcip, data->ip_src, sizeof(l3.srcip));
+		} else {
+			l3.cmd |= HW_ATL2_RPF_L3_CMD_SA_EN |
+				  HW_ATL2_RPF_L3_CMD_EN;
+			l3.srcip[0] = data->ip_src[0];
+		}
+	}
+	if (data->cmd & HW_ATL_RX_ENABLE_CMP_DEST_ADDR_L3) {
+		if (data->is_ipv6) {
+			l3.cmd |= HW_ATL2_RPF_L3_V6_CMD_DA_EN |
+				  HW_ATL2_RPF_L3_V6_CMD_EN;
+			memcpy(l3.dstip, data->ip_dst, sizeof(l3.dstip));
+		} else {
+			l3.cmd |= HW_ATL2_RPF_L3_CMD_DA_EN |
+				  HW_ATL2_RPF_L3_CMD_EN;
+			l3.dstip[0] = data->ip_dst[0];
+		}
+	}
+
+	if (data->cmd & HW_ATL_RX_ENABLE_CMP_DEST_PORT_L4) {
+		l4.cmd |= HW_ATL2_RPF_L4_CMD_DP_EN | HW_ATL2_RPF_L4_CMD_EN;
+		l4.dport = data->p_dst;
+	}
+	if (data->cmd & HW_ATL_RX_ENABLE_CMP_SRC_PORT_L4) {
+		l4.cmd |= HW_ATL2_RPF_L4_CMD_SP_EN | HW_ATL2_RPF_L4_CMD_EN;
+		l4.sport = data->p_src;
+	}
+
+	/* find L3 and L4 filters */
+	if (l3.cmd & (HW_ATL2_RPF_L3_CMD_EN | HW_ATL2_RPF_L3_V6_CMD_EN)) {
+		l3_idx = hw_atl2_new_fl3l4_find_l3(self, &l3);
+		if (l3_idx < 0)
+			return l3_idx;
+
+		if (l3.cmd & HW_ATL2_RPF_L3_V6_CMD_EN)
+			l3_filters = priv->l3_v6_filters;
+		else
+			l3_filters = priv->l3_v4_filters;
+
+		if (priv->l3l4_filters[data->location].l3_index != l3_idx)
+			hw_atl2_rxf_l3_get(self, &l3_filters[l3_idx],
+					   l3_idx, &l3);
+	}
+
+	if (old_l3_index != -1) {
+		if (old_ipv6)
+			l3_filters = priv->l3_v6_filters;
+		else
+			l3_filters = priv->l3_v4_filters;
+
+		if (!(hw_atl2_rxf_l3_is_equal(&l3,
+					      &l3_filters[old_l3_index]))) {
+			hw_atl2_rxf_l3_put(self,
+					   &l3_filters[old_l3_index],
+					   old_l3_index);
+		}
+	}
+	if (l3.cmd & HW_ATL2_RPF_L3_V6_CMD_EN)
+		priv->l3l4_filters[data->location].ipv6 = 1;
+	else
+		priv->l3l4_filters[data->location].ipv6 = 0;
+	priv->l3l4_filters[data->location].l3_index = l3_idx;
+
+	if (l4.cmd & HW_ATL2_RPF_L4_CMD_EN) {
+		l4_idx = hw_atl2_new_fl3l4_find_l4(self, &l4);
+		if (l4_idx < 0) {
+			/* Undo L3 acquisition */
+			if (l3_idx >= 0) {
+				hw_atl2_rxf_l3_put(self, &l3_filters[l3_idx], l3_idx);
+				priv->l3l4_filters[data->location].l3_index = old_l3_index;
+				priv->l3l4_filters[data->location].ipv6 = old_ipv6;
+			}
+			return -EINVAL;
+		}
+
+		if (priv->l3l4_filters[data->location].l4_index != l4_idx)
+			hw_atl2_rxf_l4_get(self, &priv->l4_filters[l4_idx],
+					   l4_idx, &l4);
+	}
+
+	if (old_l4_index != -1) {
+		if (!(hw_atl2_rxf_l4_is_equal(&priv->l4_filters[old_l4_index],
+					      &l4))) {
+			hw_atl2_rxf_l4_put(self,
+					   &priv->l4_filters[old_l4_index],
+					   old_l4_index);
+		}
+	}
+	priv->l3l4_filters[data->location].l4_index = l4_idx;
+
+	return 0;
+}
+
+static int hw_atl2_hw_fl3l4_set(struct aq_hw_s *self,
+				struct aq_rx_filter_l3l4 *data)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	struct hw_atl2_l3_filter *l3_filters;
+	struct hw_atl2_l3_filter *l3 = NULL;
+	struct hw_atl2_l4_filter *l4 = NULL;
+	u8 location = data->location;
+	u32 req_tag = 0;
+	u16 action = 0;
+	int l3_index;
+	int l4_index;
+	u32 mask = 0;
+	u8 index;
+	u8 ipv6;
+	int res;
+
+	res = hw_atl2_new_fl3l4_configure(self, data);
+	if (res)
+		return res;
+
+	l3_index = priv->l3l4_filters[location].l3_index;
+	l4_index = priv->l3l4_filters[location].l4_index;
+	ipv6 = priv->l3l4_filters[location].ipv6;
+	if (ipv6)
+		l3_filters = priv->l3_v6_filters;
+	else
+		l3_filters = priv->l3_v4_filters;
+
+	if (!(data->cmd & HW_ATL_RX_ENABLE_FLTR_L3L4)) {
+		if (l3_index > -1)
+			hw_atl2_rxf_l3_put(self, &l3_filters[l3_index],
+					   l3_index);
+
+		if (l4_index > -1)
+			hw_atl2_rxf_l4_put(self, &priv->l4_filters[l4_index],
+					   l4_index);
+
+		priv->l3l4_filters[location].l3_index = -1;
+		priv->l3l4_filters[location].l4_index = -1;
+		index = priv->art_base_index + HW_ATL2_RPF_L3L4_USER_INDEX +
+			location;
+		hw_atl2_act_rslvr_table_set(self, index, 0, 0,
+					    HW_ATL2_ACTION_DISABLE);
+
+		return 0;
+	}
+
+	if (l3_index != -1)
+		l3 = &l3_filters[l3_index];
+	if (l4_index != -1)
+		l4 = &priv->l4_filters[l4_index];
+
+	if (l4 && (l4->cmd & HW_ATL2_RPF_L4_CMD_EN)) {
+		req_tag |= (l4_index + 1) << HW_ATL2_RPF_TAG_L4_OFFSET;
+		mask |= HW_ATL2_RPF_TAG_L4_MASK;
+	}
+
+	if (l3) {
+		if (l3->cmd & HW_ATL2_RPF_L3_V6_CMD_EN) {
+			req_tag |= (l3_index + 1) <<
+				   HW_ATL2_RPF_TAG_L3_V6_OFFSET;
+			mask |= HW_ATL2_RPF_TAG_L3_V6_MASK;
+		} else {
+			req_tag |= (l3_index + 1) <<
+				   HW_ATL2_RPF_TAG_L3_V4_OFFSET;
+			mask |= HW_ATL2_RPF_TAG_L3_V4_MASK;
+		}
+	}
+
+	if (data->cmd & (HW_ATL_RX_HOST << HW_ATL2_RPF_L3_L4_ACTF_SHIFT))
+		action = HW_ATL2_ACTION_ASSIGN_QUEUE((data->cmd  &
+						      HW_ATL2_RPF_L3_L4_RXQF_MSK) >>
+						     HW_ATL2_RPF_L3_L4_RXQF_SHIFT);
+	else if (data->cmd)
+		action = HW_ATL2_ACTION_DROP;
+	else
+		action = HW_ATL2_ACTION_DISABLE;
+
+	index = priv->art_base_index + HW_ATL2_RPF_L3L4_USER_INDEX + location;
+	hw_atl2_act_rslvr_table_set(self, index, req_tag, mask, action);
+	return 0;
+}
+
+static int hw_atl2_hw_fl2_set(struct aq_hw_s *self,
+			      struct aq_rx_filter_l2 *data)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	u32 mask = HW_ATL2_RPF_TAG_ET_MASK;
+	u32 req_tag = 0;
+	u16 action = 0;
+	u32 location;
+	u8 index;
+	int tag;
+
+	location = priv->etype_filter_base_index + data->location;
+	hw_atl_rpf_etht_flr_set(self, data->ethertype, location);
+	hw_atl_rpf_etht_user_priority_en_set(self,
+					     !!data->user_priority_en,
+					     location);
+	if (data->user_priority_en) {
+		hw_atl_rpf_etht_user_priority_set(self,
+						  data->user_priority,
+						  location);
+		req_tag |= data->user_priority << HW_ATL2_RPF_TAG_PCP_OFFSET;
+		mask |= HW_ATL2_RPF_TAG_PCP_MASK;
+	}
+
+	if (data->queue < 0) {
+		hw_atl_rpf_etht_flr_act_set(self, 0U, location);
+		hw_atl_rpf_etht_rx_queue_en_set(self, 0U, location);
+		action = HW_ATL2_ACTION_DROP;
+	} else {
+		hw_atl_rpf_etht_flr_act_set(self, 1U, location);
+		hw_atl_rpf_etht_rx_queue_en_set(self, 1U, location);
+		hw_atl_rpf_etht_rx_queue_set(self, data->queue, location);
+		action = HW_ATL2_ACTION_ASSIGN_QUEUE(data->queue);
+	}
+
+	tag = hw_atl2_filter_tag_get(priv->etype_policy,
+				     priv->etype_filter_tag_top,
+				     action);
+
+	if (tag < 0)
+		return -ENOSPC;
+
+	req_tag |= tag << HW_ATL2_RPF_TAG_ET_OFFSET;
+	hw_atl2_rpf_etht_flr_tag_set(self, tag, location);
+	index = priv->art_base_index + HW_ATL2_RPF_ET_PCP_USER_INDEX +
+		data->location;
+	hw_atl2_act_rslvr_table_set(self, index, req_tag, mask, action);
+
+	hw_atl_rpf_etht_flr_en_set(self, 1U, location);
+
+	return aq_hw_err_from_flags(self);
+}
+
+static int hw_atl2_hw_fl2_clear(struct aq_hw_s *self,
+				struct aq_rx_filter_l2 *data)
+{
+	struct hw_atl2_priv *priv = self->priv;
+	u32 location;
+	u8 index;
+	u32 tag;
+
+	location = priv->etype_filter_base_index + data->location;
+	hw_atl_rpf_etht_flr_en_set(self, 0U, location);
+	hw_atl_rpf_etht_flr_set(self, 0U, location);
+	hw_atl_rpf_etht_user_priority_en_set(self, 0U, location);
+
+	index = priv->art_base_index + HW_ATL2_RPF_ET_PCP_USER_INDEX +
+		data->location;
+	hw_atl2_act_rslvr_table_set(self, index, 0, 0,
+				    HW_ATL2_ACTION_DISABLE);
+	tag = hw_atl2_rpf_etht_flr_tag_get(self, location);
+	hw_atl2_filter_tag_put(priv->etype_policy, tag);
+
+	return aq_hw_err_from_flags(self);
+}
+
+/*
+ * Set VLAN filter table
+ * Configure VLAN filter table to accept (and assign the queue) traffic
+ * for the particular vlan ids.
+ * Note: use this function under vlan promisc mode not to lost the traffic
+ *
+ * param - aq_hw_s
+ * param - aq_rx_filter_vlan VLAN filter configuration
+ * return 0 - OK, <0 - error
+ */
 static int hw_atl2_hw_vlan_set(struct aq_hw_s *self,
 			       struct aq_rx_filter_vlan *aq_vlans)
 {
@@ -825,7 +1394,7 @@ static int hw_atl2_hw_vlan_ctrl(struct aq_hw_s *self, bool enable)
 const struct aq_hw_ops hw_atl2_ops = {
 	.hw_soft_reset        = hw_atl2_utils_soft_reset,
 	.hw_prepare           = hw_atl2_utils_initfw,
-	.hw_set_mac_address   = hw_atl_b0_hw_mac_addr_set,
+	.hw_set_mac_address   = hw_atl2_hw_mac_addr_set,
 	.hw_init              = hw_atl2_hw_init,
 	.hw_reset             = hw_atl2_hw_reset,
 	.hw_start             = hw_atl_b0_hw_start,
@@ -834,6 +1403,7 @@ const struct aq_hw_ops hw_atl2_ops = {
 	.hw_ring_rx_start     = hw_atl_b0_hw_ring_rx_start,
 	.hw_ring_rx_stop      = hw_atl_b0_hw_ring_rx_stop,
 	.hw_stop              = hw_atl2_hw_stop,
+	.hw_get_version       = hw_atl2_get_hw_version,
 
 	.hw_ring_tx_xmit         = hw_atl_b0_hw_ring_tx_xmit,
 	.hw_ring_tx_head_update  = hw_atl_b0_hw_ring_tx_head_update,
@@ -848,6 +1418,9 @@ const struct aq_hw_ops hw_atl2_ops = {
 	.hw_ring_rx_init             = hw_atl2_hw_ring_rx_init,
 	.hw_ring_tx_init             = hw_atl2_hw_ring_tx_init,
 	.hw_packet_filter_set        = hw_atl2_hw_packet_filter_set,
+	.hw_filter_l2_set            = hw_atl2_hw_fl2_set,
+	.hw_filter_l2_clear          = hw_atl2_hw_fl2_clear,
+	.hw_filter_l3l4_set          = hw_atl2_hw_fl3l4_set,
 	.hw_filter_vlan_set          = hw_atl2_hw_vlan_set,
 	.hw_filter_vlan_ctrl         = hw_atl2_hw_vlan_ctrl,
 	.hw_multicast_list_set       = hw_atl2_hw_multicast_list_set,
@@ -855,6 +1428,7 @@ const struct aq_hw_ops hw_atl2_ops = {
 	.hw_rss_set                  = hw_atl2_hw_rss_set,
 	.hw_rss_hash_set             = hw_atl_b0_hw_rss_hash_set,
 	.hw_tc_rate_limit_set        = hw_atl2_hw_init_tx_tc_rate_limit,
+	.hw_get_regs                 = hw_atl2_utils_hw_get_regs,
 	.hw_get_hw_stats             = hw_atl2_utils_get_hw_stats,
 	.hw_get_fw_version           = hw_atl2_utils_get_fw_version,
 	.hw_set_offload              = hw_atl_b0_hw_offload_set,
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 4/9] net: atlantic: add AQC113 hardware register definitions and accessors definitions and accessors
From: sukhdeeps @ 2026-05-06 13:57 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Add low-level hardware register definitions and accessor functions
for AQC113 (Antigua) chip features:

- L3/L4 filter command, tag, and address registers for IPv4/IPv6
- Ethertype filter tag registers
- TSG (Time Stamp Generator) clock control, modification, and
  GPIO event generation/input timestamp registers
- TX descriptor timestamp writeback, timestamp enable, and AVB
  enable registers
- TX data/descriptor read request limit registers
- TPB highest priority TC registers
- PCIe extended tag enable register
- RX descriptor timestamp request register
- Action resolver section enable getter
- GPIO special mode and TSG external GPIO TS input select

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../aquantia/atlantic/hw_atl2/hw_atl2_llh.c   | 359 ++++++++++++++++++
 .../aquantia/atlantic/hw_atl2/hw_atl2_llh.h   | 107 +++++-
 .../atlantic/hw_atl2/hw_atl2_llh_internal.h   | 204 +++++++++-
 3 files changed, 663 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.c
index cd954b11d24a..21fda387f60e 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.c
@@ -7,6 +7,20 @@
 #include "hw_atl2_llh_internal.h"
 #include "aq_hw_utils.h"
 
+void hw_atl2_phi_ext_tag_set(struct aq_hw_s *aq_hw, u32 val)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_PHI_EXT_TAG_EN_ADR,
+			    HW_ATL2_PHI_EXT_TAG_EN_MSK,
+			    HW_ATL2_PHI_EXT_TAG_EN_SHIFT, val);
+}
+
+u32 hw_atl2_phi_ext_tag_get(struct aq_hw_s *aq_hw)
+{
+	return aq_hw_read_reg_bit(aq_hw, HW_ATL2_PHI_EXT_TAG_EN_ADR,
+				  HW_ATL2_PHI_EXT_TAG_EN_MSK,
+				  HW_ATL2_PHI_EXT_TAG_EN_SHIFT);
+}
+
 void hw_atl2_rpf_redirection_table2_select_set(struct aq_hw_s *aq_hw,
 					       u32 select)
 {
@@ -66,6 +80,278 @@ void hw_atl2_rpf_vlan_flr_tag_set(struct aq_hw_s *aq_hw, u32 tag, u32 filter)
 			    tag);
 }
 
+void hw_atl2_rpf_etht_flr_tag_set(struct aq_hw_s *aq_hw, u32 tag, u32 filter)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_ET_TAG_ADR(filter),
+			    HW_ATL2_RPF_ET_TAG_MSK,
+			    HW_ATL2_RPF_ET_TAG_SHIFT, tag);
+}
+
+u32 hw_atl2_rpf_etht_flr_tag_get(struct aq_hw_s *aq_hw, u32 filter)
+{
+	return aq_hw_read_reg_bit(aq_hw, HW_ATL2_RPF_ET_TAG_ADR(filter),
+				  HW_ATL2_RPF_ET_TAG_MSK,
+				  HW_ATL2_RPF_ET_TAG_SHIFT);
+}
+
+void hw_atl2_rpf_l3_v4_dest_addr_set(struct aq_hw_s *aq_hw, u32 filter, u32 val)
+{
+	u32 addr_set = 6 + ((filter < 4) ? 0 : 1);
+	u32 dword = filter % 4;
+
+	aq_hw_write_reg(aq_hw, HW_ATL2_RPF_L3_DA_DW_ADR(addr_set, dword), val);
+}
+
+void hw_atl2_rpf_l3_v4_src_addr_set(struct aq_hw_s *aq_hw, u32 filter, u32 val)
+{
+	u32 addr_set = 6 + ((filter < 4) ? 0 : 1);
+	u32 dword = filter % 4;
+
+	aq_hw_write_reg(aq_hw, HW_ATL2_RPF_L3_SA_DW_ADR(addr_set, dword), val);
+}
+
+void hw_atl2_rpf_l3_v6_dest_addr_set(struct aq_hw_s *aq_hw, u8 location,
+				     u32 *ipv6_dst)
+{
+	int i;
+
+	for (i = 0; i < 4; ++i)
+		aq_hw_write_reg(aq_hw,
+				HW_ATL2_RPF_L3_DA_DW_ADR(location, 3 - i),
+				ipv6_dst[i]);
+}
+
+void hw_atl2_rpf_l3_v6_src_addr_set(struct aq_hw_s *aq_hw, u8 location,
+				    u32 *ipv6_src)
+{
+	int i;
+
+	for (i = 0; i < 4; ++i)
+		aq_hw_write_reg(aq_hw,
+				HW_ATL2_RPF_L3_SA_DW_ADR(location, 3 - i),
+				ipv6_src[i]);
+}
+
+void hw_atl2_rpf_l3_v4_cmd_set(struct aq_hw_s *aq_hw, u32 val, u32 filter)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_L3_V4_CMD_ADR(filter),
+			    HW_ATL2_RPF_L3_V4_CMD_MSK,
+			    HW_ATL2_RPF_L3_V4_CMD_SHIFT, val);
+}
+
+void hw_atl2_rpf_l3_v6_cmd_set(struct aq_hw_s *aq_hw, u32 val, u32 filter)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_L3_V6_CMD_ADR(filter),
+			    HW_ATL2_RPF_L3_V6_CMD_MSK,
+			    HW_ATL2_RPF_L3_V6_CMD_SHIFT, val);
+}
+
+void hw_atl2_rpf_l3_v6_v4_select_set(struct aq_hw_s *aq_hw, u32 val)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_L3_V6_V4_SELECT_ADR,
+			    HW_ATL2_RPF_L3_V6_V4_SELECT_MSK,
+			    HW_ATL2_RPF_L3_V6_V4_SELECT_SHIFT, val);
+}
+
+void hw_atl2_rpf_l3_v4_tag_set(struct aq_hw_s *aq_hw, u32 val, u32 filter)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_L3_V4_TAG_ADR(filter),
+			    HW_ATL2_RPF_L3_V4_TAG_MSK,
+			    HW_ATL2_RPF_L3_V4_TAG_SHIFT, val);
+}
+
+void hw_atl2_rpf_l3_v6_tag_set(struct aq_hw_s *aq_hw, u32 val, u32 filter)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_L3_V6_TAG_ADR(filter),
+			    HW_ATL2_RPF_L3_V6_TAG_MSK,
+			    HW_ATL2_RPF_L3_V6_TAG_SHIFT, val);
+}
+
+void hw_atl2_rpf_l4_tag_set(struct aq_hw_s *aq_hw, u32 val, u32 filter)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_L4_TAG_ADR(filter),
+			    HW_ATL2_RPF_L4_TAG_MSK,
+			    HW_ATL2_RPF_L4_TAG_SHIFT, val);
+}
+
+void hw_atl2_rpf_l4_cmd_set(struct aq_hw_s *aq_hw, u32 val, u32 filter)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_RPF_L4_CMD_ADR(filter),
+			    HW_ATL2_RPF_L4_CMD_MSK,
+			    HW_ATL2_RPF_L4_CMD_SHIFT, val);
+}
+
+/* tsg */
+static void hw_atl2_clock_modif_value_set(struct aq_hw_s *aq_hw,
+					  u32 clock_sel, u64 ns)
+{
+	aq_hw_write_reg64(aq_hw,
+			  HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_MODIF_VAL_LSW),
+			  ns);
+}
+
+void hw_atl2_tsg_clock_en(struct aq_hw_s *aq_hw,
+			  u32 clock_sel, u32 clock_enable)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_CFG),
+			    HW_ATL2_TSG_CLOCK_EN_MSK,
+			    HW_ATL2_TSG_CLOCK_EN_SHIFT,
+			    clock_enable);
+}
+
+void hw_atl2_tsg_clock_reset(struct aq_hw_s *aq_hw, u32 clock_sel)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_CFG),
+			    HW_ATL2_TSG_SYNC_RESET_MSK,
+			    HW_ATL2_TSG_SYNC_RESET_SHIFT, 1);
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_CFG),
+			    HW_ATL2_TSG_SYNC_RESET_MSK,
+			    HW_ATL2_TSG_SYNC_RESET_SHIFT, 0);
+}
+
+u64 hw_atl2_tsg_clock_read(struct aq_hw_s *aq_hw, u32 clock_sel)
+{
+	return aq_hw_read_reg64(aq_hw,
+				HW_ATL2_TSG_REG_ADR(clock_sel,
+						    READ_CUR_NS_LSW));
+}
+
+void hw_atl2_tsg_clock_add(struct aq_hw_s *aq_hw, u32 clock_sel, u64 ns)
+{
+	hw_atl2_clock_modif_value_set(aq_hw, clock_sel, ns);
+	aq_hw_write_reg(aq_hw,
+			HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_MODIF_CTRL),
+			HW_ATL2_TSG_ADD_COUNTER_MSK);
+}
+
+void hw_atl2_tsg_clock_sub(struct aq_hw_s *aq_hw, u32 clock_sel, u64 ns)
+{
+	hw_atl2_clock_modif_value_set(aq_hw, clock_sel, ns);
+	aq_hw_write_reg(aq_hw,
+			HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_MODIF_CTRL),
+			HW_ATL2_TSG_SUBTRACT_COUNTER_MSK);
+}
+
+void hw_atl2_tsg_clock_increment_set(struct aq_hw_s *aq_hw,
+				     u32 clock_sel, u32 ns, u32 fns)
+{
+	u32 nsfns = (ns & 0xff) | (fns & 0xffffff00);
+
+	aq_hw_write_reg(aq_hw,
+			HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_INC_CFG),
+			nsfns);
+	aq_hw_write_reg(aq_hw,
+			HW_ATL2_TSG_REG_ADR(clock_sel, CLOCK_MODIF_CTRL),
+			HW_ATL2_TSG_LOAD_INC_CFG_MSK);
+}
+
+void hw_atl2_tsg_ext_isr_to_host_set(struct aq_hw_s *aq_hw, int on)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_GLB_CONTROL_2_ADR,
+			    HW_ATL2_MIF_INTERRUPT_2_TO_ITR_MSK,
+			    HW_ATL2_MIF_INTERRUPT_TO_ITR_SHIFT + 2,
+			    !!on);
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_GLB_CONTROL_2_ADR,
+			    HW_ATL2_EN_INTERRUPT_MIF2_TO_ITR_MSK,
+			    HW_ATL2_EN_INTERRUPT_TO_ITR_SHIFT + 2,
+			    !!on);
+}
+
+void hw_atl2_tpb_tps_highest_priority_tc_enable_set(struct aq_hw_s *aq_hw,
+						    u32 tps_highest_prio_tc_en)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TPB_HIGHEST_PRIO_TC_EN_ADR,
+			    HW_ATL2_TPB_HIGHEST_PRIO_TC_EN_MSK,
+			    HW_ATL2_TPB_HIGHEST_PRIO_TC_EN_SHIFT,
+			    tps_highest_prio_tc_en);
+}
+
+void hw_atl2_tpb_tps_highest_priority_tc_set(struct aq_hw_s *aq_hw,
+					     u32 tps_highest_prio_tc)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TPB_HIGHEST_PRIO_TC_ADR,
+			    HW_ATL2_TPB_HIGHEST_PRIO_TC_MSK,
+			    HW_ATL2_TPB_HIGHEST_PRIO_TC_SHIFT,
+			    tps_highest_prio_tc);
+}
+
+void hw_atl2_tsg_gpio_isr_to_host_set(struct aq_hw_s *aq_hw,
+				      int on, u32 clock_sel)
+{
+	aq_hw_write_reg_bit(aq_hw,
+			    HW_ATL2_GLOBAL_HIGH_PRIO_INTERRUPT_1_MASK_ADR,
+		clock_sel == 1 ? HW_ATL2_TSG_TSG1_GPIO_INTERRUPT_MSK :
+			HW_ATL2_TSG_TSG0_GPIO_INTERRUPT_MSK,
+		clock_sel == 1 ? HW_ATL2_TSG_TSG1_GPIO_INTERRUPT_SHIFT :
+			HW_ATL2_TSG_TSG0_GPIO_INTERRUPT_SHIFT,
+		!!on);
+}
+
+void hw_atl2_tsg_gpio_clear_status(struct aq_hw_s *aq_hw)
+{
+	aq_hw_read_reg(aq_hw, HW_ATL2_GLOBAL_INTERNAL_ALARMS_1_ADR);
+}
+
+void hw_atl2_tsg_gpio_input_event_info_get(struct aq_hw_s *aq_hw,
+					   u32 clock_sel,
+					   u32 *event_count,
+					   u64 *event_ts)
+{
+	if (event_count)
+		*event_count = aq_hw_read_reg(aq_hw,
+					      HW_ATL2_TSG_REG_ADR(clock_sel,
+								  EXT_CLK_COUNT));
+
+	if (event_ts)
+		*event_ts = aq_hw_read_reg64(aq_hw,
+					     HW_ATL2_TSG_REG_ADR(clock_sel,
+								 GPIO_EVENT_TS_LSW));
+}
+
+void hw_atl2_tsg_ptp_gpio_gen_pulse(struct aq_hw_s *aq_hw, u32 clk_sel,
+				    u64 ts, u32 period, u32 hightime)
+{
+	u32 val = (HW_ATL2_TSG_GPIO_EVENT_MODE_SET_ON_TIME <<
+		   (HW_ATL2_TSG_GPIO_EVENT_MODE_SHIFT -
+		    HW_ATL2_TSG_GPIO_OUTPUT_EN_SHIFT)) |
+		  (HW_ATL2_TSG_GPIO_GEN_OUTPUT_EN_MSK) |
+		  (HW_ATL2_TSG_GPIO_OUTPUT_EN_MSK);
+
+	if (ts != 0) {
+		aq_hw_write_reg64(aq_hw,
+				  HW_ATL2_TSG_REG_ADR(clk_sel,
+						      GPIO_EVENT_GEN_TS_LSW),
+				  ts);
+
+		aq_hw_write_reg64(aq_hw,
+				  HW_ATL2_TSG_REG_ADR(clk_sel,
+						      GPIO_EVENT_HIGH_TIME_LSW),
+				  hightime);
+
+		aq_hw_write_reg64(aq_hw,
+				  HW_ATL2_TSG_REG_ADR(clk_sel,
+						      GPIO_EVENT_LOW_TIME_LSW),
+				  (period - hightime));
+	}
+
+	aq_hw_write_reg_bit(aq_hw,
+			    HW_ATL2_TSG_REG_ADR(clk_sel, GPIO_EVENT_GEN_CFG),
+			    HW_ATL2_TSG_GPIO_EVENT_MODE_MSK |
+				HW_ATL2_TSG_GPIO_OUTPUT_EN_MSK |
+				HW_ATL2_TSG_GPIO_GEN_OUTPUT_EN_MSK,
+			   HW_ATL2_TSG_GPIO_OUTPUT_EN_SHIFT,
+			   (!ts ? 0 : val));
+}
+
+void hw_atl2_rpf_rx_desc_timestamp_req_set(struct aq_hw_s *aq_hw, u32 request,
+					   u32 descriptor)
+{
+	aq_hw_write_reg_bit(aq_hw,
+			    HW_ATL2_RPF_TIMESTAMP_REQ_DESCD_ADR(descriptor),
+			    HW_ATL2_RPF_TIMESTAMP_REQ_DESCD_MSK,
+			    HW_ATL2_RPF_TIMESTAMP_REQ_DESCD_SHIFT, request);
+}
+
 /* TX */
 
 void hw_atl2_tpb_tx_tc_q_rand_map_en_set(struct aq_hw_s *aq_hw,
@@ -93,6 +379,30 @@ void hw_atl2_reg_tx_intr_moder_ctrl_set(struct aq_hw_s *aq_hw,
 			tx_intr_moderation_ctl);
 }
 
+void hw_atl2_tdm_tx_desc_timestamp_writeback_en_set(struct aq_hw_s *aq_hw,
+						    u32 enable, u32 descriptor)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TDM_DESCD_TS_WRB_EN_ADR(descriptor),
+			    HW_ATL2_TDM_DESCD_TS_WRB_EN_MSK,
+			    HW_ATL2_TDM_DESCD_TS_WRB_EN_SHIFT, enable);
+}
+
+void hw_atl2_tdm_tx_desc_timestamp_en_set(struct aq_hw_s *aq_hw, u32 enable,
+					  u32 descriptor)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TDM_DESCD_TS_EN_ADR(descriptor),
+			    HW_ATL2_TDM_DESCD_TS_EN_MSK,
+			    HW_ATL2_TDM_DESCD_TS_EN_SHIFT, enable);
+}
+
+void hw_atl2_tdm_tx_desc_avb_en_set(struct aq_hw_s *aq_hw, u32 enable,
+				    u32 descriptor)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TDM_DESCD_AVB_EN_ADR(descriptor),
+			    HW_ATL2_TDM_DESCD_AVB_EN_MSK,
+			    HW_ATL2_TDM_DESCD_AVB_EN_SHIFT, enable);
+}
+
 void hw_atl2_tps_tx_pkt_shed_data_arb_mode_set(struct aq_hw_s *aq_hw,
 					       const u32 data_arb_mode)
 {
@@ -122,6 +432,20 @@ void hw_atl2_tps_tx_pkt_shed_tc_data_weight_set(struct aq_hw_s *aq_hw,
 			    weight);
 }
 
+void hw_atl2_tdm_tx_data_read_req_limit_set(struct aq_hw_s *aq_hw, u32 limit)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TDM_TX_DATA_RD_REQ_LIMIT_ADR,
+			    HW_ATL2_TDM_TX_DATA_RD_REQ_LIMIT_MSK,
+			    HW_ATL2_TDM_TX_DATA_RD_REQ_LIMIT_SHIFT, limit);
+}
+
+void hw_atl2_tdm_tx_desc_read_req_limit_set(struct aq_hw_s *aq_hw, u32 limit)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TDM_TX_DESC_RD_REQ_LIMIT_ADR,
+			    HW_ATL2_TDM_TX_DESC_RD_REQ_LIMIT_MSK,
+			    HW_ATL2_TDM_TX_DESC_RD_REQ_LIMIT_SHIFT, limit);
+}
+
 u32 hw_atl2_get_hw_version(struct aq_hw_s *aq_hw)
 {
 	return aq_hw_read_reg(aq_hw, HW_ATL2_FPGA_VER_ADR);
@@ -164,6 +488,13 @@ void hw_atl2_rpf_act_rslvr_section_en_set(struct aq_hw_s *aq_hw, u32 sections)
 			    sections);
 }
 
+u32 hw_atl2_rpf_act_rslvr_section_en_get(struct aq_hw_s *aq_hw)
+{
+	return aq_hw_read_reg_bit(aq_hw, HW_ATL2_RPF_REC_TAB_EN_ADR,
+				  HW_ATL2_RPF_REC_TAB_EN_MSK,
+				  HW_ATL2_RPF_REC_TAB_EN_SHIFT);
+}
+
 void hw_atl2_mif_shared_buf_get(struct aq_hw_s *aq_hw, int offset, u32 *data,
 				int len)
 {
@@ -232,3 +563,31 @@ void hw_atl2_mif_host_req_int_clr(struct aq_hw_s *aq_hw, u32 val)
 	return aq_hw_write_reg(aq_hw, HW_ATL2_MCP_HOST_REQ_INT_CLR_ADR,
 			       val);
 }
+
+void hw_atl2_tsg1_ext_gpio_ts_input_select_set(struct aq_hw_s *aq_hw,
+					       u32 tsg_gpio_ts_select)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TSG1_EXT_GPIO_TS_INPUT_SEL_ADR,
+			    HW_ATL2_TSG1_EXT_GPIO_TS_INPUT_SEL_MSK,
+			    HW_ATL2_TSG1_EXT_GPIO_TS_INPUT_SEL_SHIFT,
+			    tsg_gpio_ts_select);
+}
+
+void hw_atl2_tsg0_ext_gpio_ts_input_select_set(struct aq_hw_s *aq_hw,
+					       u32 gpio_ts_in_select)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_TSG0_EXT_GPIO_TS_INPUT_SEL_ADR,
+			    HW_ATL2_TSG0_EXT_GPIO_TS_INPUT_SEL_MSK,
+			    HW_ATL2_TSG0_EXT_GPIO_TS_INPUT_SEL_SHIFT,
+			    gpio_ts_in_select);
+}
+
+void hw_atl2_gpio_special_mode_set(struct aq_hw_s *aq_hw,
+				   u32 gpio_special_mode,
+				   u32 pin)
+{
+	aq_hw_write_reg_bit(aq_hw, HW_ATL2_GPIO_PIN_SPEC_MODE_ADR(pin),
+			    HW_ATL2_GPIO_PIN_SPEC_MODE_MSK,
+			    HW_ATL2_GPIO_PIN_SPEC_MODE_SHIFT,
+			    gpio_special_mode);
+}
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.h
index 98c7a4621297..01aaf701b201 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.h
@@ -10,6 +10,11 @@
 
 struct aq_hw_s;
 
+/* Set Enable usage of extended tags from 32-255. */
+void hw_atl2_phi_ext_tag_set(struct aq_hw_s *aq_hw, u32 val);
+/* Get Enable usage of extended tags from 32-255. */
+u32 hw_atl2_phi_ext_tag_get(struct aq_hw_s *aq_hw);
+
 /* Set TX Interrupt Moderation Control Register */
 void hw_atl2_reg_tx_intr_moder_ctrl_set(struct aq_hw_s *aq_hw,
 					u32 tx_intr_moderation_ctl,
@@ -19,7 +24,7 @@ void hw_atl2_reg_tx_intr_moder_ctrl_set(struct aq_hw_s *aq_hw,
 void hw_atl2_rpf_redirection_table2_select_set(struct aq_hw_s *aq_hw,
 					       u32 select);
 
-/** Set RSS HASH type */
+/* Set RSS HASH type */
 void hw_atl2_rpf_rss_hash_type_set(struct aq_hw_s *aq_hw, u32 rss_hash_type);
 
 /* set new RPF enable */
@@ -37,14 +42,92 @@ void hw_atl2_new_rpf_rss_redir_set(struct aq_hw_s *aq_hw, u32 tc, u32 index,
 
 /* Set VLAN filter tag */
 void hw_atl2_rpf_vlan_flr_tag_set(struct aq_hw_s *aq_hw, u32 tag, u32 filter);
+/* set ethertype filter tag */
+void hw_atl2_rpf_etht_flr_tag_set(struct aq_hw_s *aq_hw, u32 tag, u32 filter);
+
+/* get ethertype filter tag */
+u32 hw_atl2_rpf_etht_flr_tag_get(struct aq_hw_s *aq_hw, u32 filter);
+
+/* set L3 v4 dest address */
+void hw_atl2_rpf_l3_v4_dest_addr_set(struct aq_hw_s *aq_hw,
+				     u32 filter, u32 val);
+
+/* set L3 v4 src address */
+void hw_atl2_rpf_l3_v4_src_addr_set(struct aq_hw_s *aq_hw, u32 filter, u32 val);
+
+/* set L3 v4 cmd */
+void hw_atl2_rpf_l3_v4_cmd_set(struct aq_hw_s *aq_hw, u32 val, u32 filter);
+
+/* set L3 v6 cmd */
+void hw_atl2_rpf_l3_v6_cmd_set(struct aq_hw_s *aq_hw, u32 val, u32 filter);
+
+/* set L3 v6 dest address */
+void hw_atl2_rpf_l3_v6_dest_addr_set(struct aq_hw_s *aq_hw, u8 location,
+				     u32 *ipv6_dst);
+
+/* set L3 v6 src address */
+void hw_atl2_rpf_l3_v6_src_addr_set(struct aq_hw_s *aq_hw, u8 location,
+				    u32 *ipv6_src);
+
+/* set L3 v6 v4 select */
+void hw_atl2_rpf_l3_v6_v4_select_set(struct aq_hw_s *aq_hw, u32 val);
+
+/* set L3 v4 tag */
+void hw_atl2_rpf_l3_v4_tag_set(struct aq_hw_s *aq_hw, u32 val, u32 filter);
+
+/* set L3 v6 tag */
+void hw_atl2_rpf_l3_v6_tag_set(struct aq_hw_s *aq_hw, u32 val, u32 filter);
+
+/* set L4 cmd */
+void hw_atl2_rpf_l4_cmd_set(struct aq_hw_s *aq_hw, u32 val, u32 filter);
+
+/* set L4 tag */
+void hw_atl2_rpf_l4_tag_set(struct aq_hw_s *aq_hw, u32 val, u32 filter);
 
 /* set tx random TC-queue mapping enable bit */
 void hw_atl2_tpb_tx_tc_q_rand_map_en_set(struct aq_hw_s *aq_hw,
 					 const u32 tc_q_rand_map_en);
 
+void hw_atl2_tpb_tps_highest_priority_tc_enable_set(struct aq_hw_s *aq_hw,
+						    u32 tps_highest_prio_tc_en);
+
+void hw_atl2_tpb_tps_highest_priority_tc_set(struct aq_hw_s *aq_hw,
+					     u32 tps_highest_prio_tc);
+
 /* set tx buffer clock gate enable */
 void hw_atl2_tpb_tx_buf_clk_gate_en_set(struct aq_hw_s *aq_hw, u32 clk_gate_en);
 
+/* tsg */
+
+void hw_atl2_tsg_clock_en(struct aq_hw_s *aq_hw, u32 clock_sel,
+			  u32 clock_enable);
+
+void hw_atl2_tsg_clock_reset(struct aq_hw_s *aq_hw, u32 clock_sel);
+u64 hw_atl2_tsg_clock_read(struct aq_hw_s *aq_hw, u32 clock_sel);
+void hw_atl2_tsg_clock_add(struct aq_hw_s *aq_hw, u32 clock_sel,
+			   u64 ns);
+void hw_atl2_tsg_clock_sub(struct aq_hw_s *aq_hw, u32 clock_sel,
+			   u64 ns);
+void hw_atl2_tsg_clock_increment_set(struct aq_hw_s *aq_hw, u32 clock_sel,
+				     u32 ns, u32 fns);
+void hw_atl2_tsg_gpio_isr_to_host_set(struct aq_hw_s *aq_hw, int on,
+				      u32 clock_sel);
+void hw_atl2_tsg_ext_isr_to_host_set(struct aq_hw_s *aq_hw, int on);
+void hw_atl2_tsg_gpio_clear_status(struct aq_hw_s *aq_hw);
+void hw_atl2_tsg_gpio_input_event_info_get(struct aq_hw_s *aq_hw,
+					   u32 clock_sel,
+					   u32 *event_count,
+					   u64 *event_ts);
+/* Set Rx Descriptor0 Timestamp request */
+void hw_atl2_rpf_rx_desc_timestamp_req_set(struct aq_hw_s *aq_hw, u32 request,
+					   u32 descriptor);
+/* Set Tx Descriptor Timestamp writeback Enable */
+void hw_atl2_tdm_tx_desc_timestamp_writeback_en_set(struct aq_hw_s *aq_hw,
+						    u32 enable,
+						    u32 descriptor);
+/* Set Tx Descriptor Timestamp enable */
+void hw_atl2_tdm_tx_desc_timestamp_en_set(struct aq_hw_s *aq_hw, u32 enable,
+					  u32 descriptor);
 void hw_atl2_tps_tx_pkt_shed_data_arb_mode_set(struct aq_hw_s *aq_hw,
 					       const u32 data_arb_mode);
 
@@ -57,6 +140,15 @@ void hw_atl2_tps_tx_pkt_shed_tc_data_max_credit_set(struct aq_hw_s *aq_hw,
 void hw_atl2_tps_tx_pkt_shed_tc_data_weight_set(struct aq_hw_s *aq_hw,
 						const u32 tc,
 						const u32 weight);
+/* Set Tx Descriptor AVB enable */
+void hw_atl2_tdm_tx_desc_avb_en_set(struct aq_hw_s *aq_hw, u32 enable,
+				    u32 descriptor);
+void hw_atl2_tsg_ptp_gpio_gen_pulse(struct aq_hw_s *aq_hw, u32 clk_sel,
+				    u64 ts, u32 period, u32 hightime);
+
+void hw_atl2_tdm_tx_data_read_req_limit_set(struct aq_hw_s *aq_hw, u32 limit);
+
+void hw_atl2_tdm_tx_desc_read_req_limit_set(struct aq_hw_s *aq_hw, u32 limit);
 
 u32 hw_atl2_get_hw_version(struct aq_hw_s *aq_hw);
 
@@ -69,6 +161,9 @@ void hw_atl2_rpf_act_rslvr_record_set(struct aq_hw_s *aq_hw, u8 location,
 /* set enable action resolver section */
 void hw_atl2_rpf_act_rslvr_section_en_set(struct aq_hw_s *aq_hw, u32 sections);
 
+/* get enable action resolver section */
+u32 hw_atl2_rpf_act_rslvr_section_en_get(struct aq_hw_s *aq_hw);
+
 /* get data from firmware shared input buffer */
 void hw_atl2_mif_shared_buf_get(struct aq_hw_s *aq_hw, int offset, u32 *data,
 				int len);
@@ -98,5 +193,13 @@ u32 hw_atl2_mif_host_req_int_get(struct aq_hw_s *aq_hw);
 
 /* clear host interrupt request */
 void hw_atl2_mif_host_req_int_clr(struct aq_hw_s *aq_hw, u32 val);
-
+/* Set TSG EXT GPIO TS Input select */
+void hw_atl2_tsg1_ext_gpio_ts_input_select_set(struct aq_hw_s *aq_hw,
+					       u32 tsg_gpio_ts_select);
+/* Set PTP EXT GPIO TS Input select */
+void hw_atl2_tsg0_ext_gpio_ts_input_select_set(struct aq_hw_s *aq_hw,
+					       u32 gpio_ts_in_select);
+/* Set GPIO Special Mode */
+void hw_atl2_gpio_special_mode_set(struct aq_hw_s *aq_hw,
+				   u32 gpio_special_mode, u32 pin);
 #endif /* HW_ATL2_LLH_H */
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh_internal.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh_internal.h
index e34c5cda061e..9b9be3ef1332 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh_internal.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh_internal.h
@@ -5,6 +5,11 @@
 
 #ifndef HW_ATL2_LLH_INTERNAL_H
 #define HW_ATL2_LLH_INTERNAL_H
+/* RX timestamp_req_desc{D} [1:0] Bitfield Definitions
+ */
+#define HW_ATL2_RPF_TIMESTAMP_REQ_DESCD_ADR(descr) (0x00005B08 + (descr) * 0x20)
+#define HW_ATL2_RPF_TIMESTAMP_REQ_DESCD_MSK 0x00030000
+#define HW_ATL2_RPF_TIMESTAMP_REQ_DESCD_SHIFT 16
 
 /* RX pif_rpf_redir_2_en_i Bitfield Definitions
  * PORT="pif_rpf_redir_2_en_i"
@@ -114,7 +119,68 @@
 #define HW_ATL2_RPF_VL_TAG_WIDTH 4
 /* default value of bitfield vlan_req_tag0{f}[3:0] */
 #define HW_ATL2_RPF_VL_TAG_DEFAULT 0x0
-
+/* register address for bitfield etype_req_tag0{f}[2:0] */
+#define HW_ATL2_RPF_ET_TAG_ADR(filter) (0x00005340 + (filter) * 0x4)
+/* bitmask for bitfield etype_req_tag0{f}[2:0] */
+#define HW_ATL2_RPF_ET_TAG_MSK 0x00000007
+/* lower bit position of bitfield etype_req_tag0{f}[2:0] */
+#define HW_ATL2_RPF_ET_TAG_SHIFT 0
+/* Lower bit position of bitfield l3_l4_act{F}[2:0] */
+#define HW_ATL2_RPF_L3_L4_ACTF_SHIFT 16
+/* Bitmask for bitfield l3_l4_rxq{F}[4:0] */
+#define HW_ATL2_RPF_L3_L4_RXQF_MSK 0x00001F00u
+/* Lower bit position of bitfield l3_l4_rxq{F}[4:0] */
+#define HW_ATL2_RPF_L3_L4_RXQF_SHIFT 8
+/* Register address for bitfield rpf_l3_v6_sa{F}_dw{D}[1F:0] */
+#define HW_ATL2_RPF_L3_SA_DW_ADR(filter, dword) \
+	(0x00006400u + (filter) * 0x10 + (dword) * 0x4)
+
+/* Register address for bitfield rpf_l3_v6_da{F}_dw{D}[1F:0] */
+#define HW_ATL2_RPF_L3_DA_DW_ADR(filter, dword) \
+	(0x00006480u + (filter) * 0x10 + (dword) * 0x4)
+
+/* Register address for bitfield rpf_l3_cmd{F}[1F:0] */
+#define HW_ATL2_RPF_L3_V4_CMD_ADR(filter) (0x00006500u + (filter) * 0x4)
+/* Bitmask for bitfield rpf_l3_cmd{F}[F:0] */
+#define HW_ATL2_RPF_L3_V4_CMD_MSK 0x0000FFFFu
+/* Lower bit position of bitfield rpf_l3_cmd{F}[1F:0] */
+#define HW_ATL2_RPF_L3_V4_CMD_SHIFT 0
+/* Register address for bitfield rpf_l3_v6_cmd{F}[1F:0] */
+#define HW_ATL2_RPF_L3_V6_CMD_ADR(filter) (0x00006500u + (filter) * 0x4)
+/* Bitmask for bitfield rpf_l3_v6_cmd{F}[F:0] */
+#define HW_ATL2_RPF_L3_V6_CMD_MSK 0xFF7F0000u
+/* Lower bit position of bitfield rpf_l3_v6_cmd{F}[1F:0] */
+#define HW_ATL2_RPF_L3_V6_CMD_SHIFT 0
+/* Register address for bitfield rpf_l3_v6_cmd{F}[F:0] */
+#define HW_ATL2_RPF_L3_V6_V4_SELECT_ADR 0x00006500u
+/* Bitmask for bitfield pif_rpf_l3_v6_v4_select*/
+#define HW_ATL2_RPF_L3_V6_V4_SELECT_MSK 0x00800000u
+/* Lower bit position of bitfield pif_rpf_l3_v6_v4_select */
+#define HW_ATL2_RPF_L3_V6_V4_SELECT_SHIFT 23
+/* Register address for bitfield rpf_l3_v4_req_tag{F}[2:0] */
+#define HW_ATL2_RPF_L3_V4_TAG_ADR(filter) (0x00006500u + (filter) * 0x4)
+/* Bitmask for bitfield rpf_l3_v4_req_tag{F}[2:0] */
+#define HW_ATL2_RPF_L3_V4_TAG_MSK 0x00000070u
+/* Lower bit position of bitfield rpf_l3_v4_req_tag{F}[2:0] */
+#define HW_ATL2_RPF_L3_V4_TAG_SHIFT 4
+/* Register address for bitfield rpf_l3_v6_req_tag{F}[2:0] */
+#define HW_ATL2_RPF_L3_V6_TAG_ADR(filter) (0x00006500u + (filter) * 0x4)
+/* Bitmask for bitfield rpf_l3_v6_req_tag{F}[2:0] */
+#define HW_ATL2_RPF_L3_V6_TAG_MSK 0x00700000
+/* Lower bit position of bitfield rpf_l3_v6_req_tag{F}[2:0] */
+#define HW_ATL2_RPF_L3_V6_TAG_SHIFT 20
+/* Register address for bitfield rpf_l4_cmd{F}[2:0] */
+#define HW_ATL2_RPF_L4_CMD_ADR(filter) (0x00006520u + (filter) * 0x4)
+/* Bitmask for bitfield rpf_l4_cmd{F}[2:0] */
+#define HW_ATL2_RPF_L4_CMD_MSK 0x00000007u
+/* Lower bit position of bitfield rpf_l4_cmd{F}[2:0] */
+#define HW_ATL2_RPF_L4_CMD_SHIFT 0
+/* Register address for bitfield rpf_l4_tag{F}[2:0] */
+#define HW_ATL2_RPF_L4_TAG_ADR(filter) (0x00006520u + (filter) * 0x4)
+/* Bitmask for bitfield rpf_l4_tag{F}[2:0] */
+#define HW_ATL2_RPF_L4_TAG_MSK 0x00000070u
+/* Lower bit position of bitfield rpf_l4_tag{F}[2:0] */
+#define HW_ATL2_RPF_L4_TAG_SHIFT 4
 /* RX rx_q{Q}_tc_map[2:0] Bitfield Definitions
  * Preprocessor definitions for the bitfield "rx_q{Q}_tc_map[2:0]".
  * Parameter: Queue {Q} | bit-level stride | range [0, 31]
@@ -131,7 +197,24 @@
 #define HW_ATL2_RX_Q_TC_MAP_WIDTH 3
 /* Default value of bitfield rx_q{Q}_tc_map[2:0] */
 #define HW_ATL2_RX_Q_TC_MAP_DEFAULT 0x0
-
+/* TX desc{D}_ts_wrb_en Bitfield Definitions
+ */
+#define HW_ATL2_TDM_DESCD_TS_WRB_EN_ADR(descriptor) \
+	(0x00007C08 + (descriptor) * 0x40)
+#define HW_ATL2_TDM_DESCD_TS_WRB_EN_MSK 0x00040000
+#define HW_ATL2_TDM_DESCD_TS_WRB_EN_SHIFT 18
+/* TX desc{D}_ts_en Bitfield Definitions
+ */
+#define HW_ATL2_TDM_DESCD_TS_EN_ADR(descriptor) \
+	(0x00007C08 + (descriptor) * 0x40)
+#define HW_ATL2_TDM_DESCD_TS_EN_MSK 0x00020000
+#define HW_ATL2_TDM_DESCD_TS_EN_SHIFT 17
+/* TX desc{D}_avb_en Bitfield Definitions
+ */
+#define HW_ATL2_TDM_DESCD_AVB_EN_ADR(descriptor) \
+	(0x00007C08 + (descriptor) * 0x40)
+#define HW_ATL2_TDM_DESCD_AVB_EN_MSK 0x00010000
+#define HW_ATL2_TDM_DESCD_AVB_EN_SHIFT 16
 /* tx tx_tc_q_rand_map_en bitfield definitions
  * preprocessor definitions for the bitfield "tx_tc_q_rand_map_en".
  * port="pif_tpb_tx_tc_q_rand_map_en_i"
@@ -221,7 +304,18 @@
 #define HW_ATL2_TPS_DATA_TCTCREDIT_MAX_WIDTH 16
 /* default value of bitfield data_tc{t}_credit_max[f:0] */
 #define HW_ATL2_TPS_DATA_TCTCREDIT_MAX_DEFAULT 0x0
-
+/* register address for bitfield pif_tpb_highest_prio_tc_en */
+#define HW_ATL2_TPB_HIGHEST_PRIO_TC_EN_ADR 0x00007180
+/* bitmask for bitfield pif_tpb_highest_prio_tc_en */
+#define HW_ATL2_TPB_HIGHEST_PRIO_TC_EN_MSK 0x00000100
+/* lower bit position of bitfield pif_tpb_highest_prio_tc_en */
+#define HW_ATL2_TPB_HIGHEST_PRIO_TC_EN_SHIFT 8
+/* register address for bitfield pif_tpb_highest_prio_tc */
+#define HW_ATL2_TPB_HIGHEST_PRIO_TC_ADR 0x00007180
+/* bitmask for bitfield pif_tpb_highest_prio_tc */
+#define HW_ATL2_TPB_HIGHEST_PRIO_TC_MSK 0x00000007
+/* lower bit position of bitfield pif_tpb_highest_prio_tc */
+#define HW_ATL2_TPB_HIGHEST_PRIO_TC_SHIFT 0
 /* tx data_tc{t}_weight[e:0] bitfield definitions
  * preprocessor definitions for the bitfield "data_tc{t}_weight[e:0]".
  * parameter: tc {t} | stride size 0x4 | range [0, 7]
@@ -248,7 +342,87 @@
  */
 
 #define HW_ATL2_TX_INTR_MODERATION_CTL_ADR(queue) (0x00007c28u + (queue) * 0x40)
-
+/* TX tx_data_rd_req_limit[7:0] Bitfield Definitions
+ */
+#define HW_ATL2_TDM_TX_DATA_RD_REQ_LIMIT_ADR 0x00007B04
+#define HW_ATL2_TDM_TX_DATA_RD_REQ_LIMIT_MSK 0x0000FF00
+#define HW_ATL2_TDM_TX_DATA_RD_REQ_LIMIT_SHIFT 8
+/* TX tx_desc_rd_req_limit[4:0] Bitfield Definitions
+ */
+#define HW_ATL2_TDM_TX_DESC_RD_REQ_LIMIT_ADR 0x00007B04
+#define HW_ATL2_TDM_TX_DESC_RD_REQ_LIMIT_MSK 0x0000001F
+#define HW_ATL2_TDM_TX_DESC_RD_REQ_LIMIT_SHIFT 0
+/* register address for bitfield uP Force Interrupt */
+#define HW_ATL2_GLB_CONTROL_2_ADR 0x00000404
+#define HW_ATL2_MIF_INTERRUPT_2_TO_ITR_MSK 0x00000100
+/* lower bit position of bitfield MIF Interrupt to ITR */
+#define HW_ATL2_MIF_INTERRUPT_TO_ITR_SHIFT 6
+#define HW_ATL2_EN_INTERRUPT_MIF2_TO_ITR_MSK 0x00001000
+/* lower bit position of bitfield Enable MIF Interrupt to ITR */
+#define HW_ATL2_EN_INTERRUPT_TO_ITR_SHIFT 0xA
+#define HW_ATL2_GLOBAL_INTERNAL_ALARMS_1_ADR 0x00000924
+#define HW_ATL2_GLOBAL_HIGH_PRIO_INTERRUPT_1_MASK_ADR 0x00000964
+/* bitmask for bitfield TSG PTM GPIO interrupt */
+#define HW_ATL2_TSG_TSG1_GPIO_INTERRUPT_MSK 0x00000200
+/* lower bit position of bitfield TSG PTM GPIO interrupt */
+#define HW_ATL2_TSG_TSG1_GPIO_INTERRUPT_SHIFT 9
+/* bitmask for bitfield TSG0 GPIO interrupt */
+#define HW_ATL2_TSG_TSG0_GPIO_INTERRUPT_MSK 0x00000020
+/* lower bit position of bitfield TSG0 GPIO interrupt */
+#define HW_ATL2_TSG_TSG0_GPIO_INTERRUPT_SHIFT 5
+/* TSG registers */
+#define HW_ATL2_TSG_REG_ADR(clk, reg_name) \
+	((clk) == 0 ? HW_ATL2_CLK0_##reg_name##_ADR :\
+		 HW_ATL2_CLK1_##reg_name##_ADR)
+
+#define HW_ATL2_CLK0_CLOCK_CFG_ADR 0x00000CA0u
+#define HW_ATL2_CLK1_CLOCK_CFG_ADR 0x00000D50u
+#define HW_ATL2_TSG_SYNC_RESET_MSK 0x00000001
+#define HW_ATL2_TSG_SYNC_RESET_SHIFT 0x00000000
+#define HW_ATL2_TSG_CLOCK_EN_MSK 0x00000002
+#define HW_ATL2_TSG_CLOCK_EN_SHIFT 0x00000001
+#define HW_ATL2_CLK0_CLOCK_MODIF_CTRL_ADR 0x00000CA4u
+#define HW_ATL2_CLK1_CLOCK_MODIF_CTRL_ADR 0x00000D54u
+#define HW_ATL2_TSG_SUBTRACT_COUNTER_MSK 0x00000002
+#define HW_ATL2_TSG_ADD_COUNTER_MSK 0x00000004
+#define HW_ATL2_TSG_LOAD_INC_CFG_MSK 0x00000008
+#define HW_ATL2_CLK0_CLOCK_MODIF_VAL_LSW_ADR 0x00000CA8u
+#define HW_ATL2_CLK1_CLOCK_MODIF_VAL_LSW_ADR 0x00000D58u
+#define HW_ATL2_CLK0_CLOCK_INC_CFG_ADR 0x00000CB0u
+#define HW_ATL2_CLK1_CLOCK_INC_CFG_ADR 0x00000D60u
+#define HW_ATL2_CLK0_READ_CUR_NS_LSW_ADR 0x00000CB8u
+#define HW_ATL2_CLK1_READ_CUR_NS_LSW_ADR 0x00000D68u
+
+#define HW_ATL2_CLK0_GPIO_CFG_ADR 0x00000CC4u
+#define HW_ATL2_CLK1_GPIO_CFG_ADR 0x00000D74u
+#define HW_ATL2_TSG_GPIO_IN_MONITOR_EN_SHIFT 0x00000000
+#define HW_ATL2_TSG_GPIO_IN_MONITOR_EN_MSK 0x00000001
+#define HW_ATL2_TSG_GPIO_IN_MODE_SHIFT 0x00000001
+#define HW_ATL2_TSG_GPIO_IN_MODE_MSK 0x00000006
+#define HW_ATL2_TSG_GPIO_IN_MODE_POSEDGE 0x00000000
+#define HW_ATL2_CLK0_EXT_CLK_COUNT_ADR 0x00000CCCu
+#define HW_ATL2_CLK1_EXT_CLK_COUNT_ADR 0x00000D7Cu
+#define HW_ATL2_CLK0_GPIO_EVENT_TS_LSW_ADR 0x00000CD0u
+#define HW_ATL2_CLK1_GPIO_EVENT_TS_LSW_ADR 0x00000D80u
+#define HW_ATL2_CLK0_GPIO_EVENT_GEN_TS_LSW_ADR 0x00000CE0u
+#define HW_ATL2_CLK1_GPIO_EVENT_GEN_TS_LSW_ADR 0x00000D90u
+#define HW_ATL2_CLK0_GPIO_EVENT_GEN_CFG_ADR 0x00000CE8u
+#define HW_ATL2_CLK1_GPIO_EVENT_GEN_CFG_ADR 0x00000D98u
+#define HW_ATL2_TSG_GPIO_OUTPUT_EN_SHIFT 0x00000000
+#define HW_ATL2_TSG_GPIO_OUTPUT_EN_MSK 0x00000001
+#define HW_ATL2_TSG_GPIO_EVENT_MODE_SHIFT 0x00000001
+#define HW_ATL2_TSG_GPIO_EVENT_MODE_MSK 0x00000006
+#define HW_ATL2_TSG_GPIO_EVENT_MODE_SET_ON_TIME 0x00000003
+#define HW_ATL2_TSG_GPIO_GEN_OUTPUT_EN_MSK 0x00000008
+#define HW_ATL2_CLK0_GPIO_EVENT_HIGH_TIME_LSW_ADR 0x00000CF0u
+#define HW_ATL2_CLK1_GPIO_EVENT_HIGH_TIME_LSW_ADR 0x00000DA0u
+#define HW_ATL2_CLK0_GPIO_EVENT_LOW_TIME_LSW_ADR 0x00000CF8u
+#define HW_ATL2_CLK1_GPIO_EVENT_LOW_TIME_LSW_ADR 0x00000DA8u
+/* PCIE Extended tag enable Bitfield Definitions
+ */
+#define HW_ATL2_PHI_EXT_TAG_EN_ADR 0x00001000
+#define HW_ATL2_PHI_EXT_TAG_EN_MSK 0x00000020
+#define HW_ATL2_PHI_EXT_TAG_EN_SHIFT 5
 /* Launch time control register */
 #define HW_ATL2_LT_CTRL_ADR 0x00007a1c
 
@@ -387,5 +561,25 @@
 #define HW_ATL2_MCP_HOST_REQ_INT_ADR 0x00000F00u
 #define HW_ATL2_MCP_HOST_REQ_INT_SET_ADR 0x00000F04u
 #define HW_ATL2_MCP_HOST_REQ_INT_CLR_ADR 0x00000F08u
-
+/* Register address for bitfield PTP EXT GPIO TS SEL */
+#define HW_ATL2_TSG0_EXT_GPIO_TS_INPUT_SEL_ADR 0x00003664
+/* Bitmask for bitfield PTP EXT GPIO TS SEL */
+#define HW_ATL2_TSG0_EXT_GPIO_TS_INPUT_SEL_MSK 0x00001F00
+/* Lower bit position of bitfield PTP EXT GPIO TS SEL */
+#define HW_ATL2_TSG0_EXT_GPIO_TS_INPUT_SEL_SHIFT 8
+/* Register address for bitfield TSG EXT GPIO TS SEL */
+#define HW_ATL2_TSG1_EXT_GPIO_TS_INPUT_SEL_ADR 0x00003660
+/* Bitmask for bitfield TSG EXT GPIO TS SEL */
+#define HW_ATL2_TSG1_EXT_GPIO_TS_INPUT_SEL_MSK 0x00001F00
+/* Lower bit position of bitfield TSG EXT GPIO TS SEL */
+#define HW_ATL2_TSG1_EXT_GPIO_TS_INPUT_SEL_SHIFT 8
+/* Register address for bitfield GPIO{P} Special Mode */
+#define HW_ATL2_GPIO_PIN_SPEC_MODE_ADR(pin) (0x00003698 + (pin) * 0x4)
+/* Bitmask for bitfield GPIO{P} Special Mode */
+#define HW_ATL2_GPIO_PIN_SPEC_MODE_MSK 0x0000000C
+/* Lower bit position of bitfield GPIO{P} Special Mode */
+#define HW_ATL2_GPIO_PIN_SPEC_MODE_SHIFT 2
+#define HW_ATL2_GPIO_PIN_SPEC_MODE_TSG1_EVENT_OUTPUT 0
+#define HW_ATL2_GPIO_PIN_SPEC_MODE_TSG0_EVENT_OUTPUT 2
+#define HW_ATL2_GPIO_PIN_SPEC_MODE_GPIO 3
 #endif /* HW_ATL2_LLH_INTERNAL_H */
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 3/9] net: atlantic: decouple aq_set_data_fl3l4() from driver internals driver internals
From: sukhdeeps @ 2026-05-06 13:57 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Refactor aq_set_data_fl3l4() to take an ethtool_rx_flow_spec pointer and
an explicit HW register location instead of driver-internal structures
(aq_nic_s, aq_rx_filter). This makes the function reusable for PTP
filter setup which constructs flow specs independently.

Key changes:
- Add aq_is_ipv6_flow_type() helper to derive IPv6 status from the
  flow_type field, replacing the dependency on rx_fltrs->fl3l4.is_ipv6
  shared state.
- Change aq_set_data_fl3l4() signature to accept (fsp, data, location,
  add) and export it via aq_filters.h.
- Update aq_add_del_fl3l4() to compute the HW register location and
  pass it explicitly.

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../ethernet/aquantia/atlantic/aq_filters.c   | 31 ++++++++++++++-----
 .../ethernet/aquantia/atlantic/aq_filters.h   |  3 ++
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
index 150a0b1af26a..4be7b629bfac 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
@@ -181,6 +181,20 @@ aq_check_approve_fvlan(struct aq_nic_s *aq_nic,
 	return 0;
 }
 
+static bool aq_is_ipv6_flow_type(const struct ethtool_rx_flow_spec *fsp)
+{
+	switch (fsp->flow_type & ~FLOW_EXT) {
+	case TCP_V6_FLOW:
+	case UDP_V6_FLOW:
+	case SCTP_V6_FLOW:
+	case IPV6_FLOW:
+	case IPV6_USER_FLOW:
+		return true;
+	default:
+		return false;
+	}
+}
+
 static int __must_check
 aq_check_filter(struct aq_nic_s *aq_nic,
 		struct ethtool_rx_flow_spec *fsp)
@@ -466,18 +480,16 @@ static int aq_add_del_fvlan(struct aq_nic_s *aq_nic,
 	return aq_filters_vlans_update(aq_nic);
 }
 
-static int aq_set_data_fl3l4(struct aq_nic_s *aq_nic,
-			     struct aq_rx_filter *aq_rx_fltr,
-			     struct aq_rx_filter_l3l4 *data, bool add)
+int aq_set_data_fl3l4(const struct ethtool_rx_flow_spec *fsp,
+		      struct aq_rx_filter_l3l4 *data,
+		      int location, bool add)
 {
-	struct aq_hw_rx_fltrs_s *rx_fltrs = aq_get_hw_rx_fltrs(aq_nic);
-	const struct ethtool_rx_flow_spec *fsp = &aq_rx_fltr->aq_fsp;
 	u32 flow = fsp->flow_type & ~FLOW_EXT;
 
 	memset(data, 0, sizeof(*data));
 
-	data->is_ipv6 = rx_fltrs->fl3l4.is_ipv6;
-	data->location = HW_ATL_GET_REG_LOCATION_FL3L4(fsp->location);
+	data->is_ipv6 = aq_is_ipv6_flow_type(fsp);
+	data->location = location;
 
 	if (!add)
 		return 0;
@@ -569,13 +581,16 @@ static int aq_add_del_fl3l4(struct aq_nic_s *aq_nic,
 	const struct aq_hw_ops *aq_hw_ops = aq_nic->aq_hw_ops;
 	struct aq_hw_s *aq_hw = aq_nic->aq_hw;
 	struct aq_rx_filter_l3l4 data;
+	int location;
 	int err;
 
 	if (unlikely(aq_rx_fltr->aq_fsp.location < AQ_RX_FIRST_LOC_FL3L4 ||
 		     aq_rx_fltr->aq_fsp.location > AQ_RX_LAST_LOC_FL3L4))
 		return -EINVAL;
 
-	aq_set_data_fl3l4(aq_nic, aq_rx_fltr, &data, add);
+	location = HW_ATL_GET_REG_LOCATION_FL3L4(aq_rx_fltr->aq_fsp.location);
+
+	aq_set_data_fl3l4(&aq_rx_fltr->aq_fsp, &data, location, add);
 
 	err = aq_set_fl3l4(aq_hw, aq_hw_ops, &data);
 	if (err)
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_filters.h b/drivers/net/ethernet/aquantia/atlantic/aq_filters.h
index 122e06c88a33..96e89c8e52d0 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_filters.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_filters.h
@@ -32,5 +32,8 @@ int aq_clear_rxnfc_all_rules(struct aq_nic_s *aq_nic);
 int aq_reapply_rxnfc_all_rules(struct aq_nic_s *aq_nic);
 int aq_filters_vlans_update(struct aq_nic_s *aq_nic);
 int aq_filters_vlan_offload_off(struct aq_nic_s *aq_nic);
+int aq_set_data_fl3l4(const struct ethtool_rx_flow_spec *fsp,
+		      struct aq_rx_filter_l3l4 *data,
+		      int location, bool add);
 
 #endif /* AQ_FILTERS_H */
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 2/9] net: atlantic: move active_ipv4/ipv6 bitmap updates after HW write updates after HW write
From: sukhdeeps @ 2026-05-06 13:56 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Move active_ipv4/active_ipv6 bitmap updates from aq_set_data_fl3l4()
into aq_add_del_fl3l4() after the hardware write succeeds. The bitmaps
track which filter slots are actively programmed in hardware and must
only be updated once the HW write is confirmed.

Also remove bitmap manipulation from aq_nic_reserve_filter() and
aq_nic_release_filter(). These functions manage filter slot reservation
counts, not HW filter state. Setting active_ipv4 bits at reservation
time (before any filter is programmed) and clearing them at release
time (regardless of HW state) results in incorrect state visible to
aq_check_approve_fl3l4() for IPv4/IPv6 mixing validation.

This corrected state management is required for the AQC113 L3L4 filter
path introduced later in this series.

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../ethernet/aquantia/atlantic/aq_filters.c   | 36 ++++++++++++-------
 .../net/ethernet/aquantia/atlantic/aq_nic.c   |  3 --
 2 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
index eef52f23166d..150a0b1af26a 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
@@ -479,15 +479,8 @@ static int aq_set_data_fl3l4(struct aq_nic_s *aq_nic,
 	data->is_ipv6 = rx_fltrs->fl3l4.is_ipv6;
 	data->location = HW_ATL_GET_REG_LOCATION_FL3L4(fsp->location);
 
-	if (!add) {
-		if (!data->is_ipv6)
-			rx_fltrs->fl3l4.active_ipv4 &= ~BIT(data->location);
-		else
-			rx_fltrs->fl3l4.active_ipv6 &=
-				~BIT((data->location) / 4);
-
+	if (!add)
 		return 0;
-	}
 
 	data->cmd |= HW_ATL_RX_ENABLE_FLTR_L3L4;
 
@@ -515,11 +508,9 @@ static int aq_set_data_fl3l4(struct aq_nic_s *aq_nic,
 			ntohl(fsp->h_u.tcp_ip4_spec.ip4src);
 		data->ip_dst[0] =
 			ntohl(fsp->h_u.tcp_ip4_spec.ip4dst);
-		rx_fltrs->fl3l4.active_ipv4 |= BIT(data->location);
 	} else {
 		int i;
 
-		rx_fltrs->fl3l4.active_ipv6 |= BIT((data->location) / 4);
 		for (i = 0; i < HW_ATL_RX_CNT_REG_ADDR_IPV6; ++i) {
 			data->ip_dst[i] =
 				ntohl(fsp->h_u.tcp_ip6_spec.ip6dst[i]);
@@ -574,16 +565,35 @@ static int aq_set_fl3l4(struct aq_hw_s *aq_hw,
 static int aq_add_del_fl3l4(struct aq_nic_s *aq_nic,
 			    struct aq_rx_filter *aq_rx_fltr, bool add)
 {
+	struct aq_hw_rx_fltrs_s *rx_fltrs = aq_get_hw_rx_fltrs(aq_nic);
 	const struct aq_hw_ops *aq_hw_ops = aq_nic->aq_hw_ops;
 	struct aq_hw_s *aq_hw = aq_nic->aq_hw;
 	struct aq_rx_filter_l3l4 data;
+	int err;
 
 	if (unlikely(aq_rx_fltr->aq_fsp.location < AQ_RX_FIRST_LOC_FL3L4 ||
-		     aq_rx_fltr->aq_fsp.location > AQ_RX_LAST_LOC_FL3L4  ||
-		     aq_set_data_fl3l4(aq_nic, aq_rx_fltr, &data, add)))
+		     aq_rx_fltr->aq_fsp.location > AQ_RX_LAST_LOC_FL3L4))
 		return -EINVAL;
 
-	return aq_set_fl3l4(aq_hw, aq_hw_ops, &data);
+	aq_set_data_fl3l4(aq_nic, aq_rx_fltr, &data, add);
+
+	err = aq_set_fl3l4(aq_hw, aq_hw_ops, &data);
+	if (err)
+		return err;
+
+	if (add) {
+		if (!data.is_ipv6)
+			rx_fltrs->fl3l4.active_ipv4 |= BIT(data.location);
+		else
+			rx_fltrs->fl3l4.active_ipv6 |= BIT(data.location / 4);
+	} else {
+		if (!data.is_ipv6)
+			rx_fltrs->fl3l4.active_ipv4 &= ~BIT(data.location);
+		else
+			rx_fltrs->fl3l4.active_ipv6 &= ~BIT(data.location / 4);
+	}
+
+	return 0;
 }
 
 static int aq_add_del_rule(struct aq_nic_s *aq_nic,
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index ef9447810071..3cec853e9fad 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -1522,8 +1522,6 @@ u8 aq_nic_reserve_filter(struct aq_nic_s *self, enum aq_rx_filter_type type)
 	case aq_rx_filter_l3l4:
 		fltr_cnt = AQ_RX_LAST_LOC_FL3L4 - AQ_RX_FIRST_LOC_FL3L4;
 		n_bit = fltr_cnt - self->aq_hw_rx_fltrs.fl3l4.reserved_count;
-
-		self->aq_hw_rx_fltrs.fl3l4.active_ipv4 |= BIT(n_bit);
 		self->aq_hw_rx_fltrs.fl3l4.reserved_count++;
 		location = n_bit;
 		break;
@@ -1543,7 +1541,6 @@ void aq_nic_release_filter(struct aq_nic_s *self, enum aq_rx_filter_type type,
 		break;
 	case aq_rx_filter_l3l4:
 		self->aq_hw_rx_fltrs.fl3l4.reserved_count--;
-		self->aq_hw_rx_fltrs.fl3l4.active_ipv4 &= ~BIT(location);
 		break;
 	default:
 		break;
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 1/9] net: atlantic: correct L3L4 filter flow_type masking and IPv6 handling masking and IPv6 handling
From: sukhdeeps @ 2026-05-06 13:56 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh
In-Reply-To: <20260506135706.2834-1-sukhdeeps@marvell.com>

From: Sukhdeep Singh <sukhdeeps@marvell.com>

Correct three issues in aq_set_data_fl3l4() required for the AQC113
PTP filter path introduced later in this series:

1. Mask FLOW_EXT from flow_type before the protocol switch statement.
   Flow types with FLOW_EXT set (e.g. TCP_V4_FLOW | FLOW_EXT) fall
   through to the default case and skip protocol comparison flags.

2. Extend the L3 address comparison check to cover all four IPv6
   words. The original code only checked ip_src[0]/ip_dst[0] and
   required !is_ipv6, so CMP_SRC_ADDR_L3/CMP_DEST_ADDR_L3 were never
   set for IPv6 filters.

3. Use explicit flow type checks for port extraction instead of
   negating IP_USER_FLOW/IPV6_USER_FLOW. The old check did not mask
   FLOW_EXT, so IP_USER_FLOW | FLOW_EXT would incorrectly attempt
   port extraction. Use the actual flow type to pick the correct
   union member directly.

Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
---
 .../ethernet/aquantia/atlantic/aq_filters.c   | 33 ++++++++++---------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
index e419c73b32ce..eef52f23166d 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
@@ -472,6 +472,7 @@ static int aq_set_data_fl3l4(struct aq_nic_s *aq_nic,
 {
 	struct aq_hw_rx_fltrs_s *rx_fltrs = aq_get_hw_rx_fltrs(aq_nic);
 	const struct ethtool_rx_flow_spec *fsp = &aq_rx_fltr->aq_fsp;
+	u32 flow = fsp->flow_type & ~FLOW_EXT;
 
 	memset(data, 0, sizeof(*data));
 
@@ -490,7 +491,7 @@ static int aq_set_data_fl3l4(struct aq_nic_s *aq_nic,
 
 	data->cmd |= HW_ATL_RX_ENABLE_FLTR_L3L4;
 
-	switch (fsp->flow_type) {
+	switch (flow) {
 	case TCP_V4_FLOW:
 	case TCP_V6_FLOW:
 		data->cmd |= HW_ATL_RX_ENABLE_CMP_PROT_L4;
@@ -527,23 +528,23 @@ static int aq_set_data_fl3l4(struct aq_nic_s *aq_nic,
 		}
 		data->cmd |= HW_ATL_RX_ENABLE_L3_IPV6;
 	}
-	if (fsp->flow_type != IP_USER_FLOW &&
-	    fsp->flow_type != IPV6_USER_FLOW) {
-		if (!data->is_ipv6) {
-			data->p_dst =
-				ntohs(fsp->h_u.tcp_ip4_spec.pdst);
-			data->p_src =
-				ntohs(fsp->h_u.tcp_ip4_spec.psrc);
-		} else {
-			data->p_dst =
-				ntohs(fsp->h_u.tcp_ip6_spec.pdst);
-			data->p_src =
-				ntohs(fsp->h_u.tcp_ip6_spec.psrc);
-		}
+	if (flow == TCP_V4_FLOW || flow == UDP_V4_FLOW ||
+	    flow == SCTP_V4_FLOW) {
+		data->p_dst = ntohs(fsp->h_u.tcp_ip4_spec.pdst);
+		data->p_src = ntohs(fsp->h_u.tcp_ip4_spec.psrc);
+	}
+	if (flow == TCP_V6_FLOW || flow == UDP_V6_FLOW ||
+	    flow == SCTP_V6_FLOW) {
+		data->p_dst = ntohs(fsp->h_u.tcp_ip6_spec.pdst);
+		data->p_src = ntohs(fsp->h_u.tcp_ip6_spec.psrc);
 	}
-	if (data->ip_src[0] && !data->is_ipv6)
+	if (data->ip_src[0] ||
+	    (data->is_ipv6 && (data->ip_src[1] || data->ip_src[2] ||
+			       data->ip_src[3])))
 		data->cmd |= HW_ATL_RX_ENABLE_CMP_SRC_ADDR_L3;
-	if (data->ip_dst[0] && !data->is_ipv6)
+	if (data->ip_dst[0] ||
+	    (data->is_ipv6 && (data->ip_dst[1] || data->ip_dst[2] ||
+			       data->ip_dst[3])))
 		data->cmd |= HW_ATL_RX_ENABLE_CMP_DEST_ADDR_L3;
 	if (data->p_dst)
 		data->cmd |= HW_ATL_RX_ENABLE_CMP_DEST_PORT_L4;
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next 0/9] net: atlantic: add PTP support for AQC113 (Antigua)
From: sukhdeeps @ 2026-05-06 13:56 UTC (permalink / raw)
  To: netdev
  Cc: irusskikh, epomozov, richardcochran, andrew+netdev, davem,
	edumazet, kuba, pabeni, linux-kernel, Sukhdeep Singh

From: Sukhdeep Singh <sukhdeeps@marvell.com>

This series adds IEEE 1588 PTP support for the AQC113 (Antigua) network
controller. AQC113 is the successor to the existing AQC107 (Atlantic)
chip already supported by the atlantic driver.

AQC113 uses a substantially different hardware architecture for PTP
compared to AQC107:

  - Dual on-chip TSG clocks with direct register access instead of
    PHY-based timestamping via firmware
  - TX timestamps via descriptor writeback instead of firmware mailbox
  - Hardware L3/L4 RX filters for PTP multicast steering with both
    IPv4 and IPv6 support
  - Reference-counted shared filter slots managed through an Action
    Resolver Table (ART), allowing multiple rules to share L3/L4
    hardware filters when their match criteria are identical

The series is structured in three parts:

Patches 1-3 prepare the existing L3/L4 filter path:

  Patch 1 corrects flow_type masking and IPv6 address handling in
  aq_set_data_fl3l4(). Patch 2 moves the active_ipv4/ipv6 bitmap
  updates to after the hardware write succeeds. Patch 3 decouples
  the function from driver-internal structures so it can be called
  directly by the AQC113 PTP filter setup code.

Patches 4-6 add the AQC113 hardware infrastructure:

  Patch 4 adds the low-level register definitions and accessor
  functions. Patch 5 adds filter data structures and firmware
  capability query. Patch 6 implements the complete L2/L3/L4 RX
  filter management layer including the reference-counted sharing
  and ART integration.

Patches 7-9 add the AQC113 PTP feature:

  Patch 7 reserves the dedicated PTP traffic class buffer and
  configures the TX path. Patch 8 extends the hw_ops interface
  with PTP-specific function pointers and updates AQC107 to the
  new signatures. Patch 9 implements the full PTP subsystem
  integration for AQC113.

The existing AQC107 PTP implementation is not functionally changed
by this series; AQC113-specific code paths are gated on chip
detection throughout.

Tested on AQC113 at 1G, 2.5G, 5G, and 10G link speeds using
ptp4l/phc2sys with hardware timestamping in both L2 and L4
(IPv4/IPv6) modes.

Sukhdeep Singh (9):
  net: atlantic: correct L3L4 filter flow_type masking and IPv6 handling
  net: atlantic: move active_ipv4/ipv6 bitmap updates after HW write
  net: atlantic: decouple aq_set_data_fl3l4() from driver internals
  net: atlantic: add AQC113 hardware register definitions and accessors
  net: atlantic: add AQC113 filter data structures and firmware query
  net: atlantic: implement AQC113 L2/L3/L4 RX filter management
  net: atlantic: add AQC113 PTP traffic class and TX path setup
  net: atlantic: extend hw_ops and TX descriptor for AQC113 PTP
  net: atlantic: add PTP support for AQC113 (Antigua)

 drivers/net/ethernet/aquantia/atlantic/aq_filters.c          |  64 +-
 drivers/net/ethernet/aquantia/atlantic/aq_filters.h          |   3 +
 drivers/net/ethernet/aquantia/atlantic/aq_hw.h               |  37 +-
 drivers/net/ethernet/aquantia/atlantic/aq_main.c             |  33 +-
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c              |  51 +-
 drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c         |   4 +-
 drivers/net/ethernet/aquantia/atlantic/aq_ptp.c              | 531 +++++++--
 drivers/net/ethernet/aquantia/atlantic/aq_ptp.h              |  15 +-
 drivers/net/ethernet/aquantia/atlantic/aq_ring.c             |  42 +-
 drivers/net/ethernet/aquantia/atlantic/aq_ring.h             |   4 +-
 drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c   |  15 +-
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c    | 813 +++++++++++-
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.h    |  12 +
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_internal.h | 69 +-
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.c | 360 ++++++
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh.h | 107 +-
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_llh_internal.h | 204 ++-
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.c |  33 +
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils.h |  15 +
 drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2_utils_fw.c |  52 +
 20 files changed, 2244 insertions(+), 224 deletions(-)

-- 
2.43.0

^ permalink raw reply

* Re: [PATCH 1/2] nfc: llcp: Fix use-after-free in llcp_sock_release()
From: Lee Jones @ 2026-05-06 13:51 UTC (permalink / raw)
  To: David Heidelberg
  Cc: Jakub Kicinski, David Heidelberg, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Kuniyuki Iwashima, Kees Cook,
	Junxi Qian, Ingo Molnar, Samuel Ortiz, netdev, linux-kernel
In-Reply-To: <F9112727-E4DF-4884-807A-015809EA5DC7@ixit.cz>

On Wed, 06 May 2026, David Heidelberg wrote:

> Hello Lee.
> 
> Yeah, I think today these should hit the linux-next integration tree, and I need to setup the Thank you email to work in `b4 review` :)

Thanks David.  And thanks for picking up the new role.

BTW, you may want to configure your mailer too. :)

> -------- Original Message --------
> From: Lee Jones <lee@kernel.org>
> Sent: 6 May 2026 08:11:45 UTC
> To: Jakub Kicinski <kuba@kernel.org>
> Cc: David Heidelberg <david+nfc@ixit.cz>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>, Kuniyuki Iwashima <kuniyu@google.com>, Kees Cook <kees@kernel.org>, Junxi Qian <qjx1298677004@gmail.com>, Ingo Molnar <mingo@kernel.org>, Samuel Ortiz <sameo@linux.intel.com>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/2] nfc: llcp: Fix use-after-free in llcp_sock_release()
> 
> On Fri, 01 May 2026, Jakub Kicinski wrote:
> 
> > On Wed, 29 Apr 2026 13:40:41 +0000 Lee Jones wrote:
> > > llcp_sock_release() unconditionally unlinks the socket from the local
> > > sockets list.  However, if the socket is still in connecting state, it
> > > is on the connecting list.
> > > 
> > > Fix this by checking the socket state and unlinking from the correct list.
> > > 
> > > Fixes: b4011239a08e ("NFC: llcp: Fix non blocking sockets connections")
> > > Signed-off-by: Lee Jones <lee@kernel.org>
> > 
> > Adding David H and dropping from netdev's patchwork..
> 
> Is anyone looking at these please?
> 
> These are pretty important.
> 

-- 
Lee Jones

^ permalink raw reply

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
From: syzbot @ 2026-05-06 13:48 UTC (permalink / raw)
  To: akpm, arjan, davem, dsahern, edumazet, horms, jgg, kuba, kuni1840,
	kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzkaller-bugs, yanjun.zhu, zyjzyj2000
In-Reply-To: <69ea344f.a00a0220.17a17.0040.GAE@google.com>

syzbot has found a reproducer for the following issue on:

HEAD commit:    74fe02ce122a Merge tag 'wq-for-7.1-rc2-fixes' of git://git..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16e895ce580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=59da38148f3a3d24
dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13a613ba580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-74fe02ce.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/c0a591d96864/vmlinux-74fe02ce.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9f94fb623cd1/bzImage-74fe02ce.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com

Oops: general protection fault, probably for non-canonical address 0xdffffc000000000d: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]
CPU: 3 UID: 0 PID: 5986 Comm: syz.3.20 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3785
Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 33 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 20 49 8d 7c 24 68 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 1a 49 8b 44 24 68 89 ee 48 89 df 5b 5d 41 5c ff e0
RSP: 0018:ffffc9000391f180 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff88802a2a0040 RCX: ffffffff8b8b72bd
RDX: 000000000000000d RSI: ffffffff89553b32 RDI: 0000000000000068
RBP: 0000000000000002 R08: 0000000000000001 R09: fffff52000723dfc
R10: ffffc9000391efe7 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880311b8000 R14: 0000000000000002 R15: 0000000000000018
FS:  00007f602d1fe6c0(0000) GS:ffff8880d6675000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561c522a6000 CR3: 000000002e99e000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 udp_tunnel_sock_release+0x68/0x80 net/ipv4/udp_tunnel_core.c:202
 rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
 rxe_sock_put+0xae/0x130 drivers/infiniband/sw/rxe/rxe_net.c:639
 rxe_net_del+0x83/0x120 drivers/infiniband/sw/rxe/rxe_net.c:660
 rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
 nldev_dellink+0x289/0x3c0 drivers/infiniband/core/nldev.c:1849
 rdma_nl_rcv_msg+0x392/0x6f0 drivers/infiniband/core/netlink.c:195
 rdma_nl_rcv_skb.constprop.0.isra.0+0x2cb/0x410 drivers/infiniband/core/netlink.c:239
 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
 netlink_unicast+0x585/0x850 net/netlink/af_netlink.c:1344
 netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:787 [inline]
 __sock_sendmsg net/socket.c:802 [inline]
 ____sys_sendmsg+0x9e1/0xb70 net/socket.c:2698
 ___sys_sendmsg+0x190/0x1e0 net/socket.c:2752
 __sys_sendmsg+0x170/0x220 net/socket.c:2784
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x10b/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f602db9cdd9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f602d1fe028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f602de16090 RCX: 00007f602db9cdd9
RDX: 0000000000000000 RSI: 00002000000002c0 RDI: 0000000000000007
RBP: 00007f602dc32d69 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f602de16128 R14: 00007f602de16090 R15: 00007ffc1d89c428
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3785
Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 33 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 20 49 8d 7c 24 68 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 1a 49 8b 44 24 68 89 ee 48 89 df 5b 5d 41 5c ff e0
RSP: 0018:ffffc9000391f180 EFLAGS: 00010202

RAX: dffffc0000000000 RBX: ffff88802a2a0040 RCX: ffffffff8b8b72bd
RDX: 000000000000000d RSI: ffffffff89553b32 RDI: 0000000000000068
RBP: 0000000000000002 R08: 0000000000000001 R09: fffff52000723dfc
R10: ffffc9000391efe7 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880311b8000 R14: 0000000000000002 R15: 0000000000000018
FS:  00007f602d1fe6c0(0000) GS:ffff8880d6675000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561c522a6000 CR3: 000000002e99e000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	fc                   	cld
   1:	ff                   	lcall  (bad)
   2:	df 48 89             	fisttps -0x77(%rax)
   5:	fa                   	cli
   6:	48 c1 ea 03          	shr    $0x3,%rdx
   a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   e:	75 33                	jne    0x43
  10:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  17:	fc ff df
  1a:	4c 8b 63 20          	mov    0x20(%rbx),%r12
  1e:	49 8d 7c 24 68       	lea    0x68(%r12),%rdi
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1) <-- trapping instruction
  2e:	75 1a                	jne    0x4a
  30:	49 8b 44 24 68       	mov    0x68(%r12),%rax
  35:	89 ee                	mov    %ebp,%esi
  37:	48 89 df             	mov    %rbx,%rdi
  3a:	5b                   	pop    %rbx
  3b:	5d                   	pop    %rbp
  3c:	41 5c                	pop    %r12
  3e:	ff e0                	jmp    *%rax


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply

* Re: [PATCH net-next] net: ethernet: atheros: atl2: remove kernel backward-compatibility code
From: Alexander Lobakin @ 2026-05-06 13:43 UTC (permalink / raw)
  To: Ethan Nelson-Moore
  Cc: netdev, Chris Snook, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Ingo Molnar, Thomas Gleixner
In-Reply-To: <20260506054035.23710-1-enelsonmoore@gmail.com>

From: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Date: Tue,  5 May 2026 22:40:27 -0700

> The atl2 driver contains code for compatibility with old kernels that
> do not support module_param_array. Backward compatibility is
> irrelevant because this driver is in-tree. Remove this unreachable
> code to simplify the driver's handling of module parameters.
> 
> Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>

> ---
>  drivers/net/ethernet/atheros/atlx/atl2.c | 37 ++----------------------
>  1 file changed, 3 insertions(+), 34 deletions(-)

Thanks,
Olek

^ permalink raw reply

* [PATCH net-next V2 3/3] net/mlx5: Add VHCA_ID page management mode support
From: Tariq Toukan @ 2026-05-06 13:32 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Moshe Shemesh, Akiva Goldberger, netdev, linux-rdma, linux-kernel,
	Gal Pressman, Dragos Tatulea
In-Reply-To: <20260506133239.276237-1-tariqt@nvidia.com>

From: Moshe Shemesh <moshe@nvidia.com>

Add support for VHCA_ID-based page management mode. When the device
firmware advertises the icm_mng_function_id_mode capability with
MLX5_ID_MODE_FUNCTION_VHCA_ID, page management operations between the
driver and firmware may use vhca_id instead of function_id as the
effective function identifier, and the ec_function field is ignored.

Update page management commands to conditionally set ec_function field
only in FUNC_ID mode. Boot page allocation always uses FUNC_ID mode
semantics for backward compatibility, as the capability bit is only
available after set_hca_cap(). If after set_hca_cap() VHCA_ID mode was
set, modify the tracking of the boot pages in page_root_xa to use
vhca_id too.

Add mlx5_esw_vhca_id_to_func_type() to resolve the function type in
VHCA_ID mode, enabling per-type debugfs counters. Use a dedicated
vhca_type_map xarray, to provide lockless lookup. Store the resolved
type on each fw_page at allocation time so reclaim and release paths
read it directly without any lookup.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/eswitch.c |  45 +++-
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |   8 +
 .../net/ethernet/mellanox/mlx5/core/main.c    |   3 +
 .../ethernet/mellanox/mlx5/core/pagealloc.c   | 250 +++++++++++++-----
 include/linux/mlx5/driver.h                   |   7 +
 5 files changed, 251 insertions(+), 62 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index e0eafcf0c52a..125129ef43e3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -852,6 +852,38 @@ bool mlx5_esw_vport_vhca_id(struct mlx5_eswitch *esw, u16 vportn, u16 *vhca_id)
 	return true;
 }
 
+static enum mlx5_func_type
+esw_vport_to_func_type(struct mlx5_eswitch *esw, struct mlx5_vport *vport)
+{
+	u16 vport_num = vport->vport;
+
+	if (vport_num == MLX5_VPORT_HOST_PF)
+		return MLX5_HOST_PF;
+	if (xa_get_mark(&esw->vports, vport_num, MLX5_ESW_VPT_SF))
+		return MLX5_SF;
+	if (xa_get_mark(&esw->vports, vport_num, MLX5_ESW_VPT_VF))
+		return MLX5_VF;
+	return MLX5_EC_VF;
+}
+
+u16 mlx5_esw_vhca_id_to_func_type(struct mlx5_core_dev *dev, u16 vhca_id)
+{
+	struct mlx5_eswitch *esw = dev->priv.eswitch;
+	void *entry;
+
+	if (vhca_id == MLX5_CAP_GEN(dev, vhca_id))
+		return MLX5_SELF;
+
+	if (!esw)
+		return MLX5_FUNC_TYPE_NONE;
+
+	entry = xa_load(&esw->vhca_type_map, vhca_id);
+	if (entry)
+		return xa_to_value(entry);
+
+	return MLX5_FUNC_TYPE_NONE;
+}
+
 static int esw_vport_setup(struct mlx5_eswitch *esw, struct mlx5_vport *vport)
 {
 	bool vst_mode_steering = esw_vst_mode_is_steering(esw);
@@ -942,6 +974,11 @@ int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport,
 		ret = mlx5_esw_vport_vhca_id_map(esw, vport);
 		if (ret)
 			goto err_vhca_mapping;
+		ret = xa_insert(&esw->vhca_type_map, vport->vhca_id,
+				xa_mk_value(esw_vport_to_func_type(esw, vport)),
+				GFP_KERNEL);
+		if (ret)
+			goto err_type_map;
 	}
 
 	esw_vport_change_handle_locked(vport);
@@ -952,6 +989,8 @@ int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport,
 	mutex_unlock(&esw->state_lock);
 	return ret;
 
+err_type_map:
+	mlx5_esw_vport_vhca_id_unmap(esw, vport);
 err_vhca_mapping:
 	esw_vport_cleanup(esw, vport);
 	mutex_unlock(&esw->state_lock);
@@ -976,8 +1015,10 @@ void mlx5_esw_vport_disable(struct mlx5_eswitch *esw, struct mlx5_vport *vport)
 		arm_vport_context_events_cmd(esw->dev, vport_num, 0);
 
 	if (!mlx5_esw_is_manager_vport(esw, vport_num) &&
-	    MLX5_CAP_GEN(esw->dev, vport_group_manager))
+	    MLX5_CAP_GEN(esw->dev, vport_group_manager)) {
+		xa_erase(&esw->vhca_type_map, vport->vhca_id);
 		mlx5_esw_vport_vhca_id_unmap(esw, vport);
+	}
 
 	if (vport->vport != MLX5_VPORT_HOST_PF &&
 	    (vport->info.ipsec_crypto_enabled || vport->info.ipsec_packet_enabled))
@@ -2084,6 +2125,7 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev)
 	atomic64_set(&esw->offloads.num_flows, 0);
 	ida_init(&esw->offloads.vport_metadata_ida);
 	xa_init_flags(&esw->offloads.vhca_map, XA_FLAGS_ALLOC);
+	xa_init(&esw->vhca_type_map);
 	mutex_init(&esw->state_lock);
 	init_rwsem(&esw->mode_lock);
 	refcount_set(&esw->qos.refcnt, 0);
@@ -2133,6 +2175,7 @@ void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw)
 	mutex_destroy(&esw->state_lock);
 	WARN_ON(!xa_empty(&esw->offloads.vhca_map));
 	xa_destroy(&esw->offloads.vhca_map);
+	xa_destroy(&esw->vhca_type_map);
 	ida_destroy(&esw->offloads.vport_metadata_ida);
 	mlx5e_mod_hdr_tbl_destroy(&esw->offloads.mod_hdr);
 	mutex_destroy(&esw->offloads.encap_tbl_lock);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 2fd601bd102f..b06d097824ad 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -373,6 +373,7 @@ struct mlx5_eswitch {
 	struct dentry *debugfs_root;
 	struct workqueue_struct *work_queue;
 	struct xarray vports;
+	struct xarray vhca_type_map;
 	u32 flags;
 	int                     total_vports;
 	int                     enabled_vports;
@@ -863,6 +864,7 @@ void mlx5_esw_vport_vhca_id_unmap(struct mlx5_eswitch *esw,
 				  struct mlx5_vport *vport);
 int mlx5_eswitch_vhca_id_to_vport(struct mlx5_eswitch *esw, u16 vhca_id, u16 *vport_num);
 bool mlx5_esw_vport_vhca_id(struct mlx5_eswitch *esw, u16 vportn, u16 *vhca_id);
+u16 mlx5_esw_vhca_id_to_func_type(struct mlx5_core_dev *dev, u16 vhca_id);
 
 void mlx5_esw_offloads_rep_remove(struct mlx5_eswitch *esw,
 				  const struct mlx5_vport *vport);
@@ -1034,6 +1036,12 @@ mlx5_esw_vport_vhca_id(struct mlx5_eswitch *esw, u16 vportn, u16 *vhca_id)
 	return false;
 }
 
+static inline u16
+mlx5_esw_vhca_id_to_func_type(struct mlx5_core_dev *dev, u16 vhca_id)
+{
+	return MLX5_FUNC_TYPE_NONE;
+}
+
 static inline void
 mlx5_eswitch_safe_aux_devs_remove(struct mlx5_core_dev *dev) {}
 static inline struct mlx5_flow_handle *
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 0c1c906b60fa..296c5223cf61 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -597,6 +597,9 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
 	if (MLX5_CAP_GEN_MAX(dev, release_all_pages))
 		MLX5_SET(cmd_hca_cap, set_hca_cap, release_all_pages, 1);
 
+	if (MLX5_CAP_GEN_MAX(dev, icm_mng_function_id_mode))
+		MLX5_SET(cmd_hca_cap, set_hca_cap, icm_mng_function_id_mode, 1);
+
 	if (MLX5_CAP_GEN_MAX(dev, mkey_by_name))
 		MLX5_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
index 77ffa31cc505..ce2f7fa9bd48 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
@@ -38,6 +38,7 @@
 #include "mlx5_core.h"
 #include "lib/eq.h"
 #include "lib/tout.h"
+#include "eswitch.h"
 
 enum {
 	MLX5_PAGES_CANT_GIVE	= 0,
@@ -59,6 +60,7 @@ struct fw_page {
 	u64			addr;
 	struct page	       *page;
 	u32			function;
+	u16			func_type;
 	unsigned long		bitmask;
 	struct list_head	list;
 	unsigned int free_count;
@@ -69,9 +71,24 @@ enum {
 	MLX5_NUM_4K_IN_PAGE		= PAGE_SIZE / MLX5_ADAPTER_PAGE_SIZE,
 };
 
-static u32 get_function(u16 func_id, bool ec_function)
+static bool mlx5_page_mgt_mode_is_vhca_id(const struct mlx5_core_dev *dev)
 {
-	return (u32)func_id | (ec_function << 16);
+	return dev->priv.page_mgt_mode == MLX5_PAGE_MGT_MODE_VHCA_ID;
+}
+
+static void mlx5_page_mgt_mode_set(struct mlx5_core_dev *dev,
+				   enum mlx5_page_mgt_mode mode)
+{
+	dev->priv.page_mgt_mode = mode;
+}
+
+static u32 get_function_key(struct mlx5_core_dev *dev, u16 func_vhca_id,
+			    bool ec_function)
+{
+	if (mlx5_page_mgt_mode_is_vhca_id(dev))
+		return (u32)func_vhca_id;
+
+	return (u32)func_vhca_id | (ec_function << 16);
 }
 
 static u16 func_id_to_type(struct mlx5_core_dev *dev, u16 func_id, bool ec_function)
@@ -89,12 +106,21 @@ static u16 func_id_to_type(struct mlx5_core_dev *dev, u16 func_id, bool ec_funct
 	return MLX5_SF;
 }
 
+static u16 func_vhca_id_to_type(struct mlx5_core_dev *dev, u16 func_vhca_id,
+				bool ec_function)
+{
+	if (mlx5_page_mgt_mode_is_vhca_id(dev))
+		return mlx5_esw_vhca_id_to_func_type(dev, func_vhca_id);
+
+	return func_id_to_type(dev, func_vhca_id, ec_function);
+}
+
 static u32 mlx5_get_ec_function(u32 function)
 {
 	return function >> 16;
 }
 
-static u32 mlx5_get_func_id(u32 function)
+static u32 mlx5_get_func_vhca_id(u32 function)
 {
 	return function & 0xffff;
 }
@@ -123,7 +149,8 @@ static struct rb_root *page_root_per_function(struct mlx5_core_dev *dev, u32 fun
 	return root;
 }
 
-static int insert_page(struct mlx5_core_dev *dev, u64 addr, struct page *page, u32 function)
+static int insert_page(struct mlx5_core_dev *dev, u64 addr, struct page *page,
+		       u32 function, u16 func_type)
 {
 	struct rb_node *parent = NULL;
 	struct rb_root *root;
@@ -156,6 +183,7 @@ static int insert_page(struct mlx5_core_dev *dev, u64 addr, struct page *page, u
 	nfp->addr = addr;
 	nfp->page = page;
 	nfp->function = function;
+	nfp->func_type = func_type;
 	nfp->free_count = MLX5_NUM_4K_IN_PAGE;
 	for (i = 0; i < MLX5_NUM_4K_IN_PAGE; i++)
 		set_bit(i, &nfp->bitmask);
@@ -196,7 +224,7 @@ static struct fw_page *find_fw_page(struct mlx5_core_dev *dev, u64 addr,
 	return result;
 }
 
-static int mlx5_cmd_query_pages(struct mlx5_core_dev *dev, u16 *func_id,
+static int mlx5_cmd_query_pages(struct mlx5_core_dev *dev, u16 *func_vhca_id,
 				s32 *npages, int boot)
 {
 	u32 out[MLX5_ST_SZ_DW(query_pages_out)] = {};
@@ -207,14 +235,20 @@ static int mlx5_cmd_query_pages(struct mlx5_core_dev *dev, u16 *func_id,
 	MLX5_SET(query_pages_in, in, op_mod, boot ?
 		 MLX5_QUERY_PAGES_IN_OP_MOD_BOOT_PAGES :
 		 MLX5_QUERY_PAGES_IN_OP_MOD_INIT_PAGES);
-	MLX5_SET(query_pages_in, in, embedded_cpu_function, mlx5_core_is_ecpf(dev));
+
+	if (mlx5_page_mgt_mode_is_vhca_id(dev))
+		MLX5_SET(query_pages_in, in, function_id,
+			 MLX5_CAP_GEN(dev, vhca_id));
+	else
+		MLX5_SET(query_pages_in, in, embedded_cpu_function,
+			 mlx5_core_is_ecpf(dev));
 
 	err = mlx5_cmd_exec_inout(dev, query_pages, in, out);
 	if (err)
 		return err;
 
 	*npages = MLX5_GET(query_pages_out, out, num_pages);
-	*func_id = MLX5_GET(query_pages_out, out, function_id);
+	*func_vhca_id = MLX5_GET(query_pages_out, out, function_id);
 
 	return err;
 }
@@ -245,6 +279,10 @@ static int alloc_4k(struct mlx5_core_dev *dev, u64 *addr, u32 function)
 	if (!fp->free_count)
 		list_del(&fp->list);
 
+	if (fp->func_type != MLX5_FUNC_TYPE_NONE)
+		dev->priv.page_counters[fp->func_type]++;
+	dev->priv.fw_pages++;
+
 	*addr = fp->addr + n * MLX5_ADAPTER_PAGE_SIZE;
 
 	return 0;
@@ -280,6 +318,11 @@ static void free_4k(struct mlx5_core_dev *dev, u64 addr, u32 function)
 		mlx5_core_warn_rl(dev, "page not found\n");
 		return;
 	}
+
+	if (fwp->func_type != MLX5_FUNC_TYPE_NONE)
+		dev->priv.page_counters[fwp->func_type]--;
+	dev->priv.fw_pages--;
+
 	n = (addr & ~MLX5_U64_4K_PAGE_MASK) >> MLX5_ADAPTER_PAGE_SHIFT;
 	fwp->free_count++;
 	set_bit(n, &fwp->bitmask);
@@ -289,7 +332,8 @@ static void free_4k(struct mlx5_core_dev *dev, u64 addr, u32 function)
 		list_add(&fwp->list, &dev->priv.free_list);
 }
 
-static int alloc_system_page(struct mlx5_core_dev *dev, u32 function)
+static int alloc_system_page(struct mlx5_core_dev *dev, u32 function,
+			     u16 func_type)
 {
 	struct device *device = mlx5_core_dma_dev(dev);
 	int nid = dev->priv.numa_node;
@@ -317,7 +361,7 @@ static int alloc_system_page(struct mlx5_core_dev *dev, u32 function)
 		goto map;
 	}
 
-	err = insert_page(dev, addr, page, function);
+	err = insert_page(dev, addr, page, function, func_type);
 	if (err) {
 		mlx5_core_err(dev, "failed to track allocated page\n");
 		dma_unmap_page(device, addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
@@ -334,7 +378,7 @@ static int alloc_system_page(struct mlx5_core_dev *dev, u32 function)
 	return err;
 }
 
-static void page_notify_fail(struct mlx5_core_dev *dev, u16 func_id,
+static void page_notify_fail(struct mlx5_core_dev *dev, u16 func_vhca_id,
 			     bool ec_function)
 {
 	u32 in[MLX5_ST_SZ_DW(manage_pages_in)] = {};
@@ -342,19 +386,23 @@ static void page_notify_fail(struct mlx5_core_dev *dev, u16 func_id,
 
 	MLX5_SET(manage_pages_in, in, opcode, MLX5_CMD_OP_MANAGE_PAGES);
 	MLX5_SET(manage_pages_in, in, op_mod, MLX5_PAGES_CANT_GIVE);
-	MLX5_SET(manage_pages_in, in, function_id, func_id);
-	MLX5_SET(manage_pages_in, in, embedded_cpu_function, ec_function);
+	MLX5_SET(manage_pages_in, in, function_id, func_vhca_id);
+
+	if (!mlx5_page_mgt_mode_is_vhca_id(dev))
+		MLX5_SET(manage_pages_in, in, embedded_cpu_function,
+			 ec_function);
 
 	err = mlx5_cmd_exec_in(dev, manage_pages, in);
 	if (err)
-		mlx5_core_warn(dev, "page notify failed func_id(%d) err(%d)\n",
-			       func_id, err);
+		mlx5_core_warn(dev,
+			       "page notify failed func_vhca_id(%d) err(%d)\n",
+			       func_vhca_id, err);
 }
 
-static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
+static int give_pages(struct mlx5_core_dev *dev, u16 func_vhca_id, int npages,
 		      int event, bool ec_function)
 {
-	u32 function = get_function(func_id, ec_function);
+	u32 function = get_function_key(dev, func_vhca_id, ec_function);
 	u32 out[MLX5_ST_SZ_DW(manage_pages_out)] = {0};
 	int inlen = MLX5_ST_SZ_BYTES(manage_pages_in);
 	int notify_fail = event;
@@ -364,6 +412,8 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 	u32 *in;
 	int i;
 
+	func_type = func_vhca_id_to_type(dev, func_vhca_id, ec_function);
+
 	inlen += npages * MLX5_FLD_SZ_BYTES(manage_pages_in, pas[0]);
 	in = kvzalloc(inlen, GFP_KERNEL);
 	if (!in) {
@@ -377,7 +427,8 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 		err = alloc_4k(dev, &addr, function);
 		if (err) {
 			if (err == -ENOMEM)
-				err = alloc_system_page(dev, function);
+				err = alloc_system_page(dev, function,
+							func_type);
 			if (err) {
 				dev->priv.fw_pages_alloc_failed += (npages - i);
 				goto out_4k;
@@ -390,9 +441,12 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 
 	MLX5_SET(manage_pages_in, in, opcode, MLX5_CMD_OP_MANAGE_PAGES);
 	MLX5_SET(manage_pages_in, in, op_mod, MLX5_PAGES_GIVE);
-	MLX5_SET(manage_pages_in, in, function_id, func_id);
+	MLX5_SET(manage_pages_in, in, function_id, func_vhca_id);
 	MLX5_SET(manage_pages_in, in, input_num_entries, npages);
-	MLX5_SET(manage_pages_in, in, embedded_cpu_function, ec_function);
+
+	if (!mlx5_page_mgt_mode_is_vhca_id(dev))
+		MLX5_SET(manage_pages_in, in, embedded_cpu_function,
+			 ec_function);
 
 	err = mlx5_cmd_do(dev, in, inlen, out, sizeof(out));
 	if (err == -EREMOTEIO) {
@@ -405,17 +459,15 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 	}
 	err = mlx5_cmd_check(dev, err, in, out);
 	if (err) {
-		mlx5_core_warn(dev, "func_id 0x%x, npages %d, err %d\n",
-			       func_id, npages, err);
+		mlx5_core_warn(dev, "func_vhca_id 0x%x, npages %d, err %d\n",
+			       func_vhca_id, npages, err);
 		goto out_dropped;
 	}
 
-	func_type = func_id_to_type(dev, func_id, ec_function);
-	dev->priv.page_counters[func_type] += npages;
-	dev->priv.fw_pages += npages;
-
-	mlx5_core_dbg(dev, "npages %d, ec_function %d, func_id 0x%x, err %d\n",
-		      npages, ec_function, func_id, err);
+	mlx5_core_dbg(dev,
+		      "npages %d, ec_function %d, func 0x%x, mode %d, err %d\n",
+		      npages, ec_function, func_vhca_id,
+		      mlx5_page_mgt_mode_is_vhca_id(dev), err);
 
 	kvfree(in);
 	return 0;
@@ -428,18 +480,17 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 out_free:
 	kvfree(in);
 	if (notify_fail)
-		page_notify_fail(dev, func_id, ec_function);
+		page_notify_fail(dev, func_vhca_id, ec_function);
 	return err;
 }
 
-static void release_all_pages(struct mlx5_core_dev *dev, u16 func_id,
+static void release_all_pages(struct mlx5_core_dev *dev, u16 func_vhca_id,
 			      bool ec_function)
 {
-	u32 function = get_function(func_id, ec_function);
+	u32 function = get_function_key(dev, func_vhca_id, ec_function);
 	struct rb_root *root;
 	struct rb_node *p;
 	int npages = 0;
-	u16 func_type;
 
 	root = xa_load(&dev->priv.page_root_xa, function);
 	if (WARN_ON_ONCE(!root))
@@ -448,18 +499,20 @@ static void release_all_pages(struct mlx5_core_dev *dev, u16 func_id,
 	p = rb_first(root);
 	while (p) {
 		struct fw_page *fwp = rb_entry(p, struct fw_page, rb_node);
+		int used = MLX5_NUM_4K_IN_PAGE - fwp->free_count;
 
 		p = rb_next(p);
-		npages += (MLX5_NUM_4K_IN_PAGE - fwp->free_count);
+		npages += used;
+		if (fwp->func_type != MLX5_FUNC_TYPE_NONE)
+			dev->priv.page_counters[fwp->func_type] -= used;
 		free_fwp(dev, fwp, fwp->free_count);
 	}
 
-	func_type = func_id_to_type(dev, func_id, ec_function);
-	dev->priv.page_counters[func_type] -= npages;
 	dev->priv.fw_pages -= npages;
 
-	mlx5_core_dbg(dev, "npages %d, ec_function %d, func_id 0x%x\n",
-		      npages, ec_function, func_id);
+	mlx5_core_dbg(dev, "npages %d, ec_function %d, func 0x%x, mode %d\n",
+		      npages, ec_function, func_vhca_id,
+		      mlx5_page_mgt_mode_is_vhca_id(dev));
 }
 
 static u32 fwp_fill_manage_pages_out(struct fw_page *fwp, u32 *out, u32 index,
@@ -487,7 +540,7 @@ static int reclaim_pages_cmd(struct mlx5_core_dev *dev,
 	struct fw_page *fwp;
 	struct rb_node *p;
 	bool ec_function;
-	u32 func_id;
+	u32 func_vhca_id;
 	u32 npages;
 	u32 i = 0;
 	int err;
@@ -499,10 +552,11 @@ static int reclaim_pages_cmd(struct mlx5_core_dev *dev,
 
 	/* No hard feelings, we want our pages back! */
 	npages = MLX5_GET(manage_pages_in, in, input_num_entries);
-	func_id = MLX5_GET(manage_pages_in, in, function_id);
+	func_vhca_id = MLX5_GET(manage_pages_in, in, function_id);
 	ec_function = MLX5_GET(manage_pages_in, in, embedded_cpu_function);
 
-	root = xa_load(&dev->priv.page_root_xa, get_function(func_id, ec_function));
+	root = xa_load(&dev->priv.page_root_xa,
+		       get_function_key(dev, func_vhca_id, ec_function));
 	if (WARN_ON_ONCE(!root))
 		return -EEXIST;
 
@@ -518,14 +572,14 @@ static int reclaim_pages_cmd(struct mlx5_core_dev *dev,
 	return 0;
 }
 
-static int reclaim_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
-			 int *nclaimed, bool event, bool ec_function)
+static int reclaim_pages(struct mlx5_core_dev *dev, u16 func_vhca_id,
+			 int npages, int *nclaimed, bool event,
+			 bool ec_function)
 {
-	u32 function = get_function(func_id, ec_function);
+	u32 function = get_function_key(dev, func_vhca_id, ec_function);
 	int outlen = MLX5_ST_SZ_BYTES(manage_pages_out);
 	u32 in[MLX5_ST_SZ_DW(manage_pages_in)] = {};
 	int num_claimed;
-	u16 func_type;
 	u32 *out;
 	int err;
 	int i;
@@ -540,12 +594,16 @@ static int reclaim_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 
 	MLX5_SET(manage_pages_in, in, opcode, MLX5_CMD_OP_MANAGE_PAGES);
 	MLX5_SET(manage_pages_in, in, op_mod, MLX5_PAGES_TAKE);
-	MLX5_SET(manage_pages_in, in, function_id, func_id);
+	MLX5_SET(manage_pages_in, in, function_id, func_vhca_id);
 	MLX5_SET(manage_pages_in, in, input_num_entries, npages);
-	MLX5_SET(manage_pages_in, in, embedded_cpu_function, ec_function);
 
-	mlx5_core_dbg(dev, "func 0x%x, npages %d, outlen %d\n",
-		      func_id, npages, outlen);
+	if (!mlx5_page_mgt_mode_is_vhca_id(dev))
+		MLX5_SET(manage_pages_in, in, embedded_cpu_function,
+			 ec_function);
+
+	mlx5_core_dbg(dev, "func 0x%x, npages %d, outlen %d mode %d\n",
+		      func_vhca_id, npages, outlen,
+		      mlx5_page_mgt_mode_is_vhca_id(dev));
 	err = reclaim_pages_cmd(dev, in, sizeof(in), out, outlen);
 	if (err) {
 		npages = MLX5_GET(manage_pages_in, in, input_num_entries);
@@ -577,10 +635,6 @@ static int reclaim_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 	if (nclaimed)
 		*nclaimed = num_claimed;
 
-	func_type = func_id_to_type(dev, func_id, ec_function);
-	dev->priv.page_counters[func_type] -= num_claimed;
-	dev->priv.fw_pages -= num_claimed;
-
 out_free:
 	kvfree(out);
 	return err;
@@ -658,30 +712,102 @@ static int req_pages_handler(struct notifier_block *nb,
 	 * req->npages (and not min ()).
 	 */
 	req->npages = max_t(s32, npages, MAX_RECLAIM_NPAGES);
-	req->ec_function = ec_function;
+	if (!mlx5_page_mgt_mode_is_vhca_id(dev))
+		req->ec_function = ec_function;
 	req->release_all = release_all;
 	INIT_WORK(&req->work, pages_work_handler);
 	queue_work(dev->priv.pg_wq, &req->work);
 	return NOTIFY_OK;
 }
 
+/*
+ * After set_hca_cap(), the second satisfy_startup_pages(dev, 0) may see
+ * VHCA_ID mode. If page_root_xa already has the PF entry from the first
+ * (boot) call under FUNC_ID keys 0 or (ec_function << 16), migrate that
+ * entry to the device vhca_id key so lookups use VHCA_ID semantics.
+ */
+static int mlx5_pagealloc_migrate_pf_to_vhca_id(struct mlx5_core_dev *dev)
+{
+	u32 vhca_id_key, old_key;
+	struct rb_root *root;
+	struct fw_page *fwp;
+	struct rb_node *p;
+	bool ec_function;
+	int err;
+
+	if (xa_empty(&dev->priv.page_root_xa))
+		return 0;
+
+	vhca_id_key = MLX5_CAP_GEN(dev, vhca_id);
+	ec_function = mlx5_core_is_ecpf(dev);
+
+	old_key = ec_function ? (1U << 16) : 0;
+	root = xa_load(&dev->priv.page_root_xa, old_key);
+	if (!root)
+		return 0;
+
+	if (old_key == vhca_id_key)
+		return 0;
+
+	err = xa_insert(&dev->priv.page_root_xa, vhca_id_key, root, GFP_KERNEL);
+	if (err) {
+		mlx5_core_warn(dev,
+			       "failed to migrate page root key 0x%x to vhca_id 0x%x\n",
+			       old_key, vhca_id_key);
+		return err;
+	}
+
+	for (p = rb_first(root); p; p = rb_next(p)) {
+		fwp = rb_entry(p, struct fw_page, rb_node);
+		fwp->function = vhca_id_key;
+	}
+
+	xa_erase(&dev->priv.page_root_xa, old_key);
+
+	return 0;
+}
+
 int mlx5_satisfy_startup_pages(struct mlx5_core_dev *dev, int boot)
 {
-	u16 func_id;
+	bool ec_function = false;
+	u16 func_vhca_id;
 	s32 npages;
 	int err;
 
-	err = mlx5_cmd_query_pages(dev, &func_id, &npages, boot);
+	/* Boot pages are requested before set_hca_cap(), so the capability
+	 * is not negotiated yet; use FUNC_ID mode for backward compatibility.
+	 * Init pages are requested after set_hca_cap(), which unconditionally
+	 * enables CAP_GEN_MAX. Current caps are not re-queried at this point,
+	 * so check CAP_GEN_MAX directly.
+	 */
+	if (boot) {
+		mlx5_page_mgt_mode_set(dev, MLX5_PAGE_MGT_MODE_FUNC_ID);
+	} else {
+		if (MLX5_CAP_GEN_MAX(dev, icm_mng_function_id_mode) ==
+		    MLX5_ID_MODE_FUNCTION_VHCA_ID) {
+			err = mlx5_pagealloc_migrate_pf_to_vhca_id(dev);
+			if (err)
+				return err;
+			mlx5_page_mgt_mode_set(dev, MLX5_PAGE_MGT_MODE_VHCA_ID);
+		}
+	}
+
+	err = mlx5_cmd_query_pages(dev, &func_vhca_id, &npages, boot);
 	if (err)
 		return err;
 
-	mlx5_core_dbg(dev, "requested %d %s pages for func_id 0x%x\n",
-		      npages, boot ? "boot" : "init", func_id);
+	mlx5_core_dbg(dev,
+		      "requested %d %s pages for func_vhca_id 0x%x\n",
+		      npages, boot ? "boot" : "init", func_vhca_id);
 
 	if (!npages)
 		return 0;
 
-	return give_pages(dev, func_id, npages, 0, mlx5_core_is_ecpf(dev));
+	/* In VHCA_ID mode, ec_function remains false (not used). */
+	if (!mlx5_page_mgt_mode_is_vhca_id(dev))
+		ec_function = mlx5_core_is_ecpf(dev);
+
+	return give_pages(dev, func_vhca_id, npages, 0, ec_function);
 }
 
 enum {
@@ -709,15 +835,17 @@ static int mlx5_reclaim_root_pages(struct mlx5_core_dev *dev,
 
 	while (!RB_EMPTY_ROOT(root)) {
 		u32 ec_function = mlx5_get_ec_function(function);
-		u32 function_id = mlx5_get_func_id(function);
+		u32 func_vhca_id = mlx5_get_func_vhca_id(function);
 		int nclaimed;
 		int err;
 
-		err = reclaim_pages(dev, function_id, optimal_reclaimed_pages(),
+		err = reclaim_pages(dev, func_vhca_id,
+				    optimal_reclaimed_pages(),
 				    &nclaimed, false, ec_function);
 		if (err) {
-			mlx5_core_warn(dev, "reclaim_pages err (%d) func_id=0x%x ec_func=0x%x\n",
-				       err, function_id, ec_function);
+			mlx5_core_warn(dev,
+				       "reclaim_pages err (%d) func_vhca_id=0x%x ec_func=0x%x\n",
+				       err, func_vhca_id, ec_function);
 			return err;
 		}
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index d1751c5d01c7..8b4d384125d1 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -558,6 +558,12 @@ enum mlx5_func_type {
 	MLX5_HOST_PF,
 	MLX5_EC_VF,
 	MLX5_FUNC_TYPE_NUM,
+	MLX5_FUNC_TYPE_NONE = MLX5_FUNC_TYPE_NUM,
+};
+
+enum mlx5_page_mgt_mode {
+	MLX5_PAGE_MGT_MODE_FUNC_ID,
+	MLX5_PAGE_MGT_MODE_VHCA_ID,
 };
 
 struct mlx5_frag_buf_node_pools;
@@ -578,6 +584,7 @@ struct mlx5_priv {
 	u32			fw_pages_alloc_failed;
 	u32			give_pages_dropped;
 	u32			reclaim_pages_discard;
+	enum mlx5_page_mgt_mode	page_mgt_mode;
 
 	struct mlx5_core_health health;
 	struct list_head	traps;
-- 
2.44.0


^ permalink raw reply related

* [PATCH net-next V2 1/3] net/mlx5: Relax capability check for eswitch query paths
From: Tariq Toukan @ 2026-05-06 13:32 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Moshe Shemesh, Akiva Goldberger, netdev, linux-rdma, linux-kernel,
	Gal Pressman, Dragos Tatulea
In-Reply-To: <20260506133239.276237-1-tariqt@nvidia.com>

From: Moshe Shemesh <moshe@nvidia.com>

Several eswitch functions that only query other functions' HCA
capabilities or read cached vport state are guarded by the
vhca_resource_manager capability. This capability is required for
set_hca_cap operations but query_hca_cap of other functions only
requires the vport_group_manager capability.

Relax the capability check from vhca_resource_manager to
vport_group_manager in the following query-only paths:
- mlx5_esw_vport_caps_get() - queries other function general caps
- esw_ipsec_vf_query_generic() - queries other function ipsec cap
- mlx5_devlink_port_fn_migratable_get() - reads cached vport state
- mlx5_devlink_port_fn_roce_get() - reads cached vport state
- mlx5_devlink_port_fn_max_io_eqs_get() - queries other function caps
- mlx5_esw_vport_enable/disable() - vhca_id map/unmap

Functions that perform also set_hca_cap (migratable_set, roce_set,
max_io_eqs_set, esw_ipsec_vf_set_generic, esw_ipsec_vf_set_bytype)
retain the vhca_resource_manager requirement.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/esw/ipsec.c    |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |  6 +++---
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 14 ++++++++------
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec.c
index 8b12c3ae0cf7..4811b60ea430 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec.c
@@ -12,7 +12,7 @@ static int esw_ipsec_vf_query_generic(struct mlx5_core_dev *dev, u16 vport_num,
 	void *hca_cap, *query_cap;
 	int err;
 
-	if (!MLX5_CAP_GEN(dev, vhca_resource_manager))
+	if (!MLX5_CAP_GEN(dev, vport_group_manager))
 		return -EOPNOTSUPP;
 
 	if (!mlx5_esw_ipsec_vf_offload_supported(dev)) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 66a773a99876..e0eafcf0c52a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -806,7 +806,7 @@ static int mlx5_esw_vport_caps_get(struct mlx5_eswitch *esw, struct mlx5_vport *
 	void *hca_caps;
 	int err;
 
-	if (!MLX5_CAP_GEN(esw->dev, vhca_resource_manager))
+	if (!MLX5_CAP_GEN(esw->dev, vport_group_manager))
 		return 0;
 
 	query_ctx = kzalloc(query_out_sz, GFP_KERNEL);
@@ -938,7 +938,7 @@ int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport,
 		vport->info.trusted = true;
 
 	if (!mlx5_esw_is_manager_vport(esw, vport_num) &&
-	    MLX5_CAP_GEN(esw->dev, vhca_resource_manager)) {
+	    MLX5_CAP_GEN(esw->dev, vport_group_manager)) {
 		ret = mlx5_esw_vport_vhca_id_map(esw, vport);
 		if (ret)
 			goto err_vhca_mapping;
@@ -976,7 +976,7 @@ void mlx5_esw_vport_disable(struct mlx5_eswitch *esw, struct mlx5_vport *vport)
 		arm_vport_context_events_cmd(esw->dev, vport_num, 0);
 
 	if (!mlx5_esw_is_manager_vport(esw, vport_num) &&
-	    MLX5_CAP_GEN(esw->dev, vhca_resource_manager))
+	    MLX5_CAP_GEN(esw->dev, vport_group_manager))
 		mlx5_esw_vport_vhca_id_unmap(esw, vport);
 
 	if (vport->vport != MLX5_VPORT_HOST_PF &&
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 69ddf56e2fc9..392d8f364db6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -4677,8 +4677,9 @@ int mlx5_devlink_port_fn_migratable_get(struct devlink_port *port, bool *is_enab
 		return -EOPNOTSUPP;
 	}
 
-	if (!MLX5_CAP_GEN(esw->dev, vhca_resource_manager)) {
-		NL_SET_ERR_MSG_MOD(extack, "Device doesn't support VHCA management");
+	if (!MLX5_CAP_GEN(esw->dev, vport_group_manager)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Device doesn't support vport group management");
 		return -EOPNOTSUPP;
 	}
 
@@ -4753,8 +4754,9 @@ int mlx5_devlink_port_fn_roce_get(struct devlink_port *port, bool *is_enabled,
 	struct mlx5_eswitch *esw = mlx5_devlink_eswitch_nocheck_get(port->devlink);
 	struct mlx5_vport *vport = mlx5_devlink_port_vport_get(port);
 
-	if (!MLX5_CAP_GEN(esw->dev, vhca_resource_manager)) {
-		NL_SET_ERR_MSG_MOD(extack, "Device doesn't support VHCA management");
+	if (!MLX5_CAP_GEN(esw->dev, vport_group_manager)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Device doesn't support vport group management");
 		return -EOPNOTSUPP;
 	}
 
@@ -5076,9 +5078,9 @@ mlx5_devlink_port_fn_max_io_eqs_get(struct devlink_port *port, u32 *max_io_eqs,
 	int err;
 
 	esw = mlx5_devlink_eswitch_nocheck_get(port->devlink);
-	if (!MLX5_CAP_GEN(esw->dev, vhca_resource_manager)) {
+	if (!MLX5_CAP_GEN(esw->dev, vport_group_manager)) {
 		NL_SET_ERR_MSG_MOD(extack,
-				   "Device doesn't support VHCA management");
+				   "Device doesn't support vport group management");
 		return -EOPNOTSUPP;
 	}
 
-- 
2.44.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox