From: Paolo Abeni <pabeni@redhat.com>
To: ouster@cs.stanford.edu
Cc: horms@kernel.org, netdev@vger.kernel.org, kuba@kernel.org,
pabeni@redhat.com, edumazet@google.com
Subject: Re: [net-next,v17,05/14] net: homa: create homa_peer.h and homa_peer.c
Date: Wed, 18 Mar 2026 08:21:19 +0100 [thread overview]
Message-ID: <20260318072119.268920-1-pabeni@redhat.com> (raw)
In-Reply-To: <20260316223228.2611-6-ouster@cs.stanford.edu>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: homa: create homa_peer.h and homa_peer.c
Homa needs to keep a small amount of information for each peer that
it has communicated with. These files define that state and provide
functions for storing and accessing it.
> diff --git a/net/homa/homa_peer.c b/net/homa/homa_peer.c
> new file mode 100644
> index 000000000000..1e5301cdef41
> --- /dev/null
> +++ b/net/homa/homa_peer.c
[ ... ]
> +/**
> + * homa_peer_free_net() - Garbage collect all of the peer information
> + * associated with a particular network namespace.
> + * @hnet: Network namespace whose peers should be freed. There must not
> + * be any active sockets or RPCs for this namespace.
> + */
> +void homa_peer_free_net(struct homa_net *hnet)
> +{
> + struct homa_peertab *peertab = hnet->homa->peertab;
> + struct rhashtable_iter iter;
> + struct homa_peer *peer;
> +
> + spin_lock_bh(&peertab->lock);
> + peertab->gc_stop_count++;
> + spin_unlock_bh(&peertab->lock);
> +
> + rhashtable_walk_enter(&peertab->ht, &iter);
> + rhashtable_walk_start(&iter);
> + while (1) {
> + peer = rhashtable_walk_next(&iter);
> + if (!peer)
> + break;
> + if (IS_ERR(peer))
> + continue;
> + if (peer->ht_key.hnet != hnet)
> + continue;
> + if (rhashtable_remove_fast(&peertab->ht, &peer->ht_linkage,
> + ht_params) == 0) {
> + homa_peer_release(peer);
> + hnet->num_peers--;
> + peertab->num_peers--;
> + }
> + }
> + rhashtable_walk_stop(&iter);
> + rhashtable_walk_exit(&iter);
Does this code correctly protect peertab->num_peers updates? Looking at
homa_peer_gc():
net/homa/homa_peer.c:homa_peer_gc() {
spin_lock_bh(&peertab->lock);
...
if (rhashtable_remove_fast(&peertab->ht, &peer->ht_linkage,
ht_params) == 0) {
homa_peer_release(peer);
peertab->num_peers--;
peer->ht_key.hnet->num_peers--;
}
...
spin_unlock_bh(&peertab->lock);
}
and homa_peer_get():
net/homa/homa_peer.c:homa_peer_get() {
...
spin_lock_bh(&peertab->lock);
other = rhashtable_lookup_get_insert_fast(&peertab->ht,
&peer->ht_linkage, ht_params);
...
peertab->num_peers++;
key.hnet->num_peers++;
...
spin_unlock_bh(&peertab->lock);
}
Both those functions hold peertab->lock while updating peertab->num_peers.
But homa_peer_free_net() drops peertab->lock before the rhashtable walk and
doesn't reacquire it before decrementing peertab->num_peers. Can a concurrent
homa_peer_get() call (for any namespace) produce lost updates to
peertab->num_peers? The gc_stop_count mechanism prevents concurrent GC but
doesn't prevent concurrent homa_peer_get().
Corrupted num_peers affects GC threshold decisions in the peertab->num_peers <
peertab->gc_threshold check in homa_peer_gc().
> +/**
> + * homa_peer_prefer_evict() - Given two peers, determine which one is
> + * a better candidate for eviction.
> + * @peertab: Overall information used to manage peers.
> + * @peer1: First peer.
> + * @peer2: Second peer.
> + * Return: True if @peer1 is a better candidate for eviction than @peer2.
> + */
> +int homa_peer_prefer_evict(struct homa_peertab *peertab,
> + struct homa_peer *peer1,
> + struct homa_peer *peer2)
> +{
> + /* Prefer a peer whose homa-net is over its limit; if both are either
> + * over or under, then prefer the peer with the shortest idle time.
> + */
^^^^^^^^
This isn't a bug, but the comment says "shortest idle time" while the code
returns peer1->access_jiffies < peer2->access_jiffies, which prefers the peer
accessed earlier (longest idle time). Since eviction should prefer stale peers,
the code is correct but the comment should say "longest idle time".
[ ... ]
> +/**
> + * homa_peer_free() - Release any resources in a peer and free the homa_peer
> + * struct. Invoked by the RCU mechanism via homa_peer_release.
> + * @head: Pointer to the rcu_head field of the peer to free.
> + */
> +void homa_peer_free(struct rcu_head *head)
> +{
> + struct homa_peer *peer;
> +
> + peer = container_of(head, struct homa_peer, rcu_head);
> + dst_release(rcu_dereference(peer->dst));
> + kfree(peer);
> +}
Is the RCU API correct here? homa_peer_free() is an RCU callback invoked via
call_rcu from homa_peer_release():
net/homa/homa_peer.h:homa_peer_release() {
if (refcount_dec_and_test(&peer->refs))
call_rcu(&peer->rcu_head, homa_peer_free);
}
RCU callbacks run after the grace period, outside any RCU read-side critical
section. Using rcu_dereference() without holding rcu_read_lock() triggers
lockdep 'suspicious rcu_dereference_check() usage' warnings at runtime.
The peer is exclusively owned at this point (refcount was zero, grace period
has passed). The same codebase uses the correct pattern in
homa_peer_reset_dst():
net/homa/homa_peer.c:homa_peer_reset_dst() {
...
dst_release(rcu_replace_pointer(peer->dst, dst, true));
...
}
Should this use rcu_dereference_protected(peer->dst, 1) instead of
rcu_dereference(peer->dst)?
> +/**
> + * homa_peer_get() - Returns the peer associated with a given host; creates
> + * a new homa_peer if one doesn't already exist.
> + * @hsk: Socket where the peer will be used.
> + * @addr: Address of the desired host: IPv4 addresses are represented
> + * as IPv4-mapped IPv6 addresses.
> + *
> + * Return: The peer associated with @addr, or a negative errno if an
> + * error occurred. On a successful return the reference count
> + * will be incremented for the returned peer. The caller must
> + * eventually call homa_peer_release to release the reference.
> + */
> +struct homa_peer *homa_peer_get(struct homa_sock *hsk,
> + const struct in6_addr *addr)
> +{
> + struct homa_peertab *peertab = hsk->homa->peertab;
> + struct homa_peer *peer, *other;
> + struct homa_peer_key key;
> +
> + key.addr = *addr;
> + key.hnet = hsk->hnet;
> + rcu_read_lock();
> + peer = rhashtable_lookup(&peertab->ht, &key, ht_params);
> + if (peer) {
> + homa_peer_hold(peer);
> + peer->access_jiffies = jiffies;
> + rcu_read_unlock();
> + return peer;
> + }
Can this sequence race with homa_peer_gc()? The lookup uses
rhashtable_lookup() under rcu_read_lock() but without peertab->lock. Then it
calls homa_peer_hold() which uses refcount_inc():
net/homa/homa_peer.h:homa_peer_hold() {
refcount_inc(&peer->refs);
}
Concurrently, homa_peer_gc() can:
net/homa/homa_peer.c:homa_peer_gc() {
spin_lock_bh(&peertab->lock);
...
if (rhashtable_remove_fast(&peertab->ht, &peer->ht_linkage,
ht_params) == 0) {
homa_peer_release(peer);
...
}
...
spin_unlock_bh(&peertab->lock);
}
net/homa/homa_peer.h:homa_peer_release() {
if (refcount_dec_and_test(&peer->refs))
call_rcu(&peer->rcu_head, homa_peer_free);
}
If homa_peer_gc() removes the peer from the rhashtable and calls
homa_peer_release() (dropping refcount to 0 and scheduling call_rcu) between
the lookup and the hold, then refcount_inc() on a zero refcount triggers
REFCOUNT_WARN and refuses the increment. This leaves the caller with an
unref'd peer that will be freed after the RCU grace period.
The standard kernel pattern for RCU + rhashtable lookups is
refcount_inc_not_zero() with retry or fallback if it returns false. The RCU
read lock keeps the memory valid but doesn't prevent the refcount from
reaching zero.
[ ... ]
> +/**
> + * homa_peer_add_ack() - Add a given RPC to the list of unacked
> + * RPCs for its server. Once this method has been invoked, it's safe
> + * to delete the RPC, since it will eventually be acked to the server.
> + * @rpc: Client RPC that has now completed.
> + */
> +void homa_peer_add_ack(struct homa_rpc *rpc)
> +{
> + struct homa_peer *peer = rpc->peer;
> + struct homa_ack_hdr ack;
> +
> + homa_peer_lock(peer);
> + if (peer->num_acks < HOMA_MAX_ACKS_PER_PKT) {
> + peer->acks[peer->num_acks].client_id = cpu_to_be64(rpc->id);
> + peer->acks[peer->num_acks].server_port = htons(rpc->dport);
> + peer->num_acks++;
> + homa_peer_unlock(peer);
> + return;
> + }
> +
> + /* The peer has filled up; send an ACK message to empty it. The
> + * RPC in the message header will also be considered ACKed.
> + */
> + memcpy(ack.acks, peer->acks, sizeof(peer->acks));
> + ack.num_acks = htons(peer->num_acks);
> + peer->num_acks = 0;
> + homa_peer_unlock(peer);
> + homa_xmit_control(ACK, &ack, sizeof(ack), rpc);
> +}
Does this code leak kernel stack memory? The 'struct homa_ack_hdr ack' is
declared on the stack without zeroing it. Only ack.acks (via memcpy) and
ack.num_acks are initialized before calling homa_xmit_control().
Looking at how homa_xmit_control() uses the ack:
The function sets ack.common.type, sport, dport, and sender_id, but the
following homa_common_hdr fields remain uninitialized: sequence, ack[3], doff,
reserved1, window, checksum, and reserved2.
These uninitialized stack bytes are then memcpy'd into an skb by
__homa_xmit_control() and transmitted on the network, leaking kernel stack
memory.
Additionally, the homa_common_hdr comment for doff warns: "other values may
cause some NICs (such as Intel E810-C) to drop outgoing packets when TCP
hijacking is enabled", so garbage in doff can cause packet drops.
The homa_xmit_control() doc comment claims "the common header will be filled
in by this function" but only 4 of 11 fields are actually set.
next prev parent reply other threads:[~2026-03-18 7:21 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 22:32 [PATCH net-next v17 00/14] Begin upstreaming Homa transport protocol John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 01/14] net: homa: define user-visible API for Homa John Ousterhout
2026-03-17 10:10 ` kernel test robot
2026-03-17 18:40 ` kernel test robot
2026-03-16 22:32 ` [PATCH net-next v17 02/14] net: homa: create homa_wire.h John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 03/14] net: homa: create shared Homa header files John Ousterhout
2026-03-18 7:20 ` [net-next,v17,03/14] " Paolo Abeni
2026-03-19 20:37 ` John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 04/14] net: homa: create homa_pool.h and homa_pool.c John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 05/14] net: homa: create homa_peer.h and homa_peer.c John Ousterhout
2026-03-18 7:21 ` Paolo Abeni [this message]
2026-03-20 17:13 ` [net-next,v17,05/14] " John Ousterhout
2026-03-20 17:20 ` Paolo Abeni
2026-03-16 22:32 ` [PATCH net-next v17 06/14] net: homa: create homa_sock.h and homa_sock.c John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 07/14] net: homa: create homa_interest.h and homa_interest.c John Ousterhout
2026-03-18 7:21 ` [net-next,v17,07/14] " Paolo Abeni
2026-03-16 22:32 ` [PATCH net-next v17 08/14] net: homa: create homa_rpc.h and homa_rpc.c John Ousterhout
2026-03-18 7:21 ` [net-next,v17,08/14] " Paolo Abeni
2026-03-23 22:43 ` John Ousterhout
2026-03-24 8:55 ` Paolo Abeni
2026-03-16 22:32 ` [PATCH net-next v17 09/14] net: homa: create homa_outgoing.c John Ousterhout
2026-03-18 7:21 ` [net-next,v17,09/14] " Paolo Abeni
2026-03-20 18:21 ` John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 10/14] net: homa: create homa_utils.c John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 11/14] net: homa: create homa_incoming.c John Ousterhout
2026-03-18 7:21 ` [net-next,v17,11/14] " Paolo Abeni
2026-03-20 20:51 ` John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 12/14] net: homa: create homa_timer.c John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 13/14] net: homa: create homa_plumbing.c John Ousterhout
2026-03-18 7:21 ` [net-next,v17,13/14] " Paolo Abeni
2026-03-20 21:49 ` John Ousterhout
2026-03-16 22:32 ` [PATCH net-next v17 14/14] net: homa: create Makefile and Kconfig John Ousterhout
2026-03-17 18:51 ` kernel test robot
2026-03-17 19:26 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260318072119.268920-1-pabeni@redhat.com \
--to=pabeni@redhat.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ouster@cs.stanford.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox