From: Paolo Abeni <pabeni@redhat.com>
To: John Ousterhout <ouster@cs.stanford.edu>, netdev@vger.kernel.org
Cc: edumazet@google.com, horms@kernel.org, kuba@kernel.org
Subject: Re: [PATCH net-next v6 06/12] net: homa: create homa_peer.h and homa_peer.c
Date: Thu, 23 Jan 2025 18:45:09 +0100 [thread overview]
Message-ID: <8faebc20-2b79-4b5f-aed6-b403d5b0f33d@redhat.com> (raw)
In-Reply-To: <20250115185937.1324-7-ouster@cs.stanford.edu>
On 1/15/25 7:59 PM, John Ousterhout wrote:
> +/**
> + * homa_peertab_get_peers() - Return information about all of the peers
> + * currently known
> + * @peertab: The table to search for peers.
> + * @num_peers: Modified to hold the number of peers returned.
> + * Return: kmalloced array holding pointers to all known peers. The
> + * caller must free this. If there is an error, or if there
> + * are no peers, NULL is returned.
> + */
> +struct homa_peer **homa_peertab_get_peers(struct homa_peertab *peertab,
> + int *num_peers)
Look like this function is unsed in the current series. Please don't
introduce unused code.
> +{
> + struct homa_peer **result;
> + struct hlist_node *next;
> + struct homa_peer *peer;
> + int i, count;
> +
> + *num_peers = 0;
> + if (!peertab->buckets)
> + return NULL;
> +
> + /* Figure out how many peers there are. */
> + count = 0;
> + for (i = 0; i < HOMA_PEERTAB_BUCKETS; i++) {
> + hlist_for_each_entry_safe(peer, next, &peertab->buckets[i],
> + peertab_links)
No lock acquired, so others process could concurrently modify the list;
hlist_for_each_entry_safe() is not the correct helper to use. You should
probably use hlist_for_each_entry_rcu(), adding rcu protection. Assuming
the thing is actually under an RCU schema, which is not entirely clear.
> + count++;
> + }
> +
> + if (count == 0)
> + return NULL;
> +
> + result = kmalloc_array(count, sizeof(peer), GFP_KERNEL);
> + if (!result)
> + return NULL;
> + *num_peers = count;
> + count = 0;
> + for (i = 0; i < HOMA_PEERTAB_BUCKETS; i++) {
> + hlist_for_each_entry_safe(peer, next, &peertab->buckets[i],
> + peertab_links) {
> + result[count] = peer;
> + count++;
> + }
> + }
> + return result;
> +}
> +
> +/**
> + * homa_peertab_gc_dsts() - Invoked to free unused dst_entries, if it is
> + * safe to do so.
> + * @peertab: The table in which to free entries.
> + * @now: Current time, in sched_clock() units; entries with expiration
> + * dates no later than this will be freed. Specify ~0 to
> + * free all entries.
> + */
> +void homa_peertab_gc_dsts(struct homa_peertab *peertab, __u64 now)
> +{
Apparently this is called under (and need) peertab lock, an annotation
or a comment would be helpful.
> + while (!list_empty(&peertab->dead_dsts)) {
> + struct homa_dead_dst *dead =
> + list_first_entry(&peertab->dead_dsts,
> + struct homa_dead_dst, dst_links);
> + if (dead->gc_time > now)
> + break;
> + dst_release(dead->dst);
> + list_del(&dead->dst_links);
> + kfree(dead);
> + }
> +}
> +
> +/**
> + * homa_peer_find() - Returns the peer associated with a given host; creates
> + * a new homa_peer if one doesn't already exist.
> + * @peertab: Peer table in which to perform lookup.
> + * @addr: Address of the desired host: IPv4 addresses are represented
> + * as IPv4-mapped IPv6 addresses.
> + * @inet: Socket that will be used for sending packets.
> + *
> + * Return: The peer associated with @addr, or a negative errno if an
> + * error occurred. The caller can retain this pointer
> + * indefinitely: peer entries are never deleted except in
> + * homa_peertab_destroy.
> + */
> +struct homa_peer *homa_peer_find(struct homa_peertab *peertab,
> + const struct in6_addr *addr,
> + struct inet_sock *inet)
> +{
> + /* Note: this function uses RCU operators to ensure safety even
> + * if a concurrent call is adding a new entry.
> + */
> + struct homa_peer *peer;
> + struct dst_entry *dst;
> +
> + __u32 bucket = hash_32((__force __u32)addr->in6_u.u6_addr32[0],
> + HOMA_PEERTAB_BUCKET_BITS);
> +
> + bucket ^= hash_32((__force __u32)addr->in6_u.u6_addr32[1],
> + HOMA_PEERTAB_BUCKET_BITS);
> + bucket ^= hash_32((__force __u32)addr->in6_u.u6_addr32[2],
> + HOMA_PEERTAB_BUCKET_BITS);
> + bucket ^= hash_32((__force __u32)addr->in6_u.u6_addr32[3],
> + HOMA_PEERTAB_BUCKET_BITS);
> + hlist_for_each_entry_rcu(peer, &peertab->buckets[bucket],
> + peertab_links) {
> + if (ipv6_addr_equal(&peer->addr, addr))
The caller does not acquire the RCU read lock, so this looks buggy.
AFAICS UaF is not possible because peers are removed only by
homa_peertab_destroy(), at unload time. That in turn looks
dangerous/wrong. What about memory utilization for peers over time?
apparently bucket list could grow in an unlimited way.
[...]
> +/**
> + * homa_peer_lock_slow() - This function implements the slow path for
> + * acquiring a peer's @unacked_lock. It is invoked when the lock isn't
> + * immediately available. It waits for the lock, but also records statistics
> + * about the waiting time.
> + * @peer: Peer to lock.
> + */
> +void homa_peer_lock_slow(struct homa_peer *peer)
> + __acquires(&peer->ack_lock)
> +{
> + spin_lock_bh(&peer->ack_lock);
Is this just a placeholder for future changes?!? I don't see any stats
update here, and currently homa_peer_lock() is really:
if (!spin_trylock_bh(&peer->ack_lock))
spin_lock_bh(&peer->ack_lock);
which does not make much sense to me. Either document this is going to
change very soon (possibly even how and why) or use a plain spin_lock_bh()
> +}
> +
> +/**
> + * homa_peer_add_ack() - Add a given RPC to the list of unacked
> + * RPCs for its server. Once this method has been invoked, it's safe
> + * to delete the RPC, since it will eventually be acked to the server.
> + * @rpc: Client RPC that has now completed.
> + */
> +void homa_peer_add_ack(struct homa_rpc *rpc)
> +{
> + struct homa_peer *peer = rpc->peer;
> + struct homa_ack_hdr ack;
> +
> + homa_peer_lock(peer);
> + if (peer->num_acks < HOMA_MAX_ACKS_PER_PKT) {
> + peer->acks[peer->num_acks].client_id = cpu_to_be64(rpc->id);
> + peer->acks[peer->num_acks].server_port = htons(rpc->dport);
> + peer->num_acks++;
> + homa_peer_unlock(peer);
> + return;
> + }
> +
> + /* The peer has filled up; send an ACK message to empty it. The
> + * RPC in the message header will also be considered ACKed.
> + */
> + memcpy(ack.acks, peer->acks, sizeof(peer->acks));
> + ack.num_acks = htons(peer->num_acks);
> + peer->num_acks = 0;
> + homa_peer_unlock(peer);
> + homa_xmit_control(ACK, &ack, sizeof(ack), rpc);
> +}
> +
> +/**
> + * homa_peer_get_acks() - Copy acks out of a peer, and remove them from the
> + * peer.
> + * @peer: Peer to check for possible unacked RPCs.
> + * @count: Maximum number of acks to return.
> + * @dst: The acks are copied to this location.
> + *
> + * Return: The number of acks extracted from the peer (<= count).
> + */
> +int homa_peer_get_acks(struct homa_peer *peer, int count, struct homa_ack *dst)
> +{
> + /* Don't waste time acquiring the lock if there are no ids available. */
> + if (peer->num_acks == 0)
> + return 0;
> +
> + homa_peer_lock(peer);
> +
> + if (count > peer->num_acks)
> + count = peer->num_acks;
> + memcpy(dst, &peer->acks[peer->num_acks - count],
> + count * sizeof(peer->acks[0]));
> + peer->num_acks -= count;
> +
> + homa_peer_unlock(peer);
> + return count;
> +}
> diff --git a/net/homa/homa_peer.h b/net/homa/homa_peer.h
> new file mode 100644
> index 000000000000..556aeda49656
> --- /dev/null
> +++ b/net/homa/homa_peer.h
> @@ -0,0 +1,233 @@
> +/* SPDX-License-Identifier: BSD-2-Clause */
> +
> +/* This file contains definitions related to managing peers (homa_peer
> + * and homa_peertab).
> + */
> +
> +#ifndef _HOMA_PEER_H
> +#define _HOMA_PEER_H
> +
> +#include "homa_wire.h"
> +#include "homa_sock.h"
> +
> +struct homa_rpc;
> +
> +/**
> + * struct homa_dead_dst - Used to retain dst_entries that are no longer
> + * needed, until it is safe to delete them (I'm not confident that the RCU
> + * mechanism will be safe for these: the reference count could get incremented
> + * after it's on the RCU list?).
> + */
> +struct homa_dead_dst {
> + /** @dst: Entry that is no longer used by a struct homa_peer. */
> + struct dst_entry *dst;
> +
> + /**
> + * @gc_time: Time (in units of sched_clock()) when it is safe
> + * to free @dst.
> + */
> + __u64 gc_time;
> +
> + /** @dst_links: Used to link together entries in peertab->dead_dsts. */
> + struct list_head dst_links;
> +};
> +
> +/**
> + * define HOMA_PEERTAB_BUCKET_BITS - Number of bits in the bucket index for a
> + * homa_peertab. Should be large enough to hold an entry for every server
> + * in a datacenter without long hash chains.
> + */
> +#define HOMA_PEERTAB_BUCKET_BITS 16
> +
> +/** define HOME_PEERTAB_BUCKETS - Number of buckets in a homa_peertab. */
> +#define HOMA_PEERTAB_BUCKETS BIT(HOMA_PEERTAB_BUCKET_BITS)
> +
> +/**
> + * struct homa_peertab - A hash table that maps from IPv6 addresses
> + * to homa_peer objects. IPv4 entries are encapsulated as IPv6 addresses.
> + * Entries are gradually added to this table, but they are never removed
> + * except when the entire table is deleted. We can't safely delete because
> + * results returned by homa_peer_find may be retained indefinitely.
> + *
> + * This table is managed exclusively by homa_peertab.c, using RCU to
> + * permit efficient lookups.
> + */
> +struct homa_peertab {
> + /**
> + * @write_lock: Synchronizes addition of new entries; not needed
> + * for lookups (RCU is used instead).
> + */
> + spinlock_t write_lock;
This look looks potentially havily contented on add, why don't you use a
per bucket lock?
> +
> + /**
> + * @dead_dsts: List of dst_entries that are waiting to be deleted.
> + * Hold @write_lock when manipulating.
> + */
> + struct list_head dead_dsts;
> +
> + /**
> + * @buckets: Pointer to heads of chains of homa_peers for each bucket.
> + * Malloc-ed, and must eventually be freed. NULL means this structure
> + * has not been initialized.
> + */
> + struct hlist_head *buckets;
> +};
> +
> +/**
> + * struct homa_peer - One of these objects exists for each machine that we
> + * have communicated with (either as client or server).
> + */
> +struct homa_peer {
> + /**
> + * @addr: IPv6 address for the machine (IPv4 addresses are stored
> + * as IPv4-mapped IPv6 addresses).
> + */
> + struct in6_addr addr;
> +
> + /** @flow: Addressing info needed to send packets. */
> + struct flowi flow;
> +
> + /**
> + * @dst: Used to route packets to this peer; we own a reference
> + * to this, which we must eventually release.
> + */
> + struct dst_entry *dst;
> +
> + /**
> + * @grantable_rpcs: Contains all homa_rpcs (both requests and
> + * responses) involving this peer whose msgins require (or required
> + * them in the past) and have not been fully received. The list is
> + * sorted in priority order (head has fewest bytes_remaining).
> + * Locked with homa->grantable_lock.
> + */
> + struct list_head grantable_rpcs;
Apparently not used in this patch series. More field below with similar
problem. Please introduce such fields in the same series that will
actualy use them.
/P
next prev parent reply other threads:[~2025-01-23 17:45 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-15 18:59 [PATCH net-next v6 00/12] Begin upstreaming Homa transport protocol John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 01/12] net: homa: define user-visible API for Homa John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 02/12] net: homa: create homa_wire.h John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 03/12] net: homa: create shared Homa header files John Ousterhout
2025-01-23 11:01 ` Paolo Abeni
2025-01-24 21:21 ` John Ousterhout
2025-01-27 9:05 ` Paolo Abeni
2025-01-27 17:04 ` John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 04/12] net: homa: create homa_pool.h and homa_pool.c John Ousterhout
2025-01-23 12:06 ` Paolo Abeni
2025-01-24 23:53 ` John Ousterhout
2025-01-25 0:46 ` Andrew Lunn
2025-01-26 5:33 ` John Ousterhout
2025-01-27 9:41 ` Paolo Abeni
2025-01-27 17:34 ` John Ousterhout
2025-01-27 18:28 ` Paolo Abeni
2025-01-27 19:12 ` John Ousterhout
2025-01-28 8:27 ` Paolo Abeni
2025-01-15 18:59 ` [PATCH net-next v6 05/12] net: homa: create homa_rpc.h and homa_rpc.c John Ousterhout
2025-01-23 14:29 ` Paolo Abeni
2025-01-27 5:22 ` John Ousterhout
2025-01-27 10:01 ` Paolo Abeni
2025-01-27 18:03 ` John Ousterhout
2025-01-28 8:19 ` Paolo Abeni
2025-01-29 1:23 ` John Ousterhout
[not found] ` <13345e2a-849d-4bd8-a95e-9cd7f287c7df@redhat.com>
2025-01-29 16:43 ` John Ousterhout
2025-01-29 16:49 ` Eric Dumazet
2025-01-29 16:54 ` John Ousterhout
2025-01-29 17:04 ` Eric Dumazet
2025-01-29 20:27 ` John Ousterhout
2025-01-29 20:40 ` Eric Dumazet
2025-01-29 21:08 ` John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 06/12] net: homa: create homa_peer.h and homa_peer.c John Ousterhout
2025-01-23 17:45 ` Paolo Abeni [this message]
2025-01-28 0:06 ` John Ousterhout
2025-01-28 0:32 ` Jason Xing
2025-01-15 18:59 ` [PATCH net-next v6 07/12] net: homa: create homa_sock.h and homa_sock.c John Ousterhout
2025-01-23 19:01 ` Paolo Abeni
2025-01-28 0:40 ` John Ousterhout
2025-01-28 4:26 ` John Ousterhout
2025-01-28 15:10 ` Eric Dumazet
2025-01-28 17:04 ` John Ousterhout
2025-01-24 7:33 ` Paolo Abeni
2025-01-15 18:59 ` [PATCH net-next v6 08/12] net: homa: create homa_incoming.c John Ousterhout
2025-01-24 8:31 ` Paolo Abeni
2025-01-30 0:41 ` John Ousterhout
[not found] ` <991b5ad9-57cf-4e1d-8e01-9d0639fa4e49@redhat.com>
2025-01-31 22:48 ` John Ousterhout
2025-02-03 9:12 ` Paolo Abeni
2025-02-03 23:33 ` John Ousterhout
2025-02-04 8:50 ` Paolo Abeni
2025-02-04 16:30 ` John Ousterhout
2025-02-04 19:41 ` Andrew Lunn
2025-02-04 21:20 ` John Ousterhout
2025-01-27 10:19 ` Paolo Abeni
2025-01-30 0:48 ` John Ousterhout
2025-01-30 9:57 ` Paolo Abeni
2025-01-31 22:51 ` John Ousterhout
[not found] ` <CAGXJAmxLqnjnWr8sjooJRRyQ2-5BqPCQL8gnn0gzYoZ0MMoBSw@mail.gmail.com>
2025-02-03 9:17 ` Paolo Abeni
2025-02-03 17:33 ` John Ousterhout
2025-02-03 17:58 ` Andrew Lunn
2025-02-05 23:56 ` John Ousterhout
2025-02-06 1:49 ` Andrew Lunn
2025-01-15 18:59 ` [PATCH net-next v6 09/12] net: homa: create homa_outgoing.c John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 10/12] net: homa: create homa_timer.c John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 11/12] net: homa: create homa_plumbing.c and homa_utils.c John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 12/12] net: homa: create Makefile and Kconfig John Ousterhout
2025-01-24 8:55 ` [PATCH net-next v6 00/12] Begin upstreaming Homa transport protocol Paolo Abeni
2025-02-10 19:19 ` John Ousterhout
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8faebc20-2b79-4b5f-aed6-b403d5b0f33d@redhat.com \
--to=pabeni@redhat.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ouster@cs.stanford.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).