netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: John Ousterhout <ouster@cs.stanford.edu>, netdev@vger.kernel.org
Cc: edumazet@google.com, horms@kernel.org, kuba@kernel.org
Subject: Re: [PATCH net-next v6 06/12] net: homa: create homa_peer.h and homa_peer.c
Date: Thu, 23 Jan 2025 18:45:09 +0100	[thread overview]
Message-ID: <8faebc20-2b79-4b5f-aed6-b403d5b0f33d@redhat.com> (raw)
In-Reply-To: <20250115185937.1324-7-ouster@cs.stanford.edu>

On 1/15/25 7:59 PM, John Ousterhout wrote:
> +/**
> + * homa_peertab_get_peers() - Return information about all of the peers
> + * currently known
> + * @peertab:    The table to search for peers.
> + * @num_peers:  Modified to hold the number of peers returned.
> + * Return:      kmalloced array holding pointers to all known peers. The
> + *		caller must free this. If there is an error, or if there
> + *	        are no peers, NULL is returned.
> + */
> +struct homa_peer **homa_peertab_get_peers(struct homa_peertab *peertab,
> +					  int *num_peers)

Look like this function is unsed in the current series. Please don't
introduce unused code.

> +{
> +	struct homa_peer **result;
> +	struct hlist_node *next;
> +	struct homa_peer *peer;
> +	int i, count;
> +
> +	*num_peers = 0;
> +	if (!peertab->buckets)
> +		return NULL;
> +
> +	/* Figure out how many peers there are. */
> +	count = 0;
> +	for (i = 0; i < HOMA_PEERTAB_BUCKETS; i++) {
> +		hlist_for_each_entry_safe(peer, next, &peertab->buckets[i],
> +					  peertab_links)

No lock acquired, so others process could concurrently modify the list;
hlist_for_each_entry_safe() is not the correct helper to use. You should
probably use hlist_for_each_entry_rcu(), adding rcu protection. Assuming
the thing is actually under an RCU schema, which is not entirely clear.

> +			count++;
> +	}
> +
> +	if (count == 0)
> +		return NULL;
> +
> +	result = kmalloc_array(count, sizeof(peer), GFP_KERNEL);
> +	if (!result)
> +		return NULL;
> +	*num_peers = count;
> +	count = 0;
> +	for (i = 0; i < HOMA_PEERTAB_BUCKETS; i++) {
> +		hlist_for_each_entry_safe(peer, next, &peertab->buckets[i],
> +					  peertab_links) {
> +			result[count] = peer;
> +			count++;
> +		}
> +	}
> +	return result;
> +}
> +
> +/**
> + * homa_peertab_gc_dsts() - Invoked to free unused dst_entries, if it is
> + * safe to do so.
> + * @peertab:       The table in which to free entries.
> + * @now:           Current time, in sched_clock() units; entries with expiration
> + *                 dates no later than this will be freed. Specify ~0 to
> + *                 free all entries.
> + */
> +void homa_peertab_gc_dsts(struct homa_peertab *peertab, __u64 now)
> +{

Apparently this is called under (and need) peertab lock, an annotation
or a comment would be helpful.

> +	while (!list_empty(&peertab->dead_dsts)) {
> +		struct homa_dead_dst *dead =
> +			list_first_entry(&peertab->dead_dsts,
> +					 struct homa_dead_dst, dst_links);
> +		if (dead->gc_time > now)
> +			break;
> +		dst_release(dead->dst);
> +		list_del(&dead->dst_links);
> +		kfree(dead);
> +	}
> +}
> +
> +/**
> + * homa_peer_find() - Returns the peer associated with a given host; creates
> + * a new homa_peer if one doesn't already exist.
> + * @peertab:    Peer table in which to perform lookup.
> + * @addr:       Address of the desired host: IPv4 addresses are represented
> + *              as IPv4-mapped IPv6 addresses.
> + * @inet:       Socket that will be used for sending packets.
> + *
> + * Return:      The peer associated with @addr, or a negative errno if an
> + *              error occurred. The caller can retain this pointer
> + *              indefinitely: peer entries are never deleted except in
> + *              homa_peertab_destroy.
> + */
> +struct homa_peer *homa_peer_find(struct homa_peertab *peertab,
> +				 const struct in6_addr *addr,
> +				 struct inet_sock *inet)
> +{
> +	/* Note: this function uses RCU operators to ensure safety even
> +	 * if a concurrent call is adding a new entry.
> +	 */
> +	struct homa_peer *peer;
> +	struct dst_entry *dst;
> +
> +	__u32 bucket = hash_32((__force __u32)addr->in6_u.u6_addr32[0],
> +			       HOMA_PEERTAB_BUCKET_BITS);
> +
> +	bucket ^= hash_32((__force __u32)addr->in6_u.u6_addr32[1],
> +			  HOMA_PEERTAB_BUCKET_BITS);
> +	bucket ^= hash_32((__force __u32)addr->in6_u.u6_addr32[2],
> +			  HOMA_PEERTAB_BUCKET_BITS);
> +	bucket ^= hash_32((__force __u32)addr->in6_u.u6_addr32[3],
> +			  HOMA_PEERTAB_BUCKET_BITS);
> +	hlist_for_each_entry_rcu(peer, &peertab->buckets[bucket],
> +				 peertab_links) {
> +		if (ipv6_addr_equal(&peer->addr, addr))

The caller does not acquire the RCU read lock, so this looks buggy.

AFAICS UaF is not possible because peers are removed only by
homa_peertab_destroy(), at unload time. That in turn looks
dangerous/wrong. What about memory utilization for peers over time?
apparently bucket list could grow in an unlimited way.

[...]
> +/**
> + * homa_peer_lock_slow() - This function implements the slow path for
> + * acquiring a peer's @unacked_lock. It is invoked when the lock isn't
> + * immediately available. It waits for the lock, but also records statistics
> + * about the waiting time.
> + * @peer:    Peer to  lock.
> + */
> +void homa_peer_lock_slow(struct homa_peer *peer)
> +	__acquires(&peer->ack_lock)
> +{
> +	spin_lock_bh(&peer->ack_lock);

Is this just a placeholder for future changes?!? I don't see any stats
update here, and currently homa_peer_lock() is really:

	if (!spin_trylock_bh(&peer->ack_lock))
		spin_lock_bh(&peer->ack_lock);

which does not make much sense to me. Either document this is going to
change very soon (possibly even how and why) or use a plain spin_lock_bh()

> +}
> +
> +/**
> + * homa_peer_add_ack() - Add a given RPC to the list of unacked
> + * RPCs for its server. Once this method has been invoked, it's safe
> + * to delete the RPC, since it will eventually be acked to the server.
> + * @rpc:    Client RPC that has now completed.
> + */
> +void homa_peer_add_ack(struct homa_rpc *rpc)
> +{
> +	struct homa_peer *peer = rpc->peer;
> +	struct homa_ack_hdr ack;
> +
> +	homa_peer_lock(peer);
> +	if (peer->num_acks < HOMA_MAX_ACKS_PER_PKT) {
> +		peer->acks[peer->num_acks].client_id = cpu_to_be64(rpc->id);
> +		peer->acks[peer->num_acks].server_port = htons(rpc->dport);
> +		peer->num_acks++;
> +		homa_peer_unlock(peer);
> +		return;
> +	}
> +
> +	/* The peer has filled up; send an ACK message to empty it. The
> +	 * RPC in the message header will also be considered ACKed.
> +	 */
> +	memcpy(ack.acks, peer->acks, sizeof(peer->acks));
> +	ack.num_acks = htons(peer->num_acks);
> +	peer->num_acks = 0;
> +	homa_peer_unlock(peer);
> +	homa_xmit_control(ACK, &ack, sizeof(ack), rpc);
> +}
> +
> +/**
> + * homa_peer_get_acks() - Copy acks out of a peer, and remove them from the
> + * peer.
> + * @peer:    Peer to check for possible unacked RPCs.
> + * @count:   Maximum number of acks to return.
> + * @dst:     The acks are copied to this location.
> + *
> + * Return:   The number of acks extracted from the peer (<= count).
> + */
> +int homa_peer_get_acks(struct homa_peer *peer, int count, struct homa_ack *dst)
> +{
> +	/* Don't waste time acquiring the lock if there are no ids available. */
> +	if (peer->num_acks == 0)
> +		return 0;
> +
> +	homa_peer_lock(peer);
> +
> +	if (count > peer->num_acks)
> +		count = peer->num_acks;
> +	memcpy(dst, &peer->acks[peer->num_acks - count],
> +	       count * sizeof(peer->acks[0]));
> +	peer->num_acks -= count;
> +
> +	homa_peer_unlock(peer);
> +	return count;
> +}
> diff --git a/net/homa/homa_peer.h b/net/homa/homa_peer.h
> new file mode 100644
> index 000000000000..556aeda49656
> --- /dev/null
> +++ b/net/homa/homa_peer.h
> @@ -0,0 +1,233 @@
> +/* SPDX-License-Identifier: BSD-2-Clause */
> +
> +/* This file contains definitions related to managing peers (homa_peer
> + * and homa_peertab).
> + */
> +
> +#ifndef _HOMA_PEER_H
> +#define _HOMA_PEER_H
> +
> +#include "homa_wire.h"
> +#include "homa_sock.h"
> +
> +struct homa_rpc;
> +
> +/**
> + * struct homa_dead_dst - Used to retain dst_entries that are no longer
> + * needed, until it is safe to delete them (I'm not confident that the RCU
> + * mechanism will be safe for these: the reference count could get incremented
> + * after it's on the RCU list?).
> + */
> +struct homa_dead_dst {
> +	/** @dst: Entry that is no longer used by a struct homa_peer. */
> +	struct dst_entry *dst;
> +
> +	/**
> +	 * @gc_time: Time (in units of sched_clock()) when it is safe
> +	 * to free @dst.
> +	 */
> +	__u64 gc_time;
> +
> +	/** @dst_links: Used to link together entries in peertab->dead_dsts. */
> +	struct list_head dst_links;
> +};
> +
> +/**
> + * define HOMA_PEERTAB_BUCKET_BITS - Number of bits in the bucket index for a
> + * homa_peertab.  Should be large enough to hold an entry for every server
> + * in a datacenter without long hash chains.
> + */
> +#define HOMA_PEERTAB_BUCKET_BITS 16
> +
> +/** define HOME_PEERTAB_BUCKETS - Number of buckets in a homa_peertab. */
> +#define HOMA_PEERTAB_BUCKETS BIT(HOMA_PEERTAB_BUCKET_BITS)
> +
> +/**
> + * struct homa_peertab - A hash table that maps from IPv6 addresses
> + * to homa_peer objects. IPv4 entries are encapsulated as IPv6 addresses.
> + * Entries are gradually added to this table, but they are never removed
> + * except when the entire table is deleted. We can't safely delete because
> + * results returned by homa_peer_find may be retained indefinitely.
> + *
> + * This table is managed exclusively by homa_peertab.c, using RCU to
> + * permit efficient lookups.
> + */
> +struct homa_peertab {
> +	/**
> +	 * @write_lock: Synchronizes addition of new entries; not needed
> +	 * for lookups (RCU is used instead).
> +	 */
> +	spinlock_t write_lock;

This look looks potentially havily contented on add, why don't you use a
per bucket lock?

> +
> +	/**
> +	 * @dead_dsts: List of dst_entries that are waiting to be deleted.
> +	 * Hold @write_lock when manipulating.
> +	 */
> +	struct list_head dead_dsts;
> +
> +	/**
> +	 * @buckets: Pointer to heads of chains of homa_peers for each bucket.
> +	 * Malloc-ed, and must eventually be freed. NULL means this structure
> +	 * has not been initialized.
> +	 */
> +	struct hlist_head *buckets;
> +};
> +
> +/**
> + * struct homa_peer - One of these objects exists for each machine that we
> + * have communicated with (either as client or server).
> + */
> +struct homa_peer {
> +	/**
> +	 * @addr: IPv6 address for the machine (IPv4 addresses are stored
> +	 * as IPv4-mapped IPv6 addresses).
> +	 */
> +	struct in6_addr addr;
> +
> +	/** @flow: Addressing info needed to send packets. */
> +	struct flowi flow;
> +
> +	/**
> +	 * @dst: Used to route packets to this peer; we own a reference
> +	 * to this, which we must eventually release.
> +	 */
> +	struct dst_entry *dst;
> +
> +	/**
> +	 * @grantable_rpcs: Contains all homa_rpcs (both requests and
> +	 * responses) involving this peer whose msgins require (or required
> +	 * them in the past) and have not been fully received. The list is
> +	 * sorted in priority order (head has fewest bytes_remaining).
> +	 * Locked with homa->grantable_lock.
> +	 */
> +	struct list_head grantable_rpcs;

Apparently not used in this patch series. More field below with similar
problem. Please introduce such fields in the same series that will
actualy use them.

/P


  reply	other threads:[~2025-01-23 17:45 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-15 18:59 [PATCH net-next v6 00/12] Begin upstreaming Homa transport protocol John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 01/12] net: homa: define user-visible API for Homa John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 02/12] net: homa: create homa_wire.h John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 03/12] net: homa: create shared Homa header files John Ousterhout
2025-01-23 11:01   ` Paolo Abeni
2025-01-24 21:21     ` John Ousterhout
2025-01-27  9:05       ` Paolo Abeni
2025-01-27 17:04         ` John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 04/12] net: homa: create homa_pool.h and homa_pool.c John Ousterhout
2025-01-23 12:06   ` Paolo Abeni
2025-01-24 23:53     ` John Ousterhout
2025-01-25  0:46       ` Andrew Lunn
2025-01-26  5:33         ` John Ousterhout
2025-01-27  9:41       ` Paolo Abeni
2025-01-27 17:34         ` John Ousterhout
2025-01-27 18:28           ` Paolo Abeni
2025-01-27 19:12             ` John Ousterhout
2025-01-28  8:27               ` Paolo Abeni
2025-01-15 18:59 ` [PATCH net-next v6 05/12] net: homa: create homa_rpc.h and homa_rpc.c John Ousterhout
2025-01-23 14:29   ` Paolo Abeni
2025-01-27  5:22     ` John Ousterhout
2025-01-27 10:01       ` Paolo Abeni
2025-01-27 18:03         ` John Ousterhout
2025-01-28  8:19           ` Paolo Abeni
2025-01-29  1:23             ` John Ousterhout
     [not found]               ` <13345e2a-849d-4bd8-a95e-9cd7f287c7df@redhat.com>
2025-01-29 16:43                 ` John Ousterhout
2025-01-29 16:49                   ` Eric Dumazet
2025-01-29 16:54                     ` John Ousterhout
2025-01-29 17:04                       ` Eric Dumazet
2025-01-29 20:27                         ` John Ousterhout
2025-01-29 20:40                           ` Eric Dumazet
2025-01-29 21:08                             ` John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 06/12] net: homa: create homa_peer.h and homa_peer.c John Ousterhout
2025-01-23 17:45   ` Paolo Abeni [this message]
2025-01-28  0:06     ` John Ousterhout
2025-01-28  0:32       ` Jason Xing
2025-01-15 18:59 ` [PATCH net-next v6 07/12] net: homa: create homa_sock.h and homa_sock.c John Ousterhout
2025-01-23 19:01   ` Paolo Abeni
2025-01-28  0:40     ` John Ousterhout
2025-01-28  4:26       ` John Ousterhout
2025-01-28 15:10       ` Eric Dumazet
2025-01-28 17:04         ` John Ousterhout
2025-01-24  7:33   ` Paolo Abeni
2025-01-15 18:59 ` [PATCH net-next v6 08/12] net: homa: create homa_incoming.c John Ousterhout
2025-01-24  8:31   ` Paolo Abeni
2025-01-30  0:41     ` John Ousterhout
     [not found]       ` <991b5ad9-57cf-4e1d-8e01-9d0639fa4e49@redhat.com>
2025-01-31 22:48         ` John Ousterhout
2025-02-03  9:12           ` Paolo Abeni
2025-02-03 23:33             ` John Ousterhout
2025-02-04  8:50               ` Paolo Abeni
2025-02-04 16:30                 ` John Ousterhout
2025-02-04 19:41                   ` Andrew Lunn
2025-02-04 21:20                     ` John Ousterhout
2025-01-27 10:19   ` Paolo Abeni
2025-01-30  0:48     ` John Ousterhout
2025-01-30  9:57       ` Paolo Abeni
2025-01-31 22:51         ` John Ousterhout
     [not found]         ` <CAGXJAmxLqnjnWr8sjooJRRyQ2-5BqPCQL8gnn0gzYoZ0MMoBSw@mail.gmail.com>
2025-02-03  9:17           ` Paolo Abeni
2025-02-03 17:33             ` John Ousterhout
2025-02-03 17:58               ` Andrew Lunn
2025-02-05 23:56                 ` John Ousterhout
2025-02-06  1:49                   ` Andrew Lunn
2025-01-15 18:59 ` [PATCH net-next v6 09/12] net: homa: create homa_outgoing.c John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 10/12] net: homa: create homa_timer.c John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 11/12] net: homa: create homa_plumbing.c and homa_utils.c John Ousterhout
2025-01-15 18:59 ` [PATCH net-next v6 12/12] net: homa: create Makefile and Kconfig John Ousterhout
2025-01-24  8:55 ` [PATCH net-next v6 00/12] Begin upstreaming Homa transport protocol Paolo Abeni
2025-02-10 19:19   ` John Ousterhout

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8faebc20-2b79-4b5f-aed6-b403d5b0f33d@redhat.com \
    --to=pabeni@redhat.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=ouster@cs.stanford.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).