Re: [PATCH net-next 2/6] net: nexthop: Increase weight to u16

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Petr Machata <petrm@nvidia.com>
To: Simon Horman <horms@kernel.org>
Cc: Petr Machata <petrm@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>, <netdev@vger.kernel.org>,
	Ido Schimmel <idosch@nvidia.com>,
	David Ahern <dsahern@kernel.org>,
	Donald Sharp <sharpd@nvidia.com>, <mlxsw@nvidia.com>
Subject: Re: [PATCH net-next 2/6] net: nexthop: Increase weight to u16
Date: Mon, 5 Aug 2024 14:33:45 +0200	[thread overview]
Message-ID: <8734nj5esp.fsf@nvidia.com> (raw)
In-Reply-To: <20240801193928.GC2495006@kernel.org>


Simon Horman <horms@kernel.org> writes:

> On Thu, Aug 01, 2024 at 06:23:58PM +0200, Petr Machata wrote:
>> In CLOS networks, as link failures occur at various points in the network,
>> ECMP weights of the involved nodes are adjusted to compensate. With high
>> fan-out of the involved nodes, and overall high number of nodes,
>> a (non-)ECMP weight ratio that we would like to configure does not fit into
>> 8 bits. Instead of, say, 255:254, we might like to configure something like
>> 1000:999. For these deployments, the 8-bit weight may not be enough.
>> 
>> To that end, in this patch increase the next hop weight from u8 to u16.
>> 
>> Increasing the width of an integral type can be tricky, because while the
>> code still compiles, the types may not check out anymore, and numerical
>> errors come up. To prevent this, the conversion was done in two steps.
>> First the type was changed from u8 to a single-member structure, which
>> invalidated all uses of the field. This allowed going through them one by
>> one and audit for type correctness. Then the structure was replaced with a
>> vanilla u16 again. This should ensure that no place was missed.
>> 
>> The UAPI for configuring nexthop group members is that an attribute
>> NHA_GROUP carries an array of struct nexthop_grp entries:
>> 
>> 	struct nexthop_grp {
>> 		__u32	id;	  /* nexthop id - must exist */
>> 		__u8	weight;   /* weight of this nexthop */
>> 		__u8	resvd1;
>> 		__u16	resvd2;
>> 	};
>> 
>> The field resvd1 is currently validated and required to be zero. We can
>> lift this requirement and carry high-order bits of the weight in the
>> reserved field:
>> 
>> 	struct nexthop_grp {
>> 		__u32	id;	  /* nexthop id - must exist */
>> 		__u8	weight;   /* weight of this nexthop */
>> 		__u8	weight_high;
>> 		__u16	resvd2;
>> 	};
>> 
>> Keeping the fields split this way was chosen in case an existing userspace
>> makes assumptions about the width of the weight field, and to sidestep any
>> endianes issues.
>
> nit: endianness

Thanks, will fix for v2. For now I'm still waiting if there's other
feedback.

next prev parent reply	other threads:[~2024-08-05 12:34 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-01 16:23 [PATCH net-next 0/6] net: nexthop: Increase weight to u16 Petr Machata
2024-08-01 16:23 ` [PATCH net-next 1/6] net: nexthop: Add flag to assert that NHGRP reserved fields are zero Petr Machata
2024-08-05 22:23   ` Jakub Kicinski
2024-08-06  9:48     ` Petr Machata
2024-08-01 16:23 ` [PATCH net-next 2/6] net: nexthop: Increase weight to u16 Petr Machata
2024-08-01 19:39   ` Simon Horman
2024-08-05 12:33     ` Petr Machata [this message]
2024-08-06 14:02   ` Przemek Kitszel
2024-08-01 16:23 ` [PATCH net-next 3/6] selftests: router_mpath: Sleep after MZ Petr Machata
2024-08-01 16:24 ` [PATCH net-next 4/6] selftests: router_mpath_nh: Test 16-bit next hop weights Petr Machata
2024-08-01 16:24 ` [PATCH net-next 5/6] selftests: router_mpath_nh_res: " Petr Machata
2024-08-01 16:24 ` [PATCH net-next 6/6] selftests: fib_nexthops: " Petr Machata
2024-08-01 22:52 ` [PATCH net-next 0/6] net: nexthop: Increase weight to u16 Jakub Kicinski
2024-08-05  8:06   ` Petr Machata
2024-08-05 14:31 ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734nj5esp.fsf@nvidia.com \
    --to=petrm@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sharpd@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).