From: Petr Machata <petrm@nvidia.com>
To: Simon Horman <horms@kernel.org>
Cc: Petr Machata <petrm@nvidia.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>, <netdev@vger.kernel.org>,
Ido Schimmel <idosch@nvidia.com>,
David Ahern <dsahern@kernel.org>,
Donald Sharp <sharpd@nvidia.com>, <mlxsw@nvidia.com>
Subject: Re: [PATCH net-next 2/6] net: nexthop: Increase weight to u16
Date: Mon, 5 Aug 2024 14:33:45 +0200 [thread overview]
Message-ID: <8734nj5esp.fsf@nvidia.com> (raw)
In-Reply-To: <20240801193928.GC2495006@kernel.org>
Simon Horman <horms@kernel.org> writes:
> On Thu, Aug 01, 2024 at 06:23:58PM +0200, Petr Machata wrote:
>> In CLOS networks, as link failures occur at various points in the network,
>> ECMP weights of the involved nodes are adjusted to compensate. With high
>> fan-out of the involved nodes, and overall high number of nodes,
>> a (non-)ECMP weight ratio that we would like to configure does not fit into
>> 8 bits. Instead of, say, 255:254, we might like to configure something like
>> 1000:999. For these deployments, the 8-bit weight may not be enough.
>>
>> To that end, in this patch increase the next hop weight from u8 to u16.
>>
>> Increasing the width of an integral type can be tricky, because while the
>> code still compiles, the types may not check out anymore, and numerical
>> errors come up. To prevent this, the conversion was done in two steps.
>> First the type was changed from u8 to a single-member structure, which
>> invalidated all uses of the field. This allowed going through them one by
>> one and audit for type correctness. Then the structure was replaced with a
>> vanilla u16 again. This should ensure that no place was missed.
>>
>> The UAPI for configuring nexthop group members is that an attribute
>> NHA_GROUP carries an array of struct nexthop_grp entries:
>>
>> struct nexthop_grp {
>> __u32 id; /* nexthop id - must exist */
>> __u8 weight; /* weight of this nexthop */
>> __u8 resvd1;
>> __u16 resvd2;
>> };
>>
>> The field resvd1 is currently validated and required to be zero. We can
>> lift this requirement and carry high-order bits of the weight in the
>> reserved field:
>>
>> struct nexthop_grp {
>> __u32 id; /* nexthop id - must exist */
>> __u8 weight; /* weight of this nexthop */
>> __u8 weight_high;
>> __u16 resvd2;
>> };
>>
>> Keeping the fields split this way was chosen in case an existing userspace
>> makes assumptions about the width of the weight field, and to sidestep any
>> endianes issues.
>
> nit: endianness
Thanks, will fix for v2. For now I'm still waiting if there's other
feedback.
next prev parent reply other threads:[~2024-08-05 12:34 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-01 16:23 [PATCH net-next 0/6] net: nexthop: Increase weight to u16 Petr Machata
2024-08-01 16:23 ` [PATCH net-next 1/6] net: nexthop: Add flag to assert that NHGRP reserved fields are zero Petr Machata
2024-08-05 22:23 ` Jakub Kicinski
2024-08-06 9:48 ` Petr Machata
2024-08-01 16:23 ` [PATCH net-next 2/6] net: nexthop: Increase weight to u16 Petr Machata
2024-08-01 19:39 ` Simon Horman
2024-08-05 12:33 ` Petr Machata [this message]
2024-08-06 14:02 ` Przemek Kitszel
2024-08-01 16:23 ` [PATCH net-next 3/6] selftests: router_mpath: Sleep after MZ Petr Machata
2024-08-01 16:24 ` [PATCH net-next 4/6] selftests: router_mpath_nh: Test 16-bit next hop weights Petr Machata
2024-08-01 16:24 ` [PATCH net-next 5/6] selftests: router_mpath_nh_res: " Petr Machata
2024-08-01 16:24 ` [PATCH net-next 6/6] selftests: fib_nexthops: " Petr Machata
2024-08-01 22:52 ` [PATCH net-next 0/6] net: nexthop: Increase weight to u16 Jakub Kicinski
2024-08-05 8:06 ` Petr Machata
2024-08-05 14:31 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8734nj5esp.fsf@nvidia.com \
--to=petrm@nvidia.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=idosch@nvidia.com \
--cc=kuba@kernel.org \
--cc=mlxsw@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sharpd@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).