From: Pavel Begunkov <asml.silence@gmail.com>
To: Mina Almasry <almasrymina@google.com>
Cc: "Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, virtualization@lists.linux.dev,
kvm@vger.kernel.org, linux-kselftest@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Simon Horman" <horms@kernel.org>,
"Donald Hunter" <donald.hunter@gmail.com>,
"Jonathan Corbet" <corbet@lwn.net>,
"Andrew Lunn" <andrew+netdev@lunn.ch>,
"David Ahern" <dsahern@kernel.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Stefano Garzarella" <sgarzare@redhat.com>,
"Shuah Khan" <shuah@kernel.org>,
"Kaiyuan Zhang" <kaiyuanz@google.com>,
"Willem de Bruijn" <willemb@google.com>,
"Samiullah Khawaja" <skhawaja@google.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"Joe Damato" <jdamato@fastly.com>,
dw@davidwei.uk
Subject: Re: [PATCH RFC net-next v1 5/5] net: devmem: Implement TX path
Date: Wed, 5 Feb 2025 22:16:18 +0000 [thread overview]
Message-ID: <88cb8f03-7976-4846-a74d-e2d234c5cf8d@gmail.com> (raw)
In-Reply-To: <CAHS8izMcs=3qo1jhZSM499mxHh10-oBL6Fhb2W0eKWhJGax4Bg@mail.gmail.com>
On 2/5/25 20:22, Mina Almasry wrote:
> On Wed, Feb 5, 2025 at 4:41 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> On 1/28/25 14:49, Willem de Bruijn wrote:
>>>>>> +struct net_devmem_dmabuf_binding *
>>>>>> +net_devmem_get_sockc_binding(struct sock *sk, struct sockcm_cookie *sockc)
>>>>>> +{
>>>>>> + struct net_devmem_dmabuf_binding *binding;
>>>>>> + int err = 0;
>>>>>> +
>>>>>> + binding = net_devmem_lookup_dmabuf(sockc->dmabuf_id);
>>>>>
>>>>> This lookup is from global xarray net_devmem_dmabuf_bindings.
>>>>>
>>>>> Is there a check that the socket is sending out through the device
>>>>> to which this dmabuf was bound with netlink? Should there be?
>>>>> (e.g., SO_BINDTODEVICE).
>>>>>
>>>>
>>>> Yes, I think it may be an issue if the user triggers a send from a
>>>> different netdevice, because indeed when we bind a dmabuf we bind it
>>>> to a specific netdevice.
>>>>
>>>> One option is as you say to require TX sockets to be bound and to
>>>> check that we're bound to the correct netdev. I also wonder if I can
>>>> make this work without SO_BINDTODEVICE, by querying the netdev the
>>>> sock is currently trying to send out on and doing a check in the
>>>> tcp_sendmsg. I'm not sure if this is possible but I'll give it a look.
>>>
>>> I was a bit quick on mentioning SO_BINDTODEVICE. Agreed that it is
>>> vastly preferable to not require that, but infer the device from
>>> the connected TCP sock.
>>
>> I wonder why so? I'd imagine something like SO_BINDTODEVICE is a
>> better way to go. The user has to do it anyway, otherwise packets
>> might go to a different device and the user would suddenly start
>> getting errors with no good way to alleviate them (apart from
>> likes of SO_BINDTODEVICE). It's even worse if it works for a while
>> but starts to unpredictably fail as time passes. With binding at
>> least it'd fail fast if the setup is not done correctly.
>>
>
> I think there may be a misunderstanding. There is nothing preventing
> the user from SO_BINDTODEVICE to make sure the socket is bound to the
Right, not arguing otherwise
> ifindex, and the test changes in the latest series actually do this
> binding.
>
> It's just that on TX, we check what device we happen to be going out
> over, and fail if we're going out of a different device.
>
> There are setups where the device will always be correct even without
> SO_BINDTODEVICE. Like if the host has only 1 interface or if the
> egress IP is only reachable over 1 interface. I don't see much reason
> to require the user to SO_BINDTODEVICE in these cases.
That's exactly the problem. People would test their code with one setup
where it works just fine, but then there will be a rare user of a
library used by some other framework or a lonely server where it starts
to fails for no apparent reason while "it worked before and nothing has
changed". It's more predictable if enforced.
I don't think we'd care about setup overhead one extra ioctl() here(?),
but with this option we'd need to be careful about not racing with
rebinding, if it's allowed.
--
Pavel Begunkov
next prev parent reply other threads:[~2025-02-05 22:16 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-21 0:42 [PATCH RFC net-next v1 0/5] Device memory TCP TX Mina Almasry
2024-12-21 0:42 ` [PATCH RFC net-next v1 1/5] net: add devmem TCP TX documentation Mina Almasry
2024-12-21 4:56 ` Stanislav Fomichev
2025-01-27 22:45 ` Mina Almasry
2025-01-28 3:51 ` Stanislav Fomichev
2024-12-21 0:42 ` [PATCH RFC net-next v1 2/5] selftests: ncdevmem: Implement devmem TCP TX Mina Almasry
2024-12-21 4:57 ` Stanislav Fomichev
2024-12-26 21:24 ` Willem de Bruijn
2024-12-21 0:42 ` [PATCH RFC net-next v1 3/5] net: add get_netmem/put_netmem support Mina Almasry
2024-12-26 19:07 ` Stanislav Fomichev
2025-01-27 22:47 ` Mina Almasry
2024-12-21 0:42 ` [PATCH RFC net-next v1 4/5] net: devmem TCP tx netlink api Mina Almasry
2024-12-21 0:42 ` [PATCH RFC net-next v1 5/5] net: devmem: Implement TX path Mina Almasry
2024-12-21 5:09 ` Stanislav Fomichev
2024-12-26 19:10 ` Stanislav Fomichev
2025-01-27 22:52 ` Mina Almasry
2024-12-26 21:52 ` Willem de Bruijn
2025-01-28 0:06 ` Mina Almasry
2025-01-28 14:49 ` Willem de Bruijn
2025-02-05 12:41 ` Pavel Begunkov
2025-02-05 20:22 ` Mina Almasry
2025-02-05 22:16 ` Pavel Begunkov [this message]
2025-02-05 22:22 ` Pavel Begunkov
2025-02-10 21:14 ` Mina Almasry
2024-12-28 19:28 ` David Ahern
2024-12-21 4:53 ` [PATCH RFC net-next v1 0/5] Device memory TCP TX Stanislav Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=88cb8f03-7976-4846-a74d-e2d234c5cf8d@gmail.com \
--to=asml.silence@gmail.com \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=dsahern@kernel.org \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=eperezma@redhat.com \
--cc=horms@kernel.org \
--cc=jasowang@redhat.com \
--cc=jdamato@fastly.com \
--cc=kaiyuanz@google.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=sgarzare@redhat.com \
--cc=shuah@kernel.org \
--cc=skhawaja@google.com \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux.dev \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.