netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Ahern <dsahern@kernel.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Mina Almasry <almasrymina@google.com>,
	Praveen Kaligineedi <pkaligineedi@google.com>,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	netdev@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	sdf@google.com, Willem de Bruijn <willemb@google.com>,
	Kaiyuan Zhang <kaiyuanz@google.com>
Subject: Re: [RFC PATCH v2 02/11] netdev: implement netlink api to bind dma-buf to netdevice
Date: Fri, 18 Aug 2023 21:30:15 -0600	[thread overview]
Message-ID: <38a06656-b6bf-e6b7-48a1-c489d2d76db8@kernel.org> (raw)
In-Reply-To: <20230818190653.78ca6e5a@kernel.org>

On 8/18/23 8:06 PM, Jakub Kicinski wrote:
> On Fri, 18 Aug 2023 19:34:32 -0600 David Ahern wrote:
>> On 8/18/23 3:52 PM, Mina Almasry wrote:
>>> The sticking points are:
>>> 1. From David: this proposal doesn't give an application the ability
>>> to flush an rx queue, which means that we have to rely on a driver
>>> reset that affects all queues to refill the rx queue buffers.  
>>
>> Generically, the design needs to be able to flush (or invalidate) all
>> references to the dma-buf once the process no longer "owns" it.
> 
> Are we talking about the ability for the app to flush the queue
> when it wants to (do no idea what)? Or auto-flush when app crashes?

If a buffer reference can be invalidated such that a posted buffer is
ignored by H/W, then no flush is needed per se. Either way the key point
is that posted buffers can no longer be filled by H/W once a process no
longer owns the dma-buf reference. I believe the actual mechanism here
will vary by H/W.

> 
>>> 2. From Jakub: the uAPI and implementation here needs to be in line
>>> with his general direction & extensible to apply to existing use cases
>>> `ethtool -L/-G`, etc.  
>>
>> I think this is a bit more open ended given the openness of the netdev
>> netlink API. i.e., managing a H/W queue (create, delete, stop / flush,
>> associate a page_pool) could be done through this API.
>>
>>>
>>> AFAIU this is what I need to do in the next version:
>>>
>>> 1. The uAPI will be changed such that it will either re-configure an
>>> existing queue to bind it to the dma-buf, or allocate a new queue
>>> bound to the dma-buf (not sure which is better at the moment). Either  
>>
>> 1. API to manage a page-pool (create, delete, update).
> 
> I wasn't anticipating a "create page pool" API.
> 
> I was thinking of a scheme where user space sets page pool parameters,
> but the driver still creates the pool.

There needs to be a process (or process group depending on design)
unique page pool due to the lifetime of what is backing it. The driver
or core netdev code can create it; if it is tied to an rx queue and
created by the driver there are design implications. As separate objects
page pools and rx queues can have their own lifetimes (e.g., multiple rx
queues for a single pp); generically a H/W queue and the basis of
buffers supplied to land packets are independent.

> 
> But I guess it is doable. More work, tho. Are there ibverbs which
> can do it? lol.

Well, generically yes - think intent not necessarily a 1:1 mapping.

> 
>> 2. API to add and remove a dma-buf (or host memory buffer) with a
>> page-pool. Remove may take time to flush references pushed to hardware
>> so this would be asynchronous.
>>
>> 3. Create a queue or use an existing queue id and associate a page-pool
>> with it.
>>
>>> way, the configuration will take place immediately, and not rely on an
>>> entire driver reset to actuate the change.  
>>
>> yes
>>
>>>
>>> 2. The uAPI will be changed such that if the netlink socket is closed,
>>> or the process dies, the rx queue will be unbound from the dma-buf or
>>> the rx queue will be freed entirely (again, not sure which is better  
>>
>> I think those are separate actions. But, if the queue was created by and
>> referenced by a process, then closing an fd means it should be freed.
>>
>>> at the moment). The configuration will take place immediately without
>>> relying on a driver reset.  
>>
>> yes on the reset.
>>
>>>
>>> 3. I will add 4 new net_device_ops that Jakub specified:
>>> queue_mem_alloc/free(), and queue_start/stop().
>>>
>>> 4. The uAPI mentioned in #1 will use the new net_device_ops to
>>> allocate or reconfigure a queue attached to the provided dma-buf.
> 
> I'd leave 2, 3, 4 alone for now. Focus on binding a page pool to 
> an existing queue.
> 
>>> Does this sound roughly reasonable here?
>>>
>>> AFAICT the only technical difficulty is that I'm not sure it's
>>> feasible for a driver to start or stop 1 rx-queue without triggering a
>>> full driver reset. The (2) drivers I looked at both do a full reset to
>>> change any queue configuration. I'll investigate.  
>>
> 


  reply	other threads:[~2023-08-19  3:30 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-10  1:57 [RFC PATCH v2 00/11] Device Memory TCP Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 01/11] net: add netdev netlink api to bind dma-buf to a net device Mina Almasry
2023-08-10 16:04   ` Samudrala, Sridhar
2023-08-11  2:19     ` Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 02/11] netdev: implement netlink api to bind dma-buf to netdevice Mina Almasry
2023-08-13 11:26   ` Leon Romanovsky
2023-08-14  1:10   ` David Ahern
2023-08-14  3:15     ` Mina Almasry
2023-08-16  0:16     ` Jakub Kicinski
2023-08-16 16:12       ` Willem de Bruijn
2023-08-18  1:33         ` David Ahern
2023-08-18  2:09           ` Jakub Kicinski
2023-08-18  2:21             ` David Ahern
2023-08-18 21:52             ` Mina Almasry
2023-08-19  1:34               ` David Ahern
2023-08-19  2:06                 ` Jakub Kicinski
2023-08-19  3:30                   ` David Ahern [this message]
2023-08-19 14:18                     ` Willem de Bruijn
2023-08-19 17:59                       ` Mina Almasry
2023-08-21 21:16                       ` Jakub Kicinski
2023-08-22  0:38                         ` Willem de Bruijn
2023-08-22  1:51                           ` Jakub Kicinski
2023-08-22  3:19                       ` David Ahern
2023-08-30 12:38   ` Yunsheng Lin
2023-09-08  0:47   ` David Wei
2023-08-10  1:57 ` [RFC PATCH v2 03/11] netdev: implement netdevice devmem allocator Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 04/11] memory-provider: updates to core provider API for devmem TCP Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 05/11] memory-provider: implement dmabuf devmem memory provider Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 06/11] page-pool: add device memory support Mina Almasry
2023-08-19  9:51   ` Jesper Dangaard Brouer
2023-08-19 14:08     ` Willem de Bruijn
2023-08-19 15:22       ` Jesper Dangaard Brouer
2023-08-19 15:49         ` David Ahern
2023-08-19 16:12           ` Willem de Bruijn
2023-08-21 21:31             ` Jakub Kicinski
2023-08-22  0:58               ` Willem de Bruijn
2023-08-19 16:11         ` Willem de Bruijn
2023-08-19 20:24         ` Mina Almasry
2023-08-19 20:27           ` Mina Almasry
2023-09-08  2:32           ` David Wei
2023-08-22  6:05     ` Mina Almasry
2023-08-22 12:24       ` Jesper Dangaard Brouer
2023-08-22 23:33         ` Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 07/11] net: support non paged skb frags Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 08/11] net: add support for skbs with unreadable frags Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 09/11] tcp: implement recvmsg() RX path for devmem TCP Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 10/11] net: add SO_DEVMEM_DONTNEED setsockopt to release RX pages Mina Almasry
2023-08-10  1:57 ` [RFC PATCH v2 11/11] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry
2023-08-10 10:29 ` [RFC PATCH v2 00/11] Device Memory TCP Christian König
2023-08-10 16:06   ` Jason Gunthorpe
2023-08-10 18:44   ` Mina Almasry
2023-08-10 18:58     ` Jason Gunthorpe
2023-08-11  1:56       ` Mina Almasry
2023-08-11 11:02     ` Christian König
2023-08-14  1:12 ` David Ahern
2023-08-14  2:11   ` Mina Almasry
2023-08-17 18:00   ` Pavel Begunkov
2023-08-17 22:18     ` Mina Almasry
2023-08-23 22:52       ` David Wei
2023-08-24  3:35         ` David Ahern
2023-08-15 13:38 ` David Laight
2023-08-15 14:41   ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38a06656-b6bf-e6b7-48a1-c489d2d76db8@kernel.org \
    --to=dsahern@kernel.org \
    --cc=almasrymina@google.com \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kaiyuanz@google.com \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkaligineedi@google.com \
    --cc=sdf@google.com \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).