From: David Ahern <dsahern@kernel.org>
To: Mina Almasry <almasrymina@google.com>
Cc: "Christian König" <christian.koenig@amd.com>,
"Hari Ramakrishnan" <rharix@google.com>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Samiullah Khawaja" <skhawaja@google.com>,
"Willem de Bruijn" <willemb@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Christoph Hellwig" <hch@lst.de>,
"John Hubbard" <jhubbard@nvidia.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Jesper Dangaard Brouer" <jbrouer@redhat.com>,
brouer@redhat.com, "Alexander Duyck" <alexander.duyck@gmail.com>,
"Yunsheng Lin" <linyunsheng@huawei.com>,
davem@davemloft.net, pabeni@redhat.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org,
"Lorenzo Bianconi" <lorenzo@kernel.org>,
"Yisen Zhuang" <yisen.zhuang@huawei.com>,
"Salil Mehta" <salil.mehta@huawei.com>,
"Eric Dumazet" <edumazet@google.com>,
"Sunil Goutham" <sgoutham@marvell.com>,
"Geetha sowjanya" <gakula@marvell.com>,
"Subbaraya Sundeep" <sbhatta@marvell.com>,
hariprasad <hkelam@marvell.com>,
"Saeed Mahameed" <saeedm@nvidia.com>,
"Leon Romanovsky" <leon@kernel.org>,
"Felix Fietkau" <nbd@nbd.name>,
"Ryder Lee" <ryder.lee@mediatek.com>,
"Shayne Chen" <shayne.chen@mediatek.com>,
"Sean Wang" <sean.wang@mediatek.com>,
"Kalle Valo" <kvalo@kernel.org>,
"Matthias Brugger" <matthias.bgg@gmail.com>,
"AngeloGioacchino Del Regno"
<angelogioacchino.delregno@collabora.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-mediatek@lists.infradead.org,
"Jonathan Lemon" <jonathan.lemon@gmail.com>,
logang@deltatee.com, "Bjorn Helgaas" <bhelgaas@google.com>
Subject: Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag)
Date: Sun, 16 Jul 2023 21:08:16 -0600 [thread overview]
Message-ID: <765b02a5-2f09-e744-f441-c082fa3987ff@kernel.org> (raw)
In-Reply-To: <CAHS8izOL593X7=9pGaeC1JJ_5hYookZDn7O=fike=e48+myvxA@mail.gmail.com>
On 7/16/23 8:05 PM, Mina Almasry wrote:
>>
>> For the driver and hardware queue: don't you need a dedicated queue for
>> the flow(s) in question?
>
> In the RFC and the implementation I'm thinking of, the queue is
> 'dedicated' in that each queue will be a devmem TCP queue or a regular
> queue. devmem queues generate devmem skbs and non-devmem queues
> generate non-devmem skbs. We support switching queues between devmem
> mode and non-devmem mode via a uapi.
ethtool APIs or something else?
>
>> If not, how can you properly handle the
>> teardown case (e.g., app crashes and you need to ensure all references
>> to GPU memory are removed from NIC descriptors)?
>
> Jason and Christian will correct me if I'm wrong, but AFAICT the
> dma-buf API requires the dma-buf provider to keep the attachment
> mapping alive as long as the importer requires it. The dma-buf API
> gives the importer dma_buf_map_attachment() and
> dma_buf_unmap_attachment() APIs, but there is no callback for the
> exporter to inform the importer that it has to take the mapping away.
Isn't the importer that application that terminated (cleanly or other)?
That was my thinking but I guess there are other designs that can cross
a single application.
> The closest thing I saw was the move_notify() callback, but that is
> optional.
>
> In my mind the way it works is that there will be some uapi that binds
> a dma-buf to an RX queue, that will create the attachment and the
> mapping. If the user crashes or closes the dma-buf handle then that
> will unbind the dma-buf from the RX queue, but the mapping will remain
> alive (via some refcounting) until all the NIC descriptors are freed
> and the mapping is not under use anymore. Usually this will happen
> next driver reset which destroys and recreates rx queues thereby
> freeing all the NIC descriptors (but could be a new API so that we
> don't rely on a driver reset).
>
>> If you agree on this
>> point, then you can require the dedicated queue management in the driver
>> to use and expect only the alternative frag addressing scheme. ie., it
>> knows the address is not struct page (validates by checking skb flag or
>> frag flag or address magic), but a reference to say a page_pool entry
>> (if you are using page_pool for management of the dmabuf slices) which
>> contains the metadata needed for the use case.
>
> Honestly if my understanding above doesn't match what you want, I
> could implement 'dedicated queues' instead, just let me know what you
> want at some future iteration. Now, I'm more worried about this memory
> format issue and I'm working on an RX prototype without struct pages.
> So far purely technically speaking it seems possible.
>
>
My comment was only a suggestion on how to simplify driver changes. ie.,
a queue is either pages (based on standard page_pool or alloc_pages) or
some "special" page_pool (ie., new abstraction) but not mixed. In that
case it knows how to handle the overloaded 'address' in skb_frag in a
clean manner.
next prev parent reply other threads:[~2023-07-17 3:08 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20230612130256.4572-1-linyunsheng@huawei.com>
2023-06-12 13:02 ` [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag Yunsheng Lin
2023-06-14 17:19 ` Jakub Kicinski
2023-06-15 7:17 ` Yunsheng Lin
2023-06-15 16:51 ` Jakub Kicinski
2023-06-15 18:26 ` Alexander Duyck
2023-06-16 12:20 ` Yunsheng Lin
2023-06-16 15:01 ` Alexander Duyck
2023-06-16 18:59 ` Jesper Dangaard Brouer
2023-06-16 19:21 ` Jakub Kicinski
2023-06-16 20:42 ` Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag) Jesper Dangaard Brouer
2023-06-19 18:07 ` Jakub Kicinski
2023-06-20 15:12 ` Jesper Dangaard Brouer
2023-06-20 15:39 ` Jakub Kicinski
2023-06-30 2:27 ` Mina Almasry
2023-07-03 4:20 ` David Ahern
2023-07-03 6:22 ` Mina Almasry
2023-07-03 14:45 ` David Ahern
2023-07-03 17:13 ` Eric Dumazet
2023-07-03 17:23 ` David Ahern
2023-07-06 1:19 ` Mina Almasry
2023-07-03 17:15 ` Eric Dumazet
2023-07-03 17:25 ` David Ahern
2023-07-03 21:43 ` Jason Gunthorpe
2023-07-06 1:17 ` Mina Almasry
2023-07-10 17:44 ` Jason Gunthorpe
2023-07-10 23:02 ` Mina Almasry
2023-07-10 23:49 ` Jason Gunthorpe
2023-07-11 0:45 ` Mina Almasry
2023-07-11 13:11 ` Jason Gunthorpe
2023-07-11 17:24 ` Mina Almasry
2023-07-11 4:27 ` Christoph Hellwig
2023-07-11 4:59 ` Jakub Kicinski
2023-07-11 5:04 ` Christoph Hellwig
2023-07-11 12:05 ` Jason Gunthorpe
2023-07-11 16:00 ` Jakub Kicinski
2023-07-11 16:20 ` David Ahern
2023-07-11 16:32 ` Jakub Kicinski
2023-07-11 17:06 ` Mina Almasry
2023-07-11 20:39 ` Jakub Kicinski
2023-07-11 21:39 ` David Ahern
2023-07-12 3:42 ` Mina Almasry
2023-07-12 7:55 ` Christian König
2023-07-12 13:03 ` Jason Gunthorpe
2023-07-12 13:35 ` Christian König
2023-07-12 22:41 ` Mina Almasry
2023-07-12 13:01 ` Jason Gunthorpe
2023-07-12 20:16 ` Mina Almasry
2023-07-12 23:57 ` Jason Gunthorpe
2023-07-13 7:56 ` Christian König
2023-07-14 14:55 ` Mina Almasry
2023-07-14 15:18 ` David Ahern
2023-07-17 2:05 ` Mina Almasry
2023-07-17 3:08 ` David Ahern [this message]
2023-07-14 15:55 ` Jason Gunthorpe
2023-07-17 1:53 ` Mina Almasry
2023-07-24 14:56 ` Jesper Dangaard Brouer
2023-07-24 16:28 ` Jason Gunthorpe
2023-07-25 4:04 ` Mina Almasry
2023-07-26 17:36 ` Jesper Dangaard Brouer
2023-07-11 16:42 ` Jason Gunthorpe
2023-07-11 17:06 ` Jakub Kicinski
2023-07-11 18:52 ` Jason Gunthorpe
2023-07-11 20:34 ` Jakub Kicinski
2023-07-11 23:56 ` Jason Gunthorpe
2023-07-11 6:52 ` Dan Williams
2023-07-06 16:50 ` Jakub Kicinski
2023-06-17 12:19 ` [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag Yunsheng Lin
2023-06-15 13:59 ` Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=765b02a5-2f09-e744-f441-c082fa3987ff@kernel.org \
--to=dsahern@kernel.org \
--cc=alexander.duyck@gmail.com \
--cc=almasrymina@google.com \
--cc=angelogioacchino.delregno@collabora.com \
--cc=bhelgaas@google.com \
--cc=brouer@redhat.com \
--cc=christian.koenig@amd.com \
--cc=dan.j.williams@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gakula@marvell.com \
--cc=hawk@kernel.org \
--cc=hch@lst.de \
--cc=hkelam@marvell.com \
--cc=ilias.apalodimas@linaro.org \
--cc=jbrouer@redhat.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=jonathan.lemon@gmail.com \
--cc=kuba@kernel.org \
--cc=kvalo@kernel.org \
--cc=leon@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=linyunsheng@huawei.com \
--cc=logang@deltatee.com \
--cc=lorenzo@kernel.org \
--cc=matthias.bgg@gmail.com \
--cc=nbd@nbd.name \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rharix@google.com \
--cc=ryder.lee@mediatek.com \
--cc=saeedm@nvidia.com \
--cc=salil.mehta@huawei.com \
--cc=sbhatta@marvell.com \
--cc=sean.wang@mediatek.com \
--cc=sgoutham@marvell.com \
--cc=shayne.chen@mediatek.com \
--cc=skhawaja@google.com \
--cc=willemb@google.com \
--cc=yisen.zhuang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).