From: David Wei <dw@davidwei.uk>
To: Pavel Begunkov <asml.silence@gmail.com>,
Mina Almasry <almasrymina@google.com>,
Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Leon Romanovsky" <leonro@nvidia.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Christian König" <christian.koenig@amd.com>,
"Samiullah Khawaja" <skhawaja@google.com>,
"Taehee Yoo" <ap420073@gmail.com>,
davem@davemloft.net, pabeni@redhat.com, edumazet@google.com,
netdev@vger.kernel.org, linux-doc@vger.kernel.org,
donald.hunter@gmail.com, corbet@lwn.net,
michael.chan@broadcom.com, kory.maincent@bootlin.com,
andrew@lunn.ch, maxime.chevallier@bootlin.com,
danieller@nvidia.com, hengqi@linux.alibaba.com,
ecree.xilinx@gmail.com, przemyslaw.kitszel@intel.com,
hkallweit1@gmail.com, ahmed.zaki@intel.com,
paul.greenwalt@intel.com, rrameshbabu@nvidia.com,
idosch@nvidia.com, kaiyuanz@google.com, willemb@google.com,
aleksander.lobakin@intel.com, sridhar.samudrala@intel.com,
bcreeley@amd.com, "David Wei" <dw@davidwei.uk>
Subject: Re: [PATCH net-next v3 7/7] bnxt_en: add support for device memory tcp
Date: Tue, 15 Oct 2024 10:38:56 -0700 [thread overview]
Message-ID: <592f06dd-cfc1-4e4b-acf9-350e9747d624@davidwei.uk> (raw)
In-Reply-To: <75b16ab0-07c0-41d8-9285-0511a10629f7@gmail.com>
On 2024-10-15 07:29, Pavel Begunkov wrote:
> On 10/14/24 23:38, Mina Almasry wrote:
>> On Sat, Oct 12, 2024 at 2:42 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>>>
>>> On Fri, Oct 11, 2024 at 10:33:43AM -0700, Mina Almasry wrote:
>>>> On Thu, Oct 10, 2024 at 6:34 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>>>>
>>>>> On Thu, 10 Oct 2024 10:44:38 -0700 Mina Almasry wrote:
>>>>>>>> I haven't thought the failure of PP_FLAG_DMA_SYNC_DEV
>>>>>>>> for dmabuf may be wrong.
>>>>>>>> I think device memory TCP is not related to this flag.
>>>>>>>> So device memory TCP core API should not return failure when
>>>>>>>> PP_FLAG_DMA_SYNC_DEV flag is set.
>>>>>>>> How about removing this condition check code in device memory TCP core?
>>>>>>>
>>>>>>> I think we need to invert the check..
>>>>>>> Mina, WDYT?
>>>>>>
>>>>>> On a closer look, my feeling is similar to Taehee,
>>>>>> PP_FLAG_DMA_SYNC_DEV should be orthogonal to memory providers. The
>>>>>> memory providers allocate the memory and provide the dma-addr, but
>>>>>> need not dma-sync the dma-addr, right? The driver can sync the
>>>>>> dma-addr if it wants and the driver can delegate the syncing to the pp
>>>>>> via PP_FLAG_DMA_SYNC_DEV if it wants. AFAICT I think the check should
>>>>>> be removed, not inverted, but I could be missing something.
>>>>>
>>>>> I don't know much about dmabuf but it hinges on the question whether
>>>>> doing DMA sync for device on a dmabuf address is :
>>>>> - a good thing
>>>>> - a noop
>>>>> - a bad thing
>>>>>
>>>>> If it's a good thing or a noop - agreed.
>>>>>
>>>>> Similar question for the sync for CPU.
>>>>>
>>>>> I agree that intuitively it should be all fine. But the fact that dmabuf
>>>>> has a bespoke API for accessing the memory by the CPU makes me worried
>>>>> that there may be assumptions about these addresses not getting
>>>>> randomly fed into the normal DMA API..
>>>>
>>>> Sorry I'm also a bit unsure what is the right thing to do here. The
>>>> code that we've been running in GVE does a dma-sync for cpu
>>>> unconditionally on RX for dma-buf and non-dmabuf dma-addrs and we
>>>> haven't been seeing issues. It never does dma-sync for device.
>>>>
>>>> My first question is why is dma-sync for device needed on RX path at
>>>> all for some drivers in the first place? For incoming (non-dmabuf)
>>>> data, the data is written by the device and read by the cpu, so sync
>>>> for cpu is really what's needed. Is the sync for device for XDP? Or is
>>>> it that buffers should be dma-syncd for device before they are
>>>> re-posted to the NIC?
>>>>
>>>> Christian/Jason, sorry quick question: are
>>>> dma_sync_single_for_{device|cpu} needed or wanted when the dma-addrs
>>>> come from a dma-buf? Or these dma-addrs to be treated like any other
>>>> with the normal dma_sync_for_{device|cpu} rules?
>>>
>>> Um, I think because dma-buf hacks things up and generates illegal
>>> scatterlist entries with weird dma_map_resource() addresses for the
>>> typical P2P case the dma sync API should not be used on those things.
>>>
>>> However, there is no way to know if the dma-buf has does this, and
>>> there are valid case where the scatterlist is not ill formed and the
>>> sync is necessary.
>>>
>>> We are getting soo close to being able to start fixing these API
>>> issues in dmabuf, I hope next cylce we can begin.. Fingers crossed.
>>>
>>> From a CPU architecture perspective you do not need to cache flush PCI
>>> MMIO BAR memory, and perhaps doing so be might be problematic on some
>>> arches (???). But you do need to flush normal cachable CPU memory if
>>> that is in the DMA buf.
>>>
>>
>> Thanks Jason. In that case I agree with Jakub we should take in his change here:
>>
>> https://lore.kernel.org/netdev/20241009170102.1980ed1d@kernel.org/
>>
>> With this change the driver would delegate dma_sync_for_device to the
>> page_pool, and the page_pool will skip it altogether for the dma-buf
>> memory provider.
>
> Requiring ->dma_map should be common to all providers as page pool
> shouldn't be dipping to net_iovs figuring out how to map them. However,
> looking at this discussion seems that the ->dma_sync concern is devmem
> specific and should be discarded by pp providers using dmabufs, i.e. in
> devmem.c:mp_dmabuf_devmem_init().
Yes, that's my preference as well, see my earlier reply.
>
next prev parent reply other threads:[~2024-10-15 17:39 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-03 16:06 [PATCH net-next v3 0/7] bnxt_en: implement device memory TCP for bnxt Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 1/7] bnxt_en: add support for rx-copybreak ethtool command Taehee Yoo
2024-10-03 16:57 ` Brett Creeley
2024-10-03 17:15 ` Taehee Yoo
2024-10-03 17:13 ` Michael Chan
2024-10-03 17:22 ` Taehee Yoo
2024-10-03 17:43 ` Michael Chan
2024-10-03 18:28 ` Taehee Yoo
2024-10-03 18:34 ` Andrew Lunn
2024-10-05 6:29 ` Taehee Yoo
2024-10-08 18:10 ` Jakub Kicinski
2024-10-08 19:38 ` Michael Chan
2024-10-08 19:53 ` Jakub Kicinski
2024-10-08 20:35 ` Michael Chan
2024-10-03 16:06 ` [PATCH net-next v3 2/7] bnxt_en: add support for tcp-data-split " Taehee Yoo
2024-10-08 18:19 ` Jakub Kicinski
2024-10-09 13:54 ` Taehee Yoo
2024-10-09 15:28 ` Jakub Kicinski
2024-10-09 17:47 ` Taehee Yoo
2024-10-31 17:34 ` Taehee Yoo
2024-10-31 23:56 ` Jakub Kicinski
2024-11-01 17:11 ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 3/7] net: ethtool: add support for configuring tcp-data-split-thresh Taehee Yoo
2024-10-03 18:25 ` Mina Almasry
2024-10-03 19:33 ` Taehee Yoo
2024-10-04 1:47 ` Mina Almasry
2024-10-05 6:11 ` Taehee Yoo
2024-10-08 18:33 ` Jakub Kicinski
2024-10-09 14:25 ` Taehee Yoo
2024-10-09 15:46 ` Jakub Kicinski
2024-10-09 17:49 ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 4/7] bnxt_en: add support for tcp-data-split-thresh ethtool command Taehee Yoo
2024-10-03 18:13 ` Brett Creeley
2024-10-03 19:13 ` Taehee Yoo
2024-10-08 18:35 ` Jakub Kicinski
2024-10-09 14:31 ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 5/7] net: devmem: add ring parameter filtering Taehee Yoo
2024-10-03 18:29 ` Mina Almasry
2024-10-04 3:57 ` Taehee Yoo
2024-10-03 18:35 ` Brett Creeley
2024-10-03 18:49 ` Mina Almasry
2024-10-08 19:28 ` Jakub Kicinski
2024-10-09 14:35 ` Taehee Yoo
2024-10-04 4:01 ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 6/7] net: ethtool: " Taehee Yoo
2024-10-03 18:32 ` Mina Almasry
2024-10-03 19:35 ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 7/7] bnxt_en: add support for device memory tcp Taehee Yoo
2024-10-03 18:43 ` Mina Almasry
2024-10-04 10:34 ` Taehee Yoo
2024-10-08 2:57 ` David Wei
2024-10-09 15:02 ` Taehee Yoo
2024-10-08 19:50 ` Jakub Kicinski
2024-10-09 15:37 ` Taehee Yoo
2024-10-10 0:01 ` Jakub Kicinski
2024-10-10 17:44 ` Mina Almasry
2024-10-11 1:34 ` Jakub Kicinski
2024-10-11 17:33 ` Mina Almasry
2024-10-11 23:42 ` Jason Gunthorpe
2024-10-14 22:38 ` Mina Almasry
2024-10-15 0:16 ` Jakub Kicinski
2024-10-15 1:10 ` Mina Almasry
2024-10-15 12:44 ` Jason Gunthorpe
2024-10-18 8:25 ` Mina Almasry
2024-10-19 13:55 ` Taehee Yoo
2024-10-15 14:29 ` Pavel Begunkov
2024-10-15 17:38 ` David Wei [this message]
2024-10-05 3:48 ` kernel test robot
2024-10-08 2:45 ` David Wei
2024-10-08 3:54 ` Taehee Yoo
2024-10-08 3:58 ` Taehee Yoo
2024-10-16 20:17 ` [PATCH net-next v3 0/7] bnxt_en: implement device memory TCP for bnxt Stanislav Fomichev
2024-10-17 8:58 ` Taehee Yoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=592f06dd-cfc1-4e4b-acf9-350e9747d624@davidwei.uk \
--to=dw@davidwei.uk \
--cc=ahmed.zaki@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=almasrymina@google.com \
--cc=andrew@lunn.ch \
--cc=ap420073@gmail.com \
--cc=asml.silence@gmail.com \
--cc=bcreeley@amd.com \
--cc=christian.koenig@amd.com \
--cc=corbet@lwn.net \
--cc=danieller@nvidia.com \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=ecree.xilinx@gmail.com \
--cc=edumazet@google.com \
--cc=hengqi@linux.alibaba.com \
--cc=hkallweit1@gmail.com \
--cc=idosch@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=kaiyuanz@google.com \
--cc=kory.maincent@bootlin.com \
--cc=kuba@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-doc@vger.kernel.org \
--cc=maxime.chevallier@bootlin.com \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=paul.greenwalt@intel.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=rrameshbabu@nvidia.com \
--cc=skhawaja@google.com \
--cc=sridhar.samudrala@intel.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).