netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Wei <dw@davidwei.uk>
To: Pavel Begunkov <asml.silence@gmail.com>,
	Mina Almasry <almasrymina@google.com>,
	Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Leon Romanovsky" <leonro@nvidia.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Samiullah Khawaja" <skhawaja@google.com>,
	"Taehee Yoo" <ap420073@gmail.com>,
	davem@davemloft.net, pabeni@redhat.com, edumazet@google.com,
	netdev@vger.kernel.org, linux-doc@vger.kernel.org,
	donald.hunter@gmail.com, corbet@lwn.net,
	michael.chan@broadcom.com, kory.maincent@bootlin.com,
	andrew@lunn.ch, maxime.chevallier@bootlin.com,
	danieller@nvidia.com, hengqi@linux.alibaba.com,
	ecree.xilinx@gmail.com, przemyslaw.kitszel@intel.com,
	hkallweit1@gmail.com, ahmed.zaki@intel.com,
	paul.greenwalt@intel.com, rrameshbabu@nvidia.com,
	idosch@nvidia.com, kaiyuanz@google.com, willemb@google.com,
	aleksander.lobakin@intel.com, sridhar.samudrala@intel.com,
	bcreeley@amd.com, "David Wei" <dw@davidwei.uk>
Subject: Re: [PATCH net-next v3 7/7] bnxt_en: add support for device memory tcp
Date: Tue, 15 Oct 2024 10:38:56 -0700	[thread overview]
Message-ID: <592f06dd-cfc1-4e4b-acf9-350e9747d624@davidwei.uk> (raw)
In-Reply-To: <75b16ab0-07c0-41d8-9285-0511a10629f7@gmail.com>

On 2024-10-15 07:29, Pavel Begunkov wrote:
> On 10/14/24 23:38, Mina Almasry wrote:
>> On Sat, Oct 12, 2024 at 2:42 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>>>
>>> On Fri, Oct 11, 2024 at 10:33:43AM -0700, Mina Almasry wrote:
>>>> On Thu, Oct 10, 2024 at 6:34 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>>>>
>>>>> On Thu, 10 Oct 2024 10:44:38 -0700 Mina Almasry wrote:
>>>>>>>> I haven't thought the failure of PP_FLAG_DMA_SYNC_DEV
>>>>>>>> for dmabuf may be wrong.
>>>>>>>> I think device memory TCP is not related to this flag.
>>>>>>>> So device memory TCP core API should not return failure when
>>>>>>>> PP_FLAG_DMA_SYNC_DEV flag is set.
>>>>>>>> How about removing this condition check code in device memory TCP core?
>>>>>>>
>>>>>>> I think we need to invert the check..
>>>>>>> Mina, WDYT?
>>>>>>
>>>>>> On a closer look, my feeling is similar to Taehee,
>>>>>> PP_FLAG_DMA_SYNC_DEV should be orthogonal to memory providers. The
>>>>>> memory providers allocate the memory and provide the dma-addr, but
>>>>>> need not dma-sync the dma-addr, right? The driver can sync the
>>>>>> dma-addr if it wants and the driver can delegate the syncing to the pp
>>>>>> via PP_FLAG_DMA_SYNC_DEV if it wants. AFAICT I think the check should
>>>>>> be removed, not inverted, but I could be missing something.
>>>>>
>>>>> I don't know much about dmabuf but it hinges on the question whether
>>>>> doing DMA sync for device on a dmabuf address is :
>>>>>   - a good thing
>>>>>   - a noop
>>>>>   - a bad thing
>>>>>
>>>>> If it's a good thing or a noop - agreed.
>>>>>
>>>>> Similar question for the sync for CPU.
>>>>>
>>>>> I agree that intuitively it should be all fine. But the fact that dmabuf
>>>>> has a bespoke API for accessing the memory by the CPU makes me worried
>>>>> that there may be assumptions about these addresses not getting
>>>>> randomly fed into the normal DMA API..
>>>>
>>>> Sorry I'm also a bit unsure what is the right thing to do here. The
>>>> code that we've been running in GVE does a dma-sync for cpu
>>>> unconditionally on RX for dma-buf and non-dmabuf dma-addrs and we
>>>> haven't been seeing issues. It never does dma-sync for device.
>>>>
>>>> My first question is why is dma-sync for device needed on RX path at
>>>> all for some drivers in the first place? For incoming (non-dmabuf)
>>>> data, the data is written by the device and read by the cpu, so sync
>>>> for cpu is really what's needed. Is the sync for device for XDP? Or is
>>>> it that buffers should be dma-syncd for device before they are
>>>> re-posted to the NIC?
>>>>
>>>> Christian/Jason, sorry quick question: are
>>>> dma_sync_single_for_{device|cpu} needed or wanted when the dma-addrs
>>>> come from a dma-buf? Or these dma-addrs to be treated like any other
>>>> with the normal dma_sync_for_{device|cpu} rules?
>>>
>>> Um, I think because dma-buf hacks things up and generates illegal
>>> scatterlist entries with weird dma_map_resource() addresses for the
>>> typical P2P case the dma sync API should not be used on those things.
>>>
>>> However, there is no way to know if the dma-buf has does this, and
>>> there are valid case where the scatterlist is not ill formed and the
>>> sync is necessary.
>>>
>>> We are getting soo close to being able to start fixing these API
>>> issues in dmabuf, I hope next cylce we can begin.. Fingers crossed.
>>>
>>>  From a CPU architecture perspective you do not need to cache flush PCI
>>> MMIO BAR memory, and perhaps doing so be might be problematic on some
>>> arches (???). But you do need to flush normal cachable CPU memory if
>>> that is in the DMA buf.
>>>
>>
>> Thanks Jason. In that case I agree with Jakub we should take in his change here:
>>
>> https://lore.kernel.org/netdev/20241009170102.1980ed1d@kernel.org/
>>
>> With this change the driver would delegate dma_sync_for_device to the
>> page_pool, and the page_pool will skip it altogether for the dma-buf
>> memory provider.
> 
> Requiring ->dma_map should be common to all providers as page pool
> shouldn't be dipping to net_iovs figuring out how to map them. However,
> looking at this discussion seems that the ->dma_sync concern is devmem
> specific and should be discarded by pp providers using dmabufs, i.e. in
> devmem.c:mp_dmabuf_devmem_init().

Yes, that's my preference as well, see my earlier reply.

> 

  reply	other threads:[~2024-10-15 17:39 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-03 16:06 [PATCH net-next v3 0/7] bnxt_en: implement device memory TCP for bnxt Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 1/7] bnxt_en: add support for rx-copybreak ethtool command Taehee Yoo
2024-10-03 16:57   ` Brett Creeley
2024-10-03 17:15     ` Taehee Yoo
2024-10-03 17:13   ` Michael Chan
2024-10-03 17:22     ` Taehee Yoo
2024-10-03 17:43       ` Michael Chan
2024-10-03 18:28         ` Taehee Yoo
2024-10-03 18:34         ` Andrew Lunn
2024-10-05  6:29           ` Taehee Yoo
2024-10-08 18:10             ` Jakub Kicinski
2024-10-08 19:38               ` Michael Chan
2024-10-08 19:53                 ` Jakub Kicinski
2024-10-08 20:35                   ` Michael Chan
2024-10-03 16:06 ` [PATCH net-next v3 2/7] bnxt_en: add support for tcp-data-split " Taehee Yoo
2024-10-08 18:19   ` Jakub Kicinski
2024-10-09 13:54     ` Taehee Yoo
2024-10-09 15:28       ` Jakub Kicinski
2024-10-09 17:47         ` Taehee Yoo
2024-10-31 17:34         ` Taehee Yoo
2024-10-31 23:56           ` Jakub Kicinski
2024-11-01 17:11             ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 3/7] net: ethtool: add support for configuring tcp-data-split-thresh Taehee Yoo
2024-10-03 18:25   ` Mina Almasry
2024-10-03 19:33     ` Taehee Yoo
2024-10-04  1:47       ` Mina Almasry
2024-10-05  6:11         ` Taehee Yoo
2024-10-08 18:33   ` Jakub Kicinski
2024-10-09 14:25     ` Taehee Yoo
2024-10-09 15:46       ` Jakub Kicinski
2024-10-09 17:49         ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 4/7] bnxt_en: add support for tcp-data-split-thresh ethtool command Taehee Yoo
2024-10-03 18:13   ` Brett Creeley
2024-10-03 19:13     ` Taehee Yoo
2024-10-08 18:35   ` Jakub Kicinski
2024-10-09 14:31     ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 5/7] net: devmem: add ring parameter filtering Taehee Yoo
2024-10-03 18:29   ` Mina Almasry
2024-10-04  3:57     ` Taehee Yoo
2024-10-03 18:35   ` Brett Creeley
2024-10-03 18:49     ` Mina Almasry
2024-10-08 19:28       ` Jakub Kicinski
2024-10-09 14:35         ` Taehee Yoo
2024-10-04  4:01     ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 6/7] net: ethtool: " Taehee Yoo
2024-10-03 18:32   ` Mina Almasry
2024-10-03 19:35     ` Taehee Yoo
2024-10-03 16:06 ` [PATCH net-next v3 7/7] bnxt_en: add support for device memory tcp Taehee Yoo
2024-10-03 18:43   ` Mina Almasry
2024-10-04 10:34     ` Taehee Yoo
2024-10-08  2:57       ` David Wei
2024-10-09 15:02         ` Taehee Yoo
2024-10-08 19:50       ` Jakub Kicinski
2024-10-09 15:37         ` Taehee Yoo
2024-10-10  0:01           ` Jakub Kicinski
2024-10-10 17:44             ` Mina Almasry
2024-10-11  1:34               ` Jakub Kicinski
2024-10-11 17:33                 ` Mina Almasry
2024-10-11 23:42                   ` Jason Gunthorpe
2024-10-14 22:38                     ` Mina Almasry
2024-10-15  0:16                       ` Jakub Kicinski
2024-10-15  1:10                         ` Mina Almasry
2024-10-15 12:44                           ` Jason Gunthorpe
2024-10-18  8:25                             ` Mina Almasry
2024-10-19 13:55                               ` Taehee Yoo
2024-10-15 14:29                       ` Pavel Begunkov
2024-10-15 17:38                         ` David Wei [this message]
2024-10-05  3:48   ` kernel test robot
2024-10-08  2:45   ` David Wei
2024-10-08  3:54     ` Taehee Yoo
2024-10-08  3:58       ` Taehee Yoo
2024-10-16 20:17 ` [PATCH net-next v3 0/7] bnxt_en: implement device memory TCP for bnxt Stanislav Fomichev
2024-10-17  8:58   ` Taehee Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=592f06dd-cfc1-4e4b-acf9-350e9747d624@davidwei.uk \
    --to=dw@davidwei.uk \
    --cc=ahmed.zaki@intel.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=almasrymina@google.com \
    --cc=andrew@lunn.ch \
    --cc=ap420073@gmail.com \
    --cc=asml.silence@gmail.com \
    --cc=bcreeley@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=danieller@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=ecree.xilinx@gmail.com \
    --cc=edumazet@google.com \
    --cc=hengqi@linux.alibaba.com \
    --cc=hkallweit1@gmail.com \
    --cc=idosch@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=kaiyuanz@google.com \
    --cc=kory.maincent@bootlin.com \
    --cc=kuba@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=maxime.chevallier@bootlin.com \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=paul.greenwalt@intel.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=rrameshbabu@nvidia.com \
    --cc=skhawaja@google.com \
    --cc=sridhar.samudrala@intel.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).