netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Stephen Warren <swarren@wwwdotorg.org>
Cc: Tariq Toukan <tariqt@mellanox.com>,
	xavier.huwei@huawei.com, netdev@vger.kernel.org,
	linux-rdma@vger.kernel.org, Doug Ledford <dledford@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Stephen Warren <swarren@nvidia.com>
Subject: Re: [PATCH V3] net/mlx4: Get rid of page operation after dma_alloc_coherent
Date: Thu, 20 Dec 2018 10:43:18 -0700	[thread overview]
Message-ID: <20181220174318.GA21404@ziepe.ca> (raw)
In-Reply-To: <20181219182031.8675-1-swarren@wwwdotorg.org>

On Wed, Dec 19, 2018 at 11:20:31AM -0700, Stephen Warren wrote:
> From: Stephen Warren <swarren@nvidia.com>
> 
> This patch solves a crash at the time of mlx4 driver unload or system
> shutdown. The crash occurs because dma_alloc_coherent() returns one
> value in mlx4_alloc_icm_coherent(), but a different value is passed to
> dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because
> when allocated, that pointer is passed to sg_set_buf() to record it,
> then when freed it is re-calculated by calling
> lowmem_page_address(sg_page()) which returns a different value. Solve
> this by recording the value that dma_alloc_coherent() returns, and
> passing this to dma_free_coherent().
> 
> This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get
> rid of page operation after dma_alloc_coherent"). That patch was
> described as:
> 
> > In general, dma_alloc_coherent() returns a CPU virtual address and
> > a DMA address, and we have no guarantee that the underlying memory
> > even has an associated struct page at all.
> >
> > This patch gets rid of the page operation after dma_alloc_coherent,
> > and records the VA returned form dma_alloc_coherent in the struct
> > of hem in hns RoCE driver.
> 
> However, this patch reworks the code to store all information about
> coherent chunks separately from the sg list, since using sg lists for
> them doesn't make sense. Hence, the structure of this patch is quite
> different compared to the hns patch.
> 
> Based-on-code-from: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
> v3:
> - Rework chunk data structure to store all data for coherent allocations
>   separately from the sg list. Code from Christoph Hellwig with fixes by
>   me. Notes:
>   - chunk->coherent is an int not a bool since checkpatch complains about
>     using bool in structs; see https://lkml.org/lkml/2017/11/21/384.

:( bool is much more readable and there is no performance concern in
this struct. I think checkpatch is overzealous here.

>   - chunk->coherent is used rather than chunk->table->coherent since the
>     table pointer isn't available when creating chunks. This duplicates
>     data, but simplifies the patch.
> v2:
> - Rework mlx4_table_find() to explicitly calculate the returned address
>   differently depending on wheter the table was allocated using
>   dma_alloc_coherent() or alloc_pages(), which in turn allows the
>   changes to mlx4_alloc_icm_pages() to be dropped.
> - Drop changes to mlx4_alloc/free_icm_pages. This path uses
>   pci_map_sg() which can re-write the sg list which in turn would cause
>   chunk->mem[] (the sg list) and chunk->buf[] to become inconsistent.
> - Enhance commit description.
> 
> Note: I've tested this patch in a downstream 4.14 based kernel (using
> ibping, ib_read_bw, and ib_write_bw), but can't test it in mainline
> since my system isn't supported there yet. I have compile-tested it in
> mainline at least, for ARM64.
> ---
>  drivers/net/ethernet/mellanox/mlx4/icm.c | 92 ++++++++++++++----------
>  drivers/net/ethernet/mellanox/mlx4/icm.h | 22 +++++-
>  2 files changed, 75 insertions(+), 39 deletions(-)

I think this needs an ack from the driver maintainers, and I gather
they are all on break for the next two weeks or so.

Lets revisit in January?

But at first glance it looks OK to me, though I would tidy the commit
message somewhat, your new leading paragraph fully explains the
problem, no need for the hns quote - and we can now see that the hns
should't have used a sgl either...

Jason

  reply	other threads:[~2018-12-20 17:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-19 18:20 [PATCH V3] net/mlx4: Get rid of page operation after dma_alloc_coherent Stephen Warren
2018-12-20 17:43 ` Jason Gunthorpe [this message]
2018-12-20 17:44   ` Christoph Hellwig
2018-12-20 17:49     ` Bart Van Assche
2018-12-21  2:25       ` rfc: bool structure members (was Re: [PATCH V3] net/mlx4: Get rid of page operation after dma_alloc_coherent) Joe Perches
2018-12-21  2:40         ` Andrew Morton
2018-12-21  3:31         ` Jason Gunthorpe
2018-12-21  5:12           ` Joe Perches
2018-12-21 23:52             ` Jason Gunthorpe
2018-12-23 16:42               ` Gal Pressman
2018-12-23 16:53                 ` Al Viro
2018-12-24 22:08                   ` Jason Gunthorpe
2018-12-24 23:12                 ` Jason Gunthorpe
2018-12-25  8:41                   ` Gal Pressman
2019-01-02 16:29   ` [PATCH V3] net/mlx4: Get rid of page operation after dma_alloc_coherent Stephen Warren
2019-01-03  7:48     ` Christoph Hellwig
2019-01-03 14:32     ` Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181220174318.GA21404@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=dledford@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=swarren@nvidia.com \
    --cc=swarren@wwwdotorg.org \
    --cc=tariqt@mellanox.com \
    --cc=xavier.huwei@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).