All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Doug Ledford <dledford@redhat.com>, <linux-rdma@vger.kernel.org>,
	"Shiraz Saleem" <shiraz.saleem@intel.com>
Subject: Re: [PATCH 02/14] RDMA/umem: Prevent small pages from being returned by ib_umem_find_best_pgsz()
Date: Wed, 2 Sep 2020 13:34:07 -0300	[thread overview]
Message-ID: <20200902163407.GU1152540@nvidia.com> (raw)
In-Reply-To: <20200902120540.GI59010@unreal>

On Wed, Sep 02, 2020 at 03:05:40PM +0300, Leon Romanovsky wrote:
> On Wed, Sep 02, 2020 at 08:59:12AM -0300, Jason Gunthorpe wrote:
> > On Wed, Sep 02, 2020 at 02:51:19PM +0300, Leon Romanovsky wrote:
> > > On Tue, Sep 01, 2020 at 09:43:30PM -0300, Jason Gunthorpe wrote:
> > > > rdma_for_each_block() makes assumptions about how the SGL is constructed
> > > > that don't work if the block size is below the page size used to to build
> > > > the SGL.
> > > >
> > > > The rules for umem SGL construction require that the SG's all be PAGE_SIZE
> > > > aligned and we don't encode the actual byte offset of the VA range inside
> > > > the SGL using offset and length. So rdma_for_each_block() has no idea
> > > > where the actual starting/ending point is to compute the first/last block
> > > > boundary if the starting address should be within a SGL.
> > > >
> > > > Fixing the SGL construction turns out to be really hard, and will be the
> > > > subject of other patches. For now block smaller pages.
> > > >
> > > > Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
> > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > >  drivers/infiniband/core/umem.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > > > index 120e98403c345d..7b5bc969e55630 100644
> > > > +++ b/drivers/infiniband/core/umem.c
> > > > @@ -151,6 +151,12 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
> > > >  	dma_addr_t mask;
> > > >  	int i;
> > > >
> > > > +	/* rdma_for_each_block() has a bug if the page size is smaller than the
> > > > +	 * page size used to build the umem. For now prevent smaller page sizes
> > > > +	 * from being returned.
> > > > +	 */
> > > > +	pgsz_bitmap &= GENMASK(BITS_PER_LONG - 1, PAGE_SHIFT);
> > > > +
> > >
> > > Why do we care about such case? Why can't we leave this check forever?
> >
> > If HW supports only, say 4k page size, and runs on a 64k page size
> > architecture it should be able to fragment into the native HW page
> > size.
> >
> > The whole point of these APIs is to decouple the system and HW page
> > sizes.
> 
> Right now you are preventing such combinations, but is this real concern
> for existing drivers?

No, I didn't prevent anything, I've left those drivers just hardwired
to use PAGE_SHIFT/PAGE_SIZE.

Maybe they are broken and malfunction on 64k page size systems, maybe
the HW supports other pages sizes and they should call
ib_umem_find_best_pgsz(), I don't really know.

The fix is fairly trivial, but it can't be done until the drivers stop
touching umem->sgl - as it requires changing how the sgl is
constructed to match standard kernel expectations, which also breaks
all the drivers.

Jason

  reply	other threads:[~2020-09-02 16:34 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-02  0:43 [PATCH 00/14] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 01/14] RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page boundary Jason Gunthorpe
2020-09-02  9:24   ` Leon Romanovsky
2020-09-03 14:11   ` Saleem, Shiraz
2020-09-04 22:30   ` Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 02/14] RDMA/umem: Prevent small pages from being returned by ib_umem_find_best_pgsz() Jason Gunthorpe
2020-09-02 11:51   ` Leon Romanovsky
2020-09-02 11:59     ` Jason Gunthorpe
2020-09-02 12:05       ` Leon Romanovsky
2020-09-02 16:34         ` Jason Gunthorpe [this message]
2020-09-03 14:11   ` Saleem, Shiraz
2020-09-03 14:17     ` Jason Gunthorpe
2020-09-03 14:18       ` Saleem, Shiraz
2020-09-02  0:43 ` [PATCH 03/14] RDMA/umem: Use simpler logic for ib_umem_find_best_pgsz() Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 04/14] RDMA/umem: Add rdma_umem_for_each_dma_block() Jason Gunthorpe
2020-09-02  3:10   ` Miguel Ojeda
2020-09-03 14:12   ` Saleem, Shiraz
2020-09-02  0:43 ` [PATCH 05/14] RDMA/umem: Replace for_each_sg_dma_page with rdma_umem_for_each_dma_block Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 06/14] RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks() Jason Gunthorpe
2020-09-03 14:12   ` Saleem, Shiraz
2020-09-03 14:14     ` Jason Gunthorpe
2020-09-04 22:32   ` Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 07/14] RDMA/qedr: Use rdma_umem_for_each_dma_block() instead of open-coding Jason Gunthorpe
2020-09-02 15:36   ` [EXT] " Michal Kalderon
2020-09-02 18:44     ` Jason Gunthorpe
2020-09-02 19:53       ` Michal Kalderon
2020-09-02  0:43 ` [PATCH 08/14] RDMA/qedr: Use ib_umem_num_dma_blocks() instead of ib_umem_page_count() Jason Gunthorpe
2020-09-02 15:35   ` [EXT] " Michal Kalderon
2020-09-02  0:43 ` [PATCH 09/14] RDMA/bnxt: Do not use ib_umem_page_count() or ib_umem_num_pages() Jason Gunthorpe
2020-09-03  4:41   ` Selvin Xavier
2020-09-02  0:43 ` [PATCH 10/14] RDMA/hns: Use ib_umem_num_dma_blocks() instead of opencoding Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 11/14] RDMA/ocrdma: Use ib_umem_num_dma_blocks() instead of ib_umem_page_count() Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 12/14] RDMA/pvrdma: " Jason Gunthorpe
2020-09-02  0:43 ` [PATCH 13/14] RDMA/mlx5: Use ib_umem_num_dma_blocks() Jason Gunthorpe
2020-09-02  9:07   ` Gal Pressman
2020-09-03 15:14   ` Saleem, Shiraz
2020-09-02  0:43 ` [PATCH 14/14] RDMA/umem: Rename ib_umem_offset() to ib_umem_dma_offset() Jason Gunthorpe
2020-09-02  0:51   ` Zhu Yanjun
2020-09-02 15:36   ` [EXT] " Michal Kalderon
2020-09-03 18:48   ` Jason Gunthorpe
2020-09-02  9:09 ` [PATCH 00/14] RDMA: Improve use of umem in DMA drivers Gal Pressman
2020-09-02 12:00   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200902163407.GU1152540@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=dledford@redhat.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=shiraz.saleem@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.