From: Leon Romanovsky <leonro@mellanox.com>
To: Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@mellanox.com>
Cc: RDMA mailing list <linux-rdma@vger.kernel.org>,
Ariel Levkovich <lariel@mellanox.com>,
Yishai Hadas <yishaih@mellanox.com>
Subject: Re: [PATCH rdma-rc 2/2] IB/mlx5: Fix device memory flows
Date: Thu, 12 Dec 2019 11:21:46 +0000 [thread overview]
Message-ID: <20191212112144.GW67461@unreal> (raw)
In-Reply-To: <20191212111457.GV67461@unreal>
On Thu, Dec 12, 2019 at 11:15:01AM +0000, Leon Romanovsky wrote:
> On Thu, Dec 12, 2019 at 12:02:37PM +0200, Leon Romanovsky wrote:
> > From: Yishai Hadas <yishaih@mellanox.com>
> >
> > Fix device memory flows so that only once there will be no live mmaped
> > VA to a given allocation the matching object will be destroyed.
> >
> > This prevents a potential scenario that existing VA that was mmaped by
> > one process might still be used post its deallocation despite that it's
> > owned now by other process.
> >
> > The above is achieved by integrating with IB core APIs to manage
> > mmap/munmap. Only once the refcount will become 0 the DM object and its
> > underlay area will be freed.
> >
> > Fixes: 3b113a1ec3d4 ("IB/mlx5: Support device memory type attribute")
> > Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > ---
> > drivers/infiniband/hw/mlx5/cmd.c | 16 ++--
> > drivers/infiniband/hw/mlx5/cmd.h | 2 +-
> > drivers/infiniband/hw/mlx5/main.c | 120 ++++++++++++++++++---------
> > drivers/infiniband/hw/mlx5/mlx5_ib.h | 19 ++++-
> > 4 files changed, 105 insertions(+), 52 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
> > index 4937947400cd..4c26492ab8a3 100644
> > --- a/drivers/infiniband/hw/mlx5/cmd.c
> > +++ b/drivers/infiniband/hw/mlx5/cmd.c
> > @@ -157,7 +157,7 @@ int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr,
> > return -ENOMEM;
> > }
> >
> > -int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length)
> > +void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length)
> > {
> > struct mlx5_core_dev *dev = dm->dev;
> > u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr);
> > @@ -175,15 +175,13 @@ int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length)
> > MLX5_SET(dealloc_memic_in, in, memic_size, length);
> >
> > err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
> > + if (err)
> > + return;
> >
> > - if (!err) {
> > - spin_lock(&dm->lock);
> > - bitmap_clear(dm->memic_alloc_pages,
> > - start_page_idx, num_pages);
> > - spin_unlock(&dm->lock);
> > - }
> > -
> > - return err;
> > + spin_lock(&dm->lock);
> > + bitmap_clear(dm->memic_alloc_pages,
> > + start_page_idx, num_pages);
> > + spin_unlock(&dm->lock);
> > }
> >
> > int mlx5_cmd_query_ext_ppcnt_counters(struct mlx5_core_dev *dev, void *out)
> > diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h
> > index 169cab4915e3..945ebce73613 100644
> > --- a/drivers/infiniband/hw/mlx5/cmd.h
> > +++ b/drivers/infiniband/hw/mlx5/cmd.h
> > @@ -46,7 +46,7 @@ int mlx5_cmd_modify_cong_params(struct mlx5_core_dev *mdev,
> > void *in, int in_size);
> > int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr,
> > u64 length, u32 alignment);
> > -int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length);
> > +void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length);
> > void mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid);
> > void mlx5_cmd_destroy_tir(struct mlx5_core_dev *dev, u32 tirn, u16 uid);
> > void mlx5_cmd_destroy_tis(struct mlx5_core_dev *dev, u32 tisn, u16 uid);
> > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > index 2f5a159cbe1c..4d89d85226c2 100644
> > --- a/drivers/infiniband/hw/mlx5/main.c
> > +++ b/drivers/infiniband/hw/mlx5/main.c
> > @@ -2074,6 +2074,24 @@ static int mlx5_ib_mmap_clock_info_page(struct mlx5_ib_dev *dev,
> > virt_to_page(dev->mdev->clock_info));
> > }
> >
> > +static void mlx5_ib_mmap_free(struct rdma_user_mmap_entry *entry)
> > +{
> > + struct mlx5_user_mmap_entry *mentry = to_mmmap(entry);
> > + struct mlx5_ib_dev *dev = to_mdev(entry->ucontext->device);
> > + struct mlx5_ib_dm *mdm;
> > +
> > + switch (mentry->mmap_flag) {
> > + case MLX5_IB_MMAP_TYPE_MEMIC:
> > + mdm = container_of(mentry, struct mlx5_ib_dm, mentry);
> > + mlx5_cmd_dealloc_memic(&dev->dm, mdm->dev_addr,
> > + mdm->size);
> > + kfree(mdm);
> > + break;
> > + default:
> > + WARN_ON(true);
> > + }
> > +}
> > +
> > static int uar_mmap(struct mlx5_ib_dev *dev, enum mlx5_ib_mmap_cmd cmd,
> > struct vm_area_struct *vma,
> > struct mlx5_ib_ucontext *context)
> > @@ -2186,26 +2204,55 @@ static int uar_mmap(struct mlx5_ib_dev *dev, enum mlx5_ib_mmap_cmd cmd,
> > return err;
> > }
> >
> > -static int dm_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
> > +static int add_dm_mmap_entry(struct ib_ucontext *context,
> > + struct mlx5_ib_dm *mdm,
> > + u64 address)
> > +{
> > + mdm->mentry.mmap_flag = MLX5_IB_MMAP_TYPE_MEMIC;
> > + mdm->mentry.address = address;
> > + return rdma_user_mmap_entry_insert_range(
> > + context, &mdm->mentry.rdma_entry,
> > + mdm->size,
> > + MLX5_IB_MMAP_DEVICE_MEM << 16,
> > + (MLX5_IB_MMAP_DEVICE_MEM << 16) + (1UL << 16) - 1);
> > +}
> > +
> > +static unsigned long mlx5_vma_to_pgoff(struct vm_area_struct *vma)
> > +{
> > + unsigned long idx;
> > + u8 command;
> > +
> > + command = get_command(vma->vm_pgoff);
> > + idx = get_extended_index(vma->vm_pgoff);
> > +
> > + return (command << 16 | idx);
> > +}
> > +
> > +static int mlx5_ib_mmap_offset(struct mlx5_ib_dev *dev,
> > + struct vm_area_struct *vma,
> > + struct ib_ucontext *ucontext)
> > {
> > - struct mlx5_ib_ucontext *mctx = to_mucontext(context);
> > - struct mlx5_ib_dev *dev = to_mdev(context->device);
> > - u16 page_idx = get_extended_index(vma->vm_pgoff);
> > - size_t map_size = vma->vm_end - vma->vm_start;
> > - u32 npages = map_size >> PAGE_SHIFT;
> > + struct mlx5_user_mmap_entry *mentry;
> > + struct rdma_user_mmap_entry *entry;
> > + unsigned long pgoff;
> > + pgprot_t prot;
> > phys_addr_t pfn;
> > + int ret;
> >
> > - if (find_next_zero_bit(mctx->dm_pages, page_idx + npages, page_idx) !=
> > - page_idx + npages)
> > + pgoff = mlx5_vma_to_pgoff(vma);
> > + entry = rdma_user_mmap_entry_get_pgoff(ucontext, pgoff);
> > + if (!entry)
> > return -EINVAL;
> >
> > - pfn = ((dev->mdev->bar_addr +
> > - MLX5_CAP64_DEV_MEM(dev->mdev, memic_bar_start_addr)) >>
> > - PAGE_SHIFT) +
> > - page_idx;
> > - return rdma_user_mmap_io(context, vma, pfn, map_size,
> > - pgprot_writecombine(vma->vm_page_prot),
> > - NULL);
> > + mentry = to_mmmap(entry);
> > + pfn = (mentry->address >> PAGE_SHIFT);
> > + prot = pgprot_writecombine(vma->vm_page_prot);
> > + ret = rdma_user_mmap_io(ucontext, vma, pfn,
> > + entry->npages * PAGE_SIZE,
> > + prot,
> > + entry);
> > + rdma_user_mmap_entry_put(&mentry->rdma_entry);
> > + return ret;
> > }
> >
> > static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vma)
> > @@ -2248,11 +2295,8 @@ static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vm
> > case MLX5_IB_MMAP_CLOCK_INFO:
> > return mlx5_ib_mmap_clock_info_page(dev, vma, context);
> >
> > - case MLX5_IB_MMAP_DEVICE_MEM:
> > - return dm_mmap(ibcontext, vma);
> > -
> > default:
> > - return -EINVAL;
> > + return mlx5_ib_mmap_offset(dev, vma, ibcontext);
> > }
> >
> > return 0;
> > @@ -2288,8 +2332,9 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx,
> > {
> > struct mlx5_dm *dm_db = &to_mdev(ctx->device)->dm;
> > u64 start_offset;
> > - u32 page_idx;
> > + u16 page_idx = 0;
>
> This hunk is not needed.
To be clear, I wanted to say this about "= 0" part. The change of the
type is still needed.
Thanks
>
> Thanks
next prev parent reply other threads:[~2019-12-12 11:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-12 10:02 [PATCH rdma-rc 0/2] Prevent device memory VA reuse Leon Romanovsky
2019-12-12 10:02 ` [PATCH rdma-rc 1/2] IB/core: Introduce rdma_user_mmap_entry_insert_range() API Leon Romanovsky
2019-12-12 10:02 ` [PATCH rdma-rc 2/2] IB/mlx5: Fix device memory flows Leon Romanovsky
2019-12-12 11:15 ` Leon Romanovsky
2019-12-12 11:21 ` Leon Romanovsky [this message]
2019-12-12 22:00 ` Doug Ledford
2019-12-15 18:55 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191212112144.GW67461@unreal \
--to=leonro@mellanox.com \
--cc=dledford@redhat.com \
--cc=jgg@mellanox.com \
--cc=lariel@mellanox.com \
--cc=linux-rdma@vger.kernel.org \
--cc=yishaih@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).