All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "trondmy@kernel.org" <trondmy@kernel.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH] RDMA: null pointer in __ib_umem_release causes kernel panic
Date: Wed, 5 Jan 2022 12:09:16 -0400	[thread overview]
Message-ID: <20220105160916.GT2328285@nvidia.com> (raw)
In-Reply-To: <3b74b8f4481ec27debad500e53facc56f9b388cd.camel@hammerspace.com>

On Wed, Jan 05, 2022 at 03:02:34PM +0000, Trond Myklebust wrote:
> On Wed, 2022-01-05 at 10:37 -0400, Jason Gunthorpe wrote:
> > On Wed, Jan 05, 2022 at 09:18:41AM -0500, trondmy@kernel.org wrote:
> > > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > 
> > > When doing RPC/RDMA, we're seeing a kernel panic when
> > > __ib_umem_release()
> > > iterates over the scatter gather list and hits NULL pages.
> > > 
> > > It turns out that commit 79fbd3e1241c ended up changing the
> > > iteration
> > > from being over only the mapped entries to being over the original
> > > list
> > > size.
> > 
> > You mean this?
> > 
> > -       for_each_sg(umem->sg_head.sgl, sg, umem->sg_nents, i)
> > +       for_each_sgtable_sg(&umem->sgt_append.sgt, sg, i)
> > 
> > I don't see what changed there? The invarient should be that
> > 
> >   umem->sg_nents == sgt->orig_nents
> > 
> > > @@ -55,7 +55,7 @@ static void __ib_umem_release(struct ib_device
> > > *dev, struct ib_umem *umem, int d
> > >                 ib_dma_unmap_sgtable_attrs(dev, &umem-
> > > >sgt_append.sgt,
> > >                                            DMA_BIDIRECTIONAL, 0);
> > >  
> > > -       for_each_sgtable_sg(&umem->sgt_append.sgt, sg, i)
> > > +       for_each_sgtable_dma_sg(&umem->sgt_append.sgt, sg, i)
> > >                 unpin_user_page_range_dirty_lock(sg_page(sg),
> > 
> > Calling sg_page() from under a dma_sg iterator is unconditionally
> > wrong..
> > 
> > More likely your case is something has gone wrong when the sgtable
> > was
> > created and it has the wrong value in orig_nents..
> 
> Can you define "wrong value" in this case? Chuck's RPC/RDMA code
> appears to call ib_alloc_mr() with an 'expected maximum number of
> entries' (depth) in net/sunrpc/xprtrdma/frwr_ops.c:frwr_mr_init().
> 
> It then fills that table with a set of n <= depth pages in
> net/sunrpc/xprtrdma/frwr_ops.c:frwr_map() and calls ib_dma_map_sg() to
> map them, and then adjusts the sgtable with a call to ib_map_mr_sg().

I'm confused, RPC/RDMA should never touch a umem at all.

Is this really the other bug where user and kernel MR are getting
confused?

Jason

  reply	other threads:[~2022-01-05 16:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05 14:18 [PATCH] RDMA: null pointer in __ib_umem_release causes kernel panic trondmy
2022-01-05 14:37 ` Jason Gunthorpe
2022-01-05 15:02   ` Trond Myklebust
2022-01-05 16:09     ` Jason Gunthorpe [this message]
2022-01-05 17:16       ` Trond Myklebust
2022-01-05 17:43         ` Jason Gunthorpe
2022-01-05 17:49           ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220105160916.GT2328285@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.