All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>,
	akpm@linux-foundation.org, dave@stgolabs.net, jack@suse.cz,
	cl@linux.com, linux-mm@kvack.org, kvm@vger.kernel.org,
	kvm-ppc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-fpga@vger.kernel.org, linux-kernel@vger.kernel.org,
	alex.williamson@redhat.com, paulus@ozlabs.org,
	benh@kernel.crashing.org, mpe@ellerman.id.au, hao.wu@intel.com,
	atull@kernel.org, mdf@kernel.org, aik@ozlabs.ru
Subject: Re: [PATCH 0/5] use pinned_vm instead of locked_vm to account pinned pages
Date: Thu, 14 Feb 2019 06:00:06 +0000	[thread overview]
Message-ID: <20190214060006.GE24692@ziepe.ca> (raw)
In-Reply-To: <20190214015314.GB1151@iweiny-DESK2.sc.intel.com>

On Wed, Feb 13, 2019 at 05:53:14PM -0800, Ira Weiny wrote:
> On Mon, Feb 11, 2019 at 03:54:47PM -0700, Jason Gunthorpe wrote:
> > On Mon, Feb 11, 2019 at 05:44:32PM -0500, Daniel Jordan wrote:
> > 
> > > All five of these places, and probably some of Davidlohr's conversions,
> > > probably want to be collapsed into a common helper in the core mm for
> > > accounting pinned pages.  I tried, and there are several details that
> > > likely need discussion, so this can be done as a follow-on.
> > 
> > I've wondered the same..
> 
> I'm really thinking this would be a nice way to ensure it gets cleaned up and
> does not happen again.
> 
> Also, by moving it to the core we could better manage any user visible changes.
> 
> From a high level, pinned is a subset of locked so it seems like we need a 2
> sets of helpers.
> 
> try_increment_locked_vm(...)
> decrement_locked_vm(...)
> 
> try_increment_pinned_vm(...)
> decrement_pinned_vm(...)
> 
> Where try_increment_pinned_vm() also increments locked_vm...  Of course this
> may end up reverting the improvement of Davidlohr  Bueso's atomic work...  :-(
> 
> Furthermore it would seem better (although I don't know if at all possible) if
> this were accounted for in core calls which tracked them based on how the pages
> are being used so that drivers can't call try_increment_locked_vm() and then
> pin the pages...  Thus getting the account wrong vs what actually happened.
> 
> And then in the end we can go back to locked_vm being the value checked against
> RLIMIT_MEMLOCK.

Someone would need to understand the bug that was fixed by splitting
them. 

I think it had to do with double accounting pinned and mlocked pages
and thus delivering a lower than expected limit to userspace.

vfio has this bug, RDMA does not. RDMA has a bug where it can
overallocate locked memory, vfio doesn't.

Really unclear how to fix this. The pinned/locked split with two
buckets may be the right way.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>,
	akpm@linux-foundation.org, dave@stgolabs.net, jack@suse.cz,
	cl@linux.com, linux-mm@kvack.org, kvm@vger.kernel.org,
	kvm-ppc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-fpga@vger.kernel.org, linux-kernel@vger.kernel.org,
	alex.williamson@redhat.com, paulus@ozlabs.org,
	benh@kernel.crashing.org, mpe@ellerman.id.au, hao.wu@intel.com,
	atull@kernel.org, mdf@kernel.org, aik@ozlabs.ru
Subject: Re: [PATCH 0/5] use pinned_vm instead of locked_vm to account pinned pages
Date: Wed, 13 Feb 2019 23:00:06 -0700	[thread overview]
Message-ID: <20190214060006.GE24692@ziepe.ca> (raw)
In-Reply-To: <20190214015314.GB1151@iweiny-DESK2.sc.intel.com>

On Wed, Feb 13, 2019 at 05:53:14PM -0800, Ira Weiny wrote:
> On Mon, Feb 11, 2019 at 03:54:47PM -0700, Jason Gunthorpe wrote:
> > On Mon, Feb 11, 2019 at 05:44:32PM -0500, Daniel Jordan wrote:
> > 
> > > All five of these places, and probably some of Davidlohr's conversions,
> > > probably want to be collapsed into a common helper in the core mm for
> > > accounting pinned pages.  I tried, and there are several details that
> > > likely need discussion, so this can be done as a follow-on.
> > 
> > I've wondered the same..
> 
> I'm really thinking this would be a nice way to ensure it gets cleaned up and
> does not happen again.
> 
> Also, by moving it to the core we could better manage any user visible changes.
> 
> From a high level, pinned is a subset of locked so it seems like we need a 2
> sets of helpers.
> 
> try_increment_locked_vm(...)
> decrement_locked_vm(...)
> 
> try_increment_pinned_vm(...)
> decrement_pinned_vm(...)
> 
> Where try_increment_pinned_vm() also increments locked_vm...  Of course this
> may end up reverting the improvement of Davidlohr  Bueso's atomic work...  :-(
> 
> Furthermore it would seem better (although I don't know if at all possible) if
> this were accounted for in core calls which tracked them based on how the pages
> are being used so that drivers can't call try_increment_locked_vm() and then
> pin the pages...  Thus getting the account wrong vs what actually happened.
> 
> And then in the end we can go back to locked_vm being the value checked against
> RLIMIT_MEMLOCK.

Someone would need to understand the bug that was fixed by splitting
them. 

I think it had to do with double accounting pinned and mlocked pages
and thus delivering a lower than expected limit to userspace.

vfio has this bug, RDMA does not. RDMA has a bug where it can
overallocate locked memory, vfio doesn't.

Really unclear how to fix this. The pinned/locked split with two
buckets may be the right way.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Ira Weiny <ira.weiny@intel.com>
Cc: dave@stgolabs.net, jack@suse.cz, kvm@vger.kernel.org,
	atull@kernel.org, aik@ozlabs.ru, linux-fpga@vger.kernel.org,
	linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	linux-mm@kvack.org, alex.williamson@redhat.com, mdf@kernel.org,
	akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org,
	cl@linux.com, hao.wu@intel.com
Subject: Re: [PATCH 0/5] use pinned_vm instead of locked_vm to account pinned pages
Date: Wed, 13 Feb 2019 23:00:06 -0700	[thread overview]
Message-ID: <20190214060006.GE24692@ziepe.ca> (raw)
In-Reply-To: <20190214015314.GB1151@iweiny-DESK2.sc.intel.com>

On Wed, Feb 13, 2019 at 05:53:14PM -0800, Ira Weiny wrote:
> On Mon, Feb 11, 2019 at 03:54:47PM -0700, Jason Gunthorpe wrote:
> > On Mon, Feb 11, 2019 at 05:44:32PM -0500, Daniel Jordan wrote:
> > 
> > > All five of these places, and probably some of Davidlohr's conversions,
> > > probably want to be collapsed into a common helper in the core mm for
> > > accounting pinned pages.  I tried, and there are several details that
> > > likely need discussion, so this can be done as a follow-on.
> > 
> > I've wondered the same..
> 
> I'm really thinking this would be a nice way to ensure it gets cleaned up and
> does not happen again.
> 
> Also, by moving it to the core we could better manage any user visible changes.
> 
> From a high level, pinned is a subset of locked so it seems like we need a 2
> sets of helpers.
> 
> try_increment_locked_vm(...)
> decrement_locked_vm(...)
> 
> try_increment_pinned_vm(...)
> decrement_pinned_vm(...)
> 
> Where try_increment_pinned_vm() also increments locked_vm...  Of course this
> may end up reverting the improvement of Davidlohr  Bueso's atomic work...  :-(
> 
> Furthermore it would seem better (although I don't know if at all possible) if
> this were accounted for in core calls which tracked them based on how the pages
> are being used so that drivers can't call try_increment_locked_vm() and then
> pin the pages...  Thus getting the account wrong vs what actually happened.
> 
> And then in the end we can go back to locked_vm being the value checked against
> RLIMIT_MEMLOCK.

Someone would need to understand the bug that was fixed by splitting
them. 

I think it had to do with double accounting pinned and mlocked pages
and thus delivering a lower than expected limit to userspace.

vfio has this bug, RDMA does not. RDMA has a bug where it can
overallocate locked memory, vfio doesn't.

Really unclear how to fix this. The pinned/locked split with two
buckets may be the right way.

Jason

  reply	other threads:[~2019-02-14  6:00 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-11 22:44 [PATCH 0/5] use pinned_vm instead of locked_vm to account pinned pages Daniel Jordan
2019-02-11 22:44 ` Daniel Jordan
2019-02-11 22:44 ` Daniel Jordan
2019-02-11 22:44 ` [PATCH 1/5] vfio/type1: " Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-11 22:56   ` Jason Gunthorpe
2019-02-11 22:56     ` Jason Gunthorpe
2019-02-11 22:56     ` Jason Gunthorpe
2019-02-11 23:11     ` Daniel Jordan
2019-02-11 23:11       ` Daniel Jordan
2019-02-11 23:11       ` Daniel Jordan
2019-02-12 18:41       ` Alex Williamson
2019-02-12 18:41         ` Alex Williamson
2019-02-12 18:41         ` Alex Williamson
2019-02-13  0:26         ` Daniel Jordan
2019-02-13  0:26           ` Daniel Jordan
2019-02-13  0:26           ` Daniel Jordan
2019-02-13 20:03           ` Alex Williamson
2019-02-13 20:03             ` Alex Williamson
2019-02-13 20:03             ` Alex Williamson
2019-02-13 23:07             ` Jason Gunthorpe
2019-02-13 23:07               ` Jason Gunthorpe
2019-02-13 23:07               ` Jason Gunthorpe
2019-02-14  1:46             ` Daniel Jordan
2019-02-14  1:46               ` Daniel Jordan
2019-02-14  1:46               ` Daniel Jordan
2019-02-11 22:44 ` [PATCH 2/5] vfio/spapr_tce: " Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-12  6:56   ` Alexey Kardashevskiy
2019-02-12  6:56     ` Alexey Kardashevskiy
2019-02-12  6:56     ` Alexey Kardashevskiy
2019-02-12 16:50     ` Christopher Lameter
2019-02-12 16:50       ` Christopher Lameter
2019-02-12 17:18       ` Daniel Jordan
2019-02-12 17:18         ` Daniel Jordan
2019-02-12 17:18         ` Daniel Jordan
2019-02-13  0:37         ` Alexey Kardashevskiy
2019-02-13  0:37           ` Alexey Kardashevskiy
2019-02-13  0:37           ` Alexey Kardashevskiy
2019-02-12 18:56     ` Alex Williamson
2019-02-12 18:56       ` Alex Williamson
2019-02-12 18:56       ` Alex Williamson
2019-02-13  0:34       ` Alexey Kardashevskiy
2019-02-13  0:34         ` Alexey Kardashevskiy
2019-02-13  0:34         ` Alexey Kardashevskiy
2019-02-11 22:44 ` [PATCH 3/5] fpga/dlf/afu: " Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-11 22:44 ` [PATCH 4/5] powerpc/mmu: " Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-13  1:14   ` kbuild test robot
2019-02-13  1:14     ` kbuild test robot
2019-02-13  1:14     ` kbuild test robot
2019-02-13  1:14     ` kbuild test robot
2019-02-11 22:44 ` [PATCH 5/5] kvm/book3s: " Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-11 22:44   ` Daniel Jordan
2019-02-13  1:43   ` kbuild test robot
2019-02-13  1:43     ` kbuild test robot
2019-02-13  1:43     ` kbuild test robot
2019-02-13  1:43     ` kbuild test robot
2019-02-11 22:54 ` [PATCH 0/5] " Jason Gunthorpe
2019-02-11 22:54   ` Jason Gunthorpe
2019-02-11 22:54   ` Jason Gunthorpe
2019-02-11 23:15   ` Daniel Jordan
2019-02-11 23:15     ` Daniel Jordan
2019-02-11 23:15     ` Daniel Jordan
2019-02-14  1:53   ` Ira Weiny
2019-02-14  1:53     ` Ira Weiny
2019-02-14  1:53     ` Ira Weiny
2019-02-14  1:53     ` Ira Weiny
2019-02-14  6:00     ` Jason Gunthorpe [this message]
2019-02-14  6:00       ` Jason Gunthorpe
2019-02-14  6:00       ` Jason Gunthorpe
2019-02-14 19:33       ` Ira Weiny
2019-02-14 19:33         ` Ira Weiny
2019-02-14 19:33         ` Ira Weiny
2019-02-14 20:12         ` Jason Gunthorpe
2019-02-14 20:12           ` Jason Gunthorpe
2019-02-14 20:12           ` Jason Gunthorpe
2019-02-14 21:46           ` Ira Weiny
2019-02-14 21:46             ` Ira Weiny
2019-02-14 21:46             ` Ira Weiny
2019-02-14 22:16             ` Jason Gunthorpe
2019-02-14 22:16               ` Jason Gunthorpe
2019-02-14 22:16               ` Jason Gunthorpe
2019-02-15 15:26               ` Christopher Lameter
2019-02-15 15:26                 ` Christopher Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190214060006.GE24692@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=aik@ozlabs.ru \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=atull@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=cl@linux.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave@stgolabs.net \
    --cc=hao.wu@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-fpga@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mdf@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.