All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	John Hubbard <jhubbard@nvidia.com>,
	Andy Lutomirski <luto@kernel.org>,
	x86@kernel.org, Jann Horn <jannh@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.8 04/17] mm: fix pin vs. gup mismatch with gate pages
Date: Fri,  4 Sep 2020 15:30:03 +0200	[thread overview]
Message-ID: <20200904120258.196815367@linuxfoundation.org> (raw)
In-Reply-To: <20200904120257.983551609@linuxfoundation.org>

From: Dave Hansen <dave.hansen@linux.intel.com>

commit 9fa2dd946743ae6f30dc4830da19147bf100a7f2 upstream.

Gate pages were missed when converting from get to pin_user_pages().
This can lead to refcount imbalances.  This is reliably and quickly
reproducible running the x86 selftests when vsyscall=emulate is enabled
(the default).  Fix by using try_grab_page() with appropriate flags
passed.

The long story:

Today, pin_user_pages() and get_user_pages() are similar interfaces for
manipulating page reference counts.  However, "pins" use a "bias" value
and manipulate the actual reference count by 1024 instead of 1 used by
plain "gets".

That means that pin_user_pages() must be matched with unpin_user_pages()
and can't be mixed with a plain put_user_pages() or put_page().

Enter gate pages, like the vsyscall page.  They are pages usually in the
kernel image, but which are mapped to userspace.  Userspace is allowed
access to them, including interfaces using get/pin_user_pages().  The
refcount of these kernel pages is manipulated just like a normal user
page on the get/pin side so that the put/unpin side can work the same
for normal user pages or gate pages.

get_gate_page() uses try_get_page() which only bumps the refcount by
1, not 1024, even if called in the pin_user_pages() path.  If someone
pins a gate page, this happens:

	pin_user_pages()
		get_gate_page()
			try_get_page() // bump refcount +1
	... some time later
	unpin_user_pages()
		page_ref_sub_and_test(page, 1024))

... and boom, we get a refcount off by 1023.  This is reliably and
quickly reproducible running the x86 selftests when booted with
vsyscall=emulate (the default).  The selftests use ptrace(), but I
suspect anything using pin_user_pages() on gate pages could hit this.

To fix it, simply use try_grab_page() instead of try_get_page(), and
pass 'gup_flags' in so that FOLL_PIN can be respected.

This bug traces back to the very beginning of the FOLL_PIN support in
commit 3faa52c03f44 ("mm/gup: track FOLL_PIN pages"), which showed up in
the 5.7 release.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/gup.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/gup.c
+++ b/mm/gup.c
@@ -843,7 +843,7 @@ static int get_gate_page(struct mm_struc
 			goto unmap;
 		*page = pte_page(*pte);
 	}
-	if (unlikely(!try_get_page(*page))) {
+	if (unlikely(!try_grab_page(*page, gup_flags))) {
 		ret = -ENOMEM;
 		goto unmap;
 	}



  parent reply	other threads:[~2020-09-04 14:22 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-04 13:29 [PATCH 5.8 00/17] 5.8.7-rc1 review Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 01/17] HID: core: Correctly handle ReportSize being zero Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 02/17] HID: core: Sanitize event code and type when mapping input Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 03/17] netfilter: nft_set_rbtree: Handle outcomes of tree rotations in overlap detection Greg Kroah-Hartman
2020-09-04 13:30 ` Greg Kroah-Hartman [this message]
2020-09-04 13:30 ` [PATCH 5.8 05/17] selftests/x86/test_vsyscall: Improve the process_vm_readv() test Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 06/17] perf record/stat: Explicitly call out event modifiers in the documentation Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 07/17] media: media/v4l2-core: Fix kernel-infoleak in video_put_user() Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 08/17] KVM: arm64: Add kvm_extable for vaxorcism code Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 09/17] KVM: arm64: Survive synchronous exceptions caused by AT instructions Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 10/17] dt-bindings: mmc: tegra: Add tmclk for Tegra210 and later Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 11/17] arm64: tegra: Add missing timeout clock to Tegra194 SDMMC nodes Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 12/17] arm64: tegra: Add missing timeout clock to Tegra186 " Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 13/17] arm64: tegra: Add missing timeout clock to Tegra210 SDMMC Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 14/17] sdhci: tegra: Remove SDHCI_QUIRK_DATA_TIMEOUT_USES_SDCLK for Tegra210 Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 15/17] sdhci: tegra: Remove SDHCI_QUIRK_DATA_TIMEOUT_USES_SDCLK for Tegra186 Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 16/17] nl80211: fix NL80211_ATTR_HE_6GHZ_CAPABILITY usage Greg Kroah-Hartman
2020-09-04 13:30 ` [PATCH 5.8 17/17] scsi: target: tcmu: Optimize use of flush_dcache_page Greg Kroah-Hartman
2020-09-04 19:23 ` [PATCH 5.8 00/17] 5.8.7-rc1 review Guenter Roeck
2020-09-05  9:28   ` Greg Kroah-Hartman
2020-09-04 20:11 ` Shuah Khan
2020-09-05  9:28   ` Greg Kroah-Hartman
2020-09-05 15:42 ` Dan Rue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200904120258.196815367@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=jannh@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.