stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Dan Williams <dan.j.williams@intel.com>
Subject: [PATCH 4.10 24/24] device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation
Date: Tue, 25 Apr 2017 16:09:33 +0100	[thread overview]
Message-ID: <20170425150835.247132201@linuxfoundation.org> (raw)
In-Reply-To: <20170425150834.258486705@linuxfoundation.org>

4.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Williams <dan.j.williams@intel.com>

commit 956a4cd2c957acf638ff29951aabaa9d8e92bbc2 upstream.

The following warning triggers with a new unit test that stresses the
device-dax interface.

 ===============================
 [ ERR: suspicious RCU usage.  ]
 4.11.0-rc4+ #1049 Tainted: G           O
 -------------------------------
 ./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 0
 2 locks held by fio/9070:
  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0
  #1:  (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax]

 Call Trace:
  dump_stack+0x86/0xc3
  lockdep_rcu_suspicious+0xd7/0x110
  ___might_sleep+0xac/0x250
  __might_sleep+0x4a/0x80
  __alloc_pages_nodemask+0x23a/0x360
  alloc_pages_current+0xa1/0x1f0
  pte_alloc_one+0x17/0x80
  __pte_alloc+0x1e/0x120
  __get_locked_pte+0x1bf/0x1d0
  insert_pfn.isra.70+0x3a/0x100
  ? lookup_memtype+0xa6/0xd0
  vm_insert_mixed+0x64/0x90
  dax_dev_huge_fault+0x520/0x620 [dax]
  ? dax_dev_huge_fault+0x32/0x620 [dax]
  dax_dev_fault+0x10/0x20 [dax]
  __do_fault+0x1e/0x140
  __handle_mm_fault+0x9af/0x10d0
  handle_mm_fault+0x16d/0x370
  ? handle_mm_fault+0x47/0x370
  __do_page_fault+0x28c/0x4f0
  trace_do_page_fault+0x58/0x2a0
  do_async_page_fault+0x1a/0xa0
  async_page_fault+0x28/0x30

Inserting a page table entry may trigger an allocation while we are
holding a read lock to keep the device instance alive for the duration
of the fault. Use srcu for this keep-alive protection.

Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/dax/Kconfig |    1 +
 drivers/dax/dax.c   |   13 +++++++------
 2 files changed, 8 insertions(+), 6 deletions(-)

--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -2,6 +2,7 @@ menuconfig DEV_DAX
 	tristate "DAX: direct access to differentiated memory"
 	default m if NVDIMM_DAX
 	depends on TRANSPARENT_HUGEPAGE
+	select SRCU
 	help
 	  Support raw access to differentiated (persistence, bandwidth,
 	  latency...) memory via an mmap(2) capable character
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -24,6 +24,7 @@
 #include "dax.h"
 
 static dev_t dax_devt;
+DEFINE_STATIC_SRCU(dax_srcu);
 static struct class *dax_class;
 static DEFINE_IDA(dax_minor_ida);
 static int nr_dax = CONFIG_NR_DEV_DAX;
@@ -59,7 +60,7 @@ struct dax_region {
  * @region - parent region
  * @dev - device backing the character device
  * @cdev - core chardev data
- * @alive - !alive + rcu grace period == no new mappings can be established
+ * @alive - !alive + srcu grace period == no new mappings can be established
  * @id - child id in the region
  * @num_resources - number of physical address extents in this device
  * @res - array of physical address ranges
@@ -530,7 +531,7 @@ static int __dax_dev_pmd_fault(struct da
 static int dax_dev_pmd_fault(struct vm_area_struct *vma, unsigned long addr,
 		pmd_t *pmd, unsigned int flags)
 {
-	int rc;
+	int rc, id;
 	struct file *filp = vma->vm_file;
 	struct dax_dev *dax_dev = filp->private_data;
 
@@ -538,9 +539,9 @@ static int dax_dev_pmd_fault(struct vm_a
 			current->comm, (flags & FAULT_FLAG_WRITE)
 			? "write" : "read", vma->vm_start, vma->vm_end);
 
-	rcu_read_lock();
+	id = srcu_read_lock(&dax_srcu);
 	rc = __dax_dev_pmd_fault(dax_dev, vma, addr, pmd, flags);
-	rcu_read_unlock();
+	srcu_read_unlock(&dax_srcu, id);
 
 	return rc;
 }
@@ -656,11 +657,11 @@ static void unregister_dax_dev(void *dev
 	 * Note, rcu is not protecting the liveness of dax_dev, rcu is
 	 * ensuring that any fault handlers that might have seen
 	 * dax_dev->alive == true, have completed.  Any fault handlers
-	 * that start after synchronize_rcu() has started will abort
+	 * that start after synchronize_srcu() has started will abort
 	 * upon seeing dax_dev->alive == false.
 	 */
 	dax_dev->alive = false;
-	synchronize_rcu();
+	synchronize_srcu(&dax_srcu);
 	unmap_mapping_range(dax_dev->inode->i_mapping, 0, 0, 1);
 	cdev_del(cdev);
 	device_unregister(dev);

  parent reply	other threads:[~2017-04-25 15:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-25 15:09 [PATCH 4.10 00/24] 4.10.13-stable review Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 01/24] KEYS: Disallow keyrings beginning with . to be joined as session keyrings Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 02/24] KEYS: Change the name of the dead type to ".dead" to prevent user access Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 03/24] KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 04/24] tracing: Allocate the snapshot buffer before enabling probe Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 05/24] HID: wacom: Treat HID_DG_TOOLSERIALNUMBER as unsigned Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 06/24] ring-buffer: Have ring_buffer_iter_empty() return true when empty Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 07/24] mm: prevent NR_ISOLATE_* stats from going negative Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 08/24] cifs: Do not send echoes before Negotiate is complete Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 09/24] CIFS: remove bad_network_name flag Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 10/24] mmc: dw_mmc: silent verbose log when calling from PM context Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 11/24] s390/mm: fix CMMA vs KSM vs others Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 12/24] Input: elantech - add Fujitsu Lifebook E547 to force crc_enabled Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 13/24] ACPI / power: Avoid maybe-uninitialized warning Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 14/24] mmc: dw_mmc: Dont allow Runtime PM for SDIO cards Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 15/24] mmc: sdhci-esdhc-imx: increase the pad I/O drive strength for DDR50 card Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 16/24] ubifs: Fix RENAME_WHITEOUT support Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 17/24] ubifs: Fix O_TMPFILE corner case in ubifs_link() Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 18/24] mac80211: reject ToDS broadcast data frames Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 19/24] mac80211: fix MU-MIMO follow-MAC mode Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 20/24] x86/mce: Make the MCE notifier a blocking one Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 21/24] ubi/upd: Always flush after prepared for an update Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 22/24] powerpc/kprobe: Fix oops when kprobed on stdu instruction Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 23/24] x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs Greg Kroah-Hartman
2017-04-25 15:09 ` Greg Kroah-Hartman [this message]
2017-04-25 18:24 ` [PATCH 4.10 00/24] 4.10.13-stable review Shuah Khan
2017-04-26  2:23 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170425150835.247132201@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).