From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Dan Williams <dan.j.williams@intel.com>
Subject: [PATCH 4.10 24/24] device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation
Date: Tue, 25 Apr 2017 16:09:33 +0100 [thread overview]
Message-ID: <20170425150835.247132201@linuxfoundation.org> (raw)
In-Reply-To: <20170425150834.258486705@linuxfoundation.org>
4.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams <dan.j.williams@intel.com>
commit 956a4cd2c957acf638ff29951aabaa9d8e92bbc2 upstream.
The following warning triggers with a new unit test that stresses the
device-dax interface.
===============================
[ ERR: suspicious RCU usage. ]
4.11.0-rc4+ #1049 Tainted: G O
-------------------------------
./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 0
2 locks held by fio/9070:
#0: (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0
#1: (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax]
Call Trace:
dump_stack+0x86/0xc3
lockdep_rcu_suspicious+0xd7/0x110
___might_sleep+0xac/0x250
__might_sleep+0x4a/0x80
__alloc_pages_nodemask+0x23a/0x360
alloc_pages_current+0xa1/0x1f0
pte_alloc_one+0x17/0x80
__pte_alloc+0x1e/0x120
__get_locked_pte+0x1bf/0x1d0
insert_pfn.isra.70+0x3a/0x100
? lookup_memtype+0xa6/0xd0
vm_insert_mixed+0x64/0x90
dax_dev_huge_fault+0x520/0x620 [dax]
? dax_dev_huge_fault+0x32/0x620 [dax]
dax_dev_fault+0x10/0x20 [dax]
__do_fault+0x1e/0x140
__handle_mm_fault+0x9af/0x10d0
handle_mm_fault+0x16d/0x370
? handle_mm_fault+0x47/0x370
__do_page_fault+0x28c/0x4f0
trace_do_page_fault+0x58/0x2a0
do_async_page_fault+0x1a/0xa0
async_page_fault+0x28/0x30
Inserting a page table entry may trigger an allocation while we are
holding a read lock to keep the device instance alive for the duration
of the fault. Use srcu for this keep-alive protection.
Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/dax/Kconfig | 1 +
drivers/dax/dax.c | 13 +++++++------
2 files changed, 8 insertions(+), 6 deletions(-)
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -2,6 +2,7 @@ menuconfig DEV_DAX
tristate "DAX: direct access to differentiated memory"
default m if NVDIMM_DAX
depends on TRANSPARENT_HUGEPAGE
+ select SRCU
help
Support raw access to differentiated (persistence, bandwidth,
latency...) memory via an mmap(2) capable character
--- a/drivers/dax/dax.c
+++ b/drivers/dax/dax.c
@@ -24,6 +24,7 @@
#include "dax.h"
static dev_t dax_devt;
+DEFINE_STATIC_SRCU(dax_srcu);
static struct class *dax_class;
static DEFINE_IDA(dax_minor_ida);
static int nr_dax = CONFIG_NR_DEV_DAX;
@@ -59,7 +60,7 @@ struct dax_region {
* @region - parent region
* @dev - device backing the character device
* @cdev - core chardev data
- * @alive - !alive + rcu grace period == no new mappings can be established
+ * @alive - !alive + srcu grace period == no new mappings can be established
* @id - child id in the region
* @num_resources - number of physical address extents in this device
* @res - array of physical address ranges
@@ -530,7 +531,7 @@ static int __dax_dev_pmd_fault(struct da
static int dax_dev_pmd_fault(struct vm_area_struct *vma, unsigned long addr,
pmd_t *pmd, unsigned int flags)
{
- int rc;
+ int rc, id;
struct file *filp = vma->vm_file;
struct dax_dev *dax_dev = filp->private_data;
@@ -538,9 +539,9 @@ static int dax_dev_pmd_fault(struct vm_a
current->comm, (flags & FAULT_FLAG_WRITE)
? "write" : "read", vma->vm_start, vma->vm_end);
- rcu_read_lock();
+ id = srcu_read_lock(&dax_srcu);
rc = __dax_dev_pmd_fault(dax_dev, vma, addr, pmd, flags);
- rcu_read_unlock();
+ srcu_read_unlock(&dax_srcu, id);
return rc;
}
@@ -656,11 +657,11 @@ static void unregister_dax_dev(void *dev
* Note, rcu is not protecting the liveness of dax_dev, rcu is
* ensuring that any fault handlers that might have seen
* dax_dev->alive == true, have completed. Any fault handlers
- * that start after synchronize_rcu() has started will abort
+ * that start after synchronize_srcu() has started will abort
* upon seeing dax_dev->alive == false.
*/
dax_dev->alive = false;
- synchronize_rcu();
+ synchronize_srcu(&dax_srcu);
unmap_mapping_range(dax_dev->inode->i_mapping, 0, 0, 1);
cdev_del(cdev);
device_unregister(dev);
next prev parent reply other threads:[~2017-04-25 15:12 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-25 15:09 [PATCH 4.10 00/24] 4.10.13-stable review Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 01/24] KEYS: Disallow keyrings beginning with . to be joined as session keyrings Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 02/24] KEYS: Change the name of the dead type to ".dead" to prevent user access Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 03/24] KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 04/24] tracing: Allocate the snapshot buffer before enabling probe Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 05/24] HID: wacom: Treat HID_DG_TOOLSERIALNUMBER as unsigned Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 06/24] ring-buffer: Have ring_buffer_iter_empty() return true when empty Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 07/24] mm: prevent NR_ISOLATE_* stats from going negative Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 08/24] cifs: Do not send echoes before Negotiate is complete Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 09/24] CIFS: remove bad_network_name flag Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 10/24] mmc: dw_mmc: silent verbose log when calling from PM context Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 11/24] s390/mm: fix CMMA vs KSM vs others Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 12/24] Input: elantech - add Fujitsu Lifebook E547 to force crc_enabled Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 13/24] ACPI / power: Avoid maybe-uninitialized warning Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 14/24] mmc: dw_mmc: Dont allow Runtime PM for SDIO cards Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 15/24] mmc: sdhci-esdhc-imx: increase the pad I/O drive strength for DDR50 card Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 16/24] ubifs: Fix RENAME_WHITEOUT support Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 17/24] ubifs: Fix O_TMPFILE corner case in ubifs_link() Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 18/24] mac80211: reject ToDS broadcast data frames Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 19/24] mac80211: fix MU-MIMO follow-MAC mode Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 20/24] x86/mce: Make the MCE notifier a blocking one Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 21/24] ubi/upd: Always flush after prepared for an update Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 22/24] powerpc/kprobe: Fix oops when kprobed on stdu instruction Greg Kroah-Hartman
2017-04-25 15:09 ` [PATCH 4.10 23/24] x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs Greg Kroah-Hartman
2017-04-25 15:09 ` Greg Kroah-Hartman [this message]
2017-04-25 18:24 ` [PATCH 4.10 00/24] 4.10.13-stable review Shuah Khan
2017-04-26 2:23 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170425150835.247132201@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).