From: Daniel Jordan <daniel.m.jordan@oracle.com>
To: akpm@linux-foundation.org
Cc: dan.carpenter@oracle.com, andrea.parri@amarulasolutions.com,
shli@kernel.org, ying.huang@intel.com,
dave.hansen@linux.intel.com, sfr@canb.auug.org.au,
osandov@fb.com, tj@kernel.org, ak@linux.intel.com,
linux-mm@kvack.org, kernel-janitors@vger.kernel.org,
paulmck@linux.ibm.com, stern@rowland.harvard.edu,
peterz@infradead.org, will.deacon@arm.com,
daniel.m.jordan@oracle.com
Subject: [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL derefs
Date: Tue, 15 Jan 2019 00:23:05 +0000 [thread overview]
Message-ID: <20190115002305.15402-1-daniel.m.jordan@oracle.com> (raw)
In-Reply-To: <20190114222529.43zay6r242ipw5jb@ca-dmjordan1.us.oracle.com>
Dan Carpenter reports a potential NULL dereference in
get_swap_page_of_type:
Smatch complains that the NULL checks on "si" aren't consistent. This
seems like a real bug because we have not ensured that the type is
valid and so "si" can be NULL.
Add the missing check for NULL, taking care to use a read barrier to
ensure CPU1 observes CPU0's updates in the correct order:
CPU0 CPU1
alloc_swap_info() if (type >= nr_swapfiles)
swap_info[type] = p /* handle invalid entry */
smp_wmb() smp_rmb()
++nr_swapfiles p = swap_info[type]
Without smp_rmb, CPU1 might observe CPU0's write to nr_swapfiles before
CPU0's write to swap_info[type] and read NULL from swap_info[type].
Ying Huang noticed that other places don't order these reads properly.
Introduce swap_type_to_swap_info to encourage correct usage.
Use READ_ONCE and WRITE_ONCE to follow the Linux Kernel Memory Model
(see tools/memory-model/Documentation/explanation.txt).
This ordering need not be enforced in places where swap_lock is held
(e.g. si_swapinfo) because swap_lock serializes updates to nr_swapfiles
and the swap_info array.
This is a theoretical problem, no actual reports of it exist.
Fixes: ec8acf20afb8 ("swap: add per-partition lock for swapfile")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
---
I'd appreciate it if someone more familiar with memory barriers could
check this over. Thanks.
Probably no need for stable, this is all theoretical.
Against linux-mmotm tag v5.0-rc1-mmotm-2019-01-09-13-40
mm/swapfile.c | 43 +++++++++++++++++++++++++++----------------
1 file changed, 27 insertions(+), 16 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index f0edf7244256..dad52fc67045 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -99,6 +99,15 @@ static atomic_t proc_poll_event = ATOMIC_INIT(0);
atomic_t nr_rotate_swap = ATOMIC_INIT(0);
+static struct swap_info_struct *swap_type_to_swap_info(int type)
+{
+ if (type >= READ_ONCE(nr_swapfiles))
+ return NULL;
+
+ smp_rmb(); /* Pairs with smp_wmb in alloc_swap_info. */
+ return READ_ONCE(swap_info[type]);
+}
+
static inline unsigned char swap_count(unsigned char ent)
{
return ent & ~SWAP_HAS_CACHE; /* may include COUNT_CONTINUED flag */
@@ -1045,12 +1054,14 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_size)
/* The only caller of this function is now suspend routine */
swp_entry_t get_swap_page_of_type(int type)
{
- struct swap_info_struct *si;
+ struct swap_info_struct *si = swap_type_to_swap_info(type);
pgoff_t offset;
- si = swap_info[type];
+ if (!si)
+ goto fail;
+
spin_lock(&si->lock);
- if (si && (si->flags & SWP_WRITEOK)) {
+ if (si->flags & SWP_WRITEOK) {
atomic_long_dec(&nr_swap_pages);
/* This is called for allocating swap entry, not cache */
offset = scan_swap_map(si, 1);
@@ -1061,6 +1072,7 @@ swp_entry_t get_swap_page_of_type(int type)
atomic_long_inc(&nr_swap_pages);
}
spin_unlock(&si->lock);
+fail:
return (swp_entry_t) {0};
}
@@ -1072,9 +1084,9 @@ static struct swap_info_struct *__swap_info_get(swp_entry_t entry)
if (!entry.val)
goto out;
type = swp_type(entry);
- if (type >= nr_swapfiles)
+ p = swap_type_to_swap_info(type);
+ if (!p)
goto bad_nofile;
- p = swap_info[type];
if (!(p->flags & SWP_USED))
goto bad_device;
offset = swp_offset(entry);
@@ -1212,9 +1224,9 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
if (!entry.val)
goto out;
type = swp_type(entry);
- if (type >= nr_swapfiles)
+ si = swap_type_to_swap_info(type);
+ if (!si)
goto bad_nofile;
- si = swap_info[type];
preempt_disable();
if (!(si->flags & SWP_VALID))
@@ -1765,10 +1777,9 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
sector_t swapdev_block(int type, pgoff_t offset)
{
struct block_device *bdev;
+ struct swap_info_struct *si = swap_type_to_swap_info(type);
- if ((unsigned int)type >= nr_swapfiles)
- return 0;
- if (!(swap_info[type]->flags & SWP_WRITEOK))
+ if (!si || !(si->flags & SWP_WRITEOK))
return 0;
return map_swap_entry(swp_entry(type, offset), &bdev);
}
@@ -2799,9 +2810,9 @@ static void *swap_start(struct seq_file *swap, loff_t *pos)
if (!l)
return SEQ_START_TOKEN;
- for (type = 0; type < nr_swapfiles; type++) {
+ for (type = 0; type < READ_ONCE(nr_swapfiles); type++) {
smp_rmb(); /* read nr_swapfiles before swap_info[type] */
- si = swap_info[type];
+ si = READ_ONCE(swap_info[type]);
if (!(si->flags & SWP_USED) || !si->swap_map)
continue;
if (!--l)
@@ -2821,9 +2832,9 @@ static void *swap_next(struct seq_file *swap, void *v, loff_t *pos)
else
type = si->type + 1;
- for (; type < nr_swapfiles; type++) {
+ for (; type < READ_ONCE(nr_swapfiles); type++) {
smp_rmb(); /* read nr_swapfiles before swap_info[type] */
- si = swap_info[type];
+ si = READ_ONCE(swap_info[type]);
if (!(si->flags & SWP_USED) || !si->swap_map)
continue;
++*pos;
@@ -2930,14 +2941,14 @@ static struct swap_info_struct *alloc_swap_info(void)
}
if (type >= nr_swapfiles) {
p->type = type;
- swap_info[type] = p;
+ WRITE_ONCE(swap_info[type], p);
/*
* Write swap_info[type] before nr_swapfiles, in case a
* racing procfs swap_start() or swap_next() is reading them.
* (We never shrink nr_swapfiles, we never free this entry.)
*/
smp_wmb();
- nr_swapfiles++;
+ WRITE_ONCE(nr_swapfiles, nr_swapfiles + 1);
} else {
kvfree(p);
p = swap_info[type];
--
2.20.0
next prev parent reply other threads:[~2019-01-15 0:23 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-11 9:59 [PATCH] mm, swap: Potential NULL dereference in get_swap_page_of_type() Dan Carpenter
2019-01-11 17:41 ` Daniel Jordan
2019-01-11 23:20 ` Andrea Parri
2019-01-14 22:25 ` Daniel Jordan
2019-01-15 0:23 ` Daniel Jordan [this message]
2019-01-15 1:17 ` [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL derefs Andrea Parri
2019-01-30 6:26 ` Andrew Morton
2019-01-31 1:52 ` Daniel Jordan
2019-01-31 2:44 ` [PATCH v2] mm, swap: bounds check swap_info array " Daniel Jordan
2019-01-31 2:48 ` About swapoff race patch (was Re: [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL d Huang, Ying
2019-01-31 20:46 ` About swapoff race patch (was Re: [PATCH] mm, swap: bounds check swap_info accesses to avoid NU Andrew Morton
2019-02-02 7:14 ` Huang, Ying
2019-02-04 21:37 ` Hugh Dickins
2019-02-04 22:26 ` Matthew Wilcox
2019-02-06 0:14 ` Huang, Ying
2019-02-06 0:36 ` Hugh Dickins
2019-02-06 0:58 ` Huang, Ying
2019-02-08 0:28 ` Andrea Parri
2019-02-11 1:02 ` Huang, Ying
2019-01-30 7:28 ` [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL derefs Dan Carpenter
2019-01-31 1:55 ` Daniel Jordan
2019-01-30 9:13 ` Peter Zijlstra
2019-01-31 2:00 ` Daniel Jordan
2019-01-15 0:28 ` [PATCH] mm, swap: Potential NULL dereference in get_swap_page_of_type() Andrea Parri
2019-01-14 2:12 ` Huang, Ying
2019-01-14 8:43 ` Dan Carpenter
2019-01-14 23:40 ` Daniel Jordan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190115002305.15402-1-daniel.m.jordan@oracle.com \
--to=daniel.m.jordan@oracle.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=andrea.parri@amarulasolutions.com \
--cc=dan.carpenter@oracle.com \
--cc=dave.hansen@linux.intel.com \
--cc=kernel-janitors@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=osandov@fb.com \
--cc=paulmck@linux.ibm.com \
--cc=peterz@infradead.org \
--cc=sfr@canb.auug.org.au \
--cc=shli@kernel.org \
--cc=stern@rowland.harvard.edu \
--cc=tj@kernel.org \
--cc=will.deacon@arm.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.