From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "zhangyi (F)" <yi.zhang@huawei.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
mawilcox@microsoft.com, ross.zwisler@linux.intel.com,
viro@zeniv.linux.org.uk, miaoxie@huawei.com
Subject: Re: [PATCH] dax: fix potential overflow on 32bit machine
Date: Tue, 5 Dec 2017 10:07:09 -0700 [thread overview]
Message-ID: <20171205170709.GA21010@linux.intel.com> (raw)
In-Reply-To: <20171205052407.GA20757@bombadil.infradead.org>
On Mon, Dec 04, 2017 at 09:24:07PM -0800, Matthew Wilcox wrote:
> On Tue, Dec 05, 2017 at 11:32:10AM +0800, zhangyi (F) wrote:
> > On 32bit machine, when mmap2 a large enough file with pgoff more than
> > ULONG_MAX >> PAGE_SHIFT, it will trigger offset overflow and lead to
> > unmap the wrong page in dax_insert_mapping_entry(). This patch cast
> > pgoff to 64bit to prevent the overflow.
>
> You're quite correct, and you've solved this problem the same way as the
> other half-dozen users in the kernel with the problem, so good job.
>
> I think we can do better though. How does this look?
>
> From 9f8f30197eba970c82eea909624299c86b2c5f7e Mon Sep 17 00:00:00 2001
> From: Matthew Wilcox <mawilcox@microsoft.com>
> Date: Tue, 5 Dec 2017 00:15:54 -0500
> Subject: [PATCH] mm: Add unmap_mapping_pages
>
> Several users of unmap_mapping_range() would much prefer to express
> their range in pages rather than bytes. This leads to four places
> in the current tree where there are bugs on 32-bit kernels because of
> missing casts.
>
> Conveniently, unmap_mapping_range() actually converts from bytes into
> pages, so hoist the guts of unmap_mapping_range() into a new function
> unmap_mapping_pages() and convert the callers.
Yep, this interface is much nicer than all the casting and shifting we
currently have for unmap_mapping_range().
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Reported-by: "zhangyi (F)" <yi.zhang@huawei.com>
> ---
> fs/dax.c | 19 ++++++-------------
> include/linux/mm.h | 2 ++
> mm/khugepaged.c | 3 +--
> mm/memory.c | 41 +++++++++++++++++++++++++++++------------
> mm/truncate.c | 23 +++++++----------------
> 5 files changed, 45 insertions(+), 43 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 95981591977a..6dd481f8216c 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -44,6 +44,7 @@
>
> /* The 'colour' (ie low bits) within a PMD of a page offset. */
> #define PG_PMD_COLOUR ((PMD_SIZE >> PAGE_SHIFT) - 1)
> +#define PG_PMD_NR (PMD_SIZE >> PAGE_SHIFT)
I wonder if it's confusing that PG_PMD_COLOUR is a mask, but PG_PMD_NR is a
count? Would "PAGES_PER_PMD" be clearer, in the spirit of
PTRS_PER_{PGD,PMD,PTE}?
Also, can we use the same define both in fs/dax.c and in mm/truncate.c,
instead of the latter using HPAGE_PMD_NR?
> static wait_queue_head_t wait_table[DAX_WAIT_TABLE_ENTRIES];
>
> @@ -375,8 +376,8 @@ static void *grab_mapping_entry(struct address_space *mapping, pgoff_t index,
> * unmapped.
> */
> if (pmd_downgrade && dax_is_zero_entry(entry))
> - unmap_mapping_range(mapping,
> - (index << PAGE_SHIFT) & PMD_MASK, PMD_SIZE, 0);
> + unmap_mapping_pages(mapping, index & ~PG_PMD_COLOUR,
> + PG_PMD_NR, 0);
>
> err = radix_tree_preload(
> mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM);
> @@ -538,12 +539,10 @@ static void *dax_insert_mapping_entry(struct address_space *mapping,
> if (dax_is_zero_entry(entry) && !(flags & RADIX_DAX_ZERO_PAGE)) {
> /* we are replacing a zero page with block mapping */
> if (dax_is_pmd_entry(entry))
> - unmap_mapping_range(mapping,
> - (vmf->pgoff << PAGE_SHIFT) & PMD_MASK,
> - PMD_SIZE, 0);
> + unmap_mapping_pages(mapping, vmf->pgoff & PG_PMD_COLOUR,
I think you need: ~PG_PMD_COLOUR,
> + PG_PMD_NR, 0);
> else /* pte entry */
> - unmap_mapping_range(mapping, vmf->pgoff << PAGE_SHIFT,
> - PAGE_SIZE, 0);
> + unmap_mapping_pages(mapping, vmf->pgoff, 1, 0);
> }
>
> spin_lock_irq(&mapping->tree_lock);
> @@ -1269,12 +1268,6 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
> }
>
> #ifdef CONFIG_FS_DAX_PMD
> -/*
> - * The 'colour' (ie low bits) within a PMD of a page offset. This comes up
> - * more often than one might expect in the below functions.
> - */
> -#define PG_PMD_COLOUR ((PMD_SIZE >> PAGE_SHIFT) - 1)
Yay!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-12-05 17:07 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-05 3:32 [PATCH] dax: fix potential overflow on 32bit machine zhangyi (F)
2017-12-05 5:24 ` Matthew Wilcox
2017-12-05 5:24 ` Matthew Wilcox
2017-12-05 8:40 ` zhangyi (F)
2017-12-05 8:40 ` zhangyi (F)
2017-12-05 17:07 ` Ross Zwisler [this message]
2017-12-05 17:37 ` Matthew Wilcox
2017-12-05 19:19 ` Ross Zwisler
2017-12-05 19:54 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171205170709.GA21010@linux.intel.com \
--to=ross.zwisler@linux.intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mawilcox@microsoft.com \
--cc=miaoxie@huawei.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.