From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrey Ryabinin <a.ryabinin@samsung.com>,
Dave Hansen <dave.hansen@intel.com>,
Sasha Levin <sasha.levin@oracle.com>,
linux-mm@kvack.org
Subject: Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
Date: Wed, 30 Jul 2014 02:16:36 +0300 [thread overview]
Message-ID: <20140729231636.GA17685@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.DEB.2.02.1407291531080.20991@chino.kir.corp.google.com>
On Tue, Jul 29, 2014 at 03:36:57PM -0700, David Rientjes wrote:
> On Tue, 29 Jul 2014, Kirill A. Shutemov wrote:
>
> > Things can go wrong if fault_around_bytes will be changed under
> > do_fault_around(): between fault_around_mask() and fault_around_pages().
> >
> > Let's read fault_around_bytes only once during do_fault_around() and
> > calculate mask based on the reading.
> >
> > Note: fault_around_bytes can only be updated via debug interface. Also
> > I've tried but was not able to trigger a bad behaviour without the
> > patch. So I would not consider this patch as urgent.
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> > mm/memory.c | 17 +++++++++++------
> > 1 file changed, 11 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 9d66bc66f338..7f4f0c41c9e9 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
> >
> > static inline unsigned long fault_around_pages(void)
> > {
> > - return fault_around_bytes >> PAGE_SHIFT;
> > + return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
>
> Not sure why this is being added here, ACCESS_ONCE() would be needed
> depending on the context in which the return value is used,
> do_read_fault() won't need it.
Fair enough. I'll move it.
> > }
> >
> > -static inline unsigned long fault_around_mask(void)
> > +static inline unsigned long fault_around_mask(unsigned long nr_pages)
> > {
> > - return ~(fault_around_bytes - 1) & PAGE_MASK;
> > + return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
> > }
> >
> >
>
> This patch is corrupted because of the newline here that doesn't exist in
> linux-next.
I'll recheck.
> > @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
> > static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
> > pte_t *pte, pgoff_t pgoff, unsigned int flags)
> > {
> > - unsigned long start_addr;
> > + unsigned long start_addr, nr_pages;
> > pgoff_t max_pgoff;
> > struct vm_fault vmf;
> > int off;
> >
> > - start_addr = max(address & fault_around_mask(), vma->vm_start);
> > + nr_pages = fault_around_pages();
> > + /* race with fault_around_bytes_set() */
> > + if (unlikely(nr_pages <= 1))
> > + return;
>
> Why exactly is this unlikely if fault_around_bytes is tunable via debugfs
> to equal PAGE_SIZE? I assume we're expecting nobody is going to be doing
> that, otherwise we'll hit the unlikely() branch here every time.
No. We hit do_fault_around() only after fault_around_pages() check in
do_read_fault(): so only in race case.
> So either the unlikely or the tunable should be removed.
>
> The problem is that fault_around_bytes isn't documented so we don't even
> know the min value without looking at the source code.
I would prefer to drop tunable, it will make code a bit simplier.
Andrew, iirc you've asked for it. Do you still think we need the handle?
> I also don't see how nr_pages can be < 1.
As Andrey has pointed, the 'if' is not needed.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-07-29 23:16 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-29 11:33 [PATCH 0/2] faultaround updates Kirill A. Shutemov
2014-07-29 11:33 ` [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
2014-07-29 13:32 ` Andrey Ryabinin
2014-07-29 14:27 ` Kirill A. Shutemov
2014-07-29 17:07 ` Andrey Ryabinin
2014-07-29 22:36 ` David Rientjes
2014-07-29 23:16 ` Kirill A. Shutemov [this message]
2014-07-29 11:33 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
2014-07-29 22:38 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140729231636.GA17685@node.dhcp.inet.fi \
--to=kirill@shutemov.name \
--cc=a.ryabinin@samsung.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=sasha.levin@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).