linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] faultaround updates
@ 2014-07-29 11:33 Kirill A. Shutemov
  2014-07-29 11:33 ` [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
  2014-07-29 11:33 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
  0 siblings, 2 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2014-07-29 11:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Andrey Ryabinin, Sasha Levin, David Rientjes,
	linux-mm, Kirill A. Shutemov

One fix and one tweak for faultaround code.

As alternative, we could just drop debugfs interface and make
fault_around_bytes constant.

Kirill A. Shutemov (2):
  mm: close race between do_fault_around() and fault_around_bytes_set()
  mm: mark fault_around_bytes __read_mostly

 mm/memory.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

-- 
2.0.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-07-29 11:33 [PATCH 0/2] faultaround updates Kirill A. Shutemov
@ 2014-07-29 11:33 ` Kirill A. Shutemov
  2014-07-29 13:32   ` Andrey Ryabinin
  2014-07-29 11:33 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
  1 sibling, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2014-07-29 11:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Andrey Ryabinin, Sasha Levin, David Rientjes,
	linux-mm, Kirill A. Shutemov

Things can go wrong if fault_around_bytes will be changed under
do_fault_around(): between fault_around_mask() and fault_around_pages().

Let's read fault_around_bytes only once during do_fault_around() and
calculate mask based on the reading.

Note: fault_around_bytes can only be updated via debug interface. Also
I've tried but was not able to trigger a bad behaviour without the
patch. So I would not consider this patch as urgent.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/memory.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 9d66bc66f338..2ce07dc9b52b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
 
 static inline unsigned long fault_around_pages(void)
 {
-	return fault_around_bytes >> PAGE_SHIFT;
+	return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
 }
 
-static inline unsigned long fault_around_mask(void)
+static inline unsigned long fault_around_mask(unsigned long nr_pages)
 {
-	return ~(fault_around_bytes - 1) & PAGE_MASK;
+	return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
 }
 
 
@@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
 static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
 		pte_t *pte, pgoff_t pgoff, unsigned int flags)
 {
-	unsigned long start_addr;
+	unsigned long start_addr, nr_pages;
 	pgoff_t max_pgoff;
 	struct vm_fault vmf;
 	int off;
 
-	start_addr = max(address & fault_around_mask(), vma->vm_start);
+	nr_pages = fault_around_pages();
+	/* race with fault_around_bytes_set() */
+	if (nr_pages <= 1)
+		return;
+
+	start_addr = max(address & fault_around_mask(nr_pages), vma->vm_start);
 	off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
 	pte -= off;
 	pgoff -= off;
@@ -2861,7 +2866,7 @@ static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
 	max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
 		PTRS_PER_PTE - 1;
 	max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
-			pgoff + fault_around_pages() - 1);
+			pgoff + nr_pages - 1);
 
 	/* Check if it makes any sense to call ->map_pages */
 	while (!pte_none(*pte)) {
-- 
2.0.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] mm: mark fault_around_bytes __read_mostly
  2014-07-29 11:33 [PATCH 0/2] faultaround updates Kirill A. Shutemov
  2014-07-29 11:33 ` [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
@ 2014-07-29 11:33 ` Kirill A. Shutemov
  2014-07-29 22:38   ` David Rientjes
  1 sibling, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2014-07-29 11:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Andrey Ryabinin, Sasha Levin, David Rientjes,
	linux-mm, Kirill A. Shutemov

fault_around_bytes can only be changed via debugfs. Let's mark it
read-mostly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 2ce07dc9b52b..ed3073d6a0e0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2768,7 +2768,8 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
 	update_mmu_cache(vma, address, pte);
 }
 
-static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
+static unsigned long fault_around_bytes __read_mostly =
+	rounddown_pow_of_two(65536);
 
 static inline unsigned long fault_around_pages(void)
 {
-- 
2.0.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-07-29 11:33 ` [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
@ 2014-07-29 13:32   ` Andrey Ryabinin
  2014-07-29 14:27     ` Kirill A. Shutemov
  0 siblings, 1 reply; 9+ messages in thread
From: Andrey Ryabinin @ 2014-07-29 13:32 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton
  Cc: Dave Hansen, Sasha Levin, David Rientjes, linux-mm

On 07/29/14 15:33, Kirill A. Shutemov wrote:
> Things can go wrong if fault_around_bytes will be changed under
> do_fault_around(): between fault_around_mask() and fault_around_pages().
> 
> Let's read fault_around_bytes only once during do_fault_around() and
> calculate mask based on the reading.
> 
> Note: fault_around_bytes can only be updated via debug interface. Also
> I've tried but was not able to trigger a bad behaviour without the
> patch. So I would not consider this patch as urgent.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  mm/memory.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 9d66bc66f338..2ce07dc9b52b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
>  
>  static inline unsigned long fault_around_pages(void)
>  {
> -	return fault_around_bytes >> PAGE_SHIFT;
> +	return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
>  }
>  
> -static inline unsigned long fault_around_mask(void)
> +static inline unsigned long fault_around_mask(unsigned long nr_pages)
>  {
> -	return ~(fault_around_bytes - 1) & PAGE_MASK;
> +	return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
>  }
>  
>  
> @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
>  static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
>  		pte_t *pte, pgoff_t pgoff, unsigned int flags)
>  {
> -	unsigned long start_addr;
> +	unsigned long start_addr, nr_pages;
>  	pgoff_t max_pgoff;
>  	struct vm_fault vmf;
>  	int off;
>  
> -	start_addr = max(address & fault_around_mask(), vma->vm_start);
> +	nr_pages = fault_around_pages();
> +	/* race with fault_around_bytes_set() */
> +	if (nr_pages <= 1)

unlikely() ?

> +		return;
> +
> +	start_addr = max(address & fault_around_mask(nr_pages), vma->vm_start);
>  	off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
>  	pte -= off;
>  	pgoff -= off;
> @@ -2861,7 +2866,7 @@ static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
>  	max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
>  		PTRS_PER_PTE - 1;
>  	max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
> -			pgoff + fault_around_pages() - 1);
> +			pgoff + nr_pages - 1);
>  
>  	/* Check if it makes any sense to call ->map_pages */
>  	while (!pte_none(*pte)) {
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-07-29 13:32   ` Andrey Ryabinin
@ 2014-07-29 14:27     ` Kirill A. Shutemov
  2014-07-29 17:07       ` Andrey Ryabinin
  2014-07-29 22:36       ` David Rientjes
  0 siblings, 2 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2014-07-29 14:27 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Kirill A. Shutemov, Andrew Morton, Dave Hansen, Sasha Levin,
	David Rientjes, linux-mm

Andrey Ryabinin wrote:
> On 07/29/14 15:33, Kirill A. Shutemov wrote:
> > Things can go wrong if fault_around_bytes will be changed under
> > do_fault_around(): between fault_around_mask() and fault_around_pages().
> > 
> > Let's read fault_around_bytes only once during do_fault_around() and
> > calculate mask based on the reading.
> > 
> > Note: fault_around_bytes can only be updated via debug interface. Also
> > I've tried but was not able to trigger a bad behaviour without the
> > patch. So I would not consider this patch as urgent.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  mm/memory.c | 17 +++++++++++------
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 9d66bc66f338..2ce07dc9b52b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
> >  
> >  static inline unsigned long fault_around_pages(void)
> >  {
> > -	return fault_around_bytes >> PAGE_SHIFT;
> > +	return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
> >  }
> >  
> > -static inline unsigned long fault_around_mask(void)
> > +static inline unsigned long fault_around_mask(unsigned long nr_pages)
> >  {
> > -	return ~(fault_around_bytes - 1) & PAGE_MASK;
> > +	return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
> >  }
> >  
> >  
> > @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
> >  static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
> >  		pte_t *pte, pgoff_t pgoff, unsigned int flags)
> >  {
> > -	unsigned long start_addr;
> > +	unsigned long start_addr, nr_pages;
> >  	pgoff_t max_pgoff;
> >  	struct vm_fault vmf;
> >  	int off;
> >  
> > -	start_addr = max(address & fault_around_mask(), vma->vm_start);
> > +	nr_pages = fault_around_pages();
> > +	/* race with fault_around_bytes_set() */
> > +	if (nr_pages <= 1)
> 
> unlikely() ?

Yep.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-07-29 14:27     ` Kirill A. Shutemov
@ 2014-07-29 17:07       ` Andrey Ryabinin
  2014-07-29 22:36       ` David Rientjes
  1 sibling, 0 replies; 9+ messages in thread
From: Andrey Ryabinin @ 2014-07-29 17:07 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrey Ryabinin, Andrew Morton, Dave Hansen, Sasha Levin,
	David Rientjes, linux-mm

2014-07-29 18:27 GMT+04:00 Kirill A. Shutemov <kirill.shutemov@linux.intel.com>:
> Andrey Ryabinin wrote:
>> On 07/29/14 15:33, Kirill A. Shutemov wrote:
>> > Things can go wrong if fault_around_bytes will be changed under
>> > do_fault_around(): between fault_around_mask() and fault_around_pages().
>> >
>> > Let's read fault_around_bytes only once during do_fault_around() and
>> > calculate mask based on the reading.
>> >
>> > Note: fault_around_bytes can only be updated via debug interface. Also
>> > I've tried but was not able to trigger a bad behaviour without the
>> > patch. So I would not consider this patch as urgent.
>> >
>> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> > ---
>> >  mm/memory.c | 17 +++++++++++------
>> >  1 file changed, 11 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/mm/memory.c b/mm/memory.c
>> > index 9d66bc66f338..2ce07dc9b52b 100644
>> > --- a/mm/memory.c
>> > +++ b/mm/memory.c
>> > @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
>> >
>> >  static inline unsigned long fault_around_pages(void)
>> >  {
>> > -   return fault_around_bytes >> PAGE_SHIFT;
>> > +   return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
>> >  }
>> >
>> > -static inline unsigned long fault_around_mask(void)
>> > +static inline unsigned long fault_around_mask(unsigned long nr_pages)
>> >  {
>> > -   return ~(fault_around_bytes - 1) & PAGE_MASK;
>> > +   return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
>> >  }
>> >
>> >
>> > @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
>> >  static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
>> >             pte_t *pte, pgoff_t pgoff, unsigned int flags)
>> >  {
>> > -   unsigned long start_addr;
>> > +   unsigned long start_addr, nr_pages;
>> >     pgoff_t max_pgoff;
>> >     struct vm_fault vmf;
>> >     int off;
>> >
>> > -   start_addr = max(address & fault_around_mask(), vma->vm_start);
>> > +   nr_pages = fault_around_pages();
>> > +   /* race with fault_around_bytes_set() */
>> > +   if (nr_pages <= 1)
>>
>> unlikely() ?
>
> Yep.
>

Btw, do we need this check at all? nr_pages can't be 0, and code below
seems able to handle nr_page == 1.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-07-29 14:27     ` Kirill A. Shutemov
  2014-07-29 17:07       ` Andrey Ryabinin
@ 2014-07-29 22:36       ` David Rientjes
  2014-07-29 23:16         ` Kirill A. Shutemov
  1 sibling, 1 reply; 9+ messages in thread
From: David Rientjes @ 2014-07-29 22:36 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrey Ryabinin, Andrew Morton, Dave Hansen, Sasha Levin,
	linux-mm

On Tue, 29 Jul 2014, Kirill A. Shutemov wrote:

> Things can go wrong if fault_around_bytes will be changed under
> do_fault_around(): between fault_around_mask() and fault_around_pages().
> 
> Let's read fault_around_bytes only once during do_fault_around() and
> calculate mask based on the reading.
> 
> Note: fault_around_bytes can only be updated via debug interface. Also
> I've tried but was not able to trigger a bad behaviour without the
> patch. So I would not consider this patch as urgent.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  mm/memory.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 9d66bc66f338..7f4f0c41c9e9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
>  
>  static inline unsigned long fault_around_pages(void)
>  {
> -	return fault_around_bytes >> PAGE_SHIFT;
> +	return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;

Not sure why this is being added here, ACCESS_ONCE() would be needed 
depending on the context in which the return value is used, 
do_read_fault() won't need it.

>  }
>  
> -static inline unsigned long fault_around_mask(void)
> +static inline unsigned long fault_around_mask(unsigned long nr_pages)
>  {
> -	return ~(fault_around_bytes - 1) & PAGE_MASK;
> +	return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
>  }
>  
>  

This patch is corrupted because of the newline here that doesn't exist in 
linux-next.

> @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
>  static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
>  		pte_t *pte, pgoff_t pgoff, unsigned int flags)
>  {
> -	unsigned long start_addr;
> +	unsigned long start_addr, nr_pages;
>  	pgoff_t max_pgoff;
>  	struct vm_fault vmf;
>  	int off;
>  
> -	start_addr = max(address & fault_around_mask(), vma->vm_start);
> +	nr_pages = fault_around_pages();
> +	/* race with fault_around_bytes_set() */
> +	if (unlikely(nr_pages <= 1))
> +		return;

Why exactly is this unlikely if fault_around_bytes is tunable via debugfs 
to equal PAGE_SIZE?  I assume we're expecting nobody is going to be doing 
that, otherwise we'll hit the unlikely() branch here every time.  So 
either the unlikely or the tunable should be removed.

The problem is that fault_around_bytes isn't documented so we don't even 
know the min value without looking at the source code.

I also don't see how nr_pages can be < 1.

> +
> +	start_addr = max(address & fault_around_mask(nr_pages), vma->vm_start);
>  	off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
>  	pte -= off;
>  	pgoff -= off;
> @@ -2861,7 +2866,7 @@ static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
>  	max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
>  		PTRS_PER_PTE - 1;
>  	max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
> -			pgoff + fault_around_pages() - 1);
> +			pgoff + nr_pages - 1);
>  
>  	/* Check if it makes any sense to call ->map_pages */
>  	while (!pte_none(*pte)) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] mm: mark fault_around_bytes __read_mostly
  2014-07-29 11:33 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
@ 2014-07-29 22:38   ` David Rientjes
  0 siblings, 0 replies; 9+ messages in thread
From: David Rientjes @ 2014-07-29 22:38 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Dave Hansen, Andrey Ryabinin, Sasha Levin,
	linux-mm

On Tue, 29 Jul 2014, Kirill A. Shutemov wrote:

> fault_around_bytes can only be changed via debugfs. Let's mark it
> read-mostly.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>

> ---
>  mm/memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 2ce07dc9b52b..ed3073d6a0e0 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2768,7 +2768,8 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
>  	update_mmu_cache(vma, address, pte);
>  }
>  
> -static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
> +static unsigned long fault_around_bytes __read_mostly =
> +	rounddown_pow_of_two(65536);
>  
>  static inline unsigned long fault_around_pages(void)
>  {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-07-29 22:36       ` David Rientjes
@ 2014-07-29 23:16         ` Kirill A. Shutemov
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2014-07-29 23:16 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: Kirill A. Shutemov, Andrey Ryabinin, Dave Hansen, Sasha Levin,
	linux-mm

On Tue, Jul 29, 2014 at 03:36:57PM -0700, David Rientjes wrote:
> On Tue, 29 Jul 2014, Kirill A. Shutemov wrote:
> 
> > Things can go wrong if fault_around_bytes will be changed under
> > do_fault_around(): between fault_around_mask() and fault_around_pages().
> > 
> > Let's read fault_around_bytes only once during do_fault_around() and
> > calculate mask based on the reading.
> > 
> > Note: fault_around_bytes can only be updated via debug interface. Also
> > I've tried but was not able to trigger a bad behaviour without the
> > patch. So I would not consider this patch as urgent.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  mm/memory.c | 17 +++++++++++------
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 9d66bc66f338..7f4f0c41c9e9 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
> >  
> >  static inline unsigned long fault_around_pages(void)
> >  {
> > -	return fault_around_bytes >> PAGE_SHIFT;
> > +	return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
> 
> Not sure why this is being added here, ACCESS_ONCE() would be needed 
> depending on the context in which the return value is used, 
> do_read_fault() won't need it.

Fair enough. I'll move it.

> >  }
> >  
> > -static inline unsigned long fault_around_mask(void)
> > +static inline unsigned long fault_around_mask(unsigned long nr_pages)
> >  {
> > -	return ~(fault_around_bytes - 1) & PAGE_MASK;
> > +	return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
> >  }
> >  
> >  
> 
> This patch is corrupted because of the newline here that doesn't exist in 
> linux-next.

I'll recheck.

> > @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
> >  static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
> >  		pte_t *pte, pgoff_t pgoff, unsigned int flags)
> >  {
> > -	unsigned long start_addr;
> > +	unsigned long start_addr, nr_pages;
> >  	pgoff_t max_pgoff;
> >  	struct vm_fault vmf;
> >  	int off;
> >  
> > -	start_addr = max(address & fault_around_mask(), vma->vm_start);
> > +	nr_pages = fault_around_pages();
> > +	/* race with fault_around_bytes_set() */
> > +	if (unlikely(nr_pages <= 1))
> > +		return;
> 
> Why exactly is this unlikely if fault_around_bytes is tunable via debugfs 
> to equal PAGE_SIZE?  I assume we're expecting nobody is going to be doing 
> that, otherwise we'll hit the unlikely() branch here every time.

No. We hit do_fault_around() only after fault_around_pages() check in
do_read_fault(): so only in race case.

> So either the unlikely or the tunable should be removed.
> 
> The problem is that fault_around_bytes isn't documented so we don't even 
> know the min value without looking at the source code.

I would prefer to drop tunable, it will make code a bit simplier.
Andrew, iirc you've asked for it. Do you still think we need the handle?

> I also don't see how nr_pages can be < 1.

As Andrey has pointed, the 'if' is not needed.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-07-29 23:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-29 11:33 [PATCH 0/2] faultaround updates Kirill A. Shutemov
2014-07-29 11:33 ` [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
2014-07-29 13:32   ` Andrey Ryabinin
2014-07-29 14:27     ` Kirill A. Shutemov
2014-07-29 17:07       ` Andrey Ryabinin
2014-07-29 22:36       ` David Rientjes
2014-07-29 23:16         ` Kirill A. Shutemov
2014-07-29 11:33 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
2014-07-29 22:38   ` David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).