linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] mm: Fix slab->page flags corruption.
@ 2012-05-17 22:17 Pravin B Shelar
  2012-05-22 17:48 ` Andrea Arcangeli
  0 siblings, 1 reply; 3+ messages in thread
From: Pravin B Shelar @ 2012-05-17 22:17 UTC (permalink / raw)
  To: aarcange, cl, penberg, mpm
  Cc: linux-kernel, linux-mm, jesse, abhide, Pravin B Shelar

v2-v3:
	- Check if page is still compound page after inc refcnt.
v1-v2:
	- Avoid taking compound lock for slab pages.

--8<--------------------------cut here-------------------------->8--

Transparent huge pages can change page->flags (PG_compound_lock)
without taking Slab lock. Since THP can not break slab pages we can
safely access compound page without taking compound lock.

Specifically this patch fixes race between compound_unlock and slab
functions which does page-flags update. This can occur when
get_page/put_page is called on page from slab object.

Reported-by: Amey Bhide <abhide@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
---
 include/linux/mm.h |    2 ++
 mm/swap.c          |   27 +++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 74aa71b..82f86e6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -321,6 +321,7 @@ static inline int is_vmalloc_or_module_addr(const void *x)
 static inline void compound_lock(struct page *page)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	VM_BUG_ON(PageSlab(page));
 	bit_spin_lock(PG_compound_lock, &page->flags);
 #endif
 }
@@ -328,6 +329,7 @@ static inline void compound_lock(struct page *page)
 static inline void compound_unlock(struct page *page)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	VM_BUG_ON(PageSlab(page));
 	bit_spin_unlock(PG_compound_lock, &page->flags);
 #endif
 }
diff --git a/mm/swap.c b/mm/swap.c
index 8ff73d8..44a0f81 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -82,6 +82,19 @@ static void put_compound_page(struct page *page)
 		if (likely(page != page_head &&
 			   get_page_unless_zero(page_head))) {
 			unsigned long flags;
+
+			if (PageSlab(page_head)) {
+				if (PageTail(page)) {
+					/* THP can not break up slab pages, avoid
+					 * taking compound_lock(). */
+					if (put_page_testzero(page_head))
+						VM_BUG_ON(1);
+
+					atomic_dec(&page->_mapcount);
+					goto skip_lock_tail;
+				} else
+					goto skip_lock;
+			}
 			/*
 			 * page_head wasn't a dangling pointer but it
 			 * may not be a head page anymore by the time
@@ -93,6 +106,7 @@ static void put_compound_page(struct page *page)
 				/* __split_huge_page_refcount run before us */
 				compound_unlock_irqrestore(page_head, flags);
 				VM_BUG_ON(PageHead(page_head));
+			skip_lock:
 				if (put_page_testzero(page_head))
 					__put_single_page(page_head);
 			out_put_single:
@@ -115,6 +129,8 @@ static void put_compound_page(struct page *page)
 			VM_BUG_ON(atomic_read(&page_head->_count) <= 0);
 			VM_BUG_ON(atomic_read(&page->_count) != 0);
 			compound_unlock_irqrestore(page_head, flags);
+
+			skip_lock_tail:
 			if (put_page_testzero(page_head)) {
 				if (PageHead(page_head))
 					__put_compound_page(page_head);
@@ -162,6 +178,15 @@ bool __get_page_tail(struct page *page)
 	struct page *page_head = compound_trans_head(page);
 
 	if (likely(page != page_head && get_page_unless_zero(page_head))) {
+
+		if (PageSlab(page_head)) {
+			if (likely(PageTail(page))) {
+				__get_page_tail_foll(page, false);
+				return true;
+			} else
+				goto out;
+		}
+
 		/*
 		 * page_head wasn't a dangling pointer but it
 		 * may not be a head page anymore by the time
@@ -175,6 +200,8 @@ bool __get_page_tail(struct page *page)
 			got = true;
 		}
 		compound_unlock_irqrestore(page_head, flags);
+
+		out:
 		if (unlikely(!got))
 			put_page(page_head);
 	}
-- 
1.7.10

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] mm: Fix slab->page flags corruption.
  2012-05-17 22:17 [PATCH v3] mm: Fix slab->page flags corruption Pravin B Shelar
@ 2012-05-22 17:48 ` Andrea Arcangeli
  2012-05-22 22:41   ` Pravin Shelar
  0 siblings, 1 reply; 3+ messages in thread
From: Andrea Arcangeli @ 2012-05-22 17:48 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: cl, penberg, mpm, linux-kernel, linux-mm, jesse, abhide

On Thu, May 17, 2012 at 03:17:49PM -0700, Pravin B Shelar wrote:
> diff --git a/mm/swap.c b/mm/swap.c
> index 8ff73d8..44a0f81 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -82,6 +82,19 @@ static void put_compound_page(struct page *page)
>  		if (likely(page != page_head &&
>  			   get_page_unless_zero(page_head))) {
>  			unsigned long flags;
> +
> +			if (PageSlab(page_head)) {
> +				if (PageTail(page)) {
> +					/* THP can not break up slab pages, avoid
> +					 * taking compound_lock(). */
> +					if (put_page_testzero(page_head))
> +						VM_BUG_ON(1);
> +
> +					atomic_dec(&page->_mapcount);
> +					goto skip_lock_tail;
> +				} else
> +					goto skip_lock;
> +			}

Some commentary on the fact slab prefers not using atomic ops on the
page->flags could help here.

>  			/*
>  			 * page_head wasn't a dangling pointer but it
>  			 * may not be a head page anymore by the time
> @@ -93,6 +106,7 @@ static void put_compound_page(struct page *page)
>  				/* __split_huge_page_refcount run before us */
>  				compound_unlock_irqrestore(page_head, flags);
>  				VM_BUG_ON(PageHead(page_head));

Hmmm hmmm while reviewing this one, I've been thinking maybe the head
page after the hugepage split, could have been freed and reallocated
as order 1 or 2, and legitimately become an head page again.

The whole point of the bug-on is that it cannot be reallocated as a
THP beause the tail is still there and it's not free yet, but it
doesn't take into account the head page could be allocated as a
compound page of a smaller size and maybe the tail is the last subpage
of the thp.

So there's the risk of a false positive, in an extremely unlikely case
(the fact slab goes in unmovable pageblocks and thp goes in movable
further decreases the probability). All production kernels runs with
VM_BUG_ON disabled so it's a very small concern, but maybe we should
delete it. It has never triggered, just code reivew. Do you agree?

> +			skip_lock:
>  				if (put_page_testzero(page_head))
>  					__put_single_page(page_head);
>  			out_put_single:
> @@ -115,6 +129,8 @@ static void put_compound_page(struct page *page)
>  			VM_BUG_ON(atomic_read(&page_head->_count) <= 0);
>  			VM_BUG_ON(atomic_read(&page->_count) != 0);
>  			compound_unlock_irqrestore(page_head, flags);
> +
> +			skip_lock_tail:
>  			if (put_page_testzero(page_head)) {
>  				if (PageHead(page_head))
>  					__put_compound_page(page_head);
> @@ -162,6 +178,15 @@ bool __get_page_tail(struct page *page)
>  	struct page *page_head = compound_trans_head(page);
>  
>  	if (likely(page != page_head && get_page_unless_zero(page_head))) {
> +
> +		if (PageSlab(page_head)) {
> +			if (likely(PageTail(page))) {
> +				__get_page_tail_foll(page, false);
> +				return true;
> +			} else
> +				goto out;
> +		}
> +

A comment here too would be nice.

>  		/*
>  		 * page_head wasn't a dangling pointer but it
>  		 * may not be a head page anymore by the time
> @@ -175,6 +200,8 @@ bool __get_page_tail(struct page *page)
>  			got = true;
>  		}
>  		compound_unlock_irqrestore(page_head, flags);
> +
> +		out:
>  		if (unlikely(!got))
>  			put_page(page_head);

out could go in the line below. Assuming we don't want to be cleaner
and use put_page above instead of goto, that would also drop a branch
probably (the goto place is such a slow path). I'm fine either ways.

It's not the cleanest of the patches but it's clearly a performance
tweak.

Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] mm: Fix slab->page flags corruption.
  2012-05-22 17:48 ` Andrea Arcangeli
@ 2012-05-22 22:41   ` Pravin Shelar
  0 siblings, 0 replies; 3+ messages in thread
From: Pravin Shelar @ 2012-05-22 22:41 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: cl, penberg, mpm, linux-kernel, linux-mm, jesse, abhide

On Tue, May 22, 2012 at 10:48 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Thu, May 17, 2012 at 03:17:49PM -0700, Pravin B Shelar wrote:
>> diff --git a/mm/swap.c b/mm/swap.c
>> index 8ff73d8..44a0f81 100644
>> --- a/mm/swap.c
>> +++ b/mm/swap.c
>> @@ -82,6 +82,19 @@ static void put_compound_page(struct page *page)
>>               if (likely(page != page_head &&
>>                          get_page_unless_zero(page_head))) {
>>                       unsigned long flags;
>> +
>> +                     if (PageSlab(page_head)) {
>> +                             if (PageTail(page)) {
>> +                                     /* THP can not break up slab pages, avoid
>> +                                      * taking compound_lock(). */
>> +                                     if (put_page_testzero(page_head))
>> +                                             VM_BUG_ON(1);
>> +
>> +                                     atomic_dec(&page->_mapcount);
>> +                                     goto skip_lock_tail;
>> +                             } else
>> +                                     goto skip_lock;
>> +                     }
>
> Some commentary on the fact slab prefers not using atomic ops on the
> page->flags could help here.
ok.

>
>>                       /*
>>                        * page_head wasn't a dangling pointer but it
>>                        * may not be a head page anymore by the time
>> @@ -93,6 +106,7 @@ static void put_compound_page(struct page *page)
>>                               /* __split_huge_page_refcount run before us */
>>                               compound_unlock_irqrestore(page_head, flags);
>>                               VM_BUG_ON(PageHead(page_head));
>
> Hmmm hmmm while reviewing this one, I've been thinking maybe the head
> page after the hugepage split, could have been freed and reallocated
> as order 1 or 2, and legitimately become an head page again.
>
> The whole point of the bug-on is that it cannot be reallocated as a
> THP beause the tail is still there and it's not free yet, but it
> doesn't take into account the head page could be allocated as a
> compound page of a smaller size and maybe the tail is the last subpage
> of the thp.
>
> So there's the risk of a false positive, in an extremely unlikely case
> (the fact slab goes in unmovable pageblocks and thp goes in movable
> further decreases the probability). All production kernels runs with
> VM_BUG_ON disabled so it's a very small concern, but maybe we should
> delete it. It has never triggered, just code reivew. Do you agree?
>
right, I will delete it.

>> +                     skip_lock:
>>                               if (put_page_testzero(page_head))
>>                                       __put_single_page(page_head);
>>                       out_put_single:
>> @@ -115,6 +129,8 @@ static void put_compound_page(struct page *page)
>>                       VM_BUG_ON(atomic_read(&page_head->_count) <= 0);
>>                       VM_BUG_ON(atomic_read(&page->_count) != 0);
>>                       compound_unlock_irqrestore(page_head, flags);
>> +
>> +                     skip_lock_tail:
>>                       if (put_page_testzero(page_head)) {
>>                               if (PageHead(page_head))
>>                                       __put_compound_page(page_head);
>> @@ -162,6 +178,15 @@ bool __get_page_tail(struct page *page)
>>       struct page *page_head = compound_trans_head(page);
>>
>>       if (likely(page != page_head && get_page_unless_zero(page_head))) {
>> +
>> +             if (PageSlab(page_head)) {
>> +                     if (likely(PageTail(page))) {
>> +                             __get_page_tail_foll(page, false);
>> +                             return true;
>> +                     } else
>> +                             goto out;
>> +             }
>> +
>
> A comment here too would be nice.
>
>>               /*
>>                * page_head wasn't a dangling pointer but it
>>                * may not be a head page anymore by the time
>> @@ -175,6 +200,8 @@ bool __get_page_tail(struct page *page)
>>                       got = true;
>>               }
>>               compound_unlock_irqrestore(page_head, flags);
>> +
>> +             out:
>>               if (unlikely(!got))
>>                       put_page(page_head);
>
> out could go in the line below. Assuming we don't want to be cleaner
> and use put_page above instead of goto, that would also drop a branch
> probably (the goto place is such a slow path). I'm fine either ways.
>
> It's not the cleanest of the patches but it's clearly a performance
> tweak.
>
ok, I will post revised patch.
> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
>
> Thanks,
> Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-05-22 22:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-17 22:17 [PATCH v3] mm: Fix slab->page flags corruption Pravin B Shelar
2012-05-22 17:48 ` Andrea Arcangeli
2012-05-22 22:41   ` Pravin Shelar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).