linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [Resend PATCH v2] mm: Fix slab->page _count corruption.
@ 2012-05-30 19:20 Pravin B Shelar
  2012-06-08 20:10 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Pravin B Shelar @ 2012-05-30 19:20 UTC (permalink / raw)
  To: akpm, cl, penberg, aarcange; +Cc: linux-mm, abhide, Pravin B Shelar

On arches that do not support this_cpu_cmpxchg_double slab_lock is used
to do atomic cmpxchg() on double word which contains page->_count.
page count can be changed from get_page() or put_page() without taking
slab_lock. That corrupts page counter.

Following patch fixes it by moving page->_count out of cmpxchg_double
data. So that slub does no change it while updating slub meta-data in
struct page.

Reported-by: Amey Bhide <abhide@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Christoph Lameter <cl@linux.com>
---
 include/linux/mm_types.h |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 18b48c4..e54a6b0 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -57,8 +57,16 @@ struct page {
 		};
 
 		union {
+#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \
+    defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE)
 			/* Used for cmpxchg_double in slub */
 			unsigned long counters;
+#else
+			/* Keep _count separate from slub cmpxchg_double data,
+			 * As rest of double word is protected by slab_lock
+			 * but _count is not. */
+			unsigned counters;
+#endif
 
 			struct {
 
-- 
1.7.10

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Resend PATCH v2] mm: Fix slab->page _count corruption.
  2012-05-30 19:20 [Resend PATCH v2] mm: Fix slab->page _count corruption Pravin B Shelar
@ 2012-06-08 20:10 ` Andrew Morton
  2012-06-08 20:15   ` Christoph Lameter
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2012-06-08 20:10 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: cl, penberg, aarcange, linux-mm, abhide

On Wed, 30 May 2012 12:20:10 -0700
Pravin B Shelar <pshelar@nicira.com> wrote:

> On arches that do not support this_cpu_cmpxchg_double slab_lock is used
> to do atomic cmpxchg() on double word which contains page->_count.
> page count can be changed from get_page() or put_page() without taking
> slab_lock. That corrupts page counter.
> 
> Following patch fixes it by moving page->_count out of cmpxchg_double
> data. So that slub does no change it while updating slub meta-data in
> struct page.
> 
> Reported-by: Amey Bhide <abhide@nicira.com>
> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> ---
>  include/linux/mm_types.h |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 18b48c4..e54a6b0 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -57,8 +57,16 @@ struct page {
>  		};
>  
>  		union {
> +#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \
> +    defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE)
>  			/* Used for cmpxchg_double in slub */
>  			unsigned long counters;
> +#else
> +			/* Keep _count separate from slub cmpxchg_double data,
> +			 * As rest of double word is protected by slab_lock
> +			 * but _count is not. */
> +			unsigned counters;
> +#endif
>  
>  			struct {

OK.  I assume this bug has been there for quite some time.

How serious is it?  Have people been reporting it in real workloads? 
How to trigger it?  IOW, does this need -stable backporting?

Also, someone forgot to document these:

				struct {
					unsigned inuse:16;
					unsigned objects:15;
					unsigned frozen:1;
				};
pls fix.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Resend PATCH v2] mm: Fix slab->page _count corruption.
  2012-06-08 20:10 ` Andrew Morton
@ 2012-06-08 20:15   ` Christoph Lameter
  2012-06-08 20:23     ` Pravin Shelar
  2012-06-08 20:32     ` Andrew Morton
  0 siblings, 2 replies; 8+ messages in thread
From: Christoph Lameter @ 2012-06-08 20:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Pravin B Shelar, penberg, aarcange, linux-mm, abhide

On Fri, 8 Jun 2012, Andrew Morton wrote:

> OK.  I assume this bug has been there for quite some time.

Well the huge pages refcount tricks caused the issue.

> How serious is it?  Have people been reporting it in real workloads?
> How to trigger it?  IOW, does this need -stable backporting?

Possibly.

> Also, someone forgot to document these:
>
> 				struct {
> 					unsigned inuse:16;
> 					unsigned objects:15;
> 					unsigned frozen:1;
> 				};

So far I thouight that the field names are pretty clear on their own.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Resend PATCH v2] mm: Fix slab->page _count corruption.
  2012-06-08 20:15   ` Christoph Lameter
@ 2012-06-08 20:23     ` Pravin Shelar
  2012-06-08 21:19       ` Andrew Morton
  2012-06-08 20:32     ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Pravin Shelar @ 2012-06-08 20:23 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Andrew Morton, penberg, aarcange, linux-mm, abhide

On Fri, Jun 8, 2012 at 1:15 PM, Christoph Lameter <cl@linux.com> wrote:
> On Fri, 8 Jun 2012, Andrew Morton wrote:
>
>> OK.  I assume this bug has been there for quite some time.
>
> Well the huge pages refcount tricks caused the issue.
>
>> How serious is it?  Have people been reporting it in real workloads?
>> How to trigger it?  IOW, does this need -stable backporting?
>
> Possibly.

If this patch is getting back-ported then we shld also do same for
5bf5f03c271907978 (mm: fix slab->page flags corruption) which fixes
other issue related to slub  and huge page sharing.

>
>> Also, someone forgot to document these:
>>
>>                               struct {
>>                                       unsigned inuse:16;
>>                                       unsigned objects:15;
>>                                       unsigned frozen:1;
>>                               };
>
> So far I thouight that the field names are pretty clear on their own.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Resend PATCH v2] mm: Fix slab->page _count corruption.
  2012-06-08 20:15   ` Christoph Lameter
  2012-06-08 20:23     ` Pravin Shelar
@ 2012-06-08 20:32     ` Andrew Morton
  2012-06-08 21:25       ` Christoph Lameter
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2012-06-08 20:32 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pravin B Shelar, penberg, aarcange, linux-mm, abhide

On Fri, 8 Jun 2012 15:15:50 -0500 (CDT)
Christoph Lameter <cl@linux.com> wrote:

> On Fri, 8 Jun 2012, Andrew Morton wrote:
> 
> > OK.  I assume this bug has been there for quite some time.
> 
> Well the huge pages refcount tricks caused the issue.
> 
> > How serious is it?  Have people been reporting it in real workloads?
> > How to trigger it?  IOW, does this need -stable backporting?
> 
> Possibly.

that was all admirably indecisive.

> > Also, someone forgot to document these:
> >
> > 				struct {
> > 					unsigned inuse:16;
> > 					unsigned objects:15;
> > 					unsigned frozen:1;
> > 				};
> 
> So far I thouight that the field names are pretty clear on their own.

Kidding?  I had to grep the tree just to find out which subsystem owns
these.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Resend PATCH v2] mm: Fix slab->page _count corruption.
  2012-06-08 20:23     ` Pravin Shelar
@ 2012-06-08 21:19       ` Andrew Morton
  2012-06-11 18:31         ` Pravin Shelar
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2012-06-08 21:19 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Christoph Lameter, penberg, aarcange, linux-mm, abhide

On Fri, 8 Jun 2012 13:23:56 -0700
Pravin Shelar <pshelar@nicira.com> wrote:

> On Fri, Jun 8, 2012 at 1:15 PM, Christoph Lameter <cl@linux.com> wrote:
> > On Fri, 8 Jun 2012, Andrew Morton wrote:
> >
> >> OK. __I assume this bug has been there for quite some time.
> >
> > Well the huge pages refcount tricks caused the issue.
> >
> >> How serious is it? __Have people been reporting it in real workloads?
> >> How to trigger it? __IOW, does this need -stable backporting?
> >
> > Possibly.
> 
> If this patch is getting back-ported then we shld also do same for
> 5bf5f03c271907978 (mm: fix slab->page flags corruption) which fixes
> other issue related to slub  and huge page sharing.

Well I don't know if either are getting backported yet.

To decide that we would have to understand the end-user impact of the
bug(s).  Please tell us?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Resend PATCH v2] mm: Fix slab->page _count corruption.
  2012-06-08 20:32     ` Andrew Morton
@ 2012-06-08 21:25       ` Christoph Lameter
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Lameter @ 2012-06-08 21:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Pravin B Shelar, penberg, aarcange, linux-mm, abhide

On Fri, 8 Jun 2012, Andrew Morton wrote:

> > So far I thouight that the field names are pretty clear on their own.
>
> Kidding?  I had to grep the tree just to find out which subsystem owns
> these.

Reading the comments a couple of lines up would have helped as well.

But anyways we are already adding more comments in the upcoming patchsets
for the next merge.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Resend PATCH v2] mm: Fix slab->page _count corruption.
  2012-06-08 21:19       ` Andrew Morton
@ 2012-06-11 18:31         ` Pravin Shelar
  0 siblings, 0 replies; 8+ messages in thread
From: Pravin Shelar @ 2012-06-11 18:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, penberg, aarcange, linux-mm, abhide,
	Jesse Gross

On Fri, Jun 8, 2012 at 2:19 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 8 Jun 2012 13:23:56 -0700
> Pravin Shelar <pshelar@nicira.com> wrote:
>
>> On Fri, Jun 8, 2012 at 1:15 PM, Christoph Lameter <cl@linux.com> wrote:
>> > On Fri, 8 Jun 2012, Andrew Morton wrote:
>> >
>> >> OK. __I assume this bug has been there for quite some time.
>> >
>> > Well the huge pages refcount tricks caused the issue.
>> >
>> >> How serious is it? __Have people been reporting it in real workloads?
>> >> How to trigger it? __IOW, does this need -stable backporting?
>> >
>> > Possibly.
>>
>> If this patch is getting back-ported then we shld also do same for
>> 5bf5f03c271907978 (mm: fix slab->page flags corruption) which fixes
>> other issue related to slub  and huge page sharing.
>
> Well I don't know if either are getting backported yet.
>
> To decide that we would have to understand the end-user impact of the
> bug(s).  Please tell us?
>

We are working on zero copy io over skb in Open-vswitch. thats when we
so this panic when we tried to get_page() over skb linear data
allocated slub. But then I realized that it could potentially affect
other subsystems as well, e.g. xfs and ocfs, which does page struct
updates on slub objects.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-06-11 18:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-30 19:20 [Resend PATCH v2] mm: Fix slab->page _count corruption Pravin B Shelar
2012-06-08 20:10 ` Andrew Morton
2012-06-08 20:15   ` Christoph Lameter
2012-06-08 20:23     ` Pravin Shelar
2012-06-08 21:19       ` Andrew Morton
2012-06-11 18:31         ` Pravin Shelar
2012-06-08 20:32     ` Andrew Morton
2012-06-08 21:25       ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).