linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* memory leak with linux-3.14.16
@ 2014-08-16  8:40 Peter Koch
  2014-08-16 10:10 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Koch @ 2014-08-16  8:40 UTC (permalink / raw)
  To: linux-raid

Dear readers,

I am shrinking my raid10-array consiting of 16 2TB disks
to 13 disks right now. Reshaping runs for 2 hours and
I'm constantly obervng /proc/mdstat and /proc/meminfo

SUnreclaim is constantly growing while MemFree and
MemAvailable are decreasing.

Seems like linux 3.14.16 is leaking memory at a rate
of 4GB per 1TB of reshape data.

My machine has 32GB of RAM and if I interpolate the current
memory-values I will run out of mem at 80% of the reshape
operation. This is exactly what happened to me with
linux-3.14.12.

Do I need linux-3.14.17 ??

Kind regards

Peter Koch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory leak with linux-3.14.16
  2014-08-16  8:40 Peter Koch
@ 2014-08-16 10:10 ` NeilBrown
  0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2014-08-16 10:10 UTC (permalink / raw)
  To: Peter Koch; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1322 bytes --]

On Sat, 16 Aug 2014 10:40:54 +0200 mdraid.pkoch@dfgh.net (Peter Koch) wrote:

> Dear readers,
> 
> I am shrinking my raid10-array consiting of 16 2TB disks
> to 13 disks right now. Reshaping runs for 2 hours and
> I'm constantly obervng /proc/mdstat and /proc/meminfo
> 
> SUnreclaim is constantly growing while MemFree and
> MemAvailable are decreasing.
> 
> Seems like linux 3.14.16 is leaking memory at a rate
> of 4GB per 1TB of reshape data.

Hmm... don't know about that bug.
Does /proc/slabinfo show some slab much bigger than the rest?

> 
> My machine has 32GB of RAM and if I interpolate the current
> memory-values I will run out of mem at 80% of the reshape
> operation. This is exactly what happened to me with
> linux-3.14.12.

If you gracefully shutdown and reboot it should keep pick up where it left
off but with more memory free.

> 
> Do I need linux-3.14.17 ??

The only bug I know of was fixed in 3.14.6.
I said 3.14.16 before - sorry about typo.

I'll if I can reproduce it myself some time next week.

NeilBrown



> 
> Kind regards
> 
> Peter Koch
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory leak with linux-3.14.16
@ 2014-08-16 13:45 Peter Koch
  2014-08-16 20:36 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Koch @ 2014-08-16 13:45 UTC (permalink / raw)
  To: linux-raid

Dear Neil,

> The only bug I know of was fixed in 3.14.6.
> I said 3.14.16 before - sorry about typo.

No wonder 3.14.16 behaves exactly as 3.14.12 did

My server now has reshaped 2.28TB and lost 9.14GB of RAM
so memory is still leaking at 4GB per TB

> Hmm... don't know about that bug.
> Does /proc/slabinfo show some slab much bigger than the rest?

I'm not a memory expert, so I made a copy of /proc/slabinfo and
compared this copy with /proc/slabinfo in an endless loop.

There are two values which are unusually high and
are going up constantly:

radix_tree_node   403942
kmalloc-256       38283576

38283576 chunks of 256 bytes are exactly those 9.14GB of
RAM that have leaked so far.

> If you gracefully shutdown and reboot it should keep pick up where it left
> off but with more memory free.

Last time my machine crashed when about 10TB of data was reshaped.
And my machine has 32GB of RAM plus 8GB of swap. According to my
calculations I need 13TB * 4GB/TB = 52GB of RAM, so adding another
20GB of swapspace should keep my server running until the reshape
has finished.

Kind regards

Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory leak with linux-3.14.16
  2014-08-16 13:45 memory leak with linux-3.14.16 Peter Koch
@ 2014-08-16 20:36 ` NeilBrown
  0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2014-08-16 20:36 UTC (permalink / raw)
  To: Peter Koch; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1411 bytes --]

On Sat, 16 Aug 2014 15:45:05 +0200 mdraid.pkoch@dfgh.net (Peter Koch) wrote:

> Dear Neil,
> 
> > The only bug I know of was fixed in 3.14.6.
> > I said 3.14.16 before - sorry about typo.
> 
> No wonder 3.14.16 behaves exactly as 3.14.12 did
> 
> My server now has reshaped 2.28TB and lost 9.14GB of RAM
> so memory is still leaking at 4GB per TB
> 
> > Hmm... don't know about that bug.
> > Does /proc/slabinfo show some slab much bigger than the rest?
> 
> I'm not a memory expert, so I made a copy of /proc/slabinfo and
> compared this copy with /proc/slabinfo in an endless loop.
> 
> There are two values which are unusually high and
> are going up constantly:
> 
> radix_tree_node   403942
> kmalloc-256       38283576
> 
> 38283576 chunks of 256 bytes are exactly those 9.14GB of
> RAM that have leaked so far.
> 
> > If you gracefully shutdown and reboot it should keep pick up where it left
> > off but with more memory free.
> 
> Last time my machine crashed when about 10TB of data was reshaped.
> And my machine has 32GB of RAM plus 8GB of swap. According to my
> calculations I need 13TB * 4GB/TB = 52GB of RAM, so adding another
> 20GB of swapspace should keep my server running until the reshape
> has finished.

That won't help.  Data stored in kmalloc-256 won't get swapped out - it stays
in RAM.  So unless you can hot-plus 20Gig of RAM ....

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory leak with linux-3.14.16
@ 2014-08-17  8:55 Peter Koch
  2014-08-18  5:01 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Koch @ 2014-08-17  8:55 UTC (permalink / raw)
  To: linux-raid

Dear Neil,

> That won't help.  Data stored in kmalloc-256 won't get swapped out - it stays
> in RAM.  So unless you can hot-plus 20Gig of RAM ....

Thanks for the info. I read it when almost all my memory were in
kmalloc-256. Half an hour later my machine would have crashed despite
the increased swapspace. So I could do a graceful reboot and the reshape
has sucessfully finished in the meantime.

Now I'm going to add those three drives to my array one by one. I'm doing
this because I cannot physically swap drive 13 and 14 (the next maintenance
window for such an operation would be in october). I will grow the array
to 14 drives today since my main concern is to put the data on an even
number of disks where the mirrors are separated correctly.

Then I will add drive 14 and 15 in one step.

By the way: Will a raid10 array with an even number of drives survife
if one half of the drives go offline during a reshape operation that
adds an even number of drives?

Should I download linux 3.14.17 sources and wait for a patch? If only
a missing kfree() has to be added somewhere I can do that by hand and
recompile 3.14.16.

Would it help you if I setup another machine and try to reproduce the
problem with linux 3.15.x, 3.16.x and 3.17.x?

Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: memory leak with linux-3.14.16
  2014-08-17  8:55 Peter Koch
@ 2014-08-18  5:01 ` NeilBrown
  0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2014-08-18  5:01 UTC (permalink / raw)
  To: Peter Koch; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4078 bytes --]

On Sun, 17 Aug 2014 10:55:04 +0200 mdraid.pkoch@dfgh.net (Peter Koch) wrote:

> Dear Neil,
> 
> > That won't help.  Data stored in kmalloc-256 won't get swapped out - it stays
> > in RAM.  So unless you can hot-plus 20Gig of RAM ....
> 
> Thanks for the info. I read it when almost all my memory were in
> kmalloc-256. Half an hour later my machine would have crashed despite
> the increased swapspace. So I could do a graceful reboot and the reshape
> has sucessfully finished in the meantime.
> 
> Now I'm going to add those three drives to my array one by one. I'm doing
> this because I cannot physically swap drive 13 and 14 (the next maintenance
> window for such an operation would be in october). I will grow the array
> to 14 drives today since my main concern is to put the data on an even
> number of disks where the mirrors are separated correctly.
> 
> Then I will add drive 14 and 15 in one step.
> 
> By the way: Will a raid10 array with an even number of drives survife
> if one half of the drives go offline during a reshape operation that
> adds an even number of drives?

Should do, yes.

> 
> Should I download linux 3.14.17 sources and wait for a patch? If only
> a missing kfree() has to be added somewhere I can do that by hand and
> recompile 3.14.16.

The following pair of patches should fix your problems.  Should be easy to
apply by hand to whatever kernel you want to use.

> 
> Would it help you if I setup another machine and try to reproduce the
> problem with linux 3.15.x, 3.16.x and 3.17.x?

No thanks.  memory leaks are quite easy to find - just enable
CONFIG_DEBUG_KMEMLEAK and there they are....
I found about 4 but these are the only important ones.  The second one might
not seem so important from the description, but it is.  Not freeing that
memory causes it to be re-used in a slightly incorrect way.

Thanks for the report.

NeilBrown


From 83a1ebfa292042b11b1e173b3fc50f243cb01c8b Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 18 Aug 2014 13:56:38 +1000
Subject: [PATCH] md/raid10: fix memory leak when reshaping a RAID10.

raid10 reshape clears unwanted bits from a bio->bi_flags using
a method which, while clumsy, worked until 3.10 when BIO_OWNS_VEC
was added.
Since then it clears that bit but shouldn't.  This results in a
memory leak.

So change to used the approved method of clearing unwanted bits.

As this causes a memory leak which can consume all of memory
the fix is suitable for -stable.

Fixes: a38352e0ac02dbbd4fa464dc22d1352b5fbd06fd
Cc: stable@vger.kernel.org (v3.10+)
Reported-by: mdraid.pkoch@dfgh.net (Peter Koch)
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index b08c18871323..d9073a10f2f2 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4410,7 +4410,7 @@ read_more:
 	read_bio->bi_private = r10_bio;
 	read_bio->bi_end_io = end_sync_read;
 	read_bio->bi_rw = READ;
-	read_bio->bi_flags &= ~(BIO_POOL_MASK - 1);
+	read_bio->bi_flags &= (~0UL << BIO_RESET_BITS);
 	read_bio->bi_flags |= 1 << BIO_UPTODATE;
 	read_bio->bi_vcnt = 0;
 	read_bio->bi_iter.bi_size = 0;
From afad1968a35676fa39ebe64603ffd7fbf4ceea10 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 18 Aug 2014 13:59:50 +1000
Subject: [PATCH] md/raid10: Fix memory leak when raid10 reshape completes.

When a raid10 commences a resync/recovery/reshape it allocates
some buffer space.
When a resync/recovery completes the buffer space is freed.  But not
when the reshape completes.
This can result in a small memory leak.

Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index d9073a10f2f2..a46124ecafc7 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2953,6 +2953,7 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr,
 		 */
 		if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) {
 			end_reshape(conf);
+			close_sync(conf);
 			return 0;
 		}
 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-18  5:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-16 13:45 memory leak with linux-3.14.16 Peter Koch
2014-08-16 20:36 ` NeilBrown
  -- strict thread matches above, loose matches on Subject: below --
2014-08-17  8:55 Peter Koch
2014-08-18  5:01 ` NeilBrown
2014-08-16  8:40 Peter Koch
2014-08-16 10:10 ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).