linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: memory leak with linux-3.14.16
@ 2014-08-17  8:55 Peter Koch
  2014-08-18  5:01 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Koch @ 2014-08-17  8:55 UTC (permalink / raw)
  To: linux-raid

Dear Neil,

> That won't help.  Data stored in kmalloc-256 won't get swapped out - it stays
> in RAM.  So unless you can hot-plus 20Gig of RAM ....

Thanks for the info. I read it when almost all my memory were in
kmalloc-256. Half an hour later my machine would have crashed despite
the increased swapspace. So I could do a graceful reboot and the reshape
has sucessfully finished in the meantime.

Now I'm going to add those three drives to my array one by one. I'm doing
this because I cannot physically swap drive 13 and 14 (the next maintenance
window for such an operation would be in october). I will grow the array
to 14 drives today since my main concern is to put the data on an even
number of disks where the mirrors are separated correctly.

Then I will add drive 14 and 15 in one step.

By the way: Will a raid10 array with an even number of drives survife
if one half of the drives go offline during a reshape operation that
adds an even number of drives?

Should I download linux 3.14.17 sources and wait for a patch? If only
a missing kfree() has to be added somewhere I can do that by hand and
recompile 3.14.16.

Would it help you if I setup another machine and try to reproduce the
problem with linux 3.15.x, 3.16.x and 3.17.x?

Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: memory leak with linux-3.14.16
@ 2014-08-16 13:45 Peter Koch
  2014-08-16 20:36 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Koch @ 2014-08-16 13:45 UTC (permalink / raw)
  To: linux-raid

Dear Neil,

> The only bug I know of was fixed in 3.14.6.
> I said 3.14.16 before - sorry about typo.

No wonder 3.14.16 behaves exactly as 3.14.12 did

My server now has reshaped 2.28TB and lost 9.14GB of RAM
so memory is still leaking at 4GB per TB

> Hmm... don't know about that bug.
> Does /proc/slabinfo show some slab much bigger than the rest?

I'm not a memory expert, so I made a copy of /proc/slabinfo and
compared this copy with /proc/slabinfo in an endless loop.

There are two values which are unusually high and
are going up constantly:

radix_tree_node   403942
kmalloc-256       38283576

38283576 chunks of 256 bytes are exactly those 9.14GB of
RAM that have leaked so far.

> If you gracefully shutdown and reboot it should keep pick up where it left
> off but with more memory free.

Last time my machine crashed when about 10TB of data was reshaped.
And my machine has 32GB of RAM plus 8GB of swap. According to my
calculations I need 13TB * 4GB/TB = 52GB of RAM, so adding another
20GB of swapspace should keep my server running until the reshape
has finished.

Kind regards

Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread
* memory leak with linux-3.14.16
@ 2014-08-16  8:40 Peter Koch
  2014-08-16 10:10 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Koch @ 2014-08-16  8:40 UTC (permalink / raw)
  To: linux-raid

Dear readers,

I am shrinking my raid10-array consiting of 16 2TB disks
to 13 disks right now. Reshaping runs for 2 hours and
I'm constantly obervng /proc/mdstat and /proc/meminfo

SUnreclaim is constantly growing while MemFree and
MemAvailable are decreasing.

Seems like linux 3.14.16 is leaking memory at a rate
of 4GB per 1TB of reshape data.

My machine has 32GB of RAM and if I interpolate the current
memory-values I will run out of mem at 80% of the reshape
operation. This is exactly what happened to me with
linux-3.14.12.

Do I need linux-3.14.17 ??

Kind regards

Peter Koch

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-18  5:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-17  8:55 memory leak with linux-3.14.16 Peter Koch
2014-08-18  5:01 ` NeilBrown
  -- strict thread matches above, loose matches on Subject: below --
2014-08-16 13:45 Peter Koch
2014-08-16 20:36 ` NeilBrown
2014-08-16  8:40 Peter Koch
2014-08-16 10:10 ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).