cgroups to prevent OSD from taking down a whole machine

All of lore.kernel.org
 help / color / mirror / Atom feed

* cgroups to prevent OSD from taking down a whole machine
@ 2013-02-08 14:46 Wido den Hollander
  2013-02-08 14:52 ` Mark Nelson
  2013-02-08 14:55 ` Andrey Korolyov
  0 siblings, 2 replies; 4+ messages in thread
From: Wido den Hollander @ 2013-02-08 14:46 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hi,

Has anybody tried this yet?

Running into the memory leaks during scrubbing[0] I started thinking 
about a way to limit OSDs to a specific amount of memory.

A machine has 32GB of memory, 4 OSDs, so you might want to limit each 
OSD to 8GB so it can't take the whole machine down and would only kill 
itself.

I think I'll give it a try on a couple of machines, but I just wanted to 
see if anybody has tried this already or sees any downsides to this?

We use cgroups in the CloudStack project (through libvirt) to prevent 
that a memory leak in one KVM proces can take down a whole hypervisor, 
it works pretty well there.

Suggestions or comments?

Wido

[0]: http://tracker.ceph.com/issues/3883

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cgroups to prevent OSD from taking down a whole machine
  2013-02-08 14:46 cgroups to prevent OSD from taking down a whole machine Wido den Hollander
@ 2013-02-08 14:52 ` Mark Nelson
  2013-02-08 14:58   ` Andrey Korolyov
  2013-02-08 14:55 ` Andrey Korolyov
  1 sibling, 1 reply; 4+ messages in thread
From: Mark Nelson @ 2013-02-08 14:52 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel@vger.kernel.org

I've been thinking about using this for machines where people want to 
run OSDS and VMs on the same nodes.  Keep Ceph and the VMs in separate 
cgroups to help keep them from interfering with each other.

It won't help with memory or QPI/hypertransport throughput (unless you 
have them segmented on different sockets), but it should help in some 
other cases.

Mark

On 02/08/2013 08:46 AM, Wido den Hollander wrote:
> Hi,
>
> Has anybody tried this yet?
>
> Running into the memory leaks during scrubbing[0] I started thinking
> about a way to limit OSDs to a specific amount of memory.
>
> A machine has 32GB of memory, 4 OSDs, so you might want to limit each
> OSD to 8GB so it can't take the whole machine down and would only kill
> itself.
>
> I think I'll give it a try on a couple of machines, but I just wanted to
> see if anybody has tried this already or sees any downsides to this?
>
> We use cgroups in the CloudStack project (through libvirt) to prevent
> that a memory leak in one KVM proces can take down a whole hypervisor,
> it works pretty well there.
>
> Suggestions or comments?
>
> Wido
>
> [0]: http://tracker.ceph.com/issues/3883
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cgroups to prevent OSD from taking down a whole machine
  2013-02-08 14:52 ` Mark Nelson
@ 2013-02-08 14:58   ` Andrey Korolyov
  0 siblings, 0 replies; 4+ messages in thread
From: Andrey Korolyov @ 2013-02-08 14:58 UTC (permalink / raw)
  To: Mark Nelson; +Cc: Wido den Hollander, ceph-devel@vger.kernel.org

On Fri, Feb 8, 2013 at 6:52 PM, Mark Nelson <mark.nelson@inktank.com> wrote:
> I've been thinking about using this for machines where people want to run
> OSDS and VMs on the same nodes.  Keep Ceph and the VMs in separate cgroups
> to help keep them from interfering with each other.
>
> It won't help with memory or QPI/hypertransport throughput (unless you have
> them segmented on different sockets), but it should help in some other
> cases.
>
If we don`t speak on strict memory pinning and related performance
boost, I have no idea how it can matters for cgroups by themselves.

> Mark
>
>
> On 02/08/2013 08:46 AM, Wido den Hollander wrote:
>>
>> Hi,
>>
>> Has anybody tried this yet?
>>
>> Running into the memory leaks during scrubbing[0] I started thinking
>> about a way to limit OSDs to a specific amount of memory.
>>
>> A machine has 32GB of memory, 4 OSDs, so you might want to limit each
>> OSD to 8GB so it can't take the whole machine down and would only kill
>> itself.
>>
>> I think I'll give it a try on a couple of machines, but I just wanted to
>> see if anybody has tried this already or sees any downsides to this?
>>
>> We use cgroups in the CloudStack project (through libvirt) to prevent
>> that a memory leak in one KVM proces can take down a whole hypervisor,
>> it works pretty well there.
>>
>> Suggestions or comments?
>>
>> Wido
>>
>> [0]: http://tracker.ceph.com/issues/3883
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cgroups to prevent OSD from taking down a whole machine
  2013-02-08 14:46 cgroups to prevent OSD from taking down a whole machine Wido den Hollander
  2013-02-08 14:52 ` Mark Nelson
@ 2013-02-08 14:55 ` Andrey Korolyov
  1 sibling, 0 replies; 4+ messages in thread
From: Andrey Korolyov @ 2013-02-08 14:55 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel@vger.kernel.org

On Fri, Feb 8, 2013 at 6:46 PM, Wido den Hollander <wido@widodh.nl> wrote:
> Hi,
>
> Has anybody tried this yet?
>
> Running into the memory leaks during scrubbing[0] I started thinking about a
> way to limit OSDs to a specific amount of memory.
>
> A machine has 32GB of memory, 4 OSDs, so you might want to limit each OSD to
> 8GB so it can't take the whole machine down and would only kill itself.
>
> I think I'll give it a try on a couple of machines, but I just wanted to see
> if anybody has tried this already or sees any downsides to this?
>

Yep, it works fine, although I`d recommend to take a look to an
oom-delay patch to handle oom situations inside cg more nicely. And of
course you`ll pay for memory cg usage by some percents of overall node
performance.

> We use cgroups in the CloudStack project (through libvirt) to prevent that a
> memory leak in one KVM proces can take down a whole hypervisor, it works
> pretty well there.
>
> Suggestions or comments?
>
> Wido
>
> [0]: http://tracker.ceph.com/issues/3883
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-02-08 14:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-08 14:46 cgroups to prevent OSD from taking down a whole machine Wido den Hollander
2013-02-08 14:52 ` Mark Nelson
2013-02-08 14:58   ` Andrey Korolyov
2013-02-08 14:55 ` Andrey Korolyov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.