From: Milosz Tanski <milosz@adfin.com>
To: Vit Yenukas <Vit.Yenukas@twosigma.com>,
Gregory Farnum <greg@inktank.com>
Cc: Wido den Hollander <wido@42on.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: Adding a delay when restarting all OSDs on a host
Date: Wed, 23 Jul 2014 16:57:22 -0400 [thread overview]
Message-ID: <53D021B2.7080407@adfin.com> (raw)
In-Reply-To: <5F401C8095B99D4FABC1B919C1BE599A01029F8655@EXMBNJE1.ad.twosigma.com>
Default stack size shouldn't matter. At least it's not an issue on a kernel with over-commit turned on (default). Most threads / apps never use that many stack frames (in fact they use a fraction of that), thus the kernel doesn't bother allocating the pages to it. My bet is on some other resource.
On 7/23/14, 3:22 PM, Vit Yenukas wrote:
> Just some fun fact pertaining to the resources consumption during startup sequence -
> we've ran out of memory on a 72-disk server with 256GB RAM during the startup.
> ceph-osd dies with 'can not fork' and cores. There were in excess of 40 thousands
> threads when this began to happen. With default thread stack size being 8MB, no wonder :)
> Note that this was in an experimental setup with just one node, so all OSDs peering happens on the same host.
> Just for heck of it, I reduced the number of OSDs by two (to 36 OSDs) by setting up a soft RAID-0 for each disk pair.
> This worked after some tweaking of udev rules (that ignore 'md' block devs). I'm not sure if we're going to see
> the same problem with real cluster (18 such 72-disk nodes), with EC 9-3.
> Also, not sure if reducing user proc stack to 4MB would be a good idea.
>
> On 07/22/2014 08:08 PM, Gregory Farnum wrote:
>
>> On Tue, Jul 22, 2014 at 6:19 AM, Wido den Hollander <wido@42on.com> wrote:
>>> Hi,
>>>
>>> Currently on Ubuntu with Upstart when you invoke a restart like this:
>>>
>>> $ sudo restart ceph-osd-all
>>>
>>> It will restart all OSDs at once, which can increase the load on the system
>>> a quite a bit.
>>>
>>> It's better to restart all OSDs by restarting them one by one:
>>>
>>> $ sudo ceph restart ceph-osd id=X
>>>
>>> But you then have to figure out all the IDs by doing a find in
>>> /var/lib/ceph/osd and that's more manual work.
>>>
>>> I'm thinking of patching the init scripts which allows something like this:
>>>
>>> $ sudo restart ceph-osd-all delay=180
>>>
>>> It then waits 180 seconds between each OSD restart making the proces even
>>> smoother.
>>>
>>> I know there are currently sysvinit, upstart and systemd scripts, so it has
>>> to be implemented on various places, but how does the general idea sound?
>> That sounds like a good idea to me. I presume you're meaning to
>> actually delay the restarts, not just turning them on, so that the
>> daemons all remain alive (that's what it sounds like to me here, just
>> wanted to clarify).
>> -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2014-07-23 20:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-22 13:19 Adding a delay when restarting all OSDs on a host Wido den Hollander
2014-07-22 13:48 ` Andrey Korolyov
2014-07-22 14:28 ` Wido den Hollander
2014-07-22 14:58 ` Andrey Korolyov
2014-07-23 0:08 ` Gregory Farnum
2014-07-23 19:22 ` Vit Yenukas
2014-07-23 20:57 ` Milosz Tanski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D021B2.7080407@adfin.com \
--to=milosz@adfin.com \
--cc=Vit.Yenukas@twosigma.com \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=wido@42on.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.