From: Stefan Kooman <stefan-68+x73Hep80@public.gmane.org>
To: by morphin <morphinwithyou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org,
ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Mimic cluster is offline and not healing
Date: Fri, 28 Sep 2018 09:09:23 +0200 [thread overview]
Message-ID: <20180928070923.GC17567@shell.dmz.bit.nl> (raw)
In-Reply-To: <CAE-AtHo2UVSFcMHMXszSPJXs=BRKb0PELzryMyu4LVEv910pQQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Quoting by morphin (morphinwithyou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> Good news... :)
>
> After I tried everything. I decide to re-create my MONs from OSD's and
> I used the script:
> https://paste.ubuntu.com/p/rNMPdMPhT5/
>
> And it worked!!!
Congrats!
> I think when 2 server crashed and come back same time some how MON's
> confused and the maps just corrupted.
> After re-creation all the MONs was have the same map so it worked.
> But still I dont know how to hell the mons can cause endless %95 I/O ???
> This a bug anyway and if you dont want to leave the problem then do
> not "enable" your mons. Just start them manual! Another tough lesson.
The only time we needed to manually start the mons was at "bootstrap"
time. After a reboot they are brought up by systemd ... and it keeps on
working. Have you rebooted your mon(s) after the manual start?
>
> ceph -s: https://paste.ubuntu.com/p/m3hFF22jM9/
>
> As you can see below some of the OSDs are still down. And when I start
> them they dont start.
> Check start log: https://paste.ubuntu.com/p/ZJQG4khdbx/
> Debug log: https://paste.ubuntu.com/p/J3JyGShHym/
>
> What we can do for the problem?
Apply PR https://github.com/ceph/ceph/pull/24064
I see that you are running Mimic 13.2.1 ... 13.2.2 was released a few
days ago. Not sure if this fix has made it into 13.2.2.
> What is the cause of the problem?
Somehow it looks like you hit this issue:
https://tracker.ceph.com/issues/24866
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info-68+x73Hep80@public.gmane.org
prev parent reply other threads:[~2018-09-28 7:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-27 12:19 Mimic cluster is offline and not healing by morphin
[not found] ` <CAE-AtHqSpX09gnAfgXt1=nmyLKuvjgMMn+qKaiZ0nOUKwEARrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-09-27 13:10 ` Stefan Kooman
[not found] ` <20180927131043.GB17567-VkyGEX2O1ez1kYbDYJMsfg@public.gmane.org>
2018-09-27 13:27 ` by morphin
[not found] ` <CAE-AtHodr9iaGF3vhkrv+J8mHsYk384Ni8MpbMvW6Xg_Tdw4GQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-09-27 18:38 ` by morphin
[not found] ` <CAE-AtHpGLZu5ygyw0sLkOcB3mt-0pLfcLZiPKYuptDLAafy7uw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-09-27 20:52 ` by morphin
[not found] ` <CAE-AtHo2UVSFcMHMXszSPJXs=BRKb0PELzryMyu4LVEv910pQQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-09-28 7:09 ` Stefan Kooman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180928070923.GC17567@shell.dmz.bit.nl \
--to=stefan-68+x73hep80@public.gmane.org \
--cc=ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org \
--cc=morphinwithyou-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.