Re: why my cluster become unavailable (min_size of pool)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "hzwulibin" <hzwulibin@gmail.com>
To: Sage Weil <sage@newdream.net>, Haomai Wang <haomaiwang@gmail.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: why my cluster become unavailable (min_size of pool)
Date: Thu, 26 Nov 2015 15:54:04 +0800	[thread overview]
Message-ID: <5656BA98.4090301@gmail.com> (raw)
In-Reply-To: <5652651C.8040902@gmail.com>

Hi, Sage

I has a question about min_size of pool.

The default value of min_size is 2, but in this setting, when two OSDs are down(mean two replicas lost) at same time, the IO will be blocked.
We want to set the min_size to 1 in our production environment as we think it's normal case when two OSDs are down(sure on different host) at same time.

So is there anypotential problem of this setting?

We use 0.80.10 version.

Thanks!


------------------				 
hzwulibin
2015-11-26

-------------------------------------------------------------
发件人："hzwulibin"<hzwulibin@gmail.com>
发送日期：2015-11-23 09:00
收件人：Sage Weil,Haomai Wang
抄送：ceph-devel
主题：Re: why my cluster become unavailable

Hi, Sage

Thanks! Will try it when next testing!

------------------				 
hzwulibin
2015-11-23

-------------------------------------------------------------
发件人：Sage Weil <sage@newdream.net>
发送日期：2015-11-22 01:49
收件人：Haomai Wang
抄送：Libin Wu,ceph-devel
主题：Re: why my cluster become unavailable

On Sun, 22 Nov 2015, Haomai Wang wrote:
> On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote:
> > Hi, cepher
> >
> > I have a cluster of 6 OSD server, every server has 8 OSDs.
> >
> > I out 4 OSDs on every server, then my client io is blocking.
> >
> > I reboot my client and then create a new rbd device, but the new
> > device also can't write io.
> >
> > Yeah, i understand that some data may lost as threee replicas of some
> > object were lost, but why the cluster become unavailable?
> >
> > There 80 incomplete pg and 4 down+incomplete pg.
> >
> > Any solution i could solve the problem?
> 
> Yes, if you doesn't have a special crushmap to control the data
> replcement policy, pg will lack of necessary metadata to boot. If need
> to readd outed osds or force remove pg which is incomplete(hope it's
> just a test).

Is min_size 2 or 1?  Reducing it to 1 will generally clear some of the 
incomplete pgs.  Just remember to raise it back to 2 after the cluster 
recovers.

sage

next prev parent reply	other threads:[~2015-11-26  7:54 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-19 15:26 why my cluster become unavailable Libin Wu
2015-11-21 17:43 ` Haomai Wang
2015-11-21 17:49   ` Sage Weil
2015-11-23  1:00     ` hzwulibin
2015-11-26  7:54       ` hzwulibin [this message]
2015-11-26  8:00         ` why my cluster become unavailable (min_size of pool) Haomai Wang
2015-11-26  8:04           ` hzwulibin
2015-11-26 13:30         ` Sage Weil
2015-12-02 11:22           ` Libin Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5656BA98.4090301@gmail.com \
    --to=hzwulibin@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=haomaiwang@gmail.com \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.