* why my cluster become unavailable @ 2015-11-19 15:26 Libin Wu 2015-11-21 17:43 ` Haomai Wang 0 siblings, 1 reply; 9+ messages in thread From: Libin Wu @ 2015-11-19 15:26 UTC (permalink / raw) To: ceph-devel Hi, cepher I have a cluster of 6 OSD server, every server has 8 OSDs. I out 4 OSDs on every server, then my client io is blocking. I reboot my client and then create a new rbd device, but the new device also can't write io. Yeah, i understand that some data may lost as threee replicas of some object were lost, but why the cluster become unavailable? There 80 incomplete pg and 4 down+incomplete pg. Any solution i could solve the problem? Thanks! ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: why my cluster become unavailable 2015-11-19 15:26 why my cluster become unavailable Libin Wu @ 2015-11-21 17:43 ` Haomai Wang 2015-11-21 17:49 ` Sage Weil 0 siblings, 1 reply; 9+ messages in thread From: Haomai Wang @ 2015-11-21 17:43 UTC (permalink / raw) To: Libin Wu; +Cc: ceph-devel On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: > Hi, cepher > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > I out 4 OSDs on every server, then my client io is blocking. > > I reboot my client and then create a new rbd device, but the new > device also can't write io. > > Yeah, i understand that some data may lost as threee replicas of some > object were lost, but why the cluster become unavailable? > > There 80 incomplete pg and 4 down+incomplete pg. > > Any solution i could solve the problem? Yes, if you doesn't have a special crushmap to control the data replcement policy, pg will lack of necessary metadata to boot. If need to readd outed osds or force remove pg which is incomplete(hope it's just a test). > > Thanks! > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards, Wheat ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: why my cluster become unavailable 2015-11-21 17:43 ` Haomai Wang @ 2015-11-21 17:49 ` Sage Weil 2015-11-23 1:00 ` hzwulibin 0 siblings, 1 reply; 9+ messages in thread From: Sage Weil @ 2015-11-21 17:49 UTC (permalink / raw) To: Haomai Wang; +Cc: Libin Wu, ceph-devel On Sun, 22 Nov 2015, Haomai Wang wrote: > On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: > > Hi, cepher > > > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > > > I out 4 OSDs on every server, then my client io is blocking. > > > > I reboot my client and then create a new rbd device, but the new > > device also can't write io. > > > > Yeah, i understand that some data may lost as threee replicas of some > > object were lost, but why the cluster become unavailable? > > > > There 80 incomplete pg and 4 down+incomplete pg. > > > > Any solution i could solve the problem? > > Yes, if you doesn't have a special crushmap to control the data > replcement policy, pg will lack of necessary metadata to boot. If need > to readd outed osds or force remove pg which is incomplete(hope it's > just a test). Is min_size 2 or 1? Reducing it to 1 will generally clear some of the incomplete pgs. Just remember to raise it back to 2 after the cluster recovers. sage ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: why my cluster become unavailable 2015-11-21 17:49 ` Sage Weil @ 2015-11-23 1:00 ` hzwulibin 2015-11-26 7:54 ` why my cluster become unavailable (min_size of pool) hzwulibin 0 siblings, 1 reply; 9+ messages in thread From: hzwulibin @ 2015-11-23 1:00 UTC (permalink / raw) To: Sage Weil, Haomai Wang; +Cc: ceph-devel Hi, Sage Thanks! Will try it when next testing! ------------------ hzwulibin 2015-11-23 ------------------------------------------------------------- 发件人:Sage Weil <sage@newdream.net> 发送日期:2015-11-22 01:49 收件人:Haomai Wang 抄送:Libin Wu,ceph-devel 主题:Re: why my cluster become unavailable On Sun, 22 Nov 2015, Haomai Wang wrote: > On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: > > Hi, cepher > > > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > > > I out 4 OSDs on every server, then my client io is blocking. > > > > I reboot my client and then create a new rbd device, but the new > > device also can't write io. > > > > Yeah, i understand that some data may lost as threee replicas of some > > object were lost, but why the cluster become unavailable? > > > > There 80 incomplete pg and 4 down+incomplete pg. > > > > Any solution i could solve the problem? > > Yes, if you doesn't have a special crushmap to control the data > replcement policy, pg will lack of necessary metadata to boot. If need > to readd outed osds or force remove pg which is incomplete(hope it's > just a test). Is min_size 2 or 1? Reducing it to 1 will generally clear some of the incomplete pgs. Just remember to raise it back to 2 after the cluster recovers. sage ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: why my cluster become unavailable (min_size of pool) 2015-11-23 1:00 ` hzwulibin @ 2015-11-26 7:54 ` hzwulibin 2015-11-26 8:00 ` Haomai Wang 2015-11-26 13:30 ` Sage Weil 0 siblings, 2 replies; 9+ messages in thread From: hzwulibin @ 2015-11-26 7:54 UTC (permalink / raw) To: Sage Weil, Haomai Wang; +Cc: ceph-devel Hi, Sage I has a question about min_size of pool. The default value of min_size is 2, but in this setting, when two OSDs are down(mean two replicas lost) at same time, the IO will be blocked. We want to set the min_size to 1 in our production environment as we think it's normal case when two OSDs are down(sure on different host) at same time. So is there anypotential problem of this setting? We use 0.80.10 version. Thanks! ------------------ hzwulibin 2015-11-26 ------------------------------------------------------------- 发件人:"hzwulibin"<hzwulibin@gmail.com> 发送日期:2015-11-23 09:00 收件人:Sage Weil,Haomai Wang 抄送:ceph-devel 主题:Re: why my cluster become unavailable Hi, Sage Thanks! Will try it when next testing! ------------------ hzwulibin 2015-11-23 ------------------------------------------------------------- 发件人:Sage Weil <sage@newdream.net> 发送日期:2015-11-22 01:49 收件人:Haomai Wang 抄送:Libin Wu,ceph-devel 主题:Re: why my cluster become unavailable On Sun, 22 Nov 2015, Haomai Wang wrote: > On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: > > Hi, cepher > > > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > > > I out 4 OSDs on every server, then my client io is blocking. > > > > I reboot my client and then create a new rbd device, but the new > > device also can't write io. > > > > Yeah, i understand that some data may lost as threee replicas of some > > object were lost, but why the cluster become unavailable? > > > > There 80 incomplete pg and 4 down+incomplete pg. > > > > Any solution i could solve the problem? > > Yes, if you doesn't have a special crushmap to control the data > replcement policy, pg will lack of necessary metadata to boot. If need > to readd outed osds or force remove pg which is incomplete(hope it's > just a test). Is min_size 2 or 1? Reducing it to 1 will generally clear some of the incomplete pgs. Just remember to raise it back to 2 after the cluster recovers. sage ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: why my cluster become unavailable (min_size of pool) 2015-11-26 7:54 ` why my cluster become unavailable (min_size of pool) hzwulibin @ 2015-11-26 8:00 ` Haomai Wang 2015-11-26 8:04 ` hzwulibin 2015-11-26 13:30 ` Sage Weil 1 sibling, 1 reply; 9+ messages in thread From: Haomai Wang @ 2015-11-26 8:00 UTC (permalink / raw) To: hzwulibin; +Cc: Sage Weil, ceph-devel On Thu, Nov 26, 2015 at 3:54 PM, hzwulibin <hzwulibin@gmail.com> wrote: > Hi, Sage > > I has a question about min_size of pool. > > The default value of min_size is 2, but in this setting, when two OSDs are down(mean two replicas lost) at same time, the IO will be blocked. > We want to set the min_size to 1 in our production environment as we think it's normal case when two OSDs are down(sure on different host) at same time. min_size with 2 means each object must ensure two copies in this pool. It mainly reduce the permanent storage media corrupt risk which cause actual data lose. That's mean if min_size is 1 and under this degraded case, one more osd permanent corrupt will cause data lose. If min_size is 2, it need at least 2 osds. > > So is there anypotential problem of this setting? > > We use 0.80.10 version. > > Thanks! > > > ------------------ > hzwulibin > 2015-11-26 > > ------------------------------------------------------------- > 发件人:"hzwulibin"<hzwulibin@gmail.com> > 发送日期:2015-11-23 09:00 > 收件人:Sage Weil,Haomai Wang > 抄送:ceph-devel > 主题:Re: why my cluster become unavailable > > Hi, Sage > > Thanks! Will try it when next testing! > > ------------------ > hzwulibin > 2015-11-23 > > ------------------------------------------------------------- > 发件人:Sage Weil <sage@newdream.net> > 发送日期:2015-11-22 01:49 > 收件人:Haomai Wang > 抄送:Libin Wu,ceph-devel > 主题:Re: why my cluster become unavailable > > On Sun, 22 Nov 2015, Haomai Wang wrote: >> On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: >> > Hi, cepher >> > >> > I have a cluster of 6 OSD server, every server has 8 OSDs. >> > >> > I out 4 OSDs on every server, then my client io is blocking. >> > >> > I reboot my client and then create a new rbd device, but the new >> > device also can't write io. >> > >> > Yeah, i understand that some data may lost as threee replicas of some >> > object were lost, but why the cluster become unavailable? >> > >> > There 80 incomplete pg and 4 down+incomplete pg. >> > >> > Any solution i could solve the problem? >> >> Yes, if you doesn't have a special crushmap to control the data >> replcement policy, pg will lack of necessary metadata to boot. If need >> to readd outed osds or force remove pg which is incomplete(hope it's >> just a test). > > Is min_size 2 or 1? Reducing it to 1 will generally clear some of the > incomplete pgs. Just remember to raise it back to 2 after the cluster > recovers. > > sage > > -- Best Regards, Wheat -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: why my cluster become unavailable (min_size of pool) 2015-11-26 8:00 ` Haomai Wang @ 2015-11-26 8:04 ` hzwulibin 0 siblings, 0 replies; 9+ messages in thread From: hzwulibin @ 2015-11-26 8:04 UTC (permalink / raw) To: Haomai Wang; +Cc: Sage Weil, ceph-devel Hi, haomai Thanks for quick reply, your explain make sense for me. Thanks! ------------------ hzwulibin 2015-11-26 ------------------------------------------------------------- 发件人:Haomai Wang <haomaiwang@gmail.com> 发送日期:2015-11-26 16:00 收件人:hzwulibin 抄送:Sage Weil,ceph-devel 主题:Re: why my cluster become unavailable (min_size of pool) On Thu, Nov 26, 2015 at 3:54 PM, hzwulibin <hzwulibin@gmail.com> wrote: > Hi, Sage > > I has a question about min_size of pool. > > The default value of min_size is 2, but in this setting, when two OSDs are down(mean two replicas lost) at same time, the IO will be blocked. > We want to set the min_size to 1 in our production environment as we think it's normal case when two OSDs are down(sure on different host) at same time. min_size with 2 means each object must ensure two copies in this pool. It mainly reduce the permanent storage media corrupt risk which cause actual data lose. That's mean if min_size is 1 and under this degraded case, one more osd permanent corrupt will cause data lose. If min_size is 2, it need at least 2 osds. > > So is there anypotential problem of this setting? > > We use 0.80.10 version. > > Thanks! > > > ------------------ > hzwulibin > 2015-11-26 > > ------------------------------------------------------------- > 发件人:"hzwulibin"<hzwulibin@gmail.com> > 发送日期:2015-11-23 09:00 > 收件人:Sage Weil,Haomai Wang > 抄送:ceph-devel > 主题:Re: why my cluster become unavailable > > Hi, Sage > > Thanks! Will try it when next testing! > > ------------------ > hzwulibin > 2015-11-23 > > ------------------------------------------------------------- > 发件人:Sage Weil <sage@newdream.net> > 发送日期:2015-11-22 01:49 > 收件人:Haomai Wang > 抄送:Libin Wu,ceph-devel > 主题:Re: why my cluster become unavailable > > On Sun, 22 Nov 2015, Haomai Wang wrote: >> On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: >> > Hi, cepher >> > >> > I have a cluster of 6 OSD server, every server has 8 OSDs. >> > >> > I out 4 OSDs on every server, then my client io is blocking. >> > >> > I reboot my client and then create a new rbd device, but the new >> > device also can't write io. >> > >> > Yeah, i understand that some data may lost as threee replicas of some >> > object were lost, but why the cluster become unavailable? >> > >> > There 80 incomplete pg and 4 down+incomplete pg. >> > >> > Any solution i could solve the problem? >> >> Yes, if you doesn't have a special crushmap to control the data >> replcement policy, pg will lack of necessary metadata to boot. If need >> to readd outed osds or force remove pg which is incomplete(hope it's >> just a test). > > Is min_size 2 or 1? Reducing it to 1 will generally clear some of the > incomplete pgs. Just remember to raise it back to 2 after the cluster > recovers. > > sage > > -- Best Regards, Wheat ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: why my cluster become unavailable (min_size of pool) 2015-11-26 7:54 ` why my cluster become unavailable (min_size of pool) hzwulibin 2015-11-26 8:00 ` Haomai Wang @ 2015-11-26 13:30 ` Sage Weil 2015-12-02 11:22 ` Libin Wu 1 sibling, 1 reply; 9+ messages in thread From: Sage Weil @ 2015-11-26 13:30 UTC (permalink / raw) To: hzwulibin; +Cc: Haomai Wang, ceph-devel On Thu, 26 Nov 2015, hzwulibin wrote: > Hi, Sage > > I has a question about min_size of pool. > > The default value of min_size is 2, but in this setting, when two OSDs > are down(mean two replicas lost) at same time, the IO will be blocked. > We want to set the min_size to 1 in our production environment as we > think it's normal case when two OSDs are down(sure on different host) at > same time. > > So is there anypotential problem of this setting? min_size = 1 is okay, but be aware that it will increase the risk of a situation of a pg history like epoch 10: osd.0, osd.1, osd.2 epoch 11: osd.0 (1 and 2 down) epoch 12: - (osd.0 fails hard) epoch 13: osd.1 osd.2 i.e., a pg is serviced by a single osd for some period (possibly very short) and then fails permanently, and any writes during that period are *only* stored on that osd. It'll require some manual recovery to get past it (mark that osd as lost, and accept that you may have lost some recent writes to the data). sage > > We use 0.80.10 version. > > Thanks! > > > ------------------ > hzwulibin > 2015-11-26 > > ------------------------------------------------------------- > ????"hzwulibin"<hzwulibin@gmail.com> > ?????2015-11-23 09:00 > ????Sage Weil,Haomai Wang > ???ceph-devel > ???Re: why my cluster become unavailable > > Hi, Sage > > Thanks! Will try it when next testing! > > ------------------ > hzwulibin > 2015-11-23 > > ------------------------------------------------------------- > ????Sage Weil <sage@newdream.net> > ?????2015-11-22 01:49 > ????Haomai Wang > ???Libin Wu,ceph-devel > ???Re: why my cluster become unavailable > > On Sun, 22 Nov 2015, Haomai Wang wrote: > > On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: > > > Hi, cepher > > > > > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > > > > > I out 4 OSDs on every server, then my client io is blocking. > > > > > > I reboot my client and then create a new rbd device, but the new > > > device also can't write io. > > > > > > Yeah, i understand that some data may lost as threee replicas of some > > > object were lost, but why the cluster become unavailable? > > > > > > There 80 incomplete pg and 4 down+incomplete pg. > > > > > > Any solution i could solve the problem? > > > > Yes, if you doesn't have a special crushmap to control the data > > replcement policy, pg will lack of necessary metadata to boot. If need > > to readd outed osds or force remove pg which is incomplete(hope it's > > just a test). > > Is min_size 2 or 1? Reducing it to 1 will generally clear some of the > incomplete pgs. Just remember to raise it back to 2 after the cluster > recovers. > > sage > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: why my cluster become unavailable (min_size of pool) 2015-11-26 13:30 ` Sage Weil @ 2015-12-02 11:22 ` Libin Wu 0 siblings, 0 replies; 9+ messages in thread From: Libin Wu @ 2015-12-02 11:22 UTC (permalink / raw) To: Sage Weil; +Cc: Haomai Wang, ceph-devel Sage, thanks! I'm missing your email until i saw it in GMANE today. Thanks again! 2015-11-26 21:30 GMT+08:00 Sage Weil <sage@newdream.net>: > On Thu, 26 Nov 2015, hzwulibin wrote: >> Hi, Sage >> >> I has a question about min_size of pool. >> >> The default value of min_size is 2, but in this setting, when two OSDs >> are down(mean two replicas lost) at same time, the IO will be blocked. >> We want to set the min_size to 1 in our production environment as we >> think it's normal case when two OSDs are down(sure on different host) at >> same time. >> >> So is there anypotential problem of this setting? > > min_size = 1 is okay, but be aware that it will increase the risk of a > situation of a pg history like > > epoch 10: osd.0, osd.1, osd.2 > epoch 11: osd.0 (1 and 2 down) > epoch 12: - (osd.0 fails hard) > epoch 13: osd.1 osd.2 > > i.e., a pg is serviced by a single osd for some period (possibly very > short) and then fails permanently, and any writes during that period are > *only* stored on that osd. It'll require some manual recovery to get past > it (mark that osd as lost, and accept that you may have lost some recent > writes to the data). > > sage > > > > > >> >> We use 0.80.10 version. >> >> Thanks! >> >> >> ------------------ >> hzwulibin >> 2015-11-26 >> >> ------------------------------------------------------------- >> ????"hzwulibin"<hzwulibin@gmail.com> >> ?????2015-11-23 09:00 >> ????Sage Weil,Haomai Wang >> ???ceph-devel >> ???Re: why my cluster become unavailable >> >> Hi, Sage >> >> Thanks! Will try it when next testing! >> >> ------------------ >> hzwulibin >> 2015-11-23 >> >> ------------------------------------------------------------- >> ????Sage Weil <sage@newdream.net> >> ?????2015-11-22 01:49 >> ????Haomai Wang >> ???Libin Wu,ceph-devel >> ???Re: why my cluster become unavailable >> >> On Sun, 22 Nov 2015, Haomai Wang wrote: >> > On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@gmail.com> wrote: >> > > Hi, cepher >> > > >> > > I have a cluster of 6 OSD server, every server has 8 OSDs. >> > > >> > > I out 4 OSDs on every server, then my client io is blocking. >> > > >> > > I reboot my client and then create a new rbd device, but the new >> > > device also can't write io. >> > > >> > > Yeah, i understand that some data may lost as threee replicas of some >> > > object were lost, but why the cluster become unavailable? >> > > >> > > There 80 incomplete pg and 4 down+incomplete pg. >> > > >> > > Any solution i could solve the problem? >> > >> > Yes, if you doesn't have a special crushmap to control the data >> > replcement policy, pg will lack of necessary metadata to boot. If need >> > to readd outed osds or force remove pg which is incomplete(hope it's >> > just a test). >> >> Is min_size 2 or 1? Reducing it to 1 will generally clear some of the >> incomplete pgs. Just remember to raise it back to 2 after the cluster >> recovers. >> >> sage >> >> >> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-12-02 11:22 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-11-19 15:26 why my cluster become unavailable Libin Wu 2015-11-21 17:43 ` Haomai Wang 2015-11-21 17:49 ` Sage Weil 2015-11-23 1:00 ` hzwulibin 2015-11-26 7:54 ` why my cluster become unavailable (min_size of pool) hzwulibin 2015-11-26 8:00 ` Haomai Wang 2015-11-26 8:04 ` hzwulibin 2015-11-26 13:30 ` Sage Weil 2015-12-02 11:22 ` Libin Wu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.