RADOS + deep scrubbing performance issues in production environment

All of lore.kernel.org
 help / color / mirror / Atom feed

* RADOS + deep scrubbing performance issues in production environment
@ 2014-01-27 15:13 Filippos Giannakos
  2014-01-27 18:10 ` Kyle Bader
  0 siblings, 1 reply; 9+ messages in thread
From: Filippos Giannakos @ 2014-01-27 15:13 UTC (permalink / raw)
  To: ceph-devel; +Cc: synnefo-devel

Hello all,

We have been running RADOS in a large scale, production, public cloud
environment for a few months now and we are generally happy with it.

However, we experience performance problems when deep scrubbing is active.

We managed to reproduce them in our testing cluster running emperor, even while
it was idle.

We ran a simple rados bench test:

  rados -p bench bench -b 524288 120 write

and could easily reach 230MB/Sec consistently [1].

Then, we manually initiated a deep scrub and re-ran the test.

As you can see from the results [2], the performance dropped significantly and
even paused for a few seconds.

Now imagine that behavior in a loaded cluster with thousands of VMs on top of
it. The performance drop is unacceptable for our service.

Are there any tools we are not aware of for controlling, possibly pausing,
deep-scrub and/or getting some progress about the procedure ?
Also since I believe it would be a bad practice to disable deep-scrubbing do you
have any recommendations of how to work around (or even solve) this issue ?

[1] https://pithos.okeanos.grnet.gr/public/yzq5fHNkl5OnjgLOPlRTA3
[2] https://pithos.okeanos.grnet.gr/public/OjIGAQFBGwcsBNMHtA8ir5

Kind Regards,
-- 
Filippos
<philipgian@grnet.gr>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
  2014-01-27 15:13 RADOS + deep scrubbing performance issues in production environment Filippos Giannakos
@ 2014-01-27 18:10 ` Kyle Bader
  2014-01-27 18:45   ` Sage Weil
  2014-01-28 18:12   ` Filippos Giannakos
  0 siblings, 2 replies; 9+ messages in thread
From: Kyle Bader @ 2014-01-27 18:10 UTC (permalink / raw)
  To: Filippos Giannakos; +Cc: ceph-devel@vger.kernel.org, synnefo-devel

> Are there any tools we are not aware of for controlling, possibly pausing,
> deep-scrub and/or getting some progress about the procedure ?
> Also since I believe it would be a bad practice to disable deep-scrubbing do you
> have any recommendations of how to work around (or even solve) this issue ?

The periodicity of scrubs is controllable with these tunables:

osd scrub max interval
osd deep scrub interval

You may also be interested in adjusting:

osd scrub load threshold

More information on the docs page:

http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing

Hope that helps some!

-- 

Kyle

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
  2014-01-27 18:10 ` Kyle Bader
@ 2014-01-27 18:45   ` Sage Weil
  2014-01-28  6:30     ` Mike Dawson
  2014-01-28 18:13     ` Filippos Giannakos
  2014-01-28 18:12   ` Filippos Giannakos
  1 sibling, 2 replies; 9+ messages in thread
From: Sage Weil @ 2014-01-27 18:45 UTC (permalink / raw)
  To: Kyle Bader; +Cc: Filippos Giannakos, ceph-devel@vger.kernel.org, synnefo-devel

There is also 

 ceph osd set noscrub

and then later

 ceph osd unset noscrub

I forget whether this pauses an in-progress PG scrub or just makes it stop 
when it gets to the next PG boundary.

sage

On Mon, 27 Jan 2014, Kyle Bader wrote:

> > Are there any tools we are not aware of for controlling, possibly pausing,
> > deep-scrub and/or getting some progress about the procedure ?
> > Also since I believe it would be a bad practice to disable deep-scrubbing do you
> > have any recommendations of how to work around (or even solve) this issue ?
> 
> The periodicity of scrubs is controllable with these tunables:
> 
> osd scrub max interval
> osd deep scrub interval
> 
> You may also be interested in adjusting:
> 
> osd scrub load threshold
> 
> More information on the docs page:
> 
> http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing
> 
> Hope that helps some!
> 
> -- 
> 
> Kyle
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
  2014-01-27 18:45   ` Sage Weil
@ 2014-01-28  6:30     ` Mike Dawson
  2014-01-28 18:13       ` Filippos Giannakos
  2014-01-28 18:13     ` Filippos Giannakos
  1 sibling, 1 reply; 9+ messages in thread
From: Mike Dawson @ 2014-01-28  6:30 UTC (permalink / raw)
  To: Sage Weil, Kyle Bader
  Cc: Filippos Giannakos, ceph-devel@vger.kernel.org, synnefo-devel


On 1/27/2014 1:45 PM, Sage Weil wrote:
> There is also
>
>   ceph osd set noscrub
>
> and then later
>
>   ceph osd unset noscrub
>
In my experience scrub isn't nearly as much of a problem as deep-scrub. 
On a IOPS constrained cluster with writes approaching the available 
aggregate spindle performance minus replication penalty and possibly 
co-located osd journal penalty, scrub may run without any disruption. 
But deep-scrub tends to make iowait on the spindles get ugly.

To disable/enable deep-scrub use:

ceph osd set nodeep-scrub
ceph osd unset nodeep-scrub


> I forget whether this pauses an in-progress PG scrub or just makes it stop
> when it gets to the next PG boundary.
>
> sage
>
> On Mon, 27 Jan 2014, Kyle Bader wrote:
>
>>> Are there any tools we are not aware of for controlling, possibly pausing,
>>> deep-scrub and/or getting some progress about the procedure ?
>>> Also since I believe it would be a bad practice to disable deep-scrubbing do you
>>> have any recommendations of how to work around (or even solve) this issue ?
>>
>> The periodicity of scrubs is controllable with these tunables:
>>
>> osd scrub max interval
>> osd deep scrub interval
>>
>> You may also be interested in adjusting:
>>
>> osd scrub load threshold
>>
>> More information on the docs page:
>>
>> http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing

I rarely run into a situation where 1m average of load is <0.5 on a 
multi-core server running osds, so deep scrub for me is always triggered 
by the 'osd scrub max interval'. I've had a bug out there to take core 
count into consideration:

http://tracker.ceph.com/issues/6296

The documentation used to say the "Default is 50%" implying that this 
feature should allow scrub to start with a much higher load than 0.5 
will allow on multi-core systems. The documentation has changed, but the 
default of 0.5 is still artificially suppressing deep-scrub from 
opportunistically starting on relatively idle multi-core systems.

That being said, deep-scrub may be better served with an 
osd_scrub_iops_threshold mechanism instead of (or in addition to) the 
osd_scrub_load_threshold.

- Mike

>>
>> Hope that helps some!
>>
>> --
>>
>> Kyle
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
  2014-01-27 18:10 ` Kyle Bader
  2014-01-27 18:45   ` Sage Weil
@ 2014-01-28 18:12   ` Filippos Giannakos
  1 sibling, 0 replies; 9+ messages in thread
From: Filippos Giannakos @ 2014-01-28 18:12 UTC (permalink / raw)
  To: Kyle Bader; +Cc: ceph-devel@vger.kernel.org, synnefo-devel

On Mon, Jan 27, 2014 at 01:10:23PM -0500, Kyle Bader wrote:
> > Are there any tools we are not aware of for controlling, possibly pausing,
> > deep-scrub and/or getting some progress about the procedure ?
> > Also since I believe it would be a bad practice to disable deep-scrubbing do you
> > have any recommendations of how to work around (or even solve) this issue ?
> 
> The periodicity of scrubs is controllable with these tunables:
> 
> osd scrub max interval
> osd deep scrub interval
> 
> You may also be interested in adjusting:
> 
> osd scrub load threshold
> 
> More information on the docs page:
> 
> http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing
> 
> Hope that helps some!
> 

Thanks Kyle but this does not solve the performance degradation when the deep
scrubbing is actually running. Plus, it can take several days to complete.

Kind Regards,
-- 
Filippos
<philipgian@grnet.gr>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
  2014-01-27 18:45   ` Sage Weil
  2014-01-28  6:30     ` Mike Dawson
@ 2014-01-28 18:13     ` Filippos Giannakos
       [not found]       ` <066498EC-D137-472A-85DB-93751E85C753@yahoo.com>
  1 sibling, 1 reply; 9+ messages in thread
From: Filippos Giannakos @ 2014-01-28 18:13 UTC (permalink / raw)
  To: Sage Weil; +Cc: Kyle Bader, ceph-devel@vger.kernel.org, synnefo-devel

On Mon, Jan 27, 2014 at 10:45:48AM -0800, Sage Weil wrote:
> There is also 
> 
>  ceph osd set noscrub
> 
> and then later
> 
>  ceph osd unset noscrub
> 
> I forget whether this pauses an in-progress PG scrub or just makes it stop 
> when it gets to the next PG boundary.
> 
> sage

I bumped into those settings but I couldn't find any documentation about them.
When I first tried them, they didn't do anything immediately, so I thought they
weren't the answer. After your mention, I tried them again, and after a while
the deep-scrubbing stopped. So I'm guessing they stop scrubbing on the next PG
boundary.

I see from this thread and others before, that some people think it is a spindle
issue. I'm not sure that it is just that. Replicating it to an idle cluster that
can do more than 250MiB/seconds and pausing for 4-5 seconds on a single request,
sounds like an issue by itself. Maybe there is too much locking or not enough
priority to the actual I/O ? Plus, that idea of throttling deep scrubbing based
on the iops sounds appealing.

Kind Regards,
-- 
Filippos
<philipgian@grnet.gr>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
  2014-01-28  6:30     ` Mike Dawson
@ 2014-01-28 18:13       ` Filippos Giannakos
  0 siblings, 0 replies; 9+ messages in thread
From: Filippos Giannakos @ 2014-01-28 18:13 UTC (permalink / raw)
  To: Mike Dawson
  Cc: Sage Weil, Kyle Bader, ceph-devel@vger.kernel.org, synnefo-devel

On Tue, Jan 28, 2014 at 01:30:46AM -0500, Mike Dawson wrote:
> 
> On 1/27/2014 1:45 PM, Sage Weil wrote:
> >There is also
> >
> >  ceph osd set noscrub
> >
> >and then later
> >
> >  ceph osd unset noscrub
> >
> In my experience scrub isn't nearly as much of a problem as
> deep-scrub. On a IOPS constrained cluster with writes approaching
> the available aggregate spindle performance minus replication
> penalty and possibly co-located osd journal penalty, scrub may run
> without any disruption. But deep-scrub tends to make iowait on the
> spindles get ugly.
> 
> To disable/enable deep-scrub use:
> 
> ceph osd set nodeep-scrub
> ceph osd unset nodeep-scrub
>

Yes, deep-scrubbing is much worse than scrubbing, but I think fully disabling it
is not a good option. But having days of degraded performance isn't either.
That's why I am bringing up the problem and seeking for a solid solution
regarding the matter.

Kind Regards,
-- 
Filippos
<philipgian@grnet.gr>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
       [not found]         ` <066498EC-D137-472A-85DB-93751E85C753-/E1597aS9LQAvxtiuMwx3w@public.gmane.org>
@ 2014-02-03 13:40           ` Guang
  2015-07-10 13:52             ` icq2206241
  0 siblings, 1 reply; 9+ messages in thread
From: Guang @ 2014-02-03 13:40 UTC (permalink / raw)
  To: Filippos Giannakos,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Development,
	ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
  Cc: synnefo-devel-/JYPxA39Uh5TLH3MbocFFw


[-- Attachment #1.1: Type: text/plain, Size: 2680 bytes --]

+ceph-users.

Does anybody have the similar experience of scrubbing / deep-scrubbing?

Thanks,
Guang

On Jan 29, 2014, at 10:35 AM, Guang <yguang11-/E1597aS9LQAvxtiuMwx3w@public.gmane.org> wrote:

> Glad to see there are some discussion around scrubbing / deep-scrubbing.
> 
> We are experiencing the same that scrubbing could affect latency quite a bit and so far I found two slow patterns (dump_historic_ops): 1) waiting from being dispatched 2) waiting in the op working queue to be fetched by an available op thread. For the first slow pattern, it looks like there is lock (as dispatcher stop working for 2 seconds and then resume, same for scrubber thread), that needs further investigation. For the second slow pattern, as scrubbing brings more ops (for scrubbing check), that make the op thread's work load increase (client op has a lower priority), I think that could be improved by increasing the op thread number, I will confirm this analysis by adding more op threads and turn on scrubbing on OSD basis.
> 
> Does the above observation and analysis make sense?
> 
> Thanks,
> Guang
> 
> On Jan 29, 2014, at 2:13 AM, Filippos Giannakos <philipgian-Sqt7GMbKoOQ@public.gmane.org> wrote:
> 
>> On Mon, Jan 27, 2014 at 10:45:48AM -0800, Sage Weil wrote:
>>> There is also 
>>> 
>>> ceph osd set noscrub
>>> 
>>> and then later
>>> 
>>> ceph osd unset noscrub
>>> 
>>> I forget whether this pauses an in-progress PG scrub or just makes it stop 
>>> when it gets to the next PG boundary.
>>> 
>>> sage
>> 
>> I bumped into those settings but I couldn't find any documentation about them.
>> When I first tried them, they didn't do anything immediately, so I thought they
>> weren't the answer. After your mention, I tried them again, and after a while
>> the deep-scrubbing stopped. So I'm guessing they stop scrubbing on the next PG
>> boundary.
>> 
>> I see from this thread and others before, that some people think it is a spindle
>> issue. I'm not sure that it is just that. Replicating it to an idle cluster that
>> can do more than 250MiB/seconds and pausing for 4-5 seconds on a single request,
>> sounds like an issue by itself. Maybe there is too much locking or not enough
>> priority to the actual I/O ? Plus, that idea of throttling deep scrubbing based
>> on the iops sounds appealing.
>> 
>> Kind Regards,
>> -- 
>> Filippos
>> <philipgian-Sqt7GMbKoOQ@public.gmane.org>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


[-- Attachment #1.2: Type: text/html, Size: 4071 bytes --]

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RADOS + deep scrubbing performance issues in production environment
  2014-02-03 13:40           ` Guang
@ 2015-07-10 13:52             ` icq2206241
  0 siblings, 0 replies; 9+ messages in thread
From: icq2206241 @ 2015-07-10 13:52 UTC (permalink / raw)
  To: synnefo-devel
  Cc: ceph-users, kyle.bader, sage, guangyy, ceph-devel, philipgian

[-- Attachment #1: Type: text/plain, Size: 676 bytes --]

All IO drops to ZERO IOPS for 1-15 minutes during the deep-scrub on my cluster. There is clearly a locking bug! 

I have VMs - every day, several times, sometime on all of them disk IO _completely_ stops. Disk queue is growing, 0 IOPS are performed, services are dying with timeouts... At the same time the CEPH (where the VM images are stored) is doing a deep scrub. No fiddling with priorities and number of different threads are helping. Actually, making the scrub slower makes those delays longer - so there is clearly a bug with locking. 

I am experiencing this for two years already, since then we tried everything and upgraded our cluster several times! Nothing helps!

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-07-10 15:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-27 15:13 RADOS + deep scrubbing performance issues in production environment Filippos Giannakos
2014-01-27 18:10 ` Kyle Bader
2014-01-27 18:45   ` Sage Weil
2014-01-28  6:30     ` Mike Dawson
2014-01-28 18:13       ` Filippos Giannakos
2014-01-28 18:13     ` Filippos Giannakos
     [not found]       ` <066498EC-D137-472A-85DB-93751E85C753@yahoo.com>
     [not found]         ` <066498EC-D137-472A-85DB-93751E85C753-/E1597aS9LQAvxtiuMwx3w@public.gmane.org>
2014-02-03 13:40           ` Guang
2015-07-10 13:52             ` icq2206241
2014-01-28 18:12   ` Filippos Giannakos

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.