Increase number of PG

All of lore.kernel.org
 help / color / mirror / Atom feed

* Increase number of PG
@ 2012-07-20 15:31 Sławomir Skowron
  2012-07-20 20:15 ` Tommi Virtanen
  0 siblings, 1 reply; 7+ messages in thread
From: Sławomir Skowron @ 2012-07-20 15:31 UTC (permalink / raw)
  To: ceph-devel

I know that this feature is disabled, are you planning to enable this
in near future ??

I have many of drives, and my S3 instalation use only few of them in
one time, and i need to improve that.

When i use it as rbd it use all of them.

Regards

Slawomir Skowron

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Increase number of PG
  2012-07-20 15:31 Increase number of PG Sławomir Skowron
@ 2012-07-20 20:15 ` Tommi Virtanen
  2012-07-21 17:13   ` Gregory Farnum
  0 siblings, 1 reply; 7+ messages in thread
From: Tommi Virtanen @ 2012-07-20 20:15 UTC (permalink / raw)
  To: Sławomir Skowron; +Cc: ceph-devel

On Fri, Jul 20, 2012 at 8:31 AM, Sławomir Skowron <szibis@gmail.com> wrote:
> I know that this feature is disabled, are you planning to enable this
> in near future ??

PG splitting/joining is the next major project for the OSD. It won't
be backported to argonaut, but it will be in the next stable release,
and will probably appear in our regular development release in 2-3
months.

> I have many of drives, and my S3 instalation use only few of them in
> one time, and i need to improve that.
>
> When i use it as rbd it use all of them.

Radosgw normally stores most of the data for a single S3-level object
in a single RADOS object, where as RBD stripes disk images across
objects by default in 4MB chunks. If you have only a few S3 objects,
you will see an uneven distribution. It will get more balanced as you
upload more images. Also, if you use multi-part uploads, each part
goes into a separate RADOS object, so that'll spread the load more
evenly.

Now, if your problem comes from the rgw pools having too few PGs to
begin with, the distribution will be.. lumpy.. even with more objects.
Here's another mailing list thread that talks about what you can do
about that: http://article.gmane.org/gmane.comp.file-systems.ceph.devel/8069
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Increase number of PG
  2012-07-20 20:15 ` Tommi Virtanen
@ 2012-07-21 17:13   ` Gregory Farnum
  2012-07-21 18:08     ` Yehuda Sadeh
  0 siblings, 1 reply; 7+ messages in thread
From: Gregory Farnum @ 2012-07-21 17:13 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Sławomir Skowron, ceph-devel

On Fri, Jul 20, 2012 at 1:15 PM, Tommi Virtanen <tv@inktank.com (mailto:tv@inktank.com)> wrote:
> On Fri, Jul 20, 2012 at 8:31 AM, Sławomir Skowron <szibis@gmail.com (mailto:szibis@gmail.com)> wrote:
> > I know that this feature is disabled, are you planning to enable this
> > in near future ??
> >  
>  
>  
> PG splitting/joining is the next major project for the OSD. It won't
> be backported to argonaut, but it will be in the next stable release,
> and will probably appear in our regular development release in 2-3
> months.
>  
> > I have many of drives, and my S3 instalation use only few of them in
> > one time, and i need to improve that.
> >  
> > When i use it as rbd it use all of them.
>  
> Radosgw normally stores most of the data for a single S3-level object
> in a single RADOS object, where as RBD stripes disk images across
> objects by default in 4MB chunks. If you have only a few S3 objects,
> you will see an uneven distribution. It will get more balanced as you
> upload more images. Also, if you use multi-part uploads, each part
> goes into a separate RADOS object, so that'll spread the load more
> evenly.
>  

RGW only does this for small objects — I believe its default chunk size is also 4MB.


But I'm pretty sure this is his problem:
> Now, if your problem comes from the rgw pools having too few PGs to
> begin with, the distribution will be.. lumpy.. even with more objects.
> Here's another mailing list thread that talks about what you can do
> about that: http://article.gmane.org/gmane.comp.file-systems.ceph.devel/8069
>  

-Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Increase number of PG
  2012-07-21 17:13   ` Gregory Farnum
@ 2012-07-21 18:08     ` Yehuda Sadeh
  2012-07-23  6:57       ` Sławomir Skowron
  0 siblings, 1 reply; 7+ messages in thread
From: Yehuda Sadeh @ 2012-07-21 18:08 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Tommi Virtanen, Sławomir Skowron, ceph-devel

On Sat, Jul 21, 2012 at 10:13 AM, Gregory Farnum <greg@inktank.com> wrote:
> On Fri, Jul 20, 2012 at 1:15 PM, Tommi Virtanen <tv@inktank.com (mailto:tv@inktank.com)> wrote:
>> On Fri, Jul 20, 2012 at 8:31 AM, Sławomir Skowron <szibis@gmail.com (mailto:szibis@gmail.com)> wrote:
>> > I know that this feature is disabled, are you planning to enable this
>> > in near future ??
>> >
>>
>>
>> PG splitting/joining is the next major project for the OSD. It won't
>> be backported to argonaut, but it will be in the next stable release,
>> and will probably appear in our regular development release in 2-3
>> months.
>>
>> > I have many of drives, and my S3 instalation use only few of them in
>> > one time, and i need to improve that.
>> >
>> > When i use it as rbd it use all of them.
>>
>> Radosgw normally stores most of the data for a single S3-level object
>> in a single RADOS object, where as RBD stripes disk images across
>> objects by default in 4MB chunks. If you have only a few S3 objects,
>> you will see an uneven distribution. It will get more balanced as you
>> upload more images. Also, if you use multi-part uploads, each part
>> goes into a separate RADOS object, so that'll spread the load more
>> evenly.
>>
>
> RGW only does this for small objects — I believe its default chunk size is also 4MB.

Actually no. While the infrastructure is there, currently a regular
object upload at the moment is not going to create more than 2 rados
objects. The head object, which is capped at 512k and the tail, which
will contain the rest. As Tommi specified, multipart upload chunks
depend on the actual upload.
There's actually no real reason anymore for not striping, and it's
easy enough to implement, so it might be something that we're going to
do soon.

Yehuda

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Increase number of PG
  2012-07-21 18:08     ` Yehuda Sadeh
@ 2012-07-23  6:57       ` Sławomir Skowron
  2012-07-23 16:00         ` Tommi Virtanen
  0 siblings, 1 reply; 7+ messages in thread
From: Sławomir Skowron @ 2012-07-23  6:57 UTC (permalink / raw)
  To: Yehuda Sadeh; +Cc: Gregory Farnum, Tommi Virtanen, ceph-devel@vger.kernel.org

Dnia 21 lip 2012 o godz. 20:08 Yehuda Sadeh <yehuda@inktank.com> napisał(a):

> On Sat, Jul 21, 2012 at 10:13 AM, Gregory Farnum <greg@inktank.com> wrote:
>> On Fri, Jul 20, 2012 at 1:15 PM, Tommi Virtanen <tv@inktank.com (mailto:tv@inktank.com)> wrote:
>>> On Fri, Jul 20, 2012 at 8:31 AM, Sławomir Skowron <szibis@gmail.com (mailto:szibis@gmail.com)> wrote:
>>>> I know that this feature is disabled, are you planning to enable this
>>>> in near future ??
>>>>
>>>
>>>
>>> PG splitting/joining is the next major project for the OSD. It won't
>>> be backported to argonaut, but it will be in the next stable release,
>>> and will probably appear in our regular development release in 2-3
>>> months.

Ok, so i am waiting for this feature, but in a meantime i can move my
objects to a new pool with more PG's manualy created, and use it as a
bucket pool in radosgw ?? How can i tell radosgw to use this pool, or
i can't ??
At this moment my pool .rgw.buckets have default 8 PG's, and it is
small amount, too small.

>>>
>>>> I have many of drives, and my S3 instalation use only few of them in
>>>> one time, and i need to improve that.
>>>>
>>>> When i use it as rbd it use all of them.
>>>
>>> Radosgw normally stores most of the data for a single S3-level object
>>> in a single RADOS object, where as RBD stripes disk images across
>>> objects by default in 4MB chunks. If you have only a few S3 objects,
>>> you will see an uneven distribution. It will get more balanced as you
>>> upload more images. Also, if you use multi-part uploads, each part
>>> goes into a separate RADOS object, so that'll spread the load more
>>> evenly.
>>>
>>
>> RGW only does this for small objects — I believe its default chunk size is also 4MB.
>

Yes i have a lots of small objects (500k) from bajts to 2-3MB in
.rgw.buckets pool. They are not even hit multipart.

> Actually no. While the infrastructure is there, currently a regular
> object upload at the moment is not going to create more than 2 rados
> objects. The head object, which ad is capped at 512k and the tail, which
> will contain the rest. As Tommi specified, multipart upload chunks
> depend on the actual upload.
> There's actually no real reason anymore for not striping, and it's
> easy enough to implement, so it might be something that we're going to
> do soon.
>

This can be useful. But now in my case,  objects are too small, and if
i think right, my only option is to have more PG's to balance new
objects in more drives.

My workload looks like this:

- Max 20% are PUTs, with 99% of objects smaller then 4MB,
- 80% are GETs, and S3 metadata operations.

When workload hit worse scenario (PUT, and then only one GET), then
every GET miss the cache in NGINX, and it's goes from only few drives,
and it's hurts ;)

> Yehuda
>
> Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Increase number of PG
  2012-07-23  6:57       ` Sławomir Skowron
@ 2012-07-23 16:00         ` Tommi Virtanen
  2012-07-23 16:46           ` Sławomir Skowron
  0 siblings, 1 reply; 7+ messages in thread
From: Tommi Virtanen @ 2012-07-23 16:00 UTC (permalink / raw)
  To: Sławomir Skowron
  Cc: Yehuda Sadeh, Gregory Farnum, ceph-devel@vger.kernel.org

On Sun, Jul 22, 2012 at 11:57 PM, Sławomir Skowron <szibis@gmail.com> wrote:
> My workload looks like this:
>
> - Max 20% are PUTs, with 99% of objects smaller then 4MB,
> - 80% are GETs, and S3 metadata operations.

Well, the good news is that that's actually the easy to fix part --
just increase the number of PGs (which you currently have to do the
awkward way, as explained earlier in this thread), nothing else
needed.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Increase number of PG
  2012-07-23 16:00         ` Tommi Virtanen
@ 2012-07-23 16:46           ` Sławomir Skowron
  0 siblings, 0 replies; 7+ messages in thread
From: Sławomir Skowron @ 2012-07-23 16:46 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Yehuda Sadeh, Gregory Farnum, ceph-devel@vger.kernel.org

Ok everything is clear now, Thanks. I will try this in planning service works.

Regards

Slawomir Skowron.

On 23 lip 2012, at 18:00, Tommi Virtanen <tv@inktank.com> wrote:

> On Sun, Jul 22, 2012 at 11:57 PM, Sławomir Skowron <szibis@gmail.com> wrote:
>> My workload looks like this:
>>
>> - Max 20% are PUTs, with 99% of objects smaller then 4MB,
>> - 80% are GETs, and S3 metadata operations.
>
> Well, the good news is that that's actually the easy to fix part --
> just increase the number of PGs (which you currently have to do the
> awkward way, as explained earlier in this thread), nothing else
> needed.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-07-23 16:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-20 15:31 Increase number of PG Sławomir Skowron
2012-07-20 20:15 ` Tommi Virtanen
2012-07-21 17:13   ` Gregory Farnum
2012-07-21 18:08     ` Yehuda Sadeh
2012-07-23  6:57       ` Sławomir Skowron
2012-07-23 16:00         ` Tommi Virtanen
2012-07-23 16:46           ` Sławomir Skowron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.