* Re: Weekly performance meeting
2014-09-25 18:27 Weekly performance meeting Sage Weil
@ 2014-09-25 19:03 ` Matt W. Benjamin
2014-09-25 21:15 ` Vu Pham
` (6 subsequent siblings)
7 siblings, 0 replies; 27+ messages in thread
From: Matt W. Benjamin @ 2014-09-25 19:03 UTC (permalink / raw)
To: Sage Weil
Cc: Somnath Roy, Allen Samuels, dieter kasper, PVonStamwitz,
xinxin shu, haomaiwang, s priebe, xiaoxi chen, milosz,
zhiqiang wang, jianpeng ma, gdror, vuhuong, mark nelson,
ceph-devel
Hi Sage,
Great idea, we'd certainly be interested in attending.
We have a variety of potentially relevant work, some (but not all) at least tangentially related to RDMA. Our attention is necessarily split between upstream and internal branches (that currently don't have CRUSH, PGs, etc), but we'd like to upstream as much generally useful work as we can.
Thanks,
Matt
----- "Sage Weil" <sweil@redhat.com> wrote:
> Hi everyone,
>
> A number of people have approached me about how to get more involved
> with
> the current work on improving performance and how to better coordinate
>
> with other interested parties. A few meetings have taken place
> offline
> with good results but only a few interested parties were involved.
>
> Ideally, we'd like to move as much of this dicussion into the public
> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>
> sufficient, however. I'd like to also set up a regular weekly meeting
>
> using google hangouts or bluejeans so that all interested parties can
>
> share progress. There are a lot of things we can do during the Hammer
>
> cycle to improve things but it will require some coordination of
> effort.
>
--
Matt Benjamin
The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI 48104
http://linuxbox.com
tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: Weekly performance meeting
2014-09-25 18:27 Weekly performance meeting Sage Weil
2014-09-25 19:03 ` Matt W. Benjamin
@ 2014-09-25 21:15 ` Vu Pham
2014-09-26 6:42 ` Dror Goldenberg
2014-09-26 1:50 ` Paul Von-Stamwitz
` (5 subsequent siblings)
7 siblings, 1 reply; 27+ messages in thread
From: Vu Pham @ 2014-09-25 21:15 UTC (permalink / raw)
To: Sage Weil
Cc: ceph-devel, Somnath.Roy, Allen.Samuels, dieter.kasper,
PVonStamwitz, xinxin.shu, haomaiwang, s.priebe, xiaoxi.chen,
milosz, zhiqiang.wang, jianpeng.ma, gdror, mark.nelson
Hi Sage,
Thanks for this initiative.
I'll definitely attend this weekly call.
I'd like to discuss on how to make XioMessenger/rdma being useful and
improving performance
thanks
-vu
> Hi everyone,
>
> A number of people have approached me about how to get more involved with
> the current work on improving performance and how to better coordinate
> with other interested parties. A few meetings have taken place offline
> with good results but only a few interested parties were involved.
>
> Ideally, we'd like to move as much of this dicussion into the public
> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
> sufficient, however. I'd like to also set up a regular weekly meeting
> using google hangouts or bluejeans so that all interested parties can
> share progress. There are a lot of things we can do during the Hammer
> cycle to improve things but it will require some coordination of effort.
>
> Among other things, we can discuss:
>
> - observed performance limitations
> - high level strategies for addressing them
> - proposed patch sets and their performance impact
> - anything else that will move us forward
>
> One challenge is timezones: there are developers in the US, China, Europe,
> and Israel who may want to join. As a starting point, how about next
> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>
> 8:00 (PDT, California)
> 15:00 (UTC)
> 18:00 (IDT, Israel)
> 23:00 (CST, China)
>
> That is surely not the ideal time for everyone but it can hopefully be a
> starting point.
>
> I've also created an etherpad for collecting discussion/agenda items at
>
> http://pad.ceph.com/p/performance_weekly
>
> Is there interest here? Please let everyone know if you are actively
> working in this area and/or would like to join, and update the pad above
> with the topics you would like to discuss.
>
> Thanks!
> sage
>
^ permalink raw reply [flat|nested] 27+ messages in thread* RE: Weekly performance meeting
2014-09-25 21:15 ` Vu Pham
@ 2014-09-26 6:42 ` Dror Goldenberg
0 siblings, 0 replies; 27+ messages in thread
From: Dror Goldenberg @ 2014-09-26 6:42 UTC (permalink / raw)
To: Vu Pham, Sage Weil
Cc: ceph-devel@vger.kernel.org, Somnath.Roy@sandisk.com,
Allen.Samuels@sandisk.com, dieter.kasper@ts.fujitsu.com,
PVonStamwitz@us.fujitsu.com, xinxin.shu@intel.com,
haomaiwang@gmail.com, s.priebe@profihost.ag,
xiaoxi.chen@intel.com, milosz@adfin.com, zhiqiang.wang@intel.com,
jianpeng.ma@intel.com, mark.nelson@inktank.com, Oren Duer
Great initiative Sage.
I will attend as well.
-Dror
Dror Goldenberg |VP Software Architecture| Mellanox Technologies Ltd.
Work: +972 74 7237324 | Cell +972 54 4478308 |Fax: +972 4 959 3245
-----Original Message-----
From: Vu Pham
Sent: Friday, September 26, 2014 12:16 AM
To: Sage Weil
Cc: ceph-devel@vger.kernel.org; Somnath.Roy@sandisk.com; Allen.Samuels@sandisk.com; dieter.kasper@ts.fujitsu.com; PVonStamwitz@us.fujitsu.com; xinxin.shu@intel.com; haomaiwang@gmail.com; s.priebe@profihost.ag; xiaoxi.chen@intel.com; milosz@adfin.com; zhiqiang.wang@intel.com; jianpeng.ma@intel.com; Dror Goldenberg; mark.nelson@inktank.com
Subject: Re: Weekly performance meeting
Hi Sage,
Thanks for this initiative.
I'll definitely attend this weekly call.
I'd like to discuss on how to make XioMessenger/rdma being useful and improving performance
thanks
-vu
> Hi everyone,
>
> A number of people have approached me about how to get more involved
> with the current work on improving performance and how to better
> coordinate with other interested parties. A few meetings have taken
> place offline with good results but only a few interested parties were involved.
>
> Ideally, we'd like to move as much of this dicussion into the public
> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
> sufficient, however. I'd like to also set up a regular weekly meeting
> using google hangouts or bluejeans so that all interested parties can
> share progress. There are a lot of things we can do during the Hammer
> cycle to improve things but it will require some coordination of effort.
>
> Among other things, we can discuss:
>
> - observed performance limitations
> - high level strategies for addressing them
> - proposed patch sets and their performance impact
> - anything else that will move us forward
>
> One challenge is timezones: there are developers in the US, China,
> Europe, and Israel who may want to join. As a starting point, how
> about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong,
> that's
>
> 8:00 (PDT, California)
> 15:00 (UTC)
> 18:00 (IDT, Israel)
> 23:00 (CST, China)
>
> That is surely not the ideal time for everyone but it can hopefully be
> a starting point.
>
> I've also created an etherpad for collecting discussion/agenda items
> at
>
> http://pad.ceph.com/p/performance_weekly
>
> Is there interest here? Please let everyone know if you are actively
> working in this area and/or would like to join, and update the pad
> above with the topics you would like to discuss.
>
> Thanks!
> sage
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: Weekly performance meeting
2014-09-25 18:27 Weekly performance meeting Sage Weil
2014-09-25 19:03 ` Matt W. Benjamin
2014-09-25 21:15 ` Vu Pham
@ 2014-09-26 1:50 ` Paul Von-Stamwitz
2014-09-26 2:27 ` Haomai Wang
` (4 subsequent siblings)
7 siblings, 0 replies; 27+ messages in thread
From: Paul Von-Stamwitz @ 2014-09-26 1:50 UTC (permalink / raw)
To: Sage Weil, ceph-devel@vger.kernel.org
Cc: Somnath.Roy@sandisk.com, Allen.Samuels@sandisk.com,
dieter.kasper@ts.fujitsu.com, xinxin.shu@intel.com,
haomaiwang@gmail.com, s.priebe@profihost.ag,
xiaoxi.chen@intel.com, milosz@adfin.com, zhiqiang.wang@intel.com,
jianpeng.ma@intel.com, gdror@mellanox.com, vuhuong@mellanox.com,
mark.nelson@inktank.com
Thanks, Sage.
I'm in, too.
Paul
-----Original Message-----
From: Sage Weil [mailto:sweil@redhat.com]
Sent: Thursday, September 25, 2014 11:27 AM
To: ceph-devel@vger.kernel.org
Cc: Somnath.Roy@sandisk.com; Allen.Samuels@sandisk.com; dieter.kasper@ts.fujitsu.com; Paul Von-Stamwitz; xinxin.shu@intel.com; haomaiwang@gmail.com; s.priebe@profihost.ag; xiaoxi.chen@intel.com; milosz@adfin.com; zhiqiang.wang@intel.com; jianpeng.ma@intel.com; gdror@mellanox.com; vuhuong@mellanox.com; mark.nelson@inktank.com
Subject: Weekly performance meeting
Hi everyone,
A number of people have approached me about how to get more involved with the current work on improving performance and how to better coordinate with other interested parties. A few meetings have taken place offline with good results but only a few interested parties were involved.
Ideally, we'd like to move as much of this dicussion into the public
forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always sufficient, however. I'd like to also set up a regular weekly meeting using google hangouts or bluejeans so that all interested parties can share progress. There are a lot of things we can do during the Hammer cycle to improve things but it will require some coordination of effort.
Among other things, we can discuss:
- observed performance limitations
- high level strategies for addressing them
- proposed patch sets and their performance impact
- anything else that will move us forward
One challenge is timezones: there are developers in the US, China, Europe, and Israel who may want to join. As a starting point, how about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
8:00 (PDT, California)
15:00 (UTC)
18:00 (IDT, Israel)
23:00 (CST, China)
That is surely not the ideal time for everyone but it can hopefully be a starting point.
I've also created an etherpad for collecting discussion/agenda items at
http://pad.ceph.com/p/performance_weekly
Is there interest here? Please let everyone know if you are actively working in this area and/or would like to join, and update the pad above with the topics you would like to discuss.
Thanks!
sage
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: Weekly performance meeting
2014-09-25 18:27 Weekly performance meeting Sage Weil
` (2 preceding siblings ...)
2014-09-26 1:50 ` Paul Von-Stamwitz
@ 2014-09-26 2:27 ` Haomai Wang
2014-09-26 2:45 ` Haomai Wang
2014-09-26 6:30 ` Dong Yuan
2014-09-26 2:47 ` Guang Yang
` (3 subsequent siblings)
7 siblings, 2 replies; 27+ messages in thread
From: Haomai Wang @ 2014-09-26 2:27 UTC (permalink / raw)
To: Sage Weil
Cc: ceph-devel@vger.kernel.org, Somnath Roy, Allen Samuels,
dieter.kasper, PVonStamwitz, Shu, Xinxin,
Stefan Priebe - Profihost AG, xiaoxi.chen, Milosz Tanski,
zhiqiang.wang, jianpeng.ma, gdror, vuhuong, Mark Nelson
Thanks for sage!
I'm on the flight at Oct 1. :-(
Now my team is mainly worked on the performance of ceph, we have
observed these points:
1. encode/decode plays remarkable latency, especially in
ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
avoid encode/decode codes. It seemed has be signed in note(- remove
serialization from ObjectStore::Transaction (ymmv))
2. obvious latency for threadpool/workqueue model. Do we consider to
impl performance optimization workqueue to replace existing critical
workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
AsyncMessenger impl, I will try to use custom and simple workqueue
impl to improve performance.
3. Large lock in client library such as ObjectCacher
On Fri, Sep 26, 2014 at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
> Hi everyone,
>
> A number of people have approached me about how to get more involved with
> the current work on improving performance and how to better coordinate
> with other interested parties. A few meetings have taken place offline
> with good results but only a few interested parties were involved.
>
> Ideally, we'd like to move as much of this dicussion into the public
> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
> sufficient, however. I'd like to also set up a regular weekly meeting
> using google hangouts or bluejeans so that all interested parties can
> share progress. There are a lot of things we can do during the Hammer
> cycle to improve things but it will require some coordination of effort.
>
> Among other things, we can discuss:
>
> - observed performance limitations
> - high level strategies for addressing them
> - proposed patch sets and their performance impact
> - anything else that will move us forward
>
> One challenge is timezones: there are developers in the US, China, Europe,
> and Israel who may want to join. As a starting point, how about next
> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>
> 8:00 (PDT, California)
> 15:00 (UTC)
> 18:00 (IDT, Israel)
> 23:00 (CST, China)
>
> That is surely not the ideal time for everyone but it can hopefully be a
> starting point.
>
> I've also created an etherpad for collecting discussion/agenda items at
>
> http://pad.ceph.com/p/performance_weekly
>
> Is there interest here? Please let everyone know if you are actively
> working in this area and/or would like to join, and update the pad above
> with the topics you would like to discuss.
>
> Thanks!
> sage
--
Best Regards,
Wheat
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: Weekly performance meeting
2014-09-26 2:27 ` Haomai Wang
@ 2014-09-26 2:45 ` Haomai Wang
2014-09-26 6:30 ` Dong Yuan
1 sibling, 0 replies; 27+ messages in thread
From: Haomai Wang @ 2014-09-26 2:45 UTC (permalink / raw)
To: Sage Weil
Cc: ceph-devel@vger.kernel.org, Somnath Roy, Allen Samuels,
dieter.kasper, PVonStamwitz, Shu, Xinxin,
Stefan Priebe - Profihost AG, xiaoxi.chen, Milosz Tanski,
zhiqiang.wang, jianpeng.ma, gdror, vuhuong, Mark Nelson
Some detail optimization points:
1. FileStore/KeyValueStore worker threads will complete with a global
object("meta" collection, "infos" oid) which is used only by omap_*
methods. (https://github.com/ceph/ceph/pull/2502)
2. Sparse recovery when using fiemap(https://github.com/ceph/ceph/pull/2137)
On Fri, Sep 26, 2014 at 10:27 AM, Haomai Wang <haomaiwang@gmail.com> wrote:
> Thanks for sage!
>
> I'm on the flight at Oct 1. :-(
>
> Now my team is mainly worked on the performance of ceph, we have
> observed these points:
>
> 1. encode/decode plays remarkable latency, especially in
> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
> avoid encode/decode codes. It seemed has be signed in note(- remove
> serialization from ObjectStore::Transaction (ymmv))
> 2. obvious latency for threadpool/workqueue model. Do we consider to
> impl performance optimization workqueue to replace existing critical
> workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
> AsyncMessenger impl, I will try to use custom and simple workqueue
> impl to improve performance.
> 3. Large lock in client library such as ObjectCacher
>
>
> On Fri, Sep 26, 2014 at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
>> Hi everyone,
>>
>> A number of people have approached me about how to get more involved with
>> the current work on improving performance and how to better coordinate
>> with other interested parties. A few meetings have taken place offline
>> with good results but only a few interested parties were involved.
>>
>> Ideally, we'd like to move as much of this dicussion into the public
>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>> sufficient, however. I'd like to also set up a regular weekly meeting
>> using google hangouts or bluejeans so that all interested parties can
>> share progress. There are a lot of things we can do during the Hammer
>> cycle to improve things but it will require some coordination of effort.
>>
>> Among other things, we can discuss:
>>
>> - observed performance limitations
>> - high level strategies for addressing them
>> - proposed patch sets and their performance impact
>> - anything else that will move us forward
>>
>> One challenge is timezones: there are developers in the US, China, Europe,
>> and Israel who may want to join. As a starting point, how about next
>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>
>> 8:00 (PDT, California)
>> 15:00 (UTC)
>> 18:00 (IDT, Israel)
>> 23:00 (CST, China)
>>
>> That is surely not the ideal time for everyone but it can hopefully be a
>> starting point.
>>
>> I've also created an etherpad for collecting discussion/agenda items at
>>
>> http://pad.ceph.com/p/performance_weekly
>>
>> Is there interest here? Please let everyone know if you are actively
>> working in this area and/or would like to join, and update the pad above
>> with the topics you would like to discuss.
>>
>> Thanks!
>> sage
>
>
>
> --
> Best Regards,
>
> Wheat
--
Best Regards,
Wheat
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: Weekly performance meeting
2014-09-26 2:27 ` Haomai Wang
2014-09-26 2:45 ` Haomai Wang
@ 2014-09-26 6:30 ` Dong Yuan
2014-09-26 6:40 ` Somnath Roy
2014-09-26 12:58 ` Milosz Tanski
1 sibling, 2 replies; 27+ messages in thread
From: Dong Yuan @ 2014-09-26 6:30 UTC (permalink / raw)
To: Haomai Wang
Cc: Sage Weil, ceph-devel@vger.kernel.org, Somnath Roy, Allen Samuels,
dieter.kasper, PVonStamwitz, Shu, Xinxin,
Stefan Priebe - Profihost AG, xiaoxi.chen, Milosz Tanski,
zhiqiang.wang, jianpeng.ma, gdror, vuhuong, Mark Nelson
Some data can support Haomai's points.
> 1. encode/decode plays remarkable latency, especially in
> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
> avoid encode/decode codes. It seemed has be signed in note(- remove
> serialization from ObjectStore::Transaction (ymmv))
My environment, single OSD on a single SSD with filestore_blackhole = true.
With All transaction encode, 10000 4K WriteFull operations by single
thread need about 14.3s. While without transaction encode, the same
test can be finished in about 11.5s.
Considering the FileStore needs to decode the bufferlist too,
encode/decode cost more than 20% time!
Oprofile results can validate this problem too: methods used by
encode/decode sometimes take 9 of the top 10.
> 2. obvious latency for threadpool/workqueue model. Do we consider to
> impl performance optimization workqueue to replace existing critical
> workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
> AsyncMessenger impl, I will try to use custom and simple workqueue
> impl to improve performance.
When I analyze the latency of a 4K object WriteFull operation, I put
static probes into codes to measure times used by OpWQ. I test 10000
4K object WriteFull operations and average the results.
I found it spends 158us for the OpWQ for each IO, including 30us to
enqueue, 108us in the queue, and 20us to dequeue. It takes more than
20% time of PG layer (not including msg and os layer) when encode is
ignored.
Maybe a more effective ThreadPool/WorkQueue Model is needed or at
least some improvement for WorkQueues in the IO path to reduce the
latency.
On 26 September 2014 10:27, Haomai Wang <haomaiwang@gmail.com> wrote:
> Thanks for sage!
>
> I'm on the flight at Oct 1. :-(
>
> Now my team is mainly worked on the performance of ceph, we have
> observed these points:
>
> 1. encode/decode plays remarkable latency, especially in
> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
> avoid encode/decode codes. It seemed has be signed in note(- remove
> serialization from ObjectStore::Transaction (ymmv))
> 2. obvious latency for threadpool/workqueue model. Do we consider to
> impl performance optimization workqueue to replace existing critical
> workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
> AsyncMessenger impl, I will try to use custom and simple workqueue
> impl to improve performance.
> 3. Large lock in client library such as ObjectCacher
>
>
> On Fri, Sep 26, 2014 at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
>> Hi everyone,
>>
>> A number of people have approached me about how to get more involved with
>> the current work on improving performance and how to better coordinate
>> with other interested parties. A few meetings have taken place offline
>> with good results but only a few interested parties were involved.
>>
>> Ideally, we'd like to move as much of this dicussion into the public
>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>> sufficient, however. I'd like to also set up a regular weekly meeting
>> using google hangouts or bluejeans so that all interested parties can
>> share progress. There are a lot of things we can do during the Hammer
>> cycle to improve things but it will require some coordination of effort.
>>
>> Among other things, we can discuss:
>>
>> - observed performance limitations
>> - high level strategies for addressing them
>> - proposed patch sets and their performance impact
>> - anything else that will move us forward
>>
>> One challenge is timezones: there are developers in the US, China, Europe,
>> and Israel who may want to join. As a starting point, how about next
>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>
>> 8:00 (PDT, California)
>> 15:00 (UTC)
>> 18:00 (IDT, Israel)
>> 23:00 (CST, China)
>>
>> That is surely not the ideal time for everyone but it can hopefully be a
>> starting point.
>>
>> I've also created an etherpad for collecting discussion/agenda items at
>>
>> http://pad.ceph.com/p/performance_weekly
>>
>> Is there interest here? Please let everyone know if you are actively
>> working in this area and/or would like to join, and update the pad above
>> with the topics you would like to discuss.
>>
>> Thanks!
>> sage
>
>
>
> --
> Best Regards,
>
> Wheat
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Dong Yuan
Email:yuandong1222@gmail.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: Weekly performance meeting
2014-09-26 6:30 ` Dong Yuan
@ 2014-09-26 6:40 ` Somnath Roy
2014-09-26 7:20 ` Dong Yuan
2014-09-26 12:58 ` Milosz Tanski
1 sibling, 1 reply; 27+ messages in thread
From: Somnath Roy @ 2014-09-26 6:40 UTC (permalink / raw)
To: Dong Yuan, Haomai Wang
Cc: Sage Weil, ceph-devel@vger.kernel.org, Allen Samuels,
dieter.kasper@ts.fujitsu.com, PVonStamwitz@us.fujitsu.com,
Shu, Xinxin, Stefan Priebe - Profihost AG, xiaoxi.chen@intel.com,
Milosz Tanski, zhiqiang.wang@intel.com, jianpeng.ma@intel.com,
gdror@mellanox.com, vuhuong@mellanox.com, Mark Nelson
Haomai/Dong
Have you tried this with latest shardedpool/WQ model which is already in the Giant branch ?
IOS will be going with this path in the latest code not with op_wq.
Yes, we also saw encode/decode was consuming lot of cpu times and if I remember correctly profiler was pointing bufferlist::append in many of such cases.
Thanks & Regards
Somnath
-----Original Message-----
From: Dong Yuan [mailto:yuandong1222@gmail.com]
Sent: Thursday, September 25, 2014 11:31 PM
To: Haomai Wang
Cc: Sage Weil; ceph-devel@vger.kernel.org; Somnath Roy; Allen Samuels; dieter.kasper@ts.fujitsu.com; PVonStamwitz@us.fujitsu.com; Shu, Xinxin; Stefan Priebe - Profihost AG; xiaoxi.chen@intel.com; Milosz Tanski; zhiqiang.wang@intel.com; jianpeng.ma@intel.com; gdror@mellanox.com; vuhuong@mellanox.com; Mark Nelson
Subject: Re: Weekly performance meeting
Some data can support Haomai's points.
> 1. encode/decode plays remarkable latency, especially in
> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
> avoid encode/decode codes. It seemed has be signed in note(- remove
> serialization from ObjectStore::Transaction (ymmv))
My environment, single OSD on a single SSD with filestore_blackhole = true.
With All transaction encode, 10000 4K WriteFull operations by single thread need about 14.3s. While without transaction encode, the same test can be finished in about 11.5s.
Considering the FileStore needs to decode the bufferlist too, encode/decode cost more than 20% time!
Oprofile results can validate this problem too: methods used by encode/decode sometimes take 9 of the top 10.
> 2. obvious latency for threadpool/workqueue model. Do we consider to
> impl performance optimization workqueue to replace existing critical
> workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
> AsyncMessenger impl, I will try to use custom and simple workqueue
> impl to improve performance.
When I analyze the latency of a 4K object WriteFull operation, I put static probes into codes to measure times used by OpWQ. I test 10000 4K object WriteFull operations and average the results.
I found it spends 158us for the OpWQ for each IO, including 30us to enqueue, 108us in the queue, and 20us to dequeue. It takes more than 20% time of PG layer (not including msg and os layer) when encode is ignored.
Maybe a more effective ThreadPool/WorkQueue Model is needed or at least some improvement for WorkQueues in the IO path to reduce the latency.
On 26 September 2014 10:27, Haomai Wang <haomaiwang@gmail.com> wrote:
> Thanks for sage!
>
> I'm on the flight at Oct 1. :-(
>
> Now my team is mainly worked on the performance of ceph, we have
> observed these points:
>
> 1. encode/decode plays remarkable latency, especially in
> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
> avoid encode/decode codes. It seemed has be signed in note(- remove
> serialization from ObjectStore::Transaction (ymmv)) 2. obvious latency
> for threadpool/workqueue model. Do we consider to impl performance
> optimization workqueue to replace existing critical workqueue such as
> op_wq in OSD.h and op_wq in FileStore.h. Now in my AsyncMessenger
> impl, I will try to use custom and simple workqueue impl to improve
> performance.
> 3. Large lock in client library such as ObjectCacher
>
>
> On Fri, Sep 26, 2014 at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
>> Hi everyone,
>>
>> A number of people have approached me about how to get more involved
>> with the current work on improving performance and how to better
>> coordinate with other interested parties. A few meetings have taken
>> place offline with good results but only a few interested parties were involved.
>>
>> Ideally, we'd like to move as much of this dicussion into the public
>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't
>> always sufficient, however. I'd like to also set up a regular weekly
>> meeting using google hangouts or bluejeans so that all interested
>> parties can share progress. There are a lot of things we can do
>> during the Hammer cycle to improve things but it will require some coordination of effort.
>>
>> Among other things, we can discuss:
>>
>> - observed performance limitations
>> - high level strategies for addressing them
>> - proposed patch sets and their performance impact
>> - anything else that will move us forward
>>
>> One challenge is timezones: there are developers in the US, China,
>> Europe, and Israel who may want to join. As a starting point, how
>> about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong,
>> that's
>>
>> 8:00 (PDT, California)
>> 15:00 (UTC)
>> 18:00 (IDT, Israel)
>> 23:00 (CST, China)
>>
>> That is surely not the ideal time for everyone but it can hopefully
>> be a starting point.
>>
>> I've also created an etherpad for collecting discussion/agenda items
>> at
>>
>> http://pad.ceph.com/p/performance_weekly
>>
>> Is there interest here? Please let everyone know if you are actively
>> working in this area and/or would like to join, and update the pad
>> above with the topics you would like to discuss.
>>
>> Thanks!
>> sage
>
>
>
> --
> Best Regards,
>
> Wheat
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
--
Dong Yuan
Email:yuandong1222@gmail.com
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-26 6:40 ` Somnath Roy
@ 2014-09-26 7:20 ` Dong Yuan
0 siblings, 0 replies; 27+ messages in thread
From: Dong Yuan @ 2014-09-26 7:20 UTC (permalink / raw)
To: Somnath Roy
Cc: Haomai Wang, Sage Weil, ceph-devel@vger.kernel.org, Allen Samuels,
dieter.kasper@ts.fujitsu.com, PVonStamwitz@us.fujitsu.com,
Shu, Xinxin, Stefan Priebe - Profihost AG, xiaoxi.chen@intel.com,
Milosz Tanski, zhiqiang.wang@intel.com, jianpeng.ma@intel.com,
gdror@mellanox.com, vuhuong@mellanox.com, Mark Nelson
> Have you tried this with latest shardedpool/WQ model which is already in the Giant branch ?
> IOS will be going with this path in the latest code not with op_wq.
Not yet. I mainly work on firefly now.
I think I will try it when I have time and I will give the report. :)
> Yes, we also saw encode/decode was consuming lot of cpu times and if I remember correctly profiler was pointing bufferlist::append in many of such cases.
As right as my glove. :)
On 26 September 2014 14:40, Somnath Roy <Somnath.Roy@sandisk.com> wrote:
> Haomai/Dong
> Have you tried this with latest shardedpool/WQ model which is already in the Giant branch ?
> IOS will be going with this path in the latest code not with op_wq.
> Yes, we also saw encode/decode was consuming lot of cpu times and if I remember correctly profiler was pointing bufferlist::append in many of such cases.
>
> Thanks & Regards
> Somnath
>
> -----Original Message-----
> From: Dong Yuan [mailto:yuandong1222@gmail.com]
> Sent: Thursday, September 25, 2014 11:31 PM
> To: Haomai Wang
> Cc: Sage Weil; ceph-devel@vger.kernel.org; Somnath Roy; Allen Samuels; dieter.kasper@ts.fujitsu.com; PVonStamwitz@us.fujitsu.com; Shu, Xinxin; Stefan Priebe - Profihost AG; xiaoxi.chen@intel.com; Milosz Tanski; zhiqiang.wang@intel.com; jianpeng.ma@intel.com; gdror@mellanox.com; vuhuong@mellanox.com; Mark Nelson
> Subject: Re: Weekly performance meeting
>
> Some data can support Haomai's points.
>
>> 1. encode/decode plays remarkable latency, especially in
>> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
>> avoid encode/decode codes. It seemed has be signed in note(- remove
>> serialization from ObjectStore::Transaction (ymmv))
>
> My environment, single OSD on a single SSD with filestore_blackhole = true.
>
> With All transaction encode, 10000 4K WriteFull operations by single thread need about 14.3s. While without transaction encode, the same test can be finished in about 11.5s.
>
> Considering the FileStore needs to decode the bufferlist too, encode/decode cost more than 20% time!
>
> Oprofile results can validate this problem too: methods used by encode/decode sometimes take 9 of the top 10.
>
>> 2. obvious latency for threadpool/workqueue model. Do we consider to
>> impl performance optimization workqueue to replace existing critical
>> workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
>> AsyncMessenger impl, I will try to use custom and simple workqueue
>> impl to improve performance.
>
> When I analyze the latency of a 4K object WriteFull operation, I put static probes into codes to measure times used by OpWQ. I test 10000 4K object WriteFull operations and average the results.
>
> I found it spends 158us for the OpWQ for each IO, including 30us to enqueue, 108us in the queue, and 20us to dequeue. It takes more than 20% time of PG layer (not including msg and os layer) when encode is ignored.
>
> Maybe a more effective ThreadPool/WorkQueue Model is needed or at least some improvement for WorkQueues in the IO path to reduce the latency.
>
> On 26 September 2014 10:27, Haomai Wang <haomaiwang@gmail.com> wrote:
>> Thanks for sage!
>>
>> I'm on the flight at Oct 1. :-(
>>
>> Now my team is mainly worked on the performance of ceph, we have
>> observed these points:
>>
>> 1. encode/decode plays remarkable latency, especially in
>> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
>> avoid encode/decode codes. It seemed has be signed in note(- remove
>> serialization from ObjectStore::Transaction (ymmv)) 2. obvious latency
>> for threadpool/workqueue model. Do we consider to impl performance
>> optimization workqueue to replace existing critical workqueue such as
>> op_wq in OSD.h and op_wq in FileStore.h. Now in my AsyncMessenger
>> impl, I will try to use custom and simple workqueue impl to improve
>> performance.
>> 3. Large lock in client library such as ObjectCacher
>>
>>
>> On Fri, Sep 26, 2014 at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
>>> Hi everyone,
>>>
>>> A number of people have approached me about how to get more involved
>>> with the current work on improving performance and how to better
>>> coordinate with other interested parties. A few meetings have taken
>>> place offline with good results but only a few interested parties were involved.
>>>
>>> Ideally, we'd like to move as much of this dicussion into the public
>>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't
>>> always sufficient, however. I'd like to also set up a regular weekly
>>> meeting using google hangouts or bluejeans so that all interested
>>> parties can share progress. There are a lot of things we can do
>>> during the Hammer cycle to improve things but it will require some coordination of effort.
>>>
>>> Among other things, we can discuss:
>>>
>>> - observed performance limitations
>>> - high level strategies for addressing them
>>> - proposed patch sets and their performance impact
>>> - anything else that will move us forward
>>>
>>> One challenge is timezones: there are developers in the US, China,
>>> Europe, and Israel who may want to join. As a starting point, how
>>> about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong,
>>> that's
>>>
>>> 8:00 (PDT, California)
>>> 15:00 (UTC)
>>> 18:00 (IDT, Israel)
>>> 23:00 (CST, China)
>>>
>>> That is surely not the ideal time for everyone but it can hopefully
>>> be a starting point.
>>>
>>> I've also created an etherpad for collecting discussion/agenda items
>>> at
>>>
>>> http://pad.ceph.com/p/performance_weekly
>>>
>>> Is there interest here? Please let everyone know if you are actively
>>> working in this area and/or would like to join, and update the pad
>>> above with the topics you would like to discuss.
>>>
>>> Thanks!
>>> sage
>>
>>
>>
>> --
>> Best Regards,
>>
>> Wheat
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Dong Yuan
> Email:yuandong1222@gmail.com
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>
--
Dong Yuan
Email:yuandong1222@gmail.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-26 6:30 ` Dong Yuan
2014-09-26 6:40 ` Somnath Roy
@ 2014-09-26 12:58 ` Milosz Tanski
2014-09-26 13:02 ` Christoph Hellwig
2014-09-26 15:37 ` Sage Weil
1 sibling, 2 replies; 27+ messages in thread
From: Milosz Tanski @ 2014-09-26 12:58 UTC (permalink / raw)
To: Dong Yuan
Cc: Haomai Wang, Sage Weil, ceph-devel@vger.kernel.org, Somnath Roy,
Allen Samuels, Kasper Dieter, PVonStamwitz, Shu, Xinxin,
Stefan Priebe - Profihost AG, xiaoxi.chen, Wang, Zhiqiang,
jianpeng.ma, gdror, vuhuong, Mark Nelson
On Fri, Sep 26, 2014 at 2:30 AM, Dong Yuan <yuandong1222@gmail.com> wrote:
> Some data can support Haomai's points.
>
>> 1. encode/decode plays remarkable latency, especially in
>> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
>> avoid encode/decode codes. It seemed has be signed in note(- remove
>> serialization from ObjectStore::Transaction (ymmv))
>
> My environment, single OSD on a single SSD with filestore_blackhole = true.
>
> With All transaction encode, 10000 4K WriteFull operations by single
> thread need about 14.3s. While without transaction encode, the same
> test can be finished in about 11.5s.
>
> Considering the FileStore needs to decode the bufferlist too,
> encode/decode cost more than 20% time!
>
> Oprofile results can validate this problem too: methods used by
> encode/decode sometimes take 9 of the top 10.
>
>> 2. obvious latency for threadpool/workqueue model. Do we consider to
>> impl performance optimization workqueue to replace existing critical
>> workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
>> AsyncMessenger impl, I will try to use custom and simple workqueue
>> impl to improve performance.
>
> When I analyze the latency of a 4K object WriteFull operation, I put
> static probes into codes to measure times used by OpWQ. I test 10000
> 4K object WriteFull operations and average the results.
>
> I found it spends 158us for the OpWQ for each IO, including 30us to
> enqueue, 108us in the queue, and 20us to dequeue. It takes more than
> 20% time of PG layer (not including msg and os layer) when encode is
> ignored.
>
> Maybe a more effective ThreadPool/WorkQueue Model is needed or at
> least some improvement for WorkQueues in the IO path to reduce the
> latency.
There's a number of things here. I haven't look at the code in Giant
so take my statements here with a grain of salt.
First, I have recently submitted a series of patches to kernel to add
a new preadv2 syscall that lets you do a "fast read" out of the page
cache the point being that you can skip the whole disk IO queue in
user space in the cases it's already cached (thus reducing the
latency). Obviously this doesn't do much for writes (yet, Christoph
Heldwig is working on that). Samba expressed an interest using these
new syscalls as well.
LWN article about it: http://lwn.net/Articles/612483/
Here's the latest patch:
http://thread.gmane.org/gmane.linux.kernel.aio.general/4306
The architecture that would benefit from "fast reads":
http://i.imgur.com/f8Pla7j.png
Previous version of the patch (mostly because there was a lot more
conversation there): https://lkml.org/lkml/2014/9/17/671
Second, when you have a very fast SSD device that can do up to 100k
iops the naive queueing/thread pool implementation becomes an issue.
Like you mentioned it's a lot of extra latency. The solution for this
is not easy and thankfully people have done lots of research work for
you, you'll still need lots of trial an error to get it figured out.
Here's some common strategies:
You're going to consider your queue. Obviously you're going to want to
get away from a single crude mutex. First is multiple queues each with
mutex / non-locking queues? Then if you choose non-locking how are you
going to build your queueing system. Is it going to be a single MPMC
queue (slowest), MPSC (faster, but things will get stuck behind slow
requests), MPSC with work stealing (complicated) or FastFlow style
network of SPSC (needs arbiter thread).
- How do you handle empty queue? Spin with fallback reduces latency
but it does waste CPU cycles which could be used by a different OSD
process / EC decoding.
- Eventcount versus Seamphore (for blocking / notification) after all
you don't want to spin forever. You really want an Eventcount since
you don't want to have have a mutex with you're semaphore (since
that's what you tried getting rid of). Here you get into platform
specific implementations (futexes).
- If the queue has priorities, is it okay if our priorities aren't
perfectly enforced? In general this really complicates things and you
pretty much best off having a FastFlow like queue so your arbiter
thread can do some kind of prioritization.
>
> On 26 September 2014 10:27, Haomai Wang <haomaiwang@gmail.com> wrote:
>> Thanks for sage!
>>
>> I'm on the flight at Oct 1. :-(
>>
>> Now my team is mainly worked on the performance of ceph, we have
>> observed these points:
>>
>> 1. encode/decode plays remarkable latency, especially in
>> ObjectStore::Transaction. I'm urgen in refactor ObjectStore API to
>> avoid encode/decode codes. It seemed has be signed in note(- remove
>> serialization from ObjectStore::Transaction (ymmv))
>> 2. obvious latency for threadpool/workqueue model. Do we consider to
>> impl performance optimization workqueue to replace existing critical
>> workqueue such as op_wq in OSD.h and op_wq in FileStore.h. Now in my
>> AsyncMessenger impl, I will try to use custom and simple workqueue
>> impl to improve performance.
>> 3. Large lock in client library such as ObjectCacher
>>
>>
>> On Fri, Sep 26, 2014 at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
>>> Hi everyone,
>>>
>>> A number of people have approached me about how to get more involved with
>>> the current work on improving performance and how to better coordinate
>>> with other interested parties. A few meetings have taken place offline
>>> with good results but only a few interested parties were involved.
>>>
>>> Ideally, we'd like to move as much of this dicussion into the public
>>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>>> sufficient, however. I'd like to also set up a regular weekly meeting
>>> using google hangouts or bluejeans so that all interested parties can
>>> share progress. There are a lot of things we can do during the Hammer
>>> cycle to improve things but it will require some coordination of effort.
>>>
>>> Among other things, we can discuss:
>>>
>>> - observed performance limitations
>>> - high level strategies for addressing them
>>> - proposed patch sets and their performance impact
>>> - anything else that will move us forward
>>>
>>> One challenge is timezones: there are developers in the US, China, Europe,
>>> and Israel who may want to join. As a starting point, how about next
>>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>>
>>> 8:00 (PDT, California)
>>> 15:00 (UTC)
>>> 18:00 (IDT, Israel)
>>> 23:00 (CST, China)
I'd love to participate and contributed to the discussion and solution
but due to my obligations it's hard to commit to a weekly time so it's
my hope that a lot of this is done on the mailing list.
>>>
>>> That is surely not the ideal time for everyone but it can hopefully be a
>>> starting point.
>>>
>>> I've also created an etherpad for collecting discussion/agenda items at
>>>
>>> http://pad.ceph.com/p/performance_weekly
>>>
>>> Is there interest here? Please let everyone know if you are actively
>>> working in this area and/or would like to join, and update the pad above
>>> with the topics you would like to discuss.
>>>
>>> Thanks!
>>> sage
>>
>>
>>
>> --
>> Best Regards,
>>
>> Wheat
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Dong Yuan
> Email:yuandong1222@gmail.com
--
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016
p: 646-253-9055
e: milosz@adfin.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-26 12:58 ` Milosz Tanski
@ 2014-09-26 13:02 ` Christoph Hellwig
2014-09-26 15:37 ` Sage Weil
1 sibling, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2014-09-26 13:02 UTC (permalink / raw)
To: Milosz Tanski
Cc: Dong Yuan, Haomai Wang, Sage Weil, ceph-devel@vger.kernel.org,
Somnath Roy, Allen Samuels, Kasper Dieter, PVonStamwitz,
Shu, Xinxin, Stefan Priebe - Profihost AG, xiaoxi.chen,
Wang, Zhiqiang, jianpeng.ma, gdror, vuhuong, Mark Nelson
On Fri, Sep 26, 2014 at 08:58:56AM -0400, Milosz Tanski wrote:
> First, I have recently submitted a series of patches to kernel to add
> a new preadv2 syscall that lets you do a "fast read" out of the page
> cache the point being that you can skip the whole disk IO queue in
> user space in the cases it's already cached (thus reducing the
> latency). Obviously this doesn't do much for writes (yet, Christoph
> Heldwig is working on that). Samba expressed an interest using these
> new syscalls as well.
We could also implement it for writes, but if would be a bit more
complicated. If there is a compelling use case it might be worth
exploring.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-26 12:58 ` Milosz Tanski
2014-09-26 13:02 ` Christoph Hellwig
@ 2014-09-26 15:37 ` Sage Weil
2014-09-26 15:57 ` Mark Nelson
1 sibling, 1 reply; 27+ messages in thread
From: Sage Weil @ 2014-09-26 15:37 UTC (permalink / raw)
To: Milosz Tanski
Cc: Dong Yuan, Haomai Wang, ceph-devel@vger.kernel.org, Somnath Roy,
Allen Samuels, Kasper Dieter, PVonStamwitz, Shu, Xinxin,
Stefan Priebe - Profihost AG, xiaoxi.chen, Wang, Zhiqiang,
jianpeng.ma, gdror, vuhuong, Mark Nelson
On Fri, 26 Sep 2014, Milosz Tanski wrote:
> >>> One challenge is timezones: there are developers in the US, China, Europe,
> >>> and Israel who may want to join. As a starting point, how about next
> >>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
> >>>
> >>> 8:00 (PDT, California)
> >>> 15:00 (UTC)
> >>> 18:00 (IDT, Israel)
> >>> 23:00 (CST, China)
>
> I'd love to participate and contributed to the discussion and solution
> but due to my obligations it's hard to commit to a weekly time so it's
> my hope that a lot of this is done on the mailing list.
I agree. One thing that has quickly become clear is that there are a
*lot* of areas to address and we can't possibly discuss them all in any
detail in a single meeting. I think the key will be to break things down
into specific areas of investigation that can be discussed in detail
on-list, and to use the meeting to coordinate activities, share latest
results, and so forth.
sage
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-26 15:37 ` Sage Weil
@ 2014-09-26 15:57 ` Mark Nelson
0 siblings, 0 replies; 27+ messages in thread
From: Mark Nelson @ 2014-09-26 15:57 UTC (permalink / raw)
To: Sage Weil, Milosz Tanski
Cc: Dong Yuan, Haomai Wang, ceph-devel@vger.kernel.org, Somnath Roy,
Allen Samuels, Kasper Dieter, PVonStamwitz, Shu, Xinxin,
Stefan Priebe - Profihost AG, xiaoxi.chen, Wang, Zhiqiang,
jianpeng.ma, gdror, vuhuong, Mark Nelson
On 09/26/2014 10:37 AM, Sage Weil wrote:
> On Fri, 26 Sep 2014, Milosz Tanski wrote:
>>>>> One challenge is timezones: there are developers in the US, China, Europe,
>>>>> and Israel who may want to join. As a starting point, how about next
>>>>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>>>>
>>>>> 8:00 (PDT, California)
>>>>> 15:00 (UTC)
>>>>> 18:00 (IDT, Israel)
>>>>> 23:00 (CST, China)
>>
>> I'd love to participate and contributed to the discussion and solution
>> but due to my obligations it's hard to commit to a weekly time so it's
>> my hope that a lot of this is done on the mailing list.
>
> I agree. One thing that has quickly become clear is that there are a
> *lot* of areas to address and we can't possibly discuss them all in any
> detail in a single meeting. I think the key will be to break things down
> into specific areas of investigation that can be discussed in detail
> on-list, and to use the meeting to coordinate activities, share latest
> results, and so forth.
That sounds like a great plan to me. There's a *ton* of work to do and
more than enough to go around! I'm really excited by the response we've
gotten. Not sure if google hangouts will be able to accommodate the
number of people!
>
> sage
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-25 18:27 Weekly performance meeting Sage Weil
` (3 preceding siblings ...)
2014-09-26 2:27 ` Haomai Wang
@ 2014-09-26 2:47 ` Guang Yang
2014-09-26 13:12 ` Mark Nelson
2014-09-26 6:40 ` Zhang, Jian
` (2 subsequent siblings)
7 siblings, 1 reply; 27+ messages in thread
From: Guang Yang @ 2014-09-26 2:47 UTC (permalink / raw)
To: Sage Weil
Cc: Ceph-devel, Somnath.Roy, Allen.Samuels, dieter.kasper,
PVonStamwitz, xinxin.shu, haomaiwang, s.priebe, xiaoxi.chen,
milosz, zhiqiang.wang, jianpeng.ma, gdror, vuhuong, mark.nelson
Hi Sage,
We are very interested to join (and contribute effort) as well. Following are a list of issues we have particular interests:
1> Large number of small files bring performance degradation most due to file system lookup (even worst with EC).
2> Messenger uses too many threads which bring burden for high density hardware (which I believe Haomai already has great progress).
Thanks,
Guang
On Sep 26, 2014, at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
> Hi everyone,
>
> A number of people have approached me about how to get more involved with
> the current work on improving performance and how to better coordinate
> with other interested parties. A few meetings have taken place offline
> with good results but only a few interested parties were involved.
>
> Ideally, we'd like to move as much of this dicussion into the public
> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
> sufficient, however. I'd like to also set up a regular weekly meeting
> using google hangouts or bluejeans so that all interested parties can
> share progress. There are a lot of things we can do during the Hammer
> cycle to improve things but it will require some coordination of effort.
>
> Among other things, we can discuss:
>
> - observed performance limitations
> - high level strategies for addressing them
> - proposed patch sets and their performance impact
> - anything else that will move us forward
>
> One challenge is timezones: there are developers in the US, China, Europe,
> and Israel who may want to join. As a starting point, how about next
> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>
> 8:00 (PDT, California)
> 15:00 (UTC)
> 18:00 (IDT, Israel)
> 23:00 (CST, China)
>
> That is surely not the ideal time for everyone but it can hopefully be a
> starting point.
>
> I've also created an etherpad for collecting discussion/agenda items at
>
> http://pad.ceph.com/p/performance_weekly
>
> Is there interest here? Please let everyone know if you are actively
> working in this area and/or would like to join, and update the pad above
> with the topics you would like to discuss.
>
> Thanks!
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: Weekly performance meeting
2014-09-26 2:47 ` Guang Yang
@ 2014-09-26 13:12 ` Mark Nelson
2014-09-27 3:05 ` Guang Yang
0 siblings, 1 reply; 27+ messages in thread
From: Mark Nelson @ 2014-09-26 13:12 UTC (permalink / raw)
To: Guang Yang, Sage Weil
Cc: Ceph-devel, Somnath.Roy, Allen.Samuels, dieter.kasper,
PVonStamwitz, xinxin.shu, haomaiwang, s.priebe, xiaoxi.chen,
milosz, zhiqiang.wang, jianpeng.ma, gdror, vuhuong, mark.nelson
On 09/25/2014 09:47 PM, Guang Yang wrote:
> Hi Sage,
> We are very interested to join (and contribute effort) as well. Following are a list of issues we have particular interests:
> 1> Large number of small files bring performance degradation most due to file system lookup (even worst with EC).
Have you tried decreasing vfs_cache_pressure to retain dentries and
inodes in cache? I've had good luck improve performance for medium
sized IO workloads doing this.
> 2> Messenger uses too many threads which bring burden for high density hardware (which I believe Haomai already has great progress).
Yes, The biggest thing on my personal wish list has been to move to a
hybrid threading/event processing model.
>
> Thanks,
> Guang
>
> On Sep 26, 2014, at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
>
>> Hi everyone,
>>
>> A number of people have approached me about how to get more involved with
>> the current work on improving performance and how to better coordinate
>> with other interested parties. A few meetings have taken place offline
>> with good results but only a few interested parties were involved.
>>
>> Ideally, we'd like to move as much of this dicussion into the public
>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>> sufficient, however. I'd like to also set up a regular weekly meeting
>> using google hangouts or bluejeans so that all interested parties can
>> share progress. There are a lot of things we can do during the Hammer
>> cycle to improve things but it will require some coordination of effort.
>>
>> Among other things, we can discuss:
>>
>> - observed performance limitations
>> - high level strategies for addressing them
>> - proposed patch sets and their performance impact
>> - anything else that will move us forward
>>
>> One challenge is timezones: there are developers in the US, China, Europe,
>> and Israel who may want to join. As a starting point, how about next
>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>
>> 8:00 (PDT, California)
>> 15:00 (UTC)
>> 18:00 (IDT, Israel)
>> 23:00 (CST, China)
>>
>> That is surely not the ideal time for everyone but it can hopefully be a
>> starting point.
>>
>> I've also created an etherpad for collecting discussion/agenda items at
>>
>> http://pad.ceph.com/p/performance_weekly
>>
>> Is there interest here? Please let everyone know if you are actively
>> working in this area and/or would like to join, and update the pad above
>> with the topics you would like to discuss.
>>
>> Thanks!
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-26 13:12 ` Mark Nelson
@ 2014-09-27 3:05 ` Guang Yang
0 siblings, 0 replies; 27+ messages in thread
From: Guang Yang @ 2014-09-27 3:05 UTC (permalink / raw)
To: Mark Nelson
Cc: Sage Weil, Ceph-devel, Somnath.Roy, Allen.Samuels, dieter.kasper,
PVonStamwitz, xinxin.shu, haomaiwang, s.priebe, xiaoxi.chen,
milosz, zhiqiang.wang, jianpeng.ma, gdror, vuhuong
On Sep 26, 2014, at 9:12 PM, Mark Nelson <mark.nelson@inktank.com> wrote:
> On 09/25/2014 09:47 PM, Guang Yang wrote:
>> Hi Sage,
>> We are very interested to join (and contribute effort) as well. Following are a list of issues we have particular interests:
>> 1> Large number of small files bring performance degradation most due to file system lookup (even worst with EC).
>
> Have you tried decreasing vfs_cache_pressure to retain dentries and inodes in cache? I've had good luck improve performance for medium sized IO workloads doing this.
Yeah we changed the setting from its default value 100 to 20 and it turned out improvement for dentry/inode cache (we also tried setting it to 1 but got OOM in some traffic pattern). Even with the setting change, given the object size is several hundred KB, we still observed lookup miss which increase latency, this became worst when we turned to EC as: 1) More files on each system. 2) The long tail determine the latency.
>
>> 2> Messenger uses too many threads which bring burden for high density hardware (which I believe Haomai already has great progress).
>
> Yes, The biggest thing on my personal wish list has been to move to a hybrid threading/event processing model.
>
>>
>> Thanks,
>> Guang
>>
>> On Sep 26, 2014, at 2:27 AM, Sage Weil <sweil@redhat.com> wrote:
>>
>>> Hi everyone,
>>>
>>> A number of people have approached me about how to get more involved with
>>> the current work on improving performance and how to better coordinate
>>> with other interested parties. A few meetings have taken place offline
>>> with good results but only a few interested parties were involved.
>>>
>>> Ideally, we'd like to move as much of this dicussion into the public
>>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>>> sufficient, however. I'd like to also set up a regular weekly meeting
>>> using google hangouts or bluejeans so that all interested parties can
>>> share progress. There are a lot of things we can do during the Hammer
>>> cycle to improve things but it will require some coordination of effort.
>>>
>>> Among other things, we can discuss:
>>>
>>> - observed performance limitations
>>> - high level strategies for addressing them
>>> - proposed patch sets and their performance impact
>>> - anything else that will move us forward
>>>
>>> One challenge is timezones: there are developers in the US, China, Europe,
>>> and Israel who may want to join. As a starting point, how about next
>>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>>
>>> 8:00 (PDT, California)
>>> 15:00 (UTC)
>>> 18:00 (IDT, Israel)
>>> 23:00 (CST, China)
>>>
>>> That is surely not the ideal time for everyone but it can hopefully be a
>>> starting point.
>>>
>>> I've also created an etherpad for collecting discussion/agenda items at
>>>
>>> http://pad.ceph.com/p/performance_weekly
>>>
>>> Is there interest here? Please let everyone know if you are actively
>>> working in this area and/or would like to join, and update the pad above
>>> with the topics you would like to discuss.
>>>
>>> Thanks!
>>> sage
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: Weekly performance meeting
2014-09-25 18:27 Weekly performance meeting Sage Weil
` (4 preceding siblings ...)
2014-09-26 2:47 ` Guang Yang
@ 2014-09-26 6:40 ` Zhang, Jian
2014-09-26 7:25 ` Loic Dachary
[not found] ` <20140925192728.GA22139@oder.mch.fsc.net>
7 siblings, 0 replies; 27+ messages in thread
From: Zhang, Jian @ 2014-09-26 6:40 UTC (permalink / raw)
To: Sage Weil, ceph-devel@vger.kernel.org
Cc: Somnath.Roy@sandisk.com, Allen.Samuels@sandisk.com,
dieter.kasper@ts.fujitsu.com, PVonStamwitz@us.fujitsu.com,
Shu, Xinxin, haomaiwang@gmail.com, s.priebe@profihost.ag,
Chen, Xiaoxi, milosz@adfin.com, Wang, Zhiqiang, Ma, Jianpeng,
gdror@mellanox.com, vuhuong@mellanox.com, mark.nelson@inktank.com,
Duan, Jiangang
Sage,
It is really great to have a performance meeting running. We'd like to join the meeting.
So far, we have a list of topics that can be discussed on the next meeting(seems we already have many topics listed here, and its PRC holiday next week).
* Full SSD setup performance limitations
* Cache tiring optimization proposals
* EC Performance enhancement
Thanks
Jian
-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Sage Weil
Sent: Friday, September 26, 2014 2:27 AM
To: ceph-devel@vger.kernel.org
Cc: Somnath.Roy@sandisk.com; Allen.Samuels@sandisk.com; dieter.kasper@ts.fujitsu.com; PVonStamwitz@us.fujitsu.com; Shu, Xinxin; haomaiwang@gmail.com; s.priebe@profihost.ag; Chen, Xiaoxi; milosz@adfin.com; Wang, Zhiqiang; Ma, Jianpeng; gdror@mellanox.com; vuhuong@mellanox.com; mark.nelson@inktank.com
Subject: Weekly performance meeting
Hi everyone,
A number of people have approached me about how to get more involved with the current work on improving performance and how to better coordinate with other interested parties. A few meetings have taken place offline with good results but only a few interested parties were involved.
Ideally, we'd like to move as much of this dicussion into the public
forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always sufficient, however. I'd like to also set up a regular weekly meeting using google hangouts or bluejeans so that all interested parties can share progress. There are a lot of things we can do during the Hammer cycle to improve things but it will require some coordination of effort.
Among other things, we can discuss:
- observed performance limitations
- high level strategies for addressing them
- proposed patch sets and their performance impact
- anything else that will move us forward
One challenge is timezones: there are developers in the US, China, Europe, and Israel who may want to join. As a starting point, how about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
8:00 (PDT, California)
15:00 (UTC)
18:00 (IDT, Israel)
23:00 (CST, China)
That is surely not the ideal time for everyone but it can hopefully be a starting point.
I've also created an etherpad for collecting discussion/agenda items at
http://pad.ceph.com/p/performance_weekly
Is there interest here? Please let everyone know if you are actively working in this area and/or would like to join, and update the pad above with the topics you would like to discuss.
Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: Weekly performance meeting
2014-09-25 18:27 Weekly performance meeting Sage Weil
` (5 preceding siblings ...)
2014-09-26 6:40 ` Zhang, Jian
@ 2014-09-26 7:25 ` Loic Dachary
2014-09-26 8:37 ` Zhang, Jian
[not found] ` <20140925192728.GA22139@oder.mch.fsc.net>
7 siblings, 1 reply; 27+ messages in thread
From: Loic Dachary @ 2014-09-26 7:25 UTC (permalink / raw)
To: ceph-devel, Zhang, Jian
[-- Attachment #1: Type: text/plain, Size: 2377 bytes --]
Hi,
I added a section to http://pad.ceph.com/p/performance_weekly about the current efforts to optimize/benchmark erasure code plugins. @Zhang, Jian : the ISA erasure code plugin Andreas Peters worked on is not mentioned because it does not need any optimization, as far as I can tell ;-)
Cheers
On 25/09/2014 20:27, Sage Weil wrote:
> Hi everyone,
>
> A number of people have approached me about how to get more involved with
> the current work on improving performance and how to better coordinate
> with other interested parties. A few meetings have taken place offline
> with good results but only a few interested parties were involved.
>
> Ideally, we'd like to move as much of this dicussion into the public
> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
> sufficient, however. I'd like to also set up a regular weekly meeting
> using google hangouts or bluejeans so that all interested parties can
> share progress. There are a lot of things we can do during the Hammer
> cycle to improve things but it will require some coordination of effort.
>
> Among other things, we can discuss:
>
> - observed performance limitations
> - high level strategies for addressing them
> - proposed patch sets and their performance impact
> - anything else that will move us forward
>
> One challenge is timezones: there are developers in the US, China, Europe,
> and Israel who may want to join. As a starting point, how about next
> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>
> 8:00 (PDT, California)
> 15:00 (UTC)
> 18:00 (IDT, Israel)
> 23:00 (CST, China)
>
> That is surely not the ideal time for everyone but it can hopefully be a
> starting point.
>
> I've also created an etherpad for collecting discussion/agenda items at
>
> http://pad.ceph.com/p/performance_weekly
>
> Is there interest here? Please let everyone know if you are actively
> working in this area and/or would like to join, and update the pad above
> with the topics you would like to discuss.
>
> Thanks!
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread* RE: Weekly performance meeting
2014-09-26 7:25 ` Loic Dachary
@ 2014-09-26 8:37 ` Zhang, Jian
2014-09-26 8:56 ` Loic Dachary
0 siblings, 1 reply; 27+ messages in thread
From: Zhang, Jian @ 2014-09-26 8:37 UTC (permalink / raw)
To: Loic Dachary, ceph-devel@vger.kernel.org
Loic,
Sorry for the confusion. We have observed up to ~20% EC performance degradation and have some initial findings and want to have a discussion.
Thanks
Jian
-----Original Message-----
From: Loic Dachary [mailto:loic@dachary.org]
Sent: Friday, September 26, 2014 3:25 PM
To: ceph-devel@vger.kernel.org; Zhang, Jian
Subject: Re: Weekly performance meeting
Hi,
I added a section to http://pad.ceph.com/p/performance_weekly about the current efforts to optimize/benchmark erasure code plugins. @Zhang, Jian : the ISA erasure code plugin Andreas Peters worked on is not mentioned because it does not need any optimization, as far as I can tell ;-)
Cheers
On 25/09/2014 20:27, Sage Weil wrote:
> Hi everyone,
>
> A number of people have approached me about how to get more involved
> with the current work on improving performance and how to better
> coordinate with other interested parties. A few meetings have taken
> place offline with good results but only a few interested parties were involved.
>
> Ideally, we'd like to move as much of this dicussion into the public
> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
> sufficient, however. I'd like to also set up a regular weekly meeting
> using google hangouts or bluejeans so that all interested parties can
> share progress. There are a lot of things we can do during the Hammer
> cycle to improve things but it will require some coordination of effort.
>
> Among other things, we can discuss:
>
> - observed performance limitations
> - high level strategies for addressing them
> - proposed patch sets and their performance impact
> - anything else that will move us forward
>
> One challenge is timezones: there are developers in the US, China,
> Europe, and Israel who may want to join. As a starting point, how
> about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong,
> that's
>
> 8:00 (PDT, California)
> 15:00 (UTC)
> 18:00 (IDT, Israel)
> 23:00 (CST, China)
>
> That is surely not the ideal time for everyone but it can hopefully be
> a starting point.
>
> I've also created an etherpad for collecting discussion/agenda items
> at
>
> http://pad.ceph.com/p/performance_weekly
>
> Is there interest here? Please let everyone know if you are actively
> working in this area and/or would like to join, and update the pad
> above with the topics you would like to discuss.
>
> Thanks!
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
>
--
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
2014-09-26 8:37 ` Zhang, Jian
@ 2014-09-26 8:56 ` Loic Dachary
0 siblings, 0 replies; 27+ messages in thread
From: Loic Dachary @ 2014-09-26 8:56 UTC (permalink / raw)
To: Zhang, Jian, ceph-devel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 3344 bytes --]
Hi,
I added
- isa : performances improvement expected ~20% [jian.zhang]
to http://pad.ceph.com/p/performance_weekly . Do you have more information about this degradation ?
Cheers
On 26/09/2014 10:37, Zhang, Jian wrote:
> Loic,
> Sorry for the confusion. We have observed up to ~20% EC performance degradation and have some initial findings and want to have a discussion.
>
> Thanks
> Jian
>
>
> -----Original Message-----
> From: Loic Dachary [mailto:loic@dachary.org]
> Sent: Friday, September 26, 2014 3:25 PM
> To: ceph-devel@vger.kernel.org; Zhang, Jian
> Subject: Re: Weekly performance meeting
>
> Hi,
>
> I added a section to http://pad.ceph.com/p/performance_weekly about the current efforts to optimize/benchmark erasure code plugins. @Zhang, Jian : the ISA erasure code plugin Andreas Peters worked on is not mentioned because it does not need any optimization, as far as I can tell ;-)
>
> Cheers
>
> On 25/09/2014 20:27, Sage Weil wrote:
>> Hi everyone,
>>
>> A number of people have approached me about how to get more involved
>> with the current work on improving performance and how to better
>> coordinate with other interested parties. A few meetings have taken
>> place offline with good results but only a few interested parties were involved.
>>
>> Ideally, we'd like to move as much of this dicussion into the public
>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>> sufficient, however. I'd like to also set up a regular weekly meeting
>> using google hangouts or bluejeans so that all interested parties can
>> share progress. There are a lot of things we can do during the Hammer
>> cycle to improve things but it will require some coordination of effort.
>>
>> Among other things, we can discuss:
>>
>> - observed performance limitations
>> - high level strategies for addressing them
>> - proposed patch sets and their performance impact
>> - anything else that will move us forward
>>
>> One challenge is timezones: there are developers in the US, China,
>> Europe, and Israel who may want to join. As a starting point, how
>> about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong,
>> that's
>>
>> 8:00 (PDT, California)
>> 15:00 (UTC)
>> 18:00 (IDT, Israel)
>> 23:00 (CST, China)
>>
>> That is surely not the ideal time for everyone but it can hopefully be
>> a starting point.
>>
>> I've also created an etherpad for collecting discussion/agenda items
>> at
>>
>> http://pad.ceph.com/p/performance_weekly
>>
>> Is there interest here? Please let everyone know if you are actively
>> working in this area and/or would like to join, and update the pad
>> above with the topics you would like to discuss.
>>
>> Thanks!
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <20140925192728.GA22139@oder.mch.fsc.net>]
* Re: Weekly performance meeting
[not found] ` <20140925192728.GA22139@oder.mch.fsc.net>
@ 2014-09-25 19:30 ` Mark Nelson
2014-09-25 19:31 ` Mark Nelson
2014-09-25 19:41 ` Sage Weil
2014-10-01 9:30 ` Andreas Bluemle
2 siblings, 1 reply; 27+ messages in thread
From: Mark Nelson @ 2014-09-25 19:30 UTC (permalink / raw)
To: Kasper Dieter, Sage Weil
Cc: ceph-devel@vger.kernel.org, Somnath.Roy@sandisk.com,
Andreas Bluemle, Allen.Samuels@sandisk.com,
PVonStamwitz@us.fujitsu.com, xinxin.shu@intel.com,
haomaiwang@gmail.com, s.priebe@profihost.ag,
xiaoxi.chen@intel.com, milosz@adfin.com, zhiqiang.wang@intel.com,
jianpeng.ma@intel.com, gdror@mellanox.com, vuhuong@mellanox.com,
mark.nelson@inktank.com
I'll be there too, and 100% agree with you Kasper! :)
Mark
On 09/25/2014 02:27 PM, Kasper Dieter wrote:
> Hi Sage,
>
> I'm definitely interested in joining this weekly call starting Oct 1st.
> Thanks for this initiative!
>
> Especially I'm interested in:
> - how can we reduce the number of threads in the system
> -- including to avoid the context switches in between
> -- including to avoid the queues and locks in between
> - how we can reduce the number of lines of code
> -- including the multiple system calls for each IO
> - how we can introduce a high efficient timestamp collection of the most important FN check-points
> (see for example the attached file)
> to measure the change and effect of our actions
>
> Best Regards,
> -Dieter
>
>
> On Thu, Sep 25, 2014 at 08:27:00PM +0200, Sage Weil wrote:
>> Hi everyone,
>>
>> A number of people have approached me about how to get more involved with
>> the current work on improving performance and how to better coordinate
>> with other interested parties. A few meetings have taken place offline
>> with good results but only a few interested parties were involved.
>>
>> Ideally, we'd like to move as much of this dicussion into the public
>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>> sufficient, however. I'd like to also set up a regular weekly meeting
>> using google hangouts or bluejeans so that all interested parties can
>> share progress. There are a lot of things we can do during the Hammer
>> cycle to improve things but it will require some coordination of effort.
>>
>> Among other things, we can discuss:
>>
>> - observed performance limitations
>> - high level strategies for addressing them
>> - proposed patch sets and their performance impact
>> - anything else that will move us forward
>>
>> One challenge is timezones: there are developers in the US, China, Europe,
>> and Israel who may want to join. As a starting point, how about next
>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>
>> 8:00 (PDT, California)
>> 15:00 (UTC)
>> 18:00 (IDT, Israel)
>> 23:00 (CST, China)
>>
>> That is surely not the ideal time for everyone but it can hopefully be a
>> starting point.
>>
>> I've also created an etherpad for collecting discussion/agenda items at
>>
>> http://pad.ceph.com/p/performance_weekly
>>
>> Is there interest here? Please let everyone know if you are actively
>> working in this area and/or would like to join, and update the pad above
>> with the topics you would like to discuss.
>>
>> Thanks!
>> sage
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: Weekly performance meeting
2014-09-25 19:30 ` Mark Nelson
@ 2014-09-25 19:31 ` Mark Nelson
0 siblings, 0 replies; 27+ messages in thread
From: Mark Nelson @ 2014-09-25 19:31 UTC (permalink / raw)
To: Kasper Dieter, Sage Weil
Cc: ceph-devel@vger.kernel.org, Somnath.Roy@sandisk.com,
Andreas Bluemle, Allen.Samuels@sandisk.com,
PVonStamwitz@us.fujitsu.com, xinxin.shu@intel.com,
haomaiwang@gmail.com, s.priebe@profihost.ag,
xiaoxi.chen@intel.com, milosz@adfin.com, zhiqiang.wang@intel.com,
jianpeng.ma@intel.com, gdror@mellanox.com, vuhuong@mellanox.com,
mark.nelson@inktank.com
Oops, replied to fast and meant to put Dieter! :)
Mark
On 09/25/2014 02:30 PM, Mark Nelson wrote:
> I'll be there too, and 100% agree with you Kasper! :)
>
> Mark
>
> On 09/25/2014 02:27 PM, Kasper Dieter wrote:
>> Hi Sage,
>>
>> I'm definitely interested in joining this weekly call starting Oct 1st.
>> Thanks for this initiative!
>>
>> Especially I'm interested in:
>> - how can we reduce the number of threads in the system
>> -- including to avoid the context switches in between
>> -- including to avoid the queues and locks in between
>> - how we can reduce the number of lines of code
>> -- including the multiple system calls for each IO
>> - how we can introduce a high efficient timestamp collection of the
>> most important FN check-points
>> (see for example the attached file)
>> to measure the change and effect of our actions
>>
>> Best Regards,
>> -Dieter
>>
>>
>> On Thu, Sep 25, 2014 at 08:27:00PM +0200, Sage Weil wrote:
>>> Hi everyone,
>>>
>>> A number of people have approached me about how to get more involved
>>> with
>>> the current work on improving performance and how to better coordinate
>>> with other interested parties. A few meetings have taken place offline
>>> with good results but only a few interested parties were involved.
>>>
>>> Ideally, we'd like to move as much of this dicussion into the public
>>> forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
>>> sufficient, however. I'd like to also set up a regular weekly meeting
>>> using google hangouts or bluejeans so that all interested parties can
>>> share progress. There are a lot of things we can do during the Hammer
>>> cycle to improve things but it will require some coordination of effort.
>>>
>>> Among other things, we can discuss:
>>>
>>> - observed performance limitations
>>> - high level strategies for addressing them
>>> - proposed patch sets and their performance impact
>>> - anything else that will move us forward
>>>
>>> One challenge is timezones: there are developers in the US, China,
>>> Europe,
>>> and Israel who may want to join. As a starting point, how about next
>>> Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
>>>
>>> 8:00 (PDT, California)
>>> 15:00 (UTC)
>>> 18:00 (IDT, Israel)
>>> 23:00 (CST, China)
>>>
>>> That is surely not the ideal time for everyone but it can hopefully be a
>>> starting point.
>>>
>>> I've also created an etherpad for collecting discussion/agenda items at
>>>
>>> http://pad.ceph.com/p/performance_weekly
>>>
>>> Is there interest here? Please let everyone know if you are actively
>>> working in this area and/or would like to join, and update the pad above
>>> with the topics you would like to discuss.
>>>
>>> Thanks!
>>> sage
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
[not found] ` <20140925192728.GA22139@oder.mch.fsc.net>
2014-09-25 19:30 ` Mark Nelson
@ 2014-09-25 19:41 ` Sage Weil
2014-09-25 20:20 ` Somnath Roy
2014-10-01 9:30 ` Andreas Bluemle
2 siblings, 1 reply; 27+ messages in thread
From: Sage Weil @ 2014-09-25 19:41 UTC (permalink / raw)
To: Kasper Dieter
Cc: ceph-devel@vger.kernel.org, Somnath.Roy@sandisk.com,
Andreas Bluemle, Allen.Samuels@sandisk.com,
PVonStamwitz@us.fujitsu.com, xinxin.shu@intel.com,
haomaiwang@gmail.com, s.priebe@profihost.ag,
xiaoxi.chen@intel.com, milosz@adfin.com, zhiqiang.wang@intel.com,
jianpeng.ma@intel.com, gdror@mellanox.com, vuhuong@mellanox.com,
mark.nelson@inktank.com
Hi Dieter,
On Thu, 25 Sep 2014, Kasper Dieter wrote:
> Hi Sage,
>
> I'm definitely interested in joining this weekly call starting Oct 1st.
> Thanks for this initiative!
Great! Please add these notes to
http://pad.ceph.com/p/performance_weekly
> Especially I'm interested in:
> - how can we reduce the number of threads in the system
> -- including to avoid the context switches in between
> -- including to avoid the queues and locks in between
You should take a look at Haomai's AsyncMessenger implementation he posted
a few weeks back. I'm not sure how much testing it's seen, but it
essentially refactors SimpleMessenger into a state machine and uses
libevent to schedule work.
> - how we can reduce the number of lines of code
> -- including the multiple system calls for each IO
I have some planned changes to ObjectStore::Transaction that make it
handle-based. This will make it easier to cache and avoid lots of dup
lookups (and lay some of the groundwork for the proposed KeyFileStore).
> - how we can introduce a high efficient timestamp collection of the most important FN check-points
> (see for example the attached file)
> to measure the change and effect of our actions
I think most of us are looking to LTTng or systemtap for this. You should
also check out the thread '?nstrumenting RADOS with Zipkin + LTTng' from
several weeks ago for a pretty promising tracing strategy.
sage
>
> Best Regards,
> -Dieter
>
>
> On Thu, Sep 25, 2014 at 08:27:00PM +0200, Sage Weil wrote:
> > Hi everyone,
> >
> > A number of people have approached me about how to get more involved with
> > the current work on improving performance and how to better coordinate
> > with other interested parties. A few meetings have taken place offline
> > with good results but only a few interested parties were involved.
> >
> > Ideally, we'd like to move as much of this dicussion into the public
> > forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't always
> > sufficient, however. I'd like to also set up a regular weekly meeting
> > using google hangouts or bluejeans so that all interested parties can
> > share progress. There are a lot of things we can do during the Hammer
> > cycle to improve things but it will require some coordination of effort.
> >
> > Among other things, we can discuss:
> >
> > - observed performance limitations
> > - high level strategies for addressing them
> > - proposed patch sets and their performance impact
> > - anything else that will move us forward
> >
> > One challenge is timezones: there are developers in the US, China, Europe,
> > and Israel who may want to join. As a starting point, how about next
> > Wednesday, 15:00 UTC? If I didn't do my tz math wrong, that's
> >
> > 8:00 (PDT, California)
> > 15:00 (UTC)
> > 18:00 (IDT, Israel)
> > 23:00 (CST, China)
> >
> > That is surely not the ideal time for everyone but it can hopefully be a
> > starting point.
> >
> > I've also created an etherpad for collecting discussion/agenda items at
> >
> > http://pad.ceph.com/p/performance_weekly
> >
> > Is there interest here? Please let everyone know if you are actively
> > working in this area and/or would like to join, and update the pad above
> > with the topics you would like to discuss.
> >
> > Thanks!
> > sage
>
^ permalink raw reply [flat|nested] 27+ messages in thread* RE: Weekly performance meeting
2014-09-25 19:41 ` Sage Weil
@ 2014-09-25 20:20 ` Somnath Roy
0 siblings, 0 replies; 27+ messages in thread
From: Somnath Roy @ 2014-09-25 20:20 UTC (permalink / raw)
To: Sage Weil, Kasper Dieter
Cc: ceph-devel@vger.kernel.org, Andreas Bluemle, Allen Samuels,
PVonStamwitz@us.fujitsu.com, xinxin.shu@intel.com,
haomaiwang@gmail.com, s.priebe@profihost.ag,
xiaoxi.chen@intel.com, milosz@adfin.com, zhiqiang.wang@intel.com,
jianpeng.ma@intel.com, gdror@mellanox.com, vuhuong@mellanox.com,
mark.nelson@inktank.com
Sage,
It will be helpful, I am planning to attend too.
Thanks & Regards
Somnath
-----Original Message-----
From: Sage Weil [mailto:sweil@redhat.com]
Sent: Thursday, September 25, 2014 12:42 PM
To: Kasper Dieter
Cc: ceph-devel@vger.kernel.org; Somnath Roy; Andreas Bluemle; Allen Samuels; PVonStamwitz@us.fujitsu.com; xinxin.shu@intel.com; haomaiwang@gmail.com; s.priebe@profihost.ag; xiaoxi.chen@intel.com; milosz@adfin.com; zhiqiang.wang@intel.com; jianpeng.ma@intel.com; gdror@mellanox.com; vuhuong@mellanox.com; mark.nelson@inktank.com
Subject: Re: Weekly performance meeting
Hi Dieter,
On Thu, 25 Sep 2014, Kasper Dieter wrote:
> Hi Sage,
>
> I'm definitely interested in joining this weekly call starting Oct 1st.
> Thanks for this initiative!
Great! Please add these notes to
http://pad.ceph.com/p/performance_weekly
> Especially I'm interested in:
> - how can we reduce the number of threads in the system
> -- including to avoid the context switches in between
> -- including to avoid the queues and locks in between
You should take a look at Haomai's AsyncMessenger implementation he posted a few weeks back. I'm not sure how much testing it's seen, but it essentially refactors SimpleMessenger into a state machine and uses libevent to schedule work.
> - how we can reduce the number of lines of code
> -- including the multiple system calls for each IO
I have some planned changes to ObjectStore::Transaction that make it handle-based. This will make it easier to cache and avoid lots of dup lookups (and lay some of the groundwork for the proposed KeyFileStore).
> - how we can introduce a high efficient timestamp collection of the most important FN check-points
> (see for example the attached file)
> to measure the change and effect of our actions
I think most of us are looking to LTTng or systemtap for this. You should also check out the thread '?nstrumenting RADOS with Zipkin + LTTng' from several weeks ago for a pretty promising tracing strategy.
sage
>
> Best Regards,
> -Dieter
>
>
> On Thu, Sep 25, 2014 at 08:27:00PM +0200, Sage Weil wrote:
> > Hi everyone,
> >
> > A number of people have approached me about how to get more involved
> > with the current work on improving performance and how to better
> > coordinate with other interested parties. A few meetings have taken
> > place offline with good results but only a few interested parties were involved.
> >
> > Ideally, we'd like to move as much of this dicussion into the public
> > forums: ceph-devel@vger.kernel.org and #ceph-devel. That isn't
> > always sufficient, however. I'd like to also set up a regular
> > weekly meeting using google hangouts or bluejeans so that all
> > interested parties can share progress. There are a lot of things we
> > can do during the Hammer cycle to improve things but it will require some coordination of effort.
> >
> > Among other things, we can discuss:
> >
> > - observed performance limitations
> > - high level strategies for addressing them
> > - proposed patch sets and their performance impact
> > - anything else that will move us forward
> >
> > One challenge is timezones: there are developers in the US, China,
> > Europe, and Israel who may want to join. As a starting point, how
> > about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong,
> > that's
> >
> > 8:00 (PDT, California)
> > 15:00 (UTC)
> > 18:00 (IDT, Israel)
> > 23:00 (CST, China)
> >
> > That is surely not the ideal time for everyone but it can hopefully
> > be a starting point.
> >
> > I've also created an etherpad for collecting discussion/agenda items
> > at
> >
> > http://pad.ceph.com/p/performance_weekly
> >
> > Is there interest here? Please let everyone know if you are
> > actively working in this area and/or would like to join, and update
> > the pad above with the topics you would like to discuss.
> >
> > Thanks!
> > sage
>
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Weekly performance meeting
[not found] ` <20140925192728.GA22139@oder.mch.fsc.net>
2014-09-25 19:30 ` Mark Nelson
2014-09-25 19:41 ` Sage Weil
@ 2014-10-01 9:30 ` Andreas Bluemle
2 siblings, 0 replies; 27+ messages in thread
From: Andreas Bluemle @ 2014-10-01 9:30 UTC (permalink / raw)
Cc: Sage Weil, ceph-devel@vger.kernel.org, Somnath.Roy@sandisk.com,
Allen.Samuels@sandisk.com, PVonStamwitz@us.fujitsu.com,
xinxin.shu@intel.com, haomaiwang@gmail.com, s.priebe@profihost.ag,
xiaoxi.chen@intel.com, milosz@adfin.com, zhiqiang.wang@intel.com,
jianpeng.ma@intel.com, gdror@mellanox.com, vuhuong@mellanox.com,
mark.nelson@inktank.com
Hi,
to illustrate the "number of threads issue":
we had setup a cluster with 11 storage nodes and a total of 375 osd's,
i.e. like 30 to 40 osd's per storage node.
Looking at one of the storage nodes when the
cluster is idle (no client I/O, no scrub) we encounter
- up to 82.000 ceph-osd threads
or approx. 2.000 threads per osd
- a CPU load of 20%:
this is on a storage node with 12 CPU cores
which means that more than 2 CPU cores are busy
- a network load of almost 50.000 packets/second:
separate cluster and public network,
12.000 packets per second on each network interface,
outgoing and incoming (heartbeats?)
Regards
Andreas Bluemle
On Thu, 25 Sep 2014 21:27:28 +0200
Kasper Dieter <dieter.kasper@ts.fujitsu.com> wrote:
> Hi Sage,
>
> I'm definitely interested in joining this weekly call starting Oct
> 1st. Thanks for this initiative!
>
> Especially I'm interested in:
> - how can we reduce the number of threads in the system
> -- including to avoid the context switches in between
> -- including to avoid the queues and locks in between
> - how we can reduce the number of lines of code
> -- including the multiple system calls for each IO
> - how we can introduce a high efficient timestamp collection of the
> most important FN check-points (see for example the attached file)
> to measure the change and effect of our actions
>
> Best Regards,
> -Dieter
>
>
> On Thu, Sep 25, 2014 at 08:27:00PM +0200, Sage Weil wrote:
> > Hi everyone,
> >
> > A number of people have approached me about how to get more
> > involved with the current work on improving performance and how to
> > better coordinate with other interested parties. A few meetings
> > have taken place offline with good results but only a few
> > interested parties were involved.
> >
> > Ideally, we'd like to move as much of this dicussion into the
> > public forums: ceph-devel@vger.kernel.org and #ceph-devel. That
> > isn't always sufficient, however. I'd like to also set up a
> > regular weekly meeting using google hangouts or bluejeans so that
> > all interested parties can share progress. There are a lot of
> > things we can do during the Hammer cycle to improve things but it
> > will require some coordination of effort.
> >
> > Among other things, we can discuss:
> >
> > - observed performance limitations
> > - high level strategies for addressing them
> > - proposed patch sets and their performance impact
> > - anything else that will move us forward
> >
> > One challenge is timezones: there are developers in the US, China,
> > Europe, and Israel who may want to join. As a starting point, how
> > about next Wednesday, 15:00 UTC? If I didn't do my tz math wrong,
> > that's
> >
> > 8:00 (PDT, California)
> > 15:00 (UTC)
> > 18:00 (IDT, Israel)
> > 23:00 (CST, China)
> >
> > That is surely not the ideal time for everyone but it can hopefully
> > be a starting point.
> >
> > I've also created an etherpad for collecting discussion/agenda
> > items at
> >
> > http://pad.ceph.com/p/performance_weekly
> >
> > Is there interest here? Please let everyone know if you are
> > actively working in this area and/or would like to join, and update
> > the pad above with the topics you would like to discuss.
> >
> > Thanks!
> > sage
--
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
^ permalink raw reply [flat|nested] 27+ messages in thread