* libcrush.so
@ 2015-05-08 1:29 Zhou, Yuan
2015-05-08 4:37 ` libcrush.so Gregory Farnum
2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI
0 siblings, 2 replies; 12+ messages in thread
From: Zhou, Yuan @ 2015-05-08 1:29 UTC (permalink / raw)
To: Ceph Development; +Cc: Cohen, David E, Yu, Zhidong
Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
Thanks, -yuan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: libcrush.so
2015-05-08 1:29 libcrush.so Zhou, Yuan
@ 2015-05-08 4:37 ` Gregory Farnum
2015-05-09 1:39 ` libcrush.so Zhou, Yuan
2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI
1 sibling, 1 reply; 12+ messages in thread
From: Gregory Farnum @ 2015-05-08 4:37 UTC (permalink / raw)
To: Zhou, Yuan; +Cc: Ceph Development, Cohen, David E, Yu, Zhidong
On Thu, May 7, 2015 at 6:29 PM, Zhou, Yuan <yuan.zhou@intel.com> wrote:
> Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
>
> Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
>
> From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
I don't think there was ever a libcrush that was friendly for external
use. There was a makefile-level "libcrush" but it got merged into
libcommon, presumably for ease of maintenance. The interfaces we use
around CRUSH are just not very clean, IIRC; the C interface is opaque
and the C++ CrushWrapper bits are...well, C++, and not easy to change
into something separable from the OSDMap, either. :/
That said, if somebody wanted to rework the code interfaces to be
nicer, PRs are always welcome. ;)
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-08 1:29 libcrush.so Zhou, Yuan
2015-05-08 4:37 ` libcrush.so Gregory Farnum
@ 2015-05-08 17:40 ` James (Fei) Liu-SSI
2015-05-08 17:57 ` libcrush.so Mark Nelson
2015-05-09 1:25 ` libcrush.so Zhou, Yuan
1 sibling, 2 replies; 12+ messages in thread
From: James (Fei) Liu-SSI @ 2015-05-08 17:40 UTC (permalink / raw)
To: Zhou, Yuan, Ceph Development; +Cc: Cohen, David E, Yu, Zhidong
Hi Yuan,
Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
Regards,
James
-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
Sent: Thursday, May 07, 2015 6:29 PM
To: Ceph Development
Cc: Cohen, David E; Yu, Zhidong
Subject: libcrush.so
Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
Thanks, -yuan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: libcrush.so
2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI
@ 2015-05-08 17:57 ` Mark Nelson
2015-05-08 18:13 ` libcrush.so James (Fei) Liu-SSI
2015-05-09 1:25 ` libcrush.so Zhou, Yuan
1 sibling, 1 reply; 12+ messages in thread
From: Mark Nelson @ 2015-05-08 17:57 UTC (permalink / raw)
To: James (Fei) Liu-SSI, Zhou, Yuan, Ceph Development
Cc: Cohen, David E, Yu, Zhidong
FWIW, an easily buildable libcrush would be fantastic for simulation
purposes (and things like avalanche analysis!) as well.
Mark
On 05/08/2015 12:40 PM, James (Fei) Liu-SSI wrote:
> Hi Yuan,
> Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
>
> Regards,
> James
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
> Sent: Thursday, May 07, 2015 6:29 PM
> To: Ceph Development
> Cc: Cohen, David E; Yu, Zhidong
> Subject: libcrush.so
>
> Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
>
> Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
>
> From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
>
>
> Thanks, -yuan
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-08 17:57 ` libcrush.so Mark Nelson
@ 2015-05-08 18:13 ` James (Fei) Liu-SSI
0 siblings, 0 replies; 12+ messages in thread
From: James (Fei) Liu-SSI @ 2015-05-08 18:13 UTC (permalink / raw)
To: Mark Nelson, Zhou, Yuan, Ceph Development; +Cc: Cohen, David E, Yu, Zhidong
Good to know. thanks
-----Original Message-----
From: Mark Nelson [mailto:mnelson@redhat.com]
Sent: Friday, May 08, 2015 10:58 AM
To: James (Fei) Liu-SSI; Zhou, Yuan; Ceph Development
Cc: Cohen, David E; Yu, Zhidong
Subject: Re: libcrush.so
FWIW, an easily buildable libcrush would be fantastic for simulation purposes (and things like avalanche analysis!) as well.
Mark
On 05/08/2015 12:40 PM, James (Fei) Liu-SSI wrote:
> Hi Yuan,
> Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
>
> Regards,
> James
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
> Sent: Thursday, May 07, 2015 6:29 PM
> To: Ceph Development
> Cc: Cohen, David E; Yu, Zhidong
> Subject: libcrush.so
>
> Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
>
> Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
>
> From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
>
>
> Thanks, -yuan
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI
2015-05-08 17:57 ` libcrush.so Mark Nelson
@ 2015-05-09 1:25 ` Zhou, Yuan
2015-05-11 2:34 ` libcrush.so Sage Weil
1 sibling, 1 reply; 12+ messages in thread
From: Zhou, Yuan @ 2015-05-09 1:25 UTC (permalink / raw)
To: James (Fei) Liu-SSI, Ceph Development; +Cc: Cohen, David E, Yu, Zhidong
Hi James,
This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment.
There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster.
Thanks, -yuan
[1] https://github.com/openstack/swift/blob/master/swift/common/middleware/list_endpoints.py
-----Original Message-----
From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com]
Sent: Saturday, May 9, 2015 1:40 AM
To: Zhou, Yuan; Ceph Development
Cc: Cohen, David E; Yu, Zhidong
Subject: RE: libcrush.so
Hi Yuan,
Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
Regards,
James
-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
Sent: Thursday, May 07, 2015 6:29 PM
To: Ceph Development
Cc: Cohen, David E; Yu, Zhidong
Subject: libcrush.so
Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
Thanks, -yuan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-08 4:37 ` libcrush.so Gregory Farnum
@ 2015-05-09 1:39 ` Zhou, Yuan
0 siblings, 0 replies; 12+ messages in thread
From: Zhou, Yuan @ 2015-05-09 1:39 UTC (permalink / raw)
To: Gregory Farnum; +Cc: Ceph Development, Cohen, David E, Yu, Zhidong
Greg, Thanks a lot for the info!
Yes current CRUSH code heavily relies on the internal data structures and it's a bit difficult to extract it out.
For the APIs, do you have any ideas on this? Currently I see there should have:
map object to pg: give the object name, return the pg
map pg to osds: give the pg, return the osd lists
We have done some tests here. Let me see if I can make it more clean for a PR.
Thanks, -yuan
-----Original Message-----
From: Gregory Farnum [mailto:greg@gregs42.com]
Sent: Friday, May 8, 2015 12:37 PM
To: Zhou, Yuan
Cc: Ceph Development; Cohen, David E; Yu, Zhidong
Subject: Re: libcrush.so
On Thu, May 7, 2015 at 6:29 PM, Zhou, Yuan <yuan.zhou@intel.com> wrote:
> Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
>
> Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
>
> From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
I don't think there was ever a libcrush that was friendly for external use. There was a makefile-level "libcrush" but it got merged into libcommon, presumably for ease of maintenance. The interfaces we use around CRUSH are just not very clean, IIRC; the C interface is opaque and the C++ CrushWrapper bits are...well, C++, and not easy to change into something separable from the OSDMap, either. :/
That said, if somebody wanted to rework the code interfaces to be nicer, PRs are always welcome. ;)
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-09 1:25 ` libcrush.so Zhou, Yuan
@ 2015-05-11 2:34 ` Sage Weil
2015-05-11 11:41 ` libcrush.so Cohen, David E
0 siblings, 1 reply; 12+ messages in thread
From: Sage Weil @ 2015-05-11 2:34 UTC (permalink / raw)
To: Zhou, Yuan
Cc: James (Fei) Liu-SSI, Ceph Development, Cohen, David E,
Yu, Zhidong
On Sat, 9 May 2015, Zhou, Yuan wrote:
> Hi James,
>
> This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment.
> There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster.
I think the place for this is librados--and I believe there is already a
method to do this (I think... I know there is in libcephfs as the hadoop
bindings use it). That way you don't have to deal with the annoying
details of getting an up to date map and so on.
sage
>
> Thanks, -yuan
>
> [1] https://github.com/openstack/swift/blob/master/swift/common/middleware/list_endpoints.py
>
> -----Original Message-----
> From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com]
> Sent: Saturday, May 9, 2015 1:40 AM
> To: Zhou, Yuan; Ceph Development
> Cc: Cohen, David E; Yu, Zhidong
> Subject: RE: libcrush.so
>
> Hi Yuan,
> Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
>
> Regards,
> James
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
> Sent: Thursday, May 07, 2015 6:29 PM
> To: Ceph Development
> Cc: Cohen, David E; Yu, Zhidong
> Subject: libcrush.so
>
> Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
>
> Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
>
> >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
>
>
> Thanks, -yuan
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-11 2:34 ` libcrush.so Sage Weil
@ 2015-05-11 11:41 ` Cohen, David E
2015-05-11 17:01 ` libcrush.so Sage Weil
0 siblings, 1 reply; 12+ messages in thread
From: Cohen, David E @ 2015-05-11 11:41 UTC (permalink / raw)
To: Sage Weil; +Cc: James (Fei) Liu-SSI, Ceph Development, Yu, Zhidong, Zhou, Yuan
The "ceph_get_osd_crush_location" method is made available via libcephfs. In deployment scenarios that don't include CephFS it will be ideal if this method is also available. However, there is no equivalent method available via librados. Instead, it looks like you have to use the "rados_mon_command_target" to mimic the functionality of the command line tools.
/dave.
https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs
https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs.h
https://github.com/ceph/ceph/blob/master/src/libcephfs.cc
-----Original Message-----
From: Sage Weil [mailto:sage@newdream.net]
Sent: Sunday, May 10, 2015 10:34 PM
To: Zhou, Yuan
Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong
Subject: RE: libcrush.so
On Sat, 9 May 2015, Zhou, Yuan wrote:
> Hi James,
>
> This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment.
> There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster.
I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on.
sage
>
> Thanks, -yuan
>
> [1]
> https://github.com/openstack/swift/blob/master/swift/common/middleware
> /list_endpoints.py
>
> -----Original Message-----
> From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com]
> Sent: Saturday, May 9, 2015 1:40 AM
> To: Zhou, Yuan; Ceph Development
> Cc: Cohen, David E; Yu, Zhidong
> Subject: RE: libcrush.so
>
> Hi Yuan,
> Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
>
> Regards,
> James
>
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
> Sent: Thursday, May 07, 2015 6:29 PM
> To: Ceph Development
> Cc: Cohen, David E; Yu, Zhidong
> Subject: libcrush.so
>
> Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
>
> Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
>
> >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
>
>
> Thanks, -yuan
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to majordomo@vger.kernel.org More majordomo
> info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-11 11:41 ` libcrush.so Cohen, David E
@ 2015-05-11 17:01 ` Sage Weil
2015-05-11 22:39 ` libcrush.so Zhou, Yuan
0 siblings, 1 reply; 12+ messages in thread
From: Sage Weil @ 2015-05-11 17:01 UTC (permalink / raw)
To: Cohen, David E
Cc: James (Fei) Liu-SSI, Ceph Development, Yu, Zhidong, Zhou, Yuan
On Mon, 11 May 2015, Cohen, David E wrote:
> The "ceph_get_osd_crush_location" method is made available via
> libcephfs. In deployment scenarios that don't include CephFS it will be
> ideal if this method is also available. However, there is no equivalent
> method available via librados. Instead, it looks like you have to use
> the "rados_mon_command_target" to mimic the functionality of the command
> line tools.
Yeah, okay. Would adding those calls to librados address your use-case?
As Greg mentioned, putting together a separate libcrush.so is a bit of
work because the useful bits that encode/decode maps and so forth pull in
a bunch of generic Ceph code and there will be some annoying linking
issues to sort out if running alongside other Ceph code (like librados).
And even if we did all of that work, it'll push responsibility to the user
to make sure they have the latest osdmap. If the goal is to calculate
mappings for a running cluster, adding to librados seems like the
easiest path forward...
sage
>
> /dave.
>
> https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs
> https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs.h
> https://github.com/ceph/ceph/blob/master/src/libcephfs.cc
>
>
> -----Original Message-----
> From: Sage Weil [mailto:sage@newdream.net]
> Sent: Sunday, May 10, 2015 10:34 PM
> To: Zhou, Yuan
> Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong
> Subject: RE: libcrush.so
>
> On Sat, 9 May 2015, Zhou, Yuan wrote:
> > Hi James,
> >
> > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment.
> > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster.
>
> I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on.
>
> sage
>
>
> >
> > Thanks, -yuan
> >
> > [1]
> > https://github.com/openstack/swift/blob/master/swift/common/middleware
> > /list_endpoints.py
> >
> > -----Original Message-----
> > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com]
> > Sent: Saturday, May 9, 2015 1:40 AM
> > To: Zhou, Yuan; Ceph Development
> > Cc: Cohen, David E; Yu, Zhidong
> > Subject: RE: libcrush.so
> >
> > Hi Yuan,
> > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
> >
> > Regards,
> > James
> >
> > -----Original Message-----
> > From: ceph-devel-owner@vger.kernel.org
> > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
> > Sent: Thursday, May 07, 2015 6:29 PM
> > To: Ceph Development
> > Cc: Cohen, David E; Yu, Zhidong
> > Subject: libcrush.so
> >
> > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
> >
> > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
> >
> > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
> >
> >
> > Thanks, -yuan
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at http://vger.kernel.org/majordomo-info.html
> >
> >
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-11 17:01 ` libcrush.so Sage Weil
@ 2015-05-11 22:39 ` Zhou, Yuan
2015-05-11 22:52 ` libcrush.so Sage Weil
0 siblings, 1 reply; 12+ messages in thread
From: Zhou, Yuan @ 2015-05-11 22:39 UTC (permalink / raw)
To: Sage Weil, Cohen, David E
Cc: James (Fei) Liu-SSI, Ceph Development, Yu, Zhidong
Yes Sage, I think this will work for us.
In fact we did tested in this way, the first step is to make librados have the latest osd map and then based on the weight and the crush rule, it's able to use crush_do_rule to get the acting set. Is this approach right?
-----Original Message-----
From: Sage Weil [mailto:sage@newdream.net]
Sent: Tuesday, May 12, 2015 1:01 AM
To: Cohen, David E
Cc: James (Fei) Liu-SSI; Ceph Development; Yu, Zhidong; Zhou, Yuan
Subject: RE: libcrush.so
On Mon, 11 May 2015, Cohen, David E wrote:
> The "ceph_get_osd_crush_location" method is made available via
> libcephfs. In deployment scenarios that don't include CephFS it will
> be ideal if this method is also available. However, there is no
> equivalent method available via librados. Instead, it looks like you
> have to use the "rados_mon_command_target" to mimic the functionality
> of the command line tools.
Yeah, okay. Would adding those calls to librados address your use-case?
As Greg mentioned, putting together a separate libcrush.so is a bit of work because the useful bits that encode/decode maps and so forth pull in a bunch of generic Ceph code and there will be some annoying linking issues to sort out if running alongside other Ceph code (like librados).
And even if we did all of that work, it'll push responsibility to the user to make sure they have the latest osdmap. If the goal is to calculate mappings for a running cluster, adding to librados seems like the easiest path forward...
sage
>
> /dave.
>
> https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs
> https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs.
> h https://github.com/ceph/ceph/blob/master/src/libcephfs.cc
>
>
> -----Original Message-----
> From: Sage Weil [mailto:sage@newdream.net]
> Sent: Sunday, May 10, 2015 10:34 PM
> To: Zhou, Yuan
> Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong
> Subject: RE: libcrush.so
>
> On Sat, 9 May 2015, Zhou, Yuan wrote:
> > Hi James,
> >
> > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment.
> > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster.
>
> I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on.
>
> sage
>
>
> >
> > Thanks, -yuan
> >
> > [1]
> > https://github.com/openstack/swift/blob/master/swift/common/middlewa
> > re
> > /list_endpoints.py
> >
> > -----Original Message-----
> > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com]
> > Sent: Saturday, May 9, 2015 1:40 AM
> > To: Zhou, Yuan; Ceph Development
> > Cc: Cohen, David E; Yu, Zhidong
> > Subject: RE: libcrush.so
> >
> > Hi Yuan,
> > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
> >
> > Regards,
> > James
> >
> > -----Original Message-----
> > From: ceph-devel-owner@vger.kernel.org
> > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
> > Sent: Thursday, May 07, 2015 6:29 PM
> > To: Ceph Development
> > Cc: Cohen, David E; Yu, Zhidong
> > Subject: libcrush.so
> >
> > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
> >
> > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
> >
> > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
> >
> >
> > Thanks, -yuan
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > in the body of a message to majordomo@vger.kernel.org More majordomo
> > info at http://vger.kernel.org/majordomo-info.html
> >
> >
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so
2015-05-11 22:39 ` libcrush.so Zhou, Yuan
@ 2015-05-11 22:52 ` Sage Weil
0 siblings, 0 replies; 12+ messages in thread
From: Sage Weil @ 2015-05-11 22:52 UTC (permalink / raw)
To: Zhou, Yuan
Cc: Cohen, David E, James (Fei) Liu-SSI, Ceph Development,
Yu, Zhidong
On Mon, 11 May 2015, Zhou, Yuan wrote:
> Yes Sage, I think this will work for us. In fact we did tested in this
> way, the first step is to make librados have the latest osd map and then
> based on the weight and the crush rule, it's able to use crush_do_rule
> to get the acting set. Is this approach right?
Yeah, I think so.
There is already a wait_for_map() type call in librados, so I would make
the API call only calculate the mapping (using whatever the current map
is). The caller can then do one call to ensure a fresh map and then
calculate N mappings without any additional map sync overhead.
Sound right?
sage
>
> -----Original Message-----
> From: Sage Weil [mailto:sage@newdream.net]
> Sent: Tuesday, May 12, 2015 1:01 AM
> To: Cohen, David E
> Cc: James (Fei) Liu-SSI; Ceph Development; Yu, Zhidong; Zhou, Yuan
> Subject: RE: libcrush.so
>
> On Mon, 11 May 2015, Cohen, David E wrote:
> > The "ceph_get_osd_crush_location" method is made available via
> > libcephfs. In deployment scenarios that don't include CephFS it will
> > be ideal if this method is also available. However, there is no
> > equivalent method available via librados. Instead, it looks like you
> > have to use the "rados_mon_command_target" to mimic the functionality
> > of the command line tools.
>
> Yeah, okay. Would adding those calls to librados address your use-case?
>
> As Greg mentioned, putting together a separate libcrush.so is a bit of work because the useful bits that encode/decode maps and so forth pull in a bunch of generic Ceph code and there will be some annoying linking issues to sort out if running alongside other Ceph code (like librados).
> And even if we did all of that work, it'll push responsibility to the user to make sure they have the latest osdmap. If the goal is to calculate mappings for a running cluster, adding to librados seems like the easiest path forward...
>
> sage
>
> >
> > /dave.
> >
> > https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs
> > https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs.
> > h https://github.com/ceph/ceph/blob/master/src/libcephfs.cc
> >
> >
> > -----Original Message-----
> > From: Sage Weil [mailto:sage@newdream.net]
> > Sent: Sunday, May 10, 2015 10:34 PM
> > To: Zhou, Yuan
> > Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong
> > Subject: RE: libcrush.so
> >
> > On Sat, 9 May 2015, Zhou, Yuan wrote:
> > > Hi James,
> > >
> > > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment.
> > > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster.
> >
> > I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on.
> >
> > sage
> >
> >
> > >
> > > Thanks, -yuan
> > >
> > > [1]
> > > https://github.com/openstack/swift/blob/master/swift/common/middlewa
> > > re
> > > /list_endpoints.py
> > >
> > > -----Original Message-----
> > > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com]
> > > Sent: Saturday, May 9, 2015 1:40 AM
> > > To: Zhou, Yuan; Ceph Development
> > > Cc: Cohen, David E; Yu, Zhidong
> > > Subject: RE: libcrush.so
> > >
> > > Hi Yuan,
> > > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool?
> > >
> > > Regards,
> > > James
> > >
> > > -----Original Message-----
> > > From: ceph-devel-owner@vger.kernel.org
> > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan
> > > Sent: Thursday, May 07, 2015 6:29 PM
> > > To: Ceph Development
> > > Cc: Cohen, David E; Yu, Zhidong
> > > Subject: libcrush.so
> > >
> > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example.
> > >
> > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also.
> > >
> > > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change?
> > >
> > >
> > > Thanks, -yuan
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > > in the body of a message to majordomo@vger.kernel.org More majordomo
> > > info at http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > > in the body of a message to majordomo@vger.kernel.org More majordomo
> > > info at http://vger.kernel.org/majordomo-info.html
> > >
> > >
> >
> >
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-05-11 22:52 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-08 1:29 libcrush.so Zhou, Yuan
2015-05-08 4:37 ` libcrush.so Gregory Farnum
2015-05-09 1:39 ` libcrush.so Zhou, Yuan
2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI
2015-05-08 17:57 ` libcrush.so Mark Nelson
2015-05-08 18:13 ` libcrush.so James (Fei) Liu-SSI
2015-05-09 1:25 ` libcrush.so Zhou, Yuan
2015-05-11 2:34 ` libcrush.so Sage Weil
2015-05-11 11:41 ` libcrush.so Cohen, David E
2015-05-11 17:01 ` libcrush.so Sage Weil
2015-05-11 22:39 ` libcrush.so Zhou, Yuan
2015-05-11 22:52 ` libcrush.so Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.