* libcrush.so @ 2015-05-08 1:29 Zhou, Yuan 2015-05-08 4:37 ` libcrush.so Gregory Farnum 2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI 0 siblings, 2 replies; 12+ messages in thread From: Zhou, Yuan @ 2015-05-08 1:29 UTC (permalink / raw) To: Ceph Development; +Cc: Cohen, David E, Yu, Zhidong Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? Thanks, -yuan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: libcrush.so 2015-05-08 1:29 libcrush.so Zhou, Yuan @ 2015-05-08 4:37 ` Gregory Farnum 2015-05-09 1:39 ` libcrush.so Zhou, Yuan 2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI 1 sibling, 1 reply; 12+ messages in thread From: Gregory Farnum @ 2015-05-08 4:37 UTC (permalink / raw) To: Zhou, Yuan; +Cc: Ceph Development, Cohen, David E, Yu, Zhidong On Thu, May 7, 2015 at 6:29 PM, Zhou, Yuan <yuan.zhou@intel.com> wrote: > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? I don't think there was ever a libcrush that was friendly for external use. There was a makefile-level "libcrush" but it got merged into libcommon, presumably for ease of maintenance. The interfaces we use around CRUSH are just not very clean, IIRC; the C interface is opaque and the C++ CrushWrapper bits are...well, C++, and not easy to change into something separable from the OSDMap, either. :/ That said, if somebody wanted to rework the code interfaces to be nicer, PRs are always welcome. ;) ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-08 4:37 ` libcrush.so Gregory Farnum @ 2015-05-09 1:39 ` Zhou, Yuan 0 siblings, 0 replies; 12+ messages in thread From: Zhou, Yuan @ 2015-05-09 1:39 UTC (permalink / raw) To: Gregory Farnum; +Cc: Ceph Development, Cohen, David E, Yu, Zhidong Greg, Thanks a lot for the info! Yes current CRUSH code heavily relies on the internal data structures and it's a bit difficult to extract it out. For the APIs, do you have any ideas on this? Currently I see there should have: map object to pg: give the object name, return the pg map pg to osds: give the pg, return the osd lists We have done some tests here. Let me see if I can make it more clean for a PR. Thanks, -yuan -----Original Message----- From: Gregory Farnum [mailto:greg@gregs42.com] Sent: Friday, May 8, 2015 12:37 PM To: Zhou, Yuan Cc: Ceph Development; Cohen, David E; Yu, Zhidong Subject: Re: libcrush.so On Thu, May 7, 2015 at 6:29 PM, Zhou, Yuan <yuan.zhou@intel.com> wrote: > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? I don't think there was ever a libcrush that was friendly for external use. There was a makefile-level "libcrush" but it got merged into libcommon, presumably for ease of maintenance. The interfaces we use around CRUSH are just not very clean, IIRC; the C interface is opaque and the C++ CrushWrapper bits are...well, C++, and not easy to change into something separable from the OSDMap, either. :/ That said, if somebody wanted to rework the code interfaces to be nicer, PRs are always welcome. ;) ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-08 1:29 libcrush.so Zhou, Yuan 2015-05-08 4:37 ` libcrush.so Gregory Farnum @ 2015-05-08 17:40 ` James (Fei) Liu-SSI 2015-05-08 17:57 ` libcrush.so Mark Nelson 2015-05-09 1:25 ` libcrush.so Zhou, Yuan 1 sibling, 2 replies; 12+ messages in thread From: James (Fei) Liu-SSI @ 2015-05-08 17:40 UTC (permalink / raw) To: Zhou, Yuan, Ceph Development; +Cc: Cohen, David E, Yu, Zhidong Hi Yuan, Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? Regards, James -----Original Message----- From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan Sent: Thursday, May 07, 2015 6:29 PM To: Ceph Development Cc: Cohen, David E; Yu, Zhidong Subject: libcrush.so Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? Thanks, -yuan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: libcrush.so 2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI @ 2015-05-08 17:57 ` Mark Nelson 2015-05-08 18:13 ` libcrush.so James (Fei) Liu-SSI 2015-05-09 1:25 ` libcrush.so Zhou, Yuan 1 sibling, 1 reply; 12+ messages in thread From: Mark Nelson @ 2015-05-08 17:57 UTC (permalink / raw) To: James (Fei) Liu-SSI, Zhou, Yuan, Ceph Development Cc: Cohen, David E, Yu, Zhidong FWIW, an easily buildable libcrush would be fantastic for simulation purposes (and things like avalanche analysis!) as well. Mark On 05/08/2015 12:40 PM, James (Fei) Liu-SSI wrote: > Hi Yuan, > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? > > Regards, > James > > -----Original Message----- > From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan > Sent: Thursday, May 07, 2015 6:29 PM > To: Ceph Development > Cc: Cohen, David E; Yu, Zhidong > Subject: libcrush.so > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? > > > Thanks, -yuan > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-08 17:57 ` libcrush.so Mark Nelson @ 2015-05-08 18:13 ` James (Fei) Liu-SSI 0 siblings, 0 replies; 12+ messages in thread From: James (Fei) Liu-SSI @ 2015-05-08 18:13 UTC (permalink / raw) To: Mark Nelson, Zhou, Yuan, Ceph Development; +Cc: Cohen, David E, Yu, Zhidong Good to know. thanks -----Original Message----- From: Mark Nelson [mailto:mnelson@redhat.com] Sent: Friday, May 08, 2015 10:58 AM To: James (Fei) Liu-SSI; Zhou, Yuan; Ceph Development Cc: Cohen, David E; Yu, Zhidong Subject: Re: libcrush.so FWIW, an easily buildable libcrush would be fantastic for simulation purposes (and things like avalanche analysis!) as well. Mark On 05/08/2015 12:40 PM, James (Fei) Liu-SSI wrote: > Hi Yuan, > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? > > Regards, > James > > -----Original Message----- > From: ceph-devel-owner@vger.kernel.org > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan > Sent: Thursday, May 07, 2015 6:29 PM > To: Ceph Development > Cc: Cohen, David E; Yu, Zhidong > Subject: libcrush.so > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? > > > Thanks, -yuan > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@vger.kernel.org More majordomo > info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@vger.kernel.org More majordomo > info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI 2015-05-08 17:57 ` libcrush.so Mark Nelson @ 2015-05-09 1:25 ` Zhou, Yuan 2015-05-11 2:34 ` libcrush.so Sage Weil 1 sibling, 1 reply; 12+ messages in thread From: Zhou, Yuan @ 2015-05-09 1:25 UTC (permalink / raw) To: James (Fei) Liu-SSI, Ceph Development; +Cc: Cohen, David E, Yu, Zhidong Hi James, This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment. There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster. Thanks, -yuan [1] https://github.com/openstack/swift/blob/master/swift/common/middleware/list_endpoints.py -----Original Message----- From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com] Sent: Saturday, May 9, 2015 1:40 AM To: Zhou, Yuan; Ceph Development Cc: Cohen, David E; Yu, Zhidong Subject: RE: libcrush.so Hi Yuan, Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? Regards, James -----Original Message----- From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan Sent: Thursday, May 07, 2015 6:29 PM To: Ceph Development Cc: Cohen, David E; Yu, Zhidong Subject: libcrush.so Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? Thanks, -yuan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-09 1:25 ` libcrush.so Zhou, Yuan @ 2015-05-11 2:34 ` Sage Weil 2015-05-11 11:41 ` libcrush.so Cohen, David E 0 siblings, 1 reply; 12+ messages in thread From: Sage Weil @ 2015-05-11 2:34 UTC (permalink / raw) To: Zhou, Yuan Cc: James (Fei) Liu-SSI, Ceph Development, Cohen, David E, Yu, Zhidong On Sat, 9 May 2015, Zhou, Yuan wrote: > Hi James, > > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment. > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster. I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on. sage > > Thanks, -yuan > > [1] https://github.com/openstack/swift/blob/master/swift/common/middleware/list_endpoints.py > > -----Original Message----- > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com] > Sent: Saturday, May 9, 2015 1:40 AM > To: Zhou, Yuan; Ceph Development > Cc: Cohen, David E; Yu, Zhidong > Subject: RE: libcrush.so > > Hi Yuan, > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? > > Regards, > James > > -----Original Message----- > From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan > Sent: Thursday, May 07, 2015 6:29 PM > To: Ceph Development > Cc: Cohen, David E; Yu, Zhidong > Subject: libcrush.so > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? > > > Thanks, -yuan > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-11 2:34 ` libcrush.so Sage Weil @ 2015-05-11 11:41 ` Cohen, David E 2015-05-11 17:01 ` libcrush.so Sage Weil 0 siblings, 1 reply; 12+ messages in thread From: Cohen, David E @ 2015-05-11 11:41 UTC (permalink / raw) To: Sage Weil; +Cc: James (Fei) Liu-SSI, Ceph Development, Yu, Zhidong, Zhou, Yuan The "ceph_get_osd_crush_location" method is made available via libcephfs. In deployment scenarios that don't include CephFS it will be ideal if this method is also available. However, there is no equivalent method available via librados. Instead, it looks like you have to use the "rados_mon_command_target" to mimic the functionality of the command line tools. /dave. https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs.h https://github.com/ceph/ceph/blob/master/src/libcephfs.cc -----Original Message----- From: Sage Weil [mailto:sage@newdream.net] Sent: Sunday, May 10, 2015 10:34 PM To: Zhou, Yuan Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong Subject: RE: libcrush.so On Sat, 9 May 2015, Zhou, Yuan wrote: > Hi James, > > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment. > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster. I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on. sage > > Thanks, -yuan > > [1] > https://github.com/openstack/swift/blob/master/swift/common/middleware > /list_endpoints.py > > -----Original Message----- > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com] > Sent: Saturday, May 9, 2015 1:40 AM > To: Zhou, Yuan; Ceph Development > Cc: Cohen, David E; Yu, Zhidong > Subject: RE: libcrush.so > > Hi Yuan, > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? > > Regards, > James > > -----Original Message----- > From: ceph-devel-owner@vger.kernel.org > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan > Sent: Thursday, May 07, 2015 6:29 PM > To: Ceph Development > Cc: Cohen, David E; Yu, Zhidong > Subject: libcrush.so > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? > > > Thanks, -yuan > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@vger.kernel.org More majordomo > info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > in the body of a message to majordomo@vger.kernel.org More majordomo > info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-11 11:41 ` libcrush.so Cohen, David E @ 2015-05-11 17:01 ` Sage Weil 2015-05-11 22:39 ` libcrush.so Zhou, Yuan 0 siblings, 1 reply; 12+ messages in thread From: Sage Weil @ 2015-05-11 17:01 UTC (permalink / raw) To: Cohen, David E Cc: James (Fei) Liu-SSI, Ceph Development, Yu, Zhidong, Zhou, Yuan On Mon, 11 May 2015, Cohen, David E wrote: > The "ceph_get_osd_crush_location" method is made available via > libcephfs. In deployment scenarios that don't include CephFS it will be > ideal if this method is also available. However, there is no equivalent > method available via librados. Instead, it looks like you have to use > the "rados_mon_command_target" to mimic the functionality of the command > line tools. Yeah, okay. Would adding those calls to librados address your use-case? As Greg mentioned, putting together a separate libcrush.so is a bit of work because the useful bits that encode/decode maps and so forth pull in a bunch of generic Ceph code and there will be some annoying linking issues to sort out if running alongside other Ceph code (like librados). And even if we did all of that work, it'll push responsibility to the user to make sure they have the latest osdmap. If the goal is to calculate mappings for a running cluster, adding to librados seems like the easiest path forward... sage > > /dave. > > https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs > https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs.h > https://github.com/ceph/ceph/blob/master/src/libcephfs.cc > > > -----Original Message----- > From: Sage Weil [mailto:sage@newdream.net] > Sent: Sunday, May 10, 2015 10:34 PM > To: Zhou, Yuan > Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong > Subject: RE: libcrush.so > > On Sat, 9 May 2015, Zhou, Yuan wrote: > > Hi James, > > > > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment. > > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster. > > I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on. > > sage > > > > > > Thanks, -yuan > > > > [1] > > https://github.com/openstack/swift/blob/master/swift/common/middleware > > /list_endpoints.py > > > > -----Original Message----- > > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com] > > Sent: Saturday, May 9, 2015 1:40 AM > > To: Zhou, Yuan; Ceph Development > > Cc: Cohen, David E; Yu, Zhidong > > Subject: RE: libcrush.so > > > > Hi Yuan, > > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? > > > > Regards, > > James > > > > -----Original Message----- > > From: ceph-devel-owner@vger.kernel.org > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan > > Sent: Thursday, May 07, 2015 6:29 PM > > To: Ceph Development > > Cc: Cohen, David E; Yu, Zhidong > > Subject: libcrush.so > > > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > > > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? > > > > > > Thanks, -yuan > > > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@vger.kernel.org More majordomo > > info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@vger.kernel.org More majordomo > > info at http://vger.kernel.org/majordomo-info.html > > > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-11 17:01 ` libcrush.so Sage Weil @ 2015-05-11 22:39 ` Zhou, Yuan 2015-05-11 22:52 ` libcrush.so Sage Weil 0 siblings, 1 reply; 12+ messages in thread From: Zhou, Yuan @ 2015-05-11 22:39 UTC (permalink / raw) To: Sage Weil, Cohen, David E Cc: James (Fei) Liu-SSI, Ceph Development, Yu, Zhidong Yes Sage, I think this will work for us. In fact we did tested in this way, the first step is to make librados have the latest osd map and then based on the weight and the crush rule, it's able to use crush_do_rule to get the acting set. Is this approach right? -----Original Message----- From: Sage Weil [mailto:sage@newdream.net] Sent: Tuesday, May 12, 2015 1:01 AM To: Cohen, David E Cc: James (Fei) Liu-SSI; Ceph Development; Yu, Zhidong; Zhou, Yuan Subject: RE: libcrush.so On Mon, 11 May 2015, Cohen, David E wrote: > The "ceph_get_osd_crush_location" method is made available via > libcephfs. In deployment scenarios that don't include CephFS it will > be ideal if this method is also available. However, there is no > equivalent method available via librados. Instead, it looks like you > have to use the "rados_mon_command_target" to mimic the functionality > of the command line tools. Yeah, okay. Would adding those calls to librados address your use-case? As Greg mentioned, putting together a separate libcrush.so is a bit of work because the useful bits that encode/decode maps and so forth pull in a bunch of generic Ceph code and there will be some annoying linking issues to sort out if running alongside other Ceph code (like librados). And even if we did all of that work, it'll push responsibility to the user to make sure they have the latest osdmap. If the goal is to calculate mappings for a running cluster, adding to librados seems like the easiest path forward... sage > > /dave. > > https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs > https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs. > h https://github.com/ceph/ceph/blob/master/src/libcephfs.cc > > > -----Original Message----- > From: Sage Weil [mailto:sage@newdream.net] > Sent: Sunday, May 10, 2015 10:34 PM > To: Zhou, Yuan > Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong > Subject: RE: libcrush.so > > On Sat, 9 May 2015, Zhou, Yuan wrote: > > Hi James, > > > > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment. > > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster. > > I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on. > > sage > > > > > > Thanks, -yuan > > > > [1] > > https://github.com/openstack/swift/blob/master/swift/common/middlewa > > re > > /list_endpoints.py > > > > -----Original Message----- > > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com] > > Sent: Saturday, May 9, 2015 1:40 AM > > To: Zhou, Yuan; Ceph Development > > Cc: Cohen, David E; Yu, Zhidong > > Subject: RE: libcrush.so > > > > Hi Yuan, > > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? > > > > Regards, > > James > > > > -----Original Message----- > > From: ceph-devel-owner@vger.kernel.org > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan > > Sent: Thursday, May 07, 2015 6:29 PM > > To: Ceph Development > > Cc: Cohen, David E; Yu, Zhidong > > Subject: libcrush.so > > > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > > > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? > > > > > > Thanks, -yuan > > > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@vger.kernel.org More majordomo > > info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@vger.kernel.org More majordomo > > info at http://vger.kernel.org/majordomo-info.html > > > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: libcrush.so 2015-05-11 22:39 ` libcrush.so Zhou, Yuan @ 2015-05-11 22:52 ` Sage Weil 0 siblings, 0 replies; 12+ messages in thread From: Sage Weil @ 2015-05-11 22:52 UTC (permalink / raw) To: Zhou, Yuan Cc: Cohen, David E, James (Fei) Liu-SSI, Ceph Development, Yu, Zhidong On Mon, 11 May 2015, Zhou, Yuan wrote: > Yes Sage, I think this will work for us. In fact we did tested in this > way, the first step is to make librados have the latest osd map and then > based on the weight and the crush rule, it's able to use crush_do_rule > to get the acting set. Is this approach right? Yeah, I think so. There is already a wait_for_map() type call in librados, so I would make the API call only calculate the mapping (using whatever the current map is). The caller can then do one call to ensure a fresh map and then calculate N mappings without any additional map sync overhead. Sound right? sage > > -----Original Message----- > From: Sage Weil [mailto:sage@newdream.net] > Sent: Tuesday, May 12, 2015 1:01 AM > To: Cohen, David E > Cc: James (Fei) Liu-SSI; Ceph Development; Yu, Zhidong; Zhou, Yuan > Subject: RE: libcrush.so > > On Mon, 11 May 2015, Cohen, David E wrote: > > The "ceph_get_osd_crush_location" method is made available via > > libcephfs. In deployment scenarios that don't include CephFS it will > > be ideal if this method is also available. However, there is no > > equivalent method available via librados. Instead, it looks like you > > have to use the "rados_mon_command_target" to mimic the functionality > > of the command line tools. > > Yeah, okay. Would adding those calls to librados address your use-case? > > As Greg mentioned, putting together a separate libcrush.so is a bit of work because the useful bits that encode/decode maps and so forth pull in a bunch of generic Ceph code and there will be some annoying linking issues to sort out if running alongside other Ceph code (like librados). > And even if we did all of that work, it'll push responsibility to the user to make sure they have the latest osdmap. If the goal is to calculate mappings for a running cluster, adding to librados seems like the easiest path forward... > > sage > > > > > /dave. > > > > https://github.com/ceph/ceph/blob/master/debian/libcephfs-java.jlibs > > https://github.com/ceph/ceph/blob/master/src/include/cephfs/libcephfs. > > h https://github.com/ceph/ceph/blob/master/src/libcephfs.cc > > > > > > -----Original Message----- > > From: Sage Weil [mailto:sage@newdream.net] > > Sent: Sunday, May 10, 2015 10:34 PM > > To: Zhou, Yuan > > Cc: James (Fei) Liu-SSI; Ceph Development; Cohen, David E; Yu, Zhidong > > Subject: RE: libcrush.so > > > > On Sat, 9 May 2015, Zhou, Yuan wrote: > > > Hi James, > > > > > > This happens usually when the storage platform and applications are in segmented networks. For example, in a cluster with multiple RGW instances, if we could know the which RGW instance is the closest to primary copy, then we can do more efficient local read/write through some particular deployment. > > > There's one feature in Openstack Swift [1] which is able to provide the location of objects inside a cluster. > > > > I think the place for this is librados--and I believe there is already a method to do this (I think... I know there is in libcephfs as the hadoop bindings use it). That way you don't have to deal with the annoying details of getting an up to date map and so on. > > > > sage > > > > > > > > > > Thanks, -yuan > > > > > > [1] > > > https://github.com/openstack/swift/blob/master/swift/common/middlewa > > > re > > > /list_endpoints.py > > > > > > -----Original Message----- > > > From: James (Fei) Liu-SSI [mailto:james.liu@ssi.samsung.com] > > > Sent: Saturday, May 9, 2015 1:40 AM > > > To: Zhou, Yuan; Ceph Development > > > Cc: Cohen, David E; Yu, Zhidong > > > Subject: RE: libcrush.so > > > > > > Hi Yuan, > > > Very interesting. Would be possible to know why application needs to access the cursh map directly instead of accessing through ceph tool? > > > > > > Regards, > > > James > > > > > > -----Original Message----- > > > From: ceph-devel-owner@vger.kernel.org > > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Zhou, Yuan > > > Sent: Thursday, May 07, 2015 6:29 PM > > > To: Ceph Development > > > Cc: Cohen, David E; Yu, Zhidong > > > Subject: libcrush.so > > > > > > Ceph use crush algorithm to provide the mapping of objects to OSD servers. This is great for clients so they could talk to with these OSDs directly. However there are some scenarios where the application needs to access the crush map, for load-balancing as an example. > > > > > > Currently Ceph doesn't provides any API to render the layout. If your application needs to access the crush map you'll going to rely on the command 'ceph osd map pool_name obj_name'. With this libcrush.so we could let the application to choose which nodes to access. The other advantage is we could provide some other bindings(python, go) based on this also. > > > > > > >From the git log we find libcrush was there before but removed out since Argonaut. Can anyone kindly share us the background of this change? > > > > > > > > > Thanks, -yuan > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > in the body of a message to majordomo@vger.kernel.org More majordomo > > > info at http://vger.kernel.org/majordomo-info.html > > > -- > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > in the body of a message to majordomo@vger.kernel.org More majordomo > > > info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-05-11 22:52 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-05-08 1:29 libcrush.so Zhou, Yuan 2015-05-08 4:37 ` libcrush.so Gregory Farnum 2015-05-09 1:39 ` libcrush.so Zhou, Yuan 2015-05-08 17:40 ` libcrush.so James (Fei) Liu-SSI 2015-05-08 17:57 ` libcrush.so Mark Nelson 2015-05-08 18:13 ` libcrush.so James (Fei) Liu-SSI 2015-05-09 1:25 ` libcrush.so Zhou, Yuan 2015-05-11 2:34 ` libcrush.so Sage Weil 2015-05-11 11:41 ` libcrush.so Cohen, David E 2015-05-11 17:01 ` libcrush.so Sage Weil 2015-05-11 22:39 ` libcrush.so Zhou, Yuan 2015-05-11 22:52 ` libcrush.so Sage Weil
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.