* Extra daemons/servers reporting to mgr
@ 2017-06-11 12:04 John Spray
2017-06-12 14:14 ` Daniel Gryniewicz
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: John Spray @ 2017-06-11 12:04 UTC (permalink / raw)
To: Ceph Development
MgrClient instances (such as those in the daemons, and those in every
librados instance) open a session with ceph-mgr where they identify
themselves by entity name and type. ceph-mgr sends a MgrConfigure
message, which tells the client whether to both sending stats and how
often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd
metadata...") for OSD and MDS daemons -- it loads all that up at
startup, and then also freshens it when it sees a daemon restart or
sees a new daemon.
We would like to have something similar so that the mgr can be aware
of the existence of other services like RGW gateways, RBD mirror
services, perhaps also NFS gateways.
The information about each daemon would at a minimum be its identity,
type, and some static metadata. It might also include some dynamic
state/health structure. The challenging part here is how to expose
that to the various daemons, given that things like RGW are not known
in advance to core Ceph and that they just consume the librados
interface.
It doesn't feel like a particularly natural thing for librados, but
ultimately whatever we expose to rgw/rbd is de-facto librados, even if
we put it in a different library or whatever.
So far I've got as far as thinking we should have an extra call just
in the C++ bindings that lets callers say "Hi, I'm a service not just
a client, and here's a map of metadata", that they call one time
between creating their RadosClient and connecting to the cluster.
John
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: Extra daemons/servers reporting to mgr 2017-06-11 12:04 Extra daemons/servers reporting to mgr John Spray @ 2017-06-12 14:14 ` Daniel Gryniewicz 2017-06-12 14:26 ` Matt Benjamin 2017-06-12 14:47 ` Jason Dillaman ` (2 subsequent siblings) 3 siblings, 1 reply; 20+ messages in thread From: Daniel Gryniewicz @ 2017-06-12 14:14 UTC (permalink / raw) To: John Spray, Ceph Development On 06/11/2017 08:04 AM, John Spray wrote: > MgrClient instances (such as those in the daemons, and those in every > librados instance) open a session with ceph-mgr where they identify > themselves by entity name and type. ceph-mgr sends a MgrConfigure > message, which tells the client whether to both sending stats and how > often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd > metadata...") for OSD and MDS daemons -- it loads all that up at > startup, and then also freshens it when it sees a daemon restart or > sees a new daemon. > > We would like to have something similar so that the mgr can be aware > of the existence of other services like RGW gateways, RBD mirror > services, perhaps also NFS gateways. > > The information about each daemon would at a minimum be its identity, > type, and some static metadata. It might also include some dynamic > state/health structure. The challenging part here is how to expose > that to the various daemons, given that things like RGW are not known > in advance to core Ceph and that they just consume the librados > interface. > > It doesn't feel like a particularly natural thing for librados, but > ultimately whatever we expose to rgw/rbd is de-facto librados, even if > we put it in a different library or whatever. > > So far I've got as far as thinking we should have an extra call just > in the C++ bindings that lets callers say "Hi, I'm a service not just > a client, and here's a map of metadata", that they call one time > between creating their RadosClient and connecting to the cluster. > This seems fine, but if we want NFS gateways, too, then a C API would be helpful. Daniel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-12 14:14 ` Daniel Gryniewicz @ 2017-06-12 14:26 ` Matt Benjamin 2017-06-14 19:03 ` Yehuda Sadeh-Weinraub 0 siblings, 1 reply; 20+ messages in thread From: Matt Benjamin @ 2017-06-12 14:26 UTC (permalink / raw) To: dang; +Cc: John Spray, Ceph Development The nfs gateways are clients of either libcephfs or librgw, so in the first instance, I would say that librgw, for example, which creates an RGW instance, should be the client of this API. Matt ----- Original Message ----- > From: "Daniel Gryniewicz" <dang@redhat.com> > To: "John Spray" <jspray@redhat.com>, "Ceph Development" <ceph-devel@vger.kernel.org> > Sent: Monday, June 12, 2017 10:14:02 AM > Subject: Re: Extra daemons/servers reporting to mgr > > On 06/11/2017 08:04 AM, John Spray wrote: > > MgrClient instances (such as those in the daemons, and those in every > > librados instance) open a session with ceph-mgr where they identify > > themselves by entity name and type. ceph-mgr sends a MgrConfigure > > message, which tells the client whether to both sending stats and how > > often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd > > metadata...") for OSD and MDS daemons -- it loads all that up at > > startup, and then also freshens it when it sees a daemon restart or > > sees a new daemon. > > > > We would like to have something similar so that the mgr can be aware > > of the existence of other services like RGW gateways, RBD mirror > > services, perhaps also NFS gateways. > > > > The information about each daemon would at a minimum be its identity, > > type, and some static metadata. It might also include some dynamic > > state/health structure. The challenging part here is how to expose > > that to the various daemons, given that things like RGW are not known > > in advance to core Ceph and that they just consume the librados > > interface. > > > > It doesn't feel like a particularly natural thing for librados, but > > ultimately whatever we expose to rgw/rbd is de-facto librados, even if > > we put it in a different library or whatever. > > > > So far I've got as far as thinking we should have an extra call just > > in the C++ bindings that lets callers say "Hi, I'm a service not just > > a client, and here's a map of metadata", that they call one time > > between creating their RadosClient and connecting to the cluster. > > > > This seems fine, but if we want NFS gateways, too, then a C API would be > helpful. > > Daniel > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-12 14:26 ` Matt Benjamin @ 2017-06-14 19:03 ` Yehuda Sadeh-Weinraub 0 siblings, 0 replies; 20+ messages in thread From: Yehuda Sadeh-Weinraub @ 2017-06-14 19:03 UTC (permalink / raw) To: Matt Benjamin; +Cc: Daniel Gryniewicz, John Spray, Ceph Development On Mon, Jun 12, 2017 at 7:26 AM, Matt Benjamin <mbenjamin@redhat.com> wrote: > The nfs gateways are clients of either libcephfs or librgw, so in the first instance, I would say that librgw, for example, which creates an RGW instance, should be the client of this API. Yes, but should be able to provide radosgw specific info that the nfs gateway does not provide (e.g., port, ssl). Yehuda > > Matt > > ----- Original Message ----- >> From: "Daniel Gryniewicz" <dang@redhat.com> >> To: "John Spray" <jspray@redhat.com>, "Ceph Development" <ceph-devel@vger.kernel.org> >> Sent: Monday, June 12, 2017 10:14:02 AM >> Subject: Re: Extra daemons/servers reporting to mgr >> >> On 06/11/2017 08:04 AM, John Spray wrote: >> > MgrClient instances (such as those in the daemons, and those in every >> > librados instance) open a session with ceph-mgr where they identify >> > themselves by entity name and type. ceph-mgr sends a MgrConfigure >> > message, which tells the client whether to both sending stats and how >> > often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd >> > metadata...") for OSD and MDS daemons -- it loads all that up at >> > startup, and then also freshens it when it sees a daemon restart or >> > sees a new daemon. >> > >> > We would like to have something similar so that the mgr can be aware >> > of the existence of other services like RGW gateways, RBD mirror >> > services, perhaps also NFS gateways. >> > >> > The information about each daemon would at a minimum be its identity, >> > type, and some static metadata. It might also include some dynamic >> > state/health structure. The challenging part here is how to expose >> > that to the various daemons, given that things like RGW are not known >> > in advance to core Ceph and that they just consume the librados >> > interface. >> > >> > It doesn't feel like a particularly natural thing for librados, but >> > ultimately whatever we expose to rgw/rbd is de-facto librados, even if >> > we put it in a different library or whatever. >> > >> > So far I've got as far as thinking we should have an extra call just >> > in the C++ bindings that lets callers say "Hi, I'm a service not just >> > a client, and here's a map of metadata", that they call one time >> > between creating their RadosClient and connecting to the cluster. >> > >> >> This seems fine, but if we want NFS gateways, too, then a C API would be >> helpful. >> >> Daniel >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-11 12:04 Extra daemons/servers reporting to mgr John Spray 2017-06-12 14:14 ` Daniel Gryniewicz @ 2017-06-12 14:47 ` Jason Dillaman 2017-06-12 18:09 ` Casey Bodley 2017-06-14 19:35 ` Yehuda Sadeh-Weinraub 3 siblings, 0 replies; 20+ messages in thread From: Jason Dillaman @ 2017-06-12 14:47 UTC (permalink / raw) To: John Spray; +Cc: Ceph Development On Sun, Jun 11, 2017 at 8:04 AM, John Spray <jspray@redhat.com> wrote: > So far I've got as far as thinking we should have an extra call just > in the C++ bindings that lets callers say "Hi, I'm a service not just > a client, and here's a map of metadata", that they call one time > between creating their RadosClient and connecting to the cluster. ... plus an additional API for updating some key/value dynamic metadata? -- Jason ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-11 12:04 Extra daemons/servers reporting to mgr John Spray 2017-06-12 14:14 ` Daniel Gryniewicz 2017-06-12 14:47 ` Jason Dillaman @ 2017-06-12 18:09 ` Casey Bodley 2017-06-12 20:03 ` Matt Benjamin 2017-06-12 20:33 ` John Spray 2017-06-14 19:35 ` Yehuda Sadeh-Weinraub 3 siblings, 2 replies; 20+ messages in thread From: Casey Bodley @ 2017-06-12 18:09 UTC (permalink / raw) To: John Spray, Ceph Development On 06/11/2017 08:04 AM, John Spray wrote: > MgrClient instances (such as those in the daemons, and those in every > librados instance) open a session with ceph-mgr where they identify > themselves by entity name and type. ceph-mgr sends a MgrConfigure > message, which tells the client whether to both sending stats and how > often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd > metadata...") for OSD and MDS daemons -- it loads all that up at > startup, and then also freshens it when it sees a daemon restart or > sees a new daemon. > > We would like to have something similar so that the mgr can be aware > of the existence of other services like RGW gateways, RBD mirror > services, perhaps also NFS gateways. > > The information about each daemon would at a minimum be its identity, > type, and some static metadata. It might also include some dynamic > state/health structure. The challenging part here is how to expose > that to the various daemons, given that things like RGW are not known > in advance to core Ceph and that they just consume the librados > interface. > > It doesn't feel like a particularly natural thing for librados, but > ultimately whatever we expose to rgw/rbd is de-facto librados, even if > we put it in a different library or whatever. > > So far I've got as far as thinking we should have an extra call just > in the C++ bindings that lets callers say "Hi, I'm a service not just > a client, and here's a map of metadata", that they call one time > between creating their RadosClient and connecting to the cluster. > > John > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Hi John, Is there a reason that radosgw can't instantiate its own MgrClient instance and report status through that, instead of having to go through librados? If we have to report an entity instance, we could choose from one of our Rados clients. Casey ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-12 18:09 ` Casey Bodley @ 2017-06-12 20:03 ` Matt Benjamin 2017-06-12 20:24 ` Casey Bodley 2017-06-12 20:33 ` John Spray 1 sibling, 1 reply; 20+ messages in thread From: Matt Benjamin @ 2017-06-12 20:03 UTC (permalink / raw) To: Casey Bodley; +Cc: John Spray, Ceph Development I agree, though I'm unsure how much this matters either way? Matt ----- Original Message ----- > > Hi John, > > Is there a reason that radosgw can't instantiate its own MgrClient > instance and report status through that, instead of having to go through > librados? If we have to report an entity instance, we could choose from > one of our Rados clients. > > Casey > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-12 20:03 ` Matt Benjamin @ 2017-06-12 20:24 ` Casey Bodley 0 siblings, 0 replies; 20+ messages in thread From: Casey Bodley @ 2017-06-12 20:24 UTC (permalink / raw) To: Ceph Development Just that it could obviate the need to expose and support it as a public librados api. On 06/12/2017 04:03 PM, Matt Benjamin wrote: > I agree, though I'm unsure how much this matters either way? > > Matt > > ----- Original Message ----- > >> Hi John, >> >> Is there a reason that radosgw can't instantiate its own MgrClient >> instance and report status through that, instead of having to go through >> librados? If we have to report an entity instance, we could choose from >> one of our Rados clients. >> >> Casey >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-12 18:09 ` Casey Bodley 2017-06-12 20:03 ` Matt Benjamin @ 2017-06-12 20:33 ` John Spray 2017-06-14 17:43 ` Gregory Farnum 1 sibling, 1 reply; 20+ messages in thread From: John Spray @ 2017-06-12 20:33 UTC (permalink / raw) To: Casey Bodley; +Cc: Ceph Development On Mon, Jun 12, 2017 at 2:09 PM, Casey Bodley <cbodley@redhat.com> wrote: > > On 06/11/2017 08:04 AM, John Spray wrote: >> >> MgrClient instances (such as those in the daemons, and those in every >> librados instance) open a session with ceph-mgr where they identify >> themselves by entity name and type. ceph-mgr sends a MgrConfigure >> message, which tells the client whether to both sending stats and how >> often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd >> metadata...") for OSD and MDS daemons -- it loads all that up at >> startup, and then also freshens it when it sees a daemon restart or >> sees a new daemon. >> >> We would like to have something similar so that the mgr can be aware >> of the existence of other services like RGW gateways, RBD mirror >> services, perhaps also NFS gateways. >> >> The information about each daemon would at a minimum be its identity, >> type, and some static metadata. It might also include some dynamic >> state/health structure. The challenging part here is how to expose >> that to the various daemons, given that things like RGW are not known >> in advance to core Ceph and that they just consume the librados >> interface. >> >> It doesn't feel like a particularly natural thing for librados, but >> ultimately whatever we expose to rgw/rbd is de-facto librados, even if >> we put it in a different library or whatever. >> >> So far I've got as far as thinking we should have an extra call just >> in the C++ bindings that lets callers say "Hi, I'm a service not just >> a client, and here's a map of metadata", that they call one time >> between creating their RadosClient and connecting to the cluster. >> >> John >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Hi John, > > Is there a reason that radosgw can't instantiate its own MgrClient instance > and report status through that, instead of having to go through librados? If > we have to report an entity instance, we could choose from one of our Rados > clients. Currently librados does internally have a MgrClient already, so I think I was instinctively looking to reuse that. However, the MgrClient in librados is only there so that it can issue commands (i.e. for use in CLI), so one option would be to disable that MgrClient by default (unless someone is actually calling mgr_command()), and instantiate an external one. For things like NFS-ganesha though, we can't use MgrClient unless we formalize it as an external API... John > > Casey ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-12 20:33 ` John Spray @ 2017-06-14 17:43 ` Gregory Farnum 0 siblings, 0 replies; 20+ messages in thread From: Gregory Farnum @ 2017-06-14 17:43 UTC (permalink / raw) To: John Spray; +Cc: Casey Bodley, Ceph Development On Mon, Jun 12, 2017 at 1:33 PM, John Spray <jspray@redhat.com> wrote: > On Mon, Jun 12, 2017 at 2:09 PM, Casey Bodley <cbodley@redhat.com> wrote: >> >> On 06/11/2017 08:04 AM, John Spray wrote: >>> >>> MgrClient instances (such as those in the daemons, and those in every >>> librados instance) open a session with ceph-mgr where they identify >>> themselves by entity name and type. ceph-mgr sends a MgrConfigure >>> message, which tells the client whether to both sending stats and how >>> often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd >>> metadata...") for OSD and MDS daemons -- it loads all that up at >>> startup, and then also freshens it when it sees a daemon restart or >>> sees a new daemon. >>> >>> We would like to have something similar so that the mgr can be aware >>> of the existence of other services like RGW gateways, RBD mirror >>> services, perhaps also NFS gateways. >>> >>> The information about each daemon would at a minimum be its identity, >>> type, and some static metadata. It might also include some dynamic >>> state/health structure. The challenging part here is how to expose >>> that to the various daemons, given that things like RGW are not known >>> in advance to core Ceph and that they just consume the librados >>> interface. >>> >>> It doesn't feel like a particularly natural thing for librados, but >>> ultimately whatever we expose to rgw/rbd is de-facto librados, even if >>> we put it in a different library or whatever. >>> >>> So far I've got as far as thinking we should have an extra call just >>> in the C++ bindings that lets callers say "Hi, I'm a service not just >>> a client, and here's a map of metadata", that they call one time >>> between creating their RadosClient and connecting to the cluster. >>> >>> John >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> Hi John, >> >> Is there a reason that radosgw can't instantiate its own MgrClient instance >> and report status through that, instead of having to go through librados? If >> we have to report an entity instance, we could choose from one of our Rados >> clients. > > Currently librados does internally have a MgrClient already, so I > think I was instinctively looking to reuse that. > > However, the MgrClient in librados is only there so that it can issue > commands (i.e. for use in CLI), so one option would be to disable that > MgrClient by default (unless someone is actually calling > mgr_command()), and instantiate an external one. > > For things like NFS-ganesha though, we can't use MgrClient unless we > formalize it as an external API... Perhaps somebody can draw up what data we actually want to share from NFS-Ganesha to the manager? (Or from RGW?) It's not unreasonable to let RGW set up its own MgrClient, since it's an in-tree service and can be kept more tightly bound than external librados users. But if we can make a simple interface that serves both needs, that would be better. Maybe something like void share_metadata(map<string,string>); that can be invoked whenever desired, and the MgrClient sends that in whenever it's doing a report? Or maybe we want something that more explicitly maps from static metadata (like kind and version info) versus perfcounter-like state. But I'm not sure we actually want to restrict shared data to going in via services. It seems perfectly reasonable to collect IO and other statistics from cooperating clients, if we can do that without overwhelming the manager service. -Greg ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-11 12:04 Extra daemons/servers reporting to mgr John Spray ` (2 preceding siblings ...) 2017-06-12 18:09 ` Casey Bodley @ 2017-06-14 19:35 ` Yehuda Sadeh-Weinraub 2017-06-14 19:50 ` Sage Weil 3 siblings, 1 reply; 20+ messages in thread From: Yehuda Sadeh-Weinraub @ 2017-06-14 19:35 UTC (permalink / raw) To: John Spray; +Cc: Ceph Development On Sun, Jun 11, 2017 at 5:04 AM, John Spray <jspray@redhat.com> wrote: > MgrClient instances (such as those in the daemons, and those in every > librados instance) open a session with ceph-mgr where they identify > themselves by entity name and type. ceph-mgr sends a MgrConfigure > message, which tells the client whether to both sending stats and how > often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd > metadata...") for OSD and MDS daemons -- it loads all that up at > startup, and then also freshens it when it sees a daemon restart or > sees a new daemon. > > We would like to have something similar so that the mgr can be aware > of the existence of other services like RGW gateways, RBD mirror > services, perhaps also NFS gateways. > > The information about each daemon would at a minimum be its identity, > type, and some static metadata. It might also include some dynamic > state/health structure. The challenging part here is how to expose > that to the various daemons, given that things like RGW are not known > in advance to core Ceph and that they just consume the librados > interface. To start with, for radosgw process we could provide: - realm name (+ id) - current period - zonegroup name (+ id) - zone name (+ id ) - instance id # (changes between runs) - listening ip:port - list of domains handled (maybe? can be quite large) - max number of concurrent requests - current # of requests (or max # of requests in the past X minutes) There is some other info that falls under different aspects of the rgw cluster, e.g., all the realm info / structure, the zonegroup config, zone config. This kind of data should be a category of its own, and more of a container to the specific radosgw process info. Does it make sense in the ceph-mgr context? > > It doesn't feel like a particularly natural thing for librados, but > ultimately whatever we expose to rgw/rbd is de-facto librados, even if > we put it in a different library or whatever. > > So far I've got as far as thinking we should have an extra call just > in the C++ bindings that lets callers say "Hi, I'm a service not just > a client, and here's a map of metadata", that they call one time > between creating their RadosClient and connecting to the cluster. > It'd be great if the client itself could provide a flexible schema mapping of the info that it needs to expose. Or if there was some other generic way to do it. Was thinking something like the way you send a json to elasticsearch and it generates a doc out of it, but there's also a way to create a fixed mapping for the stored data. Yehuda > John > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-14 19:35 ` Yehuda Sadeh-Weinraub @ 2017-06-14 19:50 ` Sage Weil 2017-06-14 20:45 ` Yehuda Sadeh-Weinraub 2017-06-14 21:20 ` Jason Dillaman 0 siblings, 2 replies; 20+ messages in thread From: Sage Weil @ 2017-06-14 19:50 UTC (permalink / raw) To: Yehuda Sadeh-Weinraub; +Cc: John Spray, Ceph Development On Wed, 14 Jun 2017, Yehuda Sadeh-Weinraub wrote: > On Sun, Jun 11, 2017 at 5:04 AM, John Spray <jspray@redhat.com> wrote: > > MgrClient instances (such as those in the daemons, and those in every > > librados instance) open a session with ceph-mgr where they identify > > themselves by entity name and type. ceph-mgr sends a MgrConfigure > > message, which tells the client whether to both sending stats and how > > often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd > > metadata...") for OSD and MDS daemons -- it loads all that up at > > startup, and then also freshens it when it sees a daemon restart or > > sees a new daemon. > > > > We would like to have something similar so that the mgr can be aware > > of the existence of other services like RGW gateways, RBD mirror > > services, perhaps also NFS gateways. > > > > The information about each daemon would at a minimum be its identity, > > type, and some static metadata. It might also include some dynamic > > state/health structure. The challenging part here is how to expose > > that to the various daemons, given that things like RGW are not known > > in advance to core Ceph and that they just consume the librados > > interface. > > > To start with, for radosgw process we could provide: It seems like these fall into two categories: 1/ static information about the daemon instance > - realm name (+ id) > - zonegroup name (+ id) > - zone name (+ id ) > - instance id # (changes between runs) > - listening ip:port > - list of domains handled (maybe? can be quite large) > - max number of concurrent requests This stuff would populate a 'running rgw daemons' type view, and could be used by an agent that dynamically configures a load balancer (haproxy or whatever). 2/ dynamic values that can be exposed as perfcounters > - current period > - current # of requests (or max # of requests in the past X minutes) - current bandwidth - etc Is there anything that is updated at runtime (not startup) that doesn't fit into a perfcounter? It not, then a single call to something like rados_register_daemon(name, metadata_map) (and some mechanism for cleaning out dead people) ought to suffice? > There is some other info that falls under different aspects of the rgw > cluster, e.g., all the realm info / structure, the zonegroup config, > zone config. This kind of data should be a category of its own, and > more of a container to the specific radosgw process info. Does it make > sense in the ceph-mgr context? This stuff any potential mgr module can pull out of the zone metadata pool itself to display, right? rgw doesn't need to translate it and report it? > > It doesn't feel like a particularly natural thing for librados, but > > ultimately whatever we expose to rgw/rbd is de-facto librados, even if > > we put it in a different library or whatever. > > > > So far I've got as far as thinking we should have an extra call just > > in the C++ bindings that lets callers say "Hi, I'm a service not just > > a client, and here's a map of metadata", that they call one time > > between creating their RadosClient and connecting to the cluster. > > > > > It'd be great if the client itself could provide a flexible schema > mapping of the info that it needs to expose. Or if there was some > other generic way to do it. Was thinking something like the way you > send a json to elasticsearch and it generates a doc out of it, but > there's also a way to create a fixed mapping for the stored data. Hmm, instead of a map<string,string> it could just be a json blob that the mgr stores that it is up to the user to parse and interpret. Or, if there is info that doesn't fit into a simple dict, then one dict item ("extra_stuff") can have a value consisting of encoded json. sage ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-14 19:50 ` Sage Weil @ 2017-06-14 20:45 ` Yehuda Sadeh-Weinraub 2017-06-14 21:20 ` Jason Dillaman 1 sibling, 0 replies; 20+ messages in thread From: Yehuda Sadeh-Weinraub @ 2017-06-14 20:45 UTC (permalink / raw) To: Sage Weil; +Cc: John Spray, Ceph Development On Wed, Jun 14, 2017 at 12:50 PM, Sage Weil <sweil@redhat.com> wrote: > On Wed, 14 Jun 2017, Yehuda Sadeh-Weinraub wrote: >> On Sun, Jun 11, 2017 at 5:04 AM, John Spray <jspray@redhat.com> wrote: >> > MgrClient instances (such as those in the daemons, and those in every >> > librados instance) open a session with ceph-mgr where they identify >> > themselves by entity name and type. ceph-mgr sends a MgrConfigure >> > message, which tells the client whether to both sending stats and how >> > often. ceph-mgr also keeps a copy of the metadata (a la "ceph osd >> > metadata...") for OSD and MDS daemons -- it loads all that up at >> > startup, and then also freshens it when it sees a daemon restart or >> > sees a new daemon. >> > >> > We would like to have something similar so that the mgr can be aware >> > of the existence of other services like RGW gateways, RBD mirror >> > services, perhaps also NFS gateways. >> > >> > The information about each daemon would at a minimum be its identity, >> > type, and some static metadata. It might also include some dynamic >> > state/health structure. The challenging part here is how to expose >> > that to the various daemons, given that things like RGW are not known >> > in advance to core Ceph and that they just consume the librados >> > interface. >> >> >> To start with, for radosgw process we could provide: > > It seems like these fall into two categories: > > 1/ static information about the daemon instance > >> - realm name (+ id) >> - zonegroup name (+ id) >> - zone name (+ id ) >> - instance id # (changes between runs) >> - listening ip:port >> - list of domains handled (maybe? can be quite large) >> - max number of concurrent requests > > This stuff would populate a 'running rgw daemons' type view, and could be > used by an agent that dynamically configures a load balancer (haproxy or > whatever). > > 2/ dynamic values that can be exposed as perfcounters > >> - current period >> - current # of requests (or max # of requests in the past X minutes) > - current bandwidth > - etc > > Is there anything that is updated at runtime (not startup) that > doesn't fit into a perfcounter? It not, then a single call to something > like rados_register_daemon(name, metadata_map) (and some mechanism > for cleaning out dead people) ought to suffice? > >> There is some other info that falls under different aspects of the rgw >> cluster, e.g., all the realm info / structure, the zonegroup config, >> zone config. This kind of data should be a category of its own, and >> more of a container to the specific radosgw process info. Does it make >> sense in the ceph-mgr context? > > This stuff any potential mgr module can pull out of the zone metadata pool > itself to display, right? rgw doesn't need to translate it and > report it? Right, that's another option. > >> > It doesn't feel like a particularly natural thing for librados, but >> > ultimately whatever we expose to rgw/rbd is de-facto librados, even if >> > we put it in a different library or whatever. >> > >> > So far I've got as far as thinking we should have an extra call just >> > in the C++ bindings that lets callers say "Hi, I'm a service not just >> > a client, and here's a map of metadata", that they call one time >> > between creating their RadosClient and connecting to the cluster. >> > >> >> >> It'd be great if the client itself could provide a flexible schema >> mapping of the info that it needs to expose. Or if there was some >> other generic way to do it. Was thinking something like the way you >> send a json to elasticsearch and it generates a doc out of it, but >> there's also a way to create a fixed mapping for the stored data. > > Hmm, instead of a map<string,string> it could just be a json blob that > the mgr stores that it is up to the user to parse and interpret. Or, if > there is info that doesn't fit into a simple dict, then one dict item > ("extra_stuff") can have a value consisting of encoded json. > Providing a json is probably easy and generic enough. Yehuda ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-14 19:50 ` Sage Weil 2017-06-14 20:45 ` Yehuda Sadeh-Weinraub @ 2017-06-14 21:20 ` Jason Dillaman 2017-06-19 19:26 ` Sage Weil 1 sibling, 1 reply; 20+ messages in thread From: Jason Dillaman @ 2017-06-14 21:20 UTC (permalink / raw) To: Sage Weil; +Cc: Yehuda Sadeh-Weinraub, John Spray, Ceph Development On Wed, Jun 14, 2017 at 3:50 PM, Sage Weil <sweil@redhat.com> wrote: > Is there anything that is updated at runtime (not startup) that > doesn't fit into a perfcounter? It not, then a single call to something > like rados_register_daemon(name, metadata_map) (and some mechanism > for cleaning out dead people) ought to suffice? My 2 cents: for rbd-mirror, I was thinking of injecting dynamic health state like "X images in error state" so that things like dashboard could have that data available for display without having to poll the data from the backing status objects in the OSDs. Low-level details about which images have issues would still require a pull. -- Jason ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-14 21:20 ` Jason Dillaman @ 2017-06-19 19:26 ` Sage Weil 2017-06-20 20:39 ` Gregory Farnum 0 siblings, 1 reply; 20+ messages in thread From: Sage Weil @ 2017-06-19 19:26 UTC (permalink / raw) To: dillaman; +Cc: Yehuda Sadeh-Weinraub, John Spray, Ceph Development I wrote up a quick proposal at http://pad.ceph.com/p/service-map Basic idea: - generic ServiceMap of service -> daemon -> metadata and status - managed/persisted by mon - librados interface to register as service X name Y (e.g., 'rgw.foo') - librados will send regular beacon to mon to keep entry alive - various mon commands to dump all or part of the service map sage ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-19 19:26 ` Sage Weil @ 2017-06-20 20:39 ` Gregory Farnum 2017-06-20 21:00 ` Sage Weil 0 siblings, 1 reply; 20+ messages in thread From: Gregory Farnum @ 2017-06-20 20:39 UTC (permalink / raw) To: Sage Weil Cc: Jason Dillaman, Yehuda Sadeh-Weinraub, John Spray, Ceph Development On Mon, Jun 19, 2017 at 12:26 PM, Sage Weil <sweil@redhat.com> wrote: > I wrote up a quick proposal at > > http://pad.ceph.com/p/service-map > > Basic idea: > > - generic ServiceMap of service -> daemon -> metadata and status > - managed/persisted by mon > - librados interface to register as service X name Y (e.g., 'rgw.foo') > - librados will send regular beacon to mon to keep entry alive > - various mon commands to dump all or part of the service map I am deeply uncomfortable with putting this stuff into the monitor (at least, directly). The main purpose we've discussed is to enable manager dashboard display of these services, along with stats collection, and there's no reason for that to go anywhere other than the manager — in fact, routing it through the monitor is inimical to timely updates of statistics. Why do you want to do that instead of letting it be handled by the manager, which can aggregate and persist whatever data it likes in a convenient form — and in ways which are mindful of monitor IO abilities? (The librados interface looks fine to me, though.) -Greg ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-20 20:39 ` Gregory Farnum @ 2017-06-20 21:00 ` Sage Weil 2017-06-20 21:24 ` Gregory Farnum 0 siblings, 1 reply; 20+ messages in thread From: Sage Weil @ 2017-06-20 21:00 UTC (permalink / raw) To: Gregory Farnum Cc: Jason Dillaman, Yehuda Sadeh-Weinraub, John Spray, Ceph Development [-- Attachment #1: Type: TEXT/PLAIN, Size: 3167 bytes --] On Tue, 20 Jun 2017, Gregory Farnum wrote: > On Mon, Jun 19, 2017 at 12:26 PM, Sage Weil <sweil@redhat.com> wrote: > > I wrote up a quick proposal at > > > > http://pad.ceph.com/p/service-map > > > > Basic idea: > > > > - generic ServiceMap of service -> daemon -> metadata and status > > - managed/persisted by mon > > - librados interface to register as service X name Y (e.g., 'rgw.foo') > > - librados will send regular beacon to mon to keep entry alive > > - various mon commands to dump all or part of the service map > > I am deeply uncomfortable with putting this stuff into the monitor (at > least, directly). The main purpose we've discussed is to enable > manager dashboard display of these services, along with stats > collection, and there's no reason for that to go anywhere other than > the manager — in fact, routing it through the monitor is inimical to > timely updates of statistics. Why do you want to do that instead of > letting it be handled by the manager, which can aggregate and persist > whatever data it likes in a convenient form — and in ways which are > mindful of monitor IO abilities? Well, I argued for doing this in the mon this morning but after implementing the first half of it I'm thinking the mgr makes more sense. I wanted to use the mon makes sense because - it's a persistent structure that should remain consistent across mgr restarts etc, - it looks just like OSDMap and FSMap, just a bit more freeform. those are in the mon. - if it's stored on the mon, there's no particular reason the mgr needs to be involved at all The main complaint was around the 'status' map which may update semi-frequently; does that need to be persisted? (I'd argue that most things that change very frequently are probably best covered by perfcounters or something other than this globally visible service map. But some ad hoc status information is definitely useful, so...) But... after writing a ServiceMap and ServiceMonitor skeleton it's time to implemetn beacon, and I'd prefer to do that using MMonCommand to (1) make it usable and testable via the cli (i.e., a well-written bash script could be a service if it wanted to), and (2) avoid writing new messages that aren't really needed. And new commands can be trivially implemented on the mgr. In python. Also, the get_health etc hooks in ServiceMonitor made me think we will want some per-service logic around this stuff. Like, issue a health warning if < my target 5 radosgws are running. Writing per-service pluggable logic is also a good fit for ceph-mgr. Also, the contents of ServiceMap can just be a section of config-key and trivially visible to all, without any special code. This also seems convenient (albeit more fragile). If it goes in mgr, though, I assume we'll have a split between what is persisted (in config-key or elsewhere) and what is ephemeral status information. I expect this whole thing is easiest to implement as a mgr_module, but I'm not sure we have a way to share unpersisted state between modules? Perhaps a config-key-like interface but local only to the mgr instance is all we need there. sage ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-20 21:00 ` Sage Weil @ 2017-06-20 21:24 ` Gregory Farnum 2017-06-20 21:40 ` Sage Weil 2017-06-21 19:53 ` Sage Weil 0 siblings, 2 replies; 20+ messages in thread From: Gregory Farnum @ 2017-06-20 21:24 UTC (permalink / raw) To: Sage Weil Cc: Jason Dillaman, Yehuda Sadeh-Weinraub, John Spray, Ceph Development On Tue, Jun 20, 2017 at 2:00 PM, Sage Weil <sweil@redhat.com> wrote: > On Tue, 20 Jun 2017, Gregory Farnum wrote: >> On Mon, Jun 19, 2017 at 12:26 PM, Sage Weil <sweil@redhat.com> wrote: >> > I wrote up a quick proposal at >> > >> > http://pad.ceph.com/p/service-map >> > >> > Basic idea: >> > >> > - generic ServiceMap of service -> daemon -> metadata and status >> > - managed/persisted by mon >> > - librados interface to register as service X name Y (e.g., 'rgw.foo') >> > - librados will send regular beacon to mon to keep entry alive >> > - various mon commands to dump all or part of the service map >> >> I am deeply uncomfortable with putting this stuff into the monitor (at >> least, directly). The main purpose we've discussed is to enable >> manager dashboard display of these services, along with stats >> collection, and there's no reason for that to go anywhere other than >> the manager — in fact, routing it through the monitor is inimical to >> timely updates of statistics. Why do you want to do that instead of >> letting it be handled by the manager, which can aggregate and persist >> whatever data it likes in a convenient form — and in ways which are >> mindful of monitor IO abilities? > > Well, I argued for doing this in the mon this morning but after > implementing the first half of it I'm thinking the mgr makes more sense. > I wanted to use the mon makes sense because > > - it's a persistent structure that should remain consistent across mgr > restarts etc, > - it looks just like OSDMap and FSMap, just a bit more freeform. those are > in the mon. > - if it's stored on the mon, there's no particular reason the mgr needs to > be involved at all I wrote out a whole email and then realized these 3 criteria are actually the sticking point let's go through them in order: * Why should the service map be a persistent structure? I mean, we don't want to see stuff flapping in and out of existence if the manager bounces, but that's a very different set of constraints than something like "this must consistently move strictly forward in time", which is what the monitor provides. I'd be inclined to persist a snapshot of the static metadata every 30 seconds (if it's changed) just so we don't gratuitously make graphs look weird, but otherwise it seems entirely ephemeral to me. * I guess at the moment I disagree about that. It looks like them in the sense that it stores data, I guess. But the purpose ("displaying things to administrators") is entirely different from the OSDMap/FSMap ("assign authority over data so we remain consistent"). * It's always nice to restrict the number of involved components, but that can just as easily be flipped around: if it's stored on the manager, there's no reason the mon needs to be involved at all! And not involving the mon (with its requirement that any change touch disk) is a lot bigger of a deal, unless you're worried about adding new dependencies on a not-quite-as-HA service. But the service map as I understand it is way less critical than stuff like some of the PG and quota commands that already depend on the manager. > The main complaint was around the 'status' map which may update > semi-frequently; does that need to be persisted? (I'd argue that most > things that change very frequently are probably best covered by > perfcounters or something other than this globally visible service map. > But some ad hoc status information is definitely useful, so...) Are perfcounters available through the librados interface? I was sort of assuming that punching a similar interface through librados was 50% of the point here, although re-reading the thread I'm not sure how I got that impression. -Greg ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-20 21:24 ` Gregory Farnum @ 2017-06-20 21:40 ` Sage Weil 2017-06-21 19:53 ` Sage Weil 1 sibling, 0 replies; 20+ messages in thread From: Sage Weil @ 2017-06-20 21:40 UTC (permalink / raw) To: Gregory Farnum Cc: Jason Dillaman, Yehuda Sadeh-Weinraub, John Spray, Ceph Development [-- Attachment #1: Type: TEXT/PLAIN, Size: 5425 bytes --] On Tue, 20 Jun 2017, Gregory Farnum wrote: > On Tue, Jun 20, 2017 at 2:00 PM, Sage Weil <sweil@redhat.com> wrote: > > On Tue, 20 Jun 2017, Gregory Farnum wrote: > >> On Mon, Jun 19, 2017 at 12:26 PM, Sage Weil <sweil@redhat.com> wrote: > >> > I wrote up a quick proposal at > >> > > >> > http://pad.ceph.com/p/service-map > >> > > >> > Basic idea: > >> > > >> > - generic ServiceMap of service -> daemon -> metadata and status > >> > - managed/persisted by mon > >> > - librados interface to register as service X name Y (e.g., 'rgw.foo') > >> > - librados will send regular beacon to mon to keep entry alive > >> > - various mon commands to dump all or part of the service map > >> > >> I am deeply uncomfortable with putting this stuff into the monitor (at > >> least, directly). The main purpose we've discussed is to enable > >> manager dashboard display of these services, along with stats > >> collection, and there's no reason for that to go anywhere other than > >> the manager — in fact, routing it through the monitor is inimical to > >> timely updates of statistics. Why do you want to do that instead of > >> letting it be handled by the manager, which can aggregate and persist > >> whatever data it likes in a convenient form — and in ways which are > >> mindful of monitor IO abilities? > > > > Well, I argued for doing this in the mon this morning but after > > implementing the first half of it I'm thinking the mgr makes more sense. > > I wanted to use the mon makes sense because > > > > - it's a persistent structure that should remain consistent across mgr > > restarts etc, > > - it looks just like OSDMap and FSMap, just a bit more freeform. those are > > in the mon. > > - if it's stored on the mon, there's no particular reason the mgr needs to > > be involved at all > > I wrote out a whole email and then realized these 3 criteria are > actually the sticking point let's go through them in order: > > * Why should the service map be a persistent structure? I mean, we > don't want to see stuff flapping in and out of existence if the > manager bounces, but that's a very different set of constraints than > something like "this must consistently move strictly forward in time", > which is what the monitor provides. I'd be inclined to persist a > snapshot of the static metadata every 30 seconds (if it's changed) > just so we don't gratuitously make graphs look weird, but otherwise it > seems entirely ephemeral to me. > > * I guess at the moment I disagree about that. It looks like them in > the sense that it stores data, I guess. But the purpose ("displaying > things to administrators") is entirely different from the OSDMap/FSMap > ("assign authority over data so we remain consistent"). > > * It's always nice to restrict the number of involved components, but > that can just as easily be flipped around: if it's stored on the > manager, there's no reason the mon needs to be involved at all! And > not involving the mon (with its requirement that any change touch > disk) is a lot bigger of a deal, unless you're worried about adding > new dependencies on a not-quite-as-HA service. But the service map as > I understand it is way less critical than stuff like some of the PG > and quota commands that already depend on the manager. Making something appear on the dashboard is goal #1, but the minute this is in place it's going to be used for all the same sorts of things that things like ZK are used for. Which rbd-mirror daemon is the leader? Which rgw is doing gc? How should I auto-generate my haproxy config for rgw? And so on. And for pretty everything that isn't just the gui display, making this progress forward in time in an orderly way makes sense. But I agree this implementation doesn't need to go in the mon. It just needs to persist the important stuff there. I think if we segregate per-daemon state into things that need to be consistent and persisted (e.g., rgw's IP address) and things that don't (which bucket radosgw multisite sync is working on, or current progress resharding a bucket) we'll be fine. > > The main complaint was around the 'status' map which may update > > semi-frequently; does that need to be persisted? (I'd argue that most > > things that change very frequently are probably best covered by > > perfcounters or something other than this globally visible service map. > > But some ad hoc status information is definitely useful, so...) > > Are perfcounters available through the librados interface? I was sort > of assuming that punching a similar interface through librados was 50% > of the point here, although re-reading the thread I'm not sure how I > got that impression. Yeah, I totally didn't think of that. I guess I'd say we probably want that additional librados interface to funnel information into the same metrics channel that perfcounts go through so that this data ends up in whatever TSDB you're using. Then you can draw all the pretty graphs of how many rbd images are currently replicating, how much bandwidth they're consuming, what the lag is, and so on. Unfortunately I also buy the argument that there is other ephemeral stuff that doesn't look like a metric (like current rgw sync position/bucket/object), so we probably need all three (static and/or persistent service daemon metadata, ephemerate daemon metadata, and additional perfcounter-like metrics)... sage ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Extra daemons/servers reporting to mgr 2017-06-20 21:24 ` Gregory Farnum 2017-06-20 21:40 ` Sage Weil @ 2017-06-21 19:53 ` Sage Weil 1 sibling, 0 replies; 20+ messages in thread From: Sage Weil @ 2017-06-21 19:53 UTC (permalink / raw) To: Gregory Farnum Cc: Jason Dillaman, Yehuda Sadeh-Weinraub, John Spray, Ceph Development I've updated the pad at http://pad.ceph.com/p/service-map - commands mediated by mgr - (some) state persisted in config-key - a few use-cases outline explicitly sage ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2017-06-21 19:53 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-06-11 12:04 Extra daemons/servers reporting to mgr John Spray 2017-06-12 14:14 ` Daniel Gryniewicz 2017-06-12 14:26 ` Matt Benjamin 2017-06-14 19:03 ` Yehuda Sadeh-Weinraub 2017-06-12 14:47 ` Jason Dillaman 2017-06-12 18:09 ` Casey Bodley 2017-06-12 20:03 ` Matt Benjamin 2017-06-12 20:24 ` Casey Bodley 2017-06-12 20:33 ` John Spray 2017-06-14 17:43 ` Gregory Farnum 2017-06-14 19:35 ` Yehuda Sadeh-Weinraub 2017-06-14 19:50 ` Sage Weil 2017-06-14 20:45 ` Yehuda Sadeh-Weinraub 2017-06-14 21:20 ` Jason Dillaman 2017-06-19 19:26 ` Sage Weil 2017-06-20 20:39 ` Gregory Farnum 2017-06-20 21:00 ` Sage Weil 2017-06-20 21:24 ` Gregory Farnum 2017-06-20 21:40 ` Sage Weil 2017-06-21 19:53 ` Sage Weil
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.