All of lore.kernel.org
 help / color / mirror / Atom feed
* [Lustre-devel] Feature request: expand SNMP scope
@ 2008-03-11 18:12 Kilian CAVALOTTI
  0 siblings, 0 replies; 4+ messages in thread
From: Kilian CAVALOTTI @ 2008-03-11 18:12 UTC (permalink / raw)
  To: lustre-devel

Hi,

After a discussion started on lustre-discuss@ [1], I'd like to join 
other users [2] to make an official feature request about the Lustre 
SNMP module.

I believe it could be extremely useful for Lustre systems administrators 
to get more than just the number of free space and available objects 
from the SNMP module.  For instance, it could be interesting to get the 
following live stats through SNMP:
on clients: /proc/fs/lustre/llite/*/stats 
on OSSes:   /proc/fs/lustre/obdfilter/*/stats 
on MDSes:   /proc/fs/lustre/mds/*/stats on MDSes.

But it would be especially interesting to not limit the SNMPable values 
to just a subset of what's available in /proc/fs/lustre. Since it looks 
like some work has begun to rework the Lustre /proc structure [3], 
maybe it would be the right opportunity to incorporate SNMP more 
closely into the new UI. The idea being to translate everything 
available in /proc into SNMP variables, so that future variables could 
be exported too, without having to explicitly add them to the SNMP 
code.

I have little idea on how easily this can be achieved, but that would be 
an excellent foundation stone for next-to-come Lustre monitoring 
systems.

[1]http://lists.lustre.org/pipermail/lustre-discuss/2008-March/005277.html
[2]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001504.html, 
and bug #14729
[3]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001475.html

Thanks!
--
Kilian

PS: I also created bug #15197 to keep track of this.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Lustre-devel] Feature request: expand SNMP scope
@ 2008-03-12 14:00 patrice.lucas at cea.fr
  2008-03-12 19:51 ` Kilian CAVALOTTI
  0 siblings, 1 reply; 4+ messages in thread
From: patrice.lucas at cea.fr @ 2008-03-12 14:00 UTC (permalink / raw)
  To: lustre-devel

Hi,

> After a discussion started on lustre-discuss@ [1], I'd like to join 
> other users [2] to make an official feature request about the Lustre 
> SNMP module.
> 
> I believe it could be extremely useful for Lustre systems administrators 
> to get more than just the number of free space and available objects 
> from the SNMP module.  For instance, it could be interesting to get the 
> following live stats through SNMP:
> on clients: /proc/fs/lustre/llite/*/stats 
> on OSSes:   /proc/fs/lustre/obdfilter/*/stats 
> on MDSes:   /proc/fs/lustre/mds/*/stats on MDSes.

Kilian, as you noticed from my previous mail and patch, I definitely 
agree with you.

> 
> But it would be especially interesting to not limit the SNMPable values 
> to just a subset of what's available in /proc/fs/lustre. Since it looks 
> like some work has begun to rework the Lustre /proc structure [3], 
> maybe it would be the right opportunity to incorporate SNMP more 
> closely into the new UI. The idea being to translate everything 
> available in /proc into SNMP variables, so that future variables could 
> be exported too, without having to explicitly add them to the SNMP 
> code.
> 
> I have little idea on how easily this can be achieved, but that would be 
> an excellent foundation stone for next-to-come Lustre monitoring 
> systems.

In the patch "bug #14729", I just add a new external access from the 
snmp agent to a /proc entry . I create this patch as an instance of what 
could be easyly done. The goal was to start to discuss around this need 
of improving access to monitoring data. This patch was accepted by 
Lustre team but without discussion. This method is not integrated to the 
inner Lustre code. If people change /proc entries, the snmp agent code 
must clearly be rewrite. I agree with you when you emphasize the need to 
link the snmp code to the rest of the Lustre development.

 From a more integrated point of view, do you think it could be a good 
idea to benefit from Lustre itself to deliver monitoring data ? Lustre 
is a parallel filesystem. Data delivered by Lustre can be accessed by 
remote client. Instead of using "/proc", can Lustre benefits from its 
capability of distributed filesystem to deliver monitoring data ? By 
doing that, we could lose the advantage of snmp to interface with many 
available common snmp network monitoring tools.

> 
> [1]http://lists.lustre.org/pipermail/lustre-discuss/2008-March/005277.html
> [2]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001504.html, 
> and bug #14729
> [3]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001475.html
> 
> Thanks!
> --
> Kilian
> 
> PS: I also created bug #15197 to keep track of this.


Thanks,
Patrice LUCAS

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Lustre-devel] Feature request: expand SNMP scope
  2008-03-12 14:00 [Lustre-devel] Feature request: expand SNMP scope patrice.lucas at cea.fr
@ 2008-03-12 19:51 ` Kilian CAVALOTTI
  2008-03-12 19:56   ` Peter Braam
  0 siblings, 1 reply; 4+ messages in thread
From: Kilian CAVALOTTI @ 2008-03-12 19:51 UTC (permalink / raw)
  To: lustre-devel

Bonjour Patrice,

On Wednesday 12 March 2008 07:00:05 am patrice.lucas at cea.fr wrote:
> This method is not 
> integrated to the inner Lustre code. If people change /proc entries,
> the snmp agent code must clearly be rewrite. I agree with you when
> you emphasize the need to link the snmp code to the rest of the
> Lustre development.

Yes, that's what Brian first pointed out, and I think that's really the 
cornerstone here. Manually editing the SNMP code and the corresponding 
MIB files each time a new metric is added, removed or renamed, will 
rapidly get to be a nightmare.

>  From a more integrated point of view, do you think it could be a
> good idea to benefit from Lustre itself to deliver monitoring data ?
> Lustre is a parallel filesystem. Data delivered by Lustre can be
> accessed by remote client. Instead of using "/proc", can Lustre
> benefits from its capability of distributed filesystem to deliver
> monitoring data ? By doing that, we could lose the advantage of snmp
> to interface with many available common snmp network monitoring
> tools.

Well, yes, actually, that sounds like a very reasonnable approach too. 
The main advantages for SNMP, from my standpoint are the following:

1. It's a network protocol, so the monitored system doesn't have to be 
   the same as the monitoring one. This allows remote collection of 
   metrics, aggregation, and central administration.

2. It's an industry standard (even if vendors sometimes tend to have a 
   proprietary interpretation of what is a 'standard'), so it can be 
   used across a large variety of monitoring systems. Interoperability 
   is always a good thing

But only point 1. is really required to allow easier Lustre monitoring. 
If all the lnet/client/oss/mds data could be accessed from clients, 
that would be enough. One specific client (potentially patchless) could 
be dedicated for monitoring with almost the same advantage as a SNMP 
host.

That looks like the OFED approach: SNMP is not a priority for 
OpenFabrics, since the IB counters from all over the fabric can be 
gathered with a single perfquery, from a simple IB node.

And this may also be easier to implement than mapping SNMP exports to 
the Lustre stats files.

Cheers,
-- 
Kilian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Lustre-devel] Feature request: expand SNMP scope
  2008-03-12 19:51 ` Kilian CAVALOTTI
@ 2008-03-12 19:56   ` Peter Braam
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Braam @ 2008-03-12 19:56 UTC (permalink / raw)
  To: lustre-devel




On 3/12/08 1:51 PM, "Kilian CAVALOTTI" <kilian@stanford.edu> wrote:

> Bonjour Patrice,
> 
> On Wednesday 12 March 2008 07:00:05 am patrice.lucas at cea.fr wrote:
>> This method is not
>> integrated to the inner Lustre code. If people change /proc entries,
>> the snmp agent code must clearly be rewrite. I agree with you when
>> you emphasize the need to link the snmp code to the rest of the
>> Lustre development.
> 
> Yes, that's what Brian first pointed out, and I think that's really the
> cornerstone here. Manually editing the SNMP code and the corresponding
> MIB files each time a new metric is added, removed or renamed, will
> rapidly get to be a nightmare.
> 
>>  From a more integrated point of view, do you think it could be a
>> good idea to benefit from Lustre itself to deliver monitoring data ?
>> Lustre is a parallel filesystem. Data delivered by Lustre can be
>> accessed by remote client. Instead of using "/proc", can Lustre
>> benefits from its capability of distributed filesystem to deliver
>> monitoring data ? By doing that, we could lose the advantage of snmp
>> to interface with many available common snmp network monitoring
>> tools.

There are already some /proc files for Lustre that actually make an RPC when
read.  We have talked often about greatly enlarging this and in addition
letting servers also report on the client state.

So a monitoring node would poll servers and servers would export their own
data including data for each client that is connected to the server.

Generating SNMP info from this is then easy, and it would hook very nicely
into the various management tools too, and work on non-IP networked
computers (if there are any left).

- Peter -


> 
> Well, yes, actually, that sounds like a very reasonnable approach too.
> The main advantages for SNMP, from my standpoint are the following:
> 
> 1. It's a network protocol, so the monitored system doesn't have to be
>    the same as the monitoring one. This allows remote collection of
>    metrics, aggregation, and central administration.
> 
> 2. It's an industry standard (even if vendors sometimes tend to have a
>    proprietary interpretation of what is a 'standard'), so it can be
>    used across a large variety of monitoring systems. Interoperability
>    is always a good thing
> 
> But only point 1. is really required to allow easier Lustre monitoring.
> If all the lnet/client/oss/mds data could be accessed from clients,
> that would be enough. One specific client (potentially patchless) could
> be dedicated for monitoring with almost the same advantage as a SNMP
> host.
> 
> That looks like the OFED approach: SNMP is not a priority for
> OpenFabrics, since the IB counters from all over the fabric can be
> gathered with a single perfquery, from a simple IB node.
> 
> And this may also be easier to implement than mapping SNMP exports to
> the Lustre stats files.
> 
> Cheers,

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-03-12 19:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-12 14:00 [Lustre-devel] Feature request: expand SNMP scope patrice.lucas at cea.fr
2008-03-12 19:51 ` Kilian CAVALOTTI
2008-03-12 19:56   ` Peter Braam
  -- strict thread matches above, loose matches on Subject: below --
2008-03-11 18:12 Kilian CAVALOTTI

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.