* [Lustre-devel] Feature request: expand SNMP scope
@ 2008-03-11 18:12 Kilian CAVALOTTI
0 siblings, 0 replies; 4+ messages in thread
From: Kilian CAVALOTTI @ 2008-03-11 18:12 UTC (permalink / raw)
To: lustre-devel
Hi,
After a discussion started on lustre-discuss@ [1], I'd like to join
other users [2] to make an official feature request about the Lustre
SNMP module.
I believe it could be extremely useful for Lustre systems administrators
to get more than just the number of free space and available objects
from the SNMP module. For instance, it could be interesting to get the
following live stats through SNMP:
on clients: /proc/fs/lustre/llite/*/stats
on OSSes: /proc/fs/lustre/obdfilter/*/stats
on MDSes: /proc/fs/lustre/mds/*/stats on MDSes.
But it would be especially interesting to not limit the SNMPable values
to just a subset of what's available in /proc/fs/lustre. Since it looks
like some work has begun to rework the Lustre /proc structure [3],
maybe it would be the right opportunity to incorporate SNMP more
closely into the new UI. The idea being to translate everything
available in /proc into SNMP variables, so that future variables could
be exported too, without having to explicitly add them to the SNMP
code.
I have little idea on how easily this can be achieved, but that would be
an excellent foundation stone for next-to-come Lustre monitoring
systems.
[1]http://lists.lustre.org/pipermail/lustre-discuss/2008-March/005277.html
[2]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001504.html,
and bug #14729
[3]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001475.html
Thanks!
--
Kilian
PS: I also created bug #15197 to keep track of this.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Lustre-devel] Feature request: expand SNMP scope
@ 2008-03-12 14:00 patrice.lucas at cea.fr
2008-03-12 19:51 ` Kilian CAVALOTTI
0 siblings, 1 reply; 4+ messages in thread
From: patrice.lucas at cea.fr @ 2008-03-12 14:00 UTC (permalink / raw)
To: lustre-devel
Hi,
> After a discussion started on lustre-discuss@ [1], I'd like to join
> other users [2] to make an official feature request about the Lustre
> SNMP module.
>
> I believe it could be extremely useful for Lustre systems administrators
> to get more than just the number of free space and available objects
> from the SNMP module. For instance, it could be interesting to get the
> following live stats through SNMP:
> on clients: /proc/fs/lustre/llite/*/stats
> on OSSes: /proc/fs/lustre/obdfilter/*/stats
> on MDSes: /proc/fs/lustre/mds/*/stats on MDSes.
Kilian, as you noticed from my previous mail and patch, I definitely
agree with you.
>
> But it would be especially interesting to not limit the SNMPable values
> to just a subset of what's available in /proc/fs/lustre. Since it looks
> like some work has begun to rework the Lustre /proc structure [3],
> maybe it would be the right opportunity to incorporate SNMP more
> closely into the new UI. The idea being to translate everything
> available in /proc into SNMP variables, so that future variables could
> be exported too, without having to explicitly add them to the SNMP
> code.
>
> I have little idea on how easily this can be achieved, but that would be
> an excellent foundation stone for next-to-come Lustre monitoring
> systems.
In the patch "bug #14729", I just add a new external access from the
snmp agent to a /proc entry . I create this patch as an instance of what
could be easyly done. The goal was to start to discuss around this need
of improving access to monitoring data. This patch was accepted by
Lustre team but without discussion. This method is not integrated to the
inner Lustre code. If people change /proc entries, the snmp agent code
must clearly be rewrite. I agree with you when you emphasize the need to
link the snmp code to the rest of the Lustre development.
From a more integrated point of view, do you think it could be a good
idea to benefit from Lustre itself to deliver monitoring data ? Lustre
is a parallel filesystem. Data delivered by Lustre can be accessed by
remote client. Instead of using "/proc", can Lustre benefits from its
capability of distributed filesystem to deliver monitoring data ? By
doing that, we could lose the advantage of snmp to interface with many
available common snmp network monitoring tools.
>
> [1]http://lists.lustre.org/pipermail/lustre-discuss/2008-March/005277.html
> [2]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001504.html,
> and bug #14729
> [3]http://lists.lustre.org/pipermail/lustre-devel/2008-January/001475.html
>
> Thanks!
> --
> Kilian
>
> PS: I also created bug #15197 to keep track of this.
Thanks,
Patrice LUCAS
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Lustre-devel] Feature request: expand SNMP scope
2008-03-12 14:00 [Lustre-devel] Feature request: expand SNMP scope patrice.lucas at cea.fr
@ 2008-03-12 19:51 ` Kilian CAVALOTTI
2008-03-12 19:56 ` Peter Braam
0 siblings, 1 reply; 4+ messages in thread
From: Kilian CAVALOTTI @ 2008-03-12 19:51 UTC (permalink / raw)
To: lustre-devel
Bonjour Patrice,
On Wednesday 12 March 2008 07:00:05 am patrice.lucas at cea.fr wrote:
> This method is not
> integrated to the inner Lustre code. If people change /proc entries,
> the snmp agent code must clearly be rewrite. I agree with you when
> you emphasize the need to link the snmp code to the rest of the
> Lustre development.
Yes, that's what Brian first pointed out, and I think that's really the
cornerstone here. Manually editing the SNMP code and the corresponding
MIB files each time a new metric is added, removed or renamed, will
rapidly get to be a nightmare.
> From a more integrated point of view, do you think it could be a
> good idea to benefit from Lustre itself to deliver monitoring data ?
> Lustre is a parallel filesystem. Data delivered by Lustre can be
> accessed by remote client. Instead of using "/proc", can Lustre
> benefits from its capability of distributed filesystem to deliver
> monitoring data ? By doing that, we could lose the advantage of snmp
> to interface with many available common snmp network monitoring
> tools.
Well, yes, actually, that sounds like a very reasonnable approach too.
The main advantages for SNMP, from my standpoint are the following:
1. It's a network protocol, so the monitored system doesn't have to be
the same as the monitoring one. This allows remote collection of
metrics, aggregation, and central administration.
2. It's an industry standard (even if vendors sometimes tend to have a
proprietary interpretation of what is a 'standard'), so it can be
used across a large variety of monitoring systems. Interoperability
is always a good thing
But only point 1. is really required to allow easier Lustre monitoring.
If all the lnet/client/oss/mds data could be accessed from clients,
that would be enough. One specific client (potentially patchless) could
be dedicated for monitoring with almost the same advantage as a SNMP
host.
That looks like the OFED approach: SNMP is not a priority for
OpenFabrics, since the IB counters from all over the fabric can be
gathered with a single perfquery, from a simple IB node.
And this may also be easier to implement than mapping SNMP exports to
the Lustre stats files.
Cheers,
--
Kilian
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Lustre-devel] Feature request: expand SNMP scope
2008-03-12 19:51 ` Kilian CAVALOTTI
@ 2008-03-12 19:56 ` Peter Braam
0 siblings, 0 replies; 4+ messages in thread
From: Peter Braam @ 2008-03-12 19:56 UTC (permalink / raw)
To: lustre-devel
On 3/12/08 1:51 PM, "Kilian CAVALOTTI" <kilian@stanford.edu> wrote:
> Bonjour Patrice,
>
> On Wednesday 12 March 2008 07:00:05 am patrice.lucas at cea.fr wrote:
>> This method is not
>> integrated to the inner Lustre code. If people change /proc entries,
>> the snmp agent code must clearly be rewrite. I agree with you when
>> you emphasize the need to link the snmp code to the rest of the
>> Lustre development.
>
> Yes, that's what Brian first pointed out, and I think that's really the
> cornerstone here. Manually editing the SNMP code and the corresponding
> MIB files each time a new metric is added, removed or renamed, will
> rapidly get to be a nightmare.
>
>> From a more integrated point of view, do you think it could be a
>> good idea to benefit from Lustre itself to deliver monitoring data ?
>> Lustre is a parallel filesystem. Data delivered by Lustre can be
>> accessed by remote client. Instead of using "/proc", can Lustre
>> benefits from its capability of distributed filesystem to deliver
>> monitoring data ? By doing that, we could lose the advantage of snmp
>> to interface with many available common snmp network monitoring
>> tools.
There are already some /proc files for Lustre that actually make an RPC when
read. We have talked often about greatly enlarging this and in addition
letting servers also report on the client state.
So a monitoring node would poll servers and servers would export their own
data including data for each client that is connected to the server.
Generating SNMP info from this is then easy, and it would hook very nicely
into the various management tools too, and work on non-IP networked
computers (if there are any left).
- Peter -
>
> Well, yes, actually, that sounds like a very reasonnable approach too.
> The main advantages for SNMP, from my standpoint are the following:
>
> 1. It's a network protocol, so the monitored system doesn't have to be
> the same as the monitoring one. This allows remote collection of
> metrics, aggregation, and central administration.
>
> 2. It's an industry standard (even if vendors sometimes tend to have a
> proprietary interpretation of what is a 'standard'), so it can be
> used across a large variety of monitoring systems. Interoperability
> is always a good thing
>
> But only point 1. is really required to allow easier Lustre monitoring.
> If all the lnet/client/oss/mds data could be accessed from clients,
> that would be enough. One specific client (potentially patchless) could
> be dedicated for monitoring with almost the same advantage as a SNMP
> host.
>
> That looks like the OFED approach: SNMP is not a priority for
> OpenFabrics, since the IB counters from all over the fabric can be
> gathered with a single perfquery, from a simple IB node.
>
> And this may also be easier to implement than mapping SNMP exports to
> the Lustre stats files.
>
> Cheers,
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-03-12 19:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-12 14:00 [Lustre-devel] Feature request: expand SNMP scope patrice.lucas at cea.fr
2008-03-12 19:51 ` Kilian CAVALOTTI
2008-03-12 19:56 ` Peter Braam
-- strict thread matches above, loose matches on Subject: below --
2008-03-11 18:12 Kilian CAVALOTTI
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.