lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
From: Damien Berget via lttng-dev <lttng-dev@lists.lttng.org>
To: Kienan Stewart <kstewart@efficios.com>
Cc: lttng-dev@lists.lttng.org
Subject: Re: [lttng-dev] Trigger snapshots on a watchdog
Date: Thu, 12 Sep 2024 09:14:17 -0700	[thread overview]
Message-ID: <CAA1MA5eTdaSt_fxuEFK1dbGZXZSFHwLRO22YiXU1hgsCDQKRLA@mail.gmail.com> (raw)
In-Reply-To: <39b62cc1-66e1-4962-a3a8-0d3ad6e151ef@efficios.com>


[-- Attachment #1.1: Type: text/plain, Size: 3449 bytes --]

Thanks for the quick response Kienan,
Your proposal is exactly how we were thinking the monitor application could
work, so we'll go with that for now.
Reacting to absence of an event (watch dog) would really be a good
complement to the existing trigger types.
It's a really useful feature for a flight recorder in embedded medium
real-time applications, is the team open to feature requests?
Cheers
Damien

On Thu, Sep 12, 2024 at 12:57 AM Kienan Stewart <kstewart@efficios.com>
wrote:

> Hi Damien,
>
> On 2024-09-11 18:38, Damien Berget via lttng-dev wrote:
> > Good day,
> > We are trying to see what it the best way to monitor some applications
> > not hitting a deadline. Ideally something like a watchdog that needs
> > to be pat regularly and if timeout is reached triggers the snapshot.
> >
> > Before we reinvent the wheel and code some userland applications, is
> > there a canonical way in LTTng to do it? I found this
> > <https://review.lttng.org/c/lttng-tools/+/9657/9> that is suspiciously
> > close maybe?
> >
> I don't think the the proposed changes you linked to are useful or
> related to what you hope to achieve. The patch series is a concept about
> how some types of UST ring buffer stalls might be addressed by the
> session daemon. After a quick glance, the monitoring seems to be more
> closely related to the 'monitor timer', which is used to sample
> statistical information channels[1].
>
>
> There is a concept of triggers[2]; however triggers react to the
> presence of events rather than the absence thereof.
>
>
> I think a small user space application that monitors the state of other
> applications is more the direction to head in. There's at least of
> couple of ways that a snapshot on unhealthy state could be achieved:
>
>
> * Use liblttng-ctl to trigger a snapshot from your watchdog
> application[3][4].
>
> * Have the watchdog application exec `lttng snapshot record`[5].
>
> * Have the watchdog application emit some sort of "health state" events
> with some data (e.g. health_okay, health_bad, ...) per your usage
> requirements, and configure a trigger[2] to take a snapshot on the
> "health state" events that have the non-okay state.
>
>
> Depending on your tracing configuration - channel overwrite/discard
> mode[6], buffer sizes, blocking mode, and number of events it is
> possible that events may not be recorded. I would privilege using
> liblttng-ctl or exec'ing `lttng snapshort record` if you want a stronger
> guarantee that your watchdog will cause a snapshot to be taken.
>
>
> I would love to hear if there are other ideas. Regardless, hope this helps!
>
>
> thanks,
>
> kienan
>
>
> [1]: https://lttng.org/docs/v2.13/#doc-channel-timers
>
> [2]:  https://lttng.org/docs/v2.13/#doc-trigger
>
> [3]:  https://lttng.org/docs/v2.13/#doc-liblttng-ctl-lttng
>
> [4]: https://github.com/lttng/lttng-tools/tree/master/src/lib/lttng-ctl
>
> [5]: https://lttng.org/man/1/lttng-snapshot/v2.13/
>
> [6]:
> https://lttng.org/docs/v2.13/#doc-channel-overwrite-mode-vs-discard-mode
>
>
> > Thanks,
> > Cheers
> >
> > --
> > *Damien Berget*
> > Embedded Platform Lead
> > damien.berget@flyzipline.com
> >
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev@lists.lttng.org
> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>


-- 
*Damien Berget*

[-- Attachment #1.2: Type: text/html, Size: 5185 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

  reply	other threads:[~2024-09-12 16:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-11 22:38 [lttng-dev] Trigger snapshots on a watchdog Damien Berget via lttng-dev
2024-09-12  7:57 ` Kienan Stewart via lttng-dev
2024-09-12 16:14   ` Damien Berget via lttng-dev [this message]
2024-09-13  9:51     ` Kienan Stewart via lttng-dev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA1MA5eTdaSt_fxuEFK1dbGZXZSFHwLRO22YiXU1hgsCDQKRLA@mail.gmail.com \
    --to=lttng-dev@lists.lttng.org \
    --cc=damien.berget@flyzipline.com \
    --cc=kstewart@efficios.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).