public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH 0/6] Fix ordering for systemd remote journal support
@ 2024-01-25 20:35 Luis Chamberlain
  2024-01-25 20:35 ` [PATCH 1/6] mirror: add a smart git check Luis Chamberlain
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Luis Chamberlain @ 2024-01-25 20:35 UTC (permalink / raw)
  To: kdevops, da.gomez, p.raghav; +Cc: Luis Chamberlain

For those that may have missed it, in December I added support for
systemd remote journals. This allows us to:

1) reduce the workflow watchdog time, since it is now on the localhost
2) always lets us collect node logs in case they crash and
   the node is gone after so we can't insepct it
3) let's us use journalctl to look for data on the target nodes

The implementation however was to address a series of issues when
one is using and enabling SOAK_DURATION of 2.5 hours (so that is
CONFIG_FSTESTS_SOAK_DURATION=9900), and so it was designed to cure
an existing running system I was fet up with not being able to collect
logs for, as crashes are much more frequent to the point you really do
often loose access to the system / console.

Since I implemented support for a running kdevops system I simply
togggled on CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE=y and
ran 'make' and then:

make journal-server
make journal-client

And then use it. The other commands are:

journal-restart    - Restart client upload service
journal-status     - Ensure systemd-journal-remote works
journal-ls          - List journals available and sizes
journal-ln          - Add symlinks with hostnames

The issue with this is I had not tested a new cluster bringup. After
Daniel Gomez struggled with that, I now looked into that and have
identified the issues. The fix is to enhance our semantics on bringup
by defining clearly what a provider is and also add clear semantics
to allow us to install things *prior* to running the generic devconfig
playbook with all the bells and whistles.

The last few patches should demonstrate how easy it is now to add
new install targets for things we need in the pipeline for provisioning.

This should bring using CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE
closer to a reality for most folks. The only nagging thing I don't
like about it is, is that we require sudo on the local system to query
the client logs, and that pollutes your systems's logs with a ton of
messages with sudo.

So the next thing is to verify we can remove sudo from the watchdog
calls, since we already took care of making the systemd remote journal
directory with a sticky bit, and then ensured our user is part of the group
systemd-journal-remote so to allow our user to read the remote journals
without sudo. We just gotta test this a bit more wider.

So other than patch review, it's now a good time to ask for wider
testing with CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE=y and so we
can default CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE=y later.

Please let me know what you think.

Luis Chamberlain (6):
  mirror: add a smart git check
  bringup: split provisioning into 2 steps
  bringup: share bringup method targe and agument by two steps
  provision: move all provisioning things to its own Makefile
  bringup: move journal-server setup early
  journal-server: fix by adjusting ordering

 .gitignore                               |  2 +-
 Makefile                                 | 31 ++--------
 kconfigs/Kconfig.mirror                  | 24 ++++++--
 playbooks/roles/devconfig/tasks/main.yml |  1 +
 scripts/bringup.Makefile                 | 56 ------------------
 scripts/guestfs.Makefile                 | 19 ++----
 scripts/journal-server.Makefile          | 56 ++++++++++++++++++
 scripts/provision.Makefile               | 74 ++++++++++++++++++++++++
 scripts/terraform.Makefile               | 10 ++--
 scripts/test_git_firewall.sh             | 17 ++++++
 scripts/vagrant.Makefile                 | 19 ++----
 11 files changed, 187 insertions(+), 122 deletions(-)
 create mode 100644 scripts/journal-server.Makefile
 create mode 100644 scripts/provision.Makefile
 create mode 100755 scripts/test_git_firewall.sh

-- 
2.42.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-01-25 20:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-25 20:35 [PATCH 0/6] Fix ordering for systemd remote journal support Luis Chamberlain
2024-01-25 20:35 ` [PATCH 1/6] mirror: add a smart git check Luis Chamberlain
2024-01-25 20:35 ` [PATCH 2/6] bringup: split provisioning into 2 steps Luis Chamberlain
2024-01-25 20:35 ` [PATCH 3/6] bringup: share bringup method targe and agument by two steps Luis Chamberlain
2024-01-25 20:35 ` [PATCH 4/6] provision: move all provisioning things to its own Makefile Luis Chamberlain
2024-01-25 20:35 ` [PATCH 5/6] bringup: move journal-server setup early Luis Chamberlain
2024-01-25 20:35 ` [PATCH 6/6] journal-server: fix by adjusting ordering Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox