* Split archives to new repo - clean kdevops repo
@ 2024-03-28 18:04 Luis Chamberlain
2024-03-30 11:09 ` Jeff Layton
0 siblings, 1 reply; 3+ messages in thread
From: Luis Chamberlain @ 2024-03-28 18:04 UTC (permalink / raw)
To: kdevops
At LSFMM long ago we had a discussion about keeping test results
around, back of the napkin calcluation then for fstests end up
with an estimate impact of about 730.40 GiB per year, removing the
good results and only keeping *.bad and *.dmesg for failed tests
was about 456.5 MiB per year, and compressing it ~44 KiB per year.
For 5 filesystems that is about 100 MiB / year.
Well, kdevops is a big fat pig now over 200 MiB, and even though the
above math was conservative it turns out that to make results more
useful for things like an ELK stack, we want to keep good results.
And in practice now git grep'ing is creating a lot of noise.
And so it seems best the experiment to keep results for tests has
come to a crux, and I'd like to suggest we spin off the results into
an optional archive directory per workflow, and we create a fresh
new kdevops git tree from scratch, and move the existing one to
kdevops-history.
The way I'd see this is, we'd look for ~/kdevops-archive/ and if present
a new future 'make kdevops-archive' would add directory symlinks to
workflows/fstests/results/archive/ --> ~/kdevops-archive/workflows/fstests/results/archive/
That would keep things like scripts/workflows/fstests/copy-results.sh
working, we'd just have to check if the directory exist first before
proceeding.
This:
* let's us modify results to contain even *good* results so we can
help ELK stacks which want good results too
* puts the churn of the archive out to a separate tree
* keeps kdevops clean
If this is agreeable, perhaps we can make the switch Monday? Any
opposition to this?
Luis
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Split archives to new repo - clean kdevops repo
2024-03-28 18:04 Split archives to new repo - clean kdevops repo Luis Chamberlain
@ 2024-03-30 11:09 ` Jeff Layton
2024-04-03 1:34 ` Luis Chamberlain
0 siblings, 1 reply; 3+ messages in thread
From: Jeff Layton @ 2024-03-30 11:09 UTC (permalink / raw)
To: Luis Chamberlain, kdevops
On Thu, 2024-03-28 at 11:04 -0700, Luis Chamberlain wrote:
> At LSFMM long ago we had a discussion about keeping test results
> around, back of the napkin calcluation then for fstests end up
> with an estimate impact of about 730.40 GiB per year, removing the
> good results and only keeping *.bad and *.dmesg for failed tests
> was about 456.5 MiB per year, and compressing it ~44 KiB per year.
> For 5 filesystems that is about 100 MiB / year.
>
> Well, kdevops is a big fat pig now over 200 MiB, and even though the
> above math was conservative it turns out that to make results more
> useful for things like an ELK stack, we want to keep good results.
> And in practice now git grep'ing is creating a lot of noise.
>
> And so it seems best the experiment to keep results for tests has
> come to a crux, and I'd like to suggest we spin off the results into
> an optional archive directory per workflow, and we create a fresh
> new kdevops git tree from scratch, and move the existing one to
> kdevops-history.
>
We could just git rm the old results, but that does keep the history
around, which will always be huge.
ACK from me on this idea.
> The way I'd see this is, we'd look for ~/kdevops-archive/ and if present
> a new future 'make kdevops-archive' would add directory symlinks to
>
> workflows/fstests/results/archive/ --> ~/kdevops-archive/workflows/fstests/results/archive/
>
> That would keep things like scripts/workflows/fstests/copy-results.sh
> working, we'd just have to check if the directory exist first before
> proceeding.
>
> This:
>
> * let's us modify results to contain even *good* results so we can
> help ELK stacks which want good results too
> * puts the churn of the archive out to a separate tree
> * keeps kdevops clean
>
> If this is agreeable, perhaps we can make the switch Monday? Any
> opposition to this?
>
I like this idea. Do you intend to keep tracking the results with git,
just in another tree? You could consider using git-lfs to store archive
tarballs. github supports it, but they only give you 1GB by default:
https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage
Looks like it's only $60 a year for 50G though.
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Split archives to new repo - clean kdevops repo
2024-03-30 11:09 ` Jeff Layton
@ 2024-04-03 1:34 ` Luis Chamberlain
0 siblings, 0 replies; 3+ messages in thread
From: Luis Chamberlain @ 2024-04-03 1:34 UTC (permalink / raw)
To: Jeff Layton; +Cc: kdevops
On Sat, Mar 30, 2024 at 07:09:34AM -0400, Jeff Layton wrote:
> On Thu, 2024-03-28 at 11:04 -0700, Luis Chamberlain wrote:
> > And so it seems best the experiment to keep results for tests has
> > come to a crux, and I'd like to suggest we spin off the results into
> > an optional archive directory per workflow, and we create a fresh
> > new kdevops git tree from scratch, and move the existing one to
> > kdevops-history.
> >
>
> We could just git rm the old results, but that does keep the history
> around, which will always be huge.
>
> ACK from me on this idea.
OK this is all done:
The main repo is now just 32 MiB:
https://github.com/linux-kdevops/kdevops
The archive for results for now using the old same style as before:
https://github.com/linux-kdevops/kdevops-results-archive
The old repo is renamed to kdevops-history:
https://github.com/linux-kdevops/kdevops-history
> > The way I'd see this is, we'd look for ~/kdevops-archive/ and if present
> > a new future 'make kdevops-archive' would add directory symlinks to
> >
> > workflows/fstests/results/archive/ --> ~/kdevops-archive/workflows/fstests/results/archive/
> >
> > That would keep things like scripts/workflows/fstests/copy-results.sh
> > working, we'd just have to check if the directory exist first before
> > proceeding.
> >
> > This:
> >
> > * let's us modify results to contain even *good* results so we can
> > help ELK stacks which want good results too
> > * puts the churn of the archive out to a separate tree
> > * keeps kdevops clean
> >
> > If this is agreeable, perhaps we can make the switch Monday? Any
> > opposition to this?
> >
>
> I like this idea. Do you intend to keep tracking the results with git,
> just in another tree?
Yes it has helped before, and also if we want ELK stack integration we
need to grow this to no exclude the "ok" tests, that is test output
that did not fail. That would grow the archive further too..
> You could consider using git-lfs to store archive
> just in another tree? You could consider using git-lfs to store archive
> tarballs. github supports it, but they only give you 1GB by default:
>
> https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage
>
> Looks like it's only $60 a year for 50G though.
It looks nice but I just wanted to chug along for now, we should
certainly evaluate the best strategy to scale this with all new
requirements in mind. The old archiving was just a test given that
we didn't know we should keep "ok" results, and frankly the results
just got out of hand. LBS work shows you can end up with many results
for a new feature by different developers, and this is just so much
data. Over time and over beers we should be able to craft this up to
something sensible.
Luis
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-04-03 1:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-28 18:04 Split archives to new repo - clean kdevops repo Luis Chamberlain
2024-03-30 11:09 ` Jeff Layton
2024-04-03 1:34 ` Luis Chamberlain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox