From: Andrea Cervesato via ltp <ltp@lists.linux.it>
To: "Wei Gao via ltp" <ltp@lists.linux.it>
Cc: ltp@lists.linux.it, linuxtestproject.agent@gmail.com
Subject: Re: [LTP] min_free_kbytes: Handle transient memory drops in check_monitor
Date: Tue, 02 Jun 2026 16:07:02 +0000 [thread overview]
Message-ID: <6a1effa6.5e46ca1c.1e45c8.7955@mx.google.com> (raw)
In-Reply-To: <ah6KOWY2Xcb-grLr@autotest-wegao.qe.prg2.suse.org>
Hi Wei,
> > > Introduce a 10% tolerance (90% threshold) for the MemFree check. My
> > > measurements showed that under extreme pressure, MemFree can dip as low
> > > as ~50% to ~70% of the target. While it typically recovers above 90%
> > > within one second, hitting the exact 100% watermark sometimes can take
> > > significantly longer. This tolerance prevents false positives during the
> > > slow recovery tail while still ensuring memory is maintained near the
> > > required level.
> >
> > Please rewrite in impersonal form, e.g.:
> > "Measurements under extreme pressure show MemFree can dip as low as
> > ~50% to ~70% of the target."
> >
> > Commit messages become a permanent part of the project history and
> > should avoid first-person language.
> The measurement result is quite related with my local env, also is there any ltp
> rule forbids the use of "My"?
Yeah, this is quite common in the commit messages, but it's not a strict
rule don't worry about it.
> >
> > > -unsigned long tune;
> > > -unsigned long memfree;
> > > +unsigned long tune, threshold;
> > > +unsigned long memfree, min_memfree;
> > > +int i;
> > >
> > > while (!end) {
> > > memfree = SAFE_READ_MEMINFO("MemFree:");
> > > tune = TST_SYS_CONF_LONG_GET(MIN_FREE_KBYTES);
> > > +/*
> > > + * Allow 10% tolerance to account for transient states.
> > > + */
> > > +threshold = tune * 9 / 10;
> > >
> > > if (memfree < tune) {
> > > -tst_res(TINFO, "MemFree is %lu kB, "
> > > - "min_free_kbytes is %lu kB", memfree, tune);
> > > -tst_res(TFAIL, "MemFree < min_free_kbytes");
> > > +min_memfree = memfree;
> > > +/*
> > > + * Give it some time to reclaim. The kernel should keep
> > > + * MemFree above min_free_kbytes, but transient drops
> > > + * are possible under high pressure.
> > > + * Check every 10ms for up to 2 seconds for high accuracy.
> > > + */
> > > +for (i = 10; i <= 2000; i += 10) {
> > > +usleep(10000);
> > > +memfree = SAFE_READ_MEMINFO("MemFree:");
> > > +if (memfree < min_memfree)
> > > +min_memfree = memfree;
> > > +
> > > +if (memfree >= tune)
> > > +break;
> > > +}
> >
> > The inner polling loop does not check the 'end' flag, which is set to 1
> > by the SIGUSR1 handler when the parent finishes. In the original code,
> > the single sleep(2) at the bottom of the outer loop was interrupted by
> > the signal (usleep/sleep return EINTR), so the outer while (!end) check
> > fired promptly. With the new grace period loop, the signal can fire
> > while we are in any of the 200 inner usleep(10000) calls; the sleep
> > returns early but the loop body continues, and the outer while (!end)
> > cannot be reached until the full ~2 seconds elapse.
> >
> > Please add an early exit:
> >
> > for (i = 10; i <= 2000; i += 10) {
> > if (end)
> > return;
> > usleep(10000);
> > memfree = SAFE_READ_MEMINFO("MemFree:");
> > if (memfree < min_memfree)
> > min_memfree = memfree;
> > if (memfree >= tune)
> > break;
> > }
> >
> > Alternatively a `break` is sufficient if the outer loop check is
> > reached quickly, but `return` is cleaner given the function only has
> > one exit point anyway.
> Completeness is more important than saving 2 seconds of test time, especially
> for the final sub-test. If we jump out early, we might miss the final result of the test.
I'm not sure about this one. In general, once we have a result we
should return as fast as possible, so we can save time while running
other tests. 2 seconds might not seem enough, but when we sum up
various tests with the same logic, we might have a big delay.
> >
> > > +if (memfree < threshold) {
> > > +tst_res(TFAIL, "MemFree %lu < 90%% of min_free_kbytes %lu (MinSeen: %lu%%) after 2s",
> > > +memfree, tune, (min_memfree * 100 / tune));
> > > +} else if (memfree < tune) {
> > > +tst_res(TINFO, "MemFree (%lu) stayed within 10%% tolerance (min %lu%%) after ~2s",
> > > +memfree, (min_memfree * 100 / tune));
> > > +} else {
> > > +tst_res(TINFO, "MemFree recovered to %lu (min %lu%%) after %d ms",
> > > +memfree, (min_memfree * 100 / tune), i);
> > > +}
> >
> > Minor: the TINFO messages mix kB values and percentage values using the
> > same format specifier %lu, which can be confusing in the output. The
> > TFAIL line prints MemFree and tune as raw kB values but labels the third
> > argument "(MinSeen: N%)" — the intent is clear but adding "kB" units to
> > the first two values would be more consistent:
> >
> > "MemFree %lu kB < 90%% of min_free_kbytes %lu kB (MinSeen: %lu%%) after 2s"
> >
> > Similarly for the TINFO lines.
> Name is min_free_kbytes and the test context is very clear.
>
> Unless there's a logical error or an LTP rule violation, I won't be sending another patch.
I think the agent complains that the output message is not easy
to debug: we are printing values without the units. I would just
add MemFree unit `kB` in this case. The rest of the message looks
ok.
Regards,
--
Andrea Cervesato
SUSE QE Automation Engineer Linux
andrea.cervesato@suse.com
--
Mailing list info: https://lists.linux.it/listinfo/ltp
next prev parent reply other threads:[~2026-06-02 16:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-27 5:08 [LTP] [PATCH v2] mem/min_free_kbytes: Add grace period for memory reclaim Wei Gao via ltp
2026-05-27 5:31 ` [LTP] " linuxtestproject.agent
2026-05-27 15:40 ` [LTP] [PATCH v2] " Cyril Hrubis
2026-05-29 16:07 ` Petr Vorel
2026-05-31 13:51 ` Wei Gao via ltp
2026-05-31 13:40 ` [LTP] [PATCH v3] min_free_kbytes: Handle transient memory drops in check_monitor Wei Gao via ltp
2026-06-01 6:42 ` [LTP] " linuxtestproject.agent
2026-06-02 1:00 ` [LTP] [PATCH v4] " Wei Gao via ltp
2026-06-02 4:02 ` [LTP] " linuxtestproject.agent
2026-06-02 7:46 ` Wei Gao via ltp
2026-06-02 16:07 ` Andrea Cervesato via ltp [this message]
2026-06-03 3:07 ` Wei Gao via ltp
-- strict thread matches above, loose matches on Subject: below --
2026-05-26 8:36 [LTP] [PATCH v1] " Wei Gao via ltp
2026-05-26 15:29 ` [LTP] " linuxtestproject.agent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6a1effa6.5e46ca1c.1e45c8.7955@mx.google.com \
--to=ltp@lists.linux.it \
--cc=andrea.cervesato@suse.com \
--cc=linuxtestproject.agent@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox