* Re: [LTP] [PATCH] mmap: fix intermittent OOM kill of test parent in mmap22
[not found] <6215241a-07c4-4eed-8549-0e0afc2fc096@linux.ibm.com>
@ 2026-04-13 3:25 ` Li Wang via ltp
2026-04-13 5:27 ` Soma Das
0 siblings, 1 reply; 2+ messages in thread
From: Li Wang via ltp @ 2026-04-13 3:25 UTC (permalink / raw)
To: Soma Das; +Cc: ltp
Hi Soma,
> static void stress_child(void)
> {
> for (;;) {
> @@ -63,9 +82,25 @@ static void test_mmap(void)
>
> vec = SAFE_MALLOC(npages);
>
> + /*
> + * Protect the parent (test harness) from the OOM killer. Both parent
> + * and child share the same memcg, so without an explicit hint the OOM
> + * killer picks based on heuristics that can favour the parent.
> + */
> + set_oom_score_adj(-1000);
> +
> child = SAFE_FORK();
> - if (!child)
> + if (!child) {
> + /*
> + * Make the child the preferred OOM victim. If OOM fires while
> + * the stress worker is filling memory, the kernel must kill the
> + * child (stress worker) and not the parent (test harness).
> + * oom_score_adj=1000 is the maximum, guaranteeing this process
> + * is chosen first within the cgroup.
> + */
> + set_oom_score_adj(1000);
Setting the child's oom_score_adj to 1000 does severely compromise the
validity of the test. This would typically result in a false negative.
Because once the child gets killed the memory stress will disappear
immediately, but the parent still keeps looping for check if kernel reclaim
those memory, it will evantaully report TFAIL when time elapsed.
Instead of modifying oom_score_adj to interfere with the OOM killer, a
better approach is to optimize how the child process generates memory
pressure, making it more reflective of real-world memory reclaim scenarios.
For example, the child process can allocate larger chunks (e.g., 1MB) to
rapidly build up memory pressure. Once the total allocated memory approaches
the cgroup limit (e.g., 80% capacity), a small delay can be introduced
into the allocation loop. This approach efficiently drives the system to
its memory limit while providing the kernel's reclaim mechanism a sufficient
time window to identify and drop MAP_DROPPABLE pages. It also effectively
avoids an instantaneous memory spike that would otherwise trigger the OOM
killer prematurely.
#define CHUNK_SIZE (1024 * 1024)
static void stress_child(size_t cg_limit)
{
size_t allocated = 0;
size_t threshold = cg_limit * 8 / 10;
for (;;) {
char *buf = malloc(CHUNK_SIZE);
if (!buf) {
usleep(10000);
continue;
}
memset(buf, 'B', CHUNK_SIZE);
allocated += CHUNK_SIZE;
if (allocated >= threshold) {
usleep(1000);
}
}
}
And some workflow issues:
- please rebase you code on the latest branch before cooking a patch.
- Use git send-email to send patch to LTP mailing list.
- FYI: LTP has already achieved the oom protection fucntions:
see: lib/tst_memutils.h
Note:
This work-email will be disabled soon, reply to: wangli.ahau@gmail.com
--
Regards,
Li Wang
--
Mailing list info: https://lists.linux.it/listinfo/ltp
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [LTP] [PATCH] mmap: fix intermittent OOM kill of test parent in mmap22
2026-04-13 3:25 ` [LTP] [PATCH] mmap: fix intermittent OOM kill of test parent in mmap22 Li Wang via ltp
@ 2026-04-13 5:27 ` Soma Das
0 siblings, 0 replies; 2+ messages in thread
From: Soma Das @ 2026-04-13 5:27 UTC (permalink / raw)
To: Li Wang; +Cc: ltp
Hi Li Wang,
Thanks for the review.
Understood. Setting the child to +1000 removes memory pressure
immediately once it gets killed, causing the parent to time out with
TFAIL. I'll rework using your suggested |stress_child()| approach with
1MB chunks and the 80% threshold delay.
I'll also check |lib/tst_memutils.h| before finalizing.
v2 will be rebased on latest master and sent via git send-email.
Thanks, Soma Das
On 13/04/26 8:55 AM, Li Wang wrote:
> Hi Soma,
>
>> static void stress_child(void)
>> {
>> for (;;) {
>> @@ -63,9 +82,25 @@ static void test_mmap(void)
>>
>> vec = SAFE_MALLOC(npages);
>>
>> + /*
>> + * Protect the parent (test harness) from the OOM killer. Both parent
>> + * and child share the same memcg, so without an explicit hint the OOM
>> + * killer picks based on heuristics that can favour the parent.
>> + */
>> + set_oom_score_adj(-1000);
>> +
>> child = SAFE_FORK();
>> - if (!child)
>> + if (!child) {
>> + /*
>> + * Make the child the preferred OOM victim. If OOM fires while
>> + * the stress worker is filling memory, the kernel must kill the
>> + * child (stress worker) and not the parent (test harness).
>> + * oom_score_adj=1000 is the maximum, guaranteeing this process
>> + * is chosen first within the cgroup.
>> + */
>> + set_oom_score_adj(1000);
> Setting the child's oom_score_adj to 1000 does severely compromise the
> validity of the test. This would typically result in a false negative.
>
> Because once the child gets killed the memory stress will disappear
> immediately, but the parent still keeps looping for check if kernel reclaim
> those memory, it will evantaully report TFAIL when time elapsed.
>
> Instead of modifying oom_score_adj to interfere with the OOM killer, a
> better approach is to optimize how the child process generates memory
> pressure, making it more reflective of real-world memory reclaim scenarios.
>
> For example, the child process can allocate larger chunks (e.g., 1MB) to
> rapidly build up memory pressure. Once the total allocated memory approaches
> the cgroup limit (e.g., 80% capacity), a small delay can be introduced
> into the allocation loop. This approach efficiently drives the system to
> its memory limit while providing the kernel's reclaim mechanism a sufficient
> time window to identify and drop MAP_DROPPABLE pages. It also effectively
> avoids an instantaneous memory spike that would otherwise trigger the OOM
> killer prematurely.
>
> #define CHUNK_SIZE (1024 * 1024)
>
> static void stress_child(size_t cg_limit)
> {
> size_t allocated = 0;
> size_t threshold = cg_limit * 8 / 10;
>
> for (;;) {
> char *buf = malloc(CHUNK_SIZE);
>
> if (!buf) {
> usleep(10000);
> continue;
> }
>
> memset(buf, 'B', CHUNK_SIZE);
> allocated += CHUNK_SIZE;
>
> if (allocated >= threshold) {
> usleep(1000);
> }
> }
> }
>
> And some workflow issues:
>
> - please rebase you code on the latest branch before cooking a patch.
>
> - Use git send-email to send patch to LTP mailing list.
>
> - FYI: LTP has already achieved the oom protection fucntions:
> see: lib/tst_memutils.h
>
> Note:
> This work-email will be disabled soon, reply to:wangli.ahau@gmail.com
>
--
Mailing list info: https://lists.linux.it/listinfo/ltp
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-13 5:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <6215241a-07c4-4eed-8549-0e0afc2fc096@linux.ibm.com>
2026-04-13 3:25 ` [LTP] [PATCH] mmap: fix intermittent OOM kill of test parent in mmap22 Li Wang via ltp
2026-04-13 5:27 ` Soma Das
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox