* Re: [LTP] [PATCH] mmap: fix intermittent OOM kill of test parent in mmap22
2026-04-13 3:25 ` [LTP] [PATCH] mmap: fix intermittent OOM kill of test parent in mmap22 Li Wang via ltp
@ 2026-04-13 5:27 ` Soma Das
0 siblings, 0 replies; 2+ messages in thread
From: Soma Das @ 2026-04-13 5:27 UTC (permalink / raw)
To: Li Wang; +Cc: ltp
Hi Li Wang,
Thanks for the review.
Understood. Setting the child to +1000 removes memory pressure
immediately once it gets killed, causing the parent to time out with
TFAIL. I'll rework using your suggested |stress_child()| approach with
1MB chunks and the 80% threshold delay.
I'll also check |lib/tst_memutils.h| before finalizing.
v2 will be rebased on latest master and sent via git send-email.
Thanks, Soma Das
On 13/04/26 8:55 AM, Li Wang wrote:
> Hi Soma,
>
>> static void stress_child(void)
>> {
>> for (;;) {
>> @@ -63,9 +82,25 @@ static void test_mmap(void)
>>
>> vec = SAFE_MALLOC(npages);
>>
>> + /*
>> + * Protect the parent (test harness) from the OOM killer. Both parent
>> + * and child share the same memcg, so without an explicit hint the OOM
>> + * killer picks based on heuristics that can favour the parent.
>> + */
>> + set_oom_score_adj(-1000);
>> +
>> child = SAFE_FORK();
>> - if (!child)
>> + if (!child) {
>> + /*
>> + * Make the child the preferred OOM victim. If OOM fires while
>> + * the stress worker is filling memory, the kernel must kill the
>> + * child (stress worker) and not the parent (test harness).
>> + * oom_score_adj=1000 is the maximum, guaranteeing this process
>> + * is chosen first within the cgroup.
>> + */
>> + set_oom_score_adj(1000);
> Setting the child's oom_score_adj to 1000 does severely compromise the
> validity of the test. This would typically result in a false negative.
>
> Because once the child gets killed the memory stress will disappear
> immediately, but the parent still keeps looping for check if kernel reclaim
> those memory, it will evantaully report TFAIL when time elapsed.
>
> Instead of modifying oom_score_adj to interfere with the OOM killer, a
> better approach is to optimize how the child process generates memory
> pressure, making it more reflective of real-world memory reclaim scenarios.
>
> For example, the child process can allocate larger chunks (e.g., 1MB) to
> rapidly build up memory pressure. Once the total allocated memory approaches
> the cgroup limit (e.g., 80% capacity), a small delay can be introduced
> into the allocation loop. This approach efficiently drives the system to
> its memory limit while providing the kernel's reclaim mechanism a sufficient
> time window to identify and drop MAP_DROPPABLE pages. It also effectively
> avoids an instantaneous memory spike that would otherwise trigger the OOM
> killer prematurely.
>
> #define CHUNK_SIZE (1024 * 1024)
>
> static void stress_child(size_t cg_limit)
> {
> size_t allocated = 0;
> size_t threshold = cg_limit * 8 / 10;
>
> for (;;) {
> char *buf = malloc(CHUNK_SIZE);
>
> if (!buf) {
> usleep(10000);
> continue;
> }
>
> memset(buf, 'B', CHUNK_SIZE);
> allocated += CHUNK_SIZE;
>
> if (allocated >= threshold) {
> usleep(1000);
> }
> }
> }
>
> And some workflow issues:
>
> - please rebase you code on the latest branch before cooking a patch.
>
> - Use git send-email to send patch to LTP mailing list.
>
> - FYI: LTP has already achieved the oom protection fucntions:
> see: lib/tst_memutils.h
>
> Note:
> This work-email will be disabled soon, reply to:wangli.ahau@gmail.com
>
--
Mailing list info: https://lists.linux.it/listinfo/ltp
^ permalink raw reply [flat|nested] 2+ messages in thread