* [BPF CI] OOMs on s390x runners
@ 2026-01-20 17:59 Ihor Solodrai
2026-01-21 22:56 ` Ilya Leoshkevich
0 siblings, 1 reply; 4+ messages in thread
From: Ihor Solodrai @ 2026-01-20 17:59 UTC (permalink / raw)
To: Ilya Leoshkevich
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Martin KaFai Lau
Hi Ilya,
The BPF selftests regularly fail on s390x runners with OOMs.
The s390x hosts that I've been maintaining have 16G of memory, hosting
2 runners each. VMs (qemu instances) running the tests currently get
5G of memory each.
I noticed that the s390x runners didn't have swap set up, so I added
an ansible config to set up a swapfile. This seems to have helped with
OOM failures.
https://github.com/libbpf/ci/commit/8767dc05ab84c88da198af3c651511e731ddbac7
I spot checked a couple of recent failures, and all of them happened
on the "ebpf1-worker-*" runners that you've been maintaining IIRC.
Here is a couple of examples:
- https://github.com/kernel-patches/bpf/actions/runs/21178791628/job/60916006746
- https://github.com/kernel-patches/bpf/actions/runs/21163448380/job/60863171062
- https://github.com/kernel-patches/bpf/actions/runs/21157785072/job/60846570064
Could you please re-provision the hosts with the swap setup?
Let's see if that helps.
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BPF CI] OOMs on s390x runners
2026-01-20 17:59 [BPF CI] OOMs on s390x runners Ihor Solodrai
@ 2026-01-21 22:56 ` Ilya Leoshkevich
2026-02-05 23:50 ` Ihor Solodrai
0 siblings, 1 reply; 4+ messages in thread
From: Ilya Leoshkevich @ 2026-01-21 22:56 UTC (permalink / raw)
To: Ihor Solodrai
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Martin KaFai Lau
On 1/20/26 18:59, Ihor Solodrai wrote:
> Hi Ilya,
>
> The BPF selftests regularly fail on s390x runners with OOMs.
>
> The s390x hosts that I've been maintaining have 16G of memory, hosting
> 2 runners each. VMs (qemu instances) running the tests currently get
> 5G of memory each.
>
> I noticed that the s390x runners didn't have swap set up, so I added
> an ansible config to set up a swapfile. This seems to have helped with
> OOM failures.
>
> https://github.com/libbpf/ci/commit/8767dc05ab84c88da198af3c651511e731ddbac7
>
> I spot checked a couple of recent failures, and all of them happened
> on the "ebpf1-worker-*" runners that you've been maintaining IIRC.
> Here is a couple of examples:
> - https://github.com/kernel-patches/bpf/actions/runs/21178791628/job/60916006746
> - https://github.com/kernel-patches/bpf/actions/runs/21163448380/job/60863171062
> - https://github.com/kernel-patches/bpf/actions/runs/21157785072/job/60846570064
>
> Could you please re-provision the hosts with the swap setup?
> Let's see if that helps.
>
> Thanks!
Hi Ihor,
Thanks for the scripts!
I reprovisioned all 5 builders, everything looks healthy.
I will check on them again tomorrow morning.
Best regards,
Ilya
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BPF CI] OOMs on s390x runners
2026-01-21 22:56 ` Ilya Leoshkevich
@ 2026-02-05 23:50 ` Ihor Solodrai
2026-02-10 21:23 ` Ilya Leoshkevich
0 siblings, 1 reply; 4+ messages in thread
From: Ihor Solodrai @ 2026-02-05 23:50 UTC (permalink / raw)
To: Ilya Leoshkevich
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Martin KaFai Lau
On 1/21/26 2:56 PM, Ilya Leoshkevich wrote:
>
> On 1/20/26 18:59, Ihor Solodrai wrote:
>> Hi Ilya,
>>
>> The BPF selftests regularly fail on s390x runners with OOMs.
>>
>> The s390x hosts that I've been maintaining have 16G of memory, hosting
>> 2 runners each. VMs (qemu instances) running the tests currently get
>> 5G of memory each.
>>
>> I noticed that the s390x runners didn't have swap set up, so I added
>> an ansible config to set up a swapfile. This seems to have helped with
>> OOM failures.
>>
>> https://github.com/libbpf/ci/commit/8767dc05ab84c88da198af3c651511e731ddbac7
>>
>> I spot checked a couple of recent failures, and all of them happened
>> on the "ebpf1-worker-*" runners that you've been maintaining IIRC.
>> Here is a couple of examples:
>> - https://github.com/kernel-patches/bpf/actions/runs/21178791628/job/60916006746
>> - https://github.com/kernel-patches/bpf/actions/runs/21163448380/job/60863171062
>> - https://github.com/kernel-patches/bpf/actions/runs/21157785072/job/60846570064
>>
>> Could you please re-provision the hosts with the swap setup?
>> Let's see if that helps.
>>
>> Thanks!
>
> Hi Ihor,
>
>
> Thanks for the scripts!
>
> I reprovisioned all 5 builders, everything looks healthy.
>
> I will check on them again tomorrow morning.
Hi Ilya.
We ended up disabling KASAN for s390x and that helped with OOMs.
I noticed a couple "no space left on device" errors on your runners
today, could you please take a look? Example:
https://github.com/kernel-patches/bpf/actions/runs/21732338737/job/62690585484
Thanks!
>
>
> Best regards,
>
> Ilya
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BPF CI] OOMs on s390x runners
2026-02-05 23:50 ` Ihor Solodrai
@ 2026-02-10 21:23 ` Ilya Leoshkevich
0 siblings, 0 replies; 4+ messages in thread
From: Ilya Leoshkevich @ 2026-02-10 21:23 UTC (permalink / raw)
To: Ihor Solodrai
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Martin KaFai Lau
On 2/6/26 00:50, Ihor Solodrai wrote:
> On 1/21/26 2:56 PM, Ilya Leoshkevich wrote:
>> On 1/20/26 18:59, Ihor Solodrai wrote:
>>> Hi Ilya,
>>>
>>> The BPF selftests regularly fail on s390x runners with OOMs.
>>>
>>> The s390x hosts that I've been maintaining have 16G of memory, hosting
>>> 2 runners each. VMs (qemu instances) running the tests currently get
>>> 5G of memory each.
>>>
>>> I noticed that the s390x runners didn't have swap set up, so I added
>>> an ansible config to set up a swapfile. This seems to have helped with
>>> OOM failures.
>>>
>>> https://github.com/libbpf/ci/commit/8767dc05ab84c88da198af3c651511e731ddbac7
>>>
>>> I spot checked a couple of recent failures, and all of them happened
>>> on the "ebpf1-worker-*" runners that you've been maintaining IIRC.
>>> Here is a couple of examples:
>>> - https://github.com/kernel-patches/bpf/actions/runs/21178791628/job/60916006746
>>> - https://github.com/kernel-patches/bpf/actions/runs/21163448380/job/60863171062
>>> - https://github.com/kernel-patches/bpf/actions/runs/21157785072/job/60846570064
>>>
>>> Could you please re-provision the hosts with the swap setup?
>>> Let's see if that helps.
>>>
>>> Thanks!
>> Hi Ihor,
>>
>>
>> Thanks for the scripts!
>>
>> I reprovisioned all 5 builders, everything looks healthy.
>>
>> I will check on them again tomorrow morning.
> Hi Ilya.
>
> We ended up disabling KASAN for s390x and that helped with OOMs.
>
> I noticed a couple "no space left on device" errors on your runners
> today, could you please take a look? Example:
>
> https://github.com/kernel-patches/bpf/actions/runs/21732338737/job/62690585484
>
> Thanks!
>
>>
>> Best regards,
>>
>> Ilya
>>
Hi Ihor,
Thanks for letting me know.
I deleted the unused volumes: actions-runner-kernel-patches-worker-02
and actions-runner-kernel-patches-worker-03 from all builders.
Now there should be enough space on all of them.
Best regards,
Ilya
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-10 21:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-20 17:59 [BPF CI] OOMs on s390x runners Ihor Solodrai
2026-01-21 22:56 ` Ilya Leoshkevich
2026-02-05 23:50 ` Ihor Solodrai
2026-02-10 21:23 ` Ilya Leoshkevich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox