From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 88EE3274B59 for ; Mon, 2 Feb 2026 04:06:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.45 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770005200; cv=none; b=SE79ocYYc/ivCWkB5U6wyEgZ//W/hdbgG6eOIKbzTHb+mgKMzfAPdTAbrXf3P670oN/xeUrudXIbFjd93k/RsAbqJa0UEGLX4YqSomolOcf/z5JMFOQpHElSgI0Oc8FDj5C8fE6dSEjAw4JVS4UK+apr+mERbg4lK7goukKmA6A= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770005200; c=relaxed/simple; bh=6ZOsJLFF/K4OUe0FvDNHIlVC+p0N7PXMZnb+q4i6yqY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=kwguQ34bX04GDaGO02SDHpsrK4bIB5/UzadULchXuYBg6pv1MC/EQZOZyWw1TCsNQuemRzxmjEx+ES4Xp87GVerXeY8Cz0G8akqADytIMvDKneB5fip5sIqRcjwbAq1RDOgb02nLMdxWoxohIXviLWDwDH40YWsJYQoWy7G1JV4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hvQw6McV; arc=none smtp.client-ip=209.85.218.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hvQw6McV" Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-b8837152db5so632918766b.0 for ; Sun, 01 Feb 2026 20:06:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770005197; x=1770609997; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2NDqFnN3Ud68GoysowUDYBirFmO+BcmQLVFrXirUD0A=; b=hvQw6McV/sr56BtqLq+t77mp+b9Z0N7CeIx5HxxrBSjSra9m79kSq3Md/cZ8+0VRwb QUwhGQ/9S4pbn5rc2AOXBCOFpQ6IV4ZlC597LWvIkr4TNH6PtfHquuijtR3rdGDHhUJf Tg11FXJXdbBmu3XNO5SeeUip+bpPumNPBerZBqDBIj/2rOTgAqxCKWMPSy8GOHllRb7x hqRXvUVPE3BYK+SpRAzvfrWwPYmFzpV5ePLuwW8hQGNKcyl9c2+N3oZR++X2cR3tzuXL pxk5i01FR7FkVNk/oft35Eu6EWynrcpANJDit/sZ1hbq1BtgGRFml/WWISXsfXayAvMK lHyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770005197; x=1770609997; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2NDqFnN3Ud68GoysowUDYBirFmO+BcmQLVFrXirUD0A=; b=NQmHfBmVBLEb+eV1/RR+xPJyYK37jxeAyG6zn03OarMQqCBIk3Nk23vOob1UPL911M OxoxQIFG/QZCtVmh8UlGvtcPCRSuS+GWSmEdujS38Dx/JPeqi/hv1jEoHUpXKiRV87w6 nhxXLBDNwCqIJQ1mkb7O0NNBiTKuK9t8LXg5vBiPLV/ZibdWYsbEy7KmMElhj6/cqLXD beuW45JMglsNpgdbK2OGTeHN4GreA5ykWV0rB90K0bQZLm+QghPYfXDRIgriJFnh3foR EJ8XC0Y4n0T+tUqX7q8x/786IioBlronXxTwaWmirJXJy2eAJNlU0TXJiF+ZmvVgsZ52 D7zw== X-Forwarded-Encrypted: i=1; AJvYcCXWmU/FKopyNPz9tK+2XwoPIj38DzjoIWrT0Qt6lCIBY5JNYvd2xPZ5Ra6IM/H7lAOIgRebyoh8U0F7Ras=@vger.kernel.org X-Gm-Message-State: AOJu0YzyY3FzL2NuiIjbWBH8cU9CGbtACdPQ8nOPTgh4ioLDwZwnV1vx /A2RwAAN/BNsuEHnvxsaov4nhRdFKVEsRtunJXZ/Q1MGOf5SHaNvJ0q+1FyyULMEng== X-Gm-Gg: AZuq6aKux1xfWMpmWOdHuRj90ZMi7xOLTQYaQlXQ/e2EatQ5xqFwWVcYUqqXqBoNodg SLb0tWpIvUXYVucqY4eew68V92V9aIcmnzPgoBjpNFlZXOhMXEr8pscN4AzI+8v1puwqdkpNm2s E8ivKPRhN0gV3UEBlJkPb6S4F3ZsbT5HuMZ3toOYyKuCeiYUH3r0u5ffh+qBpGI1qSUrrVJveYG wgalpK/bKu9+elp8MBpVqH5Sq4kPhWEhNnfKqTccOqZQ6dl1NBedZVlX4kAj/1WcW3V1n9x637W 2MQ/6EDBWJA9NyM2gWc/0CAr8p/KYHi6svUX7fPxzkXNfABHI5VO8o+MaoE7l5Alg41v8CtFXIC oKYXMQsuGCuwl2cDRbuQTJorS2h9qrnli1RgSK96Sa00U0PIG+olZ2PK110WL2hiTsm/nqbeKr6 kPv9MbFKpE2ewXLO87R2Ea1sKNiH01kDur2rwe4/hdkRi69d1EER6bZpQHBmhMtA== X-Received: by 2002:a17:907:9814:b0:b87:1590:d528 with SMTP id a640c23a62f3a-b8dff7225cemr576539466b.40.1770005196652; Sun, 01 Feb 2026 20:06:36 -0800 (PST) Received: from google.com (93.50.90.34.bc.googleusercontent.com. [34.90.50.93]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8dbf184e5dsm822058166b.43.2026.02.01.20.06.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 20:06:35 -0800 (PST) Date: Mon, 2 Feb 2026 04:06:32 +0000 From: Matt Bobrowski To: Roman Gushchin Cc: Michal Hocko , bpf@vger.kernel.org, Alexei Starovoitov , Shakeel Butt , JP Kobryn , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Suren Baghdasaryan , Johannes Weiner , Andrew Morton Subject: Re: [PATCH bpf-next v3 07/17] mm: introduce BPF OOM struct ops Message-ID: References: <20260127024421.494929-1-roman.gushchin@linux.dev> <20260127024421.494929-8-roman.gushchin@linux.dev> <7ia4tsw6hi93.fsf@castle.c.googlers.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7ia4tsw6hi93.fsf@castle.c.googlers.com> On Tue, Jan 27, 2026 at 09:12:56PM +0000, Roman Gushchin wrote: > Michal Hocko writes: > > > On Mon 26-01-26 18:44:10, Roman Gushchin wrote: > >> Introduce a bpf struct ops for implementing custom OOM handling > >> policies. > >> > >> It's possible to load one bpf_oom_ops for the system and one > >> bpf_oom_ops for every memory cgroup. In case of a memcg OOM, the > >> cgroup tree is traversed from the OOM'ing memcg up to the root and > >> corresponding BPF OOM handlers are executed until some memory is > >> freed. If no memory is freed, the kernel OOM killer is invoked. > >> > >> The struct ops provides the bpf_handle_out_of_memory() callback, > >> which expected to return 1 if it was able to free some memory and 0 > >> otherwise. If 1 is returned, the kernel also checks the bpf_memory_freed > >> field of the oom_control structure, which is expected to be set by > >> kfuncs suitable for releasing memory (which will be introduced later > >> in the patch series). If both are set, OOM is considered handled, > >> otherwise the next OOM handler in the chain is executed: e.g. BPF OOM > >> attached to the parent cgroup or the kernel OOM killer. > > > > I still find this dual reporting a bit confusing. I can see your > > intention in having a pre-defined "releasers" of the memory to trust BPF > > handlers more but they do have access to oc->bpf_memory_freed so they > > can manipulate it. Therefore an additional level of protection is rather > > weak. > > No, they can't. They have only a read-only access. > > > It is also not really clear to me how this works while there is OOM > > victim on the way out. (i.e. tsk_is_oom_victim() -> abort case). This > > will result in no killing therefore no bpf_memory_freed, right? Handler > > itself should consider its work done. How exactly is this handled. > > It's a good question, I see your point... > Basically we want to give a handler an option to exit with "I promise, > some memory will be freed soon" without doing anything destructive. > But keeping it save at the same time. > > I don't have a perfect answer out of my head, maybe some sort of a > rate-limiter/counter might work? E.g. a handler can promise this N times > before the kernel kicks in? Any ideas? > > > Also is there any way to handle the oom by increasing the memcg limit? > > I do not see a callback for that. > > There is no kfunc yet, but it's a good idea (which we accidentally > discussed few days ago). I'll implement it. Yes, please, this is something that I had mentioned to you the other day too. With this kind of BPF kfunc, we'll basically be able to handle memcg scoped OOM events inline without necessarily being forced to kill off anything.