From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9668E2E7641 for ; Tue, 11 Nov 2025 19:13:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762888406; cv=none; b=tY9uUBeoeUYf1t91PO8mqyHnVNMVGsUCzqTLZyEkiR0BU0YNvgLEaTqYxeVUOQmGDIzifW6lHS3xrHfyZHUOYwPjn6HtPZ9ODV8yCZzs8CnUUSAkQVVwh9bBYITidlY4a6wG1e9be/VvmV7nAWqvJ7XzB/UCJL4KE9HkAvTQL6g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762888406; c=relaxed/simple; bh=2GOc/f+MZYJh7MJNPngu+YOeG3qkWI4rcxeCs4gXmtY=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=NBr7jf+Ct5dmwTEf0/dp1lR7I5e59Ii11iAkOrCn7pczrS/3seLaGqrHqNyAbTLT50YrfUOmFPhZ7lfPfD2WxJlbO/pd1RUzhwhWVKFvktDhug4Q+ZXPLT8srGlTDx4S/8HRVp4yOVgz1wMpJUskGSeUHP2KIwLLcjeWXwOu1gw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=BK+z0/Gw; arc=none smtp.client-ip=91.218.175.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="BK+z0/Gw" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762888391; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=WAMF5Kieve7eLi0CA1Q7SiwB8XtOEdddDh5a5e2G5pM=; b=BK+z0/GwfZFWc92LGrJKHMQiWjt8FetZh6gz9F7Q3nigg37iUGQkMuedZ0rPT2Gs1ppbzb W95ukc8Q++q9Y/TmKv2Gbi490gsoalmrdY0sSLJhEoUJRa777q+YEA8iwwAVuqcwuBu3b/ 81SxkJdLw2+ynMuB7sJquE7X9LFbQzs= From: Roman Gushchin To: Michal Hocko Cc: Andrew Morton , linux-kernel@vger.kernel.org, Alexei Starovoitov , Suren Baghdasaryan , Shakeel Butt , Johannes Weiner , Andrii Nakryiko , JP Kobryn , linux-mm@kvack.org, cgroups@vger.kernel.org, bpf@vger.kernel.org, Martin KaFai Lau , Song Liu , Kumar Kartikeya Dwivedi , Tejun Heo Subject: Re: [PATCH v2 13/23] mm: introduce bpf_out_of_memory() BPF kfunc In-Reply-To: (Michal Hocko's message of "Mon, 10 Nov 2025 10:46:15 +0100") References: <20251027232206.473085-1-roman.gushchin@linux.dev> <20251027232206.473085-3-roman.gushchin@linux.dev> Date: Tue, 11 Nov 2025 11:13:04 -0800 Message-ID: <87qzu4pem7.fsf@linux.dev> Precedence: bulk X-Mailing-List: cgroups@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-Migadu-Flow: FLOW_OUT Michal Hocko writes: > On Mon 27-10-25 16:21:56, Roman Gushchin wrote: >> Introduce bpf_out_of_memory() bpf kfunc, which allows to declare >> an out of memory events and trigger the corresponding kernel OOM >> handling mechanism. >> >> It takes a trusted memcg pointer (or NULL for system-wide OOMs) >> as an argument, as well as the page order. >> >> If the BPF_OOM_FLAGS_WAIT_ON_OOM_LOCK flag is not set, only one OOM >> can be declared and handled in the system at once, so if the function >> is called in parallel to another OOM handling, it bails out with -EBUSY. >> This mode is suited for global OOM's: any concurrent OOMs will likely >> do the job and release some memory. In a blocking mode (which is >> suited for memcg OOMs) the execution will wait on the oom_lock mutex. > > Rather than relying on BPF_OOM_FLAGS_WAIT_ON_OOM_LOCK would it make > sense to take the oom_lock based on the oc->memcg so that this is > completely transparent to specific oom bpf handlers? Idk, I don't have a super-strong opinion here, but giving the user the flexibility seems to be more future-proof. E.g. if we split oom lock so that we can have competing OOMs in different parts of the memcg tree, will we change the behavior?