From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ot1-f43.google.com (mail-ot1-f43.google.com [209.85.210.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 474FB2C15AA for ; Tue, 30 Dec 2025 16:01:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.43 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767110514; cv=none; b=F1YKwjp8g6hdZ3kgwZmwtROjQB8mZ60TJUX/mXHepySg+anmPFHpzcPAahWqtD8Qqy2tXzu3jbrHW5JY7QRhUER09pgPo/WH+ntiDtMJLTQyBXBEqlDBOlQUdOstzEEdINgDKD9MZ50a14agVqSdhaM4KvDqhZbBs6rZRJpatg4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767110514; c=relaxed/simple; bh=Czoa0Kp+hY6TKaw7SoJcHPtP79La43NnipreUMbv00s=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=FNdVLx9nP4jbnY3cHSoWyD8rI2f6DF7j8UYLue32TYgsobOYOVxXeO8ILTUZN9nTUbGoTTebRBc5RH89S47BIATUFnr9thxLPd6AS76XmMBVKem0zGlCi1YXKufObvMD1VIqZH71QYoMehLrzX6cOglBc3GxB79vRLQnWhqirJk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=oicLNbGz; arc=none smtp.client-ip=209.85.210.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="oicLNbGz" Received: by mail-ot1-f43.google.com with SMTP id 46e09a7af769-7c7660192b0so6855917a34.0 for ; Tue, 30 Dec 2025 08:01:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1767110509; x=1767715309; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=TcsmC9rFND+k/Zu82FdBSgGWIRL+xP3ul4I6+BnHQr4=; b=oicLNbGzbNGC2mLsoPGozoACZaQdNyh41OgadGF9MNrlVpcuXdq+p7bsIuTZnbAV2w 1pnagqax9Z2tD3gzGai1XuVnVmR64T7TQD/xkhLYDmQWamgMgNI+6UeWBguE5dr0ntP5 VVBgqYmgl6nKWwHnOJNxs3N+go9c82jUDJX3ndStVnTSKis5jPdhcFukh5oVqetJdtdE TB4nBng2DnMTaYD0oQ32PepegWCTSjaAmqTtWo0YvtXr0ugXlmGXC+MeIKIsqRYCw5Ly EEQGxYIhsIcv++w284C46IqRLkdhqtGKugsb1apklJG/3+fmt0w3FM8JiwL95cpBzrF6 z5Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767110509; x=1767715309; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TcsmC9rFND+k/Zu82FdBSgGWIRL+xP3ul4I6+BnHQr4=; b=KeWqlJucabIW+pbr6KvjCTUWPsz3coC3IBS+rR1WCrIaIHorZtz72lF1jL/VcLwT8c J8RA/+CaViun/Q2k7ZR92wxmJt4PWLEJ+zEPGRFwU6u+ATK9OKjH3npBJcfMApvy6WvY S5AMEzl4ODRmY4u8+V16e5zeWpjvsUvu/+MYZUPQi91lQwRDxDyfMoOpR2POtcz8ZcoR Hutw7VplenlPhiA4wH5jmmxikgcTswEXQyRng5swB+oKh8dmnudGNIH4MhfT+CD6nIDC 0U+GXB2mi41SmGMRKHu0gPFiZbcF8sPi7EEWLrSwID/yts4kD/YXdfcu9uE45g5xuAPg NTxA== X-Forwarded-Encrypted: i=1; AJvYcCUW9uVbSEc/k8BqZQFIBgaHnXjvY0y4c7Gg/b/o/18s3UV4PTYY8sbWyN8EFVczRwk35r3CCaOSP4mpHu0=@vger.kernel.org X-Gm-Message-State: AOJu0Yw9GR7WEI2TIxNg3avSkCN3mAbJpa9F9GPdb3X2CHIspU+yS1I0 jDNEXzN2hlxuYLMC6LlTGPQ9aUxBMqgg5EGs1p5td6AhqjU0FYwZupodQ5q4xbQefiAZCWety/T y6Dqi X-Gm-Gg: AY/fxX66pRgzqg/8pHFh/tu0GOJHYdT+8xvPrWJa1fD8wiEi32BFIrjnoQq4duOPfYM umpYzr8p34hfUERjmwVyAa18Yp2192H+IHz7m9oR7bty1Az3bP5Irtqg2SypVnGFJKp4vokNB5F idajoLEpb8qBHXYXYjozaP9vn/foYHgXxJTlnlm7JWqWot6TdXLC68UZagt46nRj5KXLKJ1RP1p 6Jw0ltVpTdWF9vjqXr9RdliFIyCcI3c2PtWnqU5vUao71yGbgxMrf5EMtWtMao0LP0CIu/MSDiP dIEa+dD+I701vMY2uIVZhXfDwCo7EN4H7M2ZcuUlAbFx9UCkTjkP1cehJVnixI75tFD1zpFg7I6 li3EUVrOLtdDqoiYt2CHfR2ng4b7Z6+Ryj3QGR+G/1KcZi8q2Pa+08JDvGOUjlu80nLtajbh9Ej XKfLyiURq9 X-Google-Smtp-Source: AGHT+IEw08M+6Y7r9kqQYjCzGYu60U7ro1MnIP6UINE9Aq1Z2cmf7ihI84cze00Pb6YN3X72eA3v7w== X-Received: by 2002:a05:6830:3690:b0:7c7:80e5:245f with SMTP id 46e09a7af769-7cc66a17d42mr20905042a34.30.1767110508772; Tue, 30 Dec 2025 08:01:48 -0800 (PST) Received: from [192.168.1.150] ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7cc667563ffsm22947043a34.13.2025.12.30.08.01.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 30 Dec 2025 08:01:47 -0800 (PST) Message-ID: Date: Tue, 30 Dec 2025 09:01:46 -0700 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] io_uring: make overflowing cqe subject to OOM To: Alexandre Negrel Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org References: <20251229201933.515797-1-alexandre@negrel.dev> <5YHjvAsQKKhRWwp95PB0tGlW7nmplpjVW0b5mruoUD73qmg89ntObcPe63oCPf1mhBUh-Y3ARNMcPueF2dUttoWCyWv_KiG3VMIbguuOJHY=@negrel.dev> Content-Language: en-US From: Jens Axboe In-Reply-To: <5YHjvAsQKKhRWwp95PB0tGlW7nmplpjVW0b5mruoUD73qmg89ntObcPe63oCPf1mhBUh-Y3ARNMcPueF2dUttoWCyWv_KiG3VMIbguuOJHY=@negrel.dev> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 12/30/25 7:50 AM, Alexandre Negrel wrote: >> I'm assuming the issue here is that memcg will look at __GFP_HIGH >> somehow and allow it to proceed? > > Exactly, the allocation succeed even though it exceed cgroup limits. > After digging through try_charge_memcg(), it seems that OOM killer > isn't involved unless __GFP_DIRECT_RECLAIM bit is set (see > gfpflags_allow_blocking). > > https://github.com/torvalds/linux/blob/8640b74557fc8b4c300030f6ccb8cd078f665ec8/mm/memcontrol.c#L2329 > https://github.com/torvalds/linux/blob/8640b74557fc8b4c300030f6ccb8cd078f665ec8/include/linux/gfp.h#L38 > >> In any case, then below should then do the same. Can you test? > > I tried it and it seems to fix the issue but in a different way. > try_charge_memcg now returns -ENOMEM and the allocation failed. The > completion queue entry is "dropped on the floor" in > io_cqring_add_overflow. > > So I see 3 options here: > * use GFP_NOWAIT if dropping CQE is ok We're utterly out of memory at that point, so something has to give. We can't invent memory out of thin air. Hence dropping the event, and logging it as such, is imho the way to go. Same thing would've happened with GFP_ATOMIC, just a bit earlier in the process. It's worth noting that this is extreme circumstances - the kernel is completely out of memory, and this will cause various spurious failures to complete syscalls or other events. Additionally, this is the non DEFER_TASKRUN case, which is what people should be using anyway. > * allocate using GFP_KERNEL_ACCOUNT without holding the lock then adding > overflowing entries while holding the completion_lock (iterating twice over > compl_reqs) Only viable way to do that would be to allocate it upfront, which is a huge waste of time for the normal case where the CQ ring isn't overflowing. We should not optimize for the slow/broken case, where userspace overflows the ring. > * charge memory after releasing the lock. I don't know if this is possible but > doing kfree(kmalloc(1, GFP_KERNEL_ACCOUNT)) after releasing the lock does the > job (even though it's dirty). And that's definitely a no-go as well. -- Jens Axboe