From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73B10427A09 for ; Tue, 20 Jan 2026 12:05:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.67 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768910716; cv=none; b=DAvwDKb2L0YlzbcUoDeWrXhYElmyhgXJoIpoVXbvcChboKOHbhM/mvVuoKK++60yAgC43nG+KsXa8TwF0PVp33IyStHsNYvKDkmeb0uGlnEqdzGAfEwinDL8SBk2I2NVC9aqMDUWarSrvikW8eS8YFJOcw/olPtcPBPLMIftwPo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768910716; c=relaxed/simple; bh=1SUJOIbcpKZ97vROXHZbc4uywvg5mICVVTFXnu+8N0E=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=m7/J+jdITVZNDlwRGOv8XRaPDpTXc5NKX3c93ijJ46K34g8tX7i3qR6stExn8TyfhCRRKDVC3PiS934KSZWy+HWKej/YV4IeF3lO4YVIjsqsU+swZ6jbS7c89DPI9474GyDnvr5nz3RpjV5/i7XfHxu3Rx5iaMB/SAw/teNi0+E= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=e5hRf789; arc=none smtp.client-ip=209.85.128.67 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="e5hRf789" Received: by mail-wm1-f67.google.com with SMTP id 5b1f17b1804b1-47ff94b46afso35796635e9.1 for ; Tue, 20 Jan 2026 04:05:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1768910713; x=1769515513; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=JckE27aaTC/MenhzlCHYt+N31ilQSZiKPy2PxHLLEmg=; b=e5hRf789+sDFQRo4cko4z9oRXrlGElRU7lc3q4NIGtOKRke9YROX5TAjFeOGYG6zAW 3afOnjx6LtmlHGG8SnjkSYi60DZlm+xWEtNPl5XHhbOigMUkNQXq9Vkjk2/KUM83xdJe Qr2vyCN7NhlYNZEXAyhadErhC5U7UxK8mu7qesPXXLphqQgmkfTwyKPcv0UgmLCtO3+w g+K5+owI/G78G9+Buz9UkqF9jdtxHHto+udzSPwkkLdMvZ/B4qq3NFxDUjuyQaQUStO9 Qm1FFIHgeC/RgGVPEBVFXfQbTnr+LDxjUuqkJSa9bwCJKjns+s+IcSIgwoq0ffvymAIB DKoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768910713; x=1769515513; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=JckE27aaTC/MenhzlCHYt+N31ilQSZiKPy2PxHLLEmg=; b=HvDXavDk6aU97WRTFQUHtJA6ckhk/Ra7XzhWI/ZadgvH7uBwFAZHIpTc3BW0wX2PSY mKw9tBALnTuK2KuxNpMavw13b7Kc8srQfha0J1N45SjU0V+aq5w8+5hY7jn++QaWhLzY KPCZCOEyC1VmUP65AW+5CImMcxeJRbbee5jmxAZOi4VLC1vvz0W7tz+tpqrVtfKKxqJJ gUx96V7nfuTfEgvAtIeIWmhZVK6+r8t65LWcmyBMaHaswbxxPgAWD83s52nQq7hYAK8C N10jzrIwHccSguTVkGNa8KtSFpDqwQ7V97zSA7yqzL8XZtt2ziOUAYrjq1iLYgB4KZSE 0v8g== X-Forwarded-Encrypted: i=1; AJvYcCUtFa/wuEINssGpmT7kzh/v0BVXOb+9znTZfKIN5UldAKCKUqyH++yaQNhzIA9PIXxJFbOpjx9cpc9B6qU=@vger.kernel.org X-Gm-Message-State: AOJu0Yz7lnrskYtXBInPBJw0inWG+zocWgoT4DdBN4VBkn5LjtGT0T48 wVJGCR6bLhH8CWLQAAx53n2wLTBKmbteuTiq8LE9XjKDsiTOwH3fhp+d X-Gm-Gg: AY/fxX5rfrm5mNHtQ1Lnhp6tO6N7BK2zqrjjh08z+gyR32wTsCff9LwtJ6np6UouQlA cFCEUztTwgEKB7wFAIiZdEb2l5rm4RbuDiRF6u/yeSwRXRv7iAZc6CerOKJCXFpL8H0azSlGrPK ZvZgAPW2TBbxpbeUlpxZjeKGEN4kaPoA4kXRwY3nhvMjurhcljyDnlAq+6edHl62/T1PiDitn36 hBdaoosJlZ5Q+45tEHldWAliUpSyGBKbZc2VzWnz8wn8+n8ANdsC21PgdOuQROYnv22RS+zaoZ7 TryZGEbCSEb5lDfX8NUWCjMwCncG4/Ox5A/9cGrcUXt2CmwJtFO/uhSWnQW8U8LAoQqT1oGycT9 uVnnzQgvVwGFCEF3E1Qp6CbYpNQoNkABfR/+T+AZ0ecyrrJgnDE73E9Htw2EMOkrTEfQ741fHyJ sOod1souIOwaN1tjvU7d2hD7hLqCyMfZa/2S/yxxaTNNTH1tQ/zZ6IWKrYj/YfveBzKyJxqnbm2 QJyJeaLlW1WjZMQJKzgdi2xZP2KJ17P8BWrPTHZaOUEDf6PwTjlg3E6Mh14acJt X-Received: by 2002:a05:600c:8b6c:b0:46e:2815:8568 with SMTP id 5b1f17b1804b1-4801e66fcc5mr167787885e9.10.1768910712480; Tue, 20 Jan 2026 04:05:12 -0800 (PST) Received: from ?IPV6:2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c? ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-47f4289b789sm302284295e9.1.2026.01.20.04.05.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 20 Jan 2026 04:05:11 -0800 (PST) Message-ID: Date: Tue, 20 Jan 2026 12:05:09 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass by removing cross-buffer accounting To: Yuhao Jiang , Jens Axboe Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org References: <20260119071039.2113739-1-danisjiang@gmail.com> <2919f3c5-2510-4e97-ab7f-c9eef1c76a69@kernel.dk> Content-Language: en-US From: Pavel Begunkov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 1/20/26 07:05, Yuhao Jiang wrote: > Hi Jens, > > On Mon, Jan 19, 2026 at 5:40 PM Jens Axboe wrote: >> >> On 1/19/26 4:34 PM, Yuhao Jiang wrote: >>> On Mon, Jan 19, 2026 at 11:03 AM Jens Axboe wrote: >>>> >>>> On 1/19/26 12:10 AM, Yuhao Jiang wrote: >>>>> The trade-off is that memory accounting may be overestimated when >>>>> multiple buffers share compound pages, but this is safe and prevents >>>>> the security issue. >>>> >>>> I'd be worried that this would break existing setups. We obviously need >>>> to get the unmap accounting correct, but in terms of practicality, any >>>> user of registered buffers will have had to bump distro limits manually >>>> anyway, and in that case it's usually just set very high. Otherwise >>>> there's very little you can do with it. >>>> >>>> How about something else entirely - just track the accounted pages on >>>> the side. If we ref those, then we can ensure that if a huge page is >>>> accounted, it's only unaccounted when all existing "users" of it have >>>> gone away. That means if you drop parts of it, it'll remain accounted. >>>> >>>> Something totally untested like the below... Yes it's not a trivial >>>> amount of code, but it is actually fairly trivial code. >>> >>> Thanks, this approach makes sense. I'll send a v3 based on this. >> >> Great, thanks! I think the key is tracking this on the side, and then >> a ref to tell when it's safe to unaccount it. The rest is just >> implementation details. >> >> -- >> Jens Axboe >> > > I've been implementing the xarray-based ref tracking approach for v3. > While working on it, I discovered an issue with buffer cloning. > > If ctx1 has two buffers sharing a huge page, ctx1->hpage_acct[page] = 2. > Clone to ctx2, now both have a refcount of 2. On cleanup both hit zero > and unaccount, so we double-unaccount and user->locked_vm goes negative. > > The per-context xarray can't coordinate across clones - each context > tracks its own refcount independently. I think we either need a global > xarray (shared across all contexts), or just go back to v2. What do > you think? The Jens' diff is functionally equivalent to your v1 and has exactly same problems. Global tracking won't work well. You can try to double account clones, or wrap it all together with the xarray into an object that you share b/w rings on clone. Just make sure it's protected right. -- Pavel Begunkov