From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88961C433FE for ; Mon, 3 Oct 2022 12:47:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E27618E0001; Mon, 3 Oct 2022 08:47:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD5FB6B0073; Mon, 3 Oct 2022 08:47:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9E928E0001; Mon, 3 Oct 2022 08:47:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B49756B0072 for ; Mon, 3 Oct 2022 08:47:15 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 79D1E1409F6 for ; Mon, 3 Oct 2022 12:47:15 +0000 (UTC) X-FDA: 79979613630.27.C9CBDE9 Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by imf25.hostedemail.com (Postfix) with ESMTP id 01916A0009 for ; Mon, 3 Oct 2022 12:47:14 +0000 (UTC) Received: by mail-lf1-f47.google.com with SMTP id z4so16427672lft.2 for ; Mon, 03 Oct 2022 05:47:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date; bh=bbW48lF7w7dqMd3ew4lAUYrRQXOK7chMR3ZlI/inaU4=; b=NGoUgbYfYtHEEYsy9dCs1A9gSkbNIQg2VSZnphma6IT1DMIuVPaEqNzqy37pZE9kvf B3h0E1O4KOfYhRvqwo0TMWPWdCo3ZYSNABgDYaOfhNSmhPfbKa0t9aVmQQFMO7/mnT7K fpUG2XHFQYIyeE8Nob6/UpNH7j/i/9XBYqsfsk0Elum5w90fFoCQLjcTMgtZybM/ByWo R2wGU5eW037JUB4mfk6j7/LezcpdYINK3+5dCPf4Edi7t8YLg+s1qpK39i94feFj65br R6HnacVAvceG4G3eBlnK2VAE7KVwAAWc8F6mXIcy9BojiZDLzbccZqISrA74KAG6rS7u f83A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date; bh=bbW48lF7w7dqMd3ew4lAUYrRQXOK7chMR3ZlI/inaU4=; b=tzVc2+Y+7MHYjoxJKFy+mQwibUNcQ3S1qrjML/AePlQzYwHVOpby3Z6lateBOHWfmW 6uyDSTd1G5sZGu98D7lfaKsVHREi35D5ihZarLoCIxZRhFv+wDjH9qS4wQ8PyFYswfXS VlgQB6KECR0qmBD9+cW+SoHoB+xkzM866aq4ra3hANWCNyrLKPSzNklWUm28BkxB0uV6 cMdVcyiUeebw0POgDlRt5ePu0+oFSDY6i+4W2g2E22qQVMyoNXNujkU2cKipRTPYJvbU 8o+qUEHtXuwizylbC1fE3dbX7z65FHquz6CupkK+kn0pGRxSXU7Mey1RrJsji1WE7Jbd QNZg== X-Gm-Message-State: ACrzQf3LENkxS+zmWGg/eBdsB1KuUcjHyrepfpVeJKCVQMjwR2ytGd1C NLrfseRc6IhyXw1RCTR8DTU= X-Google-Smtp-Source: AMsMyM5kO0YKeK5rgnUvXoxrmYFDf04/f0ZIbTHZOOm5UlTIgYDUorOWUnONufhYfVCnKSPqU9L0Jw== X-Received: by 2002:ac2:4f03:0:b0:49b:bc01:35d3 with SMTP id k3-20020ac24f03000000b0049bbc0135d3mr6960800lfr.467.1664801233190; Mon, 03 Oct 2022 05:47:13 -0700 (PDT) Received: from ?IPV6:2a02:2168:a11:244b::1? ([2a02:2168:a11:244b::1]) by smtp.gmail.com with ESMTPSA id bd21-20020a05651c169500b0026c037747bfsm890574ljb.3.2022.10.03.05.47.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 03 Oct 2022 05:47:12 -0700 (PDT) Message-ID: Date: Mon, 3 Oct 2022 15:47:10 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: Possible race in obj_stock_flush_required() vs drain_obj_stock() To: Roman Gushchin Cc: Johannes Weiner , Michal Hocko , Shakeel Butt , Vladimir Davydov , Muchun Song , Sebastian Andrzej Siewior , cgroups@vger.kernel.org, linux-mm@kvack.org References: <1664546131660.1777662787.1655319815@gmail.com> From: Alexander Fedorov In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664801235; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bbW48lF7w7dqMd3ew4lAUYrRQXOK7chMR3ZlI/inaU4=; b=iVCl3RuzDHvuD22dpQBq+HBy8sTvjuiLPlONgkDpFdNydZUW1+wVimZ4EluM4ruFDz8SKj bx8VzgSbOOXhQDmgm8amJ5NoWKjeCwvoiO+/ebkl55FuBG26K1fGqn1qBnlNacBp4XiDMZ wRVEcb4dkOWTX5/gFFv1kEeQlnNHPdE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=NGoUgbYf; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of halcien@gmail.com designates 209.85.167.47 as permitted sender) smtp.mailfrom=halcien@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664801235; a=rsa-sha256; cv=none; b=AxsmyaHZHxZtOKao3nBtSIPe7OLCVUop20gwY2JLDLE9c5b9c5v93+x2q3iqMQwaUdX3pd tSV5kPQ+TOzU9wq7aSgj3TSzzxb2/D3HQFZvLHmh6eB2DXQoD/TjRN/BOqQ58gTB3S9HXx pCLBNcOoaJBdo7A9akVEi3HGjmDEZbU= X-Rspamd-Queue-Id: 01916A0009 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=NGoUgbYf; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of halcien@gmail.com designates 209.85.167.47 as permitted sender) smtp.mailfrom=halcien@gmail.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Stat-Signature: kigemj7767j5abc531g6uuttn8e5tc3o X-HE-Tag: 1664801234-922225 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 02.10.2022 19:16, Roman Gushchin wrote: > On Sat, Oct 01, 2022 at 03:38:43PM +0300, Alexander Fedorov wrote: >> Tested READ_ONCE() patch and it works. > > Thank you! > >> But are rcu primitives an overkill? >> For me they are documenting how actually complex is synchronization here. > > I agree, however rcu primitives will add unnecessary barriers on hot paths. > In this particular case most accesses to stock->cached_objcg are done from > a local cpu, so no rcu primitives are needed. So in my opinion using a > READ_ONCE() is preferred. Understood, then here is patch that besides READ_ONCE() also fixes mentioned use-after-free that exists in 5.10. In mainline the drain_obj_stock() part of the patch should be skipped. Should probably be also Signed-off-by: Roman Gushchin but I am not sure if I have rights to add that :) mm/memcg: fix race in obj_stock_flush_required() vs drain_obj_stock() When obj_stock_flush_required() is called from drain_all_stock() it reads the `memcg_stock->cached_objcg` field twice for another CPU's per-cpu variable, leading to TOCTTOU race: another CPU can simultaneously enter drain_obj_stock() and clear its own instance of `memcg_stock->cached_objcg`. Another problem is in drain_obj_stock() which sets `cached_objcg` to NULL after freeing which might lead to use-after-free. To fix it use READ_ONCE() for TOCTTOU problem and also clear the `cached_objcg` pointer earlier in drain_obj_stock() for use-after-free problem. Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Alexander Fedorov diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c1152f8747..56bd5ea6d3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3197,17 +3197,30 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) stock->nr_bytes = 0; } - obj_cgroup_put(old); + /* + * Clear pointer before freeing memory so that + * drain_all_stock() -> obj_stock_flush_required() + * does not see a freed pointer. + */ stock->cached_objcg = NULL; + obj_cgroup_put(old); } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg) { + struct obj_cgroup *objcg; struct mem_cgroup *memcg; - if (stock->cached_objcg) { - memcg = obj_cgroup_memcg(stock->cached_objcg); + /* + * stock->cached_objcg can be changed asynchronously, so read + * it using READ_ONCE(). The objcg can't go away though because + * obj_stock_flush_required() is called from within a rcu read + * section. + */ + objcg = READ_ONCE(stock->cached_objcg); + if (objcg) { + memcg = obj_cgroup_memcg(objcg); if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) return true; }