From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FAFDC38159 for ; Thu, 19 Jan 2023 03:06:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93C0D6B0072; Wed, 18 Jan 2023 22:06:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EC2C6B0073; Wed, 18 Jan 2023 22:06:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B40E6B0074; Wed, 18 Jan 2023 22:06:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6C1FA6B0072 for ; Wed, 18 Jan 2023 22:06:26 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 36B71C092F for ; Thu, 19 Jan 2023 03:06:26 +0000 (UTC) X-FDA: 80370060372.13.3C1B6C9 Received: from mail-yb1-f169.google.com (mail-yb1-f169.google.com [209.85.219.169]) by imf04.hostedemail.com (Postfix) with ESMTP id 95F2B4000C for ; Thu, 19 Jan 2023 03:06:23 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Y3pFPShz; spf=pass (imf04.hostedemail.com: domain of surenb@google.com designates 209.85.219.169 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674097583; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qSVT8YMAjXFFuPOc52FToBAxdMmpNVmxrCRZs5o5VjI=; b=PX+cFBsFAUuLYK6ACEkiPng9WypI5C6EqIP5ntmoK1/3vXgRXz2pai105qXNEhY60gM7+K sT11oCOBT0y/ZN9YP6gakiYhEL9EovU2j8fdZgHlfRLDwl90XsTMkCLEJXI7zU+XJF1E2z TbvDwLc2fwtsB+lzq4gcbJx8Fw2Di0I= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Y3pFPShz; spf=pass (imf04.hostedemail.com: domain of surenb@google.com designates 209.85.219.169 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674097583; a=rsa-sha256; cv=none; b=JbI1EqzyRWwvM0MVIiCrbUYL9TdqfZsVlG3qMh2z0EqhRkr49oGddWp6w2TD5Vci54wA/v D8Sva1CBVJiXcKhSImqi1SdJrJwkDhxtJ0NTqKSTnjgrZ7SZ/mVj+PW7Qk/tbYVOKrgOel WtN2cWZnbCkKdlLZddeuWRmiogHj7EE= Received: by mail-yb1-f169.google.com with SMTP id v19so948217ybv.1 for ; Wed, 18 Jan 2023 19:06:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qSVT8YMAjXFFuPOc52FToBAxdMmpNVmxrCRZs5o5VjI=; b=Y3pFPShzMo7m/C/Jbyf8twwR/ljuAb7pInc97aWWrCPc6JmUlDjeCObiYI61nbhC6c XDC3+gNU8ufbrG9efFIMnn3lk5g6TfpACZQZluk5wa+WHfNbal0TxzV9FHP71U8SfZGl zJ3XfQ4oA3eBQnWNLCL7/czQ2ga6mAo+WgrMgKdBaidMspmMI1kkfFKNTha1zKJSom8S WRHM2N8jEkYX/FccbMRwxEu0oDNfEwk84BYuIibSI86OhQhg/8l5nMIxe8ngCEQt7GPM 5kZ4y3DznA986yj2Jba7c1HPZL6x1yPp1RSV9IXEgSPtlTmAPv2KDF7kuFK6eT1zoTcB dPig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qSVT8YMAjXFFuPOc52FToBAxdMmpNVmxrCRZs5o5VjI=; b=Wz4Vl+Ue46S0Rbj8TsvFSigVKANTbxh2DSskBvNsUrmtxNiQ6qS3KD++dys3zNdI9Q fLgK+Crt3/+hH2Q/Y7un0QHTOwi4gFwcJyYzt5l0W+hHCCgoDfWs8OMuKrBCcXZherj8 2wp0NJIZSYC1vWEI4P4kgjJ2hD8LodreK+1DlpqYwfCNCTTdgHyJ5eSLBjJpwRL+KMuA MS9eCIvHpNLKcmazcopd4LTUqBPyrSRalzmx3CS3rTcEi6Ij50/a36HUbIh1Qe66Pv3f ZSWASPh9t60KnFs+aKKzYvZ0xkXnAkIe3Pzi+oY1UK9yv6gnINcn/Vpq2eq0Mem108tL jW2Q== X-Gm-Message-State: AFqh2koe2AQG3mwrzp2+7YCCVw5XfltMjEuqFYdJyP5zxMKUUOtEGGPK U9o64NWNY/Tn+Bl9RFoIua6br1J6KHyrTFqeVUEE0g== X-Google-Smtp-Source: AMrXdXsAS+F/t+k+RLycofXHpm2MVkThS9HDzF+oG0yCkqMTTq9jCZYTPVYA3UVQC5g/ZQoWjF19SIwf+o09Ru1wZ6k= X-Received: by 2002:a05:6902:11cd:b0:7d6:c4f6:b4ea with SMTP id n13-20020a05690211cd00b007d6c4f6b4eamr976386ybu.59.1674097582453; Wed, 18 Jan 2023 19:06:22 -0800 (PST) MIME-Version: 1.0 References: <20230113022555.2467724-1-kamatam@amazon.com> In-Reply-To: From: Suren Baghdasaryan Date: Wed, 18 Jan 2023 19:06:10 -0800 Message-ID: Subject: Re: another use-after-free in ep_remove_wait_queue() To: Munehisa Kamata Cc: ebiggers@kernel.org, hannes@cmpxchg.org, hdanton@sina.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mengcc@amazon.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 95F2B4000C X-Stat-Signature: omaifqjhjizc3aikaguoxhxxypnfqitt X-Rspam-User: X-HE-Tag: 1674097583-278084 X-HE-Meta: U2FsdGVkX19QL0+BMTrmMLn6qty1xUVFpWesmOrUr2kPGJfjERR96qVUl5iIPDqfpaqarPTFZvgQp5widd8OMygAKsrix0tCJEFa8TRtpoQdmXxPCa8/7YkrfoJF0l437SEd6uybtSHP5o+211Qj02ycqH0fjPqeJbvC+nzEb9fMuU+zaQpOvbO0/c+B1helxchCEVtTzNbJl7bYGtP398WDWi5+HCOM/2TY8heYf3UzTZCH/NianoamDQmcfyDW/ix0XOz5q4B/A0ULLA5DQ3nJK3k+l9mV1Sn/SFCmd2IzPtNoWNa9X1VuiY7yomlE+ztPQ84QgNybmeruMaQRU9aPtUyYI4NNZHUmgd9t+gziZFi9AvHqEOpSKvJc2ehkzp9Ti6/JRw7u1/WkDLhlWK2x9Pss1+/F35LBN1ee6OhLarO58XdJ4/DRGgxf0scRcC6ZRKWyuYmFfXlfYmOVUSl6rwYsR5dhY3Na9rg3CUn52amW7brzrj9U6H/WRhB93sttAqol8KWFFjaKZularg7efNkQ+5IT20vp20RvnbJlSsH+UsCn69yLnDOvWKos6NAOotjVhvua20+ZupFFxNl5f7eDuj0kA2hCieFMLNkiylGLLKVEQPLhENyyYH5MUWg8qmgWvSn/Yb4Zjgg9IQZWYD+3URKYQHdC7VkvYIZa9taQG0WQqaC5FXYXODsPJTwMIbda6qwzv/ZnNz5N6wffa21e7p7h+eHpSxXQJmvV5gReF2Aakb/Q+OCq9HJ4dfBpQFRrg9X8/llb4cTlF0m46j7C3l2b2KDS2Smp8pCH1XLjm3YtO+o6Hnfx/CTb5c6J2fYFldLLbBu+MoMvC+6eogSqC6okqJPp7ZLs3KrgvY7i1z3cvzjRBKTZ62EmCbTngOLT0NTuXvkA9ktI+vifF3COq/A6ZVzflglwvFgVf9aPSx2BgYtI1EfqPjjR0Q+RtsZCmPsD7m0GV5V VPF/UTjz gaSmtc1d6M4uXyn8o7PohyZe4YevKbQgxkNwDnu3X4feToyff4wjbSO5Fsw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000659, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jan 13, 2023 at 9:52 AM Suren Baghdasaryan wrote: > > On Thu, Jan 12, 2023 at 6:26 PM Munehisa Kamata wrote: > > > > On Thu, 2023-01-12 22:01:24 +0000, Suren Baghdasaryan wrote: > > > > > > On Mon, Jan 9, 2023 at 7:06 PM Suren Baghdasaryan wrote: > > > > > > > > On Mon, Jan 9, 2023 at 5:33 PM Suren Baghdasaryan wrote: > > > > > > > > > > On Sun, Jan 8, 2023 at 3:49 PM Hillf Danton wrote: > > > > > > > > > > > > On 8 Jan 2023 14:25:48 -0800 PM Munehisa Kamata wrote: > > > > > > > > > > > > > > That patch survived the repro in my original post, however, the waker > > > > > > > (rmdir) was getting stuck until a file descriptor of the epoll instance or > > > > > > > the pressure file got closed. So, if the following modified repro runs > > > > > > > with the patch, the waker never returns (unless the sleeper gets killed) > > > > > > > while holding cgroup_mutex. This doesn't seem to be what you expected to > > > > > > > see with the patch, does it? Even wake_up_all() does not appear to empty > > > > > > > the queue, but wake_up_pollfree() does. > > > > > > > > > > > > Thanks for your testing. And the debugging completes. > > > > > > > > > > > > Mind sending a patch with wake_up_pollfree() folded? > > > > > > > > > > I finally had some time to look into this issue. I don't think > > > > > delaying destruction in psi_trigger_destroy() because there are still > > > > > users of the trigger as Hillf suggested is a good way to go. Before > > > > > [1] correct trigger destruction was handled using a > > > > > psi_trigger.refcount. For some reason I thought it's not needed > > > > > anymore when we placed one-trigger-per-file restriction in that patch, > > > > > so I removed it. Obviously that was a wrong move, so I think the > > > > > cleanest way would be to bring back the refcounting. That way the last > > > > > user of the trigger (either psi_trigger_poll() or psi_fop_release()) > > > > > will free the trigger. > > > > > I'll check once more to make sure I did not miss anything and if there > > > > > are no objections, will post a fix. > > > > > > > > Uh, I recalled now why refcounting was not helpful here. I'm making > > > > the same mistake of thinking that poll_wait() blocks until the call to > > > > wake_up() which is not the case. Let me think if there is anything > > > > better than wake_up_pollfree() for this case. > > > > > > Hi Munehisa, > > > Sorry for the delay. I was trying to reproduce the issue but even > > > after adding a delay before ep_remove_wait_queue() it did not happen. > > > > Hi Suren, > > > > Thank you for your help here. > > > > Just in case, do you have KASAN enabled in your config? If not, this may > > just silently corrupt a certain memory location and not immediately > > followed by obvious messages or noticeable event like oops. > > Yes, KASAN was enabled in my build. > > > > > > One thing about wake_up_pollfree() solution that does not seem right > > > to me is this comment at > > > https://elixir.bootlin.com/linux/latest/source/include/linux/wait.h#L253: > > > > > > `In the very rare cases where a ->poll() implementation uses a > > > waitqueue whose lifetime is tied to a task rather than to the 'struct > > > file' being polled, this function must be called before the waitqueue > > > is freed...` > > > > > > In our case we free the waitqueue from cgroup_pressure_release(), > > > which is the handler for `release` operation on cgroup psi files. The > > > other place calling psi_trigger_destroy() is psi_fop_release(), which > > > is also tied to the lifetime to the psi files. Therefore the lifetime > > > of the trigger's waitqueue is tied to the lifetime of the files and > > > IIUC, we should not be required to use wake_up_pollfree(). > > > Could you please post your .config file? I might be missing some > > > configuration which prevents the issue from happening on my side. > > > > Sure, here is my config. > > > > https://gist.github.com/kamatam9/a078bdd9f695e7a0767b061c60e48d50 > > > > I confirmed that it's reliably reproducible with v6.2-rc3 as shown below. > > > > https://gist.github.com/kamatam9/096a79cf59d8ed8785c4267e917b8675 > > Thanks! I'll try to figure out the difference. Sorry for the slow progress on this issue. I'm multiplexing between several tasks ATM but I did not forget about this one. Even though I still can't get the kasan UAF report, I clearly see the wrong order when tracing these functions and forcing the delay before ep_remove_wait_queue(). I don't think that should be happening, so something in the release process I think needs fixing. Will update once I figure out the root cause, hopefully sometime this week. > Suren. > > > > > > > Regards, > > Munehisa > > > > > > > Thanks, > > > Suren. > > > > > > > > > > > > > > > > > > > > > [1] https://lore.kernel.org/lkml/20220111232309.1786347-1-surenb@google.com/ > > > > > > > > > > Thanks, > > > > > Suren. > > > > > > > > > > > > > > > > > Hillf > > > > > >