All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chengming Zhou <chengming.zhou@linux.dev>
To: Yosry Ahmed <yosryahmed@google.com>,
	"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Nhat Pham <nphamcs@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
	"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
	"21cnbao@gmail.com" <21cnbao@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
	"Gopal, Vinodh" <vinodh.gopal@intel.com>
Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in zswap_decompress().
Date: Wed, 20 Nov 2024 10:31:53 +0800	[thread overview]
Message-ID: <78374b54-1a68-4dc1-a220-dcef30a338c1@linux.dev> (raw)
In-Reply-To: <CAJD7tkZiE3PeRF=9_-ySMr7rDogGQtG9aHuwfZvpMF3uPN6aJQ@mail.gmail.com>

On 2024/11/20 07:44, Yosry Ahmed wrote:
> On Tue, Nov 19, 2024 at 2:35 PM Sridhar, Kanchana P
> <kanchana.p.sridhar@intel.com> wrote:
>>
>>
>>> -----Original Message-----
>>> From: Yosry Ahmed <yosryahmed@google.com>
>>> Sent: Tuesday, November 19, 2024 11:51 AM
>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>>> Cc: Chengming Zhou <chengming.zhou@linux.dev>; Johannes Weiner
>>> <hannes@cmpxchg.org>; Nhat Pham <nphamcs@gmail.com>; linux-
>>> kernel@vger.kernel.org; linux-mm@kvack.org; usamaarif642@gmail.com;
>>> ryan.roberts@arm.com; 21cnbao@gmail.com; akpm@linux-foundation.org;
>>> Feghali, Wajdi K <wajdi.k.feghali@intel.com>; Gopal, Vinodh
>>> <vinodh.gopal@intel.com>
>>> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in
>>> zswap_decompress().
>>>
>>> On Tue, Nov 19, 2024 at 11:42 AM Sridhar, Kanchana P
>>> <kanchana.p.sridhar@intel.com> wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Yosry Ahmed <yosryahmed@google.com>
>>>>> Sent: Tuesday, November 19, 2024 11:27 AM
>>>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>>>>> Cc: Chengming Zhou <chengming.zhou@linux.dev>; Johannes Weiner
>>>>> <hannes@cmpxchg.org>; Nhat Pham <nphamcs@gmail.com>; linux-
>>>>> kernel@vger.kernel.org; linux-mm@kvack.org; usamaarif642@gmail.com;
>>>>> ryan.roberts@arm.com; Huang, Ying <ying.huang@intel.com>;
>>>>> 21cnbao@gmail.com; akpm@linux-foundation.org; Feghali, Wajdi K
>>>>> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
>>>>> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in
>>>>> zswap_decompress().
>>>>>
>>>>> On Tue, Nov 19, 2024 at 11:22 AM Sridhar, Kanchana P
>>>>> <kanchana.p.sridhar@intel.com> wrote:
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Yosry Ahmed <yosryahmed@google.com>
>>>>>>> Sent: Friday, November 15, 2024 1:49 PM
>>>>>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>>>>>>> Cc: Chengming Zhou <chengming.zhou@linux.dev>; Johannes Weiner
>>>>>>> <hannes@cmpxchg.org>; Nhat Pham <nphamcs@gmail.com>; linux-
>>>>>>> kernel@vger.kernel.org; linux-mm@kvack.org;
>>> usamaarif642@gmail.com;
>>>>>>> ryan.roberts@arm.com; Huang, Ying <ying.huang@intel.com>;
>>>>>>> 21cnbao@gmail.com; akpm@linux-foundation.org; Feghali, Wajdi K
>>>>>>> <wajdi.k.feghali@intel.com>; Gopal, Vinodh
>>> <vinodh.gopal@intel.com>
>>>>>>> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in
>>>>>>> zswap_decompress().
>>>>>>>
>>>>>>> On Fri, Nov 15, 2024 at 1:14 PM Sridhar, Kanchana P
>>>>>>> <kanchana.p.sridhar@intel.com> wrote:
>>>>>>>>
>>>>>>>> Hi Chengming,
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Chengming Zhou <chengming.zhou@linux.dev>
>>>>>>>>> Sent: Wednesday, November 13, 2024 11:24 PM
>>>>>>>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>;
>>> Johannes
>>>>>>> Weiner
>>>>>>>>> <hannes@cmpxchg.org>
>>>>>>>>> Cc: Nhat Pham <nphamcs@gmail.com>; Yosry Ahmed
>>>>>>>>> <yosryahmed@google.com>; linux-kernel@vger.kernel.org; linux-
>>>>>>>>> mm@kvack.org; usamaarif642@gmail.com;
>>> ryan.roberts@arm.com;
>>>>>>> Huang,
>>>>>>>>> Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
>>>>>>>>> foundation.org; Feghali, Wajdi K <wajdi.k.feghali@intel.com>;
>>> Gopal,
>>>>>>> Vinodh
>>>>>>>>> <vinodh.gopal@intel.com>
>>>>>>>>> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in
>>>>>>>>> zswap_decompress().
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> On 2024/11/14 14:37, Sridhar, Kanchana P wrote:
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Johannes Weiner <hannes@cmpxchg.org>
>>>>>>>>>>> Sent: Wednesday, November 13, 2024 9:12 PM
>>>>>>>>>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>>>>>>>>>>> Cc: Nhat Pham <nphamcs@gmail.com>; Yosry Ahmed
>>>>>>>>>>> <yosryahmed@google.com>; linux-kernel@vger.kernel.org;
>>> linux-
>>>>>>>>>>> mm@kvack.org; chengming.zhou@linux.dev;
>>>>>>> usamaarif642@gmail.com;
>>>>>>>>>>> ryan.roberts@arm.com; Huang, Ying <ying.huang@intel.com>;
>>>>>>>>>>> 21cnbao@gmail.com; akpm@linux-foundation.org; Feghali,
>>> Wajdi K
>>>>>>>>>>> <wajdi.k.feghali@intel.com>; Gopal, Vinodh
>>>>> <vinodh.gopal@intel.com>
>>>>>>>>>>> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory
>>> leak in
>>>>>>>>>>> zswap_decompress().
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Nov 14, 2024 at 01:56:16AM +0000, Sridhar, Kanchana
>>> P
>>>>>>> wrote:
>>>>>>>>>>>> So my question was, can we prevent the migration to a
>>> different
>>>>> cpu
>>>>>>>>>>>> by relinquishing the mutex lock after this conditional
>>>>>>>>>>>
>>>>>>>>>>> Holding the mutex doesn't prevent preemption/migration.
>>>>>>>>>>
>>>>>>>>>> Sure, however, is this also applicable to holding the mutex of a
>>> per-
>>>>> cpu
>>>>>>>>>> structure obtained via raw_cpu_ptr()?
>>>>>>>>>
>>>>>>>>> Yes, unless you use migration_disable() or cpus_read_lock() to
>>> protect
>>>>>>>>> this section.
>>>>>>>>
>>>>>>>> Ok.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Would holding the mutex prevent the acomp_ctx of the cpu prior
>>> to
>>>>>>>>>> the migration (in the UAF scenario you described) from being
>>>>> deleted?
>>>>>>>>>
>>>>>>>>> No, cpu offline can kick in anytime to free the acomp_ctx->buffer.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If holding the per-cpu acomp_ctx's mutex isn't sufficient to
>>> prevent
>>>>> the
>>>>>>>>>> UAF, I agree, we might need a way to prevent the acomp_ctx
>>> from
>>>>> being
>>>>>>>>>> deleted, e.g. with refcounts as you've suggested, or to not use
>>> the
>>>>>>>>>
>>>>>>>>> Right, refcount solution from Johannes is very good IMHO.
>>>>>>>>>
>>>>>>>>>> acomp_ctx at all for the check, instead use a boolean.
>>>>>>>>>
>>>>>>>>> But this is not enough to just avoid using acomp_ctx for the check,
>>>>>>>>> the usage of acomp_ctx inside the mutex is also UAF, since cpu
>>> offline
>>>>>>>>> can kick in anytime to free the acomp_ctx->buffer.
>>>>>>>>
>>>>>>>> I see. How would the refcounts work? Would this add latency to
>>> zswap
>>>>>>>> ops? In low memory situations, could the cpu offlining code over-ride
>>>>>>>> the refcounts?
>>>>>>>
>>>>>>> I think what Johannes meant is that the zswap compress/decompress
>>>>>>> paths grab a ref on the acomp_ctx before using it, and the CPU
>>>>>>> offlining code only drops the initial ref, and does not free the
>>>>>>> buffer directly. The buffer is only freed when the ref drops to zero.
>>>>>>>
>>>>>>> I am not familiar with CPU hotplug, would it be simpler if we have a
>>>>>>> wrapper like get_acomp_ctx() that disables migration or calls
>>>>>>> cpus_read_lock() before grabbing the per-CPU acomp_ctx? A similar
>>>>>>> wrapper, put_acompt_ctx() will be used after we are done using the
>>>>>>> acomp_ctx.
>>>>>>
>>>>>> Would it be sufficient to add a check for mutex_is_locked() in
>>>>>> zswap_cpu_comp_dead() and if this returns true, to exit without
>>> deleting
>>>>>> the acomp?
>>>>>
>>>>> I don't think this works. First of all, it's racy. It's possible the
>>>>> mutex gets locked after we check mutex_is_locked() but before we
>>>>> delete the acomp_ctx. Also, if we find that the mutex is locked, then
>>>>> we do nothing and essentially leak the memory.
>>>>
>>>> Yes, this would assume the cpu offlining code retries at some interval,
>>>> which could prevent the memory leak.
>>>
>>> I am not sure about that, but even so, it wouldn't handle the first
>>> scenario where the mutex gets locked after we check mutex_is_locked().
>>>
>>>>
>>>>>
>>>>> Second, and probably more important, this only checks if anyone is
>>>>> currently holding the mutex. What about tasks that may be sleeping
>>>>> waiting for the mutex to be unlocked? The mutex will be deleted from
>>>>> under them as well.
>>>>
>>>> Wouldn't this and the race described above, also be issues for the
>>>> refcount based approach?
>>>
>>> I don't think so, at least if implemented correctly. There are a lot
>>> of examples around the kernel that use RCU + refcounts for such use
>>> cases. I think there are also some examples in kernel docs.
>>>
>>> That being said, I am wondering if we can get away with something
>>> simpler like holding the cpus read lock or disabling migration as I
>>> suggested earlier, but I am not quite sure.
>>
>> Another idea to consider is how zsmalloc avoids this issue through
>> its use of the local_lock() on the per-cpu mapping area. This disables
>> preemption from zs_map_object() through zs_unmap_object().
>> Would changing the acomp_ctx's mutex to a local_lock solve the
>> problem?
> 
> This is similar to disabling migration as I suggested, but disabling
> preemption means that we cannot sleep, we spin on a lock instead.
> 
> In zswap_compress(), we send the compression request and may sleep
> waiting for it. We also make a non-atomic allocation later that may
> also sleep but that's less of a problem.
> 
> In zswap_decompress() we may also sleep, which is why we sometimes
> copy the data into acomp_ctx->buffer and unmap the handle to begin
> with.
> 
> So I don't think we can just replace the mutex with a lock.
> 
>>
>>>
>>>>
>>>> Also, I am wondering if the mutex design already handles cases where
>>>> tasks are sleeping, waiting for a mutex that disappears?
>>>
>>> I don't believe so. It doesn't make sense for someone to free a mutex
>>> while someone is waiting for it. How would the waiter know if the
>>> memory backing the mutex was freed?
>>
>> Thanks Yosry, all good points. There would need to be some sort of
>> arbiter (for e.g., the cpu offlining code) that would reschedule tasks
>> running on a cpu before shutting it down, which could address
>> this specific issue. I was thinking these are not problems unique to
>> zswap's per-cpu acomp_ctx->mutex wrt the offlining?
> 
> There are a few reasons why zswap has this problem and other code may
> not have it. For example the data structure is dynamically allocated
> and is freed during offlining, it wouldn't be a problem if it was
> static. Also the fact that we don't disable preemption when accessing
> the per-CPU data, as I mentioned earlier, which would prevent the CPU
> we are running on from going offline while we access the per-CPU data.
> 
> I think we should either:
> (a) Use refcounts.
> (b) Disable migration.
> (c) Hold the CPUs read lock.
> 
> I was hoping someone with more knowledge about CPU offlining would
> confirm (b) and (c) would work, but I am pretty confident they would.

+1, I also think (b) or (c) should work with my limited knowledge about
CPU offlining :-) And they seem simpler than refcounts implementation.

Thanks.


      parent reply	other threads:[~2024-11-20  2:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-13  5:24 [PATCH v1] mm: zswap: Fix a potential memory leak in zswap_decompress() Kanchana P Sridhar
2024-11-13  5:34 ` Yosry Ahmed
2024-11-13  5:58   ` Sridhar, Kanchana P
2024-11-13  6:21     ` Yosry Ahmed
2024-11-13 19:12       ` Sridhar, Kanchana P
2024-11-13 20:11         ` Yosry Ahmed
2024-11-13 20:59           ` Sridhar, Kanchana P
2024-11-13 20:59             ` Yosry Ahmed
2024-11-13 21:12               ` Sridhar, Kanchana P
2024-11-13 21:30         ` Johannes Weiner
2024-11-13 22:01           ` Yosry Ahmed
2024-11-13 22:13           ` Sridhar, Kanchana P
2024-11-14  0:28             ` Nhat Pham
2024-11-14  1:56               ` Sridhar, Kanchana P
2024-11-14  5:11                 ` Johannes Weiner
2024-11-14  6:37                   ` Sridhar, Kanchana P
2024-11-14  7:24                     ` Chengming Zhou
2024-11-15 21:12                       ` Sridhar, Kanchana P
2024-11-15 21:49                         ` Yosry Ahmed
2024-11-19 19:22                           ` Sridhar, Kanchana P
2024-11-19 19:27                             ` Yosry Ahmed
2024-11-19 19:41                               ` Sridhar, Kanchana P
2024-11-19 19:51                                 ` Yosry Ahmed
2024-11-19 22:35                                   ` Sridhar, Kanchana P
2024-11-19 23:44                                     ` Yosry Ahmed
2024-11-20  0:00                                       ` Sridhar, Kanchana P
2024-11-20  2:31                                       ` Chengming Zhou [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78374b54-1a68-4dc1-a220-dcef30a338c1@linux.dev \
    --to=chengming.zhou@linux.dev \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kanchana.p.sridhar@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=usamaarif642@gmail.com \
    --cc=vinodh.gopal@intel.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.