From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FBD3C77B7A for ; Tue, 16 May 2023 12:31:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B070280003; Tue, 16 May 2023 08:31:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3387C900002; Tue, 16 May 2023 08:31:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B2F4280003; Tue, 16 May 2023 08:31:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0A20F900002 for ; Tue, 16 May 2023 08:31:10 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CE65580239 for ; Tue, 16 May 2023 12:31:09 +0000 (UTC) X-FDA: 80796053058.14.FC47EC4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 4DAEF16001D for ; Tue, 16 May 2023 12:31:03 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=I0OFH+Fd; spf=pass (imf08.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684240264; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ALMVvYlZxEE1m2+2KXk6OCrrF6yVXPnBatrbCWbyCNQ=; b=w7LJZUkfVssZV0iC1zoG09hZ2iya14Y7Yb564AZ2a/iry7FLWvZyx1EYSJGBuwpZwjHqgo 8zihGQtplN30zN2iQqzHChi5aLQOT7otto+En1DORsMMuMSkXRPAa4L/1iY9ofgGa3NtWh HeY/yxOdZdJnjglQtp6PEPLuFXompfQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684240264; a=rsa-sha256; cv=none; b=BEnhqDvtbC+3vtNW3DmjuBTjzfhxabm1lBbC8DlM//q8H1c3pFgUBxLUJtvoZgAIa7LrE0 QCAzsV4fumoe4QkMH7N67p4ZkUTwq1K9IhhUV98IcLMYpCpGx8oA+kEQ6p5SYtLorBSMJ6 EXYCpNrUfS/bL4ETtGPcn9nxFQEqO2k= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=I0OFH+Fd; spf=pass (imf08.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684240263; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ALMVvYlZxEE1m2+2KXk6OCrrF6yVXPnBatrbCWbyCNQ=; b=I0OFH+FdXM80uxlwvKjLAa7ecNZSGT+1E7XJTfacZIn+2rh+tW2nWxDNDc2mGslmUQxNSq gtK+f1ffuNQRFroKuBex4NVn/iqu45/zRgC/DE2t0luZu3Zv9PWtvFDTqaPh9X/t9FPL8H fQfWp0wIp+QtYN1oDJ92aruSvs1QLME= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-561-SAcjOHB-O1epMnc1EBn_9Q-1; Tue, 16 May 2023 08:31:00 -0400 X-MC-Unique: SAcjOHB-O1epMnc1EBn_9Q-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-3f426d4944fso35177875e9.1 for ; Tue, 16 May 2023 05:31:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684240259; x=1686832259; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ALMVvYlZxEE1m2+2KXk6OCrrF6yVXPnBatrbCWbyCNQ=; b=G6EtdVZaJq5NJQXcQrGhPUIuaOI9j72c+0dPNnn6e4c2uN9XJ30ATlzR0Y698QyXmh CQ4Rb4hK+vem6OX5mB9Posf/++5VDNms4rZ7XQak5Magre9PKejW7e5Nbskwy43E3BLk vyeh2TvriE4nv06jT4iXppqrQCJTvfUaYaGP4O+UdscY4/patgV2afvvM4QBmJ989jtR IBPaRbuUZxeoyKwE9pTp0Z0UvaDnJb0UTe+Hhq0qy6+vT86Cx0ZRGiOU5JpsFvFgmySa ZZVmuaxVLZcIkmXbg3E9pTtBM3yWkrzGcTSMK7lTWvVH3/IUIlyTNZugcYljVMs3V35e Usjg== X-Gm-Message-State: AC+VfDz2daCBNVVfhkGuo6YhfTfMd8jh9hYej/ulmOoBRxfG9fKK9CcF hOXj9t7Fz6Lwc1E4e0M2fCL6Wyl2HABTHwNIBRhokznG4x1HeVjMZRWNSJO4pYmIQxed8pXGDgM jV8wBERjYmhU= X-Received: by 2002:a1c:7507:0:b0:3f1:9acf:8682 with SMTP id o7-20020a1c7507000000b003f19acf8682mr23702012wmc.17.1684240259568; Tue, 16 May 2023 05:30:59 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5Eg6UWL+D6zbQXZNHghi8/vvX2EWStcyjxEyuD7ME2Z87kkdCw4OhQfxz1isxG+5csGWdeXg== X-Received: by 2002:a1c:7507:0:b0:3f1:9acf:8682 with SMTP id o7-20020a1c7507000000b003f19acf8682mr23701988wmc.17.1684240259166; Tue, 16 May 2023 05:30:59 -0700 (PDT) Received: from ?IPV6:2003:cb:c74f:2500:1e3a:9ee0:5180:cc13? (p200300cbc74f25001e3a9ee05180cc13.dip0.t-ipconnect.de. [2003:cb:c74f:2500:1e3a:9ee0:5180:cc13]) by smtp.gmail.com with ESMTPSA id v10-20020a05600c214a00b003f50e88ffb5sm2233741wml.24.2023.05.16.05.30.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 May 2023 05:30:58 -0700 (PDT) Message-ID: <91246137-a3d2-689f-8ff6-eccc0e61c8fe@redhat.com> Date: Tue, 16 May 2023 14:30:57 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 To: Catalin Marinas Cc: Peter Collingbourne , =?UTF-8?B?UXVuLXdlaSBMaW4gKOael+e+pOW0tCk=?= , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, "surenb@google.com" , =?UTF-8?B?Q2hpbndlbiBDaGFuZyAo5by16Yym5paHKQ==?= , "kasan-dev@googlegroups.com" , =?UTF-8?B?S3Vhbi1ZaW5nIExlZSAo5p2O5Yag56mOKQ==?= , =?UTF-8?B?Q2FzcGVyIExpICjmnY7kuK3mpq4p?= , "gregkh@linuxfoundation.org" , vincenzo.frascino@arm.com, Alexandru Elisei , will@kernel.org, eugenis@google.com, Steven Price , stable@vger.kernel.org References: <20230512235755.1589034-1-pcc@google.com> <20230512235755.1589034-2-pcc@google.com> <7471013e-4afb-e445-5985-2441155fc82c@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH 1/3] mm: Move arch_do_swap_page() call to before swap_free() In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4DAEF16001D X-Stat-Signature: wtx7z9gxmecqomtympx3a9a9jiuorax8 X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1684240263-811207 X-HE-Meta: U2FsdGVkX1/Ghgh3HWIAVcn9Uyn2zCoOZmiO+pvJXzFR1KpzCQBpMTaX0WsPe28kOiWJHvLDtebqbJmc9iFqlN1n9TFM0VXXIliG2FFZRfazCVprlvmx7nOPQj3JOd0RF9LC8M9ivqAnKT+kFKtCuuohKeUw7aAwpB5DPXLv8hh7AFSye9zxpQdxpw1nln51q6+OkaSUbrOJfjeB+f4TJPVbcnshRev9mFpdHIUUtvsQNgrEp4aKew4Fu9JfCYv330ccAcSHXsE73aP5ZgR7A3GSzHykxlV38WTU8bbcRxp1c61JwmAWjJBSBwPrRkazA7f6a8LSdEnRyy4rQPiDLLgKfmXGQstYcloA/QkE+vF4z/zRJ+X36Z8CepbR/j4ZO5lfqTDnK/iUJ96d8oCgQOxx2w/DiZVROvmfEDF4xpDqHIdCcP77p8BZBRv2PApJOLHAkUm0bPrBwAk93OMc3+kl41BXBPvaGWbePqqfYfue0/CekMy1cXduqffpSJAzOn+B4G5T01C9OuRkV3/9KgHNi2EflhxZsC3s0PgEK2Qe3GPcCETTZWWvUqoiB6iBmofBGHYQBYxTn2rbkDvWq5yq2TUxuWIYFt4Fpfcp+J2mp7igXkZwF8aPpHfvPaMBocSktzchiLDYAbtZ17jwSYqLiNrSyLUskj9Wzz7gskJ3EvJ+guE9AMOa7yeN6h7vSwxAbhb8DKUlSy4TYZPGMDVu8EGlL9XbnQLDxM8lOc66J8sh/Fh3Xdop51xoKFC8VPMyWl39xJmuOPxF4JPzXTV+/KKknbIm4c++zCqu5dWFioQgGF9UBSSN4gs/4ulU9aTooNm/YdV+bikWDX+IRrFB/E+TpEF75XmXbJLNR4o+U6hQv2cDyEk5uGzxUeekbYns373pcpgswB+Jp9AB8MWnqpTFxXpSbaEM3BstHLZ5keWSCEU3PFxFTiBvt0TGU59jQT5vzv8fbyJ8YMD XWYD0NZX gYCMstEbRLfWrS5qS6AyYq8c4IqG0Xr4gochpUe15KXHfSAL/3k3nYMpsj6KCV98/AvV2/xnuIecioBbJkVCujEepHefNfvIY3mq21ieDzqhQV4Ab8Ks4yeMsrq0chjvIVymGL8odEgyg0qyBq4HJ2znfRs9X/2/811ntEvYga4HRLX0pu1vnxracPNAxdRtxzZz5u8qfm8nk5hSvZ6aHTtF97x6kRpES2uPXdhk+CDD8DvWG2K9vQ/iQ1FANfE6Eg6RAxjF8xcRnnAX9adJei5CiU3gzxaK7vFXFrpXtgZmWhveX7UFqqGirB9ymCsl0CRlATRKZhf0FuQZhxmRG4YHoX24d42gkVoOUbqVTgevutrMOtpVGSQhICVERAfB1pIQDqVOkdGbwLwOaMJZil+HVEp1fe8DHAkziLnmy9G0NpJinOoZp5E0BJ2rp4wn3by0lUxvOz6kw6EDxGPGXzMWEd1HjIPG8+yd5r46gI9GlNxCXUftPEEk8/9Eqt23kjQztLIliVMOe23A7j7WKZEjR7fQgDVw7FxmT1xwNZLlsuTBe5YdeoGLx5qdQ6cDFg7ABzK0i6jAiAGtSxQow2JvEdw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 15.05.23 19:34, Catalin Marinas wrote: > On Sat, May 13, 2023 at 05:29:53AM +0200, David Hildenbrand wrote: >> On 13.05.23 01:57, Peter Collingbourne wrote: >>> diff --git a/mm/memory.c b/mm/memory.c >>> index 01a23ad48a04..83268d287ff1 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -3914,19 +3914,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>> } >>> } >>> - /* >>> - * Remove the swap entry and conditionally try to free up the swapcache. >>> - * We're already holding a reference on the page but haven't mapped it >>> - * yet. >>> - */ >>> - swap_free(entry); >>> - if (should_try_to_free_swap(folio, vma, vmf->flags)) >>> - folio_free_swap(folio); >>> - >>> - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); >>> - dec_mm_counter(vma->vm_mm, MM_SWAPENTS); >>> pte = mk_pte(page, vma->vm_page_prot); >>> - >>> /* >>> * Same logic as in do_wp_page(); however, optimize for pages that are >>> * certainly not shared either because we just allocated them without >>> @@ -3946,8 +3934,21 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>> pte = pte_mksoft_dirty(pte); >>> if (pte_swp_uffd_wp(vmf->orig_pte)) >>> pte = pte_mkuffd_wp(pte); >>> + arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); >>> vmf->orig_pte = pte; >>> + /* >>> + * Remove the swap entry and conditionally try to free up the swapcache. >>> + * We're already holding a reference on the page but haven't mapped it >>> + * yet. >>> + */ >>> + swap_free(entry); >>> + if (should_try_to_free_swap(folio, vma, vmf->flags)) >>> + folio_free_swap(folio); >>> + >>> + inc_mm_counter(vma->vm_mm, MM_ANONPAGES); >>> + dec_mm_counter(vma->vm_mm, MM_SWAPENTS); >>> + >>> /* ksm created a completely new copy */ >>> if (unlikely(folio != swapcache && swapcache)) { >>> page_add_new_anon_rmap(page, vma, vmf->address); >>> @@ -3959,7 +3960,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>> VM_BUG_ON(!folio_test_anon(folio) || >>> (pte_write(pte) && !PageAnonExclusive(page))); >>> set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); >>> - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); >>> folio_unlock(folio); >>> if (folio != swapcache && swapcache) { >> >> >> You are moving the folio_free_swap() call after the folio_ref_count(folio) >> == 1 check, which means that such (previously) swapped pages that are >> exclusive cannot be detected as exclusive. >> >> There must be a better way to handle MTE here. >> >> Where are the tags stored, how is the location identified, and when are they >> effectively restored right now? > > I haven't gone through Peter's patches yet but a pretty good description > of the problem is here: > https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@mediatek.com/. > I couldn't reproduce it with my swap setup but both Qun-wei and Peter > triggered it. > > When a tagged page is swapped out, the arm64 code stores the metadata > (tags) in a local xarray indexed by the swap pte. When restoring from > swap, the arm64 set_pte_at() checks this xarray using the old swap pte > and spills the tags onto the new page. Apparently something changed in > the kernel recently that causes swap_range_free() to be called before > set_pte_at(). The arm64 arch_swap_invalidate_page() frees the metadata > from the xarray and the subsequent set_pte_at() won't find it. > > If we have the page, the metadata can be restored before set_pte_at() > and I guess that's what Peter is trying to do (again, I haven't looked > at the details yet; leaving it for tomorrow). Thanks for the details! I was missing that we also have a hook in swap_range_free(). > > Is there any other way of handling this? E.g. not release the metadata > in arch_swap_invalidate_page() but later in set_pte_at() once it was > restored. But then we may leak this metadata if there's no set_pte_at() > (the process mapping the swap entry died). That was my immediate thought: do we really have to hook into swap_range_free() at all? And I also wondered why we have to do this from set_pte_at() and not do this explicitly (maybe that's the other arch_* callback on the swapin path). I'll have a look at v2, maybe it can be fixed easily without having to shuffle around too much of the swapin code (which can easily break again because the dependencies are not obvious at all and even undocumented in the code). -- Thanks, David / dhildenb