From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AE86C636CC for ; Tue, 31 Jan 2023 08:47:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C4D86B008C; Tue, 31 Jan 2023 03:47:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 373EA6B0093; Tue, 31 Jan 2023 03:47:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 214DE6B0095; Tue, 31 Jan 2023 03:47:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 13C346B008C for ; Tue, 31 Jan 2023 03:47:09 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id ED6511C5C5A for ; Tue, 31 Jan 2023 08:47:08 +0000 (UTC) X-FDA: 80414464536.23.344F005 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf25.hostedemail.com (Postfix) with ESMTP id B72E9A0005 for ; Tue, 31 Jan 2023 08:47:06 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=V8tCZWy9; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675154826; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eJbeBzcV4TWL0jebm/OUMAc1K3T5Nk9LlHTe0trXExM=; b=dWFIcFAIbvoszVpQHIjBJvV/ey0cs4CdLr7NDeCwsbhg8VtVOf9WwpHiG1Lpjd/W2dxS0R Coxc+kuoVR42/jrsmOx/T7IG6Pzgw2FeKtWgBnHbNA+Iic5Gg619hk8CglR3hLkP2PiuFj dSLLifGp06c5pTA2ZnZ2YUdT0b4PPHM= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=V8tCZWy9; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675154826; a=rsa-sha256; cv=none; b=yroGW/1FLUZLZhYZYMOB9hB8u8rZ5RMKXd6Y+cJZTNxP33WfzAYwALA73kMmXNJxpFSSc9 PUb6rYiRARJ3g/ChQ42Q63mBUMtwKnoiGLAo1UUBFd0MY77d+rXVnOMLupURWW+ZiiyFT5 LyPRWFe5qyNmgoOcfAbV2HsBbSvGxjA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675154826; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eJbeBzcV4TWL0jebm/OUMAc1K3T5Nk9LlHTe0trXExM=; b=V8tCZWy9YDPcu+yGOErI0AIpMQgD0wEJcQteNOZuRxtPHNolmbR4qXRiFIa7bgnU77EfJD 7XtqatR++U3DhR/fFpk43vlGNDal7iFyxzEA5OfyGobXum59I28s+T393c8TmeRQsQeqgP 9zOKumqm0ZPXJC1ZBn9M6DOOVoX9siE= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-131-omG5mHcVP1KOGiSIC11ktw-1; Tue, 31 Jan 2023 03:47:04 -0500 X-MC-Unique: omG5mHcVP1KOGiSIC11ktw-1 Received: by mail-wr1-f71.google.com with SMTP id j25-20020adfd219000000b002bfd1484f9bso1781195wrh.2 for ; Tue, 31 Jan 2023 00:47:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=eJbeBzcV4TWL0jebm/OUMAc1K3T5Nk9LlHTe0trXExM=; b=hMmOmoV/m86ewLxRHCjVzMYlxdgkW/Ljeg7ZiFqJt8p2jJR0lD/n8J61viB2xltUwL T0aosvdLELCcE6fxjWMdzzEgS50KvoJjDO4oO0VO2dIUVCnfhnm0zoy15yB3OkbWAkWQ voLrPB2g0HZ9Sumfp2bzVD8RnAaEu7+YjkMv5hVHMtw94H/8huc7EYAc7Y6VKqv6R5Yd DCoVn53s2SgfNBzvQ3S7Wg7pcGoIxvkurfYRRrWFA0rqWcvK+1sZ0Cz9Bi9SXoCMCzxe zLgiI/msN6h7JBHb6iFZ5H1cXnIKn3zcdeAxfuo8hALMV1VnYoSwRIEEw8/8tQl61nZn EmKQ== X-Gm-Message-State: AFqh2kr4g5bsDqwoFgBhPvGT0UqlOIeUvBtKPZVP5sg6ZfRy7QDfMCy1 hBzje+OlcT0DT1onfPVGDlNvc1lkNyp31W3uiKwZQbSSVflwD+0WwYBCNtmJyho412EV9eLs9JE E88ky+kjZC6c= X-Received: by 2002:a05:600c:5488:b0:3da:1e3e:1ce8 with SMTP id iv8-20020a05600c548800b003da1e3e1ce8mr53630065wmb.13.1675154823551; Tue, 31 Jan 2023 00:47:03 -0800 (PST) X-Google-Smtp-Source: AMrXdXvh7lQeNQsyIFm/SR9ZDxGOpNWOdkerlKVXtNt1BuLbAz5uHp0GAPBMu/t5psiwCY3rWQKS6A== X-Received: by 2002:a05:600c:5488:b0:3da:1e3e:1ce8 with SMTP id iv8-20020a05600c548800b003da1e3e1ce8mr53630039wmb.13.1675154823235; Tue, 31 Jan 2023 00:47:03 -0800 (PST) Received: from ?IPV6:2003:d8:2f0a:ca00:f74f:2017:1617:3ec3? (p200300d82f0aca00f74f201716173ec3.dip0.t-ipconnect.de. [2003:d8:2f0a:ca00:f74f:2017:1617:3ec3]) by smtp.gmail.com with ESMTPSA id z24-20020a1cf418000000b003dc36981727sm13025688wma.14.2023.01.31.00.47.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 31 Jan 2023 00:47:02 -0800 (PST) Message-ID: <671d9bbb-0f19-2710-00ef-47734085dddc@redhat.com> Date: Tue, 31 Jan 2023 09:47:01 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH v1] sparc/mm: don't unconditionally set HW writable bit when setting PTE dirty on 64bit To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, sparclinux@vger.kernel.org, Andrew Morton , "David S. Miller" , Peter Xu , Hev , Anatoly Pugachev , Raghavendra K T , Thorsten Leemhuis , Mike Kravetz , "Kirill A. Shutemov" , Juergen Gross References: <20221212130213.136267-1-david@redhat.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: <20221212130213.136267-1-david@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B72E9A0005 X-Stat-Signature: ddegp7hpu36ze9d9ww47966y7bkgcdko X-HE-Tag: 1675154826-708965 X-HE-Meta: U2FsdGVkX18B8xzyx05uQG0nEqkafwxq2hzM61Sh0QMMr55xLtb2KuF32eR9S3f9uUWNfw01w31PW0Q2IUPI1r3VnJxWhkWDT/9In570embFKrSiamSbTUAg4FWmlD9haGXHvIYNMpiCuUGmWfFF52EexK4XQsLUdYkrcKcxci5SqYSSd14iapcoVcd9iHb0HrgpAqRwwGyhASglP4H9dROAJ/mW7igauQo81499TI1O4vss9GeTFc+/YmgqXeAoW0qcy0rCZuYGxL/EK+HCSl16Rk9LIVYz2tbGPLhYB+Yt6ZTzf0qrCOdrMmqkA5X3pFiPuw7LZlGzeyFgR6S5DDMxolFzK8knECID2m8f5ahJ/8ER982xgy/KrovLn+pNfEpSyBtfeW/4Jgk/VjyZCn85UitvzVxRzJyuc2OO9L2NKqjOX2xi/vvIXAdcb5U0r7tFRbeoBDBPejsZBkcA4XIIVVpMb6p5D12Lv0VMobL7p++q0YGfWh+72flVZZ2JXCjzvEZ4sgY8iH8yK/iTB9KMVcCUQXJ0MRKcHQdUOK2r/riRgxA8h9FAB6bX9oF1IEOcfM8VnyLKC0eAV1wwjF45JviniIpmiGPnZLSq/2iVmeNFyRb5petz5VqErygrs/rJ7BB5pdX65Ovnk22q+/G3ks6p3+EG6qqLzEwx1+sd1n4zuHJwB9Np70pYIqWuM3woCX7xN6eggo/AIDVFf0vUouoVdfDwHGidgr74+ljQoyoQGkiOwC7n4EFK8WTSkPzCRLnbPx163FM6SqoRs3RmpwhWz+2Hcbj7rDZ6YNno8H9qtXCa1Ah6TQMg076rDjpGAUp9lAb87DHCgIYapoClYOxfxM2hyMdlv8zaibigGAtzZ1rS8qoSDO6+piNxDfyPZoutxdtjff3iDuYz7OGE1QwdQWmdYzp5F9olucIVJBk0fLZgTCDTv4vjQLKEIw7Ni7rZdV37MME1r7D 9KFUgxT7 QeumYR/jBef3yPoe39xbx+hW+VBdo5UNkwRTS5cGC2qpzM7K9SjAzNcRGhNqb9FjI5eRxJhiLYvk3/roYKSk5r2f3CZU5nmNARXiD4guH6K5tO5gFMulCEFandmOXRACe1h4IHmww340S0eSFFcboLaTi8RwDt3zQUDOp284ok+zMP2MVZBZS6pS0xVMIOVLXKkji/amQEMcQz/c50sYaEjUjYdNe7f6pGUD2SIv2eNoFNfpX2OynlWOOo1aXZ/P23CUU5VOtAvkkPaqrzKM0Q+Y/BdXaUw4ByJjXrjDBJZO+mB3/Wz4w1YGQ4mLO0vhkNp7ZntkeAq1mVbe8XCmH/jHiKakFaxvarFSo7i9sY22EdK8vgQL8hs847yOZTQBJI9pO6qZa/OGNT7+9svMWnE7nsAQxyUA7rAenOwcTiK7d5rDLO5VGPeD0fmTIhjzyOmggUI24xRiTNt3sUjAnUJYRzX4ZuRh2V+Dhqv6DQRtbedbWPVmAg9TGWuBhScJurHQAQu/H+a2VhC/SpDMzG3WVhMPsN3w1IFM4rKqDqD+0saWG4owai96laA+tkGJQExmrEmMtawe2VutiV0LgE1upJxE0BJFVdmcySw7pVpd/LClYDbJfPpuX2AAyJwEEo67YOR9CqH/oQlBTbZUnwg9znw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 12.12.22 14:02, David Hildenbrand wrote: > On sparc64, there is no HW modified bit, therefore, SW tracks via a SW > bit if the PTE is dirty via pte_mkdirty(). However, pte_mkdirty() > currently also unconditionally sets the HW writable bit, which is wrong. > > pte_mkdirty() is not supposed to make a PTE actually writable, unless the > SW writable bit (pte_write()) indicates that the PTE is not > write-protected. Fortunately, sparc64 also defines a SW writable bit. > > For example, this already turned into a problem in the context of > THP splitting as documented in commit 624a2c94f5b7 ("Partly revert "mm/thp: > carry over dirty bit when thp splits on pmd") and might be an issue during > page migration in mm/migrate.c:remove_migration_pte() as well where we: > if (folio_test_dirty(folio) && is_migration_entry_dirty(entry)) > pte = pte_mkdirty(pte); > > But more general, anything like: > maybe_mkwrite(pte_mkdirty(pte), vma) > code is broken on sparc64, because it will unconditionally set the HW > writable bit even if the SW writable bit is not set. > > Simple reproducer that will result in a writable PTE after ptrace > access, to highlight the problem and as an easy way to verify if it has > been fixed: > > -------------------------------------------------------------------------- > #include > #include > #include > #include > #include > #include > #include > > static void signal_handler(int sig) > { > if (sig == SIGSEGV) > printf("[PASS] SIGSEGV generated\n"); > else > printf("[FAIL] wrong signal generated\n"); > exit(0); > } > > int main(void) > { > size_t pagesize = getpagesize(); > char data = 1; > off_t offs; > int mem_fd; > char *map; > int ret; > > mem_fd = open("/proc/self/mem", O_RDWR); > if (mem_fd < 0) { > fprintf(stderr, "open(/proc/self/mem) failed: %d\n", errno); > return 1; > } > > map = mmap(NULL, pagesize, PROT_READ, MAP_PRIVATE|MAP_ANON, -1 ,0); > if (map == MAP_FAILED) { > fprintf(stderr, "mmap() failed: %d\n", errno); > return 1; > } > > printf("original: %x\n", *map); > > /* debug access */ > offs = lseek(mem_fd, (uintptr_t) map, SEEK_SET); > ret = write(mem_fd, &data, 1); > if (ret != 1) { > fprintf(stderr, "pwrite(/proc/self/mem) failed with %d: %d\n", ret, errno); > return 1; > } > if (*map != data) { > fprintf(stderr, "pwrite(/proc/self/mem) not visible\n"); > return 1; > } > > printf("ptrace: %x\n", *map); > > /* Install signal handler. */ > if (signal(SIGSEGV, signal_handler) == SIG_ERR) { > fprintf(stderr, "signal() failed\n"); > return 1; > } > > /* Ordinary access. */ > *map = 2; > > printf("access: %x\n", *map); > > printf("[FAIL] SIGSEGV not generated\n"); > > return 0; > } > -------------------------------------------------------------------------- > > Without this commit (sun4u in QEMU): > # ./reproducer > original: 0 > ptrace: 1 > access: 2 > [FAIL] SIGSEGV not generated > > Let's fix this by setting the HW writable bit only if both, the SW dirty > bit and the SW writable bit are set. This matches, for example, how > s390x handles pte_mkwrite() and pte_mkdirty() -- except, that they have > to clear the _PAGE_PROTECT bit. > > We have to move pte_dirty() and pte_dirty() up. The code patching > mechanism and handling constants > 22bit is a bit special on sparc64. > > With this commit (sun4u in QEMU): > # ./reproducer > original: 0 > ptrace: 1 > [PASS] SIGSEGV generated > > This handling seems to have been in place forever. > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Cc: Andrew Morton > Cc: "David S. Miller" > Cc: Peter Xu > Cc: Hev > Cc: Anatoly Pugachev > Cc: Raghavendra K T > Cc: Thorsten Leemhuis > Cc: Mike Kravetz > Cc: "Kirill A. Shutemov" > Cc: Juergen Gross > Signed-off-by: David Hildenbrand > --- Ping -- Thanks, David / dhildenb