From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7581CEE57DF for ; Mon, 11 Sep 2023 12:29:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBC026B028B; Mon, 11 Sep 2023 08:29:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C6B6B6B028C; Mon, 11 Sep 2023 08:29:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B345C6B028D; Mon, 11 Sep 2023 08:29:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A6D506B028B for ; Mon, 11 Sep 2023 08:29:12 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 707D31409DF for ; Mon, 11 Sep 2023 12:29:12 +0000 (UTC) X-FDA: 81224246544.19.B1F26AA Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 14664A0022 for ; Mon, 11 Sep 2023 12:29:09 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cWkDpKIx; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694435350; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DHwDjcrul9YxmPEcQYx4HYVTPitVbFTJsMO8Jxfoen4=; b=PP4vOFIGsf93XNgOTfPQ12pQvT5n1nOnfHDTM0+FjvBHudPQQ+ZpJPJRmtXG7S8gXJZ70u wHUlX1ZHhiGLKR/9Loa4MY1okCUmuHM0efNkqk6m5Ljjbgez8p87tyXcFx38uYnQaPm5Wx ZkiTvRQ7bc+mmlWsOP7sDhXxyVplGLY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694435350; a=rsa-sha256; cv=none; b=vX4jdlUv2LYGO9/cLkwCP6luIGIms+xj8A+1Q8ZWbYPyPDnfDvJzYu3qrgq8asR/n0OSnS MWyrWbTtHcUDkNyEpT5jzX3r25hi3tDi1Tk2OMSPTswaZLFr5vxJLrK211jwNLzRIg/2f4 gCx8GW3Ye3AtHOEoVXnvOHDSf2wruds= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cWkDpKIx; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1694435349; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DHwDjcrul9YxmPEcQYx4HYVTPitVbFTJsMO8Jxfoen4=; b=cWkDpKIxPxttM0zBFXZVPzxmcZHmHZXAhDtBXKkPgA+TAO4UTC0KtQ5UctOCtzQVH/6vRb Esuj90lUWlVfPVgjTpDMQb3SoGuABNebT3wbBb+bbZ6Wwt+az3wcx6XbyhCim5rLzFdZXz xwa8Qa1XPPlI/nY1O+Mi+8s1EMGcMeI= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-362-_aEzKrm-N9CXr64p1iuP6A-1; Mon, 11 Sep 2023 08:29:07 -0400 X-MC-Unique: _aEzKrm-N9CXr64p1iuP6A-1 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-401db25510fso32854085e9.1 for ; Mon, 11 Sep 2023 05:29:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694435346; x=1695040146; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DHwDjcrul9YxmPEcQYx4HYVTPitVbFTJsMO8Jxfoen4=; b=H0kLBYtH1MSKwAne9LTifh+7Mzedu9H00HSBe0qwpGVW9PbWrLZb7h6zmLjf0D639B BfGDqx/THCyylng4Jf3RrXcdoYCH7KmWqezJY/yiVUQhcGDUu++Sc5ZMjHCzMYycn+AU T2xBe1D059v9eWumRAqUnCwNZwgyiVRSPOJoA0Tr+2cced0mG7CyuBMEVM/MGkWin/pg s74AdGmgg7+eUOnX2+NytMBrwAPjievF7c39HW9COb3eTXP9ceiZONFoWPZnUDOeaQco gIzw98pmWdlXRVs1ka43uf98mlzIL0kP2tr6TjjT6b4Y1bsvX1SUsgHFgPhoW6reuDpr lP1A== X-Gm-Message-State: AOJu0YyRIrK165qRyaLIcw0+j55iA/LQU5Iq38JTcUpP3tBFZphLqzGN QE86PE9lsrt0rgjIp6IX//a+smH3jXThBQg5rXP1DSi3H31I5CQW7p7YUaioBzKN61hU+n7M/4f Bxishie0vcwk= X-Received: by 2002:a05:600c:2208:b0:3fe:f667:4e4c with SMTP id z8-20020a05600c220800b003fef6674e4cmr8229322wml.12.1694435346278; Mon, 11 Sep 2023 05:29:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF0LpLhNizaUMU3zKfHh49aqw4McELTtqbEdG0mSCPhvh4bG6htIZyW8h2QYhMj39he9TwHvA== X-Received: by 2002:a05:600c:2208:b0:3fe:f667:4e4c with SMTP id z8-20020a05600c220800b003fef6674e4cmr8229290wml.12.1694435345880; Mon, 11 Sep 2023 05:29:05 -0700 (PDT) Received: from ?IPV6:2003:cb:c743:5500:a9bd:94ab:74e9:782f? (p200300cbc7435500a9bd94ab74e9782f.dip0.t-ipconnect.de. [2003:cb:c743:5500:a9bd:94ab:74e9:782f]) by smtp.gmail.com with ESMTPSA id z18-20020a1c4c12000000b003fedcd02e2asm9929615wmf.35.2023.09.11.05.29.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 11 Sep 2023 05:29:05 -0700 (PDT) Message-ID: <0cc8a118-2522-f666-5bcc-af06263fd352@redhat.com> Date: Mon, 11 Sep 2023 14:29:03 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 To: Catalin Marinas , Alexandru Elisei Cc: will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com, pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org References: <20230823131350.114942-1-alexandru.elisei@arm.com> <33def4fe-fdb8-6388-1151-fabd2adc8220@redhat.com> <0b9c122a-c05a-b3df-c69f-85f520294adc@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH RFC 00/37] Add support for arm64 MTE dynamic tag storage reuse In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: icsfk4i1zon5h7axauko6b7m9x86dwyh X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 14664A0022 X-Rspam-User: X-HE-Tag: 1694435349-124048 X-HE-Meta: U2FsdGVkX19swE6ImigxmDivcFGloz0M/HeSMmJmItJYYZxv+e9fnO8hnqyKs/coMCmkpaHj+E02887ux/xLUBtBrZvbwJ14ct6Yum+cKMzcofI7cOUE3yiVCThN7x1Gpq+3wfPvKQbSoo9QUcV/ZOgHd6QNynrzFEOYMLS1NrYPHbVagbOEIjK+Mksx/XVy8FKvF/p0Y8AfrjAzuldPDy4EyMppaB/q7B//B1Mub5cF2tI0Nzrp4f9kBoyzpNWlldiwREfW6jhrMHJJoOKVOmOTzZlgQWjCTKNvoIajpqats6AalOJv2hCQLJDDBv73YJtr6ivS993iOXkstyTWVeWrCW/lFOLA0AIKJOxsD8gLo5CPBLsVCglp6FUlkMtuxNAAzuRjSzuQNwU0SoMoSNtwpylx/onOv+b3XHkHO6QszqMQmHWhm9I4f8y8P3FeT5Y5b/99/Q8P4Ei4xjLUUeUe73AiwYOHgLF6A61mo1OXlCwZG5Vgazfp+I9bE5AooDsipmkMg5g5sdKTdhBP3mA3+IwBVtbNfQI6oBGnXrUYrChMeKAzVBiBUAXW8lIs5hYgGSki9Z0wIwqdCeQzWXwXuon6drC9osIm8GHk9oIBtUSAt1jWcL1TNFi+jIOZfZ6WMF+/1H9meIUwivxeV9swa4IDp4MHhthvH3SD4l+fdp2VZcvxBOzjm3zyPPhmmVpKW5lAUFzRcWIGL17RKqfdY84pSVD2gd/GBCAC8GlY8W2QmjEyZjmwD/TKTwwp21KncekdWWp1HAUhB6hCDqpl+ua2wARqep5fVEpRtoS7t3W1OvbYRSQMMprS25RJ5f4ka4hULHmbVmlWX3Fv1FoxJND0loC6Or+XDdcib+5i8vBfPP0bq70ZCSCLmT8szi6G26YrtfH/2lLffuz/b1MU/6mWom0NUgttN1eu6g8VUu9AoGWDgORf1C8+wlaurMeheRrX0xAoXrXqUKP gLUwWyfU xOjcNaOMK9F5qtGANU2u1YKaE7dPai9wGRMgPmqTzZtfA9EVF41LQwtsWgM8ZzAHb8p6kMu68cthiNcpi0wcH8qgkr1KqtlEFwJJCgakcyWO07P2S1xtf1ECgV7YG+7OKHR8wLC2FF3ZMfNqF5T+yXuDjoK+ikD0xyjMa900cA44J9YDO872g13oPxjaOK5Rpu1MsL4/Z8XNndd0BeVvKK16lzY9OQrlo1mo5JUBvKbXf7tDjRn6B/nLMxMRbmIe2aC1hASnRnTUeroCFu+4uZtf26//nvS/IcSO/Xkd2B9id9tb886xFHt6tDK7KiRp6++Jnv6UEncW8jLIM3Har1xhp4yB4i5pBpZPOQ7HClSnluZiGNsiUV1iGl/moZ7zdrDMz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11.09.23 13:52, Catalin Marinas wrote: > On Wed, Sep 06, 2023 at 12:23:21PM +0100, Alexandru Elisei wrote: >> On Thu, Aug 24, 2023 at 04:24:30PM +0100, Catalin Marinas wrote: >>> On Thu, Aug 24, 2023 at 01:25:41PM +0200, David Hildenbrand wrote: >>>> On 24.08.23 13:06, David Hildenbrand wrote: >>>>> Regarding one complication: "The kernel needs to know where to allocate >>>>> a PROT_MTE page from or migrate a current page if it becomes PROT_MTE >>>>> (mprotect()) and the range it is in does not support tagging.", >>>>> simplified handling would be if it's in a MIGRATE_CMA pageblock, it >>>>> doesn't support tagging. You have to migrate to a !CMA page (for >>>>> example, not specifying GFP_MOVABLE as a quick way to achieve that). >>>> >>>> Okay, I now realize that this patch set effectively duplicates some CMA >>>> behavior using a new migrate-type. > [...] >> I considered mixing the tag storage memory memory with normal memory and >> adding it to MIGRATE_CMA. But since tag storage memory cannot be tagged, >> this means that it's not enough anymore to have a __GFP_MOVABLE allocation >> request to use MIGRATE_CMA. >> >> I considered two solutions to this problem: >> >> 1. Only allocate from MIGRATE_CMA is the requested memory is not tagged => >> this effectively means transforming all memory from MIGRATE_CMA into the >> MIGRATE_METADATA migratetype that the series introduces. Not very >> appealing, because that means treating normal memory that is also on the >> MIGRATE_CMA lists as tagged memory. > > That's indeed not ideal. We could try this if it makes the patches > significantly simpler, though I'm not so sure. > > Allocating metadata is the easier part as we know the correspondence > from the tagged pages (32 PROT_MTE page) to the metadata page (1 tag > storage page), so alloc_contig_range() does this for us. Just adding it > to the CMA range is sufficient. > > However, making sure that we don't allocate PROT_MTE pages from the > metadata range is what led us to another migrate type. I guess we could > achieve something similar with a new zone or a CPU-less NUMA node, Ideally, no significant core-mm changes to optimize for an architecture oddity. That implies, no new zones and no new migratetypes -- unless it is unavoidable and you are confident that you can convince core-MM people that the use case (giving back 3% of system RAM at max in some setups) is worth the trouble. I also had CPU-less NUMA nodes in mind when thinking about that, but not sure how easy it would be to integrate it. If the tag memory has actually different performance characteristics as well, a NUMA node would be the right choice. If we could find some way to easily support this either via CMA or CPU-less NUMA nodes, that would be much preferable; even if we cannot cover each and every future use case right now. I expect some issues with CXL+MTE either way , but are happy to be taught otherwise :) Another thought I had was adding something like CMA memory characteristics. Like, asking if a given CMA area/page supports tagging (i.e., flag for the CMA area set?)? When you need memory that supports tagging and have a page that does not support tagging (CMA && taggable), simply migrate to !MOVABLE memory (eventually we could also try adding !CMA). Was that discussed and what would be the challenges with that? Page migration due to compaction comes to mind, but it might also be easy to handle if we can just avoid CMA memory for that. > though the latter is not guaranteed not to allocate memory from the > range, only make it less likely. Both these options are less flexible in > terms of size/alignment/placement. > > Maybe as a quick hack - only allow PROT_MTE from ZONE_NORMAL and > configure the metadata range in ZONE_MOVABLE but at some point I'd > expect some CXL-attached memory to support MTE with additional carveout > reserved. I have no idea how we could possibly cleanly support memory hotplug in virtual environments (virtual DIMMs, virtio-mem) with MTE. In contrast to s390x storage keys, the approach that arm64 with MTE took here (exposing tag memory to the VM) makes it rather hard and complicated. -- Cheers, David / dhildenb