From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA3B1ECAAA1 for ; Fri, 2 Sep 2022 06:32:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6339D800BD; Fri, 2 Sep 2022 02:32:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E30D8008D; Fri, 2 Sep 2022 02:32:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48402800BD; Fri, 2 Sep 2022 02:32:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3860C8008D for ; Fri, 2 Sep 2022 02:32:48 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 17716A08C6 for ; Fri, 2 Sep 2022 06:32:48 +0000 (UTC) X-FDA: 79866177216.08.DCDFF1B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 246ED4000D for ; Fri, 2 Sep 2022 06:32:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662100366; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W66TddTf10WkGr4cQGB6HUG4UwEsTwDAwkc3PXqiMWM=; b=e09/S7szyeaXzee2fQKKYP9paxsSqPfN07TxSZirqhvplNcZl1bHY+ZqMn6cauahpIXSoK 5eWoEGe4NGHvFA2N/8dWGpPbhvHcEdXb0uYSV2qZelbjLlxIuYBAHlwFPXvY5lXzUqX06z vcauZZAZYyMCsv2h9qOjKQURnoRr5Gg= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-669-e1Wq6c5tN9a9YbNql8-ZqQ-1; Fri, 02 Sep 2022 02:32:45 -0400 X-MC-Unique: e1Wq6c5tN9a9YbNql8-ZqQ-1 Received: by mail-wm1-f72.google.com with SMTP id j22-20020a05600c485600b003a5e4420552so2472818wmo.8 for ; Thu, 01 Sep 2022 23:32:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date; bh=W66TddTf10WkGr4cQGB6HUG4UwEsTwDAwkc3PXqiMWM=; b=fbAgzMGYel/d3rDsaRFdybO8THM4j910w73oZtTzdf7MqbaZY4ExFCbf0g4/UhFtGm TKK9hyDhtkOaS0eompYiC9BsTjQayz7y0MS5nR77ExnFZvqyCNc8looYHsf/MorlGIeH NJ/H2Q3ibqEj8fb452bpRI8QjQFmFGUk9pkFuEQgFQxQYQgFy6uHrTi+wF/m6C3qGIbp PEzZFnA/3vERHimmb/PxLg1Gx1lNyvdGGwzlyc3FnQEWN2eAiUQltM/IFQj7Ceh9yz2h yiAaHhuBUyUdW2ci8+AGGv9AnjoU+V4vAgN+2TX4T6NU/MQlP4sX18ugXEikXcE8DAHQ qdSQ== X-Gm-Message-State: ACgBeo24G3vqaS63Tm5O57TgZidtqe17mCpe3dH2c5fODI7jSWe4XU8d +AZItfgT+I/tIVBYVjcqoIAJz/EV0GyDPWK/c5dfzUC6fcLdJ3z0yMcGfj2McDrakfScO6kzml6 6Rms6MkwSL04= X-Received: by 2002:a05:6000:1867:b0:21f:f2cf:74a8 with SMTP id d7-20020a056000186700b0021ff2cf74a8mr16650430wri.344.1662100364421; Thu, 01 Sep 2022 23:32:44 -0700 (PDT) X-Google-Smtp-Source: AA6agR7nHhm/k5Mwq1+slOV3fyp2ZPs9Zh86R3uNkCy2dVyoQeIft5ws9NqXuWpcMG3fhtjPyiGzsA== X-Received: by 2002:a05:6000:1867:b0:21f:f2cf:74a8 with SMTP id d7-20020a056000186700b0021ff2cf74a8mr16650412wri.344.1662100364072; Thu, 01 Sep 2022 23:32:44 -0700 (PDT) Received: from ?IPV6:2003:cb:c714:4800:2077:1bf6:40e7:2833? (p200300cbc714480020771bf640e72833.dip0.t-ipconnect.de. [2003:cb:c714:4800:2077:1bf6:40e7:2833]) by smtp.gmail.com with ESMTPSA id k27-20020a05600c1c9b00b003a845fa1edfsm14815062wms.3.2022.09.01.23.32.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 01 Sep 2022 23:32:43 -0700 (PDT) Message-ID: <2368d91f-8442-076f-f33a-64b51b44825c@redhat.com> Date: Fri, 2 Sep 2022 08:32:42 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 To: Yang Shi , Peter Xu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Kirill A . Shutemov" , "Aneesh Kumar K . V" , Vlastimil Babka , Jerome Marchand , Andrea Arcangeli , Hugh Dickins , Jason Gunthorpe , John Hubbard References: <20220901072119.37588-1-david@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1] mm/gup: adjust stale comment for RCU GUP-fast In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662100367; a=rsa-sha256; cv=none; b=cmhfG8dSEJvE5tEspoRq8/hlcBkENWaaYFnUDxdrxe0oQIIye1ZyIrlrTzZ7B3wIso9C+t eUqHAU1Fz5VRXlAi4NTSv7OVxJluiBmYd5Zg9Hv41+R8GdStTLjXrqxFEHjlt015yeLku9 qiNNyKNGY6WPXq4QrJN2iVAtX6+MwQQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="e09/S7sz"; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662100367; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=W66TddTf10WkGr4cQGB6HUG4UwEsTwDAwkc3PXqiMWM=; b=Sxh82HwavEGr6J3MuIVAX0AIPY3cLG8QEZoWTWvuTP0kE1qoS0jfqtN8qMDjL02SwsYYC6 V++oQAXAh4SXNzjB80mjM9+71jOaU/3Zlx1AkBX7FTKIgaEH9n9mk2N0ZrLy8I7GgXf/js J6Y/wSdrC90BAtGNAI6RF6Je4aF/kBg= Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="e09/S7sz"; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 246ED4000D X-Stat-Signature: eqgcf3pxm8bs1ybs8x1yg4kecj5jf9sw X-HE-Tag: 1662100366-830075 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 01.09.22 20:35, Yang Shi wrote: > On Thu, Sep 1, 2022 at 11:07 AM Peter Xu wrote: >> >> On Thu, Sep 01, 2022 at 10:50:48AM -0700, Yang Shi wrote: >>> Yeah, because THP collapse does copy the data before clearing pte. If >>> we want to remove pmdp_collapse_flush() by just clearing pmd, we >>> should clear *AND* flush pte before copying the data IIRC. >> >> Yes tlb flush is still needed. IIUC the generic pmdp_collapse_flush() will >> still be working (with the pte level flushing there) but it should just >> start to work for all archs, so potentially we could drop the arch-specific >> pmdp_collapse_flush()s, mostly the ppc impl. > > I'm don't know why powperpc needs to have its specific > pmdp_collapse_flush() in the first place, not only the mandatory IPI > broadcast, but also the specific implementation of pmd tlb flush. But > anyway the IPI broadcast could be removed at least IMO. > pmdp_collapse_flush() is overwritten on book3s only. It either translates to radix__pmdp_collapse_flush() or hash__pmdp_collapse_flush(). radix__pmdp_collapse_flush() has a comment explaining the situation: + /* + * pmdp collapse_flush need to ensure that there are no parallel gup + * walk after this call. This is needed so that we can have stable + * page ref count when collapsing a page. We don't allow a collapse page + * if we have gup taken on the page. We can ensure that by sending IPI + * because gup walk happens with IRQ disabled. + */ The comment for hash__pmdp_collapse_flush() is a bit more involved: /* * Wait for all pending hash_page to finish. This is needed * in case of subpage collapse. When we collapse normal pages * to hugepage, we first clear the pmd, then invalidate all * the PTE entries. The assumption here is that any low level * page fault will see a none pmd and take the slow path that * will wait on mmap_lock. But we could very well be in a * hash_page with local ptep pointer value. Such a hash page * can result in adding new HPTE entries for normal subpages. * That means we could be modifying the page content as we * copy them to a huge page. So wait for parallel hash_page * to finish before invalidating HPTE entries. We can do this * by sending an IPI to all the cpus and executing a dummy * function there. */ I'm not sure if that implies that the IPI is needed for some other hash-magic. Maybe Aneesh can clarify. >> >> This also reminded me that the s390 version of pmdp_collapse_flush() is a >> bit weird, since it doesn't even have the tlb flush there. I feel like >> it's broken but I can't really tell whether something I've overlooked. >> Worth an eye on. > > I don't know why. But if s390 doesn't flush tlb in > pmdp_collapse_flush(), then there may be data integrity problem since > the page is still writable when copying the data because pte is > cleared after data copying. Or s390 hardware does flush tlb > automatically? s390x does a pmdp_huge_get_and_clear(). pmdp_huge_get_and_clear() does an pmdp_xchg_direct(). pmdp_xchg_direct() does an pmdp_flush_direct(). pmdp_flush_direct() issues an IDTE, which is a TLB flush. Note that this matches ptep_get_and_clear() behavior on s390x. Quoting the comment in there: /* * This is hard to understand. ptep_get_and_clear and ptep_clear_flush * both clear the TLB for the unmapped pte. The reason is that * ptep_get_and_clear is used in common code (e.g. change_pte_range) * to modify an active pte. The sequence is * 1) ptep_get_and_clear * 2) set_pte_at * 3) flush_tlb_range * On s390 the tlb needs to get flushed with the modification of the pte * if the pte is active. The only way how this can be implemented is to * have ptep_get_and_clear do the tlb flush. In exchange flush_tlb_range * is a nop. */ -- Thanks, David / dhildenb