From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5DC5C3DA7F for ; Thu, 15 Aug 2024 18:53:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 62CBF6B0193; Thu, 15 Aug 2024 14:53:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B4776B01F1; Thu, 15 Aug 2024 14:53:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E0A76B01F2; Thu, 15 Aug 2024 14:53:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 18CEE6B0193 for ; Thu, 15 Aug 2024 14:53:05 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C6053A171B for ; Thu, 15 Aug 2024 18:53:04 +0000 (UTC) X-FDA: 82455377088.02.E59144E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf21.hostedemail.com (Postfix) with ESMTP id A74E31C0017 for ; Thu, 15 Aug 2024 18:53:02 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BFbTxrKo; spf=pass (imf21.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723747946; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IhERV/jL6UlsHTgwO0rZfirsXuvPDZ9y+GMqTvo5SLU=; b=PcTQC6NQMIJdFzn8aKrgC6ALgctcWxtHRSwIEh4fkk1vSnm8l1PPC2VQpXHtfk/Yv+GH+A S7nccfHhZn807e3uOBZylQTexenll20XQPjSWH/3e6qsIMFXWfs1UEgC/Hh6J0JOTFcbqt ZX/1ZfjkJMfm9yEBEJVoayXDazol0EQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BFbTxrKo; spf=pass (imf21.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723747946; a=rsa-sha256; cv=none; b=xMwEZmSy7Nhprv8P53XSF9dEZ+1ez8YbKZ0TT/NEAdqUBuEZ54sSeVYitidIL2xgnds1sn LE5ObsuO/RTCIbfE0jb4P0jlpgGKjMJBqAF87/BodGIEFXDzbqDk19AmxdpuHIAVX1w0Yl njYs0jE5vfQJQfp81Etp5tvfCMu1efk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723747982; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IhERV/jL6UlsHTgwO0rZfirsXuvPDZ9y+GMqTvo5SLU=; b=BFbTxrKo9/McRJxpw+qQzf+ccyVRCXnvu9fvarenHIVc8v80/W7dyzSaqFTL9aMSQ0wG/5 aeVueHxLx++x6rZQuPk4uU1VpYPOvsFxgjGI9k4I1BG0TfM1Kl+a5S5RELp472MFxcHza5 /n8eKvl9ANV4q2w2DYmFUc6HcI1rcU4= Received: from mail-vs1-f71.google.com (mail-vs1-f71.google.com [209.85.217.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-573-EMwrgAWhPxmSD5tBQ5CB7g-1; Thu, 15 Aug 2024 14:53:00 -0400 X-MC-Unique: EMwrgAWhPxmSD5tBQ5CB7g-1 Received: by mail-vs1-f71.google.com with SMTP id ada2fe7eead31-4929c069f38so32852137.1 for ; Thu, 15 Aug 2024 11:53:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723747980; x=1724352780; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=IhERV/jL6UlsHTgwO0rZfirsXuvPDZ9y+GMqTvo5SLU=; b=BEzEv4Ty8Jz8v9q6Dv8g75HaWhMEFArcoGwMIQyOVAtpiqajG9hc0m3bxzjKEMNiza t/tDVQsd/iGFWsmdoTj5ZI9+pGGmwpCzNe5FI5BA+GjC19mFZswHOIbGXB/rxuKHXBox uMxT4nf6YBLZQCmA61An8uuHrR4h5Xa7Hy6V3kPeKgItYFng6hJ8lBV8DrhChVppcFeg uqFX+XksboO21fNJ7HWeXuF8jKJxMYOLyF4R5gx1yZi9hAT8pIQSGVDJFfFKoc1Xvmd0 vuVHAGukzh4zwT3JNE6h/ZSbitW0qVWcwQlXI2Vt22wLTLvSE63JgzMw+tNy8x6ehq/w c7wg== X-Gm-Message-State: AOJu0YydWQtvmFCqBu5CavBYCQlpCNc0/AGUnAfQEqcTe+QoAk1Cfwte vxs+0HV+9gthykYXV2FAKp5KJCnUoaeqcxA9hbewA53w6s8YGuMr6no17TIhcEx/JtlWBhY2h2U ZGtcZuz/FNFRmrMT9WJJNalmzUigxOdWk/NzRzT8+WigVhOHm X-Received: by 2002:a05:6102:cc7:b0:48d:aced:abff with SMTP id ada2fe7eead31-497798dde56mr529552137.1.1723747980123; Thu, 15 Aug 2024 11:53:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH5mqFNb1PXl4pf/EBUVsWqezNxIo7H5nHPjHSbbxlg0TjpV2kIgUEe/oeqxt6K3fJhgIgYNg== X-Received: by 2002:a05:6102:cc7:b0:48d:aced:abff with SMTP id ada2fe7eead31-497798dde56mr529525137.1.1723747979713; Thu, 15 Aug 2024 11:52:59 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a4ff105fe3sm88156085a.109.2024.08.15.11.52.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Aug 2024 11:52:59 -0700 (PDT) Date: Thu, 15 Aug 2024 14:52:56 -0400 From: Peter Xu To: Jason Gunthorpe Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sean Christopherson , Oscar Salvador , Axel Rasmussen , linux-arm-kernel@lists.infradead.org, x86@kernel.org, Will Deacon , Gavin Shan , Paolo Bonzini , Zi Yan , Andrew Morton , Catalin Marinas , Ingo Molnar , Alistair Popple , Borislav Petkov , David Hildenbrand , Thomas Gleixner , kvm@vger.kernel.org, Dave Hansen , Alex Williamson , Yan Zhao Subject: Re: [PATCH 09/19] mm: New follow_pfnmap API Message-ID: References: <20240809160909.1023470-1-peterx@redhat.com> <20240809160909.1023470-10-peterx@redhat.com> <20240814131954.GK2032816@nvidia.com> <20240814221441.GB2032816@nvidia.com> <20240815161603.GH2032816@nvidia.com> <20240815172445.GK2032816@nvidia.com> MIME-Version: 1.0 In-Reply-To: <20240815172445.GK2032816@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: A74E31C0017 X-Stat-Signature: 4iu9xkssbmxwq3w88495qdqkjpy6g31h X-HE-Tag: 1723747982-969707 X-HE-Meta: U2FsdGVkX196Q3031sS2H+PJFN/IC06H9eG5zvHuXzl+2WwV02TmTJLJRIDmNT3QGKkRVaF66OgaE+jSC1En2PO+SGE38ndIiSEYD1KsXjGoF0u0aqrORQPynn3DyFzRQBi5RR41x7sSu8jLuFiwvdoIIQO6v3QxKLEECDx9brsMqbE+4FepdopH01zDP0zdVKUU0F8SvLbzZa4wRw5IIjHDlDuHQjg+nsFmswRiEp+x3QDF+8EZdfh+BW1TyMnu0bQ7CvS9Q1usOaM5Bo9n2W1Q98w0UIDS8F8Flr17tGW/2Om8ot/idyRLry3JOHeJMUNanuAIYYUjYWQKukbLFKfMvIQoxADjtL17IURrW6DJft2GHjW1qt8gRKQIJTsQ9iaZmeDlWYVhSlcdbbDduqNGWKbVZ8GQFPrFKxIVM8075Yb6QO2aeWFKT/H0NWKvdxZrN0VJV5MoFDD2rWtK1VLUrnjKjCN3bt0mmhyVJZ3lDQKBdjU1uZYzOkblg9aGxgNhiX/y1PpCoy+8bkLg7hCiGT7GOqXFEuJALWi+wJz7nl8on//kNPBw/x4Rw0uoZnlNP8jAanWYmnRWx7kQPPw3bgEcY51r7jrTAiDgi4zKkVzlo9wxCY6K1mPYSkI+aU4gXOB6ouZ7j8yI/FRvUke/0S/Kusd7wdkgav+C4PDb60N7hXL8usToVmEJrE441FmfYzsLoGqUHkzmy3zMhaAxFLY1BIbHkBMUCtLVrZYOgyKXNCOA2CFD5a4zzNU8/1Rk9hOy8Kl+owFyUYtx2cN2Rzgx/drGhYrY80FZK+yvN0uu/fsv5KWo8ZHB2/otFx9A6tInxr/ob3BTvYBsLgAiglue9y6M0kUkanNl8BTnqw698NuWhnPKj1wyXM0NcOGXZ5hWz3drbZohWg2pvvi3TA+9MnzACxc8KpnYxi8SQ6dzAg70s02d/ENBQDCK92rZ+pNp/XwFLghhSz4 B+V+AcZN nd/5WUCrYoKdVnfgV6l9fJBJF3Er3OOlUGbcBVY9g0QlDLvK3Px999M2A1M2SQRJNYdSObEgMOsxXLria5AdVdVtqh6VlT8/uAdqav7kPEIvGPfADgc0WeKHhciGhS27Ax0aepNnn8NgAKda/xOFIvVU7JH9v4+wnHCI1h8BlE6ESTPHQZT3o50au1VVmQP7cD6EDluN1LZKnujTnSIqZj7OyBKetGaNJKe9uQIElL3MCZon4fURfbGm5pWiVjd0JJZ2K1SFWY7WKkVdwEhXjRalzqmOW/cdmMtmrqlN0sGmm+AM4SqBEuP/6UaRW9aa7p/HoBfS+gcEZHHgXZuIeB0H0SlwMH9J9diM+jFNSlg/gPosSOQc6nfnvVpuXIwRz8pj8Amxx5R7IX+icXNjmvdNvx9kmNatHEY2o X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 15, 2024 at 02:24:45PM -0300, Jason Gunthorpe wrote: > On Thu, Aug 15, 2024 at 01:21:01PM -0400, Peter Xu wrote: > > > Why? Either the function only returns PFN map no-struct page things or > > > it returns struct page stuff too, in which case why bother to check > > > the VMA flags if the caller already has to be correct for struct page > > > backed results? > > > > > > This function is only safe to use under the proper locking, and under > > > those rules it doesn't matter at all what the result is.. > > > > Do you mean we should drop the PFNMAP|IO check? > > Yeah > > > I didn't see all the > > callers to say that they won't rely on proper failing of !PFNMAP&&!IO vmas > > to work alright. So I assume we should definitely keep them around. > > But as before, if we care about this we should be using vm_normal_page > as that is sort of abusing the PFNMAP flags. I can't say it's abusing.. Taking access_remote_vm() as example again, it can go back as far as 2008 with Rik's commit here: commit 28b2ee20c7cba812b6f2ccf6d722cf86d00a84dc Author: Rik van Riel Date: Wed Jul 23 21:27:05 2008 -0700 access_process_vm device memory infrastructure So it starts with having GUP failing pfnmaps first for remote vm access, as what we also do right now with check_vma_flags(), then this whole walker is a remedy for that. It isn't used at all for normal VMAs, unless it's a private pfnmap mapping which should be extremely rare, or if it's IO+!PFNMAP, which is a world I am not familiar with.. In all cases, I hope we can still leave this alone in the huge pfnmap effort, as they do not yet to be closely relevant. From that POV, this patch as simple as "teach follow_pte() to know huge mappings", while it's just that we can't modify on top when the old interface won't work when stick with pte_t. Most of the rest was inherited from follow_pte(); there're still some trivial changes elsewhere, but here on the vma flag check we stick the same with old. > > > > Any physical address obtained through this API is only valid while > > > the @follow_pfnmap_args. Continuing to use the address after end(), > > > without some other means to synchronize with page table updates > > > will create a security bug. > > > > Some misuse on wordings here (e.g. we don't return PA but PFN), and some > > sentence doesn't seem to be complete.. but I think I get the "scary" part > > of it. How about this, appending the scary part to the end? > > > > * During the start() and end() calls, the results in @args will be valid > > * as proper locks will be held. After the end() is called, all the fields > > * in @follow_pfnmap_args will be invalid to be further accessed. Further > > * use of such information after end() may require proper synchronizations > > * by the caller with page table updates, otherwise it can create a > > * security bug. > > I would specifically emphasis that the pfn may not be used after > end. That is the primary mistake people have made. > > They think it is a PFN so it is safe. I understand your concern. It's just that it seems still legal to me to use it as long as proper action is taken. I hope "require proper synchronizations" would be the best way to phrase this matter, but maybe you have even better suggestion to put this, which I'm definitely open to that too. > > > It sounds like we need some mmu notifiers when mapping the IOMMU pgtables, > > as long as there's MMIO-region / P2P involved. It'll make sure when > > tearing down the BAR mappings, the devices will at least see the same view > > as the processors. > > I think the mmu notifiers can trigger too often for this to be > practical for DMA :( I guess the DMAs are fine as normally the notifier will be no-op, as long as the BAR enable/disable happens rare. But yeah, I see you point, and that is probably a concern if those notifier needs to be kicked off and walk a bunch of MMIO regions, even if 99% of the cases it'll do nothing. Thanks, -- Peter Xu