All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jann Horn <jannh@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Yang Shi <shy828301@gmail.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>, Yu Zhao <yuzhao@google.com>,
	Alistair Popple <apopple@nvidia.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Steven Price <steven.price@arm.com>,
	SeongJae Park <sj@kernel.org>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Huang Ying <ying.huang@intel.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Zack Rusin <zackr@vmware.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Minchan Kim <minchan@kernel.org>,
	Christoph Hellwig <hch@infradead.org>, Song Liu <song@kernel.org>,
	Thomas Hellstrom <thomas.hellstrom@linux.intel.com>,
	Russell King <linux@armlinux.org.uk>,
	"David S. Miller" <davem@davemloft.net>,
	Michael Ellerman <mpe@ellerman.id.au>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Vishal Moola <vishal.moola@gmail.com>,
	Vlastimil Babka <vbabka@suse.cz>, Zi Yan <ziy@nvidia.com>,
	Zach O'Keefe <zokeefe@google.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	sparclinux@vger.kernel.org,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-s390 <linux-s390@vger.kernel.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd
Date: Mon, 21 Aug 2023 17:26:41 -0400	[thread overview]
Message-ID: <ZOPWkS7joexET6q8@x1n> (raw)
In-Reply-To: <4d31abf5-56c0-9f3d-d12f-c9317936691@google.com>

On Mon, Aug 21, 2023 at 12:51:20PM -0700, Hugh Dickins wrote:
> Jann Horn demonstrated how userfaultfd ioctl UFFDIO_COPY into a private
> shmem mapping can add valid PTEs to page table collapse_pte_mapped_thp()
> thought it had emptied: page lock on the huge page is enough to protect
> against WP faults (which find the PTE has been cleared), but not enough
> to protect against userfaultfd.  "BUG: Bad rss-counter state" followed.
> 
> retract_page_tables() protects against this by checking !vma->anon_vma;
> but we know that MADV_COLLAPSE needs to be able to work on private shmem
> mappings, even those with an anon_vma prepared for another part of the
> mapping; and we know that MADV_COLLAPSE needs to work on shared shmem
> mappings which are userfaultfd_armed().  Whether it needs to work on
> private shmem mappings which are userfaultfd_armed(), I'm not so sure:
> but assume that it does.
> 
> Just for this case, take the pmd_lock() two steps earlier: not because
> it gives any protection against this case itself, but because ptlock
> nests inside it, and it's the dropping of ptlock which let the bug in.
> In other cases, continue to minimize the pmd_lock() hold time.
> 
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/linux-mm/CAG48ez0FxiRC4d3VTu_a9h=rg5FW-kYD5Rg5xo_RDBM0LTTqZQ@mail.gmail.com/
> Fixes: 1043173eb5eb ("mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()")
> Signed-off-by: Hugh Dickins <hughd@google.com>

The locking is indeed slightly complicated.. but I didn't spot anything
wrong.

Acked-by: Peter Xu <peterx@redhat.com>

Thanks,

-- 
Peter Xu


WARNING: multiple messages have this Message-ID (diff)
From: Peter Xu <peterx@redhat.com>
To: Hugh Dickins <hughd@google.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	David Hildenbrand <david@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Yang Shi <shy828301@gmail.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	kernel list <linux-kernel@vger.kernel.org>,
	Song Liu <song@kernel.org>,
	sparclinux@vger.kernel.org,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Will Deacon <will@kernel.org>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Yu Zhao <yuzhao@google.com>, Ira Weiny <ira.weiny@intel.com>,
	Alistair Popple <apopple@nvidia.com>,
	Russell King <linux@armlinux.org.uk>,
	Matthew Wilcox <willy@infradead.org>,
	Steven Price <steven.price@arm.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Zi Yan <ziy@nvidia.com>, Huang Ying <ying.huang@intel.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Christian Borntraege r <borntraeger@linux.ibm.com>,
	Thomas Hellstrom <thomas.hellstrom@linux.intel.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Vasily Gorbik <gor@linux.ibm.com>, Jann Horn <jannh@google.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	SeongJae Park <sj@kernel.org>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Linux-MM <linux-mm@kvack.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Zack Rusin <zackr@vmware.com>, Zach O'Keefe <zokeefe@google.com>,
	Vishal Moola <vishal.moola@gmail.com>,
	Minchan Kim <minchan@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	"David S. Miller" <davem@davemloft.net>,
	Mike Rapoport <rppt@kernel.org>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd
Date: Mon, 21 Aug 2023 17:26:41 -0400	[thread overview]
Message-ID: <ZOPWkS7joexET6q8@x1n> (raw)
In-Reply-To: <4d31abf5-56c0-9f3d-d12f-c9317936691@google.com>

On Mon, Aug 21, 2023 at 12:51:20PM -0700, Hugh Dickins wrote:
> Jann Horn demonstrated how userfaultfd ioctl UFFDIO_COPY into a private
> shmem mapping can add valid PTEs to page table collapse_pte_mapped_thp()
> thought it had emptied: page lock on the huge page is enough to protect
> against WP faults (which find the PTE has been cleared), but not enough
> to protect against userfaultfd.  "BUG: Bad rss-counter state" followed.
> 
> retract_page_tables() protects against this by checking !vma->anon_vma;
> but we know that MADV_COLLAPSE needs to be able to work on private shmem
> mappings, even those with an anon_vma prepared for another part of the
> mapping; and we know that MADV_COLLAPSE needs to work on shared shmem
> mappings which are userfaultfd_armed().  Whether it needs to work on
> private shmem mappings which are userfaultfd_armed(), I'm not so sure:
> but assume that it does.
> 
> Just for this case, take the pmd_lock() two steps earlier: not because
> it gives any protection against this case itself, but because ptlock
> nests inside it, and it's the dropping of ptlock which let the bug in.
> In other cases, continue to minimize the pmd_lock() hold time.
> 
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/linux-mm/CAG48ez0FxiRC4d3VTu_a9h=rg5FW-kYD5Rg5xo_RDBM0LTTqZQ@mail.gmail.com/
> Fixes: 1043173eb5eb ("mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()")
> Signed-off-by: Hugh Dickins <hughd@google.com>

The locking is indeed slightly complicated.. but I didn't spot anything
wrong.

Acked-by: Peter Xu <peterx@redhat.com>

Thanks,

-- 
Peter Xu


  reply	other threads:[~2023-08-21 21:27 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21 19:51 [PATCH mm-unstable] mm/khugepaged: fix collapse_pte_mapped_thp() versus uffd Hugh Dickins
2023-08-21 19:51 ` Hugh Dickins
2023-08-21 21:26 ` Peter Xu [this message]
2023-08-21 21:26   ` Peter Xu
2023-08-21 21:59 ` Jann Horn
2023-08-21 21:59   ` Jann Horn
2023-08-22  2:51   ` Hugh Dickins
2023-08-22  2:51     ` Hugh Dickins
2023-08-22 14:31     ` Peter Xu
2023-08-22 14:31       ` Peter Xu
2023-08-22 18:34       ` Hugh Dickins
2023-08-22 18:34         ` Hugh Dickins
2023-08-22 18:45         ` Matthew Wilcox
2023-08-22 18:45           ` Matthew Wilcox
2023-08-22 19:10           ` Hugh Dickins
2023-08-22 19:10             ` Hugh Dickins
2023-08-22 14:39     ` Jann Horn
2023-08-22 14:39       ` Jann Horn
2023-08-22 15:22       ` Matthew Wilcox
2023-08-22 15:22         ` Matthew Wilcox
2023-08-22 15:30         ` Jann Horn
2023-08-22 15:30           ` Jann Horn
2023-08-22 15:33           ` David Hildenbrand
2023-08-22 15:33             ` David Hildenbrand
2023-08-22 15:28       ` David Hildenbrand
2023-08-22 15:28         ` David Hildenbrand
2023-08-22 18:54       ` Hugh Dickins
2023-08-22 18:54         ` Hugh Dickins
2023-08-22 19:07         ` Jann Horn
2023-08-22 19:07           ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZOPWkS7joexET6q8@x1n \
    --to=peterx@redhat.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=ira.weiny@intel.com \
    --cc=jannh@google.com \
    --cc=jgg@ziepe.ca \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lstoakes@gmail.com \
    --cc=mgorman@techsingularity.net \
    --cc=mike.kravetz@oracle.com \
    --cc=minchan@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=naoya.horiguchi@nec.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterz@infradead.org \
    --cc=rcampbell@nvidia.com \
    --cc=rppt@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=song@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=steven.price@arm.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=zackr@vmware.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.