From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9773CC433EF for ; Fri, 17 Dec 2021 22:30:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 040346B0072; Fri, 17 Dec 2021 17:30:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F31B86B0073; Fri, 17 Dec 2021 17:30:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAA9A6B0074; Fri, 17 Dec 2021 17:30:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0059.hostedemail.com [216.40.44.59]) by kanga.kvack.org (Postfix) with ESMTP id CB1B96B0072 for ; Fri, 17 Dec 2021 17:30:00 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 794D98249980 for ; Fri, 17 Dec 2021 22:29:50 +0000 (UTC) X-FDA: 78928729740.05.0907CC7 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 0B49520031 for ; Fri, 17 Dec 2021 22:29:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1639780189; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qL/JIX8EEyFG7lDdEcEOcgCQUQYx1MZeqTzUDIdfXMM=; b=fy2uuB7xqaT1HBwqlcC9asm3mhqHl/caHSHCP+Q7H+pGE/xLwh6DNq42ZoS80yb2ukuodZ 2jqnlVaQZzX3+OHaYlYJhEMqmINPr2CMnxMoHkxBiJtPzS2mCWXkN9wfc3FR/3wgIzc7et XequAnpndNnRCUb9hMmFyBUjJMmVn4I= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-657-pGeFM_OyPv6kzmtspPR6wg-1; Fri, 17 Dec 2021 17:29:48 -0500 X-MC-Unique: pGeFM_OyPv6kzmtspPR6wg-1 Received: by mail-wm1-f70.google.com with SMTP id j71-20020a1c234a000000b00342f418ae7cso3591573wmj.1 for ; Fri, 17 Dec 2021 14:29:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=qL/JIX8EEyFG7lDdEcEOcgCQUQYx1MZeqTzUDIdfXMM=; b=1zqedy0sCS3ofFyTlU7zwWn2kAJzE3/1lqa1XGRvVGw9BrsEX/HlOZsH6cHHKhI222 q0AC6+pV06aV0oyW9eqkkWlJEdAKZw7b44huad6Bve0DMkOY3owvw2XTVdhkrxgxqkhH wHKKieB5W2ghjhQBBefNQ54SBQb6nc/Pcf9Dl/2SVJPzz2RJYIL9UwdT8wtJnl1+SBHT KxMuHHt5IUZZ5nXCZcjz5NOU08O2xWFOVw3VFLJznpghEeiS/LXYn1Vek46DqEkJ+eaD gAPFvcDBe329t9Oc7PXt7JIbfs9WcTZVDms07qfHTDQ2YaiBc9/ugg5+OcH9Jzwbob1V K92w== X-Gm-Message-State: AOAM531887hiMENmWzl8HrTKhnVZTh/bALzMKSam8CMzdN38sgwyKIh5 LSfAKpzBkbAvJYOOqrO6UN5PI+u4jiEiL5gwlt9zDwKQdeRb1nZle88Fx+t6+9HHSCXNdrXnl4g Balg1HF3gg7s= X-Received: by 2002:adf:f6c1:: with SMTP id y1mr2672039wrp.351.1639780186950; Fri, 17 Dec 2021 14:29:46 -0800 (PST) X-Google-Smtp-Source: ABdhPJyY9P+G15WX8X9gRjoZbZF1GI9X6VoJGWLTrTMCgzZH7nR7syAHkYWmUCmVqISzyIQWE2QcUA== X-Received: by 2002:adf:f6c1:: with SMTP id y1mr2672021wrp.351.1639780186681; Fri, 17 Dec 2021 14:29:46 -0800 (PST) Received: from [192.168.3.132] (p4ff234b8.dip0.t-ipconnect.de. [79.242.52.184]) by smtp.gmail.com with ESMTPSA id e4sm7474338wrh.56.2021.12.17.14.29.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 17 Dec 2021 14:29:46 -0800 (PST) Message-ID: <0aa27d7d-0db6-94ee-ca16-91d19997286b@redhat.com> Date: Fri, 17 Dec 2021 23:29:44 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 To: Linus Torvalds Cc: Linux Kernel Mailing List , Andrew Morton , Hugh Dickins , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , "open list:DOCUMENTATION" References: <20211217113049.23850-1-david@redhat.com> <20211217113049.23850-7-david@redhat.com> <9c3ba92e-9e36-75a9-9572-a08694048c1d@redhat.com> <02cf4dcf-74e8-9cbd-ffbf-8888f18a9e8a@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb) In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: m7qdpxz7ogm9gc6u4scpta9yun1expwt X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 0B49520031 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fy2uuB7x; spf=none (imf13.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1639780183-4351 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 17.12.21 22:50, Linus Torvalds wrote: > On Fri, Dec 17, 2021 at 1:47 PM David Hildenbrand wrote: >> >> For now I have not heard a compelling argument why the mapcount is >> dubious, I repeat: >> >> * mapcount can only increase due to fork() >> * mapcount can decrease due to unmap / zap >> >> We can protect from the transtition == 1 -> >1 using the mmap_lock. >> >> For COW the mapcount is the only thing that matters *if we take GUP* out >> of the equation. And that's exactly what we > > What do you have against just doing what we already do in other parts, > that a/b thing? Let me put it that way: I just want to get all of this fixed for good. I don't particularly care how. *But* I will fight for something that is superior, logically makes sense (at least to me :) ) and not super complicated. And I call also avoiding unnecessary COW "superior". I do know that what this series proposes fixes the CVE: GUP after fork. I do know that the part 2 we'll be sending out next year will fix everything else we discovered so far, and it will rely on this as a basis, to not reintroduce any other COW issues we've seen so far. If someone can propose something comparable that makes all discovered problems go away I'll be *extremely* happy. We have reproducers for all issues, so it's easy to verify, and I'm planning on extending the selftests to cover even more corner cases. So far, I am not convinced that using the mapcount is dubious or problematic, I just don't see how. COW is an about sharing pages between processes, each expressed in the mapcount. It's a pure optimization for exactly that purpose. GUP is the problem, not COW, not the mapcount. To me the mapcount is the only thing that makes sense in COW+unsharing logic, and GUP has to be taught to identify it and resolve it -> unshare when it detects a shared anaonymous page. > > Which avoids the whole mmap_sem issue. That was a big issue for the > rdma people, afaik. While I do care about future use cases, I cannot possibly see fork() not requiring the mmap_lock in the foreseeable future. Just so much depends on it as of now. And after all, fixing everything what we discovered so far is more important to me than something like that for the future. We have other problems to solve in that regard. ---------------------------------------------------------------------- I didn't want to talk about hugetlb here but I will just because it's a good example why using the refocunt is just wrong -- because unnecessary COW are just absolutely problematic. Assume you have a R/O mapped huge page that is only mapped into your process and you get a write fault. What should your COW logic do? a) Rely on the mapcount? Yes, iff GUP has been taught to unshare properly, because then it expresses exactly what we want to know. mapcount == 1 -> reuse. mapcount > 1 -> COW. b) Rely on the refocunt? If we have a speculative refrence on the page we would COW. As huge pages are a scarce resource we can easily just not have a free huge page anymore and crash the application. The app didn't do anything wrong. So teaching the hugetlb COW code to rely on the refount would just be highly fragile. -- Thanks, David / dhildenb