From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6369CC433FE for ; Fri, 21 Jan 2022 08:22:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B875D6B007E; Fri, 21 Jan 2022 03:22:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B37E66B0080; Fri, 21 Jan 2022 03:22:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A264A6B0081; Fri, 21 Jan 2022 03:22:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id 933426B007E for ; Fri, 21 Jan 2022 03:22:09 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 37EA9181D3042 for ; Fri, 21 Jan 2022 08:22:09 +0000 (UTC) X-FDA: 79053601578.11.D98F7F6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf05.hostedemail.com (Postfix) with ESMTP id DE0DE100008 for ; Fri, 21 Jan 2022 08:22:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1642753327; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D9mdAv+RGyUKYnJ0+MKBg/z6MNrIHJ34Co79nofBzHk=; b=HJ61VILuiLFBQ7XnFDOFQgNCxobzcgg4iHHKQTjc7FL+ZDWRb0pVn2eDwrGmot4Wx7HOWX P22Ay71+cE+WS0XHEf92Y91f7dO1fnCqki1Kax2BNktNze3kAPlKs6PSfrPBqUrZwc4Sjc eymORHfDNRK7dMWSHIK3bGWfDDy0hHA= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-530-Ax6fu6fOP5GJjHcP0nisdg-1; Fri, 21 Jan 2022 03:22:05 -0500 X-MC-Unique: Ax6fu6fOP5GJjHcP0nisdg-1 Received: by mail-ed1-f72.google.com with SMTP id k10-20020a50cb8a000000b00403c8326f2aso8424390edi.6 for ; Fri, 21 Jan 2022 00:22:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=D9mdAv+RGyUKYnJ0+MKBg/z6MNrIHJ34Co79nofBzHk=; b=4Tfqs4PZeWfgvNjpoY5pBYGYiAYWUCfrEL0NQBDu8+Zd65SKRs6mY5K9uhTn+5OtVO 8rJAvO1+zIhMjHjNCpIKhiZT5EihHWlGBHDZM/i142WYFS8N7RsYUEqECUokCJEF3e16 4G6/JQZZpxhSdjJYZiSI/cODNzsRlsGQ1ppc5eOS/pUhgrmcO0bpNK4ImtaAcx526fU/ +4AXIW3viGzAwQqY936LXzt4juhx6Yr8jHT0hYTS4h7tyqkiAQ6pVC3eaoKMWV4FJMGg 0kg2STMkYEugoEQcRtIIRAbbpNM6T/fT4ANj9P9+CYkrmId5pnQUf3Hn0tKEnHFEHEk6 3bbw== X-Gm-Message-State: AOAM530a2pNEbP9nrZWJkA4trJX73dzFO2/xZ4479KJpMOPFIp4RtJdh 6wUYO26ByDQHf20WSNRZXxNiv6wV7+iOXECxTCL2LqDfhdhkkRxvk4OgSHRY32Bg5hI5tXwxIhV pP6JHo+QUIwo= X-Received: by 2002:a50:c388:: with SMTP id h8mr3204205edf.218.1642753324579; Fri, 21 Jan 2022 00:22:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJzS9IjjdHldk1kkb0qxc2wFwJ7jyRsVwzm9rBQtq3xnVw8D2Ka+Wo01tcLEQp9LA/P2/jo67w== X-Received: by 2002:a50:c388:: with SMTP id h8mr3204176edf.218.1642753324288; Fri, 21 Jan 2022 00:22:04 -0800 (PST) Received: from ?IPV6:2003:cb:c709:a200:adf9:611a:39a8:435a? (p200300cbc709a200adf9611a39a8435a.dip0.t-ipconnect.de. [2003:cb:c709:a200:adf9:611a:39a8:435a]) by smtp.gmail.com with ESMTPSA id g7sm2321532edr.71.2022.01.21.00.22.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 21 Jan 2022 00:22:03 -0800 (PST) Message-ID: Date: Fri, 21 Jan 2022 09:22:02 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 To: Peter Zijlstra Cc: mingo@redhat.com, tglx@linutronix.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org, x86@kernel.org, pjt@google.com, posk@google.com, avagin@google.com, jannh@google.com, tdelisle@uwaterloo.ca, mark.rutland@arm.com, posk@posk.io References: <20220120155517.066795336@infradead.org> <20220120160822.666778608@infradead.org> <20220121075157.GA20638@worktop.programming.kicks-ass.net> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC][PATCH v2 1/5] mm: Avoid unmapping pinned pages In-Reply-To: <20220121075157.GA20638@worktop.programming.kicks-ass.net> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DE0DE100008 X-Stat-Signature: b458qj6a31qysocg78r3915yg9rzg3oh Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HJ61VILu; spf=none (imf05.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1642753327-694080 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 21.01.22 08:51, Peter Zijlstra wrote: > On Thu, Jan 20, 2022 at 07:25:08PM +0100, David Hildenbrand wrote: >> On 20.01.22 16:55, Peter Zijlstra wrote: >>> Add a guarantee for Anon pages that pin_user_page*() ensures the >>> user-mapping of these pages stay preserved. In order to ensure this >>> all rmap users have been audited: >>> >>> vmscan: already fails eviction due to page_maybe_dma_pinned() >>> >>> migrate: migration will fail on pinned pages due to >>> expected_page_refs() not matching, however that is >>> *after* try_to_migrate() has already destroyed the >>> user mapping of these pages. Add an early exit for >>> this case. >>> >>> numa-balance: as per the above, pinned pages cannot be migrated, >>> however numa balancing scanning will happily PROT_NONE >>> them to get usage information on these pages. Avoid >>> this for pinned pages. >> >> page_maybe_dma_pinned() can race with GUP-fast without >> mm->write_protect_seq. This is a real problem for vmscan() with >> concurrent GUP-fast as it can result in R/O mappings of pinned pages and >> GUP will lose synchronicity to the page table on write faults due to >> wrong COW. > > Urgh, so yeah, that might be a problem. Follow up code uses it like > this: > > +/* > + * Pinning a page inhibits rmap based unmap for Anon pages. Doing a load > + * through the user mapping ensures the user mapping exists. > + */ > +#define umcg_pin_and_load(_self, _pagep, _member) \ > +({ \ > + __label__ __out; \ > + int __ret = -EFAULT; \ > + \ > + if (pin_user_pages_fast((unsigned long)(_self), 1, 0, &(_pagep)) != 1) \ > + goto __out; \ > + \ > + if (!PageAnon(_pagep) || \ > + get_user(_member, &(_self)->_member)) { \ > + unpin_user_page(_pagep); \ > + goto __out; \ > + } \ > + __ret = 0; \ > +__out: __ret; \ > +}) > > And after that hard assumes (on the penalty of SIGKILL) that direct user > access works. Specifically it does RmW ops on it. So I suppose I'd > better upgrade that load to a RmW at the very least. > > But is that sufficient? Let me go find that race you mention... > It's described in [1] under point 3. After we put the page into the swapcache, it's still mapped into the page tables, where GUP can find it. Only after that, we try to unmap the page (placing swap entries). So it's racy. Note also point 2. in [1], which is related to O_DIRECT that does currently not yet use FOLL_PIN but uses FOLL_GET. [1] https://lore.kernel.org/r/3ae33b08-d9ef-f846-56fb-645e3b9b4c66@redhat.com -- Thanks, David / dhildenb