From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0613EB64DA for ; Wed, 12 Jul 2023 15:01:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 300C56B0071; Wed, 12 Jul 2023 11:01:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28A576B0074; Wed, 12 Jul 2023 11:01:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12B1B6B0075; Wed, 12 Jul 2023 11:01:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 009136B0071 for ; Wed, 12 Jul 2023 11:01:53 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BFF971A029F for ; Wed, 12 Jul 2023 15:01:50 +0000 (UTC) X-FDA: 81003274380.30.11671F3 Received: from mail-ot1-f50.google.com (mail-ot1-f50.google.com [209.85.210.50]) by imf13.hostedemail.com (Postfix) with ESMTP id 522F52007F for ; Wed, 12 Jul 2023 15:01:33 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="BIn/rC1q"; spf=pass (imf13.hostedemail.com: domain of surenb@google.com designates 209.85.210.50 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689174095; a=rsa-sha256; cv=none; b=WffxS7Jf2IG5ejsN65uy70cDXTGsWr4kn54mvEj+DHi9FXkVIi/w6xLSxNxG7DVZVDuRtW mlB2rE/u9ciWD3qBbuP7+LAZteNlTqX37r3ClFHeNjXZksO7zuUCGegV1yW85OBv2cUCly ZyO0fjDeM5fofCUeFclAr/c/22DVlWc= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="BIn/rC1q"; spf=pass (imf13.hostedemail.com: domain of surenb@google.com designates 209.85.210.50 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689174095; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8TZfoI5p1MLH4yINGCkQiJulTygcvx8yQ1U7n1TSAqo=; b=niwvkAVJ3J7z0gMGsDb0l6xz1KHlPrrClgCWmPwaNq783qf6yqY+rxRpXkhaJ/8H5z5crn Ng7i4yTQ+tls8r6LDEqQF1aN/fi75fkNBcbwwoFxUqcH/M0Y49mDkXLwFvighyqEbBxJwZ 3kVaHp9zzDaSM09/Wj614atkUMjNuuk= Received: by mail-ot1-f50.google.com with SMTP id 46e09a7af769-6b74faaac3bso5635404a34.1 for ; Wed, 12 Jul 2023 08:01:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689174092; x=1691766092; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8TZfoI5p1MLH4yINGCkQiJulTygcvx8yQ1U7n1TSAqo=; b=BIn/rC1qRuTEaAvR0TMRIEVDuCCgsKAZux3zf6xSYAtamrkRZN0ulCQtdC+Q7ZNOCh 7Y6PDvdxqTESSRpSZ5XsmG1u8Xl/n7htX3wb8uiwzKmLmCoZGgE6kSbuknEaGvWNiMbn ra1oXMnpXBgq/VvIbwH7km0UDx4NAN7/lQhjcJB3G/sNqHHQsXby5QNcW3aDr8E+0MMG MdCUA+0LZ5XvZ633kKO4EAadymB8zEispG+UqWHmva9J0CzGAa1b95oC212VyWBQke4p QjoHxn4gN+wBjUD4/b42+COGnPESk7X08Yp2CySS7dzzV42y8s0FgD7ldQYJ0O+0X4l6 enJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689174092; x=1691766092; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8TZfoI5p1MLH4yINGCkQiJulTygcvx8yQ1U7n1TSAqo=; b=To/zYX8v+58Y44nytutE0MNrQ5x3m56BUY+a2mIVx5zWFpf6U69AcUZxSUQTfOOn4V ZubvZ0S5I5g9O6lFW/LQ6QatVL7TqBcVZuVSNy95dAowNJ9X+ReVhyOiCMitWSr3afVD Gzx7H7Ks7V/ZoFS1hbehq2ovjIUNenWubhzgtaW9E1SrSU5k9VoZDmasKBqYq2Epam76 1rqeAhLUhBTHxNKTY9qCc5gCooBN9ZXv3hGwyQjOZrtOACfE9vkLhMAn+tcErKQDdCz2 X3EtTQVxzDwGHjBEDdv7HUELcQ47bFILKXigBHbJLURh01Jo5DIJC9VloVZY5VkXG8aN ogEg== X-Gm-Message-State: ABy/qLaYlCBYVROVF7Ywzk4hUKSM5y4m7pjVWkGNeKPgg/+ANYElml4f kFGLFwjXFR3sGxcq+8jyQuGeTDHjb6nSR8B4ZHbB2A== X-Google-Smtp-Source: APBJJlHxXT3/eh8Uh67AjENPMVlYBJ7XUDrwiHsgKCBJlEGtC8b9JZcJDVfKlkvTgY2BVdiOEGi2uFi6l4E+Mzy9bvY= X-Received: by 2002:a9d:6d0f:0:b0:6b8:6785:ed0b with SMTP id o15-20020a9d6d0f000000b006b86785ed0bmr17945415otp.30.1689174092271; Wed, 12 Jul 2023 08:01:32 -0700 (PDT) MIME-Version: 1.0 References: <9704a138-60e6-4ede-91f0-844e1df2ad84@moroto.mountain> In-Reply-To: From: Suren Baghdasaryan Date: Wed, 12 Jul 2023 08:01:18 -0700 Message-ID: Subject: Re: [bug report] mm: replace vma->vm_flags direct modifications with modifier calls To: David Hildenbrand Cc: Matthew Wilcox , Dan Carpenter , linux-mm@kvack.org, Andrew Morton , "Liam R. Howlett" , Laurent Dufour , Michel Lespinasse , Jerome Glisse , Michal Hocko , Vlastimil Babka , Johannes Weiner , Peter Xu , Dimitri Sivanich , Mike Travis , Steve Wahl Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 522F52007F X-Stat-Signature: 5wn445a5cg1gdum4rpe8uyqt9bjbfz3p X-Rspam-User: X-HE-Tag: 1689174093-846366 X-HE-Meta: U2FsdGVkX1/nkM1k2rqTJyquo00Faj01vE+Vb9LXmMcfQkf2UoMQHjc3wexOE99PZ6hWCJu9PAKQE5VmlHJBPAqJaf9vmJ0CvBJTMY3W8yl+uSIdbKOYiFxCsbo67Dy/FK4oDRcOtQNmiIFkyaSAmoH0C5d2VOrVYXpCsU1hWjrB1jUQvqtvrVIfxAZJ3McdI4sKaYO6mphRw0nwPb5tadbXmOm2EJkUX97TYk6UpLotJJcBDdp+I+ZziOpt4hfU6GiFo8Buddvov1NW4v4ghPMoBovYCyC3Te/LeeHiG3Fzqbe4l48TeHvE1lHb+p+qZsnG6SBCJ8sn82PKGKHCkuA1WMuHmYJ9zF23YT7wIetZnNaE+S7bQ7IoZ6UrsUN9b0AmF6GC525B50wt9QjKpTskfzGyyaSuhAmhWP0xj0+l0hTvCD5YscCGzTw5NxKRyLRbG582pDXceX3B8fkRm3w9SSPgUs8y9dJXiedDdH18UoBcTJZZQeZXl7j62mZ93n8mrWt8Qyavm7MtCSorRed5MfDpQLHiLwpglXpsm8Ue2Ul6TDrtnG7kRA17rBsXxzLyQ5AGtMXSGnpC696kjqp6wCdogbjcyIkUDGICTG/blQwl1zrfTyFgCGZ/gKtqTVW+nM8McZa+J59zZzQ+ourGKwUQn52hoHN0XXn5OnTdCfso6Q2Ib1hO4ezjjnEwoa/J3P1ZFL9GN0GWGfLzbWjIApejoud5y3PNFZp09JJ8vaXB+k24NCeJpOz2Vn3MO/o8VYFBHGiVhtFE2HJhNIkHfA7k8Ll7Y71X2ebneEOctJzwTY4e0LAbBwZS2fo/z8VGrz/MAtPxPYNISKsd7Kxa8jYBZ+QU0g9J3Ppy2it3H5Xcz9kFeVyBEugmqQLyJZr8y42CcBz7o/xtL2e8ZtW+dA8cQ70hRYMnK/l5TQVFYalIuJjUn+8Yuu3IcoTV5L4R7ydveJSxFNRIXOs Oz9bJymj d44VF14VNqjMM6tZeLZBSWhqJM8PnIblwO9z2tRvoNM7k2WV5rFiUirVLugRR31O4SpSYQXWffV9Op/6DKAWQoDdqCPKFobbfLONzJCCTJJDOJO5WAZGmySE6DLfeWxLansmt2AVqxTnJN0m9Nqc5mKNp2ZFGK6xe3Hlh3S9uPQrGsRzRM3o/0O/2XpNcILJ2rrG8znVZhGbL5oKYyY9fJPbEbBiGNhJ6tjTVxd3j27CRuIDI4123OmZKIqdOLFhyICwBBzbxe5LvU9MdAkf6vo2QPtdLH4xkwOfVMDmhFZFWvW3zH0014cxK3Wg65fLhAS0ExZVqkC/50Zm51Vni6MLASDC8nhY6swwaKX6RVEmWqOc+XVpDd7MY3J4UST2QyuW3aj0LchCgCXXuF4Zg9QPck1ZoY+xh6fkI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 12, 2023 at 12:35=E2=80=AFAM David Hildenbrand wrote: > > On 12.07.23 01:45, Suren Baghdasaryan wrote: > > On Tue, Jul 11, 2023 at 3:21=E2=80=AFPM Matthew Wilcox wrote: > >> > >> On Tue, Jul 11, 2023 at 02:55:20PM -0700, Suren Baghdasaryan wrote: > >>> On Tue, Jul 11, 2023 at 12:21=E2=80=AFAM Dan Carpenter wrote: > >>>> > >>>> Hello Suren Baghdasaryan, > >>>> > >>>> The patch 1c71222e5f23: "mm: replace vma->vm_flags direct > >>>> modifications with modifier calls" from Jan 26, 2023, leads to the > >>>> following Smatch static checker warning: > >>>> > >>>> ./include/linux/mm.h:729 vma_start_write() > >>>> warn: sleeping in atomic context > >>>> > >>>> include/linux/mm.h > >>>> 722 static inline void vma_start_write(struct vm_area_struct *v= ma) > >>>> 723 { > >>>> 724 int mm_lock_seq; > >>>> 725 > >>>> 726 if (__is_vma_write_locked(vma, &mm_lock_seq)) > >>>> 727 return; > >>>> 728 > >>>> --> 729 down_write(&vma->vm_lock->lock); > >>>> 730 vma->vm_lock_seq =3D mm_lock_seq; > >>>> 731 up_write(&vma->vm_lock->lock); > >>>> 732 } > >>>> > >>>> The call tree is: > >>>> > >>>> gru_fault() <- disables preempt > >>>> -> remap_pfn_range() > >>>> -> track_pfn_remap() > >>>> -> remap_pfn_range_notrack() > >>>> -> vm_flags_set() > >>>> -> vma_start_write() > >>>> > >>>> Before track_pfn_remap() and remap_pfn_range_notrack() would just do= |=3D > >>>> to set the flags but now they use vm_flags_set() so there is a poten= tial > >>>> they could sleep. > >>> > >>> Hi Dan, > >>> Thanks for reporting! Looks like the page fault handler is modifying > >>> the VMA flags, which has to be done under write-locked mmap_lock and = I > >>> don't see that being done here... I wonder if that should be allowed. > >>> I'm CC'ing some MM folks to check if this is a valid VMA modification > >>> and should be allowed. Matthew, this might be especially interesting > >>> for you since gru_fault() handles file-backed page faults AFAIKT. > >> > >> I don't run the ->fault handler under RCU, only the ->map_pages() > >> method. I don't intend to change that. > >> > >>> Back to the issue at hand. If such modification should be indeed > >>> allowed then the simplest fix I think would be to add new > >>> remap_pfn_range_locked() function to be called from gru_fault() which > >>> would use __vm_flags_mod() instead of vm_flags_set(). __vm_flags_mod(= ) > >>> does not lock the VMA, so would not have this issue. If the conclusio= n > >>> is that this is a valid scenario then I can post a fix I described. > >> > >> I'm not certain, but calling remap_pfn_range() in the fault handler > >> is definitely weird. It's normally called _instead_ of having a fault > >> handler. The fault handler usually calls set_pte_at() directly. > > > > Hmm. Is it weird enough to be considered invalid or weird but still ok? > > Also, is it ok to modify VMA flags here without write-locking the > > mmap_lock (and without write-locking the VMA)? The fault handler is > > done under read-locked mmap_lock but I thought VMA modifications > > require stronger locking... > > > > The "easy" fix would be to have something like "remap_pfn_range_prepare" > that the remap_pfn_range() caller calls during mmap(). Are you suggesting to break remap_pfn_range() into two stages (remap_pfn_range_prepare() then remap_pfn_range())? If so, there are many places remap_pfn_range() is called and IIUC all of them would need to use that 2-stage approach (lots of code churn). In addition, this is an exported function, so many more drivers might expect the current behavior. My suggestion was to have another function called smth like remap_pfn_range_nolock() and internally use the same code with an additional flag that would tell us whether we should lock the vma or not. Such change should be quite simple and small. > > We can let that one set these flags, and then we can later let > remap_pfn_range() fail if the relevant flags are not set. > > It's certainly one of these "always done like that and suboptimal but > somehow it worked" thingies. > > I suspect, because only the first pagefault->remap_pfn_range() will > actually modify the VMA flags, that this is ok. Otherwise, there usually > wouldn't be any pagetables and nothing mapped, so who really cares about > these VMA flags (e.g., GUP cannot pin anything if nothing is mapped). Is my understanding correct that the reasoning remap_pfn_range() is allowed here without write-locking the VMA (or write-locking mmap_lock) is because it's done only if there are no pre-existing patables? Thanks, Suren. > > -- > Cheers, > > David / dhildenb >