From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ACB6C2BCA1 for ; Sat, 8 Jun 2019 02:12:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B63ED208C3 for ; Sat, 8 Jun 2019 02:12:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="KdtQ5fr+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B63ED208C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 304226B0273; Fri, 7 Jun 2019 22:12:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B4CF6B0275; Fri, 7 Jun 2019 22:12:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A3686B0276; Fri, 7 Jun 2019 22:12:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id E97B16B0273 for ; Fri, 7 Jun 2019 22:12:47 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id e39so3488358qte.8 for ; Fri, 07 Jun 2019 19:12:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:date:from:to:cc:subject :message-id:references:mime-version:content-disposition:in-reply-to :user-agent; bh=wpShQkLbA04ODKXzG1qE5Ho3NFfEe7GrNzFNdw8N1fw=; b=ogbpxmbkikJSP4GNY2xM+Df2iQFLStpPvknKXGkqSjllh5gJozkUux1YXQRh30vnfv +/2VyjsDQ2unsJmeCNa5dRSraaO1ByCUDNCiACpYg1sIVpa9Kp9V2bOUdG/4h8ANZNW5 X+bK5itY80587d75i97aHBaF0/wKo6a5PV4k2iYa5FosFf5DTZ5Ig91G51NEdS+Wi/Jx B4guecJAkPt21hWdscgTZtBGjSB39NWVPuWGh6O7xUFjyf0PQDwS1fmbOCxi9WVVngCL 8VyHD6cM58iGH6xoObJ/G3wzPoGsYe4dLE0CtkISZ9BISmtmqbam/ZfqLo1oBL9RKqNl nepw== X-Gm-Message-State: APjAAAUefinSsK5YpUB4Ddqi8QjvRn5iUumkntCB2OVAk9R5Z1DhT733 dB8F3ThRbWGOeFlWeCQhqRoTe8ZGX32RcpTOo4PKhHWzBWlqoN5kwlQkr8cGSXTmcGW2gaWdw1x z5E8NZZ+4iPurt1r1WdbuToCUm3uBN4p18bhqD/qrkUMpP8wnDJBu39TIB707PQ4QwA== X-Received: by 2002:a05:620a:12a2:: with SMTP id x2mr19417904qki.133.1559959967707; Fri, 07 Jun 2019 19:12:47 -0700 (PDT) X-Received: by 2002:a05:620a:12a2:: with SMTP id x2mr19417879qki.133.1559959966940; Fri, 07 Jun 2019 19:12:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559959966; cv=none; d=google.com; s=arc-20160816; b=tiDu18YZrCzOXU59WdKJfYlowGuzpG0UX26OrfdL+oX3P4v1KE0TZLr3IK6FA3ruGg 0sXv+DZGj4RfuWqg+X6z3EFKNZFTiGfncq8c77XGL+P3acQVcyL8NQcFzS+mjG2gKW1M lWrtUtLUv0V+acuDjhDMbHE63WkwSE5/83+APlxgbJe6HeLELWR62RhbmkPSmCPe4/3M HRnva2dfqjsDD8YvhzJ14RvNAz2Pypoh3/juQzBLPIfNJHr/uLyRsWj9Q0rNlDCdgiOP ViI8fGZauJaDspsMIWK00nMUw2XxmkVBeW1uGBffcHX+fRtzkSd5eHT2n2VKgFUp/5BW pREw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:dkim-signature; bh=wpShQkLbA04ODKXzG1qE5Ho3NFfEe7GrNzFNdw8N1fw=; b=p3UqpxR/+ZD3sBEBRYJ1WixFLboGSUSo1VHztXEjSNahrfbpcBI9O01s97hHUxAaC8 lDT2oHncVVIt/EXnIdnq9rS4PekOFYQjqiuLjNXttlk8zE/W/vUqdTxUBRk4E1djmCaR qDPdsGR/tkfkKuEVvwyHtvAyh82t1GdNvsGSZuPySUau7O3wGL+XQ5+es54q1m9xgrdU 9hkmmc4qFVF3/agCyzuu5AMDtEGR3CLxzA6NGvQJz7TgVm3hJEnHjnBPWmmn0aJZZpOv c02ok9fqrKztiHtJN3+SwfjmFP92Q7fxUSg1wfQhRtDR+C8HXPshKufhqDQRJ4mT3p6S MGVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=KdtQ5fr+; spf=pass (google.com: domain of jgg@ziepe.ca designates 209.85.220.65 as permitted sender) smtp.mailfrom=jgg@ziepe.ca Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id u124sor2125234qkb.0.2019.06.07.19.12.46 for (Google Transport Security); Fri, 07 Jun 2019 19:12:46 -0700 (PDT) Received-SPF: pass (google.com: domain of jgg@ziepe.ca designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=KdtQ5fr+; spf=pass (google.com: domain of jgg@ziepe.ca designates 209.85.220.65 as permitted sender) smtp.mailfrom=jgg@ziepe.ca DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=wpShQkLbA04ODKXzG1qE5Ho3NFfEe7GrNzFNdw8N1fw=; b=KdtQ5fr+UogFRN2W3gb87flKQku8+8a2uZ9C8b+1HCumdi0y4HCUEWHcLy66b1nq4X 4SF3bP3sHcLzah/aOrL8m3PeQIXdb3olf+OcEG8F0Zk0r1dByCU7dQ8gQWxZqd1j60eD H0U83e9gy23Ez5x9lmNcp/2PZehCzH6iXZwD864FqkleaoYKq7DKpUhTRd/PcAfMRBck iRixfqYtjrFpiHJCAu53qL/zRKhqJChLVgw/2BiGOLXoHCd7CuI4Ts0aPv/9PRdPQQMD /Xieh3rM2sehEaWUCQZ6m5stbB/HmTeC2fa3mc/tDRN0kLoaVFwSQ0Qo87NCLb/F4KIU D0pQ== X-Google-Smtp-Source: APXvYqyopt5iGds43Rcjcs2oGVXUS/CH/C4qYinnv741RxxtwZYKZ6rAvBmMWcCM3HDewJAkVTAcCw== X-Received: by 2002:a37:a0e:: with SMTP id 14mr28116214qkk.203.1559959966493; Fri, 07 Jun 2019 19:12:46 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-55-100.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.55.100]) by smtp.gmail.com with ESMTPSA id e66sm2067066qtb.55.2019.06.07.19.12.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 07 Jun 2019 19:12:46 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1hZQqb-0002RD-KJ; Fri, 07 Jun 2019 23:12:45 -0300 Date: Fri, 7 Jun 2019 23:12:45 -0300 From: Jason Gunthorpe To: Ralph Campbell Cc: Jerome Glisse , John Hubbard , Felix.Kuehling@amd.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, Andrea Arcangeli , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release Message-ID: <20190608021245.GD7844@ziepe.ca> References: <20190606184438.31646-1-jgg@ziepe.ca> <20190606184438.31646-12-jgg@ziepe.ca> <61ea869d-43d2-d1e5-dc00-cf5e3e139169@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <61ea869d-43d2-d1e5-dc00-cf5e3e139169@nvidia.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 07, 2019 at 02:37:07PM -0700, Ralph Campbell wrote: > > On 6/6/19 11:44 AM, Jason Gunthorpe wrote: > > From: Jason Gunthorpe > > > > hmm_release() is called exactly once per hmm. ops->release() cannot > > accidentally trigger any action that would recurse back onto > > hmm->mirrors_sem. > > > > This fixes a use after-free race of the form: > > > > CPU0 CPU1 > > hmm_release() > > up_write(&hmm->mirrors_sem); > > hmm_mirror_unregister(mirror) > > down_write(&hmm->mirrors_sem); > > up_write(&hmm->mirrors_sem); > > kfree(mirror) > > mirror->ops->release(mirror) > > > > The only user we have today for ops->release is an empty function, so this > > is unambiguously safe. > > > > As a consequence of plugging this race drivers are not allowed to > > register/unregister mirrors from within a release op. > > > > Signed-off-by: Jason Gunthorpe > > I agree with the analysis above but I'm not sure that release() will > always be an empty function. It might be more efficient to write back > all data migrated to a device "in one pass" instead of relying > on unmap_vmas() calling hmm_start_range_invalidate() per VMA. Sure, but it should not be allowed to recurse back to hmm_mirror_unregister. > I think the bigger issue is potential deadlocks while calling > sync_cpu_device_pagetables() and tasks calling hmm_mirror_unregister(): > > Say you have three threads: > - Thread A is in try_to_unmap(), either without holding mmap_sem or with > mmap_sem held for read. > - Thread B has some unrelated driver calling hmm_mirror_unregister(). > This doesn't require mmap_sem. > - Thread C is about to call migrate_vma(). > > Thread A Thread B Thread C > try_to_unmap hmm_mirror_unregister migrate_vma > hmm_invalidate_range_start > down_read(mirrors_sem) > down_write(mirrors_sem) > // Blocked on A > device_lock > device_lock > // Blocked on C > migrate_vma() > hmm_invalidate_range_s > down_read(mirrors_sem) > // Blocked on B > // Deadlock Oh... you know I didn't know this about rwsems in linux that they have a fairness policy for writes to block future reads.. Still, at least as things are designed, the driver cannot hold a lock it obtains under sync_cpu_device_pagetables() and nest other things in that lock. It certainly can't recurse back into any mmu notifiers while holding that lock. (as you point out) The lock in sync_cpu_device_pagetables() needs to be very narrowly focused on updating device state only. So, my first reaction is that the driver in thread C is wrong, and needs a different locking scheme. I think you'd have to make a really good case that there is no alternative for a driver.. > Perhaps we should consider using SRCU for walking the mirror->list? It means the driver has to deal with races like in this patch description. At that point there is almost no reason to insert hmm here, just use mmu notifiers directly. Drivers won't get this right, it is too hard. Jason