From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06FDEECAAA1 for ; Mon, 24 Oct 2022 18:18:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C636940008; Mon, 24 Oct 2022 14:18:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57620940007; Mon, 24 Oct 2022 14:18:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 465DD940008; Mon, 24 Oct 2022 14:18:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 37E77940007 for ; Mon, 24 Oct 2022 14:18:59 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EE4CD1C604E for ; Mon, 24 Oct 2022 18:18:58 +0000 (UTC) X-FDA: 80056654356.11.BE55F6B Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf30.hostedemail.com (Postfix) with ESMTP id CF68080015 for ; Mon, 24 Oct 2022 18:18:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=t9FHfeIcloVdmFbUJWpclGjePHUA/HlGyPuL3wK+kr0=; b=tCmUJuvA77NpSI0qj14vZyfAqC XkRqamylOY2ncbRiW/TvVx3cmRphmaPB9fzoLsV5xqRkIWJxnzVjLY8fKwe4rKDCtIhJ4IxpQyf54 fpZZ31JcJYxZNTcWN2NXjbVnCdhRXQOBSedBlYztVQmLQe+nHU8/wTHTTy5CchnforEZSGLVanqGe pXFOj/SQXIqSUIaGIOnIKQDv1P4/nYA8F5iEtaDSgET1K7oaCjk/inmHoeKok+qrfQSOrKI04BxNq YNGRdOdmafgr89NVU+KhpKmKKV78Dne39j/ZO1sbfdJxEq6RrWZ3vYE+6VXPv1Fw+77QHIq6ZxcQn v3PYj9HQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1on22A-00Fdca-8R; Mon, 24 Oct 2022 18:18:46 +0000 Date: Mon, 24 Oct 2022 19:18:46 +0100 From: Matthew Wilcox To: Ira Weiny Cc: Andrew Morton , Randy Dunlap , Peter Xu , Andrea Arcangeli , kernel test robot , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH for rc] mm/shmem: Ensure proper fallback if page faults Message-ID: References: <20221024043305.1491403-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666635538; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t9FHfeIcloVdmFbUJWpclGjePHUA/HlGyPuL3wK+kr0=; b=Dol2IMNRq0dUxwjYAymSUTXAWxDiqWNp0ol4CsAYPRzQKboMeKk+Q9MbYdbF9HxCPBe51S fa6M8endavDQ4ihXvt3CUhA35/8WyM3MvFZUuLxn3XWwgJKNtSnzZTJPqjFaSqiLuAsRSE 30I3TWXvc9oTgOfYwdoIlVWhav13Z84= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=tCmUJuvA; spf=none (imf30.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666635538; a=rsa-sha256; cv=none; b=fEo48a9BTegc85MzooB2fDdiOAi0KSQzhHGx2opclffAZbTqVPBDf3Th7mzF2yoaRo7u0y 2gUNksVD0qt+Bm6lZRU4SQEtM2t+fSw3+hfYtjeIcXAJm4qw/wnGrXXORt5xEtjFKBHUyM m+/Yp366Yx/CI5KiRy02keDsfin+4T8= X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CF68080015 X-Rspam-User: Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=tCmUJuvA; spf=none (imf30.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none X-Stat-Signature: z9xqiwqoxjug17cai9bbqkeqe9gsgrzi X-HE-Tag: 1666635537-623575 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 24, 2022 at 09:54:30AM -0700, Ira Weiny wrote: > On Sun, Oct 23, 2022 at 09:33:05PM -0700, Ira wrote: > > From: Ira Weiny > > > > The kernel test robot flagged a recursive lock as a result of a > > conversion from kmap_atomic() to kmap_local_folio()[Link] > > > > The cause was due to the code depending on the kmap_atomic() side effect > > of disabling page faults. In that case the code expects the fault to > > fail and take the fallback case. > > > > git archaeology implied that the recursion may not be an actual bug.[1] > > However, the mmap_lock needed in the fault may be the one held.[2] > > > > Add an explicit pagefault_disable() and a big comment to explain this > > for future souls looking at this code. > > > > [1] https://lore.kernel.org/all/Y1MymJ%2FINb45AdaY@iweiny-desk3/ > > [2] https://lore.kernel.org/all/Y1M2p9OtBGnKwGUE@x1n/ > > > > Fixes: 7a7256d5f512 ("shmem: convert shmem_mfill_atomic_pte() to use a folio") > > Cc: Andrew Morton > > Cc: Randy Dunlap > > Cc: Peter Xu > > Cc: Andrea Arcangeli > > Reported-by: Matthew Wilcox (Oracle) > > Reported-by: kernel test robot > > Link: https://lore.kernel.org/r/202210211215.9dc6efb5-yujie.liu@intel.com > > Signed-off-by: Ira Weiny > > > > --- > > Thanks to Matt and Andrew for initial diagnosis. > > Thanks to Randy for pointing out C code needs ';' :-D > > Thanks to Andrew for suggesting an elaborate comment > > Thanks to Peter for pointing out that the mm's may be the same. > > --- > > mm/shmem.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > index 8280a5cb48df..c1bca31cd485 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -2424,9 +2424,16 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, > > > > if (!zeropage) { /* COPY */ > > page_kaddr = kmap_local_folio(folio, 0); > > + /* > > + * The mmap_lock is held here. Disable page faults to > > + * prevent deadlock should copy_from_user() fault. The > > + * copy will be retried outside the mmap_lock. > > + */ > > Offline Dave Hansen and I were discussing this and he was concerned that this > comment implies that a deadlock would always occur rather than might occur. > > I was not clear on this as I was thinking the read mmap_lock was non-recursive. > > So I think we have 3 cases only 1 of which will actually deadlock and is, as > Dave puts it, currently theoretical. > > 1) Different mm's are in play (no issue) > 2) Readlock implementation is recursive and same mm is in play (no issue) > 3) Readlock implementation is _not_ recursive (issue) > > In both 1 and 2 lockdep is incorrectly flagging the issue but 3 is a problem > and I think this is what Andrea was thinking. The readlock implementation is only recursive if nobody else has taken a write lock. AIUI, no other process can take a write lock on the mmap_lock (other processes can take read locks by examining /proc/$pid/maps, for example), although maybe ptrace can take the mmap_lock for write? But if you have a multithreaded process, one of the other threads can call mmap() and that will prevent recursion (due to fairness). Even if it's a different process that you're trying to acquire the mmap read lock on, you can still get into a deadly embrace. eg: process A thread 1 takes read lock on own mmap_lock process A thread 2 calls mmap, blocks taking write lock process B thread 1 takes page fault, read lock on own mmap lock process B thread 2 calls mmap, blocks taking write lock process A thread 1 blocks taking read lock on process B process B thread 1 blocks taking read lock on process A Now all four threads are blocked waiting for each other.