From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65C59C0015E for ; Wed, 19 Jul 2023 21:16:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7853D280091; Wed, 19 Jul 2023 17:16:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7355D28004C; Wed, 19 Jul 2023 17:16:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FBBB280091; Wed, 19 Jul 2023 17:16:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 51E7B28004C for ; Wed, 19 Jul 2023 17:16:37 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 167661A0473 for ; Wed, 19 Jul 2023 21:16:37 +0000 (UTC) X-FDA: 81029620434.08.67E84CC Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf28.hostedemail.com (Postfix) with ESMTP id 427B7C0006 for ; Wed, 19 Jul 2023 21:16:35 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=2rWGTRMA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3slK4ZA0KCGUDaHOUDVPXVVHQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--axelrasmussen.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3slK4ZA0KCGUDaHOUDVPXVVHQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--axelrasmussen.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689801395; a=rsa-sha256; cv=none; b=IN5cwVs3GscQ8Dfa3bgSnpQWYTWN7LT8lxV1blPBEdhnSQptXb0xecGJK7N/D9nJ6p/j11 +PHucEXxFkSfQoNP2vfSc5KWtmJUHJg7krL2PlXeZOGHr8PolNy19TxOn/+M+RzHyFRjnj eLpRMnip/A+l/O/FUFC2tTl5P0PcHy0= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=2rWGTRMA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3slK4ZA0KCGUDaHOUDVPXVVHQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--axelrasmussen.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3slK4ZA0KCGUDaHOUDVPXVVHQJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--axelrasmussen.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689801395; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=F9BpFuhMTkAOIbHVpXjSO4oTM9iSvRhAIZS7PoLmzJ4=; b=SapR4laiQqsakvnD/yQkoivHTbA/8wLYu6h/H0JL/4i0RQuJ9wC2AfEAV9SLj+LTx2ZBjJ iO3OItyuvO3MuTEPt4C0tB5oG0nii1pEJ5vEZfy+belTHA/NRNHKIXu7gvRD/uVztdKZ0b 3PP05jJawq3wtTZBYynB3FTiPH4GRKc= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-56ff7b4feefso1586957b3.0 for ; Wed, 19 Jul 2023 14:16:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689801394; x=1690406194; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=F9BpFuhMTkAOIbHVpXjSO4oTM9iSvRhAIZS7PoLmzJ4=; b=2rWGTRMAxnD6VeFPQEreW2HWSQArB4kLh/AVFJFeMRRu/b0qI1M/YvEqYUK8drstJH c/O7qo/G7EcS+Ij4KHSS9CoygJOHy5gj8wp3xmV/RasyNFdsxhTWXIzN3WjKFi628Z7Z rIt976xfybkKpbGh6/5efs/pvr+MjFRMYiTTM3VrAUGQ+YfVC370lk74ULcnEvsN3CDB NRDaruRhWjr9gF9nA4RyHuIkv2SMzlJxCucuDfduj2L9I5lcjVw2w2ZwfBjwrBYDCZD4 R2llbtpO92RHOF1V8NVsi+Zl+Wv+rxrxnQsvjkVz34oPQHrkFZaceYSPBoWmhKHdwQq6 pRsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689801394; x=1690406194; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=F9BpFuhMTkAOIbHVpXjSO4oTM9iSvRhAIZS7PoLmzJ4=; b=f7CLV/nuGtd12d8EFBk56WMhGj7/Edzp9yeYPgr5/8Ix2Q9mbCEz64RHwcPOVMXE82 k1KOHvAEA1u7ed9Lc2UhhDtlKzCqQRdsMuJToQ6plocFTmX3LW0A7f2WRQUfKyfS7rdv +6gAD0PgSZLH1Y/8yWSbrqmlQiFBrn0TwpqpzSqHblWhtAuP1HO0a7/w2rt8qtXNWC0s bqf56zt2Ma88cALIczFw46poyEzeGRMI0HXTrChsc5DsdIPGQp3CeXly8qpwxuzapBFQ wmAIr8cSrO5HPqu6WVWJjQh1e+iZIFGymPR+gvDDD+CDYgVBKSjzWh8nqR4XwBEgQ/CC ahbA== X-Gm-Message-State: ABy/qLZvTJ6QjRAxt0PFkNjmX4iXg+jn3SnCZbbaXyWDqcK9fGo+l9HT QLZ6EJSLoZCRnXu4BtWTn3kb/ifYMd8xCmMsKdsd X-Google-Smtp-Source: APBJJlHreOXAS0H9GQPXpAnmR0dGnAWP0KSY+RuBCP7YUswW95L59jjmKDrqfTpSNj9UEciNANfkAkMJF+8kz9Tkf4LV X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:2c07:36ef:118f:86cf]) (user=axelrasmussen job=sendgmr) by 2002:a81:430c:0:b0:555:cd45:bc3a with SMTP id q12-20020a81430c000000b00555cd45bc3amr198551ywa.9.1689801394300; Wed, 19 Jul 2023 14:16:34 -0700 (PDT) Date: Wed, 19 Jul 2023 14:16:31 -0700 In-Reply-To: <79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr> Mime-Version: 1.0 References: <79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.ntua.gr> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230719211631.890995-1-axelrasmussen@google.com> Subject: Re: Using userfaultfd with KVM's async page fault handling causes processes to hung waiting for mmap_lock to be released From: Axel Rasmussen To: Dimitris Siakavaras Cc: viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , linux-mm@kvack.org, Axel Rasmussen Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 427B7C0006 X-Stat-Signature: j6pjw7bd93gub4cixfxyeh9wdkom5xjc X-HE-Tag: 1689801395-931305 X-HE-Meta: U2FsdGVkX19nW4/ldqfBXAbsTVS0QD8y3qhrC4BohOqMQShb5pOot16Rqtnwn40XL9xEMPpHaXH+kG75PqPq/qR9wQLuA+njqQqI5VvurokM6lrAWhMBn6/zeayNfFl5sRuYPk8bPLtYR12VGn38/lU/WZKt5LPdrRLc5iRWuBeovzmTgJ25KqrOtJ9bM0+oNDpMGmDYbtvjtGliP0ZZDZAbzzXg883DIPXSladcHrp5veKYLAzkXNl3173eIHvJI4CtcKSkcFaWQeTKcoi/fkLk54uvbvIGxsMP+CTHzOOBYIM2s7iR0svDVagyCRSLQYQ2nDqU3dQlBstO81X6OL7somHYFpp6+dnMDHMPT+ymUpxutwJhymbFjwvU7Xl2L02Gol9tA9supkKqFWIYNT4Z0hUx38O445Y1fHVlqYdZNpzwhsYN0Wg9K9WYyTIxwRdUzdgpKqqWgme3cqP3wZhCbiIEAATSGdmaHDqeXfxLbWn/7k5lCWhunh92m7yU4lM679cHvGriMUcEZuTh+1BaypNS0I6PpLkSJiHUyuQRevg5nDGkTrFLar6T0wcocXRNwUjFln02EHZGwuZ6lPntcFTfC2nsYM7G2dLWFUktVBgaWLgQApoTKNCQB3hiG8B7lfIqGUEU8vL3Hnam8+bMy+pquU7XhEvPSM/FQWZK28ESjPbcyCNO3nBG/+uXWHcnRvMTpym/0CrRuJkPz8XP8c/Qu372UwKN49eNtbQMmifIl9x2wTFlbRcRAod6Q6Gflg+8BqAxO+h8LsyKh6S2ztv8i9kGmCSZlhVf4iar/c0rGCw3d4CjptCAC+WCeLfQq5hcwXGTxzibu2ONdZADjJP3ihQ07HUd6oXyndZqhaxesCR9vK2DZ8JQgfYkQskN7ffCEuSnYfmdezzAEzsHpetuzaEcFbI4YRKLcLIIAjUr+eMqlQ555u+/RdajqcRS43P4tbmHH/vg4Il 2vQg0cHM Vib0vsG/z6k+ggOnaMgRil5jxTX3CSsQFvd6Oh4S7lF3f4bk05uGbiSL8vU4gZXtJmj9KGJkOa6aSi0DVZ1NkVXKq05Fm8OCrH1+r5Gu/TmaihmAw5J4NynPZACLGsiPwcouLtbqAC8blytKiT2tboUcGrBIrQPdFMvHCIxTFIdXxziguI+lQqa+fXMKzyoICCFGYtq9JinlwQkQMA1InYGMVY0xE8/dN4mvWSJTIriVhPITjBSkkZ4RGTjrof8GWSJ75b80wX4z8YpxqeEO9cqZSzcoJkLNddPYs09XluctI2nIBP4cc9dbPxNhgsj5l9UrDuV0L7QIMuZzoJbHrjdX+cnw/0zco3RMP7azbFyG2fBE4IL8aYJcRVpqa8BSXS2boZ9288/ClgoRU4UhalGAjDLBlP98yKRpx/S+lL+3MDAaBwdiaC42/B5F+uxB398/RXturzRNGyliqMes/IyS02Lzj4kjjsIliUADv4rmwQc2m2xaAtcb2Sg0qLs6Adx1R74tW3ZwzhPs4tko+RMx9PqFrfWoWGId8RJ1qI6EYJdIHauW5L1tfej5SEwQPPCquc9lOopeFsbk6C/OLtEnX6c1+7psbXlGNZKUGNmKw9wDLGsJkbgShw6OOBjahAkOp2T/Sk0Bn3PGf1gUjqhMYCwEsyX0ps3Ynj5cVNqBD6uuFjoE5acvzXQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.031114, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks for the detailed report Dimitris! I've CCed the MM mailing list and some folks who work on userfaultfd. I took a look at this today, but I haven't quite come up with a solution. I thought it might be as easy as changing userfaultfd_release() to set released *after* taking the lock. But no such luck, the ordering is what it is to deal with another subtle case: WRITE_ONCE(ctx->released, true); if (!mmget_not_zero(mm)) goto wakeup; /* * Flush page faults out of all CPUs. NOTE: all page faults * must be retried without returning VM_FAULT_SIGBUS if * userfaultfd_ctx_get() succeeds but vma->vma_userfault_ctx * changes while handle_userfault released the mmap_lock. So * it's critical that released is set to true (above), before * taking the mmap_lock for writing. */ mmap_write_lock(mm); I think perhaps the right thing to do is to have handle_userfault() release mmap_lock when it returns VM_FAULT_NOPAGE, and to have GUP deal with that appropriately? But, some investigation is required to be sure that's okay to do in the other non-GUP ways we can end up in handle_userfault().