From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E3D6EEAA71 for ; Thu, 14 Sep 2023 21:24:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60AEB6B02F1; Thu, 14 Sep 2023 17:24:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BAAB6B02F3; Thu, 14 Sep 2023 17:24:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A9E86B02F4; Thu, 14 Sep 2023 17:24:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3B36E6B02F1 for ; Thu, 14 Sep 2023 17:24:10 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0E42F1606A4 for ; Thu, 14 Sep 2023 21:24:10 +0000 (UTC) X-FDA: 81236481060.20.44195CD Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf26.hostedemail.com (Postfix) with ESMTP id 4078714000A for ; Thu, 14 Sep 2023 21:24:06 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=AiOe4IK8; dmarc=none; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694726648; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mZz4t+gdKYS6tcBsbAuh03VOVbJbXzYNs5f6Rm45kBw=; b=awQ2IObVcSE982SEoV+uN1Xrq2t2zZD2OtYckZwfKqb0BdSxZThpH+uvdzDvDkH/yA0abY crxPe87EI6mTdfxHFOOqetkhpBgSHgVXy4Ces5yID0GvmJWye6eVwpjnkmlBpB11s8N0Um 72ZzLniT8mH0K5/5YJ/X+vL8wSzhDhc= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=AiOe4IK8; dmarc=none; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694726648; a=rsa-sha256; cv=none; b=k9GjOPRhoURlFxhM3pXuiHVnSFBPAEFfCSMlmYoifvr2kHZGDo5YHdrwLl3MjeGw7TlY3k 9Py4P+8rirCvc4vzgQzkOzrIq76nKqlh0Orlr2J0xR+ypx5qGM6RTpjvzZBufEc47RdBrh F1TFSLe3C/jNeIJaWyEZtsa/+2YCG2c= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=mZz4t+gdKYS6tcBsbAuh03VOVbJbXzYNs5f6Rm45kBw=; b=AiOe4IK887hUNxiFVqKkg+HuQW i3z1RT7BL1F1YNYNpMPfqNP2vIuq7t7QIg18sw1HJrCuX8PDvXyrqhRdbPMyvery+HcEeL2D8phK8 XGEN9Vq9H8VXiImvdy55ItXT5TZJhDDJmQWnT9sW+dE/th1uVRKRFCc7TSf3blUSJR3u3BKzHaC5I dAsl6lfiCtpZirgmkV58mkC/5hoFP5HGzoCHzguzBXajRIr2eutmkdrs7AnACjFKPMSDPvtycFZZ2 pwx2HfZy/1+R/a73pgF7BqP2Y6wgf3WJXe1LP3WeYlROfO7BUhmu9K5uIJIX0AcjdFy9SSF1hNDBZ Tr7HyGFQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qgtog-005HEe-SZ; Thu, 14 Sep 2023 21:24:02 +0000 Date: Thu, 14 Sep 2023 22:24:02 +0100 From: Matthew Wilcox To: Suren Baghdasaryan Cc: syzbot , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com Subject: Re: [syzbot] [mm?] kernel BUG in vma_replace_policy Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspam-User: X-Stat-Signature: 61jik1717ijtmg68bu3j4op9ur6tt818 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 4078714000A X-HE-Tag: 1694726646-31838 X-HE-Meta: U2FsdGVkX18nsyPbGDS4fZ7tJrUAcLR3TICIBqsxzYL0WWxTuqxKfuA5sZuyQ6JevrReBcSxnvnJPTb1WTFD34u8sfH/wMgQkjQDD0K+VMgTgxPCefVsgP7Dd/lJVCuvVUVqLDwNj6UZpIUphTKZR9V8oQCsEr7IVWGA4XU1coDa6rpJvJn8HZbGIsck7B1tJSSpB942hoe6b9a+b2T4zfBsEgBCdoU0gMFPxY5qg5kGuZRvBjP1RmLO9W33eha0gXO70rKuBiz4in3/+q3mOVkYdCW2tfK01Qmn4FXRE5dvqK4DxYaug9dfVC2/qitA3xkeINQXaVCHnvn9IPZNINdz6+QOS4oc7g9Yh6xRFq0+QdaSY7Ip65YsBlbPb2eE2WKQGZe8JywNU+r7136Xrj7VyT0s5FLMYDXyxOyXefJDczsrITjpleXYzL5zUBPn/anCtSlgrN7zIZA3yxbP5XCBnDLooqftjOLwnFh68Nm3+WhwwjfV1r36gOjy8cUXR9OzZay9tfhr73giApgsT4AVU/RkXQNmH5uRVikWQdzCqVH2Y+AY17L+J70KEB6JmLrVAJJ9D0Rp9BPY74UpLgjvvXq/cWsDCTMaZadWDLGZqDhGRvPkgUi26FbxG9gygAxqxWLjTFUvB5YdvYKa7xZjWmHvJ2LVK4Fah4vZysQk51L4HCxo5X25Xx7BACZoQUL8osEyOFZrn+/8+y/xavHVJ/uEaB+v8voP+oGau08teTTSZePfN8lqlD3G2R9hfscsaLtdH7Lc5p6CoBSX+WI0uJZ7jXpvgioLXDdEQCe790LGftBL21ZIlcfCRPgz4EZebTBUXl3/CyHFZt5DEOWFgrKWCbfHZPsg6T+kujV/ejkZBAfost/3dfT1tDFOxx12pVBO8Zo1Cbgn1P/hqIRfUmPtoMYiZIl1GXkuVmzy3lwNBduM4pp9+gAGD2wgzfPxtJ67cbZr5yBR1Qm BqSsh3Ou aoPWj3vHN75NfzqJc7+X2KnT0I79cBflqLAiYYXL3gtkFDn6YfAO5B7+RfBECVVZpKfaTQ7KuwtANKD1FyF848zD0uRggwnpCfPZyUVGyxfTOCLW3zbogKM1yLWE0w6Y/y5NK+HwNy79ay4HN+2IuVqZI3Tm48ACzz4xG/NObqcD6RVT8Y3gMcRU7fhu+kEKxf0keUyfBv/F4AzcjAwXcVxRbIlRp/TMmxtrzPVzH/1Zqt/MadvLvQMGB/WZuV1oBu4bnSnQal/0ODNj4UG7pkCB298MT/tOmFF52UehhEPPCjf6f0lAOIkxVeaYm8ARtN3PXidi+bA5d9TSCWHYQWHcvghho8ANAYoCkDHxnuXyssvCpQ+dMMsTeQFt7kCVT7AeIK3Qwp3tcj3Kp8M8OpbOlTjqQDx7jRk114w7zpiCjKHzbWoPKu0mhZWbR2rK9VRZUfPhqKIruG3rO8kJdl+ZGTA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 14, 2023 at 08:53:59PM +0000, Suren Baghdasaryan wrote: > On Thu, Sep 14, 2023 at 8:00 PM Suren Baghdasaryan wrote: > > > > On Thu, Sep 14, 2023 at 7:09 PM Matthew Wilcox wrote: > > > > > > On Thu, Sep 14, 2023 at 06:20:56PM +0000, Suren Baghdasaryan wrote: > > > > I think I found the problem and the explanation is much simpler. While > > > > walking the page range, queue_folios_pte_range() encounters an > > > > unmovable page and queue_folios_pte_range() returns 1. That causes a > > > > break from the loop inside walk_page_range() and no more VMAs get > > > > locked. After that the loop calling mbind_range() walks over all VMAs, > > > > even the ones which were skipped by queue_folios_pte_range() and that > > > > causes this BUG assertion. > > > > > > > > Thinking what's the right way to handle this situation (what's the > > > > expected behavior here)... > > > > I think the safest way would be to modify walk_page_range() and make > > > > it continue calling process_vma_walk_lock() for all VMAs in the range > > > > even when __walk_page_range() returns a positive err. Any objection or > > > > alternative suggestions? > > > > > > So we only return 1 here if MPOL_MF_MOVE* & MPOL_MF_STRICT were > > > specified. That means we're going to return an error, no matter what, > > > and there's no point in calling mbind_range(). Right? > > > > > > +++ b/mm/mempolicy.c > > > @@ -1334,6 +1334,8 @@ static long do_mbind(unsigned long start, unsigned long len, > > > ret = queue_pages_range(mm, start, end, nmask, > > > flags | MPOL_MF_INVERT, &pagelist, true); > > > > > > + if (ret == 1) > > > + ret = -EIO; > > > if (ret < 0) { > > > err = ret; > > > goto up_out; > > > > > > (I don't really understand this code, so it can't be this simple, can > > > it? Why don't we just return -EIO from queue_folios_pte_range() if > > > this is the right answer?) > > > > Yeah, I'm trying to understand the expected behavior of this function > > to make sure we are not missing anything. I tried a simple fix that I > > suggested in my previous email and it works but I want to understand a > > bit more about this function's logic before posting the fix. > > So, current functionality is that after queue_pages_range() encounters > an unmovable page, terminates the loop and returns 1, mbind_range() > will still be called for the whole range > (https://elixir.bootlin.com/linux/latest/source/mm/mempolicy.c#L1345), > all pages in the pagelist will be migrated > (https://elixir.bootlin.com/linux/latest/source/mm/mempolicy.c#L1355) > and only after that the -EIO code will be returned > (https://elixir.bootlin.com/linux/latest/source/mm/mempolicy.c#L1362). > So, if we follow Matthew's suggestion we will be altering the current > behavior which I assume is not what we want to do. Right, I'm intentionally changing the behaviour. My thinking is that mbind(MPOL_MF_MOVE | MPOL_MF_STRICT) is going to fail. Should such a failure actually move the movable pages before reporting that it failed? I don't know. > The simple fix I was thinking about that would not alter this behavior > is smth like this: I don't like it, but can we run it past syzbot to be sure it solves the issue and we're not chasing a ghost here?