From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20FCFC74A5B for ; Wed, 29 Mar 2023 19:41:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 33E376B0072; Wed, 29 Mar 2023 15:41:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2EE676B0074; Wed, 29 Mar 2023 15:41:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1DD596B0075; Wed, 29 Mar 2023 15:41:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0FDA96B0072 for ; Wed, 29 Mar 2023 15:41:10 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CF24C1207A1 for ; Wed, 29 Mar 2023 19:41:09 +0000 (UTC) X-FDA: 80622954258.30.DBF1384 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf01.hostedemail.com (Postfix) with ESMTP id AA0F14000B for ; Wed, 29 Mar 2023 19:41:06 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=N2zwR5xV; spf=pass (imf01.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680118866; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=np8vWIlQ+v2ijN50x+Ku8EC8uzXFlwApSDs6+DUB4YQ=; b=ZN+OE0BpA3UuhaG9mxIm8cIsED6zc3JJDIUcbcT+xlF01K3NDByMqLt4FvAJ4mvqHZZ/L3 4nWFi3l+M9JVPqSucxj9FQPp/vBUu+PiBDzwbKXG63DHP2Z6bQN9hF7fi/N1Zrwq63VXHg BeFY/tzAIqJ8F7y7FhLWi0+pFFijrEA= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=N2zwR5xV; spf=pass (imf01.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680118866; a=rsa-sha256; cv=none; b=HI1wHqqCEkE3SViS3T+v2on18QTS+YeCkFNRFbxgO/Ym29ZQzJrEMqaY390TzTYBIsME2n TlItoA3624OtjKlzZUUVExc+4TAOkbrr55NOZIMHxra3PtSHiiWB7+dmcECQBkCVt71cjw IN6fFZV32IJ8fwvRS2vW+rMlcYqFGyw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680118866; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=np8vWIlQ+v2ijN50x+Ku8EC8uzXFlwApSDs6+DUB4YQ=; b=N2zwR5xVOTXCgPm/0rjlX5NyYI6N5TkkczK9sbQI4WGKdzT9qdjtFjk4JUR/7Es8SLvOi+ 27roxi/EcR7A9BvQO1yY/yzbipiGPm3ECmgreRaIyN+ttLy3UKDL2Lc4EHaTT2l64lg+mf 87KFx4fk27ZQEYFg5o9ANfWTopgHTJo= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-25-4zWUsnbHPdyR8x_2YxF1RQ-1; Wed, 29 Mar 2023 15:41:04 -0400 X-MC-Unique: 4zWUsnbHPdyR8x_2YxF1RQ-1 Received: by mail-qt1-f198.google.com with SMTP id h6-20020ac85846000000b003e3c23d562aso10946020qth.1 for ; Wed, 29 Mar 2023 12:41:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680118863; x=1682710863; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=np8vWIlQ+v2ijN50x+Ku8EC8uzXFlwApSDs6+DUB4YQ=; b=Z/KVf6OQxC+BYcKJ/Da1wdXw4FU9590iTDyey3n0zLKlA7R6GgakWMdU0J83d16kwb LiZkMgdqos7qWP/GeNOg8DCFE4HJu/sRBmf6oCzj/wYFxpLJEJnwpDk9OjBzOKGgJeRP NMMQISqhJvyB7FM9yJatI/9ATB8tm9F9pFWZimEocrV0sVCiz4oIKxV5MeDUNDGju8Cx bny9+TDCY4UuyfCmFN4KzVZ9TZXq1+bUxSe1muRvc/jIz7ZBJ7NryYw+kSZGALf4utjR tfSBHzeFT/nErdf4lv9Ax04dthSJ2wjrzUCOZ0O8tqlBDwZ9oe/YPKiT3IUh6MKHk0Ho LnFA== X-Gm-Message-State: AAQBX9dS0cpgB7vzVMtTqpWFUCWYeW0r/8sinYxlenHh2brUp5jHAW5M K0ww65s03oZGL5XVKjND1dXDpxZNRiD/kNTc6YcYsb3bujUKa9ruZT9lkIoqYP/3zbPDvSF/UzP vjtoOD0pBGCGcE8bqv7w= X-Received: by 2002:a05:6214:409:b0:56c:222d:427a with SMTP id z9-20020a056214040900b0056c222d427amr32374992qvx.1.1680118863439; Wed, 29 Mar 2023 12:41:03 -0700 (PDT) X-Google-Smtp-Source: AKy350beCRIu7VzJno1QTnNsCw1pZmRV6igDjjS/pY6b+5FxGAYv86IgYq9XMU67rDUGaXNwUgQxCw== X-Received: by 2002:a05:6214:409:b0:56c:222d:427a with SMTP id z9-20020a056214040900b0056c222d427amr32374970qvx.1.1680118863150; Wed, 29 Mar 2023 12:41:03 -0700 (PDT) Received: from x1n (bras-base-aurron9127w-grc-40-70-52-229-124.dsl.bell.ca. [70.52.229.124]) by smtp.gmail.com with ESMTPSA id 123-20020a370581000000b007456df35859sm14434041qkf.74.2023.03.29.12.41.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Mar 2023 12:41:02 -0700 (PDT) Date: Wed, 29 Mar 2023 15:41:01 -0400 From: Peter Xu To: Axel Rasmussen Cc: Alexander Viro , Andrew Morton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] userfaultfd: don't fail on unrecognized features Message-ID: References: <20220722201513.1624158-1-axelrasmussen@google.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Stat-Signature: uutuzorobd6879cwtayyxi1u7p6dc1k6 X-Rspam-User: X-Rspamd-Queue-Id: AA0F14000B X-Rspamd-Server: rspam06 X-HE-Tag: 1680118866-432906 X-HE-Meta: U2FsdGVkX1/pcL2LCjQgfLhU74A6EVA7/XoGXzWU3gs551DyK3QJ0VCmfX88nfL2scGjKCuRs3db27pS7UkxA6dcOr0i8yqlMivgliQdMPvKhbh1wsm4qQraKfQa2cW5G8knKb28aVxyBdFoXL+HxdJd89Nxi7uuClE1S3o0bnjG1oS1UixPbHDm8WLbg9d3mGhHiduOZStNtS3XFlzmYqDtPY3SeLrK+Ad3o45aTgLaitcna+Xle8nYRkRgNLC3TGCcAPWer6QBMh4r7QnX3kz/25UOxl++0KbX/Jyf6BqzPHTYV13/59/RPT0FVg7aJ12I+3r7N8RHYLumINBlSmuQ9/APQoPE0L/ZgJipEbYkClZ3XKdbJjggD8q57/w0R907HwCtEoQ2WhVPY+yjkX7g5p3uWmeunQAVsmWGZYqqpVFq8n5l57w1ZPWPxvc78qsRA1YcYdN+v4vEarLkGUcIFM1VW8qtUHWlZXo4eoK7j5/Qu/kYk8FmJ/Yrc0ItKtK9uMtjXy+q0xjkIh/gP88lROeYXwLdz5t3X7BHzgnmIw/faTlBKaniuJDzB/HjrHDrtYFzoIW4FUhZ7XSb2sdw4Uc+6IaeOQpCaarN+GnD3WtR6T1h7r1RcZMhSIzbi9CAEda8gu7yW1MSYLZhyvoL3xHF+XIrRyeT7kIpKlAarUHSo1zZGTUX1v/QN54u33TDCQAxD64zvAcLXWgI8Wz8CTUUQ7pahCvjgiEWWwIDOSRY5xv11rLmCtxochRlsYFhvgkRRMIA2TWDhpHgYCIjaNgfkTAFV801UN1NTXsqX+52zlqiNw7GIYJ1or7IExi3gbSVQss6q4ko78bTEC5ipaAAZE9r9qmTsO9S4ocpun410VmeorxMykIVZ3eWN+PIawk0pSfrEDQ9qtYNVeNZ5TbVUiXpS1oCHLIgbusU+uaTsejpoTa1yep5kgy9BrMMN6+cQEfh7Gazwbn BvV0os4i 8VvHBTUYgKFMOcVm28F6pioPVxtPf3OVleyUt65aVzgibbloW1mpi6VWAwrMcWkEOfEfpYmJUMLIIGFlcs9OWXoUqsncCtAzRvBa4SIbqL0NT6BS577NCQbUXxro3BniO9/G5e+nF+5y72b/YxT+r2Kl/h8iDurG23XhClmhdWK81zTnex77AKaKKKboJpz+ctHV+IGFUTPnPunx/zIJlU+dT7mD3sga2cMo9A0Jfx9t6U+C6rfCj/2s62rrXxybcPstaqfXeKfABAgomaXjocC4uLcmk2hkGH0JVTdHL/PIYR18DSqIJIsnhMDt1mxi4V3HVTOHbWfAthYLN7istIXpmASi3lmyxztcGOfBHDtuuyUfEE5qyJhBwFMZFYdxzWinW6NwpIuv49XzMmQOnoV464w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 29, 2023 at 10:53:38AM -0700, Axel Rasmussen wrote: > On Tue, Mar 28, 2023 at 3:34 PM Peter Xu wrote: > > > > On Tue, Mar 28, 2023 at 02:52:35PM -0700, Axel Rasmussen wrote: > > > I don't see being very strict here as useful. Another example might be > > > madvise() - for example trying to MADV_PAGEOUT on a kernel that > > > doesn't support it. There is no way the kernel can proceed here, since > > > it simply doesn't know how to do what you're asking for. In this case > > > an error makes sense. > > > > IMHO, PAGEOUT is not a great example. I wished we can have a way to probe > > what madvise() the system supports, and I know many people wanted that too. > > I even had a feeling that we'll have it some day. > > > > So now I'm going back to look at this patch assuming I'm reviewing it, I'm > > still not convinced the old API needs changing. > > > > Userfaultfd allows probing with features=0 with/without this patch, so I > > see this patch as something that doesn't bring a direct functional benefit, > > The benefit is we combine probing for features and creating a > userfaultfd into a single step, so userspace doesn't have to open + > manipulate a userfaultfd twice. In my mind, both approaches achieve > the same thing, it's just that one requires extra steps to get there. > > To me, it's still unclear why there is any harm in supporting the > simpler way? And, I also don't see any way in which the more complex > way is better? Because that's what the man page says? :) > > > but some kind of api change due to subjective preferences which I cannot > > say right or wrong. Now the patch is already merged. If we need to change > > either this patch or the man page to make them match again, again I'd > > prefer we simply revert it to keep everything like before and copy stable. > > I think we need to change documentation either way. But, I think the > changes needed are actually bigger if we want to revert. IIUC the man page doesn't need to update if we revert this patch. The man page described clearly on what will happen if we pass in feature bits that are not supported: To enable userfaultfd features the application should set a bit corresponding to each feature it wants to enable in the features field. If the kernel supports all the requested features it will enable them. Otherwise it will zero out the returned uffdio_api structure and return EINVAL. > With the simpler behavior, the selftest and the example program in the > man page are ~correct as-is; otherwise we would need to modify those > to use the two-step probing method. > > (By the way, I am excited about the selftest refactoring you talked > about! Thanks for doing that work. It definitely needs it, the > complexity there has gotten significantly worse as we've added more > things onto it [wp, minor faults].) I'll definitely copy you when I post it. It growed a bit larger than I thought, it'll be great if you can help have a look. In the test cases I added an UFFDIO_API test to be the 1st one and that's why I found this issue. To let all tests pass currently I'll need to revert this patch. If you want we can move the discussion there when I post it, I think that may need to be the 1st patch for the test suite change and to let current test suite pass. > I think the man page description of how to use the API is incomplete > in either case. Right now it sort of alludes to the fact that you can > probe with features==0, but it doesn't explicitly say "you need to > probe first, then close that userfaultfd and open the real one you > want to use, with a subset of the features reported in the first > step". If we want to keep the old behavior, it should be more explicit > about the steps needed to get a userfaultfd. To tell the truth, if I'm going to change the API anyway, I'll simply add a UFFDIO_FEATURES ioctl() returning the supported features, that'll be much, much easier than either the old one or the one this patch proposed, IMHO. Then we keep all the rest untouched. That should work perfectly and that'll not require open()/close() duplications either. But as I mentioned before, I don't think UFFDIO_FEATURES justifies itself either much on being worthwhile because it introduces a new ioctl without any major benefit. At least we'll need to keep the old behavior still working. > > You are right that it also doesn't describe "you can just ask for what > you want, and the kernel tells you what subset it can give you; you > need to check that the reported features are acceptable" - the new > behavior. That should be updated. -- Peter Xu