From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27DB7C433E0 for ; Sun, 28 Feb 2021 13:57:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DAFCF64E76 for ; Sun, 28 Feb 2021 13:57:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230163AbhB1N5c (ORCPT ); Sun, 28 Feb 2021 08:57:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229715AbhB1N5c (ORCPT ); Sun, 28 Feb 2021 08:57:32 -0500 Received: from out1.migadu.com (out1.migadu.com [IPv6:2001:41d0:2:863f::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA62CC06174A for ; Sun, 28 Feb 2021 05:56:51 -0800 (PST) MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpwn.com; s=key1; t=1614520565; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sPRqwLQ7R3UvrMGac2JNlWUrHipXXOdI+lOTDeAQJ+E=; b=JHOqju67Xe+ESrvtRJttVcpD4RJaxgHsK710BZGlWyTv0kbwMj8svBhH/C+RbsRUyba7O+ 1hf5xvVHqGlfdX2pHLMjaD4U7cujFKWtGY0OZ003ESz/mE2nJtT6gkr723vAYVpbtoiMi0 cJaXyr72BAb2ba/GcTgUlEVt+4zb64uKKkMpVGy1gVnfAAq4OHAgsizQ6K6Ypi6kaEESWD EIbfylHdKArJk2ojETImzGQjGDfcLwpkQH7fFbmiGxXvg2Ix1ZwFAnVcv5mp5sxSToGgfU 3Qc27jadQMcXD9bfcOV8Kt0gmVR1OcztR2IIivrq9jiZcOzSiZ+qubBqE1gDEw== Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Sun, 28 Feb 2021 08:56:04 -0500 Message-Id: Cc: , , "Aleksa Sarai" Subject: Re: [RFC PATCH] fs: introduce mkdirat2 syscall for atomic mkdir X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Drew DeVault" To: "Al Viro" References: <20210228002500.11483-1-sir@cmpwn.com> In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: sir@cmpwn.com Precedence: bulk List-ID: X-Mailing-List: linux-api@vger.kernel.org On Sat Feb 27, 2021 at 9:58 PM EST, Al Viro wrote: > open() *always* returns descriptor or an error, for one thing. > And quite a few of open() flags are completely wrong for mkdir, > starting with symlink following and truncation. So does mkdirat2. Are you referring to the do_mkdirat2 function? I merged mkdir/mkdirat/mkdirat2 into one function with a flag to enable the mkdirat2 behavior, to avoid copying and pasting much of the functionality. However, the syscalls themselves don't overload their return value as you expect. mkdir & mkdirat both still return 0 or an error, and mkdirat2 always returns an fd or an error. If you prefer, I can leave their implementations separate so that this is more clear. I supposed the flags might be wrong - should I just introduce a new set of flags, with the specific ones which are useful (which I think is just O_CLOEXEC)? > What's more, your implementation is both racy and deadlock-prone - > it repeats the entire pathwalk with no warranty that it'll > arrive to the object you've created *AND* if you have > something like /foo/bar/baz/../../splat and dentry of bar > gets evicted on memory pressure, that pathwalk will end up > trying to look bar up. In the already locked /foo, aka > /foo/bar/baz/../.. This is down to unfamiliarity with this code, I think. I'll try to give it a closer look. > TBH, I don't understand what are you trying to achieve - > what will that mkdir+open combination buy you, especially > since that atomicity goes straight out of window if you try > to use that on e.g. NFS. How is the userland supposed to make > use of that thing? I'm trying to close what appears to be an oversight in the API. See the previous threads: https://lore.kernel.org/linux-fsdevel/C9KKYZ4T5O53.338Y48UIQ9W3H@taiga/T/#t https://lore.kernel.org/linux-fsdevel/20200316142057.xo24zea3k5zwswra@yavin= / Userland uses it the same way they use mkdir+open, but in one call, so that they can use the directory they make as soon as it's created. The atomicity goal, if possible, would also add a reference to the new directory via the open fd, so they can use it even if it's removed by another process. It makes such applications less error-prone, albiet in a minor edge case. I'm not sure what's involved with the NFS case, but I can look into it.