From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A7E5C433DF for ; Thu, 27 Aug 2020 14:11:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 04CB62054F for ; Thu, 27 Aug 2020 14:11:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=crudebyte.com header.i=@crudebyte.com header.b="n/0pxv4A" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728030AbgH0OLM (ORCPT ); Thu, 27 Aug 2020 10:11:12 -0400 Received: from lizzy.crudebyte.com ([91.194.90.13]:48385 "EHLO lizzy.crudebyte.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728185AbgH0OKh (ORCPT ); Thu, 27 Aug 2020 10:10:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=lizzy; h=Content-Type:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Content-ID:Content-Description; bh=CCa4kfn3vpnBF8Rj2S30fsBuR4gERAiFmIY3NeXQZyA=; b=n/0pxv4AYQ3fUBkOv1U407afU1 xVahq5gPEya24BZaVxYESvsMhpCGgSbY69twelpBQUcb0JLMzyZZ04eHHT7H4J9AEHHWUuy/qavEJ HpJH+Tfgm/JE7uWXBo59pN1QxV91XpgHaSmmqXDW5gDXalttiyncmoG0OtBgn80gE0z5ZxqQWyenr RBU7s0gwQtubRR6nAMx7/y7zFh0P73mtAiCBKc3wRuUfjaJ+hu+DtozriMDReuyQcRNwl41kEytGk DQ5Bkr5NZxpr/q6vfTy60vYn2KNDJYhLeHTzkKn4Spc51etGWfLy4TmvhJXkYOSet4eSOF64alGI6 hVk84xug==; From: Christian Schoenebeck To: Matthew Wilcox Cc: Miklos Szeredi , "Theodore Y. Ts'o" , Frank van der Linden , Dave Chinner , "Dr. David Alan Gilbert" , Greg Kurz , linux-fsdevel@vger.kernel.org, Stefan Hajnoczi , Miklos Szeredi , Vivek Goyal , Giuseppe Scrivano , Daniel J Walsh , Chirantan Ekbote Subject: Re: file forks vs. xattr (was: xattr names for unprivileged stacking?) Date: Thu, 27 Aug 2020 15:48:57 +0200 Message-ID: <3331978.UQhOATu6MC@silver> In-Reply-To: <20200827122555.GD14765@casper.infradead.org> References: <20200824222924.GF199705@mit.edu> <1803870.bTIpkxUbEX@silver> <20200827122555.GD14765@casper.infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Donnerstag, 27. August 2020 14:25:55 CEST Matthew Wilcox wrote: > On Thu, Aug 27, 2020 at 02:02:42PM +0200, Christian Schoenebeck wrote: > > What I could imagine as delimiter instead; slash-caret: > > /var/foo.pdf/^/forkname >=20 > Any ascii character is going to be used in some actual customer workload. Not exactly. "/foo/^/bar" is already a valid path today. So every Linux sys= tem=20 (incl. all libs/apps) must be capable to deal with that path already, so it= =20 would not introduce a tokenization problem. The caret character is not reserved by any filesystem either: https://en.wikipedia.org/wiki/Filename The only change a caret delimiter would bring, is a very minor change in=20 semantic: apps would no longer be allowed to create dirs/files named exactl= y=20 "^". But I find that a very small restriction compared to the negative impa= ct=20 of other delimiter options, i.e.: touch /some/where/^ # error if forks enabled, OK otherwise touch /some/where/^whatever # always OK So if you have apps that need to access dirs/files called *exactly* "^", th= at=20 would be easy to fix. And if you don't want to, you just keep kernel's supp= ort=20 for forks disabled and preserve old semantic of "^". > I suggest we use a unicode character instead. >=20 > /var/foo.pdf/=F0=9F=92=A9/badidea Like I mentioned before, if you'd pick a unicode character (or binary), the= n=20 each shell will map their own ASCII-sequence on top of that. Because shell= =20 users want ASCII. Which would defeat the primary purpose: a unified path=20 resolution. Then even if you'd pick unicode, that would raise new questions and problem= s;=20 e.g. utf-8, utf-16, utf-32? Character normalization required? How do you=20 ensure each layer will use the same encoding? Best regards, Christian Schoenebeck