From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1799EC433F5 for ; Tue, 1 Mar 2022 09:41:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233857AbiCAJl7 (ORCPT ); Tue, 1 Mar 2022 04:41:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233958AbiCAJlz (ORCPT ); Tue, 1 Mar 2022 04:41:55 -0500 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 596208BE04 for ; Tue, 1 Mar 2022 01:41:13 -0800 (PST) Received: by mail-io1-xd31.google.com with SMTP id d62so17763597iog.13 for ; Tue, 01 Mar 2022 01:41:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MU2HWPXy4mAk2gUeuMPfPjZpOmd7MfczuTjNa/BKYeM=; b=J5tf+w9wJrpI84ECFWJeCND7fgEugL1Y44J/PeJZI5J8YWiyX+HtHV4sQLIs8Lv9MP AiUX5b0cP+UUp70Ks5gg6JCXGhXNEB31VGMNwDkGqc4ab+pW5kH6WmTWwpL+CKmhYNqt hYeXF8Y3fWhW9T+GCnaXz+kYm7c1vnceiDEm4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MU2HWPXy4mAk2gUeuMPfPjZpOmd7MfczuTjNa/BKYeM=; b=ULzaYVyjFxpydndMWcaG/lIZgsW534oD14GN1EMxaXyjvukJBNMUXCLDNxeG9spC7r JS9bfSvi1U5IZgJB0aFSxwYVngHVfIsOp7pR/6vB1ir5kQnyKVqptOfL7PGyUUdLfuyc OSDbJTXOvWRnd2OmNH5eU15b4EU7NYbbDFw0kKgZMih3EZLNqI4dN3ldF4EmTSQyFg9D 2+Wosq9mNyY33E2jPixqDsAVM9heajetSLDesXAoUyNjbInSfJMUAlCJKavMNnyqXGow VIsX20jBC4IfNebLRkAoIDyevBdRAlEZHA29aiJq+vUfoyp84N7mQzjHKdX21KOJ56Qw MQFQ== X-Gm-Message-State: AOAM530xG0x56Wm+eH7TMaoCeTbp4iYLCP+CYCvnNG9kWFhsD9NhX4kH vRBrzkKL30PkjVLYbpY4iEC612siqpuHAHM8kD6osg== X-Google-Smtp-Source: ABdhPJxj9kVmLDPb5TzfYODF4nrClsQQz+30NgAHek1PNbHF0Q1C9hdcWEyqYC4IdVu0LWbB+6deHxoZYu2hC0zKG8s= X-Received: by 2002:a02:95a2:0:b0:30f:61cc:346f with SMTP id b31-20020a0295a2000000b0030f61cc346fmr20276611jai.273.1646127672760; Tue, 01 Mar 2022 01:41:12 -0800 (PST) MIME-Version: 1.0 References: <20220227093434.2889464-1-jhubbard@nvidia.com> <20220227093434.2889464-7-jhubbard@nvidia.com> In-Reply-To: From: Miklos Szeredi Date: Tue, 1 Mar 2022 10:41:01 +0100 Message-ID: Subject: Re: [PATCH 6/6] fuse: convert direct IO paths to use FOLL_PIN To: John Hubbard Cc: jhubbard.send.patches@gmail.com, Jens Axboe , Jan Kara , Christoph Hellwig , Dave Chinner , "Darrick J . Wong" , "Theodore Ts'o" , Alexander Viro , Andrew Morton , Chaitanya Kulkarni , linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-xfs , linux-mm , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org On Mon, 28 Feb 2022 at 22:16, John Hubbard wrote: > > On 2/28/22 07:59, Miklos Szeredi wrote: > > On Sun, 27 Feb 2022 at 10:34, wrote: > >> > >> From: John Hubbard > >> > >> Convert the fuse filesystem to support the new iov_iter_get_pages() > >> behavior. That routine now invokes pin_user_pages_fast(), which means > >> that such pages must be released via unpin_user_page(), rather than via > >> put_page(). > >> > >> This commit also removes any possibility of kernel pages being handled, > >> in the fuse_get_user_pages() call. Although this may seem like a steep > >> price to pay, Christoph Hellwig actually recommended it a few years ago > >> for nearly the same situation [1]. > > > > This might work for O_DIRECT, but fuse has this mode of operation > > which turns normal "buffered" I/O into direct I/O. And that in turn > > will break execve of such files. > > > > So AFAICS we need to keep kvec handing in some way. > > > > Thanks for bringing that up! Do you have any hints for me, to jump start How about just leaving that special code in place? It bypasses page refs and directly copies to the kernel buffer, so it should not have any affect on the user page code. > a deeper look? And especially, sample programs that exercise this? Here's one: # uncomment as appropriate: #sudo dnf install fuse3-devel #sudo apt install libfuse3-dev cat < fuse-dio-exec.c #define FUSE_USE_VERSION 31 #include #include #include static const char *filename = "/bin/true"; static int test_getattr(const char *path, struct stat *stbuf, struct fuse_file_info *fi) { return lstat(filename, stbuf) == -1 ? -errno : 0; } static int test_open(const char *path, struct fuse_file_info *fi) { int res; res = open(filename, fi->flags); if (res == -1) return -errno; fi->fh = res; fi->direct_io = 1; return 0; } static int test_read(const char *path, char *buf, size_t size, off_t offset, struct fuse_file_info *fi) { int res = pread(fi->fh, buf, size, offset); return res == -1 ? -errno : res; } static int test_release(const char *path, struct fuse_file_info *fi) { close(fi->fh); return 0; } static const struct fuse_operations test_oper = { .getattr = test_getattr, .open = test_open, .release = test_release, .read = test_read, }; int main(int argc, char *argv[]) { return fuse_main(argc, argv, &test_oper, NULL); } EOF gcc -W fuse-dio-exec.c `pkg-config fuse3 --cflags --libs` -o fuse-dio-exec touch /tmp/true #run test: ./fuse-dio-exec /tmp/true /tmp/true umount /tmp/true