From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D08BC61DD8 for ; Mon, 16 Nov 2020 20:18:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E4065217A0 for ; Mon, 16 Nov 2020 20:18:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Xp/Jnh0Q" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732142AbgKPUSS (ORCPT ); Mon, 16 Nov 2020 15:18:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727305AbgKPUSR (ORCPT ); Mon, 16 Nov 2020 15:18:17 -0500 Received: from mail-il1-x141.google.com (mail-il1-x141.google.com [IPv6:2607:f8b0:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C090C0613CF; Mon, 16 Nov 2020 12:18:16 -0800 (PST) Received: by mail-il1-x141.google.com with SMTP id h6so12827084ilj.8; Mon, 16 Nov 2020 12:18:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Y5tWyVQqyVOPnOnbKCKzBaCUixD0HHPocwip/42dsIc=; b=Xp/Jnh0Qrn4cKldL/3hsCP1XVq99BIBBI1I6GST2XcF9F534R29F/7KHEPIbD8fbLQ bLRA4NUMci+CGdK/a66oqkebK7y2uJ+bA6n848i2Xh3T0otHNhhzI/55hID+y0pXGx32 RfYUeBlcndLzmR4DbWBEx1tviDYBUcbJ9WXQUI1j3FALl6d9V1qN7BE0SgRcfmFpz53v qfLTIl6dsl6jYR4r4A+AXd83oAQlKi6EJ/D7S1ZzZbLP+MUJtRN5mDPqJ7tx6jyQ0kkq 4diyyzEahEz1PrMytrrMyTQCfP/P26VJ1S5IQzbxyxIjAHkuTSbXVo041THgly1TWvyC 9Xyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Y5tWyVQqyVOPnOnbKCKzBaCUixD0HHPocwip/42dsIc=; b=LO3TX8stXYPcicnyJ0RLx6kMddRKe50W34q4/eazcRrPkfEeTyNd3d8Eb6KVh4fd7j 9wFQVEC5LWDjEnc+2c8Ro1Cv/iCRG4VVCowtl8dSUmITRcbjpm3YK1jgBulUl+VALnV5 BXjunjWvHLjTMy6PxxkJTPwYDpzCdTx/mfY/8xRMqn5yHXLhFWG7+RTDFhUQd3989OPv K/k4GG8sZhG0lyHUk4vx4X7gScR3vK76vHBH6J8lciYAowk4h9peL2u0DTWIRPgnd81S dZiN+jYkG9PzDHuqH8pUjtwKPVflEfcVKfmHNjim9w0ayNdHieU109tgGjFNf3M2oEUP Ke9g== X-Gm-Message-State: AOAM533H7ylC+TSocqrTyuIb2s2PUKmJlA6cm8cRB1y8zWDMIXNqlJhY auXYwul9YBJc+g1XH7O6GZZBtFCfzU6BwelUKMmmDO4V X-Google-Smtp-Source: ABdhPJwN1DUWR9fjlvLzAAnRDb10aAPagg3wsxFP1tgMwwqjRPwz8kE0aVQnqwssOLqnyFRWFV/RIGhgjISa5pfNMAQ= X-Received: by 2002:a92:bac5:: with SMTP id t66mr9736955ill.250.1605557895397; Mon, 16 Nov 2020 12:18:15 -0800 (PST) MIME-Version: 1.0 References: <20201116045758.21774-1-sargun@sargun.me> <20201116045758.21774-4-sargun@sargun.me> <20201116144240.GA9190@redhat.com> <20201116163615.GA17680@redhat.com> In-Reply-To: <20201116163615.GA17680@redhat.com> From: Amir Goldstein Date: Mon, 16 Nov 2020 22:18:03 +0200 Message-ID: Subject: Re: [RFC PATCH 3/3] overlay: Add the ability to remount volatile directories when safe To: Vivek Goyal Cc: Sargun Dhillon , overlayfs , Miklos Szeredi , Alexander Viro , Giuseppe Scrivano , Daniel J Walsh , David Howells , linux-fsdevel , Chengguang Xu Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Mon, Nov 16, 2020 at 6:36 PM Vivek Goyal wrote: > > On Mon, Nov 16, 2020 at 05:20:04PM +0200, Amir Goldstein wrote: > > On Mon, Nov 16, 2020 at 4:42 PM Vivek Goyal wrote: > > > > > > On Sun, Nov 15, 2020 at 08:57:58PM -0800, Sargun Dhillon wrote: > > > > Overlayfs added the ability to setup mounts where all syncs could be > > > > short-circuted in (2a99ddacee43: ovl: provide a mount option "volatile"). > > > > > > > > A user might want to remount this fs, but we do not let the user because > > > > of the "incompat" detection feature. In the case of volatile, it is safe > > > > to do something like[1]: > > > > > > > > $ sync -f /root/upperdir > > > > $ rm -rf /root/workdir/incompat/volatile > > > > > > > > There are two ways to go about this. You can call sync on the underlying > > > > filesystem, check the error code, and delete the dirty file if everything > > > > is clean. If you're running lots of containers on the same filesystem, or > > > > you want to avoid all unnecessary I/O, this may be suboptimal. > > > > > > > > > > Hi Sargun, > > > > > > I had asked bunch of questions in previous mail thread to be more > > > clear on your requirements but never got any response. It would > > > have helped understanding your requirements better. > > > > > > How about following patch set which seems to sync only dirty inodes of > > > upper belonging to a particular overlayfs instance. > > > > > > https://lore.kernel.org/linux-unionfs/20201113065555.147276-1-cgxu519@mykernel.net/ > > > > > > So if could implement a mount option which ignores fsync but upon > > > syncfs, only syncs dirty inodes of that overlayfs instance, it will > > > make sure we are not syncing whole of the upper fs. And we could > > > do this syncing on unmount of overlayfs and remove dirty file upon > > > successful sync. > > > > > > Looks like this will be much simpler method and should be able to > > > meet your requirements (As long as you are fine with syncing dirty > > > upper inodes of this overlay instance on unmount). > > > > > > > Do note that the latest patch set by Chengguang not only syncs dirty > > inodes of this overlay instance, but also waits for in-flight writeback on > > all the upper fs inodes and I think that with !ovl_should_sync(ofs) > > we will not re-dirty the ovl inodes and lose track of the list of dirty > > inodes - maybe that can be fixed. > > > > Also, I am not sure anymore that we can safely remove the dirty file after > > sync dirty inodes sync_fs and umount. If someone did sync_fs before us > > and consumed the error, we may have a copied up file in upper whose > > data is not on disk, but when we sync_fs on unmount we won't get an > > error? not sure. > > May be we can save errseq_t when mounting overlay and compare with > errseq_t stored in upper sb after unmount. That will tell us whether > error has happened since we mounted overlay. (Similar to what Sargun > is doing). > I suppose so. > In fact, if this is a concern, we have this issue with user space > "sync " too? Other sync might fail and this one succeeds > and we will think upper is just fine. May be container tools can > keep a file/dir open at the time of mount and call syncfs using > that fd instead. (And that should catch errors since that fd > was opened, I am assuming). > Did not understand the problem with userspace sync. > > > > I am less concerned about ways to allow re-mount of volatile > > overlayfs than I am about turning volatile overlayfs into non-volatile. > > If we are not interested in converting volatile containers into > non-volatile, then whole point of these patch series is to detect > if any writeback error has happened or not. If writeback error has > happened, then we detect that at remount and possibly throw away > container. > > What happens today if writeback error has happened. Is that page thrown > away from page cache and read back from disk? IOW, will user lose > the data it had written in page cache because writeback failed. I am > assuming we can't keep the dirty page around for very long otherwise > it has potential to fill up all the available ram with dirty pages which > can't be written back. > Right. the resulting data is undefined after error. > Why is it important to detect writeback error only during remount. What > happens if container overlay instance is already mounted and writeback > error happens. We will not detct that, right? > > IOW, if capturing writeback error is important for volatile containers, > then capturing it only during remount time is not enough. Normally > fsync/syncfs should catch it and now we have skipped those, so in > the process we lost mechanism to detect writeback errrors for > volatile containers? > Yes, you are right. It's an issue with volatile that we should probably document. I think upper files data can "evaporate" even as the overlay is still mounted. Thanks, Amir.