From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49C58EB64DB for ; Thu, 15 Jun 2023 00:08:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236004AbjFOAI2 (ORCPT ); Wed, 14 Jun 2023 20:08:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229453AbjFOAIZ (ORCPT ); Wed, 14 Jun 2023 20:08:25 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 364891FD7 for ; Wed, 14 Jun 2023 17:08:24 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1b516978829so1299975ad.1 for ; Wed, 14 Jun 2023 17:08:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20221208.gappssmtp.com; s=20221208; t=1686787703; x=1689379703; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Z2+kH0s1munFi4XxoHaTia/Ij2UiOHjvg/T1/0hNqMc=; b=Zng/IOokNIWKyi0vRvue6dXwqSrBmzan9gr5cB2suQ4UEfHK1ADQStkmaDO7rU/oka NYd63ySP2OUp6tL4c19Sa+8sKejI//ieEE7f0ozKmUvfoLbnkcI9+KkSHQ8KL+WvnJBm eHERA+runumexDhq1EKFS/0+kF6Impcq+gtIrqvy8LGMWamkV1bXs8LUvHeHtTE70wG5 KJpCMMVIxZnwKfrNKnxJDhd5SmMApoEmtxSWC7tXGKZPp8pWka6cN8j5pe0snAGImR4A rJFv/kVjiPJTqT53QZOPkM+xqnSi4YtlXc25keJJFa8gv9QcEHNAyctxveWT4gOmBUPK IVog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686787703; x=1689379703; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Z2+kH0s1munFi4XxoHaTia/Ij2UiOHjvg/T1/0hNqMc=; b=KECOzRIDGj8gonkmP1e0OJb0bOOHddpb3pP7KmAX/JrzVh3YdS8PseiOtScVb+x5aT gslYm+fC/fjVP9/cDPJi54wSmqm8gfMugHH4RdFUCLdkJiWd7uVwqTrH3aGmBLX/Wf5+ 9JELvU8k1p0uHwHakhT4FkDaUfZuaa47xuK116FOEUP4+PdFhS3S6U2cCfsJNRNktUdp lcXCdvXlZnuI6tiUo0cIjcpcd7J++tABPEHDwrZcJv5bL5Sy4tIm20esK8aUHtZ8OikU 7GAt3txocXSjggvu3I6i7U2gO5UvwrWjXkZ9Cyu7kSdjwqLYCdLlb0+wPOC7ke/iLzoi jJgg== X-Gm-Message-State: AC+VfDykeGRQOMmziHQ1+WBwlpKQsx+FpYELaEQ52CU7GhRfCAGMAxyi BwdRTQs3wZDm9/JQYB3VgGn/3w== X-Google-Smtp-Source: ACHHUZ7Ye25QG4SYqhApZpBstZCJPXI5i6ojpI0XsxJqF0PxrT4mRz5+aVdkULcDqyMYFFQvL4DwjA== X-Received: by 2002:a17:903:50d:b0:1ae:14d:8d0a with SMTP id jn13-20020a170903050d00b001ae014d8d0amr14125505plb.29.1686787703650; Wed, 14 Jun 2023 17:08:23 -0700 (PDT) Received: from dread.disaster.area (pa49-180-13-202.pa.nsw.optusnet.com.au. [49.180.13.202]) by smtp.gmail.com with ESMTPSA id a7-20020a170902ecc700b00198d7b52eefsm12682775plh.257.2023.06.14.17.08.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 17:08:23 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1q9aXE-00Brst-28; Thu, 15 Jun 2023 10:08:20 +1000 Date: Thu, 15 Jun 2023 10:08:20 +1000 From: Dave Chinner To: Christoph Hellwig Cc: Sergei Shtepa , axboe@kernel.dk, corbet@lwn.net, snitzer@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, willy@infradead.org, dlemoal@kernel.org, linux@weissschuh.net, jack@suse.cz, ming.lei@redhat.com, linux-block@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Donald Buczek Subject: Re: [PATCH v5 04/11] blksnap: header file of the module interface Message-ID: References: <20230612135228.10702-1-sergei.shtepa@veeam.com> <20230612135228.10702-5-sergei.shtepa@veeam.com> <733f591e-0e8f-8668-8298-ddb11a74df81@veeam.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 14, 2023 at 07:07:16AM -0700, Christoph Hellwig wrote: > On Wed, Jun 14, 2023 at 11:26:20AM +0200, Sergei Shtepa wrote: > > This code worked quite successfully for the veeamsnap module, on the > > basis of which blksnap was created. Indeed, such an allocation of an > > area on a block device using a file does not look safe. > > > > We've already discussed this with Donald Buczek . > > Link: https://github.com/veeam/blksnap/issues/57#issuecomment-1576569075 > > And I have planned work on moving to a more secure ioctl in the future. > > Link: https://github.com/veeam/blksnap/issues/61 > > > > Now, thanks to Dave, it becomes clear to me how to solve this problem best. > > swapfile is a good example of how to do it right. > > I don't actually think swapfile is a very good idea, in fact the Linux > swap code in general is not a very good place to look for inspirations > :) Yeah, the swapfile implementation isn't very nice, I was really just using it as an example of how we can implement the requirements of block mapping delegation in a safe manner to a kernel subsystem. I think the important part is the swapfile inode flag, because that is what keeps userspace from being able to screw with the file while the kernel is using it and allows us to do read/write IO to unwritten extents without converting them to written... > IFF the usage is always to have a whole file for the diff storage the > over all API is very simple - just pass a fd to the kernel for the area, > and then use in-kernel direct I/O on it. Yeah, I was thinking a fd is a better choice for the UAPI as it frees up the kernel implementation, and it doesn't need us to pass a separate bdev identifier in the ioctl. It also means we can pass a regular file or a block device and the kernel code doesn't need to care that they are different. If you think direct IO is a better idea, then I have no objection to that - I haven't looked into the implementation that deeply at this point. I wanted to get an understanding of how all the pieces went together first, so all I've read is the documentation and looked at the UAPI. I made a leap from that: the documentation keeps talking about using files a the filesystem for the difference storage, but the only UAPI for telling the kernel about storage regions it can use is this physical bdev LBA mapping ioctl. Hence if file storage is being used.... > Now if that file should also > be able to reside on the same file system that the snapshot is taken > of things get a little more complicated, because writes to it also need > to automatically set the BIO_REFFED flag. I have some ideas for that > and will share some draft code with you. Cool, I look forward to the updates; I know of a couple of applications that could make use of this functionality right away.... Cheers, Dave. -- Dave Chinner david@fromorbit.com