From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21F724962C; Mon, 24 Jun 2024 21:55:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719266135; cv=none; b=HLQovh15a2hkFZYb6sDBT2iPU7sPqE26SqHEAoRe1itFaCzgi7V7lojG1xP8isnIieTghJ8VvmK0D4SpWkHrSXQWUfcqMZUZvS0CWlTCl2blYnmnH79OuVr8m5QQPjMUa3j4zoD9aCF4rVG1azllP77Raarb9Vq8SgypCIDbNyQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719266135; c=relaxed/simple; bh=ssw0CE3X6rtJAiVWYNCh9l8N9P1eQeBKDrj05miOGGg=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=g/ukruaTQM3JZ/XBK1u8DqqjJSK4xM/8uOYR+POGJOnD9q9QSEMu3L9x53BhJGNjbrZHcxNOZpN/cflj+9RRBq8QSRTLjH4aURcA28a/mkrRLjNiOV84QOXeTJ9pSQNf7ukXW/DlfvS0boh7YSIerLbt7HBznUYk2JWFHFKle+s= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=h5e5otEl; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="h5e5otEl" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 11C22C2BBFC; Mon, 24 Jun 2024 21:55:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719266134; bh=ssw0CE3X6rtJAiVWYNCh9l8N9P1eQeBKDrj05miOGGg=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=h5e5otElt6qUnWFncwXReRgalS93n3GWdHfZI61HaUNM702vEiLU2OL3hRtmeZ7pW ohhQ3/5IVk0g8hUUHWlUqfFBEMPtZPqq9cQTQXiwSXMSd207XtxM9X16zMkpmCwa1h P7loUD40U+W2bSN567WlBElAPghrvm140h9uxIRABpJMhzaJxKaale7wIivArgxwds 17sJnrpbor04AXrsi4bw/9WuT16U52EHU+4DHQipBM/gHcQC+G9CN1yr9Xg4UveXWD yQLKSBhNO7cmx1X2iEik5+dutOQ4SzhHG4PwsLAFryjtPCKpTROi/LdExZix+Cs8Gm 97NMXmBhfGsSg== Message-ID: Date: Tue, 25 Jun 2024 06:55:29 +0900 Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v20 02/12] Add infrastructure for copy offload in block and request layer. To: Bart Van Assche , Nitesh Shetty , Christoph Hellwig Cc: Jens Axboe , Jonathan Corbet , Alasdair Kergon , Mike Snitzer , Mikulas Patocka , Keith Busch , Sagi Grimberg , Chaitanya Kulkarni , Alexander Viro , Christian Brauner , Jan Kara , martin.petersen@oracle.com, david@fromorbit.com, hare@suse.de, damien.lemoal@opensource.wdc.com, anuj20.g@samsung.com, joshi.k@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org References: <9f1ec1c1-e1b8-48ac-b7ff-8efb806a1bc8@kernel.org> <665850bd.050a0220.a5e6b.5b72SMTPIN_ADDED_BROKEN@mx.google.com> <20240601055931.GB5772@lst.de> <20240604044042.GA29094@lst.de> <4ffad358-a3e6-4a88-9a40-b7e5d05aa53c@acm.org> <20240605082028.GC18688@lst.de> <6679526f.170a0220.9ffd.aefaSMTPIN_ADDED_BROKEN@mx.google.com> <4ea90738-afd1-486c-a9a9-f7e2775298ff@acm.org> Content-Language: en-US From: Damien Le Moal Organization: Western Digital Research In-Reply-To: <4ea90738-afd1-486c-a9a9-f7e2775298ff@acm.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 2024/06/25 1:25, Bart Van Assche wrote: > On 6/24/24 3:44 AM, Nitesh Shetty wrote: >> For reference, I have listed the approaches we have taken in the past. >> >> a. Token/payload based approach: >> 1. Here we allocate a buffer/payload. >> 2. First source BIO is sent along with the buffer. >> 3. Once the buffer reaches driver, it is filled with the source LBA >> and length and namespace info. And the request is completed. >> 4. Then destination BIO is sent with same buffer. >> 5. Once the buffer reaches driver, it retrieves the source information from >> the BIO and forms a copy command and sends it down to device. >> >> We received feedback that putting anything inside payload which is not >> data, is not a good idea[1]. > > A token-based approach (pairing copy_src and copy_dst based on a token) > is completely different from a payload-based approach (copy offload > parameters stored in the bio payload). From [1] (I agree with what has > been quoted): "In general every time we tried to come up with a request > payload that is not just data passed to the device it has been a > nightmare." [ ... ] "The only thing we'd need is a sequence number / idr > / etc to find an input and output side match up, as long as we > stick to the proper namespace scope." > >> c. List/ctx based approach: >> A new member is added to bio, bio_copy_ctx, which will a union with >> bi_integrity. Idea is once a copy bio reaches blk_mq_submit_bio, it will >> add the bio to this list. >> 1. Send the destination BIO, once this reaches blk_mq_submit_bio, this >> will add the destination BIO to the list inside bi_copy_ctx and return >> without forming any request. >> 2. Send source BIO, once this reaches blk_mq_submit_bio, this will >> retrieve the destination BIO from bi_copy_ctx and form a request with >> destination BIO and source BIO. After this request will be sent to >> driver. >> >> This work is still in POC phase[2]. But this approach makes lifetime >> management of BIO complicated, especially during failure cases. > > Associating src and dst operations by embedding a pointer to a third > data structure in struct bio is an implementation choice and is not the > only possibility for assocating src and dst operations. Hence, the > bio lifetime complexity mentioned above is not inherent to the list > based approach but is a result of the implementation choice made for > associating src and dst operations. > > Has it been considered to combine the list-based approach for managing > unpaired copy operations with the token based approach for pairing copy > src and copy dst operations? I am still a little confused as to why we need 2 BIOs, one for src and one for dst... Is it because of the overly complex scsi extended copy support ? Given that the main use case is copy offload for data within the same device, using a single BIO which somehow can carry a list of LBA sources and a single destination LBA would be far simpler and perfectly matching nvme simple copy and ATA write gathered. And I think that this would also match the simplest case for scsi extended copy as well. -- Damien Le Moal Western Digital Research