From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63BA4E92FE3 for ; Fri, 6 Oct 2023 04:32:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230079AbjJFEcK (ORCPT ); Fri, 6 Oct 2023 00:32:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229918AbjJFEbo (ORCPT ); Fri, 6 Oct 2023 00:31:44 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7016D126 for ; Thu, 5 Oct 2023 21:31:09 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-58907163519so1247047a12.1 for ; Thu, 05 Oct 2023 21:31:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1696566669; x=1697171469; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ix+sRq4XQRGJaHRfH6WXpO9zrY7MmqagNFx/9Jnm3Cw=; b=1PXrH8VZnOblzuDC7aK/UG4BmUJT3RX2BSU2ovDIkmCEzmSbK9W2unlSLAt7RVfROA uk+A/E6ugLaxfW9A2XIFg3J2HEDzr8n6c2Aa17kn39z2vmedi082//EAkpShgfYAUSJl CydbxSDIB30+eGO0Da0ibs7g3KHtjrVZ8dQlY6dePodntI+1kNfJzL8U+iwhniC76fiX nwLztdRyyquTTwympjC0U8E4NJzkL/eB4XgzU43c6iiM3jtyBdZbOtRnfFmSAALGh7Sl aHvsMcP1x2wCPpxJ2lCLWZJ+qCTipRNUe8kEQnftAMToDeSR9VUp2Pqq89J2n5vMqJCC fDuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696566669; x=1697171469; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ix+sRq4XQRGJaHRfH6WXpO9zrY7MmqagNFx/9Jnm3Cw=; b=GtoAQF7UzULI2HK/dD6jscesN+ptGx3l8cQ4hoKbbm+r+WNjoMKIebU9Z8vXy08WEw iDbIQWpRywxOwwN65mHqXuYuHCjHDb54cvuAxpcrLOd3emIGK+5kkbz1eby6lTVpMOaZ CUAaDyQabjTZGbSrY3ApYobvHtD0o/2vdKpeTrr1OWk75h7fA0o3G4o5ybWkxkedx6Sd 6wt8dN7t7uFbE2IEWinusrg5b55GGm2RnC73QjWwi1/3j0wzid8XG945Ldkoxzv2SQza tYYxXxazTaZTngL6lYnAb3T73rTLlFDKFjVpmp0DOyQYP6YMcXr8/yNISWtCsK0Pp8dO LBbA== X-Gm-Message-State: AOJu0Yw5Nk8Zc+A4hbuCHec34jrFEsOu4+fPAE7KxmktNLgNHHioW8n/ j7bGfopRpwYbGFvX5JA25KAArg== X-Google-Smtp-Source: AGHT+IHgkUMxgn1iEBVFJVYZt/afT8PIq1pP/h8RwAD8SFXjlpezcSfR4j9zUmA+wytXxYHwxrQQQg== X-Received: by 2002:a05:6a20:1447:b0:14e:3daf:fdb9 with SMTP id a7-20020a056a20144700b0014e3daffdb9mr8351717pzi.22.1696566668815; Thu, 05 Oct 2023 21:31:08 -0700 (PDT) Received: from dread.disaster.area (pa49-180-20-59.pa.nsw.optusnet.com.au. [49.180.20.59]) by smtp.gmail.com with ESMTPSA id v9-20020a62a509000000b0069029a3196dsm427960pfm.184.2023.10.05.21.31.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Oct 2023 21:31:08 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1qocUT-00A4Ul-1m; Fri, 06 Oct 2023 15:31:05 +1100 Date: Fri, 6 Oct 2023 15:31:05 +1100 From: Dave Chinner To: Bart Van Assche Cc: "Martin K. Petersen" , John Garry , axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, djwong@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, chandan.babu@oracle.com, dchinner@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-api@vger.kernel.org Subject: Re: [PATCH 10/21] block: Add fops atomic write support Message-ID: References: <5d26fa3b-ec34-bc39-ecfe-4616a04977ca@oracle.com> <34c08488-a288-45f9-a28f-a514a408541d@acm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On Thu, Oct 05, 2023 at 03:58:38PM -0700, Bart Van Assche wrote: > On 10/5/23 15:36, Dave Chinner wrote: > > $ lspci |grep -i nvme > > 03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 > > 06:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 > > $ cat /sys/block/nvme*n1/queue/write_cache > > write back > > write back > > $ > > > > That they have volatile writeback caches.... > > It seems like what I wrote has been misunderstood completely. With > "handling a power failure cleanly" I meant that power cycling a block device > does not result in read errors nor in reading data that has never been written. Then I don't see what your concern is. Single sector writes are guaranteed atomic and have been for as long as I've worked in this game. OTOH, multi-sector writes are not guaranteed to be atomic - they can get torn on sector boundaries, but the individual sectors within that write are guaranteed to be all-or-nothing. Any hardware device that does not guarantee single sector write atomicity (i.e. tears in the middle of a sector) is, by definition, broken. And we all know that broken hardware means nothing in the storage stack works as it should, so I just don't see what point you are trying to make... > Although it is hard to find information about this topic, here is what I found > online: > * About certain SSDs with power loss protection: > https://us.transcend-info.com/embedded/technology/power-loss-protection-plp > * About another class of SSDs with power loss protection: > https://www.kingston.com/en/blog/servers-and-data-centers/ssd-power-loss-protection > * About yet another class of SSDs with power loss protection: > https://phisonblog.com/avoiding-ssd-data-loss-with-phisons-power-loss-protection-2/ Yup, devices that behave as if they have non-volatile write caches. Such devices have been around for more than 30 years, they operate the same as devices without caches at all. > So far I haven't found any information about hard disks and power failure > handling. What I found is that most current hard disks protect data with ECC. > The ECC mechanism should provide good protection against reading data that > has never been written. If a power failure occurs while a hard disk is writing > a physical block, can this result in a read error after power is restored? If > so, is this behavior allowed by storage standards? If a power fail results in read errors from the storage media being reported to the OS instead of the data that was present in the sector before the power failure, then the device is broken. If there is no data in the region being read because it has never been written, then it should return zeros (no data) rather than stale data or a media read error. -Dave. -- Dave Chinner david@fromorbit.com