From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7524EC4332F for ; Wed, 2 Nov 2022 00:45:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230015AbiKBApX (ORCPT ); Tue, 1 Nov 2022 20:45:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230178AbiKBApK (ORCPT ); Tue, 1 Nov 2022 20:45:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7D791AD81; Tue, 1 Nov 2022 17:45:08 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7308761789; Wed, 2 Nov 2022 00:45:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C3F6DC433C1; Wed, 2 Nov 2022 00:45:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1667349907; bh=4e3rycQUwb33zwwCwGW0m1UPZ5kIn2EC65KtL6U4wZ0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kmjO3y2sm11ONmRank57zkDzKxO01q8BE9hXRvTd4jmFX2RW4+S9ZqmnzxKVm4cUN YJDYCav7pb8K+u8cxR+wOpnii7Skg5QwZh4NjV7qj1tR0QAuFXnSOHNeEf6XMzKfO6 4jlQcoWESNFvzonFw2Vi1pQpjXOhVsP0NQYuHQ1R5MikRCAPcfUTiK1+HaKz2gwHpk 9KA7rLIcasZ0aFUIucXvwXTJou+nCwHBPU+pIfYebvu8bqz/0d8xQ9b/bHyfEHjPcZ KZk10R2cwjEb/8GveeF3ZIEfTLygDy5jKs4DnHrPiel4CHrbe6avRseEv5YJqzXNFJ aCoR+q353l9TQ== Date: Tue, 1 Nov 2022 17:45:07 -0700 From: "Darrick J. Wong" To: Shiyang Ruan Cc: Dan Williams , Dave Chinner , "yangx.jy@fujitsu.com" , "Yasunori Gotou (Fujitsu)" , Brian Foster , "hch@infradead.org" , "linux-kernel@vger.kernel.org" , "linux-xfs@vger.kernel.org" , "nvdimm@lists.linux.dev" , "linux-fsdevel@vger.kernel.org" , "zwisler@kernel.org" , Jeff Moyer , "dm-devel@redhat.com" , "toshi.kani@hpe.com" , Theodore Ts'o Subject: Re: [PATCH] xfs: fail dax mount if reflink is enabled on a partition Message-ID: References: <6a83a56e-addc-f3c4-2357-9589a49bf582@fujitsu.com> <20221023220018.GX3600936@dread.disaster.area> <20221024053109.GY3600936@dread.disaster.area> <635b325d25889_6be129446@dwillia2-xfh.jf.intel.com.notmuch> <7a3aac47-1492-a3cc-c53a-53c908f4f857@fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <7a3aac47-1492-a3cc-c53a-53c908f4f857@fujitsu.com> Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org On Sun, Oct 30, 2022 at 05:31:43PM +0800, Shiyang Ruan wrote: > > > 在 2022/10/28 9:37, Dan Williams 写道: > > Darrick J. Wong wrote: > > > [add tytso to cc since he asked about "How do you actually /get/ fsdax > > > mode these days?" this morning] > > > > > > On Tue, Oct 25, 2022 at 10:56:19AM -0700, Darrick J. Wong wrote: > > > > On Tue, Oct 25, 2022 at 02:26:50PM +0000, ruansy.fnst@fujitsu.com wrote: > > ...skip... > > > > > > > > > Nope. Since the announcement of pmem as a product, I have had 15 > > > > minutes of acces to one preproduction prototype server with actual > > > > optane DIMMs in them. > > > > > > > > I have /never/ had access to real hardware to test any of this, so it's > > > > all configured via libvirt to simulate pmem in qemu: > > > > https://lore.kernel.org/linux-xfs/YzXsavOWMSuwTBEC@magnolia/ > > > > > > > > /run/mtrdisk/[gh].mem are both regular files on a tmpfs filesystem: > > > > > > > > $ grep mtrdisk /proc/mounts > > > > none /run/mtrdisk tmpfs rw,relatime,size=82894848k,inode64 0 0 > > > > > > > > $ ls -la /run/mtrdisk/[gh].mem > > > > -rw-r--r-- 1 libvirt-qemu kvm 10739515392 Oct 24 18:09 /run/mtrdisk/g.mem > > > > -rw-r--r-- 1 libvirt-qemu kvm 10739515392 Oct 24 19:28 /run/mtrdisk/h.mem > > > > > > Also forgot to mention that the VM with the fake pmem attached has a > > > script to do: > > > > > > ndctl create-namespace --mode fsdax --map dev -e namespace0.0 -f > > > ndctl create-namespace --mode fsdax --map dev -e namespace1.0 -f > > > > > > Every time the pmem device gets recreated, because apparently that's the > > > only way to get S_DAX mode nowadays? > > > > If you have noticed a change here it is due to VM configuration not > > anything in the driver. > > > > If you are interested there are two ways to get pmem declared the legacy > > way that predates any of the DAX work, the kernel calls it E820_PRAM, > > and the modern way by platform firmware tables like ACPI NFIT. The > > assumption with E820_PRAM is that it is dealing with battery backed > > NVDIMMs of small capacity. In that case the /dev/pmem device can support > > DAX operation by default because the necessary memory for the 'struct > > page' array for that memory is likely small. > > > > Platform firmware defined PMEM can be terabytes. So the driver does not > > enable DAX by default because the user needs to make policy choice about > > burning gigabytes of DRAM for that metadata, or placing it in PMEM which > > is abundant, but slower. So what I suspect might be happening is your > > configuration changed from something that auto-allocated the 'struct > > page' array, to something that needed those commands you list above to > > explicitly opt-in to reserving some PMEM capacity for the page metadata. > > I am using the same simulation environment as Darrick's and Dave's and have > tested many times, but still cannot reproduce the failed cases they > mentioned (dax+non_reflink mode, currently focuing) until now. Only a few > cases randomly failed because of "target is busy". But IIRC, those failed > cases you mentioned were failed with dmesg warning around the function > "dax_associate_entry()" or "dax_disassociate_entry()". Since I cannot > reproduce the failure, it hard for me to continue sovling the problem. FWIW things have calmed down as of 6.1-rc3 -- if I disable reflink, fstests runs without complaint. Now it only seems to be affecting reflink=1 filesystems. > And how is your recent test? Still failed with those dmesg warnings? If so, > could you zip the test result and send it to me? https://djwong.org/docs/kernel/daxbad.zip --D > > > -- > Thanks, > Ruan