Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/3] btrfs: do not commit logs and transactions during link and rename operations
Date: Tue, 11 Aug 2020 14:33:17 -0400	[thread overview]
Message-ID: <32fd7959-4ea4-6930-74e5-e57c30a51733@toxicpanda.com> (raw)
In-Reply-To: <20200811114348.692115-1-fdmanana@kernel.org>

On 8/11/20 7:43 AM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> Since commit d4682ba03ef618 ("Btrfs: sync log after logging new name") we
> started to commit logs, and fallback to transaction commits when we failed
> to log the new names or commit the logs, after link and rename operations
> when the target inodes (or their parents) were previously logged in the
> current transaction. This was to avoid losing directories despite an
> explicit fsync on them when they are ancestors of some inode that got a
> new named logged, due to a link or rename operation. However that adds the
> cost of starting IO and waiting for it to complete, which can cause higher
> latencies for applications.
> 
> Instead of doing that, just make sure that when we log a new name for an
> inode we don't mark any of its ancestors as logged, so that if any one
> does an fsync against any of them, without doing any other change on them,
> the fsync commits the log. This way we only pay the cost of a log commit
> (or a transaction commit if something goes wrong or a new block group was
> created) if the application explicitly asks to fsync any of the parent
> directories.
> 
> Using dbench, which mixes several filesystems operations including renames,
> revealed some significant latency gains. The following script that uses
> dbench was used to test this:
> 
>    #!/bin/bash
> 
>    DEV=/dev/nvme0n1
>    MNT=/mnt/btrfs
>    MOUNT_OPTIONS="-o ssd -o space_cache=v2"
>    MKFS_OPTIONS="-m single -d single"
>    THREADS=16
> 
>    echo "performance" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
>    mkfs.btrfs -f $MKFS_OPTIONS $DEV
>    mount $MOUNT_OPTIONS $DEV $MNT
> 
>    dbench -t 300 -D $MNT $THREADS
> 
>    umount $MNT
> 
> The test was run on bare metal, no virtualization, on a box with 12 cores
> (Intel i7-8700), 64Gb of RAM and using a NVMe device, with a kernel
> configuration that is the default of typical distributions (debian in this
> case), without debug options enabled (kasan, kmemleak, slub debug, debug
> of page allocations, lock debugging, etc).
> 
> Results before this patch:
> 
>   Operation      Count    AvgLat    MaxLat
>   ----------------------------------------
>   NTCreateX    10750455     0.011   155.088
>   Close        7896674     0.001     0.243
>   Rename        455222     2.158  1101.947
>   Unlink       2171189     0.067   121.638
>   Deltree          256     2.425     7.816
>   Mkdir            128     0.002     0.003
>   Qpathinfo    9744323     0.006    21.370
>   Qfileinfo    1707092     0.001     0.146
>   Qfsinfo      1786756     0.001    11.228
>   Sfileinfo     875612     0.003    21.263
>   Find         3767281     0.025     9.617
>   WriteX       5356924     0.011   211.390
>   ReadX        16852694     0.003    9.442
>   LockX          35008     0.002     0.119
>   UnlockX        35008     0.001     0.138
>   Flush         753458     4.252  1102.249
> 
> Throughput 1128.35 MB/sec  16 clients  16 procs  max_latency=1102.255 ms
> 
> Results after this patch:
> 
> 16 clients, after
> 
>   Operation      Count    AvgLat    MaxLat
>   ----------------------------------------
>   NTCreateX    11471098     0.012   448.281
>   Close        8426396     0.001     0.925
>   Rename        485746     0.123   267.183
>   Unlink       2316477     0.080    63.433
>   Deltree          288     2.830    11.144
>   Mkdir            144     0.003     0.010
>   Qpathinfo    10397420     0.006    10.288
>   Qfileinfo    1822039     0.001     0.169
>   Qfsinfo      1906497     0.002    14.039
>   Sfileinfo     934433     0.004     2.438
>   Find         4019879     0.026    10.200
>   WriteX       5718932     0.011   200.985
>   ReadX        17981671     0.003    10.036
>   LockX          37352     0.002     0.076
>   UnlockX        37352     0.001     0.109
>   Flush         804018     5.015   778.033
> 
> Throughput 1201.98 MB/sec  16 clients  16 procs  max_latency=778.036 ms
> (+6.5% throughput, -29.4% max latency, -75.8% rename latency)
> 
> Test case generic/498 from fstests tests the scenario that the previously
> mentioned commit fixed.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Man this took me a while to grok, but I think I've got it.  The only thing is 
this patch doesn't apply to current (as of today) misc-next.  Thanks,

Josef

  reply	other threads:[~2020-08-11 18:33 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-11 11:43 [PATCH 2/3] btrfs: do not commit logs and transactions during link and rename operations fdmanana
2020-08-11 18:33 ` Josef Bacik [this message]
2020-08-12  8:29   ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32fd7959-4ea4-6930-74e5-e57c30a51733@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox