From: Marc Lehmann <schmorp@schmorp.de>
To: linux-f2fs-devel@lists.sourceforge.net
Subject: f2fs for SMR drives
Date: Sat, 8 Aug 2015 15:51:17 +0200 [thread overview]
Message-ID: <20150808135117.GA4594@schmorp.de> (raw)
Hi!
Sorry if this is the wrong address to ask about "user problems".
I am currently investigating various filesystems for use on drive-managed SMR
drives (e.g. the seagate 8TB disks). These drives have characteristics not
unlike flash (they want to be written in large batches), but are, of course,
still quite different.
I initially tried btrfs, ext4, xfs which, not unsurprisingly, failed rather
miserably after a few hundred GB, down to ~30mb/s (or 20 in case of btrfs).
I also tried nilfs, which should be an almost perfect match for this
technology, but it performed even worse (I have no clue why, maybe nilfs
skips sectors when writing, which would explain it).
As a last resort, I tried f2fs, which initially performed absolutely great
(average write speed ~130mb/s over multiple terabytes).
However, I am running into a number of problems, and wonder if f2fs can
somehow be configured to work right.
First of all, I did most of my tests on linux-3.18.14, and recently
switched to 4.1.4. The filesystems were formatted with "-s7", the idea
being that writes always occur in 256MB blocks as much as possible, and
most importantly, are freed in 256MB blocks, to keep fragmentation low.
Mount options included noatime or noatime,inline_xattr,inline_data,inline_dentry,flush_merge,extent_cache
(I suspect 4.1.4 doesn't implement flush_merge yet?).
My first problem considers ENOSPC problem - I was happily able to write to a
100% utilized filesystem with cp and rsync continuing to write, not receiving
any error, but no write activity occuring (and the files never ending up on
the filesystem). Is this a known bug?
My second, much bigger problem, considers defragmentation. For testing,
I created a 128GB partition and kept writing an assortment of 200kb -
multiple megabyte files to it. To stress test it, I kept deleting random
files to create holes. after a while (around 84% utilisation), write
performance went down to less than 1MB/s, and is at this leve ever since
for this filesystem.
I kept the filesystem idle for a night to hope for defragmentation, but
nothing happened. Suspecting in-place-updates to be the culprit, I tried
various configurations in the hope of disabling them (such as setting
ipu_policy to 4 or 8, and/or setting min_ipu_util to 0 or 100), but that
also doesn't seem to have any effect whatsoever.
>From the description of f2fs, it seems to be quite close to ideal for these
drives, as it should be possible to write mostly linearly, and keep
fragmentation low by freeing big sequentials sections of data.
Pity that it's so close and then fails so miserably after performing so
admirably initially - can anything be done about this, in way of
configuration, or is my understanding of how f2fs writes and garbage collects
flawed?
Here is the output of /sys/kernel/debug/f2fs/status for the filesysstem in
question. This was after keeping it idle for a night, then unmounting and
remounting the volume. Before the unmount, it had very high values for in
the GC calls section, but no reads have been observed during the night,
just writes (using dstat -Dsdx).
=====[ partition info(dm-9). #1 ]=====
[SB: 1] [CP: 2] [SIT: 6] [NAT: 114] [SSA: 130] [MAIN: 65275(OverProv:2094 Resv:1456)]
Utilization: 84% (27320244 valid blocks)
- Node: 31936 (Inode: 5027, Other: 26909)
- Data: 27288308
- Inline_data Inode: 0
- Inline_dentry Inode: 0
Main area: 65275 segs, 9325 secs 9325 zones
- COLD data: 12063, 1723, 1723
- WARM data: 12075, 1725, 1725
- HOT data: 65249, 9321, 9321
- Dir dnode: 65269, 9324, 9324
- File dnode: 24455, 3493, 3493
- Indir nodes: 65260, 9322, 9322
- Valid: 52278
- Dirty: 9
- Prefree: 0
- Free: 12988 (126)
CP calls: 10843
GC calls: 91 (BG: 11)
- data segments : 21 (0)
- node segments : 70 (0)
Try to move 30355 blocks (BG: 0)
- data blocks : 7360 (0)
- node blocks : 22995 (0)
Extent Hit Ratio: 8267 / 24892
Extent Tree Count: 3130
Extent Node Count: 3138
Balancing F2FS Async:
- inmem: 0, wb: 0
- nodes: 0 in 5672
- dents: 0 in dirs: 0
- meta: 0 in 3567
- NATs: 0/ 9757
- SITs: 0/ 65275
- free_nids: 868
Distribution of User Blocks: [ valid | invalid | free ]
[------------------------------------------||--------]
IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 49114 blocks in 95 segments
BDF: 64, avg. vblocks: 1254
Memory: 48948 KB
- static: 11373 KB
- cached: 619 KB
- paged : 36956 KB
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp@schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
------------------------------------------------------------------------------
next reply other threads:[~2015-08-08 14:06 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-08 13:51 Marc Lehmann [this message]
2015-08-10 10:20 ` f2fs for SMR drives Chao Yu
2015-08-10 13:05 ` Marc Lehmann
2015-08-11 10:40 ` Chao Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150808135117.GA4594@schmorp.de \
--to=schmorp@schmorp.de \
--cc=linux-f2fs-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).