From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Alexander Gordeev <alex@gordick.net>
Cc: Chao Yu <chao@kernel.org>,
"linux-f2fs-devel@lists.sourceforge.net"
<linux-f2fs-devel@lists.sourceforge.net>
Subject: Re: video archive on a microSD card
Date: Thu, 1 Sep 2016 13:07:18 -0700 [thread overview]
Message-ID: <20160901200718.GB20281@jaegeuk> (raw)
In-Reply-To: <9581472749471@web24h.yandex.ru>
On Thu, Sep 01, 2016 at 08:04:31PM +0300, Alexander Gordeev wrote:
> Hi Jaegeuk,
>
> 29.08.2016, 21:24, "Jaegeuk Kim" <jaegeuk@kernel.org>:
> > On Mon, Aug 29, 2016 at 03:23:06PM +0300, Alexander Gordeev wrote:
> >> Hi Jaegeuk,
> >>
> >> 27.08.2016, 04:20, "Jaegeuk Kim" <jaegeuk@kernel.org>:
> >> > On Thu, Aug 25, 2016 at 11:14:03PM +0300, Alexander Gordeev wrote:
> >> >> Hi Jaegeuk,
> >> >>
> >> >> Thanks for all the help!
> >> >>...
> >> >> > This is the number of dirty segments, so it needs to consider section and
> >> >> > segment at the same time; a dirty section can consist of valid and free
> >> >> > segments.
> >> >> > How abouting testing 2MB-sized section which is the default option?
> >> >>
> >> >> I tried what you said. Still the majority of segments are dirty for some reason:
> >> >>
> >> >> =====[ partition info(sda). #0, RW]=====
> >> >> [SB: 1] [CP: 2] [SIT: 6] [NAT: 114] [SSA: 116] [MAIN: 59149(OverProv:3003 Resv:48)]
> >> >>
> >> >> Utilization: 10% (3013093 valid blocks)
> >> >> - Node: 3528 (Inode: 548, Other: 2980)
> >> >> - Data: 3009565
> >> >> - Inline_xattr Inode: 0
> >> >> - Inline_data Inode: 0
> >> >> - Inline_dentry Inode: 0
> >> >> - Orphan Inode: 0
> >> >>
> >> >> Main area: 59149 segs, 59149 secs 59149 zones
> >> >> - COLD data: 7183, 7183, 7183
> >> >> - WARM data: 6654, 6654, 6654
> >> >> - HOT data: 59134, 59134, 59134
> >> >> - Dir dnode: 59127, 59127, 59127
> >> >> - File dnode: 59125, 59125, 59125
> >> >> - Indir nodes: 59129, 59129, 59129
> >> >>
> >> >> - Valid: 300
> >> >> - Dirty: 6438
> >> >> - Prefree: 0
> >> >> - Free: 52411 (52411)
> >> >>
> >> >> CP calls: 1023 (BG: 473)
> >> >> GC calls: 470 (BG: 470)
> >> >> - data segments : 466 (466)
> >> >> - node segments : 4 (4)
> >> >> Try to move 152221 blocks (BG: 152221)
> >> >> - data blocks : 151417 (151417)
> >> >> - node blocks : 804 (804)
> >> >>
> >> >> Extent Cache:
> >> >> - Hit Count: L1-1:6262 L1-2:0 L2:0
> >> >> - Hit Ratio: 2% (6262 / 273606)
> >> >> - Inner Struct Count: tree: 292(0), node: 8
> >> >>
> >> >> Balancing F2FS Async:
> >> >> - inmem: 0, wb_bios: 0
> >> >> - nodes: 0 in 0
> >> >> - dents: 0 in dirs: 0 ( 0)
> >> >> - datas: 0 in files: 0
> >> >> - meta: 0 in 0
> >> >> - NATs: 0/ 43
> >> >> - SITs: 0/ 59149
> >> >> - free_nids: 3414
> >> >>
> >> >> Distribution of User Blocks: [ valid | invalid | free ]
> >> >> [-----|--|-------------------------------------------]
> >> >>
> >> >> IPU: 0 blocks
> >> >> SSR: 0 blocks in 0 segments
> >> >> LFS: 3691542 blocks in 7208 segments
> >> >>
> >> >> BDF: 95, avg. vblocks: 444
> >> >>
> >> >> Memory: 12662 KB
> >> >> - static: 12597 KB
> >> >> - cached: 64 KB
> >> >> - paged : 0 KB
> >> >>
> >> >> But the archive is working perfectly as before.
> >> >
> >> > Okay, so we need to gather more information about IO traces. :)
> >> >
> >> > Could you get them by:
> >> >
> >> > echo 1 > /sys/kernel/debug/tracing/events/f2fs/f2fs_submit_write_bio/enable
> >> > echo 1 > /sys/kernel/debug/tracing/events/f2fs/f2fs_submit_page_mbio/enable
> >> > echo 1 > /sys/kernel/debug/tracing/tracing_on
> >> > cat /sys/kernel/debug/tracing/trace_pipe
> >> >
> >> > You can get a script in f2fs-tools.git/scripts/tracepoint.sh
> >>
> >> I collected the trace. It is attached. Thanks!
> >
> > Thanks.
> >
> > What I've found from your trace are:
> > - there are two files (ino=17690, ino=17691) which shared the data log.
> > - ino=17690 writes data sequentiallly, and ino=17691 writes small data randomly.
> > - ino=17690 writes misaligned 4KB blocks at every around 296KB which produces
> > dirty segments.
> >
> > Could you check all the writes and truncation in your app are aligned to 4KB?
> > And, if ino=17691 is sqlite, it needs to check whether it is reaaly using other
> > data log.
>
> I collected more logs from both kernel tracing and strace and tried to get more
> understanding of this. I think, I get what's wrong now.
>
> ino=17690 is a video file. ino=17691 is not SQLite, it is an index file. It is written
> 24 bytes per frame. Here is a small piece of strace log for writing a single frame:
>
> write(19, "...", 4) = 4
> write(19, "...", 4) = 4
> write(19, "...", 2432) = 2432
> write(20, "...", 24) = 24
>
> First three writes are writing to a video file (4 byte stream id, then 4 byte length
> and then the actual frame), then the fourth one writes to and index file. Yes, I know,
> this looks ugly. :)
> All the writes are not aligned to 4096, but there are no truncations, only
> appending.
>
> Then, I think, I see f2fs worker thread wakes up about every two seconds to
> write dirty pages. Unfortunately it seems to write everything collected so far, even
> the most recent pages, which are not fully filled yet. I'd say that can not be
> expected, that every app will write data aligned to 4096 bytes. So this means
> more overhead and overwrites even in a more general case. Is it different in
> mode=adaptive?
No, the flushing time is controlled by vm, and you can tune that through proc.
And, IMO, even if those are append-only, it'd be worth to split index and media
files into different logs; it seems using the cold log for media file only would
be recommendable.
> The 296KB size, probably, comes from my bitrate, which is about 142KB/s, times
> 2 seconds. It is roughly the right size.
> My video FPS is about 30, so the size of data, written to an index, is about 1440
> in two seconds. This is why it looks like randow writes, I think.
>
> Also I see from my new traces, that f2fs_submit_write_bio for other inodes
> are writing to completely different sectors. Looks like the "cold" data feature
> is working good.
>
> To conclude:
> 1. I think I can leave everything as is because (1) there is a small number of
> rewrites and (2) I start rotating the archive at 95% utilization so given the tiny
> amount of data in index and sqlite files, this should be ok, I hope.
If both of index and media files are deleted before suffering from cleaning,
IMO, it'd be fine. You can check the cleaning information in status file.
> 2. But I'd better write both video and index files at 4096 boundary.
> 3. Or this should be fixed in f2fs. I think, there should be a configurable amount
> of time to wait for dirty page to expire. It should be written only after expiration.
> Unless a user calls fsync() of course. Is there such a tunable?
>
> Does this make sense?
Yeah, I think you can tune flushing timing through proc entries.
(e.g., /proc/sys/vm/dirty_writeback_centisecs)
Thanks,
>
> The new traces are attached for reference. They are filtered somewhat
> for easier side-by-side viewing.
>
> Thanks!
>
> --
> Alexander
------------------------------------------------------------------------------
next prev parent reply other threads:[~2016-09-01 20:07 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-12 11:52 video archive on a microSD card Alexander Gordeev
2016-08-15 10:47 ` Alexander Gordeev
2016-08-15 11:41 ` Chao Yu
2016-08-15 12:22 ` Alexander Gordeev
2016-08-16 15:29 ` Chao Yu
2016-08-17 9:47 ` Alexander Gordeev
2016-08-17 15:54 ` Chao Yu
2016-08-18 11:04 ` Alexander Gordeev
2016-08-19 2:41 ` Jaegeuk Kim
2016-08-19 11:56 ` Alexander Gordeev
2016-08-22 20:52 ` Alexander Gordeev
2016-08-23 21:12 ` Jaegeuk Kim
2016-08-25 20:14 ` Alexander Gordeev
2016-08-27 1:20 ` Jaegeuk Kim
[not found] ` <549571472473386@web20g.yandex.ru>
2016-08-29 18:23 ` Jaegeuk Kim
[not found] ` <9581472749471@web24h.yandex.ru>
2016-09-01 20:07 ` Jaegeuk Kim [this message]
2016-09-02 12:15 ` Alexander Gordeev
2016-08-23 20:27 ` Jaegeuk Kim
2016-08-19 17:22 ` Alexander Gordeev
2016-08-23 21:27 ` Jaegeuk Kim
2016-08-25 20:22 ` Alexander Gordeev
2016-08-26 16:04 ` Alexander Gordeev
2016-08-27 1:15 ` Jaegeuk Kim
2016-08-27 13:00 ` Alexander Gordeev
2016-08-29 16:50 ` Alexander Gordeev
2016-08-29 18:00 ` Jaegeuk Kim
2016-08-31 8:52 ` Alexander Gordeev
2016-08-31 23:46 ` Jaegeuk Kim
2016-09-01 17:40 ` Alexander Gordeev
2016-09-01 18:25 ` Jaegeuk Kim
2016-09-01 19:37 ` Alexander Gordeev
2016-09-01 20:15 ` Jaegeuk Kim
2016-09-02 12:05 ` Alexander Gordeev
2016-09-02 18:50 ` Jaegeuk Kim
2016-08-15 12:57 ` [PATCH] f2fs: fix build for v3.10 Alexander Gordeev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160901200718.GB20281@jaegeuk \
--to=jaegeuk@kernel.org \
--cc=alex@gordick.net \
--cc=chao@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).