From: Alexander Gordeev <alex@gordick.net>
To: Chao Yu <chao@kernel.org>,
"linux-f2fs-devel@lists.sourceforge.net"
<linux-f2fs-devel@lists.sourceforge.net>
Subject: Re: video archive on a microSD card
Date: Thu, 18 Aug 2016 14:04:55 +0300 [thread overview]
Message-ID: <1184081471518295@web5m.yandex.ru> (raw)
In-Reply-To: <bd0fc9a6-03d3-e32c-f6c7-f64d4ba8506e@kernel.org>
Hi Chao,
Thanks for your continued interest!
17.08.2016, 18:54, "Chao Yu" <chao@kernel.org>:
> Hi Alexander,
>
> On 2016/8/17 17:47, Alexander Gordeev wrote:
>> Hi Chao,
>>
>>> On 2016/8/15 20:22, Alexander Gordeev wrote:
>>>> 15.08.2016, 14:58, "Chao Yu" <yuch...@huawei.com>:
>>>>> Hi Alexander,
>>>>>
>>>>> On 2016/8/15 18:47, Alexander Gordeev wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> 12.08.2016, 14:52, "Alexander Gordeev" <a...@gordick.net>:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I hope I'm writing to the right mailing list. If not please give me
>>>>>>> directions to the right place.
>>>>>>> I'm trying to write video archive to a microSD card on an ARM based IP
>>>>>>> camera. The camera's SDK uses Linux 3.10.
>>>>>>> The kernel is quite old and F2FS is there since 3.8 AFAIK so it's
>>>>>>> probably not mature yet. However, I decided to give it a try.
>>>>>>> The idea is to write video continuously in 5 minute chunks. I also have
>>>>>>> an index file per each archive chunk file to for faster seeks and a single
>>>>>>> SQLite database.
>>>>>>> When utilization is about 95%, the chunks and their indexes from the
>>>>>>> archive tail are deleted. So it's like a ring buffer. Also the
>>>>>>> overprovision ratio is the default 5%.
>>>>>>> It worked quite good for several days with about 95% utilization, but
>>>>>>> then today it went bad. Writes are taking several seconds quite often as
>>>>>>> shown by strace.
>>>>>>> vmstat shows that my process waits for IO most of the time:
>>>>>>>
>>>>>>> procs -----------memory---------- ---swap-- -----io---- -system--
>>>>>>> ----cpu----
>>>>>>> r b swpd free buff cache si so bi bo in cs us sy id wa
>>>>>>> 2 2 0 6352 4 35924 0 0 8 88 562 1316 41 7 0 52
>>>>>>> 1 2 0 6324 4 35928 0 0 4 44 553 1231 40 8 0 52
>>>>>>> 1 2 0 6324 4 35928 0 0 0 0 690 1471 36 10 0 54
>>>>>>> 1 3 0 6296 4 35932 0 0 0 0 530 1242 40 5 0 54
>>>>>>> 2 2 0 6296 4 35936 0 0 4 48 545 1244 40 6 0 54
>>>>>>> 1 2 0 6296 4 35940 0 0 4 44 549 1275 39 6 0 55
>>>>>>> 2 2 0 6288 4 35944 0 0 4 44 563 1315 39 8 0 53
>>>>>>> 3 2 0 6296 4 35944 0 0 0 0 502 1158 41 2 0 57
>>>>>>> 1 3 0 6296 4 35952 0 0 8 88 700 1527 40 9 0 51
>>>>>>> 1 2 0 6296 4 35952 0 0 0 0 482 1141 38 8 0 55
>>>>>>> 1 2 0 6296 4 35956 0 0 4 44 594 1383 38 13 0 49
>>>>>>> 1 2 0 6296 4 35956 0 0 0 0 489 1160 37 5 0 58
>>>>>>> 5 1 0 6268 4 35980 0 0 12 132 704 1565 42 9 0 49
>>>>>>> 2 1 0 6268 4 35984 0 0 4 44 531 1215 39 10 0 51
>>>>>>> 3 2 0 6268 4 35992 0 0 8 92 714 1574 36 9 0 55
>>>>>>> 1 1 0 6268 4 35992 0 0 0 0 485 1163 39 6 0 55
>>>>>>> 1 1 0 6240 4 36000 0 0 8 92 553 1282 38 9 0 53
>>>>>>> 1 2 0 6240 4 36000 0 0 0 0 488 1135 39 7 0 54
>>>>>>> 1 1 0 6240 4 36000 0 0 0 0 552 1264 39 9 0 52
>>>>>>> 3 1 0 6240 4 36000 0 0 0 0 510 1187 40 6 0 54
>>>>>>> 1 2 0 6240 4 36004 0 0 4 44 674 1496 43 8 0 49
>>>>>>> 1 1 0 6240 4 36012 0 0 8 88 572 1373 39 9 0 53
>>>>>>> 4 1 0 6232 4 36016 0 0 4 48 549 1248 41 4 0 55
>>>>>>> 3 1 0 6240 4 36016 0 0 0 0 520 1209 36 8 0 55
>>>>>>>
>>>>>>> Here is also /sys/kernel/debug/f2fs/status for reference:
>>>>>>> =====[ partition info(sda). #0 ]=====
>>>>>>> [SB: 1] [CP: 2] [SIT: 4] [NAT: 118] [SSA: 60] [MAIN: 29646(OverProv:1529
>>>>>>> Resv:50)]
>>>>>>>
>>>>>>> Utilization: 94% (13597314 valid blocks)
>>>>>>> - Node: 16395 (Inode: 2913, Other: 13482)
>>>>>>> - Data: 13580919
>>>>>>>
>>>>>>> Main area: 29646 segs, 14823 secs 14823 zones
>>>>>>> - COLD data: 3468, 1734, 1734
>>>>>>> - WARM data: 12954, 6477, 6477
>>>>>>> - HOT data: 28105, 14052, 14052
>>>>>>> - Dir dnode: 29204, 14602, 14602
>>>>>>> - File dnode: 19960, 9980, 9980
>>>>>>> - Indir nodes: 29623, 14811, 14811
>>>>>>>
>>>>>>> - Valid: 13615
>>>>>>> - Dirty: 13309
>>>>>>> - Prefree: 0
>>>>>>> - Free: 2722 (763)
>>>>>>>
>>>>>>> GC calls: 8622 (BG: 4311)
>>>>>>> - data segments : 8560
>>>>>>> - node segments : 62
>>>>>>> Try to move 3552161 blocks
>>>>>>> - data blocks : 3540278
>>>>>>> - node blocks : 11883
>>>>>>>
>>>>>>> Extent Hit Ratio: 49 / 4171
>>>>>>>
>>>>>>> Balancing F2FS Async:
>>>>>>> - nodes 6 in 141
>>>>>>> - dents 0 in dirs: 0
>>>>>>> - meta 13 in 346
>>>>>>> - NATs 16983 > 29120
>>>>>>> - SITs: 17
>>>>>>> - free_nids: 1861
>>>>>>>
>>>>>>> Distribution of User Blocks: [ valid | invalid | free ]
>>>>>>> [-----------------------------------------------|-|--]
>>>>>>>
>>>>>>> SSR: 1230719 blocks in 14834 segments
>>>>>>> LFS: 15150190 blocks in 29589 segments
>>>>>>>
>>>>>>> BDF: 89, avg. vblocks: 949
>>>>>>>
>>>>>>> Memory: 6754 KB = static: 4763 + cached: 1990
>>>>>>>
>>>>>>> Please note that I tried to put all the archive and index files into the
>>>>>>> cold area using the file extensions mechanism.
>>>>>>> And indeed I saw numbers close to the total amount of segments in the
>>>>>>> cold area a couple of days ago.
>>>>>>> But now the cold area gets smaller very quickly. What am I doing wrong?
>>>>>
>>>>> How could you know the cold area is getting smaller? if it is looks as you
>>>>> said,
>>>>> the behavior of f2fs seems not reasonable.
>>>>
>>>> I'm not sure about this. I just saw numbers after "COLD data: " above to be
>>>> quite close to the total amount after "Main area:".
>>>
>>> Hmm.. number after "COLD data:" means No. of current cold segment, section and
>>> zone.
>>>
>>>> This is quite reasonable because I tried to the video files into the cold
>>>> area. So the cold area should take almost all the device.
>>>> The idea is to separate video files from SQLite database writes. I got to
>>>> this idea while reading some docs on f2fs.
>>>
>>> I think this would be a good idea if SQLite db files and idx&video file have
>>> different updating frequency.
>>
>> Yes, their times of life are quite different probably. I don't know exactly how
>> SQLite manages its db files. I think it's not append only. So I thought that it
>> would be a right idea to separate the other files, that I'm in total control
>> of, from SQLite db file.
>>
>> My video archive and index files are written in append-only manner. They are
>> never overwritten. They can only be deleted when the archive is rotated.
>
> Will video archive and index file be appended to size which is unaligned to 4k,
> and later, be appended continuously?
Well, the actual write() system calls are not aligned to 4k and the size can be quite
random. But I think the job of making aligned writes to the hardware is done by FS
buffers. The archive files are written at a steady rate of about 1.2Mb/s so in my
understanding all their blocks should be ready before they are written to the card.
The index files are written at a much lower rate of about 8Kb/s. But this should be
still enough to always fill their blocks before they are written to the card.
Some sysctl settings from my embedded system:
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
Total amount of RAM is 106736 kB.
>> Per my understanding of f2fs internals, it should write these "cold" files and
>> usual "hot" files to different sections (that should map internally to
>> different allocation units). So the sections used by "cold" data should almost
>> never get "dirty" because most of the time all their blocks become free at
>> the same time. Of course, the files are not exactly 4MB in size so the last
>> section of the deleted file will become dirty. If it is moved by garbage
>> collector and becomes mixed with fresh "cold" data, then indeed it might cause
>> some problems, I think. What is your opinion?
>
> If your fs is not fragmented, it may be as what you said, otherwise, SSR will
> still try to reuse invalid block of other temperture segments, then your cold
> data will be fixed with warm data too.
>
> I guess, what you are facing is the latter one:
> SSR: 1230719 blocks in 14834 segments
I guess, I need to somehow disable any cleaning or SSR for my archive and index
files. But keep the cleaning for other data and nodes.
I think the FS can get fragmented quite easily otherwise. The status above is
captured when the FS already has problems. I think it can become fragmented
this way:
1. The archive is written until the utilization is 95%. It is written separately from other
data and nodes thanks to the "cold" data feature.
2. After hitting 95% the archive my program starts to rotate the archive. The rotation
routine checks the free space, reported by statfs(), once a minute. If it is below 5%
of total, then it deletes several oldest records in the archive.
3. The last deleted record leaves a dirty section. This section holds several blocks
from a record, which now becomes the oldest one.
4. This section is merged with fresh "cold" or even warmer data by either GC, or
SSR in one or more newly used sections.
5. Then very soon the new oldest record is again deleted. And now we have one
or even several dirty sections filled with blocks from a not so old record. Which are
again merged with other records.
6. All the records get fragmented after one full rotation. The fragmentation gets
worse and worse.
So I think the best thing to do is to have sections with "cold" data be completely
out of all the cleaning schemes. It will clean itself by rotating.
Still other data and nodes might need to use some cleaning schemes.
Please correct me if I don't get it right.
> Maybe we can try to alter updating policy from OPU to IPU for your case to avoid
> performance regression of SSR and more frequently FG-GC:
>
> echo 1 > /sys/fs/f2fs/"yourdevicename"/ipu_policy
Thanks, I'll try it!
--
Alexander
------------------------------------------------------------------------------
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2016-08-18 11:05 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-12 11:52 video archive on a microSD card Alexander Gordeev
2016-08-15 10:47 ` Alexander Gordeev
2016-08-15 11:41 ` Chao Yu
2016-08-15 12:22 ` Alexander Gordeev
2016-08-16 15:29 ` Chao Yu
2016-08-17 9:47 ` Alexander Gordeev
2016-08-17 15:54 ` Chao Yu
2016-08-18 11:04 ` Alexander Gordeev [this message]
2016-08-19 2:41 ` Jaegeuk Kim
2016-08-19 11:56 ` Alexander Gordeev
2016-08-22 20:52 ` Alexander Gordeev
2016-08-23 21:12 ` Jaegeuk Kim
2016-08-25 20:14 ` Alexander Gordeev
2016-08-27 1:20 ` Jaegeuk Kim
[not found] ` <549571472473386@web20g.yandex.ru>
2016-08-29 18:23 ` Jaegeuk Kim
[not found] ` <9581472749471@web24h.yandex.ru>
2016-09-01 20:07 ` Jaegeuk Kim
2016-09-02 12:15 ` Alexander Gordeev
2016-08-23 20:27 ` Jaegeuk Kim
2016-08-19 17:22 ` Alexander Gordeev
2016-08-23 21:27 ` Jaegeuk Kim
2016-08-25 20:22 ` Alexander Gordeev
2016-08-26 16:04 ` Alexander Gordeev
2016-08-27 1:15 ` Jaegeuk Kim
2016-08-27 13:00 ` Alexander Gordeev
2016-08-29 16:50 ` Alexander Gordeev
2016-08-29 18:00 ` Jaegeuk Kim
2016-08-31 8:52 ` Alexander Gordeev
2016-08-31 23:46 ` Jaegeuk Kim
2016-09-01 17:40 ` Alexander Gordeev
2016-09-01 18:25 ` Jaegeuk Kim
2016-09-01 19:37 ` Alexander Gordeev
2016-09-01 20:15 ` Jaegeuk Kim
2016-09-02 12:05 ` Alexander Gordeev
2016-09-02 18:50 ` Jaegeuk Kim
2016-08-15 12:57 ` [PATCH] f2fs: fix build for v3.10 Alexander Gordeev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1184081471518295@web5m.yandex.ru \
--to=alex@gordick.net \
--cc=chao@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.