From: Chao Yu <chao2.yu@samsung.com>
To: 'Shuoran' <liushuoran@huawei.com>
Cc: jaegeuk@kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [PATCH] Introduce lifetime write IO statistics
Date: Wed, 27 Jan 2016 18:02:37 +0800 [thread overview]
Message-ID: <00da01d158e9$f170a290$d451e7b0$@samsung.com> (raw)
In-Reply-To: <56A87462.7010502@huawei.com>
> -----Original Message-----
> From: Shuoran [mailto:liushuoran@huawei.com]
> Sent: Wednesday, January 27, 2016 3:40 PM
> To: Chao Yu
> Cc: jaegeuk@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH] Introduce lifetime write IO statistics
>
> On 2016/1/27 15:20, Chao Yu wrote:
> >> -----Original Message-----
> >> From: Shuoran [mailto:liushuoran@huawei.com]
> >> Sent: Tuesday, January 26, 2016 6:12 PM
> >> To: Chao Yu
> >> Cc: jaegeuk@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> >> Subject: Re: [f2fs-dev] [PATCH] Introduce lifetime write IO statistics
> >>
> >> On 2016/1/26 17:24, Chao Yu wrote:
> >>> Hi,
> >>>
> >>>> -----Original Message-----
> >>>> From: Liu Shuoran [mailto:liushuoran@huawei.com]
> >>>> Sent: Tuesday, January 26, 2016 3:41 PM
> >>>> To: jaegeuk@kernel.org
> >>>> Cc: linux-f2fs-devel@lists.sourceforge.net
> >>>> Subject: [f2fs-dev] [PATCH] Introduce lifetime write IO statistics
> >>>>
> >>>> This patch introduces lifetime IO write statistics exposed to the sysfs interface.
> >>>> The write IO amount is obtained from block layer, accumulated in the file system and
> >>>> stored in the hot node summary of checkpoint.
> >>>>
> >>>> Signed-off-by: Shuoran Liu <liushuoran@huawei.com>
> >>>> Signed-off-by: Pengyang Hou <houpengyang@huawei.com>
> >>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>> ---
> >>>> fs/f2fs/checkpoint.c | 12 ++++++++++++
> >>>> fs/f2fs/f2fs.h | 11 +++++++++++
> >>>> fs/f2fs/super.c | 35 +++++++++++++++++++++++++++++++++++
> >>>> include/linux/f2fs_fs.h | 14 +++++++++++++-
> >>>> 4 files changed, 71 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> >>>> index 3842af9..998de47 100644
> >>>> --- a/fs/f2fs/checkpoint.c
> >>>> +++ b/fs/f2fs/checkpoint.c
> >>>> @@ -921,6 +921,10 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control
> >> *cpc)
> >>>> int cp_payload_blks = __cp_payload(sbi);
> >>>> block_t discard_blk = NEXT_FREE_BLKADDR(sbi, curseg);
> >>>> bool invalidate = false;
> >>>> + struct super_block *sb = sbi->sb;
> >>>> + struct curseg_info *seg_i = CURSEG_I(sbi, CURSEG_HOT_NODE);
> >>>> + u64 kbytes_written;
> >>>> +
> >>>>
> >>>> /*
> >>>> * This avoids to conduct wrong roll-forward operations and uses
> >>>> @@ -1034,6 +1038,14 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control
> >>>> *cpc)
> >>>>
> >>>> write_data_summaries(sbi, start_blk);
> >>>> start_blk += data_sum_blocks;
> >>>> +
> >>>> + /* Record write statistics in the hot node summary */
> >>>> + kbytes_written = sbi->kbytes_written;
> >>>> + if (sb->s_bdev->bd_part)
> >>>> + kbytes_written += BD_PART_WRITTEN(sbi);
> >>>> +
> >>>> + seg_i->sum_blk->info.kbytes_written = cpu_to_le64(kbytes_written);
> >>>> +
> >>>> if (__remain_node_summaries(cpc->reason)) {
> >>>> write_node_summaries(sbi, start_blk);
> >>>> start_blk += NR_CURSEG_NODE_TYPE;
> >>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>>> index ff79054..cee5fab0 100644
> >>>> --- a/fs/f2fs/f2fs.h
> >>>> +++ b/fs/f2fs/f2fs.h
> >>>> @@ -844,8 +844,19 @@ struct f2fs_sb_info {
> >>>> struct list_head s_list;
> >>>> struct mutex umount_mutex;
> >>>> unsigned int shrinker_run_no;
> >>>> +
> >>>> + /* For write statistics */
> >>>> + u64 sectors_written_start;
> >>>> + u64 kbytes_written;
> >>>> };
> >>>>
> >>>> +/* For write statistics. Suppose sector size is 512 bytes,
> >>>> + * and the return value is in kbytes. s is of struct f2fs_sb_info.
> >>>> + */
> >>>> +#define BD_PART_WRITTEN(s) \
> >>>> +(((u64)part_stat_read(s->sb->s_bdev->bd_part, sectors[1]) - \
> >>>> + s->sectors_written_start) >> 1)
> >>>> +
> >>>> static inline void f2fs_update_time(struct f2fs_sb_info *sbi, int type)
> >>>> {
> >>>> sbi->last_time[type] = jiffies;
> >>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >>>> index 3bf990b..7ed10f1 100644
> >>>> --- a/fs/f2fs/super.c
> >>>> +++ b/fs/f2fs/super.c
> >>>> @@ -126,6 +126,19 @@ static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int
> >>>> struct_type)
> >>>> return NULL;
> >>>> }
> >>>>
> >>>> +static ssize_t lifetime_write_kbytes_show(struct f2fs_attr *a,
> >>>> + struct f2fs_sb_info *sbi, char *buf)
> >>>> +{
> >>>> + struct super_block *sb = sbi->sb;
> >>>> +
> >>>> + if (!sb->s_bdev->bd_part)
> >>>> + return snprintf(buf, PAGE_SIZE, "0\n");
> >>>> +
> >>>> + return snprintf(buf, PAGE_SIZE, "%llu\n",
> >>>> + (unsigned long long)(sbi->kbytes_written +
> >>>> + BD_PART_WRITTEN(sbi)));
> >>>> +}
> >>>> +
> >>>> static ssize_t f2fs_sbi_show(struct f2fs_attr *a,
> >>>> struct f2fs_sb_info *sbi, char *buf)
> >>>> {
> >>>> @@ -204,6 +217,9 @@ static struct f2fs_attr f2fs_attr_##_name = { \
> >>>> f2fs_sbi_show, f2fs_sbi_store, \
> >>>> offsetof(struct struct_name, elname))
> >>>>
> >>>> +#define F2FS_GENERAL_RO_ATTR(name) \
> >>>> +static struct f2fs_attr f2fs_attr_##name = __ATTR(name, 0444, name##_show, NULL)
> >>>> +
> >>>> F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_min_sleep_time, min_sleep_time);
> >>>> F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_max_sleep_time, max_sleep_time);
> >>>> F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_no_gc_sleep_time, no_gc_sleep_time);
> >>>> @@ -220,6 +236,7 @@ F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_victim_search,
> >>>> max_victim_search);
> >>>> F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, dir_level, dir_level);
> >>>> F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, cp_interval, interval_time[CP_TIME]);
> >>>> F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, idle_interval, interval_time[REQ_TIME]);
> >>>> +F2FS_GENERAL_RO_ATTR(lifetime_write_kbytes);
> >>>>
> >>>> #define ATTR_LIST(name) (&f2fs_attr_##name.attr)
> >>>> static struct attribute *f2fs_attrs[] = {
> >>>> @@ -239,6 +256,7 @@ static struct attribute *f2fs_attrs[] = {
> >>>> ATTR_LIST(ra_nid_pages),
> >>>> ATTR_LIST(cp_interval),
> >>>> ATTR_LIST(idle_interval),
> >>>> + ATTR_LIST(lifetime_write_kbytes),
> >>>> NULL,
> >>>> };
> >>>>
> >>>> @@ -766,6 +784,11 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >>>> bool need_stop_gc = false;
> >>>> bool no_extent_cache = !test_opt(sbi, EXTENT_CACHE);
> >>>>
> >>>> + if (*flags & MS_RDONLY) {
> >>>> + set_opt(sbi, FASTBOOT);
> >>> Need to recover to original mode, otherwise it may change option user
> >>> set.
> >> Yes, maybe I can bring forward the action of saving old mount options.
> >>>> + set_sbi_flag(sbi, SBI_IS_DIRTY);
> >>>> + }
> >>>> +
> >>>> sync_filesystem(sb);
> >>>>
> >>>> /*
> >>>> @@ -1242,6 +1265,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int
> >> silent)
> >>>> bool retry = true, need_fsck = false;
> >>>> char *options = NULL;
> >>>> int recovery, i, valid_super_block;
> >>>> + struct curseg_info *seg_i;
> >>>>
> >>>> try_onemore:
> >>>> err = -EINVAL;
> >>>> @@ -1372,6 +1396,17 @@ try_onemore:
> >>>> goto free_nm;
> >>>> }
> >>>>
> >>>> + /* For write statistics */
> >>>> + if (sb->s_bdev->bd_part)
> >>>> + sbi->sectors_written_start =
> >>>> + (u64)part_stat_read(sb->s_bdev->bd_part, sectors[1]);
> >>>> +
> >>>> + /* Read accumulated write IO statistics if exists */
> >>>> + seg_i = CURSEG_I(sbi, CURSEG_HOT_NODE);
> >>> Why not CURSEG_WARM_DATA?
> >> Data summary might be compacted. So only hot/warm/cold node summary is
> >> available, and we just picked hot node summary.
> > Oh, you're right, warm data journal would not be persisted if data
> > summaries are compacted, my mistake.
> >
> > Wouldn't be better to persist this stat number into super block like
> > ext4?
> >
> > Thanks,
>
> I think super block in F2FS is not updated as frequent as that in Ext4
> (there are several places invoking ext4_commit_super, while few invoking
> f2fs_commit_super), although in this design the statistics is not
> updated very frequently either. Another concern is that, to some extent,
> the super blocks of f2fs is not supposed to update except for a few
> cases. So the stat is better stored in the checkpoint, and we just tried
> very hard finding a spare place.
Got it, so the key point is preventing generating additional IO.
If update points of stat in your approach is enough, so for
1) umount case: commit sb approach causes two more blocks updating.
2) remount case: commit sb approach causes one less blocks updating when
FASTBOOT is disabled.
Not a big degression. :)
IMO, maybe it's better to use journal log in data segments to store
checkpoint related data, maybe more nats/sits entries.
And another thing is I'm in doubt that whether it is enough to use cp +
data journal to update stat, since abnormal pow-cut after a normal cp would
make us losing all stats saved before.
Thanks,
>
> >>> Thanks,
> >>>
> >>>> + if(__exist_node_summaries(sbi))
> >>>> + sbi->kbytes_written =
> >>>> + le64_to_cpu(seg_i->sum_blk->info.kbytes_written);
> >>>> +
> >>>> build_gc_manager(sbi);
> >>>>
> >>>> /* get an inode for node space */
> >>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> >>>> index e59c3be..67aa01d 100644
> >>>> --- a/include/linux/f2fs_fs.h
> >>>> +++ b/include/linux/f2fs_fs.h
> >>>> @@ -358,6 +358,12 @@ struct summary_footer {
> >>>> sizeof(struct sit_journal_entry))
> >>>> #define SIT_JOURNAL_RESERVED ((SUM_JOURNAL_SIZE - 2) %\
> >>>> sizeof(struct sit_journal_entry))
> >>>> +
> >>>> +/* Reserved area should make size of f2fs_extra_info equals to
> >>>> + * that of nat_journal and sit_journal.
> >>>> + */
> >>>> +#define EXTRA_INFO_RESERVED (SUM_JOURNAL_SIZE - 2 - 8)
> >>>> +
> >>>> /*
> >>>> * frequently updated NAT/SIT entries can be stored in the spare area in
> >>>> * summary blocks
> >>>> @@ -387,6 +393,11 @@ struct sit_journal {
> >>>> __u8 reserved[SIT_JOURNAL_RESERVED];
> >>>> } __packed;
> >>>>
> >>>> +struct f2fs_extra_info {
> >>>> + __le64 kbytes_written;
> >>>> + __u8 reserved[EXTRA_INFO_RESERVED];
> >>>> +} __packed;
> >>>> +
> >>>> /* 4KB-sized summary block structure */
> >>>> struct f2fs_summary_block {
> >>>> struct f2fs_summary entries[ENTRIES_IN_SUM];
> >>>> @@ -394,10 +405,11 @@ struct f2fs_summary_block {
> >>>> __le16 n_nats;
> >>>> __le16 n_sits;
> >>>> };
> >>>> - /* spare area is used by NAT or SIT journals */
> >>>> + /* spare area is used by NAT or SIT journals or extra info */
> >>>> union {
> >>>> struct nat_journal nat_j;
> >>>> struct sit_journal sit_j;
> >>>> + struct f2fs_extra_info info;
> >>>> };
> >>>> struct summary_footer footer;
> >>>> } __packed;
> >>>> --
> >>>> 1.9.1
> >>>>
> >>>>
> >>>> ------------------------------------------------------------------------------
> >>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> >>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> >>>> Monitor end-to-end web transactions and take corrective actions now
> >>>> Troubleshoot faster and improve end-user experience. Signup Now!
> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
> >>>> _______________________________________________
> >>>> Linux-f2fs-devel mailing list
> >>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>> .
> >>>
> >
> >
> > .
> >
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
next prev parent reply other threads:[~2016-01-27 10:03 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-26 7:40 [PATCH] Introduce lifetime write IO statistics Liu Shuoran
2016-01-26 9:24 ` Chao Yu
2016-01-26 10:12 ` Shuoran
2016-01-27 7:20 ` Chao Yu
2016-01-27 7:40 ` Shuoran
2016-01-27 10:02 ` Chao Yu [this message]
2016-01-27 13:03 ` Chao Yu
2016-01-27 22:26 ` Jaegeuk Kim
2016-01-28 1:20 ` Chao Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='00da01d158e9$f170a290$d451e7b0$@samsung.com' \
--to=chao2.yu@samsung.com \
--cc=jaegeuk@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=liushuoran@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).