All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
To: Ryusuke Konishi
	<konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v2 7/9] nilfs2: ensure that all dirty blocks are written out
Date: Sat, 09 May 2015 22:18:06 +0200	[thread overview]
Message-ID: <554E6B7E.8070000@gmx.net> (raw)
In-Reply-To: <20150509.211741.1463241033923032068.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>

On 2015-05-09 14:17, Ryusuke Konishi wrote:
> On Sun,  3 May 2015 12:05:20 +0200, Andreas Rohner wrote:
>> This patch ensures, that all dirty blocks are written out if the segment
>> construction mode is SC_LSEG_SR. The scanning of the DAT file can cause
>> blocks in the SUFILE to be dirtied and newly dirtied blocks in the
>> SUFILE can in turn dirty more blocks in the DAT file. Since one of
>> these stages has to happen before the other during segment
>> construction, we end up with unwritten dirty blocks, that are lost
>> in case of a file system unmount.
>>
>> This patch introduces a new set of file scanning operations that
>> only propagate the changes to the bmap and do not add anything to the
>> segment buffer. The DAT file and SUFILE are scanned with these
>> operations. The function nilfs_sufile_flush_cache() is called in between
>> these scans with the parameter only_mark set. That way it can be called
>> repeatedly without actually writing anything to the SUFILE. If there are
>> no new blocks dirtied in the flush, the normal segment construction
>> stages can safely continue.
>>
>> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
>> ---
>>  fs/nilfs2/segment.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  fs/nilfs2/segment.h |  3 ++-
>>  2 files changed, 74 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
>> index 14e76c3..ab8df33 100644
>> --- a/fs/nilfs2/segment.c
>> +++ b/fs/nilfs2/segment.c
>> @@ -579,6 +579,12 @@ static int nilfs_collect_dat_data(struct nilfs_sc_info *sci,
>>  	return err;
>>  }
>>  
>> +static int nilfs_collect_prop_data(struct nilfs_sc_info *sci,
>> +				  struct buffer_head *bh, struct inode *inode)
>> +{
>> +	return nilfs_bmap_propagate(NILFS_I(inode)->i_bmap, bh);
>> +}
>> +
>>  static int nilfs_collect_dat_bmap(struct nilfs_sc_info *sci,
>>  				  struct buffer_head *bh, struct inode *inode)
>>  {
>> @@ -613,6 +619,14 @@ static struct nilfs_sc_operations nilfs_sc_dat_ops = {
>>  	.write_node_binfo = nilfs_write_dat_node_binfo,
>>  };
>>  
>> +static struct nilfs_sc_operations nilfs_sc_prop_ops = {
>> +	.collect_data = nilfs_collect_prop_data,
>> +	.collect_node = nilfs_collect_file_node,
>> +	.collect_bmap = NULL,
>> +	.write_data_binfo = NULL,
>> +	.write_node_binfo = NULL,
>> +};
>> +
>>  static struct nilfs_sc_operations nilfs_sc_dsync_ops = {
>>  	.collect_data = nilfs_collect_file_data,
>>  	.collect_node = NULL,
>> @@ -998,7 +1012,8 @@ static int nilfs_segctor_scan_file(struct nilfs_sc_info *sci,
>>  			err = nilfs_segctor_apply_buffers(
>>  				sci, inode, &data_buffers,
>>  				sc_ops->collect_data);
>> -			BUG_ON(!err); /* always receive -E2BIG or true error */
>> +			/* always receive -E2BIG or true error (NOT ANYMORE?)*/
>> +			/* BUG_ON(!err); */
>>  			goto break_or_fail;
>>  		}
>>  	}
> 
> If n > rest, this function will exit without scanning node buffers
> for nilfs_segctor_propagate_sufile().  This looks problem, right?
> 
> I think adding separate functions is better.  For instance,
> 
> static int nilfs_propagate_buffer(struct nilfs_sc_info *sci,
> 				  struct buffer_head *bh,
> 				  struct inode *inode)
> {
> 	return nilfs_bmap_propagate(NILFS_I(inode)->i_bmap, bh);
> }
> 
> static int nilfs_segctor_propagate_file(struct nilfs_sc_info *sci,
> 					struct inode *inode)
> {
> 	LIST_HEAD(buffers);
> 	size_t n;
> 	int err;
> 
> 	n = nilfs_lookup_dirty_data_buffers(inode, &buffers, SIZE_MAX, 0,
> 					    LLONG_MAX);
> 	if (n > 0) {
> 		ret = nilfs_segctor_apply_buffers(sci, inode, &buffers,
> 						  nilfs_propagate_buffer);
> 		if (unlikely(ret))
> 			goto fail;
> 	}
> 
> 	nilfs_lookup_dirty_node_buffers(inode, &buffers);
> 	ret = nilfs_segctor_apply_buffers(sci, inode, &buffers,
> 					  nilfs_propagate_buffer);
> fail:
> 	return ret;
> }
> 
> With this, you can also avoid defining nilfs_sc_prop_ops, nor touching
> the BUG_ON() in nilfs_segctor_scan_file.

I agree this is a much nicer solution.

>> @@ -1055,6 +1070,55 @@ static int nilfs_segctor_scan_file_dsync(struct nilfs_sc_info *sci,
>>  	return err;
>>  }
>>  
>> +/**
>> + * nilfs_segctor_propagate_sufile - dirties all needed SUFILE blocks
>> + * @sci: nilfs_sc_info
>> + *
>> + * Description: Dirties and propagates all SUFILE blocks that need to be
>> + * available later in the segment construction process, when the SUFILE cache
>> + * is flushed. Here the SUFILE cache is not actually flushed, but the blocks
>> + * that are needed for a later flush are marked as dirty. Since the propagation
>> + * of the SUFILE can dirty DAT entries and vice versa, the functions
>> + * are executed in a loop until no new blocks are dirtied.
>> + *
>> + * Return Value: On success, 0 is returned on error, one of the following
>> + * negative error codes is returned.
>> + *
>> + * %-ENOMEM - Insufficient memory available.
>> + *
>> + * %-EIO - I/O error
>> + *
>> + * %-EROFS - Read only filesystem (for create mode)
>> + */
>> +static int nilfs_segctor_propagate_sufile(struct nilfs_sc_info *sci)
>> +{
>> +	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
>> +	unsigned long ndirty_blks;
>> +	int ret, retrycount = NILFS_SC_SUFILE_PROP_RETRY;
>> +
>> +	do {
>> +		/* count changes to DAT file before flush */
>> +		ret = nilfs_segctor_scan_file(sci, nilfs->ns_dat,
>> +					      &nilfs_sc_prop_ops);
> 
> Use the previous nilfs_segctor_propagate_file() here.
> 
>> +		if (unlikely(ret))
>> +			return ret;
>> +
>> +		ret = nilfs_sufile_flush_cache(nilfs->ns_sufile, 1,
>> +					       &ndirty_blks);
>> +		if (unlikely(ret))
>> +			return ret;
>> +		if (!ndirty_blks)
>> +			break;
>> +
>> +		ret = nilfs_segctor_scan_file(sci, nilfs->ns_sufile,
>> +					      &nilfs_sc_prop_ops);
> 
> Ditto.
> 
>> +		if (unlikely(ret))
>> +			return ret;
>> +	} while (ndirty_blks && retrycount-- > 0);
>> +
> 
> Uum. This still looks to have potential for leak of dirty block
> collection between DAT and SUFILE since this retry is limited by
> the fixed retry count.

Yes unfortunately.

> How about adding function temporarily turning off the live block
> tracking and using it after this propagation loop until log write
> finishes ?

I think this is a great idea.

> It would reduce the accuracy of live block count, but is it enough ?
> How do you think ? 

I would suggest to iterate through the loop in
nilfs_segctor_propagate_sufile() at least once or twice, so that we can
count most of the DAT-File blocks. After that we temporarily turn off
the live block tracking until the end of the segment construction. This
should only lead to small inaccuracies.

> We have to eliminate the possibility of the leak
> because it can cause file system corruption.  Every checkpoint must be
> self-contained.

I didn't realize that it could cause file system corruption.

>> +	return 0;
>> +}
>> +
>>  static int nilfs_segctor_collect_blocks(struct nilfs_sc_info *sci, int mode)
>>  {
>>  	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
>> @@ -1160,6 +1224,13 @@ static int nilfs_segctor_collect_blocks(struct nilfs_sc_info *sci, int mode)
>>  		}
>>  		sci->sc_stage.flags |= NILFS_CF_SUFREED;
>>  
> 
>> +		if (mode == SC_LSEG_SR &&
> 
> This test ("mode == SC_LSEG_SR") can be removed.  When the thread
> comes here, it will always make a checkpoint.
> 
>> +		    nilfs_feature_track_live_blks(nilfs)) {
>> +			err = nilfs_segctor_propagate_sufile(sci);
>> +			if (unlikely(err))
>> +				break;
>> +		}
>> +
>>  		err = nilfs_segctor_scan_file(sci, nilfs->ns_sufile,
>>  					      &nilfs_sc_file_ops);
>>  		if (unlikely(err))
>> diff --git a/fs/nilfs2/segment.h b/fs/nilfs2/segment.h
>> index a48d6de..5aa7f91 100644
>> --- a/fs/nilfs2/segment.h
>> +++ b/fs/nilfs2/segment.h
>> @@ -208,7 +208,8 @@ enum {
>>   */
>>  #define NILFS_SC_CLEANUP_RETRY	    3  /* Retry count of construction when
>>  					  destroying segctord */
>> -
>> +#define NILFS_SC_SUFILE_PROP_RETRY  10 /* Retry count of the propagate
>> +					  sufile loop */
> 
> How many times does the propagation loop has to be repeated
> until it converges ?

Most of the time it runs only once, because all the blocks are already
dirty, but sometimes it can go on for more than 10 iterations.

Regards,
Andreas Rohner

> The current dirty block scanning function collects all dirty blocks of
> the specified file (i.e. SUFILE or DAT), traversing page cache, making
> and destructing list of dirty buffers, every time the propagation
> function is called.  It's so wasteful to repeat that many times.
> 
> Regards,
> Ryusuke Konishi
> 
>>  /*
>>   * Default values of timeout, in seconds.
>>   */
>> -- 
>> 2.3.7
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-05-09 20:18 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-03 10:05 [PATCH v2 0/9] nilfs2: implementation of cost-benefit GC policy Andreas Rohner
     [not found] ` <1430647522-14304-1-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-03 10:05   ` [PATCH v2 1/9] nilfs2: copy file system feature flags to the nilfs object Andreas Rohner
     [not found]     ` <1430647522-14304-2-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  1:54       ` Ryusuke Konishi
     [not found]         ` <20150509.105445.1816655707671265145.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:41           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 2/9] nilfs2: extend SUFILE on-disk format to enable tracking of live blocks Andreas Rohner
     [not found]     ` <1430647522-14304-3-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:24       ` Ryusuke Konishi
     [not found]         ` <20150509.112403.380867861504859109.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:47           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 3/9] nilfs2: introduce new feature flag for tracking " Andreas Rohner
     [not found]     ` <1430647522-14304-4-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:28       ` Ryusuke Konishi
     [not found]         ` <20150509.112814.2026089040966346261.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:53           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 4/9] nilfs2: add kmem_cache for SUFILE cache nodes Andreas Rohner
     [not found]     ` <1430647522-14304-5-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:41       ` Ryusuke Konishi
     [not found]         ` <20150509.114149.1643183669812667339.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 19:10           ` Andreas Rohner
     [not found]             ` <554E5B9D.7070807-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  0:05               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 5/9] nilfs2: add SUFILE cache for changes to su_nlive_blks field Andreas Rohner
     [not found]     ` <1430647522-14304-6-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  4:09       ` Ryusuke Konishi
     [not found]         ` <20150509.130900.223492430584220355.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 19:39           ` Andreas Rohner
     [not found]             ` <554E626A.2030503-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  2:09               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 6/9] nilfs2: add tracking of block deletions and updates Andreas Rohner
     [not found]     ` <1430647522-14304-7-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  7:05       ` Ryusuke Konishi
     [not found]         ` <20150509.160512.1087140271092828536.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 15:58           ` Ryusuke Konishi
2015-05-09 20:02           ` Andreas Rohner
     [not found]             ` <554E67C0.1050309-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  3:17               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 7/9] nilfs2: ensure that all dirty blocks are written out Andreas Rohner
     [not found]     ` <1430647522-14304-8-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09 12:17       ` Ryusuke Konishi
     [not found]         ` <20150509.211741.1463241033923032068.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 20:18           ` Andreas Rohner [this message]
     [not found]             ` <554E6B7E.8070000-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  3:31               ` Ryusuke Konishi
2015-05-10 11:04           ` Andreas Rohner
     [not found]             ` <554F3B32.5050004-hi6Y0CQ0nG0@public.gmane.org>
2015-06-01  4:13               ` Ryusuke Konishi
     [not found]                 ` <20150601.131320.1075202804382267027.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-06-01 14:33                   ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 8/9] nilfs2: correct live block tracking for GC protection period Andreas Rohner
     [not found]     ` <1430647522-14304-9-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10 18:15       ` Ryusuke Konishi
     [not found]         ` <20150511.031512.1036934606749624197.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-10 18:23           ` Ryusuke Konishi
     [not found]             ` <20150511.032323.1250231827423193240.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-11  2:07               ` Ryusuke Konishi
     [not found]                 ` <20150511.110726.725667075147435663.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-11 12:32                   ` Andreas Rohner
2015-05-11 13:00           ` Andreas Rohner
     [not found]             ` <5550A7FC.4050709-hi6Y0CQ0nG0@public.gmane.org>
2015-05-12 14:31               ` Ryusuke Konishi
     [not found]                 ` <20150512.233126.2206330706583570566.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-12 15:37                   ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 9/9] nilfs2: prevent starvation of segments protected by snapshots Andreas Rohner
     [not found]     ` <1430647522-14304-10-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-20 14:43       ` Ryusuke Konishi
     [not found]         ` <20150520.234335.542615158366069430.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-20 15:49           ` Ryusuke Konishi
2015-05-22 18:10           ` Andreas Rohner
     [not found]             ` <555F70FD.6090500-hi6Y0CQ0nG0@public.gmane.org>
2015-05-31 16:45               ` Ryusuke Konishi
     [not found]                 ` <20150601.014550.269184778137708369.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-31 18:13                   ` Andreas Rohner
     [not found]                     ` <556B4F58.9080801-hi6Y0CQ0nG0@public.gmane.org>
2015-06-01  0:44                       ` Ryusuke Konishi
     [not found]                         ` <20150601.094441.24658496988941562.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-06-01 14:45                           ` Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 1/5] nilfs-utils: extend SUFILE on-disk format to enable track live blocks Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 2/5] nilfs-utils: add additional flags for nilfs_vdesc Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 3/5] nilfs-utils: add support for tracking live blocks Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 4/5] nilfs-utils: implement the tracking of live blocks for set_suinfo Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 5/5] nilfs-utils: add support for greedy/cost-benefit policies Andreas Rohner
2015-05-05  3:09   ` [PATCH v2 0/9] nilfs2: implementation of cost-benefit GC policy Ryusuke Konishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=554E6B7E.8070000@gmx.net \
    --to=andreas.rohner-hi6y0cq0ng0@public.gmane.org \
    --cc=konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org \
    --cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.