linux-nilfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
To: Ryusuke Konishi
	<konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v2 7/9] nilfs2: ensure that all dirty blocks are written out
Date: Sun, 10 May 2015 13:04:18 +0200	[thread overview]
Message-ID: <554F3B32.5050004@gmx.net> (raw)
In-Reply-To: <20150509.211741.1463241033923032068.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>

On 2015-05-09 14:17, Ryusuke Konishi wrote:
> On Sun,  3 May 2015 12:05:20 +0200, Andreas Rohner wrote:
>> This patch ensures, that all dirty blocks are written out if the segment
>> construction mode is SC_LSEG_SR. The scanning of the DAT file can cause
>> blocks in the SUFILE to be dirtied and newly dirtied blocks in the
>> SUFILE can in turn dirty more blocks in the DAT file. Since one of
>> these stages has to happen before the other during segment
>> construction, we end up with unwritten dirty blocks, that are lost
>> in case of a file system unmount.
>>
>> This patch introduces a new set of file scanning operations that
>> only propagate the changes to the bmap and do not add anything to the
>> segment buffer. The DAT file and SUFILE are scanned with these
>> operations. The function nilfs_sufile_flush_cache() is called in between
>> these scans with the parameter only_mark set. That way it can be called
>> repeatedly without actually writing anything to the SUFILE. If there are
>> no new blocks dirtied in the flush, the normal segment construction
>> stages can safely continue.
>>
>> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
>> ---
>>  fs/nilfs2/segment.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  fs/nilfs2/segment.h |  3 ++-
>>  2 files changed, 74 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
>> index 14e76c3..ab8df33 100644
>> --- a/fs/nilfs2/segment.c
>> +++ b/fs/nilfs2/segment.c
>> @@ -579,6 +579,12 @@ static int nilfs_collect_dat_data(struct nilfs_sc_info *sci,
>>  	return err;
>>  }
>>  
>> +static int nilfs_collect_prop_data(struct nilfs_sc_info *sci,
>> +				  struct buffer_head *bh, struct inode *inode)
>> +{
>> +	return nilfs_bmap_propagate(NILFS_I(inode)->i_bmap, bh);
>> +}
>> +
>>  static int nilfs_collect_dat_bmap(struct nilfs_sc_info *sci,
>>  				  struct buffer_head *bh, struct inode *inode)
>>  {
>> @@ -613,6 +619,14 @@ static struct nilfs_sc_operations nilfs_sc_dat_ops = {
>>  	.write_node_binfo = nilfs_write_dat_node_binfo,
>>  };
>>  
>> +static struct nilfs_sc_operations nilfs_sc_prop_ops = {
>> +	.collect_data = nilfs_collect_prop_data,
>> +	.collect_node = nilfs_collect_file_node,
>> +	.collect_bmap = NULL,
>> +	.write_data_binfo = NULL,
>> +	.write_node_binfo = NULL,
>> +};
>> +
>>  static struct nilfs_sc_operations nilfs_sc_dsync_ops = {
>>  	.collect_data = nilfs_collect_file_data,
>>  	.collect_node = NULL,
>> @@ -998,7 +1012,8 @@ static int nilfs_segctor_scan_file(struct nilfs_sc_info *sci,
>>  			err = nilfs_segctor_apply_buffers(
>>  				sci, inode, &data_buffers,
>>  				sc_ops->collect_data);
>> -			BUG_ON(!err); /* always receive -E2BIG or true error */
>> +			/* always receive -E2BIG or true error (NOT ANYMORE?)*/
>> +			/* BUG_ON(!err); */
>>  			goto break_or_fail;
>>  		}
>>  	}
> 
> If n > rest, this function will exit without scanning node buffers
> for nilfs_segctor_propagate_sufile().  This looks problem, right?
> 
> I think adding separate functions is better.  For instance,
> 
> static int nilfs_propagate_buffer(struct nilfs_sc_info *sci,
> 				  struct buffer_head *bh,
> 				  struct inode *inode)
> {
> 	return nilfs_bmap_propagate(NILFS_I(inode)->i_bmap, bh);
> }
> 
> static int nilfs_segctor_propagate_file(struct nilfs_sc_info *sci,
> 					struct inode *inode)
> {
> 	LIST_HEAD(buffers);
> 	size_t n;
> 	int err;
> 
> 	n = nilfs_lookup_dirty_data_buffers(inode, &buffers, SIZE_MAX, 0,
> 					    LLONG_MAX);
> 	if (n > 0) {
> 		ret = nilfs_segctor_apply_buffers(sci, inode, &buffers,
> 						  nilfs_propagate_buffer);
> 		if (unlikely(ret))
> 			goto fail;
> 	}
> 
> 	nilfs_lookup_dirty_node_buffers(inode, &buffers);
> 	ret = nilfs_segctor_apply_buffers(sci, inode, &buffers,
> 					  nilfs_propagate_buffer);
> fail:
> 	return ret;
> }
> 
> With this, you can also avoid defining nilfs_sc_prop_ops, nor touching
> the BUG_ON() in nilfs_segctor_scan_file.
> 
>> @@ -1055,6 +1070,55 @@ static int nilfs_segctor_scan_file_dsync(struct nilfs_sc_info *sci,
>>  	return err;
>>  }
>>  
>> +/**
>> + * nilfs_segctor_propagate_sufile - dirties all needed SUFILE blocks
>> + * @sci: nilfs_sc_info
>> + *
>> + * Description: Dirties and propagates all SUFILE blocks that need to be
>> + * available later in the segment construction process, when the SUFILE cache
>> + * is flushed. Here the SUFILE cache is not actually flushed, but the blocks
>> + * that are needed for a later flush are marked as dirty. Since the propagation
>> + * of the SUFILE can dirty DAT entries and vice versa, the functions
>> + * are executed in a loop until no new blocks are dirtied.
>> + *
>> + * Return Value: On success, 0 is returned on error, one of the following
>> + * negative error codes is returned.
>> + *
>> + * %-ENOMEM - Insufficient memory available.
>> + *
>> + * %-EIO - I/O error
>> + *
>> + * %-EROFS - Read only filesystem (for create mode)
>> + */
>> +static int nilfs_segctor_propagate_sufile(struct nilfs_sc_info *sci)
>> +{
>> +	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
>> +	unsigned long ndirty_blks;
>> +	int ret, retrycount = NILFS_SC_SUFILE_PROP_RETRY;
>> +
>> +	do {
>> +		/* count changes to DAT file before flush */
>> +		ret = nilfs_segctor_scan_file(sci, nilfs->ns_dat,
>> +					      &nilfs_sc_prop_ops);
> 
> Use the previous nilfs_segctor_propagate_file() here.
> 
>> +		if (unlikely(ret))
>> +			return ret;
>> +
>> +		ret = nilfs_sufile_flush_cache(nilfs->ns_sufile, 1,
>> +					       &ndirty_blks);
>> +		if (unlikely(ret))
>> +			return ret;
>> +		if (!ndirty_blks)
>> +			break;
>> +
>> +		ret = nilfs_segctor_scan_file(sci, nilfs->ns_sufile,
>> +					      &nilfs_sc_prop_ops);
> 
> Ditto.
> 
>> +		if (unlikely(ret))
>> +			return ret;
>> +	} while (ndirty_blks && retrycount-- > 0);
>> +
> 
> Uum. This still looks to have potential for leak of dirty block
> collection between DAT and SUFILE since this retry is limited by
> the fixed retry count.
> 
> How about adding function temporarily turning off the live block
> tracking and using it after this propagation loop until log write
> finishes ?
> 
> It would reduce the accuracy of live block count, but is it enough ?
> How do you think ?  We have to eliminate the possibility of the leak
> because it can cause file system corruption.  Every checkpoint must be
> self-contained.

How exactly could it lead to file system corruption? Maybe I miss
something important here, but it seems to me, that no corruption is
possible.

The nilfs_sufile_flush_cache_node() function only reads in already
existing blocks. No new blocks are created. If I mark those blocks
dirty, the btree is not changed at all. If I do not call
nilfs_bmap_propagate(), then the btree stays unchanged and there are no
dangling pointers. The resulting checkpoint should be self-contained.

The only problem would be, that I could lose some nlive_blks updates.

>> +	return 0;
>> +}
>> +
>>  static int nilfs_segctor_collect_blocks(struct nilfs_sc_info *sci, int mode)
>>  {
>>  	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
>> @@ -1160,6 +1224,13 @@ static int nilfs_segctor_collect_blocks(struct nilfs_sc_info *sci, int mode)
>>  		}
>>  		sci->sc_stage.flags |= NILFS_CF_SUFREED;
>>  
> 
>> +		if (mode == SC_LSEG_SR &&
> 
> This test ("mode == SC_LSEG_SR") can be removed.  When the thread
> comes here, it will always make a checkpoint.
> 
>> +		    nilfs_feature_track_live_blks(nilfs)) {
>> +			err = nilfs_segctor_propagate_sufile(sci);
>> +			if (unlikely(err))
>> +				break;
>> +		}
>> +
>>  		err = nilfs_segctor_scan_file(sci, nilfs->ns_sufile,
>>  					      &nilfs_sc_file_ops);
>>  		if (unlikely(err))
>> diff --git a/fs/nilfs2/segment.h b/fs/nilfs2/segment.h
>> index a48d6de..5aa7f91 100644
>> --- a/fs/nilfs2/segment.h
>> +++ b/fs/nilfs2/segment.h
>> @@ -208,7 +208,8 @@ enum {
>>   */
>>  #define NILFS_SC_CLEANUP_RETRY	    3  /* Retry count of construction when
>>  					  destroying segctord */
>> -
>> +#define NILFS_SC_SUFILE_PROP_RETRY  10 /* Retry count of the propagate
>> +					  sufile loop */
> 
> How many times does the propagation loop has to be repeated
> until it converges ?
> 
> The current dirty block scanning function collects all dirty blocks of
> the specified file (i.e. SUFILE or DAT), traversing page cache, making
> and destructing list of dirty buffers, every time the propagation
> function is called.  It's so wasteful to repeat that many times.
> 
> Regards,
> Ryusuke Konishi
> 
>>  /*
>>   * Default values of timeout, in seconds.
>>   */
>> -- 
>> 2.3.7
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-05-10 11:04 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-03 10:05 [PATCH v2 0/9] nilfs2: implementation of cost-benefit GC policy Andreas Rohner
     [not found] ` <1430647522-14304-1-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-03 10:05   ` [PATCH v2 1/9] nilfs2: copy file system feature flags to the nilfs object Andreas Rohner
     [not found]     ` <1430647522-14304-2-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  1:54       ` Ryusuke Konishi
     [not found]         ` <20150509.105445.1816655707671265145.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:41           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 2/9] nilfs2: extend SUFILE on-disk format to enable tracking of live blocks Andreas Rohner
     [not found]     ` <1430647522-14304-3-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:24       ` Ryusuke Konishi
     [not found]         ` <20150509.112403.380867861504859109.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:47           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 3/9] nilfs2: introduce new feature flag for tracking " Andreas Rohner
     [not found]     ` <1430647522-14304-4-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:28       ` Ryusuke Konishi
     [not found]         ` <20150509.112814.2026089040966346261.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 18:53           ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 4/9] nilfs2: add kmem_cache for SUFILE cache nodes Andreas Rohner
     [not found]     ` <1430647522-14304-5-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  2:41       ` Ryusuke Konishi
     [not found]         ` <20150509.114149.1643183669812667339.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 19:10           ` Andreas Rohner
     [not found]             ` <554E5B9D.7070807-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  0:05               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 5/9] nilfs2: add SUFILE cache for changes to su_nlive_blks field Andreas Rohner
     [not found]     ` <1430647522-14304-6-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  4:09       ` Ryusuke Konishi
     [not found]         ` <20150509.130900.223492430584220355.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 19:39           ` Andreas Rohner
     [not found]             ` <554E626A.2030503-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  2:09               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 6/9] nilfs2: add tracking of block deletions and updates Andreas Rohner
     [not found]     ` <1430647522-14304-7-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09  7:05       ` Ryusuke Konishi
     [not found]         ` <20150509.160512.1087140271092828536.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 15:58           ` Ryusuke Konishi
2015-05-09 20:02           ` Andreas Rohner
     [not found]             ` <554E67C0.1050309-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  3:17               ` Ryusuke Konishi
2015-05-03 10:05   ` [PATCH v2 7/9] nilfs2: ensure that all dirty blocks are written out Andreas Rohner
     [not found]     ` <1430647522-14304-8-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-09 12:17       ` Ryusuke Konishi
     [not found]         ` <20150509.211741.1463241033923032068.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-09 20:18           ` Andreas Rohner
     [not found]             ` <554E6B7E.8070000-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10  3:31               ` Ryusuke Konishi
2015-05-10 11:04           ` Andreas Rohner [this message]
     [not found]             ` <554F3B32.5050004-hi6Y0CQ0nG0@public.gmane.org>
2015-06-01  4:13               ` Ryusuke Konishi
     [not found]                 ` <20150601.131320.1075202804382267027.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-06-01 14:33                   ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 8/9] nilfs2: correct live block tracking for GC protection period Andreas Rohner
     [not found]     ` <1430647522-14304-9-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-10 18:15       ` Ryusuke Konishi
     [not found]         ` <20150511.031512.1036934606749624197.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-10 18:23           ` Ryusuke Konishi
     [not found]             ` <20150511.032323.1250231827423193240.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-11  2:07               ` Ryusuke Konishi
     [not found]                 ` <20150511.110726.725667075147435663.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-11 12:32                   ` Andreas Rohner
2015-05-11 13:00           ` Andreas Rohner
     [not found]             ` <5550A7FC.4050709-hi6Y0CQ0nG0@public.gmane.org>
2015-05-12 14:31               ` Ryusuke Konishi
     [not found]                 ` <20150512.233126.2206330706583570566.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-12 15:37                   ` Andreas Rohner
2015-05-03 10:05   ` [PATCH v2 9/9] nilfs2: prevent starvation of segments protected by snapshots Andreas Rohner
     [not found]     ` <1430647522-14304-10-git-send-email-andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2015-05-20 14:43       ` Ryusuke Konishi
     [not found]         ` <20150520.234335.542615158366069430.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-20 15:49           ` Ryusuke Konishi
2015-05-22 18:10           ` Andreas Rohner
     [not found]             ` <555F70FD.6090500-hi6Y0CQ0nG0@public.gmane.org>
2015-05-31 16:45               ` Ryusuke Konishi
     [not found]                 ` <20150601.014550.269184778137708369.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-05-31 18:13                   ` Andreas Rohner
     [not found]                     ` <556B4F58.9080801-hi6Y0CQ0nG0@public.gmane.org>
2015-06-01  0:44                       ` Ryusuke Konishi
     [not found]                         ` <20150601.094441.24658496988941562.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2015-06-01 14:45                           ` Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 1/5] nilfs-utils: extend SUFILE on-disk format to enable track live blocks Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 2/5] nilfs-utils: add additional flags for nilfs_vdesc Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 3/5] nilfs-utils: add support for tracking live blocks Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 4/5] nilfs-utils: implement the tracking of live blocks for set_suinfo Andreas Rohner
2015-05-03 10:07   ` [PATCH v2 5/5] nilfs-utils: add support for greedy/cost-benefit policies Andreas Rohner
2015-05-05  3:09   ` [PATCH v2 0/9] nilfs2: implementation of cost-benefit GC policy Ryusuke Konishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=554F3B32.5050004@gmx.net \
    --to=andreas.rohner-hi6y0cq0ng0@public.gmane.org \
    --cc=konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org \
    --cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).