* [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations
@ 2014-01-21 13:59 Andreas Rohner
[not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
0 siblings, 1 reply; 44+ messages in thread
From: Andreas Rohner @ 2014-01-21 13:59 UTC (permalink / raw)
To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner
Hi,
This is the second version of this patch set. It replaces the kind of
hacky use of v_flags with a proper implementation of
NILFS_IOCTL_SET_SUINFO ioctl.
v1->v2
* Implementation of NILFS_IOCTL_SET_SUINFO
* Added mc_min_free_blocks_threshold config option
(if clean segments < min_clean_segments)
* Added new command line param for nilfs-clean
* Update man- and config-files
* Simpler benchmark
This patch set implements a small new feature and there shouldn't be
any compatibility issues. It enables the GC to check how much free
space can be gained from cleaning a segment and if it is less than a
certain threshold it will abort the operation and try a different
segment. Although no blocks need to be moved, the SUFILE entry of the
corresponding segment needs to be updated to avoid an infinite loop.
This is potentially useful for all gc policies, but it is especially
beneficial for the timestamp policy. Lets assume for example a NILFS2
volume with 20% static files and lets assume these static files are in
the oldest segments. The current timestamp policy will select the oldest
segments and, since the data is static, move them mostly unchanged to
new segments. After a while they will become the oldest segments again.
Then timestamp will move them again. These moving operations are
expensive and unnecessary.
I used a simple benchmark to test the patch set (only a few lines of C).
I used a 100 GB partition and performed the following steps:
1. Write a 20 GB file
2. Write a 50 GB file
3. Overwrite chunks of 1 MB within the 50 GB file at random
4. Repeat step 3 until 60 GB of data is written
Steps 3 and 4 are only perfomed to get the GC started. So the benchmark
writes a 130 GB in total to a 100 GB partition.
HHD:
Timestamp GB Written: 340.7574
Timestamp GB Read: 208.2935
Timestamp Runtime: 7787.546s
Patched GB Written: 313.2566
Patched GB Read: 182.6389
Patched Runtime: 7410.892s
SSD:
Timestamp GB Written: 679.3901
Timestamp GB Read: 242.59
Timestamp Runtime: 3022.081s
Patched GB Written: 500.0095
Patched GB Read: 157.475
Patched Runtime: 2313.448
The results for the HDD clearly show, that about 20 GB less data has
been written and read in the patched version. It is reasonable to
assume, that these 20 GB are the static data.
The speed of the GC was tuned to the HDD. It was probably too aggressive
for the much faster SSD. That is probably the reason why the difference
in GB written and read is much higher than 20 GB.
Best regards,
Andreas Rohner
---
Andreas Rohner (5):
nilfs-utils: cldconfig add an option to set minimal free blocks
nilfs-utils: cleanerd: add custom error value to enable fast retry
nilfs-utils: refactoring of nilfs_reclaim_segment to add minblocks
param
nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl
nilfs-utils: man: add description of min_free_blocks_threshold
include/nilfs.h | 2 ++
include/nilfs2_fs.h | 41 ++++++++++++++++++++++++++++++++
include/nilfs_cleaner.h | 19 ++++++++-------
include/nilfs_gc.h | 6 +++--
lib/gc.c | 49 ++++++++++++++++++++++++++++++++++++---
lib/nilfs.c | 26 +++++++++++++++++++++
man/nilfs-clean.8 | 4 ++++
man/nilfs_cleanerd.conf.5 | 9 +++++++
sbin/cleanerd/cldconfig.c | 40 ++++++++++++++++++++++++++++++++
sbin/cleanerd/cldconfig.h | 4 ++++
sbin/cleanerd/cleanerd.c | 38 +++++++++++++++++++++++++++---
sbin/cleanerd/nilfs_cleanerd.conf | 9 +++++++
sbin/nilfs-clean/nilfs-clean.c | 18 ++++++++++----
sbin/nilfs-resize/nilfs-resize.c | 2 +-
14 files changed, 245 insertions(+), 22 deletions(-)
--
1.8.5.3
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread[parent not found: <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-21 13:59 ` Andreas Rohner [not found] ` <22b5b3ac403052d3044dc8c1bebe323191376c03.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 13:59 ` [PATCH v2 2/5] nilfs-utils: cleanerd: add custom error value to enable fast retry Andreas Rohner ` (6 subsequent siblings) 7 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 13:59 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner With this option the user can specify the minimal number of free/dead blocks before a segment can be cleaned. This is a threshold for the GC to prevent needless moving of data. If there are less free blocks to gain from cleaning a segment than the specified number, the GC will abort and try again with a different segment. By setting it to 0 this feature is effectively disabled. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- include/nilfs_cleaner.h | 19 ++++++++++--------- sbin/cleanerd/cldconfig.c | 40 +++++++++++++++++++++++++++++++++++++++ sbin/cleanerd/cldconfig.h | 4 ++++ sbin/cleanerd/cleanerd.c | 23 ++++++++++++++++++++++ sbin/cleanerd/nilfs_cleanerd.conf | 9 +++++++++ sbin/nilfs-clean/nilfs-clean.c | 18 ++++++++++++++---- 6 files changed, 100 insertions(+), 13 deletions(-) diff --git a/include/nilfs_cleaner.h b/include/nilfs_cleaner.h index 0bf02aa..42c4aa7 100644 --- a/include/nilfs_cleaner.h +++ b/include/nilfs_cleaner.h @@ -46,17 +46,18 @@ struct nilfs_cleaner_args { uint64_t start_segnum; /* start segment number */ uint64_t nsegs; /* number of segments */ uint32_t runtime; /* runtime in seconds */ - uint32_t pad2; + uint32_t min_free_blocks_threshold; }; /* valid flags */ -#define NILFS_CLEANER_ARG_PROTECTION_PERIOD (1 << 0) -#define NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN (1 << 1) -#define NILFS_CLEANER_ARG_CLEANING_INTERVAL (1 << 2) -#define NILFS_CLEANER_ARG_USAGE_RATE_THRESHOLD (1 << 3) /* reserved */ -#define NILFS_CLEANER_ARG_START_SEGNUM (1 << 4) /* reserved */ -#define NILFS_CLEANER_ARG_NSEGS (1 << 5) /* reserved */ -#define NILFS_CLEANER_ARG_NPASSES (1 << 6) /* reserved */ -#define NILFS_CLEANER_ARG_RUNTIME (1 << 7) /* reserved */ +#define NILFS_CLEANER_ARG_PROTECTION_PERIOD (1 << 0) +#define NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN (1 << 1) +#define NILFS_CLEANER_ARG_CLEANING_INTERVAL (1 << 2) +#define NILFS_CLEANER_ARG_USAGE_RATE_THRESHOLD (1 << 3) /* reserved */ +#define NILFS_CLEANER_ARG_START_SEGNUM (1 << 4) /* reserved */ +#define NILFS_CLEANER_ARG_NSEGS (1 << 5) /* reserved */ +#define NILFS_CLEANER_ARG_NPASSES (1 << 6) /* reserved */ +#define NILFS_CLEANER_ARG_RUNTIME (1 << 7) /* reserved */ +#define NILFS_CLEANER_ARG_MIN_FREE_BLOCKS_THRESHOLD (1 << 8) enum { NILFS_CLEANER_STATUS_IDLE, diff --git a/sbin/cleanerd/cldconfig.c b/sbin/cleanerd/cldconfig.c index 270d360..4d7589c 100644 --- a/sbin/cleanerd/cldconfig.c +++ b/sbin/cleanerd/cldconfig.c @@ -456,6 +456,34 @@ nilfs_cldconfig_handle_mc_nsegments_per_clean(struct nilfs_cldconfig *config, } static int +nilfs_cldconfig_handle_min_free_blocks(struct nilfs_cldconfig *config, + char **tokens, size_t ntoks, + struct nilfs *nilfs) +{ + unsigned long n; + + if (nilfs_cldconfig_get_ulong_argument(tokens, ntoks, &n) < 0) + return 0; + + config->cf_min_free_blocks_threshold = n; + return 0; +} + +static int +nilfs_cldconfig_handle_mc_min_free_blocks(struct nilfs_cldconfig *config, + char **tokens, size_t ntoks, + struct nilfs *nilfs) +{ + unsigned long n; + + if (nilfs_cldconfig_get_ulong_argument(tokens, ntoks, &n) < 0) + return 0; + + config->cf_mc_min_free_blocks_threshold = n; + return 0; +} + +static int nilfs_cldconfig_handle_cleaning_interval(struct nilfs_cldconfig *config, char **tokens, size_t ntoks, struct nilfs *nilfs) @@ -576,6 +604,14 @@ nilfs_cldconfig_keyword_table[] = { "log_priority", 2, 2, nilfs_cldconfig_handle_log_priority }, + { + "min_free_blocks_threshold", 2, 2, + nilfs_cldconfig_handle_min_free_blocks + }, + { + "mc_min_free_blocks_threshold", 2, 2, + nilfs_cldconfig_handle_mc_min_free_blocks + }, }; #define NILFS_CLDCONFIG_NKEYWORDS \ @@ -641,6 +677,10 @@ static void nilfs_cldconfig_set_default(struct nilfs_cldconfig *config, config->cf_retry_interval.tv_usec = 0; config->cf_use_mmap = NILFS_CLDCONFIG_USE_MMAP; config->cf_log_priority = NILFS_CLDCONFIG_LOG_PRIORITY; + config->cf_min_free_blocks_threshold = + NILFS_CLDCONFIG_MIN_FREE_BLOCKS_THRESHOLD; + config->cf_mc_min_free_blocks_threshold = + NILFS_CLDCONFIG_MC_MIN_FREE_BLOCKS_THRESHOLD; } static inline int iseol(int c) diff --git a/sbin/cleanerd/cldconfig.h b/sbin/cleanerd/cldconfig.h index 188ce9b..49ee6be 100644 --- a/sbin/cleanerd/cldconfig.h +++ b/sbin/cleanerd/cldconfig.h @@ -102,6 +102,8 @@ struct nilfs_cldconfig { struct timeval cf_retry_interval; int cf_use_mmap; int cf_log_priority; + unsigned long cf_min_free_blocks_threshold; + unsigned long cf_mc_min_free_blocks_threshold; }; #define NILFS_CLDCONFIG_SELECTION_POLICY_IMPORTANCE \ @@ -120,6 +122,8 @@ struct nilfs_cldconfig { #define NILFS_CLDCONFIG_RETRY_INTERVAL 60 #define NILFS_CLDCONFIG_USE_MMAP 1 #define NILFS_CLDCONFIG_LOG_PRIORITY LOG_INFO +#define NILFS_CLDCONFIG_MIN_FREE_BLOCKS_THRESHOLD 256 +#define NILFS_CLDCONFIG_MC_MIN_FREE_BLOCKS_THRESHOLD 128 #define NILFS_CLDCONFIG_NSEGMENTS_PER_CLEAN_MAX 32 diff --git a/sbin/cleanerd/cleanerd.c b/sbin/cleanerd/cleanerd.c index 86dfcf7..eb6eace 100644 --- a/sbin/cleanerd/cleanerd.c +++ b/sbin/cleanerd/cleanerd.c @@ -172,6 +172,7 @@ struct nilfs_cleanerd { struct timeval cleaning_interval; struct timeval target; struct timeval timeout; + unsigned long min_free_blocks_threshold; __u64 prev_nongc_ctime; mqd_t recvq; char *recvq_name; @@ -184,6 +185,7 @@ struct nilfs_cleanerd { long mm_ncleansegs; struct timeval mm_protection_period; struct timeval mm_cleaning_interval; + unsigned long mm_min_free_blocks_threshold; }; /** @@ -459,6 +461,14 @@ nilfs_cleanerd_protection_period(struct nilfs_cleanerd *cleanerd) &cleanerd->config.cf_protection_period; } +static unsigned long +nilfs_cleanerd_min_free_blocks_threshold(struct nilfs_cleanerd *cleanerd) +{ + return cleanerd->running == 2 ? + cleanerd->mm_min_free_blocks_threshold : + cleanerd->min_free_blocks_threshold; +} + static void nilfs_cleanerd_reduce_ncleansegs_for_retry(struct nilfs_cleanerd *cleanerd) { @@ -998,6 +1008,13 @@ static int nilfs_cleanerd_cmd_run(struct nilfs_cleanerd *cleanerd, cleanerd->mm_cleaning_interval = cleanerd->cleaning_interval; } + /* minimal free blocks threshold */ + if (req2->args.valid & NILFS_CLEANER_ARG_MIN_FREE_BLOCKS_THRESHOLD) { + cleanerd->mm_min_free_blocks_threshold = + req2->args.min_free_blocks_threshold; + } else { + cleanerd->mm_min_free_blocks_threshold = 0; + } /* number of passes */ if (req2->args.valid & NILFS_CLEANER_ARG_NPASSES) { if (!req2->args.npasses) @@ -1235,10 +1252,14 @@ static int nilfs_cleanerd_handle_clean_check(struct nilfs_cleanerd *cleanerd, /* disk space is close to limit -- accelerate cleaning */ cleanerd->ncleansegs = config->cf_mc_nsegments_per_clean; cleanerd->cleaning_interval = config->cf_mc_cleaning_interval; + cleanerd->min_free_blocks_threshold = + config->cf_mc_min_free_blocks_threshold; } else { /* continue to run */ cleanerd->ncleansegs = config->cf_nsegments_per_clean; cleanerd->cleaning_interval = config->cf_cleaning_interval; + cleanerd->min_free_blocks_threshold = + config->cf_min_free_blocks_threshold; } return 0; /* do gc */ @@ -1424,6 +1445,8 @@ static int nilfs_cleanerd_clean_loop(struct nilfs_cleanerd *cleanerd) cleanerd->ncleansegs = cleanerd->config.cf_nsegments_per_clean; cleanerd->cleaning_interval = cleanerd->config.cf_cleaning_interval; + cleanerd->min_free_blocks_threshold = + cleanerd->config.cf_min_free_blocks_threshold; if (nilfs_cleanerd_automatic_suspend(cleanerd)) diff --git a/sbin/cleanerd/nilfs_cleanerd.conf b/sbin/cleanerd/nilfs_cleanerd.conf index 26872aa..23ef8de 100644 --- a/sbin/cleanerd/nilfs_cleanerd.conf +++ b/sbin/cleanerd/nilfs_cleanerd.conf @@ -51,6 +51,15 @@ mc_cleaning_interval 1 # Retry interval in seconds. retry_interval 60 +# Minimal number of free blocks before a segment +# can be cleaned. Set to 0 to disable it. +min_free_blocks_threshold 256 + +# Minimal number of free blocks before a segment +# can be cleaned. Set to 0 to disable it. +# if clean segments < min_clean_segments +mc_min_free_blocks_threshold 128 + # Use mmap when reading segments if supported. use_mmap diff --git a/sbin/nilfs-clean/nilfs-clean.c b/sbin/nilfs-clean/nilfs-clean.c index 55abd55..23b7e61 100644 --- a/sbin/nilfs-clean/nilfs-clean.c +++ b/sbin/nilfs-clean/nilfs-clean.c @@ -77,6 +77,7 @@ const static struct option long_option[] = { {"stop", no_argument, NULL, 'b'}, {"suspend", no_argument, NULL, 's'}, {"speed", required_argument, NULL, 'S'}, + {"free-blocks-threshold", optional_argument, NULL, 't'}, {"verbose", no_argument, NULL, 'v'}, {"version", no_argument, NULL, 'V'}, {NULL, 0, NULL, 0} @@ -95,12 +96,15 @@ const static struct option long_option[] = { " -s, --suspend\t\tsuspend cleaner\n" \ " -S, --speed=COUNT[/SECONDS]\n" \ " \t\tset GC speed\n" \ + " -t, --free-blocks-threshold=COUNT\n" \ + " \t\tset minimal number of free blocks before\n" \ + " \t\ta segment can be cleaned\n" \ " -v, --verbose\t\tverbose mode\n" \ " -V, --version\t\tdisplay version and exit\n" #else #define NILFS_CLEAN_USAGE \ "Usage: %s [-b] [-c [conffile]] [-h] [-l] [-p protection-period]" \ - " [-q] [-r] [-s] [-S gc-speed] [-v] [-V] [device]\n" + " [-q] [-r] [-s] [-S gc-speed] [-t blocks][-v] [-V] [device]\n" #endif /* _GNU_SOURCE */ @@ -124,6 +128,7 @@ static const char *conffile = NULL; static unsigned long protection_period = ULONG_MAX; static int nsegments_per_clean = 2; static struct timespec cleaning_interval = { 0, 100000000 }; /* 100 msec */ +static unsigned long min_free_blocks_threshold; static sigjmp_buf nilfs_clean_env; static struct nilfs_cleaner *nilfs_cleaner; @@ -164,9 +169,11 @@ static int nilfs_clean_do_run(struct nilfs_cleaner *cleaner) args.nsegments_per_clean = nsegments_per_clean; args.cleaning_interval = cleaning_interval.tv_sec; args.cleaning_interval_nsec = cleaning_interval.tv_nsec; + args.min_free_blocks_threshold = min_free_blocks_threshold; args.valid = (NILFS_CLEANER_ARG_NPASSES | NILFS_CLEANER_ARG_CLEANING_INTERVAL | - NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN); + NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN | + NILFS_CLEANER_ARG_MIN_FREE_BLOCKS_THRESHOLD); if (protection_period != ULONG_MAX) { args.protection_period = protection_period; @@ -434,10 +441,10 @@ static void nilfs_clean_parse_options(int argc, char *argv[]) int c; #ifdef _GNU_SOURCE - while ((c = getopt_long(argc, argv, "bc::hlp:qrsS:vV", + while ((c = getopt_long(argc, argv, "bc::hlp:qrsS:t:vV", long_option, &option_index)) >= 0) { #else - while ((c = getopt(argc, argv, "bc::hlp:qrsS:vV")) >= 0) { + while ((c = getopt(argc, argv, "bc::hlp:qrsS:t:vV")) >= 0) { #endif /* _GNU_SOURCE */ switch (c) { case 'b': @@ -472,6 +479,9 @@ static void nilfs_clean_parse_options(int argc, char *argv[]) if (nilfs_clean_parse_gcspeed(optarg) < 0) exit(EXIT_FAILURE); break; + case 't': + min_free_blocks_threshold = strtoul(optarg, NULL, 10); + break; case 'v': verbose = 1; break; -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
[parent not found: <22b5b3ac403052d3044dc8c1bebe323191376c03.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks [not found] ` <22b5b3ac403052d3044dc8c1bebe323191376c03.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-21 14:10 ` dexen deVries 2014-01-21 14:38 ` Andreas Rohner 2014-01-21 14:53 ` Andreas Rohner 2014-01-23 17:49 ` Vyacheslav Dubeyko 2 siblings, 1 reply; 44+ messages in thread From: dexen deVries @ 2014-01-21 14:10 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner Hi Andreas, I like the idea you're introducing; should lessen trashing by GC on my workstations. On Tuesday 21 of January 2014 14:59:40 you wrote: > With this option the user can specify the minimal number of free/dead > blocks before a segment can be cleaned. This is a threshold for the GC > to prevent needless moving of data. If there are less free blocks > to gain from cleaning a segment than the specified number, the GC will > abort and try again with a different segment. Will your patches be smart enough to re-claim space if the only available free blocks are under threshold? Scenario: FS is running out of free space, and the free blocks to be reclaimed are scattered through segments in small numbers. Several segments have a small number of free blocks, but in each segment the amount is under the threshold. Regards, -- dexen deVries [[[↓][→]]] -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks 2014-01-21 14:10 ` dexen deVries @ 2014-01-21 14:38 ` Andreas Rohner 0 siblings, 0 replies; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 14:38 UTC (permalink / raw) To: dexen deVries, linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Dexen, On 2014-01-21 15:10, dexen deVries wrote: > Will your patches be smart enough to re-claim space if the only available free > blocks are under threshold? > > > Scenario: FS is running out of free space, and the free blocks to be reclaimed > are scattered through segments in small numbers. Several segments have a small > number of free blocks, but in each segment the amount is under the threshold. No my patches won't do that, but there are two new config options. Namely min_free_blocks_threshold and mc_min_free_blocks_threshold, whereby the latter is used if the number of clean segments is below min_clean_segments, similar to the other mc options. To accommodate your scenario you could simple set a very low value or even 0 to mc_min_free_blocks_threshold. So if the number of clean segments drops below min_clean_segments my patches would be disabled. Regards, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks [not found] ` <22b5b3ac403052d3044dc8c1bebe323191376c03.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 14:10 ` dexen deVries @ 2014-01-21 14:53 ` Andreas Rohner 2014-01-23 17:49 ` Vyacheslav Dubeyko 2 siblings, 0 replies; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 14:53 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-21 14:59, Andreas Rohner wrote: > With this option the user can specify the minimal number of free/dead > blocks before a segment can be cleaned. This is a threshold for the GC > to prevent needless moving of data. If there are less free blocks > to gain from cleaning a segment than the specified number, the GC will > abort and try again with a different segment. Come to think of it, "blocks" is not a good unit. Maybe I should change that to "% of a segment" and/or "KB/MB". It shouldn't be hard... Regards, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks [not found] ` <22b5b3ac403052d3044dc8c1bebe323191376c03.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 14:10 ` dexen deVries 2014-01-21 14:53 ` Andreas Rohner @ 2014-01-23 17:49 ` Vyacheslav Dubeyko [not found] ` <B1FCAEBD-EB58-4A06-BD6B-7D4FB533D9F1-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 2 siblings, 1 reply; 44+ messages in thread From: Vyacheslav Dubeyko @ 2014-01-23 17:49 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: > With this option the user can specify the minimal number of free/dead > blocks before a segment can be cleaned. "Free or dead" or "free and dead"? What do you mean by free blocks? "Free blocks" sounds for me as blocks that I can allocate and to use immediately without any operations. If you mean invalid blocks (blocks that it was updated in this segment) then I need to do garbage collection operation for making it free. So, free blocks are really confusing term. And "dead blocks" sounds for me not very good. > This is a threshold for the GC > to prevent needless moving of data. If there are less free blocks > to gain from cleaning a segment than the specified number, the GC will > abort and try again with a different segment. > > By setting it to 0 this feature is effectively disabled. > > Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> > --- > include/nilfs_cleaner.h | 19 ++++++++++--------- > sbin/cleanerd/cldconfig.c | 40 +++++++++++++++++++++++++++++++++++++++ > sbin/cleanerd/cldconfig.h | 4 ++++ > sbin/cleanerd/cleanerd.c | 23 ++++++++++++++++++++++ > sbin/cleanerd/nilfs_cleanerd.conf | 9 +++++++++ > sbin/nilfs-clean/nilfs-clean.c | 18 ++++++++++++++---- > 6 files changed, 100 insertions(+), 13 deletions(-) > > diff --git a/include/nilfs_cleaner.h b/include/nilfs_cleaner.h > index 0bf02aa..42c4aa7 100644 > --- a/include/nilfs_cleaner.h > +++ b/include/nilfs_cleaner.h > @@ -46,17 +46,18 @@ struct nilfs_cleaner_args { > uint64_t start_segnum; /* start segment number */ > uint64_t nsegs; /* number of segments */ > uint32_t runtime; /* runtime in seconds */ > - uint32_t pad2; > + uint32_t min_free_blocks_threshold; > }; > /* valid flags */ > -#define NILFS_CLEANER_ARG_PROTECTION_PERIOD (1 << 0) > -#define NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN (1 << 1) > -#define NILFS_CLEANER_ARG_CLEANING_INTERVAL (1 << 2) > -#define NILFS_CLEANER_ARG_USAGE_RATE_THRESHOLD (1 << 3) /* reserved */ > -#define NILFS_CLEANER_ARG_START_SEGNUM (1 << 4) /* reserved */ > -#define NILFS_CLEANER_ARG_NSEGS (1 << 5) /* reserved */ > -#define NILFS_CLEANER_ARG_NPASSES (1 << 6) /* reserved */ > -#define NILFS_CLEANER_ARG_RUNTIME (1 << 7) /* reserved */ > +#define NILFS_CLEANER_ARG_PROTECTION_PERIOD (1 << 0) > +#define NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN (1 << 1) > +#define NILFS_CLEANER_ARG_CLEANING_INTERVAL (1 << 2) > +#define NILFS_CLEANER_ARG_USAGE_RATE_THRESHOLD (1 << 3) /* reserved */ > +#define NILFS_CLEANER_ARG_START_SEGNUM (1 << 4) /* reserved */ > +#define NILFS_CLEANER_ARG_NSEGS (1 << 5) /* reserved */ > +#define NILFS_CLEANER_ARG_NPASSES (1 << 6) /* reserved */ > +#define NILFS_CLEANER_ARG_RUNTIME (1 << 7) /* reserved */ > +#define NILFS_CLEANER_ARG_MIN_FREE_BLOCKS_THRESHOLD (1 << 8) What about NILFS_CLEANER_ARG_MIN_FBLK_THRESHOLD? But free blocks really confuse me. > > enum { > NILFS_CLEANER_STATUS_IDLE, > diff --git a/sbin/cleanerd/cldconfig.c b/sbin/cleanerd/cldconfig.c > index 270d360..4d7589c 100644 > --- a/sbin/cleanerd/cldconfig.c > +++ b/sbin/cleanerd/cldconfig.c > @@ -456,6 +456,34 @@ nilfs_cldconfig_handle_mc_nsegments_per_clean(struct nilfs_cldconfig *config, > } > > static int > +nilfs_cldconfig_handle_min_free_blocks(struct nilfs_cldconfig *config, > + char **tokens, size_t ntoks, > + struct nilfs *nilfs) > +{ > + unsigned long n; > + > + if (nilfs_cldconfig_get_ulong_argument(tokens, ntoks, &n) < 0) > + return 0; > + > + config->cf_min_free_blocks_threshold = n; > + return 0; > +} > + > +static int > +nilfs_cldconfig_handle_mc_min_free_blocks(struct nilfs_cldconfig *config, > + char **tokens, size_t ntoks, > + struct nilfs *nilfs) > +{ > + unsigned long n; > + > + if (nilfs_cldconfig_get_ulong_argument(tokens, ntoks, &n) < 0) > + return 0; > + > + config->cf_mc_min_free_blocks_threshold = n; > + return 0; > +} > + > +static int > nilfs_cldconfig_handle_cleaning_interval(struct nilfs_cldconfig *config, > char **tokens, size_t ntoks, > struct nilfs *nilfs) > @@ -576,6 +604,14 @@ nilfs_cldconfig_keyword_table[] = { > "log_priority", 2, 2, > nilfs_cldconfig_handle_log_priority > }, > + { > + "min_free_blocks_threshold", 2, 2, > + nilfs_cldconfig_handle_min_free_blocks > + }, > + { > + "mc_min_free_blocks_threshold", 2, 2, > + nilfs_cldconfig_handle_mc_min_free_blocks > + }, > }; > > #define NILFS_CLDCONFIG_NKEYWORDS \ > @@ -641,6 +677,10 @@ static void nilfs_cldconfig_set_default(struct nilfs_cldconfig *config, > config->cf_retry_interval.tv_usec = 0; > config->cf_use_mmap = NILFS_CLDCONFIG_USE_MMAP; > config->cf_log_priority = NILFS_CLDCONFIG_LOG_PRIORITY; > + config->cf_min_free_blocks_threshold = > + NILFS_CLDCONFIG_MIN_FREE_BLOCKS_THRESHOLD; > + config->cf_mc_min_free_blocks_threshold = > + NILFS_CLDCONFIG_MC_MIN_FREE_BLOCKS_THRESHOLD; > } > > static inline int iseol(int c) > diff --git a/sbin/cleanerd/cldconfig.h b/sbin/cleanerd/cldconfig.h > index 188ce9b..49ee6be 100644 > --- a/sbin/cleanerd/cldconfig.h > +++ b/sbin/cleanerd/cldconfig.h > @@ -102,6 +102,8 @@ struct nilfs_cldconfig { > struct timeval cf_retry_interval; > int cf_use_mmap; > int cf_log_priority; > + unsigned long cf_min_free_blocks_threshold; > + unsigned long cf_mc_min_free_blocks_threshold; > }; > > #define NILFS_CLDCONFIG_SELECTION_POLICY_IMPORTANCE \ > @@ -120,6 +122,8 @@ struct nilfs_cldconfig { > #define NILFS_CLDCONFIG_RETRY_INTERVAL 60 > #define NILFS_CLDCONFIG_USE_MMAP 1 > #define NILFS_CLDCONFIG_LOG_PRIORITY LOG_INFO > +#define NILFS_CLDCONFIG_MIN_FREE_BLOCKS_THRESHOLD 256 > +#define NILFS_CLDCONFIG_MC_MIN_FREE_BLOCKS_THRESHOLD 128 Why 256 or 128? What is good? What is bad? I am a user. How can I define what value is appropriate? How should I think for choosing some value? For example, I want to have the fastest performance as possible. How can I define an appropriate value? Or, maybe, I want to clean as many segments as possible. How should I think? Or, maybe, I want combination of GC policies. Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <B1FCAEBD-EB58-4A06-BD6B-7D4FB533D9F1-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks [not found] ` <B1FCAEBD-EB58-4A06-BD6B-7D4FB533D9F1-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2014-01-23 18:31 ` Andreas Rohner 0 siblings, 0 replies; 44+ messages in thread From: Andreas Rohner @ 2014-01-23 18:31 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs On 2014-01-23 18:49, Vyacheslav Dubeyko wrote: > > On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: > >> With this option the user can specify the minimal number of free/dead >> blocks before a segment can be cleaned. > > "Free or dead" or "free and dead"? What do you mean by free blocks? > "Free blocks" sounds for me as blocks that I can allocate and to use > immediately without any operations. If you mean invalid blocks (blocks > that it was updated in this segment) then I need to do garbage collection > operation for making it free. So, free blocks are really confusing term. And > "dead blocks" sounds for me not very good. I mean updated or deleted blocks, that need to be garbage collected. Here [1] they are called "dead blocks". >> This is a threshold for the GC >> to prevent needless moving of data. If there are less free blocks >> to gain from cleaning a segment than the specified number, the GC will >> abort and try again with a different segment. >> >> By setting it to 0 this feature is effectively disabled. >> >> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> >> --- >> include/nilfs_cleaner.h | 19 ++++++++++--------- >> sbin/cleanerd/cldconfig.c | 40 +++++++++++++++++++++++++++++++++++++++ >> sbin/cleanerd/cldconfig.h | 4 ++++ >> sbin/cleanerd/cleanerd.c | 23 ++++++++++++++++++++++ >> sbin/cleanerd/nilfs_cleanerd.conf | 9 +++++++++ >> sbin/nilfs-clean/nilfs-clean.c | 18 ++++++++++++++---- >> 6 files changed, 100 insertions(+), 13 deletions(-) >> >> diff --git a/include/nilfs_cleaner.h b/include/nilfs_cleaner.h >> index 0bf02aa..42c4aa7 100644 >> --- a/include/nilfs_cleaner.h >> +++ b/include/nilfs_cleaner.h >> @@ -46,17 +46,18 @@ struct nilfs_cleaner_args { >> uint64_t start_segnum; /* start segment number */ >> uint64_t nsegs; /* number of segments */ >> uint32_t runtime; /* runtime in seconds */ >> - uint32_t pad2; >> + uint32_t min_free_blocks_threshold; >> }; >> /* valid flags */ >> -#define NILFS_CLEANER_ARG_PROTECTION_PERIOD (1 << 0) >> -#define NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN (1 << 1) >> -#define NILFS_CLEANER_ARG_CLEANING_INTERVAL (1 << 2) >> -#define NILFS_CLEANER_ARG_USAGE_RATE_THRESHOLD (1 << 3) /* reserved */ >> -#define NILFS_CLEANER_ARG_START_SEGNUM (1 << 4) /* reserved */ >> -#define NILFS_CLEANER_ARG_NSEGS (1 << 5) /* reserved */ >> -#define NILFS_CLEANER_ARG_NPASSES (1 << 6) /* reserved */ >> -#define NILFS_CLEANER_ARG_RUNTIME (1 << 7) /* reserved */ >> +#define NILFS_CLEANER_ARG_PROTECTION_PERIOD (1 << 0) >> +#define NILFS_CLEANER_ARG_NSEGMENTS_PER_CLEAN (1 << 1) >> +#define NILFS_CLEANER_ARG_CLEANING_INTERVAL (1 << 2) >> +#define NILFS_CLEANER_ARG_USAGE_RATE_THRESHOLD (1 << 3) /* reserved */ >> +#define NILFS_CLEANER_ARG_START_SEGNUM (1 << 4) /* reserved */ >> +#define NILFS_CLEANER_ARG_NSEGS (1 << 5) /* reserved */ >> +#define NILFS_CLEANER_ARG_NPASSES (1 << 6) /* reserved */ >> +#define NILFS_CLEANER_ARG_RUNTIME (1 << 7) /* reserved */ >> +#define NILFS_CLEANER_ARG_MIN_FREE_BLOCKS_THRESHOLD (1 << 8) > > What about NILFS_CLEANER_ARG_MIN_FBLK_THRESHOLD? > But free blocks really confuse me. Yes maybe the name is a bit confusing. >> #define NILFS_CLDCONFIG_SELECTION_POLICY_IMPORTANCE \ >> @@ -120,6 +122,8 @@ struct nilfs_cldconfig { >> #define NILFS_CLDCONFIG_RETRY_INTERVAL 60 >> #define NILFS_CLDCONFIG_USE_MMAP 1 >> #define NILFS_CLDCONFIG_LOG_PRIORITY LOG_INFO >> +#define NILFS_CLDCONFIG_MIN_FREE_BLOCKS_THRESHOLD 256 >> +#define NILFS_CLDCONFIG_MC_MIN_FREE_BLOCKS_THRESHOLD 128 > > Why 256 or 128? What is good? What is bad? > I am a user. How can I define what value is appropriate? > How should I think for choosing some value? > For example, I want to have the fastest performance as possible. How can I define > an appropriate value? Or, maybe, I want to clean as many segments as possible. > How should I think? > > Or, maybe, I want combination of GC policies. It is the minimal number of dead or free or invalid or however you want to call them blocks, before a segment is allowed to be cleaned by the GC. It is a tradeoff, but you are right it is not very intuitive. As I stated before I am going to change that in the next version. br, Andreas Rohner [1] http://lwn.net/Articles/522507/ -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v2 2/5] nilfs-utils: cleanerd: add custom error value to enable fast retry [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 13:59 ` [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks Andreas Rohner @ 2014-01-21 13:59 ` Andreas Rohner [not found] ` <e9d3dd17318a994fe7e2c100368212e0b4029e13.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 13:59 ` [PATCH v2 3/5] nilfs-utils: refactoring of nilfs_reclaim_segment to add minblocks param Andreas Rohner ` (5 subsequent siblings) 7 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 13:59 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner This patch adds the custom error value EGCTRYAGAIN, which signals to cleanerd to retry the GC process as soon as possible with no timeout. If the GC decides not to do any real work and only changes a few metadata bytes, there is no need for a timeout. Furthermore if the GC is running, there is probably not enough free space on the device and it is crucial to retry quickly. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- include/nilfs_gc.h | 2 ++ sbin/cleanerd/cleanerd.c | 10 +++++++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/include/nilfs_gc.h b/include/nilfs_gc.h index 72e9947..7628ce1 100644 --- a/include/nilfs_gc.h +++ b/include/nilfs_gc.h @@ -27,4 +27,6 @@ static inline int nilfs_suinfo_reclaimable(const struct nilfs_suinfo *si) extern void (*nilfs_gc_logger)(int priority, const char *fmt, ...); +#define EGCTRYAGAIN 513 + #endif /* NILFS_GC_H */ diff --git a/sbin/cleanerd/cleanerd.c b/sbin/cleanerd/cleanerd.c index eb6eace..575656b 100644 --- a/sbin/cleanerd/cleanerd.c +++ b/sbin/cleanerd/cleanerd.c @@ -167,6 +167,7 @@ struct nilfs_cleanerd { int running; int fallback; int retry_cleaning; + int no_timeout; int shutdown; long ncleansegs; struct timeval cleaning_interval; @@ -883,7 +884,7 @@ static int nilfs_cleanerd_recalc_interval(struct nilfs_cleanerd *cleanerd, interval = nilfs_cleanerd_cleaning_interval(cleanerd); /* timercmp() does not work for '>=' or '<='. */ /* curr >= target */ - if (!timercmp(&curr, &cleanerd->target, <)) { + if (!timercmp(&curr, &cleanerd->target, <) || cleanerd->no_timeout) { cleanerd->timeout.tv_sec = 0; cleanerd->timeout.tv_usec = 0; timeradd(&curr, interval, &cleanerd->target); @@ -1369,6 +1370,7 @@ static ssize_t nilfs_cleanerd_clean_segments(struct nilfs_cleanerd *cleanerd, "number: %m"); goto out; } + cleanerd->no_timeout = 0; ret = nilfs_reclaim_segment(cleanerd->nilfs, segnums, nsegs, protseq, protcno); @@ -1394,6 +1396,11 @@ static ssize_t nilfs_cleanerd_clean_segments(struct nilfs_cleanerd *cleanerd, cleanerd->retry_cleaning = 0; } + } else if (ret == -EGCTRYAGAIN) { + cleanerd->fallback = 0; + cleanerd->retry_cleaning = 1; + cleanerd->no_timeout = 1; + ret = 0; } else if (ret < 0 && errno == ENOMEM) { nilfs_cleanerd_reduce_ncleansegs_for_retry(cleanerd); cleanerd->fallback = 1; @@ -1436,6 +1443,7 @@ static int nilfs_cleanerd_clean_loop(struct nilfs_cleanerd *cleanerd) cleanerd->running = 1; cleanerd->fallback = 0; cleanerd->retry_cleaning = 0; + cleanerd->no_timeout = 0; nilfs_cnoconv_reset(cleanerd->cnoconv); nilfs_gc_logger = syslog; -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
[parent not found: <e9d3dd17318a994fe7e2c100368212e0b4029e13.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 2/5] nilfs-utils: cleanerd: add custom error value to enable fast retry [not found] ` <e9d3dd17318a994fe7e2c100368212e0b4029e13.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-23 17:49 ` Vyacheslav Dubeyko [not found] ` <FE7FB751-68F4-48C3-A97A-782F4F6E69FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Vyacheslav Dubeyko @ 2014-01-23 17:49 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: > This patch adds the custom error value EGCTRYAGAIN, which signals to > cleanerd to retry the GC process as soon as possible with no timeout. > > If the GC decides not to do any real work and only changes a few > metadata bytes, there is no need for a timeout. Furthermore if the GC is > running, there is probably not enough free space on the device and it is > crucial to retry quickly. > What does it defend us from eating 100% of CPU? > Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> > --- > include/nilfs_gc.h | 2 ++ > sbin/cleanerd/cleanerd.c | 10 +++++++++- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/include/nilfs_gc.h b/include/nilfs_gc.h > index 72e9947..7628ce1 100644 > --- a/include/nilfs_gc.h > +++ b/include/nilfs_gc.h > @@ -27,4 +27,6 @@ static inline int nilfs_suinfo_reclaimable(const struct nilfs_suinfo *si) > > extern void (*nilfs_gc_logger)(int priority, const char *fmt, ...); > > +#define EGCTRYAGAIN 513 Why do you declare a new constant? What does it means value of 513? What bad in EAGAIN? Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <FE7FB751-68F4-48C3-A97A-782F4F6E69FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH v2 2/5] nilfs-utils: cleanerd: add custom error value to enable fast retry [not found] ` <FE7FB751-68F4-48C3-A97A-782F4F6E69FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2014-01-23 19:08 ` Andreas Rohner 0 siblings, 0 replies; 44+ messages in thread From: Andreas Rohner @ 2014-01-23 19:08 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-23 18:49, Vyacheslav Dubeyko wrote: > > On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: > >> This patch adds the custom error value EGCTRYAGAIN, which signals to >> cleanerd to retry the GC process as soon as possible with no timeout. >> >> If the GC decides not to do any real work and only changes a few >> metadata bytes, there is no need for a timeout. Furthermore if the GC is >> running, there is probably not enough free space on the device and it is >> crucial to retry quickly. >> > > What does it defend us from eating 100% of CPU? Nothing really, but I don't think it's a big problem, because there is no possibility that you have constant 100% CPU. For every segment the timestamp is updated, so it will be protected by the protection period for a while. But you could be right. Maybe it is better to use a smaller timout instead of no timeout. I think I will change that in the next version of my patch. >> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> >> --- >> include/nilfs_gc.h | 2 ++ >> sbin/cleanerd/cleanerd.c | 10 +++++++++- >> 2 files changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/include/nilfs_gc.h b/include/nilfs_gc.h >> index 72e9947..7628ce1 100644 >> --- a/include/nilfs_gc.h >> +++ b/include/nilfs_gc.h >> @@ -27,4 +27,6 @@ static inline int nilfs_suinfo_reclaimable(const struct nilfs_suinfo *si) >> >> extern void (*nilfs_gc_logger)(int priority, const char *fmt, ...); >> >> +#define EGCTRYAGAIN 513 > > Why do you declare a new constant? What does it means value of 513? > What bad in EAGAIN? I just thought that it's not a good idea to use an error code, that really could be returned from some system call. 513 is just 512+1. I wanted to make sure, that my error code does not conflict with any other existing errors < 512. Come to think of it, maybe you are right. I don't think that any of the calls in nilfs_reclaim_segment would return EAGAIN. So we could use it theoretically. But if a real error occurred, that returned EAGAIN for some reason, we could end up with very undesirable results. br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v2 3/5] nilfs-utils: refactoring of nilfs_reclaim_segment to add minblocks param [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 13:59 ` [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks Andreas Rohner 2014-01-21 13:59 ` [PATCH v2 2/5] nilfs-utils: cleanerd: add custom error value to enable fast retry Andreas Rohner @ 2014-01-21 13:59 ` Andreas Rohner [not found] ` <36ef66ee15b3de8ca00815a6baa7bf6b62ca57d4.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 13:59 ` [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl Andreas Rohner ` (4 subsequent siblings) 7 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 13:59 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner nilfs_reclaim_segment gathers all the necessary information to decide, how many reclaimable blocks there are in the segments that are to be cleaned. By enabling the passing of the threshold value minblocks, nilfs_reclaim_segment can check the actual value against the threshold. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- include/nilfs_gc.h | 4 ++-- lib/gc.c | 5 +++-- sbin/cleanerd/cleanerd.c | 5 +++-- sbin/nilfs-resize/nilfs-resize.c | 2 +- 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/include/nilfs_gc.h b/include/nilfs_gc.h index 7628ce1..f9fbde4 100644 --- a/include/nilfs_gc.h +++ b/include/nilfs_gc.h @@ -15,8 +15,8 @@ #include "nilfs.h" ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, - __u64 *segnums, size_t nsegs, - __u64 protseq, nilfs_cno_t protcno); + __u64 *segnums, size_t nsegs, __u64 protseq, + nilfs_cno_t protcno, unsigned long minblocks); static inline int nilfs_suinfo_reclaimable(const struct nilfs_suinfo *si) diff --git a/lib/gc.c b/lib/gc.c index 1152299..0b0e2d6 100644 --- a/lib/gc.c +++ b/lib/gc.c @@ -607,10 +607,11 @@ static int nilfs_toss_bdescs(struct nilfs_vector *bdescv) * @nsegs: size of the @segnums array * @protseq: start of sequence number of protected segments * @protcno: start checkpoint number of protected period + * @minblocks: minimal number of free blocks in a segment */ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, - __u64 *segnums, size_t nsegs, - __u64 protseq, nilfs_cno_t protcno) + __u64 *segnums, size_t nsegs, __u64 protseq, + nilfs_cno_t protcno, unsigned long minblocks) { struct nilfs_vector *vdescv, *bdescv, *periodv, *vblocknrv; sigset_t sigset, oldset, waitset; diff --git a/sbin/cleanerd/cleanerd.c b/sbin/cleanerd/cleanerd.c index 575656b..6dd9d24 100644 --- a/sbin/cleanerd/cleanerd.c +++ b/sbin/cleanerd/cleanerd.c @@ -1372,8 +1372,9 @@ static ssize_t nilfs_cleanerd_clean_segments(struct nilfs_cleanerd *cleanerd, } cleanerd->no_timeout = 0; - ret = nilfs_reclaim_segment(cleanerd->nilfs, segnums, nsegs, - protseq, protcno); + ret = nilfs_reclaim_segment(cleanerd->nilfs, segnums, + nsegs, protseq, protcno, + nilfs_cleanerd_min_free_blocks_threshold(cleanerd)); if (ret > 0) { for (i = 0; i < ret; i++) syslog(LOG_DEBUG, "segment %llu cleaned", diff --git a/sbin/nilfs-resize/nilfs-resize.c b/sbin/nilfs-resize/nilfs-resize.c index c8b0868..2850268 100644 --- a/sbin/nilfs-resize/nilfs-resize.c +++ b/sbin/nilfs-resize/nilfs-resize.c @@ -595,7 +595,7 @@ static ssize_t nilfs_resize_move_segments(struct nilfs *nilfs, return -1; ret = nilfs_reclaim_segment(nilfs, snp, nc, - sustat.ss_prot_seq, 0); + sustat.ss_prot_seq, 0, 0); if (ret < 0) return -1; -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
[parent not found: <36ef66ee15b3de8ca00815a6baa7bf6b62ca57d4.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 3/5] nilfs-utils: refactoring of nilfs_reclaim_segment to add minblocks param [not found] ` <36ef66ee15b3de8ca00815a6baa7bf6b62ca57d4.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-24 18:04 ` Ryusuke Konishi 0 siblings, 0 replies; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 18:04 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Tue, 21 Jan 2014 14:59:42 +0100, Andreas Rohner wrote: > nilfs_reclaim_segment gathers all the necessary information to decide, > how many reclaimable blocks there are in the segments that are to be > cleaned. > > By enabling the passing of the threshold value > minblocks, nilfs_reclaim_segment can check the actual value against the > threshold. > > Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> > --- > include/nilfs_gc.h | 4 ++-- > lib/gc.c | 5 +++-- > sbin/cleanerd/cleanerd.c | 5 +++-- > sbin/nilfs-resize/nilfs-resize.c | 2 +- > 4 files changed, 9 insertions(+), 7 deletions(-) > > diff --git a/include/nilfs_gc.h b/include/nilfs_gc.h > index 7628ce1..f9fbde4 100644 > --- a/include/nilfs_gc.h > +++ b/include/nilfs_gc.h > @@ -15,8 +15,8 @@ > #include "nilfs.h" > > ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, > - __u64 *segnums, size_t nsegs, > - __u64 protseq, nilfs_cno_t protcno); > + __u64 *segnums, size_t nsegs, __u64 protseq, > + nilfs_cno_t protcno, unsigned long minblocks); This breaks compatibility of libnilfsgc library. Please add new function with the extended arguments, the old function can share implementation of the new function. Ryusuke Konishi > static inline int nilfs_suinfo_reclaimable(const struct nilfs_suinfo *si) > diff --git a/lib/gc.c b/lib/gc.c > index 1152299..0b0e2d6 100644 > --- a/lib/gc.c > +++ b/lib/gc.c > @@ -607,10 +607,11 @@ static int nilfs_toss_bdescs(struct nilfs_vector *bdescv) > * @nsegs: size of the @segnums array > * @protseq: start of sequence number of protected segments > * @protcno: start checkpoint number of protected period > + * @minblocks: minimal number of free blocks in a segment > */ > ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, > - __u64 *segnums, size_t nsegs, > - __u64 protseq, nilfs_cno_t protcno) > + __u64 *segnums, size_t nsegs, __u64 protseq, > + nilfs_cno_t protcno, unsigned long minblocks) > { > struct nilfs_vector *vdescv, *bdescv, *periodv, *vblocknrv; > sigset_t sigset, oldset, waitset; > diff --git a/sbin/cleanerd/cleanerd.c b/sbin/cleanerd/cleanerd.c > index 575656b..6dd9d24 100644 > --- a/sbin/cleanerd/cleanerd.c > +++ b/sbin/cleanerd/cleanerd.c > @@ -1372,8 +1372,9 @@ static ssize_t nilfs_cleanerd_clean_segments(struct nilfs_cleanerd *cleanerd, > } > cleanerd->no_timeout = 0; > > - ret = nilfs_reclaim_segment(cleanerd->nilfs, segnums, nsegs, > - protseq, protcno); > + ret = nilfs_reclaim_segment(cleanerd->nilfs, segnums, > + nsegs, protseq, protcno, > + nilfs_cleanerd_min_free_blocks_threshold(cleanerd)); > if (ret > 0) { > for (i = 0; i < ret; i++) > syslog(LOG_DEBUG, "segment %llu cleaned", > diff --git a/sbin/nilfs-resize/nilfs-resize.c b/sbin/nilfs-resize/nilfs-resize.c > index c8b0868..2850268 100644 > --- a/sbin/nilfs-resize/nilfs-resize.c > +++ b/sbin/nilfs-resize/nilfs-resize.c > @@ -595,7 +595,7 @@ static ssize_t nilfs_resize_move_segments(struct nilfs *nilfs, > return -1; > > ret = nilfs_reclaim_segment(nilfs, snp, nc, > - sustat.ss_prot_seq, 0); > + sustat.ss_prot_seq, 0, 0); > if (ret < 0) > return -1; > > -- > 1.8.5.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> ` (2 preceding siblings ...) 2014-01-21 13:59 ` [PATCH v2 3/5] nilfs-utils: refactoring of nilfs_reclaim_segment to add minblocks param Andreas Rohner @ 2014-01-21 13:59 ` Andreas Rohner [not found] ` <72f8c37258d08ba9793b0c1bb0374dd8efcd4756.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 13:59 ` [PATCH v2 5/5] nilfs-utils: man: add description of min_free_blocks_threshold Andreas Rohner ` (3 subsequent siblings) 7 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 13:59 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner With this ioctl the segment usage information in the SUFILE can be updated from userspace. This is useful, because it allows the GC to modify and update segment usage entries for specific segments, which enables it to avoid unnecessary write operations. If a segment needs to be cleaned, but there is no or very little free space to be gained, the cleaning operation basically degrades to needless expensive copying of data. In the end the only thing that changes is the location of the data and a timestamp in the segment usage info. With this ioctl the GC can skip the copying and update the segment usage entries directly instead. Additionally this patch implements a simple check in nilfs_reclaim_segment. If the number of free blocks that can be gained by cleaning the segments is below the threshold of minblocks, it simply updates the segment usage information instead. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- include/nilfs.h | 2 ++ include/nilfs2_fs.h | 41 +++++++++++++++++++++++++++++++++++++++++ lib/gc.c | 44 +++++++++++++++++++++++++++++++++++++++++++- lib/nilfs.c | 26 ++++++++++++++++++++++++++ 4 files changed, 112 insertions(+), 1 deletion(-) diff --git a/include/nilfs.h b/include/nilfs.h index 56286a9..1e9b034 100644 --- a/include/nilfs.h +++ b/include/nilfs.h @@ -299,6 +299,8 @@ int nilfs_delete_checkpoint(struct nilfs *, nilfs_cno_t); int nilfs_get_cpstat(const struct nilfs *, struct nilfs_cpstat *); ssize_t nilfs_get_suinfo(const struct nilfs *, __u64, struct nilfs_suinfo *, size_t); +ssize_t nilfs_set_suinfo(const struct nilfs *, struct nilfs_suinfo_update *, + size_t); int nilfs_get_sustat(const struct nilfs *, struct nilfs_sustat *); ssize_t nilfs_get_vinfo(const struct nilfs *, struct nilfs_vinfo *, size_t); ssize_t nilfs_get_bdescs(const struct nilfs *, struct nilfs_bdesc *, size_t); diff --git a/include/nilfs2_fs.h b/include/nilfs2_fs.h index e674f44..181c482 100644 --- a/include/nilfs2_fs.h +++ b/include/nilfs2_fs.h @@ -713,6 +713,45 @@ static inline int nilfs_suinfo_clean(const struct nilfs_suinfo *si) return !si->sui_flags; } +/** + * nilfs_suinfo_update - segment usage information update + * @sup_segnum: segment number + * @sup_flags: flags for which fields are active in sup_sui + * @sup_sui: segment usage information + */ +struct nilfs_suinfo_update { + __u64 sup_segnum; + __u32 sup_flags; + struct nilfs_suinfo sup_sui; +}; + +enum { + NILFS_SUINFO_UPDATE_LASTMOD, + NILFS_SUINFO_UPDATE_NBLOCKS, + NILFS_SUINFO_UPDATE_FLAGS, +}; + +#define NILFS_SUINFO_UPDATE_FNS(flag, name) \ +static inline void \ +nilfs_suinfo_update_set_##name(struct nilfs_suinfo_update *sup) \ +{ \ + sup->sup_flags |= 1UL << NILFS_SUINFO_UPDATE_##flag; \ +} \ +static inline void \ +nilfs_suinfo_update_clear_##name(struct nilfs_suinfo_update *sup) \ +{ \ + sup->sup_flags &= ~(1UL << NILFS_SUINFO_UPDATE_##flag); \ +} \ +static inline int \ +nilfs_suinfo_update_##name(const struct nilfs_suinfo_update *sup) \ +{ \ + return !!(sup->sup_flags & (1UL << NILFS_SUINFO_UPDATE_##flag));\ +} + +NILFS_SUINFO_UPDATE_FNS(LASTMOD, lastmod) +NILFS_SUINFO_UPDATE_FNS(NBLOCKS, nblocks) +NILFS_SUINFO_UPDATE_FNS(FLAGS, flags) + /* ioctl */ enum { NILFS_CHECKPOINT, @@ -867,5 +906,7 @@ struct nilfs_bdesc { _IOW(NILFS_IOCTL_IDENT, 0x8B, __u64) #define NILFS_IOCTL_SET_ALLOC_RANGE \ _IOW(NILFS_IOCTL_IDENT, 0x8C, __u64[2]) +#define NILFS_IOCTL_SET_SUINFO \ + _IOW(NILFS_IOCTL_IDENT, 0x8D, struct nilfs_argv) #endif /* _LINUX_NILFS_FS_H */ diff --git a/lib/gc.c b/lib/gc.c index 0b0e2d6..ebbe0ca 100644 --- a/lib/gc.c +++ b/lib/gc.c @@ -29,6 +29,10 @@ #include <syslog.h> #endif /* HAVE_SYSLOG_H */ +#if HAVE_SYS_TIME_H +#include <sys/time.h> +#endif /* HAVE_SYS_TIME */ + #include <errno.h> #include <assert.h> #include <stdarg.h> @@ -615,7 +619,10 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, { struct nilfs_vector *vdescv, *bdescv, *periodv, *vblocknrv; sigset_t sigset, oldset, waitset; - ssize_t n, ret = -1; + ssize_t n, i, ret = -1; + __u32 freeblocks; + struct nilfs_suinfo_update *supv; + struct timeval tv; if (nsegs == 0) return 0; @@ -678,6 +685,41 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, goto out_lock; } + freeblocks = (nilfs_get_blocks_per_segment(nilfs) * n) + - (nilfs_vector_get_size(vdescv) + + nilfs_vector_get_size(bdescv)); + + /* if there are less free blocks than the + * minimal threshold try to update suinfo + * instead of cleaning */ + if (freeblocks < minblocks * n) { + ret = gettimeofday(&tv, NULL); + if (ret < 0) + goto out_lock; + + supv = malloc(sizeof(struct nilfs_suinfo_update) * n); + if (supv == NULL) { + ret = -1; + goto out_lock; + } + + for (i = 0; i < n; ++i) { + supv[i].sup_segnum = segnums[i]; + supv[i].sup_flags = 0; + nilfs_suinfo_update_set_lastmod(&supv[i]); + supv[i].sup_sui.sui_lastmod = tv.tv_sec; + } + + ret = nilfs_set_suinfo(nilfs, supv, n); + free(supv); + if (ret >= 0) { + /* success, tell caller + * to try another segment/s */ + ret = -EGCTRYAGAIN; + goto out_lock; + } + } + ret = nilfs_clean_segments(nilfs, nilfs_vector_get_data(vdescv), nilfs_vector_get_size(vdescv), diff --git a/lib/nilfs.c b/lib/nilfs.c index 7265830..dbd03e6 100644 --- a/lib/nilfs.c +++ b/lib/nilfs.c @@ -548,6 +548,32 @@ ssize_t nilfs_get_suinfo(const struct nilfs *nilfs, __u64 segnum, } /** + * nilfs_set_suinfo - + * @nilfs: + * @sup: an array of nilfs_suinfo_update structs + * @nsup: number of elements in sup + */ +ssize_t nilfs_set_suinfo(const struct nilfs *nilfs, + struct nilfs_suinfo_update *sup, size_t nsup) +{ + struct nilfs_argv argv; + + if (nilfs->n_iocfd < 0) { + errno = EBADF; + return -1; + } + + argv.v_base = (unsigned long)sup; + argv.v_nmembs = nsup; + argv.v_size = sizeof(struct nilfs_suinfo_update); + argv.v_index = 0; + argv.v_flags = 0; + if (ioctl(nilfs->n_iocfd, NILFS_IOCTL_SET_SUINFO, &argv) < 0) + return -1; + return argv.v_nmembs; +} + +/** * nilfs_get_sustat - * @nilfs: * @sustat: -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
[parent not found: <72f8c37258d08ba9793b0c1bb0374dd8efcd4756.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <72f8c37258d08ba9793b0c1bb0374dd8efcd4756.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-23 17:49 ` Vyacheslav Dubeyko [not found] ` <62FA32DB-83AC-4570-BD73-618C169390FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 2014-01-24 18:26 ` Ryusuke Konishi 1 sibling, 1 reply; 44+ messages in thread From: Vyacheslav Dubeyko @ 2014-01-23 17:49 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: [snip] > diff --git a/lib/gc.c b/lib/gc.c > index 0b0e2d6..ebbe0ca 100644 > --- a/lib/gc.c > +++ b/lib/gc.c > @@ -29,6 +29,10 @@ > #include <syslog.h> > #endif /* HAVE_SYSLOG_H */ > > +#if HAVE_SYS_TIME_H > +#include <sys/time.h> > +#endif /* HAVE_SYS_TIME */ > + > #include <errno.h> > #include <assert.h> > #include <stdarg.h> > @@ -615,7 +619,10 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, > { > struct nilfs_vector *vdescv, *bdescv, *periodv, *vblocknrv; > sigset_t sigset, oldset, waitset; > - ssize_t n, ret = -1; > + ssize_t n, i, ret = -1; > + __u32 freeblocks; > + struct nilfs_suinfo_update *supv; > + struct timeval tv; > > if (nsegs == 0) > return 0; > @@ -678,6 +685,41 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, > goto out_lock; > } > > + freeblocks = (nilfs_get_blocks_per_segment(nilfs) * n) > + - (nilfs_vector_get_size(vdescv) > + + nilfs_vector_get_size(bdescv)); > + > + /* if there are less free blocks than the > + * minimal threshold try to update suinfo > + * instead of cleaning */ > + if (freeblocks < minblocks * n) { > + ret = gettimeofday(&tv, NULL); > + if (ret < 0) > + goto out_lock; > + > + supv = malloc(sizeof(struct nilfs_suinfo_update) * n); Is it enough to allocate memory without zeroing? Do you free allocated memory in the case of error? Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <62FA32DB-83AC-4570-BD73-618C169390FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <62FA32DB-83AC-4570-BD73-618C169390FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2014-01-23 19:17 ` Andreas Rohner [not found] ` <52E16AE4.4000303-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-23 19:17 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-23 18:49, Vyacheslav Dubeyko wrote: > > On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: > > [snip] >> diff --git a/lib/gc.c b/lib/gc.c >> index 0b0e2d6..ebbe0ca 100644 >> --- a/lib/gc.c >> +++ b/lib/gc.c >> @@ -29,6 +29,10 @@ >> #include <syslog.h> >> #endif /* HAVE_SYSLOG_H */ >> >> +#if HAVE_SYS_TIME_H >> +#include <sys/time.h> >> +#endif /* HAVE_SYS_TIME */ >> + >> #include <errno.h> >> #include <assert.h> >> #include <stdarg.h> >> @@ -615,7 +619,10 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, >> { >> struct nilfs_vector *vdescv, *bdescv, *periodv, *vblocknrv; >> sigset_t sigset, oldset, waitset; >> - ssize_t n, ret = -1; >> + ssize_t n, i, ret = -1; >> + __u32 freeblocks; >> + struct nilfs_suinfo_update *supv; >> + struct timeval tv; >> >> if (nsegs == 0) >> return 0; >> @@ -678,6 +685,41 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, >> goto out_lock; >> } >> >> + freeblocks = (nilfs_get_blocks_per_segment(nilfs) * n) >> + - (nilfs_vector_get_size(vdescv) >> + + nilfs_vector_get_size(bdescv)); >> + >> + /* if there are less free blocks than the >> + * minimal threshold try to update suinfo >> + * instead of cleaning */ >> + if (freeblocks < minblocks * n) { >> + ret = gettimeofday(&tv, NULL); >> + if (ret < 0) >> + goto out_lock; >> + >> + supv = malloc(sizeof(struct nilfs_suinfo_update) * n); > > Is it enough to allocate memory without zeroing? I use the flags to indicate, which parts of the structure are valid. I don't need to zero everything. > Do you free allocated memory in the case of error? Yes I immediately free it before I check the return value. Could you please read the code a bit more carefully, before you ask such questions? br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E16AE4.4000303-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <52E16AE4.4000303-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-25 16:16 ` Vyacheslav Dubeyko 0 siblings, 0 replies; 44+ messages in thread From: Vyacheslav Dubeyko @ 2014-01-25 16:16 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Jan 23, 2014, at 10:17 PM, Andreas Rohner wrote: > >> Do you free allocated memory in the case of error? > > Yes I immediately free it before I check the return value. Could you > please read the code a bit more carefully, before you ask such questions? + freeblocks = (nilfs_get_blocks_per_segment(nilfs) * n) + - (nilfs_vector_get_size(vdescv) + + nilfs_vector_get_size(bdescv)); + + /* if there are less free blocks than the + * minimal threshold try to update suinfo + * instead of cleaning */ + if (freeblocks < minblocks * n) { + ret = gettimeofday(&tv, NULL); + if (ret < 0) + goto out_lock; + + supv = malloc(sizeof(struct nilfs_suinfo_update) * n); ^^^^^^ memory allocation + if (supv == NULL) { + ret = -1; + goto out_lock; + } + + for (i = 0; i < n; ++i) { + supv[i].sup_segnum = segnums[i]; + supv[i].sup_flags = 0; + nilfs_suinfo_update_set_lastmod(&supv[i]); + supv[i].sup_sui.sui_lastmod = tv.tv_sec; + } + + ret = nilfs_set_suinfo(nilfs, supv, n); + free(supv); ^^^^ memory freeing + if (ret >= 0) { + /* success, tell caller + * to try another segment/s */ + ret = -EGCTRYAGAIN; + goto out_lock; + } + } + Ok. But I can suggest to free memory in the place of error processing (goto out_lock) and at the end of function for success. Otherwise, it looks like hiding free() call. And it is very easy to forget about memory freeing. It is good practice, I suppose, to group memory freeing operations. Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <72f8c37258d08ba9793b0c1bb0374dd8efcd4756.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-23 17:49 ` Vyacheslav Dubeyko @ 2014-01-24 18:26 ` Ryusuke Konishi [not found] ` <20140125.032633.184837411.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 1 sibling, 1 reply; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 18:26 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Tue, 21 Jan 2014 14:59:43 +0100, Andreas Rohner wrote: > With this ioctl the segment usage information in the SUFILE can be > updated from userspace. > > This is useful, because it allows the GC to modify and update segment > usage entries for specific segments, which enables it to avoid > unnecessary write operations. > > If a segment needs to be cleaned, but there is no or very little free > space to be gained, the cleaning operation basically degrades to > needless expensive copying of data. In the end the only thing that > changes is the location of the data and a timestamp in the segment > usage info. With this ioctl the GC can skip the copying and update the > segment usage entries directly instead. > > Additionally this patch implements a simple check in > nilfs_reclaim_segment. If the number of free blocks that can be gained > by cleaning the segments is below the threshold of minblocks, it simply > updates the segment usage information instead. > > Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> > --- > include/nilfs.h | 2 ++ > include/nilfs2_fs.h | 41 +++++++++++++++++++++++++++++++++++++++++ > lib/gc.c | 44 +++++++++++++++++++++++++++++++++++++++++++- > lib/nilfs.c | 26 ++++++++++++++++++++++++++ > 4 files changed, 112 insertions(+), 1 deletion(-) > > diff --git a/include/nilfs.h b/include/nilfs.h > index 56286a9..1e9b034 100644 > --- a/include/nilfs.h > +++ b/include/nilfs.h > @@ -299,6 +299,8 @@ int nilfs_delete_checkpoint(struct nilfs *, nilfs_cno_t); > int nilfs_get_cpstat(const struct nilfs *, struct nilfs_cpstat *); > ssize_t nilfs_get_suinfo(const struct nilfs *, __u64, struct nilfs_suinfo *, > size_t); > +ssize_t nilfs_set_suinfo(const struct nilfs *, struct nilfs_suinfo_update *, > + size_t); > int nilfs_get_sustat(const struct nilfs *, struct nilfs_sustat *); > ssize_t nilfs_get_vinfo(const struct nilfs *, struct nilfs_vinfo *, size_t); > ssize_t nilfs_get_bdescs(const struct nilfs *, struct nilfs_bdesc *, size_t); > diff --git a/include/nilfs2_fs.h b/include/nilfs2_fs.h > index e674f44..181c482 100644 > --- a/include/nilfs2_fs.h > +++ b/include/nilfs2_fs.h > @@ -713,6 +713,45 @@ static inline int nilfs_suinfo_clean(const struct nilfs_suinfo *si) > return !si->sui_flags; > } > > +/** > + * nilfs_suinfo_update - segment usage information update > + * @sup_segnum: segment number > + * @sup_flags: flags for which fields are active in sup_sui > + * @sup_sui: segment usage information > + */ > +struct nilfs_suinfo_update { > + __u64 sup_segnum; > + __u32 sup_flags; > + struct nilfs_suinfo sup_sui; > +}; > + > +enum { > + NILFS_SUINFO_UPDATE_LASTMOD, > + NILFS_SUINFO_UPDATE_NBLOCKS, > + NILFS_SUINFO_UPDATE_FLAGS, > +}; > + > +#define NILFS_SUINFO_UPDATE_FNS(flag, name) \ > +static inline void \ > +nilfs_suinfo_update_set_##name(struct nilfs_suinfo_update *sup) \ > +{ \ > + sup->sup_flags |= 1UL << NILFS_SUINFO_UPDATE_##flag; \ > +} \ > +static inline void \ > +nilfs_suinfo_update_clear_##name(struct nilfs_suinfo_update *sup) \ > +{ \ > + sup->sup_flags &= ~(1UL << NILFS_SUINFO_UPDATE_##flag); \ > +} \ > +static inline int \ > +nilfs_suinfo_update_##name(const struct nilfs_suinfo_update *sup) \ > +{ \ > + return !!(sup->sup_flags & (1UL << NILFS_SUINFO_UPDATE_##flag));\ > +} > + > +NILFS_SUINFO_UPDATE_FNS(LASTMOD, lastmod) > +NILFS_SUINFO_UPDATE_FNS(NBLOCKS, nblocks) > +NILFS_SUINFO_UPDATE_FNS(FLAGS, flags) > + > /* ioctl */ > enum { > NILFS_CHECKPOINT, > @@ -867,5 +906,7 @@ struct nilfs_bdesc { > _IOW(NILFS_IOCTL_IDENT, 0x8B, __u64) > #define NILFS_IOCTL_SET_ALLOC_RANGE \ > _IOW(NILFS_IOCTL_IDENT, 0x8C, __u64[2]) > +#define NILFS_IOCTL_SET_SUINFO \ > + _IOW(NILFS_IOCTL_IDENT, 0x8D, struct nilfs_argv) > > #endif /* _LINUX_NILFS_FS_H */ > diff --git a/lib/gc.c b/lib/gc.c > index 0b0e2d6..ebbe0ca 100644 > --- a/lib/gc.c > +++ b/lib/gc.c > @@ -29,6 +29,10 @@ > #include <syslog.h> > #endif /* HAVE_SYSLOG_H */ > > +#if HAVE_SYS_TIME_H > +#include <sys/time.h> > +#endif /* HAVE_SYS_TIME */ > + > #include <errno.h> > #include <assert.h> > #include <stdarg.h> > @@ -615,7 +619,10 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, > { > struct nilfs_vector *vdescv, *bdescv, *periodv, *vblocknrv; > sigset_t sigset, oldset, waitset; > - ssize_t n, ret = -1; > + ssize_t n, i, ret = -1; > + __u32 freeblocks; > + struct nilfs_suinfo_update *supv; > + struct timeval tv; > > if (nsegs == 0) > return 0; > @@ -678,6 +685,41 @@ ssize_t nilfs_reclaim_segment(struct nilfs *nilfs, > goto out_lock; > } > > + freeblocks = (nilfs_get_blocks_per_segment(nilfs) * n) > + - (nilfs_vector_get_size(vdescv) > + + nilfs_vector_get_size(bdescv)); > + > + /* if there are less free blocks than the > + * minimal threshold try to update suinfo > + * instead of cleaning */ > + if (freeblocks < minblocks * n) { > + ret = gettimeofday(&tv, NULL); > + if (ret < 0) > + goto out_lock; > + > + supv = malloc(sizeof(struct nilfs_suinfo_update) * n); > + if (supv == NULL) { > + ret = -1; > + goto out_lock; > + } > + > + for (i = 0; i < n; ++i) { > + supv[i].sup_segnum = segnums[i]; > + supv[i].sup_flags = 0; > + nilfs_suinfo_update_set_lastmod(&supv[i]); > + supv[i].sup_sui.sui_lastmod = tv.tv_sec; > + } > + > + ret = nilfs_set_suinfo(nilfs, supv, n); > + free(supv); > + if (ret >= 0) { > + /* success, tell caller > + * to try another segment/s */ > + ret = -EGCTRYAGAIN; > + goto out_lock; > + } You should design the new reclaim function so that it turns off the logic using nilfs_set_suinfo() once nilfs_set_suinfo() returned a status code < 0 and errno was ENOTTY. This fallback logic is needed to keep compatibility for old kernel. In addition, whether applying the optimization or not, should be selectable in /etc/nilfs_cleanerd.conf. > + } > + > ret = nilfs_clean_segments(nilfs, > nilfs_vector_get_data(vdescv), > nilfs_vector_get_size(vdescv), > diff --git a/lib/nilfs.c b/lib/nilfs.c > index 7265830..dbd03e6 100644 > --- a/lib/nilfs.c > +++ b/lib/nilfs.c > @@ -548,6 +548,32 @@ ssize_t nilfs_get_suinfo(const struct nilfs *nilfs, __u64 segnum, > } > > /** > + * nilfs_set_suinfo - > + * @nilfs: Please fill the summary line of this function. I'll continue reviewing this patchset tomorrow. Regards, Ryusuke Konishi > + * @sup: an array of nilfs_suinfo_update structs > + * @nsup: number of elements in sup > + */ > +ssize_t nilfs_set_suinfo(const struct nilfs *nilfs, > + struct nilfs_suinfo_update *sup, size_t nsup) > +{ > + struct nilfs_argv argv; > + > + if (nilfs->n_iocfd < 0) { > + errno = EBADF; > + return -1; > + } > + > + argv.v_base = (unsigned long)sup; > + argv.v_nmembs = nsup; > + argv.v_size = sizeof(struct nilfs_suinfo_update); > + argv.v_index = 0; > + argv.v_flags = 0; > + if (ioctl(nilfs->n_iocfd, NILFS_IOCTL_SET_SUINFO, &argv) < 0) > + return -1; > + return argv.v_nmembs; > +} > + > +/** > * nilfs_get_sustat - > * @nilfs: > * @sustat: > -- > 1.8.5.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20140125.032633.184837411.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <20140125.032633.184837411.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2014-01-24 20:00 ` Andreas Rohner [not found] ` <52E2C643.3070704-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-24 20:00 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-24 19:26, Ryusuke Konishi wrote: >> + freeblocks = (nilfs_get_blocks_per_segment(nilfs) * n) >> + - (nilfs_vector_get_size(vdescv) >> + + nilfs_vector_get_size(bdescv)); >> + >> + /* if there are less free blocks than the >> + * minimal threshold try to update suinfo >> + * instead of cleaning */ >> + if (freeblocks < minblocks * n) { >> + ret = gettimeofday(&tv, NULL); >> + if (ret < 0) >> + goto out_lock; >> + >> + supv = malloc(sizeof(struct nilfs_suinfo_update) * n); >> + if (supv == NULL) { >> + ret = -1; >> + goto out_lock; >> + } >> + >> + for (i = 0; i < n; ++i) { >> + supv[i].sup_segnum = segnums[i]; >> + supv[i].sup_flags = 0; >> + nilfs_suinfo_update_set_lastmod(&supv[i]); >> + supv[i].sup_sui.sui_lastmod = tv.tv_sec; >> + } >> + >> + ret = nilfs_set_suinfo(nilfs, supv, n); >> + free(supv); >> + if (ret >= 0) { >> + /* success, tell caller >> + * to try another segment/s */ >> + ret = -EGCTRYAGAIN; >> + goto out_lock; >> + } > > You should design the new reclaim function so that it turns off the > logic using nilfs_set_suinfo() once nilfs_set_suinfo() returned a > status code < 0 and errno was ENOTTY. > > This fallback logic is needed to keep compatibility for old kernel. Thanks for your review. I am quite devastated how much needs to be fixed :). I implemented most of your suggestions. I will test them tomorrow and compose a new version of the patch set. > In addition, whether applying the optimization or not, should be > selectable in /etc/nilfs_cleanerd.conf. You can set the two options to 0, which would completely disable the optimization. # Minimum number of free blocks before a segment # can be cleaned. Set to 0 to disable it. min_free_blocks_threshold 0 # Minimum number of free blocks before a segment # can be cleaned. Set to 0 to disable it. # if clean segments < min_clean_segments mc_min_free_blocks_threshold 0 Is that enough or should I add an additional switch, like the use_mmap switch? br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E2C643.3070704-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <52E2C643.3070704-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-25 18:52 ` Ryusuke Konishi 2014-01-26 10:00 ` Ryusuke Konishi 1 sibling, 0 replies; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-25 18:52 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Fri, 24 Jan 2014 21:00:03 +0100, Andreas Rohner wrote: > On 2014-01-24 19:26, Ryusuke Konishi wrote: >> You should design the new reclaim function so that it turns off the >> logic using nilfs_set_suinfo() once nilfs_set_suinfo() returned a >> status code < 0 and errno was ENOTTY. >> >> This fallback logic is needed to keep compatibility for old kernel. > > Thanks for your review. I am quite devastated how much needs to be fixed > :). I implemented most of your suggestions. I will test them tomorrow > and compose a new version of the patch set. > >> In addition, whether applying the optimization or not, should be >> selectable in /etc/nilfs_cleanerd.conf. > > You can set the two options to 0, which would completely disable the > optimization. > > # Minimum number of free blocks before a segment > # can be cleaned. Set to 0 to disable it. > min_free_blocks_threshold 0 > > # Minimum number of free blocks before a segment > # can be cleaned. Set to 0 to disable it. > # if clean segments < min_clean_segments > mc_min_free_blocks_threshold 0 > > Is that enough or should I add an additional switch, like the use_mmap > switch? Yes, please do so. The flag name would be "use_set_suinfo" (NILFS_OPT_SET_SUINFO flag for nilfs object) If nilfs_set_suinfo() returned error with errno=ENOTTY, the gc library should automatically turns it off (with a warning message). If NILFS_OPT_SET_SUINFO was off from the start, this optimization should be never tried regardless of the above threshold values. And, for the above theshold declarations, proper default values should be written in nilfs_cleanerd.conf. Regards, Ryusuke Konishi -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <52E2C643.3070704-hi6Y0CQ0nG0@public.gmane.org> 2014-01-25 18:52 ` Ryusuke Konishi @ 2014-01-26 10:00 ` Ryusuke Konishi [not found] ` <20140126.190004.443429632.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 1 sibling, 1 reply; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-26 10:00 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Fri, 24 Jan 2014 21:00:03 +0100, Andreas Rohner wrote: > On 2014-01-24 19:26, Ryusuke Konishi wrote: >>> + ret = nilfs_set_suinfo(nilfs, supv, n); >>> + free(supv); >>> + if (ret >= 0) { >>> + /* success, tell caller >>> + * to try another segment/s */ >>> + ret = -EGCTRYAGAIN; >>> + goto out_lock; >>> + } Returning -EGCTRYAGAIN looks confusing, since posix error code Exxxxx is usually stored in errno variable in userland. It also looks a bit hacky. How about handling this as follows ? - Add a new structure to return result of GC, for instance, as follows: struct nilfs_reclaim_status { unsigned int ncleanedsegs; /* number of segments cleaned with nilfs_clean_segments() */ unsigned int ndeferredsegs;/* number of segments deferred with nilfs_set_suinfo() */ unsigned long nliveblocks; /* sum of live virtual blocks and live DAT blocks */ unsigned long ndeadblocks; /* number of reclaimable blocks */ unsigned long nlivevblocks; /* live virtual blocks */ unsigned long nfreevblocks; /* freed virtual block addresses */ ... }; - Use the structure as an argument of the "new" API function (i.e. variant of nilfs_reclaim_segment() ) - Rewrite the variant API function so that it returns a success value (0) also for the EGCTRYAGAIN case. - Store other status information in nilfs_reclaim_status structures. And, I think adding nilfs_set_suinfo() API should be a separate patch. Regards, Ryusuke Konishi -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20140126.190004.443429632.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <20140126.190004.443429632.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2014-01-26 11:24 ` Andreas Rohner 0 siblings, 0 replies; 44+ messages in thread From: Andreas Rohner @ 2014-01-26 11:24 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-26 11:00, Ryusuke Konishi wrote: > On Fri, 24 Jan 2014 21:00:03 +0100, Andreas Rohner wrote: >> On 2014-01-24 19:26, Ryusuke Konishi wrote: >>>> + ret = nilfs_set_suinfo(nilfs, supv, n); >>>> + free(supv); >>>> + if (ret >= 0) { >>>> + /* success, tell caller >>>> + * to try another segment/s */ >>>> + ret = -EGCTRYAGAIN; >>>> + goto out_lock; >>>> + } > > Returning -EGCTRYAGAIN looks confusing, since posix error code Exxxxx > is usually stored in errno variable in userland. It also looks a bit > hacky. True. Vyacheslav had similar concerns. > How about handling this as follows ? > > - Add a new structure to return result of GC, for instance, as > follows: > > struct nilfs_reclaim_status { > unsigned int ncleanedsegs; /* number of segments cleaned > with nilfs_clean_segments() */ > unsigned int ndeferredsegs;/* number of segments deferred > with nilfs_set_suinfo() */ > unsigned long nliveblocks; /* sum of live virtual blocks and live > DAT blocks */ > unsigned long ndeadblocks; /* number of reclaimable blocks */ > unsigned long nlivevblocks; /* live virtual blocks */ > unsigned long nfreevblocks; /* freed virtual block addresses */ > ... > }; > > - Use the structure as an argument of the "new" API function > (i.e. variant of nilfs_reclaim_segment() ) > > - Rewrite the variant API function so that it returns a success value > (0) also for the EGCTRYAGAIN case. > > - Store other status information in nilfs_reclaim_status structures. > > > And, I think adding nilfs_set_suinfo() API should be a separate patch. Yes EGCTRYAGAIN is not necessary for the nilfs_set_suinfo() API to work. It is a good idea to use a separate patch for this. br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v2 5/5] nilfs-utils: man: add description of min_free_blocks_threshold [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> ` (3 preceding siblings ...) 2014-01-21 13:59 ` [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl Andreas Rohner @ 2014-01-21 13:59 ` Andreas Rohner 2014-01-21 14:00 ` [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct Andreas Rohner ` (2 subsequent siblings) 7 siblings, 0 replies; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 13:59 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner Adds a description of the new min_free_blocks_threshold and mc_min_free_blocks_threshold options in the config file and the new command line parameter -t for nilfs-clean. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- man/nilfs-clean.8 | 4 ++++ man/nilfs_cleanerd.conf.5 | 9 +++++++++ 2 files changed, 13 insertions(+) diff --git a/man/nilfs-clean.8 b/man/nilfs-clean.8 index 04e11c6..5af836e 100644 --- a/man/nilfs-clean.8 +++ b/man/nilfs-clean.8 @@ -67,6 +67,10 @@ until user resumes the garbage collection. \fB\-S\fR, \fB\-\-speed=\fICOUNT[/SECONDS]\fR Set garbage collection speed for a cleaner run. .TP +\fB\-t\fR, \fB\-\-free\-blocks\-threshold=\fICOUNT\fR +Specify the minimal number of free blocks before a segment can be +cleaned +.TP \fB\-v\fR, \fB\-\-verbose\fR Verbose mode. .TP diff --git a/man/nilfs_cleanerd.conf.5 b/man/nilfs_cleanerd.conf.5 index 8f3fcd2..ccb0ade 100644 --- a/man/nilfs_cleanerd.conf.5 +++ b/man/nilfs_cleanerd.conf.5 @@ -79,6 +79,15 @@ Specify whether to use \fBmmap\fP(2) for reading segments. At present, this option is enabled if supported regardless of this directive. .TP +.B min_free_blocks_threshold +Specify the minimal number of free blocks before a segment can be +cleaned. Set to 0 to disable it. The default value is 256. +.TP +.B mc_min_free_blocks_threshold +Specify the minimal number of free blocks before a segment can be +cleaned if clean segments < min_clean_segments. Set to 0 to disable it. +The default value is 128. +.TP .B log_priority Gives the verbosity level that is used when logging messages from \fBnilfs_cleanerd\fP(8). The possible values are: \fBemerg\fP, -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> ` (4 preceding siblings ...) 2014-01-21 13:59 ` [PATCH v2 5/5] nilfs-utils: man: add description of min_free_blocks_threshold Andreas Rohner @ 2014-01-21 14:00 ` Andreas Rohner [not found] ` <cec6a449ddf5ae9da2928cdddfb96ebcb2789eee.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-22 23:46 ` [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations Michael L. Semon 2014-01-23 17:48 ` Vyacheslav Dubeyko 7 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 14:00 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner This adds struct nilfs_suinfo_update and some flags. It contains the information needed to update one entry of segment usage information in the SUFILE. The flags specify, which fields need to be updated. This can be used to update segment usage information from userspace with an ioctl. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- include/linux/nilfs2_fs.h | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h index 9875576..2ee1594 100644 --- a/include/linux/nilfs2_fs.h +++ b/include/linux/nilfs2_fs.h @@ -709,6 +709,45 @@ static inline int nilfs_suinfo_clean(const struct nilfs_suinfo *si) return !si->sui_flags; } +/** + * nilfs_suinfo_update - segment usage information update + * @sup_segnum: segment number + * @sup_flags: flags for which fields are active in sup_sui + * @sup_sui: segment usage information + */ +struct nilfs_suinfo_update { + __u64 sup_segnum; + __u32 sup_flags; + struct nilfs_suinfo sup_sui; +}; + +enum { + NILFS_SUINFO_UPDATE_LASTMOD, + NILFS_SUINFO_UPDATE_NBLOCKS, + NILFS_SUINFO_UPDATE_FLAGS, +}; + +#define NILFS_SUINFO_UPDATE_FNS(flag, name) \ +static inline void \ +nilfs_suinfo_update_set_##name(struct nilfs_suinfo_update *sup) \ +{ \ + sup->sup_flags |= 1UL << NILFS_SUINFO_UPDATE_##flag; \ +} \ +static inline void \ +nilfs_suinfo_update_clear_##name(struct nilfs_suinfo_update *sup) \ +{ \ + sup->sup_flags &= ~(1UL << NILFS_SUINFO_UPDATE_##flag); \ +} \ +static inline int \ +nilfs_suinfo_update_##name(const struct nilfs_suinfo_update *sup) \ +{ \ + return !!(sup->sup_flags & (1UL << NILFS_SUINFO_UPDATE_##flag));\ +} + +NILFS_SUINFO_UPDATE_FNS(LASTMOD, lastmod) +NILFS_SUINFO_UPDATE_FNS(NBLOCKS, nblocks) +NILFS_SUINFO_UPDATE_FNS(FLAGS, flags) + /* ioctl */ enum { NILFS_CHECKPOINT, -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
[parent not found: <cec6a449ddf5ae9da2928cdddfb96ebcb2789eee.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* [PATCH v2 2/3] nilfs2: add nilfs_sufile_set_suinfo to update segment usage info [not found] ` <cec6a449ddf5ae9da2928cdddfb96ebcb2789eee.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-21 14:00 ` Andreas Rohner [not found] ` <2fd48b2d524a59a02bdad13c0c194de3e5b55cc7.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 14:00 ` [PATCH v2 3/3] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl Andreas Rohner 2014-01-24 4:56 ` [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct Ryusuke Konishi 2 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 14:00 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner This patch introduces the nilfs_sufile_set_suinfo function, which expects an array of nilfs_suinfo_update structures and updates the segment usage information accordingly. This is basically a helper function for the newly introduced NILFS_IOCTL_SET_SUINFO ioctl. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- fs/nilfs2/sufile.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ fs/nilfs2/sufile.h | 2 +- 2 files changed, 92 insertions(+), 1 deletion(-) diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c index 3127e9f..81c3438 100644 --- a/fs/nilfs2/sufile.c +++ b/fs/nilfs2/sufile.c @@ -870,6 +870,97 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf, } /** + * nilfs_sufile_set_suinfo - + * @sufile: inode of segment usage file + * @buf: array of suinfo + * @supsz: byte size of suinfo + * @nsup: size of suinfo array + * + * Description: Takes an array of nilfs_suinfo_update structs and updates + * segment usage accordingly. + * + * Return Value: On success, 0 is returned and .... On error, one of the + * following negative error codes is returned. + * + * %-EIO - I/O error. + * + * %-ENOMEM - Insufficient amount of memory available. + * + * %-EINVAL - Invalid segment number in input + */ +ssize_t nilfs_sufile_set_suinfo(struct inode *sufile, void *buf, + unsigned supsz, size_t nsup) +{ + struct buffer_head *bh; + struct nilfs_suinfo_update *sup, *supv = buf; + struct nilfs_segment_usage *su; + void *kaddr; + unsigned long blkoff, prev_blkoff; + size_t nerr = 0; + int ret = 0; + + if (unlikely(nsup == 0)) + goto out; + + down_write(&NILFS_MDT(sufile)->mi_sem); + for (sup = supv; sup < supv + nsup; sup++) { + if (unlikely(sup->sup_segnum >= + nilfs_sufile_get_nsegments(sufile))) { + printk(KERN_WARNING + "%s: invalid segment number: %llu\n", __func__, + (unsigned long long)sup->sup_segnum); + nerr++; + } + } + if (nerr > 0) { + ret = -EINVAL; + goto out_sem; + } + + sup = supv; + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); + if (ret < 0) + goto out_sem; + + for (;;) { + kaddr = kmap_atomic(bh->b_page); + su = nilfs_sufile_block_get_segment_usage( + sufile, sup->sup_segnum, bh, kaddr); + + if (nilfs_suinfo_update_lastmod(sup)) + su->su_lastmod = cpu_to_le64(sup->sup_sui.sui_lastmod); + + if (nilfs_suinfo_update_nblocks(sup)) + su->su_nblocks = cpu_to_le32(sup->sup_sui.sui_nblocks); + + if (nilfs_suinfo_update_flags(sup)) + su->su_flags = cpu_to_le32(sup->sup_sui.sui_flags); + + kunmap_atomic(kaddr); + + if (++sup >= supv + nsup) + break; + prev_blkoff = blkoff; + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); + if (blkoff == prev_blkoff) + continue; + + /* get different block */ + brelse(bh); + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); + if (unlikely(ret < 0)) + goto out_sem; + } + brelse(bh); + + out_sem: + up_write(&NILFS_MDT(sufile)->mi_sem); + out: + return ret; +} + +/** * nilfs_sufile_read - read or get sufile inode * @sb: super block instance * @susize: size of a segment usage entry diff --git a/fs/nilfs2/sufile.h b/fs/nilfs2/sufile.h index e84bc5b..7aa9d3e 100644 --- a/fs/nilfs2/sufile.h +++ b/fs/nilfs2/sufile.h @@ -44,7 +44,7 @@ int nilfs_sufile_set_segment_usage(struct inode *sufile, __u64 segnum, int nilfs_sufile_get_stat(struct inode *, struct nilfs_sustat *); ssize_t nilfs_sufile_get_suinfo(struct inode *, __u64, void *, unsigned, size_t); - +ssize_t nilfs_sufile_set_suinfo(struct inode *, void *, unsigned , size_t); int nilfs_sufile_updatev(struct inode *, __u64 *, size_t, int, size_t *, void (*dofunc)(struct inode *, __u64, struct buffer_head *, -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
[parent not found: <2fd48b2d524a59a02bdad13c0c194de3e5b55cc7.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 2/3] nilfs2: add nilfs_sufile_set_suinfo to update segment usage info [not found] ` <2fd48b2d524a59a02bdad13c0c194de3e5b55cc7.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-24 11:56 ` Ryusuke Konishi [not found] ` <20140124.205623.400541300.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 11:56 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Tue, 21 Jan 2014 15:00:29 +0100, Andreas Rohner wrote: > This patch introduces the nilfs_sufile_set_suinfo function, which > expects an array of nilfs_suinfo_update structures and updates the > segment usage information accordingly. > > This is basically a helper function for the newly introduced > NILFS_IOCTL_SET_SUINFO ioctl. > > Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> > --- > fs/nilfs2/sufile.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > fs/nilfs2/sufile.h | 2 +- > 2 files changed, 92 insertions(+), 1 deletion(-) > > diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c > index 3127e9f..81c3438 100644 > --- a/fs/nilfs2/sufile.c > +++ b/fs/nilfs2/sufile.c > @@ -870,6 +870,97 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf, > } > > /** > + * nilfs_sufile_set_suinfo - Please fill the summary line of function even though nilfs_sufile_get_suinfo() function lacks it. * nilfs_sufile_set_suinfo - update segment usage information > + * @sufile: inode of segment usage file > + * @buf: array of suinfo > + * @supsz: byte size of suinfo > + * @nsup: size of suinfo array > + * > + * Description: Takes an array of nilfs_suinfo_update structs and updates > + * segment usage accordingly. > + * > + * Return Value: On success, 0 is returned and .... On error, one of the > + * following negative error codes is returned. > + * > + * %-EIO - I/O error. > + * > + * %-ENOMEM - Insufficient amount of memory available. > + * > + * %-EINVAL - Invalid segment number in input > + */ > +ssize_t nilfs_sufile_set_suinfo(struct inode *sufile, void *buf, > + unsigned supsz, size_t nsup) > +{ > + struct buffer_head *bh; > + struct nilfs_suinfo_update *sup, *supv = buf; > + struct nilfs_segment_usage *su; > + void *kaddr; > + unsigned long blkoff, prev_blkoff; > + size_t nerr = 0; > + int ret = 0; > + > + if (unlikely(nsup == 0)) > + goto out; > + > + down_write(&NILFS_MDT(sufile)->mi_sem); > + for (sup = supv; sup < supv + nsup; sup++) { > + if (unlikely(sup->sup_segnum >= > + nilfs_sufile_get_nsegments(sufile))) { > + printk(KERN_WARNING > + "%s: invalid segment number: %llu\n", __func__, > + (unsigned long long)sup->sup_segnum); > + nerr++; > + } > + } > + if (nerr > 0) { > + ret = -EINVAL; > + goto out_sem; > + } Seems that you can exit from inside the loop: for (sup = supv; sup < supv + nsup; sup++) { if (unlikely(sup->sup_segnum >= nilfs_sufile_get_nsegments(sufile))) { printk(KERN_WARNING "%s: invalid segment number: %llu\n", __func__, (unsigned long long)sup->sup_segnum); ret = -EINVAL; goto out_sem; } } > + > + sup = supv; > + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); > + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); > + if (ret < 0) > + goto out_sem; > + > + for (;;) { > + kaddr = kmap_atomic(bh->b_page); > + su = nilfs_sufile_block_get_segment_usage( > + sufile, sup->sup_segnum, bh, kaddr); > + > + if (nilfs_suinfo_update_lastmod(sup)) > + su->su_lastmod = cpu_to_le64(sup->sup_sui.sui_lastmod); > + > + if (nilfs_suinfo_update_nblocks(sup)) > + su->su_nblocks = cpu_to_le32(sup->sup_sui.sui_nblocks); > + > + if (nilfs_suinfo_update_flags(sup)) > + su->su_flags = cpu_to_le32(sup->sup_sui.sui_flags); You must also update the counter value of sh_ncleansegs, sh_ndirtysegs in the nilfs_sufile_header struct of sufile header block based on the change of these flags. In addition, you should ignore NILFS_SEGMENT_USAGE_ACTIVE_FLAG as below because this flag is a virtual flag set by running nilfs kernel code. The flag bit should not be written to sufile on disk. (sui_flags & ~(1UL << NILFS_SEGMENT_USAGE_ACTIVE)) > + > + kunmap_atomic(kaddr); > + > + if (++sup >= supv + nsup) You must not use the automatic pointer operation of C language for nilfs_suinfo_update because it may be extended in a future release. This "++sup" breaks forward compatibility of the ioctl. You should do this by: sup = (void *)sup + supsz; Note that pointer type is automatically converted for "void *" type variables. > + break; > + prev_blkoff = blkoff; > + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); > + if (blkoff == prev_blkoff) > + continue; > + > + /* get different block */ You must mark bh that this function changed as dirty with mark_buffer_dirty(bh). Otherwise, the change will be gone. > + brelse(bh); > + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); > + if (unlikely(ret < 0)) > + goto out_sem; > + } > + brelse(bh); > + The sufile should be marked as dirty (if any change happened): nilfs_mdt_mark_dirty(sufile); > + out_sem: > + up_write(&NILFS_MDT(sufile)->mi_sem); > + out: > + return ret; > +} > + > +/** > * nilfs_sufile_read - read or get sufile inode > * @sb: super block instance > * @susize: size of a segment usage entry > diff --git a/fs/nilfs2/sufile.h b/fs/nilfs2/sufile.h > index e84bc5b..7aa9d3e 100644 > --- a/fs/nilfs2/sufile.h > +++ b/fs/nilfs2/sufile.h > @@ -44,7 +44,7 @@ int nilfs_sufile_set_segment_usage(struct inode *sufile, __u64 segnum, > int nilfs_sufile_get_stat(struct inode *, struct nilfs_sustat *); > ssize_t nilfs_sufile_get_suinfo(struct inode *, __u64, void *, unsigned, > size_t); > - > +ssize_t nilfs_sufile_set_suinfo(struct inode *, void *, unsigned , size_t); > int nilfs_sufile_updatev(struct inode *, __u64 *, size_t, int, size_t *, > void (*dofunc)(struct inode *, __u64, > struct buffer_head *, > -- > 1.8.5.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20140124.205623.400541300.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: [PATCH v2 2/3] nilfs2: add nilfs_sufile_set_suinfo to update segment usage info [not found] ` <20140124.205623.400541300.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2014-01-24 12:34 ` Andreas Rohner [not found] ` <52E25DDC.2060502-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-24 12:34 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-24 12:56, Ryusuke Konishi wrote: > On Tue, 21 Jan 2014 15:00:29 +0100, Andreas Rohner wrote: >> This patch introduces the nilfs_sufile_set_suinfo function, which >> expects an array of nilfs_suinfo_update structures and updates the >> segment usage information accordingly. >> >> This is basically a helper function for the newly introduced >> NILFS_IOCTL_SET_SUINFO ioctl. >> >> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> >> --- >> fs/nilfs2/sufile.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> fs/nilfs2/sufile.h | 2 +- >> 2 files changed, 92 insertions(+), 1 deletion(-) >> >> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c >> index 3127e9f..81c3438 100644 >> --- a/fs/nilfs2/sufile.c >> +++ b/fs/nilfs2/sufile.c >> @@ -870,6 +870,97 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf, >> } >> >> /** >> + * nilfs_sufile_set_suinfo - > > Please fill the summary line of function even though > nilfs_sufile_get_suinfo() function lacks it. > > * nilfs_sufile_set_suinfo - update segment usage information > >> + * @sufile: inode of segment usage file >> + * @buf: array of suinfo >> + * @supsz: byte size of suinfo >> + * @nsup: size of suinfo array >> + * >> + * Description: Takes an array of nilfs_suinfo_update structs and updates >> + * segment usage accordingly. >> + * >> + * Return Value: On success, 0 is returned and .... On error, one of the >> + * following negative error codes is returned. >> + * >> + * %-EIO - I/O error. >> + * >> + * %-ENOMEM - Insufficient amount of memory available. >> + * >> + * %-EINVAL - Invalid segment number in input >> + */ >> +ssize_t nilfs_sufile_set_suinfo(struct inode *sufile, void *buf, >> + unsigned supsz, size_t nsup) >> +{ >> + struct buffer_head *bh; >> + struct nilfs_suinfo_update *sup, *supv = buf; >> + struct nilfs_segment_usage *su; >> + void *kaddr; >> + unsigned long blkoff, prev_blkoff; >> + size_t nerr = 0; >> + int ret = 0; >> + >> + if (unlikely(nsup == 0)) >> + goto out; >> + >> + down_write(&NILFS_MDT(sufile)->mi_sem); >> + for (sup = supv; sup < supv + nsup; sup++) { >> + if (unlikely(sup->sup_segnum >= >> + nilfs_sufile_get_nsegments(sufile))) { >> + printk(KERN_WARNING >> + "%s: invalid segment number: %llu\n", __func__, >> + (unsigned long long)sup->sup_segnum); >> + nerr++; >> + } >> + } >> + if (nerr > 0) { >> + ret = -EINVAL; >> + goto out_sem; >> + } > > Seems that you can exit from inside the loop: > > for (sup = supv; sup < supv + nsup; sup++) { > if (unlikely(sup->sup_segnum >= > nilfs_sufile_get_nsegments(sufile))) { > printk(KERN_WARNING > "%s: invalid segment number: %llu\n", __func__, > (unsigned long long)sup->sup_segnum); > ret = -EINVAL; > goto out_sem; > } > } Yes. I copied that from nilfs_sufile_updatev. >> + >> + sup = supv; >> + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); >> + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); >> + if (ret < 0) >> + goto out_sem; >> + >> + for (;;) { >> + kaddr = kmap_atomic(bh->b_page); >> + su = nilfs_sufile_block_get_segment_usage( >> + sufile, sup->sup_segnum, bh, kaddr); >> + >> + if (nilfs_suinfo_update_lastmod(sup)) >> + su->su_lastmod = cpu_to_le64(sup->sup_sui.sui_lastmod); >> + >> + if (nilfs_suinfo_update_nblocks(sup)) >> + su->su_nblocks = cpu_to_le32(sup->sup_sui.sui_nblocks); >> + >> + if (nilfs_suinfo_update_flags(sup)) >> + su->su_flags = cpu_to_le32(sup->sup_sui.sui_flags); > > You must also update the counter value of sh_ncleansegs, sh_ndirtysegs > in the nilfs_sufile_header struct of sufile header block based on the > change of these flags. > > In addition, you should ignore NILFS_SEGMENT_USAGE_ACTIVE_FLAG as > below because this flag is a virtual flag set by running nilfs kernel > code. The flag bit should not be written to sufile on disk. > > (sui_flags & ~(1UL << NILFS_SEGMENT_USAGE_ACTIVE)) Yes. I didn't think of that. >> + >> + kunmap_atomic(kaddr); >> + >> + if (++sup >= supv + nsup) > > You must not use the automatic pointer operation of C language for > nilfs_suinfo_update because it may be extended in a future release. > This "++sup" breaks forward compatibility of the ioctl. > > You should do this by: > > sup = (void *)sup + supsz; > > Note that pointer type is automatically converted for "void *" type > variables. Makes sense. I guess I blindly copied that from nilfs_sufile_updatev, but with a simple array of __u64 it works of course. With a structure, that is subject to change, not so much any more. >> + break; >> + prev_blkoff = blkoff; >> + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); >> + if (blkoff == prev_blkoff) >> + continue; >> + >> + /* get different block */ > > You must mark bh that this function changed as dirty with > mark_buffer_dirty(bh). Otherwise, the change will be gone. > > >> + brelse(bh); >> + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); >> + if (unlikely(ret < 0)) >> + goto out_sem; >> + } >> + brelse(bh); >> + > > The sufile should be marked as dirty (if any change happened): > > nilfs_mdt_mark_dirty(sufile); Of course. I am sorry that's a really stupid error. I will fix it right away. Thanks for your review. I am really embarrassed about all the stupid avoidable errors. br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E25DDC.2060502-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 2/3] nilfs2: add nilfs_sufile_set_suinfo to update segment usage info [not found] ` <52E25DDC.2060502-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-24 16:34 ` Ryusuke Konishi 0 siblings, 0 replies; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 16:34 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Fri, 24 Jan 2014 13:34:36 +0100, Andreas Rohner wrote: > On 2014-01-24 12:56, Ryusuke Konishi wrote: >> On Tue, 21 Jan 2014 15:00:29 +0100, Andreas Rohner wrote: >>> This patch introduces the nilfs_sufile_set_suinfo function, which >>> expects an array of nilfs_suinfo_update structures and updates the >>> segment usage information accordingly. >>> >>> This is basically a helper function for the newly introduced >>> NILFS_IOCTL_SET_SUINFO ioctl. >>> >>> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> >>> --- >>> fs/nilfs2/sufile.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> fs/nilfs2/sufile.h | 2 +- >>> 2 files changed, 92 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c >>> index 3127e9f..81c3438 100644 >>> --- a/fs/nilfs2/sufile.c >>> +++ b/fs/nilfs2/sufile.c >>> @@ -870,6 +870,97 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf, >>> } >>> >>> /** >>> + * nilfs_sufile_set_suinfo - >> >> Please fill the summary line of function even though >> nilfs_sufile_get_suinfo() function lacks it. >> >> * nilfs_sufile_set_suinfo - update segment usage information >> >>> + * @sufile: inode of segment usage file >>> + * @buf: array of suinfo >>> + * @supsz: byte size of suinfo >>> + * @nsup: size of suinfo array >>> + * >>> + * Description: Takes an array of nilfs_suinfo_update structs and updates >>> + * segment usage accordingly. >>> + * >>> + * Return Value: On success, 0 is returned and .... On error, one of the >>> + * following negative error codes is returned. >>> + * >>> + * %-EIO - I/O error. >>> + * >>> + * %-ENOMEM - Insufficient amount of memory available. >>> + * >>> + * %-EINVAL - Invalid segment number in input >>> + */ >>> +ssize_t nilfs_sufile_set_suinfo(struct inode *sufile, void *buf, >>> + unsigned supsz, size_t nsup) >>> +{ >>> + struct buffer_head *bh; >>> + struct nilfs_suinfo_update *sup, *supv = buf; >>> + struct nilfs_segment_usage *su; >>> + void *kaddr; >>> + unsigned long blkoff, prev_blkoff; >>> + size_t nerr = 0; >>> + int ret = 0; >>> + >>> + if (unlikely(nsup == 0)) >>> + goto out; >>> + >>> + down_write(&NILFS_MDT(sufile)->mi_sem); >>> + for (sup = supv; sup < supv + nsup; sup++) { >>> + if (unlikely(sup->sup_segnum >= >>> + nilfs_sufile_get_nsegments(sufile))) { >>> + printk(KERN_WARNING >>> + "%s: invalid segment number: %llu\n", __func__, >>> + (unsigned long long)sup->sup_segnum); >>> + nerr++; >>> + } >>> + } >>> + if (nerr > 0) { >>> + ret = -EINVAL; >>> + goto out_sem; >>> + } >> >> Seems that you can exit from inside the loop: >> >> for (sup = supv; sup < supv + nsup; sup++) { >> if (unlikely(sup->sup_segnum >= >> nilfs_sufile_get_nsegments(sufile))) { >> printk(KERN_WARNING >> "%s: invalid segment number: %llu\n", __func__, >> (unsigned long long)sup->sup_segnum); >> ret = -EINVAL; >> goto out_sem; >> } An additional comment: I think we should reject unknown sup_flags with -EINVAL here. It may be used in a future version, but the current version cannot update the corresponding field in sufile. Rejecting it looks better behavior than ignoring it. Regards, Ryusuke Konishi >> } > > Yes. I copied that from nilfs_sufile_updatev. > >>> + >>> + sup = supv; >>> + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); >>> + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); >>> + if (ret < 0) >>> + goto out_sem; >>> + >>> + for (;;) { >>> + kaddr = kmap_atomic(bh->b_page); >>> + su = nilfs_sufile_block_get_segment_usage( >>> + sufile, sup->sup_segnum, bh, kaddr); >>> + >>> + if (nilfs_suinfo_update_lastmod(sup)) >>> + su->su_lastmod = cpu_to_le64(sup->sup_sui.sui_lastmod); >>> + >>> + if (nilfs_suinfo_update_nblocks(sup)) >>> + su->su_nblocks = cpu_to_le32(sup->sup_sui.sui_nblocks); >>> + >>> + if (nilfs_suinfo_update_flags(sup)) >>> + su->su_flags = cpu_to_le32(sup->sup_sui.sui_flags); >> >> You must also update the counter value of sh_ncleansegs, sh_ndirtysegs >> in the nilfs_sufile_header struct of sufile header block based on the >> change of these flags. >> >> In addition, you should ignore NILFS_SEGMENT_USAGE_ACTIVE_FLAG as >> below because this flag is a virtual flag set by running nilfs kernel >> code. The flag bit should not be written to sufile on disk. >> >> (sui_flags & ~(1UL << NILFS_SEGMENT_USAGE_ACTIVE)) > > Yes. I didn't think of that. > >>> + >>> + kunmap_atomic(kaddr); >>> + >>> + if (++sup >= supv + nsup) >> >> You must not use the automatic pointer operation of C language for >> nilfs_suinfo_update because it may be extended in a future release. >> This "++sup" breaks forward compatibility of the ioctl. >> >> You should do this by: >> >> sup = (void *)sup + supsz; >> >> Note that pointer type is automatically converted for "void *" type >> variables. > > > Makes sense. I guess I blindly copied that from nilfs_sufile_updatev, > but with a simple array of __u64 it works of course. With a structure, > that is subject to change, not so much any more. > >>> + break; >>> + prev_blkoff = blkoff; >>> + blkoff = nilfs_sufile_get_blkoff(sufile, sup->sup_segnum); >>> + if (blkoff == prev_blkoff) >>> + continue; >>> + >>> + /* get different block */ >> >> You must mark bh that this function changed as dirty with >> mark_buffer_dirty(bh). Otherwise, the change will be gone. >> >> >>> + brelse(bh); >>> + ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh); >>> + if (unlikely(ret < 0)) >>> + goto out_sem; >>> + } >>> + brelse(bh); >>> + >> >> The sufile should be marked as dirty (if any change happened): >> >> nilfs_mdt_mark_dirty(sufile); > > Of course. I am sorry that's a really stupid error. I will fix it right > away. > > Thanks for your review. I am really embarrassed about all the stupid > avoidable errors. > > br, > Andreas Rohner > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v2 3/3] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <cec6a449ddf5ae9da2928cdddfb96ebcb2789eee.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 14:00 ` [PATCH v2 2/3] nilfs2: add nilfs_sufile_set_suinfo to update segment usage info Andreas Rohner @ 2014-01-21 14:00 ` Andreas Rohner [not found] ` <6fb5a6d45afca9ae2599c471c0e42805ed1b6c55.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-24 4:56 ` [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct Ryusuke Konishi 2 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-21 14:00 UTC (permalink / raw) To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner With this ioctl the segment usage information in the SUFILE can be updated from userspace. This is useful, because it allows the userspace GC to modify and update segment usage entries for specific segments, which enables it to avoid unnecessary write operations. If a segment needs to be cleaned, but there is no or very little free space to be gained, the cleaning operation basically degrades to needless expensive copying of data. In the end the only thing that changes is the location of the data and a timestamp in the segment usage info. With this ioctl the GC can skip the copying and update the segment usage entries directly instead. Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> --- fs/nilfs2/ioctl.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ include/linux/nilfs2_fs.h | 2 ++ 2 files changed, 52 insertions(+) diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c index b44bdb2..c6a3faf 100644 --- a/fs/nilfs2/ioctl.c +++ b/fs/nilfs2/ioctl.c @@ -273,6 +273,13 @@ nilfs_ioctl_do_get_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, return ret; } +static ssize_t +nilfs_ioctl_do_set_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, + void *buf, size_t size, size_t nmembs) +{ + return nilfs_sufile_set_suinfo(nilfs->ns_sufile, buf, size, nmembs); +} + static int nilfs_ioctl_get_sustat(struct inode *inode, struct file *filp, unsigned int cmd, void __user *argp) { @@ -767,6 +774,44 @@ out: return ret; } +static int nilfs_ioctl_set_info(struct inode *inode, struct file *filp, + unsigned int cmd, void __user *argp, + size_t membsz, + ssize_t (*dofunc)(struct the_nilfs *, + __u64 *, int, + void *, size_t, size_t)) +{ + struct the_nilfs *nilfs = inode->i_sb->s_fs_info; + struct nilfs_transaction_info ti; + struct nilfs_argv argv; + int ret; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + ret = mnt_want_write_file(filp); + if (ret) + return ret; + + ret = -EFAULT; + if (copy_from_user(&argv, argp, sizeof(argv))) + goto out; + + if (argv.v_size < membsz) + return -EINVAL; + + nilfs_transaction_begin(inode->i_sb, &ti, 0); + ret = nilfs_ioctl_wrap_copy(nilfs, &argv, _IOC_DIR(cmd), dofunc); + if (unlikely(ret < 0)) + nilfs_transaction_abort(inode->i_sb); + else + nilfs_transaction_commit(inode->i_sb); /* never fails */ + +out: + mnt_drop_write_file(filp); + return ret; +} + static int nilfs_ioctl_get_info(struct inode *inode, struct file *filp, unsigned int cmd, void __user *argp, size_t membsz, @@ -820,6 +865,10 @@ long nilfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) return nilfs_ioctl_get_info(inode, filp, cmd, argp, sizeof(struct nilfs_suinfo), nilfs_ioctl_do_get_suinfo); + case NILFS_IOCTL_SET_SUINFO: + return nilfs_ioctl_set_info(inode, filp, cmd, argp, + sizeof(struct nilfs_suinfo_update), + nilfs_ioctl_do_set_suinfo); case NILFS_IOCTL_GET_SUSTAT: return nilfs_ioctl_get_sustat(inode, filp, cmd, argp); case NILFS_IOCTL_GET_VINFO: @@ -859,6 +908,7 @@ long nilfs_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) case NILFS_IOCTL_GET_CPINFO: case NILFS_IOCTL_GET_CPSTAT: case NILFS_IOCTL_GET_SUINFO: + case NILFS_IOCTL_SET_SUINFO: case NILFS_IOCTL_GET_SUSTAT: case NILFS_IOCTL_GET_VINFO: case NILFS_IOCTL_GET_BDESCS: diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h index 2ee1594..528e2c5 100644 --- a/include/linux/nilfs2_fs.h +++ b/include/linux/nilfs2_fs.h @@ -902,5 +902,7 @@ struct nilfs_bdesc { _IOW(NILFS_IOCTL_IDENT, 0x8B, __u64) #define NILFS_IOCTL_SET_ALLOC_RANGE \ _IOW(NILFS_IOCTL_IDENT, 0x8C, __u64[2]) +#define NILFS_IOCTL_SET_SUINFO \ + _IOW(NILFS_IOCTL_IDENT, 0x8D, struct nilfs_argv) #endif /* _LINUX_NILFS_FS_H */ -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 44+ messages in thread
[parent not found: <6fb5a6d45afca9ae2599c471c0e42805ed1b6c55.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 3/3] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <6fb5a6d45afca9ae2599c471c0e42805ed1b6c55.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-24 13:20 ` Ryusuke Konishi [not found] ` <20140124.222016.110509397.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 13:20 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Tue, 21 Jan 2014 15:00:30 +0100, Andreas Rohner wrote: > With this ioctl the segment usage information in the SUFILE can be > updated from userspace. > > This is useful, because it allows the userspace GC to modify and update > segment usage entries for specific segments, which enables it to avoid > unnecessary write operations. > > If a segment needs to be cleaned, but there is no or very little free > space to be gained, the cleaning operation basically degrades to > needless expensive copying of data. In the end the only thing that > changes is the location of the data and a timestamp in the segment usage > info. With this ioctl the GC can skip the copying and update the segment > usage entries directly instead. > > Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> > --- > fs/nilfs2/ioctl.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ > include/linux/nilfs2_fs.h | 2 ++ > 2 files changed, 52 insertions(+) > > diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c > index b44bdb2..c6a3faf 100644 > --- a/fs/nilfs2/ioctl.c > +++ b/fs/nilfs2/ioctl.c > @@ -273,6 +273,13 @@ nilfs_ioctl_do_get_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, > return ret; > } > > +static ssize_t > +nilfs_ioctl_do_set_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, > + void *buf, size_t size, size_t nmembs) > +{ > + return nilfs_sufile_set_suinfo(nilfs->ns_sufile, buf, size, nmembs); > +} > + > static int nilfs_ioctl_get_sustat(struct inode *inode, struct file *filp, > unsigned int cmd, void __user *argp) > { > @@ -767,6 +774,44 @@ out: > return ret; > } > > +static int nilfs_ioctl_set_info(struct inode *inode, struct file *filp, > + unsigned int cmd, void __user *argp, > + size_t membsz, > + ssize_t (*dofunc)(struct the_nilfs *, > + __u64 *, int, > + void *, size_t, size_t)) > +{ > + struct the_nilfs *nilfs = inode->i_sb->s_fs_info; > + struct nilfs_transaction_info ti; > + struct nilfs_argv argv; > + int ret; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + ret = mnt_want_write_file(filp); > + if (ret) > + return ret; > + > + ret = -EFAULT; > + if (copy_from_user(&argv, argp, sizeof(argv))) > + goto out; > + > + if (argv.v_size < membsz) > + return -EINVAL; > + > + nilfs_transaction_begin(inode->i_sb, &ti, 0); > + ret = nilfs_ioctl_wrap_copy(nilfs, &argv, _IOC_DIR(cmd), dofunc); Hmm, we have a problem here. nilfs_ioctl_wrap_copy() cannot be used between nilfs_transcation_begin() and nilfs_transaction_end() since nilfs_ioctl_wrap_copy() calls copy_from_user() and/or copy_to_user(). Unfortunately there is a lock order restriction for these functions. Let's use vmalloc() like nilfs_ioctl_clean_segments(). The following is a sample code for this: size_t len; void __user *base; void *kbuf; int ret; ... ret = -EINVAL; if (argv.v_size < sizeof(struct nilfs_suinfo_update)) goto out; < range check of argv.v_nmembs > len = argv.v_size * argv.v_nmembs; base = (void __user *)(unsigned long)argv.v_base; kbuf = vmalloc(len); if (!kbuf) { ret = -ENOMEM; goto out; } if (copy_from_user(kbuf, base, len)) { ret = -EFAULT; goto out_free; } nilfs_transaction_begin(inode->i_sb, &ti, 0); ret = < call nilfs_sufile_set_suinfo > ; if (unlikely(ret < 0)) nilfs_transcation_abort(inode->i_sb); else nilfs_transcation_commit(inode->i_sb); out_free: vfree(kbuf); out: mnt_drop_write_file(filp); ... Thanks, Ryusuke Konishi > + if (unlikely(ret < 0)) > + nilfs_transaction_abort(inode->i_sb); > + else > + nilfs_transaction_commit(inode->i_sb); /* never fails */ > + > +out: > + mnt_drop_write_file(filp); > + return ret; > +} > + > static int nilfs_ioctl_get_info(struct inode *inode, struct file *filp, > unsigned int cmd, void __user *argp, > size_t membsz, > @@ -820,6 +865,10 @@ long nilfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > return nilfs_ioctl_get_info(inode, filp, cmd, argp, > sizeof(struct nilfs_suinfo), > nilfs_ioctl_do_get_suinfo); > + case NILFS_IOCTL_SET_SUINFO: > + return nilfs_ioctl_set_info(inode, filp, cmd, argp, > + sizeof(struct nilfs_suinfo_update), > + nilfs_ioctl_do_set_suinfo); > case NILFS_IOCTL_GET_SUSTAT: > return nilfs_ioctl_get_sustat(inode, filp, cmd, argp); > case NILFS_IOCTL_GET_VINFO: > @@ -859,6 +908,7 @@ long nilfs_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > case NILFS_IOCTL_GET_CPINFO: > case NILFS_IOCTL_GET_CPSTAT: > case NILFS_IOCTL_GET_SUINFO: > + case NILFS_IOCTL_SET_SUINFO: > case NILFS_IOCTL_GET_SUSTAT: > case NILFS_IOCTL_GET_VINFO: > case NILFS_IOCTL_GET_BDESCS: > diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h > index 2ee1594..528e2c5 100644 > --- a/include/linux/nilfs2_fs.h > +++ b/include/linux/nilfs2_fs.h > @@ -902,5 +902,7 @@ struct nilfs_bdesc { > _IOW(NILFS_IOCTL_IDENT, 0x8B, __u64) > #define NILFS_IOCTL_SET_ALLOC_RANGE \ > _IOW(NILFS_IOCTL_IDENT, 0x8C, __u64[2]) > +#define NILFS_IOCTL_SET_SUINFO \ > + _IOW(NILFS_IOCTL_IDENT, 0x8D, struct nilfs_argv) > > #endif /* _LINUX_NILFS_FS_H */ > -- > 1.8.5.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20140124.222016.110509397.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: [PATCH v2 3/3] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <20140124.222016.110509397.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2014-01-24 13:44 ` Andreas Rohner [not found] ` <52E26E46.3030702-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-24 13:44 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-24 14:20, Ryusuke Konishi wrote: > On Tue, 21 Jan 2014 15:00:30 +0100, Andreas Rohner wrote: >> With this ioctl the segment usage information in the SUFILE can be >> updated from userspace. >> >> This is useful, because it allows the userspace GC to modify and update >> segment usage entries for specific segments, which enables it to avoid >> unnecessary write operations. >> >> If a segment needs to be cleaned, but there is no or very little free >> space to be gained, the cleaning operation basically degrades to >> needless expensive copying of data. In the end the only thing that >> changes is the location of the data and a timestamp in the segment usage >> info. With this ioctl the GC can skip the copying and update the segment >> usage entries directly instead. >> >> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> >> --- >> fs/nilfs2/ioctl.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ >> include/linux/nilfs2_fs.h | 2 ++ >> 2 files changed, 52 insertions(+) >> >> diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c >> index b44bdb2..c6a3faf 100644 >> --- a/fs/nilfs2/ioctl.c >> +++ b/fs/nilfs2/ioctl.c >> @@ -273,6 +273,13 @@ nilfs_ioctl_do_get_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, >> return ret; >> } >> >> +static ssize_t >> +nilfs_ioctl_do_set_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, >> + void *buf, size_t size, size_t nmembs) >> +{ >> + return nilfs_sufile_set_suinfo(nilfs->ns_sufile, buf, size, nmembs); >> +} >> + >> static int nilfs_ioctl_get_sustat(struct inode *inode, struct file *filp, >> unsigned int cmd, void __user *argp) >> { >> @@ -767,6 +774,44 @@ out: >> return ret; >> } >> >> +static int nilfs_ioctl_set_info(struct inode *inode, struct file *filp, >> + unsigned int cmd, void __user *argp, >> + size_t membsz, >> + ssize_t (*dofunc)(struct the_nilfs *, >> + __u64 *, int, >> + void *, size_t, size_t)) >> +{ >> + struct the_nilfs *nilfs = inode->i_sb->s_fs_info; >> + struct nilfs_transaction_info ti; >> + struct nilfs_argv argv; >> + int ret; >> + >> + if (!capable(CAP_SYS_ADMIN)) >> + return -EPERM; >> + >> + ret = mnt_want_write_file(filp); >> + if (ret) >> + return ret; >> + >> + ret = -EFAULT; >> + if (copy_from_user(&argv, argp, sizeof(argv))) >> + goto out; >> + >> + if (argv.v_size < membsz) >> + return -EINVAL; >> + >> + nilfs_transaction_begin(inode->i_sb, &ti, 0); >> + ret = nilfs_ioctl_wrap_copy(nilfs, &argv, _IOC_DIR(cmd), dofunc); > > Hmm, we have a problem here. > > nilfs_ioctl_wrap_copy() cannot be used between > nilfs_transcation_begin() and nilfs_transaction_end() since > nilfs_ioctl_wrap_copy() calls copy_from_user() and/or copy_to_user(). > Unfortunately there is a lock order restriction for these functions. > > Let's use vmalloc() like nilfs_ioctl_clean_segments(). The following > is a sample code for this: > > size_t len; > void __user *base; > void *kbuf; > int ret; > > ... > ret = -EINVAL; > if (argv.v_size < sizeof(struct nilfs_suinfo_update)) > goto out; > > < range check of argv.v_nmembs > > > len = argv.v_size * argv.v_nmembs; > base = (void __user *)(unsigned long)argv.v_base; > kbuf = vmalloc(len); > if (!kbuf) { > ret = -ENOMEM; > goto out; > } > > if (copy_from_user(kbuf, base, len)) { > ret = -EFAULT; > goto out_free; > } > > nilfs_transaction_begin(inode->i_sb, &ti, 0); > ret = < call nilfs_sufile_set_suinfo > ; > if (unlikely(ret < 0)) > nilfs_transcation_abort(inode->i_sb); > else > nilfs_transcation_commit(inode->i_sb); > > out_free: > vfree(kbuf); > out: > mnt_drop_write_file(filp); > ... > Then we don't need the ugly nilfs_suinfo_update structure. I can just send an array of struct nilfs_argv[2]. First element has the segment numbers and second element has the nilfs_suinfo structures and flags are in v_flags. Would that be acceptable? br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E26E46.3030702-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 3/3] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl [not found] ` <52E26E46.3030702-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-24 15:23 ` Ryusuke Konishi 0 siblings, 0 replies; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 15:23 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Fri, 24 Jan 2014 14:44:38 +0100, Andreas Rohner wrote: > On 2014-01-24 14:20, Ryusuke Konishi wrote: >> On Tue, 21 Jan 2014 15:00:30 +0100, Andreas Rohner wrote: >>> With this ioctl the segment usage information in the SUFILE can be >>> updated from userspace. >>> >>> This is useful, because it allows the userspace GC to modify and update >>> segment usage entries for specific segments, which enables it to avoid >>> unnecessary write operations. >>> >>> If a segment needs to be cleaned, but there is no or very little free >>> space to be gained, the cleaning operation basically degrades to >>> needless expensive copying of data. In the end the only thing that >>> changes is the location of the data and a timestamp in the segment usage >>> info. With this ioctl the GC can skip the copying and update the segment >>> usage entries directly instead. >>> >>> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> >>> --- >>> fs/nilfs2/ioctl.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ >>> include/linux/nilfs2_fs.h | 2 ++ >>> 2 files changed, 52 insertions(+) >>> >>> diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c >>> index b44bdb2..c6a3faf 100644 >>> --- a/fs/nilfs2/ioctl.c >>> +++ b/fs/nilfs2/ioctl.c >>> @@ -273,6 +273,13 @@ nilfs_ioctl_do_get_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, >>> return ret; >>> } >>> >>> +static ssize_t >>> +nilfs_ioctl_do_set_suinfo(struct the_nilfs *nilfs, __u64 *posp, int flags, >>> + void *buf, size_t size, size_t nmembs) >>> +{ >>> + return nilfs_sufile_set_suinfo(nilfs->ns_sufile, buf, size, nmembs); >>> +} >>> + >>> static int nilfs_ioctl_get_sustat(struct inode *inode, struct file *filp, >>> unsigned int cmd, void __user *argp) >>> { >>> @@ -767,6 +774,44 @@ out: >>> return ret; >>> } >>> >>> +static int nilfs_ioctl_set_info(struct inode *inode, struct file *filp, >>> + unsigned int cmd, void __user *argp, >>> + size_t membsz, >>> + ssize_t (*dofunc)(struct the_nilfs *, >>> + __u64 *, int, >>> + void *, size_t, size_t)) >>> +{ >>> + struct the_nilfs *nilfs = inode->i_sb->s_fs_info; >>> + struct nilfs_transaction_info ti; >>> + struct nilfs_argv argv; >>> + int ret; >>> + >>> + if (!capable(CAP_SYS_ADMIN)) >>> + return -EPERM; >>> + >>> + ret = mnt_want_write_file(filp); >>> + if (ret) >>> + return ret; >>> + >>> + ret = -EFAULT; >>> + if (copy_from_user(&argv, argp, sizeof(argv))) >>> + goto out; >>> + >>> + if (argv.v_size < membsz) >>> + return -EINVAL; >>> + >>> + nilfs_transaction_begin(inode->i_sb, &ti, 0); >>> + ret = nilfs_ioctl_wrap_copy(nilfs, &argv, _IOC_DIR(cmd), dofunc); >> >> Hmm, we have a problem here. >> >> nilfs_ioctl_wrap_copy() cannot be used between >> nilfs_transcation_begin() and nilfs_transaction_end() since >> nilfs_ioctl_wrap_copy() calls copy_from_user() and/or copy_to_user(). >> Unfortunately there is a lock order restriction for these functions. >> >> Let's use vmalloc() like nilfs_ioctl_clean_segments(). The following >> is a sample code for this: >> >> size_t len; >> void __user *base; >> void *kbuf; >> int ret; >> >> ... >> ret = -EINVAL; >> if (argv.v_size < sizeof(struct nilfs_suinfo_update)) >> goto out; >> >> < range check of argv.v_nmembs > >> >> len = argv.v_size * argv.v_nmembs; >> base = (void __user *)(unsigned long)argv.v_base; >> kbuf = vmalloc(len); >> if (!kbuf) { >> ret = -ENOMEM; >> goto out; >> } >> >> if (copy_from_user(kbuf, base, len)) { >> ret = -EFAULT; >> goto out_free; >> } >> >> nilfs_transaction_begin(inode->i_sb, &ti, 0); >> ret = < call nilfs_sufile_set_suinfo > ; >> if (unlikely(ret < 0)) >> nilfs_transcation_abort(inode->i_sb); >> else >> nilfs_transcation_commit(inode->i_sb); >> >> out_free: >> vfree(kbuf); >> out: >> mnt_drop_write_file(filp); >> ... >> > > Then we don't need the ugly nilfs_suinfo_update structure. I can just > send an array of struct nilfs_argv[2]. First element has the segment > numbers and second element has the nilfs_suinfo structures and flags are > in v_flags. Would that be acceptable? Personally, I don't like nilfs_argv[n]. Unfortunately, we couldn't find a good way to pass several array arguments together for GC. For NILFS_IOCTL_SET_SUINFO, using nilfs_suinfo_update struct looks better to me from the viewpoint of memory access locality and readability (and clarity) of source code. Regards, Ryusuke Konishi > br, > Andreas Rohner > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct [not found] ` <cec6a449ddf5ae9da2928cdddfb96ebcb2789eee.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> 2014-01-21 14:00 ` [PATCH v2 2/3] nilfs2: add nilfs_sufile_set_suinfo to update segment usage info Andreas Rohner 2014-01-21 14:00 ` [PATCH v2 3/3] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl Andreas Rohner @ 2014-01-24 4:56 ` Ryusuke Konishi [not found] ` <20140124.135635.27780504.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 2 siblings, 1 reply; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 4:56 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Tue, 21 Jan 2014 15:00:28 +0100, Andreas Rohner wrote: > This adds struct nilfs_suinfo_update and some flags. It contains the > information needed to update one entry of segment usage information in > the SUFILE. The flags specify, which fields need to be updated. > > This can be used to update segment usage information from userspace with an > ioctl. > > Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> > --- > include/linux/nilfs2_fs.h | 39 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 39 insertions(+) > > diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h > index 9875576..2ee1594 100644 > --- a/include/linux/nilfs2_fs.h > +++ b/include/linux/nilfs2_fs.h > @@ -709,6 +709,45 @@ static inline int nilfs_suinfo_clean(const struct nilfs_suinfo *si) > return !si->sui_flags; > } > > +/** > + * nilfs_suinfo_update - segment usage information update > + * @sup_segnum: segment number > + * @sup_flags: flags for which fields are active in sup_sui > + * @sup_sui: segment usage information > + */ > +struct nilfs_suinfo_update { > + __u64 sup_segnum; > + __u32 sup_flags; > + struct nilfs_suinfo sup_sui; > +}; This structure is not aligned on 8 byte boundary. You need a 4 byte pad to make the alignment architecture independent since the structure already has some 64-bit variables. If you do not so, the ioctl will lose compatibility, for instance, between 32 bit userland programs and 64 bit kernel. struct nilfs_suinfo_update { __u64 sup_segnum; __u32 sup_flags; __u32 sup_reserved; struct nilfs_suinfo sup_sui; }; Do you really need different suinfo update flags per segment ? If it's not so, we don't have to add nilfs_suinfo_update structure. v_flags of nilfs_argv structure looks to be available for that purpose. It's just a confirmation. Basically, I think this extension is acceptable. Regards, Ryusuke Konishi > + > +enum { > + NILFS_SUINFO_UPDATE_LASTMOD, > + NILFS_SUINFO_UPDATE_NBLOCKS, > + NILFS_SUINFO_UPDATE_FLAGS, > +}; > + > +#define NILFS_SUINFO_UPDATE_FNS(flag, name) \ > +static inline void \ > +nilfs_suinfo_update_set_##name(struct nilfs_suinfo_update *sup) \ > +{ \ > + sup->sup_flags |= 1UL << NILFS_SUINFO_UPDATE_##flag; \ > +} \ > +static inline void \ > +nilfs_suinfo_update_clear_##name(struct nilfs_suinfo_update *sup) \ > +{ \ > + sup->sup_flags &= ~(1UL << NILFS_SUINFO_UPDATE_##flag); \ > +} \ > +static inline int \ > +nilfs_suinfo_update_##name(const struct nilfs_suinfo_update *sup) \ > +{ \ > + return !!(sup->sup_flags & (1UL << NILFS_SUINFO_UPDATE_##flag));\ > +} > + > +NILFS_SUINFO_UPDATE_FNS(LASTMOD, lastmod) > +NILFS_SUINFO_UPDATE_FNS(NBLOCKS, nblocks) > +NILFS_SUINFO_UPDATE_FNS(FLAGS, flags) > + > /* ioctl */ > enum { > NILFS_CHECKPOINT, > -- > 1.8.5.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20140124.135635.27780504.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct [not found] ` <20140124.135635.27780504.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2014-01-24 12:11 ` Andreas Rohner [not found] ` <52E25883.3010307-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-24 12:11 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-24 05:56, Ryusuke Konishi wrote: > On Tue, 21 Jan 2014 15:00:28 +0100, Andreas Rohner wrote: >> This adds struct nilfs_suinfo_update and some flags. It contains the >> information needed to update one entry of segment usage information in >> the SUFILE. The flags specify, which fields need to be updated. >> >> This can be used to update segment usage information from userspace with an >> ioctl. >> >> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> >> --- >> include/linux/nilfs2_fs.h | 39 +++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 39 insertions(+) >> >> diff --git a/include/linux/nilfs2_fs.h b/include/linux/nilfs2_fs.h >> index 9875576..2ee1594 100644 >> --- a/include/linux/nilfs2_fs.h >> +++ b/include/linux/nilfs2_fs.h >> @@ -709,6 +709,45 @@ static inline int nilfs_suinfo_clean(const struct nilfs_suinfo *si) >> return !si->sui_flags; >> } >> >> +/** >> + * nilfs_suinfo_update - segment usage information update >> + * @sup_segnum: segment number >> + * @sup_flags: flags for which fields are active in sup_sui >> + * @sup_sui: segment usage information >> + */ >> +struct nilfs_suinfo_update { >> + __u64 sup_segnum; >> + __u32 sup_flags; >> + struct nilfs_suinfo sup_sui; >> +}; > > This structure is not aligned on 8 byte boundary. You need a 4 byte > pad to make the alignment architecture independent since the structure > already has some 64-bit variables. > > If you do not so, the ioctl will lose compatibility, for instance, > between 32 bit userland programs and 64 bit kernel. > > struct nilfs_suinfo_update { > __u64 sup_segnum; > __u32 sup_flags; > __u32 sup_reserved; > struct nilfs_suinfo sup_sui; > }; Ok, I thought 4 byte alignment would be enough. But now I see the problem, nilfs_suinfo contains 64 bit values, that need to be 8 byte aligned on a 64 bit architecture. > Do you really need different suinfo update flags per segment ? No. > If it's not so, we don't have to add nilfs_suinfo_update structure. > v_flags of nilfs_argv structure looks to be available for that purpose. Yes v_flags is available, but I need nilfs_suinfo_update to contain the segment number. I can't use v_index for the segment number like with GET_SUINFO, because the segment numbers are not necessarily sequential. If I use v_index I could effectively only use the ioctl to update one segment at a time. So I thought, since I need nilfs_suinfo_update anyway, I might as well put the flags there. But with the extra alignment it gets a bit ugly, so I tend to agree with you to put the flags into v_flags now. Maybe I missed something, but how would you propose to send the segment numbers, only using nilfs_suinfo and nilfs_argv? It is certainly possible to use v_index of nilfs_argv and update one segment at a time. It shouldn't be a problem, because there are at most #define NILFS_CLDCONFIG_NSEGMENTS_PER_CLEAN_MAX 32 segments to be updated. > It's just a confirmation. Basically, I think this extension > is acceptable. Cool. Thanks! br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E25883.3010307-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct [not found] ` <52E25883.3010307-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-24 12:37 ` Ryusuke Konishi 0 siblings, 0 replies; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-24 12:37 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Fri, 24 Jan 2014 13:11:47 +0100, Andreas Rohner wrote: > On 2014-01-24 05:56, Ryusuke Konishi wrote: >> On Tue, 21 Jan 2014 15:00:28 +0100, Andreas Rohner wrote: >> Do you really need different suinfo update flags per segment ? > > No. > >> If it's not so, we don't have to add nilfs_suinfo_update structure. >> v_flags of nilfs_argv structure looks to be available for that purpose. > > Yes v_flags is available, but I need nilfs_suinfo_update to contain the > segment number. I can't use v_index for the segment number like with > GET_SUINFO, because the segment numbers are not necessarily sequential. Ok, it sounds reasonable. Please go ahead. Ryusuke Konishi > If I use v_index I could effectively only use the ioctl to update one > segment at a time. So I thought, since I need nilfs_suinfo_update > anyway, I might as well put the flags there. But with the extra > alignment it gets a bit ugly, so I tend to agree with you to put the > flags into v_flags now. > > Maybe I missed something, but how would you propose to send the segment > numbers, only using nilfs_suinfo and nilfs_argv? > > It is certainly possible to use v_index of nilfs_argv and update one > segment at a time. It shouldn't be a problem, because there are at most > > #define NILFS_CLDCONFIG_NSEGMENTS_PER_CLEAN_MAX 32 > > segments to be updated. > >> It's just a confirmation. Basically, I think this extension >> is acceptable. > > Cool. Thanks! > > br, > Andreas Rohner > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> ` (5 preceding siblings ...) 2014-01-21 14:00 ` [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct Andreas Rohner @ 2014-01-22 23:46 ` Michael L. Semon [not found] ` <52E05863.90604-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2014-01-23 17:48 ` Vyacheslav Dubeyko 7 siblings, 1 reply; 44+ messages in thread From: Michael L. Semon @ 2014-01-22 23:46 UTC (permalink / raw) To: Andreas Rohner, linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 01/21/2014 08:59 AM, Andreas Rohner wrote: > Hi, > > This is the second version of this patch set. It replaces the kind of > hacky use of v_flags with a proper implementation of > NILFS_IOCTL_SET_SUINFO ioctl. > > v1->v2 > * Implementation of NILFS_IOCTL_SET_SUINFO > * Added mc_min_free_blocks_threshold config option > (if clean segments < min_clean_segments) > * Added new command line param for nilfs-clean > * Update man- and config-files > * Simpler benchmark > > This patch set implements a small new feature and there shouldn't be > any compatibility issues. It enables the GC to check how much free > space can be gained from cleaning a segment and if it is less than a > certain threshold it will abort the operation and try a different > segment. Although no blocks need to be moved, the SUFILE entry of the > corresponding segment needs to be updated to avoid an infinite loop. As a user (not a NILFS2 developer), I'll have to live with this one for a while to see how well I like it. On x86, xfstests went well for my debug DEBUG_PAGEALLOC CONFIG_AIO=n kernel 3.13.0+, no obvious before/after changes in speed, but no crashes, either. I hit a glitch with Vyacheslav's xattr/ACL pages on 2k blocksize filesystems, so I'll have to test smaller block sizes at a later time. So that's your warning: small block sizes and POSIX AIO were not tested by me. If you don't have xfstests, get it. It's useful for finding bugs and regressions in filesystems. Not having your super-secret test suite or disk space to run it, I went the other direction, using the commonly available fs_mark utility to make many tiny writes with 16 threads. My initial opinion is that your new GC code fixes some obvious lag when a filesystem is populated and nilfs_cleanerd starts to do its work. However, for reasons of code or simple mathematics, the file system hits end-of-space a bit earlier than does the unpatched code. I'll have to build some kernels, live with the system, and otherwise generate lots of checkpoints to know if this is a problem. IOW, I need to find out for myself if I need to make a slightly larger filesystem to do the same things using a patched NILFS2. Thanks! Michael [sample collection of data from fs_mark below] File/sec Output from this fs_mark command (two different runs)... fs_mark -d /mnt/xfstests-test/test -F -D 16 -t 16 -n 150 -s 28672 -w 4096 ...generates the following numbers for comparison: FILES OLD NEW 2400 219.4 222.3 4800 210.5 217.9 7200 216.7 212 9600 216 213 12000 216.1 213.1 14400 215.5 213.4 16800 213.5 215.4 19200 212.5 214.9 21600 214.2 209.6 24000 212 200 26400 191.6 194.4 28800 211.3 208.6 31200 193.8 190.6 33600 188.2 174.6 36000 139.7 192.2 38400 78.9 204.9 40800 110.6 188.8 43200 73.8 205.9 45600 75.7 205.7 48000 76 190.4 50400 115.4 187 52800 180.8 190.6 55200 192.4 202 57600 158.3 206.8 60000 201.7 189.1 62400 174.8 200.7 64800 170 189.2 67200 203.3 187.7 69600 174.8 175 72000 150.9 174.3 74400 141.7 175.6 76800 199.5 174.6 79200 180.2 174.9 81600 66.8 77.1 84000 40.8 76.6 86400 67.3 86.9 88800 59.3 77.1 91200 127.1 93600 113.5 96000 110.1 98400 112.6 100800 75.5 103200 58 105600 53.8 108000 45.2 110400 45.9 112800 48.2 The test partition is a 4-GB MD RAID-0 partition, 64k chunk size, on old, damaged spinning rust HDDs and old x86 hardware. GC issues show up well when using small writes and small file sizes, at least on this hardware. If this test moves more quickly on your hardware (it should), try to increase the threads before increasing other parameters. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E05863.90604-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <52E05863.90604-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2014-01-23 10:25 ` Andreas Rohner [not found] ` <52E0EE08.3040703-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-23 10:25 UTC (permalink / raw) To: Michael L. Semon, linux-nilfs-u79uwXL29TY76Z2rM5mHXA Hi Michael, On 2014-01-23 00:46, Michael L. Semon wrote: > Not having your super-secret test suite or disk space to run it, I > went the other direction, using the commonly available fs_mark utility > to make many tiny writes with 16 threads. My initial opinion is that > your new GC code fixes some obvious lag when a filesystem is populated > and nilfs_cleanerd starts to do its work. Thanks for testing my code. It is not a super-secret test suite :), it's just a few 100 lines of crappy C code I am embarrassed to publish. Thanks for pointing out fs_mark, I didn't know it. From what I can see it seems to be the perfect tool to test the GC. I will repeat my measurements with fs_mark over the weekend. > However, for reasons of code > or simple mathematics, the file system hits end-of-space a bit earlier > than does the unpatched code. I'll have to build some kernels, live > with the system, and otherwise generate lots of checkpoints to know if > this is a problem. IOW, I need to find out for myself if I need to > make a slightly larger filesystem to do the same things using a patched > NILFS2. Maybe my default values are a bit too high. 256 blocks are about 1 MB with 4k blocks. So if you use the current default settings, you could potentially loose 1/8 of your free space. By the way "blocks" is not a very good unit for the threshold in the config file anyway. If am thinking of changing it to "% of a segment", which would be independent of the block size and a lot more intuitive. br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E0EE08.3040703-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <52E0EE08.3040703-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-23 20:57 ` Michael L. Semon 0 siblings, 0 replies; 44+ messages in thread From: Michael L. Semon @ 2014-01-23 20:57 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Thu, 23 Jan 2014, Andreas Rohner wrote: > Hi Michael, > > On 2014-01-23 00:46, Michael L. Semon wrote: >> Not having your super-secret test suite or disk space to run it, I >> went the other direction, using the commonly available fs_mark utility >> to make many tiny writes with 16 threads. My initial opinion is that >> your new GC code fixes some obvious lag when a filesystem is populated >> and nilfs_cleanerd starts to do its work. > > Thanks for testing my code. It is not a super-secret test suite :), it's > just a few 100 lines of crappy C code I am embarrassed to publish. > Thanks for pointing out fs_mark, I didn't know it. From what I can see > it seems to be the perfect tool to test the GC. I will repeat my > measurements with fs_mark over the weekend. No problem, just making sure. I had some code to zero random bits on a partition, and I'll probably never post that in public, either. It's just hard to verify conclusions based on tools that I don't have. fs_mark is one tool of many. That use of fs_mark came from trying to figure out RAID-0 stripe sizes; the NILFS2 results were by accident but reflect exactly what I'm seeing in prolonged normal use. xfstests has fsstress, fsx, and a few other special-purpose tools. There's also Flexible I/O Tester, but it's one of those cool scriptable testers that are a bit beyond my needs. >> However, for reasons of code >> or simple mathematics, the file system hits end-of-space a bit earlier >> than does the unpatched code. I'll have to build some kernels, live >> with the system, and otherwise generate lots of checkpoints to know if >> this is a problem. IOW, I need to find out for myself if I need to >> make a slightly larger filesystem to do the same things using a patched >> NILFS2. > > Maybe my default values are a bit too high. 256 blocks are about 1 MB > with 4k blocks. So if you use the current default settings, you could > potentially loose 1/8 of your free space. As long as I can plan free space in advance, all is well, though 12.5% is a little much. I try to plan NILFS2 to where I can run it all day without causing too much garbage collection, then run `mkcp -s` and nilfs-clean as cron jobs in the middle of the night. Thanks! Michael -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org> ` (6 preceding siblings ...) 2014-01-22 23:46 ` [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations Michael L. Semon @ 2014-01-23 17:48 ` Vyacheslav Dubeyko [not found] ` <85EBEC6B-CA69-472A-8DDD-8E056F809EC4-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 7 siblings, 1 reply; 44+ messages in thread From: Vyacheslav Dubeyko @ 2014-01-23 17:48 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: > Hi, > > This is the second version of this patch set. It replaces the kind of > hacky use of v_flags with a proper implementation of > NILFS_IOCTL_SET_SUINFO ioctl. > > v1->v2 > * Implementation of NILFS_IOCTL_SET_SUINFO > * Added mc_min_free_blocks_threshold config option > (if clean segments < min_clean_segments) > * Added new command line param for nilfs-clean > * Update man- and config-files > * Simpler benchmark If you are talking about something then it should be in patchset. Otherwise, why do you mention about it? > This patch set implements a small new feature and there shouldn't be > any compatibility issues. It enables the GC to check how much free > space can be gained from cleaning a segment and if it is less than a > certain threshold it will abort the operation and try a different > segment. When you have cleaned a segment then you can use the whole one. So, if segment has 8 MB in size then it will be available 8 MB free space. The phrase "It enables the GC to check how much free space can be gained from cleaning a segment" really confuses me. Because I always know how much space I gain after cleaning every segment. I suppose that you mean something different. Am I correct? > Although no blocks need to be moved, the SUFILE entry of the > corresponding segment needs to be updated to avoid an infinite loop. > > This is potentially useful for all gc policies, but it is especially > beneficial for the timestamp policy. I completely misunderstand this statement. What do you mean? > Lets assume for example a NILFS2 > volume with 20% static files and lets assume these static files are in > the oldest segments. The current timestamp policy will select the oldest > segments and, since the data is static, move them mostly unchanged to > new segments. After a while they will become the oldest segments again. > Then timestamp will move them again. These moving operations are > expensive and unnecessary. > > I used a simple benchmark to test the patch set (only a few lines of C). > I used a 100 GB partition and performed the following steps: > > 1. Write a 20 GB file > 2. Write a 50 GB file > 3. Overwrite chunks of 1 MB within the 50 GB file at random > 4. Repeat step 3 until 60 GB of data is written > > Steps 3 and 4 are only perfomed to get the GC started. So the benchmark > writes a 130 GB in total to a 100 GB partition. How is it possible to save 130 GB in 100 GB partition? Are you magician? :) > > HHD: > Timestamp GB Written: 340.7574 > Timestamp GB Read: 208.2935 > Timestamp Runtime: 7787.546s > > Patched GB Written: 313.2566 > Patched GB Read: 182.6389 > Patched Runtime: 7410.892s > > SSD: > Timestamp GB Written: 679.3901 > Timestamp GB Read: 242.59 > Timestamp Runtime: 3022.081s > > Patched GB Written: 500.0095 > Patched GB Read: 157.475 > Patched Runtime: 2313.448 > > The results for the HDD clearly show, that about 20 GB less data has > been written and read in the patched version. It is reasonable to > assume, that these 20 GB are the static data. > > The speed of the GC was tuned to the HDD. It was probably too aggressive > for the much faster SSD. That is probably the reason why the difference > in GB written and read is much higher than 20 GB. I misunderstand completely what you mean here. Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <85EBEC6B-CA69-472A-8DDD-8E056F809EC4-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <85EBEC6B-CA69-472A-8DDD-8E056F809EC4-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2014-01-23 18:12 ` Andreas Rohner [not found] ` <52E15B9E.6040307-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-23 18:12 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-23 18:48, Vyacheslav Dubeyko wrote: > > On Jan 21, 2014, at 4:59 PM, Andreas Rohner wrote: > >> Hi, >> >> This is the second version of this patch set. It replaces the kind of >> hacky use of v_flags with a proper implementation of >> NILFS_IOCTL_SET_SUINFO ioctl. >> >> v1->v2 >> * Implementation of NILFS_IOCTL_SET_SUINFO >> * Added mc_min_free_blocks_threshold config option >> (if clean segments < min_clean_segments) >> * Added new command line param for nilfs-clean >> * Update man- and config-files >> * Simpler benchmark > > If you are talking about something then it should be in patchset. > Otherwise, why do you mention about it? > >> This patch set implements a small new feature and there shouldn't be >> any compatibility issues. It enables the GC to check how much free >> space can be gained from cleaning a segment and if it is less than a >> certain threshold it will abort the operation and try a different >> segment. > > When you have cleaned a segment then you can use the whole one. > So, if segment has 8 MB in size then it will be available 8 MB free space. > The phrase "It enables the GC to check how much free space can be gained > from cleaning a segment" really confuses me. Because I always know > how much space I gain after cleaning every segment. I suppose that you > mean something different. Am I correct? You have to move the live blocks to a new segment, so you gain only (8 MB - live_blocks) of free space. >> Although no blocks need to be moved, the SUFILE entry of the >> corresponding segment needs to be updated to avoid an infinite loop. >> >> This is potentially useful for all gc policies, but it is especially >> beneficial for the timestamp policy. > > I completely misunderstand this statement. What do you mean? Well the timestamp policy always selects the oldest segment. If the oldest segment is below the threshold it won't be cleaned. If we don't change the timestamp it will be immediately selected again and it probably will still be below the threshold and so on in an infinite loop. >> Lets assume for example a NILFS2 >> volume with 20% static files and lets assume these static files are in >> the oldest segments. The current timestamp policy will select the oldest >> segments and, since the data is static, move them mostly unchanged to >> new segments. After a while they will become the oldest segments again. >> Then timestamp will move them again. These moving operations are >> expensive and unnecessary. >> >> I used a simple benchmark to test the patch set (only a few lines of C). >> I used a 100 GB partition and performed the following steps: >> >> 1. Write a 20 GB file >> 2. Write a 50 GB file >> 3. Overwrite chunks of 1 MB within the 50 GB file at random >> 4. Repeat step 3 until 60 GB of data is written >> >> Steps 3 and 4 are only perfomed to get the GC started. So the benchmark >> writes a 130 GB in total to a 100 GB partition. > > How is it possible to save 130 GB in 100 GB partition? Are you magician? :) I OVERWRITE the 50 GB file. 130 GB is the total amount of data written to the partition. >> >> HHD: >> Timestamp GB Written: 340.7574 >> Timestamp GB Read: 208.2935 >> Timestamp Runtime: 7787.546s >> >> Patched GB Written: 313.2566 >> Patched GB Read: 182.6389 >> Patched Runtime: 7410.892s >> >> SSD: >> Timestamp GB Written: 679.3901 >> Timestamp GB Read: 242.59 >> Timestamp Runtime: 3022.081s >> >> Patched GB Written: 500.0095 >> Patched GB Read: 157.475 >> Patched Runtime: 2313.448 >> >> The results for the HDD clearly show, that about 20 GB less data has >> been written and read in the patched version. It is reasonable to >> assume, that these 20 GB are the static data. >> >> The speed of the GC was tuned to the HDD. It was probably too aggressive >> for the much faster SSD. That is probably the reason why the difference >> in GB written and read is much higher than 20 GB. > > I misunderstand completely what you mean here. You need to be a bit more specific. These are the results of my benchmark. The GB Written and GB Read values are calculated by simply importing /proc/diskstats into R (You subtract the values before the benchmark from those after the benchmark). The patched version writes less and reads less. Pretty simple. br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E15B9E.6040307-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <52E15B9E.6040307-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-24 8:02 ` Vyacheslav Dubeyko [not found] ` <FC7F25C5-1E72-4DF5-B860-FBCE36E91EAB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Vyacheslav Dubeyko @ 2014-01-24 8:02 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > 23 ÑÎ×. 2014 Ç., × 22:12, Andreas Rohner <andreas.rohner@gmx.net>: >> >>> This patch set implements a small new feature and there shouldn't be >>> any compatibility issues. It enables the GC to check how much free >>> space can be gained from cleaning a segment and if it is less than a >>> certain threshold it will abort the operation and try a different >>> segment. >> >> When you have cleaned a segment then you can use the whole one. >> So, if segment has 8 MB in size then it will be available 8 MB free space. >> The phrase "It enables the GC to check how much free space can be gained >> from cleaning a segment" really confuses me. Because I always know >> how much space I gain after cleaning every segment. I suppose that you >> mean something different. Am I correct? > > You have to move the live blocks to a new segment, so you gain only (8 > MB - live_blocks) of free space. You always will have 8 MB after cleaning (garbage collection). So, all segments are identical from the free space point of view. What does GC check? And how can GC distinguish segments on the basis of free space? All cleaned segments return 8 MB free space for further allocation. So, all used segments will be over any threshold. >>> Although no blocks need to be moved, the SUFILE entry of the >>> corresponding segment needs to be updated to avoid an infinite loop. >>> >>> This is potentially useful for all gc policies, but it is especially >>> beneficial for the timestamp policy. >> >> I completely misunderstand this statement. What do you mean? > > Well the timestamp policy always selects the oldest segment. If the > oldest segment is below the threshold it won't be cleaned. If we don't > change the timestamp it will be immediately selected again and it > probably will still be below the threshold and so on in an infinite loop. I understand your approach. But I misunderstand about what usefulness you are talking. What is it? [snip] >>> >>> The speed of the GC was tuned to the HDD. It was probably too aggressive >>> for the much faster SSD. That is probably the reason why the difference >>> in GB written and read is much higher than 20 GB. >> >> I misunderstand completely what you mean here. > > You need to be a bit more specific. These are the results of my > benchmark. The GB Written and GB Read values are calculated by simply > importing /proc/diskstats into R (You subtract the values before the > benchmark from those after the benchmark). > > The patched version writes less and reads less. Pretty simple. Here I asked about aggressiveness related to SSD. What do you mean? Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <FC7F25C5-1E72-4DF5-B860-FBCE36E91EAB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <FC7F25C5-1E72-4DF5-B860-FBCE36E91EAB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2014-01-24 14:18 ` Andreas Rohner [not found] ` <52E27651.6070207-hi6Y0CQ0nG0@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Andreas Rohner @ 2014-01-24 14:18 UTC (permalink / raw) To: Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 2014-01-24 09:02, Vyacheslav Dubeyko wrote: > >> 23 янв. 2014 г., в 22:12, Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>: >>> >>>> This patch set implements a small new feature and there shouldn't be >>>> any compatibility issues. It enables the GC to check how much free >>>> space can be gained from cleaning a segment and if it is less than a >>>> certain threshold it will abort the operation and try a different >>>> segment. >>> >>> When you have cleaned a segment then you can use the whole one. >>> So, if segment has 8 MB in size then it will be available 8 MB free space. >>> The phrase "It enables the GC to check how much free space can be gained >>> from cleaning a segment" really confuses me. Because I always know >>> how much space I gain after cleaning every segment. I suppose that you >>> mean something different. Am I correct? >> >> You have to move the live blocks to a new segment, so you gain only (8 >> MB - live_blocks) of free space. > > You always will have 8 MB after cleaning (garbage collection). So, all segments > are identical from the free space point of view. What does GC check? And how > can GC distinguish segments on the basis of free space? All cleaned segments > return 8 MB free space for further allocation. So, all used segments will be over > any threshold. I am sorry if I wasn't clear enough. The invalidated blocks in a segment are basically unusable free space, that needs to be garbage collected. If you clean a segment you only gain the space of the invalidated blocks, from the perspective of the whole file system. Of course the whole segment is free after cleaning, but the live blocks were moved somewhere else and occupy space there. So from the perspective of the whole file system you only really gained the amount of space occupied by the invalidated blocks. >>>> Although no blocks need to be moved, the SUFILE entry of the >>>> corresponding segment needs to be updated to avoid an infinite loop. >>>> >>>> This is potentially useful for all gc policies, but it is especially >>>> beneficial for the timestamp policy. >>> >>> I completely misunderstand this statement. What do you mean? >> >> Well the timestamp policy always selects the oldest segment. If the >> oldest segment is below the threshold it won't be cleaned. If we don't >> change the timestamp it will be immediately selected again and it >> probably will still be below the threshold and so on in an infinite loop. > > I understand your approach. But I misunderstand about what usefulness > you are talking. What is it? > It helps to avoid unnecessary copy operations of data. >>>> >>>> The speed of the GC was tuned to the HDD. It was probably too aggressive >>>> for the much faster SSD. That is probably the reason why the difference >>>> in GB written and read is much higher than 20 GB. >>> >>> I misunderstand completely what you mean here. >> >> You need to be a bit more specific. These are the results of my >> benchmark. The GB Written and GB Read values are calculated by simply >> importing /proc/diskstats into R (You subtract the values before the >> benchmark from those after the benchmark). >> >> The patched version writes less and reads less. Pretty simple. > > Here I asked about aggressiveness related to SSD. What do you mean? I meant the configuration of the GC. The benchmark writes a lot of data, and the GC needs to keep up with that. So I configured protection_period 30 mc_nsegments_per_clean = 4 mc_cleaning_interval = 0.25 This works for the hard drive, because it is too slow to really reach the 0.25 seconds anyway. The SSD on the other hand reaches the 0.25 seconds easily and that is probably too fast. So there is no time for the data to get invalidated and the GC moves too much data around. So my point is, that the results for the SSD are not to be trusted and that I need to repeat these measurements with different settings. br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <52E27651.6070207-hi6Y0CQ0nG0@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <52E27651.6070207-hi6Y0CQ0nG0@public.gmane.org> @ 2014-01-25 16:33 ` Vyacheslav Dubeyko [not found] ` <82ACC3FC-BE83-483F-99D8-D13F4D02C58F-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Vyacheslav Dubeyko @ 2014-01-25 16:33 UTC (permalink / raw) To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Jan 24, 2014, at 5:18 PM, Andreas Rohner wrote: >>> You have to move the live blocks to a new segment, so you gain only (8 >>> MB - live_blocks) of free space. >> >> You always will have 8 MB after cleaning (garbage collection). So, all segments >> are identical from the free space point of view. What does GC check? And how >> can GC distinguish segments on the basis of free space? All cleaned segments >> return 8 MB free space for further allocation. So, all used segments will be over >> any threshold. > > I am sorry if I wasn't clear enough. The invalidated blocks in a segment > are basically unusable free space, that needs to be garbage collected. > If you clean a segment you only gain the space of the invalidated > blocks, from the perspective of the whole file system. Of course the > whole segment is free after cleaning, but the live blocks were moved > somewhere else and occupy space there. So from the perspective of the > whole file system you only really gained the amount of space occupied by > the invalidated blocks. > I suggest not to use "free blocks" term for the case of invalidated blocks. Because it is really confusing way. I suppose that we can use term "invalid blocks" or "dead blocks". But I think that "dead blocks" is not good way too. Because, from my point of view, invalidated blocks are blocks that keep previous state of actual blocks. Dead block sounds for me likewise unusable or bad block. >>>>> Although no blocks need to be moved, the SUFILE entry of the >>>>> corresponding segment needs to be updated to avoid an infinite loop. >>>>> >>>>> This is potentially useful for all gc policies, but it is especially >>>>> beneficial for the timestamp policy. >>>> >>>> I completely misunderstand this statement. What do you mean? >>> >>> Well the timestamp policy always selects the oldest segment. If the >>> oldest segment is below the threshold it won't be cleaned. If we don't >>> change the timestamp it will be immediately selected again and it >>> probably will still be below the threshold and so on in an infinite loop. >> >> I understand your approach. But I misunderstand about what usefulness >> you are talking. What is it? >> > > It helps to avoid unnecessary copy operations of data. Ok. But how your implementation and approach can be used for timestamp policy implementation? Is it possible to combine GC policies into hybrid one? Thanks, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <82ACC3FC-BE83-483F-99D8-D13F4D02C58F-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <82ACC3FC-BE83-483F-99D8-D13F4D02C58F-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org> @ 2014-01-26 6:57 ` Ryusuke Konishi [not found] ` <20140126.155740.56352351.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Ryusuke Konishi @ 2014-01-26 6:57 UTC (permalink / raw) To: Andreas Rohner; +Cc: Vyacheslav Dubeyko, linux-nilfs-u79uwXL29TY76Z2rM5mHXA On Sat, 25 Jan 2014 19:33:42 +0300, Vyacheslav Dubeyko wrote: > > On Jan 24, 2014, at 5:18 PM, Andreas Rohner wrote: > >>>> You have to move the live blocks to a new segment, so you gain only (8 >>>> MB - live_blocks) of free space. >>> >>> You always will have 8 MB after cleaning (garbage collection). So, all segments >>> are identical from the free space point of view. What does GC check? And how >>> can GC distinguish segments on the basis of free space? All cleaned segments >>> return 8 MB free space for further allocation. So, all used segments will be over >>> any threshold. >> >> I am sorry if I wasn't clear enough. The invalidated blocks in a segment >> are basically unusable free space, that needs to be garbage collected. >> If you clean a segment you only gain the space of the invalidated >> blocks, from the perspective of the whole file system. Of course the >> whole segment is free after cleaning, but the live blocks were moved >> somewhere else and occupy space there. So from the perspective of the >> whole file system you only really gained the amount of space occupied by >> the invalidated blocks. >> > > I suggest not to use "free blocks" term for the case of invalidated blocks. > Because it is really confusing way. I suppose that we can use term > "invalid blocks" or "dead blocks". But I think that "dead blocks" is not > good way too. Because, from my point of view, invalidated blocks are > blocks that keep previous state of actual blocks. Dead block sounds for > me likewise unusable or bad block. "free blocks" term sounds confusing. We often use terms "live" and "dead" for garbage collection, but "dead blocks" also sounds unfit to me as Vyacheslav pointed out. How about "min reclaimable blocks" or "min collectable blocks" ? Regards, Ryusuke Konishi -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20140126.155740.56352351.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>]
* Re: [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations [not found] ` <20140126.155740.56352351.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> @ 2014-01-26 10:36 ` Andreas Rohner 0 siblings, 0 replies; 44+ messages in thread From: Andreas Rohner @ 2014-01-26 10:36 UTC (permalink / raw) To: Ryusuke Konishi; +Cc: Vyacheslav Dubeyko, linux-nilfs-u79uwXL29TY76Z2rM5mHXA On 2014-01-26 07:57, Ryusuke Konishi wrote: > On Sat, 25 Jan 2014 19:33:42 +0300, Vyacheslav Dubeyko wrote: >> >> On Jan 24, 2014, at 5:18 PM, Andreas Rohner wrote: >> >>>>> You have to move the live blocks to a new segment, so you gain only (8 >>>>> MB - live_blocks) of free space. >>>> >>>> You always will have 8 MB after cleaning (garbage collection). So, all segments >>>> are identical from the free space point of view. What does GC check? And how >>>> can GC distinguish segments on the basis of free space? All cleaned segments >>>> return 8 MB free space for further allocation. So, all used segments will be over >>>> any threshold. >>> >>> I am sorry if I wasn't clear enough. The invalidated blocks in a segment >>> are basically unusable free space, that needs to be garbage collected. >>> If you clean a segment you only gain the space of the invalidated >>> blocks, from the perspective of the whole file system. Of course the >>> whole segment is free after cleaning, but the live blocks were moved >>> somewhere else and occupy space there. So from the perspective of the >>> whole file system you only really gained the amount of space occupied by >>> the invalidated blocks. >>> >> >> I suggest not to use "free blocks" term for the case of invalidated blocks. >> Because it is really confusing way. I suppose that we can use term >> "invalid blocks" or "dead blocks". But I think that "dead blocks" is not >> good way too. Because, from my point of view, invalidated blocks are >> blocks that keep previous state of actual blocks. Dead block sounds for >> me likewise unusable or bad block. > > "free blocks" term sounds confusing. We often use terms > "live" and "dead" for garbage collection, but "dead blocks" also > sounds unfit to me as Vyacheslav pointed out. > > How about "min reclaimable blocks" or "min collectable blocks" ? Ok I will also change the config options accordingly. I like the term "reclaimable blocks". Regards, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2014-01-26 11:24 UTC | newest]
Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-21 13:59 [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations Andreas Rohner
[not found] ` <cover.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-21 13:59 ` [PATCH v2 1/5] nilfs-utils: cldconfig add an option to set minimal free blocks Andreas Rohner
[not found] ` <22b5b3ac403052d3044dc8c1bebe323191376c03.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-21 14:10 ` dexen deVries
2014-01-21 14:38 ` Andreas Rohner
2014-01-21 14:53 ` Andreas Rohner
2014-01-23 17:49 ` Vyacheslav Dubeyko
[not found] ` <B1FCAEBD-EB58-4A06-BD6B-7D4FB533D9F1-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-23 18:31 ` Andreas Rohner
2014-01-21 13:59 ` [PATCH v2 2/5] nilfs-utils: cleanerd: add custom error value to enable fast retry Andreas Rohner
[not found] ` <e9d3dd17318a994fe7e2c100368212e0b4029e13.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-23 17:49 ` Vyacheslav Dubeyko
[not found] ` <FE7FB751-68F4-48C3-A97A-782F4F6E69FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-23 19:08 ` Andreas Rohner
2014-01-21 13:59 ` [PATCH v2 3/5] nilfs-utils: refactoring of nilfs_reclaim_segment to add minblocks param Andreas Rohner
[not found] ` <36ef66ee15b3de8ca00815a6baa7bf6b62ca57d4.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-24 18:04 ` Ryusuke Konishi
2014-01-21 13:59 ` [PATCH v2 4/5] nilfs-utils: add support for NILFS_IOCTL_SET_SUINFO ioctl Andreas Rohner
[not found] ` <72f8c37258d08ba9793b0c1bb0374dd8efcd4756.1390310509.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-23 17:49 ` Vyacheslav Dubeyko
[not found] ` <62FA32DB-83AC-4570-BD73-618C169390FE-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-23 19:17 ` Andreas Rohner
[not found] ` <52E16AE4.4000303-hi6Y0CQ0nG0@public.gmane.org>
2014-01-25 16:16 ` Vyacheslav Dubeyko
2014-01-24 18:26 ` Ryusuke Konishi
[not found] ` <20140125.032633.184837411.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-24 20:00 ` Andreas Rohner
[not found] ` <52E2C643.3070704-hi6Y0CQ0nG0@public.gmane.org>
2014-01-25 18:52 ` Ryusuke Konishi
2014-01-26 10:00 ` Ryusuke Konishi
[not found] ` <20140126.190004.443429632.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-26 11:24 ` Andreas Rohner
2014-01-21 13:59 ` [PATCH v2 5/5] nilfs-utils: man: add description of min_free_blocks_threshold Andreas Rohner
2014-01-21 14:00 ` [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct Andreas Rohner
[not found] ` <cec6a449ddf5ae9da2928cdddfb96ebcb2789eee.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-21 14:00 ` [PATCH v2 2/3] nilfs2: add nilfs_sufile_set_suinfo to update segment usage info Andreas Rohner
[not found] ` <2fd48b2d524a59a02bdad13c0c194de3e5b55cc7.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-24 11:56 ` Ryusuke Konishi
[not found] ` <20140124.205623.400541300.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-24 12:34 ` Andreas Rohner
[not found] ` <52E25DDC.2060502-hi6Y0CQ0nG0@public.gmane.org>
2014-01-24 16:34 ` Ryusuke Konishi
2014-01-21 14:00 ` [PATCH v2 3/3] nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl Andreas Rohner
[not found] ` <6fb5a6d45afca9ae2599c471c0e42805ed1b6c55.1390312777.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-01-24 13:20 ` Ryusuke Konishi
[not found] ` <20140124.222016.110509397.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-24 13:44 ` Andreas Rohner
[not found] ` <52E26E46.3030702-hi6Y0CQ0nG0@public.gmane.org>
2014-01-24 15:23 ` Ryusuke Konishi
2014-01-24 4:56 ` [PATCH v2 1/3] nilfs2: add new nilfs_suinfo_update struct Ryusuke Konishi
[not found] ` <20140124.135635.27780504.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-24 12:11 ` Andreas Rohner
[not found] ` <52E25883.3010307-hi6Y0CQ0nG0@public.gmane.org>
2014-01-24 12:37 ` Ryusuke Konishi
2014-01-22 23:46 ` [PATCH v2 0/5] nilfs-utils: skip inefficient gc operations Michael L. Semon
[not found] ` <52E05863.90604-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-01-23 10:25 ` Andreas Rohner
[not found] ` <52E0EE08.3040703-hi6Y0CQ0nG0@public.gmane.org>
2014-01-23 20:57 ` Michael L. Semon
2014-01-23 17:48 ` Vyacheslav Dubeyko
[not found] ` <85EBEC6B-CA69-472A-8DDD-8E056F809EC4-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-23 18:12 ` Andreas Rohner
[not found] ` <52E15B9E.6040307-hi6Y0CQ0nG0@public.gmane.org>
2014-01-24 8:02 ` Vyacheslav Dubeyko
[not found] ` <FC7F25C5-1E72-4DF5-B860-FBCE36E91EAB-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-24 14:18 ` Andreas Rohner
[not found] ` <52E27651.6070207-hi6Y0CQ0nG0@public.gmane.org>
2014-01-25 16:33 ` Vyacheslav Dubeyko
[not found] ` <82ACC3FC-BE83-483F-99D8-D13F4D02C58F-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2014-01-26 6:57 ` Ryusuke Konishi
[not found] ` <20140126.155740.56352351.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-01-26 10:36 ` Andreas Rohner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.