linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/02] swap: allowing a more flexible DISCARD policy
@ 2013-05-21  0:04 Rafael Aquini
  2013-05-21  0:04 ` [RFC PATCH 01/02] swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER Rafael Aquini
  2013-05-21  0:04 ` [RFC PATCH 02/02] swapon: add "cluster-discard" support Rafael Aquini
  0 siblings, 2 replies; 12+ messages in thread
From: Rafael Aquini @ 2013-05-21  0:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hughd, shli, kzak, jmoyer, riel, lwoodman,
	mgorman

Howdy folks,

While working on a backport for the following changes:

3399446 swap: discard while swapping only if SWAP_FLAG_DISCARD
052b198 swap: don't do discard if no discard option added

We found ourselves around an interesting discussion on how limiting 
the behavior with regard to user-visible swap areas configuration has become
after applying the aforementioned changesets.

Before commit 3399446, if the swap backing device supported DISCARD,
then a batched discard was issued at swapon(8) time, and fine-grained DISCARDs
were issued in between freeing swap page-clusters and re-writing to them.
As noticed at 3399446's commit message, the fine-grained discards often
didn't help on improving performance as expected, and were potentially causing
more trouble than desired. So, commit 3399446 introduced a new swapon flag,
to make the fine-grained discards while swapping conditional.
However a batched discard would have been issued everytime swapon(8) was
turning a new swap area available.

This batched operation that remained at sys_swapon was considered troublesome
for some setups, and specially because a sysadmin was not flagging swapon(8)
to do discards -- http://www.spinics.net/lists/linux-mm/msg31741.html 
then, commit 052b198 got merged to address the scenario described above.
After this last commit, now we can either only do both batched and fine-grained
discards for swap, or none of them.

As depicted above, this seems to be not very flexible as it could be,
and the whole discussion we had (internally) left us wondering if does upstream
feel it would be useful to allow for both a batched discard as well as a
fine-grained discard option for swap? (By batched, here, it could mean just
the one time operation at swapon, or something similar to what fstrim does).

In fact, we all agreed with having no discards sent down at all as the default
behaviour. But thinking a little more about the use cases where device supports
discard:
a) and can do it quickly;
b) but it's slow to do in small granularities (or concurrent with other
   I/O);
c) but the implementation is so horrendous that you don't even want to
   send one down;

And assuming that the sysadmin considers it useful to send the discards down
at all, we would (probably) want the following solutions:

1) do the fine-grained discards if swap device is capable of doing so;
2) do batched discards, either at swapon or via something like fstrim; or
3) turn it off completely (default behavior nowadays)


i.e.: Today, if we have a swap device like (b), we cannot perform (2) even if
it might be regardeed as interesting, or necessary to the workload because it
would imply (1), and the device cannot do that and perform optimally.


With all that in mind, and in order to attempt to sort out the (un)flexibility
problem exposed above, I came up with the following patches:

01 (kernel)     swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER
02 (util-linux) swapon: add "cluster-discard" support


Sorry for the long email.

Your feedback is very much appreciated!
-- Rafael

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH 01/02] swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER
  2013-05-21  0:04 [RFC PATCH 00/02] swap: allowing a more flexible DISCARD policy Rafael Aquini
@ 2013-05-21  0:04 ` Rafael Aquini
  2013-05-21  0:55   ` KOSAKI Motohiro
  2013-05-21  0:04 ` [RFC PATCH 02/02] swapon: add "cluster-discard" support Rafael Aquini
  1 sibling, 1 reply; 12+ messages in thread
From: Rafael Aquini @ 2013-05-21  0:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hughd, shli, kzak, jmoyer, riel, lwoodman,
	mgorman

Intruduce a new flag to make page-cluster fine-grained discards while swapping
conditional, as they can be considered detrimental to some setups. However,
keep allowing batched discards at sys_swapon() time, when enabled by the
system administrator. 

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 include/linux/swap.h |  8 +++++---
 mm/swapfile.c        | 12 ++++++++----
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1701ce4..ab2e742 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -19,10 +19,11 @@ struct bio;
 #define SWAP_FLAG_PREFER	0x8000	/* set if swap priority specified */
 #define SWAP_FLAG_PRIO_MASK	0x7fff
 #define SWAP_FLAG_PRIO_SHIFT	0
-#define SWAP_FLAG_DISCARD	0x10000 /* discard swap cluster after use */
+#define SWAP_FLAG_DISCARD	0x10000 /* enable discard for swap areas */
+#define SWAP_FLAG_DISCARD_CLUSTER 0x20000 /* discard swap clusters after use */
 
 #define SWAP_FLAGS_VALID	(SWAP_FLAG_PRIO_MASK | SWAP_FLAG_PREFER | \
-				 SWAP_FLAG_DISCARD)
+				 SWAP_FLAG_DISCARD | SWAP_FLAG_DISCARD_CLUSTER)
 
 static inline int current_is_kswapd(void)
 {
@@ -152,8 +153,9 @@ enum {
 	SWP_CONTINUED	= (1 << 5),	/* swap_map has count continuation */
 	SWP_BLKDEV	= (1 << 6),	/* its a block device */
 	SWP_FILE	= (1 << 7),	/* set after swap_activate success */
+	SWP_CLUSTERDISCARD = (1 << 8),	/* discard swap cluster after usage */
 					/* add others here before... */
-	SWP_SCANNING	= (1 << 8),	/* refcount in scan_swap_map */
+	SWP_SCANNING	= (1 << 9),	/* refcount in scan_swap_map */
 };
 
 #define SWAP_CLUSTER_MAX 32UL
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6c340d9..197461f 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -212,7 +212,7 @@ static unsigned long scan_swap_map(struct swap_info_struct *si,
 			si->cluster_nr = SWAPFILE_CLUSTER - 1;
 			goto checks;
 		}
-		if (si->flags & SWP_DISCARDABLE) {
+		if (si->flags & SWP_CLUSTERDISCARD) {
 			/*
 			 * Start range check on racing allocations, in case
 			 * they overlap the cluster we eventually decide on
@@ -322,7 +322,7 @@ checks:
 
 	if (si->lowest_alloc) {
 		/*
-		 * Only set when SWP_DISCARDABLE, and there's a scan
+		 * Only set when SWP_CLUSTERDISCARD, and there's a scan
 		 * for a free cluster in progress or just completed.
 		 */
 		if (found_free_cluster) {
@@ -2123,8 +2123,11 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 			p->flags |= SWP_SOLIDSTATE;
 			p->cluster_next = 1 + (prandom_u32() % p->highest_bit);
 		}
-		if ((swap_flags & SWAP_FLAG_DISCARD) && discard_swap(p) == 0)
+		if ((swap_flags & SWAP_FLAG_DISCARD) && discard_swap(p) == 0) {
 			p->flags |= SWP_DISCARDABLE;
+			if (swap_flags & SWAP_FLAG_DISCARD_CLUSTER)
+				p->flags |= SWP_CLUSTERDISCARD;
+		}
 	}
 
 	mutex_lock(&swapon_mutex);
@@ -2135,11 +2138,12 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 	enable_swap_info(p, prio, swap_map, frontswap_map);
 
 	printk(KERN_INFO "Adding %uk swap on %s.  "
-			"Priority:%d extents:%d across:%lluk %s%s%s\n",
+			"Priority:%d extents:%d across:%lluk %s%s%s%s\n",
 		p->pages<<(PAGE_SHIFT-10), name->name, p->prio,
 		nr_extents, (unsigned long long)span<<(PAGE_SHIFT-10),
 		(p->flags & SWP_SOLIDSTATE) ? "SS" : "",
 		(p->flags & SWP_DISCARDABLE) ? "D" : "",
+		(p->flags & SWP_CLUSTERDISCARD) ? "C" : "",
 		(frontswap_map) ? "FS" : "");
 
 	mutex_unlock(&swapon_mutex);
-- 
1.7.11.7

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21  0:04 [RFC PATCH 00/02] swap: allowing a more flexible DISCARD policy Rafael Aquini
  2013-05-21  0:04 ` [RFC PATCH 01/02] swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER Rafael Aquini
@ 2013-05-21  0:04 ` Rafael Aquini
  2013-05-21  1:02   ` KOSAKI Motohiro
  2013-05-21 10:13   ` Karel Zak
  1 sibling, 2 replies; 12+ messages in thread
From: Rafael Aquini @ 2013-05-21  0:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hughd, shli, kzak, jmoyer, riel, lwoodman,
	mgorman

Introduce a new swapon flag/option to support more flexible swap discard setup.
The --cluster-discard swapon(8) option can be used by a system admin to flag
sys_swapon() to perform page-cluster fine-grained discards.

This patch also changes the behaviour of swapon(8) --discard option, that now
will only be used to flag sys_swapon() batched discards will be issued at
swapon(8) time.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 sys-utils/swapon.8 | 19 ++++++++++++++++---
 sys-utils/swapon.c | 34 ++++++++++++++++++++++++++--------
 2 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/sys-utils/swapon.8 b/sys-utils/swapon.8
index 385bf5a..0c8ac69 100644
--- a/sys-utils/swapon.8
+++ b/sys-utils/swapon.8
@@ -112,10 +112,23 @@ All devices marked as ``swap'' in
 are made available, except for those with the ``noauto'' option.
 Devices that are already being used as swap are silently skipped.
 .TP
+.TP
+.B "\-c, \-\-cluster\-discard"
+Swapping will discard clusters of swap pages in between freeing them
+and re-writing to them, if the swap device supports that. This option
+also implies the
+.I \-d, \-\-discard
+swapon flag.
+The
+.I /etc/fstab
+mount option
+.BI cluster\-discard
+may be also used to enable this flag.
+
+.TP
 .B "\-d, \-\-discard"
-Discard freed swap pages before they are reused, if the swap
-device supports the discard or trim operation.  This may improve
-performance on some Solid State Devices, but often it does not.
+Enables swap discards, if the swap device supports that, and performs
+a batch discard operation for the swap device at swapon time.
 The
 .I /etc/fstab
 mount option
diff --git a/sys-utils/swapon.c b/sys-utils/swapon.c
index f1e2433..a71f69e 100644
--- a/sys-utils/swapon.c
+++ b/sys-utils/swapon.c
@@ -34,7 +34,11 @@
 #endif
 
 #ifndef SWAP_FLAG_DISCARD
-# define SWAP_FLAG_DISCARD	0x10000 /* discard swap cluster after use */
+# define SWAP_FLAG_DISCARD	0x10000 /* enable discard for swap */
+#endif
+
+#ifndef SWAP_FLAG_DISCARD_CLUSTER
+# define SWAP_FLAG_DISCARD_CLUSTER 0x20000 /* discard swap cluster after use */
 #endif
 
 #ifndef SWAP_FLAG_PREFER
@@ -70,7 +74,7 @@ enum {
 
 static int all;
 static int priority = -1;	/* non-prioritized swap by default */
-static int discard;
+static int discard = 0;		/* don't send swap discards by default */
 
 /* If true, don't complain if the device/file doesn't exist */
 static int ifexists;
@@ -570,8 +574,11 @@ static int do_swapon(const char *orig_special, int prio,
 			   << SWAP_FLAG_PRIO_SHIFT);
 	}
 #endif
-	if (fl_discard)
+	if (fl_discard) {
 		flags |= SWAP_FLAG_DISCARD;
+		if (fl_discard > 1)
+			flags |= SWAP_FLAG_DISCARD_CLUSTER;
+	}
 
 	status = swapon(special, flags);
 	if (status < 0)
@@ -615,8 +622,14 @@ static int swapon_all(void)
 
 		if (mnt_fs_get_option(fs, "noauto", NULL, NULL) == 0)
 			continue;
-		if (mnt_fs_get_option(fs, "discard", NULL, NULL) == 0)
-			dsc = 1;
+		if (mnt_fs_get_option(fs, "discard", NULL, NULL) == 0) {
+			if !(dsc)
+				dsc = 1;
+		}
+		if (mnt_fs_get_option(fs, "cluster-discard", NULL, NULL) == 0) {
+			if (!dsc || dsc == 1)
+				dsc = 2;
+		}
 		if (mnt_fs_get_option(fs, "nofail", NULL, NULL) == 0)
 			nofail = 1;
 		if (mnt_fs_get_option(fs, "pri", &p, NULL) == 0 && p)
@@ -647,7 +660,8 @@ static void __attribute__ ((__noreturn__)) usage(FILE * out)
 
 	fputs(USAGE_OPTIONS, out);
 	fputs(_(" -a, --all              enable all swaps from /etc/fstab\n"
-		" -d, --discard          discard freed pages before they are reused\n"
+		" -c, --cluster-discard  discard freed pages before they are reused, while swapping\n"
+		" -d, --discard          discard freed pages before they are reused, all at once, at swapon time\n"
 		" -e, --ifexists         silently skip devices that do not exist\n"
 		" -f, --fixpgsz          reinitialize the swap space if necessary\n"
 		" -p, --priority <prio>  specify the priority of the swap device\n"
@@ -696,6 +710,7 @@ int main(int argc, char *argv[])
 
 	static const struct option long_opts[] = {
 		{ "priority", 1, 0, 'p' },
+		{ "cluster-discard",  0, 0, 'c' },
 		{ "discard",  0, 0, 'd' },
 		{ "ifexists", 0, 0, 'e' },
 		{ "summary",  0, 0, 's' },
@@ -719,7 +734,7 @@ int main(int argc, char *argv[])
 	mnt_init_debug(0);
 	mntcache = mnt_new_cache();
 
-	while ((c = getopt_long(argc, argv, "ahdefp:svVL:U:",
+	while ((c = getopt_long(argc, argv, "ahcdefp:svVL:U:",
 				long_opts, NULL)) != -1) {
 		switch (c) {
 		case 'a':		/* all */
@@ -738,8 +753,11 @@ int main(int argc, char *argv[])
 		case 'U':
 			add_uuid(optarg);
 			break;
+		case 'c':
+			discard += 2;
+			break;
 		case 'd':
-			discard = 1;
+			discard += 1;
 			break;
 		case 'e':               /* ifexists */
 		        ifexists = 1;
-- 
1.7.11.7

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 01/02] swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER
  2013-05-21  0:04 ` [RFC PATCH 01/02] swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER Rafael Aquini
@ 2013-05-21  0:55   ` KOSAKI Motohiro
  2013-05-21 21:06     ` Rafael Aquini
  0 siblings, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2013-05-21  0:55 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-kernel, linux-mm, akpm, hughd, shli, kzak, jmoyer, riel,
	lwoodman, mgorman, kosaki.motohiro

(5/20/13 8:04 PM), Rafael Aquini wrote:
> Intruduce a new flag to make page-cluster fine-grained discards while swapping
> conditional, as they can be considered detrimental to some setups. However,
> keep allowing batched discards at sys_swapon() time, when enabled by the
> system administrator. 
> 
> Signed-off-by: Rafael Aquini <aquini@redhat.com>
> ---
>  include/linux/swap.h |  8 +++++---
>  mm/swapfile.c        | 12 ++++++++----
>  2 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 1701ce4..ab2e742 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -19,10 +19,11 @@ struct bio;
>  #define SWAP_FLAG_PREFER	0x8000	/* set if swap priority specified */
>  #define SWAP_FLAG_PRIO_MASK	0x7fff
>  #define SWAP_FLAG_PRIO_SHIFT	0
> -#define SWAP_FLAG_DISCARD	0x10000 /* discard swap cluster after use */
> +#define SWAP_FLAG_DISCARD	0x10000 /* enable discard for swap areas */
> +#define SWAP_FLAG_DISCARD_CLUSTER 0x20000 /* discard swap clusters after use */

>From point of backward compatibility view, 0x10000 should be disable both discarding
when mount and when IO.
And, introducing new two flags, enable mount time discard and enable IO time discard.

IOW, Please consider newer kernel and older swapon(8) conbination.
Other than that, looks good to me.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21  0:04 ` [RFC PATCH 02/02] swapon: add "cluster-discard" support Rafael Aquini
@ 2013-05-21  1:02   ` KOSAKI Motohiro
  2013-05-21 10:26     ` Karel Zak
  2013-05-21 10:13   ` Karel Zak
  1 sibling, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2013-05-21  1:02 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-kernel, linux-mm, akpm, hughd, shli, kzak, jmoyer, riel,
	lwoodman, mgorman, kosaki.motohiro

> +.B "\-c, \-\-cluster\-discard"
> +Swapping will discard clusters of swap pages in between freeing them
> +and re-writing to them, if the swap device supports that. This option
> +also implies the
> +.I \-d, \-\-discard
> +swapon flag.

I'm not sure this is good idea. Why can't we make these flags orthogonal?


>  /* If true, don't complain if the device/file doesn't exist */
>  static int ifexists;
> @@ -570,8 +574,11 @@ static int do_swapon(const char *orig_special, int prio,
>  			   << SWAP_FLAG_PRIO_SHIFT);
>  	}
>  #endif
> -	if (fl_discard)
> +	if (fl_discard) {
>  		flags |= SWAP_FLAG_DISCARD;
> +		if (fl_discard > 1)
> +			flags |= SWAP_FLAG_DISCARD_CLUSTER;

This is not enough, IMHO. When running this code on old kernel, swapon() return EINVAL.
At that time, we should fall back swapon(0x10000).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21  0:04 ` [RFC PATCH 02/02] swapon: add "cluster-discard" support Rafael Aquini
  2013-05-21  1:02   ` KOSAKI Motohiro
@ 2013-05-21 10:13   ` Karel Zak
  1 sibling, 0 replies; 12+ messages in thread
From: Karel Zak @ 2013-05-21 10:13 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: linux-kernel, linux-mm, akpm, hughd, shli, jmoyer, riel, lwoodman,
	mgorman

On Mon, May 20, 2013 at 09:04:25PM -0300, Rafael Aquini wrote:
> -	while ((c = getopt_long(argc, argv, "ahdefp:svVL:U:",
> +	while ((c = getopt_long(argc, argv, "ahcdefp:svVL:U:",
>  				long_opts, NULL)) != -1) {
>  		switch (c) {
>  		case 'a':		/* all */
> @@ -738,8 +753,11 @@ int main(int argc, char *argv[])
>  		case 'U':
>  			add_uuid(optarg);
>  			break;
> +		case 'c':
> +			discard += 2;
> +			break;
>  		case 'd':
> -			discard = 1;
> +			discard += 1;

 this is fragile, it would be better to use

        case 'c':
            discard |= SWAP_FLAG_DISCARD_CLUSTER;
            break;
        case 'd':
            discard |= SWAP_FLAG_DISCARD;
            break;

 and use directly the flags everywhere in the code than use magical
 numbers '1' and '2' etc.

    Karel

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21  1:02   ` KOSAKI Motohiro
@ 2013-05-21 10:26     ` Karel Zak
  2013-05-21 20:17       ` KOSAKI Motohiro
  0 siblings, 1 reply; 12+ messages in thread
From: Karel Zak @ 2013-05-21 10:26 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rafael Aquini, linux-kernel, linux-mm, akpm, hughd, shli, jmoyer,
	riel, lwoodman, mgorman

On Mon, May 20, 2013 at 09:02:43PM -0400, KOSAKI Motohiro wrote:
> > -	if (fl_discard)
> > +	if (fl_discard) {
> >  		flags |= SWAP_FLAG_DISCARD;
> > +		if (fl_discard > 1)
> > +			flags |= SWAP_FLAG_DISCARD_CLUSTER;
> 
> This is not enough, IMHO. When running this code on old kernel, swapon() return EINVAL.
> At that time, we should fall back swapon(0x10000).

 Hmm.. currently we don't use any fallback for any swap flag (e.g.
 0x10000) for compatibility with old kernels. Maybe it's better to
 keep it simple and stupid and return an error message than introduce
 any super-smart semantic to hide incompatible fstab configuration.

    Karel
 
-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21 10:26     ` Karel Zak
@ 2013-05-21 20:17       ` KOSAKI Motohiro
  2013-05-21 21:13         ` Rafael Aquini
  0 siblings, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2013-05-21 20:17 UTC (permalink / raw)
  To: Karel Zak
  Cc: KOSAKI Motohiro, Rafael Aquini, linux-kernel, linux-mm, akpm,
	hughd, shli, jmoyer, riel, lwoodman, mgorman

(5/21/13 6:26 AM), Karel Zak wrote:
> On Mon, May 20, 2013 at 09:02:43PM -0400, KOSAKI Motohiro wrote:
>>> -	if (fl_discard)
>>> +	if (fl_discard) {
>>>  		flags |= SWAP_FLAG_DISCARD;
>>> +		if (fl_discard > 1)
>>> +			flags |= SWAP_FLAG_DISCARD_CLUSTER;
>>
>> This is not enough, IMHO. When running this code on old kernel, swapon() return EINVAL.
>> At that time, we should fall back swapon(0x10000).
> 
>  Hmm.. currently we don't use any fallback for any swap flag (e.g.
>  0x10000) for compatibility with old kernels. Maybe it's better to
>  keep it simple and stupid and return an error message than introduce
>  any super-smart semantic to hide incompatible fstab configuration.

Hm. If so, I'd propose to revert the following change. 

> .B "\-d, \-\-discard"
>-Discard freed swap pages before they are reused, if the swap
>-device supports the discard or trim operation.  This may improve
>-performance on some Solid State Devices, but often it does not.
>+Enables swap discards, if the swap device supports that, and performs
>+a batch discard operation for the swap device at swapon time.


And instead, I suggest to make --discard-on-swapon like the following.
(better name idea is welcome) 

+--discard-on-swapon
+Enables swap discards, if the swap device supports that, and performs
+a batch discard operation for the swap device at swapon time.

I mean, preserving flags semantics removes the reason we need make a fallback.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 01/02] swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER
  2013-05-21  0:55   ` KOSAKI Motohiro
@ 2013-05-21 21:06     ` Rafael Aquini
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael Aquini @ 2013-05-21 21:06 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-kernel, linux-mm, akpm, hughd, shli, kzak, jmoyer, riel,
	lwoodman, mgorman

Howdy Kosaki-san,

Thanks for your time over this one :)

On Mon, May 20, 2013 at 08:55:33PM -0400, KOSAKI Motohiro wrote:
> (5/20/13 8:04 PM), Rafael Aquini wrote:
> > Intruduce a new flag to make page-cluster fine-grained discards while swapping
> > conditional, as they can be considered detrimental to some setups. However,
> > keep allowing batched discards at sys_swapon() time, when enabled by the
> > system administrator. 
> > 
> > Signed-off-by: Rafael Aquini <aquini@redhat.com>
> > ---
> >  include/linux/swap.h |  8 +++++---
> >  mm/swapfile.c        | 12 ++++++++----
> >  2 files changed, 13 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index 1701ce4..ab2e742 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -19,10 +19,11 @@ struct bio;
> >  #define SWAP_FLAG_PREFER	0x8000	/* set if swap priority specified */
> >  #define SWAP_FLAG_PRIO_MASK	0x7fff
> >  #define SWAP_FLAG_PRIO_SHIFT	0
> > -#define SWAP_FLAG_DISCARD	0x10000 /* discard swap cluster after use */
> > +#define SWAP_FLAG_DISCARD	0x10000 /* enable discard for swap areas */
> > +#define SWAP_FLAG_DISCARD_CLUSTER 0x20000 /* discard swap clusters after use */
> 
> From point of backward compatibility view, 0x10000 should be disable both discarding
> when mount and when IO.

I think you mean 0x10000 should be enable both here, then. That's a nice catch. I'll
try to think a way to accomplish it in a simple fashion.


> And, introducing new two flags, enable mount time discard and enable IO time discard.
> 
> IOW, Please consider newer kernel and older swapon(8) conbination.
> Other than that, looks good to me.
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21 20:17       ` KOSAKI Motohiro
@ 2013-05-21 21:13         ` Rafael Aquini
  2013-05-21 22:01           ` KOSAKI Motohiro
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael Aquini @ 2013-05-21 21:13 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Karel Zak, linux-kernel, linux-mm, akpm, hughd, shli, jmoyer,
	riel, lwoodman, mgorman

Karel, Motohiro,

Thanks a lot for your time reviewing this patch and providing me with valuable
feedback.

On Tue, May 21, 2013 at 04:17:04PM -0400, KOSAKI Motohiro wrote:
> (5/21/13 6:26 AM), Karel Zak wrote:
> > On Mon, May 20, 2013 at 09:02:43PM -0400, KOSAKI Motohiro wrote:
> >>> -	if (fl_discard)
> >>> +	if (fl_discard) {
> >>>  		flags |= SWAP_FLAG_DISCARD;
> >>> +		if (fl_discard > 1)
> >>> +			flags |= SWAP_FLAG_DISCARD_CLUSTER;
> >>
> >> This is not enough, IMHO. When running this code on old kernel, swapon() return EINVAL.
> >> At that time, we should fall back swapon(0x10000).
> > 
> >  Hmm.. currently we don't use any fallback for any swap flag (e.g.
> >  0x10000) for compatibility with old kernels. Maybe it's better to
> >  keep it simple and stupid and return an error message than introduce
> >  any super-smart semantic to hide incompatible fstab configuration.
> 
> Hm. If so, I'd propose to revert the following change. 
> 
> > .B "\-d, \-\-discard"
> >-Discard freed swap pages before they are reused, if the swap
> >-device supports the discard or trim operation.  This may improve
> >-performance on some Solid State Devices, but often it does not.
> >+Enables swap discards, if the swap device supports that, and performs
> >+a batch discard operation for the swap device at swapon time.
> 
> 
> And instead, I suggest to make --discard-on-swapon like the following.
> (better name idea is welcome) 
> 
> +--discard-on-swapon
> +Enables swap discards, if the swap device supports that, and performs
> +a batch discard operation for the swap device at swapon time.
> 
> I mean, preserving flags semantics removes the reason we need make a fallback.
> 
>

Instead of reverting and renaming --discard, what about making it accept an
optional argument, so we could use --discard (to enable all thing and keep
backward compatibility); --discard=cluster & --discard=batch (or whatever we
think it should be named). I'll try to sort this approach out if you folks think
it's worthwhile. 

-- Rafael

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21 21:13         ` Rafael Aquini
@ 2013-05-21 22:01           ` KOSAKI Motohiro
  2013-05-22 13:52             ` Jeff Moyer
  0 siblings, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2013-05-21 22:01 UTC (permalink / raw)
  To: Rafael Aquini
  Cc: KOSAKI Motohiro, Karel Zak, linux-kernel, linux-mm, akpm, hughd,
	shli, jmoyer, riel, lwoodman, mgorman

> Instead of reverting and renaming --discard, what about making it accept an
> optional argument, so we could use --discard (to enable all thing and keep
> backward compatibility); --discard=cluster & --discard=batch (or whatever we
> think it should be named). I'll try to sort this approach out if you folks think
> it's worthwhile. 

Optional argument looks nice, at least to me. 

But hmm.. 

"cluster" and "batch" describes current kernel implementation, not user visible effect. 
Usually I suggest to pick up a word from man pages because it describe user visible action.
e.g. --discard=freed-pages or --discard=io or --discard=swapon or --discard=once, etc..

But this is not strong opinion. You can ignore it. I don't think I have good English sense. :-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 02/02] swapon: add "cluster-discard" support
  2013-05-21 22:01           ` KOSAKI Motohiro
@ 2013-05-22 13:52             ` Jeff Moyer
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff Moyer @ 2013-05-22 13:52 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rafael Aquini, Karel Zak, linux-kernel, linux-mm, akpm, hughd,
	shli, riel, lwoodman, mgorman

KOSAKI Motohiro <kosaki.motohiro@gmail.com> writes:

>> Instead of reverting and renaming --discard, what about making it accept an
>> optional argument, so we could use --discard (to enable all thing and keep
>> backward compatibility); --discard=cluster & --discard=batch (or whatever we
>> think it should be named). I'll try to sort this approach out if you folks think
>> it's worthwhile. 
>
> Optional argument looks nice, at least to me. 
>
> But hmm.. 
>
> "cluster" and "batch" describes current kernel implementation, not user visible effect. 
> Usually I suggest to pick up a word from man pages because it describe user visible action.
> e.g. --discard=freed-pages or --discard=io or --discard=swapon or --discard=once, etc..
>
> But this is not strong opinion. You can ignore it. I don't think I have good English sense. :-)

I like discard=swapon.  For the fine-grained discards, I don't have a
strong opinion, but I guess I'd lean towards freed-pages.

Cheers,
Jeff

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-05-22 13:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-21  0:04 [RFC PATCH 00/02] swap: allowing a more flexible DISCARD policy Rafael Aquini
2013-05-21  0:04 ` [RFC PATCH 01/02] swap: discard while swapping only if SWAP_FLAG_DISCARD_CLUSTER Rafael Aquini
2013-05-21  0:55   ` KOSAKI Motohiro
2013-05-21 21:06     ` Rafael Aquini
2013-05-21  0:04 ` [RFC PATCH 02/02] swapon: add "cluster-discard" support Rafael Aquini
2013-05-21  1:02   ` KOSAKI Motohiro
2013-05-21 10:26     ` Karel Zak
2013-05-21 20:17       ` KOSAKI Motohiro
2013-05-21 21:13         ` Rafael Aquini
2013-05-21 22:01           ` KOSAKI Motohiro
2013-05-22 13:52             ` Jeff Moyer
2013-05-21 10:13   ` Karel Zak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).