From: Nitin Gupta <ngupta@vflare.org>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Greg KH <greg@kroah.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ed Tomlinson <edt@aei.ca>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
linux-mm-cc <linux-mm-cc@laptop.org>,
kamezawa.hiroyu@jp.fujitsu.com, nishimura@mxp.nes.nec.co.jp
Subject: Re: [PATCH 2/4] send callback when swap slot is freed
Date: Fri, 18 Sep 2009 20:34:11 +0530 [thread overview]
Message-ID: <4AB3A16B.90009@vflare.org> (raw)
In-Reply-To: <1253266391.4959.15.camel@penberg-laptop>
Hi Pekka,
On 09/18/2009 03:03 PM, Pekka Enberg wrote:
>
> On Fri, 2009-09-18 at 08:59 +0100, Hugh Dickins wrote:
>> On Fri, 18 Sep 2009, Pekka Enberg wrote:
>>>
>>> The *hook* looks OK to me but set_swap_free_notify() looks like an ugly
>>> hack. I don't understand why we're setting up the hook lazily in
>>> ramzswap_read() nor do I understand why we need to look up struct
>>> swap_info_struct with a bdev. Surely there's a cleaner way to do all
>>> this? Probably somewhere in sys_swapon()?
>>
>> Sounds like you have something in mind, may well be better,
>> but please show us a patch... (my mind is elsewhere!)
>
> Hey, you tricked me into a world of pain! Here's a totally untested
> patch that adds a "swapon" funciton to struct block_device_operations
> that the ramzswap driver can implement to setup ->swap_free_notify_fn.
>
It is understood that this swap notify callback is bit of a hack. I think
we will not gain much trying to beautify this hack. However, I agree with
Hugh's suggestion to rename this notify callback related function/variables
to make it explicit that its completely ramzswap related. I will send a path
that affects these renames as reply to patch 0/4.
A solution that is generic while satisfying ramzswap demands may be a bit more
involved. It will be easier to incrementally develop these additions.
Thanks,
Nitin
>
>> From 41827c57196f7eae66701a6ae6565c226c0b7674 Mon Sep 17 00:00:00 2001
> From: Nitin Gupta<ngupta@vflare.org>
> Date: Fri, 18 Sep 2009 04:13:30 +0530
> Subject: [PATCH] mm: swap slot free notifier
>
> Currently, we have "swap discard" mechanism which sends a discard bio request
> when we find a free cluster during scan_swap_map(). This callback can come a
> long time after swap slots are actually freed.
>
> This delay in callback is a great problem when (compressed) RAM [1] is used
> as a swap device. So, this change adds a callback which is called as
> soon as a swap slot becomes free. For above mentioned case of swapping
> over compressed RAM device, this is very useful since we can immediately
> free memory allocated for this swap page.
>
> This callback does not replace swap discard support. It is called with
> swap_lock held, so it is meant to trigger action that finishes quickly.
> However, swap discard is an I/O request and can be used for taking longer
> actions.
>
> It is preferred to use this callback for ramzswap case even if discard
> mechanism could be improved such that it can be called as often as required.
> This is because, allocation of 'bio'(s) is undesirable since ramzswap always
> operates under low memory conditions (its a swap device). Also, batching of
> discard bio requests is not optimal since stale data can accumulate very
> quickly in ramzswap devices, pushing system further into low memory state.
>
> Signed-off-by: Nitin Gupta<ngupta@vflare.org>
> Signed-off-by: Pekka Enberg<penberg@cs.helsinki.fi>
> ---
> include/linux/blkdev.h | 2 ++
> include/linux/swap.h | 16 ++++++++++++++++
> mm/swapfile.c | 5 +++++
> 3 files changed, 23 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index e23a86c..688e024 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -27,6 +27,7 @@ struct scsi_ioctl_command;
> struct request_queue;
> struct elevator_queue;
> struct request_pm_state;
> +struct swap_info_struct;
> struct blk_trace;
> struct request;
> struct sg_io_hdr;
> @@ -1239,6 +1240,7 @@ struct block_device_operations {
> unsigned long long);
> int (*revalidate_disk) (struct gendisk *);
> int (*getgeo)(struct block_device *, struct hd_geometry *);
> + int (*swapon)(struct block_device *, struct swap_info_struct *);
> struct module *owner;
> };
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 7c15334..6603009 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -8,6 +8,7 @@
> #include<linux/memcontrol.h>
> #include<linux/sched.h>
> #include<linux/node.h>
> +#include<linux/blkdev.h>
>
> #include<asm/atomic.h>
> #include<asm/page.h>
> @@ -20,6 +21,8 @@ struct bio;
> #define SWAP_FLAG_PRIO_MASK 0x7fff
> #define SWAP_FLAG_PRIO_SHIFT 0
>
> +typedef void (swap_free_notify_fn) (struct block_device *, unsigned long);
> +
> static inline int current_is_kswapd(void)
> {
> return current->flags& PF_KSWAPD;
> @@ -155,6 +158,7 @@ struct swap_info_struct {
> unsigned int max;
> unsigned int inuse_pages;
> unsigned int old_block_size;
> + swap_free_notify_fn *swap_free_notify_fn;
> };
>
> struct swap_list_t {
> @@ -297,6 +301,18 @@ extern int reuse_swap_page(struct page *);
> extern int try_to_free_swap(struct page *);
> struct backing_dev_info;
>
> +static inline int
> +blkdev_swapon(struct block_device *bdev, struct swap_info_struct *sis)
> +{
> + struct gendisk *disk = bdev->bd_disk;
> + int err = 0;
> +
> + if (disk->fops->swapon)
> + err = disk->fops->swapon(bdev, sis);
> +
> + return err;
> +}
> +
> /* linux/mm/thrash.c */
> extern struct mm_struct *swap_token_mm;
> extern void grab_swap_token(struct mm_struct *);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 74f1102..9d10f02 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -585,6 +585,8 @@ static int swap_entry_free(struct swap_info_struct *p,
> swap_list.next = p - swap_info;
> nr_swap_pages++;
> p->inuse_pages--;
> + if (p->swap_free_notify_fn)
> + p->swap_free_notify_fn(p->bdev, offset);
> }
> if (!swap_count(count))
> mem_cgroup_uncharge_swap(ent);
> @@ -1845,6 +1847,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> if (error< 0)
> goto bad_swap;
> p->bdev = bdev;
> + error = blkdev_swapon(bdev, p);
> + if (error< 0)
> + goto bad_swap;
> } else if (S_ISREG(inode->i_mode)) {
> p->bdev = inode->i_sb->s_bdev;
> mutex_lock(&inode->i_mutex);
WARNING: multiple messages have this Message-ID (diff)
From: Nitin Gupta <ngupta@vflare.org>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Greg KH <greg@kroah.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ed Tomlinson <edt@aei.ca>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
linux-mm-cc <linux-mm-cc@laptop.org>,
kamezawa.hiroyu@jp.fujitsu.com, nishimura@mxp.nes.nec.co.jp
Subject: Re: [PATCH 2/4] send callback when swap slot is freed
Date: Fri, 18 Sep 2009 20:34:11 +0530 [thread overview]
Message-ID: <4AB3A16B.90009@vflare.org> (raw)
In-Reply-To: <1253266391.4959.15.camel@penberg-laptop>
Hi Pekka,
On 09/18/2009 03:03 PM, Pekka Enberg wrote:
>
> On Fri, 2009-09-18 at 08:59 +0100, Hugh Dickins wrote:
>> On Fri, 18 Sep 2009, Pekka Enberg wrote:
>>>
>>> The *hook* looks OK to me but set_swap_free_notify() looks like an ugly
>>> hack. I don't understand why we're setting up the hook lazily in
>>> ramzswap_read() nor do I understand why we need to look up struct
>>> swap_info_struct with a bdev. Surely there's a cleaner way to do all
>>> this? Probably somewhere in sys_swapon()?
>>
>> Sounds like you have something in mind, may well be better,
>> but please show us a patch... (my mind is elsewhere!)
>
> Hey, you tricked me into a world of pain! Here's a totally untested
> patch that adds a "swapon" funciton to struct block_device_operations
> that the ramzswap driver can implement to setup ->swap_free_notify_fn.
>
It is understood that this swap notify callback is bit of a hack. I think
we will not gain much trying to beautify this hack. However, I agree with
Hugh's suggestion to rename this notify callback related function/variables
to make it explicit that its completely ramzswap related. I will send a path
that affects these renames as reply to patch 0/4.
A solution that is generic while satisfying ramzswap demands may be a bit more
involved. It will be easier to incrementally develop these additions.
Thanks,
Nitin
>
>> From 41827c57196f7eae66701a6ae6565c226c0b7674 Mon Sep 17 00:00:00 2001
> From: Nitin Gupta<ngupta@vflare.org>
> Date: Fri, 18 Sep 2009 04:13:30 +0530
> Subject: [PATCH] mm: swap slot free notifier
>
> Currently, we have "swap discard" mechanism which sends a discard bio request
> when we find a free cluster during scan_swap_map(). This callback can come a
> long time after swap slots are actually freed.
>
> This delay in callback is a great problem when (compressed) RAM [1] is used
> as a swap device. So, this change adds a callback which is called as
> soon as a swap slot becomes free. For above mentioned case of swapping
> over compressed RAM device, this is very useful since we can immediately
> free memory allocated for this swap page.
>
> This callback does not replace swap discard support. It is called with
> swap_lock held, so it is meant to trigger action that finishes quickly.
> However, swap discard is an I/O request and can be used for taking longer
> actions.
>
> It is preferred to use this callback for ramzswap case even if discard
> mechanism could be improved such that it can be called as often as required.
> This is because, allocation of 'bio'(s) is undesirable since ramzswap always
> operates under low memory conditions (its a swap device). Also, batching of
> discard bio requests is not optimal since stale data can accumulate very
> quickly in ramzswap devices, pushing system further into low memory state.
>
> Signed-off-by: Nitin Gupta<ngupta@vflare.org>
> Signed-off-by: Pekka Enberg<penberg@cs.helsinki.fi>
> ---
> include/linux/blkdev.h | 2 ++
> include/linux/swap.h | 16 ++++++++++++++++
> mm/swapfile.c | 5 +++++
> 3 files changed, 23 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index e23a86c..688e024 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -27,6 +27,7 @@ struct scsi_ioctl_command;
> struct request_queue;
> struct elevator_queue;
> struct request_pm_state;
> +struct swap_info_struct;
> struct blk_trace;
> struct request;
> struct sg_io_hdr;
> @@ -1239,6 +1240,7 @@ struct block_device_operations {
> unsigned long long);
> int (*revalidate_disk) (struct gendisk *);
> int (*getgeo)(struct block_device *, struct hd_geometry *);
> + int (*swapon)(struct block_device *, struct swap_info_struct *);
> struct module *owner;
> };
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 7c15334..6603009 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -8,6 +8,7 @@
> #include<linux/memcontrol.h>
> #include<linux/sched.h>
> #include<linux/node.h>
> +#include<linux/blkdev.h>
>
> #include<asm/atomic.h>
> #include<asm/page.h>
> @@ -20,6 +21,8 @@ struct bio;
> #define SWAP_FLAG_PRIO_MASK 0x7fff
> #define SWAP_FLAG_PRIO_SHIFT 0
>
> +typedef void (swap_free_notify_fn) (struct block_device *, unsigned long);
> +
> static inline int current_is_kswapd(void)
> {
> return current->flags& PF_KSWAPD;
> @@ -155,6 +158,7 @@ struct swap_info_struct {
> unsigned int max;
> unsigned int inuse_pages;
> unsigned int old_block_size;
> + swap_free_notify_fn *swap_free_notify_fn;
> };
>
> struct swap_list_t {
> @@ -297,6 +301,18 @@ extern int reuse_swap_page(struct page *);
> extern int try_to_free_swap(struct page *);
> struct backing_dev_info;
>
> +static inline int
> +blkdev_swapon(struct block_device *bdev, struct swap_info_struct *sis)
> +{
> + struct gendisk *disk = bdev->bd_disk;
> + int err = 0;
> +
> + if (disk->fops->swapon)
> + err = disk->fops->swapon(bdev, sis);
> +
> + return err;
> +}
> +
> /* linux/mm/thrash.c */
> extern struct mm_struct *swap_token_mm;
> extern void grab_swap_token(struct mm_struct *);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 74f1102..9d10f02 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -585,6 +585,8 @@ static int swap_entry_free(struct swap_info_struct *p,
> swap_list.next = p - swap_info;
> nr_swap_pages++;
> p->inuse_pages--;
> + if (p->swap_free_notify_fn)
> + p->swap_free_notify_fn(p->bdev, offset);
> }
> if (!swap_count(count))
> mem_cgroup_uncharge_swap(ent);
> @@ -1845,6 +1847,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> if (error< 0)
> goto bad_swap;
> p->bdev = bdev;
> + error = blkdev_swapon(bdev, p);
> + if (error< 0)
> + goto bad_swap;
> } else if (S_ISREG(inode->i_mode)) {
> p->bdev = inode->i_sb->s_bdev;
> mutex_lock(&inode->i_mutex);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-09-18 15:05 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-17 22:43 [PATCH 0/4] compcache: in-memory compressed swapping v3 Nitin Gupta
2009-09-17 22:43 ` Nitin Gupta
2009-09-17 22:43 ` [PATCH 1/4] xvmalloc memory allocator Nitin Gupta
2009-09-17 22:43 ` Nitin Gupta
2009-09-18 21:05 ` Marcin Slusarz
2009-09-18 21:05 ` Marcin Slusarz
2009-09-22 3:50 ` Nitin Gupta
2009-09-22 3:50 ` Nitin Gupta
2009-09-17 22:43 ` [PATCH 2/4] send callback when swap slot is freed Nitin Gupta
2009-09-17 22:43 ` Nitin Gupta
2009-09-18 6:53 ` Pekka Enberg
2009-09-18 6:53 ` Pekka Enberg
2009-09-18 7:17 ` Hugh Dickins
2009-09-18 7:17 ` Hugh Dickins
2009-09-18 7:55 ` Pekka Enberg
2009-09-18 7:55 ` Pekka Enberg
2009-09-18 7:59 ` Hugh Dickins
2009-09-18 7:59 ` Hugh Dickins
2009-09-18 9:33 ` Pekka Enberg
2009-09-18 9:33 ` Pekka Enberg
2009-09-18 15:04 ` Nitin Gupta [this message]
2009-09-18 15:04 ` Nitin Gupta
2009-09-19 7:27 ` Pekka Enberg
2009-09-19 7:27 ` Pekka Enberg
2009-09-20 15:02 ` Nitin Gupta
2009-09-20 15:02 ` Nitin Gupta
2009-09-21 11:17 ` Hugh Dickins
2009-09-21 11:17 ` Hugh Dickins
2009-09-21 11:07 ` Hugh Dickins
2009-09-21 11:07 ` Hugh Dickins
2009-09-21 11:12 ` Pekka Enberg
2009-09-21 11:12 ` Pekka Enberg
2009-09-21 11:55 ` Hugh Dickins
2009-09-21 11:55 ` Hugh Dickins
2009-09-21 12:01 ` Pekka Enberg
2009-09-21 12:01 ` Pekka Enberg
2009-09-22 3:04 ` Nitin Gupta
2009-09-22 3:04 ` Nitin Gupta
2009-09-21 12:08 ` Pekka Enberg
2009-09-21 12:08 ` Pekka Enberg
2009-09-21 12:29 ` Nitin Gupta
2009-09-21 12:29 ` Nitin Gupta
2009-09-18 9:59 ` Nitin Gupta
2009-09-18 9:59 ` Nitin Gupta
2009-09-19 5:47 ` Nitin Gupta
2009-09-19 5:47 ` Nitin Gupta
2009-09-24 1:39 ` KAMEZAWA Hiroyuki
2009-09-24 1:39 ` KAMEZAWA Hiroyuki
2009-09-17 22:43 ` [PATCH 3/4] virtual block device driver (ramzswap) Nitin Gupta
2009-09-17 22:43 ` Nitin Gupta
2009-09-18 20:48 ` Marcin Slusarz
2009-09-18 20:48 ` Marcin Slusarz
2009-09-17 22:43 ` [PATCH 4/4] documentation Nitin Gupta
2009-09-17 22:43 ` Nitin Gupta
2009-09-18 16:43 ` [PATCH] ramzswap prefix for swap free callback Nitin Gupta
2009-09-18 16:43 ` Nitin Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AB3A16B.90009@vflare.org \
--to=ngupta@vflare.org \
--cc=akpm@linux-foundation.org \
--cc=edt@aei.ca \
--cc=greg@kroah.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm-cc@laptop.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=penberg@cs.helsinki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.