* [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
@ 2010-03-05 10:22 Nitin Gupta
2010-03-09 19:07 ` Nitin Gupta
2010-03-11 7:22 ` Hugh Dickins
0 siblings, 2 replies; 13+ messages in thread
From: Nitin Gupta @ 2010-03-05 10:22 UTC (permalink / raw)
To: Hugh Dickins
Cc: Linus Torvalds, Andrew Morton, Greg KH, Pekka Enberg, andi,
linux-kernel
ramzswap driver creates RAM based block devices which can be
used (only) as swap disks. Pages swapped to these disks are
compressed and stored in memory itself.
However, these devices do not get any notification when a swap
slot is freed (swap_map[i] reaches 0). So, we cannot free memory
allocated corresponding to this swap slot. Such stale data can
quickly accumulate in (compressed) memory defeating the whole
purpose of such devices.
To overcome this problem, we now add a callback in 'struct swap_info_struct'
which is called as soon as a swap slot is freed.
Adding handler for this callback:
swapon notifier --> set_swap_free_notify(swap_type, fn)
Removing handler:
swapoff notifier --> set_swap_free_notify(swap_type, NULL)
Alternative approaches:
1) Add callback directly in 'struct block_device_operations' but
that is considered too hacky.
2) Use swap discard mechanism: It involves unncessary overhead of
allocating 'discard bio' requests and its too slow to serve ramzswap
needs.
drivers/staging/ramzswap/ramzswap_drv.c | 91 +++++++++++++++++++++++++++++
drivers/staging/ramzswap/ramzswap_drv.h | 1 +
drivers/staging/ramzswap/ramzswap_ioctl.h | 1 +
include/linux/swap.h | 16 +++++-
mm/swapfile.c | 78 ++++++++++++++++++++++++
5 files changed, 185 insertions(+), 2 deletions(-)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-03-05 10:22 Nitin Gupta
@ 2010-03-09 19:07 ` Nitin Gupta
2010-03-11 7:22 ` Hugh Dickins
1 sibling, 0 replies; 13+ messages in thread
From: Nitin Gupta @ 2010-03-09 19:07 UTC (permalink / raw)
To: Hugh Dickins
Cc: Linus Torvalds, Andrew Morton, Greg KH, Pekka Enberg, andi,
linux-kernel
On 03/05/2010 03:52 PM, Nitin Gupta wrote:
> ramzswap driver creates RAM based block devices which can be
> used (only) as swap disks. Pages swapped to these disks are
> compressed and stored in memory itself.
>
> However, these devices do not get any notification when a swap
> slot is freed (swap_map[i] reaches 0). So, we cannot free memory
> allocated corresponding to this swap slot. Such stale data can
> quickly accumulate in (compressed) memory defeating the whole
> purpose of such devices.
>
> To overcome this problem, we now add a callback in 'struct swap_info_struct'
> which is called as soon as a swap slot is freed.
>
> Adding handler for this callback:
> swapon notifier --> set_swap_free_notify(swap_type, fn)
>
> Removing handler:
> swapoff notifier --> set_swap_free_notify(swap_type, NULL)
>
>
> Alternative approaches:
> 1) Add callback directly in 'struct block_device_operations' but
> that is considered too hacky.
> 2) Use swap discard mechanism: It involves unncessary overhead of
> allocating 'discard bio' requests and its too slow to serve ramzswap
> needs.
>
> drivers/staging/ramzswap/ramzswap_drv.c | 91 +++++++++++++++++++++++++++++
> drivers/staging/ramzswap/ramzswap_drv.h | 1 +
> drivers/staging/ramzswap/ramzswap_ioctl.h | 1 +
> include/linux/swap.h | 16 +++++-
> mm/swapfile.c | 78 ++++++++++++++++++++++++
> 5 files changed, 185 insertions(+), 2 deletions(-)
>
Any chance it can go into 2.6.34? Any comments?
Thanks,
Nitin
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-03-05 10:22 Nitin Gupta
2010-03-09 19:07 ` Nitin Gupta
@ 2010-03-11 7:22 ` Hugh Dickins
2010-03-11 11:36 ` Nitin Gupta
1 sibling, 1 reply; 13+ messages in thread
From: Hugh Dickins @ 2010-03-11 7:22 UTC (permalink / raw)
To: Nitin Gupta
Cc: Linus Torvalds, Andrew Morton, Greg KH, Pekka Enberg, andi,
linux-kernel
On Fri, 5 Mar 2010, Nitin Gupta wrote:
> ramzswap driver creates RAM based block devices which can be
> used (only) as swap disks. Pages swapped to these disks are
> compressed and stored in memory itself.
>
> However, these devices do not get any notification when a swap
> slot is freed (swap_map[i] reaches 0). So, we cannot free memory
> allocated corresponding to this swap slot. Such stale data can
> quickly accumulate in (compressed) memory defeating the whole
> purpose of such devices.
>
> To overcome this problem, we now add a callback in 'struct swap_info_struct'
> which is called as soon as a swap slot is freed.
>
> Adding handler for this callback:
> swapon notifier --> set_swap_free_notify(swap_type, fn)
>
> Removing handler:
> swapoff notifier --> set_swap_free_notify(swap_type, NULL)
>
>
> Alternative approaches:
> 1) Add callback directly in 'struct block_device_operations' but
> that is considered too hacky.
> 2) Use swap discard mechanism: It involves unncessary overhead of
> allocating 'discard bio' requests and its too slow to serve ramzswap
> needs.
>
> drivers/staging/ramzswap/ramzswap_drv.c | 91 +++++++++++++++++++++++++++++
> drivers/staging/ramzswap/ramzswap_drv.h | 1 +
> drivers/staging/ramzswap/ramzswap_ioctl.h | 1 +
> include/linux/swap.h | 16 +++++-
> mm/swapfile.c | 78 ++++++++++++++++++++++++
> 5 files changed, 185 insertions(+), 2 deletions(-)
This is one of the various solutions I disliked already, isn't it?
To me this is just a more convoluted and obscure variant of the
block_device_operations swap_slot_free_notify patch you had in mmotm,
which Linus has rejected.
I admit, I did not understand at all what Linus was proposing with
readpage, writepage and address_space_operations: I kept quiet in
the hope that you'd understand where I didn't!
Hugh
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-03-11 7:22 ` Hugh Dickins
@ 2010-03-11 11:36 ` Nitin Gupta
0 siblings, 0 replies; 13+ messages in thread
From: Nitin Gupta @ 2010-03-11 11:36 UTC (permalink / raw)
To: Hugh Dickins
Cc: Linus Torvalds, Andrew Morton, Greg KH, Pekka Enberg, andi,
linux-kernel
On Thu, Mar 11, 2010 at 12:52 PM, Hugh Dickins
<hugh.dickins@tiscali.co.uk> wrote:
> On Fri, 5 Mar 2010, Nitin Gupta wrote:
<snip>
>
>> Adding handler for this callback:
>> swapon notifier --> set_swap_free_notify(swap_type, fn)
>>
>> Removing handler:
>> swapoff notifier --> set_swap_free_notify(swap_type, NULL)
>>
>>
>> Alternative approaches:
>> 1) Add callback directly in 'struct block_device_operations' but
>> that is considered too hacky.
>> 2) Use swap discard mechanism: It involves unncessary overhead of
>> allocating 'discard bio' requests and its too slow to serve ramzswap
>> needs.
>>
>> drivers/staging/ramzswap/ramzswap_drv.c | 91 +++++++++++++++++++++++++++++
>> drivers/staging/ramzswap/ramzswap_drv.h | 1 +
>> drivers/staging/ramzswap/ramzswap_ioctl.h | 1 +
>> include/linux/swap.h | 16 +++++-
>> mm/swapfile.c | 78 ++++++++++++++++++++++++
>> 5 files changed, 185 insertions(+), 2 deletions(-)
>
> This is one of the various solutions I disliked already, isn't it?
>
This is a kind of hybrid of previous approaches. We now have notifier
for swapon/off events only -- which Linus seems to be okay with (earlier
we had notifier for slot free event also). Handler for swapon is responsible
for setting callback for slot free event (earlier we were installing it from
"ramzswap_make_request" directly, when first I/O is received).
> To me this is just a more convoluted and obscure variant of the
> block_device_operations swap_slot_free_notify patch you had in mmotm,
> which Linus has rejected.
>
This slot free notification is a sort of black area for which I have
really run out of
ideas. I tried all the approaches I could think of (and all you and
Pekka suggested)
but nothing seems to satisfy everyone. This is so essential for ramzswap but now
I don't know how to get this done.
> I admit, I did not understand at all what Linus was proposing with
> readpage, writepage and address_space_operations: I kept quiet in
> the hope that you'd understand where I didn't!
>
I guess he is suggesting creating a fictional 'address_space' just like
we have 'struct address_space swapper_space'. This way, we can
intercept writepage naturally in pageout() with new mapping->a_ops->writepage().
And then similarly with mapping->a_ops->readpage().
I'm now looking into this approach. This might allow compressing page cache
pages too. However, figuring out all the details can take significant
amount of time
and allowing ramzswap development to happen in parallel should be okay.
Thanks,
Nitin
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
@ 2010-05-05 13:45 Nitin Gupta
2010-05-05 13:45 ` [PATCH 1/3] Add notifiers for swapon and swapoff events Nitin Gupta
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Nitin Gupta @ 2010-05-05 13:45 UTC (permalink / raw)
To: Greg KH
Cc: Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, Linus Torvalds,
driverdev, linux-kernel
(tested on mainline but should apply to linux-next cleanly)
ramzswap driver creates RAM based block devices which can be
used (only) as swap disks. Pages swapped to these disks are
compressed and stored in memory itself.
However, these devices do not get any notification when a swap
slot is freed (swap_map[i] reaches 0). So, we cannot free memory
allocated corresponding to this swap slot. Such stale data can
quickly accumulate in (compressed) memory defeating the whole
purpose of such devices.
To overcome this problem, we now add a callback in 'struct swap_info_struct'
which is called as soon as a swap slot is freed.
Adding handler for this callback:
swapon notifier --> set_swap_free_notify(swap_type, fn)
Removing handler:
swapoff notifier --> set_swap_free_notify(swap_type, NULL)
Alternative approaches:
1) Add callback directly in 'struct block_device_operations' but
that is considered too hacky (nacked by Linus).
2) Use swap discard mechanism: It involves unncessary overhead of
allocating 'discard bio' requests and its too slow to serve ramzswap
needs.
I also tried an approach similar to what Linus suggested:
Create an address_space like swaper_space, so we can catch
read/writes with a_ops->readpage() etc. However, this approach
turned out to be quite difficult and I could not get it to work.
Nitin Gupta (3):
Add notifiers for swapon and swapoff events
Send callback when a swap slot is freed
ramzswap: Register for swap event notifiers and callback
drivers/staging/ramzswap/TODO | 5 --
drivers/staging/ramzswap/ramzswap_drv.c | 79 ++++++++++++++++++++++++++++++-
include/linux/swap.h | 15 ++++++
mm/swapfile.c | 70 +++++++++++++++++++++++++++
4 files changed, 163 insertions(+), 6 deletions(-)
delete mode 100644 drivers/staging/ramzswap/TODO
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] Add notifiers for swapon and swapoff events
2010-05-05 13:45 [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Nitin Gupta
@ 2010-05-05 13:45 ` Nitin Gupta
2010-05-05 13:45 ` [PATCH 2/3] Send callback when a swap slot is freed Nitin Gupta
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Nitin Gupta @ 2010-05-05 13:45 UTC (permalink / raw)
To: Greg KH
Cc: Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, Linus Torvalds,
driverdev, linux-kernel
This is required for ramzswap module which implements RAM based block
devices to be used as swap disks. These devices require a notification
on these events to function properly.
Currently, I'm not sure if any of these event notifiers have any other
users. However, adding ramzswap specific hooks instead of this generic
approach resulted in a bad/hacky code.
Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
include/linux/swap.h | 9 ++++++++
mm/swapfile.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 60 insertions(+), 0 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1f59d93..f3c7378 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -150,6 +150,11 @@ enum {
SWP_SCANNING = (1 << 8), /* refcount in scan_swap_map */
};
+enum swap_event {
+ SWAP_EVENT_SWAPON,
+ SWAP_EVENT_SWAPOFF,
+};
+
#define SWAP_CLUSTER_MAX 32
#define SWAP_MAP_MAX 0x3e /* Max duplication count, in first swap_map */
@@ -329,6 +334,10 @@ extern sector_t map_swap_page(struct page *, struct block_device **);
extern sector_t swapdev_block(int, pgoff_t);
extern int reuse_swap_page(struct page *);
extern int try_to_free_swap(struct page *);
+extern int register_swap_event_notifier(struct notifier_block *nb,
+ enum swap_event event);
+extern int unregister_swap_event_notifier(struct notifier_block *nb,
+ enum swap_event event);
struct backing_dev_info;
/* linux/mm/thrash.c */
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6cd0a8f..aa474f3 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -57,6 +57,9 @@ static struct swap_list_t swap_list = {-1, -1};
static struct swap_info_struct *swap_info[MAX_SWAPFILES];
static DEFINE_MUTEX(swapon_mutex);
+static BLOCKING_NOTIFIER_HEAD(swapon_notify_list);
+static BLOCKING_NOTIFIER_HEAD(swapoff_notify_list);
+
static inline unsigned char swap_count(unsigned char ent)
{
@@ -1641,6 +1644,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
p->swap_map = NULL;
p->flags = 0;
spin_unlock(&swap_lock);
+ blocking_notifier_call_chain(&swapon_notify_list, type, swap_file);
mutex_unlock(&swapon_mutex);
vfree(swap_map);
/* Destroy swap account informatin */
@@ -2060,6 +2064,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
else
swap_info[prev]->next = type;
spin_unlock(&swap_lock);
+ blocking_notifier_call_chain(&swapon_notify_list, type, swap_file);
mutex_unlock(&swapon_mutex);
error = 0;
goto out;
@@ -2487,3 +2492,49 @@ static void free_swap_count_continuations(struct swap_info_struct *si)
}
}
}
+
+static struct blocking_notifier_head *get_swap_notifier(enum swap_event event)
+{
+ struct blocking_notifier_head *nh;
+
+ switch (event) {
+ case SWAP_EVENT_SWAPON:
+ nh = &swapon_notify_list;
+ break;
+ case SWAP_EVENT_SWAPOFF:
+ nh = &swapoff_notify_list;
+ break;
+ default:
+ nh = NULL;
+ pr_err("Invalid swap event: %d\n", event);
+ }
+ return nh;
+}
+
+int register_swap_event_notifier(struct notifier_block *nb,
+ enum swap_event event)
+{
+ int ret = -EINVAL;
+ struct blocking_notifier_head *nh;
+
+ nh = get_swap_notifier(event);
+ if (nh)
+ ret = blocking_notifier_chain_register(nh, nb);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(register_swap_event_notifier);
+
+int unregister_swap_event_notifier(struct notifier_block *nb,
+ enum swap_event event)
+{
+ int ret = -EINVAL;
+ struct blocking_notifier_head *nh;
+
+ nh = get_swap_notifier(event);
+ if (nh)
+ ret = blocking_notifier_chain_unregister(nh, nb);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(unregister_swap_event_notifier);
--
1.6.6.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/3] Send callback when a swap slot is freed
2010-05-05 13:45 [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Nitin Gupta
2010-05-05 13:45 ` [PATCH 1/3] Add notifiers for swapon and swapoff events Nitin Gupta
@ 2010-05-05 13:45 ` Nitin Gupta
2010-05-05 13:45 ` [PATCH 3/3] ramzswap: Register for swap event notifiers and callback Nitin Gupta
2010-05-05 15:14 ` [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Linus Torvalds
3 siblings, 0 replies; 13+ messages in thread
From: Nitin Gupta @ 2010-05-05 13:45 UTC (permalink / raw)
To: Greg KH
Cc: Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, Linus Torvalds,
driverdev, linux-kernel
This callback is required when RAM based devices are used as swap disks.
One such device is ramzswap[1] which is used as compressed in-memory swap
disk. For such devices, we need a callback as soon as a swap slot is no
longer used to allow freeing memory allocated for this slot. Without this
callback, stale data can quickly accumulate in memory defeating the whole
purpose of such devices.
Signed-off-by: Nitin Gupta <ngupta@vvflare.org>
---
include/linux/swap.h | 6 ++++++
mm/swapfile.c | 19 +++++++++++++++++++
2 files changed, 25 insertions(+), 0 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index f3c7378..ffe0063 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -8,6 +8,7 @@
#include <linux/memcontrol.h>
#include <linux/sched.h>
#include <linux/node.h>
+#include <linux/blkdev.h>
#include <asm/atomic.h>
#include <asm/page.h>
@@ -20,6 +21,9 @@ struct bio;
#define SWAP_FLAG_PRIO_MASK 0x7fff
#define SWAP_FLAG_PRIO_SHIFT 0
+typedef void (ramzswap_slot_free_notify_fn)(struct block_device *,
+ unsigned long);
+
static inline int current_is_kswapd(void)
{
return current->flags & PF_KSWAPD;
@@ -185,6 +189,7 @@ struct swap_info_struct {
struct swap_extent *curr_swap_extent;
struct swap_extent first_swap_extent;
struct block_device *bdev; /* swap device or bdev of swap file */
+ ramzswap_slot_free_notify_fn *ramzswap_slot_free_notify_fn;
struct file *swap_file; /* seldom referenced */
unsigned int old_block_size; /* seldom referenced */
};
@@ -338,6 +343,7 @@ extern int register_swap_event_notifier(struct notifier_block *nb,
enum swap_event event);
extern int unregister_swap_event_notifier(struct notifier_block *nb,
enum swap_event event);
+extern void set_ramzswap_slot_free_notify(int, ramzswap_slot_free_notify_fn *);
struct backing_dev_info;
/* linux/mm/thrash.c */
diff --git a/mm/swapfile.c b/mm/swapfile.c
index aa474f3..aaea63d 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -586,6 +586,8 @@ static unsigned char swap_entry_free(struct swap_info_struct *p,
swap_list.next = p->type;
nr_swap_pages++;
p->inuse_pages--;
+ if (p->ramzswap_slot_free_notify_fn)
+ p->ramzswap_slot_free_notify_fn(p->bdev, offset);
}
return usage;
@@ -2538,3 +2540,20 @@ int unregister_swap_event_notifier(struct notifier_block *nb,
return ret;
}
EXPORT_SYMBOL_GPL(unregister_swap_event_notifier);
+
+/*
+ * Called from ramzswap driver. Sets swap slot free callback for
+ * the given ramzswap device. This callback will be given a more
+ * generic name if it finds more users.
+ */
+void set_ramzswap_slot_free_notify(int type, ramzswap_slot_free_notify_fn *fn)
+{
+ struct swap_info_struct *sis;
+
+ spin_lock(&swap_lock);
+ sis = swap_info[type];
+ sis->ramzswap_slot_free_notify_fn = fn;
+ spin_unlock(&swap_lock);
+}
+EXPORT_SYMBOL_GPL(set_ramzswap_slot_free_notify);
+
--
1.6.6.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/3] ramzswap: Register for swap event notifiers and callback
2010-05-05 13:45 [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Nitin Gupta
2010-05-05 13:45 ` [PATCH 1/3] Add notifiers for swapon and swapoff events Nitin Gupta
2010-05-05 13:45 ` [PATCH 2/3] Send callback when a swap slot is freed Nitin Gupta
@ 2010-05-05 13:45 ` Nitin Gupta
2010-05-05 15:14 ` [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Linus Torvalds
3 siblings, 0 replies; 13+ messages in thread
From: Nitin Gupta @ 2010-05-05 13:45 UTC (permalink / raw)
To: Greg KH
Cc: Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, Linus Torvalds,
driverdev, linux-kernel
The SWAPON handler sets callback which frees memory associated
with given swap slot, eliminating any stale data in corresponding
ramzswap device.
Without this callback, ramzswap device does not get any notification
when a swap slot is freed. Thus, stale data can quickly accumulate in
(compressed) memory which can defeat the whole purpose of such a device.
With this callback, the device can free memory as soon as corresponding
swap slot is freed.
Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
drivers/staging/ramzswap/TODO | 5 --
drivers/staging/ramzswap/ramzswap_drv.c | 79 ++++++++++++++++++++++++++++++-
2 files changed, 78 insertions(+), 6 deletions(-)
delete mode 100644 drivers/staging/ramzswap/TODO
diff --git a/drivers/staging/ramzswap/TODO b/drivers/staging/ramzswap/TODO
deleted file mode 100644
index 8d64e28..0000000
--- a/drivers/staging/ramzswap/TODO
+++ /dev/null
@@ -1,5 +0,0 @@
-TODO:
- - Add support for swap notifiers
-
-Please send patches to Greg Kroah-Hartman <greg@kroah.com> and
-Nitin Gupta <ngupta@vflare.org>
diff --git a/drivers/staging/ramzswap/ramzswap_drv.c b/drivers/staging/ramzswap/ramzswap_drv.c
index ee5eb12..4be5bf7 100644
--- a/drivers/staging/ramzswap/ramzswap_drv.c
+++ b/drivers/staging/ramzswap/ramzswap_drv.c
@@ -1300,6 +1300,76 @@ static struct block_device_operations ramzswap_devops = {
.owner = THIS_MODULE,
};
+/*
+ * Returns ramzswap device for the given swap file. Also caches
+ * struct ramzswap in file->private_data.
+ *
+ * Returns NULL if the given file is not a ramzswap device.
+ */
+static struct ramzswap *ramzswap_find_device(struct file *swap_file)
+{
+ int i;
+ struct inode *inode;
+ struct ramzswap *rzs;
+ struct block_device *bdev;
+
+ inode = swap_file->f_mapping->host;
+ bdev = I_BDEV(inode);
+ rzs = bdev->bd_disk->private_data;
+
+ for (i = 0; i < num_devices; i++) {
+ if (rzs == &devices[i])
+ break;
+ }
+
+ if (i == num_devices) {
+ rzs = NULL;
+ goto out;
+ }
+
+out:
+ return rzs;
+}
+
+void ramzswap_slot_free_notify(struct block_device *bdev, unsigned long index)
+{
+ struct ramzswap *rzs = bdev->bd_disk->private_data;
+ ramzswap_free_page(rzs, index);
+ rzs_stat64_inc(rzs, &rzs->stats.notify_free);
+}
+
+int ramzswap_swapon_notify(struct notifier_block *nb, unsigned long type,
+ void *swap_file)
+{
+ int ret = -EINVAL;
+
+ if (ramzswap_find_device(swap_file)) {
+ set_ramzswap_slot_free_notify(type, ramzswap_slot_free_notify);
+ ret = 0;
+ }
+ return ret;
+}
+
+int ramzswap_swapoff_notify(struct notifier_block *nb, unsigned long type,
+ void *swap_file)
+{
+ int ret = -EINVAL;
+
+ if (ramzswap_find_device(swap_file)) {
+ set_ramzswap_slot_free_notify(type, NULL);
+ ret = 0;
+ }
+ return ret;
+}
+
+static struct notifier_block ramzswap_swapon_nb = {
+ .notifier_call = ramzswap_swapon_notify,
+};
+
+static struct notifier_block ramzswap_swapoff_nb = {
+ .notifier_call = ramzswap_swapoff_notify,
+};
+
static int create_device(struct ramzswap *rzs, int device_id)
{
int ret = 0;
@@ -1401,8 +1471,11 @@ static int __init ramzswap_init(void)
goto free_devices;
}
- return 0;
+ register_swap_event_notifier(&ramzswap_swapon_nb, SWAP_EVENT_SWAPON);
+ register_swap_event_notifier(&ramzswap_swapoff_nb,
+ SWAP_EVENT_SWAPOFF);
+ return 0;
free_devices:
while (dev_id)
destroy_device(&devices[--dev_id]);
@@ -1417,6 +1490,10 @@ static void __exit ramzswap_exit(void)
int i;
struct ramzswap *rzs;
+ unregister_swap_event_notifier(&ramzswap_swapon_nb, SWAP_EVENT_SWAPON);
+ unregister_swap_event_notifier(&ramzswap_swapoff_nb,
+ SWAP_EVENT_SWAPOFF);
+
for (i = 0; i < num_devices; i++) {
rzs = &devices[i];
--
1.6.6.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-05-05 13:45 [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Nitin Gupta
` (2 preceding siblings ...)
2010-05-05 13:45 ` [PATCH 3/3] ramzswap: Register for swap event notifiers and callback Nitin Gupta
@ 2010-05-05 15:14 ` Linus Torvalds
2010-05-05 16:05 ` Nitin Gupta
3 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2010-05-05 15:14 UTC (permalink / raw)
To: Nitin Gupta
Cc: Greg KH, Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, driverdev,
linux-kernel
On Wed, 5 May 2010, Nitin Gupta wrote:
>
> ramzswap driver creates RAM based block devices which can be
> used (only) as swap disks. Pages swapped to these disks are
> compressed and stored in memory itself.
Ok, this patch series looks way better, if only because it looks less
hacky.
That said, I absolutely _hate_ the f*cking notifier model that takes
"type" flags. It's a disgrace. It's a horrible horrible model.
I'd much rather bind a nice "swap_operations" structure to the device, and
have that structure have function pointers for the different operations.
No stupid "operation type codes". Real, honest-to-goodness function
pointers.
The notifier layer is a total piece of sh*t. I'm sorry I ever merged it,
and I'm _doubly_ sorry that it's use is so horribly widespread. It's a
mistake.
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-05-05 15:14 ` [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Linus Torvalds
@ 2010-05-05 16:05 ` Nitin Gupta
2010-05-05 16:22 ` Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Nitin Gupta @ 2010-05-05 16:05 UTC (permalink / raw)
To: Linus Torvalds
Cc: Greg KH, Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, driverdev,
linux-kernel
On 05/05/2010 08:44 PM, Linus Torvalds wrote:
>
>
> On Wed, 5 May 2010, Nitin Gupta wrote:
>>
>> ramzswap driver creates RAM based block devices which can be
>> used (only) as swap disks. Pages swapped to these disks are
>> compressed and stored in memory itself.
>
> Ok, this patch series looks way better, if only because it looks less
> hacky.
>
> That said, I absolutely _hate_ the f*cking notifier model that takes
> "type" flags. It's a disgrace. It's a horrible horrible model.
>
You mean you didn't like the 'swap type' value passed around by notifier
calls, as here:
"blocking_notifier_call_chain(&swapon_notify_list, type, swap_file);" ?
> I'd much rather bind a nice "swap_operations" structure to the device, and
> have that structure have function pointers for the different operations.
> No stupid "operation type codes". Real, honest-to-goodness function
> pointers.
>
I think such 'swap_operations' structure will be have to be part of
block_device_operations, so we may access it from swap_entry_free()
where a swap slot is freed. This will also get rid of all this notifier
stuff.
The patch you nacked did something similar: it add 'swap_slot_free_callback'
directly to block_device_operations. Without such change, I could not think
of any way to do away with notifiers.
> The notifier layer is a total piece of sh*t. I'm sorry I ever merged it,
> and I'm _doubly_ sorry that it's use is so horribly widespread. It's a
> mistake.
Thanks,
Nitin
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-05-05 16:05 ` Nitin Gupta
@ 2010-05-05 16:22 ` Linus Torvalds
2010-05-05 16:55 ` Nitin Gupta
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2010-05-05 16:22 UTC (permalink / raw)
To: Nitin Gupta
Cc: Greg KH, Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, driverdev,
linux-kernel
On Wed, 5 May 2010, Nitin Gupta wrote:
>
> I think such 'swap_operations' structure will be have to be part of
> block_device_operations, so we may access it from swap_entry_free()
> where a swap slot is freed. This will also get rid of all this notifier
> stuff.
Yes, I think adding it to block_device_operations would be fine. That
sounds like a sane layering, and would make it easy for a block device
driver to say "I want to know about swap events".
In fact, for regular block devices, a swap block free might well translate
into a TRIM command some day (where "some day" means when the SSD's
actually get their stuff together and there is real upside and not just
"most cases will be very slow and the upside is debatable").
> The patch you nacked did something similar: it add 'swap_slot_free_callback'
> directly to block_device_operations. Without such change, I could not think
> of any way to do away with notifiers.
Umm. No. IIRC, the patch I NAK'ed aded it to the 'swap_info_struct', which
I said was the wrong level. The block device driver level would seem to be
the _right_ level, since that's what ramzswap is. No?
Also, the patch I NAK'ed also used those nasty notifier chains, making it
even uglier.
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-05-05 16:22 ` Linus Torvalds
@ 2010-05-05 16:55 ` Nitin Gupta
2010-05-05 17:50 ` Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Nitin Gupta @ 2010-05-05 16:55 UTC (permalink / raw)
To: Linus Torvalds
Cc: Greg KH, Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, driverdev,
linux-kernel
On 05/05/2010 09:52 PM, Linus Torvalds wrote:
>
>
> On Wed, 5 May 2010, Nitin Gupta wrote:
>>
>> I think such 'swap_operations' structure will be have to be part of
>> block_device_operations, so we may access it from swap_entry_free()
>> where a swap slot is freed. This will also get rid of all this notifier
>> stuff.
>
> Yes, I think adding it to block_device_operations would be fine. That
> sounds like a sane layering, and would make it easy for a block device
> driver to say "I want to know about swap events".
>
> In fact, for regular block devices, a swap block free might well translate
> into a TRIM command some day (where "some day" means when the SSD's
> actually get their stuff together and there is real upside and not just
> "most cases will be very slow and the upside is debatable").
>
Its great if adding such a callback to block_device_operations is okay. Hugh
suggested this approach and I'm distributing it with compcache for quite some
time now:
http://code.google.com/p/compcache/source/browse/patches/patch_swap_notify_core_support_2.6.33.diff
Can you please have a lot at patch above and see if its acceptable? Then I will
post it to lkml again.
>> The patch you nacked did something similar: it add 'swap_slot_free_callback'
>> directly to block_device_operations. Without such change, I could not think
>> of any way to do away with notifiers.
>
> Umm. No. IIRC, the patch I NAK'ed aded it to the 'swap_info_struct', which
> I said was the wrong level. The block device driver level would seem to be
> the _right_ level, since that's what ramzswap is. No?
>
> Also, the patch I NAK'ed also used those nasty notifier chains, making it
> even uglier.
Please see the original mail below (patch you nacked). Maybe, at that time, I didn't
make it clear that ramzswap is really a *block device* :)
-------- Original Message --------
Subject: [nacked] mm-add-swap-slot-free-callback-to-block_device_operations.patch removed from -mm tree
Date: Tue, 09 Mar 2010 14:30:27 -0800
From: akpm@linux-foundation.org
To: <snip/>
The patch titled
mm: add swap slot free callback to block_device_operations
has been removed from the -mm tree. Its filename was
mm-add-swap-slot-free-callback-to-block_device_operations.patch
This patch was dropped because it was nacked
The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/
------------------------------------------------------
Subject: mm: add swap slot free callback to block_device_operations
From: Nitin Gupta <ngupta@vflare.org>
This callback is required when RAM based devices are used as swap disks.
One such device is ramzswap[1] which is used as compressed in-memory swap
disk. For such devices, we need a callback as soon as a swap slot is no
longer used to allow freeing memory allocated for this slot. Without this
callback, stale data can quickly accumulate in memory defeating the whole
purpose of such devices.
Another user of this callback will be "preswap" as introduced by
"Transcendent Memory" patches: http://lwn.net/Articles/367286/ (I intend
to integrade preswap with ramzswap).
[1] ramzswap: http://code.google.com/p/compcache/
Signed-off-by: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Greg KH <greg@kroah.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/blkdev.h | 2 ++
mm/swapfile.c | 3 +++
2 files changed, 5 insertions(+)
diff -puN include/linux/blkdev.h~mm-add-swap-slot-free-callback-to-block_device_operations include/linux/blkdev.h
--- a/include/linux/blkdev.h~mm-add-swap-slot-free-callback-to-block_device_operations
+++ a/include/linux/blkdev.h
@@ -1310,6 +1310,8 @@ struct block_device_operations {
unsigned long long);
int (*revalidate_disk) (struct gendisk *);
int (*getgeo)(struct block_device *, struct hd_geometry *);
+ /* this callback is with swap_lock and sometimes page table lock held */
+ void (*swap_slot_free_notify) (struct block_device *, unsigned long);
struct module *owner;
};
diff -puN mm/swapfile.c~mm-add-swap-slot-free-callback-to-block_device_operations mm/swapfile.c
--- a/mm/swapfile.c~mm-add-swap-slot-free-callback-to-block_device_operations
+++ a/mm/swapfile.c
@@ -574,6 +574,7 @@ static unsigned char swap_entry_free(str
/* free if no reference */
if (!usage) {
+ struct gendisk *disk = p->bdev->bd_disk;
if (offset < p->lowest_bit)
p->lowest_bit = offset;
if (offset > p->highest_bit)
@@ -583,6 +584,8 @@ static unsigned char swap_entry_free(str
swap_list.next = p->type;
nr_swap_pages++;
p->inuse_pages--;
+ if (disk->fops->swap_slot_free_notify)
+ disk->fops->swap_slot_free_notify(p->bdev, offset);
}
return usage;
_
Patches currently in -mm which might be from ngupta@vflare.org are
linux-next.patch
mm-add-swap-slot-free-callback-to-block_device_operations.patch
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory
2010-05-05 16:55 ` Nitin Gupta
@ 2010-05-05 17:50 ` Linus Torvalds
0 siblings, 0 replies; 13+ messages in thread
From: Linus Torvalds @ 2010-05-05 17:50 UTC (permalink / raw)
To: Nitin Gupta
Cc: Greg KH, Minchan Kim, Pekka Enberg, Hugh Dickins, Cyp, driverdev,
linux-kernel
On Wed, 5 May 2010, Nitin Gupta wrote:
>
> Please see the original mail below (patch you nacked). Maybe, at that time, I didn't
> make it clear that ramzswap is really a *block device* :)
Oh, you're right. I looked up another patch of yours that had that
swap_info" thing, and decided I hated that one with a passion, but it's
not the one I NAK'ed in the earlier discussion.
Now that I see the block-layer patch, my reaction is (a) it's so much
nicer than using that horrid nasty 'notifier' crap and (b) it reminds me
why I wasn't entirely happy: it doesn't work - or even make sense - for
filesystem-backed swap.
So when you do
struct gendisk *disk = p->bdev->bd_disk;
..
if (disk->fops->swap_slot_free_notify)
disk->fops->swap_slot_free_notify(p->bdev, offset);
there's nothing that says that 'offset' makes any sense at all, because if
it's a swap-file on a device, it does all kinds of totally wrong things.
So I don't think that patch works either. I still suspect that the right
"level" for something like this should be the 'mapping' level (which is
how we actually do the write), but that seems to not work well with the
block device layer.
So at a _minimum_, that 'disk->fops' approach needs to verify that the
swap device is actually the whole bdev, and that the bdev isn't just the
backing store for a swap _file_.
I dunno how to best check that. Either add a new flag to
'swap_info_struct' that gets set on 'swapon()' whether it's a full device
or a file. Or possibly just something like
static int swap_is_block_device(struct swap_info_struct *p)
{
return S_ISBLK(p->swap_file->f_mapping->host);
}
instead.
Because doing that 'disk->fops' thing _really_ isn't right if it isn't a
disk.
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-05-05 17:52 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-05 13:45 [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Nitin Gupta
2010-05-05 13:45 ` [PATCH 1/3] Add notifiers for swapon and swapoff events Nitin Gupta
2010-05-05 13:45 ` [PATCH 2/3] Send callback when a swap slot is freed Nitin Gupta
2010-05-05 13:45 ` [PATCH 3/3] ramzswap: Register for swap event notifiers and callback Nitin Gupta
2010-05-05 15:14 ` [PATCH 0/3] ramzswap: Eliminate stale data in compressed memory Linus Torvalds
2010-05-05 16:05 ` Nitin Gupta
2010-05-05 16:22 ` Linus Torvalds
2010-05-05 16:55 ` Nitin Gupta
2010-05-05 17:50 ` Linus Torvalds
-- strict thread matches above, loose matches on Subject: below --
2010-03-05 10:22 Nitin Gupta
2010-03-09 19:07 ` Nitin Gupta
2010-03-11 7:22 ` Hugh Dickins
2010-03-11 11:36 ` Nitin Gupta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox