[PATCH] vm - swap_prefetch-15

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] vm - swap_prefetch-15
@ 2005-10-06 14:01 Con Kolivas
  2005-10-06 14:13 ` [PATCH] vm - swap_prefetch-15 docs Con Kolivas
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Con Kolivas @ 2005-10-06 14:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: ck

[-- Attachment #1: Type: text/plain, Size: 519 bytes --]

The last known bugs were addressed in this latest version of the swap 
prefetching patch. Thanks to the testers out there who helped it get this 
far.

-Prefetched pages weren't handled properly by the lru lists.
-Prefetch groups are now 10 times larger when laptop_mode is enabled thus 
decreasing the amount of time spent prefetching and thus the disk spinning.
-Documentation as suggested by Ingo Oeser

Incremental patches and latest available here:
http://ck.kolivas.org/patches/swap-prefetch/

Cheers,
Con
---




[-- Attachment #2: vm-swap_prefetch-15.patch --]
[-- Type: text/x-diff, Size: 22821 bytes --]

This patch implements swap prefetching when the vm is relatively idle and
there is free ram available. The code is based on some early work by Thomas
Schlichter.

This stores a list of swapped entries in a list ordered most recently used
and a radix tree. It generates a low priority kernel thread running at nice 19
to do the prefetching at a later stage.

Once pages have been added to the swapped list, a timer is started, testing
for conditions suitable to prefetch swap pages every 5 seconds. Suitable
conditions are defined as lack of swapping out or in any pages, and no
watermark tests failing. Significant amounts of dirtied ram and changes in
free ram representing disk writes or reads also prevent prefetching.

It then checks that we have spare ram looking for at least 3* pages_high free
per zone and if it succeeds that will prefetch pages from swap. The pages are
prefetched in 128kb groups every 1 second until the vm is busy for the tests
above, the watermarks fail to detect adequate free ram or the list is emptied.
The pages are copied to swap cache and kept on backing store. This allows
pressure on either physical ram or swap to readily find free pages without
further I/O.

The amount prefetched in each group is configurable via the tunable in
/proc/sys/vm/swap_prefetch . This is set to 2 by default (256kb). When
laptop_mode is enabled it prefetches in ten times larger blocks to minimise
the time spent reading.

In testing on modern pc hardware this results in wall-clock time activation of
the firefox browser to speed up 5 fold after a worst case complete swap-out
of the browser on an static web page.

Signed-off-by: Con Kolivas <kernel@kolivas.org>

 include/linux/swap.h   |   33 +++
 include/linux/sysctl.h |    1
 init/Kconfig           |   21 ++
 kernel/sysctl.c        |   12 +
 mm/Makefile            |    1
 mm/page_alloc.c        |   14 +
 mm/swap.c              |    3
 mm/swap_prefetch.c     |  445 +++++++++++++++++++++++++++++++++++++++++++++++++
 mm/swap_state.c        |   10 -
 mm/vmscan.c            |    5
 10 files changed, 540 insertions(+), 5 deletions(-)

Index: linux-2.6.13-sp/include/linux/swap.h
===================================================================
--- linux-2.6.13-sp.orig/include/linux/swap.h	2005-10-06 23:18:57.000000000 +1000
+++ linux-2.6.13-sp/include/linux/swap.h	2005-10-06 23:19:49.000000000 +1000
@@ -185,6 +185,38 @@ extern int shmem_unuse(swp_entry_t entry
 
 extern void swap_unplug_io_fn(struct backing_dev_info *, struct page *);
 
+#ifdef CONFIG_SWAP_PREFETCH
+/* only used by prefetch externally */
+/*	mm/swap_prefetch.c */
+extern void prepare_prefetch(void);
+extern void add_to_swapped_list(unsigned long index);
+extern void remove_from_swapped_list(unsigned long index);
+extern void delay_prefetch(void);
+/* linux/mm/page_alloc.c */
+extern struct page *
+buffered_rmqueue(struct zone *zone, int order, unsigned int __nocast gfp_flags);
+extern void zone_statistics(struct zonelist *zonelist, struct zone *z);
+extern int swap_prefetch;
+
+#else	/* CONFIG_SWAP_PREFETCH */
+static inline void add_to_swapped_list(unsigned long index)
+{
+}
+
+static inline void prepare_prefetch(void)
+{
+}
+
+static inline void remove_from_swapped_list(unsigned long index)
+{
+}
+
+static inline void delay_prefetch(void)
+{
+}
+
+#endif	/* CONFIG_SWAP_PREFETCH */
+
 #ifdef CONFIG_SWAP
 /* linux/mm/page_io.c */
 extern int swap_readpage(struct file *, struct page *);
@@ -206,6 +238,7 @@ extern void free_pages_and_swap_cache(st
 extern struct page * lookup_swap_cache(swp_entry_t);
 extern struct page * read_swap_cache_async(swp_entry_t, struct vm_area_struct *vma,
 					   unsigned long addr);
+extern int add_to_swap_cache(struct page *page, swp_entry_t entry);
 /* linux/mm/swapfile.c */
 extern long total_swap_pages;
 extern unsigned int nr_swapfiles;
Index: linux-2.6.13-sp/include/linux/sysctl.h
===================================================================
--- linux-2.6.13-sp.orig/include/linux/sysctl.h	2005-08-30 14:07:46.000000000 +1000
+++ linux-2.6.13-sp/include/linux/sysctl.h	2005-10-06 23:19:49.000000000 +1000
@@ -180,6 +180,7 @@ enum
 	VM_VFS_CACHE_PRESSURE=26, /* dcache/icache reclaim pressure */
 	VM_LEGACY_VA_LAYOUT=27, /* legacy/compatibility virtual address space layout */
 	VM_SWAP_TOKEN_TIMEOUT=28, /* default time for token time out */
+	VM_SWAP_PREFETCH=29,	/* int: amount to swap prefetch */
 };
 
 
Index: linux-2.6.13-sp/init/Kconfig
===================================================================
--- linux-2.6.13-sp.orig/init/Kconfig	2005-10-06 23:18:57.000000000 +1000
+++ linux-2.6.13-sp/init/Kconfig	2005-10-06 23:19:49.000000000 +1000
@@ -87,6 +87,27 @@ config SWAP
 	  used to provide more virtual memory than the actual RAM present
 	  in your computer.  If unsure say Y.
 
+config SWAP_PREFETCH
+	bool "Support for prefetching swapped memory"
+	depends on SWAP
+	default n
+	---help---
+	  This option will allow the kernel to prefetch swapped memory pages
+	  when idle. The pages will be kept on both swap and in swap_cache
+	  thus avoiding the need for further I/O if either ram or swap space
+	  is required.
+	  
+	  What this will do on workstations is slowly bring back applications
+	  that have swapped out after memory intensive workloads back into
+	  physical ram if you have free ram at a later stage and the machine
+	  is relatively idle. This means that when you come back to your
+	  computer after leaving it idle for a while, applications will come
+	  to life faster. Note that your swap usage will appear to increase
+	  but these are cached pages, can be dropped freely by the vm, and it
+	  should stabilise around 50% swap usage.
+	  
+	  Desktop users will most likely want to say Y.
+
 config SYSVIPC
 	bool "System V IPC"
 	depends on MMU
Index: linux-2.6.13-sp/kernel/sysctl.c
===================================================================
--- linux-2.6.13-sp.orig/kernel/sysctl.c	2005-08-30 14:07:46.000000000 +1000
+++ linux-2.6.13-sp/kernel/sysctl.c	2005-10-06 23:19:49.000000000 +1000
@@ -850,6 +850,18 @@ static ctl_table vm_table[] = {
 		.proc_handler	= &proc_dointvec_jiffies,
 		.strategy	= &sysctl_jiffies,
 	},
+#ifdef CONFIG_SWAP_PREFETCH
+	{
+		.ctl_name	= VM_SWAP_PREFETCH,
+		.procname	= "swap_prefetch",
+		.data		= &swap_prefetch,
+		.maxlen		= sizeof(swap_prefetch),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+		.strategy	= &sysctl_intvec,
+		.extra1		= &zero,
+	},
+#endif
 #endif
 	{ .ctl_name = 0 }
 };
Index: linux-2.6.13-sp/mm/Makefile
===================================================================
--- linux-2.6.13-sp.orig/mm/Makefile	2005-10-06 23:18:57.000000000 +1000
+++ linux-2.6.13-sp/mm/Makefile	2005-10-06 23:19:49.000000000 +1000
@@ -13,6 +13,7 @@ obj-y			:= bootmem.o filemap.o mempool.o
 			   prio_tree.o $(mmu-y)
 
 obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o thrash.o
+obj-$(CONFIG_SWAP_PREFETCH) += swap_prefetch.o
 obj-$(CONFIG_HUGETLBFS)	+= hugetlb.o
 obj-$(CONFIG_NUMA) 	+= mempolicy.o
 obj-$(CONFIG_SPARSEMEM)	+= sparse.o
Index: linux-2.6.13-sp/mm/page_alloc.c
===================================================================
--- linux-2.6.13-sp.orig/mm/page_alloc.c	2005-10-06 23:18:57.000000000 +1000
+++ linux-2.6.13-sp/mm/page_alloc.c	2005-10-06 23:19:49.000000000 +1000
@@ -607,7 +607,7 @@ void drain_local_pages(void)
 }
 #endif /* CONFIG_PM */
 
-static void zone_statistics(struct zonelist *zonelist, struct zone *z)
+void zone_statistics(struct zonelist *zonelist, struct zone *z)
 {
 #ifdef CONFIG_NUMA
 	unsigned long flags;
@@ -684,7 +684,7 @@ static inline void prep_zero_page(struct
  * we cheat by calling it from here, in the order > 0 path.  Saves a branch
  * or two.
  */
-static struct page *
+struct page *
 buffered_rmqueue(struct zone *zone, int order, unsigned int __nocast gfp_flags)
 {
 	unsigned long flags;
@@ -745,7 +745,7 @@ int zone_watermark_ok(struct zone *z, in
 		min -= min / 4;
 
 	if (free_pages <= min + z->lowmem_reserve[classzone_idx])
-		return 0;
+		goto out_failed;
 	for (o = 0; o < order; o++) {
 		/* At the next order, this order's pages become unavailable */
 		free_pages -= z->free_area[o].nr_free << o;
@@ -754,9 +754,15 @@ int zone_watermark_ok(struct zone *z, in
 		min >>= 1;
 
 		if (free_pages <= min)
-			return 0;
+			goto out_failed;
 	}
+
 	return 1;
+out_failed:
+	/* Swap prefetching is delayed if any watermark is low */
+	delay_prefetch();
+
+	return 0;	
 }
 
 static inline int
Index: linux-2.6.13-sp/mm/swap.c
===================================================================
--- linux-2.6.13-sp.orig/mm/swap.c	2005-10-06 23:18:57.000000000 +1000
+++ linux-2.6.13-sp/mm/swap.c	2005-10-06 23:19:49.000000000 +1000
@@ -481,5 +481,8 @@ void __init swap_setup(void)
 	 * Right now other parts of the system means that we
 	 * _really_ don't want to cluster much more
 	 */
+
+	prepare_prefetch();
+
 	hotcpu_notifier(cpu_swap_callback, 0);
 }
Index: linux-2.6.13-sp/mm/swap_prefetch.c
===================================================================
--- linux-2.6.13-sp.orig/mm/swap_prefetch.c	2005-10-06 22:39:25.000000000 +1000
+++ linux-2.6.13-sp/mm/swap_prefetch.c	2005-10-06 23:20:12.000000000 +1000
@@ -0,0 +1,445 @@
+/*
+ * linux/mm/swap_prefetch.c
+ *
+ * Copyright (C) 2005 Con Kolivas
+ *
+ * Written by Con Kolivas <kernel@kolivas.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/swap.h>
+#include <linux/fs.h>
+#include <linux/pagemap.h>
+#include <linux/syscalls.h>
+#include <linux/ioprio.h>
+#include <linux/writeback.h>
+
+/* Time to delay prefetching if vm is busy or prefetching unsuccessful */
+#define PREFETCH_DELAY	(HZ * 5)
+/* Time between attempting prefetching when vm is idle */
+#define PREFETCH_INTERVAL (HZ)
+
+/* sysctl - if/how much to prefetch at a time */
+int swap_prefetch = 2;
+
+/*
+ * How many pages to prefetch at a time. We prefetch SWAP_CLUSTER_MAX *
+ * swap_prefetch per PREFETCH_INTERVAL, but prefetch ten times as much at a
+ * time in laptop_mode to minimise the time we keep the disk spinning.
+ */
+#define PREFETCH_PAGES()	(SWAP_CLUSTER_MAX * swap_prefetch * \
+					(1 + 9 * laptop_mode))
+
+struct swapped_root_t {
+	unsigned long		busy;		/* vm busy */
+	spinlock_t		lock;		/* protects all data */
+	struct list_head	list;		/* MRU list of swapped pages */
+	struct radix_tree_root	swap_tree;	/* Lookup tree of pages */
+	unsigned int		count;		/* Number of entries */
+	unsigned int		maxcount;	/* Maximum entries allowed */
+	kmem_cache_t		*cache;
+};
+
+struct swapped_entry_t {
+	swp_entry_t		swp_entry;
+	struct list_head	swapped_list;
+};
+
+static struct swapped_root_t swapped = {
+	.busy 		= 0,
+	.list  		= LIST_HEAD_INIT(swapped.list),
+	.swap_tree	= RADIX_TREE_INIT(GFP_ATOMIC),
+	.count 		= 0,
+};
+
+static struct timer_list prefetch_timer;
+
+static DECLARE_WAIT_QUEUE_HEAD(kprefetchd_wait);
+
+static unsigned long mapped_limit;	/* Max mapped we will prefetch to */
+static unsigned long last_free = 0;	/* Last total free pages */
+static unsigned long temp_free = 0;
+
+/*
+ * Create kmem cache for swapped entries
+ */
+void __init prepare_prefetch(void)
+{
+	long total_memory = nr_free_pagecache_pages();
+
+	swapped.cache = kmem_cache_create("swapped_entry",
+		sizeof(struct swapped_entry_t), 0, 0, NULL, NULL);
+	if (unlikely(!swapped.cache))
+		panic("prepare_prefetch(): cannot create swapped_entry SLAB cache");
+
+	/* Set max number of entries to size of physical ram */
+	swapped.maxcount = total_memory;
+	/* Set maximum amount of mapped pages to prefetch to 2/3 ram */
+	mapped_limit = total_memory / 3 * 2;
+
+	spin_lock_init(&swapped.lock);
+}
+
+static inline void delay_prefetch_timer(void)
+{
+	mod_timer(&prefetch_timer, jiffies + PREFETCH_DELAY);
+}
+
+static inline void reset_prefetch_timer(void)
+{
+	mod_timer(&prefetch_timer, jiffies + PREFETCH_INTERVAL);
+}
+
+/*
+ * We check to see no part of the vm is busy. If it is this will interrupt
+ * trickle_swap and wait another PREFETCH_DELAY. Purposefully racy.
+ */
+void delay_prefetch(void)
+{
+	__set_bit(0, &swapped.busy);
+}
+
+/*
+ * Accounting is sloppy on purpose. As adding and removing entries from the
+ * list happens during swapping in and out we don't want to be spinning on
+ * locks. It is cheaper to just miss adding an entry since having a reference
+ * to every entry is not critical.
+ */
+void add_to_swapped_list(unsigned long index)
+{
+	struct swapped_entry_t *entry;
+	int error;
+
+	if (unlikely(!spin_trylock(&swapped.lock)))
+		goto out;
+
+	if (swapped.count >= swapped.maxcount) {
+		entry = list_entry(swapped.list.next,
+				struct swapped_entry_t, swapped_list);
+		radix_tree_delete(&swapped.swap_tree, entry->swp_entry.val);
+		list_del(&entry->swapped_list);
+		swapped.count--;
+	} else {
+		entry = kmem_cache_alloc(swapped.cache, GFP_ATOMIC);
+		if (unlikely(!entry))
+			/* bad, can't allocate more mem */
+			goto out_locked;
+	}
+
+	entry->swp_entry.val = index;
+
+	error = radix_tree_preload(GFP_ATOMIC);
+	if (likely(!error)) {
+		error = radix_tree_insert(&swapped.swap_tree, index, entry);
+		if (likely(!error)) {
+			/*
+			 * If this is the first entry the timer needs to be
+			 * (re)started
+			 */
+			if (list_empty(&swapped.list))
+				delay_prefetch_timer();
+			list_add(&entry->swapped_list, &swapped.list);
+			swapped.count++;
+		}
+		radix_tree_preload_end();
+	} else
+		kmem_cache_free(swapped.cache, entry);
+
+out_locked:
+	spin_unlock(&swapped.lock);
+out:
+	return;
+}
+
+/*
+ * Cheaper to not spin on the lock and remove the entry lazily via
+ * add_to_swap_cache when we hit it in trickle_swap_cache_async
+ */
+void remove_from_swapped_list(unsigned long index)
+{
+	struct swapped_entry_t *entry;
+	unsigned long flags;
+
+	if (unlikely(!spin_trylock_irqsave(&swapped.lock, flags)))
+		return;
+	entry = radix_tree_delete(&swapped.swap_tree, index);
+	if (likely(entry)) {
+		list_del_init(&entry->swapped_list);
+		swapped.count--;
+		kmem_cache_free(swapped.cache, entry);
+	}
+	spin_unlock_irqrestore(&swapped.lock, flags);
+}
+
+/*
+ * Find the zone with the most free pages, recheck the watermarks and
+ * then directly allocate the ram. We don't want prefetch to use
+ * __alloc_pages and go calling on reclaim.
+ */
+static struct page *prefetch_get_page(void)
+{
+	struct zone *zone = NULL, *z;
+	struct page *page = NULL;
+	long most_free = 0;
+
+	for_each_zone(z) {
+		long free;
+
+		if (z->present_pages == 0)
+			continue;
+
+		free = z->free_pages;
+
+		/* We don't prefetch into DMA */
+		if (zone_idx(z) == ZONE_DMA)
+			continue;
+
+		/* Select the zone with the most free ram */
+		if (free > most_free) {
+			most_free = free;
+			zone = z;
+		}
+	}
+
+	if (zone == NULL)
+		goto out;
+
+	page = buffered_rmqueue(zone, 0, GFP_HIGHUSER);
+	if (likely(page)) {
+		struct zonelist *zonelist;
+
+		zonelist = NODE_DATA(numa_node_id())->node_zonelists +
+		(GFP_HIGHUSER & GFP_ZONEMASK);
+
+		zone_statistics(zonelist, zone);
+	}
+out:
+	return page;
+}
+
+/*
+ * This tries to read a swp_entry_t into swap cache for swap prefetching.
+ * Returns 1 on success, 0 on failure, -1 on failure and we should delay
+ * further prefetching.
+ */
+static int trickle_swap_cache_async(swp_entry_t entry)
+{
+	struct page *page = NULL;
+	int ret = 0;
+
+	if (unlikely(!read_trylock(&swapper_space.tree_lock))) {
+		ret = -1;
+		goto out;
+	}
+	/* Entry may already exist */
+	page = radix_tree_lookup(&swapper_space.page_tree, entry.val);
+	read_unlock(&swapper_space.tree_lock);
+	if (page) {
+		remove_from_swapped_list(entry.val);
+		goto out;
+	}
+
+	/* Get a new page to read from swap */
+	page = prefetch_get_page();
+	if (unlikely(!page)) {
+		ret = -1;
+		goto out;
+	}
+
+	if (add_to_swap_cache(page, entry))
+		/* Failed to add to swap cache */
+		goto out_release;
+
+	lru_cache_add(page);
+	if (unlikely(swap_readpage(NULL, page))) {
+		ret = -1;
+		goto out_release;
+	}
+
+	ret = 1;
+out_release:
+	page_cache_release(page);
+out:
+	return ret;
+}
+
+/*
+ * We want to be absolutely certain it's ok to start prefetching.
+ */
+static int prefetch_suitable(void)
+{
+	struct page_state ps;
+	unsigned long pending_writes, limit;
+	struct zone *z;
+	int ret = 0;
+
+	/* Purposefully racy and might return false positive which is ok */
+	if (__test_and_clear_bit(0, &swapped.busy))
+		goto out;
+
+	temp_free = 0;
+	/*
+	 * Have some hysteresis between where page reclaiming and prefetching
+	 * will occur to prevent ping-ponging between them.
+	 */
+	for_each_zone(z) {
+		unsigned long free;
+
+		if (z->present_pages == 0)
+			continue;
+		free = z->free_pages;
+		if (z->pages_high * 3 > free)
+			goto out;
+		temp_free += free;
+	}
+
+	/*
+	 * We check to see that pages are not being allocated elsewhere
+	 * at any significant rate implying any degree of memory pressure
+	 * (eg during file reads)
+	 */
+	if (last_free) {
+		if (temp_free + SWAP_CLUSTER_MAX + PREFETCH_PAGES() <
+			last_free) {
+				last_free = temp_free;
+				goto out;
+		}
+	} else
+		last_free = temp_free;
+
+	get_page_state(&ps);
+
+	/* We shouldn't prefetch when we are doing writeback */
+	if (ps.nr_writeback)
+		goto out;
+
+	/* Delay prefetching if we have significant amounts of dirty data */
+	pending_writes = ps.nr_dirty + ps.nr_unstable;
+	if (pending_writes > SWAP_CLUSTER_MAX)
+		goto out;
+
+	/* >2/3 of the ram is mapped, we need some free for pagecache */
+	limit = ps.nr_mapped + ps.nr_slab + pending_writes;
+	if (limit > mapped_limit)
+		goto out;
+
+	/*
+	 * Add swapcache to limit as well, but check this last since it needs
+	 * locking
+	 */
+	if (unlikely(!read_trylock(&swapper_space.tree_lock)))
+		goto out;
+	limit += total_swapcache_pages;
+	read_unlock(&swapper_space.tree_lock);
+	if (limit > mapped_limit)
+		goto out;
+
+	/* Survived all that? Hooray we can prefetch! */
+	ret = 1;
+out:
+	return ret;
+}
+
+/*
+ * trickle_swap is the main function that initiates the swap prefetching. It
+ * first checks to see if the busy flag is set, and does not prefetch if it
+ * is, as the flag implied we are low on memory or swapping in currently.
+ * Otherwise it runs till PREFETCH_PAGES() are prefetched.
+ * This function returns 1 if it succeeds in a cycle of prefetching, 0 if it
+ * is interrupted or -1 if there is nothing left to prefetch.
+ */
+static int trickle_swap(void)
+{
+	int ret = 0, pages = 0;
+	struct swapped_entry_t *entry;
+
+	while (pages < PREFETCH_PAGES()) {
+		int got_page;
+
+		if (!prefetch_suitable())
+			goto out;
+		/* Lock is held? We must be busy elsewhere */
+		if (unlikely(!spin_trylock(&swapped.lock)))
+			goto out;
+		if (list_empty(&swapped.list)) {
+			spin_unlock(&swapped.lock);
+			ret = -1;
+			goto out;
+		}
+		entry = list_entry(swapped.list.next,
+			struct swapped_entry_t, swapped_list);
+		spin_unlock(&swapped.lock);
+
+		got_page = trickle_swap_cache_async(entry->swp_entry);
+		if (unlikely(got_page == -1))
+			goto out;
+		pages += got_page;
+	}
+	ret = 1;
+
+out:
+	if (pages)
+		lru_add_drain();
+	return ret;
+}
+
+static int kprefetchd(void *data)
+{
+	DEFINE_WAIT(wait);
+
+	daemonize("kprefetchd");
+	set_user_nice(current, 19);
+	/* Set ioprio to lowest if supported by i/o scheduler */
+	sys_ioprio_set(IOPRIO_WHO_PROCESS, 0, IOPRIO_CLASS_IDLE);
+
+	for ( ; ; ) {
+		int prefetched;
+
+		try_to_freeze();
+		prepare_to_wait(&kprefetchd_wait, &wait, TASK_INTERRUPTIBLE);
+		schedule();
+		finish_wait(&kprefetchd_wait, &wait);
+
+		/* If trickle_swap() returns -1 the timer is not reset */
+		prefetched = trickle_swap();
+		if (prefetched == 1) {
+			last_free = temp_free;
+			reset_prefetch_timer();
+		} else {
+			last_free = 0;
+			if (!prefetched)
+				delay_prefetch_timer();
+		}
+	}
+	return 0;
+}
+
+/*
+ * Wake up kprefetchd. It will reset the timer itself appropriately so no
+ * need to do it here
+ */
+static void prefetch_wakeup(unsigned long data)
+{
+	if (waitqueue_active(&kprefetchd_wait))
+		wake_up_interruptible(&kprefetchd_wait);
+}
+
+static int __init kprefetchd_init(void)
+{
+	/*
+	 * Prepare the prefetch timer. It is inactive until entries are placed
+	 * on the swapped_list
+	 */
+	init_timer(&prefetch_timer);
+	prefetch_timer.data = 0;
+	prefetch_timer.function = prefetch_wakeup;
+
+	kernel_thread(kprefetchd, NULL, CLONE_KERNEL);
+
+	return 0;
+}
+
+module_init(kprefetchd_init)
Index: linux-2.6.13-sp/mm/swap_state.c
===================================================================
--- linux-2.6.13-sp.orig/mm/swap_state.c	2005-10-06 23:18:57.000000000 +1000
+++ linux-2.6.13-sp/mm/swap_state.c	2005-10-06 23:19:49.000000000 +1000
@@ -80,6 +80,7 @@ static int __add_to_swap_cache(struct pa
 		error = radix_tree_insert(&swapper_space.page_tree,
 						entry.val, page);
 		if (!error) {
+			remove_from_swapped_list(entry.val);
 			page_cache_get(page);
 			SetPageLocked(page);
 			SetPageSwapCache(page);
@@ -93,11 +94,12 @@ static int __add_to_swap_cache(struct pa
 	return error;
 }
 
-static int add_to_swap_cache(struct page *page, swp_entry_t entry)
+int add_to_swap_cache(struct page *page, swp_entry_t entry)
 {
 	int error;
 
 	if (!swap_duplicate(entry)) {
+		remove_from_swapped_list(entry.val);
 		INC_CACHE_INFO(noent_race);
 		return -ENOENT;
 	}
@@ -145,6 +147,9 @@ int add_to_swap(struct page * page)
 	swp_entry_t entry;
 	int err;
 
+	/* Swap prefetching is delayed if we're swapping pages */
+	delay_prefetch();
+
 	if (!PageLocked(page))
 		BUG();
 
@@ -325,6 +330,9 @@ struct page *read_swap_cache_async(swp_e
 	struct page *found_page, *new_page = NULL;
 	int err;
 
+	/* Swap prefetching is delayed if we're already reading from swap */
+	delay_prefetch();
+
 	do {
 		/*
 		 * First check the swap cache.  Since this is normally
Index: linux-2.6.13-sp/mm/vmscan.c
===================================================================
--- linux-2.6.13-sp.orig/mm/vmscan.c	2005-10-06 23:18:57.000000000 +1000
+++ linux-2.6.13-sp/mm/vmscan.c	2005-10-06 23:19:49.000000000 +1000
@@ -519,6 +519,7 @@ static int shrink_list(struct list_head 
 #ifdef CONFIG_SWAP
 		if (PageSwapCache(page)) {
 			swp_entry_t swap = { .val = page->private };
+			add_to_swapped_list(swap.val);
 			__delete_from_swap_cache(page);
 			write_unlock_irq(&mapping->tree_lock);
 			swap_free(swap);
@@ -929,6 +930,8 @@ int try_to_free_pages(struct zone **zone
 	unsigned long lru_pages = 0;
 	int i;
 
+	delay_prefetch();
+
 	sc.gfp_mask = gfp_mask;
 	sc.may_writepage = 0;
 	sc.may_swap = 1;
@@ -1275,6 +1278,8 @@ int shrink_all_memory(int nr_pages)
 		.reclaimed_slab = 0,
 	};
 
+	delay_prefetch();
+
 	current->reclaim_state = &reclaim_state;
 	for_each_pgdat(pgdat) {
 		int freed;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] vm - swap_prefetch-15 docs
  2005-10-06 14:01 [PATCH] vm - swap_prefetch-15 Con Kolivas
@ 2005-10-06 14:13 ` Con Kolivas
  2005-10-07 10:03 ` [PATCH] vm - swap_prefetch-15 Pekka Enberg
  2005-10-07 11:46 ` Paolo Ciarrocchi
  2 siblings, 0 replies; 14+ messages in thread
From: Con Kolivas @ 2005-10-06 14:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: ck

[-- Attachment #1: Type: text/plain, Size: 600 bytes --]

On Fri, 7 Oct 2005 12:01 am, Con Kolivas wrote:
> The last known bugs were addressed in this latest version of the swap
> prefetching patch. Thanks to the testers out there who helped it get this
> far.
>
> -Prefetched pages weren't handled properly by the lru lists.
> -Prefetch groups are now 10 times larger when laptop_mode is enabled thus
> decreasing the amount of time spent prefetching and thus the disk spinning.
> -Documentation as suggested by Ingo Oeser
>
> Incremental patches and latest available here:
> http://ck.kolivas.org/patches/swap-prefetch/

And the docs...

Cheers,
Con
---



[-- Attachment #2: sp15_docs.patch --]
[-- Type: text/x-diff, Size: 1118 bytes --]

Index: linux-2.6.13-ck7/Documentation/sysctl/vm.txt
===================================================================
--- linux-2.6.13-ck7.orig/Documentation/sysctl/vm.txt	2005-03-02 18:38:17.000000000 +1100
+++ linux-2.6.13-ck7/Documentation/sysctl/vm.txt	2005-10-06 23:10:54.000000000 +1000
@@ -26,6 +26,7 @@ Currently, these files are in /proc/sys/
 - min_free_kbytes
 - laptop_mode
 - block_dump
+- swap_prefetch
 
 ==============================================================
 
@@ -102,3 +103,14 @@ This is used to force the Linux VM to ke
 of kilobytes free.  The VM uses this number to compute a pages_min
 value for each lowmem zone in the system.  Each lowmem zone gets 
 a number of reserved free pages based proportionally on its size.
+
+==============================================================
+
+swap_prefetch
+
+This is the amount of data prefetched per prefetching interval when
+swap prefetching is compiled in. The value means multiples of 128K,
+except when laptop_mode is enabled and then it is ten times larger.
+Setting it to 0 disables prefetching entirely.
+
+The default value is 2.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-06 14:01 [PATCH] vm - swap_prefetch-15 Con Kolivas
  2005-10-06 14:13 ` [PATCH] vm - swap_prefetch-15 docs Con Kolivas
@ 2005-10-07 10:03 ` Pekka Enberg
  2005-10-07 10:54   ` Con Kolivas
  2005-10-07 11:46 ` Paolo Ciarrocchi
  2 siblings, 1 reply; 14+ messages in thread
From: Pekka Enberg @ 2005-10-07 10:03 UTC (permalink / raw)
  To: Con Kolivas; +Cc: linux-kernel, ck

Hi Con,

A teeny-weeny nitpick:

On 10/6/05, Con Kolivas <kernel@kolivas.org> wrote:
> +struct swapped_root_t {

[snip]

> +struct swapped_entry_t {

[snip]

Since these are not typedefs, please drop the _t postfix.

                                  Pekka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 10:03 ` [PATCH] vm - swap_prefetch-15 Pekka Enberg
@ 2005-10-07 10:54   ` Con Kolivas
  2005-10-07 11:31     ` Pekka Enberg
  0 siblings, 1 reply; 14+ messages in thread
From: Con Kolivas @ 2005-10-07 10:54 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-kernel, ck

On Fri, 7 Oct 2005 20:03, Pekka Enberg wrote:
> Hi Con,
>
> A teeny-weeny nitpick:
>
> On 10/6/05, Con Kolivas <kernel@kolivas.org> wrote:
> > +struct swapped_root_t {
>
> [snip]
>
> > +struct swapped_entry_t {
>
> [snip]
>
> Since these are not typedefs, please drop the _t postfix.

Good point, thanks! Any and all feedback is appreciated.

Cheers,
Con

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 10:54   ` Con Kolivas
@ 2005-10-07 11:31     ` Pekka Enberg
  2005-10-07 12:08       ` Con Kolivas
  2005-10-07 14:44       ` [ck] " Gustavo Barbieri
  0 siblings, 2 replies; 14+ messages in thread
From: Pekka Enberg @ 2005-10-07 11:31 UTC (permalink / raw)
  To: Con Kolivas; +Cc: linux-kernel, ck

Hi,

On 10/7/05, Con Kolivas <kernel@kolivas.org> wrote:
> Good point, thanks! Any and all feedback is appreciated.

Well, since you asked :-)

> +/*
> + * How many pages to prefetch at a time. We prefetch SWAP_CLUSTER_MAX *
> + * swap_prefetch per PREFETCH_INTERVAL, but prefetch ten times as much at a
> + * time in laptop_mode to minimise the time we keep the disk spinning.
> + */
> +#define PREFETCH_PAGES()     (SWAP_CLUSTER_MAX * swap_prefetch * \
> +                                     (1 + 9 * laptop_mode))

This looks strange. Please either drop the parenthesis from PREFETCH_PAGES or
make it a real static inline function.

> +/*
> + * Find the zone with the most free pages, recheck the watermarks and
> + * then directly allocate the ram. We don't want prefetch to use
> + * __alloc_pages and go calling on reclaim.
> + */
> +static struct page *prefetch_get_page(void)
> +{

Should this be put in mm/page_alloc.c? It is, after all, a special-purpose
page allocator. That way you wouldn't have to export zone_statistics and
buffered_rmqueue.

> +/*
> + * trickle_swap is the main function that initiates the swap prefetching. It
> + * first checks to see if the busy flag is set, and does not prefetch if it
> + * is, as the flag implied we are low on memory or swapping in currently.
> + * Otherwise it runs till PREFETCH_PAGES() are prefetched.
> + * This function returns 1 if it succeeds in a cycle of prefetching, 0 if it
> + * is interrupted or -1 if there is nothing left to prefetch.
> + */
> +static int trickle_swap(void)
> +{

This could perhaps use a three-state enum as return value. I find return value
checks in kprefetchd() slightly confusing.

                                Pekka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-06 14:01 [PATCH] vm - swap_prefetch-15 Con Kolivas
  2005-10-06 14:13 ` [PATCH] vm - swap_prefetch-15 docs Con Kolivas
  2005-10-07 10:03 ` [PATCH] vm - swap_prefetch-15 Pekka Enberg
@ 2005-10-07 11:46 ` Paolo Ciarrocchi
  2005-10-07 12:18   ` Con Kolivas
  2 siblings, 1 reply; 14+ messages in thread
From: Paolo Ciarrocchi @ 2005-10-07 11:46 UTC (permalink / raw)
  To: Con Kolivas; +Cc: linux-kernel, ck

On 10/6/05, Con Kolivas <kernel@kolivas.org> wrote:
> The last known bugs were addressed in this latest version of the swap
> prefetching patch. Thanks to the testers out there who helped it get this
> far.
>
> -Prefetched pages weren't handled properly by the lru lists.
> -Prefetch groups are now 10 times larger when laptop_mode is enabled thus
> decreasing the amount of time spent prefetching and thus the disk spinning.
> -Documentation as suggested by Ingo Oeser
>
> Incremental patches and latest available here:
> http://ck.kolivas.org/patches/swap-prefetch/
>

Ciao Con,
i downloading right now kernel 2.6.14-rc3 and your latest patch (v15),
in the weekend I'll update my Ubuntu platform and I'll like to compare
performance of vanilla vs vm-swap_prefetch.

Any hint about what kind of instrumentation I could use in order to
get interesting and useful numbers ?

Thanks!

Regards,
--
Paolo
http://technologynews.altervista.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 11:31     ` Pekka Enberg
@ 2005-10-07 12:08       ` Con Kolivas
  2005-10-07 12:26         ` Pekka J Enberg
  2005-10-07 14:44       ` [ck] " Gustavo Barbieri
  1 sibling, 1 reply; 14+ messages in thread
From: Con Kolivas @ 2005-10-07 12:08 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-kernel, ck

On Fri, 7 Oct 2005 21:31, Pekka Enberg wrote:
> Hi,
>
> On 10/7/05, Con Kolivas <kernel@kolivas.org> wrote:
> > Good point, thanks! Any and all feedback is appreciated.
>
> Well, since you asked :-)
>
> > +/*
> > + * How many pages to prefetch at a time. We prefetch SWAP_CLUSTER_MAX *
> > + * swap_prefetch per PREFETCH_INTERVAL, but prefetch ten times as much
> > at a + * time in laptop_mode to minimise the time we keep the disk
> > spinning. + */
> > +#define PREFETCH_PAGES()     (SWAP_CLUSTER_MAX * swap_prefetch * \
> > +                                     (1 + 9 * laptop_mode))
>
> This looks strange. Please either drop the parenthesis from PREFETCH_PAGES
> or make it a real static inline function.

I have seen this sort of macro style before in the kernel where () just makes 
it clear that it is a function but a real static inline is ok with me.

> > +/*
> > + * Find the zone with the most free pages, recheck the watermarks and
> > + * then directly allocate the ram. We don't want prefetch to use
> > + * __alloc_pages and go calling on reclaim.
> > + */
> > +static struct page *prefetch_get_page(void)
> > +{
>
> Should this be put in mm/page_alloc.c? It is, after all, a special-purpose
> page allocator. That way you wouldn't have to export zone_statistics and
> buffered_rmqueue.

Makes sense but it is only used in the CONFIG_SWAP_PREFETCH case so it would 
end up as a static inline in swap.h to avoid ending being #ifdefed in 
page_alloc.c. Do you think that's preferable to having it in 
swap_prefetch.c ?

>
> > +/*
> > + * trickle_swap is the main function that initiates the swap
> > prefetching. It + * first checks to see if the busy flag is set, and does
> > not prefetch if it + * is, as the flag implied we are low on memory or
> > swapping in currently. + * Otherwise it runs till PREFETCH_PAGES() are
> > prefetched.
> > + * This function returns 1 if it succeeds in a cycle of prefetching, 0
> > if it + * is interrupted or -1 if there is nothing left to prefetch.
> > + */
> > +static int trickle_swap(void)
> > +{
>
> This could perhaps use a three-state enum as return value. I find return
> value checks in kprefetchd() slightly confusing.

Good idea.

Thanks!
Con

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 11:46 ` Paolo Ciarrocchi
@ 2005-10-07 12:18   ` Con Kolivas
  0 siblings, 0 replies; 14+ messages in thread
From: Con Kolivas @ 2005-10-07 12:18 UTC (permalink / raw)
  To: Paolo Ciarrocchi; +Cc: linux-kernel, ck

On Fri, 7 Oct 2005 21:46, Paolo Ciarrocchi wrote:
> On 10/6/05, Con Kolivas <kernel@kolivas.org> wrote:
> > The last known bugs were addressed in this latest version of the swap
> > prefetching patch. Thanks to the testers out there who helped it get this
> > far.
> >
> > -Prefetched pages weren't handled properly by the lru lists.
> > -Prefetch groups are now 10 times larger when laptop_mode is enabled thus
> > decreasing the amount of time spent prefetching and thus the disk
> > spinning. -Documentation as suggested by Ingo Oeser
> >
> > Incremental patches and latest available here:
> > http://ck.kolivas.org/patches/swap-prefetch/
>
> Ciao Con,
> i downloading right now kernel 2.6.14-rc3 and your latest patch (v15),
> in the weekend I'll update my Ubuntu platform and I'll like to compare
> performance of vanilla vs vm-swap_prefetch.

Great!

> Any hint about what kind of instrumentation I could use in order to
> get interesting and useful numbers ?

To get some useful advantage from it you have to have a workload that swaps on 
your hardware in the first place. I have a simple mechanism to induce it 
reliably where I open a few large applications concurrently - browser and 
office suite come to mind, then create my swap load:

tail -f /dev/zero

works real nice. Either let it run to completion and hope that tail gets 
oom-killed or ctrl-c it after you've used up half your swapspace.
Then I usually let vmstat 1 run in the background. Leave the machine for 5 or 
10 minutes and do simple wallclock time from the moment you click on the 
application till it is available. Try changing swap_prefetch from 2 to 0 
in /proc/sys/vm/swap_prefetch to compare the difference. This is a very 
coarse and somewhat contrived example, but even in real world settings where 
you hit swapspace during normal usage it is slowly trickling in in the 
background and I find it makes a noticeable difference.

If you watch vmstat with swap_prefetch enabled, you'll notice it doing SI of 
256KB (at default setting of swap_prefetch==2) every second when the machine 
is very idle. Then after all the pages have been prefetched, when you click 
on the application you shouldn't see anything in SI at all implying no swap 
in. You can see how many entries are currently in the prefetch swaplist 
easily enough with
cat /proc/slabinfo | grep swapped_entry

where the first numeric column of active objects shows the number of pages in 
the list (not all of them will result in a swapped in page).

Cheers,
Con

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 12:08       ` Con Kolivas
@ 2005-10-07 12:26         ` Pekka J Enberg
  2005-10-07 12:33           ` Con Kolivas
  0 siblings, 1 reply; 14+ messages in thread
From: Pekka J Enberg @ 2005-10-07 12:26 UTC (permalink / raw)
  To: Con Kolivas; +Cc: linux-kernel, ck

On Fri, 7 Oct 2005, Con Kolivas wrote:
> Makes sense but it is only used in the CONFIG_SWAP_PREFETCH case so it would 
> end up as a static inline in swap.h to avoid ending being #ifdefed in 
> page_alloc.c. Do you think that's preferable to having it in 
> swap_prefetch.c ?

But then you would still have to open up buffered_rmqueue() and 
zone_statistics() to everyone, no? How about you implement a new gfp flag 
__GFP_NEVER_RECLAIM similar to __GFP_NORECLAIM instead so you don't have 
to duplicate __page_alloc()?

				Pekka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 12:26         ` Pekka J Enberg
@ 2005-10-07 12:33           ` Con Kolivas
  2005-10-07 12:49             ` Pekka J Enberg
  0 siblings, 1 reply; 14+ messages in thread
From: Con Kolivas @ 2005-10-07 12:33 UTC (permalink / raw)
  To: Pekka J Enberg; +Cc: linux-kernel, ck

On Fri, 7 Oct 2005 22:26, Pekka J Enberg wrote:
> On Fri, 7 Oct 2005, Con Kolivas wrote:
> > Makes sense but it is only used in the CONFIG_SWAP_PREFETCH case so it
> > would end up as a static inline in swap.h to avoid ending being #ifdefed
> > in page_alloc.c. Do you think that's preferable to having it in
> > swap_prefetch.c ?
>
> But then you would still have to open up buffered_rmqueue() and
> zone_statistics() to everyone, no? 

bah of course..  /me slaps forehead

> How about you implement a new gfp flag 
> __GFP_NEVER_RECLAIM similar to __GFP_NORECLAIM instead so you don't have
> to duplicate __page_alloc()?

That will end up being far more intrusive than this version and __alloc_pages 
would need more tests that affect every call to __alloc_pages which seems 
much more expensive to me than exporting buffered_rmqueue and 
zone_statistics, and the modified __alloc_pages will still be a much more 
complicated function than prefetch_get_page. 

Thanks,
Con

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 12:33           ` Con Kolivas
@ 2005-10-07 12:49             ` Pekka J Enberg
  0 siblings, 0 replies; 14+ messages in thread
From: Pekka J Enberg @ 2005-10-07 12:49 UTC (permalink / raw)
  To: Con Kolivas; +Cc: linux-kernel, ck

On Fri, 7 Oct 2005, Con Kolivas wrote:
> That will end up being far more intrusive than this version and __alloc_pages 
> would need more tests that affect every call to __alloc_pages which seems 
> much more expensive to me than exporting buffered_rmqueue and 
> zone_statistics, and the modified __alloc_pages will still be a much more 
> complicated function than prefetch_get_page. 

Short-term, perhaps. However, what you are doing is inventing your own 
page allocator which, I suspect, is more expensive in the long term.

Up to you of course and I am probably the wrong person to talk to about 
this. Never the less, here's a totally untested patch to do it.

			Pekka

Index: 2.6/include/linux/gfp.h
===================================================================
--- 2.6.orig/include/linux/gfp.h
+++ 2.6/include/linux/gfp.h
@@ -41,6 +41,7 @@ struct vm_area_struct;
 #define __GFP_NOMEMALLOC 0x10000u /* Don't use emergency reserves */
 #define __GFP_NORECLAIM  0x20000u /* No realy zone reclaim during allocation */
 #define __GFP_HARDWALL   0x40000u /* Enforce hardwall cpuset memory allocs */
+#define __GFP_NEVER_RECLAIM 0x80000u /* Never attempt to reclaim */
 
 #define __GFP_BITS_SHIFT 20	/* Room for 20 __GFP_FOO bits */
 #define __GFP_BITS_MASK ((1 << __GFP_BITS_SHIFT) - 1)
Index: 2.6/mm/page_alloc.c
===================================================================
--- 2.6.orig/mm/page_alloc.c
+++ 2.6/mm/page_alloc.c
@@ -778,6 +778,7 @@ __alloc_pages(unsigned int __nocast gfp_
 		struct zonelist *zonelist)
 {
 	const int wait = gfp_mask & __GFP_WAIT;
+	const int can_reclaim = !(gfp_mask & __GFP_NEVER_RECLAIM);
 	struct zone **zones, *z;
 	struct page *page;
 	struct reclaim_state reclaim_state;
@@ -812,7 +813,7 @@ restart:
 	 * See also cpuset_zone_allowed() comment in kernel/cpuset.c.
 	 */
 	for (i = 0; (z = zones[i]) != NULL; i++) {
-		int do_reclaim = should_reclaim_zone(z, gfp_mask);
+		int do_reclaim = can_reclaim && should_reclaim_zone(z, gfp_mask);
 
 		if (!cpuset_zone_allowed(z, __GFP_HARDWALL))
 			continue;
@@ -840,6 +841,9 @@ zone_reclaim_retry:
 			goto got_pg;
 	}
 
+	if (unlikely(!can_reclaim))
+		goto out;
+
 	for (i = 0; (z = zones[i]) != NULL; i++)
 		wakeup_kswapd(z, order);
 
@@ -966,6 +970,7 @@ nopage:
 		dump_stack();
 		show_mem();
 	}
+out:
 	return NULL;
 got_pg:
 	zone_statistics(zonelist, z);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ck] Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 11:31     ` Pekka Enberg
  2005-10-07 12:08       ` Con Kolivas
@ 2005-10-07 14:44       ` Gustavo Barbieri
  2005-10-07 18:28         ` Rudo Thomas
  1 sibling, 1 reply; 14+ messages in thread
From: Gustavo Barbieri @ 2005-10-07 14:44 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Con Kolivas, ck, linux-kernel

On 10/7/05, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> Hi,
>
> On 10/7/05, Con Kolivas <kernel@kolivas.org> wrote:
> > Good point, thanks! Any and all feedback is appreciated.
>
> Well, since you asked :-)
>
> > +/*
> > + * How many pages to prefetch at a time. We prefetch SWAP_CLUSTER_MAX *
> > + * swap_prefetch per PREFETCH_INTERVAL, but prefetch ten times as much at a
> > + * time in laptop_mode to minimise the time we keep the disk spinning.
> > + */
> > +#define PREFETCH_PAGES()     (SWAP_CLUSTER_MAX * swap_prefetch * \
> > +                                     (1 + 9 * laptop_mode))
>
> This looks strange. Please either drop the parenthesis from PREFETCH_PAGES or
> make it a real static inline function.

Or make it a "const static" variable, so compiler will check types and
everything, but the symbol will not be present in the binary, causing
no overhead. So it could be:

const unsigned PREFETCH_PAGES = (SWAP_CLUSTER_MAX * swap_prefetch * \
        (1 + 9 * laptop_mode));

--
Gustavo Sverzut Barbieri
---------------------------------------
Computer Engineer 2001 - UNICAMP
GPSL - Grupo Pro Software Livre
Cell..: +55 (19) 9165 8010
Jabber: gsbarbieri@jabber.org
  ICQ#: 17249123
   MSN: barbieri@gmail.com
 Skype: gsbarbieri
   GPG: 0xB640E1A2 @ wwwkeys.pgp.net

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 14:44       ` [ck] " Gustavo Barbieri
@ 2005-10-07 18:28         ` Rudo Thomas
  2005-10-07 18:40           ` Gustavo Barbieri
  0 siblings, 1 reply; 14+ messages in thread
From: Rudo Thomas @ 2005-10-07 18:28 UTC (permalink / raw)
  To: Gustavo Barbieri; +Cc: Pekka Enberg, ck, linux-kernel

> Or make it a "const static" variable, so compiler will check types and
> everything, but the symbol will not be present in the binary, causing
> no overhead. So it could be:
> 
> const unsigned PREFETCH_PAGES = (SWAP_CLUSTER_MAX * swap_prefetch * \
>         (1 + 9 * laptop_mode));

This won't work, AFAICT. swap_prefetch and laptop_mode are variables,
but with the code above, they would be evaluated only once. And maybe
the compiler will reject that code immediately...

Rudo.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: [PATCH] vm - swap_prefetch-15
  2005-10-07 18:28         ` Rudo Thomas
@ 2005-10-07 18:40           ` Gustavo Barbieri
  0 siblings, 0 replies; 14+ messages in thread
From: Gustavo Barbieri @ 2005-10-07 18:40 UTC (permalink / raw)
  To: Gustavo Barbieri, Pekka Enberg, ck, linux-kernel

On 10/7/05, Rudo Thomas <rudo@matfyz.cz> wrote:
> > Or make it a "const static" variable, so compiler will check types and
> > everything, but the symbol will not be present in the binary, causing
> > no overhead. So it could be:
> >
> > const unsigned PREFETCH_PAGES = (SWAP_CLUSTER_MAX * swap_prefetch * \
> >         (1 + 9 * laptop_mode));
>
> This won't work, AFAICT. swap_prefetch and laptop_mode are variables,
> but with the code above, they would be evaluated only once. And maybe
> the compiler will reject that code immediately...

Ah... yes, you're right! These may change in runtime.

--
Gustavo Sverzut Barbieri
---------------------------------------
Computer Engineer 2001 - UNICAMP
GPSL - Grupo Pro Software Livre
Cell..: +55 (19) 9165 8010
Jabber: gsbarbieri@jabber.org
  ICQ#: 17249123
   MSN: barbieri@gmail.com
 Skype: gsbarbieri
   GPG: 0xB640E1A2 @ wwwkeys.pgp.net

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2005-10-07 18:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-06 14:01 [PATCH] vm - swap_prefetch-15 Con Kolivas
2005-10-06 14:13 ` [PATCH] vm - swap_prefetch-15 docs Con Kolivas
2005-10-07 10:03 ` [PATCH] vm - swap_prefetch-15 Pekka Enberg
2005-10-07 10:54   ` Con Kolivas
2005-10-07 11:31     ` Pekka Enberg
2005-10-07 12:08       ` Con Kolivas
2005-10-07 12:26         ` Pekka J Enberg
2005-10-07 12:33           ` Con Kolivas
2005-10-07 12:49             ` Pekka J Enberg
2005-10-07 14:44       ` [ck] " Gustavo Barbieri
2005-10-07 18:28         ` Rudo Thomas
2005-10-07 18:40           ` Gustavo Barbieri
2005-10-07 11:46 ` Paolo Ciarrocchi
2005-10-07 12:18   ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox