linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
@ 2008-11-17  2:20 Izik Eidus
  2008-11-20  7:44 ` Ryota OZAKI
  2008-11-28 12:57 ` Dmitri Monakhov
  0 siblings, 2 replies; 19+ messages in thread
From: Izik Eidus @ 2008-11-17  2:20 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, kvm, aarcange, chrisw, avi, dlaor,
	kamezawa.hiroyu, cl, corbet, ieidus

(From v1 to v2 the main change is much more documentation)

KSM is a linux driver that allows dynamicly sharing identical memory
pages between one or more processes.

Unlike tradtional page sharing that is made at the allocation of the
memory, ksm do it dynamicly after the memory was created.
Memory is periodically scanned; identical pages are identified and
merged.
The sharing is unnoticeable by the process that use this memory.
(the shared pages are marked as readonly, and in case of write
do_wp_page() take care to create new copy of the page)

This driver is very useful for KVM as in cases of runing multiple guests
operation system of the same type.
(For desktop work loads we have achived more than x2 memory overcommit
(more like x3))

This driver have found users other than KVM, for example CERN,
Fons Rademakers:
"on many-core machines we run one large detector simulation program per core.
These simulation programs are identical but run each in their own process and
need about 2 - 2.5 GB RAM.
We typically buy machines with 2GB RAM per core and so have a problem to run
one of these programs per core.
Of the 2 - 2.5 GB about 700MB is identical data in the form of magnetic field
maps, detector geometry, etc.
Currently people have been trying to start one program, initialize the geometry
and field maps and then fork it N times, to have the data shared.
With KSM this would be done automatically by the system so it sounded extremely
attractive when Andrea presented it."

(We have are already started to test KSM on their systems...)

KSM can run as kernel thread or as userspace application or both

example for how to control the kernel thread:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include "ksm.h"

int main(int argc, char *argv[])
{
	int fd;
	int used = 0;
	int fd_start;
	struct ksm_kthread_info info;
	

	if (argc < 2) {
		fprintf(stderr,
			"usage: %s {start npages sleep | stop | info}\n",
			argv[0]);
		exit(1);
	}

	fd = open("/dev/ksm", O_RDWR | O_TRUNC, (mode_t)0600);
	if (fd == -1) {
		fprintf(stderr, "could not open /dev/ksm\n");
		exit(1);
	}

	if (!strncmp(argv[1], "start", strlen(argv[1]))) {
		used = 1;
		if (argc < 4) {
			fprintf(stderr,
		    "usage: %s start npages_to_scan max_pages_to_merge sleep\n",
		    argv[0]);
			exit(1);
		}
		info.pages_to_scan = atoi(argv[2]);
		info.max_pages_to_merge = atoi(argv[3]);
		info.sleep = atoi(argv[4]);
		info.flags = ksm_control_flags_run;

		fd_start = ioctl(fd, KSM_START_STOP_KTHREAD, &info);
		if (fd_start == -1) {
			fprintf(stderr, "KSM_START_KTHREAD failed\n");
			exit(1);
		}
		printf("created scanner\n");
	}

	if (!strncmp(argv[1], "stop", strlen(argv[1]))) {
		used = 1;
		info.flags = 0;
		fd_start = ioctl(fd, KSM_START_STOP_KTHREAD, &info);
		printf("stopped scanner\n");
	}

	if (!strncmp(argv[1], "info", strlen(argv[1]))) {
		used = 1;
		ioctl(fd, KSM_GET_INFO_KTHREAD, &info);
	 printf("flags %d, pages_to_scan %d npages_merge %d, sleep_time %d\n",
	 info.flags, info.pages_to_scan, info.max_pages_to_merge, info.sleep);
	}

	if (!used)
		fprintf(stderr, "unknown command %s\n", argv[1]);

	return 0;
}

example of how to register qemu to ksm (or any userspace application)

diff --git a/qemu/vl.c b/qemu/vl.c
index 4721fdd..7785bf9 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -21,6 +21,7 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
  * DEALINGS IN
  * THE SOFTWARE.
  */
+#include "ksm.h"
 #include "hw/hw.h"
 #include "hw/boards.h"
 #include "hw/usb.h"
@@ -5799,6 +5800,37 @@ static void termsig_setup(void)
 
 #endif
 
+int ksm_register_memory(void)
+{
+    int fd;
+    int ksm_fd;
+    int r = 1;
+    struct ksm_memory_region ksm_region;
+
+    fd = open("/dev/ksm", O_RDWR | O_TRUNC, (mode_t)0600);
+    if (fd == -1)
+        goto out;
+
+    ksm_fd = ioctl(fd, KSM_CREATE_SHARED_MEMORY_AREA);
+    if (ksm_fd == -1)
+        goto out_free;
+
+    ksm_region.npages = phys_ram_size / TARGET_PAGE_SIZE;
+    ksm_region.addr = phys_ram_base;
+    r = ioctl(ksm_fd, KSM_REGISTER_MEMORY_REGION, &ksm_region);
+    if (r)
+        goto out_free1;
+
+    return r;
+
+out_free1:
+    close(ksm_fd);
+out_free:
+    close(fd);
+out:
+    return r;
+}
+
 int main(int argc, char **argv)
 {
 #ifdef CONFIG_GDBSTUB
@@ -6735,6 +6767,8 @@ int main(int argc, char **argv)
     /* init the dynamic translator */
     cpu_exec_init_all(tb_size * 1024 * 1024);
 
+    ksm_register_memory();
+
     bdrv_init();
 
     /* we always create the cdrom drive, even if no disk is there */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2008-11-17  2:20 Izik Eidus
@ 2008-11-20  7:44 ` Ryota OZAKI
  2008-11-20  9:03   ` Izik Eidus
  2008-11-28 12:57 ` Dmitri Monakhov
  1 sibling, 1 reply; 19+ messages in thread
From: Ryota OZAKI @ 2008-11-20  7:44 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, linux-mm, kvm, aarcange, chrisw, avi, dlaor,
	kamezawa.hiroyu, cl, corbet

[-- Attachment #1: Type: text/plain, Size: 8893 bytes --]

Hi Izik,

I've tried your patch set, but ksm doesn't work in my machine.

I compiled linux patched with the four patches and configured with KSM
and KVM enabled. After boot with the linux, I run two VMs running linux
using QEMU with a patch in your mail and started KSM scanner with your
script, then the host linux caused panic with the following oops.


== BEGINNING of OOPS
kernel BUG at arch/x86/mm/highmem_32.c:87!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/class/net/vnet-ssh2/address
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in: netconsole autofs4 nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack xt_tcpudp ipt_REJECT iptable_filter ip_tables
x_tables loop kvm_intel kvm iTCO_wdt iTCO_vendor_support igb
netxen_nic button ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd usbcore
[last unloaded: microcode]

Pid: 343, comm: kksmd Not tainted
(2.6.28-rc5-linus-head-20081119-sparsemem #1) X7DWA
EIP: 0060:[<c041eff9>] EFLAGS: 00010206 CPU: 6
EIP is at kmap_atomic_prot+0x7d/0xeb
EAX: c0008d94 EBX: c1ff6240 ECX: 00000163 EDX: 7e000000
ESI: 00000154 EDI: 00000055 EBP: f5cdbf10 ESP: f5cdbef8
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process kksmd (pid: 343, ti=f5cda000 task=f617b140 task.ti=f5cda000)
Stack:
 7fa12163 fffff000 c204efbc f50479e8 9eb7e000 c08a34d0 f5cdbf18 c041f07a
 f5cdbf28 c048339c 00000000 f5c271e0 f5cdbf30 c04833bc f5cdbfb0 c0483b0d
 f5cdbf50 c0425845 00000000 00000064 00000009 c08a34d0 f5cdbfb0 c06384c1
Call Trace:
 [<c041f07a>] ? kmap_atomic+0x13/0x15
 [<c048339c>] ? get_pte+0x50/0x63
 [<c04833bc>] ? is_present_pte+0xd/0x1f
 [<c0483b0d>] ? ksm_scan_start+0x9a/0x7ac
 [<c0425845>] ? finish_task_switch+0x29/0xa4
 [<c06384c1>] ? schedule+0x6bf/0x719
 [<c041b3fc>] ? default_spin_lock_flags+0x8/0xc
 [<c043bffa>] ? finish_wait+0x49/0x4e
 [<c04845f4>] ? kthread_ksm_scan_thread+0x0/0xdc
 [<c048462e>] ? kthread_ksm_scan_thread+0x3a/0xdc
 [<c043bf31>] ? autoremove_wake_function+0x0/0x38
 [<c043be3e>] ? kthread+0x40/0x66
 [<c043bdfe>] ? kthread+0x0/0x66
 [<c0404997>] ? kernel_thread_helper+0x7/0x10
Code: 86 00 00 00 64 a1 04 a0 82 c0 6b c0 0d 8d 3c 30 a1 78 b0 77 c0
8d 34 bd 00 00 00 00 89 45 ec a1 0c d0 84 c0 29 f0 83 38 00 74 04 <0f>
0b eb fe c1 ea 1a 8b 04 d5 80 32 8a c0 83 e0 fc 29 c3 c1 fb
EIP: [<c041eff9>] kmap_atomic_prot+0x7d/0xeb SS:ESP 0068:f5cdbef8
Kernel panic - not syncing: Fatal exception
== END of OOPS


I used a version of linux kernel from linus tree (commit e14c8bf863...
at Mon Nov 17)
with a configuration as the attached file and kvm-userspace (commit
0cf57d569f3...
at Mon Nov 10). Other configurations are as follows:

Host: Fedora 8
Arch: i686
CPU: Intel(R) Xeon(R) CPU E5410  @ 2.33GHz
Mem: 2 GB
Guest: Fedora Core 7, linux-2.6.27
Cmd: qemu-system-x86_64 -kernel vmlinuz-2.6.27 -initrd
initrd-2.6.27.img -append ro root=/dev/VolGroup00/LogVol00
kvm_paravirt=1 loglevel=4 rhgb quiet -hda -M pc -smp 1 -curses -clock
dynticks -net nic,vlan=0,macaddr=00:00:00:1:11:0,model=virtio -net
tap,vlan=0,ifname=vnet1,script=no -m 384


Note that if I disable HIGHMEM of the host kernel, ksm works well.

Any idea?

Thanks in advance,
  ozaki-r


2008/11/17 Izik Eidus <ieidus@redhat.com>:
> (From v1 to v2 the main change is much more documentation)
>
> KSM is a linux driver that allows dynamicly sharing identical memory
> pages between one or more processes.
>
> Unlike tradtional page sharing that is made at the allocation of the
> memory, ksm do it dynamicly after the memory was created.
> Memory is periodically scanned; identical pages are identified and
> merged.
> The sharing is unnoticeable by the process that use this memory.
> (the shared pages are marked as readonly, and in case of write
> do_wp_page() take care to create new copy of the page)
>
> This driver is very useful for KVM as in cases of runing multiple guests
> operation system of the same type.
> (For desktop work loads we have achived more than x2 memory overcommit
> (more like x3))
>
> This driver have found users other than KVM, for example CERN,
> Fons Rademakers:
> "on many-core machines we run one large detector simulation program per core.
> These simulation programs are identical but run each in their own process and
> need about 2 - 2.5 GB RAM.
> We typically buy machines with 2GB RAM per core and so have a problem to run
> one of these programs per core.
> Of the 2 - 2.5 GB about 700MB is identical data in the form of magnetic field
> maps, detector geometry, etc.
> Currently people have been trying to start one program, initialize the geometry
> and field maps and then fork it N times, to have the data shared.
> With KSM this would be done automatically by the system so it sounded extremely
> attractive when Andrea presented it."
>
> (We have are already started to test KSM on their systems...)
>
> KSM can run as kernel thread or as userspace application or both
>
> example for how to control the kernel thread:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sys/ioctl.h>
> #include <fcntl.h>
> #include <sys/mman.h>
> #include <unistd.h>
> #include "ksm.h"
>
> int main(int argc, char *argv[])
> {
>        int fd;
>        int used = 0;
>        int fd_start;
>        struct ksm_kthread_info info;
>
>
>        if (argc < 2) {
>                fprintf(stderr,
>                        "usage: %s {start npages sleep | stop | info}\n",
>                        argv[0]);
>                exit(1);
>        }
>
>        fd = open("/dev/ksm", O_RDWR | O_TRUNC, (mode_t)0600);
>        if (fd == -1) {
>                fprintf(stderr, "could not open /dev/ksm\n");
>                exit(1);
>        }
>
>        if (!strncmp(argv[1], "start", strlen(argv[1]))) {
>                used = 1;
>                if (argc < 4) {
>                        fprintf(stderr,
>                    "usage: %s start npages_to_scan max_pages_to_merge sleep\n",
>                    argv[0]);
>                        exit(1);
>                }
>                info.pages_to_scan = atoi(argv[2]);
>                info.max_pages_to_merge = atoi(argv[3]);
>                info.sleep = atoi(argv[4]);
>                info.flags = ksm_control_flags_run;
>
>                fd_start = ioctl(fd, KSM_START_STOP_KTHREAD, &info);
>                if (fd_start == -1) {
>                        fprintf(stderr, "KSM_START_KTHREAD failed\n");
>                        exit(1);
>                }
>                printf("created scanner\n");
>        }
>
>        if (!strncmp(argv[1], "stop", strlen(argv[1]))) {
>                used = 1;
>                info.flags = 0;
>                fd_start = ioctl(fd, KSM_START_STOP_KTHREAD, &info);
>                printf("stopped scanner\n");
>        }
>
>        if (!strncmp(argv[1], "info", strlen(argv[1]))) {
>                used = 1;
>                ioctl(fd, KSM_GET_INFO_KTHREAD, &info);
>         printf("flags %d, pages_to_scan %d npages_merge %d, sleep_time %d\n",
>         info.flags, info.pages_to_scan, info.max_pages_to_merge, info.sleep);
>        }
>
>        if (!used)
>                fprintf(stderr, "unknown command %s\n", argv[1]);
>
>        return 0;
> }
>
> example of how to register qemu to ksm (or any userspace application)
>
> diff --git a/qemu/vl.c b/qemu/vl.c
> index 4721fdd..7785bf9 100644
> --- a/qemu/vl.c
> +++ b/qemu/vl.c
> @@ -21,6 +21,7 @@
>  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>  * DEALINGS IN
>  * THE SOFTWARE.
>  */
> +#include "ksm.h"
>  #include "hw/hw.h"
>  #include "hw/boards.h"
>  #include "hw/usb.h"
> @@ -5799,6 +5800,37 @@ static void termsig_setup(void)
>
>  #endif
>
> +int ksm_register_memory(void)
> +{
> +    int fd;
> +    int ksm_fd;
> +    int r = 1;
> +    struct ksm_memory_region ksm_region;
> +
> +    fd = open("/dev/ksm", O_RDWR | O_TRUNC, (mode_t)0600);
> +    if (fd == -1)
> +        goto out;
> +
> +    ksm_fd = ioctl(fd, KSM_CREATE_SHARED_MEMORY_AREA);
> +    if (ksm_fd == -1)
> +        goto out_free;
> +
> +    ksm_region.npages = phys_ram_size / TARGET_PAGE_SIZE;
> +    ksm_region.addr = phys_ram_base;
> +    r = ioctl(ksm_fd, KSM_REGISTER_MEMORY_REGION, &ksm_region);
> +    if (r)
> +        goto out_free1;
> +
> +    return r;
> +
> +out_free1:
> +    close(ksm_fd);
> +out_free:
> +    close(fd);
> +out:
> +    return r;
> +}
> +
>  int main(int argc, char **argv)
>  {
>  #ifdef CONFIG_GDBSTUB
> @@ -6735,6 +6767,8 @@ int main(int argc, char **argv)
>     /* init the dynamic translator */
>     cpu_exec_init_all(tb_size * 1024 * 1024);
>
> +    ksm_register_memory();
> +
>     bdrv_init();
>
>     /* we always create the cdrom drive, even if no disk is there */
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[-- Attachment #2: config-2.6.28-rc5.i686 --]
[-- Type: application/octet-stream, Size: 50736 bytes --]

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.28-rc5
# Wed Nov 19 14:04:28 2008
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
# CONFIG_HAVE_CPUMASK_OF_CPU_MAP is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_X86_SMP=y
CONFIG_X86_32_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION="-linus-head-20081119-sparsemem"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_TREE=y
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_GROUP_SCHED=y
# CONFIG_FAIR_GROUP_SCHED is not set
# CONFIG_RT_GROUP_SCHED is not set
CONFIG_USER_SCHED=y
# CONFIG_CGROUP_SCHED is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSFS_DEPRECATED_V2=y
CONFIG_RELAY=y
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_COMPAT_BRK=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
CONFIG_MARKERS=y
CONFIG_OPROFILE=m
# CONFIG_OPROFILE_IBS is not set
CONFIG_HAVE_OPROFILE=y
# CONFIG_KPROBES is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_HAVE_GENERIC_DMA_COHERENT=y
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_LBD=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_LSF=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_BLK_DEV_INTEGRITY is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_CLASSIC_RCU=y
CONFIG_FREEZER=y

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_VSMP is not set
# CONFIG_X86_RDC321X is not set
# CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER is not set
CONFIG_PARAVIRT_GUEST=y
# CONFIG_VMI is not set
# CONFIG_KVM_CLOCK is not set
CONFIG_KVM_GUEST=y
CONFIG_LGUEST_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_CLOCK is not set
# CONFIG_PARAVIRT_DEBUG is not set
# CONFIG_MEMTEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MPSC is not set
CONFIG_MCORE2=y
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CPU=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_CYRIX_32=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR_32=y
CONFIG_CPU_SUP_TRANSMETA_32=y
CONFIG_CPU_SUP_UMC_32=y
CONFIG_X86_DS=y
CONFIG_X86_PTRACE_BTS=y
CONFIG_HPET_TIMER=y
CONFIG_DMI=y
# CONFIG_IOMMU_HELPER is not set
CONFIG_NR_CPUS=32
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_NONFATAL is not set
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=m
CONFIG_MICROCODE_INTEL=y
# CONFIG_MICROCODE_AMD is not set
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
# CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set
CONFIG_NEED_NODE_MEMMAP_SIZE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
# CONFIG_DISCONTIGMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_STATIC=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_RESOURCES_64BIT is not set
# CONFIG_PHYS_ADDR_T_64BIT is not set
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_UNEVICTABLE_LRU=y
CONFIG_MMU_NOTIFIER=y
CONFIG_PAGE_SHARING=y
CONFIG_HIGHPTE=y
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_X86_RESERVE_LOW_64K=y
CONFIG_MATH_EMULATION=y
CONFIG_MTRR=y
# CONFIG_MTRR_SANITIZER is not set
CONFIG_X86_PAT=y
CONFIG_EFI=y
# CONFIG_SECCOMP is not set
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x400000
CONFIG_HOTPLUG_CPU=y
# CONFIG_COMPAT_VDSO is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management and ACPI options
#
CONFIG_PM=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_VERBOSE is not set
CONFIG_CAN_PM_TRACE=y
# CONFIG_PM_TRACE_RTC is not set
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM_SLEEP=y
CONFIG_SUSPEND=y
# CONFIG_PM_TEST_SUSPEND is not set
CONFIG_SUSPEND_FREEZER=y
# CONFIG_HIBERNATION is not set
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_PROCFS=y
# CONFIG_ACPI_PROCFS_POWER is not set
CONFIG_ACPI_SYSFS_POWER=y
CONFIG_ACPI_PROC_EVENT=y
CONFIG_ACPI_AC=m
CONFIG_ACPI_BATTERY=m
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_WMI is not set
# CONFIG_ACPI_ASUS is not set
CONFIG_ACPI_TOSHIBA=m
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_BLACKLIST_YEAR=1999
# CONFIG_ACPI_DEBUG is not set
# CONFIG_ACPI_PCI_SLOT is not set
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_SBS=m
CONFIG_X86_APM_BOOT=y
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
CONFIG_APM_CPU_IDLE=y
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_ALLOW_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_DEBUG=y
CONFIG_CPU_FREQ_STAT=m
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=m
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m

#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=m
# CONFIG_X86_POWERNOW_K6 is not set
# CONFIG_X86_POWERNOW_K7 is not set
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_GX_SUSPMOD is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_SPEEDSTEP_ICH=y
CONFIG_X86_SPEEDSTEP_SMI=y
# CONFIG_X86_P4_CLOCKMOD is not set
# CONFIG_X86_CPUFREQ_NFORCE2 is not set
# CONFIG_X86_LONGRUN is not set
# CONFIG_X86_LONGHAUL is not set
# CONFIG_X86_E_POWERSAVER is not set

#
# shared options
#
# CONFIG_X86_ACPI_CPUFREQ_PROC_INTF is not set
CONFIG_X86_SPEEDSTEP_LIB=y
# CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK is not set
# CONFIG_CPU_IDLE is not set

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
# CONFIG_PCI_GOOLPC is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
# CONFIG_PCIEAER is not set
# CONFIG_PCIEASPM is not set
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
CONFIG_HT_IRQ=y
CONFIG_ISA_DMA_API=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
# CONFIG_OLPC is not set
# CONFIG_PCCARD is not set
# CONFIG_HOTPLUG_PCI is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_HAVE_AOUT=y
# CONFIG_BINFMT_AOUT is not set
CONFIG_BINFMT_MISC=y
CONFIG_HAVE_ATOMIC_IOMAP=y
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
# CONFIG_XFRM_STATISTICS is not set
CONFIG_XFRM_IPCOMP=m
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
CONFIG_INET_XFRM_MODE_BEET=m
CONFIG_INET_LRO=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
# CONFIG_IPV6 is not set
CONFIG_NETWORK_SECMARK=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CT_ACCT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_EVENTS=y
# CONFIG_NF_CT_PROTO_DCCP is not set
CONFIG_NF_CT_PROTO_GRE=m
CONFIG_NF_CT_PROTO_SCTP=m
# CONFIG_NF_CT_PROTO_UDPLITE is not set
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NETFILTER_TPROXY=m
CONFIG_NETFILTER_XTABLES=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
# CONFIG_NETFILTER_XT_TARGET_RATEEST is not set
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
# CONFIG_NETFILTER_XT_TARGET_TRACE is not set
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
# CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP is not set
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
# CONFIG_NETFILTER_XT_MATCH_CONNLIMIT is not set
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
# CONFIG_NETFILTER_XT_MATCH_IPRANGE is not set
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
# CONFIG_NETFILTER_XT_MATCH_OWNER is not set
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
# CONFIG_NETFILTER_XT_MATCH_RATEEST is not set
CONFIG_NETFILTER_XT_MATCH_REALM=m
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
CONFIG_NETFILTER_XT_MATCH_SCTP=m
# CONFIG_NETFILTER_XT_MATCH_SOCKET is not set
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
CONFIG_IP_VS=m
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_CONNTRACK_IPV4=m
# CONFIG_NF_CONNTRACK_PROC_COMPAT is not set
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PROTO_GRE=m
CONFIG_NF_NAT_PROTO_SCTP=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_NF_NAT_SIP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_ULOG=m
# CONFIG_BRIDGE_EBT_NFLOG is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
CONFIG_STP=m
CONFIG_BRIDGE=m
# CONFIG_NET_DSA is not set
CONFIG_VLAN_8021Q=m
# CONFIG_VLAN_8021Q_GVRP is not set
# CONFIG_DECNET is not set
CONFIG_LLC=m
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
CONFIG_WAN_ROUTER=m
# CONFIG_NET_SCHED is not set
CONFIG_NET_CLS_ROUTE=y

#
# Network testing
#
CONFIG_NET_PKTGEN=m
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_PHONET is not set
CONFIG_FIB_RULES=y
# CONFIG_WIRELESS is not set
CONFIG_RFKILL=m
# CONFIG_RFKILL_INPUT is not set
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
CONFIG_FIRMWARE_IN_KERNEL=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
CONFIG_MTD=m
# CONFIG_MTD_DEBUG is not set
CONFIG_MTD_CONCAT=m
CONFIG_MTD_PARTITIONS=y
CONFIG_MTD_REDBOOT_PARTS=m
CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=-1
# CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED is not set
# CONFIG_MTD_REDBOOT_PARTS_READONLY is not set
# CONFIG_MTD_AR7_PARTS is not set

#
# User Modules And Translation Layers
#
CONFIG_MTD_CHAR=m
CONFIG_MTD_BLKDEVS=m
CONFIG_MTD_BLOCK=m
CONFIG_MTD_BLOCK_RO=m
CONFIG_FTL=m
CONFIG_NFTL=m
CONFIG_NFTL_RW=y
CONFIG_INFTL=m
CONFIG_RFD_FTL=m
CONFIG_SSFDC=m
# CONFIG_MTD_OOPS is not set

#
# RAM/ROM/Flash chip drivers
#
CONFIG_MTD_CFI=m
CONFIG_MTD_JEDECPROBE=m
CONFIG_MTD_GEN_PROBE=m
# CONFIG_MTD_CFI_ADV_OPTIONS is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
CONFIG_MTD_CFI_INTELEXT=m
CONFIG_MTD_CFI_AMDSTD=m
CONFIG_MTD_CFI_STAA=m
CONFIG_MTD_CFI_UTIL=m
CONFIG_MTD_RAM=m
CONFIG_MTD_ROM=m
CONFIG_MTD_ABSENT=m

#
# Mapping drivers for chip access
#
CONFIG_MTD_COMPLEX_MAPPINGS=y
# CONFIG_MTD_PHYSMAP is not set
CONFIG_MTD_SC520CDP=m
CONFIG_MTD_NETSC520=m
CONFIG_MTD_TS5500=m
# CONFIG_MTD_SBC_GXX is not set
# CONFIG_MTD_AMD76XROM is not set
# CONFIG_MTD_ICHXROM is not set
CONFIG_MTD_ESB2ROM=m
CONFIG_MTD_CK804XROM=m
CONFIG_MTD_SCB2_FLASH=m
# CONFIG_MTD_NETtel is not set
# CONFIG_MTD_DILNETPC is not set
# CONFIG_MTD_L440GX is not set
CONFIG_MTD_PCI=m
# CONFIG_MTD_INTEL_VR_NOR is not set
# CONFIG_MTD_PLATRAM is not set

#
# Self-contained MTD device drivers
#
CONFIG_MTD_PMC551=m
# CONFIG_MTD_PMC551_BUGFIX is not set
# CONFIG_MTD_PMC551_DEBUG is not set
# CONFIG_MTD_SLRAM is not set
# CONFIG_MTD_PHRAM is not set
CONFIG_MTD_MTDRAM=m
CONFIG_MTDRAM_TOTAL_SIZE=4096
CONFIG_MTDRAM_ERASE_SIZE=128
CONFIG_MTD_BLOCK2MTD=m

#
# Disk-On-Chip Device Drivers
#
# CONFIG_MTD_DOC2000 is not set
# CONFIG_MTD_DOC2001 is not set
# CONFIG_MTD_DOC2001PLUS is not set
CONFIG_MTD_NAND=m
# CONFIG_MTD_NAND_VERIFY_WRITE is not set
CONFIG_MTD_NAND_ECC_SMC=y
# CONFIG_MTD_NAND_MUSEUM_IDS is not set
CONFIG_MTD_NAND_IDS=m
CONFIG_MTD_NAND_DISKONCHIP=m
# CONFIG_MTD_NAND_DISKONCHIP_PROBE_ADVANCED is not set
CONFIG_MTD_NAND_DISKONCHIP_PROBE_ADDRESS=0
# CONFIG_MTD_NAND_DISKONCHIP_BBTWRITE is not set
CONFIG_MTD_NAND_CAFE=m
CONFIG_MTD_NAND_CS553X=m
CONFIG_MTD_NAND_NANDSIM=m
# CONFIG_MTD_NAND_PLATFORM is not set
# CONFIG_MTD_ALAUDA is not set
# CONFIG_MTD_ONENAND is not set

#
# UBI - Unsorted block images
#
# CONFIG_MTD_UBI is not set
# CONFIG_PARPORT is not set
CONFIG_PNP=y
CONFIG_PNP_DEBUG_MESSAGES=y

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
CONFIG_BLK_DEV_DAC960=m
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_SX8=m
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
# CONFIG_BLK_DEV_XIP is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_VIRTIO_BLK=m
# CONFIG_BLK_DEV_HD is not set
CONFIG_MISC_DEVICES=y
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_SGI_IOC4 is not set
CONFIG_TIFM_CORE=m
# CONFIG_TIFM_7XX1 is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_TC1100_WMI is not set
# CONFIG_MSI_LAPTOP is not set
# CONFIG_PANASONIC_LAPTOP is not set
# CONFIG_COMPAL_LAPTOP is not set
# CONFIG_SONY_LAPTOP is not set
# CONFIG_THINKPAD_ACPI is not set
# CONFIG_INTEL_MENLOW is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_C2PORT is not set
CONFIG_HAVE_IDE=y
CONFIG_IDE=y

#
# Please see Documentation/ide/ide.txt for help/info on IDE drives
#
CONFIG_IDE_ATAPI=y
# CONFIG_BLK_DEV_IDE_SATA is not set
CONFIG_IDE_GD=y
CONFIG_IDE_GD_ATA=y
# CONFIG_IDE_GD_ATAPI is not set
# CONFIG_BLK_DEV_IDECD is not set
# CONFIG_BLK_DEV_IDETAPE is not set
CONFIG_BLK_DEV_IDESCSI=y
# CONFIG_BLK_DEV_IDEACPI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_PROC_FS=y

#
# IDE chipset support/bugfixes
#
# CONFIG_IDE_GENERIC is not set
# CONFIG_BLK_DEV_PLATFORM is not set
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set

#
# PCI IDE chipsets support
#
# CONFIG_BLK_DEV_GENERIC is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_CS5535 is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_JMICRON is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_IT8213 is not set
# CONFIG_BLK_DEV_IT821X is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_BLK_DEV_TC86C001 is not set
# CONFIG_BLK_DEV_IDEDMA is not set

#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_TGT is not set
# CONFIG_SCSI_NETLINK is not set
# CONFIG_SCSI_PROC_FS is not set

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
CONFIG_CHR_DEV_SG=y
# CONFIG_CHR_DEV_SCH is not set

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
CONFIG_SCSI_SAS_ATTRS=y
CONFIG_SCSI_SAS_LIBSAS=y
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_SAS_HOST_SMP=y
CONFIG_SCSI_SAS_LIBSAS_DEBUG=y
# CONFIG_SCSI_SRP_ATTRS is not set
# CONFIG_SCSI_LOWLEVEL is not set
# CONFIG_SCSI_DH is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_ACPI=y
CONFIG_SATA_PMP=y
# CONFIG_SATA_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y
# CONFIG_SATA_SVW is not set
CONFIG_ATA_PIIX=y
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_PATA_ACPI is not set
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CS5535 is not set
# CONFIG_PATA_CS5536 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_MARVELL is not set
CONFIG_PATA_MPIIX=y
CONFIG_PATA_OLDPIIX=y
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RZ1000 is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
# CONFIG_PATA_SCH is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_MD_RAID5_RESHAPE=y
# CONFIG_MD_MULTIPATH is not set
# CONFIG_MD_FAULTY is not set
CONFIG_BLK_DEV_DM=y
CONFIG_DM_DEBUG=y
CONFIG_DM_CRYPT=y
# CONFIG_DM_SNAPSHOT is not set
# CONFIG_DM_MIRROR is not set
# CONFIG_DM_ZERO is not set
# CONFIG_DM_MULTIPATH is not set
# CONFIG_DM_DELAY is not set
# CONFIG_DM_UEVENT is not set
CONFIG_FUSION=y
# CONFIG_FUSION_SPI is not set
# CONFIG_FUSION_FC is not set
CONFIG_FUSION_SAS=y
CONFIG_FUSION_MAX_SGE=128
# CONFIG_FUSION_CTL is not set
# CONFIG_FUSION_LOGGING is not set

#
# IEEE 1394 (FireWire) support
#

#
# Enable only one of the two stacks, unless you know what you are doing
#
# CONFIG_FIREWIRE is not set
# CONFIG_IEEE1394 is not set
# CONFIG_I2O is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_BONDING=m
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
CONFIG_TUN=y
CONFIG_VETH=y
# CONFIG_NET_SB1000 is not set
# CONFIG_ARCNET is not set
CONFIG_PHYLIB=m

#
# MII PHY device drivers
#
# CONFIG_MARVELL_PHY is not set
# CONFIG_DAVICOM_PHY is not set
# CONFIG_QSEMI_PHY is not set
# CONFIG_LXT_PHY is not set
# CONFIG_CICADA_PHY is not set
# CONFIG_VITESSE_PHY is not set
# CONFIG_SMSC_PHY is not set
# CONFIG_BROADCOM_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_REALTEK_PHY is not set
# CONFIG_MDIO_BITBANG is not set
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
# CONFIG_IBM_NEW_EMAC_ZMII is not set
# CONFIG_IBM_NEW_EMAC_RGMII is not set
# CONFIG_IBM_NEW_EMAC_TAH is not set
# CONFIG_IBM_NEW_EMAC_EMAC4 is not set
# CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set
# CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set
# CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
CONFIG_8139CP=m
CONFIG_8139TOO=m
CONFIG_8139TOO_PIO=y
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_OLD_RX_RESET is not set
# CONFIG_R6040 is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_SC92031 is not set
# CONFIG_ATL2 is not set
CONFIG_NETDEV_1000=y
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
CONFIG_E1000=m
CONFIG_E1000E=m
# CONFIG_IP1000 is not set
CONFIG_IGB=m
# CONFIG_IGB_LRO is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_VIA_VELOCITY is not set
CONFIG_TIGON3=m
# CONFIG_BNX2 is not set
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_JME is not set
CONFIG_NETDEV_10000=y
# CONFIG_CHELSIO_T1 is not set
# CONFIG_CHELSIO_T3 is not set
# CONFIG_ENIC is not set
CONFIG_IXGBE=m
CONFIG_IXGB=m
# CONFIG_S2IO is not set
# CONFIG_MYRI10GE is not set
CONFIG_NETXEN_NIC=m
# CONFIG_NIU is not set
# CONFIG_MLX4_EN is not set
# CONFIG_MLX4_CORE is not set
# CONFIG_TEHUTI is not set
CONFIG_BNX2X=m
# CONFIG_QLGE is not set
# CONFIG_SFC is not set
# CONFIG_TR is not set

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
# CONFIG_WLAN_80211 is not set
# CONFIG_IWLWIFI_LEDS is not set

#
# USB Network Adapters
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
# CONFIG_USB_HSO is not set
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
CONFIG_NETCONSOLE=m
# CONFIG_NETCONSOLE_DYNAMIC is not set
CONFIG_NETPOLL=y
CONFIG_NETPOLL_TRAP=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_VIRTIO_NET=m
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_POLLDEV=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_SERIAL=m
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
CONFIG_MOUSE_VSXXXAA=m
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_DEVKMEM=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
CONFIG_ROCKETPORT=m
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_ISI is not set
CONFIG_SYNCLINK=m
CONFIG_SYNCLINKMP=m
CONFIG_SYNCLINK_GT=m
CONFIG_N_HDLC=m
# CONFIG_RISCOM8 is not set
# CONFIG_SPECIALIX is not set
# CONFIG_SX is not set
# CONFIG_RIO is not set
# CONFIG_STALDRV is not set
# CONFIG_NOZOMI is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_JSM=m
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_HVC_DRIVER=y
CONFIG_VIRTIO_CONSOLE=y
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_INTEL=m
# CONFIG_HW_RANDOM_AMD is not set
# CONFIG_HW_RANDOM_GEODE is not set
# CONFIG_HW_RANDOM_VIA is not set
CONFIG_HW_RANDOM_VIRTIO=m
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
# CONFIG_NSC_GPIO is not set
# CONFIG_CS5535_GPIO is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
# CONFIG_HPET_MMAP is not set
CONFIG_HANGCHECK_TIMER=m
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
# CONFIG_I2C is not set
# CONFIG_SPI is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_BATTERY_DS2760 is not set
# CONFIG_HWMON is not set
CONFIG_THERMAL=y
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
# CONFIG_SC520_WDT is not set
# CONFIG_EUROTECH_WDT is not set
# CONFIG_IB700_WDT is not set
CONFIG_IBMASR=m
# CONFIG_WAFER_WDT is not set
CONFIG_I6300ESB_WDT=m
CONFIG_ITCO_WDT=m
CONFIG_ITCO_VENDOR_SUPPORT=y
# CONFIG_IT8712F_WDT is not set
# CONFIG_IT87_WDT is not set
# CONFIG_HP_WATCHDOG is not set
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
# CONFIG_60XX_WDT is not set
# CONFIG_SBC8360_WDT is not set
# CONFIG_SBC7240_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_SMSC37B787_WDT is not set
CONFIG_W83627HF_WDT=m
CONFIG_W83697HF_WDT=m
# CONFIG_W83697UG_WDT is not set
CONFIG_W83877F_WDT=m
CONFIG_W83977F_WDT=m
CONFIG_MACHZ_WDT=m
# CONFIG_SBC_EPX_C3_WATCHDOG is not set

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m
CONFIG_WDT_501_PCI=y

#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_MFD_TMIO is not set
# CONFIG_REGULATOR is not set

#
# Multimedia devices
#

#
# Multimedia core support
#
# CONFIG_VIDEO_DEV is not set
# CONFIG_DVB_CORE is not set
# CONFIG_VIDEO_MEDIA is not set

#
# Multimedia drivers
#
# CONFIG_DAB is not set

#
# Graphics support
#
# CONFIG_AGP is not set
# CONFIG_DRM is not set
# CONFIG_VGASTATE is not set
# CONFIG_VIDEO_OUTPUT_CONTROL is not set
# CONFIG_FB is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
# CONFIG_LCD_CLASS_DEVICE is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_CORGI is not set
# CONFIG_BACKLIGHT_PROGEAR is not set
# CONFIG_BACKLIGHT_MBP_NVIDIA is not set
# CONFIG_BACKLIGHT_SAHARA is not set

#
# Display device support
#
# CONFIG_DISPLAY_SUPPORT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
# CONFIG_SOUND is not set
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
CONFIG_HID_DEBUG=y
# CONFIG_HIDRAW is not set

#
# USB Input Devices
#
CONFIG_USB_HID=m
# CONFIG_HID_PID is not set
# CONFIG_USB_HIDDEV is not set

#
# USB HID Boot Protocol drivers
#
# CONFIG_USB_KBD is not set
# CONFIG_USB_MOUSE is not set

#
# Special HID drivers
#
CONFIG_HID_COMPAT=y
CONFIG_HID_A4TECH=m
CONFIG_HID_APPLE=m
CONFIG_HID_BELKIN=m
CONFIG_HID_BRIGHT=m
CONFIG_HID_CHERRY=m
CONFIG_HID_CHICONY=m
CONFIG_HID_CYPRESS=m
CONFIG_HID_DELL=m
CONFIG_HID_EZKEY=m
CONFIG_HID_GYRATION=m
CONFIG_HID_LOGITECH=m
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
CONFIG_HID_MICROSOFT=m
CONFIG_HID_MONTEREY=m
CONFIG_HID_PANTHERLORD=m
# CONFIG_PANTHERLORD_FF is not set
CONFIG_HID_PETALYNX=m
CONFIG_HID_SAMSUNG=m
CONFIG_HID_SONY=m
CONFIG_HID_SUNPLUS=m
# CONFIG_THRUSTMASTER_FF is not set
# CONFIG_ZEROPLUS_FF is not set
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set
# CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set

#
# Miscellaneous USB options
#
# CONFIG_USB_DEVICEFS is not set
CONFIG_USB_DEVICE_CLASS=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_OTG is not set
CONFIG_USB_MON=y
# CONFIG_USB_WUSB is not set
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_EHCI_HCD=m
# CONFIG_USB_EHCI_ROOT_HUB_TT is not set
# CONFIG_USB_EHCI_TT_NEWSCHED is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1760_HCD is not set
CONFIG_USB_OHCI_HCD=m
# CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=m
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_WHCI_HCD is not set
# CONFIG_USB_HWA_HCD is not set

#
# Enable Host or Gadget support to see Inventra options
#

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may also be needed;
#

#
# see USB_STORAGE Help for more information
#
# CONFIG_USB_STORAGE is not set
# CONFIG_USB_LIBUSUAL is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set

#
# USB port drivers
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_BERRY_CHARGE is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGET is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_VST is not set
# CONFIG_USB_GADGET is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
# CONFIG_RTC_DRV_CMOS is not set
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_V3020 is not set

#
# on-CPU RTC drivers
#
CONFIG_DMADEVICES=y

#
# DMA Devices
#
# CONFIG_INTEL_IOATDMA is not set
CONFIG_UIO=m
# CONFIG_UIO_CIF is not set
# CONFIG_UIO_PDRV is not set
# CONFIG_UIO_PDRV_GENIRQ is not set
# CONFIG_UIO_SMX is not set
# CONFIG_UIO_SERCOS3 is not set
# CONFIG_STAGING is not set
CONFIG_STAGING_EXCLUDE_BUILD=y

#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
# CONFIG_EFI_VARS is not set
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_ISCSI_IBFT_FIND is not set

#
# File systems
#
# CONFIG_EXT2_FS is not set
CONFIG_EXT3_FS=m
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_EXT4_FS is not set
CONFIG_JBD=m
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=m
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_FILE_LOCKING=y
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
# CONFIG_QUOTA_NETLINK_INTERFACE is not set
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=y
CONFIG_GENERIC_ACL=y

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=m

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
CONFIG_ROMFS_FS=m
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_NETWORK_FILESYSTEMS is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_BSD_DISKLABEL is not set
# CONFIG_MINIX_SUBPARTITION is not set
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
# CONFIG_EFI_PARTITION is not set
# CONFIG_SYSV68_PARTITION is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
CONFIG_NLS_CODEPAGE_932=y
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ASCII=y
# CONFIG_NLS_ISO8859_1 is not set
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
CONFIG_NLS_UTF8=y
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_PRINTK_TIME is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
CONFIG_MAGIC_SYSRQ=y
# CONFIG_UNUSED_SYMBOLS is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SHIRQ is not set
CONFIG_DETECT_SOFTLOCKUP=y
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
# CONFIG_TIMER_STATS is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
# CONFIG_DEBUG_WRITECOUNT is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_SG is not set
CONFIG_FRAME_POINTER=y
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_FAULT_INJECTION is not set
CONFIG_LATENCYTOP=y
CONFIG_SYSCTL_SYSCALL_CHECK=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_RING_BUFFER=y
CONFIG_TRACING=y

#
# Tracers
#
CONFIG_FUNCTION_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_SYSPROF_TRACER is not set
# CONFIG_SCHED_TRACER is not set
CONFIG_CONTEXT_SWITCH_TRACER=y
# CONFIG_BOOT_TRACER is not set
# CONFIG_STACK_TRACER is not set
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_BUILD_DOCSRC is not set
# CONFIG_DYNAMIC_PRINTK_DEBUG is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
# CONFIG_EARLY_PRINTK_DBGP is not set
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_DEBUG_PER_CPU_MAPS is not set
# CONFIG_X86_PTDUMP is not set
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_RODATA_TEST is not set
# CONFIG_DEBUG_NX_TEST is not set
# CONFIG_4KSTACKS is not set
CONFIG_DOUBLEFAULT=y
# CONFIG_MMIOTRACE is not set
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
# CONFIG_DEBUG_BOOT_PARAMS is not set
# CONFIG_CPA_DEBUG is not set
# CONFIG_OPTIMIZE_INLINING is not set

#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
# CONFIG_SECURITY_FILE_CAPABILITIES is not set
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
# CONFIG_CRYPTO_FIPS is not set
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_GF128MUL=m
CONFIG_CRYPTO_NULL=m
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_AUTHENC=m
# CONFIG_CRYPTO_TEST is not set

#
# Authenticated Encryption with Associated Data
#
# CONFIG_CRYPTO_CCM is not set
# CONFIG_CRYPTO_GCM is not set
# CONFIG_CRYPTO_SEQIV is not set

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
# CONFIG_CRYPTO_CTR is not set
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=m
CONFIG_CRYPTO_LRW=m
CONFIG_CRYPTO_PCBC=m
# CONFIG_CRYPTO_XTS is not set

#
# Hash modes
#
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=m

#
# Digest
#
CONFIG_CRYPTO_CRC32C=m
CONFIG_CRYPTO_CRC32C_INTEL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
# CONFIG_CRYPTO_RMD128 is not set
# CONFIG_CRYPTO_RMD160 is not set
# CONFIG_CRYPTO_RMD256 is not set
# CONFIG_CRYPTO_RMD320 is not set
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_WP512=m

#
# Ciphers
#
CONFIG_CRYPTO_AES=m
# CONFIG_CRYPTO_AES_586 is not set
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
# CONFIG_CRYPTO_SALSA20 is not set
# CONFIG_CRYPTO_SALSA20_586 is not set
# CONFIG_CRYPTO_SEED is not set
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
# CONFIG_CRYPTO_TWOFISH_586 is not set

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=m
# CONFIG_CRYPTO_LZO is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
# CONFIG_CRYPTO_HW is not set
CONFIG_HAVE_KVM=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
# CONFIG_KVM_AMD is not set
# CONFIG_KVM_TRACE is not set
# CONFIG_LGUEST is not set
CONFIG_VIRTIO=y
CONFIG_VIRTIO_RING=y
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m

#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_CRC_CCITT=m
CONFIG_CRC16=m
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=m
CONFIG_AUDIT_GENERIC=y
CONFIG_ZLIB_INFLATE=m
CONFIG_ZLIB_DEFLATE=m
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_DEC16=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_PLIST=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2008-11-20  7:44 ` Ryota OZAKI
@ 2008-11-20  9:03   ` Izik Eidus
  2008-11-20  9:13     ` Izik Eidus
  0 siblings, 1 reply; 19+ messages in thread
From: Izik Eidus @ 2008-11-20  9:03 UTC (permalink / raw)
  To: Ryota OZAKI
  Cc: akpm, linux-kernel, linux-mm, kvm, aarcange, chrisw, avi, dlaor,
	kamezawa.hiroyu, cl, corbet

[-- Attachment #1: Type: text/plain, Size: 2816 bytes --]

oeeae Ryota OZAKI:
> Hi Izik,
>
> I've tried your patch set, but ksm doesn't work in my machine.
>
> I compiled linux patched with the four patches and configured with KSM
> and KVM enabled. After boot with the linux, I run two VMs running linux
> using QEMU with a patch in your mail and started KSM scanner with your
> script, then the host linux caused panic with the following oops.
>   

Yes you are right, we are missing pte_unmap(pte); in get_pte()!
that will effect just 32bits with highmem so this why you see it
thanks for the reporting, i will fix it for v3

below patch should fix it (i cant test it now, will test it for v3)

can you report if it fix your problem? thanks
>
> == BEGINNING of OOPS
> kernel BUG at arch/x86/mm/highmem_32.c:87!
> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/class/net/vnet-ssh2/address
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in: netconsole autofs4 nf_conntrack_ipv4 nf_defrag_ipv4
> xt_state nf_conntrack xt_tcpudp ipt_REJECT iptable_filter ip_tables
> x_tables loop kvm_intel kvm iTCO_wdt iTCO_vendor_support igb
> netxen_nic button ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd usbcore
> [last unloaded: microcode]
>
> Pid: 343, comm: kksmd Not tainted
> (2.6.28-rc5-linus-head-20081119-sparsemem #1) X7DWA
> EIP: 0060:[<c041eff9>] EFLAGS: 00010206 CPU: 6
> EIP is at kmap_atomic_prot+0x7d/0xeb
> EAX: c0008d94 EBX: c1ff6240 ECX: 00000163 EDX: 7e000000
> ESI: 00000154 EDI: 00000055 EBP: f5cdbf10 ESP: f5cdbef8
>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Process kksmd (pid: 343, ti=f5cda000 task=f617b140 task.ti=f5cda000)
> Stack:
>  7fa12163 fffff000 c204efbc f50479e8 9eb7e000 c08a34d0 f5cdbf18 c041f07a
>  f5cdbf28 c048339c 00000000 f5c271e0 f5cdbf30 c04833bc f5cdbfb0 c0483b0d
>  f5cdbf50 c0425845 00000000 00000064 00000009 c08a34d0 f5cdbfb0 c06384c1
> Call Trace:
>  [<c041f07a>] ? kmap_atomic+0x13/0x15
>  [<c048339c>] ? get_pte+0x50/0x63
>  [<c04833bc>] ? is_present_pte+0xd/0x1f
>  [<c0483b0d>] ? ksm_scan_start+0x9a/0x7ac
>  [<c0425845>] ? finish_task_switch+0x29/0xa4
>  [<c06384c1>] ? schedule+0x6bf/0x719
>  [<c041b3fc>] ? default_spin_lock_flags+0x8/0xc
>  [<c043bffa>] ? finish_wait+0x49/0x4e
>  [<c04845f4>] ? kthread_ksm_scan_thread+0x0/0xdc
>  [<c048462e>] ? kthread_ksm_scan_thread+0x3a/0xdc
>  [<c043bf31>] ? autoremove_wake_function+0x0/0x38
>  [<c043be3e>] ? kthread+0x40/0x66
>  [<c043bdfe>] ? kthread+0x0/0x66
>  [<c0404997>] ? kernel_thread_helper+0x7/0x10
> Code: 86 00 00 00 64 a1 04 a0 82 c0 6b c0 0d 8d 3c 30 a1 78 b0 77 c0
> 8d 34 bd 00 00 00 00 89 45 ec a1 0c d0 84 c0 29 f0 83 38 00 74 04 <0f>
> 0b eb fe c1 ea 1a 8b 04 d5 80 32 8a c0 83 e0 fc 29 c3 c1 fb
> EIP: [<c041eff9>] kmap_atomic_prot+0x7d/0xeb SS:ESP 0068:f5cdbef8
> Kernel panic - not syncing: Fatal exception
> == END of OOPS
>   


[-- Attachment #2: fix_32highmem --]
[-- Type: text/plain, Size: 271 bytes --]

diff --git a/mm/ksm.c b/mm/ksm.c
index 707be52..e14448a 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -562,6 +562,7 @@ static pte_t *get_pte(struct mm_struct *mm, unsigned long addr)
 		goto out;
 
 	ptep = pte_offset_map(pmd, addr);
+	pte_unmap(ptep);
 out:
 	return ptep;
 }

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2008-11-20  9:03   ` Izik Eidus
@ 2008-11-20  9:13     ` Izik Eidus
  2008-11-20  9:44       ` Ryota OZAKI
  0 siblings, 1 reply; 19+ messages in thread
From: Izik Eidus @ 2008-11-20  9:13 UTC (permalink / raw)
  To: Ryota OZAKI
  Cc: akpm, linux-kernel, linux-mm, kvm, aarcange, chrisw, avi, dlaor,
	kamezawa.hiroyu, cl, corbet

[-- Attachment #1: Type: text/plain, Size: 892 bytes --]

oeeae Izik Eidus:
> oeeae Ryota OZAKI:
>> Hi Izik,
>>
>> I've tried your patch set, but ksm doesn't work in my machine.
>>
>> I compiled linux patched with the four patches and configured with KSM
>> and KVM enabled. After boot with the linux, I run two VMs running linux
>> using QEMU with a patch in your mail and started KSM scanner with your
>> script, then the host linux caused panic with the following oops.
>>   
>
> Yes you are right, we are missing pte_unmap(pte); in get_pte()!
> that will effect just 32bits with highmem so this why you see it
> thanks for the reporting, i will fix it for v3
>
> below patch should fix it (i cant test it now, will test it for v3)
>
> can you report if it fix your problem? thanks
>
Thinking about what i just did, it is wrong,
this patch is the right one (still wasnt tested), but if you are going 
to apply something then use this one.

thanks

[-- Attachment #2: fix_highmem_2 --]
[-- Type: text/plain, Size: 676 bytes --]

diff --git a/mm/ksm.c b/mm/ksm.c
index 707be52..c842c29 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -569,14 +569,16 @@ out:
 static int is_present_pte(struct mm_struct *mm, unsigned long addr)
 {
 	pte_t *ptep;
+	int r;
 
 	ptep = get_pte(mm, addr);
 	if (!ptep)
 		return 0;
 
-	if (pte_present(*ptep))
-		return 1;
-	return 0;
+	r = pte_present(*ptep);
+	pte_unmap(ptep);
+
+	return r;
 }
 
 #define PAGEHASH_LEN 128
@@ -669,6 +671,7 @@ static int try_to_merge_one_page(struct mm_struct *mm,
 	if (!orig_ptep)
 		goto out_unlock;
 	orig_pte = *orig_ptep;
+	pte_unmap(orig_ptep);
 	if (!pte_present(orig_pte))
 		goto out_unlock;
 	if (page_to_pfn(oldpage) != pte_pfn(orig_pte))

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2008-11-20  9:13     ` Izik Eidus
@ 2008-11-20  9:44       ` Ryota OZAKI
  0 siblings, 0 replies; 19+ messages in thread
From: Ryota OZAKI @ 2008-11-20  9:44 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, linux-mm, kvm, aarcange, chrisw, avi, dlaor,
	kamezawa.hiroyu, cl, corbet

2008/11/20 Izik Eidus <ieidus@redhat.com>:
> ציטוט Izik Eidus:
>>
>> ציטוט Ryota OZAKI:
>>>
>>> Hi Izik,
>>>
>>> I've tried your patch set, but ksm doesn't work in my machine.
>>>
>>> I compiled linux patched with the four patches and configured with KSM
>>> and KVM enabled. After boot with the linux, I run two VMs running linux
>>> using QEMU with a patch in your mail and started KSM scanner with your
>>> script, then the host linux caused panic with the following oops.
>>>
>>
>> Yes you are right, we are missing pte_unmap(pte); in get_pte()!
>> that will effect just 32bits with highmem so this why you see it
>> thanks for the reporting, i will fix it for v3
>>
>> below patch should fix it (i cant test it now, will test it for v3)
>>
>> can you report if it fix your problem? thanks
>>
> Thinking about what i just did, it is wrong,
> this patch is the right one (still wasnt tested), but if you are going to
> apply something then use this one.

Great! Applied the 2nd patch, ksm works with both HIGHMEM enabled and disabled.

Thanks for your quick response,
  ozaki-r

>
> thanks
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 707be52..c842c29 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -569,14 +569,16 @@ out:
>  static int is_present_pte(struct mm_struct *mm, unsigned long addr)
>  {
>        pte_t *ptep;
> +       int r;
>
>        ptep = get_pte(mm, addr);
>        if (!ptep)
>                return 0;
>
> -       if (pte_present(*ptep))
> -               return 1;
> -       return 0;
> +       r = pte_present(*ptep);
> +       pte_unmap(ptep);
> +
> +       return r;
>  }
>
>  #define PAGEHASH_LEN 128
> @@ -669,6 +671,7 @@ static int try_to_merge_one_page(struct mm_struct *mm,
>        if (!orig_ptep)
>                goto out_unlock;
>        orig_pte = *orig_ptep;
> +       pte_unmap(orig_ptep);
>        if (!pte_present(orig_pte))
>                goto out_unlock;
>        if (page_to_pfn(oldpage) != pte_pfn(orig_pte))
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2008-11-17  2:20 Izik Eidus
  2008-11-20  7:44 ` Ryota OZAKI
@ 2008-11-28 12:57 ` Dmitri Monakhov
  2008-11-28 13:51   ` Alan Cox
  1 sibling, 1 reply; 19+ messages in thread
From: Dmitri Monakhov @ 2008-11-28 12:57 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, linux-mm, kvm, aarcange, chrisw, avi, dlaor,
	kamezawa.hiroyu, cl, corbet

Izik Eidus <ieidus@redhat.com> writes:

> (From v1 to v2 the main change is much more documentation)
>
> KSM is a linux driver that allows dynamicly sharing identical memory
> pages between one or more processes.
>
> Unlike tradtional page sharing that is made at the allocation of the
> memory, ksm do it dynamicly after the memory was created.
> Memory is periodically scanned; identical pages are identified and
> merged.
> The sharing is unnoticeable by the process that use this memory.
> (the shared pages are marked as readonly, and in case of write
> do_wp_page() take care to create new copy of the page)
>
> This driver is very useful for KVM as in cases of runing multiple guests
> operation system of the same type.
Hi Izik, approach that was used in the driver commonly known as
content based search. Where are several variants of it
most commons are:
1: with guest TM support
2: w/o guest vm support.
You have implemented second one, but seems it already was patented
http://www.google.com/patents?vid=USPAT6789156
I'm not a lawyer but IMHO we have direct conflict here.
>From other point of view they have patented the WEEL, but at least we
have to know about this.
> (For desktop work loads we have achived more than x2 memory overcommit
> (more like x3))
>
> This driver have found users other than KVM, for example CERN,
> Fons Rademakers:
> "on many-core machines we run one large detector simulation program per core.
> These simulation programs are identical but run each in their own process and
> need about 2 - 2.5 GB RAM.
> We typically buy machines with 2GB RAM per core and so have a problem to run
> one of these programs per core.
> Of the 2 - 2.5 GB about 700MB is identical data in the form of magnetic field
> maps, detector geometry, etc.
> Currently people have been trying to start one program, initialize the geometry
> and field maps and then fork it N times, to have the data shared.
> With KSM this would be done automatically by the system so it sounded extremely
> attractive when Andrea presented it."
>
> (We have are already started to test KSM on their systems...)
>
> KSM can run as kernel thread or as userspace application or both
>
> example for how to control the kernel thread:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sys/ioctl.h>
> #include <fcntl.h>
> #include <sys/mman.h>
> #include <unistd.h>
> #include "ksm.h"
>
> int main(int argc, char *argv[])
> {
> 	int fd;
> 	int used = 0;
> 	int fd_start;
> 	struct ksm_kthread_info info;
> 	
>
> 	if (argc < 2) {
> 		fprintf(stderr,
> 			"usage: %s {start npages sleep | stop | info}\n",
> 			argv[0]);
> 		exit(1);
> 	}
>
> 	fd = open("/dev/ksm", O_RDWR | O_TRUNC, (mode_t)0600);
> 	if (fd == -1) {
> 		fprintf(stderr, "could not open /dev/ksm\n");
> 		exit(1);
> 	}
>
> 	if (!strncmp(argv[1], "start", strlen(argv[1]))) {
> 		used = 1;
> 		if (argc < 4) {
> 			fprintf(stderr,
> 		    "usage: %s start npages_to_scan max_pages_to_merge sleep\n",
> 		    argv[0]);
> 			exit(1);
> 		}
> 		info.pages_to_scan = atoi(argv[2]);
> 		info.max_pages_to_merge = atoi(argv[3]);
> 		info.sleep = atoi(argv[4]);
> 		info.flags = ksm_control_flags_run;
>
> 		fd_start = ioctl(fd, KSM_START_STOP_KTHREAD, &info);
> 		if (fd_start == -1) {
> 			fprintf(stderr, "KSM_START_KTHREAD failed\n");
> 			exit(1);
> 		}
> 		printf("created scanner\n");
> 	}
>
> 	if (!strncmp(argv[1], "stop", strlen(argv[1]))) {
> 		used = 1;
> 		info.flags = 0;
> 		fd_start = ioctl(fd, KSM_START_STOP_KTHREAD, &info);
> 		printf("stopped scanner\n");
> 	}
>
> 	if (!strncmp(argv[1], "info", strlen(argv[1]))) {
> 		used = 1;
> 		ioctl(fd, KSM_GET_INFO_KTHREAD, &info);
> 	 printf("flags %d, pages_to_scan %d npages_merge %d, sleep_time %d\n",
> 	 info.flags, info.pages_to_scan, info.max_pages_to_merge, info.sleep);
> 	}
>
> 	if (!used)
> 		fprintf(stderr, "unknown command %s\n", argv[1]);
>
> 	return 0;
> }
>
> example of how to register qemu to ksm (or any userspace application)
>
> diff --git a/qemu/vl.c b/qemu/vl.c
> index 4721fdd..7785bf9 100644
> --- a/qemu/vl.c
> +++ b/qemu/vl.c
> @@ -21,6 +21,7 @@
>   * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>   * DEALINGS IN
>   * THE SOFTWARE.
>   */
> +#include "ksm.h"
>  #include "hw/hw.h"
>  #include "hw/boards.h"
>  #include "hw/usb.h"
> @@ -5799,6 +5800,37 @@ static void termsig_setup(void)
>  
>  #endif
>  
> +int ksm_register_memory(void)
> +{
> +    int fd;
> +    int ksm_fd;
> +    int r = 1;
> +    struct ksm_memory_region ksm_region;
> +
> +    fd = open("/dev/ksm", O_RDWR | O_TRUNC, (mode_t)0600);
> +    if (fd == -1)
> +        goto out;
> +
> +    ksm_fd = ioctl(fd, KSM_CREATE_SHARED_MEMORY_AREA);
> +    if (ksm_fd == -1)
> +        goto out_free;
> +
> +    ksm_region.npages = phys_ram_size / TARGET_PAGE_SIZE;
> +    ksm_region.addr = phys_ram_base;
> +    r = ioctl(ksm_fd, KSM_REGISTER_MEMORY_REGION, &ksm_region);
> +    if (r)
> +        goto out_free1;
> +
> +    return r;
> +
> +out_free1:
> +    close(ksm_fd);
> +out_free:
> +    close(fd);
> +out:
> +    return r;
> +}
> +
>  int main(int argc, char **argv)
>  {
>  #ifdef CONFIG_GDBSTUB
> @@ -6735,6 +6767,8 @@ int main(int argc, char **argv)
>      /* init the dynamic translator */
>      cpu_exec_init_all(tb_size * 1024 * 1024);
>  
> +    ksm_register_memory();
> +
>      bdrv_init();
>  
>      /* we always create the cdrom drive, even if no disk is there */
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

>  LocalWords:  Izik vm WEEL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2008-11-28 12:57 ` Dmitri Monakhov
@ 2008-11-28 13:51   ` Alan Cox
  0 siblings, 0 replies; 19+ messages in thread
From: Alan Cox @ 2008-11-28 13:51 UTC (permalink / raw)
  To: Dmitri Monakhov
  Cc: Izik Eidus, akpm, linux-kernel, linux-mm, kvm, aarcange, chrisw,
	avi, dlaor, kamezawa.hiroyu, cl, corbet

> You have implemented second one, but seems it already was patented
> http://www.google.com/patents?vid=USPAT6789156
> I'm not a lawyer but IMHO we have direct conflict here.
> >From other point of view they have patented the WEEL, but at least we
> have to know about this.

Its an old idea and appeared for Linux in March 1998: Little project from
Philipp Reisner called "mergemem".

http://groups.google.com/group/muc.lists.linux-kernel/browse_thread/thread/387af278089c7066?ie=utf-8&oe=utf-8&q=share+identical+pages#b3d4f68fb5dd4f88

so if there is a patent which is relevant (and thats a question for
lawyers and legal patent search people) perhaps the Linux Foundation and
some of the patent busters could take a look at mergemem and
re-examination.

Alan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
@ 2009-04-04 14:35 Izik Eidus
  2009-04-04 14:35 ` [PATCH 1/4] MMU_NOTIFIERS: add set_pte_at_notify() Izik Eidus
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Izik Eidus @ 2009-04-04 14:35 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, kvm, linux-mm, avi, aarcange, chrisw, mtosatti,
	hugh, kamezawa.hiroyu, Izik Eidus

>From v1 to v2:

1)Fixed security issue found by Chris Wright:
    Ksm was checking if page is a shared page by running !PageAnon.
    Beacuse that Ksm scan only anonymous memory, all !PageAnons
    inside ksm data strctures are shared page, however there might
    be a case for do_wp_page() when the VM_SHARED is used where
    do_wp_page() would instead of copying the page into new anonymos
    page, would reuse the page, it was fixed by adding check for the
    dirty_bit of the virtual addresses pointing into the shared page.
    I was not finding any VM code tha would clear the dirty bit from
    this virtual address (due to the fact that we allocate the page
    using page_alloc() - kernel allocated pages), ~but i still want
    confirmation about this from the vm guys - thanks.~

2)Moved to sysfs to control ksm:
    It was requested as a better way to control the ksm scanning
    thread than ioctls.
    the sysfs api:
    dir: /sys/kernel/mm/ksm/

    kernel_pages_allocated - information about how many kernel pages
    ksm have allocated, this pages are not swappable, and each page
    like that is used by ksm to share pages with identical content
    
    pages_shared - how many pages were shared by ksm

    run - set to 1 when you want ksm to run, 0 when no

    max_kernel_pages - set the maximum amount of kernel pages
    to be allocated by ksm, set 0 for unlimited.

    pages_to_scan - how many pages to scan before ksm will sleep

    sleep - how much usecs ksm will sleep.

3)Add sysfs paramater to control the maximum kernel pages to be by
ksm.

4)Add statistics about how much pages are really shared.


One issue still to be discussed:
There was a suggestion to use madvice(SHAREABLE) instead of using
ioctls to register memory that need to be scanned by ksm.
Such change is outside the area of ksm.c and would required adding
new madvice api, and change some parts of the vm and the kernel
code, so first thing to do, is realized if we really want this.

I dont know any other open issues.

Thanks.

This is from the first post:
(The kvm part, togather with the kvm-userspace part, was post with V1
before about a week, whoever want to test ksm may download the
patch from lkml archive)

KSM is a linux driver that allows dynamicly sharing identical memory
pages between one or more processes.

Unlike tradtional page sharing that is made at the allocation of the
memory, ksm do it dynamicly after the memory was created.
Memory is periodically scanned; identical pages are identified and
merged.
The sharing is unnoticeable by the process that use this memory.
(the shared pages are marked as readonly, and in case of write
do_wp_page() take care to create new copy of the page)

To find identical pages ksm use algorithm that is split into three
primery levels:

1) Ksm will start scan the memory and will calculate checksum for each
   page that is registred to be scanned.
   (In the first round of the scanning, ksm would only calculate
    this checksum for all the pages)

2) Ksm will go again on the whole memory and will recalculate the
   checmsum of the pages, pages that are found to have the same
   checksum value, would be considered "pages that are most likely
   wont changed"
   Ksm will insert this pages into sorted by page content RB-tree that
   is called "unstable tree", the reason that this tree is called
   unstable is due to the fact that the page contents might changed
   while they are still inside the tree, and therefore the tree would
   become corrupted.
   Due to this problem ksm take two more steps in addition to the
   checksum calculation:
   a) Ksm will throw and recreate the entire unstable tree each round
      of memory scanning - so if we have corruption, it will be fixed
      when we will rebuild the tree.
   b) Ksm is using RB-tree, that its balancing is made by the node color
      and not by the content, so even if the page get corrupted, it still
      would take the same amount of time to search on it.

3) In addition to the unstable tree, ksm hold another tree that is called
   "stable tree" - this tree is RB-tree that is sorted by the pages
   content and all its pages are write protected, and therefore it cant get
   corrupted.
   Each time ksm will find two identcial pages using the unstable tree,
   it will create new write-protected shared page, and this page will be
   inserted into the stable tree, and would be saved there, the
   stable tree, unlike the unstable tree, is never throwen away, so each
   page that we find would be saved inside it.

Taking into account the three levels that described above, the algorithm
work like that:

search primary tree (sorted by entire page contents, pages write protected)
- if match found, merge
- if no match found...
  - search secondary tree (sorted by entire page contents, pages not write
    protected)
    - if match found, merge
      - remove from secondary tree and insert merged page into primary tree
    - if no match found...
      - checksum
        - if checksum hasn't changed
	  - insert into secondary tree
	- if it has, store updated checksum (note: first time this page
	  is handled it won't have a checksum, so checksum will appear
	  as "changed", so it takes two passes w/ no other matches to
	  get into secondary tree)
	  - do not insert into any tree, will see it again on next pass

The basic idea of this algorithm, is that even if the unstable tree doesnt
promise to us to find two identical pages in the first round, we would
probably find them in the second or the third or the tenth round,
then after we have found this two identical pages only once, we will insert
them into the stable tree, and then they would be protected there forever.
So the all idea of the unstable tree, is just to build the stable tree and
then we will find the identical pages using it.

The current implemantion can be improved alot:
we dont have to calculate exspensive checksum, we can just use the host
dirty bit.

currently we dont support shared pages swapping (other pages that are not
shared can be swapped (all the pages that we didnt find to be identical
to other pages...).

Walking on the tree, we keep call to get_user_pages(), we can optimized it
by saving the pfn, and using mmu notifiers to know when the virtual address
mapping was changed.

We currently scan just programs that were registred to be used by ksm, we
would later want to add the abilaty to tell ksm to scan PIDS (so you can
scan closed binary applications as well).

Right now ksm scanning is made by just one thread, multiple scanners
support might would be needed.

This driver is very useful for KVM as in cases of runing multiple guests
operation system of the same type.
(For desktop work loads we have achived more than x2 memory overcommit
(more like x3))

This driver have found users other than KVM, for example CERN,
Fons Rademakers:
"on many-core machines we run one large detector simulation program per core.
These simulation programs are identical but run each in their own process and
need about 2 - 2.5 GB RAM.
We typically buy machines with 2GB RAM per core and so have a problem to run
one of these programs per core.
Of the 2 - 2.5 GB about 700MB is identical data in the form of magnetic field
maps, detector geometry, etc.
Currently people have been trying to start one program, initialize the geometry
and field maps and then fork it N times, to have the data shared.
With KSM this would be done automatically by the system so it sounded extremely
attractive when Andrea presented it."

I am sending another seires of patchs for kvm kernel and kvm-userspace
that would allow users of kvm to test ksm with it.
The kvm patchs would apply to Avi git tree.


Izik Eidus (4):
  MMU_NOTIFIERS: add set_pte_at_notify()
  add page_wrprotect(): write protecting page.
  add replace_page(): change the page pte is pointing to.
  add ksm kernel shared memory driver.

 include/linux/ksm.h          |   48 ++
 include/linux/miscdevice.h   |    1 +
 include/linux/mm.h           |    5 +
 include/linux/mmu_notifier.h |   34 +
 include/linux/rmap.h         |   11 +
 mm/Kconfig                   |    6 +
 mm/Makefile                  |    1 +
 mm/ksm.c                     | 1668 ++++++++++++++++++++++++++++++++++++++++++
 mm/memory.c                  |   90 +++-
 mm/mmu_notifier.c            |   20 +
 mm/rmap.c                    |  139 ++++
 11 files changed, 2021 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/ksm.h
 create mode 100644 mm/ksm.c

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] MMU_NOTIFIERS: add set_pte_at_notify()
  2009-04-04 14:35 [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Izik Eidus
@ 2009-04-04 14:35 ` Izik Eidus
  2009-04-04 14:35   ` [PATCH 2/4] add page_wrprotect(): write protecting page Izik Eidus
  2009-04-06  7:04 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Nick Piggin
  2009-04-07 13:57 ` Andrea Arcangeli
  2 siblings, 1 reply; 19+ messages in thread
From: Izik Eidus @ 2009-04-04 14:35 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, kvm, linux-mm, avi, aarcange, chrisw, mtosatti,
	hugh, kamezawa.hiroyu, Izik Eidus

this macro allow setting the pte in the shadow page tables directly
instead of flushing the shadow page table entry and then get vmexit in
order to set it.

This function is optimzation for kvm/users of mmu_notifiers for COW
pages, it is useful for kvm when ksm is used beacuse it allow kvm
not to have to recive VMEXIT and only then map the shared page into
the mmu shadow pages, but instead map it directly at the same time
linux map the page into the host page table.

this mmu notifer macro is working by calling to callback that will map
directly the physical page into the shadow page tables.

(users of mmu_notifiers that didnt implement the set_pte_at_notify()
call back will just recive the mmu_notifier_invalidate_page callback)

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 include/linux/mmu_notifier.h |   34 ++++++++++++++++++++++++++++++++++
 mm/memory.c                  |   10 ++++++++--
 mm/mmu_notifier.c            |   20 ++++++++++++++++++++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index b77486d..8bb245f 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -61,6 +61,15 @@ struct mmu_notifier_ops {
 				 struct mm_struct *mm,
 				 unsigned long address);
 
+	/* 
+	* change_pte is called in cases that pte mapping into page is changed
+	* for example when ksm mapped pte to point into a new shared page.
+	*/
+	void (*change_pte)(struct mmu_notifier *mn,
+			   struct mm_struct *mm,
+			   unsigned long address,
+			   pte_t pte);
+
 	/*
 	 * Before this is invoked any secondary MMU is still ok to
 	 * read/write to the page previously pointed to by the Linux
@@ -154,6 +163,8 @@ extern void __mmu_notifier_mm_destroy(struct mm_struct *mm);
 extern void __mmu_notifier_release(struct mm_struct *mm);
 extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
 					  unsigned long address);
+extern void __mmu_notifier_change_pte(struct mm_struct *mm, 
+				      unsigned long address, pte_t pte);
 extern void __mmu_notifier_invalidate_page(struct mm_struct *mm,
 					  unsigned long address);
 extern void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
@@ -175,6 +186,13 @@ static inline int mmu_notifier_clear_flush_young(struct mm_struct *mm,
 	return 0;
 }
 
+static inline void mmu_notifier_change_pte(struct mm_struct *mm,
+					   unsigned long address, pte_t pte)
+{
+	if (mm_has_notifiers(mm))
+		__mmu_notifier_change_pte(mm, address, pte);
+}
+
 static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
 					  unsigned long address)
 {
@@ -236,6 +254,16 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
 	__young;							\
 })
 
+#define set_pte_at_notify(__mm, __address, __ptep, __pte)		\
+({									\
+	struct mm_struct *___mm = __mm;					\
+	unsigned long ___address = __address;				\
+	pte_t ___pte = __pte;						\
+									\
+	set_pte_at(__mm, __address, __ptep, ___pte);			\
+	mmu_notifier_change_pte(___mm, ___address, ___pte);		\
+})
+
 #else /* CONFIG_MMU_NOTIFIER */
 
 static inline void mmu_notifier_release(struct mm_struct *mm)
@@ -248,6 +276,11 @@ static inline int mmu_notifier_clear_flush_young(struct mm_struct *mm,
 	return 0;
 }
 
+static inline void mmu_notifier_change_pte(struct mm_struct *mm,
+					   unsigned long address, pte_t pte)
+{
+}
+
 static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
 					  unsigned long address)
 {
@@ -273,6 +306,7 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
 
 #define ptep_clear_flush_young_notify ptep_clear_flush_young
 #define ptep_clear_flush_notify ptep_clear_flush
+#define set_pte_at_notify set_pte_at
 
 #endif /* CONFIG_MMU_NOTIFIER */
 
diff --git a/mm/memory.c b/mm/memory.c
index cf6873e..1e1a14b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2051,9 +2051,15 @@ gotten:
 		 * seen in the presence of one thread doing SMC and another
 		 * thread doing COW.
 		 */
-		ptep_clear_flush_notify(vma, address, page_table);
+		ptep_clear_flush(vma, address, page_table);
 		page_add_new_anon_rmap(new_page, vma, address);
-		set_pte_at(mm, address, page_table, entry);
+		/*
+		 * We call here the notify macro beacuse in cases of using
+		 * secondary mmu page table like kvm shadow page, tables we want
+		 * the new page to be mapped directly into the secondary page
+		 * table
+		 */
+		set_pte_at_notify(mm, address, page_table, entry);
 		update_mmu_cache(vma, address, entry);
 		if (old_page) {
 			/*
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 5f4ef02..c3e8779 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -99,6 +99,26 @@ int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
 	return young;
 }
 
+void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address,
+			       pte_t pte)
+{
+	struct mmu_notifier *mn;
+	struct hlist_node *n;
+
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_mm->list, hlist) {
+		if (mn->ops->change_pte)
+			mn->ops->change_pte(mn, mm, address, pte);
+		/* 
+		 * some drivers dont have change_pte and therefor we must call
+		 * for invalidate_page in that case
+		 */
+		else if (mn->ops->invalidate_page)
+			mn->ops->invalidate_page(mn, mm, address);
+	}
+	rcu_read_unlock();
+}
+
 void __mmu_notifier_invalidate_page(struct mm_struct *mm,
 					  unsigned long address)
 {
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] add page_wrprotect(): write protecting page.
  2009-04-04 14:35 ` [PATCH 1/4] MMU_NOTIFIERS: add set_pte_at_notify() Izik Eidus
@ 2009-04-04 14:35   ` Izik Eidus
  2009-04-04 14:35     ` [PATCH 3/4] add replace_page(): change the page pte is pointing to Izik Eidus
  0 siblings, 1 reply; 19+ messages in thread
From: Izik Eidus @ 2009-04-04 14:35 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, kvm, linux-mm, avi, aarcange, chrisw, mtosatti,
	hugh, kamezawa.hiroyu, Izik Eidus

this patch add new function called page_wrprotect(),
page_wrprotect() is used to take a page and mark all the pte that
point into it as readonly.

The function is working by walking the rmap of the page, and setting
each pte realted to the page as readonly.

The odirect_sync parameter is used to protect against possible races
with odirect while we are marking the pte as readonly,
as noted by Andrea Arcanglei:

"While thinking at get_user_pages_fast I figured another worse way
things can go wrong with ksm and o_direct: think a thread writing
constantly to the last 512bytes of a page, while another thread read
and writes to/from the first 512bytes of the page. We can lose
O_DIRECT reads, the very moment we mark any pte wrprotected..."

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 include/linux/rmap.h |   11 ++++
 mm/rmap.c            |  139 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 150 insertions(+), 0 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index b35bc0e..469376d 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -118,6 +118,10 @@ static inline int try_to_munlock(struct page *page)
 }
 #endif
 
+#if defined(CONFIG_KSM) || defined(CONFIG_KSM_MODULE)
+int page_wrprotect(struct page *page, int *odirect_sync, int count_offset);
+#endif
+
 #else	/* !CONFIG_MMU */
 
 #define anon_vma_init()		do {} while (0)
@@ -132,6 +136,13 @@ static inline int page_mkclean(struct page *page)
 	return 0;
 }
 
+#if defined(CONFIG_KSM) || defined(CONFIG_KSM_MODULE)
+static inline int page_wrprotect(struct page *page, int *odirect_sync,
+				 int count_offset)
+{
+	return 0;
+}
+#endif
 
 #endif	/* CONFIG_MMU */
 
diff --git a/mm/rmap.c b/mm/rmap.c
index 1652166..95c55ea 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -585,6 +585,145 @@ int page_mkclean(struct page *page)
 }
 EXPORT_SYMBOL_GPL(page_mkclean);
 
+#if defined(CONFIG_KSM) || defined(CONFIG_KSM_MODULE)
+
+static int page_wrprotect_one(struct page *page, struct vm_area_struct *vma,
+			      int *odirect_sync, int count_offset)
+{
+	struct mm_struct *mm = vma->vm_mm;
+	unsigned long address;
+	pte_t *pte;
+	spinlock_t *ptl;
+	int ret = 0;
+
+	address = vma_address(page, vma);
+	if (address == -EFAULT)
+		goto out;
+
+	pte = page_check_address(page, mm, address, &ptl, 0);
+	if (!pte)
+		goto out;
+
+	if (pte_write(*pte)) {
+		pte_t entry;
+
+		flush_cache_page(vma, address, pte_pfn(*pte));
+		/*
+		 * Ok this is tricky, when get_user_pages_fast() run it doesnt
+		 * take any lock, therefore the check that we are going to make
+		 * with the pagecount against the mapcount is racey and
+		 * O_DIRECT can happen right after the check.
+		 * So we clear the pte and flush the tlb before the check
+		 * this assure us that no O_DIRECT can happen after the check
+		 * or in the middle of the check.
+		 */
+		entry = ptep_clear_flush(vma, address, pte);
+		/*
+		 * Check that no O_DIRECT or similar I/O is in progress on the
+		 * page
+		 */
+		if ((page_mapcount(page) + count_offset) != page_count(page)) {
+			*odirect_sync = 0;
+			set_pte_at_notify(mm, address, pte, entry);
+			goto out_unlock;
+		}
+		entry = pte_wrprotect(entry);
+		set_pte_at_notify(mm, address, pte, entry);
+	}
+	ret = 1;
+
+out_unlock:
+	pte_unmap_unlock(pte, ptl);
+out:
+	return ret;
+}
+
+static int page_wrprotect_file(struct page *page, int *odirect_sync,
+			       int count_offset)
+{
+	struct address_space *mapping;
+	struct prio_tree_iter iter;
+	struct vm_area_struct *vma;
+	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	int ret = 0;
+
+	mapping = page_mapping(page);
+	if (!mapping)
+		return ret;
+
+	spin_lock(&mapping->i_mmap_lock);
+
+	vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff)
+		ret += page_wrprotect_one(page, vma, odirect_sync,
+					  count_offset);
+
+	spin_unlock(&mapping->i_mmap_lock);
+
+	return ret;
+}
+
+static int page_wrprotect_anon(struct page *page, int *odirect_sync,
+			       int count_offset)
+{
+	struct vm_area_struct *vma;
+	struct anon_vma *anon_vma;
+	int ret = 0;
+
+	anon_vma = page_lock_anon_vma(page);
+	if (!anon_vma)
+		return ret;
+
+	/*
+	 * If the page is inside the swap cache, its _count number was
+	 * increased by one, therefore we have to increase count_offset by one.
+	 */
+	if (PageSwapCache(page))
+		count_offset++;
+
+	list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
+		ret += page_wrprotect_one(page, vma, odirect_sync,
+					  count_offset);
+
+	page_unlock_anon_vma(anon_vma);
+
+	return ret;
+}
+
+/**
+ * page_wrprotect - set all ptes pointing to a page as readonly
+ * @page:         the page to set as readonly
+ * @odirect_sync: boolean value that is set to 0 when some of the ptes were not
+ *                marked as readonly beacuse page_wrprotect_one() was not able
+ *                to mark this ptes as readonly without opening window to a race
+ *                with odirect
+ * @count_offset: number of times page_wrprotect() caller had called get_page()
+ *                on the page
+ *
+ * returns the number of ptes which were marked as readonly.
+ * (ptes that were readonly before this function was called are counted as well)
+ */
+int page_wrprotect(struct page *page, int *odirect_sync, int count_offset)
+{
+	int ret = 0;
+
+	/*
+	 * Page lock is needed for anon pages for the PageSwapCache check,
+	 * and for page_mapping for filebacked pages
+	 */
+	BUG_ON(!PageLocked(page));
+
+	*odirect_sync = 1;
+	if (PageAnon(page))
+		ret = page_wrprotect_anon(page, odirect_sync, count_offset);
+	else
+		ret = page_wrprotect_file(page, odirect_sync, count_offset);
+
+	return ret;
+}
+EXPORT_SYMBOL(page_wrprotect);
+
+#endif
+
 /**
  * __page_set_anon_rmap - setup new anonymous rmap
  * @page:	the page to add the mapping to
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] add replace_page(): change the page pte is pointing to.
  2009-04-04 14:35   ` [PATCH 2/4] add page_wrprotect(): write protecting page Izik Eidus
@ 2009-04-04 14:35     ` Izik Eidus
  2009-04-04 14:35       ` [PATCH 4/4] add ksm kernel shared memory driver Izik Eidus
  0 siblings, 1 reply; 19+ messages in thread
From: Izik Eidus @ 2009-04-04 14:35 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, kvm, linux-mm, avi, aarcange, chrisw, mtosatti,
	hugh, kamezawa.hiroyu, Izik Eidus

replace_page() allow changing the mapping of pte from one physical page
into diffrent physical page.

this function is working by removing oldpage from the rmap and calling
put_page on it, and by setting the pte to point into newpage and by
inserting it to the rmap using page_add_file_rmap().

note: newpage must be non anonymous page, the reason for this is:
replace_page() is built to allow mapping one page into more than one
virtual addresses, the mapping of this page can happen in diffrent
offsets inside each vma, and therefore we cannot trust the page->index
anymore.

the side effect of this issue is that newpage cannot be anything but
kernel allocated page that is not swappable.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 include/linux/mm.h |    5 +++
 mm/memory.c        |   80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index bff1f0d..7a831ce 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1240,6 +1240,11 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 			unsigned long pfn);
 
+#if defined(CONFIG_KSM) || defined(CONFIG_KSM_MODULE)
+int replace_page(struct vm_area_struct *vma, struct page *oldpage,
+		 struct page *newpage, pte_t orig_pte, pgprot_t prot);
+#endif
+
 struct page *follow_page(struct vm_area_struct *, unsigned long address,
 			unsigned int foll_flags);
 #define FOLL_WRITE	0x01	/* check pte is writable */
diff --git a/mm/memory.c b/mm/memory.c
index 1e1a14b..d6e53c2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1567,6 +1567,86 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 }
 EXPORT_SYMBOL(vm_insert_mixed);
 
+#if defined(CONFIG_KSM) || defined(CONFIG_KSM_MODULE)
+
+/**
+ * replace_page - replace page in vma with new page
+ * @vma:      vma that hold the pte oldpage is pointed by.
+ * @oldpage:  the page we are replacing with newpage
+ * @newpage:  the page we replace oldpage with
+ * @orig_pte: the original value of the pte
+ * @prot: page protection bits
+ *
+ * Returns 0 on success, -EFAULT on failure.
+ *
+ * Note: @newpage must not be an anonymous page because replace_page() does
+ * not change the mapping of @newpage to have the same values as @oldpage.
+ * @newpage can be mapped in several vmas at different offsets (page->index).
+ */
+int replace_page(struct vm_area_struct *vma, struct page *oldpage,
+		 struct page *newpage, pte_t orig_pte, pgprot_t prot)
+{
+	struct mm_struct *mm = vma->vm_mm;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *ptep;
+	spinlock_t *ptl;
+	unsigned long addr;
+	int ret;
+
+	BUG_ON(PageAnon(newpage));
+
+	ret = -EFAULT;
+	addr = page_address_in_vma(oldpage, vma);
+	if (addr == -EFAULT)
+		goto out;
+
+	pgd = pgd_offset(mm, addr);
+	if (!pgd_present(*pgd))
+		goto out;
+
+	pud = pud_offset(pgd, addr);
+	if (!pud_present(*pud))
+		goto out;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		goto out;
+
+	ptep = pte_offset_map_lock(mm, pmd, addr, &ptl);
+	if (!ptep)
+		goto out;
+
+	if (!pte_same(*ptep, orig_pte)) {
+		pte_unmap_unlock(ptep, ptl);
+		goto out;
+	}
+
+	ret = 0;
+	get_page(newpage);
+	page_add_file_rmap(newpage);
+
+	flush_cache_page(vma, addr, pte_pfn(*ptep));
+	ptep_clear_flush(vma, addr, ptep);
+	set_pte_at_notify(mm, addr, ptep, mk_pte(newpage, prot));
+
+	page_remove_rmap(oldpage);
+	if (PageAnon(oldpage)) {
+		dec_mm_counter(mm, anon_rss);
+		inc_mm_counter(mm, file_rss);
+	}
+	put_page(oldpage);
+
+	pte_unmap_unlock(ptep, ptl);
+
+out:
+	return ret;
+}
+EXPORT_SYMBOL_GPL(replace_page);
+
+#endif
+
 /*
  * maps a range of physical memory into the requested pages. the old
  * mappings are removed. any references to nonexistent pages results
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] add ksm kernel shared memory driver.
  2009-04-04 14:35     ` [PATCH 3/4] add replace_page(): change the page pte is pointing to Izik Eidus
@ 2009-04-04 14:35       ` Izik Eidus
  2009-04-06  9:13         ` Andrey Panin
  0 siblings, 1 reply; 19+ messages in thread
From: Izik Eidus @ 2009-04-04 14:35 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, kvm, linux-mm, avi, aarcange, chrisw, mtosatti,
	hugh, kamezawa.hiroyu, Izik Eidus

Ksm is driver that allow merging identical pages between one or more
applications in way unvisible to the application that use it.
Pages that are merged are marked as readonly and are COWed when any
application try to change them.

Ksm is used for cases where using fork() is not suitable,
one of this cases is where the pages of the application keep changing
dynamicly and the application cannot know in advance what pages are
going to be identical.

Ksm works by walking over the memory pages of the applications it
scan in order to find identical pages.
It uses a two sorted data strctures called stable and unstable trees
to find in effective way the identical pages.

When ksm finds two identical pages, it marks them as readonly and merges
them into single one page,
after the pages are marked as readonly and merged into one page, linux
will treat this pages as normal copy_on_write pages and will fork them
when write access will happen to them.

Ksm scan just memory areas that were registred to be scanned by it.

Ksm api:

KSM_GET_API_VERSION:
Give the userspace the api version of the module.

KSM_CREATE_SHARED_MEMORY_AREA:
Create shared memory reagion fd, that latter allow the user to register
the memory region to scan by using:
KSM_REGISTER_MEMORY_REGION and KSM_REMOVE_MEMORY_REGION

KSM_REGISTER_MEMORY_REGION:
Register userspace virtual address range to be scanned by ksm.
This ioctl is using the ksm_memory_region structure:
ksm_memory_region:
__u32 npages;
         number of pages to share inside this memory region.
__u32 pad;
__u64 addr:
        the begining of the virtual address of this region.
__u64 reserved_bits;
        reserved bits for future usage.

KSM_REMOVE_MEMORY_REGION:
Remove memory region from ksm.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 include/linux/ksm.h        |   48 ++
 include/linux/miscdevice.h |    1 +
 mm/Kconfig                 |    6 +
 mm/Makefile                |    1 +
 mm/ksm.c                   | 1668 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 1724 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/ksm.h
 create mode 100644 mm/ksm.c

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
new file mode 100644
index 0000000..2c11e9a
--- /dev/null
+++ b/include/linux/ksm.h
@@ -0,0 +1,48 @@
+#ifndef __LINUX_KSM_H
+#define __LINUX_KSM_H
+
+/*
+ * Userspace interface for /dev/ksm - kvm shared memory
+ */
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#include <asm/types.h>
+
+#define KSM_API_VERSION 1
+
+#define ksm_control_flags_run 1
+
+/* for KSM_REGISTER_MEMORY_REGION */
+struct ksm_memory_region {
+	__u32 npages; /* number of pages to share */
+	__u32 pad;
+	__u64 addr; /* the begining of the virtual address */
+        __u64 reserved_bits;
+};
+
+#define KSMIO 0xAB
+
+/* ioctls for /dev/ksm */
+
+#define KSM_GET_API_VERSION              _IO(KSMIO,   0x00)
+/*
+ * KSM_CREATE_SHARED_MEMORY_AREA - create the shared memory reagion fd
+ */
+#define KSM_CREATE_SHARED_MEMORY_AREA    _IO(KSMIO,   0x01) /* return SMA fd */
+
+/* ioctls for SMA fds */
+
+/*
+ * KSM_REGISTER_MEMORY_REGION - register virtual address memory area to be
+ * scanned by kvm.
+ */
+#define KSM_REGISTER_MEMORY_REGION       _IOW(KSMIO,  0x20,\
+					      struct ksm_memory_region)
+/*
+ * KSM_REMOVE_MEMORY_REGION - remove virtual address memory area from ksm.
+ */
+#define KSM_REMOVE_MEMORY_REGION         _IO(KSMIO,   0x21)
+
+#endif
diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index beb6ec9..297c0bb 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -30,6 +30,7 @@
 #define HPET_MINOR		228
 #define FUSE_MINOR		229
 #define KVM_MINOR		232
+#define KSM_MINOR		233
 #define MISC_DYNAMIC_MINOR	255
 
 struct device;
diff --git a/mm/Kconfig b/mm/Kconfig
index b53427a..3f3fd04 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -223,3 +223,9 @@ config HAVE_MLOCKED_PAGE_BIT
 
 config MMU_NOTIFIER
 	bool
+
+config KSM
+	tristate "Enable KSM for page sharing"
+	help
+	  Enable the KSM kernel module to allow page sharing of equal pages
+	  among different tasks.
diff --git a/mm/Makefile b/mm/Makefile
index ec73c68..b885513 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -24,6 +24,7 @@ obj-$(CONFIG_SPARSEMEM_VMEMMAP) += sparse-vmemmap.o
 obj-$(CONFIG_TMPFS_POSIX_ACL) += shmem_acl.o
 obj-$(CONFIG_SLOB) += slob.o
 obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o
+obj-$(CONFIG_KSM) += ksm.o
 obj-$(CONFIG_PAGE_POISONING) += debug-pagealloc.o
 obj-$(CONFIG_SLAB) += slab.o
 obj-$(CONFIG_SLUB) += slub.o
diff --git a/mm/ksm.c b/mm/ksm.c
new file mode 100644
index 0000000..fb59a08
--- /dev/null
+++ b/mm/ksm.c
@@ -0,0 +1,1668 @@
+/*
+ * Memory merging driver for Linux
+ *
+ * This module enables dynamic sharing of identical pages found in different
+ * memory areas, even if they are not shared by fork()
+ *
+ * Copyright (C) 2008 Red Hat, Inc.
+ * Authors:
+ *	Izik Eidus
+ *	Andrea Arcangeli
+ *	Chris Wright
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+
+#include <linux/module.h>
+#include <linux/errno.h>
+#include <linux/mm.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/vmalloc.h>
+#include <linux/file.h>
+#include <linux/mman.h>
+#include <linux/sched.h>
+#include <linux/rwsem.h>
+#include <linux/pagemap.h>
+#include <linux/sched.h>
+#include <linux/rmap.h>
+#include <linux/spinlock.h>
+#include <linux/jhash.h>
+#include <linux/delay.h>
+#include <linux/kthread.h>
+#include <linux/wait.h>
+#include <linux/scatterlist.h>
+#include <linux/random.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+#include <linux/rbtree.h>
+#include <linux/anon_inodes.h>
+#include <linux/ksm.h>
+
+#include <asm/tlbflush.h>
+
+MODULE_AUTHOR("Red Hat, Inc.");
+MODULE_LICENSE("GPL");
+
+static int rmap_hash_size;
+module_param(rmap_hash_size, int, 0);
+MODULE_PARM_DESC(rmap_hash_size, "Hash table size for the reverse mapping");
+
+/*
+ * ksm_mem_slot - hold information for an userspace scanning range
+ * (the scanning for this region will be from addr untill addr +
+ *  npages * PAGE_SIZE inside mm)
+ */
+struct ksm_mem_slot {
+	struct list_head link;
+	struct list_head sma_link;
+	struct mm_struct *mm;
+	unsigned long addr;	/* the begining of the virtual address */
+	unsigned npages;	/* number of pages to share */
+};
+
+/*
+ * ksm_sma - shared memory area, each process have its own sma that contain the
+ * information about the slots that it own
+ */
+struct ksm_sma {
+	struct list_head sma_slots;
+};
+
+/**
+ * struct ksm_scan - cursor for scanning
+ * @slot_index: the current slot we are scanning
+ * @page_index: the page inside the sma that is currently being scanned
+ *
+ * ksm uses it to know what are the next pages it need to scan
+ */
+struct ksm_scan {
+	struct ksm_mem_slot *slot_index;
+	unsigned long page_index;
+};
+
+/*
+ * Few notes about ksm scanning progress (make it easier to understand the
+ * data structures below):
+ *
+ * In order to reduce excessive scanning, ksm sort the memory pages by their
+ * contents into a data strcture that hold pointer into the pages.
+ *
+ * Since the contents of the pages may change at any moment, ksm cant just
+ * insert the pages into normal sorted tree and expect it to find anything.
+ *
+ * For this purpuse ksm use two data strctures - stable and unstable trees,
+ * the stable tree hold pointers into all the merged pages (KsmPage) sorted by
+ * their contents, beacuse that each such page have to be write-protected,
+ * searching on this tree is fully assuranced to be working and therefore this
+ * tree is called the stable tree.
+ *
+ * In addition to the stable tree, ksm use another data strcture called the
+ * unstable tree, this specific tree hold pointers into pages that have
+ * been found to be "unchanged for period of time", the unstable tree sort this
+ * pages by their contents, but given the fact that this pages are not
+ * write-protected, ksm cant trust the unstable tree to be fully assuranced to
+ * work.
+ * For the reason that the unstable tree would become corrupted when some of
+ * the page inside itself would change, the tree is called unstable.
+ * Ksm solve this problem by two ways:
+ * 1) the unstable tree get flushed every time ksm finish to scan the whole
+ *    memory, and then the tree is rebuild from the begining.
+ * 2) Ksm will only insert into the unstable tree, pages that their hash value
+ *    was not changed during the whole progress of one circuler scanning of the
+ *    memory.
+ * 3) The unstable tree is RedBlack Tree - meaning its balancing is based on
+ *    the colors of the nodes and not their content, this assure that even when
+ *    the tree get "corrupted" we wont get out of balance and the timing of
+ *    scanning is the same, another issue is that searching and inserting nodes
+ *    into rbtree is the same algorithem, therefore we have no overhead when we
+ *    flush the tree and rebuild it.
+ * 4) Ksm never flush the stable tree, this mean that even if it would take 10
+ *    times to find page inside the unstable tree, as soon as we would find it,
+ *    it will be secured inside the stable tree,
+ *    (When we scan new page, we first compare it against the stable tree, and
+ *     then against the unstable tree)
+ */
+
+struct rmap_item;
+
+/*
+ * tree_item - object of the stable and unstable trees
+ */
+struct tree_item {
+	struct rb_node node;
+	struct rmap_item *rmap_item;
+};
+
+/*
+ * rmap_item - object of the rmap_hash hash table
+ * (it is holding the previous hash value (oldindex),
+ *  pointer into the page_hash_item, and pointer into the tree_item)
+ */
+
+/**
+ * struct rmap_item - reverse mapping item for virtual addresses
+ * @link: link into the rmap_hash hash table.
+ * @mm: the memory strcture the rmap_item is pointing to.
+ * @address: the virtual address the rmap_item is pointing to.
+ * @oldchecksum: old checksum result for the page belong the virtual address
+ * @stable_tree: when 1 rmap_item is used for stable_tree, 0 unstable tree
+ * @kpage_outside_tree: when 1 this rmap_item point into kpage outside tree
+ * @tree_item: pointer into the stable/unstable tree that hold the virtual
+ *             address that the rmap_item is pointing to.
+ * @next: the next rmap item inside the stable/unstable tree that have that is
+ *        found inside the same tree node.
+ */
+
+struct rmap_item {
+	struct hlist_node link;
+	struct mm_struct *mm;
+	unsigned long address;
+	unsigned int oldchecksum; /* old checksum value */
+	unsigned char stable_tree; /* 1 stable_tree 0 unstable tree */
+	unsigned char kpage_outside_tree;
+	struct tree_item *tree_item;
+	struct rmap_item *next;
+	struct rmap_item *prev;
+};
+
+/*
+ * slots is linked list that hold all the memory regions that were registred
+ * to be scanned.
+ */
+static LIST_HEAD(slots);
+/*
+ * slots_lock protect against removing and adding memory regions while a scanner
+ * is in the middle of scanning.
+ */
+static DECLARE_RWSEM(slots_lock);
+
+/* The stable and unstable trees heads. */
+struct rb_root root_stable_tree = RB_ROOT;
+struct rb_root root_unstable_tree = RB_ROOT;
+
+
+/* The number of linked list members inside the hash table */
+static int nrmaps_hash;
+/* rmap_hash hash table */
+static struct hlist_head *rmap_hash;
+
+static struct kmem_cache *tree_item_cache;
+static struct kmem_cache *rmap_item_cache;
+
+/* the number of nodes inside the stable tree */
+static unsigned long nnodes_stable_tree;
+
+/* the number of kernel allocated pages outside the stable tree */
+static unsigned long nkpage_out_tree;
+
+static int kthread_sleep; /* sleep time of the kernel thread */
+static int kthread_pages_to_scan; /* npages to scan for the kernel thread */
+static int kthread_max_kernel_pages; /* number of unswappable pages allowed */
+static unsigned long ksm_pages_shared;
+static struct ksm_scan kthread_ksm_scan;
+static int ksmd_flags;
+static struct task_struct *kthread;
+static DECLARE_WAIT_QUEUE_HEAD(kthread_wait);
+static DECLARE_RWSEM(kthread_lock);
+
+
+static int ksm_slab_init(void)
+{
+	int ret = -ENOMEM;
+
+	tree_item_cache = KMEM_CACHE(tree_item, 0);
+	if (!tree_item_cache)
+		goto out;
+
+	rmap_item_cache = KMEM_CACHE(rmap_item, 0);
+	if (!rmap_item_cache)
+		goto out_free;
+
+	return 0;
+
+out_free:
+	kmem_cache_destroy(tree_item_cache);
+out:
+	return ret;
+}
+
+static void ksm_slab_free(void)
+{
+	kmem_cache_destroy(rmap_item_cache);
+	kmem_cache_destroy(tree_item_cache);
+}
+
+static inline struct tree_item *alloc_tree_item(void)
+{
+	return kmem_cache_zalloc(tree_item_cache, GFP_KERNEL);
+}
+
+static void free_tree_item(struct tree_item *tree_item)
+{
+	kmem_cache_free(tree_item_cache, tree_item);
+}
+
+static inline struct rmap_item *alloc_rmap_item(void)
+{
+	return kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL);
+}
+
+static inline void free_rmap_item(struct rmap_item *rmap_item)
+{
+	kmem_cache_free(rmap_item_cache, rmap_item);
+}
+
+static unsigned long addr_in_vma(struct vm_area_struct *vma, struct page *page)
+{
+	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	unsigned long addr;
+
+	addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
+	if (unlikely(addr < vma->vm_start || addr >= vma->vm_end))
+		return -EFAULT;
+	return addr;
+}
+
+static pte_t *get_pte(struct mm_struct *mm, unsigned long addr)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *ptep = NULL;
+
+	pgd = pgd_offset(mm, addr);
+	if (!pgd_present(*pgd))
+		goto out;
+
+	pud = pud_offset(pgd, addr);
+	if (!pud_present(*pud))
+		goto out;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		goto out;
+
+	ptep = pte_offset_map(pmd, addr);
+out:
+	return ptep;
+}
+
+
+static int is_present_pte(struct mm_struct *mm, unsigned long addr)
+{
+	pte_t *ptep;
+	int r;
+
+	ptep = get_pte(mm, addr);
+	if (!ptep)
+		return 0;
+
+	r = pte_present(*ptep);
+	pte_unmap(ptep);
+
+	return r;
+}
+
+static int is_dirty_pte(struct mm_struct *mm, unsigned long addr)
+{
+	pte_t *ptep;
+	int r;
+
+	ptep = get_pte(mm, addr);
+	if (!ptep)
+		return 0;
+
+	r = pte_dirty(*ptep);
+	pte_unmap(ptep);
+
+	return r;
+}
+
+/*
+ * PageKsm - this type of pages are the write protected pages that ksm map
+ * into multiple vmas (this is the "shared page")
+ * this page was allocated using alloc_page(), and every pte that point to it
+ * is always write protected (therefore its data content cant ever be changed)
+ * and this page cant be swapped.
+ */
+static inline int PageKsm(struct page *page, struct mm_struct *mm,
+			  unsigned long addr)
+{
+	/*
+	 * When ksm create new shared page, it create kernel allocated page
+	 * using alloc_page(), therefore this page is not anonymous, taking into
+	 * account that ksm scan just anonymous pages, we can relay on the fact
+	 * that each time we see !PageAnon(page) we are hitting shared page,
+	 * the !is_dirty_pte is used to protect against do_wp_page() that might
+	 * keep the last COW page as filed-backed, therefore we use the dirty
+	 * bit to know if such thing happend to the page.
+	 * for more info about that look at reuse: in do_wp_page() memory.c
+	 * (It always set there the pte as dirty)
+	 */
+	return !PageAnon(page) && !is_dirty_pte(mm, addr);
+}
+
+static int rmap_hash_init(void)
+{
+	if (!rmap_hash_size) {
+		struct sysinfo sinfo;
+
+		si_meminfo(&sinfo);
+		rmap_hash_size = sinfo.totalram / 10;
+	}
+	nrmaps_hash = rmap_hash_size;
+	rmap_hash = vmalloc(nrmaps_hash * sizeof(struct hlist_head));
+	if (!rmap_hash)
+		return -ENOMEM;
+	memset(rmap_hash, 0, nrmaps_hash * sizeof(struct hlist_head));
+	return 0;
+}
+
+static void rmap_hash_free(void)
+{
+	int i;
+	struct hlist_head *bucket;
+	struct hlist_node *node, *n;
+	struct rmap_item *rmap_item;
+
+	for (i = 0; i < nrmaps_hash; ++i) {
+		bucket = &rmap_hash[i];
+		hlist_for_each_entry_safe(rmap_item, node, n, bucket, link) {
+			hlist_del(&rmap_item->link);
+			free_rmap_item(rmap_item);
+		}
+	}
+	vfree(rmap_hash);
+}
+
+static inline u32 calc_checksum(struct page *page)
+{
+	u32 checksum;
+	void *addr = kmap_atomic(page, KM_USER0);
+	checksum = jhash(addr, PAGE_SIZE, 17);
+	kunmap_atomic(addr, KM_USER0);
+	return checksum;
+}
+
+/*
+ * Return rmap_item for a given virtual address.
+ */
+static struct rmap_item *get_rmap_item(struct mm_struct *mm, unsigned long addr)
+{
+	struct rmap_item *rmap_item;
+	struct hlist_head *bucket;
+	struct hlist_node *node;
+
+	bucket = &rmap_hash[addr % nrmaps_hash];
+	hlist_for_each_entry(rmap_item, node, bucket, link) {
+		if (mm == rmap_item->mm && rmap_item->address == addr) {
+			return rmap_item;
+		}
+	}
+	return NULL;
+}
+
+/*
+ * Removing rmap_item from stable or unstable tree.
+ * This function will free the rmap_item object, and if that rmap_item was
+ * insde the stable or unstable trees, it would remove the link from there
+ * as well.
+ */
+static void remove_rmap_item_from_tree(struct rmap_item *rmap_item)
+{
+	struct tree_item *tree_item;
+
+	tree_item = rmap_item->tree_item;
+	rmap_item->tree_item = NULL;
+
+	if (rmap_item->stable_tree) {
+		ksm_pages_shared--;
+		if (rmap_item->prev) {
+			BUG_ON(rmap_item->prev->next != rmap_item);
+			rmap_item->prev->next = rmap_item->next;
+		}
+		if (rmap_item->next) {
+			BUG_ON(rmap_item->next->prev != rmap_item);
+			rmap_item->next->prev = rmap_item->prev;
+		}
+	} else if (rmap_item->kpage_outside_tree) {
+		ksm_pages_shared--;
+		nkpage_out_tree--;
+	}
+
+	if (tree_item) {
+		if (rmap_item->stable_tree) {
+	 		if (!rmap_item->next && !rmap_item->prev) {
+				rb_erase(&tree_item->node, &root_stable_tree);
+				free_tree_item(tree_item);
+				nnodes_stable_tree--;
+			} else if (!rmap_item->prev) {
+				tree_item->rmap_item = rmap_item->next;
+			} else {
+				tree_item->rmap_item = rmap_item->prev;
+			}
+		} else {
+			free_tree_item(tree_item);
+		}
+	}
+
+	hlist_del(&rmap_item->link);
+	free_rmap_item(rmap_item);
+}
+
+static void break_cow(struct mm_struct *mm, unsigned long addr)
+{
+	struct page *page[1];
+
+	down_read(&mm->mmap_sem);
+	if (get_user_pages(current, mm, addr, 1, 1, 0, page, NULL)) {
+			put_page(page[0]);
+	}
+	up_read(&mm->mmap_sem);
+}
+
+static void remove_page_from_tree(struct mm_struct *mm,
+				  unsigned long addr)
+{
+	struct rmap_item *rmap_item;
+
+	rmap_item = get_rmap_item(mm, addr);
+	if (!rmap_item)
+		return;
+
+	if (rmap_item->stable_tree) {
+		/* We are breaking all the KsmPages of area that is removed */
+		break_cow(mm, addr);
+	} else {
+		/*
+		 * If kpage_outside_tree is set, this item is KsmPage outside
+		 * the stable tree, therefor we have to break the COW and
+		 * in addition we have to dec nkpage_out_tree.
+		 */
+		if (rmap_item->kpage_outside_tree)
+			break_cow(mm, addr);
+	}
+
+	remove_rmap_item_from_tree(rmap_item);
+}
+
+static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma,
+						struct ksm_memory_region *mem)
+{
+	struct ksm_mem_slot *slot;
+	int ret = -EPERM;
+
+	slot = kzalloc(sizeof(struct ksm_mem_slot), GFP_KERNEL);
+	if (!slot) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	slot->mm = get_task_mm(current);
+	if (!slot->mm)
+		goto out_free;
+	slot->addr = mem->addr;
+	slot->npages = mem->npages;
+
+	down_write(&slots_lock);
+
+	list_add_tail(&slot->link, &slots);
+	list_add_tail(&slot->sma_link, &ksm_sma->sma_slots);
+
+	up_write(&slots_lock);
+	return 0;
+
+out_free:
+	kfree(slot);
+out:
+	return ret;
+}
+
+static void remove_mm_from_hash_and_tree(struct mm_struct *mm)
+{
+	struct ksm_mem_slot *slot;
+	int pages_count;
+
+	list_for_each_entry(slot, &slots, link)
+		if (slot->mm == mm)
+			break;
+	BUG_ON(!slot);
+
+	root_unstable_tree = RB_ROOT;
+	for (pages_count = 0; pages_count < slot->npages; ++pages_count)
+		remove_page_from_tree(mm, slot->addr +
+				      pages_count * PAGE_SIZE);
+	list_del(&slot->link);
+}
+
+static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
+{
+	struct ksm_mem_slot *slot, *node;
+
+	down_write(&slots_lock);
+	list_for_each_entry_safe(slot, node, &ksm_sma->sma_slots, sma_link) {
+		remove_mm_from_hash_and_tree(slot->mm);
+		mmput(slot->mm);
+		list_del(&slot->sma_link);
+		kfree(slot);
+	}
+	up_write(&slots_lock);
+	return 0;
+}
+
+static int ksm_sma_release(struct inode *inode, struct file *filp)
+{
+	struct ksm_sma *ksm_sma = filp->private_data;
+	int r;
+
+	r = ksm_sma_ioctl_remove_memory_region(ksm_sma);
+	kfree(ksm_sma);
+	return r;
+}
+
+static long ksm_sma_ioctl(struct file *filp,
+			  unsigned int ioctl, unsigned long arg)
+{
+	struct ksm_sma *sma = filp->private_data;
+	void __user *argp = (void __user *)arg;
+	int r = EINVAL;
+
+	switch (ioctl) {
+	case KSM_REGISTER_MEMORY_REGION: {
+		struct ksm_memory_region ksm_memory_region;
+
+		r = -EFAULT;
+		if (copy_from_user(&ksm_memory_region, argp,
+				   sizeof(ksm_memory_region)))
+			goto out;
+		r = ksm_sma_ioctl_register_memory_region(sma,
+							 &ksm_memory_region);
+		break;
+	}
+	case KSM_REMOVE_MEMORY_REGION:
+		r = ksm_sma_ioctl_remove_memory_region(sma);
+		break;
+	}
+
+out:
+	return r;
+}
+
+static int memcmp_pages(struct page *page1, struct page *page2)
+{
+	char *addr1, *addr2;
+	int r;
+
+	addr1 = kmap_atomic(page1, KM_USER0);
+	addr2 = kmap_atomic(page2, KM_USER1);
+	r = memcmp(addr1, addr2, PAGE_SIZE);
+	kunmap_atomic(addr1, KM_USER0);
+	kunmap_atomic(addr2, KM_USER1);
+	return r;
+}
+
+/* pages_identical
+ * return 1 if identical, 0 otherwise.
+ */
+static inline int pages_identical(struct page *page1, struct page *page2)
+{
+	return !memcmp_pages(page1, page2);
+}
+
+/*
+ * try_to_merge_one_page - take two pages and merge them into one
+ * @mm: mm_struct that hold vma pointing into oldpage
+ * @vma: the vma that hold the pte pointing into oldpage
+ * @oldpage: the page that we want to replace with newpage
+ * @newpage: the page that we want to map instead of oldpage
+ * @newprot: the new permission of the pte inside vma
+ * note:
+ * oldpage should be anon page while newpage should be file mapped page
+ *
+ * this function return 0 if the pages were merged, 1 otherwise.
+ */
+static int try_to_merge_one_page(struct mm_struct *mm,
+				 struct vm_area_struct *vma,
+				 struct page *oldpage,
+				 struct page *newpage,
+				 pgprot_t newprot)
+{
+	int ret = 1;
+	int odirect_sync;
+	unsigned long page_addr_in_vma;
+	pte_t orig_pte, *orig_ptep;
+
+	get_page(newpage);
+	get_page(oldpage);
+
+	down_read(&mm->mmap_sem);
+
+	page_addr_in_vma = addr_in_vma(vma, oldpage);
+	if (page_addr_in_vma == -EFAULT)
+		goto out_unlock;
+
+	orig_ptep = get_pte(mm, page_addr_in_vma);
+	if (!orig_ptep)
+		goto out_unlock;
+	orig_pte = *orig_ptep;
+	pte_unmap(orig_ptep);
+	if (!pte_present(orig_pte))
+		goto out_unlock;
+	if (page_to_pfn(oldpage) != pte_pfn(orig_pte))
+		goto out_unlock;
+	/*
+	 * we need the page lock to read a stable PageSwapCache in
+	 * page_wrprotect()
+	 */
+	if (!trylock_page(oldpage))
+		goto out_unlock;
+	/*
+	 * page_wrprotect check if the page is swapped or in swap cache,
+	 * in the future we might want to run here if_present_pte and then
+	 * swap_free
+	 */
+	if (!page_wrprotect(oldpage, &odirect_sync, 2)) {
+		unlock_page(oldpage);
+		goto out_unlock;
+	}
+	unlock_page(oldpage);
+	if (!odirect_sync)
+		goto out_unlock;
+
+	orig_pte = pte_wrprotect(orig_pte);
+
+	if (pages_identical(oldpage, newpage))
+		ret = replace_page(vma, oldpage, newpage, orig_pte, newprot);
+
+out_unlock:
+	up_read(&mm->mmap_sem);
+	put_page(oldpage);
+	put_page(newpage);
+	return ret;
+}
+
+/*
+ * try_to_merge_two_pages - take two identical pages and prepare them to be
+ * merged into one page.
+ *
+ * this function return 0 if we successfully mapped two identical pages into one
+ * page, 1 otherwise.
+ * (note in case we created KsmPage and mapped one page into it but the second
+ *  page was not mapped we consider it as a failure and return 1)
+ */
+static int try_to_merge_two_pages(struct mm_struct *mm1, struct page *page1,
+				  struct mm_struct *mm2, struct page *page2,
+				  unsigned long addr1, unsigned long addr2)
+{
+	struct vm_area_struct *vma;
+	pgprot_t prot;
+	int ret = 1;
+
+	/*
+	 * If page2 isn't shared (it isn't PageKsm) we have to allocate a new
+	 * file mapped page and make the two ptes of mm1(page1) and mm2(page2)
+	 * point to it.  If page2 is shared, we can just make the pte of
+	 * mm1(page1) point to page2
+	 */
+	if (PageKsm(page2, mm2, addr2)) {
+		down_read(&mm1->mmap_sem);
+		vma = find_vma(mm1, addr1);
+		up_read(&mm1->mmap_sem);
+		if (!vma)
+			return ret;
+		prot = vma->vm_page_prot;
+		pgprot_val(prot) &= ~_PAGE_RW;
+		ret = try_to_merge_one_page(mm1, vma, page1, page2, prot);
+		if (!ret)
+			ksm_pages_shared++;
+	} else {
+		struct page *kpage;
+
+		/*
+		 * The number of the nodes inside the stable tree +
+		 * nkpage_out_tree is the same as the number kernel pages that
+		 * we hold.
+		 */
+		if (kthread_max_kernel_pages &&
+		    (nnodes_stable_tree + nkpage_out_tree) >=
+		    kthread_max_kernel_pages)
+			return ret;
+
+		kpage = alloc_page(GFP_HIGHUSER);
+		if (!kpage)
+			return ret;
+		down_read(&mm1->mmap_sem);
+		vma = find_vma(mm1, addr1);
+		up_read(&mm1->mmap_sem);
+		if (!vma) {
+			put_page(kpage);
+			return ret;
+		}
+		prot = vma->vm_page_prot;
+		pgprot_val(prot) &= ~_PAGE_RW;
+
+		copy_user_highpage(kpage, page1, addr1, vma);
+		ret = try_to_merge_one_page(mm1, vma, page1, kpage, prot);
+
+		if (!ret) {
+			down_read(&mm2->mmap_sem);
+			vma = find_vma(mm2, addr2);
+			up_read(&mm2->mmap_sem);
+			if (!vma) {
+				put_page(kpage);
+				break_cow(mm1, addr1);
+				ret = 1;
+				return ret;
+			}
+
+			prot = vma->vm_page_prot;
+			pgprot_val(prot) &= ~_PAGE_RW;
+
+			ret = try_to_merge_one_page(mm2, vma, page2, kpage,
+						    prot);
+			/*
+			 * If the secoend try_to_merge_one_page call was failed,
+			 * we are in situation where we have Ksm page that have
+			 * just one pte pointing to it, in this case we break
+			 * it.
+			 */
+			if (ret) {
+				break_cow(mm1, addr1);
+			} else {
+				ksm_pages_shared += 2;
+			}
+		}
+		put_page(kpage);
+	}
+	return ret;
+}
+
+/*
+ * is_zapped_item - check if the page belong to the rmap_item was zapped.
+ *
+ * This function would check if the page that the virtual address inside
+ * rmap_item is poiting to is still KsmPage, and therefore we can trust the
+ * content of this page.
+ * Since that this function call already to get_user_pages it return the
+ * pointer to the page as an optimization.
+ */
+static int is_zapped_item(struct rmap_item *rmap_item,
+			  struct page **page)
+{
+	int ret = 0;
+
+	cond_resched();
+	if (is_present_pte(rmap_item->mm, rmap_item->address)) {
+		down_read(&rmap_item->mm->mmap_sem);
+		ret = get_user_pages(current, rmap_item->mm, rmap_item->address,
+				     1, 0, 0, page, NULL);
+		up_read(&rmap_item->mm->mmap_sem);
+	}
+
+	if (!ret)
+		return 1;
+
+	if (unlikely(!PageKsm(page[0], rmap_item->mm, rmap_item->address))) { 
+		put_page(page[0]);
+		return 1;
+	}
+	return 0;
+}
+
+/*
+ * stable_tree_search - search page inside the stable tree
+ * @page: the page that we are searching idneitcal pages to.
+ * @page2: pointer into identical page that we are holding inside the stable
+ *	   tree that we have found.
+ * @rmap_item: the reverse mapping item
+ *
+ * this function check if there is a page inside the stable tree
+ * with identical content to the page that we are scanning right now.
+ *
+ * this function return rmap_item pointer to the identical item if found, NULL
+ * otherwise.
+ */
+static struct rmap_item *stable_tree_search(struct page *page,
+					    struct page **page2,
+					    struct rmap_item *rmap_item)
+{
+	struct rb_node *node = root_stable_tree.rb_node;
+	struct tree_item *tree_item;
+	struct rmap_item *found_rmap_item;
+
+	while (node) {
+		int ret;
+
+		tree_item = rb_entry(node, struct tree_item, node);
+		found_rmap_item = tree_item->rmap_item;
+		while (found_rmap_item) {
+			BUG_ON(!found_rmap_item->stable_tree);
+			BUG_ON(!found_rmap_item->tree_item);
+			if (!rmap_item ||
+			     !(found_rmap_item->mm == rmap_item->mm &&
+			      found_rmap_item->address == rmap_item->address)) {
+				if (!is_zapped_item(found_rmap_item, page2))
+					break;
+				remove_rmap_item_from_tree(found_rmap_item);
+			}
+			found_rmap_item = found_rmap_item->next;
+		}
+		if (!found_rmap_item)
+			goto out_didnt_find;
+
+		/*
+		 * We can trust the value of the memcmp as we know the pages
+		 * are write protected.
+		 */
+		ret = memcmp_pages(page, page2[0]);
+
+		if (ret < 0) {
+			put_page(page2[0]);
+			node = node->rb_left;
+		} else if (ret > 0) {
+			put_page(page2[0]);
+			node = node->rb_right;
+		} else {
+			goto out_found;
+		}
+	}
+out_didnt_find:
+	found_rmap_item = NULL;
+out_found:
+	return found_rmap_item;
+}
+
+/*
+ * stable_tree_insert - insert into the stable tree, new rmap_item that is
+ * pointing into a new KsmPage.
+ *
+ * @page: the page that we are searching identical page to inside the stable
+ *	  tree.
+ * @new_tree_item: the new tree item we are going to link into the stable tree.
+ * @rmap_item: pointer into the reverse mapping item.
+ *
+ * this function return 0 if success, 1 otherwise.
+ * otherwise.
+ */
+static int stable_tree_insert(struct page *page,
+			      struct tree_item *new_tree_item,
+			      struct rmap_item *rmap_item)
+{
+	struct rb_node **new = &(root_stable_tree.rb_node);
+	struct rb_node *parent = NULL;
+	struct tree_item *tree_item;
+	struct page *page2[1];
+
+	while (*new) {
+		int ret;
+		struct rmap_item *insert_rmap_item;
+
+		tree_item = rb_entry(*new, struct tree_item, node);
+		BUG_ON(!tree_item);
+		BUG_ON(!tree_item->rmap_item);
+
+		insert_rmap_item = tree_item->rmap_item;
+		while (insert_rmap_item) {
+			BUG_ON(!insert_rmap_item->stable_tree);
+			BUG_ON(!insert_rmap_item->tree_item);
+			if (!(insert_rmap_item->mm == rmap_item->mm &&
+			     insert_rmap_item->address == rmap_item->address)) {
+				if (!is_zapped_item(insert_rmap_item, page2))
+					break;
+				remove_rmap_item_from_tree(insert_rmap_item);
+			}
+			insert_rmap_item = insert_rmap_item->next;
+		}
+		if (!insert_rmap_item)
+			return 1;
+
+		ret = memcmp_pages(page, page2[0]);
+
+		parent = *new;
+		if (ret < 0) {
+			put_page(page2[0]);
+			new = &((*new)->rb_left);
+		} else if (ret > 0) {
+			put_page(page2[0]);
+			new = &((*new)->rb_right);
+		} else {
+			/*
+			 * It isnt a bug when we are here (the fact that we
+			 * didnt find the page inside the stable tree), beacuse:
+			 * when we searched the page inside the stable tree
+			 * it was still not write protected, and therefore it
+			 * could have changed later.
+			 */
+			return 1;
+		}
+	}
+
+	rb_link_node(&new_tree_item->node, parent, new);
+	rb_insert_color(&new_tree_item->node, &root_stable_tree);
+	nnodes_stable_tree++;
+	rmap_item->stable_tree = 1;
+	rmap_item->tree_item = new_tree_item;
+
+	return 0;
+}
+
+/*
+ * unstable_tree_search_insert - search and insert items into the unstable tree.
+ *
+ * @page: the page that we are going to search for identical page or to insert
+ *	  into the unstable tree
+ * @page2: pointer into identical page that was found inside the unstable tree
+ * @page_rmap_item: the reverse mapping item of page
+ *
+ * this function search if identical page to the page that we
+ * are scanning right now is found inside the unstable tree, and in case no page
+ * with identical content is exist inside the unstable tree, we insert
+ * page_rmap_item as a new object into the unstable tree.
+ *
+ * this function return pointer to rmap_item pointer of item that is found to
+ * be identical to the page that we are scanning right now, NULL otherwise.
+ *
+ * (this function do both searching and inserting, beacuse the fact that
+ *  searching and inserting share the same walking algorithem in rbtrees)
+ */
+static struct tree_item *unstable_tree_search_insert(struct page *page,
+					struct page **page2,
+					struct rmap_item *page_rmap_item)
+{
+	struct rb_node **new = &(root_unstable_tree.rb_node);
+	struct rb_node *parent = NULL;
+	struct tree_item *tree_item;
+	struct tree_item *new_tree_item;
+	struct rmap_item *rmap_item;
+
+	while (*new) {
+		int ret;
+
+		tree_item = rb_entry(*new, struct tree_item, node);
+		BUG_ON(!tree_item);
+		rmap_item = tree_item->rmap_item;
+		BUG_ON(!rmap_item);
+
+		/*
+		 * We dont want to swap in pages
+		 */
+		if (!is_present_pte(rmap_item->mm, rmap_item->address))
+			return NULL;
+
+		down_read(&rmap_item->mm->mmap_sem);
+		ret = get_user_pages(current, rmap_item->mm, rmap_item->address,
+				     1, 0, 0, page2, NULL);
+		up_read(&rmap_item->mm->mmap_sem);
+		if (!ret)
+			return NULL;
+
+		ret = memcmp_pages(page, page2[0]);
+
+		parent = *new;
+		if (ret < 0) {
+			put_page(page2[0]);
+			new = &((*new)->rb_left);
+		} else if (ret > 0) {
+			put_page(page2[0]);
+			new = &((*new)->rb_right);
+		} else {
+			return tree_item;
+		}
+	}
+
+	if (!page_rmap_item)
+		return NULL;
+
+	new_tree_item = alloc_tree_item();
+	if (!new_tree_item)
+		return NULL;
+
+	page_rmap_item->tree_item = new_tree_item;
+	page_rmap_item->stable_tree = 0;
+	new_tree_item->rmap_item = page_rmap_item;
+	rb_link_node(&new_tree_item->node, parent, new);
+	rb_insert_color(&new_tree_item->node, &root_unstable_tree);
+
+	return NULL;
+}
+
+/*
+ * update_stable_tree - check if the page inside tree got zapped,
+ * and if it got zapped, kick it from the tree.
+ *
+ * we return 1 in case we removed the rmap_item.
+ */
+int update_tree(struct rmap_item *rmap_item)
+{
+	if (!rmap_item->stable_tree) {
+		if (unlikely(rmap_item->kpage_outside_tree)) {
+			remove_rmap_item_from_tree(rmap_item);
+			return 1;
+		}
+		/*
+		 * If the rmap_item is !stable_tree and in addition
+		 * it have tree_item != NULL, it mean this rmap_item
+		 * was inside the unstable tree, therefore we have to free
+		 * the tree_item from it (beacuse the unstable tree was already
+		 * flushed by the time we are here).
+		 */
+		if (rmap_item->tree_item) {
+			free_tree_item(rmap_item->tree_item);
+			rmap_item->tree_item = NULL;
+			return 0;
+		}
+		return 0;
+	}
+	/*
+	 * If we are here it mean the rmap_item was zapped, beacuse the
+	 * rmap_item was pointing into the stable_tree and there all the pages
+	 * should be KsmPages, so it shouldnt have came to here in the first
+	 * place. (cmp_and_merge_page() shouldnt have been called)
+	 */
+	remove_rmap_item_from_tree(rmap_item);
+	return 1;
+}
+
+static void create_new_rmap_item(struct rmap_item *rmap_item,
+				 struct mm_struct *mm,
+				 unsigned long addr,
+				 unsigned int checksum)
+{
+	struct hlist_head *bucket;
+
+	rmap_item->mm = mm;
+	rmap_item->address = addr;
+	rmap_item->oldchecksum = checksum;
+	rmap_item->stable_tree = 0;
+	rmap_item->kpage_outside_tree = 0;
+	rmap_item->tree_item = NULL;
+
+	bucket = &rmap_hash[addr % nrmaps_hash];
+	hlist_add_head(&rmap_item->link, bucket);
+}
+
+/*
+ * cmp_and_merge_page - take a page computes its hash value and check if there
+ * is similar hash value to different page,
+ * in case we find that there is similar hash to different page we call to
+ * try_to_merge_two_pages().
+ *
+ * @ksm_scan: the ksm scanner strcture.
+ * @page: the page that we are searching identical page to.
+ */
+static int cmp_and_merge_page(struct ksm_scan *ksm_scan, struct page *page)
+{
+	struct page *page2[1];
+	struct ksm_mem_slot *slot;
+	struct tree_item *tree_item;
+	struct rmap_item *rmap_item;
+	struct rmap_item *tree_rmap_item;
+	unsigned int checksum;
+	unsigned long addr;
+	int wait = 0;
+	int ret = 0;
+
+	slot = ksm_scan->slot_index;
+	addr = slot->addr + ksm_scan->page_index * PAGE_SIZE;
+	rmap_item = get_rmap_item(slot->mm, addr);
+	if (rmap_item) {
+		if (update_tree(rmap_item)) {
+			rmap_item = NULL;
+			wait = 1;
+		}
+	}
+
+	/* We first start with searching the page inside the stable tree */
+	tree_rmap_item = stable_tree_search(page, page2, rmap_item);
+	if (tree_rmap_item) {
+		struct rmap_item *tmp_rmap_item = NULL;
+
+		if (!rmap_item) {
+			tmp_rmap_item = alloc_rmap_item();
+			if (!tmp_rmap_item)
+				return ret;
+		}
+
+		BUG_ON(!tree_rmap_item->tree_item);
+		ret = try_to_merge_two_pages(slot->mm, page, tree_rmap_item->mm,
+					     page2[0], addr,
+					     tree_rmap_item->address);
+		put_page(page2[0]);
+		if (!ret) {
+			/*
+			 * The page was successuly merged, lets insert its
+			 * rmap_item into the stable tree.
+			 */
+
+			if (!rmap_item) {
+				create_new_rmap_item(tmp_rmap_item, slot->mm,
+						     addr, 0);
+				rmap_item = tmp_rmap_item;
+			}
+
+			rmap_item->next = tree_rmap_item->next;
+			rmap_item->prev = tree_rmap_item;
+
+			if (tree_rmap_item->next)
+				tree_rmap_item->next->prev = rmap_item;
+
+			tree_rmap_item->next = rmap_item;
+
+			rmap_item->stable_tree = 1;
+			rmap_item->tree_item = tree_rmap_item->tree_item;
+		} else {
+			if (tmp_rmap_item)
+				free_rmap_item(tmp_rmap_item);
+		}
+		ret = !ret;
+		goto out;
+	}
+
+	/*
+	 * In case the hash value of the page was changed from the last time we
+	 * have calculated it, this page to be changed frequely, therefore we
+	 * dont want to insert it to the unstable tree, and we dont want to
+	 * waste our time to search if there is something identical to it there.
+	 */
+	if (rmap_item) {
+		checksum = calc_checksum(page);
+		if (rmap_item->oldchecksum != checksum) {
+			rmap_item->oldchecksum = checksum;
+			goto out;
+		}
+	}
+
+	tree_item = unstable_tree_search_insert(page, page2, rmap_item);
+	if (tree_item) {
+		struct rmap_item *tmp_rmap_item = NULL;
+		struct rmap_item *merge_rmap_item;
+
+		merge_rmap_item = tree_item->rmap_item;
+		BUG_ON(!merge_rmap_item);
+
+		if (!rmap_item) {
+			tmp_rmap_item = alloc_rmap_item();
+			if (!tmp_rmap_item)
+				return ret;
+		}
+
+		ret = try_to_merge_two_pages(slot->mm, page,
+					     merge_rmap_item->mm,
+					     page2[0], addr,
+					     merge_rmap_item->address);
+		/*
+		 * As soon as we successuly merged this page, we want to remove
+		 * the rmap_item object of the page that we have merged with
+		 * from the unstable_tree and instead insert it as a new stable
+		 * tree node.
+		 */
+		if (!ret) {
+			rb_erase(&tree_item->node, &root_unstable_tree);
+			/*
+			 * In case we will fail to insert the page into
+			 * the stable tree, we will have 2 virtual addresses
+			 * that are pointing into KsmPage that wont be inside
+			 * the stable tree, therefore we have to mark both of
+			 * their rmap as tree_item->kpage_outside_tree = 1
+			 * and to inc nkpage_out_tree by 2.
+			 */
+			if (stable_tree_insert(page2[0],
+					       tree_item, merge_rmap_item)) {
+				merge_rmap_item->kpage_outside_tree = 1;
+				if (!rmap_item) {
+					create_new_rmap_item(tmp_rmap_item,
+							     slot->mm,
+							     addr, 0);
+					rmap_item = tmp_rmap_item;
+				}
+				rmap_item->kpage_outside_tree = 1;
+				nkpage_out_tree += 2;
+			} else {
+				if (tmp_rmap_item) {
+					create_new_rmap_item(tmp_rmap_item,
+							     slot->mm, addr, 0);
+					rmap_item = tmp_rmap_item;
+				}
+				rmap_item->stable_tree = 1;
+			}
+		} else {
+			if (tmp_rmap_item)
+				free_rmap_item(tmp_rmap_item);
+		}
+		put_page(page2[0]);
+		ret = !ret;
+		goto out;
+	}
+	/*
+	 * When wait is 1, we dont want to calculate the hash value of the page
+	 * right now, instead we prefer to wait.
+	 */
+	if (!wait && !rmap_item) {
+		rmap_item = alloc_rmap_item();
+		if (!rmap_item)
+			return ret;
+		checksum = calc_checksum(page);
+		create_new_rmap_item(rmap_item, slot->mm, addr, checksum);
+	}
+out:
+	return ret;
+}
+
+/* return -EAGAIN - no slots registered, nothing to be done */
+static int scan_get_next_index(struct ksm_scan *ksm_scan)
+{
+	struct ksm_mem_slot *slot;
+
+	if (list_empty(&slots))
+		return -EAGAIN;
+
+	slot = ksm_scan->slot_index;
+
+	/* Are there pages left in this slot to scan? */
+	if ((slot->npages - ksm_scan->page_index - 1) > 0) {
+		ksm_scan->page_index++;
+		return 0;
+	}
+
+	list_for_each_entry_from(slot, &slots, link) {
+		if (slot == ksm_scan->slot_index)
+			continue;
+		ksm_scan->page_index = 0;
+		ksm_scan->slot_index = slot;
+		return 0;
+	}
+
+	/* look like we finished scanning the whole memory, starting again */
+	root_unstable_tree = RB_ROOT;
+	ksm_scan->page_index = 0;
+	ksm_scan->slot_index = list_first_entry(&slots,
+						struct ksm_mem_slot, link);
+	return 0;
+}
+
+/*
+ * update slot_index - make sure ksm_scan will point to vaild data,
+ * it is possible that by the time we are here the data that ksm_scan was
+ * pointed to was released so we have to call this function every time after
+ * taking the slots_lock
+ */
+static void scan_update_old_index(struct ksm_scan *ksm_scan)
+{
+	struct ksm_mem_slot *slot;
+
+	if (list_empty(&slots))
+		return;
+
+	list_for_each_entry(slot, &slots, link) {
+		if (ksm_scan->slot_index == slot)
+			return;
+	}
+
+	ksm_scan->slot_index = list_first_entry(&slots,
+						struct ksm_mem_slot, link);
+	ksm_scan->page_index = 0;
+}
+
+/**
+ * ksm_scan_start - the ksm scanner main worker function.
+ * @ksm_scan -    the scanner.
+ * @scan_npages - number of pages we are want to scan before we return from this
+ * @function.
+ *
+ * (this function can be called from the kernel thread scanner, or from 
+ *  userspace ioctl context scanner)
+ *
+ *  The function return -EAGAIN in case there are not slots to scan.
+ */
+static int ksm_scan_start(struct ksm_scan *ksm_scan, unsigned int scan_npages)
+{
+	struct ksm_mem_slot *slot;
+	struct page *page[1];
+	int val;
+	int ret = 0;
+
+	down_read(&slots_lock);
+
+	scan_update_old_index(ksm_scan);
+
+	while (scan_npages > 0) {
+		ret = scan_get_next_index(ksm_scan);
+		if (ret)
+			goto out;
+
+		slot = ksm_scan->slot_index;
+
+		cond_resched();
+
+		/*
+		 * If the page is swapped out or in swap cache, we don't want to
+		 * scan it (it is just for performance).
+		 */
+		if (is_present_pte(slot->mm, slot->addr +
+				   ksm_scan->page_index * PAGE_SIZE)) {
+			down_read(&slot->mm->mmap_sem);
+			val = get_user_pages(current, slot->mm, slot->addr +
+					     ksm_scan->page_index * PAGE_SIZE ,
+					      1, 0, 0, page, NULL);
+			up_read(&slot->mm->mmap_sem);
+			if (val == 1) {
+				if (!PageKsm(page[0], slot->mm,
+					     slot->addr + ksm_scan->page_index *
+					     PAGE_SIZE))
+					cmp_and_merge_page(ksm_scan, page[0]);
+				put_page(page[0]);
+			}
+		}
+		scan_npages--;
+	}
+	scan_get_next_index(ksm_scan);
+out:
+	up_read(&slots_lock);
+	return ret;
+}
+
+static struct file_operations ksm_sma_fops = {
+	.release        = ksm_sma_release,
+	.unlocked_ioctl = ksm_sma_ioctl,
+	.compat_ioctl   = ksm_sma_ioctl,
+};
+
+static int ksm_dev_ioctl_create_shared_memory_area(void)
+{
+	int fd = -1;
+	struct ksm_sma *ksm_sma;
+
+	ksm_sma = kmalloc(sizeof(struct ksm_sma), GFP_KERNEL);
+	if (!ksm_sma)
+		goto out;
+
+	INIT_LIST_HEAD(&ksm_sma->sma_slots);
+
+	fd = anon_inode_getfd("ksm-sma", &ksm_sma_fops, ksm_sma, 0);
+	if (fd < 0)
+		goto out_free;
+
+	return fd;
+out_free:
+	kfree(ksm_sma);
+out:
+	return fd;
+}
+
+static long ksm_dev_ioctl(struct file *filp,
+			  unsigned int ioctl, unsigned long arg)
+{
+	long r = -EINVAL;
+
+	switch (ioctl) {
+	case KSM_GET_API_VERSION:
+		r = KSM_API_VERSION;
+		break;
+	case KSM_CREATE_SHARED_MEMORY_AREA:
+		r = ksm_dev_ioctl_create_shared_memory_area();
+		break;
+	default:
+		break;
+	}
+	return r;
+}
+
+static struct file_operations ksm_chardev_ops = {
+	.unlocked_ioctl = ksm_dev_ioctl,
+	.compat_ioctl   = ksm_dev_ioctl,
+	.owner          = THIS_MODULE,
+};
+
+static struct miscdevice ksm_dev = {
+	KSM_MINOR,
+	"ksm",
+	&ksm_chardev_ops,
+};
+
+int kthread_ksm_scan_thread(void *nothing)
+{
+	while (!kthread_should_stop()) {
+		if (ksmd_flags & ksm_control_flags_run) {
+			down_read(&kthread_lock);
+			ksm_scan_start(&kthread_ksm_scan,
+				       kthread_pages_to_scan);
+			up_read(&kthread_lock);
+			schedule_timeout_interruptible(
+					usecs_to_jiffies(kthread_sleep));
+		} else {
+			wait_event_interruptible(kthread_wait,
+					ksmd_flags & ksm_control_flags_run ||
+					kthread_should_stop());
+		}
+	}
+	return 0;
+}
+
+#define KSM_ATTR_RO(_name) \
+	static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
+#define KSM_ATTR(_name) \
+	static struct kobj_attribute _name##_attr = \
+		__ATTR(_name, 0644, _name##_show, _name##_store)
+
+static ssize_t sleep_show(struct kobject *kobj, struct kobj_attribute *attr,
+			  char *buf)
+{
+	unsigned int usecs;
+
+	down_read(&kthread_lock);
+	usecs = kthread_sleep;
+	up_read(&kthread_lock);
+
+	return sprintf(buf, "%u\n", usecs);
+}
+
+static ssize_t sleep_store(struct kobject *kobj,
+				   struct kobj_attribute *attr,
+				   const char *buf, size_t count)
+{
+	unsigned long usecs;
+	int err;
+
+	err = strict_strtoul(buf, 10, &usecs);
+	if (err)
+		return 0;
+
+	/* TODO sanitize usecs */
+
+	down_write(&kthread_lock);
+	kthread_sleep = usecs;
+	up_write(&kthread_lock);
+
+	return count;
+}
+KSM_ATTR(sleep);
+
+static ssize_t pages_to_scan_show(struct kobject *kobj,
+				  struct kobj_attribute *attr, char *buf)
+{
+	unsigned long nr_pages;
+
+	down_read(&kthread_lock);
+	nr_pages = kthread_pages_to_scan;
+	up_read(&kthread_lock);
+
+	return sprintf(buf, "%lu\n", nr_pages);
+}
+
+static ssize_t pages_to_scan_store(struct kobject *kobj,
+				   struct kobj_attribute *attr,
+				   const char *buf, size_t count)
+{
+	int err;
+	unsigned long nr_pages;
+
+	err = strict_strtoul(buf, 10, &nr_pages);
+	if (err)
+		return 0;
+
+	down_write(&kthread_lock);
+	kthread_pages_to_scan = nr_pages;
+	up_write(&kthread_lock);
+
+	return count;
+}
+KSM_ATTR(pages_to_scan);
+
+static ssize_t run_show(struct kobject *kobj, struct kobj_attribute *attr,
+			char *buf)
+{
+	unsigned long run;
+
+	down_read(&kthread_lock);
+	run = ksmd_flags;
+	up_read(&kthread_lock);
+
+	return sprintf(buf, "%lu\n", run);
+}
+
+static ssize_t run_store(struct kobject *kobj, struct kobj_attribute *attr,
+			 const char *buf, size_t count)
+{
+	int err;
+	unsigned long k_flags;
+
+	err = strict_strtoul(buf, 10, &k_flags);
+	if (err)
+		return 0;
+
+	down_write(&kthread_lock);
+	ksmd_flags = k_flags;
+	up_write(&kthread_lock);
+
+	if (ksmd_flags)
+		wake_up_interruptible(&kthread_wait);
+
+	return count;
+}
+KSM_ATTR(run);
+
+static ssize_t pages_shared_show(struct kobject *kobj,
+				 struct kobj_attribute *attr, char *buf)
+{
+	/*
+	 * Note: this number does not include the shared pages outside the
+	 * stable tree.
+	 */
+	return sprintf(buf, "%lu\n", ksm_pages_shared - nnodes_stable_tree);
+}
+KSM_ATTR_RO(pages_shared);
+
+static ssize_t kernel_pages_allocated_show(struct kobject *kobj,
+					   struct kobj_attribute *attr,
+					   char *buf)
+{
+	return sprintf(buf, "%lu\n", nnodes_stable_tree);
+}
+KSM_ATTR_RO(kernel_pages_allocated);
+
+static ssize_t max_kernel_pages_store(struct kobject *kobj,
+				      struct kobj_attribute *attr,
+				      const char *buf, size_t count)
+{
+	int err;
+	unsigned long nr_pages;
+
+	err = strict_strtoul(buf, 10, &nr_pages);
+	if (err)
+		return 0;
+
+	down_write(&kthread_lock);
+	kthread_max_kernel_pages = nr_pages;
+	up_write(&kthread_lock);
+
+	return count;
+}
+
+static ssize_t max_kernel_pages_show(struct kobject *kobj,
+				     struct kobj_attribute *attr, char *buf)
+{
+	unsigned long nr_pages;
+
+	down_read(&kthread_lock);
+	nr_pages = kthread_max_kernel_pages;
+	up_read(&kthread_lock);
+
+	return sprintf(buf, "%lu\n", nr_pages);
+}
+KSM_ATTR(max_kernel_pages);
+
+static struct attribute *ksm_attrs[] = {
+	&sleep_attr.attr,
+	&pages_to_scan_attr.attr,
+	&run_attr.attr,
+	&pages_shared_attr.attr,
+	&kernel_pages_allocated_attr.attr,
+	&max_kernel_pages_attr.attr,
+	NULL,
+};
+
+static struct attribute_group ksm_attr_group = {
+	.attrs = ksm_attrs,
+	.name = "ksm",
+};
+
+
+static int __init ksm_init(void)
+{
+	int r;
+
+	r = ksm_slab_init();
+	if (r)
+		goto out;
+
+	r = rmap_hash_init();
+	if (r)
+		goto out_free1;
+
+	kthread = kthread_run(kthread_ksm_scan_thread, NULL, "kksmd");
+	if (IS_ERR(kthread)) {
+		printk(KERN_ERR "ksm: creating kthread failed\n");
+		r = PTR_ERR(kthread);
+		goto out_free2;
+	}
+
+	r = misc_register(&ksm_dev);
+	if (r) {
+		printk(KERN_ERR "ksm: misc device register failed\n");
+		goto out_free3;
+	}
+
+	r = sysfs_create_group(mm_kobj, &ksm_attr_group);
+	if (r) {
+		printk(KERN_ERR "ksm: register sysfs failed\n");
+		goto out_free4;
+	}
+
+	printk(KERN_WARNING "ksm loaded\n");
+	return 0;
+
+out_free4:
+	misc_deregister(&ksm_dev);
+out_free3:
+	kthread_stop(kthread);
+out_free2:
+	rmap_hash_free();
+out_free1:
+	ksm_slab_free();
+out:
+	return r;
+}
+
+static void __exit ksm_exit(void)
+{
+	sysfs_remove_group(mm_kobj, &ksm_attr_group);
+	misc_deregister(&ksm_dev);
+	ksmd_flags = ksm_control_flags_run;
+	kthread_stop(kthread);
+	rmap_hash_free();
+	ksm_slab_free();
+}
+
+module_init(ksm_init)
+module_exit(ksm_exit)
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2009-04-04 14:35 [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Izik Eidus
  2009-04-04 14:35 ` [PATCH 1/4] MMU_NOTIFIERS: add set_pte_at_notify() Izik Eidus
@ 2009-04-06  7:04 ` Nick Piggin
  2009-04-06  7:33   ` Avi Kivity
                     ` (2 more replies)
  2009-04-07 13:57 ` Andrea Arcangeli
  2 siblings, 3 replies; 19+ messages in thread
From: Nick Piggin @ 2009-04-06  7:04 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, kvm, linux-mm, avi, aarcange, chrisw,
	mtosatti, hugh, kamezawa.hiroyu

On Sunday 05 April 2009 01:35:18 Izik Eidus wrote:

> This driver is very useful for KVM as in cases of runing multiple guests
> operation system of the same type.
> (For desktop work loads we have achived more than x2 memory overcommit
> (more like x3))

Interesting that it is a desirable workload to have multiple guests each
running MS office.

I wonder, can windows enter a paravirtualised guest mode for KVM? And can
you detect page allocation/freeing events?

 
> This driver have found users other than KVM, for example CERN,
> Fons Rademakers:
> "on many-core machines we run one large detector simulation program per core.
> These simulation programs are identical but run each in their own process and
> need about 2 - 2.5 GB RAM.
> We typically buy machines with 2GB RAM per core and so have a problem to run
> one of these programs per core.
> Of the 2 - 2.5 GB about 700MB is identical data in the form of magnetic field
> maps, detector geometry, etc.
> Currently people have been trying to start one program, initialize the geometry
> and field maps and then fork it N times, to have the data shared.
> With KSM this would be done automatically by the system so it sounded extremely
> attractive when Andrea presented it."

They should use a shared memory segment, or MAP_ANONYMOUS|MAP_SHARED etc.
Presumably they will probably want to control it to interleave it over
all numa nodes and use hugepages for it. It would be very little work.

 
> I am sending another seires of patchs for kvm kernel and kvm-userspace
> that would allow users of kvm to test ksm with it.
> The kvm patchs would apply to Avi git tree.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2009-04-06  7:04 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Nick Piggin
@ 2009-04-06  7:33   ` Avi Kivity
  2009-04-06 11:19   ` Izik Eidus
  2009-04-06 13:42   ` Andrea Arcangeli
  2 siblings, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2009-04-06  7:33 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Izik Eidus, akpm, linux-kernel, kvm, linux-mm, aarcange, chrisw,
	mtosatti, hugh, kamezawa.hiroyu

Nick Piggin wrote:
> On Sunday 05 April 2009 01:35:18 Izik Eidus wrote:
>
>   
>> This driver is very useful for KVM as in cases of runing multiple guests
>> operation system of the same type.
>> (For desktop work loads we have achived more than x2 memory overcommit
>> (more like x3))
>>     
>
> Interesting that it is a desirable workload to have multiple guests each
> running MS office.
>
> I wonder, can windows enter a paravirtualised guest mode for KVM?

Windows has some support for paravirtualization, for example it can use 
hypercalls instead of tlb flush IPIs.

>  And can
> you detect page allocation/freeing events?
>   

Not that I know of.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] add ksm kernel shared memory driver.
  2009-04-04 14:35       ` [PATCH 4/4] add ksm kernel shared memory driver Izik Eidus
@ 2009-04-06  9:13         ` Andrey Panin
  2009-04-06 10:58           ` Izik Eidus
  0 siblings, 1 reply; 19+ messages in thread
From: Andrey Panin @ 2009-04-06  9:13 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, kvm, linux-mm, avi, aarcange, chrisw,
	mtosatti, hugh, kamezawa.hiroyu

On 094, 04 04, 2009 at 05:35:22PM +0300, Izik Eidus wrote:

<SNIP>

> +static inline u32 calc_checksum(struct page *page)
> +{
> +	u32 checksum;
> +	void *addr = kmap_atomic(page, KM_USER0);
> +	checksum = jhash(addr, PAGE_SIZE, 17);

Why jhash2() is not used here ? It's faster and leads to smaller code size.

> +	kunmap_atomic(addr, KM_USER0);
> +	return checksum;
> +}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/4] add ksm kernel shared memory driver.
  2009-04-06  9:13         ` Andrey Panin
@ 2009-04-06 10:58           ` Izik Eidus
  0 siblings, 0 replies; 19+ messages in thread
From: Izik Eidus @ 2009-04-06 10:58 UTC (permalink / raw)
  To: Izik Eidus, akpm, linux-kernel, kvm, linux-mm, avi, aarcange,
	chrisw, mtosatti, hugh, kamezawa.hiroyu

Andrey Panin wrote:
> On 094, 04 04, 2009 at 05:35:22PM +0300, Izik Eidus wrote:
>
> <SNIP>
>
>   
>> +static inline u32 calc_checksum(struct page *page)
>> +{
>> +	u32 checksum;
>> +	void *addr = kmap_atomic(page, KM_USER0);
>> +	checksum = jhash(addr, PAGE_SIZE, 17);
>>     
>
> Why jhash2() is not used here ? It's faster and leads to smaller code size.
>   

Beacuse i didnt know, i will check that and change.

Thanks.

(We should really use in cpu crc for Intel Nehalem, and dirty bit for 
the rest of the architactures...)

>   
>> +	kunmap_atomic(addr, KM_USER0);
>> +	return checksum;
>> +}
>>     
>
>   

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2009-04-06  7:04 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Nick Piggin
  2009-04-06  7:33   ` Avi Kivity
@ 2009-04-06 11:19   ` Izik Eidus
  2009-04-06 13:42   ` Andrea Arcangeli
  2 siblings, 0 replies; 19+ messages in thread
From: Izik Eidus @ 2009-04-06 11:19 UTC (permalink / raw)
  To: Nick Piggin
  Cc: akpm, linux-kernel, kvm, linux-mm, avi, aarcange, chrisw,
	mtosatti, hugh, kamezawa.hiroyu

Nick Piggin wrote:
> On Sunday 05 April 2009 01:35:18 Izik Eidus wrote:
>
>   
>> This driver is very useful for KVM as in cases of runing multiple guests
>> operation system of the same type.
>> (For desktop work loads we have achived more than x2 memory overcommit
>> (more like x3))
>>     
>
> Interesting that it is a desirable workload to have multiple guests each
> running MS office.
>   

This numbers are took from such workload, it is some kind of weird 
script that keep opening Word / Excel and write there like a user...
I think in addition it open internet explorer and enter to random sites...
I can search for the script if wanted...

> I wonder, can windows enter a paravirtualised guest mode for KVM? And can
> you detect page allocation/freeing events?
>   

I Dont know.

>  
>   
>> This driver have found users other than KVM, for example CERN,
>> Fons Rademakers:
>> "on many-core machines we run one large detector simulation program per core.
>> These simulation programs are identical but run each in their own process and
>> need about 2 - 2.5 GB RAM.
>> We typically buy machines with 2GB RAM per core and so have a problem to run
>> one of these programs per core.
>> Of the 2 - 2.5 GB about 700MB is identical data in the form of magnetic field
>> maps, detector geometry, etc.
>> Currently people have been trying to start one program, initialize the geometry
>> and field maps and then fork it N times, to have the data shared.
>> With KSM this would be done automatically by the system so it sounded extremely
>> attractive when Andrea presented it."
>>     
>
> They should use a shared memory segment, or MAP_ANONYMOUS|MAP_SHARED etc.
> Presumably they will probably want to control it to interleave it over
> all numa nodes and use hugepages for it. It would be very little work.
>   

Agree about that, dont know their application to much, i know they had 
problems to do it.

>  
>   
>> I am sending another seires of patchs for kvm kernel and kvm-userspace
>> that would allow users of kvm to test ksm with it.
>> The kvm patchs would apply to Avi git tree.
>>     

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2009-04-06  7:04 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Nick Piggin
  2009-04-06  7:33   ` Avi Kivity
  2009-04-06 11:19   ` Izik Eidus
@ 2009-04-06 13:42   ` Andrea Arcangeli
  2 siblings, 0 replies; 19+ messages in thread
From: Andrea Arcangeli @ 2009-04-06 13:42 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Izik Eidus, akpm, linux-kernel, kvm, linux-mm, avi, chrisw,
	mtosatti, hugh, kamezawa.hiroyu

On Mon, Apr 06, 2009 at 05:04:49PM +1000, Nick Piggin wrote:
> They should use a shared memory segment, or MAP_ANONYMOUS|MAP_SHARED etc.
> Presumably they will probably want to control it to interleave it over
> all numa nodes and use hugepages for it. It would be very little work.

I thought it's the intermediate result of the computations that leads
to lots of equal data too, in which case ksm is the only way to share
it all.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] ksm - dynamic page sharing driver for linux v2
  2009-04-04 14:35 [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Izik Eidus
  2009-04-04 14:35 ` [PATCH 1/4] MMU_NOTIFIERS: add set_pte_at_notify() Izik Eidus
  2009-04-06  7:04 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Nick Piggin
@ 2009-04-07 13:57 ` Andrea Arcangeli
  2 siblings, 0 replies; 19+ messages in thread
From: Andrea Arcangeli @ 2009-04-07 13:57 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, kvm, linux-mm, avi, chrisw, mtosatti, hugh,
	kamezawa.hiroyu

On Sat, Apr 04, 2009 at 05:35:18PM +0300, Izik Eidus wrote:
> From v1 to v2:
> 
> 1)Fixed security issue found by Chris Wright:
>     Ksm was checking if page is a shared page by running !PageAnon.
>     Beacuse that Ksm scan only anonymous memory, all !PageAnons
>     inside ksm data strctures are shared page, however there might
>     be a case for do_wp_page() when the VM_SHARED is used where
>     do_wp_page() would instead of copying the page into new anonymos
>     page, would reuse the page, it was fixed by adding check for the
>     dirty_bit of the virtual addresses pointing into the shared page.
>     I was not finding any VM code tha would clear the dirty bit from
>     this virtual address (due to the fact that we allocate the page
>     using page_alloc() - kernel allocated pages), ~but i still want
>     confirmation about this from the vm guys - thanks.~

As far as I can tell this wasn't a bug and this change is
unnecessary. I already checked this bit but I may have missed
something, so I ask here to be sure.

As far as I can tell when VM_SHARED is set, no anonymous page can ever
be allocated by in that vma range, hence no KSM page can ever be
generated in that vma either. MAP_SHARED|MAP_ANONYMOUS is only a
different API for /dev/shm, IPCSHM backing, no anonymous pages can
live there. It surely worked like that in older 2.6, reading latest
code it seems to still work like that, but if something has changed
Hugh will surely correct me in a jiffy ;).

I still see this in the file=null path.
  
  } else if (vm_flags & VM_SHARED) {
    error = shmem_zero_setup(vma);
    	  if (error)
		goto free_vma;
		}


So you can revert your change for now.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-04-07 13:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-04 14:35 [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Izik Eidus
2009-04-04 14:35 ` [PATCH 1/4] MMU_NOTIFIERS: add set_pte_at_notify() Izik Eidus
2009-04-04 14:35   ` [PATCH 2/4] add page_wrprotect(): write protecting page Izik Eidus
2009-04-04 14:35     ` [PATCH 3/4] add replace_page(): change the page pte is pointing to Izik Eidus
2009-04-04 14:35       ` [PATCH 4/4] add ksm kernel shared memory driver Izik Eidus
2009-04-06  9:13         ` Andrey Panin
2009-04-06 10:58           ` Izik Eidus
2009-04-06  7:04 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux v2 Nick Piggin
2009-04-06  7:33   ` Avi Kivity
2009-04-06 11:19   ` Izik Eidus
2009-04-06 13:42   ` Andrea Arcangeli
2009-04-07 13:57 ` Andrea Arcangeli
  -- strict thread matches above, loose matches on Subject: below --
2008-11-17  2:20 Izik Eidus
2008-11-20  7:44 ` Ryota OZAKI
2008-11-20  9:03   ` Izik Eidus
2008-11-20  9:13     ` Izik Eidus
2008-11-20  9:44       ` Ryota OZAKI
2008-11-28 12:57 ` Dmitri Monakhov
2008-11-28 13:51   ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).