linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] ksm changes
@ 2009-05-02 22:16 Izik Eidus
  2009-05-02 22:16 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
  0 siblings, 1 reply; 14+ messages in thread
From: Izik Eidus @ 2009-05-02 22:16 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

Following patchs touch 4 diffrent areas inside ksm:

1) Patchs 1 - 3: Change the api to be more robust and make more sense.
                 This include:
                     * Limiting the number of memory regions user can
                       register inside ksm per file descriptor that
                       he open.

                     * Reject overlap memory addresses registrations.

                     * change KSM_REMOVE_MEMORY_REGION ioctl to make
                       more sense, untill this patchs user was able
                       to register servel memory regions per file
                       descriptor, but when he called to
                       KSM_REMOVE_MEMORY_REGION, he had no way to tell
                       what memory region he want to remove, as a
                       result each call to KSM_REMOVE_MEMORY_REGION
                       nuked all the regions inside the fd.
                       Now KSM_REMOVE_MEMORY_REGION is working on
                       specific addresses.

2) Patch 4: Use generic helper functions to deal with the vma prot.
            
3) Patch 5: Return ksm to be build on all archs (Now after patch 4,
            ksm shouldnt break any arch).

4) Patch 6: change the miscdevice minor number - lets wait to Alan
            saying he is happy with this change before we apply.

Thanks.

Izik Eidus (6):
  ksm: limiting the num of mem regions user can register per fd.
  ksm: dont allow overlap memory addresses registrations.
  ksm: change the KSM_REMOVE_MEMORY_REGION ioctl.
  ksm: change the prot handling to use the generic helper functions
  ksm: build system make it compile for all archs
  ksm: use another miscdevice minor number.

 Documentation/devices.txt  |    1 +
 include/linux/miscdevice.h |    2 +-
 mm/Kconfig                 |    1 -
 mm/ksm.c                   |  112 +++++++++++++++++++++++++++++++++++++-------
 4 files changed, 96 insertions(+), 20 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd.
  2009-05-02 22:16 [PATCH 0/6] ksm changes Izik Eidus
@ 2009-05-02 22:16 ` Izik Eidus
  2009-05-02 22:16   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
                     ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Izik Eidus @ 2009-05-02 22:16 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

Right now user can open /dev/ksm fd and register unlimited number of
regions, such behavior may allocate unlimited amount of kernel memory
and get the whole host into out of memory situation.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 mm/ksm.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 6165276..d58db6b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -48,6 +48,9 @@ static int rmap_hash_size;
 module_param(rmap_hash_size, int, 0);
 MODULE_PARM_DESC(rmap_hash_size, "Hash table size for the reverse mapping");
 
+static int regions_per_fd;
+module_param(regions_per_fd, int, 0);
+
 /*
  * ksm_mem_slot - hold information for an userspace scanning range
  * (the scanning for this region will be from addr untill addr +
@@ -67,6 +70,7 @@ struct ksm_mem_slot {
  */
 struct ksm_sma {
 	struct list_head sma_slots;
+	int nregions;
 };
 
 /**
@@ -453,6 +457,11 @@ static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma,
 	struct ksm_mem_slot *slot;
 	int ret = -EPERM;
 
+	if ((ksm_sma->nregions + 1) > regions_per_fd) {
+		ret = -EBUSY;
+		goto out;
+	}
+
 	slot = kzalloc(sizeof(struct ksm_mem_slot), GFP_KERNEL);
 	if (!slot) {
 		ret = -ENOMEM;
@@ -473,6 +482,7 @@ static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma,
 
 	list_add_tail(&slot->link, &slots);
 	list_add_tail(&slot->sma_link, &ksm_sma->sma_slots);
+	ksm_sma->nregions++;
 
 	up_write(&slots_lock);
 	return 0;
@@ -511,6 +521,7 @@ static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
 		mmput(slot->mm);
 		list_del(&slot->sma_link);
 		kfree(slot);
+		ksm_sma->nregions--;
 	}
 	up_write(&slots_lock);
 	return 0;
@@ -1389,6 +1400,7 @@ static int ksm_dev_ioctl_create_shared_memory_area(void)
 	}
 
 	INIT_LIST_HEAD(&ksm_sma->sma_slots);
+	ksm_sma->nregions = 0;
 
 	fd = anon_inode_getfd("ksm-sma", &ksm_sma_fops, ksm_sma, 0);
 	if (fd < 0)
@@ -1631,6 +1643,9 @@ static int __init ksm_init(void)
 	if (r)
 		goto out_free1;
 
+	if (!regions_per_fd)
+		regions_per_fd = 1024;
+
 	ksm_thread = kthread_run(ksm_scan_thread, NULL, "kksmd");
 	if (IS_ERR(ksm_thread)) {
 		printk(KERN_ERR "ksm: creating kthread failed\n");
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] ksm: dont allow overlap memory addresses registrations.
  2009-05-02 22:16 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
@ 2009-05-02 22:16   ` Izik Eidus
  2009-05-02 22:16     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
  2009-05-03  2:08   ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Rik van Riel
  2009-05-03  2:08   ` Rik van Riel
  2 siblings, 1 reply; 14+ messages in thread
From: Izik Eidus @ 2009-05-02 22:16 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

subjects say it all.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 mm/ksm.c |   58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index d58db6b..982dfff 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -451,21 +451,71 @@ static void remove_page_from_tree(struct mm_struct *mm,
 	remove_rmap_item_from_tree(rmap_item);
 }
 
+static inline int is_intersecting_address(unsigned long addr,
+					  unsigned long begin,
+					  unsigned long end)
+{
+	if (addr >= begin && addr < end)
+		return 1;
+	return 0;
+}
+
+/*
+ * is_overlap_mem - check if there is overlapping with memory that was already
+ * registred.
+ *
+ * note - this function must to be called under slots_lock
+ */
+static int is_overlap_mem(struct ksm_memory_region *mem)
+{
+	struct ksm_mem_slot *slot;
+
+	list_for_each_entry(slot, &slots, link) {
+		unsigned long mem_end;
+		unsigned long slot_end;
+
+		cond_resched();
+
+		if (current->mm != slot->mm)
+			continue;
+
+		mem_end = mem->addr + (unsigned long)mem->npages * PAGE_SIZE;
+		slot_end = slot->addr + (unsigned long)slot->npages * PAGE_SIZE;
+
+		if (is_intersecting_address(mem->addr, slot->addr, slot_end) ||
+		    is_intersecting_address(mem_end - 1, slot->addr, slot_end))
+			return 1;
+		if (is_intersecting_address(slot->addr, mem->addr, mem_end) ||
+		    is_intersecting_address(slot_end - 1, mem->addr, mem_end))
+			return 1;
+	}
+
+	return 0;
+}
+
 static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma,
 						struct ksm_memory_region *mem)
 {
 	struct ksm_mem_slot *slot;
 	int ret = -EPERM;
 
+	if (!mem->npages)
+		goto out;
+
+	down_write(&slots_lock);
+
 	if ((ksm_sma->nregions + 1) > regions_per_fd) {
 		ret = -EBUSY;
-		goto out;
+		goto out_unlock;
 	}
 
+	if (is_overlap_mem(mem))
+		goto out_unlock;
+
 	slot = kzalloc(sizeof(struct ksm_mem_slot), GFP_KERNEL);
 	if (!slot) {
 		ret = -ENOMEM;
-		goto out;
+		goto out_unlock;
 	}
 
 	/*
@@ -478,8 +528,6 @@ static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma,
 	slot->addr = mem->addr;
 	slot->npages = mem->npages;
 
-	down_write(&slots_lock);
-
 	list_add_tail(&slot->link, &slots);
 	list_add_tail(&slot->sma_link, &ksm_sma->sma_slots);
 	ksm_sma->nregions++;
@@ -489,6 +537,8 @@ static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma,
 
 out_free:
 	kfree(slot);
+out_unlock:
+	up_write(&slots_lock);
 out:
 	return ret;
 }
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl.
  2009-05-02 22:16   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
@ 2009-05-02 22:16     ` Izik Eidus
  2009-05-02 22:16       ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Izik Eidus
  2009-05-04 19:43       ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Hugh Dickins
  0 siblings, 2 replies; 14+ messages in thread
From: Izik Eidus @ 2009-05-02 22:16 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

This patch change the KSM_REMOVE_MEMORY_REGION ioctl to be specific per
memory region (instead of flushing all the registred memory regions inside
the file descriptor like it happen now)

The previoes api was:
user register memory regions using KSM_REGISTER_MEMORY_REGION inside the fd,
and then when he wanted to remove just one memory region, he had to remove them
all using KSM_REMOVE_MEMORY_REGION.

This patch change this beahivor by chaning the KSM_REMOVE_MEMORY_REGION
ioctl to recive another paramter that it is the begining of the virtual
address that is wanted to be removed.

(user can still remove all the memory regions all at once, by just closing
the file descriptor)

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 mm/ksm.c |   31 +++++++++++++++++++++----------
 1 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 982dfff..c14019f 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -561,17 +561,20 @@ static void remove_mm_from_hash_and_tree(struct mm_struct *mm)
 	list_del(&slot->link);
 }
 
-static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
+static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma,
+					      unsigned long addr)
 {
 	struct ksm_mem_slot *slot, *node;
 
 	down_write(&slots_lock);
 	list_for_each_entry_safe(slot, node, &ksm_sma->sma_slots, sma_link) {
-		remove_mm_from_hash_and_tree(slot->mm);
-		mmput(slot->mm);
-		list_del(&slot->sma_link);
-		kfree(slot);
-		ksm_sma->nregions--;
+		if (addr == slot->addr) {
+			remove_mm_from_hash_and_tree(slot->mm);
+			mmput(slot->mm);
+			list_del(&slot->sma_link);
+			kfree(slot);
+			ksm_sma->nregions--;
+		}
 	}
 	up_write(&slots_lock);
 	return 0;
@@ -579,12 +582,20 @@ static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
 
 static int ksm_sma_release(struct inode *inode, struct file *filp)
 {
+	struct ksm_mem_slot *slot, *node;
 	struct ksm_sma *ksm_sma = filp->private_data;
-	int r;
 
-	r = ksm_sma_ioctl_remove_memory_region(ksm_sma);
+	down_write(&slots_lock);
+	list_for_each_entry_safe(slot, node, &ksm_sma->sma_slots, sma_link) {
+		remove_mm_from_hash_and_tree(slot->mm);
+		mmput(slot->mm);
+		list_del(&slot->sma_link);
+		kfree(slot);
+	}
+	up_write(&slots_lock);
+
 	kfree(ksm_sma);
-	return r;
+	return 0;
 }
 
 static long ksm_sma_ioctl(struct file *filp,
@@ -607,7 +618,7 @@ static long ksm_sma_ioctl(struct file *filp,
 		break;
 	}
 	case KSM_REMOVE_MEMORY_REGION:
-		r = ksm_sma_ioctl_remove_memory_region(sma);
+		r = ksm_sma_ioctl_remove_memory_region(sma, arg);
 		break;
 	}
 
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] ksm: change the prot handling to use the generic helper functions
  2009-05-02 22:16     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
@ 2009-05-02 22:16       ` Izik Eidus
  2009-05-02 22:16         ` [PATCH 5/6] ksm: build system make it compile for all archs Izik Eidus
  2009-05-04 19:43       ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Hugh Dickins
  1 sibling, 1 reply; 14+ messages in thread
From: Izik Eidus @ 2009-05-02 22:16 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

This is needed to avoid breaking some architectures.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 mm/ksm.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index c14019f..bfbbe1d 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -766,8 +766,8 @@ static int try_to_merge_two_pages_alloc(struct mm_struct *mm1,
 		up_read(&mm1->mmap_sem);
 		return ret;
 	}
-	prot = vma->vm_page_prot;
-	pgprot_val(prot) &= ~_PAGE_RW;
+
+	prot = vm_get_page_prot(vma->vm_flags & ~VM_WRITE);
 
 	copy_user_highpage(kpage, page1, addr1, vma);
 	ret = try_to_merge_one_page(mm1, vma, page1, kpage, prot);
@@ -784,8 +784,7 @@ static int try_to_merge_two_pages_alloc(struct mm_struct *mm1,
 			return ret;
 		}
 
-		prot = vma->vm_page_prot;
-		pgprot_val(prot) &= ~_PAGE_RW;
+		prot = vm_get_page_prot(vma->vm_flags & ~VM_WRITE);
 
 		ret = try_to_merge_one_page(mm2, vma, page2, kpage,
 					    prot);
@@ -831,8 +830,9 @@ static int try_to_merge_two_pages_noalloc(struct mm_struct *mm1,
 		up_read(&mm1->mmap_sem);
 		return ret;
 	}
-	prot = vma->vm_page_prot;
-	pgprot_val(prot) &= ~_PAGE_RW;
+
+	prot = vm_get_page_prot(vma->vm_flags & ~VM_WRITE);
+
 	ret = try_to_merge_one_page(mm1, vma, page1, page2, prot);
 	up_read(&mm1->mmap_sem);
 	if (!ret)
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] ksm: build system make it compile for all archs
  2009-05-02 22:16       ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Izik Eidus
@ 2009-05-02 22:16         ` Izik Eidus
  2009-05-02 22:16           ` [PATCH 6/6] ksm: use another miscdevice minor number Izik Eidus
  0 siblings, 1 reply; 14+ messages in thread
From: Izik Eidus @ 2009-05-02 22:16 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

The known issues with cross platform support were fixed,
so we return it back to compile on all archs.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 mm/Kconfig |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index f59b1e4..fb8ac63 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -228,7 +228,6 @@ config MMU_NOTIFIER
 
 config KSM
 	tristate "Enable KSM for page sharing"
-	depends on X86
 	help
 	  Enable the KSM kernel module to allow page sharing of equal pages
 	  among different tasks.
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] ksm: use another miscdevice minor number.
  2009-05-02 22:16         ` [PATCH 5/6] ksm: build system make it compile for all archs Izik Eidus
@ 2009-05-02 22:16           ` Izik Eidus
  0 siblings, 0 replies; 14+ messages in thread
From: Izik Eidus @ 2009-05-02 22:16 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

The old number was registered already by another project.
The new number is #234.

Thanks.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 Documentation/devices.txt  |    1 +
 include/linux/miscdevice.h |    2 +-
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/Documentation/devices.txt b/Documentation/devices.txt
index 53d64d3..a0c3259 100644
--- a/Documentation/devices.txt
+++ b/Documentation/devices.txt
@@ -443,6 +443,7 @@ Your cooperation is appreciated.
 		231 = /dev/snapshot	System memory snapshot device
 		232 = /dev/kvm		Kernel-based virtual machine (hardware virtualization extensions)
 		233 = /dev/kmview	View-OS A process with a view
+		234 = /dev/ksm		Dynamic page sharing
 		240-254			Reserved for local use
 		255			Reserved for MISC_DYNAMIC_MINOR
 
diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index 297c0bb..c7b8e9b 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -30,7 +30,7 @@
 #define HPET_MINOR		228
 #define FUSE_MINOR		229
 #define KVM_MINOR		232
-#define KSM_MINOR		233
+#define KSM_MINOR		234
 #define MISC_DYNAMIC_MINOR	255
 
 struct device;
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd.
  2009-05-02 22:16 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
  2009-05-02 22:16   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
@ 2009-05-03  2:08   ` Rik van Riel
  2009-05-03  9:06     ` Izik Eidus
  2009-05-03  2:08   ` Rik van Riel
  2 siblings, 1 reply; 14+ messages in thread
From: Rik van Riel @ 2009-05-03  2:08 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, aarcange, chrisw, alan, device, linux-mm,
	hugh, nickpiggin

On Sun,  3 May 2009 01:16:07 +0300
Izik Eidus <ieidus@redhat.com> wrote:

> Right now user can open /dev/ksm fd and register unlimited number of
> regions, such behavior may allocate unlimited amount of kernel memory
> and get the whole host into out of memory situation.

How many times can a process open /dev/ksm?

If a process can open /dev/ksm a thousand times and then
register 1000 regions through each file descriptor, this
patch does not help all that much...

-- 
All rights reversed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd.
  2009-05-02 22:16 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
  2009-05-02 22:16   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
  2009-05-03  2:08   ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Rik van Riel
@ 2009-05-03  2:08   ` Rik van Riel
  2 siblings, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2009-05-03  2:08 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, aarcange, chrisw, alan, device, linux-mm,
	hugh, nickpiggin

On Sun,  3 May 2009 01:16:07 +0300
Izik Eidus <ieidus@redhat.com> wrote:

> Right now user can open /dev/ksm fd and register unlimited number of
> regions, such behavior may allocate unlimited amount of kernel memory
> and get the whole host into out of memory situation.
> 
> Signed-off-by: Izik Eidus <ieidus@redhat.com>
> ---
>  mm/ksm.c |   15 +++++++++++++++
>  1 files changed, 15 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 6165276..d58db6b 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -48,6 +48,9 @@ static int rmap_hash_size;
>  module_param(rmap_hash_size, int, 0);
>  MODULE_PARM_DESC(rmap_hash_size, "Hash table size for the reverse
> mapping"); 
> +static int regions_per_fd;
> +module_param(regions_per_fd, int, 0);
> +
>  /*
>   * ksm_mem_slot - hold information for an userspace scanning range
>   * (the scanning for this region will be from addr untill addr +
> @@ -67,6 +70,7 @@ struct ksm_mem_slot {
>   */
>  struct ksm_sma {
>  	struct list_head sma_slots;
> +	int nregions;
>  };
>  
>  /**
> @@ -453,6 +457,11 @@ static int
> ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma, struct
> ksm_mem_slot *slot; int ret = -EPERM;
>  
> +	if ((ksm_sma->nregions + 1) > regions_per_fd) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
>  	slot = kzalloc(sizeof(struct ksm_mem_slot), GFP_KERNEL);
>  	if (!slot) {
>  		ret = -ENOMEM;
> @@ -473,6 +482,7 @@ static int
> ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma, 
>  	list_add_tail(&slot->link, &slots);
>  	list_add_tail(&slot->sma_link, &ksm_sma->sma_slots);
> +	ksm_sma->nregions++;
>  
>  	up_write(&slots_lock);
>  	return 0;
> @@ -511,6 +521,7 @@ static int
> ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
> mmput(slot->mm); list_del(&slot->sma_link);
>  		kfree(slot);
> +		ksm_sma->nregions--;
>  	}
>  	up_write(&slots_lock);
>  	return 0;
> @@ -1389,6 +1400,7 @@ static int
> ksm_dev_ioctl_create_shared_memory_area(void) }
>  
>  	INIT_LIST_HEAD(&ksm_sma->sma_slots);
> +	ksm_sma->nregions = 0;
>  
>  	fd = anon_inode_getfd("ksm-sma", &ksm_sma_fops, ksm_sma, 0);
>  	if (fd < 0)
> @@ -1631,6 +1643,9 @@ static int __init ksm_init(void)
>  	if (r)
>  		goto out_free1;
>  
> +	if (!regions_per_fd)
> +		regions_per_fd = 1024;
> +
>  	ksm_thread = kthread_run(ksm_scan_thread, NULL, "kksmd");
>  	if (IS_ERR(ksm_thread)) {
>  		printk(KERN_ERR "ksm: creating kthread failed\n");


-- 
All rights reversed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd.
  2009-05-03  2:08   ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Rik van Riel
@ 2009-05-03  9:06     ` Izik Eidus
  0 siblings, 0 replies; 14+ messages in thread
From: Izik Eidus @ 2009-05-03  9:06 UTC (permalink / raw)
  To: Rik van Riel
  Cc: akpm, linux-kernel, aarcange, chrisw, alan, device, linux-mm,
	hugh, nickpiggin

Rik van Riel wrote:
> On Sun,  3 May 2009 01:16:07 +0300
> Izik Eidus <ieidus@redhat.com> wrote:
>
>   
>> Right now user can open /dev/ksm fd and register unlimited number of
>> regions, such behavior may allocate unlimited amount of kernel memory
>> and get the whole host into out of memory situation.
>>     
>
> How many times can a process open /dev/ksm?
>
> If a process can open /dev/ksm a thousand times and then
> register 1000 regions through each file descriptor, this
> patch does not help all that much...
>
>   
The idea is that the limitation is now on the maximum file descriptors 
user can open.
So for each such file descriptor user can open 1024 structures that are 
just few bytes each.

The whole propose of this patch is to avoid while (1) { 
IOCTL(REGISTER_MEMORY_REGION) } and oom the host.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl.
  2009-05-02 22:16     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
  2009-05-02 22:16       ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Izik Eidus
@ 2009-05-04 19:43       ` Hugh Dickins
  2009-05-04 20:37         ` Izik Eidus
  1 sibling, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2009-05-04 19:43 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, aarcange, chrisw, alan, device, linux-mm,
	nickpiggin

On Sun, 3 May 2009, Izik Eidus wrote:

> This patch change the KSM_REMOVE_MEMORY_REGION ioctl to be specific per
> memory region (instead of flushing all the registred memory regions inside
> the file descriptor like it happen now)
> 
> The previoes api was:
> user register memory regions using KSM_REGISTER_MEMORY_REGION inside the fd,
> and then when he wanted to remove just one memory region, he had to remove them
> all using KSM_REMOVE_MEMORY_REGION.
> 
> This patch change this beahivor by chaning the KSM_REMOVE_MEMORY_REGION
> ioctl to recive another paramter that it is the begining of the virtual
> address that is wanted to be removed.
> 
> (user can still remove all the memory regions all at once, by just closing
> the file descriptor)
> 
> Signed-off-by: Izik Eidus <ieidus@redhat.com>

I realize that it's ridiculous to break my silence with a comment
on this particular patch, when I've not yet commented on KSM as a
whole.  (In the last few days I have at last managed to set aside
some time to give KSM the attention it deserves, but I'm still
not yet through and ready to comment.)

However, although this patch is on the right lines (certainly you
should be allowing to remove individual regions rather than just
all at once), I believe the patch is seriously broken and corrupting
as is, so thought I'd better speak up now.

remove_mm_from_hash_and_tree(slot->mm) is still doing its own
silly loop through the slots:
	list_for_each_entry(slot, &slots, link)
		if (slot->mm == mm)
			break;
So it will be operating on whatever it finds first, in general
the wrong slot, and I expect havoc to follow once you kfree(slot).

Easily fixed: replace remove_mm_from_hash_and_tree(mm)
by remove_slot_from_hash_and_tree(slot).

Hugh

> ---
>  mm/ksm.c |   31 +++++++++++++++++++++----------
>  1 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 982dfff..c14019f 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -561,17 +561,20 @@ static void remove_mm_from_hash_and_tree(struct mm_struct *mm)
>  	list_del(&slot->link);
>  }
>  
> -static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
> +static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma,
> +					      unsigned long addr)
>  {
>  	struct ksm_mem_slot *slot, *node;
>  
>  	down_write(&slots_lock);
>  	list_for_each_entry_safe(slot, node, &ksm_sma->sma_slots, sma_link) {
> -		remove_mm_from_hash_and_tree(slot->mm);
> -		mmput(slot->mm);
> -		list_del(&slot->sma_link);
> -		kfree(slot);
> -		ksm_sma->nregions--;
> +		if (addr == slot->addr) {
> +			remove_mm_from_hash_and_tree(slot->mm);
> +			mmput(slot->mm);
> +			list_del(&slot->sma_link);
> +			kfree(slot);
> +			ksm_sma->nregions--;
> +		}
>  	}
>  	up_write(&slots_lock);
>  	return 0;
> @@ -579,12 +582,20 @@ static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
>  
>  static int ksm_sma_release(struct inode *inode, struct file *filp)
>  {
> +	struct ksm_mem_slot *slot, *node;
>  	struct ksm_sma *ksm_sma = filp->private_data;
> -	int r;
>  
> -	r = ksm_sma_ioctl_remove_memory_region(ksm_sma);
> +	down_write(&slots_lock);
> +	list_for_each_entry_safe(slot, node, &ksm_sma->sma_slots, sma_link) {
> +		remove_mm_from_hash_and_tree(slot->mm);
> +		mmput(slot->mm);
> +		list_del(&slot->sma_link);
> +		kfree(slot);
> +	}
> +	up_write(&slots_lock);
> +
>  	kfree(ksm_sma);
> -	return r;
> +	return 0;
>  }
>  
>  static long ksm_sma_ioctl(struct file *filp,
> @@ -607,7 +618,7 @@ static long ksm_sma_ioctl(struct file *filp,
>  		break;
>  	}
>  	case KSM_REMOVE_MEMORY_REGION:
> -		r = ksm_sma_ioctl_remove_memory_region(sma);
> +		r = ksm_sma_ioctl_remove_memory_region(sma, arg);
>  		break;
>  	}
>  
> -- 
> 1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl.
  2009-05-04 19:43       ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Hugh Dickins
@ 2009-05-04 20:37         ` Izik Eidus
  0 siblings, 0 replies; 14+ messages in thread
From: Izik Eidus @ 2009-05-04 20:37 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: akpm, linux-kernel, aarcange, chrisw, alan, device, linux-mm,
	nickpiggin

Hugh Dickins wrote:
> On Sun, 3 May 2009, Izik Eidus wrote:
>
>   
>> This patch change the KSM_REMOVE_MEMORY_REGION ioctl to be specific per
>> memory region (instead of flushing all the registred memory regions inside
>> the file descriptor like it happen now)
>>
>> The previoes api was:
>> user register memory regions using KSM_REGISTER_MEMORY_REGION inside the fd,
>> and then when he wanted to remove just one memory region, he had to remove them
>> all using KSM_REMOVE_MEMORY_REGION.
>>
>> This patch change this beahivor by chaning the KSM_REMOVE_MEMORY_REGION
>> ioctl to recive another paramter that it is the begining of the virtual
>> address that is wanted to be removed.
>>
>> (user can still remove all the memory regions all at once, by just closing
>> the file descriptor)
>>
>> Signed-off-by: Izik Eidus <ieidus@redhat.com>
>>     
>
> I realize that it's ridiculous to break my silence

Hello ! :)

>  with a comment
> on this particular patch, when I've not yet commented on KSM as a
> whole.  (In the last few days I have at last managed to set aside
> some time to give KSM the attention it deserves, but I'm still
> not yet through and ready to comment.)
>
> However, although this patch is on the right lines (certainly you
> should be allowing to remove individual regions rather than just
> all at once), I believe the patch is seriously broken and corrupting
> as is, so thought I'd better speak up now.
>
> remove_mm_from_hash_and_tree(slot->mm) is still doing its own
> silly loop through the slots:
> 	list_for_each_entry(slot, &slots, link)
> 		if (slot->mm == mm)
> 			break;
> So it will be operating on whatever it finds first

I just started to write big answer that go over the code path to show 
why you are wrong, and then found the problem.

Thanks i will fix it and resend...


> , in general
> the wrong slot, and I expect havoc to follow once you kfree(slot).
>
> Easily fixed: replace remove_mm_from_hash_and_tree(mm)
> by remove_slot_from_hash_and_tree(slot).
>   

Yea, remove_mm_from_hash_and_tree(mm) is surely something that we dont 
need that was left from the old code base.

Thanks.
> Hugh
>
>   
>> ---
>>  mm/ksm.c |   31 +++++++++++++++++++++----------
>>  1 files changed, 21 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index 982dfff..c14019f 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -561,17 +561,20 @@ static void remove_mm_from_hash_and_tree(struct mm_struct *mm)
>>  	list_del(&slot->link);
>>  }
>>  
>> -static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
>> +static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma,
>> +					      unsigned long addr)
>>  {
>>  	struct ksm_mem_slot *slot, *node;
>>  
>>  	down_write(&slots_lock);
>>  	list_for_each_entry_safe(slot, node, &ksm_sma->sma_slots, sma_link) {
>> -		remove_mm_from_hash_and_tree(slot->mm);
>> -		mmput(slot->mm);
>> -		list_del(&slot->sma_link);
>> -		kfree(slot);
>> -		ksm_sma->nregions--;
>> +		if (addr == slot->addr) {
>> +			remove_mm_from_hash_and_tree(slot->mm);
>> +			mmput(slot->mm);
>> +			list_del(&slot->sma_link);
>> +			kfree(slot);
>> +			ksm_sma->nregions--;
>> +		}
>>  	}
>>  	up_write(&slots_lock);
>>  	return 0;
>> @@ -579,12 +582,20 @@ static int ksm_sma_ioctl_remove_memory_region(struct ksm_sma *ksm_sma)
>>  
>>  static int ksm_sma_release(struct inode *inode, struct file *filp)
>>  {
>> +	struct ksm_mem_slot *slot, *node;
>>  	struct ksm_sma *ksm_sma = filp->private_data;
>> -	int r;
>>  
>> -	r = ksm_sma_ioctl_remove_memory_region(ksm_sma);
>> +	down_write(&slots_lock);
>> +	list_for_each_entry_safe(slot, node, &ksm_sma->sma_slots, sma_link) {
>> +		remove_mm_from_hash_and_tree(slot->mm);
>> +		mmput(slot->mm);
>> +		list_del(&slot->sma_link);
>> +		kfree(slot);
>> +	}
>> +	up_write(&slots_lock);
>> +
>>  	kfree(ksm_sma);
>> -	return r;
>> +	return 0;
>>  }
>>  
>>  static long ksm_sma_ioctl(struct file *filp,
>> @@ -607,7 +618,7 @@ static long ksm_sma_ioctl(struct file *filp,
>>  		break;
>>  	}
>>  	case KSM_REMOVE_MEMORY_REGION:
>> -		r = ksm_sma_ioctl_remove_memory_region(sma);
>> +		r = ksm_sma_ioctl_remove_memory_region(sma, arg);
>>  		break;
>>  	}
>>  
>> -- 
>> 1.5.6.5
>>     

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/6] ksm: change the prot handling to use the generic helper functions
  2009-05-04 22:25     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
@ 2009-05-04 22:25       ` Izik Eidus
  2009-05-06  0:54         ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Izik Eidus @ 2009-05-04 22:25 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, aarcange, chrisw, alan, device, linux-mm, hugh,
	nickpiggin, Izik Eidus

This is needed to avoid breaking some architectures.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 mm/ksm.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 6e8b24b..8a0489b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -762,8 +762,8 @@ static int try_to_merge_two_pages_alloc(struct mm_struct *mm1,
 		up_read(&mm1->mmap_sem);
 		return ret;
 	}
-	prot = vma->vm_page_prot;
-	pgprot_val(prot) &= ~_PAGE_RW;
+
+	prot = vm_get_page_prot(vma->vm_flags & ~VM_WRITE);
 
 	copy_user_highpage(kpage, page1, addr1, vma);
 	ret = try_to_merge_one_page(mm1, vma, page1, kpage, prot);
@@ -780,8 +780,7 @@ static int try_to_merge_two_pages_alloc(struct mm_struct *mm1,
 			return ret;
 		}
 
-		prot = vma->vm_page_prot;
-		pgprot_val(prot) &= ~_PAGE_RW;
+		prot = vm_get_page_prot(vma->vm_flags & ~VM_WRITE);
 
 		ret = try_to_merge_one_page(mm2, vma, page2, kpage,
 					    prot);
@@ -827,8 +826,9 @@ static int try_to_merge_two_pages_noalloc(struct mm_struct *mm1,
 		up_read(&mm1->mmap_sem);
 		return ret;
 	}
-	prot = vma->vm_page_prot;
-	pgprot_val(prot) &= ~_PAGE_RW;
+
+	prot = vm_get_page_prot(vma->vm_flags & ~VM_WRITE);
+
 	ret = try_to_merge_one_page(mm1, vma, page1, page2, prot);
 	up_read(&mm1->mmap_sem);
 	if (!ret)
-- 
1.5.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/6] ksm: change the prot handling to use the generic helper functions
  2009-05-04 22:25       ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Izik Eidus
@ 2009-05-06  0:54         ` Rik van Riel
  0 siblings, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2009-05-06  0:54 UTC (permalink / raw)
  To: Izik Eidus
  Cc: akpm, linux-kernel, aarcange, chrisw, alan, device, linux-mm,
	hugh, nickpiggin

Izik Eidus wrote:
> This is needed to avoid breaking some architectures.
> 
> Signed-off-by: Izik Eidus <ieidus@redhat.com>

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-05-06  0:53 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-02 22:16 [PATCH 0/6] ksm changes Izik Eidus
2009-05-02 22:16 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
2009-05-02 22:16   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
2009-05-02 22:16     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
2009-05-02 22:16       ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Izik Eidus
2009-05-02 22:16         ` [PATCH 5/6] ksm: build system make it compile for all archs Izik Eidus
2009-05-02 22:16           ` [PATCH 6/6] ksm: use another miscdevice minor number Izik Eidus
2009-05-04 19:43       ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Hugh Dickins
2009-05-04 20:37         ` Izik Eidus
2009-05-03  2:08   ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Rik van Riel
2009-05-03  9:06     ` Izik Eidus
2009-05-03  2:08   ` Rik van Riel
  -- strict thread matches above, loose matches on Subject: below --
2009-05-04 22:25 [PATCH 0/6] ksm changes (v2) Izik Eidus
2009-05-04 22:25 ` [PATCH 1/6] ksm: limiting the num of mem regions user can register per fd Izik Eidus
2009-05-04 22:25   ` [PATCH 2/6] ksm: dont allow overlap memory addresses registrations Izik Eidus
2009-05-04 22:25     ` [PATCH 3/6] ksm: change the KSM_REMOVE_MEMORY_REGION ioctl Izik Eidus
2009-05-04 22:25       ` [PATCH 4/6] ksm: change the prot handling to use the generic helper functions Izik Eidus
2009-05-06  0:54         ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).