[ANNOUNCE] kmemcheck v7

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [ANNOUNCE] kmemcheck v7
@ 2008-04-04 13:44 Vegard Nossum
  2008-04-04 13:45 ` [PATCH 1/3] kmemcheck: add the kmemcheck core Vegard Nossum
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Vegard Nossum @ 2008-04-04 13:44 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Pekka Enberg, Ingo Molnar, Peter Zijlstra, Paul E. McKenney,
	Christoph Lameter, Daniel Walker, Andi Kleen, Randy Dunlap,
	Josh Aune, Pekka Paalanen

Hi,

I skipped the public announcements for versions 5 and 6, but here is 7 :)

General description: kmemcheck is a patch to the linux kernel that
detects use of uninitialized memory. It does this by trapping every
read and write to memory that was allocated dynamically (e.g. using
kmalloc()). If a memory address is read that has not previously been
written to, a message is printed to the kernel log.

Changes since v4 (rough list):
- SLUB parts were broken-out into its own file to avoid cluttering the main
   SLUB code.
- A rather lot of cleanups, including removing #ifdefs from arch code.
- Some preparation in anticipation of an x86_64 port.
- Make reporting safer by using a periodic timer to inspect the error queue.
- Fix hang due to page flags changing too early on free().
- Fix hang due to kprobes incompatibility.
- Allow CONFIG_SMP, but limit number of CPUs to 1 at run-time.
- Add kmemcheck=0|1 boot option.
- Add /proc/sys/kernel/kmemcheck for run-time enabling/disabling.


These patches apply to Linus's v2.6.25-rc8. The latest patchset can also be
found here: http://folk.uio.no/vegardno/linux/kmemcheck/

(I will try to submit this for inclusion in 2.6.26, and testing and feedback
is of course very welcome!)

I would like to thank the following people, who provided patches or helped
in various ways:

Ingo Molnar
Paul McKenney
Pekka Enberg
Pekka Paalanen
Peter Zijlstra
Randy Dunlap


Kind regards,
Vegard Nossum

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] kmemcheck: add the kmemcheck core
  2008-04-04 13:44 [ANNOUNCE] kmemcheck v7 Vegard Nossum
@ 2008-04-04 13:45 ` Vegard Nossum
  2008-04-04 13:46 ` [PATCH 2/3] x86: add hooks for kmemcheck Vegard Nossum
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Vegard Nossum @ 2008-04-04 13:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Pekka Enberg, Ingo Molnar, Peter Zijlstra, Paul E. McKenney,
	Christoph Lameter, Daniel Walker, Andi Kleen, Randy Dunlap,
	Josh Aune, Pekka Paalanen

 From bb63a1de75a67ecd88c962c2616f0ab4217f27fe Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@gmail.com>
Date: Fri, 4 Apr 2008 00:51:41 +0200
Subject: [PATCH] kmemcheck: add the kmemcheck core

General description: kmemcheck is a patch to the linux kernel that
detects use of uninitialized memory. It does this by trapping every
read and write to memory that was allocated dynamically (e.g. using
kmalloc()). If a memory address is read that has not previously been
written to, a message is printed to the kernel log.

Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
  Documentation/kmemcheck.txt  |   93 +++++
  arch/x86/Kconfig.debug       |   47 +++
  arch/x86/kernel/Makefile     |    2 +
  arch/x86/kernel/kmemcheck.c  |  893 ++++++++++++++++++++++++++++++++++++++++++
  include/asm-x86/kmemcheck.h  |   30 ++
  include/asm-x86/pgtable.h    |    4 +-
  include/asm-x86/pgtable_32.h |    6 +
  include/linux/kmemcheck.h    |   27 ++
  include/linux/page-flags.h   |    6 +
  init/main.c                  |    2 +
  kernel/sysctl.c              |   12 +
  11 files changed, 1120 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/kmemcheck.txt
  create mode 100644 arch/x86/kernel/kmemcheck.c
  create mode 100644 include/asm-x86/kmemcheck.h
  create mode 100644 include/linux/kmemcheck.h

diff --git a/Documentation/kmemcheck.txt b/Documentation/kmemcheck.txt
new file mode 100644
index 0000000..9d359d2
--- /dev/null
+++ b/Documentation/kmemcheck.txt
@@ -0,0 +1,93 @@
+Technical description
+=====================
+
+kmemcheck works by marking memory pages non-present. This means that whenever
+somebody attempts to access the page, a page fault is generated. The page
+fault handler notices that the page was in fact only hidden, and so it calls
+on the kmemcheck code to make further investigations.
+
+When the investigations are completed, kmemcheck "shows" the page by marking
+it present (as it would be under normal circumstances). This way, the
+interrupted code can continue as usual.
+
+But after the instruction has been executed, we should hide the page again, so
+that we can catch the next access too! Now kmemcheck makes use of a debugging
+feature of the processor, namely single-stepping. When the processor has
+finished the one instruction that generated the memory access, a debug
+exception is raised. From here, we simply hide the page again and continue
+execution, this time with the single-stepping feature turned off.
+
+
+Changes to the memory allocator (SLUB)
+======================================
+
+kmemcheck requires some assistance from the memory allocator in order to work.
+The memory allocator needs to
+
+1. Request twice as much memory as would normally be needed. The bottom half
+   of the memory is what the user actually sees and uses; the upper half
+   contains the so-called shadow memory, which stores the status of each byte
+   in the bottom half, e.g. initialized or uninitialized.
+2. Tell kmemcheck which parts of memory should be marked uninitialized. There
+   are actually a few more states, such as "not yet allocated" and "recently
+   freed".
+
+If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
+memory that can take page faults because of kmemcheck.
+
+If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
+request memory with the __GFP_NOTRACK flag. This does not prevent the page
+faults from occurring, however, but marks the object in question as being
+initialized so that no warnings will ever be produced for this object.
+
+
+Problems
+========
+
+The most prominent problem seems to be that of bit-fields. kmemcheck can only
+track memory with byte granularity. Therefore, when gcc generates code to
+access only one bit in a bit-field, there is really no way for kmemcheck to
+know which of the other bits will be used or thrown away. Consequently, there
+may be bogus warnings for bit-field accesses. There is some experimental
+support to detect this automatically, though it is probably better to work
+around this by explicitly initializing whole bit-fields at once.
+
+Some allocations are used for DMA. As DMA doesn't go through the paging
+mechanism, we have absolutely no way to detect DMA writes. This means that
+spurious warnings may be seen on access to DMA memory. DMA allocations should
+be annotated with the __GFP_NOTRACK flag or allocated from caches marked
+SLAB_NOTRACK to work around this problem.
+
+
+Parameters
+==========
+
+In addition to enabling CONFIG_KMEMCHECK before the kernel is compiled, the
+parameter kmemcheck=1 must be passed to the kernel when it is started in order
+to actually do the tracking. So by default, there is only a very small
+(probably negligible) overhead for enabling the config option.
+
+Similarly, kmemcheck may be turned on or off at run-time using, respectively:
+
+echo 1 > /proc/sys/kernel/kmemcheck
+	and
+echo 0 > /proc/sys/kernel/kmemcheck
+
+Note that this is a lazy setting; once turned off, the old allocations will
+still have to take a single page fault exception before tracking is turned off
+for that particular page. Enabling kmemcheck on will only enable tracking for
+allocations made from that point onwards.
+
+
+Future enhancements
+===================
+
+There is already some preliminary support for catching use-after-free errors.
+What still needs to be done is delaying kfree() so that memory is not
+reallocated immediately after freeing it. [Suggested by Pekka Enberg.]
+
+It should be possible to allow SMP systems by duplicating the page tables for
+each processor in the system. This is probably extremely difficult, however.
+[Suggested by Ingo Molnar.]
+
+Support for instruction set extensions like XMM, SSE2, etc.
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 702eb39..a33683f 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -134,6 +134,53 @@ config IOMMU_LEAK
  	  Add a simple leak tracer to the IOMMU code. This is useful when you
  	  are debugging a buggy device driver that leaks IOMMU mappings.

+config KMEMCHECK
+	bool "kmemcheck: trap use of uninitialized memory"
+	depends on X86_32
+	depends on !X86_USE_3DNOW
+	depends on !CC_OPTIMIZE_FOR_SIZE
+	depends on !DEBUG_PAGEALLOC && SLUB
+	select FRAME_POINTER
+	select STACKTRACE
+	default n
+	help
+	  This option enables tracing of dynamically allocated kernel memory
+	  to see if memory is used before it has been given an initial value.
+	  Be aware that this requires half of your memory for bookkeeping and
+	  will insert extra code at *every* read and write to tracked memory
+	  thus slow down the kernel code (but user code is unaffected).
+
+	  The kernel may be started with kmemcheck=0 or kmemcheck=1 to disable
+	  or enable kmemcheck at boot-time. If the kernel is started with
+	  kmemcheck=0, the large memory and CPU overhead is not incurred.
+
+config KMEMCHECK_ENABLED_BY_DEFAULT
+	bool "kmemcheck: enable at boot by default"
+	depends on KMEMCHECK
+	default y
+	help
+	  This option controls the default behaviour of kmemcheck when the
+	  kernel boots and no kmemcheck= parameter is given.
+
+config KMEMCHECK_PARTIAL_OK
+	bool "kmemcheck: allow partially uninitialized memory"
+	depends on KMEMCHECK
+	default y
+	help
+	  This option works around certain GCC optimizations that produce
+	  32-bit reads from 16-bit variables where the upper 16 bits are
+	  thrown away afterwards. This may of course also hide some real
+	  bugs.
+
+config KMEMCHECK_BITOPS_OK
+	bool "kmemcheck: allow bit-field manipulation"
+	depends on KMEMCHECK
+	default n
+	help
+	  This option silences warnings that would be generated for bit-field
+	  accesses where not all the bits are initialized at the same time.
+	  This may also hide some real bugs.
+
  #
  # IO delay types:
  #
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 4eb5ce8..e1fcc1e 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -86,6 +86,8 @@ endif
  obj-$(CONFIG_SCx200)		+= scx200.o
  scx200-y			+= scx200_32.o

+obj-$(CONFIG_KMEMCHECK)		+= kmemcheck.o
+
  ###
  # 64 bit specific files
  ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/kmemcheck.c b/arch/x86/kernel/kmemcheck.c
new file mode 100644
index 0000000..1dd79f5
--- /dev/null
+++ b/arch/x86/kernel/kmemcheck.c
@@ -0,0 +1,893 @@
+/**
+ * kmemcheck - a heavyweight memory checker for the linux kernel
+ * Copyright (C) 2007, 2008  Vegard Nossum <vegardno@ifi.uio.no>
+ * (With a lot of help from Ingo Molnar and Pekka Enberg.)
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License (version 2) as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/init.h>
+#include <linux/kallsyms.h>
+#include <linux/kdebug.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/page-flags.h>
+#include <linux/stacktrace.h>
+#include <linux/timer.h>
+
+#include <asm/cacheflush.h>
+#include <asm/kmemcheck.h>
+#include <asm/pgtable.h>
+#include <asm/string.h>
+#include <asm/tlbflush.h>
+
+enum shadow {
+	SHADOW_UNALLOCATED,
+	SHADOW_UNINITIALIZED,
+	SHADOW_INITIALIZED,
+	SHADOW_FREED,
+};
+
+enum kmemcheck_error_type {
+	ERROR_INVALID_ACCESS,
+	ERROR_BUG,
+};
+
+struct kmemcheck_error {
+	enum kmemcheck_error_type type;
+
+	union {
+		/* ERROR_INVALID_ACCESS */
+		struct {
+			/* Kind of access that caused the error */
+			enum shadow		state;
+			/* Address and size of the erroneous read */
+			uint32_t		address;
+			unsigned int		size;
+		};
+	};
+
+	struct pt_regs		regs;
+	struct stack_trace	trace;
+	unsigned long		trace_entries[32];
+};
+
+/*
+ * Create a ring queue of errors to output. We can't call printk() directly
+ * from the kmemcheck traps, since this may call the console drivers and
+ * result in a recursive fault.
+ */
+static struct kmemcheck_error error_fifo[32];
+static unsigned int error_count;
+static unsigned int error_rd;
+static unsigned int error_wr;
+
+static struct timer_list kmemcheck_timer;
+
+static struct kmemcheck_error *
+error_next_wr(void)
+{
+	struct kmemcheck_error *e;
+
+	if (error_count == ARRAY_SIZE(error_fifo))
+		return NULL;
+
+	e = &error_fifo[error_wr];
+	if (++error_wr == ARRAY_SIZE(error_fifo))
+		error_wr = 0;
+	++error_count;
+	return e;
+}
+
+static struct kmemcheck_error *
+error_next_rd(void)
+{
+	struct kmemcheck_error *e;
+
+	if (error_count == 0)
+		return NULL;
+
+	e = &error_fifo[error_rd];
+	if (++error_rd == ARRAY_SIZE(error_fifo))
+		error_rd = 0;
+	--error_count;
+	return e;
+}
+
+/*
+ * Save the context of an error.
+ */
+static void
+error_save(enum shadow state, uint32_t address, unsigned int size,
+	struct pt_regs *regs)
+{
+	static uint32_t prev_ip;
+
+	struct kmemcheck_error *e;
+
+	/* Don't report several adjacent errors from the same EIP. */
+	if (regs->ip == prev_ip)
+		return;
+	prev_ip = regs->ip;
+
+	e = error_next_wr();
+	if (!e)
+		return;
+
+	e->type = ERROR_INVALID_ACCESS;
+
+	e->state = state;
+	e->address = address;
+	e->size = size;
+
+	/* Save regs */
+	memcpy(&e->regs, regs, sizeof(*regs));
+
+	/* Save stack trace */
+	e->trace.nr_entries = 0;
+	e->trace.entries = e->trace_entries;
+	e->trace.max_entries = ARRAY_SIZE(e->trace_entries);
+	e->trace.skip = 1;
+	save_stack_trace(&e->trace);
+}
+
+/*
+ * Save the context of a kmemcheck bug.
+ */
+static void
+error_save_bug(struct pt_regs *regs)
+{
+	struct kmemcheck_error *e;
+
+	e = error_next_wr();
+	if (!e)
+		return;
+
+	e->type = ERROR_BUG;
+
+	memcpy(&e->regs, regs, sizeof(*regs));
+
+	e->trace.nr_entries = 0;
+	e->trace.entries = e->trace_entries;
+	e->trace.max_entries = ARRAY_SIZE(e->trace_entries);
+	e->trace.skip = 1;
+	save_stack_trace(&e->trace);
+}
+
+static void
+error_recall(void)
+{
+	static const char *desc[] = {
+		[SHADOW_UNALLOCATED]	= "unallocated",
+		[SHADOW_UNINITIALIZED]	= "uninitialized",
+		[SHADOW_INITIALIZED]	= "initialized",
+		[SHADOW_FREED]		= "freed",
+	};
+
+	struct kmemcheck_error *e;
+
+	e = error_next_rd();
+	if (!e)
+		return;
+
+	switch (e->type) {
+	case ERROR_INVALID_ACCESS:
+		printk(KERN_ERR  "kmemcheck: Caught %d-bit read "
+			"from %s memory (%08x)\n",
+			e->size, desc[e->state], e->address);
+		break;
+	case ERROR_BUG:
+		printk(KERN_EMERG "kmemcheck: Fatal error\n");
+		break;
+	}
+
+	__show_registers(&e->regs, 1);
+	print_stack_trace(&e->trace, 0);
+}
+
+static void
+do_wakeup(unsigned long data)
+{
+	while (error_count > 0)
+		error_recall();
+	mod_timer(&kmemcheck_timer, kmemcheck_timer.expires + HZ);
+}
+
+void __init
+kmemcheck_init(void)
+{
+	printk(KERN_INFO "kmemcheck: \"Bugs, beware!\"\n");
+
+#ifdef CONFIG_SMP
+	/* Limit SMP to use a single CPU. We rely on the fact that this code
+	 * runs before SMP is set up. */
+	if (setup_max_cpus > 1) {
+		printk(KERN_INFO
+			"kmemcheck: Limiting number of CPUs to 1.\n");
+		setup_max_cpus = 1;
+	}
+#endif
+
+	setup_timer(&kmemcheck_timer, &do_wakeup, 0);
+	mod_timer(&kmemcheck_timer, jiffies + HZ);
+}
+
+#ifdef CONFIG_KMEMCHECK_ENABLED_BY_DEFAULT
+int kmemcheck_enabled = 1;
+#else
+int kmemcheck_enabled = 0;
+#endif
+
+static int __init
+param_kmemcheck(char *str)
+{
+	if (!str)
+		return -EINVAL;
+
+	switch (str[0]) {
+	case '0':
+		kmemcheck_enabled = 0;
+		return 0;
+	case '1':
+		kmemcheck_enabled = 1;
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
+early_param("kmemcheck", param_kmemcheck);
+
+/*
+ * Return the shadow address for the given address. Returns NULL if the
+ * address is not tracked.
+ */
+static void *
+address_get_shadow(unsigned long address)
+{
+	struct page *page;
+	struct page *head;
+
+	if (address < PAGE_OFFSET)
+		return NULL;
+	page = virt_to_page(address);
+	if (!page)
+		return NULL;
+	head = compound_head(page);
+	if (!PageHead(head))
+		return NULL;
+	if (!PageSlab(head))
+		return NULL;
+	if (!PageTracked(head))
+		return NULL;
+
+	return (void *) address + (PAGE_SIZE << (compound_order(head) - 1));
+}
+
+static int
+show_addr(uint32_t addr)
+{
+	pte_t *pte;
+	int level;
+
+	if (!address_get_shadow(addr))
+		return 0;
+
+	pte = lookup_address(addr, &level);
+	BUG_ON(!pte);
+	BUG_ON(level != PG_LEVEL_4K);
+
+	set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));
+	__flush_tlb_one(addr);
+	return 1;
+}
+
+/*
+ * In case there's something seriously wrong with kmemcheck (like a recursive
+ * or looping page fault), we should disable tracking for the page as a last
+ * attempt to not hang the machine.
+ */
+static void
+emergency_show_addr(uint32_t address)
+{
+	pte_t *pte;
+	int level;
+
+	pte = lookup_address(address, &level);
+	if (!pte)
+		return;
+	if (level != PG_LEVEL_4K)
+		return;
+
+	/* Don't change pages that weren't hidden in the first place -- they
+	 * aren't ours to modify. */
+	if (!(pte_val(*pte) & _PAGE_HIDDEN))
+		return;
+
+	set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));
+	__flush_tlb_one(address);
+}
+
+static int
+hide_addr(uint32_t addr)
+{
+	pte_t *pte;
+	int level;
+
+	if (!address_get_shadow(addr))
+		return 0;
+
+	pte = lookup_address(addr, &level);
+	BUG_ON(!pte);
+	BUG_ON(level != PG_LEVEL_4K);
+
+	set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));
+	__flush_tlb_one(addr);
+	return 1;
+}
+
+struct kmemcheck_context {
+	bool busy;
+	int balance;
+
+	uint32_t addr1;
+	uint32_t addr2;
+	uint32_t flags;
+};
+
+DEFINE_PER_CPU(struct kmemcheck_context, kmemcheck_context);
+
+bool
+kmemcheck_active(struct pt_regs *regs)
+{
+	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+
+	return data->balance > 0;
+}
+
+/*
+ * Called from the #PF handler.
+ */
+void
+kmemcheck_show(struct pt_regs *regs)
+{
+	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	int n;
+
+	BUG_ON(!irqs_disabled());
+
+	if (unlikely(data->balance != 0)) {
+		emergency_show_addr(data->addr1);
+		emergency_show_addr(data->addr2);
+		error_save_bug(regs);
+		data->balance = 0;
+		return;
+	}
+
+	n = 0;
+	n += show_addr(data->addr1);
+	n += show_addr(data->addr2);
+
+	/* None of the addresses actually belonged to kmemcheck. Note that
+	 * this is not an error. */
+	if (n == 0)
+		return;
+
+	++data->balance;
+
+	/*
+	 * The IF needs to be cleared as well, so that the faulting
+	 * instruction can run "uninterrupted". Otherwise, we might take
+	 * an interrupt and start executing that before we've had a chance
+	 * to hide the page again.
+	 *
+	 * NOTE: In the rare case of multiple faults, we must not override
+	 * the original flags:
+	 */
+	if (!(regs->flags & TF_MASK))
+		data->flags = regs->flags;
+
+	regs->flags |= TF_MASK;
+	regs->flags &= ~IF_MASK;
+}
+
+/*
+ * Called from the #DB handler.
+ */
+void
+kmemcheck_hide(struct pt_regs *regs)
+{
+	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+	int n;
+
+	BUG_ON(!irqs_disabled());
+
+	if (data->balance == 0)
+		return;
+
+	if (unlikely(data->balance != 1)) {
+		emergency_show_addr(data->addr1);
+		emergency_show_addr(data->addr2);
+		error_save_bug(regs);
+		data->addr1 = 0;
+		data->addr2 = 0;
+		data->balance = 0;
+
+		if (!(data->flags & TF_MASK))
+			regs->flags &= ~TF_MASK;
+		if (data->flags & IF_MASK)
+			regs->flags |= IF_MASK;
+		return;
+	}
+
+	n = 0;
+	if (kmemcheck_enabled) {
+		n += hide_addr(data->addr1);
+		n += hide_addr(data->addr2);
+	} else {
+		n += show_addr(data->addr1);
+		n += show_addr(data->addr2);
+	}
+
+	if (n == 0)
+		return;
+
+	--data->balance;
+
+	data->addr1 = 0;
+	data->addr2 = 0;
+
+	if (!(data->flags & TF_MASK))
+		regs->flags &= ~TF_MASK;
+	if (data->flags & IF_MASK)
+		regs->flags |= IF_MASK;
+}
+
+void
+kmemcheck_show_pages(struct page *p, unsigned int n)
+{
+	unsigned int i;
+	struct page *head;
+
+	head = compound_head(p);
+	BUG_ON(!PageHead(head));
+
+	ClearPageTracked(head);
+
+	for (i = 0; i < n; ++i) {
+		unsigned long address;
+		pte_t *pte;
+		int level;
+
+		address = (unsigned long) page_address(&p[i]);
+		pte = lookup_address(address, &level);
+		BUG_ON(!pte);
+		BUG_ON(level != PG_LEVEL_4K);
+
+		set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));
+		set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_HIDDEN));
+		__flush_tlb_one(address);
+	}
+}
+
+void
+kmemcheck_hide_pages(struct page *p, unsigned int n)
+{
+	unsigned int i;
+	struct page *head;
+
+	head = compound_head(p);
+	BUG_ON(!PageHead(head));
+
+	SetPageTracked(head);
+
+	for (i = 0; i < n; ++i) {
+		unsigned long address;
+		pte_t *pte;
+		int level;
+
+		address = (unsigned long) page_address(&p[i]);
+		pte = lookup_address(address, &level);
+		BUG_ON(!pte);
+		BUG_ON(level != PG_LEVEL_4K);
+
+		set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));
+		set_pte(pte, __pte(pte_val(*pte) | _PAGE_HIDDEN));
+		__flush_tlb_one(address);
+	}
+}
+
+static void
+mark_shadow(void *address, unsigned int n, enum shadow status)
+{
+	void *shadow;
+
+	shadow = address_get_shadow((unsigned long) address);
+	if (!shadow)
+		return;
+	__memset(shadow, status, n);
+}
+
+void
+kmemcheck_mark_unallocated(void *address, unsigned int n)
+{
+	mark_shadow(address, n, SHADOW_UNALLOCATED);
+}
+
+void
+kmemcheck_mark_uninitialized(void *address, unsigned int n)
+{
+	mark_shadow(address, n, SHADOW_UNINITIALIZED);
+}
+
+/*
+ * Fill the shadow memory of the given address such that the memory at that
+ * address is marked as being initialized.
+ */
+void
+kmemcheck_mark_initialized(void *address, unsigned int n)
+{
+	mark_shadow(address, n, SHADOW_INITIALIZED);
+}
+
+void
+kmemcheck_mark_freed(void *address, unsigned int n)
+{
+	mark_shadow(address, n, SHADOW_FREED);
+}
+
+void
+kmemcheck_mark_unallocated_pages(struct page *p, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 0; i < n; ++i)
+		kmemcheck_mark_unallocated(page_address(&p[i]), PAGE_SIZE);
+}
+
+void
+kmemcheck_mark_uninitialized_pages(struct page *p, unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 0; i < n; ++i)
+		kmemcheck_mark_uninitialized(page_address(&p[i]), PAGE_SIZE);
+}
+
+static bool
+opcode_is_prefix(uint8_t b)
+{
+	return
+		/* Group 1 */
+		b == 0xf0 || b == 0xf2 || b == 0xf3
+		/* Group 2 */
+		|| b == 0x2e || b == 0x36 || b == 0x3e || b == 0x26
+		|| b == 0x64 || b == 0x65 || b == 0x2e || b == 0x3e
+		/* Group 3 */
+		|| b == 0x66
+		/* Group 4 */
+		|| b == 0x67;
+}
+
+/* This is a VERY crude opcode decoder. We only need to find the size of the
+ * load/store that caused our #PF and this should work for all the opcodes
+ * that we care about. Moreover, the ones who invented this instruction set
+ * should be shot. */
+static unsigned int
+opcode_get_size(const uint8_t *op)
+{
+	/* Default operand size */
+	int operand_size_override = 32;
+
+	/* prefixes */
+	for (; opcode_is_prefix(*op); ++op) {
+		if (*op == 0x66)
+			operand_size_override = 16;
+	}
+
+	/* escape opcode */
+	if (*op == 0x0f) {
+		++op;
+
+		if (*op == 0xb6)
+			return operand_size_override >> 1;
+		if (*op == 0xb7)
+			return 16;
+	}
+
+	return (*op & 1) ? operand_size_override : 8;
+}
+
+static const uint8_t *
+opcode_get_primary(const uint8_t *op)
+{
+	/* skip prefixes */
+	for (; opcode_is_prefix(*op); ++op);
+	return op;
+}
+
+static inline enum shadow
+test(void *shadow, unsigned int size)
+{
+	uint8_t *x;
+
+	x = shadow;
+
+#ifdef CONFIG_KMEMCHECK_PARTIAL_OK
+	/*
+	 * Make sure _some_ bytes are initialized. Gcc frequently generates
+	 * code to access neighboring bytes.
+	 */
+	switch (size) {
+	case 32:
+		if (x[3] == SHADOW_INITIALIZED)
+			return x[3];
+		if (x[2] == SHADOW_INITIALIZED)
+			return x[2];
+	case 16:
+		if (x[1] == SHADOW_INITIALIZED)
+			return x[1];
+	case 8:
+		if (x[0] == SHADOW_INITIALIZED)
+			return x[0];
+	}
+#else
+	switch (size) {
+	case 32:
+		if (x[3] != SHADOW_INITIALIZED)
+			return x[3];
+		if (x[2] != SHADOW_INITIALIZED)
+			return x[2];
+	case 16:
+		if (x[1] != SHADOW_INITIALIZED)
+			return x[1];
+	case 8:
+		if (x[0] != SHADOW_INITIALIZED)
+			return x[0];
+	}
+#endif
+
+	return x[0];
+}
+
+static inline void
+set(void *shadow, unsigned int size)
+{
+	uint8_t *x;
+
+	x = shadow;
+
+	switch (size) {
+	case 32:
+		x[3] = SHADOW_INITIALIZED;
+		x[2] = SHADOW_INITIALIZED;
+	case 16:
+		x[1] = SHADOW_INITIALIZED;
+	case 8:
+		x[0] = SHADOW_INITIALIZED;
+	}
+
+	return;
+}
+
+static void
+kmemcheck_read(struct pt_regs *regs, uint32_t address, unsigned int size)
+{
+	void *shadow;
+	enum shadow status;
+
+	shadow = address_get_shadow(address);
+	if (!shadow)
+		return;
+
+	status = test(shadow, size);
+	if (status == SHADOW_INITIALIZED)
+		return;
+
+	/* Don't warn about it again. */
+	set(shadow, size);
+
+	error_save(status, address, size, regs);
+}
+
+static void
+kmemcheck_write(struct pt_regs *regs, uint32_t address, unsigned int size)
+{
+	void *shadow;
+
+	shadow = address_get_shadow(address);
+	if (!shadow)
+		return;
+	set(shadow, size);
+}
+
+void
+kmemcheck_access(struct pt_regs *regs,
+	unsigned long fallback_address, enum kmemcheck_method fallback_method)
+{
+	const uint8_t *insn;
+	const uint8_t *insn_primary;
+	unsigned int size;
+
+	struct kmemcheck_context *data = &__get_cpu_var(kmemcheck_context);
+
+	/* Recursive fault -- ouch. */
+	if (data->busy) {
+		emergency_show_addr(fallback_address);
+		error_save_bug(regs);
+		return;
+	}
+
+	data->busy = true;
+
+	insn = (const uint8_t *) regs->ip;
+	insn_primary = opcode_get_primary(insn);
+
+	size = opcode_get_size(insn);
+
+	switch (insn_primary[0]) {
+#ifdef CONFIG_KMEMCHECK_BITOPS_OK
+		/* AND, OR, XOR */
+		/*
+		 * Unfortunately, these instructions have to be excluded from
+		 * our regular checking since they access only some (and not
+		 * all) bits. This clears out "bogus" bitfield-access warnings.
+		 */
+	case 0x80:
+	case 0x81:
+	case 0x82:
+	case 0x83:
+		switch ((insn_primary[1] >> 3) & 7) {
+			/* OR */
+		case 1:
+			/* AND */
+		case 4:
+			/* XOR */
+		case 6:
+			kmemcheck_write(regs, fallback_address, size);
+			data->addr1 = fallback_address;
+			data->addr2 = 0;
+			data->busy = false;
+			return;
+
+			/* ADD */
+		case 0:
+			/* ADC */
+		case 2:
+			/* SBB */
+		case 3:
+			/* SUB */
+		case 5:
+			/* CMP */
+		case 7:
+			break;
+		}
+		break;
+#endif
+
+		/* MOVS, MOVSB, MOVSW, MOVSD */
+	case 0xa4:
+	case 0xa5:
+		/* These instructions are special because they take two
+		 * addresses, but we only get one page fault. */
+		kmemcheck_read(regs, regs->si, size);
+		kmemcheck_write(regs, regs->di, size);
+		data->addr1 = regs->si;
+		data->addr2 = regs->di;
+		data->busy = false;
+		return;
+
+		/* CMPS, CMPSB, CMPSW, CMPSD */
+	case 0xa6:
+	case 0xa7:
+		kmemcheck_read(regs, regs->si, size);
+		kmemcheck_read(regs, regs->di, size);
+		data->addr1 = regs->si;
+		data->addr2 = regs->di;
+		data->busy = false;
+		return;
+	}
+
+	/* If the opcode isn't special in any way, we use the data from the
+	 * page fault handler to determine the address and type of memory
+	 * access. */
+	switch (fallback_method) {
+	case KMEMCHECK_READ:
+		kmemcheck_read(regs, fallback_address, size);
+		data->addr1 = fallback_address;
+		data->addr2 = 0;
+		data->busy = false;
+		return;
+	case KMEMCHECK_WRITE:
+		kmemcheck_write(regs, fallback_address, size);
+		data->addr1 = fallback_address;
+		data->addr2 = 0;
+		data->busy = false;
+		return;
+	}
+}
+
+/*
+ * A faster implementation of memset() when tracking is enabled where the
+ * whole memory area is within a single page.
+ */
+static void
+memset_one_page(void *s, int c, size_t n)
+{
+	unsigned long addr;
+	void *x;
+	unsigned long flags;
+
+	addr = (unsigned long) s;
+
+	x = address_get_shadow(addr);
+	if (!x) {
+		/* The page isn't being tracked. */
+		__memset(s, c, n);
+		return;
+	}
+
+	/* While we are not guarding the page in question, nobody else
+	 * should be able to change them. */
+	local_irq_save(flags);
+
+	show_addr(addr);
+	__memset(s, c, n);
+	__memset(x, SHADOW_INITIALIZED, n);
+	if (kmemcheck_enabled)
+		hide_addr(addr);
+
+	local_irq_restore(flags);
+}
+
+/*
+ * A faster implementation of memset() when tracking is enabled. We cannot
+ * assume that all pages within the range are tracked, so copying has to be
+ * split into page-sized (or smaller, for the ends) chunks.
+ */
+void *
+kmemcheck_memset(void *s, int c, size_t n)
+{
+	unsigned long addr;
+	unsigned long start_page, start_offset;
+	unsigned long end_page, end_offset;
+	unsigned long i;
+
+	if (!n)
+		return s;
+
+	if (!slab_is_available()) {
+		__memset(s, c, n);
+		return s;
+	}
+
+	addr = (unsigned long) s;
+
+	start_page = addr & PAGE_MASK;
+	end_page = (addr + n) & PAGE_MASK;
+
+	if (start_page == end_page) {
+		/* The entire area is within the same page. Good, we only
+		 * need one memset(). */
+		memset_one_page(s, c, n);
+		return s;
+	}
+
+	start_offset = addr & ~PAGE_MASK;
+	end_offset = (addr + n) & ~PAGE_MASK;
+
+	/* Clear the head, body, and tail of the memory area. */
+	if (start_offset < PAGE_SIZE)
+		memset_one_page(s, c, PAGE_SIZE - start_offset);
+	for (i = start_page + PAGE_SIZE; i < end_page; i += PAGE_SIZE)
+		memset_one_page((void *) i, c, PAGE_SIZE);
+	if (end_offset > 0)
+		memset_one_page((void *) end_page, c, end_offset);
+
+	return s;
+}
+
+EXPORT_SYMBOL(kmemcheck_memset);
diff --git a/include/asm-x86/kmemcheck.h b/include/asm-x86/kmemcheck.h
new file mode 100644
index 0000000..885b107
--- /dev/null
+++ b/include/asm-x86/kmemcheck.h
@@ -0,0 +1,30 @@
+#ifndef ASM_X86_KMEMCHECK_32_H
+#define ASM_X86_KMEMCHECK_32_H
+
+#include <linux/percpu.h>
+#include <asm/pgtable.h>
+
+enum kmemcheck_method {
+	KMEMCHECK_READ,
+	KMEMCHECK_WRITE,
+};
+
+#ifdef CONFIG_KMEMCHECK
+bool kmemcheck_active(struct pt_regs *regs);
+
+void kmemcheck_show(struct pt_regs *regs);
+void kmemcheck_hide(struct pt_regs *regs);
+
+void kmemcheck_access(struct pt_regs *regs,
+	unsigned long address, enum kmemcheck_method method);
+#else
+static inline bool kmemcheck_active(struct pt_regs *regs) { return false; }
+
+static inline void kmemcheck_show(struct pt_regs *regs) { }
+static inline void kmemcheck_hide(struct pt_regs *regs) { }
+
+static inline void kmemcheck_access(struct pt_regs *regs,
+	unsigned long address, enum kmemcheck_method method) { }
+#endif /* CONFIG_KMEMCHECK */
+
+#endif
diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
index 9cf472a..ebf285d 100644
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -17,8 +17,8 @@
  #define _PAGE_BIT_GLOBAL	8	/* Global TLB entry PPro+ */
  #define _PAGE_BIT_UNUSED1	9	/* available for programmer */
  #define _PAGE_BIT_UNUSED2	10
-#define _PAGE_BIT_UNUSED3	11
  #define _PAGE_BIT_PAT_LARGE	12	/* On 2MB or 1GB pages */
+#define _PAGE_BIT_HIDDEN	11
  #define _PAGE_BIT_NX           63       /* No execute: only valid after cpuid check */

  /*
@@ -37,9 +37,9 @@
  #define _PAGE_GLOBAL	(_AC(1, L)<<_PAGE_BIT_GLOBAL)	/* Global TLB entry */
  #define _PAGE_UNUSED1	(_AC(1, L)<<_PAGE_BIT_UNUSED1)
  #define _PAGE_UNUSED2	(_AC(1, L)<<_PAGE_BIT_UNUSED2)
-#define _PAGE_UNUSED3	(_AC(1, L)<<_PAGE_BIT_UNUSED3)
  #define _PAGE_PAT	(_AC(1, L)<<_PAGE_BIT_PAT)
  #define _PAGE_PAT_LARGE (_AC(1, L)<<_PAGE_BIT_PAT_LARGE)
+#define _PAGE_HIDDEN	(_AC(1, L)<<_PAGE_BIT_HIDDEN)

  #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
  #define _PAGE_NX	(_AC(1, ULL) << _PAGE_BIT_NX)
diff --git a/include/asm-x86/pgtable_32.h b/include/asm-x86/pgtable_32.h
index 4e6a0fc..266a0f5 100644
--- a/include/asm-x86/pgtable_32.h
+++ b/include/asm-x86/pgtable_32.h
@@ -87,6 +87,12 @@ extern unsigned long pg0[];

  #define pte_present(x)	((x).pte_low & (_PAGE_PRESENT | _PAGE_PROTNONE))

+#ifdef CONFIG_KMEMCHECK
+#define pte_hidden(x)	((x).pte_low & (_PAGE_HIDDEN))
+#else
+#define pte_hidden(x)	0
+#endif
+
  /* To avoid harmful races, pmd_none(x) should check only the lower when PAE */
  #define pmd_none(x)	(!(unsigned long)pmd_val(x))
  #define pmd_present(x)	(pmd_val(x) & _PAGE_PRESENT)
diff --git a/include/linux/kmemcheck.h b/include/linux/kmemcheck.h
new file mode 100644
index 0000000..801da50
--- /dev/null
+++ b/include/linux/kmemcheck.h
@@ -0,0 +1,27 @@
+#ifndef LINUX_KMEMCHECK_H
+#define LINUX_KMEMCHECK_H
+
+#ifdef CONFIG_KMEMCHECK
+extern int kmemcheck_enabled;
+
+void kmemcheck_init(void);
+
+void kmemcheck_show_pages(struct page *p, unsigned int n);
+void kmemcheck_hide_pages(struct page *p, unsigned int n);
+
+void kmemcheck_mark_unallocated(void *address, unsigned int n);
+void kmemcheck_mark_uninitialized(void *address, unsigned int n);
+void kmemcheck_mark_initialized(void *address, unsigned int n);
+void kmemcheck_mark_freed(void *address, unsigned int n);
+
+void kmemcheck_mark_unallocated_pages(struct page *p, unsigned int n);
+void kmemcheck_mark_uninitialized_pages(struct page *p, unsigned int n);
+#endif /* CONFIG_KMEMCHECK */
+
+
+#ifndef CONFIG_KMEMCHECK
+#define kmemcheck_enabled 0
+static inline void kmemcheck_init(void) { }
+#endif /* CONFIG_KMEMCHECK */
+
+#endif /* LINUX_KMEMCHECK_H */
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index b5b30f1..63f5fd8 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -89,6 +89,7 @@
  #define PG_mappedtodisk		16	/* Has blocks allocated on-disk */
  #define PG_reclaim		17	/* To be reclaimed asap */
  #define PG_buddy		19	/* Page is free, on buddy lists */
+#define PG_tracked		20	/* Tracked by kmemcheck */

  /* PG_readahead is only used for file reads; PG_reclaim is only for writes */
  #define PG_readahead		PG_reclaim /* Reminder to do async read-ahead */
@@ -296,6 +297,11 @@ static inline void __ClearPageTail(struct page *page)
  #define SetPageUncached(page)	set_bit(PG_uncached, &(page)->flags)
  #define ClearPageUncached(page)	clear_bit(PG_uncached, &(page)->flags)

+#define PageTracked(page)	test_bit(PG_tracked, &(page)->flags)
+#define SetPageTracked(page)	set_bit(PG_tracked, &(page)->flags)
+#define ClearPageTracked(page)	clear_bit(PG_tracked, &(page)->flags)
+
+
  struct page;	/* forward declaration */

  extern void cancel_dirty_page(struct page *page, unsigned int account_size);
diff --git a/init/main.c b/init/main.c
index 99ce949..7f85ea2 100644
--- a/init/main.c
+++ b/init/main.c
@@ -58,6 +58,7 @@
  #include <linux/kthread.h>
  #include <linux/sched.h>
  #include <linux/signal.h>
+#include <linux/kmemcheck.h>

  #include <asm/io.h>
  #include <asm/bugs.h>
@@ -751,6 +752,7 @@ static void __init do_pre_smp_initcalls(void)
  {
  	extern int spawn_ksoftirqd(void);

+	kmemcheck_init();
  	migration_init();
  	spawn_ksoftirqd();
  	if (!nosoftlockup)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b2a2d68..5381eb7 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -45,6 +45,7 @@
  #include <linux/nfs_fs.h>
  #include <linux/acpi.h>
  #include <linux/reboot.h>
+#include <linux/kmemcheck.h>

  #include <asm/uaccess.h>
  #include <asm/processor.h>
@@ -820,6 +821,17 @@ static struct ctl_table kern_table[] = {
  		.proc_handler	= &proc_dostring,
  		.strategy	= &sysctl_string,
  	},
+#ifdef CONFIG_KMEMCHECK
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "kmemcheck",
+		.data		= &kmemcheck_enabled,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
+#endif
+
  /*
   * NOTE: do not add new entries to this table unless you have read
   * Documentation/sysctl/ctl_unnumbered.txt
-- 
1.5.4.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] x86: add hooks for kmemcheck
  2008-04-04 13:44 [ANNOUNCE] kmemcheck v7 Vegard Nossum
  2008-04-04 13:45 ` [PATCH 1/3] kmemcheck: add the kmemcheck core Vegard Nossum
@ 2008-04-04 13:46 ` Vegard Nossum
  2008-04-04 13:47 ` [PATCH 3/3] slub: " Vegard Nossum
  2008-05-10  9:07 ` [ANNOUNCE] kmemcheck v7 Bart Van Assche
  3 siblings, 0 replies; 22+ messages in thread
From: Vegard Nossum @ 2008-04-04 13:46 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Pekka Enberg, Ingo Molnar, Peter Zijlstra, Paul E. McKenney,
	Christoph Lameter, Daniel Walker, Andi Kleen, Randy Dunlap,
	Josh Aune, Pekka Paalanen

 From caf2b1b7198bd4e0d690555be389a3444523162d Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@gmail.com>
Date: Fri, 4 Apr 2008 00:53:23 +0200
Subject: [PATCH] x86: add hooks for kmemcheck

The hooks that we modify are:
- Page fault handler (to handle kmemcheck faults)
- Debug exception handler (to hide pages after single-stepping
   the instruction that caused the page fault)

Also redefine memset() to use the optimized version if kmemcheck
is enabled.

Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
  arch/x86/kernel/cpu/common.c |    7 +++++++
  arch/x86/kernel/entry_32.S   |    8 ++++----
  arch/x86/kernel/traps_32.c   |   16 +++++++++++++++-
  arch/x86/mm/fault.c          |   25 +++++++++++++++++++++----
  include/asm-x86/string_32.h  |    8 ++++++++
  5 files changed, 55 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a38aafa..040c650 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -634,6 +634,13 @@ void __init early_cpu_init(void)
  	nexgen_init_cpu();
  	umc_init_cpu();
  	early_cpu_detect();
+
+#ifdef CONFIG_KMEMCHECK
+	/*
+	 * We need 4K granular PTEs for kmemcheck:
+	 */
+	setup_clear_cpu_cap(X86_FEATURE_PSE);
+#endif
  }

  /* Make sure %fs is initialized properly in idle threads */
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 4b87c32..54f477c 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -289,7 +289,7 @@ ENTRY(ia32_sysenter_target)
  	CFI_DEF_CFA esp, 0
  	CFI_REGISTER esp, ebp
  	movl TSS_sysenter_sp0(%esp),%esp
-sysenter_past_esp:
+ENTRY(sysenter_past_esp)
  	/*
  	 * No need to follow this irqs on/off section: the syscall
  	 * disabled irqs and here we enable it straight after entry:
@@ -767,7 +767,7 @@ label:						\
  	CFI_ADJUST_CFA_OFFSET 4;		\
  	CFI_REL_OFFSET eip, 0

-KPROBE_ENTRY(debug)
+KPROBE_ENTRY(x86_debug)
  	RING0_INT_FRAME
  	cmpl $ia32_sysenter_target,(%esp)
  	jne debug_stack_correct
@@ -781,7 +781,7 @@ debug_stack_correct:
  	call do_debug
  	jmp ret_from_exception
  	CFI_ENDPROC
-KPROBE_END(debug)
+KPROBE_END(x86_debug)

  /*
   * NMI is doubly nasty. It can happen _while_ we're handling
@@ -835,7 +835,7 @@ nmi_debug_stack_check:
  	/* We have a RING0_INT_FRAME here */
  	cmpw $__KERNEL_CS,16(%esp)
  	jne nmi_stack_correct
-	cmpl $debug,(%esp)
+	cmpl $x86_debug,(%esp)
  	jb nmi_stack_correct
  	cmpl $debug_esp_fix_insn,(%esp)
  	ja nmi_stack_correct
diff --git a/arch/x86/kernel/traps_32.c b/arch/x86/kernel/traps_32.c
index b22c01e..898796e 100644
--- a/arch/x86/kernel/traps_32.c
+++ b/arch/x86/kernel/traps_32.c
@@ -56,6 +56,7 @@
  #include <asm/arch_hooks.h>
  #include <linux/kdebug.h>
  #include <asm/stacktrace.h>
+#include <asm/kmemcheck.h>

  #include <linux/module.h>

@@ -841,6 +842,10 @@ void __kprobes do_int3(struct pt_regs *regs, long error_code)
  }
  #endif

+extern void ia32_sysenter_target(void);
+extern void sysenter_past_esp(void);
+extern void x86_debug(void);
+
  /*
   * Our handling of the processor debug registers is non-trivial.
   * We do not clear them on entry and exit from the kernel. Therefore
@@ -872,6 +877,14 @@ void __kprobes do_debug(struct pt_regs * regs, long error_code)

  	get_debugreg(condition, 6);

+	/* Catch kmemcheck conditions first of all! */
+	if (condition & DR_STEP) {
+		if (kmemcheck_active(regs)) {
+			kmemcheck_hide(regs);
+			return;
+		}
+	}
+
  	/*
  	 * The processor cleared BTF, so don't mark that we need it set.
  	 */
@@ -881,6 +894,7 @@ void __kprobes do_debug(struct pt_regs * regs, long error_code)
  	if (notify_die(DIE_DEBUG, "debug", regs, condition, error_code,
  					SIGTRAP) == NOTIFY_STOP)
  		return;
+
  	/* It's safe to allow irq's after DR6 has been saved */
  	if (regs->flags & X86_EFLAGS_IF)
  		local_irq_enable();
@@ -1154,7 +1168,7 @@ void __init trap_init(void)
  #endif

  	set_trap_gate(0,&divide_error);
-	set_intr_gate(1,&debug);
+	set_intr_gate(1,&x86_debug);
  	set_intr_gate(2,&nmi);
  	set_system_intr_gate(3, &int3); /* int3/4 can be called from all */
  	set_system_gate(4,&overflow);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index ec08d83..fe493b4 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -33,6 +33,7 @@
  #include <asm/smp.h>
  #include <asm/tlbflush.h>
  #include <asm/proto.h>
+#include <asm/kmemcheck.h>
  #include <asm-generic/sections.h>

  /*
@@ -491,7 +492,8 @@ static int spurious_fault(unsigned long address,
   *
   * This assumes no large pages in there.
   */
-static int vmalloc_fault(unsigned long address)
+static int vmalloc_fault(struct pt_regs *regs, unsigned long address,
+	unsigned long error_code)
  {
  #ifdef CONFIG_X86_32
  	unsigned long pgd_paddr;
@@ -509,8 +511,16 @@ static int vmalloc_fault(unsigned long address)
  	if (!pmd_k)
  		return -1;
  	pte_k = pte_offset_kernel(pmd_k, address);
-	if (!pte_present(*pte_k))
-		return -1;
+	if (!pte_present(*pte_k)) {
+		if (!pte_hidden(*pte_k))
+			return -1;
+
+		if (error_code & 2)
+			kmemcheck_access(regs, address, KMEMCHECK_WRITE);
+		else
+			kmemcheck_access(regs, address, KMEMCHECK_READ);
+		kmemcheck_show(regs);
+	}
  	return 0;
  #else
  	pgd_t *pgd, *pgd_ref;
@@ -599,6 +609,13 @@ void __kprobes do_page_fault(struct pt_regs *regs, unsigned long error_code)

  	si_code = SEGV_MAPERR;

+	/*
+	 * Detect and handle instructions that would cause a page fault for
+	 * both a tracked kernel page and a userspace page.
+	 */
+	if(kmemcheck_active(regs))
+		kmemcheck_hide(regs);
+
  	if (notify_page_fault(regs))
  		return;

@@ -621,7 +638,7 @@ void __kprobes do_page_fault(struct pt_regs *regs, unsigned long error_code)
  	if (unlikely(address >= TASK_SIZE64)) {
  #endif
  		if (!(error_code & (PF_RSVD|PF_USER|PF_PROT)) &&
-		    vmalloc_fault(address) >= 0)
+		    vmalloc_fault(regs, address, error_code) >= 0)
  			return;

  		/* Can handle a stale RO->RW TLB */
diff --git a/include/asm-x86/string_32.h b/include/asm-x86/string_32.h
index c5d13a8..6138aa4 100644
--- a/include/asm-x86/string_32.h
+++ b/include/asm-x86/string_32.h
@@ -262,6 +262,14 @@ __asm__  __volatile__( \
   __constant_c_x_memset((s),(0x01010101UL*(unsigned char)(c)),(count)) : \
   __memset((s),(c),(count)))

+/* If kmemcheck is enabled, our best bet is a custom memset() that disables
+ * checking in order to save a whole lot of (unnecessary) page faults. */
+#ifdef CONFIG_KMEMCHECK
+void *kmemcheck_memset(void *s, int c, size_t n);
+#undef memset
+#define memset(s, c, n) kmemcheck_memset((s), (c), (n))
+#endif
+
  /*
   * find the first occurrence of byte 'c', or 1 past the area if none
   */
-- 
1.5.4.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] slub: add hooks for kmemcheck
  2008-04-04 13:44 [ANNOUNCE] kmemcheck v7 Vegard Nossum
  2008-04-04 13:45 ` [PATCH 1/3] kmemcheck: add the kmemcheck core Vegard Nossum
  2008-04-04 13:46 ` [PATCH 2/3] x86: add hooks for kmemcheck Vegard Nossum
@ 2008-04-04 13:47 ` Vegard Nossum
  2008-05-10  9:07 ` [ANNOUNCE] kmemcheck v7 Bart Van Assche
  3 siblings, 0 replies; 22+ messages in thread
From: Vegard Nossum @ 2008-04-04 13:47 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Pekka Enberg, Ingo Molnar, Peter Zijlstra, Paul E. McKenney,
	Christoph Lameter, Daniel Walker, Andi Kleen, Randy Dunlap,
	Josh Aune, Pekka Paalanen

 From d3844118edba5548cce8d27a78bb15b8d6aded66 Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@gmail.com>
Date: Fri, 4 Apr 2008 00:54:48 +0200
Subject: [PATCH] slub: add hooks for kmemcheck

With kmemcheck enabled, SLUB needs to do this:

1. Request twice as much memory as would normally be needed. The bottom half
    of the memory is what the user actually sees and uses; the upper half
    contains the so-called shadow memory, which stores the status of each byte
    in the bottom half, e.g. initialized or uninitialized.
2. Tell kmemcheck which parts of memory that should be marked uninitialized.
    There are actually a few more states, such as "not yet allocated" and
    "recently freed".

If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
memory that can take page faults because of kmemcheck.

If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
request memory with the __GFP_NOTRACK flag. This does not prevent the page
faults from occuring, however, but marks the object in question as being
initialized so that no warnings will ever be produced for this object.

Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
  include/linux/gfp.h      |    3 +-
  include/linux/slab.h     |    7 +++
  include/linux/slub_def.h |   17 ++++++++
  kernel/fork.c            |   15 ++++---
  mm/Makefile              |    3 +
  mm/slub.c                |   36 ++++++++++++-----
  mm/slub_kmemcheck.c      |   99 ++++++++++++++++++++++++++++++++++++++++++++++
  7 files changed, 161 insertions(+), 19 deletions(-)
  create mode 100644 mm/slub_kmemcheck.c

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 164be9d..0faeedc 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -50,8 +50,9 @@ struct vm_area_struct;
  #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
  #define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
  #define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
+#define __GFP_NOTRACK	((__force gfp_t)0x200000u)  /* Don't track with kmemcheck */

-#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 22	/* Room for 22 __GFP_FOO bits */
  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

  /* This equals 0, but use constants in case they ever change */
diff --git a/include/linux/slab.h b/include/linux/slab.h
index f62caaa..d5505b1 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -29,6 +29,13 @@
  #define SLAB_MEM_SPREAD		0x00100000UL	/* Spread some memory over cpuset */
  #define SLAB_TRACE		0x00200000UL	/* Trace allocations and frees */

+#ifdef CONFIG_KMEMCHECK
+/* Don't track use of uninitialized memory */
+# define SLAB_NOTRACK		0x00400000UL
+#else
+# define SLAB_NOTRACK		0
+#endif
+
  /* The following flags affect the page allocator grouping pages by mobility */
  #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
  #define SLAB_TEMPORARY		SLAB_RECLAIM_ACCOUNT	/* Objects are short-lived */
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index b00c1c7..e0b9a39 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -231,4 +231,21 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
  }
  #endif

+#ifdef CONFIG_KMEMCHECK
+struct page *kmemcheck_allocate_slab(struct kmem_cache *s,
+	gfp_t flags, int node, int pages);
+void kmemcheck_free_slab(struct kmem_cache *s, struct page *page, int pages);
+
+void kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object);
+void kmemcheck_slab_free(struct kmem_cache *s, void *object);
+#else
+static inline struct page *kmemcheck_allocate_slab(struct kmem_cache *s,
+	gfp_t flags, int node, int pages) { return NULL; }
+static inline void kmemcheck_free_slab(struct kmem_cache *s,
+	struct page *page, int pages) { }
+static inline void kmemcheck_slab_alloc(struct kmem_cache *s,
+	gfp_t gfpflags, void *object) { }
+static inline void kmemcheck_slab_free(struct kmem_cache *s, void *object) { }
+#endif /* CONFIG_KMEMCHECK */
+
  #endif /* _LINUX_SLUB_DEF_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 9c042f9..1318da2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -141,7 +141,7 @@ void __init fork_init(unsigned long mempages)
  	/* create a slab on which task_structs can be allocated */
  	task_struct_cachep =
  		kmem_cache_create("task_struct", sizeof(struct task_struct),
-			ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL);
+			ARCH_MIN_TASKALIGN, SLAB_PANIC | SLAB_NOTRACK, NULL);
  #endif

  	/*
@@ -1547,23 +1547,24 @@ void __init proc_caches_init(void)
  {
  	sighand_cachep = kmem_cache_create("sighand_cache",
  			sizeof(struct sighand_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_DESTROY_BY_RCU,
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_DESTROY_BY_RCU
+			|SLAB_NOTRACK,
  			sighand_ctor);
  	signal_cachep = kmem_cache_create("signal_cache",
  			sizeof(struct signal_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
  	files_cachep = kmem_cache_create("files_cache",
  			sizeof(struct files_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
  	fs_cachep = kmem_cache_create("fs_cache",
  			sizeof(struct fs_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
  	vm_area_cachep = kmem_cache_create("vm_area_struct",
  			sizeof(struct vm_area_struct), 0,
-			SLAB_PANIC, NULL);
+			SLAB_PANIC|SLAB_NOTRACK, NULL);
  	mm_cachep = kmem_cache_create("mm_struct",
  			sizeof(struct mm_struct), ARCH_MIN_MMSTRUCT_ALIGN,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
  }

  /*
diff --git a/mm/Makefile b/mm/Makefile
index a5b0dd9..ae65439 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -34,3 +34,6 @@ obj-$(CONFIG_SMP) += allocpercpu.o
  obj-$(CONFIG_QUICKLIST) += quicklist.o
  obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o

+ifeq ($(CONFIG_KMEMCHECK),y)
+obj-$(CONFIG_SLUB) += slub_kmemcheck.o
+endif
diff --git a/mm/slub.c b/mm/slub.c
index acc975f..325f2e4 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -21,6 +21,7 @@
  #include <linux/ctype.h>
  #include <linux/kallsyms.h>
  #include <linux/memory.h>
+#include <linux/kmemcheck.h>

  /*
   * Lock order:
@@ -191,7 +192,7 @@ static inline void ClearSlabDebug(struct page *page)
  		SLAB_TRACE | SLAB_DESTROY_BY_RCU)

  #define SLUB_MERGE_SAME (SLAB_DEBUG_FREE | SLAB_RECLAIM_ACCOUNT | \
-		SLAB_CACHE_DMA)
+		SLAB_CACHE_DMA | SLAB_NOTRACK)

  #ifndef ARCH_KMALLOC_MINALIGN
  #define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long)
@@ -1039,6 +1040,9 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)

  	flags |= s->allocflags;

+	if (kmemcheck_enabled && !(s->flags & SLAB_NOTRACK))
+		return kmemcheck_allocate_slab(s, flags, node, pages);
+
  	if (node == -1)
  		page = alloc_pages(flags, s->order);
  	else
@@ -1120,6 +1124,13 @@ static void __free_slab(struct kmem_cache *s, struct page *page)
  		ClearSlabDebug(page);
  	}

+	if (PageTracked(page) && !(s->flags & SLAB_NOTRACK)) {
+		kmemcheck_free_slab(s, page, pages);
+		return;
+	}
+
+	__ClearPageSlab(page);
+
  	mod_zone_page_state(page_zone(page),
  		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
  		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
@@ -1155,7 +1166,6 @@ static void discard_slab(struct kmem_cache *s, struct page *page)

  	atomic_long_dec(&n->nr_slabs);
  	reset_page_mapcount(page);
-	__ClearPageSlab(page);
  	free_slab(s, page);
  }

@@ -1592,6 +1602,7 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,
  	if (unlikely((gfpflags & __GFP_ZERO) && object))
  		memset(object, 0, c->objsize);

+	kmemcheck_slab_alloc(s, gfpflags, object);
  	return object;
  }

@@ -1694,6 +1705,8 @@ static __always_inline void slab_free(struct kmem_cache *s,
  	struct kmem_cache_cpu *c;
  	unsigned long flags;

+	kmemcheck_slab_free(s, object);
+
  	local_irq_save(flags);
  	c = get_cpu_slab(s, smp_processor_id());
  	debug_check_no_locks_freed(object, c->objsize);
@@ -2449,12 +2462,10 @@ static int __init setup_slub_nomerge(char *str)
  __setup("slub_nomerge", setup_slub_nomerge);

  static struct kmem_cache *create_kmalloc_cache(struct kmem_cache *s,
-		const char *name, int size, gfp_t gfp_flags)
+	const char *name, int size, gfp_t gfp_flags, unsigned int flags)
  {
-	unsigned int flags = 0;
-
  	if (gfp_flags & SLUB_DMA)
-		flags = SLAB_CACHE_DMA;
+		flags |= SLAB_CACHE_DMA;

  	down_write(&slub_lock);
  	if (!kmem_cache_open(s, gfp_flags, name, size, ARCH_KMALLOC_MINALIGN,
@@ -2517,7 +2528,8 @@ static noinline struct kmem_cache *dma_kmalloc_cache(int index, gfp_t flags)

  	if (!s || !text || !kmem_cache_open(s, flags, text,
  			realsize, ARCH_KMALLOC_MINALIGN,
-			SLAB_CACHE_DMA|__SYSFS_ADD_DEFERRED, NULL)) {
+			SLAB_CACHE_DMA|SLAB_NOTRACK|__SYSFS_ADD_DEFERRED,
+			NULL)) {
  		kfree(s);
  		kfree(text);
  		goto unlock_out;
@@ -2910,7 +2922,7 @@ void __init kmem_cache_init(void)
  	 * kmem_cache_open for slab_state == DOWN.
  	 */
  	create_kmalloc_cache(&kmalloc_caches[0], "kmem_cache_node",
-		sizeof(struct kmem_cache_node), GFP_KERNEL);
+		sizeof(struct kmem_cache_node), GFP_KERNEL, 0);
  	kmalloc_caches[0].refcount = -1;
  	caches++;

@@ -2923,18 +2935,18 @@ void __init kmem_cache_init(void)
  	/* Caches that are not of the two-to-the-power-of size */
  	if (KMALLOC_MIN_SIZE <= 64) {
  		create_kmalloc_cache(&kmalloc_caches[1],
-				"kmalloc-96", 96, GFP_KERNEL);
+				"kmalloc-96", 96, GFP_KERNEL, 0);
  		caches++;
  	}
  	if (KMALLOC_MIN_SIZE <= 128) {
  		create_kmalloc_cache(&kmalloc_caches[2],
-				"kmalloc-192", 192, GFP_KERNEL);
+				"kmalloc-192", 192, GFP_KERNEL, 0);
  		caches++;
  	}

  	for (i = KMALLOC_SHIFT_LOW; i <= PAGE_SHIFT; i++) {
  		create_kmalloc_cache(&kmalloc_caches[i],
-			"kmalloc", 1 << i, GFP_KERNEL);
+			"kmalloc", 1 << i, GFP_KERNEL, 0);
  		caches++;
  	}

@@ -4167,6 +4179,8 @@ static char *create_unique_id(struct kmem_cache *s)
  		*p++ = 'a';
  	if (s->flags & SLAB_DEBUG_FREE)
  		*p++ = 'F';
+	if (!(s->flags & SLAB_NOTRACK))
+		*p++ = 't';
  	if (p != name + 1)
  		*p++ = '-';
  	p += sprintf(p, "%07d", s->size);
diff --git a/mm/slub_kmemcheck.c b/mm/slub_kmemcheck.c
new file mode 100644
index 0000000..ca5f1a9
--- /dev/null
+++ b/mm/slub_kmemcheck.c
@@ -0,0 +1,99 @@
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/kmemcheck.h>
+
+struct page *
+kmemcheck_allocate_slab(struct kmem_cache *s, gfp_t flags, int node, int pages)
+{
+	struct page *page;
+
+	/*
+	 * With kmemcheck enabled, we actually allocate twice as much. The
+	 * upper half of the allocation is used as our shadow memory where
+	 * the status (e.g. initialized/uninitialized) of each byte is
+	 * stored.
+	 */
+
+	flags |= __GFP_COMP;
+
+	if (node == -1)
+		page = alloc_pages(flags, s->order + 1);
+	else
+		page = alloc_pages_node(node, flags, s->order + 1);
+
+	if (!page)
+		return NULL;
+
+	/*
+	 * Mark it as non-present for the MMU so that our accesses to
+	 * this memory will trigger a page fault and let us analyze
+	 * the memory accesses.
+	 */
+	kmemcheck_hide_pages(page, pages);
+
+	/*
+	 * Objects from caches that have a constructor don't get
+	 * cleared when they're allocated, so we need to do it here.
+	 */
+	if (s->ctor)
+		kmemcheck_mark_uninitialized_pages(page, pages);
+	else
+		kmemcheck_mark_unallocated_pages(page, pages);
+
+	mod_zone_page_state(page_zone(page),
+		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
+		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
+		pages + pages);
+
+	return page;
+}
+
+void
+kmemcheck_free_slab(struct kmem_cache *s, struct page *page, int pages)
+{
+	kmemcheck_show_pages(page, pages);
+
+	__ClearPageSlab(page);
+
+	mod_zone_page_state(page_zone(page),
+		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
+		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
+		-pages - pages);
+
+	__free_pages(page, s->order + 1);
+}
+
+void
+kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object)
+{
+	if (gfpflags & __GFP_ZERO)
+		return;
+	if (s->flags & SLAB_NOTRACK)
+		return;
+
+	if (!kmemcheck_enabled || gfpflags & __GFP_NOTRACK) {
+		/*
+		 * Allow notracked objects to be allocated from
+		 * tracked caches. Note however that these objects
+		 * will still get page faults on access, they just
+		 * won't ever be flagged as uninitialized. If page
+		 * faults are not acceptable, the slab cache itself
+		 * should be marked NOTRACK.
+		 */
+		kmemcheck_mark_initialized(object, s->objsize);
+	} else if (!s->ctor) {
+		/*
+		 * New objects should be marked uninitialized before
+		 * they're returned to the called.
+		 */
+		kmemcheck_mark_uninitialized(object, s->objsize);
+	}
+}
+
+void
+kmemcheck_slab_free(struct kmem_cache *s, void *object)
+{
+	/* TODO: RCU freeing is unsupported for now; hide false positives. */
+	if (!s->ctor && !(s->flags & SLAB_DESTROY_BY_RCU))
+		kmemcheck_mark_freed(object, s->objsize);
+}
-- 
1.5.4.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10  9:07 ` [ANNOUNCE] kmemcheck v7 Bart Van Assche
@ 2008-05-10  9:06   ` Pekka Enberg
  2008-05-10 11:04     ` Bart Van Assche
  0 siblings, 1 reply; 22+ messages in thread
From: Pekka Enberg @ 2008-05-10  9:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vegard Nossum, Linux Kernel Mailing List, Ingo Molnar,
	Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Andi Kleen, Randy Dunlap, Josh Aune,
	Pekka Paalanen

Bart Van Assche wrote:
> It's a bit late but I finally found out about your announcement of
> kmemcheck version 7. Are you familiar with the patch that adds support
> to Valgrind for User Mode Linux ? I'm not sure what the best approach
> is -- letting the kernel do its own checking like kmemcheck or extend
> Valgrind such that it supports UML. Anyway, the techniques applied in
> Valgrind may be useful for kmemcheck too, such as the algorithms used
> in Valgrind to compress the memory state information.

It's better to do it with the native kernel so you can "valgrind" all 
the interesting driver code.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-04-04 13:44 [ANNOUNCE] kmemcheck v7 Vegard Nossum
                   ` (2 preceding siblings ...)
  2008-04-04 13:47 ` [PATCH 3/3] slub: " Vegard Nossum
@ 2008-05-10  9:07 ` Bart Van Assche
  2008-05-10  9:06   ` Pekka Enberg
  3 siblings, 1 reply; 22+ messages in thread
From: Bart Van Assche @ 2008-05-10  9:07 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Linux Kernel Mailing List, Pekka Enberg, Ingo Molnar,
	Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Andi Kleen, Randy Dunlap, Josh Aune,
	Pekka Paalanen

On Fri, Apr 4, 2008 at 3:44 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> I skipped the public announcements for versions 5 and 6, but here is 7 :)
>
> General description: kmemcheck is a patch to the linux kernel that
> detects use of uninitialized memory. It does this by trapping every
> read and write to memory that was allocated dynamically (e.g. using
> kmalloc()). If a memory address is read that has not previously been
> written to, a message is printed to the kernel log.
>
> Changes since v4 (rough list):
> - SLUB parts were broken-out into its own file to avoid cluttering the main
>  SLUB code.
> - A rather lot of cleanups, including removing #ifdefs from arch code.
> - Some preparation in anticipation of an x86_64 port.
> - Make reporting safer by using a periodic timer to inspect the error queue.
> - Fix hang due to page flags changing too early on free().
> - Fix hang due to kprobes incompatibility.
> - Allow CONFIG_SMP, but limit number of CPUs to 1 at run-time.
> - Add kmemcheck=0|1 boot option.
> - Add /proc/sys/kernel/kmemcheck for run-time enabling/disabling.
>
>
> These patches apply to Linus's v2.6.25-rc8. The latest patchset can also be
> found here: http://folk.uio.no/vegardno/linux/kmemcheck/

(reply to an e-mail of one month ago)

Hello Vegard,

It's a bit late but I finally found out about your announcement of
kmemcheck version 7. Are you familiar with the patch that adds support
to Valgrind for User Mode Linux ? I'm not sure what the best approach
is -- letting the kernel do its own checking like kmemcheck or extend
Valgrind such that it supports UML. Anyway, the techniques applied in
Valgrind may be useful for kmemcheck too, such as the algorithms used
in Valgrind to compress the memory state information.

See also:
http://www.mail-archive.com/user-mode-linux-devel@lists.sourceforge.net/msg05602.html

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10  9:06   ` Pekka Enberg
@ 2008-05-10 11:04     ` Bart Van Assche
  2008-05-10 12:02       ` Vegard Nossum
  0 siblings, 1 reply; 22+ messages in thread
From: Bart Van Assche @ 2008-05-10 11:04 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Vegard Nossum, Linux Kernel Mailing List, Ingo Molnar,
	Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Andi Kleen, Randy Dunlap, Josh Aune,
	Pekka Paalanen

On Sat, May 10, 2008 at 11:06 AM, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> Bart Van Assche wrote:
>>
>> It's a bit late but I finally found out about your announcement of
>> kmemcheck version 7. Are you familiar with the patch that adds support
>> to Valgrind for User Mode Linux ? I'm not sure what the best approach
>> is -- letting the kernel do its own checking like kmemcheck or extend
>> Valgrind such that it supports UML. Anyway, the techniques applied in
>> Valgrind may be useful for kmemcheck too, such as the algorithms used
>> in Valgrind to compress the memory state information.
>
> It's better to do it with the native kernel so you can "valgrind" all the
> interesting driver code.

That's right. This is the paper I was referring to that details how to
minimize the memory consumption when tracking state information:
http://www.valgrind.org/docs/shadow-memory2007.pdf

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 11:04     ` Bart Van Assche
@ 2008-05-10 12:02       ` Vegard Nossum
  2008-05-10 12:37         ` Andi Kleen
                           ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Vegard Nossum @ 2008-05-10 12:02 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: John Reiser, Pekka Enberg, Linux Kernel Mailing List, Ingo Molnar,
	Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Andi Kleen, Randy Dunlap, Josh Aune,
	Pekka Paalanen

Hi!

On Sat, May 10, 2008 at 1:04 PM, Bart Van Assche
<bart.vanassche@gmail.com> wrote:
>> Bart Van Assche wrote:
>>>
>>> It's a bit late but I finally found out about your announcement of
>>> kmemcheck version 7. Are you familiar with the patch that adds support
>>> to Valgrind for User Mode Linux ? I'm not sure what the best approach
>>> is -- letting the kernel do its own checking like kmemcheck or extend
>>> Valgrind such that it supports UML. Anyway, the techniques applied in
>>> Valgrind may be useful for kmemcheck too, such as the algorithms used
>>> in Valgrind to compress the memory state information.

Yes, I have learned of it not so long ago, around January or so. I
wanted to stop kmemcheck development back then, but Ingo and Pekka
convinced me that it could still be useful :-)

(The link is http://bitwagon.com/valgrind+uml/index.html)

I guess the main disadvantages of using kmemcheck over valgrind-memcheck are:
 - kmemcheck can only warn eagerly, whereas memcheck will wait until
the uninitialized bits are actually used. This means that kmemcheck
will report many false positives. (We have some workarounds but this
is obviously not perfect.)
 - kmemcheck can only warn for dynamic memory, whereas kmemcheck I
believe will also work for local variables, static variables, etc.

It would be interesting to compare the output of kmemcheck vs. the
output of memcheck, though.

> On Sat, May 10, 2008 at 11:06 AM, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
>> It's better to do it with the native kernel so you can "valgrind" all the
>> interesting driver code.
>
> That's right. This is the paper I was referring to that details how to
> minimize the memory consumption when tracking state information:
> http://www.valgrind.org/docs/shadow-memory2007.pdf

Thanks. I have actually seen the paper before, but not read all of it.
>From a quick glace, it seems that the optimizations described there
apply to the tracking of individual bits within a byte, but since we
are tracking by byte granularity (as opposed to bit granularity), it
also seems irrelevant to kmemcheck. (I am not saying that it isn't
interesting, however.)

Currently, we are using a full byte for each shadowed byte. Since we
actually only use two bits out of eight, we could save three fourths
compared to what we use today.

However, memory usage doesn't seem to be much of a problem. I actually
think it might be worth saving the CPU cycles that are needed for the
lookups/bit operations (memory is cheap, cycles aren't). How is the
speed of Valgrind+UML, does anybody know? Isn't there a problem that
Valgrind will have to emulate all the userspace programs as well?
That, I believe, would make the Valgrinded system painfully slow to
work with. I have no benchmarks or profiler results to refer to, but
kmemcheck at least boots to full userspace+X and is still quite
usable.


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 12:02       ` Vegard Nossum
@ 2008-05-10 12:37         ` Andi Kleen
  2008-05-10 13:22           ` Bart Van Assche
  2008-05-10 17:17           ` Jeremy Fitzhardinge
  2008-05-10 13:29         ` Bart Van Assche
                           ` (2 subsequent siblings)
  3 siblings, 2 replies; 22+ messages in thread
From: Andi Kleen @ 2008-05-10 12:37 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Bart Van Assche, John Reiser, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Andi Kleen,
	Randy Dunlap, Josh Aune, Pekka Paalanen

>  - kmemcheck can only warn for dynamic memory, whereas kmemcheck I
> believe will also work for local variables, static variables, etc.

I don't think that's true. valgrind can only detect uninitialized 
local variables in one special case (first use of the stack region).
But as soon as you reuse stack which is pretty common it won't 
be able to detect the next uninitialized use in a stack frame. 

Luckily the compilers do a reasonable job at detecting them at build time.

And static/global variables are never uninitialized in C.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 12:37         ` Andi Kleen
@ 2008-05-10 13:22           ` Bart Van Assche
  2008-05-10 17:17           ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2008-05-10 13:22 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Vegard Nossum, John Reiser, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Randy Dunlap,
	Josh Aune, Pekka Paalanen

On Sat, May 10, 2008 at 2:37 PM, Andi Kleen <andi@firstfloor.org> wrote:
>>  - kmemcheck can only warn for dynamic memory, whereas kmemcheck I
>> believe will also work for local variables, static variables, etc.
>
> I don't think that's true. valgrind can only detect uninitialized
> local variables in one special case (first use of the stack region).
> But as soon as you reuse stack which is pretty common it won't
> be able to detect the next uninitialized use in a stack frame.

As long as the compiler is not told to optimize the compiled code,
Valgrind's memcheck tool is able to detect uninitialized local
variables. Valgrind a.o. tracks all updates of the stack pointer. If
the stack pointer is increased, the memory range between the old and
the new stack pointer is marked as undefined. This works as long as
gcc doesn't optimize away individual stack pointer updates. (I'm one
of the Valgrind developers.)

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 12:02       ` Vegard Nossum
  2008-05-10 12:37         ` Andi Kleen
@ 2008-05-10 13:29         ` Bart Van Assche
  2008-05-10 17:17         ` Jeremy Fitzhardinge
  2008-05-11 12:08         ` John Reiser
  3 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2008-05-10 13:29 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: John Reiser, Pekka Enberg, Linux Kernel Mailing List, Ingo Molnar,
	Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Andi Kleen, Randy Dunlap, Josh Aune,
	Pekka Paalanen

On Sat, May 10, 2008 at 2:02 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> However, memory usage doesn't seem to be much of a problem. I actually
> think it might be worth saving the CPU cycles that are needed for the
> lookups/bit operations (memory is cheap, cycles aren't).

Keep in mind that a reduction in memory usage may reduce the number of
cache misses, and that the improved caching behavior may outweigh the
extra CPU cycles needed for the bit operations.

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 12:02       ` Vegard Nossum
  2008-05-10 12:37         ` Andi Kleen
  2008-05-10 13:29         ` Bart Van Assche
@ 2008-05-10 17:17         ` Jeremy Fitzhardinge
  2008-05-10 20:35           ` Jeff Dike
  2008-05-11 12:08         ` John Reiser
  3 siblings, 1 reply; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-05-10 17:17 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Bart Van Assche, John Reiser, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Andi Kleen,
	Randy Dunlap, Josh Aune, Pekka Paalanen

Vegard Nossum wrote:
> Hi!
>
> On Sat, May 10, 2008 at 1:04 PM, Bart Van Assche
> <bart.vanassche@gmail.com> wrote:
>   
>>> Bart Van Assche wrote:
>>>       
>>>> It's a bit late but I finally found out about your announcement of
>>>> kmemcheck version 7. Are you familiar with the patch that adds support
>>>> to Valgrind for User Mode Linux ? I'm not sure what the best approach
>>>> is -- letting the kernel do its own checking like kmemcheck or extend
>>>> Valgrind such that it supports UML. Anyway, the techniques applied in
>>>> Valgrind may be useful for kmemcheck too, such as the algorithms used
>>>> in Valgrind to compress the memory state information.
>>>>         
>
> Yes, I have learned of it not so long ago, around January or so. I
> wanted to stop kmemcheck development back then, but Ingo and Pekka
> convinced me that it could still be useful :-)
>
> (The link is http://bitwagon.com/valgrind+uml/index.html)
>
> I guess the main disadvantages of using kmemcheck over valgrind-memcheck are:
>  - kmemcheck can only warn eagerly, whereas memcheck will wait until
> the uninitialized bits are actually used. This means that kmemcheck
> will report many false positives. (We have some workarounds but this
> is obviously not perfect.)
>  - kmemcheck can only warn for dynamic memory, whereas kmemcheck I
> believe will also work for local variables, static variables, etc.
>
> It would be interesting to compare the output of kmemcheck vs. the
> output of memcheck, though.
>
>   
>> On Sat, May 10, 2008 at 11:06 AM, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
>>     
>>> It's better to do it with the native kernel so you can "valgrind" all the
>>> interesting driver code.
>>>       
>> That's right. This is the paper I was referring to that details how to
>> minimize the memory consumption when tracking state information:
>> http://www.valgrind.org/docs/shadow-memory2007.pdf
>>     
>
> Thanks. I have actually seen the paper before, but not read all of it.
> From a quick glace, it seems that the optimizations described there
> apply to the tracking of individual bits within a byte, but since we
> are tracking by byte granularity (as opposed to bit granularity), it
> also seems irrelevant to kmemcheck. (I am not saying that it isn't
> interesting, however.)
>
> Currently, we are using a full byte for each shadowed byte. Since we
> actually only use two bits out of eight, we could save three fourths
> compared to what we use today.
>
> However, memory usage doesn't seem to be much of a problem. I actually
> think it might be worth saving the CPU cycles that are needed for the
> lookups/bit operations (memory is cheap, cycles aren't). How is the
> speed of Valgrind+UML, does anybody know? Isn't there a problem that
> Valgrind will have to emulate all the userspace programs as well?
> That, I believe, would make the Valgrinded system painfully slow to
> work with. I have no benchmarks or profiler results to refer to, but
> kmemcheck at least boots to full userspace+X and is still quite
> usable.

No, I think valgrind+uml deliberately lets usermode code run directly on 
the cpu, not under valgrind.  Having the option to run everything under 
Valgrind would be interesting, since it would allow you to trace 
uninitialized values crossing the user-kernel boundary (both ways) 
indicating either usermode or kernel bugs (also user to user via the 
kernel, such as via a pipe).

I've thought about, but not actually implemented, running valgrind as a 
Xen guest, and then running a sub-guest under it, allowing you to run an 
entire virtual machine under Valgrind.  I think people have done vaguely 
similar stuff with qemu.

    J


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 12:37         ` Andi Kleen
  2008-05-10 13:22           ` Bart Van Assche
@ 2008-05-10 17:17           ` Jeremy Fitzhardinge
  2008-05-10 17:48             ` Andi Kleen
  1 sibling, 1 reply; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-05-10 17:17 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Vegard Nossum, Bart Van Assche, John Reiser, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Randy Dunlap,
	Josh Aune, Pekka Paalanen

Andi Kleen wrote:
>>  - kmemcheck can only warn for dynamic memory, whereas kmemcheck I
>> believe will also work for local variables, static variables, etc.
>>     
>
> I don't think that's true. valgrind can only detect uninitialized 
> local variables in one special case (first use of the stack region).
> But as soon as you reuse stack which is pretty common it won't 
> be able to detect the next uninitialized use in a stack frame. 
>   

It tracks changes to the stack pointer, and any memory below it is 
considered uninitialized.  But, yes, if you mean that if you use the 
variable (or slot) once in a function, then again later, it will still 
be considered initialized.  But that's no different from any other memory.

    J

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 17:17           ` Jeremy Fitzhardinge
@ 2008-05-10 17:48             ` Andi Kleen
  2008-05-10 20:45               ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2008-05-10 17:48 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, Vegard Nossum, Bart Van Assche, John Reiser,
	Pekka Enberg, Linux Kernel Mailing List, Ingo Molnar,
	Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Randy Dunlap, Josh Aune, Pekka Paalanen

> It tracks changes to the stack pointer, and any memory below it is 
> considered uninitialized.  But, yes, if you mean that if you use the 

But it does not invalidate anything below the stack pointer as soon
as it changes right ?

> variable (or slot) once in a function, then again later, it will still 
> be considered initialized.  But that's no different from any other memory.

What I meant is e.g. 

	f1();
	f2();

both f1 and f2 use the same stack memory, but f2 uses it uninitialized,
then I think valgrind would still think it is initialized in f2 from the
execution of f1. It would only detect such things in f1 (assuming there
were no other users of the stack before that)

In theory it could throw away all stack related uninitizedness on each
SP change, but that would be likely prohibitively expensive and also
it might be hard to know the exact boundaries of the stack.

BTW on running a test program here it doesn't seem to detect any uninitialized
stack frames here with 3.2.3. Test program is http://halobates.de/t10.c 
(should be compiled without optimization) 

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 17:17         ` Jeremy Fitzhardinge
@ 2008-05-10 20:35           ` Jeff Dike
  2008-05-11 11:23             ` John Reiser
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Dike @ 2008-05-10 20:35 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Vegard Nossum, Bart Van Assche, John Reiser, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Andi Kleen,
	Randy Dunlap, Josh Aune, Pekka Paalanen

On Sat, May 10, 2008 at 06:17:21PM +0100, Jeremy Fitzhardinge wrote:
> No, I think valgrind+uml deliberately lets usermode code run directly on 
> the cpu, not under valgrind.  

It can be done either way.  Grinding userspace code as well is more
uniform, as there's no need to say "this clone should not be followed,
as it will become a UML process".  On the other hand, not grinding
processes means you don't need to figure out how to get the valgrind
engine into your processes.

			Jeff

-- 
Work email - jdike at linux dot intel dot com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 17:48             ` Andi Kleen
@ 2008-05-10 20:45               ` Jeremy Fitzhardinge
  2008-05-10 21:29                 ` John Reiser
  2008-05-10 21:31                 ` Andi Kleen
  0 siblings, 2 replies; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-05-10 20:45 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Vegard Nossum, Bart Van Assche, John Reiser, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Randy Dunlap,
	Josh Aune, Pekka Paalanen

[-- Attachment #1: Type: text/plain, Size: 4891 bytes --]

Andi Kleen wrote:
>> It tracks changes to the stack pointer, and any memory below it is 
>> considered uninitialized.  But, yes, if you mean that if you use the 
>>     
>
> But it does not invalidate anything below the stack pointer as soon
> as it changes right ?
>   

Yeah, as soon as the stack pointer changes, everything below it is 
invalidated (except if the stack-pointer change was actually determined 
to be a stack switch).

>> variable (or slot) once in a function, then again later, it will still 
>> be considered initialized.  But that's no different from any other memory.
>>     
>
> What I meant is e.g. 
>
> 	f1();
> 	f2();
>
> both f1 and f2 use the same stack memory, but f2 uses it uninitialized,
> then I think valgrind would still think it is initialized in f2 from the
> execution of f1. It would only detect such things in f1 (assuming there
> were no other users of the stack before that)
>   

No, it won't.  If the stack pointer goes up then down between f1 and f2, 
then f2 will get fresh values.

The big thing Valgrind hasn't traditionally helped with is overruns of 
on-stack arrays.  You may be thinking of that.

> In theory it could throw away all stack related uninitizedness on each
> SP change, but that would be likely prohibitively expensive and also
> it might be hard to know the exact boundaries of the stack.
>   

No, its not all that expensive compared the overall cost of valgrind and 
the amount of diagnostic power it provides.  Determining stack 
boundaries has always been a bit fraught.  Typically a stack switch has 
been determined heuristically by looking for a "large" change in stack 
pointer, but there's a callback to specifically mark a range of memory 
as a stack, so that movements into and out of a stack can be determined 
as a switch (added specifically to deal with small densely packed stacks 
in uml).

> BTW on running a test program here it doesn't seem to detect any uninitialized
> stack frames here with 3.2.3. Test program is http://halobates.de/t10.c 
> (should be compiled without optimization) 
>   

Hm, I'd expect it to.  Oh, your test program doesn't use the value.  
Valgrind doesn't complain about uninitialized values unless they 
actually affect execution (ie, a conditional depends on one, you use it 
as an address for a dereference, or pass it to a syscall).

The attached version emits errors as I'd expect:

$ valgrind t10
==30474== Memcheck, a memory error detector.
==30474== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==30474== Using LibVEX rev 1804, a library for dynamic binary translation.
==30474== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==30474== Using valgrind-3.3.0, a dynamic binary instrumentation framework.
==30474== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==30474== For more details, rerun with: -v
==30474== 
f1 set y to 1
==30474== Conditional jump or move depends on uninitialised value(s)
==30474==    at 0x8048420: test (t10.c:22)
==30474==    by 0x8048451: main (t10.c:29)
==30474== 
==30474== Use of uninitialised value of size 4
==30474==    at 0xB5C5B6: _itoa_word (in /lib/libc-2.8.so)
==30474==    by 0xB5FF90: vfprintf (in /lib/libc-2.8.so)
==30474==    by 0xB6769F: printf (in /lib/libc-2.8.so)
==30474==    by 0x8048436: test (t10.c:23)
==30474==    by 0x8048451: main (t10.c:29)
==30474== 
==30474== Conditional jump or move depends on uninitialised value(s)
==30474==    at 0xB5C5BE: _itoa_word (in /lib/libc-2.8.so)
==30474==    by 0xB5FF90: vfprintf (in /lib/libc-2.8.so)
==30474==    by 0xB6769F: printf (in /lib/libc-2.8.so)
==30474==    by 0x8048436: test (t10.c:23)
==30474==    by 0x8048451: main (t10.c:29)
==30474== 
==30474== Conditional jump or move depends on uninitialised value(s)
==30474==    at 0xB5EADE: vfprintf (in /lib/libc-2.8.so)
==30474==    by 0xB6769F: printf (in /lib/libc-2.8.so)
==30474==    by 0x8048436: test (t10.c:23)
==30474==    by 0x8048451: main (t10.c:29)
==30474== 
==30474== Conditional jump or move depends on uninitialised value(s)
==30474==    at 0xB60828: vfprintf (in /lib/libc-2.8.so)
==30474==    by 0xB6769F: printf (in /lib/libc-2.8.so)
==30474==    by 0x8048436: test (t10.c:23)
==30474==    by 0x8048451: main (t10.c:29)
==30474== 
==30474== Conditional jump or move depends on uninitialised value(s)
==30474==    at 0xB5EB88: vfprintf (in /lib/libc-2.8.so)
==30474==    by 0xB6769F: printf (in /lib/libc-2.8.so)
==30474==    by 0x8048436: test (t10.c:23)
==30474==    by 0x8048451: main (t10.c:29)
f2 set y to 13123572
==30474== 
==30474== ERROR SUMMARY: 20 errors from 6 contexts (suppressed: 13 from 1)
==30474== malloc/free: in use at exit: 0 bytes in 0 blocks.
==30474== malloc/free: 0 allocs, 0 frees, 0 bytes allocated.
==30474== For counts of detected errors, rerun with: -v
==30474== All heap blocks were freed -- no leaks are possible.



    J

[-- Attachment #2: t10.c --]
[-- Type: text/x-csrc, Size: 258 bytes --]

#include <stdio.h>
int y;

void f1(void)
{	
	int x = 1;
	y = x;
}

void f2(void)
{	
	int x;
	y = x;
}

void test()
{
	f1();
	if (y)
		printf("f1 set y to %d\n", y);
	f2();
	if (y)
		printf("f2 set y to %d\n", y);
}

main()
{
	char buf[16 * 1024];
	test();
}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 20:45               ` Jeremy Fitzhardinge
@ 2008-05-10 21:29                 ` John Reiser
  2008-05-10 23:05                   ` Jeremy Fitzhardinge
  2008-05-10 21:31                 ` Andi Kleen
  1 sibling, 1 reply; 22+ messages in thread
From: John Reiser @ 2008-05-10 21:29 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, Vegard Nossum, Bart Van Assche, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Randy Dunlap,
	Josh Aune, Pekka Paalanen

Jeremy Fitzhardinge wrote:

> Determining stack
> boundaries has always been a bit fraught.  Typically a stack switch has
> been determined heuristically by looking for a "large" change in stack
> pointer, but there's a callback to specifically mark a range of memory
> as a stack, so that movements into and out of a stack can be determined
> as a switch (added specifically to deal with small densely packed stacks
> in uml).

The valgrind+uml patches added a callback, "I am switching stacks >NOW<."
If possible then it is better to tell an interpreter what is happening,
rather than requiring that the interpreter [try to] figure it out.

-- 
John Reiser, jreiser@BitWagon.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 20:45               ` Jeremy Fitzhardinge
  2008-05-10 21:29                 ` John Reiser
@ 2008-05-10 21:31                 ` Andi Kleen
  2008-05-10 22:59                   ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2008-05-10 21:31 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, Vegard Nossum, Bart Van Assche, John Reiser,
	Pekka Enberg, Linux Kernel Mailing List, Ingo Molnar,
	Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Randy Dunlap, Josh Aune, Pekka Paalanen

> Yeah, as soon as the stack pointer changes, everything below it is 
> invalidated (except if the stack-pointer change was actually determined 
> to be a stack switch).

It might in theory, but at least it doesn't for my test program.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 21:31                 ` Andi Kleen
@ 2008-05-10 22:59                   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-05-10 22:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Vegard Nossum, Bart Van Assche, John Reiser, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Randy Dunlap,
	Josh Aune, Pekka Paalanen

Andi Kleen wrote:
>> Yeah, as soon as the stack pointer changes, everything below it is 
>> invalidated (except if the stack-pointer change was actually determined 
>> to be a stack switch).
>>     
>
> It might in theory, but at least it doesn't for my test program.
>   

If you'd read a tiny bit further down my mail, you'd have seen my 
explanation of why your test program isn't testing what you think it is, 
and a variant which does.

    J

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 21:29                 ` John Reiser
@ 2008-05-10 23:05                   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 22+ messages in thread
From: Jeremy Fitzhardinge @ 2008-05-10 23:05 UTC (permalink / raw)
  To: John Reiser
  Cc: Andi Kleen, Vegard Nossum, Bart Van Assche, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Randy Dunlap,
	Josh Aune, Pekka Paalanen

John Reiser wrote:
> The valgrind+uml patches added a callback, "I am switching stacks >NOW<."
>   
Hm, I never particularly liked that approach because unless you do the 
whole thing in assembly it was never certain that there wasn't a 
basic-block break between them (ie, atomic with respect to valgrind).  
For the kernel that may be possible, but I was thinking of the general 
case where you might want to use setjmp or something.

> If possible then it is better to tell an interpreter what is happening,
> rather than requiring that the interpreter [try to] figure it out.
>   

Matter of taste really, but I tend to disagree.  If you say something 
like "addresses A-B, C-D, E-F are stacks", then the stack pointer 
changing from the range A-B to C-D is a pretty clear indication of stack 
switch, regardless of the mechanism you use to do it.  Of course, an 
explicit hint prevents an accidental push/pop of 32k onto an 8K stack 
from being considered a stack switch, but unless you actually know where 
the stacks are, you can't warn about it or prevent it from 
validating/invalidating a pile of innocent memory.

    J

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 20:35           ` Jeff Dike
@ 2008-05-11 11:23             ` John Reiser
  0 siblings, 0 replies; 22+ messages in thread
From: John Reiser @ 2008-05-11 11:23 UTC (permalink / raw)
  To: Jeff Dike
  Cc: Jeremy Fitzhardinge, Vegard Nossum, Bart Van Assche, Pekka Enberg,
	Linux Kernel Mailing List, Ingo Molnar, Peter Zijlstra,
	Paul E. McKenney, Christoph Lameter, Daniel Walker, Andi Kleen,
	Randy Dunlap, Josh Aune, Pekka Paalanen

Jeff Dike wrote:

> ... not grinding processes means you don't need to figure out
> how to get the valgrind engine into your processes.

One easy way to force valgrind into a process is for load_elf_binary()
in fs/binfmt_elf.c to force a PT_INTERP which loads memcheck via
true user-mode calls, then chains to the original PT_INTERP.

-- 
John Reiser, jreiser@BitWagon.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] kmemcheck v7
  2008-05-10 12:02       ` Vegard Nossum
                           ` (2 preceding siblings ...)
  2008-05-10 17:17         ` Jeremy Fitzhardinge
@ 2008-05-11 12:08         ` John Reiser
  3 siblings, 0 replies; 22+ messages in thread
From: John Reiser @ 2008-05-11 12:08 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Bart Van Assche, Pekka Enberg, Linux Kernel Mailing List,
	Ingo Molnar, Peter Zijlstra, Paul E. McKenney, Christoph Lameter,
	Daniel Walker, Andi Kleen, Randy Dunlap, Josh Aune,
	Pekka Paalanen

Vegard Nossum wrote:
> How is the speed of Valgrind+UML, does anybody know?

The speed of Valgrind+UML is the same as the speed of valgrind
on any application.  On a 2GHz box it took about 2.5 minutes
to reach "login:" from a cold boot of UML (includes udev, etc.)
So if normal boot takes 15 seconds, then that's a factor of 10
slowdown: slow for interactivity, yet bearable for checking.
The memory-intensive portions (linear search, pointer chasing,
etc.) can be slower still, but loops that concentrate on
register arithmetic or conditional branching go faster.
There is almost no system wait time: normal device delays (disk,
network) get totally overlapped by CPU usage for grinding :-)

I'd like to have both kmemcheck and valgrind+UML, and use them
differently.  Run kmemcheck all the time on a box or two as
"background trolling" for infrequent cases.  Use valgrind+UML
for interactivity and programmable flexibility when hunting
specific bugs, or when hardware cannot be dedicated.

-- 
John Reiser, jreiser@BitWagon.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2008-05-11 12:36 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-04 13:44 [ANNOUNCE] kmemcheck v7 Vegard Nossum
2008-04-04 13:45 ` [PATCH 1/3] kmemcheck: add the kmemcheck core Vegard Nossum
2008-04-04 13:46 ` [PATCH 2/3] x86: add hooks for kmemcheck Vegard Nossum
2008-04-04 13:47 ` [PATCH 3/3] slub: " Vegard Nossum
2008-05-10  9:07 ` [ANNOUNCE] kmemcheck v7 Bart Van Assche
2008-05-10  9:06   ` Pekka Enberg
2008-05-10 11:04     ` Bart Van Assche
2008-05-10 12:02       ` Vegard Nossum
2008-05-10 12:37         ` Andi Kleen
2008-05-10 13:22           ` Bart Van Assche
2008-05-10 17:17           ` Jeremy Fitzhardinge
2008-05-10 17:48             ` Andi Kleen
2008-05-10 20:45               ` Jeremy Fitzhardinge
2008-05-10 21:29                 ` John Reiser
2008-05-10 23:05                   ` Jeremy Fitzhardinge
2008-05-10 21:31                 ` Andi Kleen
2008-05-10 22:59                   ` Jeremy Fitzhardinge
2008-05-10 13:29         ` Bart Van Assche
2008-05-10 17:17         ` Jeremy Fitzhardinge
2008-05-10 20:35           ` Jeff Dike
2008-05-11 11:23             ` John Reiser
2008-05-11 12:08         ` John Reiser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).