public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 0/9] 2.6.22-stable review
@ 2007-11-02 17:37 ` Greg KH
  2007-11-02 17:37   ` [patch 1/9] genirq: cleanup mismerge artifact Greg KH
                     ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:37 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan

This is the start of the stable review cycle for the 2.6.22.12 release.
There are 9 patches in this series, all will be posted as a response to
this one.  If anyone has any issues with these being applied, please let
us know.  If anyone is a maintainer of the proper subsystem, and wants
to add a Signed-off-by: line to the patch, please respond with it.

This should be the last 2.6.22-stable release unless we missed something
major and people complain loudly that we need to do another release.

These patches are sent out with a number of different people on the Cc:
line.  If you wish to be a reviewer, please email stable@kernel.org
to add your name to the list.  If you want to be off the reviewer list,
also email us.

Responses should be made by Monday, Nov 5, 2007, 16:00:00 UTC.  Anything
received after that time might be too late.

thanks,

the -stable release team

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 1/9] genirq: cleanup mismerge artifact
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
@ 2007-11-02 17:37   ` Greg KH
  2007-11-02 17:37   ` [patch 2/9] genirq: suppress resend of level interrupts Greg KH
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:37 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan,
	Thomas Gleixner

[-- Attachment #1: genirq-cleanup-mismerge-artifact.patch --]
[-- Type: text/plain, Size: 1267 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Thomas Gleixner <tglx@linutronix.de>

patch 496634217e5671ed876a0348e9f5b7165e830b20 in mainline.

Commit 5a43a066b11ac2fe84cf67307f20b83bea390f83: "genirq: Allow fasteoi
handler to retrigger disabled interrupts" was erroneously applied to
handle_level_irq().  This added the irq retrigger / resend functionality
to the level irq handler.

Revert the offending bits.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 kernel/irq/chip.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -352,13 +352,10 @@ handle_level_irq(unsigned int irq, struc
 	 * keep it masked and get out of here
 	 */
 	action = desc->action;
-	if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
-		desc->status |= IRQ_PENDING;
+	if (unlikely(!action || (desc->status & IRQ_DISABLED)))
 		goto out_unlock;
-	}
 
 	desc->status |= IRQ_INPROGRESS;
-	desc->status &= ~IRQ_PENDING;
 	spin_unlock(&desc->lock);
 
 	action_ret = handle_IRQ_event(irq, action);

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 2/9] genirq: suppress resend of level interrupts
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
  2007-11-02 17:37   ` [patch 1/9] genirq: cleanup mismerge artifact Greg KH
@ 2007-11-02 17:37   ` Greg KH
  2007-11-02 17:37   ` [patch 3/9] genirq: mark io_apic level interrupts to avoid resend Greg KH
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:37 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan,
	Thomas Gleixner

[-- Attachment #1: genirq-suppress-resend-of-level-interrupts.patch --]
[-- Type: text/plain, Size: 1197 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Thomas Gleixner <tglx@linutronix.de>

patch 2464286ace55b3abddfb9cc30ab95e2dac1de9a6 in mainline.

Level type interrupts are resent by the interrupt hardware when they are
still active at irq_enable().

Suppress the resend mechanism for interrupts marked as level.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 kernel/irq/resend.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/kernel/irq/resend.c
+++ b/kernel/irq/resend.c
@@ -62,7 +62,12 @@ void check_irq_resend(struct irq_desc *d
 	 */
 	desc->chip->enable(irq);
 
-	if ((status & (IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) {
+	/*
+	 * We do not resend level type interrupts. Level type
+	 * interrupts are resent by hardware when they are still
+	 * active.
+	 */
+	if ((status & (IRQ_LEVEL | IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) {
 		desc->status = (status & ~IRQ_PENDING) | IRQ_REPLAY;
 
 		if (!desc->chip || !desc->chip->retrigger ||

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 3/9] genirq: mark io_apic level interrupts to avoid resend
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
  2007-11-02 17:37   ` [patch 1/9] genirq: cleanup mismerge artifact Greg KH
  2007-11-02 17:37   ` [patch 2/9] genirq: suppress resend of level interrupts Greg KH
@ 2007-11-02 17:37   ` Greg KH
  2007-11-02 17:37   ` [patch 4/9] IB/uverbs: Fix checking of userspace object ownership Greg KH
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:37 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan,
	Thomas Gleixner

[-- Attachment #1: genirq-mark-io_apic-level-interrupts-to-avoid-resend.patch --]
[-- Type: text/plain, Size: 2029 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Thomas Gleixner <tglx@linutronix.de>

patch cc75b92d11384ba14f93828a2a0040344ae872e7 in mainline.

Level type interrupts do not need to be resent.  It was also found that
some chipsets get confused in case of the resend.

Mark the ioapic level type interrupts as such to avoid the resend
functionality in the generic irq code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 arch/i386/kernel/io_apic.c   |    7 +++++--
 arch/x86_64/kernel/io_apic.c |    7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -1275,12 +1275,15 @@ static struct irq_chip ioapic_chip;
 static void ioapic_register_intr(int irq, int vector, unsigned long trigger)
 {
 	if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) ||
-			trigger == IOAPIC_LEVEL)
+	    trigger == IOAPIC_LEVEL) {
+		irq_desc[irq].status |= IRQ_LEVEL;
 		set_irq_chip_and_handler_name(irq, &ioapic_chip,
 					 handle_fasteoi_irq, "fasteoi");
-	else
+	} else {
+		irq_desc[irq].status &= ~IRQ_LEVEL;
 		set_irq_chip_and_handler_name(irq, &ioapic_chip,
 					 handle_edge_irq, "edge");
+	}
 	set_intr_gate(vector, interrupt[irq]);
 }
 
--- a/arch/x86_64/kernel/io_apic.c
+++ b/arch/x86_64/kernel/io_apic.c
@@ -774,12 +774,15 @@ static struct irq_chip ioapic_chip;
 
 static void ioapic_register_intr(int irq, unsigned long trigger)
 {
-	if (trigger)
+	if (trigger) {
+		irq_desc[irq].status |= IRQ_LEVEL;
 		set_irq_chip_and_handler_name(irq, &ioapic_chip,
 					      handle_fasteoi_irq, "fasteoi");
-	else
+	} else {
+		irq_desc[irq].status &= ~IRQ_LEVEL;
 		set_irq_chip_and_handler_name(irq, &ioapic_chip,
 					      handle_edge_irq, "edge");
+	}
 }
 
 static void setup_IO_APIC_irq(int apic, int pin, unsigned int irq,

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 4/9] IB/uverbs: Fix checking of userspace object ownership
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
                     ` (2 preceding siblings ...)
  2007-11-02 17:37   ` [patch 3/9] genirq: mark io_apic level interrupts to avoid resend Greg KH
@ 2007-11-02 17:37   ` Greg KH
  2007-11-02 17:38   ` [patch 5/9] minixfs: limit minixfs printks on corrupted dir i_size (CVE-2006-6058) Greg KH
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:37 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan,
	Roland Dreier

[-- Attachment #1: ib-uverbs-fix-checking-of-userspace-object-ownership.patch --]
[-- Type: text/plain, Size: 1333 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Roland Dreier <rolandd@cisco.com>

Upstream as cbfb50e6e2e9c580848c0f51d37c24cdfb1cb704

Commit 9ead190b ("IB/uverbs: Don't serialize with ib_uverbs_idr_mutex")
rewrote how userspace objects are looked up in the uverbs module's
idrs, and introduced a severe bug in the process: there is no checking
that an operation is being performed by the right process any more.
Fix this by adding the missing check of uobj->context in __idr_get_uobj().

Apparently everyone is being very careful to only touch their own
objects, because this bug was introduced in June 2006 in 2.6.18, and
has gone undetected until now.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/infiniband/core/uverbs_cmd.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -147,8 +147,12 @@ static struct ib_uobject *__idr_get_uobj
 
 	spin_lock(&ib_uverbs_idr_lock);
 	uobj = idr_find(idr, id);
-	if (uobj)
-		kref_get(&uobj->ref);
+	if (uobj) {
+		if (uobj->context == context)
+			kref_get(&uobj->ref);
+		else
+			uobj = NULL;
+	}
 	spin_unlock(&ib_uverbs_idr_lock);
 
 	return uobj;

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 5/9] minixfs: limit minixfs printks on corrupted dir i_size (CVE-2006-6058)
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
                     ` (3 preceding siblings ...)
  2007-11-02 17:37   ` [patch 4/9] IB/uverbs: Fix checking of userspace object ownership Greg KH
@ 2007-11-02 17:38   ` Greg KH
  2007-11-02 17:38   ` [patch 6/9] param_sysfs_builtin memchr argument fix Greg KH
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan,
	Eric Sandeen, Bodo Eggert

[-- Attachment #1: minixfs-limit-minixfs-printks-on-corrupted-dir-i_size.patch --]
[-- Type: text/plain, Size: 2722 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Eric Sandeen <sandeen@redhat.com>

patch 44ec6f3f89889a469773b1fd894f8fcc07c29cf in mainline

This attempts to address CVE-2006-6058
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-6058

first reported at http://projects.info-pull.com/mokb/MOKB-17-11-2006.html

Essentially a corrupted minix dir inode reporting a very large
i_size will loop for a very long time in minix_readdir, minix_find_entry,
etc, because on EIO they just move on to try the next page.  This is
under the BKL, printk-storming as well.  This can lock up the machine
for a very long time.  Simply ratelimiting the printks gets things back
under control.  Make the message a bit more informative while we're here.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Bodo Eggert <7eggert@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/minix/itree_v1.c |    9 +++++++--
 fs/minix/itree_v2.c |    9 +++++++--
 2 files changed, 14 insertions(+), 4 deletions(-)

--- a/fs/minix/itree_v1.c
+++ b/fs/minix/itree_v1.c
@@ -23,11 +23,16 @@ static inline block_t *i_data(struct ino
 static int block_to_path(struct inode * inode, long block, int offsets[DEPTH])
 {
 	int n = 0;
+	char b[BDEVNAME_SIZE];
 
 	if (block < 0) {
-		printk("minix_bmap: block<0\n");
+		printk("MINIX-fs: block_to_path: block %ld < 0 on dev %s\n",
+			block, bdevname(inode->i_sb->s_bdev, b));
 	} else if (block >= (minix_sb(inode->i_sb)->s_max_size/BLOCK_SIZE)) {
-		printk("minix_bmap: block>big\n");
+		if (printk_ratelimit())
+			printk("MINIX-fs: block_to_path: "
+			       "block %ld too big on dev %s\n",
+				block, bdevname(inode->i_sb->s_bdev, b));
 	} else if (block < 7) {
 		offsets[n++] = block;
 	} else if ((block -= 7) < 512) {
--- a/fs/minix/itree_v2.c
+++ b/fs/minix/itree_v2.c
@@ -23,12 +23,17 @@ static inline block_t *i_data(struct ino
 static int block_to_path(struct inode * inode, long block, int offsets[DEPTH])
 {
 	int n = 0;
+	char b[BDEVNAME_SIZE];
 	struct super_block *sb = inode->i_sb;
 
 	if (block < 0) {
-		printk("minix_bmap: block<0\n");
+		printk("MINIX-fs: block_to_path: block %ld < 0 on dev %s\n",
+			block, bdevname(sb->s_bdev, b));
 	} else if (block >= (minix_sb(inode->i_sb)->s_max_size/sb->s_blocksize)) {
-		printk("minix_bmap: block>big\n");
+		if (printk_ratelimit())
+			printk("MINIX-fs: block_to_path: "
+			       "block %ld too big on dev %s\n",
+				block, bdevname(sb->s_bdev, b));
 	} else if (block < 7) {
 		offsets[n++] = block;
 	} else if ((block -= 7) < 256) {

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 6/9] param_sysfs_builtin memchr argument fix
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
                     ` (4 preceding siblings ...)
  2007-11-02 17:38   ` [patch 5/9] minixfs: limit minixfs printks on corrupted dir i_size (CVE-2006-6058) Greg KH
@ 2007-11-02 17:38   ` Greg KH
  2007-11-02 17:38   ` [patch 7/9] x86: fix global_flush_tlb() bug Greg KH
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan, Dave Young,
	Greg KH

[-- Attachment #1: param_sysfs_builtin-memchr-argument-fix.patch --]
[-- Type: text/plain, Size: 3038 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Dave Young <hidave.darkstar@gmail.com>

patch faf8c714f4508207a9c81cc94dafc76ed6680b44 in mainline.

If memchr argument is longer than strlen(kp->name), there will be some
weird result.

It will casuse duplicate filenames in sysfs for the "nousb".  kernel
warning messages are as bellow:

sysfs: duplicate filename 'usbcore' can not be created
WARNING: at fs/sysfs/dir.c:416 sysfs_add_one()
 [<c01c4750>] sysfs_add_one+0xa0/0xe0
 [<c01c4ab8>] create_dir+0x48/0xb0
 [<c01c4b69>] sysfs_create_dir+0x29/0x50
 [<c024e0fb>] create_dir+0x1b/0x50
 [<c024e3b6>] kobject_add+0x46/0x150
 [<c024e2da>] kobject_init+0x3a/0x80
 [<c053b880>] kernel_param_sysfs_setup+0x50/0xb0
 [<c053b9ce>] param_sysfs_builtin+0xee/0x130
 [<c053ba33>] param_sysfs_init+0x23/0x60
 [<c024d062>] __next_cpu+0x12/0x20
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052a856>] do_initcalls+0x46/0x1e0
 [<c01bdb12>] create_proc_entry+0x52/0x90
 [<c0158d4c>] register_irq_proc+0x9c/0xc0
 [<c01bda94>] proc_mkdir_mode+0x34/0x50
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa92>] kernel_init+0x62/0xb0
 [<c0104f83>] kernel_thread_helper+0x7/0x14
 =======================
kobject_add failed for usbcore with -EEXIST, don't try to register things with the same name in the same directory.
 [<c024e466>] kobject_add+0xf6/0x150
 [<c053b880>] kernel_param_sysfs_setup+0x50/0xb0
 [<c053b9ce>] param_sysfs_builtin+0xee/0x130
 [<c053ba33>] param_sysfs_init+0x23/0x60
 [<c024d062>] __next_cpu+0x12/0x20
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052a856>] do_initcalls+0x46/0x1e0
 [<c01bdb12>] create_proc_entry+0x52/0x90
 [<c0158d4c>] register_irq_proc+0x9c/0xc0
 [<c01bda94>] proc_mkdir_mode+0x34/0x50
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa92>] kernel_init+0x62/0xb0
 [<c0104f83>] kernel_thread_helper+0x7/0x14
 =======================
Module 'usbcore' failed to be added to sysfs, error number -17
The system will be unstable now.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 kernel/params.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/kernel/params.c
+++ b/kernel/params.c
@@ -591,11 +591,17 @@ static void __init param_sysfs_builtin(v
 
 	for (i=0; i < __stop___param - __start___param; i++) {
 		char *dot;
+		size_t kplen;
 
 		kp = &__start___param[i];
+		kplen = strlen(kp->name);
 
 		/* We do not handle args without periods. */
-		dot = memchr(kp->name, '.', MAX_KBUILD_MODNAME);
+		if (kplen > MAX_KBUILD_MODNAME) {
+			DEBUGP("kernel parameter name is too long: %s\n", kp->name);
+			continue;
+		}
+		dot = memchr(kp->name, '.', kplen);
 		if (!dot) {
 			DEBUGP("couldn't find period in %s\n", kp->name);
 			continue;

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 7/9] x86: fix global_flush_tlb() bug
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
                     ` (5 preceding siblings ...)
  2007-11-02 17:38   ` [patch 6/9] param_sysfs_builtin memchr argument fix Greg KH
@ 2007-11-02 17:38   ` Greg KH
  2007-11-02 17:38   ` [patch 8/9] dm snapshot: fix invalidation deadlock Greg KH
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan,
	Ingo Molnar, Thomas Gleixner, Andi Kleen

[-- Attachment #1: x86-fix-global_flush_tlb-bug.patch --]
[-- Type: text/plain, Size: 2186 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Ingo Molnar <mingo@elte.hu>

patch 9a24d04a3c26c223f22493492c5c9085b8773d4a upstream

While we were reviewing pageattr_32/64.c for unification,
Thomas Gleixner noticed the following serious SMP bug in
global_flush_tlb():

	down_read(&init_mm.mmap_sem);
	list_replace_init(&deferred_pages, &l);
	up_read(&init_mm.mmap_sem);

this is SMP-unsafe because list_replace_init() done on two CPUs in
parallel can corrupt the list.

This bug has been introduced about a year ago in the 64-bit tree:

       commit ea7322decb974a4a3e804f96a0201e893ff88ce3
       Author: Andi Kleen <ak@suse.de>
       Date:   Thu Dec 7 02:14:05 2006 +0100

       [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr

                down_read(&init_mm.mmap_sem);
        -       dpage = xchg(&deferred_pages, NULL);
        +       list_replace_init(&deferred_pages, &l);
                up_read(&init_mm.mmap_sem);

the xchg() based version was SMP-safe, but list_replace_init() is not.
So this "cleanup" introduced a nasty bug.

why this bug never become prominent is a mystery - it can probably be
explained with the (still) relative obscurity of the x86_64 architecture.

the safe fix for now is to write-lock init_mm.mmap_sem.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 arch/x86_64/mm/pageattr.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

--- a/arch/x86_64/mm/pageattr.c
+++ b/arch/x86_64/mm/pageattr.c
@@ -227,9 +227,14 @@ void global_flush_tlb(void)
 	struct page *pg, *next;
 	struct list_head l;
 
-	down_read(&init_mm.mmap_sem);
+	/*
+	 * Write-protect the semaphore, to exclude two contexts
+	 * doing a list_replace_init() call in parallel and to
+	 * exclude new additions to the deferred_pages list:
+	 */
+	down_write(&init_mm.mmap_sem);
 	list_replace_init(&deferred_pages, &l);
-	up_read(&init_mm.mmap_sem);
+	up_write(&init_mm.mmap_sem);
 
 	flush_map(&l);
 

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 8/9] dm snapshot: fix invalidation deadlock
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
                     ` (6 preceding siblings ...)
  2007-11-02 17:38   ` [patch 7/9] x86: fix global_flush_tlb() bug Greg KH
@ 2007-11-02 17:38   ` Greg KH
  2007-11-02 17:38   ` [patch 9/9] Revert "x86_64: allocate sparsemem memmap above 4G" Greg KH
  2007-11-02 17:41   ` [patch 0/9] 2.6.22-stable review Greg KH
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan, Milan Broz,
	Alasdair G Kergon

[-- Attachment #1: dm-snapshot-fix-invalidation-deadlock.patch --]
[-- Type: text/plain, Size: 3859 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Milan Broz <mbroz@redhat.com>

patch fcac03abd325e4f7a4cc8fe05fea2793b1c8eb75 in mainline

Process persistent exception store metadata IOs in a separate thread.

A snapshot may become invalid while inside generic_make_request().
A synchronous write is then needed to update the metadata while still
inside that function.  Since the introduction of
md-dm-reduce-stack-usage-with-stacked-block-devices.patch this has to
be performed by a separate thread to avoid deadlock.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/md/dm-exception-store.c |   48 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 43 insertions(+), 5 deletions(-)

--- a/drivers/md/dm-exception-store.c
+++ b/drivers/md/dm-exception-store.c
@@ -125,6 +125,8 @@ struct pstore {
 	uint32_t callback_count;
 	struct commit_callback *callbacks;
 	struct dm_io_client *io_client;
+
+	struct workqueue_struct *metadata_wq;
 };
 
 static inline unsigned int sectors_to_pages(unsigned int sectors)
@@ -156,10 +158,24 @@ static void free_area(struct pstore *ps)
 	ps->area = NULL;
 }
 
+struct mdata_req {
+	struct io_region *where;
+	struct dm_io_request *io_req;
+	struct work_struct work;
+	int result;
+};
+
+static void do_metadata(struct work_struct *work)
+{
+	struct mdata_req *req = container_of(work, struct mdata_req, work);
+
+	req->result = dm_io(req->io_req, 1, req->where, NULL);
+}
+
 /*
  * Read or write a chunk aligned and sized block of data from a device.
  */
-static int chunk_io(struct pstore *ps, uint32_t chunk, int rw)
+static int chunk_io(struct pstore *ps, uint32_t chunk, int rw, int metadata)
 {
 	struct io_region where = {
 		.bdev = ps->snap->cow->bdev,
@@ -173,8 +189,23 @@ static int chunk_io(struct pstore *ps, u
 		.client = ps->io_client,
 		.notify.fn = NULL,
 	};
+	struct mdata_req req;
+
+	if (!metadata)
+		return dm_io(&io_req, 1, &where, NULL);
 
-	return dm_io(&io_req, 1, &where, NULL);
+	req.where = &where;
+	req.io_req = &io_req;
+
+	/*
+	 * Issue the synchronous I/O from a different thread
+	 * to avoid generic_make_request recursion.
+	 */
+	INIT_WORK(&req.work, do_metadata);
+	queue_work(ps->metadata_wq, &req.work);
+	flush_workqueue(ps->metadata_wq);
+
+	return req.result;
 }
 
 /*
@@ -189,7 +220,7 @@ static int area_io(struct pstore *ps, ui
 	/* convert a metadata area index to a chunk index */
 	chunk = 1 + ((ps->exceptions_per_area + 1) * area);
 
-	r = chunk_io(ps, chunk, rw);
+	r = chunk_io(ps, chunk, rw, 0);
 	if (r)
 		return r;
 
@@ -230,7 +261,7 @@ static int read_header(struct pstore *ps
 	if (r)
 		return r;
 
-	r = chunk_io(ps, 0, READ);
+	r = chunk_io(ps, 0, READ, 1);
 	if (r)
 		goto bad;
 
@@ -292,7 +323,7 @@ static int write_header(struct pstore *p
 	dh->version = cpu_to_le32(ps->version);
 	dh->chunk_size = cpu_to_le32(ps->snap->chunk_size);
 
-	return chunk_io(ps, 0, WRITE);
+	return chunk_io(ps, 0, WRITE, 1);
 }
 
 /*
@@ -409,6 +440,7 @@ static void persistent_destroy(struct ex
 {
 	struct pstore *ps = get_info(store);
 
+	destroy_workqueue(ps->metadata_wq);
 	dm_io_client_destroy(ps->io_client);
 	vfree(ps->callbacks);
 	free_area(ps);
@@ -589,6 +621,12 @@ int dm_create_persistent(struct exceptio
 	atomic_set(&ps->pending_count, 0);
 	ps->callbacks = NULL;
 
+	ps->metadata_wq = create_singlethread_workqueue("ksnaphd");
+	if (!ps->metadata_wq) {
+		DMERR("couldn't start header metadata update thread");
+		return -ENOMEM;
+	}
+
 	store->destroy = persistent_destroy;
 	store->read_metadata = persistent_read_metadata;
 	store->prepare_exception = persistent_prepare;

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch 9/9] Revert "x86_64: allocate sparsemem memmap above 4G"
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
                     ` (7 preceding siblings ...)
  2007-11-02 17:38   ` [patch 8/9] dm snapshot: fix invalidation deadlock Greg KH
@ 2007-11-02 17:38   ` Greg KH
  2007-11-02 17:41   ` [patch 0/9] 2.6.22-stable review Greg KH
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:38 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Justin Forbes, Zwane Mwaikambo, Theodore Ts'o, Randy Dunlap,
	Dave Jones, Chuck Wolber, Chris Wedgwood, Michael Krufky,
	Chuck Ebbert, Domenico Andreoli, torvalds, akpm, alan,
	Zou Nan hai, Suresh Siddha, Andy Whitcroft

[-- Attachment #1: revert-x86_64-allocate-sparsemem-memmap-above-4g.patch --]
[-- Type: text/plain, Size: 2798 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Linus Torvalds <torvalds@linux-foundation.org>

patch 6a22c57b8d2a62dea7280a6b2ac807a539ef0716 in mainline.

This reverts commit 2e1c49db4c640b35df13889b86b9d62215ade4b6.

First off, testing in Fedora has shown it to cause boot failures,
bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
commit will likely be reverted in the 2.6.23 stable kernels.

Secondly, in the 2.6.24 model, x86-64 has now grown support for
SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
bug is not visible any more, it's become invisible due to the code just
being irrelevant and no longer enabled on the only architecture that
this ever affected.

backported to 2.6.22 by Chuck Ebbert

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Martin Ebourne <fedora@ebourne.me.uk>
Cc: Zou Nan hai <nanhai.zou@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


---
 arch/x86_64/mm/init.c   |    5 -----
 include/linux/bootmem.h |    1 -
 mm/sparse.c             |   11 -----------
 3 files changed, 17 deletions(-)

--- a/arch/x86_64/mm/init.c
+++ b/arch/x86_64/mm/init.c
@@ -769,8 +769,3 @@ int in_gate_area_no_task(unsigned long a
 	return (addr >= VSYSCALL_START) && (addr < VSYSCALL_END);
 }
 
-void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size)
-{
-	return __alloc_bootmem_core(pgdat->bdata, size,
-			SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0);
-}
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -59,7 +59,6 @@ extern void *__alloc_bootmem_core(struct
 				  unsigned long align,
 				  unsigned long goal,
 				  unsigned long limit);
-extern void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size);
 
 #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
 extern void reserve_bootmem(unsigned long addr, unsigned long size);
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -209,12 +209,6 @@ static int __meminit sparse_init_one_sec
 	return 1;
 }
 
-__attribute__((weak))
-void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size)
-{
-	return NULL;
-}
-
 static struct page __init *sparse_early_mem_map_alloc(unsigned long pnum)
 {
 	struct page *map;
@@ -225,11 +219,6 @@ static struct page __init *sparse_early_
 	if (map)
 		return map;
 
-  	map = alloc_bootmem_high_node(NODE_DATA(nid),
-                       sizeof(struct page) * PAGES_PER_SECTION);
-	if (map)
-		return map;
-
 	map = alloc_bootmem_node(NODE_DATA(nid),
 			sizeof(struct page) * PAGES_PER_SECTION);
 	if (map)

-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 0/9] 2.6.22-stable review
  2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
                     ` (8 preceding siblings ...)
  2007-11-02 17:38   ` [patch 9/9] Revert "x86_64: allocate sparsemem memmap above 4G" Greg KH
@ 2007-11-02 17:41   ` Greg KH
  9 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2007-11-02 17:41 UTC (permalink / raw)
  To: linux-kernel, stable, Justin Forbes, Zwane Mwaikambo,
	Theodore Ts'o, Randy Dunlap, Dave Jones, Chuck Wolber,
	Chris Wedgwood, Michael Krufky, Chuck Ebbert, Domenico Andreoli,
	torvalds, akpm, alan

On Fri, Nov 02, 2007 at 10:37:33AM -0700, Greg KH wrote:
> This is the start of the stable review cycle for the 2.6.22.12 release.
> There are 9 patches in this series, all will be posted as a response to
> this one.  If anyone has any issues with these being applied, please let
> us know.  If anyone is a maintainer of the proper subsystem, and wants
> to add a Signed-off-by: line to the patch, please respond with it.
> 
> This should be the last 2.6.22-stable release unless we missed something
> major and people complain loudly that we need to do another release.

The full patch all in one tarball can be found at:
	kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.22.12-rc1.gz

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-11-02 18:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20071102173359.709442489@mini.kroah.org>
2007-11-02 17:37 ` [patch 0/9] 2.6.22-stable review Greg KH
2007-11-02 17:37   ` [patch 1/9] genirq: cleanup mismerge artifact Greg KH
2007-11-02 17:37   ` [patch 2/9] genirq: suppress resend of level interrupts Greg KH
2007-11-02 17:37   ` [patch 3/9] genirq: mark io_apic level interrupts to avoid resend Greg KH
2007-11-02 17:37   ` [patch 4/9] IB/uverbs: Fix checking of userspace object ownership Greg KH
2007-11-02 17:38   ` [patch 5/9] minixfs: limit minixfs printks on corrupted dir i_size (CVE-2006-6058) Greg KH
2007-11-02 17:38   ` [patch 6/9] param_sysfs_builtin memchr argument fix Greg KH
2007-11-02 17:38   ` [patch 7/9] x86: fix global_flush_tlb() bug Greg KH
2007-11-02 17:38   ` [patch 8/9] dm snapshot: fix invalidation deadlock Greg KH
2007-11-02 17:38   ` [patch 9/9] Revert "x86_64: allocate sparsemem memmap above 4G" Greg KH
2007-11-02 17:41   ` [patch 0/9] 2.6.22-stable review Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox