public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Linux 2.6.19-rc6
@ 2006-11-16  4:21 Linus Torvalds
  2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk
                   ` (5 more replies)
  0 siblings, 6 replies; 36+ messages in thread
From: Linus Torvalds @ 2006-11-16  4:21 UTC (permalink / raw)
  To: Linux Kernel Mailing List


Ok,
 there's nothing earth-shattering here (and there shouldn't be), but we've 
hopefully made good progress on the regression list (and thanks again to 
Adrian Bunk for reminding people, especially when they thought *cough* 
that some particular regression had already been fixed)..

So with -rc6, we hopefully should leave the irq-related regressions behind 
us. There were issues both with devices that started enabling MSI (which 
seem to trigger hardware bugs, although there's also been discussion about 
what we should do to make things safer) and with the new genirq layer that 
showed problems with edge-triggered irq's (notably legacy ISA interrupts, 
or, more commonly these days, the 16-bit PCMCIA interrupts that are 
basically just ISA in another formfactor).

Thanks for everybody involved in whittling down that regression list.

Also, apart from the regression tracking, we've had some other updates, eg 
infiniband and DVB fixes, network driver fixes, some networking fixes etc.

The ShortLog is appended, and gives a mostly readable picture of what has 
been going on. But the main thing to take away is: regressions fixed, and 
not a whole lot of changes since -rc5 (it may not look that way, but a lot 
of these things are essentially one-liners or close to it, so the total 
diff between -rc5 and -rc6 is actually just about 5k lines, which is not 
a whole lot, considering).

		Linus

---
Aaron Durbin (1):
      x86-64: Fix partial page check to ensure unusable memory is not being marked usable.

Adrian Bunk (2):
      bcm43xx: Add error checking in bcm43xx_sprom_write()
      drivers/telephony/ixj: fix an array overrun

Alan Cox (1):
      hpt37x: Check the enablebits

Alan Stern (1):
      SCSI core: always store >= 36 bytes of INQUIRY data

Alasdair G Kergon (2):
      dm: fix find_device race
      dm: suspend: fix error path

Alexey Dobriyan (4):
      ipmi_si_intf.c: fix "&& 0xff" typos
      V4L/DVB (4795): Tda826x: use correct max frequency
      V4L/DVB (4818): Flexcop-usb: fix debug printk
      pata_artop: fix "& (1 >>" typo

Andi Kleen (6):
      Revert "MMCONFIG and new Intel motherboards"
      x86-64: Fix PTRACE_[SG]ET_THREAD_AREA regression with ia32 emulation.
      x86-64: Handle reserve_bootmem_generic beyond end_pfn
      x86: Add acpi_user_timer_override option for Asus boards
      x86-64: Fix vgetcpu when CONFIG_HOTPLUG_CPU is disabled
      x86-64: Fix race in exit_idle

Andrew Morton (2):
      setup_irq(): better mismatch debugging
      revert "PCI: quirk for IBM Dock II cardbus controllers"

Arjan van de Ven (1):
      Regression in 2.6.19-rc microcode driver

Benjamin Herrenschmidt (2):
      [POWERPC] Fix cell "new style" mapping and add debug
      powerpc: windfarm shall request it's sub modules

Brian King (1):
      libata: Convert from module_init to subsys_initcall

Bryan O'Sullivan (1):
      IB/ipath - program intconfig register using new HT irq hook

Chris Lalancette (1):
      [NETPOLL]: Compute checksum properly in netpoll_send_udp().

Corey Minyard (3):
      IPMI: Clean up the waiting message queue properly on unload
      IPMI: retry messages on certain error returns
      IPMI: Fix more && typos

Daniel Ritz (1):
      fix via586 irq routing for pirq 5

Darrick J. Wong (1):
      libata: fix double-completion on error

David Brownell (1):
      usb: MAINTAINERS updates

David Chinner (3):
      [XFS] Clean up i_flags and i_flags_lock handling.
      [XFS] Prevent a deadlock when xfslogd unpins inodes.
      [XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h

David Gibson (1):
      hugetlb: check for brk() entering a hugepage region

David Miller (1):
      pci: don't try to remove sysfs files before they are setup.

David Rientjes (1):
      drivers cris: return on NULL dev_alloc_skb()

Eric Dumazet (1):
      vmalloc: optimization, cleanup, bugfixes

Eric W. Biederman (4):
      sysctl: Undeprecate sys_sysctl
      htirq: refactor so we only have one function that writes to the chip
      htirq: allow buggy drivers of buggy hardware to write the registers
      Use delayed disable mode of ioapic edge triggered interrupts

Franck Bui-Huu (1):
      .gitignore: add miscellaneous files

Geoff Levand (1):
      [POWERPC] cell: set ARCH_SPARSEMEM_DEFAULT in Kconfig

Herbert Xu (1):
      [NET]: Set truesize in pskb_copy

Hermann Pitton (1):
      V4L/DVB (4802): Cx88: fix remote control on WinFast 2000XP Expert

Hoang-Nam Nguyen (3):
      IB/ehca: Assure 4K alignment for firmware control blocks
      IB/ehca: Use named constant for max mtu
      IB/ehca: Activate scaling code by default

Hugh Dickins (2):
      hugetlb: prepare_hugepage_range check offset too
      hugetlb: fix error return for brk() entering a hugepage region

Ian Kent (1):
      autofs4: panic after mount fail

J. Bruce Fields (3):
      nfsd4: reindent do_open_lookup()
      nfsd4: fix open-create permissions
      nfsd: fix spurious error return from nfsd_create in async case

Jean Delvare (2):
      V4L/DVB (4817): Fix uses of "&&" where "&" was intended
      RDMA/amso1100: Fix && typo

Jeff Garzik (1):
      [libata] sata_via: fix obvious typo

Jens Axboe (4):
      Fix bad data direction in SG_IO
      ide-cd: only set rq->errors SCSI style for block pc requests
      cciss: fix iostat
      cpqarray: fix iostat

Jes Sorensen (1):
      mspec driver build fix

Jiri Slaby (2):
      [NET]: kconfig, correct traffic shaper
      Char: isicom, fix close bug

John Heffner (1):
      [TCP]: Don't use highmem in tcp hash size calculation.

John Rose (1):
      [POWERPC] pseries: Force 4k update_flash block and list sizes

Jonathan E Brassow (2):
      dm: multipath: fix rr_add_path order
      dm: raid1: fix waiting for io on suspend

Julian Anastasov (1):
      [IPVS]: More endianness fixed.

Kalle Pokki (2):
      [POWERPC] CPM_UART: Fix non-console transmit
      [POWERPC] CPM_UART: Fix non-console initialisation

KAMEZAWA Hiroyuki (1):
      ia64: select ACPI_NUMA if ACPI

Linus Torvalds (6):
      Revert "i386: Add MMCFG resources to i386 too"
      x86-64: clean up io-apic accesses
      x86-64: write IO APIC irq routing entries in correct order
      [dvb saa7134] Fix missing 'break' for avermedia card case
      Revert "fix Data Acess error in dup_fd"
      Linux 2.6.19-rc6

Magnus Damm (1):
      x86-64: setup saved_max_pfn correctly (kdump)

Masami Hiramatsu (1):
      kretprobe: fix kretprobe-booster to save regs and set status

Mauro Carvalho Chehab (1):
      V4L/DVB (4804): Fix missing i2c dependency for saa7110

Michael Buesch (1):
      bcm43xx: Drain TX status before starting IRQs

Michael Chan (1):
      [TG3]: Fix array overrun in tg3_read_partno().

Nathan Lynch (1):
      nvidiafb: fix unreachable code in nv10GetConfig

NeilBrown (2):
      md: change ONLINE/OFFLINE events to a single CHANGE event
      md: fix sizing problem with raid5-reshape and CONFIG_LBD=n

Nicolas Kaiser (1):
      drivers/ide: stray bracket

Oleg Nesterov (1):
      A minor fix for set_mb() in Documentation/memory-barriers.txt

pasky@ucw.cz (3):
      V4L/DVB (4814): Remote support for Avermedia 777
      V4L/DVB (4815): Remote support for Avermedia A16AR
      V4L/DVB (4816): Change tuner type for Avermedia A16AR

Paul Mackerras (1):
      [POWERPC] Make sure initrd and dtb sections get into zImage correctly

Pavel Emelianov (1):
      Fix misrouted interrupts deadlocks

Peter Zijlstra (1):
      bonding: lockdep annotation

Rafael J. Wysocki (1):
      md: do not freeze md threads for suspend

Randy Dunlap (1):
      com20020 build fix

Roland Dreier (1):
      IB/mad: Fix race between cancel and receive completion

Russell King (1):
      Fix missing parens in set_personality()

Sharyathi Nagesh (1):
      fix Data Acess error in dup_fd

Simon Horman (1):
      [IPVS]: Compile fix for annotations in userland.

Stephen Hemminger (1):
      [PKT_SCHED] sch_htb: Use hlist_del_init().

Stephen Rothwell (2):
      [POWERPC] Add the thread_siblings files to sysfs
      [POWERPC] Wire up sys_move_pages

Steve French (4):
      [CIFS] NFS stress test generates flood of "close with pending write" messages
      [CIFS] Explicitly set stat->blksize
      [CIFS]  Fix mount failure when domain not specified
      [CIFS] Fix minor problem with previous patch

Steven Rostedt (1):
      x86-64: shorten the x86_64 boot setup GDT to what the comment says

Steven Whitehouse (1):
      [DECNET]: Endianess fixes (try #2)

Takashi Iwai (1):
      ALSA: hda-intel - Disable MSI support by default

Tigran Aivazian (1):
      Tigran has moved

Tim Shimmin (1):
      [XFS] Keep lockdep happy.

Timo Teras (2):
      MMC: Poll card status after rescanning cards
      MMC: Do not set unsupported bits in OCR response

Tom Tucker (1):
      RDMA/amso1100: Fix unitialized pseudo_netdev accessed in c2_register_device

Vivek Goyal (1):
      i386: Force data segment to be 4K aligned

Vlad Apostolov (3):
      [XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when
      [XFS] rename uio_read() to xfs_uio_read()
      [XFS] 956664: dm_read_invis() changes i_atime

Wink Saville (1):
      Patch for nvidia divide by zero error for 7600 pci-express card


^ permalink raw reply	[flat|nested] 36+ messages in thread

* 2.6.19-rc6: known regressions
  2006-11-16  4:21 Linux 2.6.19-rc6 Linus Torvalds
@ 2006-11-16 21:37 ` Adrian Bunk
  2006-11-16 21:43   ` Greg KH
  2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 36+ messages in thread
From: Adrian Bunk @ 2006-11-16 21:37 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: Linux Kernel Mailing List, Ray Lee, Michael Buesch, Larry Finger,
	st3, linville, netdev, David Brownell, Len Brown,
	Alexey Starikovskiy, linux-acpi, Ernst Herzberg, Ingo Molnar,
	Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el,
	oprofile-list, Dennis Stosberg, Greg Kroah-Hartman, ecashin,
	Andrey Borzenkov, Alan Stern, linux-usb-devel

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : bcm43xx: serious problems
References : http://lkml.org/lkml/2006/11/15/296
Submitter  : Ray Lee <ray-lk@madrabbit.org>
Handled-By : Michael Buesch <mb@bu3sch.de>
             Larry Finger <Larry.Finger@lwfinger.net>
Status     : problem is being debugged


Subject    : nasty ACPI regression, AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter  : David Brownell <david-b@pacbell.net>
Handled-By : Len Brown <len.brown@intel.com>
             Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Status     : problem is being debugged


Subject    : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter  : Ernst Herzberg <earny@net4u.de>
Handled-By : Len Brown <len.brown@intel.com>
Status     : problem is being debugged


Subject    : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
             http://lkml.org/lkml/2006/11/10/208
Submitter  : Andre Noll <maan@systemlinux.org>
Handled-By : Andi Kleen <ak@suse.de>
Status     : Andi is investigating


Subject    : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
             http://lkml.org/lkml/2006/11/15/92
Submitter  : Prakash Punnoor <prakash@punnoor.de>
Status     : problem is being discussed


Subject    : x86_64 UP compile error
References : http://lkml.org/lkml/2006/11/16/29
Submitter  : Ingo Molnar <mingo@elte.hu>
Caused-By  : Andi Kleen <ak@suse.de>
             commit 8c131af1db510793f87dc43edbc8950a35370df3
Handled-By : Andi Kleen <ak@suse.de>
             Ingo Molnar <mingo@elte.hu>
Patch      : http://lkml.org/lkml/2006/11/16/36
Status     : patch available


Subject    : aoe: Add forgotten NULL at end of attribute list in aoeblk.c
References : http://lkml.org/lkml/2006/11/13/26
Submitter  : Dennis Stosberg <dennis@stosberg.net>
Caused-By  : Greg Kroah-Hartman <gregkh@suse.de>
             commit 4ca5224f3ea4779054d96e885ca9b3980801ce13
Handled-By : Dennis Stosberg <dennis@stosberg.net>
Patch      : http://lkml.org/lkml/2006/11/13/26
Status     : patch available


Subject    : can't disable OHCI wakeup via sysfs
References : http://lkml.org/lkml/2006/11/11/33
Submitter  : Andrey Borzenkov <arvidjaar@mail.ru>
Handled-By : Alan Stern <stern@rowland.harvard.edu>
Patch      : http://lkml.org/lkml/2006/11/13/261
Status     : patch available


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions
  2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk
@ 2006-11-16 21:43   ` Greg KH
  0 siblings, 0 replies; 36+ messages in thread
From: Greg KH @ 2006-11-16 21:43 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Ray Lee,
	Michael Buesch, Larry Finger, st3, linville, netdev,
	David Brownell, Len Brown, Alexey Starikovskiy, linux-acpi,
	Ernst Herzberg, Ingo Molnar, Andre Noll, Andi Kleen, discuss,
	Prakash Punnoor, phil.el, oprofile-list, Dennis Stosberg, ecashin,
	Andrey Borzenkov, Alan Stern, linux-usb-devel

On Thu, Nov 16, 2006 at 10:37:18PM +0100, Adrian Bunk wrote:
> Subject    : aoe: Add forgotten NULL at end of attribute list in aoeblk.c
> References : http://lkml.org/lkml/2006/11/13/26
> Submitter  : Dennis Stosberg <dennis@stosberg.net>
> Caused-By  : Greg Kroah-Hartman <gregkh@suse.de>
>              commit 4ca5224f3ea4779054d96e885ca9b3980801ce13
> Handled-By : Dennis Stosberg <dennis@stosberg.net>
> Patch      : http://lkml.org/lkml/2006/11/13/26
> Status     : patch available
> 
> 
> Subject    : can't disable OHCI wakeup via sysfs
> References : http://lkml.org/lkml/2006/11/11/33
> Submitter  : Andrey Borzenkov <arvidjaar@mail.ru>
> Handled-By : Alan Stern <stern@rowland.harvard.edu>
> Patch      : http://lkml.org/lkml/2006/11/13/261
> Status     : patch available

I'll be sending Linus both of these patches later today.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 36+ messages in thread

* 2.6.19-rc6: known regressions (v2)
  2006-11-16  4:21 Linux 2.6.19-rc6 Linus Torvalds
  2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk
@ 2006-11-17 20:40 ` Adrian Bunk
  2006-11-18  8:02   ` [PATCH] mm: do not call bad_page on PG_reserved check David Rientjes
  2006-11-18  4:04 ` Linux 2.6.19-rc6 - NFSD working again Christian Kujau
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 36+ messages in thread
From: Adrian Bunk @ 2006-11-17 20:40 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: Linux Kernel Mailing List, Thomas Gleixner, Alan Stern,
	Ingo Molnar, davej, cpufreq, Alexey Starikovskiy, Mattia Dongili,
	Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el,
	oprofile-list, Ray Lee, Michael Buesch, Larry Finger, st3,
	linville, netdev, David Brownell, Len Brown, linux-acpi,
	Ernst Herzberg

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : cpufreq notification broken
References : http://lkml.org/lkml/2006/11/16/177
Submitter  : Thomas Gleixner <tglx@timesys.com>
Caused-By  : Alan Stern <stern@rowland.harvard.edu>
             commit b4dfdbb3c707474a2254c5b4d7e62be31a4b7da9
Handled-By : Ingo Molnar <mingo@elte.hu>
             Linus Torvalds <torvalds@osdl.org>
Status     : patches are being discussed


Subject    : CPU_FREQ_GOV_ONDEMAND=y compile error
References : http://lkml.org/lkml/2006/11/17/198
Submitter  : alex1000@comcast.net
Caused-By  : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com>
             commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
Handled-By : Mattia Dongili <malattia@linux.it>
Patch      : http://lkml.org/lkml/2006/11/17/236
Status     : patch available


Subject    : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
             http://lkml.org/lkml/2006/11/10/208
Submitter  : Andre Noll <maan@systemlinux.org>
Handled-By : Andi Kleen <ak@suse.de>
Status     : Andi is investigating


Subject    : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
             http://lkml.org/lkml/2006/11/15/92
Submitter  : Prakash Punnoor <prakash@punnoor.de>
Status     : problem is being discussed


Subject    : bcm43xx: serious problems
References : http://lkml.org/lkml/2006/11/15/296
Submitter  : Ray Lee <ray-lk@madrabbit.org>
Handled-By : Michael Buesch <mb@bu3sch.de>
             Larry Finger <Larry.Finger@lwfinger.net>
Status     : problem is being debugged


Subject    : nasty ACPI regression, AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter  : David Brownell <david-b@pacbell.net>
Handled-By : Len Brown <len.brown@intel.com>
             Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Status     : problem is being debugged


Subject    : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter  : Ernst Herzberg <earny@net4u.de>
Handled-By : Len Brown <len.brown@intel.com>
Status     : problem is being debugged


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Linux 2.6.19-rc6 - NFSD working again
  2006-11-16  4:21 Linux 2.6.19-rc6 Linus Torvalds
  2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk
  2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk
@ 2006-11-18  4:04 ` Christian Kujau
  2006-11-20 19:53 ` 2.6.19-rc6: known regressions (v3) Adrian Bunk
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 36+ messages in thread
From: Christian Kujau @ 2006-11-18  4:04 UTC (permalink / raw)
  To: neilb; +Cc: Linux Kernel Mailing List

Hi,

I just wanted to report a 'it works again' for rc6: after encountering 
the very same problems with -rc3 Jeff Garzik described in [0], I 
upgraded to -rc5 and applied the proposed[1] patch[2].
Now, the knfsd behaved a bit better (nfs-mounted /home, X11 
applications created thousands of empty 'configuration'-files),
however 'mkdir' and 'touch' still failed too often:

  $ mkdir /mnt/nfs/compile-farm/foo
  mkdir: /mnt/nfs/compile-farm/foo: Operation not permitted
  $ mkdir /mnt/nfs/compile-farm/foo
  mkdir: /mnt/nfs/compile-farm/foo: File exists

...and things like that.

With -rc6 this seems to be gone. However, I noticed this message in the 
server's (192.168.10.10) syslog:

nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out
nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out

The NFS server is running on 0.0.0.0:2049, what does this mean?
The message occurs once in a while, not sure what triggers it, found 
not much in the archives...

Thanks,
Christian.

[0] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1418.html
[1] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1491.html
[2] http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.19-rc3-2/linux-2.6.19-rc3-CITI_NFS4_ALL-2.diff
-- 
BOFH excuse #106:

The electrician didn't know what the yellow cable was so he yanked the ethernet out.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] mm: do not call bad_page on PG_reserved check
  2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk
@ 2006-11-18  8:02   ` David Rientjes
  2006-11-18 13:37     ` Hugh Dickins
  0 siblings, 1 reply; 36+ messages in thread
From: David Rientjes @ 2006-11-18  8:02 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: Andi Kleen, Nick Piggin, Andre Noll, linux-kernel

The return value of free_pages_check() indicates if PG_reserved was set.
If so, the calling functions return immediately and no pages are freed so
there is no need to call bad_page().

Cc: Andi Kleen <ak@suse.de>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: David Rientjes <rientjes@cs.washington.edu>
---
 mm/page_alloc.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bf2f6cf..99bc29d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -439,7 +439,6 @@ static inline int free_pages_check(struc
 			1 << PG_slab	|
 			1 << PG_swapcache |
 			1 << PG_writeback |
-			1 << PG_reserved |
 			1 << PG_buddy ))))
 		bad_page(page);
 	if (PageDirty(page))

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] mm: do not call bad_page on PG_reserved check
  2006-11-18  8:02   ` [PATCH] mm: do not call bad_page on PG_reserved check David Rientjes
@ 2006-11-18 13:37     ` Hugh Dickins
  0 siblings, 0 replies; 36+ messages in thread
From: Hugh Dickins @ 2006-11-18 13:37 UTC (permalink / raw)
  To: David Rientjes
  Cc: Linus Torvalds, Andrew Morton, Andi Kleen, Nick Piggin,
	Andre Noll, linux-kernel

On Sat, 18 Nov 2006, David Rientjes wrote:

> The return value of free_pages_check() indicates if PG_reserved was set.
> If so, the calling functions return immediately and no pages are freed so
> there is no need to call bad_page().
> 
> Cc: Andi Kleen <ak@suse.de>
> Cc: Nick Piggin <npiggin@suse.de>
> Signed-off-by: David Rientjes <rientjes@cs.washington.edu>

NAK.  You're missing the point.  If an attempt is made to free a
reserved page, it implies that the page reference counting has
gone wrong: we want to hear about that (so call bad_page),
and we dare not reuse the page (so skip freeing it).

What might be a good change, is to avoid freeing a page which meets
_any_ of the criteria for calling bad_page: I often wonder whether
to do that, alongside abandoning that hopeless page_mapcount BUG in
page_remove_rmap, which has almost(?) never helped lead us to any fix.

Hugh

> ---
>  mm/page_alloc.c |    1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index bf2f6cf..99bc29d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -439,7 +439,6 @@ static inline int free_pages_check(struc
>  			1 << PG_slab	|
>  			1 << PG_swapcache |
>  			1 << PG_writeback |
> -			1 << PG_reserved |
>  			1 << PG_buddy ))))
>  		bad_page(page);
>  	if (PageDirty(page))

^ permalink raw reply	[flat|nested] 36+ messages in thread

* 2.6.19-rc6: known regressions (v3)
  2006-11-16  4:21 Linux 2.6.19-rc6 Linus Torvalds
                   ` (2 preceding siblings ...)
  2006-11-18  4:04 ` Linux 2.6.19-rc6 - NFSD working again Christian Kujau
@ 2006-11-20 19:53 ` Adrian Bunk
  2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk
  2006-11-23  0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk
  5 siblings, 0 replies; 36+ messages in thread
From: Adrian Bunk @ 2006-11-20 19:53 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: Linux Kernel Mailing List, Vivek Goyal, Andre Noll,
	David Rientjes, ak, discuss, Prakash Punnoor, phil.el,
	oprofile-list, Thomas Gleixner, Alan Stern, Ingo Molnar,
	Oleg Nesterov, Paul E. McKenney, davej, cpufreq,
	Alexey Starikovskiy, Mattia Dongili, David Brownell, Len Brown,
	linux-acpi, Ernst Herzberg

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : kernel hangs when booting with irqpoll
References : http://lkml.org/lkml/2006/11/20/233
Submitter  : Vivek Goyal <vgoyal@in.ibm.com>
Status     : unknown


Subject    : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
             http://lkml.org/lkml/2006/11/10/208
Submitter  : Andre Noll <maan@systemlinux.org>
Handled-By : David Rientjes <rientjes@cs.washington.edu>
Status     : problem is being debugged


Subject    : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
             http://lkml.org/lkml/2006/11/15/92
Submitter  : Prakash Punnoor <prakash@punnoor.de>
Status     : problem is being discussed


Subject    : cpufreq notification broken
References : http://lkml.org/lkml/2006/11/16/177
Submitter  : Thomas Gleixner <tglx@timesys.com>
Caused-By  : Alan Stern <stern@rowland.harvard.edu>
             commit b4dfdbb3c707474a2254c5b4d7e62be31a4b7da9
Handled-By : Ingo Molnar <mingo@elte.hu>
             Linus Torvalds <torvalds@osdl.org>
             Oleg Nesterov <oleg@tv-sign.ru>
             Paul E. McKenney <paulmck@us.ibm.com>
Status     : patches are being discussed


Subject    : CPU_FREQ_GOV_ONDEMAND=y compile error
References : http://lkml.org/lkml/2006/11/17/198
Submitter  : alex1000@comcast.net
Caused-By  : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com>
             commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
Handled-By : Mattia Dongili <malattia@linux.it>
Patch      : http://lkml.org/lkml/2006/11/17/236
Status     : patch available


Subject    : nasty ACPI regression, AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter  : David Brownell <david-b@pacbell.net>
Handled-By : Len Brown <len.brown@intel.com>
             Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Status     : problem is being debugged


Subject    : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter  : Ernst Herzberg <earny@net4u.de>
Handled-By : Len Brown <len.brown@intel.com>
Status     : problem is being debugged



^ permalink raw reply	[flat|nested] 36+ messages in thread

* 2.6.19-rc6: known regressions (v4)
  2006-11-16  4:21 Linux 2.6.19-rc6 Linus Torvalds
                   ` (3 preceding siblings ...)
  2006-11-20 19:53 ` 2.6.19-rc6: known regressions (v3) Adrian Bunk
@ 2006-11-21 21:24 ` Adrian Bunk
  2006-11-21 21:31   ` [discuss] " Dave Jones
                     ` (3 more replies)
  2006-11-23  0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk
  5 siblings, 4 replies; 36+ messages in thread
From: Adrian Bunk @ 2006-11-21 21:24 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: Linux Kernel Mailing List, Vivek Goyal, Pavel Emelianov,
	Andre Noll, David Rientjes, ak, discuss, Prakash Punnoor, phil.el,
	oprofile-list, David Brownell, Len Brown, Alexey Starikovskiy,
	linux-acpi, Ernst Herzberg, Kumar Gala, Joakim Tjernlund,
	Kim Phillips, paulus, linuxppc-dev, a.zummo, Randy Dunlap,
	Roman Zippel, Phil Oester, Sam Ravnborg, Mattia Dongili, davej,
	cpufreq

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : kernel hangs when booting with irqpoll
References : http://lkml.org/lkml/2006/11/20/233
Submitter  : Vivek Goyal <vgoyal@in.ibm.com>
Caused-By  : Pavel Emelianov <xemul@openvz.org>
             commit f72fa707604c015a6625e80f269506032d5430dc
Handled-By : Vivek Goyal <vgoyal@in.ibm.com>
Status     : problem is being debugged


Subject    : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
             http://lkml.org/lkml/2006/11/10/208
Submitter  : Andre Noll <maan@systemlinux.org>
Handled-By : David Rientjes <rientjes@cs.washington.edu>
Status     : problem is being debugged


Subject    : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
             http://lkml.org/lkml/2006/11/15/92
Submitter  : Prakash Punnoor <prakash@punnoor.de>
Status     : problem is being discussed


Subject    : ACPI: AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter  : David Brownell <david-b@pacbell.net>
Handled-By : Len Brown <len.brown@intel.com>
             Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Status     : problem is being debugged


Subject    : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter  : Ernst Herzberg <earny@net4u.de>
Handled-By : Len Brown <len.brown@intel.com>
Status     : problem is being debugged


Subject    : powerpc: serious RTC problems
References : http://lkml.org/lkml/2006/11/17/187
             http://lkml.org/lkml/2006/11/18/99
Submitter  : Kumar Gala <galak@kernel.crashing.org>
             Joakim Tjernlund <joakim.tjernlund@transmode.se>
Caused-By  : Kim Phillips <kim.phillips@freescale.com>
             commit 7a69af63e788a324d162201a0b23df41bcf158dd
             commit a8ed4f7ec3aa472134d7de6176f823b2667e450b
Handled-By : David Brownell <david-b@pacbell.net
             Kim Phillips <kim.phillips@freescale.com>
Patch      : http://lkml.org/lkml/2006/11/20/320
             http://lkml.org/lkml/2006/11/20/321
Status     : patches available


Subject    : xconfig crashes on x86_64
References : http://lkml.org/lkml/2006/11/19/177
Submitter  : Randy Dunlap <randy.dunlap@oracle.com>
Handled-By : Roman Zippel <zippel@linux-m68k.org>
Patch      : http://lkml.org/lkml/2006/11/20/340
Status     : patch available


Subject    : menuconfig problems with TERM=vt100
References : http://lkml.org/lkml/2006/11/13/369
Submitter  : Phil Oester <kernel@linuxace.com>
Caused-By  : Sam Ravnborg <sam@mars.ravnborg.org>
             commit 350b5b76384e77bcc58217f00455fdbec5cac594
Handled-By : Roman Zippel <zippel@linux-m68k.org>
Patch      : http://lkml.org/lkml/2006/11/20/341
Status     : patch available


Subject    : CPU_FREQ_GOV_ONDEMAND=y compile error
References : http://lkml.org/lkml/2006/11/17/198
Submitter  : alex1000@comcast.net
Caused-By  : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com>
             commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
Handled-By : Mattia Dongili <malattia@linux.it>
Patch      : http://lkml.org/lkml/2006/11/17/236
Status     : patch available



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk
@ 2006-11-21 21:31   ` Dave Jones
  2006-11-21 21:39     ` Adrian Bunk
  2006-11-21 21:33   ` Vivek Goyal
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: Dave Jones @ 2006-11-21 21:31 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List,
	Mattia Dongili, cpufreq

On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:

 > Subject    : CPU_FREQ_GOV_ONDEMAND=y compile error
 > References : http://lkml.org/lkml/2006/11/17/198
 > Submitter  : alex1000@comcast.net
 > Caused-By  : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com>
 >              commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
 > Handled-By : Mattia Dongili <malattia@linux.it>
 > Patch      : http://lkml.org/lkml/2006/11/17/236
 > Status     : patch available

not a regression, easily worked around, queued for .20

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk
  2006-11-21 21:31   ` [discuss] " Dave Jones
@ 2006-11-21 21:33   ` Vivek Goyal
  2006-11-21 21:41     ` Adrian Bunk
  2006-11-21 22:18     ` Linus Torvalds
  2006-11-22 10:42   ` [discuss] " Andi Kleen
  2006-11-23  0:04   ` David Brownell
  3 siblings, 2 replies; 36+ messages in thread
From: Vivek Goyal @ 2006-11-21 21:33 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: linux kernel mailing list, Linus Torvalds, Morton Andrew Morton,
	Pavel Emelianov, mingo, dev

On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> that are not yet fixed in Linus' tree.
> 
> If you find your name in the Cc header, you are either submitter of one
> of the bugs, maintainer of an affectected subsystem or driver, a patch
> of you caused a breakage or I'm considering you in any other way possibly
> involved with one or more of these issues.
> 
> Due to the huge amount of recipients, please trim the Cc when answering.
> 
> 
> Subject    : kernel hangs when booting with irqpoll
> References : http://lkml.org/lkml/2006/11/20/233
> Submitter  : Vivek Goyal <vgoyal@in.ibm.com>
> Caused-By  : Pavel Emelianov <xemul@openvz.org>
>              commit f72fa707604c015a6625e80f269506032d5430dc
> Handled-By : Vivek Goyal <vgoyal@in.ibm.com>
> Status     : problem is being debugged
> 

Adrian,

Pavel already provided a fix for this issue.

http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2

Thanks
Vivek

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:31   ` [discuss] " Dave Jones
@ 2006-11-21 21:39     ` Adrian Bunk
  2006-11-21 21:56       ` Dave Jones
  0 siblings, 1 reply; 36+ messages in thread
From: Adrian Bunk @ 2006-11-21 21:39 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, Mattia Dongili, cpufreq

On Tue, Nov 21, 2006 at 04:31:39PM -0500, Dave Jones wrote:
> On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> 
>  > Subject    : CPU_FREQ_GOV_ONDEMAND=y compile error
>  > References : http://lkml.org/lkml/2006/11/17/198
>  > Submitter  : alex1000@comcast.net
>  > Caused-By  : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com>
>  >              commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
>  > Handled-By : Mattia Dongili <malattia@linux.it>
>  > Patch      : http://lkml.org/lkml/2006/11/17/236
>  > Status     : patch available
> 
> not a regression, easily worked around, queued for .20

It is a regression since commit 05ca0350e8caa91a5ec9961c585c98005b6934ea 
was merged after 2.6.18.

Considering that the fix is trivial, why shouldn't it be merged before 
2.6.19?

> 		Dave

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:33   ` Vivek Goyal
@ 2006-11-21 21:41     ` Adrian Bunk
  2006-11-21 22:18     ` Linus Torvalds
  1 sibling, 0 replies; 36+ messages in thread
From: Adrian Bunk @ 2006-11-21 21:41 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux kernel mailing list, Linus Torvalds, Morton Andrew Morton,
	Pavel Emelianov, mingo, dev

On Tue, Nov 21, 2006 at 04:33:35PM -0500, Vivek Goyal wrote:
> On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> > that are not yet fixed in Linus' tree.
> > 
> > If you find your name in the Cc header, you are either submitter of one
> > of the bugs, maintainer of an affectected subsystem or driver, a patch
> > of you caused a breakage or I'm considering you in any other way possibly
> > involved with one or more of these issues.
> > 
> > Due to the huge amount of recipients, please trim the Cc when answering.
> > 
> > 
> > Subject    : kernel hangs when booting with irqpoll
> > References : http://lkml.org/lkml/2006/11/20/233
> > Submitter  : Vivek Goyal <vgoyal@in.ibm.com>
> > Caused-By  : Pavel Emelianov <xemul@openvz.org>
> >              commit f72fa707604c015a6625e80f269506032d5430dc
> > Handled-By : Vivek Goyal <vgoyal@in.ibm.com>
> > Status     : problem is being debugged
> > 
> 
> Adrian,
> 
> Pavel already provided a fix for this issue.
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2

Thanks for the information, I missed this patch.

> Thanks
> Vivek

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:39     ` Adrian Bunk
@ 2006-11-21 21:56       ` Dave Jones
  0 siblings, 0 replies; 36+ messages in thread
From: Dave Jones @ 2006-11-21 21:56 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List,
	Mattia Dongili, cpufreq

On Tue, Nov 21, 2006 at 10:39:00PM +0100, Adrian Bunk wrote:
 > On Tue, Nov 21, 2006 at 04:31:39PM -0500, Dave Jones wrote:
 > > On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
 > > 
 > >  > Subject    : CPU_FREQ_GOV_ONDEMAND=y compile error
 > >  > References : http://lkml.org/lkml/2006/11/17/198
 > >  > Submitter  : alex1000@comcast.net
 > >  > Caused-By  : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com>
 > >  >              commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
 > >  > Handled-By : Mattia Dongili <malattia@linux.it>
 > >  > Patch      : http://lkml.org/lkml/2006/11/17/236
 > >  > Status     : patch available
 > > 
 > > not a regression, easily worked around, queued for .20
 > 
 > It is a regression since commit 05ca0350e8caa91a5ec9961c585c98005b6934ea 
 > was merged after 2.6.18.

Ah, I misinterpreted when that cset went in (I read the commit date
which was back in June, not the merge date, which was september).

 > Considering that the fix is trivial, why shouldn't it be merged before 
 > 2.6.19?

Yes, I'll push it on.

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:33   ` Vivek Goyal
  2006-11-21 21:41     ` Adrian Bunk
@ 2006-11-21 22:18     ` Linus Torvalds
  2006-11-22  9:44       ` Pavel Emelianov
  1 sibling, 1 reply; 36+ messages in thread
From: Linus Torvalds @ 2006-11-21 22:18 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Adrian Bunk, linux kernel mailing list, Morton Andrew Morton,
	Pavel Emelianov, mingo, dev



On Tue, 21 Nov 2006, Vivek Goyal wrote:

> On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> > that are not yet fixed in Linus' tree.
> > 
> > If you find your name in the Cc header, you are either submitter of one
> > of the bugs, maintainer of an affectected subsystem or driver, a patch
> > of you caused a breakage or I'm considering you in any other way possibly
> > involved with one or more of these issues.
> > 
> > Due to the huge amount of recipients, please trim the Cc when answering.
> > 
> > 
> > Subject    : kernel hangs when booting with irqpoll
> > References : http://lkml.org/lkml/2006/11/20/233
> > Submitter  : Vivek Goyal <vgoyal@in.ibm.com>
> > Caused-By  : Pavel Emelianov <xemul@openvz.org>
> >              commit f72fa707604c015a6625e80f269506032d5430dc
> > Handled-By : Vivek Goyal <vgoyal@in.ibm.com>
> > Status     : problem is being debugged
> > 
> 
> Adrian,
> 
> Pavel already provided a fix for this issue.
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2

I really think this is wrong.

The original patch was wrong, and the _real_ problem is in __do_IRQ() that 
got the desc->lock too early.

I _think_ the correct fix is to simply revert the broken commit, and fix 
the _one_ place that called "misnote_interrupt()" with the lock held.

Something like this..

I also think that the real fix will be to move the whole

	if (!noirqdebug)
		note_interrupt(irq, desc, action_ret);


into handle_IRQ_event itself, since every caller (except for 
"misrouted_irq()" itself, and that should probably be done separately) 
should always do it. Right now we have a lot of people that just do

	action_ret = handle_IRQ_event(irq, action);
	if (!noirqdebug)
		note_interrupt(irq, desc, action_ret);

explicitly.

The only thing that keeps us from doing that is that we don't pass in 
"desc", but we should just do that.

But in the meantime, this appears to be the minimal fix. Can people please 
test and verify?

		Linus

---
diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index 42aa6f1..a681912 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -231,10 +231,10 @@ fastcall unsigned int __do_IRQ(unsigned
 		spin_unlock(&desc->lock);
 
 		action_ret = handle_IRQ_event(irq, action);
-
-		spin_lock(&desc->lock);
 		if (!noirqdebug)
 			note_interrupt(irq, desc, action_ret);
+
+		spin_lock(&desc->lock);
 		if (likely(!(desc->status & IRQ_PENDING)))
 			break;
 		desc->status &= ~IRQ_PENDING;
diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c
index 9c7e2e4..543ea2e 100644
--- a/kernel/irq/spurious.c
+++ b/kernel/irq/spurious.c
@@ -147,11 +147,7 @@ void note_interrupt(unsigned int irq, st
 	if (unlikely(irqfixup)) {
 		/* Don't punish working computers */
 		if ((irqfixup == 2 && irq == 0) || action_ret == IRQ_NONE) {
-			int ok;
-
-			spin_unlock(&desc->lock);
-			ok = misrouted_irq(irq);
-			spin_lock(&desc->lock);
+			int ok = misrouted_irq(irq);
 			if (action_ret == IRQ_NONE)
 				desc->irqs_unhandled -= ok;
 		}

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions (v4)
  2006-11-21 22:18     ` Linus Torvalds
@ 2006-11-22  9:44       ` Pavel Emelianov
  2006-11-22 14:58         ` Vivek Goyal
  2006-11-22 17:28         ` Linus Torvalds
  0 siblings, 2 replies; 36+ messages in thread
From: Pavel Emelianov @ 2006-11-22  9:44 UTC (permalink / raw)
  To: Linus Torvalds, Morton Andrew Morton, mingo
  Cc: Vivek Goyal, Adrian Bunk, linux kernel mailing list, dev

> I really think this is wrong.
> 
> The original patch was wrong, and the _real_ problem is in __do_IRQ() that 
> got the desc->lock too early.
> 
> I _think_ the correct fix is to simply revert the broken commit, and fix 
> the _one_ place that called "misnote_interrupt()" with the lock held.
> 
> Something like this..
> 
> I also think that the real fix will be to move the whole
> 
> 	if (!noirqdebug)
> 		note_interrupt(irq, desc, action_ret);
> 
> 
> into handle_IRQ_event itself, since every caller (except for 
> "misrouted_irq()" itself, and that should probably be done separately) 
> should always do it. Right now we have a lot of people that just do
> 
> 	action_ret = handle_IRQ_event(irq, action);
> 	if (!noirqdebug)
> 		note_interrupt(irq, desc, action_ret);
> 
> explicitly.
> 
> The only thing that keeps us from doing that is that we don't pass in 
> "desc", but we should just do that.
> 
> But in the meantime, this appears to be the minimal fix. Can people please 
> test and verify?

This works for me, but is this normal that desc's fields are
modified non-atomically in note_interrupt()?

And one more thing - report_bad_irq() traverses desc->action
list without any locking either.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk
  2006-11-21 21:31   ` [discuss] " Dave Jones
  2006-11-21 21:33   ` Vivek Goyal
@ 2006-11-22 10:42   ` Andi Kleen
  2006-11-22 15:52     ` Mel Gorman
  2006-11-22 16:05     ` Andre Noll
  2006-11-23  0:04   ` David Brownell
  3 siblings, 2 replies; 36+ messages in thread
From: Andi Kleen @ 2006-11-22 10:42 UTC (permalink / raw)
  To: discuss
  Cc: Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, Andre Noll, David Rientjes, Mel Gorman

ject    : x86_64: Bad page state in process 'swapper'
> References : http://lkml.org/lkml/2006/11/10/135
>              http://lkml.org/lkml/2006/11/10/208
> Submitter  : Andre Noll <maan@systemlinux.org>
> Handled-By : David Rientjes <rientjes@cs.washington.edu>
> Status     : problem is being debugged

Does this still happen with -rc6? 

It's probably another bug in the memmap parsing rewrite (Mel cc'ed) 
but the debugging information in the standard kernel unfortunately
doesn't give enough output to find out where it happens.

-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions (v4)
  2006-11-22  9:44       ` Pavel Emelianov
@ 2006-11-22 14:58         ` Vivek Goyal
  2006-11-22 17:28         ` Linus Torvalds
  1 sibling, 0 replies; 36+ messages in thread
From: Vivek Goyal @ 2006-11-22 14:58 UTC (permalink / raw)
  To: Pavel Emelianov, Linus Torvalds
  Cc: Morton Andrew Morton, mingo, Adrian Bunk,
	linux kernel mailing list, dev

On Wed, Nov 22, 2006 at 12:44:14PM +0300, Pavel Emelianov wrote:
> > I really think this is wrong.
> >
> > The original patch was wrong, and the _real_ problem is in __do_IRQ() that
> > got the desc->lock too early.
> >
> > I _think_ the correct fix is to simply revert the broken commit, and fix
> > the _one_ place that called "misnote_interrupt()" with the lock held.
> >
> > Something like this..
> >
> > I also think that the real fix will be to move the whole
> >
> > 	if (!noirqdebug)
> > 		note_interrupt(irq, desc, action_ret);
> >
> >
> > into handle_IRQ_event itself, since every caller (except for
> > "misrouted_irq()" itself, and that should probably be done separately)
> > should always do it. Right now we have a lot of people that just do
> >
> > 	action_ret = handle_IRQ_event(irq, action);
> > 	if (!noirqdebug)
> > 		note_interrupt(irq, desc, action_ret);
> >
> > explicitly.
> >
> > The only thing that keeps us from doing that is that we don't pass in
> > "desc", but we should just do that.
> >
> > But in the meantime, this appears to be the minimal fix. Can people please
> > test and verify?
> 
> This works for me, but is this normal that desc's fields are
> modified non-atomically in note_interrupt()?
> 
> And one more thing - report_bad_irq() traverses desc->action
> list without any locking either.

Works for me too. But Pavel's concern look genuine. May be we should take
the lock again in note_interrupt()/report_bad_irq() whenever we are
accessing/modifying desc.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-22 10:42   ` [discuss] " Andi Kleen
@ 2006-11-22 15:52     ` Mel Gorman
  2006-11-22 17:42       ` Andre Noll
  2006-11-22 16:05     ` Andre Noll
  1 sibling, 1 reply; 36+ messages in thread
From: Mel Gorman @ 2006-11-22 15:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, Andre Noll, David Rientjes

On (22/11/06 11:42), Andi Kleen didst pronounce:
> ject    : x86_64: Bad page state in process 'swapper'
> > References : http://lkml.org/lkml/2006/11/10/135
> >              http://lkml.org/lkml/2006/11/10/208
> > Submitter  : Andre Noll <maan@systemlinux.org>
> > Handled-By : David Rientjes <rientjes@cs.washington.edu>
> > Status     : problem is being debugged
> 
> Does this still happen with -rc6? 
> 
> It's probably another bug in the memmap parsing rewrite (Mel cc'ed) 
> but the debugging information in the standard kernel unfortunately
> doesn't give enough output to find out where it happens.
> 

Right, so I took a closer look to see what the story was.

According to the thread, this was the E820 map with the corresponding
PFNs appended to the usable regions.

BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)    (  0-159)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)  
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)  
 BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable)    (256-1032176)
 BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data) 
 BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS)
 BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000200000000 (usable)    (1048576-2097152)

This is what the PFN ranges look like to arch-independent zone-sizing
reading the map without node awareness

Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used
Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used

That matches exactly. So far so good. Later with node awareness, we get

SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: Node 0 PXM 0 100000-fc000000
Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used
SRAT: Node 1 PXM 1 100000000-200000000
Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used
SRAT: Node 0 PXM 0 0-fc000000
Entering add_active_range(0, 0, 159) 2 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used

Unusual ordering, but the information is still correct. The final sorted
map looks like;

early_node_map[3] active PFN ranges
    0:        0 ->      159
    0:      256 ->  1032176
    1:  1048576 ->  2097152

Again, everything there looks like what the E820 map reports so I don't
believe this is the zone-sizings code fault although it may be exposing a
bug from elsewhere. According to bootmap, things look like

Bootmem setup node 0 0000000000000000-00000000fc000000
Bootmem setup node 1 0000000100000000-0000000200000000

That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152.

That is showing an additional 16 page frames that are not in the E820 map
(although I have seen this before and it didn't show up as a bad page). I
would be very interested in finding out what the bad_page PFNs are if this
bug still exists to see if it is those 16 frames. I've included a patch
below that might help.

Andre, if the bug still exists for you, can you apply Andi's patch to
reduce the log size and the following patch please and post us the
output with loglevel=8 please? Thanks

Signed-off-by: Mel Gorman <mel@csn.ul.ie>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc6-clean/arch/x86_64/mm/numa.c linux-2.6.19-rc6-debug_bootmem_init_issues/arch/x86_64/mm/numa.c
--- linux-2.6.19-rc6-clean/arch/x86_64/mm/numa.c	2006-11-22 15:08:20.000000000 +0000
+++ linux-2.6.19-rc6-debug_bootmem_init_issues/arch/x86_64/mm/numa.c	2006-11-22 15:07:47.000000000 +0000
@@ -192,6 +192,9 @@ void __init setup_node_zones(int nodeid)
 				memmapsize, SMP_CACHE_BYTES, 
 				round_down(limit - memmapsize, PAGE_SIZE), 
 				limit);
+	printk(KERN_DEBUG "Node %d memmap at 0x%p size %lu first pfn 0x%p\n",
+			nodeid, NODE_DATA(nodeid)->node_mem_map,
+			memmapsize, NODE_DATA(nodeid)->node_mem_map);
 #endif
 } 
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-debug_bootmem_init_issues/mm/page_alloc.c
--- linux-2.6.19-rc6-clean/mm/page_alloc.c	2006-11-16 04:03:40.000000000 +0000
+++ linux-2.6.19-rc6-debug_bootmem_init_issues/mm/page_alloc.c	2006-11-22 14:16:46.000000000 +0000
@@ -2453,6 +2453,9 @@ static void __init alloc_node_mem_map(st
 		if (!map)
 			map = alloc_bootmem_node(pgdat, size);
 		pgdat->node_mem_map = map + (pgdat->node_start_pfn - start);
+		printk(KERN_DEBUG
+			"Node %d memmap at 0x%p size %lu first pfn 0x%p\n",
+			pgdat->node_id, map, size, pgdat->node_mem_map);
 	}
 #ifdef CONFIG_FLATMEM
 	/*
@@ -2683,6 +2686,9 @@ void __init free_area_init_nodes(unsigne
 	/* Regions in the early_node_map can be in any order */
 	sort_node_map();
 
+	/* Print out the page size for debugging meminit problems */
+	printk(KERN_DEBUG "sizeof(struct page) = %d\n", sizeof(struct page));
+
 	/* Print out the zone ranges */
 	printk("Zone PFN ranges:\n");
 	for (i = 0; i < MAX_NR_ZONES; i++)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-22 10:42   ` [discuss] " Andi Kleen
  2006-11-22 15:52     ` Mel Gorman
@ 2006-11-22 16:05     ` Andre Noll
  2006-11-22 17:03       ` Mel Gorman
  2006-11-22 17:08       ` Andi Kleen
  1 sibling, 2 replies; 36+ messages in thread
From: Andre Noll @ 2006-11-22 16:05 UTC (permalink / raw)
  To: Andi Kleen
  Cc: discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes, Mel Gorman

[-- Attachment #1: Type: text/plain, Size: 858 bytes --]

On 11:42, Andi Kleen wrote:
> ject    : x86_64: Bad page state in process 'swapper'
> > References : http://lkml.org/lkml/2006/11/10/135
> >              http://lkml.org/lkml/2006/11/10/208
> > Submitter  : Andre Noll <maan@systemlinux.org>
> > Handled-By : David Rientjes <rientjes@cs.washington.edu>
> > Status     : problem is being debugged
> 
> Does this still happen with -rc6? 

Unfortunately, yes. I tried rc6, current git, and currrent git + David
Rientjes' patch. They all show the same behaviour.

> It's probably another bug in the memmap parsing rewrite (Mel cc'ed) 
> but the debugging information in the standard kernel unfortunately
> doesn't give enough output to find out where it happens.

Feel free to send me a debugging patch..

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-22 16:05     ` Andre Noll
@ 2006-11-22 17:03       ` Mel Gorman
  2006-11-22 17:08       ` Andi Kleen
  1 sibling, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2006-11-22 17:03 UTC (permalink / raw)
  To: Andre Noll
  Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes, Mel Gorman

On Wed, 22 Nov 2006, Andre Noll wrote:

> On 11:42, Andi Kleen wrote:
>> ject    : x86_64: Bad page state in process 'swapper'
>>> References : http://lkml.org/lkml/2006/11/10/135
>>>              http://lkml.org/lkml/2006/11/10/208
>>> Submitter  : Andre Noll <maan@systemlinux.org>
>>> Handled-By : David Rientjes <rientjes@cs.washington.edu>
>>> Status     : problem is being debugged
>>
>> Does this still happen with -rc6?
>
> Unfortunately, yes. I tried rc6, current git, and currrent git + David
> Rientjes' patch. They all show the same behaviour.
>
>> It's probably another bug in the memmap parsing rewrite (Mel cc'ed)
>> but the debugging information in the standard kernel unfortunately
>> doesn't give enough output to find out where it happens.
>
> Feel free to send me a debugging patch..
>



You should have received such a patch from me later in the thread. In 
combination with the patch at http://lkml.org/lkml/2006/11/10/198 and a 
copy of the dmesg, I might be able to guess what is going wrong. Thanks

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-22 16:05     ` Andre Noll
  2006-11-22 17:03       ` Mel Gorman
@ 2006-11-22 17:08       ` Andi Kleen
  2006-11-22 18:00         ` Andre Noll
  1 sibling, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2006-11-22 17:08 UTC (permalink / raw)
  To: Andre Noll
  Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes, Mel Gorman

On Wed, Nov 22, 2006 at 05:05:49PM +0100, Andre Noll wrote:
> Unfortunately, yes. I tried rc6, current git, and currrent git + David
> Rientjes' patch. They all show the same behaviour.

I must have missed that patch.
> 
> > It's probably another bug in the memmap parsing rewrite (Mel cc'ed) 
> > but the debugging information in the standard kernel unfortunately
> > doesn't give enough output to find out where it happens.
> 
> Feel free to send me a debugging patch..

Here's one. Please send output (unless Mel finds the problem first..)

-Andi

Index: linux-2.6.19-rc6-hack/mm/page_alloc.c
===================================================================
--- linux-2.6.19-rc6-hack/mm/page_alloc.c
+++ linux-2.6.19-rc6-hack/mm/page_alloc.c
@@ -188,6 +188,10 @@ static inline int bad_range(struct zone 
 
 static void bad_page(struct page *page)
 {
+	static int warned; 
+	if (!warned) { 
+	warned = 1;
+	printk(KERN_EMERG "page address %lx\n", page_address(page));
 	printk(KERN_EMERG "Bad page state in process '%s'\n"
 		KERN_EMERG "page:%p flags:0x%0*lx mapping:%p mapcount:%d count:%d\n"
 		KERN_EMERG "Trying to fix it up, but a reboot is needed\n"
@@ -196,6 +200,7 @@ static void bad_page(struct page *page)
 		(unsigned long)page->flags, page->mapping,
 		page_mapcount(page), page_count(page));
 	dump_stack();
+	}
 	page->flags &= ~(1 << PG_lru	|
 			1 << PG_private |
 			1 << PG_locked	|


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions (v4)
  2006-11-22  9:44       ` Pavel Emelianov
  2006-11-22 14:58         ` Vivek Goyal
@ 2006-11-22 17:28         ` Linus Torvalds
  1 sibling, 0 replies; 36+ messages in thread
From: Linus Torvalds @ 2006-11-22 17:28 UTC (permalink / raw)
  To: Pavel Emelianov
  Cc: Morton Andrew Morton, mingo, Vivek Goyal, Adrian Bunk,
	linux kernel mailing list, dev



On Wed, 22 Nov 2006, Pavel Emelianov wrote:
> 
> This works for me, but is this normal that desc's fields are
> modified non-atomically in note_interrupt()?

This is all inside the normal interrupt handling logic, so it should be 
exactly as safe as any interrupt is: we don't allow the _same_ interrupt 
to be entered recursively at the same time.

So yes, the counts etc are done non-atomically, but the code around it all 
guarantees that only one concurrent invocation happens per irq descriptor, 
so it's all ok.

(The one exception to that may be the "desc->status" modification in case 
the irq is determined to have screamed, since "status" can be modified by 
a recursive interrupt coming in, but (a) that's a "this irq is dead" 
schenario _anyway_ and (b) if we ever care, we should lock it _there_, not 
somewhere else).

			Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-22 15:52     ` Mel Gorman
@ 2006-11-22 17:42       ` Andre Noll
  2006-11-23 12:01         ` Mel Gorman
  0 siblings, 1 reply; 36+ messages in thread
From: Andre Noll @ 2006-11-22 17:42 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes

[-- Attachment #1: Type: text/plain, Size: 22063 bytes --]

On 15:52, Mel Gorman wrote:

> Right, so I took a closer look to see what the story was.

Thanks a lot, Mel.

> Bootmem setup node 0 0000000000000000-00000000fc000000
> Bootmem setup node 1 0000000100000000-0000000200000000
> 
> That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152.
> 
> That is showing an additional 16 page frames that are not in the E820 map
> (although I have seen this before and it didn't show up as a bad page). I
> would be very interested in finding out what the bad_page PFNs are if this
> bug still exists to see if it is those 16 frames. I've included a patch
> below that might help.
> 
> Andre, if the bug still exists for you, can you apply Andi's patch to
> reduce the log size and the following patch please and post us the
> output with loglevel=8 please? Thanks

Done. Here's the output of dmesg with your and Andi's patch applied.

Andre

Linux version 2.6.19-rc6-mel-tt64-6-g0f9005a6-dirty (maan@congo) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #11 SMP Wed Nov 22 17:11:44 CET 2006
Command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel 
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable)
 BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data)
 BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS)
 BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000200000000 (usable)
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used
Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used
end_pfn_map = 2097152
DMI 2.3 present.
ACPI: RSDP (v000 ACPIAM                                ) @ 0x00000000000f6bc0
ACPI: RSDT (v001 A M I  OEMRSDT  0x01000510 MSFT 0x00000097) @ 0x00000000fbff0000
ACPI: FADT (v001 A M I  OEMFACP  0x01000510 MSFT 0x00000097) @ 0x00000000fbff0200
ACPI: MADT (v001 A M I  OEMAPIC  0x01000510 MSFT 0x00000097) @ 0x00000000fbff0380
ACPI: OEMB (v001 A M I  OEMBIOS  0x01000510 MSFT 0x00000097) @ 0x00000000fbfff040
ACPI: SRAT (v001 A M I  OEMSRAT  0x01000510 MSFT 0x00000097) @ 0x00000000fbff34e0
ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0x00000000fbff35f0
ACPI: DSDT (v001  0AAAA 0AAAA000 0x00000000 INTL 0x02002026) @ 0x0000000000000000
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: Node 0 PXM 0 100000-fc000000
Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used
SRAT: Node 1 PXM 1 100000000-200000000
Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used
SRAT: Node 0 PXM 0 0-fc000000
Entering add_active_range(0, 0, 159) 2 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used
NUMA: Using 32 for the hash shift.
Bootmem setup node 0 0000000000000000-00000000fc000000
Bootmem setup node 1 0000000100000000-0000000200000000
Node 0 memmap at 0xffff810000893000 size 57802752 first pfn 0xffff810000893000
Node 1 memmap at 0xffff8101fc800000 size 58720256 first pfn 0xffff8101fc800000
sizeof(struct page) = 56
Zone PFN ranges:
  DMA           256 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 ->  2097152
early_node_map[3] active PFN ranges
    0:        0 ->      159
    0:      256 ->  1032176
    1:  1048576 ->  2097152
On node 0 totalpages: 1031920
  DMA zone: 52 pages used for memmap
  DMA zone: 1953 pages reserved
  DMA zone: 1835 pages, LIFO batch:0
  DMA32 zone: 14055 pages used for memmap
  DMA32 zone: 1014025 pages, LIFO batch:31
  Normal zone: 0 pages used for memmap
On node 1 totalpages: 1048576
  DMA zone: 0 pages used for memmap
  DMA32 zone: 0 pages used for memmap
  Normal zone: 14336 pages used for memmap
  Normal zone: 1034240 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x5008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24])
IOAPIC[1]: apic_id 3, address 0xfebff000, GSI 24-27
ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28])
IOAPIC[2]: apic_id 4, address 0xfebfe000, GSI 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009f000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000fbff0000 - 00000000fbfff000
Nosave address range: 00000000fbfff000 - 00000000fc000000
Nosave address range: 00000000fc000000 - 00000000ff780000
Nosave address range: 00000000ff780000 - 0000000100000000
Allocating PCI resources starting at fc400000 (gap: fc000000:3780000)
PERCPU: Allocating 25728 bytes of per cpu data
Built 2 zonelists.  Total pages: 2050100
Kernel command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel 
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Checking aperture...
CPU 0: aperture @ f5cc000000 size 32 MB
Aperture too small (32 MB)
No AGP bridge found
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
Bad page state in process 'swapper'
page:ffff810003faf480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:

Call Trace:
 [<ffffffff8014f1dd>] bad_page+0x71/0x9f
 [<ffffffff8014f6be>] __free_pages_ok+0x78/0xf9
 [<ffffffff805cd878>] free_all_bootmem_core+0xce/0x1c2
 [<ffffffff805cad99>] numa_free_all_bootmem+0x39/0x78
 [<ffffffff805ca603>] mem_init+0x59/0x16c
 [<ffffffff805bb75c>] start_kernel+0x165/0x1e7
 [<ffffffff805bb195>] x86_64_start_kernel+0x12b/0x130

Memory: 8122880k/8388608k available (3184k kernel code, 199740k reserved, 1490k data, 2612k init)
Calibrating delay using timer specific routine.. 4784.66 BogoMIPS (lpj=9569329)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0/0 -> Node 0
Freeing SMP alternatives: 32k freed
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 12447006
Detected 12.447 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4780.00 BogoMIPS (lpj=9560010)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1/1 -> Node 1
AMD Opteron(tm) Processor 250 stepping 0a
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -14 cycles, maxerr 1190 cycles)
Brought up 2 CPUs
testing NMI watchdog ... OK.
Disabling vsyscall due to use of PM timer
time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
time.c: Detected 2389.823 MHz processor.
migration_cost=569
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:03:06.0
PCI: Firmware left 0000:03:08.0 e100 interrupts enabled, disabling
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
AMD768 RNG detected
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 8000000 size 65536 KB
PCI-DMA: using GART IOMMU.
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
PCI: Bridge: 0000:00:06.0
  IO window: a000-bfff
  MEM window: fc900000-feafffff
  PREFETCH window: disabled.
PCI: Bridge: 0000:00:0a.0
  IO window: 9000-9fff
  MEM window: fc600000-fc8fffff
  PREFETCH window: ff500000-ff5fffff
PCI: Bridge: 0000:00:0b.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
microcode: CPU0 not a capable Intel processor
microcode: CPU1 not a capable Intel processor
IA-32 Microcode Update Driver: v1.14a <tigran@veritas.com>
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Getting cpuindex for acpiid 0x3
ACPI: Getting cpuindex for acpiid 0x4
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.101 (c) Dave Jones
ipmi message handler version 39.0
ipmi device interface
IPMI System Interface driver.
ipmi_si: Unable to find any System Interface(s)
IPMI Watchdog: driver initialized
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
loop: loaded (max 8 devices)
Intel(R) PRO/1000 Network Driver - version 7.2.9-k4
Copyright (c) 1999-2006 Intel Corporation.
eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others
ACPI: PCI Interrupt 0000:03:08.0[A] -> GSI 18 (level, low) -> IRQ 18
eth0: 0000:03:08.0, 00:E0:81:2E:78:F7, IRQ 18.
  Board assembly 567812-052, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0xd0a6c714).
e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
tg3.c:v3.69 (November 15, 2006)
ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 24 (level, low) -> IRQ 24
eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:26
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
eth1: dma_rwctrl[769f4000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:02:09.1[B] -> GSI 25 (level, low) -> IRQ 25
eth2: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:27
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
eth2: dma_rwctrl[769f4000] dma_mask[64-bit]
Linux video capture interface: v2.00
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: IDE controller at PCI slot 0000:00:07.1
AMD8111: chipset revision 3
AMD8111: not 100% native mode: will probe irqs later
AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide0...
Probing IDE interface ide1...
ACPI: PCI Interrupt 0000:02:06.0[A] -> GSI 24 (level, low) -> IRQ 24
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

scsi 0:0:0:0: Direct-Access     FUJITSU  MAT3073NP        0105 PQ: 0 ANSI: 3
 target0:0:0: asynchronous
scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
 target0:0:0: Beginning Domain Validation
 target0:0:0: wide asynchronous
 target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
 target0:0:0: Ending Domain Validation
scsi 0:0:1:0: Direct-Access     FUJITSU  MAT3073NP        0105 PQ: 0 ANSI: 3
 target0:0:1: asynchronous
scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
 target0:0:1: Beginning Domain Validation
 target0:0:1: wide asynchronous
 target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
 target0:0:1: Ending Domain Validation
ACPI: PCI Interrupt 0000:02:06.1[B] -> GSI 25 (level, low) -> IRQ 25
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

3ware Storage Controller device driver for Linux v1.26.02.001.
3ware 9000 Storage Controller device driver for Linux v2.26.02.008.
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
 sda: sda1 sda2
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2
sd 0:0:1:0: Attached scsi disk sdb
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:1:0: Attached scsi generic sg1 type 0
Fusion MPT base driver 3.04.02
Copyright (c) 1999-2005 LSI Logic Corporation
Fusion MPT SPI Host driver 3.04.02
usbmon: debugfs is not available
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.0: OHCI Host Controller
ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:03:00.0: irq 19, io mem 0xfeafc000
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ACPI: PCI Interrupt 0000:03:00.1[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.1: OHCI Host Controller
ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
ohci_hcd 0000:03:00.1: irq 19, io mem 0xfeafd000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
USB Universal Host Controller Interface driver v3.0
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
/.amd_mnt/huangho/export/kwaid0/home/maan/scm/torvalds/linux-2.6/drivers/usb/input/hid-core.c: v2.6:USB HID core driver
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input0
md: raid0 personality registered for level 0
md: multipath personality registered for level -4
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
CCID: Registered CCID 3 (ccid3)
CCID: Registered CCID 2 (ccid2)
SCTP: Hash tables configured (established 65536 bind 65536)
powernow-k8: Found 2 AMD Opteron(tm) Processor 250 processors (version 2.00.00)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
PM: Writing back config space on device 0000:02:09.0 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.0 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.0 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.0 at offset 1 (was 2b00000, writing 2b00146)
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
Sending DHCP requests .<6>tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is on for TX and on for RX.
., OK
IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.120
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
IP-Config: Complete:
      device=eth1, addr=192.168.1.120, mask=255.255.0.0, gw=192.168.1.254,
     host=node120, domain=, nis-domain=(none),
     bootserver=192.168.1.254, rootserver=192.168.1.254, rootpath=
Freeing unused kernel memory: 2612k freed
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0:   comparing sdb2(55038592) with sdb2(55038592)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0:   comparing sda2(55038592) with sdb2(55038592)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
md: unbind<sdb2>
md: export_rdev(sdb2)
md: unbind<sda2>
md: export_rdev(sda2)
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0:   comparing sdb2(55038592) with sdb2(55038592)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0:   comparing sda2(55038592) with sdb2(55038592)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
Adding 16779852k swap on /dev/sda1.  Priority:42 extents:1 across:16779852k
Adding 16779852k swap on /dev/sdb1.  Priority:42 extents:1 across:16779852k
warning: process `sensors' used the removed sysctl system call with 7.2.1.
warning: process `sensors' used the removed sysctl system call with 7.2.1.
process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-22 17:08       ` Andi Kleen
@ 2006-11-22 18:00         ` Andre Noll
  0 siblings, 0 replies; 36+ messages in thread
From: Andre Noll @ 2006-11-22 18:00 UTC (permalink / raw)
  To: Andi Kleen
  Cc: discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes, Mel Gorman

[-- Attachment #1: Type: text/plain, Size: 23730 bytes --]

On 18:08, Andi Kleen wrote:
> On Wed, Nov 22, 2006 at 05:05:49PM +0100, Andre Noll wrote:
> > Unfortunately, yes. I tried rc6, current git, and currrent git + David
> > Rientjes' patch. They all show the same behaviour.
> 
> I must have missed that patch.

He sent it to me in private. In fact, he sent several patches. This is
the one I tried today and which didn't work:


	Hi Andre,

	Please try the following patch to your 2.6.19-rc5 and see if it corrects 
	the problem (it should also apply to 2.6.19-rc6 cleanly).

			David
	---
	 mm/memory.c |   33 ++++++++-------------------------
	 1 files changed, 8 insertions(+), 25 deletions(-)

	diff --git a/mm/memory.c b/mm/memory.c
	index 156861f..74aa08b 100644
	--- a/mm/memory.c
	+++ b/mm/memory.c
	@@ -1483,29 +1483,14 @@ static int do_wp_page(struct mm_struct *
	 {
		struct page *old_page, *new_page;
		pte_t entry;
	-	int reuse = 0, ret = VM_FAULT_MINOR;
	-	struct page *dirty_page = NULL;
	+	int reuse, ret = VM_FAULT_MINOR;
	 
		old_page = vm_normal_page(vma, address, orig_pte);
		if (!old_page)
			goto gotten;
	 
	-	/*
	-	 * Take out anonymous pages first, anonymous shared vmas are
	-	 * not dirty accountable.
	-	 */
	-	if (PageAnon(old_page)) {
	-		if (!TestSetPageLocked(old_page)) {
	-			reuse = can_share_swap_page(old_page);
	-			unlock_page(old_page);
	-		}
	-	} else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
	-					(VM_WRITE|VM_SHARED))) {
	-		/*
	-		 * Only catch write-faults on shared writable pages,
	-		 * read-only shared pages can get COWed by
	-		 * get_user_pages(.write=1, .force=1).
	-		 */
	+	if (unlikely((vma->vm_flags & (VM_SHARED | VM_WRITE)) ==
	+			(VM_SHARED | VM_WRITE))) {
			if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
				/*
				 * Notify the address space that the page is about to
	@@ -1534,10 +1519,12 @@ static int do_wp_page(struct mm_struct *
				if (!pte_same(*page_table, orig_pte))
					goto unlock;
			}
	-		dirty_page = old_page;
	-		get_page(dirty_page);
			reuse = 1;
	-	}
	+	} else if (PageAnon(old_page) && !TestSetPageLocked(old_page)) {
	+		reuse = can_share_swap_page(old_page);
	+		unlock_page(old_page);
	+	} else
	+		reuse = 0;
	 
		if (reuse) {
			flush_cache_page(vma, address, pte_pfn(orig_pte));
	@@ -1609,10 +1596,6 @@ gotten:
			page_cache_release(old_page);
	 unlock:
		pte_unmap_unlock(page_table, ptl);
	-	if (dirty_page) {
	-		set_page_dirty_balance(dirty_page);
	-		put_page(dirty_page);
	-	}
		return ret;
	 oom:
		if (old_page)



> > Feel free to send me a debugging patch..

> Here's one. Please send output (unless Mel finds the problem first..)

Here comes the output.
Andre


Linux version 2.6.19-rc6-andi-v2-tt64-6-g0f9005a6-dirty (maan@congo) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #12 SMP Wed Nov 22 18:54:11 CET 2006
Command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel 
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable)
 BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data)
 BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS)
 BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000200000000 (usable)
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used
Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used
end_pfn_map = 2097152
DMI 2.3 present.
ACPI: RSDP (v000 ACPIAM                                ) @ 0x00000000000f6bc0
ACPI: RSDT (v001 A M I  OEMRSDT  0x01000510 MSFT 0x00000097) @ 0x00000000fbff0000
ACPI: FADT (v001 A M I  OEMFACP  0x01000510 MSFT 0x00000097) @ 0x00000000fbff0200
ACPI: MADT (v001 A M I  OEMAPIC  0x01000510 MSFT 0x00000097) @ 0x00000000fbff0380
ACPI: OEMB (v001 A M I  OEMBIOS  0x01000510 MSFT 0x00000097) @ 0x00000000fbfff040
ACPI: SRAT (v001 A M I  OEMSRAT  0x01000510 MSFT 0x00000097) @ 0x00000000fbff34e0
ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0x00000000fbff35f0
ACPI: DSDT (v001  0AAAA 0AAAA000 0x00000000 INTL 0x02002026) @ 0x0000000000000000
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: Node 0 PXM 0 100000-fc000000
Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used
SRAT: Node 1 PXM 1 100000000-200000000
Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used
SRAT: Node 0 PXM 0 0-fc000000
Entering add_active_range(0, 0, 159) 2 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used
NUMA: Using 32 for the hash shift.
Bootmem setup node 0 0000000000000000-00000000fc000000
Bootmem setup node 1 0000000100000000-0000000200000000
Zone PFN ranges:
  DMA           256 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 ->  2097152
early_node_map[3] active PFN ranges
    0:        0 ->      159
    0:      256 ->  1032176
    1:  1048576 ->  2097152
On node 0 totalpages: 1031920
  DMA zone: 52 pages used for memmap
  DMA zone: 1953 pages reserved
  DMA zone: 1835 pages, LIFO batch:0
  DMA32 zone: 14055 pages used for memmap
  DMA32 zone: 1014025 pages, LIFO batch:31
  Normal zone: 0 pages used for memmap
On node 1 totalpages: 1048576
  DMA zone: 0 pages used for memmap
  DMA32 zone: 0 pages used for memmap
  Normal zone: 14336 pages used for memmap
  Normal zone: 1034240 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x5008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24])
IOAPIC[1]: apic_id 3, address 0xfebff000, GSI 24-27
ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28])
IOAPIC[2]: apic_id 4, address 0xfebfe000, GSI 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009f000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000fbff0000 - 00000000fbfff000
Nosave address range: 00000000fbfff000 - 00000000fc000000
Nosave address range: 00000000fc000000 - 00000000ff780000
Nosave address range: 00000000ff780000 - 0000000100000000
Allocating PCI resources starting at fc400000 (gap: fc000000:3780000)
PERCPU: Allocating 25728 bytes of per cpu data
Built 2 zonelists.  Total pages: 2050100
Kernel command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel 
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Checking aperture...
CPU 0: aperture @ f4cc000000 size 32 MB
Aperture too small (32 MB)
No AGP bridge found
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
page address ffff8100fbef0000
Bad page state in process 'swapper'
page:ffff810003faf480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:

Call Trace:
 [<ffffffff8014f200>] bad_page+0x94/0xbe
 [<ffffffff8014f6dd>] __free_pages_ok+0x78/0xf9
 [<ffffffff805cd83c>] free_all_bootmem_core+0xce/0x1c2
 [<ffffffff805cad5d>] numa_free_all_bootmem+0x39/0x78
 [<ffffffff805ca603>] mem_init+0x59/0x16c
 [<ffffffff805bb75c>] start_kernel+0x165/0x1e7
 [<ffffffff805bb195>] x86_64_start_kernel+0x12b/0x130

Memory: 8122880k/8388608k available (3184k kernel code, 199740k reserved, 1490k data, 2612k init)
Calibrating delay using timer specific routine.. 4782.31 BogoMIPS (lpj=9564629)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0/0 -> Node 0
Freeing SMP alternatives: 32k freed
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 12441507
Detected 12.441 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4777.69 BogoMIPS (lpj=9555388)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1/1 -> Node 1
AMD Opteron(tm) Processor 250 stepping 0a
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -177 cycles, maxerr 928 cycles)
Brought up 2 CPUs
testing NMI watchdog ... OK.
Disabling vsyscall due to use of PM timer
time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
time.c: Detected 2388.767 MHz processor.
migration_cost=574
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:03:06.0
PCI: Firmware left 0000:03:08.0 e100 interrupts enabled, disabling
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
AMD768 RNG detected
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 8000000 size 65536 KB
PCI-DMA: using GART IOMMU.
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
PCI: Bridge: 0000:00:06.0
  IO window: a000-bfff
  MEM window: fc900000-feafffff
  PREFETCH window: disabled.
PCI: Bridge: 0000:00:0a.0
  IO window: 9000-9fff
  MEM window: fc600000-fc8fffff
  PREFETCH window: ff500000-ff5fffff
PCI: Bridge: 0000:00:0b.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
microcode: CPU0 not a capable Intel processor
microcode: CPU1 not a capable Intel processor
IA-32 Microcode Update Driver: v1.14a <tigran@veritas.com>
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Getting cpuindex for acpiid 0x3
ACPI: Getting cpuindex for acpiid 0x4
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.101 (c) Dave Jones
ipmi message handler version 39.0
ipmi device interface
IPMI System Interface driver.
ipmi_si: Unable to find any System Interface(s)
IPMI Watchdog: driver initialized
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
loop: loaded (max 8 devices)
Intel(R) PRO/1000 Network Driver - version 7.2.9-k4
Copyright (c) 1999-2006 Intel Corporation.
eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others
ACPI: PCI Interrupt 0000:03:08.0[A] -> GSI 18 (level, low) -> IRQ 18
eth0: 0000:03:08.0, 00:E0:81:2E:78:F7, IRQ 18.
  Board assembly 567812-052, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0xd0a6c714).
e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
tg3.c:v3.69 (November 15, 2006)
ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 24 (level, low) -> IRQ 24
eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:26
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
eth1: dma_rwctrl[769f4000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:02:09.1[B] -> GSI 25 (level, low) -> IRQ 25
eth2: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:27
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
eth2: dma_rwctrl[769f4000] dma_mask[64-bit]
Linux video capture interface: v2.00
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: IDE controller at PCI slot 0000:00:07.1
AMD8111: chipset revision 3
AMD8111: not 100% native mode: will probe irqs later
AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide0...
Probing IDE interface ide1...
ACPI: PCI Interrupt 0000:02:06.0[A] -> GSI 24 (level, low) -> IRQ 24
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

scsi 0:0:0:0: Direct-Access     FUJITSU  MAT3073NP        0105 PQ: 0 ANSI: 3
 target0:0:0: asynchronous
scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
 target0:0:0: Beginning Domain Validation
 target0:0:0: wide asynchronous
 target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
 target0:0:0: Ending Domain Validation
scsi 0:0:1:0: Direct-Access     FUJITSU  MAT3073NP        0105 PQ: 0 ANSI: 3
 target0:0:1: asynchronous
scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
 target0:0:1: Beginning Domain Validation
 target0:0:1: wide asynchronous
 target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
 target0:0:1: Ending Domain Validation
ACPI: PCI Interrupt 0000:02:06.1[B] -> GSI 25 (level, low) -> IRQ 25
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

3ware Storage Controller device driver for Linux v1.26.02.001.
3ware 9000 Storage Controller device driver for Linux v2.26.02.008.
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
 sda: sda1 sda2
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2
sd 0:0:1:0: Attached scsi disk sdb
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:1:0: Attached scsi generic sg1 type 0
Fusion MPT base driver 3.04.02
Copyright (c) 1999-2005 LSI Logic Corporation
Fusion MPT SPI Host driver 3.04.02
usbmon: debugfs is not available
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.0: OHCI Host Controller
ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:03:00.0: irq 19, io mem 0xfeafc000
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ACPI: PCI Interrupt 0000:03:00.1[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.1: OHCI Host Controller
ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
ohci_hcd 0000:03:00.1: irq 19, io mem 0xfeafd000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
USB Universal Host Controller Interface driver v3.0
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
/.amd_mnt/huangho/export/kwaid0/home/maan/scm/torvalds/linux-2.6/drivers/usb/input/hid-core.c: v2.6:USB HID core driver
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input0
md: raid0 personality registered for level 0
md: multipath personality registered for level -4
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
CCID: Registered CCID 3 (ccid3)
CCID: Registered CCID 2 (ccid2)
SCTP: Hash tables configured (established 65536 bind 65536)
powernow-k8: Found 2 AMD Opteron(tm) Processor 250 processors (version 2.00.00)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
PM: Writing back config space on device 0000:02:09.0 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.0 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.0 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.0 at offset 1 (was 2b00000, writing 2b00146)
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
Sending DHCP requests .<6>tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is on for TX and on for RX.
., OK
IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.120
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
IP-Config: Complete:
      device=eth1, addr=192.168.1.120, mask=255.255.0.0, gw=192.168.1.254,
     host=node120, domain=, nis-domain=(none),
     bootserver=192.168.1.254, rootserver=192.168.1.254, rootpath=
Freeing unused kernel memory: 2612k freed
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0:   comparing sdb2(55038592) with sdb2(55038592)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0:   comparing sda2(55038592) with sdb2(55038592)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
md: unbind<sdb2>
md: export_rdev(sdb2)
md: unbind<sda2>
md: export_rdev(sda2)
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0:   comparing sdb2(55038592) with sdb2(55038592)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0:   comparing sda2(55038592) with sdb2(55038592)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
Adding 16779852k swap on /dev/sda1.  Priority:42 extents:1 across:16779852k
Adding 16779852k swap on /dev/sdb1.  Priority:42 extents:1 across:16779852k
warning: process `sensors' used the removed sysctl system call with 7.2.1.
warning: process `sensors' used the removed sysctl system call with 7.2.1.
process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT

-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions (v4)
  2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk
                     ` (2 preceding siblings ...)
  2006-11-22 10:42   ` [discuss] " Andi Kleen
@ 2006-11-23  0:04   ` David Brownell
  3 siblings, 0 replies; 36+ messages in thread
From: David Brownell @ 2006-11-23  0:04 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Alexey Starikovskiy, Andrew Morton, Len Brown, Linus Torvalds,
	linux-acpi, Linux Kernel Mailing List

On Tuesday 21 November 2006 1:24 pm, Adrian Bunk wrote:

> Subject    : ACPI: AE_TIME errors
> References : http://lkml.org/lkml/2006/11/15/12
> Submitter  : David Brownell <david-b@pacbell.net>
> Handled-By : Len Brown <len.brown@intel.com>
>              Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
> Status     : problem is being debugged

I've not seen this in over 3 days now, and am willing to believe that
the previous instance (after manually reverting the patch identified
by Linus) was a fluke ... it's certainly not the critical/blocking kind
of issue it had previously been.

- Dave

^ permalink raw reply	[flat|nested] 36+ messages in thread

* 2.6.19-rc6: known regressions with patches available
  2006-11-16  4:21 Linux 2.6.19-rc6 Linus Torvalds
                   ` (4 preceding siblings ...)
  2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk
@ 2006-11-23  0:54 ` Adrian Bunk
  2006-11-23  1:08   ` Andrew Morton
  5 siblings, 1 reply; 36+ messages in thread
From: Adrian Bunk @ 2006-11-23  0:54 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: Linux Kernel Mailing List, Randy Dunlap, Roman Zippel,
	Phil Oester, Sam Ravnborg

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
with patches available.

The first issue (for an unknown it never occured before - is seems some 
random Kconfig change has triggered this latent bug) seems to have the 
potential of affecting more users.

The second issue is so exotic that I wouldn't have listed it if there 
was no patch, but considering that the patch looks safe I don't see why 
this regression shouldn't be fixed in 2.6.19.


Subject    : xconfig crashes on x86_64
References : http://lkml.org/lkml/2006/11/19/177
Submitter  : Randy Dunlap <randy.dunlap@oracle.com>
Handled-By : Roman Zippel <zippel@linux-m68k.org>
Patch      : http://lkml.org/lkml/2006/11/20/340
Status     : patch available


Subject    : menuconfig problems with TERM=vt100
References : http://lkml.org/lkml/2006/11/13/369
Submitter  : Phil Oester <kernel@linuxace.com>
Caused-By  : Sam Ravnborg <sam@ravnborg.org>
             commit 350b5b76384e77bcc58217f00455fdbec5cac594
Handled-By : Roman Zippel <zippel@linux-m68k.org>
Patch      : http://lkml.org/lkml/2006/11/20/341
Status     : patch available


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 2.6.19-rc6: known regressions with patches available
  2006-11-23  0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk
@ 2006-11-23  1:08   ` Andrew Morton
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2006-11-23  1:08 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Linus Torvalds, Linux Kernel Mailing List, Randy Dunlap,
	Roman Zippel, Phil Oester, Sam Ravnborg

On Thu, 23 Nov 2006 01:54:57 +0100
Adrian Bunk <bunk@stusta.de> wrote:

> This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> with patches available.
> 
> The first issue (for an unknown it never occured before - is seems some 
> random Kconfig change has triggered this latent bug) seems to have the 
> potential of affecting more users.
> 
> The second issue is so exotic that I wouldn't have listed it if there 
> was no patch, but considering that the patch looks safe I don't see why 
> this regression shouldn't be fixed in 2.6.19.
> 
> 
> Subject    : xconfig crashes on x86_64
> References : http://lkml.org/lkml/2006/11/19/177
> Submitter  : Randy Dunlap <randy.dunlap@oracle.com>
> Handled-By : Roman Zippel <zippel@linux-m68k.org>
> Patch      : http://lkml.org/lkml/2006/11/20/340
> Status     : patch available
> 
> 
> Subject    : menuconfig problems with TERM=vt100
> References : http://lkml.org/lkml/2006/11/13/369
> Submitter  : Phil Oester <kernel@linuxace.com>
> Caused-By  : Sam Ravnborg <sam@ravnborg.org>
>              commit 350b5b76384e77bcc58217f00455fdbec5cac594
> Handled-By : Roman Zippel <zippel@linux-m68k.org>
> Patch      : http://lkml.org/lkml/2006/11/20/341
> Status     : patch available

I have both these queued for 2.6.19, thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-22 17:42       ` Andre Noll
@ 2006-11-23 12:01         ` Mel Gorman
  2006-11-23 13:08           ` Andre Noll
  2006-11-23 19:09           ` Andrew Morton
  0 siblings, 2 replies; 36+ messages in thread
From: Mel Gorman @ 2006-11-23 12:01 UTC (permalink / raw)
  To: Andre Noll
  Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes

On (22/11/06 18:42), Andre Noll didst pronounce:
> On 15:52, Mel Gorman wrote:
> 
> > Right, so I took a closer look to see what the story was.
> 
> Thanks a lot, Mel.
> 

Thank you for getting back promptly.

> > Bootmem setup node 0 0000000000000000-00000000fc000000
> > Bootmem setup node 1 0000000100000000-0000000200000000
> > 
> > That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152.
> > 
> > That is showing an additional 16 page frames that are not in the E820 map
> > (although I have seen this before and it didn't show up as a bad page). I
> > would be very interested in finding out what the bad_page PFNs are if this
> > bug still exists to see if it is those 16 frames. I've included a patch
> > below that might help.
> > 
> > Andre, if the bug still exists for you, can you apply Andi's patch to
> > reduce the log size and the following patch please and post us the
> > output with loglevel=8 please? Thanks
> 
> Done. Here's the output of dmesg with your and Andi's patch applied.
>

ahhh, I believe I see the problem now. Please try out the following patch.

====

find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend
on a sorted early_node_map[]. However, sort_node_map() is being called after
fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases,
this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges
to be registered out of order. This gave the wrong values for the min PFN
range resulting in some pages not being initialised.

This patch sorts the early_node_map in find_min_pfn_for_node(). It has
been boot tested on x86, x86_64, ppc64 and ia64.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>

diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c
--- linux-2.6.19-rc6-clean/mm/page_alloc.c	2006-11-15 20:03:40.000000000 -0800
+++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c	2006-11-23 02:23:57.000000000 -0800
@@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no
 {
 	int i;
 
+	/* Regions in the early_node_map can be in any order */
+	sort_node_map();
+
 	/* Assuming a sorted map, the first range found has the starting pfn */
 	for_each_active_range_index_in_nid(i, nid)
 		return early_node_map[i].start_pfn;
@@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne
 			max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]);
 	}
 
-	/* Regions in the early_node_map can be in any order */
-	sort_node_map();
-
 	/* Print out the zone ranges */
 	printk("Zone PFN ranges:\n");
 	for (i = 0; i < MAX_NR_ZONES; i++)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-23 12:01         ` Mel Gorman
@ 2006-11-23 13:08           ` Andre Noll
  2006-11-23 13:28             ` Mel Gorman
  2006-11-23 19:09           ` Andrew Morton
  1 sibling, 1 reply; 36+ messages in thread
From: Andre Noll @ 2006-11-23 13:08 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes

[-- Attachment #1: Type: text/plain, Size: 676 bytes --]

On 12:01, Mel Gorman wrote:

> > > Andre, if the bug still exists for you, can you apply Andi's patch to
> > > reduce the log size and the following patch please and post us the
> > > output with loglevel=8 please? Thanks
> > 
> > Done. Here's the output of dmesg with your and Andi's patch applied.
> >
> 
> ahhh, I believe I see the problem now. Please try out the following patch.

[...]

> This patch sorts the early_node_map in find_min_pfn_for_node(). It has
> been boot tested on x86, x86_64, ppc64 and ia64.

That did the trick, you're the man!

Thanks a lot
Andre

-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-23 13:08           ` Andre Noll
@ 2006-11-23 13:28             ` Mel Gorman
  0 siblings, 0 replies; 36+ messages in thread
From: Mel Gorman @ 2006-11-23 13:28 UTC (permalink / raw)
  To: Andre Noll
  Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton,
	Linux Kernel Mailing List, David Rientjes

On Thu, 23 Nov 2006, Andre Noll wrote:

> On 12:01, Mel Gorman wrote:
>
>>>> Andre, if the bug still exists for you, can you apply Andi's patch to
>>>> reduce the log size and the following patch please and post us the
>>>> output with loglevel=8 please? Thanks
>>>
>>> Done. Here's the output of dmesg with your and Andi's patch applied.
>>>
>>
>> ahhh, I believe I see the problem now. Please try out the following patch.
>
> [...]
>
>> This patch sorts the early_node_map in find_min_pfn_for_node(). It has
>> been boot tested on x86, x86_64, ppc64 and ia64.
>
> That did the trick, you're the man!
>

heh, I was also the problem. Thanks a lot for reporting and testing.


-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-23 12:01         ` Mel Gorman
  2006-11-23 13:08           ` Andre Noll
@ 2006-11-23 19:09           ` Andrew Morton
  2006-11-23 21:55             ` Mel Gorman
  1 sibling, 1 reply; 36+ messages in thread
From: Andrew Morton @ 2006-11-23 19:09 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andre Noll, Andi Kleen, discuss, Adrian Bunk, Linus Torvalds,
	Linux Kernel Mailing List, David Rientjes

On Thu, 23 Nov 2006 12:01:41 +0000
mel@skynet.ie (Mel Gorman) wrote:

> find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend
> on a sorted early_node_map[]. However, sort_node_map() is being called after
> fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases,
> this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges
> to be registered out of order. This gave the wrong values for the min PFN
> range resulting in some pages not being initialised.
> 
> This patch sorts the early_node_map in find_min_pfn_for_node(). It has
> been boot tested on x86, x86_64, ppc64 and ia64.
> 
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> 
> diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c
> --- linux-2.6.19-rc6-clean/mm/page_alloc.c	2006-11-15 20:03:40.000000000 -0800
> +++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c	2006-11-23 02:23:57.000000000 -0800
> @@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no
>  {
>  	int i;
>  
> +	/* Regions in the early_node_map can be in any order */
> +	sort_node_map();
> +
>  	/* Assuming a sorted map, the first range found has the starting pfn */
>  	for_each_active_range_index_in_nid(i, nid)
>  		return early_node_map[i].start_pfn;
> @@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne
>  			max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]);
>  	}
>  
> -	/* Regions in the early_node_map can be in any order */
> -	sort_node_map();
> -
>  	/* Print out the zone ranges */
>  	printk("Zone PFN ranges:\n");
>  	for (i = 0; i < MAX_NR_ZONES; i++)

Doesn't this mean that we can sort that map multiple times?

Seems a bit ... ungainly?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-23 19:09           ` Andrew Morton
@ 2006-11-23 21:55             ` Mel Gorman
  2006-11-24  9:51               ` Andre Noll
  2006-11-24  9:58               ` Andi Kleen
  0 siblings, 2 replies; 36+ messages in thread
From: Mel Gorman @ 2006-11-23 21:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andre Noll, Andi Kleen, discuss, Adrian Bunk, Linus Torvalds,
	Linux Kernel Mailing List, David Rientjes

On (23/11/06 11:09), Andrew Morton didst pronounce:
> On Thu, 23 Nov 2006 12:01:41 +0000
> mel@skynet.ie (Mel Gorman) wrote:
> 
> > find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend
> > on a sorted early_node_map[]. However, sort_node_map() is being called after
> > fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases,
> > this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges
> > to be registered out of order. This gave the wrong values for the min PFN
> > range resulting in some pages not being initialised.
> > 
> > This patch sorts the early_node_map in find_min_pfn_for_node(). It has
> > been boot tested on x86, x86_64, ppc64 and ia64.
> > 
> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > 
> > diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c
> > --- linux-2.6.19-rc6-clean/mm/page_alloc.c	2006-11-15 20:03:40.000000000 -0800
> > +++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c	2006-11-23 02:23:57.000000000 -0800
> > @@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no
> >  {
> >  	int i;
> >  
> > +	/* Regions in the early_node_map can be in any order */
> > +	sort_node_map();
> > +
> >  	/* Assuming a sorted map, the first range found has the starting pfn */
> >  	for_each_active_range_index_in_nid(i, nid)
> >  		return early_node_map[i].start_pfn;
> > @@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne
> >  			max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]);
> >  	}
> >  
> > -	/* Regions in the early_node_map can be in any order */
> > -	sort_node_map();
> > -
> >  	/* Print out the zone ranges */
> >  	printk("Zone PFN ranges:\n");
> >  	for (i = 0; i < MAX_NR_ZONES; i++)
> 

yes, once per active node.
          
> Seems a bit ... ungainly?
>


It is, but this late in the cycle, I was going for the
obviously-correct-and-will-definitly-work solution.

It would be sufficient to call sort_node_map() in
find_min_pfn_with_active_regions() but I wasn't sure someone would call
find_min_pfn_for_node() at some future time causing another fun bug.

A slightly smarter, but not quite as obviously correct, patch is below if
you prefer it. It removes the assumption about early_node_map being sorted
for find_min_pfns and friends by always searching the whole map.  The map
is then only sorted once when it is required. Andre, I'd appreciate it if
you could give it a spin to be 100% sure it's ok. It passed a boot-test on
a few machines here.

===========

find_min_pfn_for_node() and find_min_pfn_with_active_regions() both
depend on a sorted  early_node_map[] to find the correct values. However,
sort_node_map() is being called after fin_min_pfn_with_active_regions()
in free_area_init_nodes(). In most cases, this is ok, but on an x86_64,
the SRAT table caused the E820 ranges to be registered out of order. This gave
the wrong values for the min PFN range resulting in some pages not being
initialised.

This patch works by always searching the whole early_node_map[] in
find_min_pfn_for_node().

Signed-off-by: Mel Gorman <mel@csn.ul.ie>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc5-mm2-clean/mm/page_alloc.c linux-2.6.19-rc5-mm2-sort_in_find_min/mm/page_alloc.c
--- linux-2.6.19-rc5-mm2-clean/mm/page_alloc.c	2006-11-14 14:01:37.000000000 +0000
+++ linux-2.6.19-rc5-mm2-sort_in_find_min/mm/page_alloc.c	2006-11-23 20:37:18.000000000 +0000
@@ -2945,17 +2945,22 @@ static void __init sort_node_map(void)
 			cmp_node_active_region, NULL);
 }
 
-/* Find the lowest pfn for a node. This depends on a sorted early_node_map */
+/* Find the lowest pfn for a node */
 unsigned long __init find_min_pfn_for_node(unsigned long nid)
 {
 	int i;
+	unsigned long min_pfn = -1UL;
 
 	/* Assuming a sorted map, the first range found has the starting pfn */
 	for_each_active_range_index_in_nid(i, nid)
-		return early_node_map[i].start_pfn;
+		min_pfn = min(min_pfn, early_node_map[i].start_pfn);
 
-	printk(KERN_WARNING "Could not find start_pfn for node %lu\n", nid);
-	return 0;
+	if (min_pfn == -1UL) {
+		printk(KERN_WARNING "Could not find start_pfn for node %lu\n", nid);
+		return 0;
+	}
+	
+	return min_pfn;
 }
 
 /**

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-23 21:55             ` Mel Gorman
@ 2006-11-24  9:51               ` Andre Noll
  2006-11-24  9:58               ` Andi Kleen
  1 sibling, 0 replies; 36+ messages in thread
From: Andre Noll @ 2006-11-24  9:51 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Andi Kleen, discuss, Adrian Bunk, Linus Torvalds,
	Linux Kernel Mailing List, David Rientjes

[-- Attachment #1: Type: text/plain, Size: 606 bytes --]

On 21:55, Mel Gorman wrote:

> A slightly smarter, but not quite as obviously correct, patch is below if
> you prefer it. It removes the assumption about early_node_map being sorted
> for find_min_pfns and friends by always searching the whole map.  The map
> is then only sorted once when it is required. Andre, I'd appreciate it if
> you could give it a spin to be 100% sure it's ok. It passed a boot-test on
> a few machines here.

Yes, this one also works for me.

Acked-by: Andre Noll <maan@systemlinux.org>
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-23 21:55             ` Mel Gorman
  2006-11-24  9:51               ` Andre Noll
@ 2006-11-24  9:58               ` Andi Kleen
  2006-11-24 20:43                 ` Andrew Morton
  1 sibling, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2006-11-24  9:58 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Andre Noll, discuss, Adrian Bunk, Linus Torvalds,
	Linux Kernel Mailing List, David Rientjes


> A slightly smarter, but not quite as obviously correct, 

I think it's better to go for the "obviously correct" approach right now
And sorting multiple times should be fine

-Andi

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [discuss] 2.6.19-rc6: known regressions (v4)
  2006-11-24  9:58               ` Andi Kleen
@ 2006-11-24 20:43                 ` Andrew Morton
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2006-11-24 20:43 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mel Gorman, Andre Noll, discuss, Adrian Bunk, Linus Torvalds,
	Linux Kernel Mailing List, David Rientjes

On Fri, 24 Nov 2006 10:58:55 +0100
Andi Kleen <ak@suse.de> wrote:

> 
> > A slightly smarter, but not quite as obviously correct, 
> 
> I think it's better to go for the "obviously correct" approach right now
> And sorting multiple times should be fine
> 

yup, that's what I'd decided.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2006-11-24 20:48 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-16  4:21 Linux 2.6.19-rc6 Linus Torvalds
2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk
2006-11-16 21:43   ` Greg KH
2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk
2006-11-18  8:02   ` [PATCH] mm: do not call bad_page on PG_reserved check David Rientjes
2006-11-18 13:37     ` Hugh Dickins
2006-11-18  4:04 ` Linux 2.6.19-rc6 - NFSD working again Christian Kujau
2006-11-20 19:53 ` 2.6.19-rc6: known regressions (v3) Adrian Bunk
2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk
2006-11-21 21:31   ` [discuss] " Dave Jones
2006-11-21 21:39     ` Adrian Bunk
2006-11-21 21:56       ` Dave Jones
2006-11-21 21:33   ` Vivek Goyal
2006-11-21 21:41     ` Adrian Bunk
2006-11-21 22:18     ` Linus Torvalds
2006-11-22  9:44       ` Pavel Emelianov
2006-11-22 14:58         ` Vivek Goyal
2006-11-22 17:28         ` Linus Torvalds
2006-11-22 10:42   ` [discuss] " Andi Kleen
2006-11-22 15:52     ` Mel Gorman
2006-11-22 17:42       ` Andre Noll
2006-11-23 12:01         ` Mel Gorman
2006-11-23 13:08           ` Andre Noll
2006-11-23 13:28             ` Mel Gorman
2006-11-23 19:09           ` Andrew Morton
2006-11-23 21:55             ` Mel Gorman
2006-11-24  9:51               ` Andre Noll
2006-11-24  9:58               ` Andi Kleen
2006-11-24 20:43                 ` Andrew Morton
2006-11-22 16:05     ` Andre Noll
2006-11-22 17:03       ` Mel Gorman
2006-11-22 17:08       ` Andi Kleen
2006-11-22 18:00         ` Andre Noll
2006-11-23  0:04   ` David Brownell
2006-11-23  0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk
2006-11-23  1:08   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox