* Linux 2.6.19-rc6
@ 2006-11-16 4:21 Linus Torvalds
2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk
` (5 more replies)
0 siblings, 6 replies; 36+ messages in thread
From: Linus Torvalds @ 2006-11-16 4:21 UTC (permalink / raw)
To: Linux Kernel Mailing List
Ok,
there's nothing earth-shattering here (and there shouldn't be), but we've
hopefully made good progress on the regression list (and thanks again to
Adrian Bunk for reminding people, especially when they thought *cough*
that some particular regression had already been fixed)..
So with -rc6, we hopefully should leave the irq-related regressions behind
us. There were issues both with devices that started enabling MSI (which
seem to trigger hardware bugs, although there's also been discussion about
what we should do to make things safer) and with the new genirq layer that
showed problems with edge-triggered irq's (notably legacy ISA interrupts,
or, more commonly these days, the 16-bit PCMCIA interrupts that are
basically just ISA in another formfactor).
Thanks for everybody involved in whittling down that regression list.
Also, apart from the regression tracking, we've had some other updates, eg
infiniband and DVB fixes, network driver fixes, some networking fixes etc.
The ShortLog is appended, and gives a mostly readable picture of what has
been going on. But the main thing to take away is: regressions fixed, and
not a whole lot of changes since -rc5 (it may not look that way, but a lot
of these things are essentially one-liners or close to it, so the total
diff between -rc5 and -rc6 is actually just about 5k lines, which is not
a whole lot, considering).
Linus
---
Aaron Durbin (1):
x86-64: Fix partial page check to ensure unusable memory is not being marked usable.
Adrian Bunk (2):
bcm43xx: Add error checking in bcm43xx_sprom_write()
drivers/telephony/ixj: fix an array overrun
Alan Cox (1):
hpt37x: Check the enablebits
Alan Stern (1):
SCSI core: always store >= 36 bytes of INQUIRY data
Alasdair G Kergon (2):
dm: fix find_device race
dm: suspend: fix error path
Alexey Dobriyan (4):
ipmi_si_intf.c: fix "&& 0xff" typos
V4L/DVB (4795): Tda826x: use correct max frequency
V4L/DVB (4818): Flexcop-usb: fix debug printk
pata_artop: fix "& (1 >>" typo
Andi Kleen (6):
Revert "MMCONFIG and new Intel motherboards"
x86-64: Fix PTRACE_[SG]ET_THREAD_AREA regression with ia32 emulation.
x86-64: Handle reserve_bootmem_generic beyond end_pfn
x86: Add acpi_user_timer_override option for Asus boards
x86-64: Fix vgetcpu when CONFIG_HOTPLUG_CPU is disabled
x86-64: Fix race in exit_idle
Andrew Morton (2):
setup_irq(): better mismatch debugging
revert "PCI: quirk for IBM Dock II cardbus controllers"
Arjan van de Ven (1):
Regression in 2.6.19-rc microcode driver
Benjamin Herrenschmidt (2):
[POWERPC] Fix cell "new style" mapping and add debug
powerpc: windfarm shall request it's sub modules
Brian King (1):
libata: Convert from module_init to subsys_initcall
Bryan O'Sullivan (1):
IB/ipath - program intconfig register using new HT irq hook
Chris Lalancette (1):
[NETPOLL]: Compute checksum properly in netpoll_send_udp().
Corey Minyard (3):
IPMI: Clean up the waiting message queue properly on unload
IPMI: retry messages on certain error returns
IPMI: Fix more && typos
Daniel Ritz (1):
fix via586 irq routing for pirq 5
Darrick J. Wong (1):
libata: fix double-completion on error
David Brownell (1):
usb: MAINTAINERS updates
David Chinner (3):
[XFS] Clean up i_flags and i_flags_lock handling.
[XFS] Prevent a deadlock when xfslogd unpins inodes.
[XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h
David Gibson (1):
hugetlb: check for brk() entering a hugepage region
David Miller (1):
pci: don't try to remove sysfs files before they are setup.
David Rientjes (1):
drivers cris: return on NULL dev_alloc_skb()
Eric Dumazet (1):
vmalloc: optimization, cleanup, bugfixes
Eric W. Biederman (4):
sysctl: Undeprecate sys_sysctl
htirq: refactor so we only have one function that writes to the chip
htirq: allow buggy drivers of buggy hardware to write the registers
Use delayed disable mode of ioapic edge triggered interrupts
Franck Bui-Huu (1):
.gitignore: add miscellaneous files
Geoff Levand (1):
[POWERPC] cell: set ARCH_SPARSEMEM_DEFAULT in Kconfig
Herbert Xu (1):
[NET]: Set truesize in pskb_copy
Hermann Pitton (1):
V4L/DVB (4802): Cx88: fix remote control on WinFast 2000XP Expert
Hoang-Nam Nguyen (3):
IB/ehca: Assure 4K alignment for firmware control blocks
IB/ehca: Use named constant for max mtu
IB/ehca: Activate scaling code by default
Hugh Dickins (2):
hugetlb: prepare_hugepage_range check offset too
hugetlb: fix error return for brk() entering a hugepage region
Ian Kent (1):
autofs4: panic after mount fail
J. Bruce Fields (3):
nfsd4: reindent do_open_lookup()
nfsd4: fix open-create permissions
nfsd: fix spurious error return from nfsd_create in async case
Jean Delvare (2):
V4L/DVB (4817): Fix uses of "&&" where "&" was intended
RDMA/amso1100: Fix && typo
Jeff Garzik (1):
[libata] sata_via: fix obvious typo
Jens Axboe (4):
Fix bad data direction in SG_IO
ide-cd: only set rq->errors SCSI style for block pc requests
cciss: fix iostat
cpqarray: fix iostat
Jes Sorensen (1):
mspec driver build fix
Jiri Slaby (2):
[NET]: kconfig, correct traffic shaper
Char: isicom, fix close bug
John Heffner (1):
[TCP]: Don't use highmem in tcp hash size calculation.
John Rose (1):
[POWERPC] pseries: Force 4k update_flash block and list sizes
Jonathan E Brassow (2):
dm: multipath: fix rr_add_path order
dm: raid1: fix waiting for io on suspend
Julian Anastasov (1):
[IPVS]: More endianness fixed.
Kalle Pokki (2):
[POWERPC] CPM_UART: Fix non-console transmit
[POWERPC] CPM_UART: Fix non-console initialisation
KAMEZAWA Hiroyuki (1):
ia64: select ACPI_NUMA if ACPI
Linus Torvalds (6):
Revert "i386: Add MMCFG resources to i386 too"
x86-64: clean up io-apic accesses
x86-64: write IO APIC irq routing entries in correct order
[dvb saa7134] Fix missing 'break' for avermedia card case
Revert "fix Data Acess error in dup_fd"
Linux 2.6.19-rc6
Magnus Damm (1):
x86-64: setup saved_max_pfn correctly (kdump)
Masami Hiramatsu (1):
kretprobe: fix kretprobe-booster to save regs and set status
Mauro Carvalho Chehab (1):
V4L/DVB (4804): Fix missing i2c dependency for saa7110
Michael Buesch (1):
bcm43xx: Drain TX status before starting IRQs
Michael Chan (1):
[TG3]: Fix array overrun in tg3_read_partno().
Nathan Lynch (1):
nvidiafb: fix unreachable code in nv10GetConfig
NeilBrown (2):
md: change ONLINE/OFFLINE events to a single CHANGE event
md: fix sizing problem with raid5-reshape and CONFIG_LBD=n
Nicolas Kaiser (1):
drivers/ide: stray bracket
Oleg Nesterov (1):
A minor fix for set_mb() in Documentation/memory-barriers.txt
pasky@ucw.cz (3):
V4L/DVB (4814): Remote support for Avermedia 777
V4L/DVB (4815): Remote support for Avermedia A16AR
V4L/DVB (4816): Change tuner type for Avermedia A16AR
Paul Mackerras (1):
[POWERPC] Make sure initrd and dtb sections get into zImage correctly
Pavel Emelianov (1):
Fix misrouted interrupts deadlocks
Peter Zijlstra (1):
bonding: lockdep annotation
Rafael J. Wysocki (1):
md: do not freeze md threads for suspend
Randy Dunlap (1):
com20020 build fix
Roland Dreier (1):
IB/mad: Fix race between cancel and receive completion
Russell King (1):
Fix missing parens in set_personality()
Sharyathi Nagesh (1):
fix Data Acess error in dup_fd
Simon Horman (1):
[IPVS]: Compile fix for annotations in userland.
Stephen Hemminger (1):
[PKT_SCHED] sch_htb: Use hlist_del_init().
Stephen Rothwell (2):
[POWERPC] Add the thread_siblings files to sysfs
[POWERPC] Wire up sys_move_pages
Steve French (4):
[CIFS] NFS stress test generates flood of "close with pending write" messages
[CIFS] Explicitly set stat->blksize
[CIFS] Fix mount failure when domain not specified
[CIFS] Fix minor problem with previous patch
Steven Rostedt (1):
x86-64: shorten the x86_64 boot setup GDT to what the comment says
Steven Whitehouse (1):
[DECNET]: Endianess fixes (try #2)
Takashi Iwai (1):
ALSA: hda-intel - Disable MSI support by default
Tigran Aivazian (1):
Tigran has moved
Tim Shimmin (1):
[XFS] Keep lockdep happy.
Timo Teras (2):
MMC: Poll card status after rescanning cards
MMC: Do not set unsupported bits in OCR response
Tom Tucker (1):
RDMA/amso1100: Fix unitialized pseudo_netdev accessed in c2_register_device
Vivek Goyal (1):
i386: Force data segment to be 4K aligned
Vlad Apostolov (3):
[XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when
[XFS] rename uio_read() to xfs_uio_read()
[XFS] 956664: dm_read_invis() changes i_atime
Wink Saville (1):
Patch for nvidia divide by zero error for 7600 pci-express card
^ permalink raw reply [flat|nested] 36+ messages in thread* 2.6.19-rc6: known regressions 2006-11-16 4:21 Linux 2.6.19-rc6 Linus Torvalds @ 2006-11-16 21:37 ` Adrian Bunk 2006-11-16 21:43 ` Greg KH 2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk ` (4 subsequent siblings) 5 siblings, 1 reply; 36+ messages in thread From: Adrian Bunk @ 2006-11-16 21:37 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Linux Kernel Mailing List, Ray Lee, Michael Buesch, Larry Finger, st3, linville, netdev, David Brownell, Len Brown, Alexey Starikovskiy, linux-acpi, Ernst Herzberg, Ingo Molnar, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Dennis Stosberg, Greg Kroah-Hartman, ecashin, Andrey Borzenkov, Alan Stern, linux-usb-devel This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : bcm43xx: serious problems References : http://lkml.org/lkml/2006/11/15/296 Submitter : Ray Lee <ray-lk@madrabbit.org> Handled-By : Michael Buesch <mb@bu3sch.de> Larry Finger <Larry.Finger@lwfinger.net> Status : problem is being debugged Subject : nasty ACPI regression, AE_TIME errors References : http://lkml.org/lkml/2006/11/15/12 Submitter : David Brownell <david-b@pacbell.net> Handled-By : Len Brown <len.brown@intel.com> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> Status : problem is being debugged Subject : ThinkPad R50p: boot fail with (lapic && on_battery) References : http://lkml.org/lkml/2006/10/31/333 Submitter : Ernst Herzberg <earny@net4u.de> Handled-By : Len Brown <len.brown@intel.com> Status : problem is being debugged Subject : x86_64: Bad page state in process 'swapper' References : http://lkml.org/lkml/2006/11/10/135 http://lkml.org/lkml/2006/11/10/208 Submitter : Andre Noll <maan@systemlinux.org> Handled-By : Andi Kleen <ak@suse.de> Status : Andi is investigating Subject : x86_64: oprofile doesn't work References : http://lkml.org/lkml/2006/10/27/3 http://lkml.org/lkml/2006/11/15/92 Submitter : Prakash Punnoor <prakash@punnoor.de> Status : problem is being discussed Subject : x86_64 UP compile error References : http://lkml.org/lkml/2006/11/16/29 Submitter : Ingo Molnar <mingo@elte.hu> Caused-By : Andi Kleen <ak@suse.de> commit 8c131af1db510793f87dc43edbc8950a35370df3 Handled-By : Andi Kleen <ak@suse.de> Ingo Molnar <mingo@elte.hu> Patch : http://lkml.org/lkml/2006/11/16/36 Status : patch available Subject : aoe: Add forgotten NULL at end of attribute list in aoeblk.c References : http://lkml.org/lkml/2006/11/13/26 Submitter : Dennis Stosberg <dennis@stosberg.net> Caused-By : Greg Kroah-Hartman <gregkh@suse.de> commit 4ca5224f3ea4779054d96e885ca9b3980801ce13 Handled-By : Dennis Stosberg <dennis@stosberg.net> Patch : http://lkml.org/lkml/2006/11/13/26 Status : patch available Subject : can't disable OHCI wakeup via sysfs References : http://lkml.org/lkml/2006/11/11/33 Submitter : Andrey Borzenkov <arvidjaar@mail.ru> Handled-By : Alan Stern <stern@rowland.harvard.edu> Patch : http://lkml.org/lkml/2006/11/13/261 Status : patch available ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions 2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk @ 2006-11-16 21:43 ` Greg KH 0 siblings, 0 replies; 36+ messages in thread From: Greg KH @ 2006-11-16 21:43 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Ray Lee, Michael Buesch, Larry Finger, st3, linville, netdev, David Brownell, Len Brown, Alexey Starikovskiy, linux-acpi, Ernst Herzberg, Ingo Molnar, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Dennis Stosberg, ecashin, Andrey Borzenkov, Alan Stern, linux-usb-devel On Thu, Nov 16, 2006 at 10:37:18PM +0100, Adrian Bunk wrote: > Subject : aoe: Add forgotten NULL at end of attribute list in aoeblk.c > References : http://lkml.org/lkml/2006/11/13/26 > Submitter : Dennis Stosberg <dennis@stosberg.net> > Caused-By : Greg Kroah-Hartman <gregkh@suse.de> > commit 4ca5224f3ea4779054d96e885ca9b3980801ce13 > Handled-By : Dennis Stosberg <dennis@stosberg.net> > Patch : http://lkml.org/lkml/2006/11/13/26 > Status : patch available > > > Subject : can't disable OHCI wakeup via sysfs > References : http://lkml.org/lkml/2006/11/11/33 > Submitter : Andrey Borzenkov <arvidjaar@mail.ru> > Handled-By : Alan Stern <stern@rowland.harvard.edu> > Patch : http://lkml.org/lkml/2006/11/13/261 > Status : patch available I'll be sending Linus both of these patches later today. thanks, greg k-h ^ permalink raw reply [flat|nested] 36+ messages in thread
* 2.6.19-rc6: known regressions (v2) 2006-11-16 4:21 Linux 2.6.19-rc6 Linus Torvalds 2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk @ 2006-11-17 20:40 ` Adrian Bunk 2006-11-18 8:02 ` [PATCH] mm: do not call bad_page on PG_reserved check David Rientjes 2006-11-18 4:04 ` Linux 2.6.19-rc6 - NFSD working again Christian Kujau ` (3 subsequent siblings) 5 siblings, 1 reply; 36+ messages in thread From: Adrian Bunk @ 2006-11-17 20:40 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Linux Kernel Mailing List, Thomas Gleixner, Alan Stern, Ingo Molnar, davej, cpufreq, Alexey Starikovskiy, Mattia Dongili, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Ray Lee, Michael Buesch, Larry Finger, st3, linville, netdev, David Brownell, Len Brown, linux-acpi, Ernst Herzberg This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : cpufreq notification broken References : http://lkml.org/lkml/2006/11/16/177 Submitter : Thomas Gleixner <tglx@timesys.com> Caused-By : Alan Stern <stern@rowland.harvard.edu> commit b4dfdbb3c707474a2254c5b4d7e62be31a4b7da9 Handled-By : Ingo Molnar <mingo@elte.hu> Linus Torvalds <torvalds@osdl.org> Status : patches are being discussed Subject : CPU_FREQ_GOV_ONDEMAND=y compile error References : http://lkml.org/lkml/2006/11/17/198 Submitter : alex1000@comcast.net Caused-By : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com> commit 05ca0350e8caa91a5ec9961c585c98005b6934ea Handled-By : Mattia Dongili <malattia@linux.it> Patch : http://lkml.org/lkml/2006/11/17/236 Status : patch available Subject : x86_64: Bad page state in process 'swapper' References : http://lkml.org/lkml/2006/11/10/135 http://lkml.org/lkml/2006/11/10/208 Submitter : Andre Noll <maan@systemlinux.org> Handled-By : Andi Kleen <ak@suse.de> Status : Andi is investigating Subject : x86_64: oprofile doesn't work References : http://lkml.org/lkml/2006/10/27/3 http://lkml.org/lkml/2006/11/15/92 Submitter : Prakash Punnoor <prakash@punnoor.de> Status : problem is being discussed Subject : bcm43xx: serious problems References : http://lkml.org/lkml/2006/11/15/296 Submitter : Ray Lee <ray-lk@madrabbit.org> Handled-By : Michael Buesch <mb@bu3sch.de> Larry Finger <Larry.Finger@lwfinger.net> Status : problem is being debugged Subject : nasty ACPI regression, AE_TIME errors References : http://lkml.org/lkml/2006/11/15/12 Submitter : David Brownell <david-b@pacbell.net> Handled-By : Len Brown <len.brown@intel.com> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> Status : problem is being debugged Subject : ThinkPad R50p: boot fail with (lapic && on_battery) References : http://lkml.org/lkml/2006/10/31/333 Submitter : Ernst Herzberg <earny@net4u.de> Handled-By : Len Brown <len.brown@intel.com> Status : problem is being debugged ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH] mm: do not call bad_page on PG_reserved check 2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk @ 2006-11-18 8:02 ` David Rientjes 2006-11-18 13:37 ` Hugh Dickins 0 siblings, 1 reply; 36+ messages in thread From: David Rientjes @ 2006-11-18 8:02 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Andi Kleen, Nick Piggin, Andre Noll, linux-kernel The return value of free_pages_check() indicates if PG_reserved was set. If so, the calling functions return immediately and no pages are freed so there is no need to call bad_page(). Cc: Andi Kleen <ak@suse.de> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: David Rientjes <rientjes@cs.washington.edu> --- mm/page_alloc.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index bf2f6cf..99bc29d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -439,7 +439,6 @@ static inline int free_pages_check(struc 1 << PG_slab | 1 << PG_swapcache | 1 << PG_writeback | - 1 << PG_reserved | 1 << PG_buddy )))) bad_page(page); if (PageDirty(page)) ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH] mm: do not call bad_page on PG_reserved check 2006-11-18 8:02 ` [PATCH] mm: do not call bad_page on PG_reserved check David Rientjes @ 2006-11-18 13:37 ` Hugh Dickins 0 siblings, 0 replies; 36+ messages in thread From: Hugh Dickins @ 2006-11-18 13:37 UTC (permalink / raw) To: David Rientjes Cc: Linus Torvalds, Andrew Morton, Andi Kleen, Nick Piggin, Andre Noll, linux-kernel On Sat, 18 Nov 2006, David Rientjes wrote: > The return value of free_pages_check() indicates if PG_reserved was set. > If so, the calling functions return immediately and no pages are freed so > there is no need to call bad_page(). > > Cc: Andi Kleen <ak@suse.de> > Cc: Nick Piggin <npiggin@suse.de> > Signed-off-by: David Rientjes <rientjes@cs.washington.edu> NAK. You're missing the point. If an attempt is made to free a reserved page, it implies that the page reference counting has gone wrong: we want to hear about that (so call bad_page), and we dare not reuse the page (so skip freeing it). What might be a good change, is to avoid freeing a page which meets _any_ of the criteria for calling bad_page: I often wonder whether to do that, alongside abandoning that hopeless page_mapcount BUG in page_remove_rmap, which has almost(?) never helped lead us to any fix. Hugh > --- > mm/page_alloc.c | 1 - > 1 files changed, 0 insertions(+), 1 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index bf2f6cf..99bc29d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -439,7 +439,6 @@ static inline int free_pages_check(struc > 1 << PG_slab | > 1 << PG_swapcache | > 1 << PG_writeback | > - 1 << PG_reserved | > 1 << PG_buddy )))) > bad_page(page); > if (PageDirty(page)) ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.6.19-rc6 - NFSD working again 2006-11-16 4:21 Linux 2.6.19-rc6 Linus Torvalds 2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk 2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk @ 2006-11-18 4:04 ` Christian Kujau 2006-11-20 19:53 ` 2.6.19-rc6: known regressions (v3) Adrian Bunk ` (2 subsequent siblings) 5 siblings, 0 replies; 36+ messages in thread From: Christian Kujau @ 2006-11-18 4:04 UTC (permalink / raw) To: neilb; +Cc: Linux Kernel Mailing List Hi, I just wanted to report a 'it works again' for rc6: after encountering the very same problems with -rc3 Jeff Garzik described in [0], I upgraded to -rc5 and applied the proposed[1] patch[2]. Now, the knfsd behaved a bit better (nfs-mounted /home, X11 applications created thousands of empty 'configuration'-files), however 'mkdir' and 'touch' still failed too often: $ mkdir /mnt/nfs/compile-farm/foo mkdir: /mnt/nfs/compile-farm/foo: Operation not permitted $ mkdir /mnt/nfs/compile-farm/foo mkdir: /mnt/nfs/compile-farm/foo: File exists ...and things like that. With -rc6 this seems to be gone. However, I noticed this message in the server's (192.168.10.10) syslog: nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out The NFS server is running on 0.0.0.0:2049, what does this mean? The message occurs once in a while, not sure what triggers it, found not much in the archives... Thanks, Christian. [0] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1418.html [1] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1491.html [2] http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.19-rc3-2/linux-2.6.19-rc3-CITI_NFS4_ALL-2.diff -- BOFH excuse #106: The electrician didn't know what the yellow cable was so he yanked the ethernet out. ^ permalink raw reply [flat|nested] 36+ messages in thread
* 2.6.19-rc6: known regressions (v3) 2006-11-16 4:21 Linux 2.6.19-rc6 Linus Torvalds ` (2 preceding siblings ...) 2006-11-18 4:04 ` Linux 2.6.19-rc6 - NFSD working again Christian Kujau @ 2006-11-20 19:53 ` Adrian Bunk 2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk 2006-11-23 0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk 5 siblings, 0 replies; 36+ messages in thread From: Adrian Bunk @ 2006-11-20 19:53 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Linux Kernel Mailing List, Vivek Goyal, Andre Noll, David Rientjes, ak, discuss, Prakash Punnoor, phil.el, oprofile-list, Thomas Gleixner, Alan Stern, Ingo Molnar, Oleg Nesterov, Paul E. McKenney, davej, cpufreq, Alexey Starikovskiy, Mattia Dongili, David Brownell, Len Brown, linux-acpi, Ernst Herzberg This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : kernel hangs when booting with irqpoll References : http://lkml.org/lkml/2006/11/20/233 Submitter : Vivek Goyal <vgoyal@in.ibm.com> Status : unknown Subject : x86_64: Bad page state in process 'swapper' References : http://lkml.org/lkml/2006/11/10/135 http://lkml.org/lkml/2006/11/10/208 Submitter : Andre Noll <maan@systemlinux.org> Handled-By : David Rientjes <rientjes@cs.washington.edu> Status : problem is being debugged Subject : x86_64: oprofile doesn't work References : http://lkml.org/lkml/2006/10/27/3 http://lkml.org/lkml/2006/11/15/92 Submitter : Prakash Punnoor <prakash@punnoor.de> Status : problem is being discussed Subject : cpufreq notification broken References : http://lkml.org/lkml/2006/11/16/177 Submitter : Thomas Gleixner <tglx@timesys.com> Caused-By : Alan Stern <stern@rowland.harvard.edu> commit b4dfdbb3c707474a2254c5b4d7e62be31a4b7da9 Handled-By : Ingo Molnar <mingo@elte.hu> Linus Torvalds <torvalds@osdl.org> Oleg Nesterov <oleg@tv-sign.ru> Paul E. McKenney <paulmck@us.ibm.com> Status : patches are being discussed Subject : CPU_FREQ_GOV_ONDEMAND=y compile error References : http://lkml.org/lkml/2006/11/17/198 Submitter : alex1000@comcast.net Caused-By : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com> commit 05ca0350e8caa91a5ec9961c585c98005b6934ea Handled-By : Mattia Dongili <malattia@linux.it> Patch : http://lkml.org/lkml/2006/11/17/236 Status : patch available Subject : nasty ACPI regression, AE_TIME errors References : http://lkml.org/lkml/2006/11/15/12 Submitter : David Brownell <david-b@pacbell.net> Handled-By : Len Brown <len.brown@intel.com> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> Status : problem is being debugged Subject : ThinkPad R50p: boot fail with (lapic && on_battery) References : http://lkml.org/lkml/2006/10/31/333 Submitter : Ernst Herzberg <earny@net4u.de> Handled-By : Len Brown <len.brown@intel.com> Status : problem is being debugged ^ permalink raw reply [flat|nested] 36+ messages in thread
* 2.6.19-rc6: known regressions (v4) 2006-11-16 4:21 Linux 2.6.19-rc6 Linus Torvalds ` (3 preceding siblings ...) 2006-11-20 19:53 ` 2.6.19-rc6: known regressions (v3) Adrian Bunk @ 2006-11-21 21:24 ` Adrian Bunk 2006-11-21 21:31 ` [discuss] " Dave Jones ` (3 more replies) 2006-11-23 0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk 5 siblings, 4 replies; 36+ messages in thread From: Adrian Bunk @ 2006-11-21 21:24 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Linux Kernel Mailing List, Vivek Goyal, Pavel Emelianov, Andre Noll, David Rientjes, ak, discuss, Prakash Punnoor, phil.el, oprofile-list, David Brownell, Len Brown, Alexey Starikovskiy, linux-acpi, Ernst Herzberg, Kumar Gala, Joakim Tjernlund, Kim Phillips, paulus, linuxppc-dev, a.zummo, Randy Dunlap, Roman Zippel, Phil Oester, Sam Ravnborg, Mattia Dongili, davej, cpufreq This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : kernel hangs when booting with irqpoll References : http://lkml.org/lkml/2006/11/20/233 Submitter : Vivek Goyal <vgoyal@in.ibm.com> Caused-By : Pavel Emelianov <xemul@openvz.org> commit f72fa707604c015a6625e80f269506032d5430dc Handled-By : Vivek Goyal <vgoyal@in.ibm.com> Status : problem is being debugged Subject : x86_64: Bad page state in process 'swapper' References : http://lkml.org/lkml/2006/11/10/135 http://lkml.org/lkml/2006/11/10/208 Submitter : Andre Noll <maan@systemlinux.org> Handled-By : David Rientjes <rientjes@cs.washington.edu> Status : problem is being debugged Subject : x86_64: oprofile doesn't work References : http://lkml.org/lkml/2006/10/27/3 http://lkml.org/lkml/2006/11/15/92 Submitter : Prakash Punnoor <prakash@punnoor.de> Status : problem is being discussed Subject : ACPI: AE_TIME errors References : http://lkml.org/lkml/2006/11/15/12 Submitter : David Brownell <david-b@pacbell.net> Handled-By : Len Brown <len.brown@intel.com> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> Status : problem is being debugged Subject : ThinkPad R50p: boot fail with (lapic && on_battery) References : http://lkml.org/lkml/2006/10/31/333 Submitter : Ernst Herzberg <earny@net4u.de> Handled-By : Len Brown <len.brown@intel.com> Status : problem is being debugged Subject : powerpc: serious RTC problems References : http://lkml.org/lkml/2006/11/17/187 http://lkml.org/lkml/2006/11/18/99 Submitter : Kumar Gala <galak@kernel.crashing.org> Joakim Tjernlund <joakim.tjernlund@transmode.se> Caused-By : Kim Phillips <kim.phillips@freescale.com> commit 7a69af63e788a324d162201a0b23df41bcf158dd commit a8ed4f7ec3aa472134d7de6176f823b2667e450b Handled-By : David Brownell <david-b@pacbell.net Kim Phillips <kim.phillips@freescale.com> Patch : http://lkml.org/lkml/2006/11/20/320 http://lkml.org/lkml/2006/11/20/321 Status : patches available Subject : xconfig crashes on x86_64 References : http://lkml.org/lkml/2006/11/19/177 Submitter : Randy Dunlap <randy.dunlap@oracle.com> Handled-By : Roman Zippel <zippel@linux-m68k.org> Patch : http://lkml.org/lkml/2006/11/20/340 Status : patch available Subject : menuconfig problems with TERM=vt100 References : http://lkml.org/lkml/2006/11/13/369 Submitter : Phil Oester <kernel@linuxace.com> Caused-By : Sam Ravnborg <sam@mars.ravnborg.org> commit 350b5b76384e77bcc58217f00455fdbec5cac594 Handled-By : Roman Zippel <zippel@linux-m68k.org> Patch : http://lkml.org/lkml/2006/11/20/341 Status : patch available Subject : CPU_FREQ_GOV_ONDEMAND=y compile error References : http://lkml.org/lkml/2006/11/17/198 Submitter : alex1000@comcast.net Caused-By : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com> commit 05ca0350e8caa91a5ec9961c585c98005b6934ea Handled-By : Mattia Dongili <malattia@linux.it> Patch : http://lkml.org/lkml/2006/11/17/236 Status : patch available ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk @ 2006-11-21 21:31 ` Dave Jones 2006-11-21 21:39 ` Adrian Bunk 2006-11-21 21:33 ` Vivek Goyal ` (2 subsequent siblings) 3 siblings, 1 reply; 36+ messages in thread From: Dave Jones @ 2006-11-21 21:31 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Mattia Dongili, cpufreq On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote: > Subject : CPU_FREQ_GOV_ONDEMAND=y compile error > References : http://lkml.org/lkml/2006/11/17/198 > Submitter : alex1000@comcast.net > Caused-By : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com> > commit 05ca0350e8caa91a5ec9961c585c98005b6934ea > Handled-By : Mattia Dongili <malattia@linux.it> > Patch : http://lkml.org/lkml/2006/11/17/236 > Status : patch available not a regression, easily worked around, queued for .20 Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-21 21:31 ` [discuss] " Dave Jones @ 2006-11-21 21:39 ` Adrian Bunk 2006-11-21 21:56 ` Dave Jones 0 siblings, 1 reply; 36+ messages in thread From: Adrian Bunk @ 2006-11-21 21:39 UTC (permalink / raw) To: Dave Jones, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Mattia Dongili, cpufreq On Tue, Nov 21, 2006 at 04:31:39PM -0500, Dave Jones wrote: > On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote: > > > Subject : CPU_FREQ_GOV_ONDEMAND=y compile error > > References : http://lkml.org/lkml/2006/11/17/198 > > Submitter : alex1000@comcast.net > > Caused-By : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com> > > commit 05ca0350e8caa91a5ec9961c585c98005b6934ea > > Handled-By : Mattia Dongili <malattia@linux.it> > > Patch : http://lkml.org/lkml/2006/11/17/236 > > Status : patch available > > not a regression, easily worked around, queued for .20 It is a regression since commit 05ca0350e8caa91a5ec9961c585c98005b6934ea was merged after 2.6.18. Considering that the fix is trivial, why shouldn't it be merged before 2.6.19? > Dave cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-21 21:39 ` Adrian Bunk @ 2006-11-21 21:56 ` Dave Jones 0 siblings, 0 replies; 36+ messages in thread From: Dave Jones @ 2006-11-21 21:56 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Mattia Dongili, cpufreq On Tue, Nov 21, 2006 at 10:39:00PM +0100, Adrian Bunk wrote: > On Tue, Nov 21, 2006 at 04:31:39PM -0500, Dave Jones wrote: > > On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote: > > > > > Subject : CPU_FREQ_GOV_ONDEMAND=y compile error > > > References : http://lkml.org/lkml/2006/11/17/198 > > > Submitter : alex1000@comcast.net > > > Caused-By : Alexey Starikovskiy <alexey_y_starikovskiy@linux.intel.com> > > > commit 05ca0350e8caa91a5ec9961c585c98005b6934ea > > > Handled-By : Mattia Dongili <malattia@linux.it> > > > Patch : http://lkml.org/lkml/2006/11/17/236 > > > Status : patch available > > > > not a regression, easily worked around, queued for .20 > > It is a regression since commit 05ca0350e8caa91a5ec9961c585c98005b6934ea > was merged after 2.6.18. Ah, I misinterpreted when that cset went in (I read the commit date which was back in June, not the merge date, which was september). > Considering that the fix is trivial, why shouldn't it be merged before > 2.6.19? Yes, I'll push it on. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions (v4) 2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk 2006-11-21 21:31 ` [discuss] " Dave Jones @ 2006-11-21 21:33 ` Vivek Goyal 2006-11-21 21:41 ` Adrian Bunk 2006-11-21 22:18 ` Linus Torvalds 2006-11-22 10:42 ` [discuss] " Andi Kleen 2006-11-23 0:04 ` David Brownell 3 siblings, 2 replies; 36+ messages in thread From: Vivek Goyal @ 2006-11-21 21:33 UTC (permalink / raw) To: Adrian Bunk Cc: linux kernel mailing list, Linus Torvalds, Morton Andrew Morton, Pavel Emelianov, mingo, dev On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote: > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 > that are not yet fixed in Linus' tree. > > If you find your name in the Cc header, you are either submitter of one > of the bugs, maintainer of an affectected subsystem or driver, a patch > of you caused a breakage or I'm considering you in any other way possibly > involved with one or more of these issues. > > Due to the huge amount of recipients, please trim the Cc when answering. > > > Subject : kernel hangs when booting with irqpoll > References : http://lkml.org/lkml/2006/11/20/233 > Submitter : Vivek Goyal <vgoyal@in.ibm.com> > Caused-By : Pavel Emelianov <xemul@openvz.org> > commit f72fa707604c015a6625e80f269506032d5430dc > Handled-By : Vivek Goyal <vgoyal@in.ibm.com> > Status : problem is being debugged > Adrian, Pavel already provided a fix for this issue. http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2 Thanks Vivek ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions (v4) 2006-11-21 21:33 ` Vivek Goyal @ 2006-11-21 21:41 ` Adrian Bunk 2006-11-21 22:18 ` Linus Torvalds 1 sibling, 0 replies; 36+ messages in thread From: Adrian Bunk @ 2006-11-21 21:41 UTC (permalink / raw) To: Vivek Goyal Cc: linux kernel mailing list, Linus Torvalds, Morton Andrew Morton, Pavel Emelianov, mingo, dev On Tue, Nov 21, 2006 at 04:33:35PM -0500, Vivek Goyal wrote: > On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote: > > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 > > that are not yet fixed in Linus' tree. > > > > If you find your name in the Cc header, you are either submitter of one > > of the bugs, maintainer of an affectected subsystem or driver, a patch > > of you caused a breakage or I'm considering you in any other way possibly > > involved with one or more of these issues. > > > > Due to the huge amount of recipients, please trim the Cc when answering. > > > > > > Subject : kernel hangs when booting with irqpoll > > References : http://lkml.org/lkml/2006/11/20/233 > > Submitter : Vivek Goyal <vgoyal@in.ibm.com> > > Caused-By : Pavel Emelianov <xemul@openvz.org> > > commit f72fa707604c015a6625e80f269506032d5430dc > > Handled-By : Vivek Goyal <vgoyal@in.ibm.com> > > Status : problem is being debugged > > > > Adrian, > > Pavel already provided a fix for this issue. > > http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2 Thanks for the information, I missed this patch. > Thanks > Vivek cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions (v4) 2006-11-21 21:33 ` Vivek Goyal 2006-11-21 21:41 ` Adrian Bunk @ 2006-11-21 22:18 ` Linus Torvalds 2006-11-22 9:44 ` Pavel Emelianov 1 sibling, 1 reply; 36+ messages in thread From: Linus Torvalds @ 2006-11-21 22:18 UTC (permalink / raw) To: Vivek Goyal Cc: Adrian Bunk, linux kernel mailing list, Morton Andrew Morton, Pavel Emelianov, mingo, dev On Tue, 21 Nov 2006, Vivek Goyal wrote: > On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote: > > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 > > that are not yet fixed in Linus' tree. > > > > If you find your name in the Cc header, you are either submitter of one > > of the bugs, maintainer of an affectected subsystem or driver, a patch > > of you caused a breakage or I'm considering you in any other way possibly > > involved with one or more of these issues. > > > > Due to the huge amount of recipients, please trim the Cc when answering. > > > > > > Subject : kernel hangs when booting with irqpoll > > References : http://lkml.org/lkml/2006/11/20/233 > > Submitter : Vivek Goyal <vgoyal@in.ibm.com> > > Caused-By : Pavel Emelianov <xemul@openvz.org> > > commit f72fa707604c015a6625e80f269506032d5430dc > > Handled-By : Vivek Goyal <vgoyal@in.ibm.com> > > Status : problem is being debugged > > > > Adrian, > > Pavel already provided a fix for this issue. > > http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2 I really think this is wrong. The original patch was wrong, and the _real_ problem is in __do_IRQ() that got the desc->lock too early. I _think_ the correct fix is to simply revert the broken commit, and fix the _one_ place that called "misnote_interrupt()" with the lock held. Something like this.. I also think that the real fix will be to move the whole if (!noirqdebug) note_interrupt(irq, desc, action_ret); into handle_IRQ_event itself, since every caller (except for "misrouted_irq()" itself, and that should probably be done separately) should always do it. Right now we have a lot of people that just do action_ret = handle_IRQ_event(irq, action); if (!noirqdebug) note_interrupt(irq, desc, action_ret); explicitly. The only thing that keeps us from doing that is that we don't pass in "desc", but we should just do that. But in the meantime, this appears to be the minimal fix. Can people please test and verify? Linus --- diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c index 42aa6f1..a681912 100644 --- a/kernel/irq/handle.c +++ b/kernel/irq/handle.c @@ -231,10 +231,10 @@ fastcall unsigned int __do_IRQ(unsigned spin_unlock(&desc->lock); action_ret = handle_IRQ_event(irq, action); - - spin_lock(&desc->lock); if (!noirqdebug) note_interrupt(irq, desc, action_ret); + + spin_lock(&desc->lock); if (likely(!(desc->status & IRQ_PENDING))) break; desc->status &= ~IRQ_PENDING; diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c index 9c7e2e4..543ea2e 100644 --- a/kernel/irq/spurious.c +++ b/kernel/irq/spurious.c @@ -147,11 +147,7 @@ void note_interrupt(unsigned int irq, st if (unlikely(irqfixup)) { /* Don't punish working computers */ if ((irqfixup == 2 && irq == 0) || action_ret == IRQ_NONE) { - int ok; - - spin_unlock(&desc->lock); - ok = misrouted_irq(irq); - spin_lock(&desc->lock); + int ok = misrouted_irq(irq); if (action_ret == IRQ_NONE) desc->irqs_unhandled -= ok; } ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions (v4) 2006-11-21 22:18 ` Linus Torvalds @ 2006-11-22 9:44 ` Pavel Emelianov 2006-11-22 14:58 ` Vivek Goyal 2006-11-22 17:28 ` Linus Torvalds 0 siblings, 2 replies; 36+ messages in thread From: Pavel Emelianov @ 2006-11-22 9:44 UTC (permalink / raw) To: Linus Torvalds, Morton Andrew Morton, mingo Cc: Vivek Goyal, Adrian Bunk, linux kernel mailing list, dev > I really think this is wrong. > > The original patch was wrong, and the _real_ problem is in __do_IRQ() that > got the desc->lock too early. > > I _think_ the correct fix is to simply revert the broken commit, and fix > the _one_ place that called "misnote_interrupt()" with the lock held. > > Something like this.. > > I also think that the real fix will be to move the whole > > if (!noirqdebug) > note_interrupt(irq, desc, action_ret); > > > into handle_IRQ_event itself, since every caller (except for > "misrouted_irq()" itself, and that should probably be done separately) > should always do it. Right now we have a lot of people that just do > > action_ret = handle_IRQ_event(irq, action); > if (!noirqdebug) > note_interrupt(irq, desc, action_ret); > > explicitly. > > The only thing that keeps us from doing that is that we don't pass in > "desc", but we should just do that. > > But in the meantime, this appears to be the minimal fix. Can people please > test and verify? This works for me, but is this normal that desc's fields are modified non-atomically in note_interrupt()? And one more thing - report_bad_irq() traverses desc->action list without any locking either. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions (v4) 2006-11-22 9:44 ` Pavel Emelianov @ 2006-11-22 14:58 ` Vivek Goyal 2006-11-22 17:28 ` Linus Torvalds 1 sibling, 0 replies; 36+ messages in thread From: Vivek Goyal @ 2006-11-22 14:58 UTC (permalink / raw) To: Pavel Emelianov, Linus Torvalds Cc: Morton Andrew Morton, mingo, Adrian Bunk, linux kernel mailing list, dev On Wed, Nov 22, 2006 at 12:44:14PM +0300, Pavel Emelianov wrote: > > I really think this is wrong. > > > > The original patch was wrong, and the _real_ problem is in __do_IRQ() that > > got the desc->lock too early. > > > > I _think_ the correct fix is to simply revert the broken commit, and fix > > the _one_ place that called "misnote_interrupt()" with the lock held. > > > > Something like this.. > > > > I also think that the real fix will be to move the whole > > > > if (!noirqdebug) > > note_interrupt(irq, desc, action_ret); > > > > > > into handle_IRQ_event itself, since every caller (except for > > "misrouted_irq()" itself, and that should probably be done separately) > > should always do it. Right now we have a lot of people that just do > > > > action_ret = handle_IRQ_event(irq, action); > > if (!noirqdebug) > > note_interrupt(irq, desc, action_ret); > > > > explicitly. > > > > The only thing that keeps us from doing that is that we don't pass in > > "desc", but we should just do that. > > > > But in the meantime, this appears to be the minimal fix. Can people please > > test and verify? > > This works for me, but is this normal that desc's fields are > modified non-atomically in note_interrupt()? > > And one more thing - report_bad_irq() traverses desc->action > list without any locking either. Works for me too. But Pavel's concern look genuine. May be we should take the lock again in note_interrupt()/report_bad_irq() whenever we are accessing/modifying desc. Thanks Vivek ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions (v4) 2006-11-22 9:44 ` Pavel Emelianov 2006-11-22 14:58 ` Vivek Goyal @ 2006-11-22 17:28 ` Linus Torvalds 1 sibling, 0 replies; 36+ messages in thread From: Linus Torvalds @ 2006-11-22 17:28 UTC (permalink / raw) To: Pavel Emelianov Cc: Morton Andrew Morton, mingo, Vivek Goyal, Adrian Bunk, linux kernel mailing list, dev On Wed, 22 Nov 2006, Pavel Emelianov wrote: > > This works for me, but is this normal that desc's fields are > modified non-atomically in note_interrupt()? This is all inside the normal interrupt handling logic, so it should be exactly as safe as any interrupt is: we don't allow the _same_ interrupt to be entered recursively at the same time. So yes, the counts etc are done non-atomically, but the code around it all guarantees that only one concurrent invocation happens per irq descriptor, so it's all ok. (The one exception to that may be the "desc->status" modification in case the irq is determined to have screamed, since "status" can be modified by a recursive interrupt coming in, but (a) that's a "this irq is dead" schenario _anyway_ and (b) if we ever care, we should lock it _there_, not somewhere else). Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk 2006-11-21 21:31 ` [discuss] " Dave Jones 2006-11-21 21:33 ` Vivek Goyal @ 2006-11-22 10:42 ` Andi Kleen 2006-11-22 15:52 ` Mel Gorman 2006-11-22 16:05 ` Andre Noll 2006-11-23 0:04 ` David Brownell 3 siblings, 2 replies; 36+ messages in thread From: Andi Kleen @ 2006-11-22 10:42 UTC (permalink / raw) To: discuss Cc: Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Andre Noll, David Rientjes, Mel Gorman ject : x86_64: Bad page state in process 'swapper' > References : http://lkml.org/lkml/2006/11/10/135 > http://lkml.org/lkml/2006/11/10/208 > Submitter : Andre Noll <maan@systemlinux.org> > Handled-By : David Rientjes <rientjes@cs.washington.edu> > Status : problem is being debugged Does this still happen with -rc6? It's probably another bug in the memmap parsing rewrite (Mel cc'ed) but the debugging information in the standard kernel unfortunately doesn't give enough output to find out where it happens. -Andi ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-22 10:42 ` [discuss] " Andi Kleen @ 2006-11-22 15:52 ` Mel Gorman 2006-11-22 17:42 ` Andre Noll 2006-11-22 16:05 ` Andre Noll 1 sibling, 1 reply; 36+ messages in thread From: Mel Gorman @ 2006-11-22 15:52 UTC (permalink / raw) To: Andi Kleen Cc: discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Andre Noll, David Rientjes On (22/11/06 11:42), Andi Kleen didst pronounce: > ject : x86_64: Bad page state in process 'swapper' > > References : http://lkml.org/lkml/2006/11/10/135 > > http://lkml.org/lkml/2006/11/10/208 > > Submitter : Andre Noll <maan@systemlinux.org> > > Handled-By : David Rientjes <rientjes@cs.washington.edu> > > Status : problem is being debugged > > Does this still happen with -rc6? > > It's probably another bug in the memmap parsing rewrite (Mel cc'ed) > but the debugging information in the standard kernel unfortunately > doesn't give enough output to find out where it happens. > Right, so I took a closer look to see what the story was. According to the thread, this was the E820 map with the corresponding PFNs appended to the usable regions. BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) ( 0-159) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable) (256-1032176) BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data) BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS) BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000200000000 (usable) (1048576-2097152) This is what the PFN ranges look like to arch-independent zone-sizing reading the map without node awareness Entering add_active_range(0, 0, 159) 0 entries of 3200 used Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used That matches exactly. So far so good. Later with node awareness, we get SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 1 -> APIC 1 -> Node 1 SRAT: Node 0 PXM 0 100000-fc000000 Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used SRAT: Node 1 PXM 1 100000000-200000000 Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used SRAT: Node 0 PXM 0 0-fc000000 Entering add_active_range(0, 0, 159) 2 entries of 3200 used Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used Unusual ordering, but the information is still correct. The final sorted map looks like; early_node_map[3] active PFN ranges 0: 0 -> 159 0: 256 -> 1032176 1: 1048576 -> 2097152 Again, everything there looks like what the E820 map reports so I don't believe this is the zone-sizings code fault although it may be exposing a bug from elsewhere. According to bootmap, things look like Bootmem setup node 0 0000000000000000-00000000fc000000 Bootmem setup node 1 0000000100000000-0000000200000000 That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152. That is showing an additional 16 page frames that are not in the E820 map (although I have seen this before and it didn't show up as a bad page). I would be very interested in finding out what the bad_page PFNs are if this bug still exists to see if it is those 16 frames. I've included a patch below that might help. Andre, if the bug still exists for you, can you apply Andi's patch to reduce the log size and the following patch please and post us the output with loglevel=8 please? Thanks Signed-off-by: Mel Gorman <mel@csn.ul.ie> diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc6-clean/arch/x86_64/mm/numa.c linux-2.6.19-rc6-debug_bootmem_init_issues/arch/x86_64/mm/numa.c --- linux-2.6.19-rc6-clean/arch/x86_64/mm/numa.c 2006-11-22 15:08:20.000000000 +0000 +++ linux-2.6.19-rc6-debug_bootmem_init_issues/arch/x86_64/mm/numa.c 2006-11-22 15:07:47.000000000 +0000 @@ -192,6 +192,9 @@ void __init setup_node_zones(int nodeid) memmapsize, SMP_CACHE_BYTES, round_down(limit - memmapsize, PAGE_SIZE), limit); + printk(KERN_DEBUG "Node %d memmap at 0x%p size %lu first pfn 0x%p\n", + nodeid, NODE_DATA(nodeid)->node_mem_map, + memmapsize, NODE_DATA(nodeid)->node_mem_map); #endif } diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-debug_bootmem_init_issues/mm/page_alloc.c --- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-16 04:03:40.000000000 +0000 +++ linux-2.6.19-rc6-debug_bootmem_init_issues/mm/page_alloc.c 2006-11-22 14:16:46.000000000 +0000 @@ -2453,6 +2453,9 @@ static void __init alloc_node_mem_map(st if (!map) map = alloc_bootmem_node(pgdat, size); pgdat->node_mem_map = map + (pgdat->node_start_pfn - start); + printk(KERN_DEBUG + "Node %d memmap at 0x%p size %lu first pfn 0x%p\n", + pgdat->node_id, map, size, pgdat->node_mem_map); } #ifdef CONFIG_FLATMEM /* @@ -2683,6 +2686,9 @@ void __init free_area_init_nodes(unsigne /* Regions in the early_node_map can be in any order */ sort_node_map(); + /* Print out the page size for debugging meminit problems */ + printk(KERN_DEBUG "sizeof(struct page) = %d\n", sizeof(struct page)); + /* Print out the zone ranges */ printk("Zone PFN ranges:\n"); for (i = 0; i < MAX_NR_ZONES; i++) ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-22 15:52 ` Mel Gorman @ 2006-11-22 17:42 ` Andre Noll 2006-11-23 12:01 ` Mel Gorman 0 siblings, 1 reply; 36+ messages in thread From: Andre Noll @ 2006-11-22 17:42 UTC (permalink / raw) To: Mel Gorman Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes [-- Attachment #1: Type: text/plain, Size: 22063 bytes --] On 15:52, Mel Gorman wrote: > Right, so I took a closer look to see what the story was. Thanks a lot, Mel. > Bootmem setup node 0 0000000000000000-00000000fc000000 > Bootmem setup node 1 0000000100000000-0000000200000000 > > That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152. > > That is showing an additional 16 page frames that are not in the E820 map > (although I have seen this before and it didn't show up as a bad page). I > would be very interested in finding out what the bad_page PFNs are if this > bug still exists to see if it is those 16 frames. I've included a patch > below that might help. > > Andre, if the bug still exists for you, can you apply Andi's patch to > reduce the log size and the following patch please and post us the > output with loglevel=8 please? Thanks Done. Here's the output of dmesg with your and Andi's patch applied. Andre Linux version 2.6.19-rc6-mel-tt64-6-g0f9005a6-dirty (maan@congo) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #11 SMP Wed Nov 22 17:11:44 CET 2006 Command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable) BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data) BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS) BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000200000000 (usable) Entering add_active_range(0, 0, 159) 0 entries of 3200 used Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used end_pfn_map = 2097152 DMI 2.3 present. ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000f6bc0 ACPI: RSDT (v001 A M I OEMRSDT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0000 ACPI: FADT (v001 A M I OEMFACP 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0200 ACPI: MADT (v001 A M I OEMAPIC 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0380 ACPI: OEMB (v001 A M I OEMBIOS 0x01000510 MSFT 0x00000097) @ 0x00000000fbfff040 ACPI: SRAT (v001 A M I OEMSRAT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff34e0 ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0x00000000fbff35f0 ACPI: DSDT (v001 0AAAA 0AAAA000 0x00000000 INTL 0x02002026) @ 0x0000000000000000 SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 1 -> APIC 1 -> Node 1 SRAT: Node 0 PXM 0 100000-fc000000 Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used SRAT: Node 1 PXM 1 100000000-200000000 Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used SRAT: Node 0 PXM 0 0-fc000000 Entering add_active_range(0, 0, 159) 2 entries of 3200 used Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used NUMA: Using 32 for the hash shift. Bootmem setup node 0 0000000000000000-00000000fc000000 Bootmem setup node 1 0000000100000000-0000000200000000 Node 0 memmap at 0xffff810000893000 size 57802752 first pfn 0xffff810000893000 Node 1 memmap at 0xffff8101fc800000 size 58720256 first pfn 0xffff8101fc800000 sizeof(struct page) = 56 Zone PFN ranges: DMA 256 -> 4096 DMA32 4096 -> 1048576 Normal 1048576 -> 2097152 early_node_map[3] active PFN ranges 0: 0 -> 159 0: 256 -> 1032176 1: 1048576 -> 2097152 On node 0 totalpages: 1031920 DMA zone: 52 pages used for memmap DMA zone: 1953 pages reserved DMA zone: 1835 pages, LIFO batch:0 DMA32 zone: 14055 pages used for memmap DMA32 zone: 1014025 pages, LIFO batch:31 Normal zone: 0 pages used for memmap On node 1 totalpages: 1048576 DMA zone: 0 pages used for memmap DMA32 zone: 0 pages used for memmap Normal zone: 14336 pages used for memmap Normal zone: 1034240 pages, LIFO batch:31 ACPI: PM-Timer IO Port: 0x5008 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24]) IOAPIC[1]: apic_id 3, address 0xfebff000, GSI 24-27 ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28]) IOAPIC[2]: apic_id 4, address 0xfebfe000, GSI 28-31 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information Nosave address range: 000000000009f000 - 00000000000a0000 Nosave address range: 00000000000a0000 - 00000000000e0000 Nosave address range: 00000000000e0000 - 0000000000100000 Nosave address range: 00000000fbff0000 - 00000000fbfff000 Nosave address range: 00000000fbfff000 - 00000000fc000000 Nosave address range: 00000000fc000000 - 00000000ff780000 Nosave address range: 00000000ff780000 - 0000000100000000 Allocating PCI resources starting at fc400000 (gap: fc000000:3780000) PERCPU: Allocating 25728 bytes of per cpu data Built 2 zonelists. Total pages: 2050100 Kernel command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes) Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes) Checking aperture... CPU 0: aperture @ f5cc000000 size 32 MB Aperture too small (32 MB) No AGP bridge found Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Mapping aperture over 65536 KB of RAM @ 8000000 Bad page state in process 'swapper' page:ffff810003faf480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0 Trying to fix it up, but a reboot is needed Backtrace: Call Trace: [<ffffffff8014f1dd>] bad_page+0x71/0x9f [<ffffffff8014f6be>] __free_pages_ok+0x78/0xf9 [<ffffffff805cd878>] free_all_bootmem_core+0xce/0x1c2 [<ffffffff805cad99>] numa_free_all_bootmem+0x39/0x78 [<ffffffff805ca603>] mem_init+0x59/0x16c [<ffffffff805bb75c>] start_kernel+0x165/0x1e7 [<ffffffff805bb195>] x86_64_start_kernel+0x12b/0x130 Memory: 8122880k/8388608k available (3184k kernel code, 199740k reserved, 1490k data, 2612k init) Calibrating delay using timer specific routine.. 4784.66 BogoMIPS (lpj=9569329) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 0/0 -> Node 0 Freeing SMP alternatives: 32k freed ACPI: Core revision 20060707 Using local APIC timer interrupts. result 12447006 Detected 12.447 MHz APIC timer. Booting processor 1/2 APIC 0x1 Initializing CPU#1 Calibrating delay using timer specific routine.. 4780.00 BogoMIPS (lpj=9560010) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 1/1 -> Node 1 AMD Opteron(tm) Processor 250 stepping 0a CPU 1: Syncing TSC to CPU 0. CPU 1: synchronized TSC with CPU 0 (last diff -14 cycles, maxerr 1190 cycles) Brought up 2 CPUs testing NMI watchdog ... OK. Disabling vsyscall due to use of PM timer time.c: Using 3.579545 MHz WALL PM GTOD PM timer. time.c: Detected 2389.823 MHz processor. migration_cost=569 NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) Boot video device is 0000:03:06.0 PCI: Firmware left 0000:03:08.0 e100 interrupts enabled, disabling ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) AMD768 RNG detected SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report PCI-DMA: Disabling AGP. PCI-DMA: aperture base @ 8000000 size 65536 KB PCI-DMA: using GART IOMMU. PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture PCI: Bridge: 0000:00:06.0 IO window: a000-bfff MEM window: fc900000-feafffff PREFETCH window: disabled. PCI: Bridge: 0000:00:0a.0 IO window: 9000-9fff MEM window: fc600000-fc8fffff PREFETCH window: ff500000-ff5fffff PCI: Bridge: 0000:00:0b.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. NET: Registered protocol family 2 IP route cache hash table entries: 262144 (order: 9, 2097152 bytes) TCP established hash table entries: 262144 (order: 10, 4194304 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 262144 bind 65536) TCP reno registered microcode: CPU0 not a capable Intel processor microcode: CPU1 not a capable Intel processor IA-32 Microcode Update Driver: v1.14a <tigran@veritas.com> io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered ACPI: Power Button (FF) [PWRF] ACPI: Power Button (CM) [PWRB] ACPI: Processor [CPU1] (supports 8 throttling states) ACPI: Getting cpuindex for acpiid 0x3 ACPI: Getting cpuindex for acpiid 0x4 Real Time Clock Driver v1.12ac Linux agpgart interface v0.101 (c) Dave Jones ipmi message handler version 39.0 ipmi device interface IPMI System Interface driver. ipmi_si: Unable to find any System Interface(s) IPMI Watchdog: driver initialized Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A loop: loaded (max 8 devices) Intel(R) PRO/1000 Network Driver - version 7.2.9-k4 Copyright (c) 1999-2006 Intel Corporation. eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others ACPI: PCI Interrupt 0000:03:08.0[A] -> GSI 18 (level, low) -> IRQ 18 eth0: 0000:03:08.0, 00:E0:81:2E:78:F7, IRQ 18. Board assembly 567812-052, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0xd0a6c714). e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI e100: Copyright(c) 1999-2006 Intel Corporation tg3.c:v3.69 (November 15, 2006) ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 24 (level, low) -> IRQ 24 eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:26 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth1: dma_rwctrl[769f4000] dma_mask[64-bit] ACPI: PCI Interrupt 0000:02:09.1[B] -> GSI 25 (level, low) -> IRQ 25 eth2: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:27 eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth2: dma_rwctrl[769f4000] dma_mask[64-bit] Linux video capture interface: v2.00 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx AMD8111: IDE controller at PCI slot 0000:00:07.1 AMD8111: chipset revision 3 AMD8111: not 100% native mode: will probe irqs later AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio Probing IDE interface ide0... Probing IDE interface ide1... Probing IDE interface ide0... Probing IDE interface ide1... ACPI: PCI Interrupt 0000:02:06.0[A] -> GSI 24 (level, low) -> IRQ 24 scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs scsi 0:0:0:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3 target0:0:0: asynchronous scsi0:A:0:0: Tagged Queuing enabled. Depth 32 target0:0:0: Beginning Domain Validation target0:0:0: wide asynchronous target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127) target0:0:0: Ending Domain Validation scsi 0:0:1:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3 target0:0:1: asynchronous scsi0:A:1:0: Tagged Queuing enabled. Depth 32 target0:0:1: Beginning Domain Validation target0:0:1: wide asynchronous target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127) target0:0:1: Ending Domain Validation ACPI: PCI Interrupt 0000:02:06.1[B] -> GSI 25 (level, low) -> IRQ 25 scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs 3ware Storage Controller device driver for Linux v1.26.02.001. 3ware 9000 Storage Controller device driver for Linux v2.26.02.008. SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB) sda: Write Protect is off sda: Mode Sense: b3 00 00 08 SCSI device sda: drive cache: write back SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB) sda: Write Protect is off sda: Mode Sense: b3 00 00 08 SCSI device sda: drive cache: write back sda: sda1 sda2 sd 0:0:0:0: Attached scsi disk sda SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB) sdb: Write Protect is off sdb: Mode Sense: b3 00 00 08 SCSI device sdb: drive cache: write back SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB) sdb: Write Protect is off sdb: Mode Sense: b3 00 00 08 SCSI device sdb: drive cache: write back sdb: sdb1 sdb2 sd 0:0:1:0: Attached scsi disk sdb sd 0:0:0:0: Attached scsi generic sg0 type 0 sd 0:0:1:0: Attached scsi generic sg1 type 0 Fusion MPT base driver 3.04.02 Copyright (c) 1999-2005 LSI Logic Corporation Fusion MPT SPI Host driver 3.04.02 usbmon: debugfs is not available ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) ACPI: PCI Interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 19 ohci_hcd 0000:03:00.0: OHCI Host Controller ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1 ohci_hcd 0000:03:00.0: irq 19, io mem 0xfeafc000 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 3 ports detected ACPI: PCI Interrupt 0000:03:00.1[D] -> GSI 19 (level, low) -> IRQ 19 ohci_hcd 0000:03:00.1: OHCI Host Controller ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2 ohci_hcd 0000:03:00.1: irq 19, io mem 0xfeafd000 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 3 ports detected USB Universal Host Controller Interface driver v3.0 Initializing USB Mass Storage driver... usbcore: registered new interface driver usb-storage USB Mass Storage support registered. usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid /.amd_mnt/huangho/export/kwaid0/home/maan/scm/torvalds/linux-2.6/drivers/usb/input/hid-core.c: v2.6:USB HID core driver serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice input: PC Speaker as /class/input/input0 md: raid0 personality registered for level 0 md: multipath personality registered for level -4 TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 15 CCID: Registered CCID 3 (ccid3) CCID: Registered CCID 2 (ccid2) SCTP: Hash tables configured (established 65536 bind 65536) powernow-k8: Found 2 AMD Opteron(tm) Processor 250 processors (version 2.00.00) powernow-k8: MP systems not supported by PSB BIOS structure powernow-k8: MP systems not supported by PSB BIOS structure PM: Writing back config space on device 0000:02:09.0 at offset b (was 164814e4, writing 164414e4) PM: Writing back config space on device 0000:02:09.0 at offset 3 (was 804000, writing 804010) PM: Writing back config space on device 0000:02:09.0 at offset 2 (was 2000000, writing 2000003) PM: Writing back config space on device 0000:02:09.0 at offset 1 (was 2b00000, writing 2b00146) PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4) PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010) PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003) PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106) Sending DHCP requests .<6>tg3: eth1: Link is up at 1000 Mbps, full duplex. tg3: eth1: Flow control is on for TX and on for RX. ., OK IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.120 PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4) PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010) PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003) PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106) IP-Config: Complete: device=eth1, addr=192.168.1.120, mask=255.255.0.0, gw=192.168.1.254, host=node120, domain=, nis-domain=(none), bootserver=192.168.1.254, rootserver=192.168.1.254, rootpath= Freeing unused kernel memory: 2612k freed program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: md0 stopped. program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: bind<sda2> md: bind<sdb2> md0: setting max_sectors to 128, segment boundary to 32767 raid0: looking at sdb2 raid0: comparing sdb2(55038592) with sdb2(55038592) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at sda2 raid0: comparing sda2(55038592) with sdb2(55038592) raid0: EQUAL raid0: FINAL 1 zones raid0: done. raid0 : md_size is 110077184 blocks. raid0 : conf->hash_spacing is 110077184 blocks. raid0 : nb_zone is 1. raid0 : Allocating 8 bytes for hash. program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: md0 stopped. md: unbind<sdb2> md: export_rdev(sdb2) md: unbind<sda2> md: export_rdev(sda2) program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: bind<sda2> md: bind<sdb2> md0: setting max_sectors to 128, segment boundary to 32767 raid0: looking at sdb2 raid0: comparing sdb2(55038592) with sdb2(55038592) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at sda2 raid0: comparing sda2(55038592) with sdb2(55038592) raid0: EQUAL raid0: FINAL 1 zones raid0: done. raid0 : md_size is 110077184 blocks. raid0 : conf->hash_spacing is 110077184 blocks. raid0 : nb_zone is 1. raid0 : Allocating 8 bytes for hash. Adding 16779852k swap on /dev/sda1. Priority:42 extents:1 across:16779852k Adding 16779852k swap on /dev/sdb1. Priority:42 extents:1 across:16779852k warning: process `sensors' used the removed sysctl system call with 7.2.1. warning: process `sensors' used the removed sysctl system call with 7.2.1. process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-22 17:42 ` Andre Noll @ 2006-11-23 12:01 ` Mel Gorman 2006-11-23 13:08 ` Andre Noll 2006-11-23 19:09 ` Andrew Morton 0 siblings, 2 replies; 36+ messages in thread From: Mel Gorman @ 2006-11-23 12:01 UTC (permalink / raw) To: Andre Noll Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes On (22/11/06 18:42), Andre Noll didst pronounce: > On 15:52, Mel Gorman wrote: > > > Right, so I took a closer look to see what the story was. > > Thanks a lot, Mel. > Thank you for getting back promptly. > > Bootmem setup node 0 0000000000000000-00000000fc000000 > > Bootmem setup node 1 0000000100000000-0000000200000000 > > > > That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152. > > > > That is showing an additional 16 page frames that are not in the E820 map > > (although I have seen this before and it didn't show up as a bad page). I > > would be very interested in finding out what the bad_page PFNs are if this > > bug still exists to see if it is those 16 frames. I've included a patch > > below that might help. > > > > Andre, if the bug still exists for you, can you apply Andi's patch to > > reduce the log size and the following patch please and post us the > > output with loglevel=8 please? Thanks > > Done. Here's the output of dmesg with your and Andi's patch applied. > ahhh, I believe I see the problem now. Please try out the following patch. ==== find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend on a sorted early_node_map[]. However, sort_node_map() is being called after fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases, this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges to be registered out of order. This gave the wrong values for the min PFN range resulting in some pages not being initialised. This patch sorts the early_node_map in find_min_pfn_for_node(). It has been boot tested on x86, x86_64, ppc64 and ia64. Signed-off-by: Mel Gorman <mel@csn.ul.ie> diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c --- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-15 20:03:40.000000000 -0800 +++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c 2006-11-23 02:23:57.000000000 -0800 @@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no { int i; + /* Regions in the early_node_map can be in any order */ + sort_node_map(); + /* Assuming a sorted map, the first range found has the starting pfn */ for_each_active_range_index_in_nid(i, nid) return early_node_map[i].start_pfn; @@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]); } - /* Regions in the early_node_map can be in any order */ - sort_node_map(); - /* Print out the zone ranges */ printk("Zone PFN ranges:\n"); for (i = 0; i < MAX_NR_ZONES; i++) ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-23 12:01 ` Mel Gorman @ 2006-11-23 13:08 ` Andre Noll 2006-11-23 13:28 ` Mel Gorman 2006-11-23 19:09 ` Andrew Morton 1 sibling, 1 reply; 36+ messages in thread From: Andre Noll @ 2006-11-23 13:08 UTC (permalink / raw) To: Mel Gorman Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes [-- Attachment #1: Type: text/plain, Size: 676 bytes --] On 12:01, Mel Gorman wrote: > > > Andre, if the bug still exists for you, can you apply Andi's patch to > > > reduce the log size and the following patch please and post us the > > > output with loglevel=8 please? Thanks > > > > Done. Here's the output of dmesg with your and Andi's patch applied. > > > > ahhh, I believe I see the problem now. Please try out the following patch. [...] > This patch sorts the early_node_map in find_min_pfn_for_node(). It has > been boot tested on x86, x86_64, ppc64 and ia64. That did the trick, you're the man! Thanks a lot Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-23 13:08 ` Andre Noll @ 2006-11-23 13:28 ` Mel Gorman 0 siblings, 0 replies; 36+ messages in thread From: Mel Gorman @ 2006-11-23 13:28 UTC (permalink / raw) To: Andre Noll Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes On Thu, 23 Nov 2006, Andre Noll wrote: > On 12:01, Mel Gorman wrote: > >>>> Andre, if the bug still exists for you, can you apply Andi's patch to >>>> reduce the log size and the following patch please and post us the >>>> output with loglevel=8 please? Thanks >>> >>> Done. Here's the output of dmesg with your and Andi's patch applied. >>> >> >> ahhh, I believe I see the problem now. Please try out the following patch. > > [...] > >> This patch sorts the early_node_map in find_min_pfn_for_node(). It has >> been boot tested on x86, x86_64, ppc64 and ia64. > > That did the trick, you're the man! > heh, I was also the problem. Thanks a lot for reporting and testing. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-23 12:01 ` Mel Gorman 2006-11-23 13:08 ` Andre Noll @ 2006-11-23 19:09 ` Andrew Morton 2006-11-23 21:55 ` Mel Gorman 1 sibling, 1 reply; 36+ messages in thread From: Andrew Morton @ 2006-11-23 19:09 UTC (permalink / raw) To: Mel Gorman Cc: Andre Noll, Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Linux Kernel Mailing List, David Rientjes On Thu, 23 Nov 2006 12:01:41 +0000 mel@skynet.ie (Mel Gorman) wrote: > find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend > on a sorted early_node_map[]. However, sort_node_map() is being called after > fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases, > this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges > to be registered out of order. This gave the wrong values for the min PFN > range resulting in some pages not being initialised. > > This patch sorts the early_node_map in find_min_pfn_for_node(). It has > been boot tested on x86, x86_64, ppc64 and ia64. > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> > > diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c > --- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-15 20:03:40.000000000 -0800 > +++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c 2006-11-23 02:23:57.000000000 -0800 > @@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no > { > int i; > > + /* Regions in the early_node_map can be in any order */ > + sort_node_map(); > + > /* Assuming a sorted map, the first range found has the starting pfn */ > for_each_active_range_index_in_nid(i, nid) > return early_node_map[i].start_pfn; > @@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne > max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]); > } > > - /* Regions in the early_node_map can be in any order */ > - sort_node_map(); > - > /* Print out the zone ranges */ > printk("Zone PFN ranges:\n"); > for (i = 0; i < MAX_NR_ZONES; i++) Doesn't this mean that we can sort that map multiple times? Seems a bit ... ungainly? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-23 19:09 ` Andrew Morton @ 2006-11-23 21:55 ` Mel Gorman 2006-11-24 9:51 ` Andre Noll 2006-11-24 9:58 ` Andi Kleen 0 siblings, 2 replies; 36+ messages in thread From: Mel Gorman @ 2006-11-23 21:55 UTC (permalink / raw) To: Andrew Morton Cc: Andre Noll, Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Linux Kernel Mailing List, David Rientjes On (23/11/06 11:09), Andrew Morton didst pronounce: > On Thu, 23 Nov 2006 12:01:41 +0000 > mel@skynet.ie (Mel Gorman) wrote: > > > find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend > > on a sorted early_node_map[]. However, sort_node_map() is being called after > > fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases, > > this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges > > to be registered out of order. This gave the wrong values for the min PFN > > range resulting in some pages not being initialised. > > > > This patch sorts the early_node_map in find_min_pfn_for_node(). It has > > been boot tested on x86, x86_64, ppc64 and ia64. > > > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> > > > > diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c > > --- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-15 20:03:40.000000000 -0800 > > +++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c 2006-11-23 02:23:57.000000000 -0800 > > @@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no > > { > > int i; > > > > + /* Regions in the early_node_map can be in any order */ > > + sort_node_map(); > > + > > /* Assuming a sorted map, the first range found has the starting pfn */ > > for_each_active_range_index_in_nid(i, nid) > > return early_node_map[i].start_pfn; > > @@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne > > max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]); > > } > > > > - /* Regions in the early_node_map can be in any order */ > > - sort_node_map(); > > - > > /* Print out the zone ranges */ > > printk("Zone PFN ranges:\n"); > > for (i = 0; i < MAX_NR_ZONES; i++) > yes, once per active node. > Seems a bit ... ungainly? > It is, but this late in the cycle, I was going for the obviously-correct-and-will-definitly-work solution. It would be sufficient to call sort_node_map() in find_min_pfn_with_active_regions() but I wasn't sure someone would call find_min_pfn_for_node() at some future time causing another fun bug. A slightly smarter, but not quite as obviously correct, patch is below if you prefer it. It removes the assumption about early_node_map being sorted for find_min_pfns and friends by always searching the whole map. The map is then only sorted once when it is required. Andre, I'd appreciate it if you could give it a spin to be 100% sure it's ok. It passed a boot-test on a few machines here. =========== find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend on a sorted early_node_map[] to find the correct values. However, sort_node_map() is being called after fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases, this is ok, but on an x86_64, the SRAT table caused the E820 ranges to be registered out of order. This gave the wrong values for the min PFN range resulting in some pages not being initialised. This patch works by always searching the whole early_node_map[] in find_min_pfn_for_node(). Signed-off-by: Mel Gorman <mel@csn.ul.ie> diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc5-mm2-clean/mm/page_alloc.c linux-2.6.19-rc5-mm2-sort_in_find_min/mm/page_alloc.c --- linux-2.6.19-rc5-mm2-clean/mm/page_alloc.c 2006-11-14 14:01:37.000000000 +0000 +++ linux-2.6.19-rc5-mm2-sort_in_find_min/mm/page_alloc.c 2006-11-23 20:37:18.000000000 +0000 @@ -2945,17 +2945,22 @@ static void __init sort_node_map(void) cmp_node_active_region, NULL); } -/* Find the lowest pfn for a node. This depends on a sorted early_node_map */ +/* Find the lowest pfn for a node */ unsigned long __init find_min_pfn_for_node(unsigned long nid) { int i; + unsigned long min_pfn = -1UL; /* Assuming a sorted map, the first range found has the starting pfn */ for_each_active_range_index_in_nid(i, nid) - return early_node_map[i].start_pfn; + min_pfn = min(min_pfn, early_node_map[i].start_pfn); - printk(KERN_WARNING "Could not find start_pfn for node %lu\n", nid); - return 0; + if (min_pfn == -1UL) { + printk(KERN_WARNING "Could not find start_pfn for node %lu\n", nid); + return 0; + } + + return min_pfn; } /** ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-23 21:55 ` Mel Gorman @ 2006-11-24 9:51 ` Andre Noll 2006-11-24 9:58 ` Andi Kleen 1 sibling, 0 replies; 36+ messages in thread From: Andre Noll @ 2006-11-24 9:51 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Linux Kernel Mailing List, David Rientjes [-- Attachment #1: Type: text/plain, Size: 606 bytes --] On 21:55, Mel Gorman wrote: > A slightly smarter, but not quite as obviously correct, patch is below if > you prefer it. It removes the assumption about early_node_map being sorted > for find_min_pfns and friends by always searching the whole map. The map > is then only sorted once when it is required. Andre, I'd appreciate it if > you could give it a spin to be 100% sure it's ok. It passed a boot-test on > a few machines here. Yes, this one also works for me. Acked-by: Andre Noll <maan@systemlinux.org> -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-23 21:55 ` Mel Gorman 2006-11-24 9:51 ` Andre Noll @ 2006-11-24 9:58 ` Andi Kleen 2006-11-24 20:43 ` Andrew Morton 1 sibling, 1 reply; 36+ messages in thread From: Andi Kleen @ 2006-11-24 9:58 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, Andre Noll, discuss, Adrian Bunk, Linus Torvalds, Linux Kernel Mailing List, David Rientjes > A slightly smarter, but not quite as obviously correct, I think it's better to go for the "obviously correct" approach right now And sorting multiple times should be fine -Andi ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-24 9:58 ` Andi Kleen @ 2006-11-24 20:43 ` Andrew Morton 0 siblings, 0 replies; 36+ messages in thread From: Andrew Morton @ 2006-11-24 20:43 UTC (permalink / raw) To: Andi Kleen Cc: Mel Gorman, Andre Noll, discuss, Adrian Bunk, Linus Torvalds, Linux Kernel Mailing List, David Rientjes On Fri, 24 Nov 2006 10:58:55 +0100 Andi Kleen <ak@suse.de> wrote: > > > A slightly smarter, but not quite as obviously correct, > > I think it's better to go for the "obviously correct" approach right now > And sorting multiple times should be fine > yup, that's what I'd decided. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-22 10:42 ` [discuss] " Andi Kleen 2006-11-22 15:52 ` Mel Gorman @ 2006-11-22 16:05 ` Andre Noll 2006-11-22 17:03 ` Mel Gorman 2006-11-22 17:08 ` Andi Kleen 1 sibling, 2 replies; 36+ messages in thread From: Andre Noll @ 2006-11-22 16:05 UTC (permalink / raw) To: Andi Kleen Cc: discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes, Mel Gorman [-- Attachment #1: Type: text/plain, Size: 858 bytes --] On 11:42, Andi Kleen wrote: > ject : x86_64: Bad page state in process 'swapper' > > References : http://lkml.org/lkml/2006/11/10/135 > > http://lkml.org/lkml/2006/11/10/208 > > Submitter : Andre Noll <maan@systemlinux.org> > > Handled-By : David Rientjes <rientjes@cs.washington.edu> > > Status : problem is being debugged > > Does this still happen with -rc6? Unfortunately, yes. I tried rc6, current git, and currrent git + David Rientjes' patch. They all show the same behaviour. > It's probably another bug in the memmap parsing rewrite (Mel cc'ed) > but the debugging information in the standard kernel unfortunately > doesn't give enough output to find out where it happens. Feel free to send me a debugging patch.. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-22 16:05 ` Andre Noll @ 2006-11-22 17:03 ` Mel Gorman 2006-11-22 17:08 ` Andi Kleen 1 sibling, 0 replies; 36+ messages in thread From: Mel Gorman @ 2006-11-22 17:03 UTC (permalink / raw) To: Andre Noll Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes, Mel Gorman On Wed, 22 Nov 2006, Andre Noll wrote: > On 11:42, Andi Kleen wrote: >> ject : x86_64: Bad page state in process 'swapper' >>> References : http://lkml.org/lkml/2006/11/10/135 >>> http://lkml.org/lkml/2006/11/10/208 >>> Submitter : Andre Noll <maan@systemlinux.org> >>> Handled-By : David Rientjes <rientjes@cs.washington.edu> >>> Status : problem is being debugged >> >> Does this still happen with -rc6? > > Unfortunately, yes. I tried rc6, current git, and currrent git + David > Rientjes' patch. They all show the same behaviour. > >> It's probably another bug in the memmap parsing rewrite (Mel cc'ed) >> but the debugging information in the standard kernel unfortunately >> doesn't give enough output to find out where it happens. > > Feel free to send me a debugging patch.. > You should have received such a patch from me later in the thread. In combination with the patch at http://lkml.org/lkml/2006/11/10/198 and a copy of the dmesg, I might be able to guess what is going wrong. Thanks -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-22 16:05 ` Andre Noll 2006-11-22 17:03 ` Mel Gorman @ 2006-11-22 17:08 ` Andi Kleen 2006-11-22 18:00 ` Andre Noll 1 sibling, 1 reply; 36+ messages in thread From: Andi Kleen @ 2006-11-22 17:08 UTC (permalink / raw) To: Andre Noll Cc: Andi Kleen, discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes, Mel Gorman On Wed, Nov 22, 2006 at 05:05:49PM +0100, Andre Noll wrote: > Unfortunately, yes. I tried rc6, current git, and currrent git + David > Rientjes' patch. They all show the same behaviour. I must have missed that patch. > > > It's probably another bug in the memmap parsing rewrite (Mel cc'ed) > > but the debugging information in the standard kernel unfortunately > > doesn't give enough output to find out where it happens. > > Feel free to send me a debugging patch.. Here's one. Please send output (unless Mel finds the problem first..) -Andi Index: linux-2.6.19-rc6-hack/mm/page_alloc.c =================================================================== --- linux-2.6.19-rc6-hack/mm/page_alloc.c +++ linux-2.6.19-rc6-hack/mm/page_alloc.c @@ -188,6 +188,10 @@ static inline int bad_range(struct zone static void bad_page(struct page *page) { + static int warned; + if (!warned) { + warned = 1; + printk(KERN_EMERG "page address %lx\n", page_address(page)); printk(KERN_EMERG "Bad page state in process '%s'\n" KERN_EMERG "page:%p flags:0x%0*lx mapping:%p mapcount:%d count:%d\n" KERN_EMERG "Trying to fix it up, but a reboot is needed\n" @@ -196,6 +200,7 @@ static void bad_page(struct page *page) (unsigned long)page->flags, page->mapping, page_mapcount(page), page_count(page)); dump_stack(); + } page->flags &= ~(1 << PG_lru | 1 << PG_private | 1 << PG_locked | ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [discuss] 2.6.19-rc6: known regressions (v4) 2006-11-22 17:08 ` Andi Kleen @ 2006-11-22 18:00 ` Andre Noll 0 siblings, 0 replies; 36+ messages in thread From: Andre Noll @ 2006-11-22 18:00 UTC (permalink / raw) To: Andi Kleen Cc: discuss, Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, David Rientjes, Mel Gorman [-- Attachment #1: Type: text/plain, Size: 23730 bytes --] On 18:08, Andi Kleen wrote: > On Wed, Nov 22, 2006 at 05:05:49PM +0100, Andre Noll wrote: > > Unfortunately, yes. I tried rc6, current git, and currrent git + David > > Rientjes' patch. They all show the same behaviour. > > I must have missed that patch. He sent it to me in private. In fact, he sent several patches. This is the one I tried today and which didn't work: Hi Andre, Please try the following patch to your 2.6.19-rc5 and see if it corrects the problem (it should also apply to 2.6.19-rc6 cleanly). David --- mm/memory.c | 33 ++++++++------------------------- 1 files changed, 8 insertions(+), 25 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 156861f..74aa08b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1483,29 +1483,14 @@ static int do_wp_page(struct mm_struct * { struct page *old_page, *new_page; pte_t entry; - int reuse = 0, ret = VM_FAULT_MINOR; - struct page *dirty_page = NULL; + int reuse, ret = VM_FAULT_MINOR; old_page = vm_normal_page(vma, address, orig_pte); if (!old_page) goto gotten; - /* - * Take out anonymous pages first, anonymous shared vmas are - * not dirty accountable. - */ - if (PageAnon(old_page)) { - if (!TestSetPageLocked(old_page)) { - reuse = can_share_swap_page(old_page); - unlock_page(old_page); - } - } else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) == - (VM_WRITE|VM_SHARED))) { - /* - * Only catch write-faults on shared writable pages, - * read-only shared pages can get COWed by - * get_user_pages(.write=1, .force=1). - */ + if (unlikely((vma->vm_flags & (VM_SHARED | VM_WRITE)) == + (VM_SHARED | VM_WRITE))) { if (vma->vm_ops && vma->vm_ops->page_mkwrite) { /* * Notify the address space that the page is about to @@ -1534,10 +1519,12 @@ static int do_wp_page(struct mm_struct * if (!pte_same(*page_table, orig_pte)) goto unlock; } - dirty_page = old_page; - get_page(dirty_page); reuse = 1; - } + } else if (PageAnon(old_page) && !TestSetPageLocked(old_page)) { + reuse = can_share_swap_page(old_page); + unlock_page(old_page); + } else + reuse = 0; if (reuse) { flush_cache_page(vma, address, pte_pfn(orig_pte)); @@ -1609,10 +1596,6 @@ gotten: page_cache_release(old_page); unlock: pte_unmap_unlock(page_table, ptl); - if (dirty_page) { - set_page_dirty_balance(dirty_page); - put_page(dirty_page); - } return ret; oom: if (old_page) > > Feel free to send me a debugging patch.. > Here's one. Please send output (unless Mel finds the problem first..) Here comes the output. Andre Linux version 2.6.19-rc6-andi-v2-tt64-6-g0f9005a6-dirty (maan@congo) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #12 SMP Wed Nov 22 18:54:11 CET 2006 Command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable) BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data) BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS) BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000200000000 (usable) Entering add_active_range(0, 0, 159) 0 entries of 3200 used Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used end_pfn_map = 2097152 DMI 2.3 present. ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000f6bc0 ACPI: RSDT (v001 A M I OEMRSDT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0000 ACPI: FADT (v001 A M I OEMFACP 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0200 ACPI: MADT (v001 A M I OEMAPIC 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0380 ACPI: OEMB (v001 A M I OEMBIOS 0x01000510 MSFT 0x00000097) @ 0x00000000fbfff040 ACPI: SRAT (v001 A M I OEMSRAT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff34e0 ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0x00000000fbff35f0 ACPI: DSDT (v001 0AAAA 0AAAA000 0x00000000 INTL 0x02002026) @ 0x0000000000000000 SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 1 -> APIC 1 -> Node 1 SRAT: Node 0 PXM 0 100000-fc000000 Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used SRAT: Node 1 PXM 1 100000000-200000000 Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used SRAT: Node 0 PXM 0 0-fc000000 Entering add_active_range(0, 0, 159) 2 entries of 3200 used Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used NUMA: Using 32 for the hash shift. Bootmem setup node 0 0000000000000000-00000000fc000000 Bootmem setup node 1 0000000100000000-0000000200000000 Zone PFN ranges: DMA 256 -> 4096 DMA32 4096 -> 1048576 Normal 1048576 -> 2097152 early_node_map[3] active PFN ranges 0: 0 -> 159 0: 256 -> 1032176 1: 1048576 -> 2097152 On node 0 totalpages: 1031920 DMA zone: 52 pages used for memmap DMA zone: 1953 pages reserved DMA zone: 1835 pages, LIFO batch:0 DMA32 zone: 14055 pages used for memmap DMA32 zone: 1014025 pages, LIFO batch:31 Normal zone: 0 pages used for memmap On node 1 totalpages: 1048576 DMA zone: 0 pages used for memmap DMA32 zone: 0 pages used for memmap Normal zone: 14336 pages used for memmap Normal zone: 1034240 pages, LIFO batch:31 ACPI: PM-Timer IO Port: 0x5008 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24]) IOAPIC[1]: apic_id 3, address 0xfebff000, GSI 24-27 ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28]) IOAPIC[2]: apic_id 4, address 0xfebfe000, GSI 28-31 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information Nosave address range: 000000000009f000 - 00000000000a0000 Nosave address range: 00000000000a0000 - 00000000000e0000 Nosave address range: 00000000000e0000 - 0000000000100000 Nosave address range: 00000000fbff0000 - 00000000fbfff000 Nosave address range: 00000000fbfff000 - 00000000fc000000 Nosave address range: 00000000fc000000 - 00000000ff780000 Nosave address range: 00000000ff780000 - 0000000100000000 Allocating PCI resources starting at fc400000 (gap: fc000000:3780000) PERCPU: Allocating 25728 bytes of per cpu data Built 2 zonelists. Total pages: 2050100 Kernel command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes) Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes) Checking aperture... CPU 0: aperture @ f4cc000000 size 32 MB Aperture too small (32 MB) No AGP bridge found Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Mapping aperture over 65536 KB of RAM @ 8000000 page address ffff8100fbef0000 Bad page state in process 'swapper' page:ffff810003faf480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0 Trying to fix it up, but a reboot is needed Backtrace: Call Trace: [<ffffffff8014f200>] bad_page+0x94/0xbe [<ffffffff8014f6dd>] __free_pages_ok+0x78/0xf9 [<ffffffff805cd83c>] free_all_bootmem_core+0xce/0x1c2 [<ffffffff805cad5d>] numa_free_all_bootmem+0x39/0x78 [<ffffffff805ca603>] mem_init+0x59/0x16c [<ffffffff805bb75c>] start_kernel+0x165/0x1e7 [<ffffffff805bb195>] x86_64_start_kernel+0x12b/0x130 Memory: 8122880k/8388608k available (3184k kernel code, 199740k reserved, 1490k data, 2612k init) Calibrating delay using timer specific routine.. 4782.31 BogoMIPS (lpj=9564629) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 0/0 -> Node 0 Freeing SMP alternatives: 32k freed ACPI: Core revision 20060707 Using local APIC timer interrupts. result 12441507 Detected 12.441 MHz APIC timer. Booting processor 1/2 APIC 0x1 Initializing CPU#1 Calibrating delay using timer specific routine.. 4777.69 BogoMIPS (lpj=9555388) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 1/1 -> Node 1 AMD Opteron(tm) Processor 250 stepping 0a CPU 1: Syncing TSC to CPU 0. CPU 1: synchronized TSC with CPU 0 (last diff -177 cycles, maxerr 928 cycles) Brought up 2 CPUs testing NMI watchdog ... OK. Disabling vsyscall due to use of PM timer time.c: Using 3.579545 MHz WALL PM GTOD PM timer. time.c: Detected 2388.767 MHz processor. migration_cost=574 NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) Boot video device is 0000:03:06.0 PCI: Firmware left 0000:03:08.0 e100 interrupts enabled, disabling ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) AMD768 RNG detected SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report PCI-DMA: Disabling AGP. PCI-DMA: aperture base @ 8000000 size 65536 KB PCI-DMA: using GART IOMMU. PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture PCI: Bridge: 0000:00:06.0 IO window: a000-bfff MEM window: fc900000-feafffff PREFETCH window: disabled. PCI: Bridge: 0000:00:0a.0 IO window: 9000-9fff MEM window: fc600000-fc8fffff PREFETCH window: ff500000-ff5fffff PCI: Bridge: 0000:00:0b.0 IO window: disabled. MEM window: disabled. PREFETCH window: disabled. NET: Registered protocol family 2 IP route cache hash table entries: 262144 (order: 9, 2097152 bytes) TCP established hash table entries: 262144 (order: 10, 4194304 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 262144 bind 65536) TCP reno registered microcode: CPU0 not a capable Intel processor microcode: CPU1 not a capable Intel processor IA-32 Microcode Update Driver: v1.14a <tigran@veritas.com> io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered ACPI: Power Button (FF) [PWRF] ACPI: Power Button (CM) [PWRB] ACPI: Processor [CPU1] (supports 8 throttling states) ACPI: Getting cpuindex for acpiid 0x3 ACPI: Getting cpuindex for acpiid 0x4 Real Time Clock Driver v1.12ac Linux agpgart interface v0.101 (c) Dave Jones ipmi message handler version 39.0 ipmi device interface IPMI System Interface driver. ipmi_si: Unable to find any System Interface(s) IPMI Watchdog: driver initialized Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A loop: loaded (max 8 devices) Intel(R) PRO/1000 Network Driver - version 7.2.9-k4 Copyright (c) 1999-2006 Intel Corporation. eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others ACPI: PCI Interrupt 0000:03:08.0[A] -> GSI 18 (level, low) -> IRQ 18 eth0: 0000:03:08.0, 00:E0:81:2E:78:F7, IRQ 18. Board assembly 567812-052, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0xd0a6c714). e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI e100: Copyright(c) 1999-2006 Intel Corporation tg3.c:v3.69 (November 15, 2006) ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 24 (level, low) -> IRQ 24 eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:26 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth1: dma_rwctrl[769f4000] dma_mask[64-bit] ACPI: PCI Interrupt 0000:02:09.1[B] -> GSI 25 (level, low) -> IRQ 25 eth2: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:27 eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] eth2: dma_rwctrl[769f4000] dma_mask[64-bit] Linux video capture interface: v2.00 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx AMD8111: IDE controller at PCI slot 0000:00:07.1 AMD8111: chipset revision 3 AMD8111: not 100% native mode: will probe irqs later AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio Probing IDE interface ide0... Probing IDE interface ide1... Probing IDE interface ide0... Probing IDE interface ide1... ACPI: PCI Interrupt 0000:02:06.0[A] -> GSI 24 (level, low) -> IRQ 24 scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs scsi 0:0:0:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3 target0:0:0: asynchronous scsi0:A:0:0: Tagged Queuing enabled. Depth 32 target0:0:0: Beginning Domain Validation target0:0:0: wide asynchronous target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127) target0:0:0: Ending Domain Validation scsi 0:0:1:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3 target0:0:1: asynchronous scsi0:A:1:0: Tagged Queuing enabled. Depth 32 target0:0:1: Beginning Domain Validation target0:0:1: wide asynchronous target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127) target0:0:1: Ending Domain Validation ACPI: PCI Interrupt 0000:02:06.1[B] -> GSI 25 (level, low) -> IRQ 25 scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0 <Adaptec AIC7902 Ultra320 SCSI adapter> aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs 3ware Storage Controller device driver for Linux v1.26.02.001. 3ware 9000 Storage Controller device driver for Linux v2.26.02.008. SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB) sda: Write Protect is off sda: Mode Sense: b3 00 00 08 SCSI device sda: drive cache: write back SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB) sda: Write Protect is off sda: Mode Sense: b3 00 00 08 SCSI device sda: drive cache: write back sda: sda1 sda2 sd 0:0:0:0: Attached scsi disk sda SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB) sdb: Write Protect is off sdb: Mode Sense: b3 00 00 08 SCSI device sdb: drive cache: write back SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB) sdb: Write Protect is off sdb: Mode Sense: b3 00 00 08 SCSI device sdb: drive cache: write back sdb: sdb1 sdb2 sd 0:0:1:0: Attached scsi disk sdb sd 0:0:0:0: Attached scsi generic sg0 type 0 sd 0:0:1:0: Attached scsi generic sg1 type 0 Fusion MPT base driver 3.04.02 Copyright (c) 1999-2005 LSI Logic Corporation Fusion MPT SPI Host driver 3.04.02 usbmon: debugfs is not available ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) ACPI: PCI Interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 19 ohci_hcd 0000:03:00.0: OHCI Host Controller ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1 ohci_hcd 0000:03:00.0: irq 19, io mem 0xfeafc000 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 3 ports detected ACPI: PCI Interrupt 0000:03:00.1[D] -> GSI 19 (level, low) -> IRQ 19 ohci_hcd 0000:03:00.1: OHCI Host Controller ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2 ohci_hcd 0000:03:00.1: irq 19, io mem 0xfeafd000 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 3 ports detected USB Universal Host Controller Interface driver v3.0 Initializing USB Mass Storage driver... usbcore: registered new interface driver usb-storage USB Mass Storage support registered. usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid /.amd_mnt/huangho/export/kwaid0/home/maan/scm/torvalds/linux-2.6/drivers/usb/input/hid-core.c: v2.6:USB HID core driver serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice input: PC Speaker as /class/input/input0 md: raid0 personality registered for level 0 md: multipath personality registered for level -4 TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 15 CCID: Registered CCID 3 (ccid3) CCID: Registered CCID 2 (ccid2) SCTP: Hash tables configured (established 65536 bind 65536) powernow-k8: Found 2 AMD Opteron(tm) Processor 250 processors (version 2.00.00) powernow-k8: MP systems not supported by PSB BIOS structure powernow-k8: MP systems not supported by PSB BIOS structure PM: Writing back config space on device 0000:02:09.0 at offset b (was 164814e4, writing 164414e4) PM: Writing back config space on device 0000:02:09.0 at offset 3 (was 804000, writing 804010) PM: Writing back config space on device 0000:02:09.0 at offset 2 (was 2000000, writing 2000003) PM: Writing back config space on device 0000:02:09.0 at offset 1 (was 2b00000, writing 2b00146) PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4) PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010) PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003) PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106) Sending DHCP requests .<6>tg3: eth1: Link is up at 1000 Mbps, full duplex. tg3: eth1: Flow control is on for TX and on for RX. ., OK IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.120 PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4) PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010) PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003) PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106) IP-Config: Complete: device=eth1, addr=192.168.1.120, mask=255.255.0.0, gw=192.168.1.254, host=node120, domain=, nis-domain=(none), bootserver=192.168.1.254, rootserver=192.168.1.254, rootpath= Freeing unused kernel memory: 2612k freed program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: md0 stopped. program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: bind<sda2> md: bind<sdb2> md0: setting max_sectors to 128, segment boundary to 32767 raid0: looking at sdb2 raid0: comparing sdb2(55038592) with sdb2(55038592) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at sda2 raid0: comparing sda2(55038592) with sdb2(55038592) raid0: EQUAL raid0: FINAL 1 zones raid0: done. raid0 : md_size is 110077184 blocks. raid0 : conf->hash_spacing is 110077184 blocks. raid0 : nb_zone is 1. raid0 : Allocating 8 bytes for hash. program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: md0 stopped. md: unbind<sdb2> md: export_rdev(sdb2) md: unbind<sda2> md: export_rdev(sda2) program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO program parted is using a deprecated SCSI ioctl, please convert it to SG_IO md: bind<sda2> md: bind<sdb2> md0: setting max_sectors to 128, segment boundary to 32767 raid0: looking at sdb2 raid0: comparing sdb2(55038592) with sdb2(55038592) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at sda2 raid0: comparing sda2(55038592) with sdb2(55038592) raid0: EQUAL raid0: FINAL 1 zones raid0: done. raid0 : md_size is 110077184 blocks. raid0 : conf->hash_spacing is 110077184 blocks. raid0 : nb_zone is 1. raid0 : Allocating 8 bytes for hash. Adding 16779852k swap on /dev/sda1. Priority:42 extents:1 across:16779852k Adding 16779852k swap on /dev/sdb1. Priority:42 extents:1 across:16779852k warning: process `sensors' used the removed sysctl system call with 7.2.1. warning: process `sensors' used the removed sysctl system call with 7.2.1. process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions (v4) 2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk ` (2 preceding siblings ...) 2006-11-22 10:42 ` [discuss] " Andi Kleen @ 2006-11-23 0:04 ` David Brownell 3 siblings, 0 replies; 36+ messages in thread From: David Brownell @ 2006-11-23 0:04 UTC (permalink / raw) To: Adrian Bunk Cc: Alexey Starikovskiy, Andrew Morton, Len Brown, Linus Torvalds, linux-acpi, Linux Kernel Mailing List On Tuesday 21 November 2006 1:24 pm, Adrian Bunk wrote: > Subject : ACPI: AE_TIME errors > References : http://lkml.org/lkml/2006/11/15/12 > Submitter : David Brownell <david-b@pacbell.net> > Handled-By : Len Brown <len.brown@intel.com> > Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> > Status : problem is being debugged I've not seen this in over 3 days now, and am willing to believe that the previous instance (after manually reverting the patch identified by Linus) was a fluke ... it's certainly not the critical/blocking kind of issue it had previously been. - Dave ^ permalink raw reply [flat|nested] 36+ messages in thread
* 2.6.19-rc6: known regressions with patches available 2006-11-16 4:21 Linux 2.6.19-rc6 Linus Torvalds ` (4 preceding siblings ...) 2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk @ 2006-11-23 0:54 ` Adrian Bunk 2006-11-23 1:08 ` Andrew Morton 5 siblings, 1 reply; 36+ messages in thread From: Adrian Bunk @ 2006-11-23 0:54 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Linux Kernel Mailing List, Randy Dunlap, Roman Zippel, Phil Oester, Sam Ravnborg This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 with patches available. The first issue (for an unknown it never occured before - is seems some random Kconfig change has triggered this latent bug) seems to have the potential of affecting more users. The second issue is so exotic that I wouldn't have listed it if there was no patch, but considering that the patch looks safe I don't see why this regression shouldn't be fixed in 2.6.19. Subject : xconfig crashes on x86_64 References : http://lkml.org/lkml/2006/11/19/177 Submitter : Randy Dunlap <randy.dunlap@oracle.com> Handled-By : Roman Zippel <zippel@linux-m68k.org> Patch : http://lkml.org/lkml/2006/11/20/340 Status : patch available Subject : menuconfig problems with TERM=vt100 References : http://lkml.org/lkml/2006/11/13/369 Submitter : Phil Oester <kernel@linuxace.com> Caused-By : Sam Ravnborg <sam@ravnborg.org> commit 350b5b76384e77bcc58217f00455fdbec5cac594 Handled-By : Roman Zippel <zippel@linux-m68k.org> Patch : http://lkml.org/lkml/2006/11/20/341 Status : patch available ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.19-rc6: known regressions with patches available 2006-11-23 0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk @ 2006-11-23 1:08 ` Andrew Morton 0 siblings, 0 replies; 36+ messages in thread From: Andrew Morton @ 2006-11-23 1:08 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Linux Kernel Mailing List, Randy Dunlap, Roman Zippel, Phil Oester, Sam Ravnborg On Thu, 23 Nov 2006 01:54:57 +0100 Adrian Bunk <bunk@stusta.de> wrote: > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18 > with patches available. > > The first issue (for an unknown it never occured before - is seems some > random Kconfig change has triggered this latent bug) seems to have the > potential of affecting more users. > > The second issue is so exotic that I wouldn't have listed it if there > was no patch, but considering that the patch looks safe I don't see why > this regression shouldn't be fixed in 2.6.19. > > > Subject : xconfig crashes on x86_64 > References : http://lkml.org/lkml/2006/11/19/177 > Submitter : Randy Dunlap <randy.dunlap@oracle.com> > Handled-By : Roman Zippel <zippel@linux-m68k.org> > Patch : http://lkml.org/lkml/2006/11/20/340 > Status : patch available > > > Subject : menuconfig problems with TERM=vt100 > References : http://lkml.org/lkml/2006/11/13/369 > Submitter : Phil Oester <kernel@linuxace.com> > Caused-By : Sam Ravnborg <sam@ravnborg.org> > commit 350b5b76384e77bcc58217f00455fdbec5cac594 > Handled-By : Roman Zippel <zippel@linux-m68k.org> > Patch : http://lkml.org/lkml/2006/11/20/341 > Status : patch available I have both these queued for 2.6.19, thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2006-11-24 20:48 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-11-16 4:21 Linux 2.6.19-rc6 Linus Torvalds 2006-11-16 21:37 ` 2.6.19-rc6: known regressions Adrian Bunk 2006-11-16 21:43 ` Greg KH 2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk 2006-11-18 8:02 ` [PATCH] mm: do not call bad_page on PG_reserved check David Rientjes 2006-11-18 13:37 ` Hugh Dickins 2006-11-18 4:04 ` Linux 2.6.19-rc6 - NFSD working again Christian Kujau 2006-11-20 19:53 ` 2.6.19-rc6: known regressions (v3) Adrian Bunk 2006-11-21 21:24 ` 2.6.19-rc6: known regressions (v4) Adrian Bunk 2006-11-21 21:31 ` [discuss] " Dave Jones 2006-11-21 21:39 ` Adrian Bunk 2006-11-21 21:56 ` Dave Jones 2006-11-21 21:33 ` Vivek Goyal 2006-11-21 21:41 ` Adrian Bunk 2006-11-21 22:18 ` Linus Torvalds 2006-11-22 9:44 ` Pavel Emelianov 2006-11-22 14:58 ` Vivek Goyal 2006-11-22 17:28 ` Linus Torvalds 2006-11-22 10:42 ` [discuss] " Andi Kleen 2006-11-22 15:52 ` Mel Gorman 2006-11-22 17:42 ` Andre Noll 2006-11-23 12:01 ` Mel Gorman 2006-11-23 13:08 ` Andre Noll 2006-11-23 13:28 ` Mel Gorman 2006-11-23 19:09 ` Andrew Morton 2006-11-23 21:55 ` Mel Gorman 2006-11-24 9:51 ` Andre Noll 2006-11-24 9:58 ` Andi Kleen 2006-11-24 20:43 ` Andrew Morton 2006-11-22 16:05 ` Andre Noll 2006-11-22 17:03 ` Mel Gorman 2006-11-22 17:08 ` Andi Kleen 2006-11-22 18:00 ` Andre Noll 2006-11-23 0:04 ` David Brownell 2006-11-23 0:54 ` 2.6.19-rc6: known regressions with patches available Adrian Bunk 2006-11-23 1:08 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox