* Linux 2.6.19-rc5
@ 2006-11-08 2:33 Linus Torvalds
[not found] ` <20061108085235.GT4729@stusta.de>
` (4 more replies)
0 siblings, 5 replies; 91+ messages in thread
From: Linus Torvalds @ 2006-11-08 2:33 UTC (permalink / raw)
To: Linux Kernel Mailing List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 16204 bytes --]
Ok, things are finally calming down, it seems.
The -rc5 thing is mainly a few random architecture updates (arm, mips,
uml, avr, power) and the only really noticeable one there is likely some
fixes to the local APIC accesses on x86, which apparently fixes a few
machines.
The rest is really mostly one-liners (or close) to various subsystems. New
PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better
about this - there may be a -rc6, but maybe we don't even need one.
As usual, thanks to everybody who tested and chased down some of the
regressions,
Linus
---
Adrian Bunk (2):
[TIPC] net/tipc/port.c: fix NULL dereference
PCI: Let PCI_MULTITHREAD_PROBE depend on BROKEN
Akinobu Mita (4):
tokenring: fix module_init error handling
n2: fix confusing error code
edac_mc: fix error handling
sunrpc: add missing spin_unlock
Al Viro (8):
[IPV6]: File the fingerprints off ah6->spi/esp6->spi
[IPX]: Trivial parts of endianness annotations
[IPX]: Annotate and fix IPX checksum
[IPV6]: Fix ECN bug on big-endian
[NETFILTER] bug: NFULA_CFG_QTHRESH uses 32bit
[NETFILTER] bug: nfulnl_msg_config_mode ->copy_range is 32bit
[NETFILTER] bug: skb->protocol is already net-endian
[PKTGEN]: TCI endianness fixes
Alexey Dobriyan (1):
[GFS2] don't panic needlessly
Amol Lad (1):
drivers/isdn/hysdn/hysdn_sched.c: sleep after taking spinlock fix
Andreas Gruenbacher (1):
Fix user.* xattr permission check for sticky dirs
Andrew Morton (6):
find_bd_holder() fix
tidy "md: check bio address after mapping through partitions"
Add printk_timed_ratelimit()
schedule removal of FUTEX_FD
acpi_noirq section fix
spi section fix
Andy Fleming (2):
[POWERPC] Fix rmb() for e500-based machines it
[POWERPC] Fix oprofile support for e500 in arch/powerpc
Ankita Garg (1):
Fix for LKDTM MEM_SWAPOUT crashpoint
Atsushi Nemoto (2):
[MIPS] Fixup migration to GENERIC_TIME
[MIPS] Do not use -msym32 option for modules.
Auke Kok (1):
e1000: Fix regression: garbled stats and irq allocation during swsusp
Ben Dooks (5):
[ARM] 3915/1: S3C2412: Add s3c2410_gpio_getirq() to general gpio.c
[ARM] 3920/1: S3C24XX: Remove smdk2410_defconfig
[ARM] 3921/1: S3C24XX: remove bast_defconfig
[ARM] 3922/1: S3C24XX: update s3c2410_defconfig to 2.6.19-rc4
[ARM] 3923/1: S3C24XX: update s3c2410_defconfig with new drivers
Benjamin Herrenschmidt (2):
[POWERPC] Fix various offb issues
[POWERPC] Make alignment exception always check exception table
Bjorn Schneider (1):
USB: new VID/PID-combos for cp2101
Brice Goglin (1):
myri10ge: ServerWorks HT2000 PCI id is already defined in pci_ids.h
Daniel Drake (1):
jfs: Add splice support
Daniel Ritz (1):
usbtouchscreen: use endpoint address from endpoint descriptor
Daniel Yeisley (1):
init_reap_node() initialization fix
Dave Kleikamp (1):
JFS: Remove redundant xattr permission checking
David Brownell (3):
USB: fix compiler issues with newer gcc versions
USB: use MII hooks only if CONFIG_MII is enabled
[ARM] 3926/1: make timer led handle HZ != 100
David Härdeman (1):
V4L/DVB (4785): Budget-ci: Change DEBIADDR_IR to a safer default
David Rientjes (1):
net s2io: return on NULL dev_alloc_skb()
David S. Miller (7):
[APPLETALK]: Fix potential OOPS in atalk_sendmsg().
[XFRM] xfrm_user: Fix unaligned accesses.
[ETH1394]: Fix unaligned accesses.
[SPARC64]: Fix Tomatillo/Schizo IRQ handling.
[SPARC64]: Add some missing print_symbol() calls.
[SPARC64]: Fix futex_atomic_cmpxchg_inatomic implementation.
[SPARC]: Fix robust futex syscalls and wire up migrate_pages.
Dmitry Mishin (3):
[NETFILTER]: Missed and reordered checks in {arp,ip,ip6}_tables
[NETFILTER]: ip_tables: compat code module refcounting fix
[IPV6]: Add ndisc_netdev_notifier unregister.
Dominic Cerquetti (1):
USB: xpad: additional USB id's added
Enrico Scholz (1):
[ARM] 3919/1: Fixed definition of some PXA270 CIF related registers
Erez Zilber (1):
IB/iser: Start connection after enabling iSER
Eric Sandeen (1):
fix UFS superblock alignment issues
Eric W. Biederman (3):
Improve the removed sysctl warnings
sysctl: allow a zero ctl_name in the middle of a sysctl table
sysctl: implement CTL_UNNUMBERED
Gautham R Shenoy (1):
Fix the spurious unlock_cpu_hotplug false warnings
Grant Grundler (1):
hid-core: big-endian fix fix
Greg Kroah-Hartman (2):
PCI: Revert "PCI: i386/x86_84: disable PCI resource decode on device disable"
USB: add another sierra wireless device id
Gui,Jian (1):
[POWERPC] Disallow kprobes on emulate_step and branch_taken
Haavard Skinnemoen (4):
AVR32: Get rid of board_early_init
AVR32: Fix thinko in generic_find_next_zero_le_bit()
AVR32: Wire up sys_epoll_pwait
AVR32: Add missing return instruction in __raw_writesb
Hartmut Hackmann (1):
V4L/DVB (4770): Fix mode switch of Compro Videomate T300
Heiko Carstens (4):
[NET]: fix uaccess handling
sys_pselect7 vs compat_sys_pselect7 uaccess error handling
[S390] revert add_active_range() usage patch.
[S390] IRQs too early enabled.
Herbert Xu (2):
[NET]: Fix segmentation of linear packets
[SCTP]: Always linearise packet on input
Hugh Dickins (3):
[POWERPC] Make current preempt-safe
[POWERPC] Make high hugepage areas preempt safe
[POWERPC] Make mmiowb's io_sync preempt safe
Jack Morgenstein (1):
IB/uverbs: Return sq_draining value in query_qp response
James Morris (3):
[IPV6]: fix lockup via /proc/net/ip6_flowlabel
[IPV6]: return EINVAL for invalid address with flowlabel lease request
[IPV6]: fix flowlabel seqfile handling
Jamie Lenehan (2):
sh: Fix IPR-IRQ's for IRQ-chip change breakage.
sh: Titan defconfig update.
Jan Luebbe (1):
USB: sierra: Fix id for Sierra Wireless MC8755 in new table
Jan Mate (1):
USB Storage: unusual_devs.h entry for Sony Ericsson P990i
Jan-Benedict Glaw (1):
Update for the srm_env driver.
Jan-Bernd Themann (1):
ehea: kzalloc GFP_ATOMIC fix
Jeff Dike (4):
uml: add _text definition to linker scripts
uml: add INITCALLS
uml: fix I/O hang
uml: include tidying
Jeff Garzik (1):
Revert "Add 0x7110 piix to ata_piix.c"
Jeff Mahoney (1):
reiserfs: reset errval after initializing bitmap cache
Jens Axboe (3):
CFQ: request <-> request merging rr_list fixup
Add 0x7110 piix to ata_piix.c
splice: fix problem introduced with inode diet
Jes Sorensen (1):
[IA64] don't double >> PAGE_SHIFT pointer for /dev/kmem access
Jiri Benc (1):
ieee80211: don't flood log with errors
Johannes Berg (1):
b44: change comment about irq mask register
Keith Owens (1):
[IA64] Correct definition of handle_IPI
Kenji Kaneshige (1):
[IA64] cpu-hotplug: Fixing confliction between CPU hot-add and IPI
Kevin Hilman (2):
[ARM] 3917/1: Fix dmabounce symbol exports
[ARM] 3918/1: ixp4xx irq-chip rework
Krishna Kumar (1):
RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count
Kristoffer Ericson (1):
video: Fix include in hp680_bl.
Larry Finger (1):
bcm43xx: fix unexpected LED control values in BCM4303 sprom
Larry Woodman (1):
[NET]: __alloc_pages() failures reported due to fragmentation
Lennert Buytenhek (3):
ep93xx_eth: fix RX/TXstatus ring full handling
ep93xx_eth: fix unlikely(x) > y test
ep93xx_eth: don't report RX errors
Linas Vepstas (1):
[POWERPC] Use 4kB iommu pages even on 64kB-page systems
Linus Torvalds (6):
i386: clean up io-apic accesses
i386: write IO APIC irq routing entries in correct order
Revert unintentional "volatile" changes in ipc/msg.c
Fix unlikely (but possible) race condition on task->user access
Make sure "user->sigpending" count is in sync
Linux 2.6.19-rc5
Manish Lachwani (1):
[MIPS] Add missing file for support of backplane on TX4927 based board
Martin Josefsson (1):
[NETFILTER]: nf_conntrack: add missing unlock in get_next_corpse()
Meelis Roos (1):
[NETFILTER]: silence a warning in ebtables
Michael Buesch (1):
bcm43xx: Fix low-traffic netdev watchdog TX timeouts
Michael Chan (1):
[TG3]: Fix 2nd ifup failure on 5752M.
Michael Halcrow (7):
eCryptfs: Clean up crypto initialization
eCryptfs: Hash code to new crypto API
eCryptfs: Cipher code to new crypto API
eCryptfs: Consolidate lower dentry_open's
eCryptfs: Remove ecryptfs_umount_begin
eCryptfs: Fix handling of lower d_count
eCryptfs: Fix pointer deref
Michael S. Tsirkin (1):
IB/mthca: Fix MAD extended header format for MAD_IFC firmware command
Naranjo Manuel Francisco (1):
USB: HID: add blacklist AIRcable USB, little beautification
NeilBrown (2):
md: check bio address after mapping through partitions.
md: send online/offline uevents when an md array starts/stops
nkalmala (1):
mm: un-needed add-store operation wastes a few bytes
OGAWA Hirofumi (4):
Cleanup read_pages()
cifs: ->readpages() fixes
fuse: ->readpages() cleanup
gfs2: ->readpages() fixes
Oleg Nesterov (2):
taskstats: fix sub-threads accounting
fix Documentation/accounting/getdelays.c buf size
Oliver Endriss (1):
V4L/DVB (4784): [saa7146_i2c] short_delay mode fixed for fast machines
Oliver Neukum (2):
USB: failure in usblp's error path
USB: usblp: fix system suspend for some systems
Paolo 'Blaisorblade' Giarrusso (11):
uml ubd driver: allow using up to 16 UBD devices
uml ubd driver: document some struct fields
uml ubd driver: var renames
uml ubd driver: give better names to some functions.
uml ubd driver: change ubd_lock to be a mutex
uml ubd driver: ubd_io_lock usage fixup
uml ubd driver: convert do_ubd to a boolean variable
uml ubd driver: reformat ubd_config
uml ubd driver: use bitfields where possible
uml ubd driver: do not store error codes as ->fd
uml ubd driver: various little changes
Patrick Caulfield (2):
[DLM] Fix kref_put oops
[DLM] fix oops in kref_put when removing a lockspace
Patrick McHardy (2):
[NETFILTER]: remove masq/NAT from ip6tables Kconfig help
[IPV6]: Give sit driver an appropriate module alias.
Paul Gortmaker (1):
[ARM] 3912/1: Make PXA270 advertise HWCAP_IWMMXT capability
Paul Mackerras (2):
IB/ehca: Fix eHCA driver compilation for uniprocessor
powerpc: Eliminate "exceeds stub group size" linker warning
Paul Moore (2):
[NetLabel]: protect the CIPSOv4 socket option from setsockopt()
[NETLABEL]: Fix build failure.
Paul Mundt (2):
sh: Wire up new syscalls.
sh: Update r7780rp_defconfig.
Pavel Emelianov (1):
Fix ipc entries removal
Pavel Roskin (1):
hostap_plx: fix CIS verification
Peer Chen (5):
[libata] sata_nv: Add PCI IDs
[libata] Add support for PATA controllers of MCP67 to pata_amd.c.
[libata] Add support for AHCI controllers of MCP67.
pci_ids.h: Add NVIDIA PCI ID
IDE: Add the support of nvidia PATA controllers of MCP67 to amd74xx.c
Peter Zijlstra (1):
lockdep: fix delayacct locking bug
Phil Dibowitz (1):
USB: usb-storage: Unusual_dev update
Rafael J. Wysocki (1):
swsusp: debugging
Ralf Baechle (26):
[MIPS] TX4927: Remove indent error message that somehow ended in the code.
[MIPS] Sort out missuse of __init for prom_getcmdline()
[MIPS] VSMP: Fix initialization ordering bug.
[MIPS] Flags must be unsigned long.
[MIPS] VSMP: Synchronize cp0 counters on bootup.
[MIPS] 16K & 64K page size fixes
[MIPS] SMTC: Fix crash if # of TC's > # of VPE's after pt_regs irq cleanup.
[MIPS] SMTC: Synchronize cp0 counters on bootup.
Revert "[MIPS] Make SPARSEMEM selectable on QEMU."
[MIPS] Fix merge screwup by patch(1)
[MIPS] IP27: Allow SMP ;-) Another changeset messed up by patch.
[MIPS] Fix warning about init_initrd() call if !CONFIG_BLK_DEV_INITRD.
[MIPS] Ocelot G: Fix : "CURRENTLY_UNUSED" is not defined warning.
[MIPS] Don't use R10000 llsc workaround version for all llsc-full processors.
[MIPS] Ocelot C: Fix large number of warnings.
[MIPS] Ocelot C: fix eth registration after conversion to platform_device
[MIPS] Ocelot C: Fix warning about missmatching format string.
[MIPS] Ocelot C: Fix mapping of ioport address range.
[MIPS] Ocelot 3: Fix large number of warnings.
[MIPS] SB1: On bootup only flush cache on local CPU.
[MIPS] Ocelot C: Fix MAC address detection after platform_device conversion.
[MIPS] Ocelot 3: Fix MAC address detection after platform_device conversion.
[MIPS] EV64120: Fix timer initialization for HZ != 100.
[MIPS] Make irq number allocator generally available for fixing EV64120.
[MIPS] EV64120: Fix PCI interrupt allocation.
[MIPS] Fix EV64120 and Ocelot builds by providing a plat_timer_setup().
Randy Dunlap (8):
[NET] sealevel: uses arp_broken_ops
[DCCP]: fix printk format warnings
SCSI: ISCSI build failure
V4L/DVB (4786): Pvrusb2: use NULL instead of 0
update some docbook comments
docbook: merge journal-api into filesystems.tmpl
lkdtm: cleanup headers and module_param/MODULE_PARM_DESC
Kconfig: remove redundant NETDEVICES depends
Ray Lehtiniemi (1):
[ARM] 3927/1: Allow show_mem() to work with holes in memory map.
Raymond Mantchala (1):
V4L/DVB (4787): Budget-ci: Inversion setting fixed for Technotrend 1500 T
Russ Anderson (1):
[IA64] MCA recovery: Montecito support
Sean Hefty (1):
RDMA/addr: Use client registration to fix module unload race
Srinivasa Ds (1):
NFS4: fix for recursive locking problem
Stephen Hemminger (4):
sky2: not experimental
skge, sky2, et all. gplv2 only
sky2: netpoll on dual port cards
[TCP]: Set default congestion control when no sysctl.
Stephen Rothwell (3):
Create compat_sys_migrate_pages
powerpc: wire up sys_migrate_pages
Fix sys_move_pages when a NULL node list is passed
Steve French (3):
[CIFS] Fix readdir breakage when blocksize set too small
[CIFS] Allow null user connections
[CIFS] report rename failure when target file is locked by Windows
Steve Wise (2):
IB/amso1100: Use dma_alloc_coherent() instead of kmalloc/dma_map_single
IB/amso1100: Fix incorrect pr_debug()
Steven Whitehouse (2):
[GFS2] Fix incorrect fs sync behaviour.
[GFS2] Fix OOM error handling
Tejun Heo (4):
sata_sis: fix flags handling for the secondary port
libata: unexport ata_dev_revalidate()
ata_piix: allow 01b MAP for both ICH6M and ICH7M
ahci: fix status register check in ahci_softreset
Thomas Klein (3):
ehea: Nullpointer dereferencation fix
ehea: Removed redundant define
ehea: 64K page support fix
Tilman Schmidt (1):
isdn/gigaset: convert warning message
Timur Tabi (1):
[POWERPC] qe_lib: qe_issue_cmd writes wrong value to CECDR
Trent Piepho (2):
V4L/DVB (4752): DVB: Add DVB_FE_CUSTOMISE support for MT2060
V4L/DVB (4751): Fix DBV_FE_CUSTOMISE for card drivers compiled into kernel
Troy Heber (1):
[IA64] move SAL_CACHE_FLUSH check later in boot
Vasily Averin (1):
[NETFILTER]: ip_tables: compat error way cleanup
Vlad Yasevich (2):
[SCTP]: Correctly set IP id for SCTP traffic
[SCTP]: Remove temporary associations from backlog and hash.
Yoichi Yuasa (3):
[MIPS] Yosemite: fix uninitialized variable in titan_i2c_xfer()
[MIPS] Fix warning of printk format in mips_srs_init()
[MIPS] Fix warning in mips-boards generic PCI
Yvan Seth (1):
ipmi_si_intf.c sets bad class_mask with PCI_DEVICE_CLASS
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions
[not found] ` <20061108085235.GT4729@stusta.de>
@ 2006-11-08 9:29 ` Jan Beulich
2006-11-08 10:21 ` Adrian Bunk
2006-11-08 9:34 ` Jens Axboe
` (4 subsequent siblings)
5 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2006-11-08 9:29 UTC (permalink / raw)
To: Adrian Bunk; +Cc: Linux Kernel Mailing List, discuss
>Subject : i386: more DWARFs and strange messages
>References : http://lkml.org/lkml/2006/10/29/127
>Submitter : Martin Lorenz <martin@lorenz.eu.org>
>Status : should be fixed by
> commit 4b96b1a10cb00c867103b21f0f2a6c91b705db11
This commit should be related only to the 'strange messages'; I'm
yet to look into the DWARFs.
Jan
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
[not found] ` <20061108085235.GT4729@stusta.de>
2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich
@ 2006-11-08 9:34 ` Jens Axboe
2006-11-08 19:09 ` Alex Romosan
2006-11-08 11:04 ` Eric W. Biederman
` (3 subsequent siblings)
5 siblings, 1 reply; 91+ messages in thread
From: Jens Axboe @ 2006-11-08 9:34 UTC (permalink / raw)
To: romosan; +Cc: linux-kernel
On Wed, Nov 08 2006, Adrian Bunk wrote:
> Subject : unable to rip cd
> References : http://lkml.org/lkml/2006/10/13/100
> Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
> Status : unknown
Alex, was/is this repeatable? If so I'd like you to repeat with this
debug patch applied, I cannot reproduce it locally.
diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index bddfebd..ad03e19 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -1726,8 +1726,10 @@ static ide_startstop_t cdrom_newpc_intr(
/*
* write to drive
*/
- if (cdrom_write_check_ireason(drive, len, ireason))
+ if (cdrom_write_check_ireason(drive, len, ireason)) {
+ blk_dump_rq_flags(rq, "cdrom_newpc");
return ide_stopped;
+ }
xferfunc = HWIF(drive)->atapi_output_bytes;
} else {
@@ -1859,8 +1861,10 @@ static ide_startstop_t cdrom_write_intr(
}
/* Check that the drive is expecting to do the same thing we are. */
- if (cdrom_write_check_ireason(drive, len, ireason))
+ if (cdrom_write_check_ireason(drive, len, ireason)) {
+ blk_dump_rq_flags(rq, "cdrom_pc");
return ide_stopped;
+ }
sectors_to_transfer = len / SECTOR_SIZE;
--
Jens Axboe
^ permalink raw reply related [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5
2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds
[not found] ` <20061108085235.GT4729@stusta.de>
@ 2006-11-08 9:43 ` Nigel Cunningham
2006-11-08 9:59 ` Alessandro Suardi
2006-11-08 15:43 ` Linus Torvalds
[not found] ` <20061111015035.GU4729@stusta.de>
` (2 subsequent siblings)
4 siblings, 2 replies; 91+ messages in thread
From: Nigel Cunningham @ 2006-11-08 9:43 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Mailing List
Gidday.
On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote:
> Ok, things are finally calming down, it seems.
>
> The -rc5 thing is mainly a few random architecture updates (arm, mips,
> uml, avr, power) and the only really noticeable one there is likely some
> fixes to the local APIC accesses on x86, which apparently fixes a few
> machines.
>
> The rest is really mostly one-liners (or close) to various subsystems. New
> PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better
> about this - there may be a -rc6, but maybe we don't even need one.
>
> As usual, thanks to everybody who tested and chased down some of the
> regressions,
>
> Linus
The patch etc doesn't seem to be available yet. (The front page is still
showing -rc4, for example).
Regards,
Nigel
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5
2006-11-08 9:43 ` Linux 2.6.19-rc5 Nigel Cunningham
@ 2006-11-08 9:59 ` Alessandro Suardi
2006-11-08 10:04 ` Nigel Cunningham
2006-11-08 14:19 ` Gene Heskett
2006-11-08 15:43 ` Linus Torvalds
1 sibling, 2 replies; 91+ messages in thread
From: Alessandro Suardi @ 2006-11-08 9:59 UTC (permalink / raw)
To: Nigel Cunningham; +Cc: Linus Torvalds, Linux Kernel Mailing List
On 11/8/06, Nigel Cunningham <ncunningham@linuxmail.org> wrote:
> Gidday.
>
> On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote:
> > Ok, things are finally calming down, it seems.
> >
> > The -rc5 thing is mainly a few random architecture updates (arm, mips,
> > uml, avr, power) and the only really noticeable one there is likely some
> > fixes to the local APIC accesses on x86, which apparently fixes a few
> > machines.
> >
> > The rest is really mostly one-liners (or close) to various subsystems. New
> > PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better
> > about this - there may be a -rc6, but maybe we don't even need one.
> >
> > As usual, thanks to everybody who tested and chased down some of the
> > regressions,
> >
> > Linus
>
> The patch etc doesn't seem to be available yet. (The front page is still
> showing -rc4, for example).
The patch is available, it's just the kernel.org home that
isn't updated.
http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.19-rc5.bz2
--alessandro
"...when I get it, I _get_ it"
(Lara Eidemiller)
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5
2006-11-08 9:59 ` Alessandro Suardi
@ 2006-11-08 10:04 ` Nigel Cunningham
2006-11-08 14:19 ` Gene Heskett
1 sibling, 0 replies; 91+ messages in thread
From: Nigel Cunningham @ 2006-11-08 10:04 UTC (permalink / raw)
To: Alessandro Suardi; +Cc: Linus Torvalds, Linux Kernel Mailing List
Hi.
On Wed, 2006-11-08 at 10:59 +0100, Alessandro Suardi wrote:
> On 11/8/06, Nigel Cunningham <ncunningham@linuxmail.org> wrote:
> > Gidday.
> >
> > On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote:
> > > Ok, things are finally calming down, it seems.
> > >
> > > The -rc5 thing is mainly a few random architecture updates (arm, mips,
> > > uml, avr, power) and the only really noticeable one there is likely some
> > > fixes to the local APIC accesses on x86, which apparently fixes a few
> > > machines.
> > >
> > > The rest is really mostly one-liners (or close) to various subsystems. New
> > > PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better
> > > about this - there may be a -rc6, but maybe we don't even need one.
> > >
> > > As usual, thanks to everybody who tested and chased down some of the
> > > regressions,
> > >
> > > Linus
> >
> > The patch etc doesn't seem to be available yet. (The front page is still
> > showing -rc4, for example).
>
> The patch is available, it's just the kernel.org home that
> isn't updated.
>
> http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.19-rc5.bz2
Ta. I was more concerned that whoever needs to fix whatever's broken
knows the issue exists.
Regards,
Nigel
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions
2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich
@ 2006-11-08 10:21 ` Adrian Bunk
0 siblings, 0 replies; 91+ messages in thread
From: Adrian Bunk @ 2006-11-08 10:21 UTC (permalink / raw)
To: Jan Beulich; +Cc: Linux Kernel Mailing List, discuss, Martin Lorenz
On Wed, Nov 08, 2006 at 10:29:36AM +0100, Jan Beulich wrote:
> >Subject : i386: more DWARFs and strange messages
> >References : http://lkml.org/lkml/2006/10/29/127
> >Submitter : Martin Lorenz <martin@lorenz.eu.org>
> >Status : should be fixed by
> > commit 4b96b1a10cb00c867103b21f0f2a6c91b705db11
>
> This commit should be related only to the 'strange messages'; I'm
> yet to look into the DWARFs.
Thanks for the information, I've updated it in my list.
> Jan
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
[not found] ` <20061108085235.GT4729@stusta.de>
2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich
2006-11-08 9:34 ` Jens Axboe
@ 2006-11-08 11:04 ` Eric W. Biederman
2006-11-08 11:32 ` Thomas Gleixner
` (2 subsequent siblings)
5 siblings, 0 replies; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-08 11:04 UTC (permalink / raw)
To: Adrian Bunk
Cc: Andrew Morton, Linux Kernel Mailing List, Bryan O'Sullivan
Adrian Bunk <bunk@stusta.de> writes:
> Subject : ipath driver MCEs system on load when HT chip present
> References : http://bugzilla.kernel.org/show_bug.cgi?id=7455
> Submitter : Bryan O'Sullivan <bos@serpentine.com>
> Caused-By : Eric W. Biederman <ebiederm@xmission.com>
> Handled-By : Bryan O'Sullivan <bos@serpentine.com>
> Eric W. Biederman <ebiederm@xmission.com>
> Status : Bryan and Eric are working on fixing the ipath driver
Except for some stupid little issues the fixes are now agreed to. Just
final code reviews and testing are needed.
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
[not found] ` <20061108085235.GT4729@stusta.de>
` (2 preceding siblings ...)
2006-11-08 11:04 ` Eric W. Biederman
@ 2006-11-08 11:32 ` Thomas Gleixner
[not found] ` <7813413.118221162987983254.komurojun-mbn@nifty.com>
[not found] ` <m1y7qm425l.fsf@ebiederm.dsl.xmission.com>
5 siblings, 0 replies; 91+ messages in thread
From: Thomas Gleixner @ 2006-11-08 11:32 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, mingo,
Komuro
On Wed, 2006-11-08 at 09:52 +0100, Adrian Bunk wrote:
> Subject : SMP kernel can not generate ISA irq properly
> References : http://lkml.org/lkml/2006/10/22/15
> Submitter : Komuro <komurojun-mbn@nifty.com>
> Handled-By : Thomas Gleixner <tglx@linutronix.de>
> Status : Thomas is investigating
Problem is not reproducable on any of my boxen.
Komuro,
is this still happening on -rc5 ? If yes, can you please provide the
boot log with "apic=verbose" on the commandline ?
tglx
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5
2006-11-08 9:59 ` Alessandro Suardi
2006-11-08 10:04 ` Nigel Cunningham
@ 2006-11-08 14:19 ` Gene Heskett
1 sibling, 0 replies; 91+ messages in thread
From: Gene Heskett @ 2006-11-08 14:19 UTC (permalink / raw)
To: linux-kernel
On Wednesday 08 November 2006 04:59, Alessandro Suardi wrote:
>On 11/8/06, Nigel Cunningham <ncunningham@linuxmail.org> wrote:
>> Gidday.
>>
>> On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote:
>> > Ok, things are finally calming down, it seems.
>> >
>> > The -rc5 thing is mainly a few random architecture updates (arm,
>> > mips, uml, avr, power) and the only really noticeable one there is
>> > likely some fixes to the local APIC accesses on x86, which apparently
>> > fixes a few machines.
>> >
>> > The rest is really mostly one-liners (or close) to various
>> > subsystems. New PCI ID's, trivial fixes, cifs, dvb, things like that.
>> > I'm feeling better about this - there may be a -rc6, but maybe we
>> > don't even need one.
>> >
>> > As usual, thanks to everybody who tested and chased down some of the
>> > regressions,
>> >
>> > Linus
>>
>> The patch etc doesn't seem to be available yet. (The front page is
>> still showing -rc4, for example).
>
>The patch is available, it's just the kernel.org home that
> isn't updated.
>
Tis now, I have it building.
>http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.19-rc5.bz2
>
>--alessandro
>
>"...when I get it, I _get_ it"
>
> (Lara Eidemiller)
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5
2006-11-08 9:43 ` Linux 2.6.19-rc5 Nigel Cunningham
2006-11-08 9:59 ` Alessandro Suardi
@ 2006-11-08 15:43 ` Linus Torvalds
1 sibling, 0 replies; 91+ messages in thread
From: Linus Torvalds @ 2006-11-08 15:43 UTC (permalink / raw)
To: Nigel Cunningham; +Cc: Linux Kernel Mailing List
On Wed, 8 Nov 2006, Nigel Cunningham wrote:
>
> The patch etc doesn't seem to be available yet. (The front page is still
> showing -rc4, for example).
It seems that mirroring is taking forever again. The patch and tar-balls
are definitely there on the master site, and even gitweb has mirrored out
(at least to one of the mirrors), but it looks like the mirroring hasn't
gotten to the kernel source "testing" directory yet.
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Re: 2.6.19-rc5: known regressions
[not found] ` <7813413.118221162987983254.komurojun-mbn@nifty.com>
@ 2006-11-08 16:00 ` Linus Torvalds
2006-11-10 12:42 ` Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq Komuro
0 siblings, 1 reply; 91+ messages in thread
From: Linus Torvalds @ 2006-11-08 16:00 UTC (permalink / raw)
To: Komuro; +Cc: tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
mingo
On Wed, 8 Nov 2006, Komuro wrote:
>
> Intel ISA PCIC probe:
> Intel i82365sl B step ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets
> host opts [0]: none
> host opts [1]: none
> ISA irqs (scanned) = 3,4,5,7,9,11,15 status change on irq 15
This definitely means that the IRQ subsystem works, at least here. That
"scanned" means that the PCMCIA driver actually tested those interrupts,
and they worked.
At that point, at least.
Of course, the "they worked" test is fairly simple, so it's by no means
foolproof, but in general, it does sound like it all really should be ok.
Komuro, if you're a git user (or are willing to learn), and it's reliable
with one particular card, it really would make most sense to bisect it.
Just start off with
git bisect start
git bisect good v2.6.18
git bisect bad v2.6.19-rc1
and off you go. That's a lot of commits (abotu 5000), but even if you
don't ant to do the 12 or 13 kernel compiles and reboots that are needed
for a full bisection, doing just 4-5 would cut the number down a lot, and
then you can send the bisection log out.
But testing 2.6.19-rc5 is still worth it. The APIC fixes might fix it, or
some other changes might.
Linus
> warning: process `date' used the removed sysctl system call
> EXT3 FS on hda1, internal journal
> Adding 257032k swap on /dev/hda2. Priority:-1 extents:1 across:257032k
> warning: process `ls' used the removed sysctl system call
> warning: process `sleep' used the removed sysctl system call
> cs: IO port probe 0x100-0x3af: excluding 0x170-0x177 0x290-0x297 0x370-0x37f
> cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7
> cs: IO port probe 0x820-0x8ff: clean.
> cs: IO port probe 0xc00-0xcf7: clean.
> cs: IO port probe 0xa00-0xaff: clean.
> cs: IO port probe 0x100-0x3af: excluding 0x170-0x177 0x290-0x297 0x370-0x37f
> cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7
> cs: IO port probe 0x820-0x8ff: clean.
> cs: IO port probe 0xc00-0xcf7: clean.
> cs: IO port probe 0xa00-0xaff: clean.
>
> Best Regards
> Komuro
>
>
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
[not found] ` <Pine.LNX.4.64.0611080745150.3667@g5.osdl.org>
@ 2006-11-08 16:22 ` Adrian Bunk
2006-11-08 23:11 ` Tim Chen
0 siblings, 1 reply; 91+ messages in thread
From: Adrian Bunk @ 2006-11-08 16:22 UTC (permalink / raw)
To: Linus Torvalds
Cc: Eric W. Biederman, Andrew Morton, Linux Kernel Mailing List,
Tim Chen
On Wed, Nov 08, 2006 at 07:47:07AM -0800, Linus Torvalds wrote:
>
>
> On Wed, 8 Nov 2006, Eric W. Biederman wrote:
> >
> > I haven't seen anyone reproduce this but Tim Chen, and Tim wasn't
> > able to root cause the problem so I believe we are going to have
> > this regression :(
>
> Note that you really shouldn't look too closely at lmbench scheduling
> fluctuations. They can fluctuate a _lot_, especially under SMP, and it can
> depend on things like cache layout that has nothing to do with the
> scheduler (ie just code movement can make the lmbench numbers change).
>
> So there are "regressions" and there are "shit happens". It can sometimes
> be hard to tell the two apart, of course ;)
There's perhaps one thing that might help us to see whether it's just a
benchmark effekt or a real problem:
With Tim's CONFIG_NR_CPUS=8, NR_IRQS only increases from 224 in 2.6.18
to 512 in 2.6.19-rc.
With CONFIG_NR_CPUS=255, NR_IRQS increases from 224 in 2.6.18
to 8416 in 2.6.19-rc.
@Tim:
Can you try CONFIG_NR_CPUS=255 with both 2.6.18 and 2.6.19-rc5?
> Linus
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 9:34 ` Jens Axboe
@ 2006-11-08 19:09 ` Alex Romosan
2006-11-08 19:29 ` Jens Axboe
0 siblings, 1 reply; 91+ messages in thread
From: Alex Romosan @ 2006-11-08 19:09 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel
Jens Axboe <jens.axboe@oracle.com> writes:
> On Wed, Nov 08 2006, Adrian Bunk wrote:
>> Subject : unable to rip cd
>> References : http://lkml.org/lkml/2006/10/13/100
>> Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
>> Status : unknown
>
> Alex, was/is this repeatable? If so I'd like you to repeat with this
> debug patch applied, I cannot reproduce it locally.
>
> diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> index bddfebd..ad03e19 100644
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -1726,8 +1726,10 @@ static ide_startstop_t cdrom_newpc_intr(
> /*
> * write to drive
> */
> - if (cdrom_write_check_ireason(drive, len, ireason))
> + if (cdrom_write_check_ireason(drive, len, ireason)) {
> + blk_dump_rq_flags(rq, "cdrom_newpc");
> return ide_stopped;
> + }
>
> xferfunc = HWIF(drive)->atapi_output_bytes;
> } else {
> @@ -1859,8 +1861,10 @@ static ide_startstop_t cdrom_write_intr(
> }
>
> /* Check that the drive is expecting to do the same thing we are. */
> - if (cdrom_write_check_ireason(drive, len, ireason))
> + if (cdrom_write_check_ireason(drive, len, ireason)) {
> + blk_dump_rq_flags(rq, "cdrom_pc");
> return ide_stopped;
> + }
>
> sectors_to_transfer = len / SECTOR_SIZE;
>
i've tried it again with the above patch applied and when i start
cdparanoia i get:
kernel: hdc: write_intr: wrong transfer direction!
kernel: cdrom_newpc: dev hdc: type=2, flags=114c9
kernel:
kernel: sector 59534648, nr/cnr 0/0
kernel: bio 00000000, biotail c14b2800, buffer 00000000, data 00000000, len 56
kernel: cdb: 12 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00
as for the lock up, the ripping process never completes, it starts and
then it hangs somewhere in the middle of the track. it could be that
the disk has some problems. anyway, abort execution doesn't work until
i physically eject the cd from the drive (which seems to be an
improvement from a couple of rc's ago). hope this helps.
--alex--
--
| I believe the moment is at hand when, by a paranoiac and active |
| advance of the mind, it will be possible (simultaneously with |
| automatism and other passive states) to systematize confusion |
| and thus to help to discredit completely the world of reality. |
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 19:09 ` Alex Romosan
@ 2006-11-08 19:29 ` Jens Axboe
2006-11-08 19:38 ` Alex Romosan
2006-11-08 20:03 ` Arjan van de Ven
0 siblings, 2 replies; 91+ messages in thread
From: Jens Axboe @ 2006-11-08 19:29 UTC (permalink / raw)
To: Alex Romosan; +Cc: linux-kernel
On Wed, Nov 08 2006, Alex Romosan wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
>
> > On Wed, Nov 08 2006, Adrian Bunk wrote:
> >> Subject : unable to rip cd
> >> References : http://lkml.org/lkml/2006/10/13/100
> >> Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
> >> Status : unknown
> >
> > Alex, was/is this repeatable? If so I'd like you to repeat with this
> > debug patch applied, I cannot reproduce it locally.
> >
> > diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> > index bddfebd..ad03e19 100644
> > --- a/drivers/ide/ide-cd.c
> > +++ b/drivers/ide/ide-cd.c
> > @@ -1726,8 +1726,10 @@ static ide_startstop_t cdrom_newpc_intr(
> > /*
> > * write to drive
> > */
> > - if (cdrom_write_check_ireason(drive, len, ireason))
> > + if (cdrom_write_check_ireason(drive, len, ireason)) {
> > + blk_dump_rq_flags(rq, "cdrom_newpc");
> > return ide_stopped;
> > + }
> >
> > xferfunc = HWIF(drive)->atapi_output_bytes;
> > } else {
> > @@ -1859,8 +1861,10 @@ static ide_startstop_t cdrom_write_intr(
> > }
> >
> > /* Check that the drive is expecting to do the same thing we are. */
> > - if (cdrom_write_check_ireason(drive, len, ireason))
> > + if (cdrom_write_check_ireason(drive, len, ireason)) {
> > + blk_dump_rq_flags(rq, "cdrom_pc");
> > return ide_stopped;
> > + }
> >
> > sectors_to_transfer = len / SECTOR_SIZE;
> >
>
> i've tried it again with the above patch applied and when i start
> cdparanoia i get:
>
> kernel: hdc: write_intr: wrong transfer direction!
> kernel: cdrom_newpc: dev hdc: type=2, flags=114c9
> kernel:
> kernel: sector 59534648, nr/cnr 0/0
> kernel: bio 00000000, biotail c14b2800, buffer 00000000, data 00000000, len 56
> kernel: cdb: 12 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00
Wonderful! So this is an INQUIRY command, yet the WRITE bit is set. The
drive gets really confused about that, for good reason. The question is
where that write bit comes from, it looks really odd. Additionally, we
have killed ->bio but ->biotail still looks valid. Perhaps it's some of
the error handling that got screwed.
> as for the lock up, the ripping process never completes, it starts and
> then it hangs somewhere in the middle of the track. it could be that
> the disk has some problems. anyway, abort execution doesn't work until
> i physically eject the cd from the drive (which seems to be an
> improvement from a couple of rc's ago). hope this helps.
It helps a lot, thanks! I may ask you to retest with another patch, if
you don't mind.
--
Jens Axboe
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 19:29 ` Jens Axboe
@ 2006-11-08 19:38 ` Alex Romosan
2006-11-08 19:45 ` Jens Axboe
2006-11-08 20:03 ` Arjan van de Ven
1 sibling, 1 reply; 91+ messages in thread
From: Alex Romosan @ 2006-11-08 19:38 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel
Jens Axboe <jens.axboe@oracle.com> writes:
> It helps a lot, thanks! I may ask you to retest with another patch,
> if you don't mind.
send the patches, i'll test them all. thanks.
--alex--
--
| I believe the moment is at hand when, by a paranoiac and active |
| advance of the mind, it will be possible (simultaneously with |
| automatism and other passive states) to systematize confusion |
| and thus to help to discredit completely the world of reality. |
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 19:38 ` Alex Romosan
@ 2006-11-08 19:45 ` Jens Axboe
2006-11-08 21:40 ` Alex Romosan
0 siblings, 1 reply; 91+ messages in thread
From: Jens Axboe @ 2006-11-08 19:45 UTC (permalink / raw)
To: Alex Romosan; +Cc: linux-kernel
On Wed, Nov 08 2006, Alex Romosan wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
>
> > It helps a lot, thanks! I may ask you to retest with another patch,
> > if you don't mind.
>
> send the patches, i'll test them all. thanks.
If you could retest with something crazy like this, then that would
likely help:
diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index 7c47e62..010acfa 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -630,6 +630,9 @@ static void cdrom_end_request (ide_drive
struct request *rq = HWGROUP(drive)->rq;
int nsectors = rq->hard_cur_sectors;
+ if (blk_pc_request(rq) && rq->cmd[0] == 0x12)
+ printk("ide-cd: end INQ rq %p\n", rq);
+
if (blk_sense_request(rq) && uptodate) {
/*
* For REQ_TYPE_SENSE, "rq->buffer" points to the original
@@ -1671,6 +1674,9 @@ static ide_startstop_t cdrom_newpc_intr(
xfer_func_t *xferfunc;
unsigned long flags;
+ if (rq->cmd[0] == 0x12)
+ printk("ide-cd: newpc %p\n", rq);
+
/* Check for errors. */
dma_error = 0;
dma = info->dma;
@@ -1789,6 +1795,8 @@ static ide_startstop_t cdrom_newpc_intr(
return ide_started;
end_request:
+ if (rq->cmd[0] == 0x12)
+ printk("ide-cd: newpc end INQ %p\n", rq);
if (!rq->data_len)
post_transform_command(rq);
@@ -1959,7 +1967,13 @@ static ide_startstop_t cdrom_do_block_pc
{
struct cdrom_info *info = drive->driver_data;
- rq->cmd_flags |= REQ_QUIET;
+ if (rq->cmd[0] == 0x12) {
+ printk("ide-cd: starting INQ %p\n", rq);
+ if (rq_data_dir(rq) == WRITE)
+ printk("ide-cd: INQ with write set seen\n");
+ }
+ if (!rq->bio && rq->biotail)
+ printk("ide-cd: no bio, but biotail\n");
info->dma = 0;
--
Jens Axboe
^ permalink raw reply related [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 19:29 ` Jens Axboe
2006-11-08 19:38 ` Alex Romosan
@ 2006-11-08 20:03 ` Arjan van de Ven
2006-11-08 20:19 ` Jens Axboe
1 sibling, 1 reply; 91+ messages in thread
From: Arjan van de Ven @ 2006-11-08 20:03 UTC (permalink / raw)
To: Jens Axboe; +Cc: Alex Romosan, linux-kernel
> Wonderful! So this is an INQUIRY command, yet the WRITE bit is set. The
> drive gets really confused about that, for good reason. The question is
> where that write bit comes from, it looks really odd. Additionally, we
it could be a userspace command; some userspace tools send inquiry via
sg...
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 20:03 ` Arjan van de Ven
@ 2006-11-08 20:19 ` Jens Axboe
0 siblings, 0 replies; 91+ messages in thread
From: Jens Axboe @ 2006-11-08 20:19 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Alex Romosan, linux-kernel
On Wed, Nov 08 2006, Arjan van de Ven wrote:
> > Wonderful! So this is an INQUIRY command, yet the WRITE bit is set. The
> > drive gets really confused about that, for good reason. The question is
> > where that write bit comes from, it looks really odd. Additionally, we
>
> it could be a userspace command; some userspace tools send inquiry via
> sg...
it is a userspace command, it originates from SG_IO. So that is a given.
The question is where the write bit comes from, I'd be puzzled if the
user app sets it - cdparanoia in this case. Seeing as there's other
request mangling, I hope the new debug patch can shed some light on
that.
--
Jens Axboe
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 19:45 ` Jens Axboe
@ 2006-11-08 21:40 ` Alex Romosan
0 siblings, 0 replies; 91+ messages in thread
From: Alex Romosan @ 2006-11-08 21:40 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel
Jens Axboe <jens.axboe@oracle.com> writes:
> If you could retest with something crazy like this, then that would
> likely help:
>
> diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> index 7c47e62..010acfa 100644
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -630,6 +630,9 @@ static void cdrom_end_request (ide_drive
> struct request *rq = HWGROUP(drive)->rq;
> int nsectors = rq->hard_cur_sectors;
>
> + if (blk_pc_request(rq) && rq->cmd[0] == 0x12)
> + printk("ide-cd: end INQ rq %p\n", rq);
> +
> if (blk_sense_request(rq) && uptodate) {
> /*
> * For REQ_TYPE_SENSE, "rq->buffer" points to the original
> @@ -1671,6 +1674,9 @@ static ide_startstop_t cdrom_newpc_intr(
> xfer_func_t *xferfunc;
> unsigned long flags;
>
> + if (rq->cmd[0] == 0x12)
> + printk("ide-cd: newpc %p\n", rq);
> +
> /* Check for errors. */
> dma_error = 0;
> dma = info->dma;
> @@ -1789,6 +1795,8 @@ static ide_startstop_t cdrom_newpc_intr(
> return ide_started;
>
> end_request:
> + if (rq->cmd[0] == 0x12)
> + printk("ide-cd: newpc end INQ %p\n", rq);
> if (!rq->data_len)
> post_transform_command(rq);
>
> @@ -1959,7 +1967,13 @@ static ide_startstop_t cdrom_do_block_pc
> {
> struct cdrom_info *info = drive->driver_data;
>
> - rq->cmd_flags |= REQ_QUIET;
> + if (rq->cmd[0] == 0x12) {
> + printk("ide-cd: starting INQ %p\n", rq);
> + if (rq_data_dir(rq) == WRITE)
> + printk("ide-cd: INQ with write set seen\n");
> + }
> + if (!rq->bio && rq->biotail)
> + printk("ide-cd: no bio, but biotail\n");
>
> info->dma = 0;
i applied this patch on top of the old one. this is what i get now:
kernel: ide-cd: starting INQ df5ad074
kernel: ide-cd: INQ with write set seen
kernel: ide-cd: newpc df5ad074
kernel: hdc: write_intr: wrong transfer direction!
kernel: ide-cd: end INQ rq df5ad074
kernel: cdrom_newpc: dev hdc: type=2, flags=104c9
kernel:
kernel: sector 59534648, nr/cnr 0/0
kernel: bio 00000000, biotail dee57c80, buffer 00000000, data 00000000, len 56
kernel: cdb: 12 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00
kernel: ide-cd: starting INQ df5ad074
kernel: ide-cd: newpc df5ad074
kernel: ide-cd: newpc df5ad074
kernel: ide-cd: newpc end INQ df5ad074
kernel: hdc: packet command error: status=0x51 { DriveReady SeekComplete Error }
kernel: hdc: packet command error: error=0xb4 { AbortedCommand LastFailedSense=0x0b }
kernel: ide: failed opcode was: unknown
kernel: ATAPI device hdc:
kernel: Error: Aborted command -- (Sense key=0x0b)
kernel: (reserved error code) -- (asc=0x11, ascq=0x11)
kernel: The failed "Read CD" packet command was:
kernel: "be 00 00 00 51 93 00 00 0d f8 00 00 00 00 00 00 "
kernel: hdc: packet command error: status=0x51 { DriveReady SeekComplete Error }
kernel: hdc: packet command error: error=0x30 { LastFailedSense=0x03 }
kernel: ide: failed opcode was: unknown
kernel: ATAPI device hdc:
kernel: Error: Medium error -- (Sense key=0x03)
kernel: Unrecovered read error -- (asc=0x11, ascq=0x00)
kernel: The failed "Read CD" packet command was:
kernel: "be 00 00 00 51 a0 00 00 07 f8 00 00 00 00 00 00 "
kernel: hdc: packet command error: status=0x51 { DriveReady SeekComplete Error }
kernel: hdc: packet command error: error=0xb4 { AbortedCommand LastFailedSense=0x0b }
kernel: ide: failed opcode was: unknown
kernel: ATAPI device hdc:
kernel: Error: Aborted command -- (Sense key=0x0b)
kernel: (reserved error code) -- (asc=0x11, ascq=0x11)
kernel: The failed "Read CD" packet command was:
kernel: "be 00 00 00 51 9b 00 00 0d f8 00 00 00 00 00 00 "
hdc is the cdrom drive and the errors started showing up when
cdparanoia hung.
--alex--
--
| I believe the moment is at hand when, by a paranoiac and active |
| advance of the mind, it will be possible (simultaneously with |
| automatism and other passive states) to systematize confusion |
| and thus to help to discredit completely the world of reality. |
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 16:22 ` 2.6.19-rc5: known regressions Adrian Bunk
@ 2006-11-08 23:11 ` Tim Chen
2006-11-09 2:49 ` Tim Chen
0 siblings, 1 reply; 91+ messages in thread
From: Tim Chen @ 2006-11-08 23:11 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linus Torvalds, Eric W. Biederman, Andrew Morton,
Linux Kernel Mailing List
On Wed, 2006-11-08 at 17:22 +0100, Adrian Bunk wrote:
> There's perhaps one thing that might help us to see whether it's just a
> benchmark effekt or a real problem:
>
> With Tim's CONFIG_NR_CPUS=8, NR_IRQS only increases from 224 in 2.6.18
> to 512 in 2.6.19-rc.
>
> With CONFIG_NR_CPUS=255, NR_IRQS increases from 224 in 2.6.18
> to 8416 in 2.6.19-rc.
>
> @Tim:
> Can you try CONFIG_NR_CPUS=255 with both 2.6.18 and 2.6.19-rc5?
>
With CONFIG_NR_CPUS increased from 8 to 64:
2.6.18 see no change in fork time measured.
2.6.19-rc5 see a 138% increase in fork time.
When I increase CONFIG_NR_CPUS to 128, the child process
from fork got killed when it executes sched_getaffinity call
in the routine to pin the process onto a processor.
This happened for both 2.6.18 and 2.6.19-rc5.
I'll need to check more carefully what lmbench is doing
there.
Tim
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-08 23:11 ` Tim Chen
@ 2006-11-09 2:49 ` Tim Chen
2006-11-09 5:10 ` Eric W. Biederman
0 siblings, 1 reply; 91+ messages in thread
From: Tim Chen @ 2006-11-09 2:49 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linus Torvalds, Eric W. Biederman, Andrew Morton,
Linux Kernel Mailing List
On Wed, 2006-11-08 at 15:11 -0800, Tim Chen wrote:
> On Wed, 2006-11-08 at 17:22 +0100, Adrian Bunk wrote:
>
> > There's perhaps one thing that might help us to see whether it's just a
> > benchmark effekt or a real problem:
> >
> > With Tim's CONFIG_NR_CPUS=8, NR_IRQS only increases from 224 in 2.6.18
> > to 512 in 2.6.19-rc.
> >
> > With CONFIG_NR_CPUS=255, NR_IRQS increases from 224 in 2.6.18
> > to 8416 in 2.6.19-rc.
> >
> > @Tim:
> > Can you try CONFIG_NR_CPUS=255 with both 2.6.18 and 2.6.19-rc5?
> >
>
> With CONFIG_NR_CPUS increased from 8 to 64:
> 2.6.18 see no change in fork time measured.
> 2.6.19-rc5 see a 138% increase in fork time.
>
Lmbench is broken in its fork time measurement.
It includes overhead time when it is pinning processes onto
specific cpu. The actual fork time is not affected by NR_IRQS.
Lmbench calls the following C library function to determine the
number of processors online before it pin the processes:
sysconf(_SC_NPROCESSORS_ONLN);
This function takes the same order of time to run as
fork itself. In addition, runtime of this function
increases with NR_IRQS. This resulted in the change in
time measured.
After hardcoding the number of online processors in lmbench,
the fork time measured now does not change with CONFIG_NR_CPUS
for both 2.6.18 and 2.6.19-rc5. So we can now conclude that
NR_IRQS does not affect fork. We can remove this particular
issue from the known regression.
Tim
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-09 2:49 ` Tim Chen
@ 2006-11-09 5:10 ` Eric W. Biederman
2006-11-13 22:46 ` Tim Chen
0 siblings, 1 reply; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-09 5:10 UTC (permalink / raw)
To: tim.c.chen
Cc: Adrian Bunk, Linus Torvalds, Andrew Morton,
Linux Kernel Mailing List
Tim Chen <tim.c.chen@linux.intel.com> writes:
> On Wed, 2006-11-08 at 15:11 -0800, Tim Chen wrote:
>> On Wed, 2006-11-08 at 17:22 +0100, Adrian Bunk wrote:
>>
>> With CONFIG_NR_CPUS increased from 8 to 64:
>> 2.6.18 see no change in fork time measured.
CONFIG_NR_CPUS has no affect on NR_IRQS in 2.6.18.
So this test unfortunately told us nothing.
>> 2.6.19-rc5 see a 138% increase in fork time.
>>
>
> Lmbench is broken in its fork time measurement.
> It includes overhead time when it is pinning processes onto
> specific cpu. The actual fork time is not affected by NR_IRQS.
>
> Lmbench calls the following C library function to determine the
> number of processors online before it pin the processes:
> sysconf(_SC_NPROCESSORS_ONLN);
>
> This function takes the same order of time to run as
> fork itself. In addition, runtime of this function
> increases with NR_IRQS. This resulted in the change in
> time measured.
>
> After hardcoding the number of online processors in lmbench,
> the fork time measured now does not change with CONFIG_NR_CPUS
> for both 2.6.18 and 2.6.19-rc5. So we can now conclude that
> NR_IRQS does not affect fork. We can remove this particular
> issue from the known regression.
Cool. I'm glad to know it was simply a buggy lmbench.
What is sysconf(_SN_NPROCESSORS_ONLN) doing that it slows down as the
number of irqs increase? It is a slow path certainly but possibly
something we should fix. My hunch is cat /proc/cpuinfo...
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq
2006-11-08 16:00 ` Linus Torvalds
@ 2006-11-10 12:42 ` Komuro
2006-11-13 16:02 ` Linus Torvalds
0 siblings, 1 reply; 91+ messages in thread
From: Komuro @ 2006-11-10 12:42 UTC (permalink / raw)
To: Linus Torvalds
Cc: tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
mingo
Hi,
>> Intel ISA PCIC probe:
>> Intel i82365sl B step ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets
>> host opts [0]: none
>> host opts [1]: none
>> ISA irqs (scanned) = 3,4,5,7,9,11,15 status change on irq 15
>
>This definitely means that the IRQ subsystem works, at least here. That
>"scanned" means that the PCMCIA driver actually tested those interrupts,
>and they worked.
>
>At that point, at least.
>
>Of course, the "they worked" test is fairly simple, so it's by no means
>foolproof, but in general, it does sound like it all really should be ok.
>
>
>But testing 2.6.19-rc5 is still worth it. The APIC fixes might fix it, or
>some other changes might.
>
> Linus
I tried the 2.6.19-rc5, the problem still happens.
But,
I remove the disable_irq_nosync() , enable_irq()
from the linux/drivers/net/pcmcia/axnet_cs.c
the interrupt is generated properly.
So I think enable_irq does not enable the irq.
Thanks!
Best Regards
Komuro
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2)
[not found] ` <20061111015035.GU4729@stusta.de>
@ 2006-11-11 9:08 ` Rafael J. Wysocki
2006-11-11 9:25 ` Paolo Ornati
0 siblings, 1 reply; 91+ messages in thread
From: Rafael J. Wysocki @ 2006-11-11 9:08 UTC (permalink / raw)
To: Adrian Bunk; +Cc: LKML, Paolo Ornati
Hi,
On Saturday, 11 November 2006 02:50, Adrian Bunk wrote:
> This email lists some known regressions in 2.6.19-rc5 compared to 2.6.18
> that are not yet fixed in Linus' tree.
>
> If you find your name in the Cc header, you are either submitter of one
> of the bugs, maintainer of an affectected subsystem or driver, a patch
> of you caused a breakage or I'm considering you in any other way possibly
> involved with one or more of these issues.
>
> Due to the huge amount of recipients, please trim the Cc when answering.
>
>
> Subject : PCI MSI setting corrupted during resume
> References : http://bugzilla.kernel.org/show_bug.cgi?id=7479
> Submitter : Stephen Hemminger <shemminger@osdl.org>
> Status : unknown
>
>
> Subject : x86_64 boot failure: irq 22: nobody cared (hda_intel MSI)
> References : http://lkml.org/lkml/2006/11/8/98
> Submitter : Olivier Nicolas <olivn@trollprod.org>
> Status : unknown
>
>
> Subject : SMP kernel can not generate ISA irq properly
> References : http://lkml.org/lkml/2006/10/22/15
> http://lkml.org/lkml/2006/11/10/142
> Submitter : Komuro <komurojun-mbn@nifty.com>
> Handled-By : Thomas Gleixner <tglx@linutronix.de>
> Status : Thomas is investigating
>
>
> Subject : x86_64: Fix partial page check to ensure unusable memory
> is not being marked usable
> References : http://lkml.org/lkml/2006/11/9/239
> Submitter : Aaron Durbin <adurbin@google.com>
> Caused-By : Mel Gorman <mel@csn.ul.ie>
> commit 5cb248abf5ab65ab543b2d5fc16c738b28031fc0
> Patch : http://lkml.org/lkml/2006/11/9/239
> Status : patch available
>
>
> Subject : x86_64: Bad page state in process 'swapper'
> References : http://lkml.org/lkml/2006/11/10/135
> http://lkml.org/lkml/2006/11/10/208
> Submitter : Andre Noll <maan@systemlinux.org>
> Handled-By : Andi Kleen <ak@suse.de>
> Status : Andi is investigating
>
>
> Subject : x86_64: oprofile doesn't work
> References : http://lkml.org/lkml/2006/10/27/3
> Submitter : Prakash Punnoor <prakash@punnoor.de>
> Status : unknown
>
>
> Subject : weird battery charge level reported
> ACPI Error method parse / execution failed
> References : http://bugzilla.kernel.org/show_bug.cgi?id=7466
> Submitter : Olivier Mondoloni <olivier.mondoloni@waika9.com>
> Status : unknown
>
>
> Subject : ThinkPad R50p: boot fail with (lapic && on_battery)
> References : http://lkml.org/lkml/2006/10/31/333
> Submitter : Ernst Herzberg <earny@net4u.de>
> Handled-By : Len Brown <len.brown@intel.com>
> Status : problem is being debugged
>
>
> Subject : BUG: scheduling while atomic: events/0/0x00000001/4
> after resume
> References : http://lkml.org/lkml/2006/11/2/209
> Submitter : Paolo Ornati <ornati@fastwebnet.it>
> Status : unknown
I couldn't find anything in the report that would indicate the problem occured
after a resume. Was it really the case?
> Subject : sata-via doesn't detect anymore disks attached to VIA vt6421
> References : http://bugzilla.kernel.org/show_bug.cgi?id=7255
> Submitter : Thierry Vignaud <tvignaud@mandriva.com>
> Status : unknown
>
>
> Subject : libata must be initialized earlier
> References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html
> Submitter : Paul Mackerras <paulus@samba.org>
> Handled-By : Brian King <brking@us.ibm.com>
> Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2
> Status : patch available
>
>
> Subject : unable to rip cd
> References : http://lkml.org/lkml/2006/10/13/100
> http://lkml.org/lkml/2006/11/8/42
> Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
> Handled-By : Jens Axboe <jens.axboe@oracle.com>
> Status : Jens is investigating
Greetings,
Rafael
--
You never change things by fighting the existing reality.
R. Buckminster Fuller
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2)
2006-11-11 9:08 ` [discuss] 2.6.19-rc5: known regressions (v2) Rafael J. Wysocki
@ 2006-11-11 9:25 ` Paolo Ornati
2006-11-11 10:49 ` Rafael J. Wysocki
0 siblings, 1 reply; 91+ messages in thread
From: Paolo Ornati @ 2006-11-11 9:25 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Adrian Bunk, LKML
On Sat, 11 Nov 2006 10:08:37 +0100
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > Subject : BUG: scheduling while atomic: events/0/0x00000001/4
> > after resume
> > References : http://lkml.org/lkml/2006/11/2/209
> > Submitter : Paolo Ornati <ornati@fastwebnet.it>
> > Status : unknown
>
> I couldn't find anything in the report that would indicate the problem occured
> after a resume. Was it really the case?
Ahh, I've written that in another email but I trimmed LKML from CC by
mistake ;)
Relevant portion of that mail follows... anyway it seems that "-rc5" is
_OK_ since I'm running it by 2 days and it survived 9 suspend/resume
cycles.
------------------------------------------------------------------
I've reproduced it (with rc4-g4b1c46a3), and I think it is
suspend/resume related sice the messages start flooding dmesg just
after a resume...
I'll see if it is reproducible just doing suspend/resume a couple of
times... and if so I'll try with -rc5.
dmesg (stripped at the end):
[ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006
[ 0.000000] Command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000001ff30000 (usable)
[ 0.000000] BIOS-e820: 000000001ff30000 - 000000001ff40000 (ACPI data)
[ 0.000000] BIOS-e820: 000000001ff40000 - 000000001fff0000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000001fff0000 - 0000000020000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used
[ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used
[ 0.000000] end_pfn_map = 1048576
[ 0.000000] DMI 2.3 present.
[ 0.000000] ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000fa850
[ 0.000000] ACPI: RSDT (v001 A M I OEMRSDT 0x06000517 MSFT 0x00000097) @ 0x000000001ff30000
[ 0.000000] ACPI: FADT (v001 A M I OEMFACP 0x06000517 MSFT 0x00000097) @ 0x000000001ff30200
[ 0.000000] ACPI: MADT (v001 A M I OEMAPIC 0x06000517 MSFT 0x00000097) @ 0x000000001ff30390
[ 0.000000] ACPI: OEMB (v001 A M I OEMBIOS 0x06000517 MSFT 0x00000097) @ 0x000000001ff40040
[ 0.000000] ACPI: DSDT (v001 A0058 A0058002 0x00000002 MSFT 0x0100000d) @ 0x0000000000000000
[ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used
[ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0 -> 4096
[ 0.000000] DMA32 4096 -> 1048576
[ 0.000000] Normal 1048576 -> 1048576
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0 -> 159
[ 0.000000] 0: 256 -> 130864
[ 0.000000] On node 0 totalpages: 130767
[ 0.000000] DMA zone: 56 pages used for memmap
[ 0.000000] DMA zone: 1183 pages reserved
[ 0.000000] DMA zone: 2760 pages, LIFO batch:0
[ 0.000000] DMA32 zone: 1733 pages used for memmap
[ 0.000000] DMA32 zone: 125035 pages, LIFO batch:31
[ 0.000000] Normal zone: 0 pages used for memmap
[ 0.000000] ACPI: PM-Timer IO Port: 0x808
[ 0.000000] ACPI: Local APIC address 0xfee00000
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
[ 0.000000] Processor #0 (Bootup-CPU)
[ 0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: IRQ0 used by override.
[ 0.000000] ACPI: IRQ2 used by override.
[ 0.000000] ACPI: IRQ9 used by override.
[ 0.000000] Setting APIC routing to flat
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] Nosave address range: 000000000009f000 - 00000000000a0000
[ 0.000000] Nosave address range: 00000000000a0000 - 00000000000e4000
[ 0.000000] Nosave address range: 00000000000e4000 - 0000000000100000
[ 0.000000] Allocating PCI resources starting at 30000000 (gap: 20000000:dff80000)
[ 0.000000] Built 1 zonelists. Total pages: 127795
[ 0.000000] Kernel command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60
[ 0.000000] Initializing CPU#0
[ 0.000000] PID hash table entries: 2048 (order: 11, 16384 bytes)
[ 32.727602] time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
[ 32.727605] time.c: Detected 2202.943 MHz processor.
[ 32.730265] Console: colour VGA+ 80x25
[ 32.733073] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 32.733509] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 32.733646] Checking aperture...
[ 32.733705] CPU 0: aperture @ f8000000 size 64 MB
[ 32.740581] Memory: 507900k/523456k available (2708k kernel code, 14716k reserved, 1343k data, 200k init)
[ 32.800438] Calibrating delay using timer specific routine.. 4409.08 BogoMIPS (lpj=2204542)
[ 32.800591] Mount-cache hash table entries: 256
[ 32.800771] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 32.800833] CPU: L2 Cache: 512K (64 bytes/line)
[ 32.800913] CPU: AMD Athlon(tm) 64 Processor 3200+ stepping 00
[ 32.801063] ACPI: Core revision 20060707
[ 32.814145] Using local APIC timer interrupts.
[ 32.859518] result 12516743
[ 32.859573] Detected 12.516 MHz APIC timer.
[ 32.860328] testing NMI watchdog ... OK.
[ 32.870515] checking if image is initramfs...it isn't (bad gzip magic numbers); looks like an initrd
[ 32.873517] Freeing initrd memory: 2000k freed
[ 32.875609] NET: Registered protocol family 16
[ 32.875754] ACPI: bus type pci registered
[ 32.875818] PCI: Using configuration type 1
[ 32.881215] ACPI: Interpreter enabled
[ 32.881276] ACPI: Using IOAPIC for interrupt routing
[ 32.882250] ACPI: PCI Root Bridge [PCI0] (0000:00)
[ 32.882315] PCI: Probing PCI hardware (bus 00)
[ 32.884489] PCI: enabled onboard AC97/MC97 devices
[ 32.884765] Boot video device is 0000:01:00.0
[ 32.884843] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[ 32.896169] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15)
[ 32.896831] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15)
[ 32.897488] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 *7 10 11 14 15)
[ 32.898143] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 7 10 11 14 15)
[ 32.898809] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
[ 32.899560] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
[ 32.900312] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
[ 32.901052] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
[ 32.901700] Linux Plug and Play Support v0.97 (c) Adam Belay
[ 32.901768] pnp: PnP ACPI init
[ 32.904855] pnp: PnP ACPI: found 13 devices
[ 32.905045] SCSI subsystem initialized
[ 32.905158] usbcore: registered new interface driver usbfs
[ 32.905247] usbcore: registered new interface driver hub
[ 32.905335] usbcore: registered new device driver usb
[ 32.905452] PCI: Using ACPI for IRQ routing
[ 32.905511] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
[ 32.905592] PCI: Cannot allocate resource region 0 of device 0000:00:00.0
[ 32.905754] agpgart: Detected AGP bridge 0
[ 32.908913] agpgart: AGP aperture is 64M @ 0xf8000000
[ 32.908997] PCI-DMA: Disabling IOMMU.
[ 32.909754] PCI: Bridge: 0000:00:01.0
[ 32.909812] IO window: a000-afff
[ 32.909871] MEM window: fd100000-fd6fffff
[ 32.909931] PREFETCH window: d5000000-f4ffffff
[ 32.910006] PCI: Setting latency timer of device 0000:00:01.0 to 64
[ 32.910026] NET: Registered protocol family 2
[ 32.918232] IP route cache hash table entries: 4096 (order: 3, 32768 bytes)
[ 32.918368] TCP established hash table entries: 16384 (order: 5, 131072 bytes)
[ 32.918516] TCP bind hash table entries: 8192 (order: 4, 65536 bytes)
[ 32.918624] TCP: Hash tables configured (established 16384 bind 8192)
[ 32.918684] TCP reno registered
[ 32.919334] io scheduler noop registered
[ 32.919440] io scheduler cfq registered (default)
[ 32.920310] ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
[ 32.920477] radeonfb: Found Intel x86 BIOS ROM Image
[ 32.920539] radeonfb: Retrieved PLL infos from BIOS
[ 32.920598] radeonfb: Reference=27.00 MHz (RefDiv=12) Memory=200.00 Mhz, System=166.00 MHz
[ 32.920671] radeonfb: PLL min 20000 max 40000
[ 32.921735] radeonfb: Monitor 1 type CRT found
[ 32.921793] radeonfb: Monitor 2 type no found
[ 32.955438] Console: switching to colour frame buffer device 128x48
[ 32.975175] radeonfb (0000:01:00.0): ATI Radeon Yd
[ 32.975352] ACPI: Power Button (FF) [PWRF]
[ 32.975477] ACPI: Power Button (CM) [PWRB]
[ 32.975586] ACPI: Sleep Button (CM) [SLPB]
[ 32.977202] Real Time Clock Driver v1.12ac
[ 32.977313] Linux agpgart interface v0.101 (c) Dave Jones
[ 32.977457] [drm] Initialized drm 1.0.1 20051102
[ 32.978126] [drm] Initialized radeon 1.25.0 20060524 on minor 0
[ 32.978294] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
[ 32.978587] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 32.978834] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[ 32.979259] 00:0a: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[ 32.979511] 00:0b: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 32.979773] Floppy drive(s): fd0 is 1.44M
[ 32.994906] FDC 0 is a post-1991 82077
[ 32.996382] RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
[ 32.996766] loop: loaded (max 8 devices)
[ 32.996906] ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 17
[ 32.999471] skge 1.9 addr 0xfdc00000 irq 17 chip Yukon-Lite rev 9
[ 33.001981] skge eth0: addr 00:11:d8:1c:a0:7a
[ 33.004520] 8139too Fast Ethernet driver 0.9.28
[ 33.007047] ACPI: PCI Interrupt 0000:00:0e.0[A] -> GSI 19 (level, low) -> IRQ 19
[ 33.010258] eth1: RealTek RTL8139 at 0xffffc20000004000, 00:e0:4c:f0:ab:b8, IRQ 19
[ 33.013135] eth1: Identified 8139 chip type 'RTL-8139C'
[ 33.013151] Linux video capture interface: v2.00
[ 33.016118] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
[ 33.019236] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
[ 33.022583] VP_IDE: IDE controller at PCI slot 0000:00:0f.1
[ 33.026030] ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 20
[ 33.029676] VP_IDE: chipset revision 6
[ 33.033379] VP_IDE: not 100% native mode: will probe irqs later
[ 33.037248] VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
[ 33.041286] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
[ 33.045502] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
[ 33.049751] Probing IDE interface ide0...
[ 33.841448] hda: HL-DT-ST DVDRAM GSA-4167B, ATAPI CD/DVD-ROM drive
[ 34.152000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[ 34.156490] Probing IDE interface ide1...
[ 34.948226] hdc: HL-DT-ST GCE-8400B, ATAPI CD/DVD-ROM drive
[ 35.257686] ide1 at 0x170-0x177,0x376 on irq 15
[ 35.264947] hda: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33)
[ 35.269964] Uniform CD-ROM driver Revision: 3.20
[ 35.283252] hdc: ATAPI 40X CD-ROM CD-R/RW drive, 2048kB Cache, DMA
[ 35.289395] libata version 2.00 loaded.
[ 35.289419] sata_via 0000:00:0f.0: version 2.0
[ 35.289430] ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 20
[ 35.294798] sata_via 0000:00:0f.0: routed to hard irq line 10
[ 35.300148] ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD400 irq 20
[ 35.305530] ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xD802 bmdma 0xD408 irq 20
[ 35.310844] scsi0 : sata_via
[ 35.515983] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 35.673125] ata1.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 0/32)
[ 35.678757] ata1.00: ata1: dev 0 multi count 16
[ 35.686006] ata1.00: configured for UDMA/133
[ 35.691568] scsi1 : sata_via
[ 35.897218] ata2: SATA link down 1.5 Gbps (SStatus 0 SControl 300)
[ 35.913718] ATA: abnormal status 0x7F on port 0xE007
[ 35.919332] scsi 0:0:0:0: Direct-Access ATA ST380817AS 3.42 PQ: 0 ANSI: 5
[ 35.925134] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
[ 35.930882] sda: Write Protect is off
[ 35.936662] sda: Mode Sense: 00 3a 00 00
[ 35.936675] SCSI device sda: drive cache: write back
[ 35.942505] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
[ 35.948370] sda: Write Protect is off
[ 35.954136] sda: Mode Sense: 00 3a 00 00
[ 35.954148] SCSI device sda: drive cache: write back
[ 35.959889] sda: sda1 sda2 < sda5 sda6 sda7 sda8 >
[ 36.027032] sd 0:0:0:0: Attached scsi disk sda
[ 36.032710] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 36.038359] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21
[ 36.044338] ehci_hcd 0000:00:10.4: EHCI Host Controller
[ 36.050131] ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 1
[ 36.055834] ehci_hcd 0000:00:10.4: irq 21, io mem 0xfd900000
[ 36.061409] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
[ 36.067109] usb usb1: configuration #1 chosen from 1 choice
[ 36.072686] hub 1-0:1.0: USB hub found
[ 36.078141] hub 1-0:1.0: 8 ports detected
[ 36.183732] USB Universal Host Controller Interface driver v3.0
[ 36.189136] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21
[ 36.194595] uhci_hcd 0000:00:10.0: UHCI Host Controller
[ 36.200026] uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 2
[ 36.205457] uhci_hcd 0000:00:10.0: irq 21, io base 0x0000b000
[ 36.211000] usb usb2: configuration #1 chosen from 1 choice
[ 36.216430] hub 2-0:1.0: USB hub found
[ 36.221776] hub 2-0:1.0: 2 ports detected
[ 36.327443] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21
[ 36.332864] uhci_hcd 0000:00:10.1: UHCI Host Controller
[ 36.338237] uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 3
[ 36.343600] uhci_hcd 0000:00:10.1: irq 21, io base 0x0000b400
[ 36.348985] usb usb3: configuration #1 chosen from 1 choice
[ 36.354262] hub 3-0:1.0: USB hub found
[ 36.359430] hub 3-0:1.0: 2 ports detected
[ 36.465147] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21
[ 36.470440] uhci_hcd 0000:00:10.2: UHCI Host Controller
[ 36.475698] uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 4
[ 36.480965] uhci_hcd 0000:00:10.2: irq 21, io base 0x0000b800
[ 36.486227] usb usb4: configuration #1 chosen from 1 choice
[ 36.491386] hub 4-0:1.0: USB hub found
[ 36.496528] hub 4-0:1.0: 2 ports detected
[ 36.601875] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21
[ 36.607110] uhci_hcd 0000:00:10.3: UHCI Host Controller
[ 36.612333] uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 5
[ 36.617549] uhci_hcd 0000:00:10.3: irq 21, io base 0x0000c000
[ 36.622729] usb usb5: configuration #1 chosen from 1 choice
[ 36.627812] hub 5-0:1.0: USB hub found
[ 36.632845] hub 5-0:1.0: 2 ports detected
[ 37.161683] usbcore: registered new interface driver cdc_acm
[ 37.166690] drivers/usb/class/cdc-acm.c: v0.25:USB Abstract Control Model driver for USB modems and ISDN adapters
[ 37.171947] usbcore: registered new interface driver usblp
[ 37.177212] drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
[ 37.182561] Initializing USB Mass Storage driver...
[ 37.187966] usbcore: registered new interface driver usb-storage
[ 37.193343] USB Mass Storage support registered.
[ 37.198710] PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
[ 37.204404] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 37.209839] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 37.215222] mice: PS/2 mouse device common for all mice
[ 37.220524] i2c /dev entries driver
[ 37.226206] Advanced Linux Sound Architecture Driver Version 1.0.13 (Sun Oct 22 08:56:16 2006 UTC).
[ 37.231946] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22
[ 37.239539] PCI: Setting latency timer of device 0000:00:11.5 to 64
[ 37.251526] input: AT Translated Set 2 keyboard as /class/input/input0
[ 37.751566] codec_read: codec 0 is not valid [0xfe0000]
[ 37.764869] codec_read: codec 0 is not valid [0xfe0000]
[ 37.778121] codec_read: codec 0 is not valid [0xfe0000]
[ 37.791242] codec_read: codec 0 is not valid [0xfe0000]
[ 37.808500] ALSA device list:
[ 37.813631] #0: VIA 8237 with AD1980 at 0xec00, irq 22
[ 37.818899] oprofile: using NMI interrupt.
[ 37.824105] TCP cubic registered
[ 37.829169] NET: Registered protocol family 1
[ 37.834229] NET: Registered protocol family 17
[ 37.839196] NET: Registered protocol family 15
[ 37.844134] ACPI: (supports S0 S1 S3 S4 S5)
[ 37.961630] input: ImPS/2 Logitech Wheel Mouse as /class/input/input1
[ 38.050012] RAMDISK: ext2 filesystem found at block 0
[ 38.054928] RAMDISK: Loading 2000KiB [1 disk] into ram disk... |\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\bdone.
[ 38.066647] VFS: Mounted root (ext2 filesystem).
[ 38.129633] kjournald starting. Commit interval 5 seconds
[ 38.134550] EXT3-fs: mounted filesystem with ordered data mode.
[ 38.139493] VFS: Mounted root (ext3 filesystem) readonly.
[ 38.144469] Trying to move old root to /initrd ... /initrd does not exist. Ignored.
[ 38.149670] Unmounting old root
[ 38.154701] Trying to free ramdisk memory ... okay
[ 38.159878] Freeing unused kernel memory: 200k freed
[ 38.164962] Write protecting the kernel read-only data: 560k
[ 40.454619] warning: process `touch' used the removed sysctl system call
[ 40.890041] warning: process `sleep' used the removed sysctl system call
[ 40.999232] warning: process `sleep' used the removed sysctl system call
[ 41.104730] warning: process `sleep' used the removed sysctl system call
[ 41.873966] warning: process `sleep' used the removed sysctl system call
[ 43.453372] EXT3 FS on sda6, internal journal
[ 43.999154] kjournald starting. Commit interval 5 seconds
[ 43.999164] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
[ 43.999323] EXT3 FS on sda8, internal journal
[ 43.999328] EXT3-fs: mounted filesystem with ordered data mode.
[ 44.117933] Adding 1004016k swap on /dev/sda7. Priority:-1 extents:1 across:1004016k
[ 50.175883] skge eth0: enabling interface
[ 62.993341] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
[ 62.993361] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
[ 62.993436] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
[ 63.288516] [drm] Setting GART location based on new memory map
[ 63.288604] [drm] Loading R200 Microcode
[ 63.288694] [drm] writeback test succeeded in 1 usecs
[ 79.495845] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
[ 81.127324] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 81.170715] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack
[ 180.334016] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
[ 306.826671] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
[ 306.826693] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
[ 306.826768] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
[ 306.826780] [drm] Loading R200 Microcode
[ 340.827814] Stopping tasks: ==========================================================================================================================================|
[ 340.828658] Shrinking memory... \b-\b\\b|\b/\b-\bdone (51711 pages freed)
[ 340.925349] Suspending console(s)
[ 341.809883] pnp: Device 00:0b disabled.
[ 341.810124] pnp: Device 00:0a disabled.
[ 341.810148] radeonfb (0000:01:00.0): suspending for event: 1...
[ 341.887207] skge eth0: disabling interface
[ 341.899219] pci_set_power_state(): 0000:00:00.0: state=3, current state=5
[ 341.912910] swsusp: Need to copy 63368 pages
[ 26.062737] APIC error on CPU0: 00(00)
[ 26.062827] PCI: Setting latency timer of device 0000:00:01.0 to 64
[ 26.146592] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a)
[ 26.146599] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000)
[ 26.146609] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801)
[ 26.146614] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000)
[ 26.146619] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010)
[ 26.146626] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117)
[ 26.146658] skge eth0: enabling interface
[ 26.160202] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
[ 26.171176] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21
[ 26.171218] usb usb2: root hub lost power or was reset
[ 26.182150] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21
[ 26.182188] usb usb3: root hub lost power or was reset
[ 26.193128] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21
[ 26.193165] usb usb4: root hub lost power or was reset
[ 26.204106] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21
[ 26.204143] usb usb5: root hub lost power or was reset
[ 26.215084] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21
[ 26.215109] usb usb1: root hub lost power or was reset
[ 26.215127] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
[ 26.226082] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22
[ 26.226088] PCI: Setting latency timer of device 0000:00:11.5 to 64
[ 26.229463] radeonfb (0000:01:00.0): resuming from state: 1...
[ 26.247263] pnp: Failed to activate device 00:03.
[ 26.247391] pnp: Failed to activate device 00:04.
[ 26.248318] pnp: Device 00:0a activated.
[ 26.249004] pnp: Device 00:0b activated.
[ 27.134110] Restarting tasks... done
[ 27.565554] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
[ 27.565593] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
[ 27.565670] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
[ 27.565682] [drm] Loading R200 Microcode
[ 28.443446] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack
[ 752.692523] Stopping tasks: ======================================================================================================================================|
[ 752.693363] Shrinking memory... \b-\b\\b|\b/\b-\b\\bdone (58183 pages freed)
[ 756.669812] Suspending console(s)
[ 757.578446] pnp: Device 00:0b disabled.
[ 757.578702] pnp: Device 00:0a disabled.
[ 757.578727] radeonfb (0000:01:00.0): suspending for event: 1...
[ 757.655322] skge eth0: disabling interface
[ 757.695225] swsusp: Need to copy 58533 pages
[ 25.139916] APIC error on CPU0: 00(00)
[ 25.293551] PCI: Setting latency timer of device 0000:00:01.0 to 64
[ 25.377319] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a)
[ 25.377326] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000)
[ 25.377338] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801)
[ 25.377343] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000)
[ 25.377348] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010)
[ 25.377353] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117)
[ 25.377384] skge eth0: enabling interface
[ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4
[ 25.382086]
[ 25.382087] Call Trace:
[ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101
[ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101
[ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 25.382141]
[ 25.387073] BUG: scheduling while atomic: events/0/0x00000001/4
[ 25.387074]
[ 25.387075] Call Trace:
[ 25.387078] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 25.387081] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 25.387085] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.387088] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 25.387092] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 25.387096] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.387099] [<ffffffff80239269>] kthread+0xce/0x101
[ 25.387103] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 25.387106] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 25.387111] [<ffffffff8023919b>] kthread+0x0/0x101
[ 25.387114] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 25.387115]
[ 25.391072] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
[ 25.392063] BUG: scheduling while atomic: events/0/0x00000001/4
[ 25.392065]
[ 25.392065] Call Trace:
[ 25.392068] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 25.392072] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 25.392075] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.392079] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 25.392082] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 25.392086] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 25.392090] [<ffffffff80239269>] kthread+0xce/0x101
[ 25.392093] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 25.392097] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 25.392101] [<ffffffff8023919b>] kthread+0x0/0x101
[ 25.392104] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 25.392106]
[ 25.402046] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21
[ 25.402095] usb usb2: root hub lost power or was reset
[ 25.413022] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21
[ 25.413067] usb usb3: root hub lost power or was reset
[ 25.423998] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21
[ 25.424043] usb usb4: root hub lost power or was reset
[ 25.434977] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21
[ 25.435020] usb usb5: root hub lost power or was reset
[ 25.445955] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21
[ 25.445987] usb usb1: root hub lost power or was reset
[ 25.446006] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
[ 25.456957] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22
[ 25.456963] PCI: Setting latency timer of device 0000:00:11.5 to 64
[ 25.460339] radeonfb (0000:01:00.0): resuming from state: 1...
[ 25.478130] pnp: Failed to activate device 00:03.
[ 25.478258] pnp: Failed to activate device 00:04.
[ 25.479252] pnp: Device 00:0a activated.
[ 25.479961] pnp: Device 00:0b activated.
[ 26.365424] Restarting tasks... done
[ 26.441032] BUG: sleeping function called from invalid context at include/asm/semaphore.h:105
[ 26.441035] in_atomic():1, irqs_disabled():0
[ 26.441037]
[ 26.441038] Call Trace:
[ 26.441046] [<ffffffff8049ff6c>] thread_return+0x0/0xf9
[ 26.441054] [<ffffffff802226ae>] __might_sleep+0xb2/0xb4
[ 26.441058] [<ffffffff8022749a>] acquire_console_sem+0x66/0x90
[ 26.441064] [<ffffffff80358086>] console_callback+0xe/0xde
[ 26.441068] [<ffffffff80235fce>] run_workqueue+0xb6/0x126
[ 26.441072] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 26.441075] [<ffffffff802365ff>] worker_thread+0xe6/0x11b
[ 26.441079] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 26.441083] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 26.441088] [<ffffffff80239269>] kthread+0xce/0x101
[ 26.441112] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 26.441117] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 26.441121] [<ffffffff8023919b>] kthread+0x0/0x101
[ 26.441125] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 26.441126]
[ 26.441160] BUG: scheduling while atomic: events/0/0x00000001/4
[ 26.441162]
[ 26.441162] Call Trace:
[ 26.441166] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 26.441171] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 26.441174] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 26.441178] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 26.441181] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 26.441185] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 26.441189] [<ffffffff80239269>] kthread+0xce/0x101
[ 26.441193] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 26.441196] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 26.441201] [<ffffffff8023919b>] kthread+0x0/0x101
[ 26.441204] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 26.441206]
[ 27.289287] BUG: scheduling while atomic: events/0/0x00000001/4
[ 27.289292]
[ 27.289293] Call Trace:
[ 27.289305] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 27.289310] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 27.289315] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 27.289319] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 27.289324] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 27.289328] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 27.289333] [<ffffffff80239269>] kthread+0xce/0x101
[ 27.289337] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 27.289342] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 27.289346] [<ffffffff8023919b>] kthread+0x0/0x101
[ 27.289349] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 27.289351]
[ 27.427130] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
[ 27.427152] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
[ 27.427228] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
[ 27.427240] [drm] Loading R200 Microcode
[ 29.285334] BUG: scheduling while atomic: events/0/0x00000001/4
[ 29.285339]
[ 29.285340] Call Trace:
[ 29.285352] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 29.285358] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 29.285363] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 29.285366] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 29.285372] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 29.285376] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 29.285380] [<ffffffff80239269>] kthread+0xce/0x101
[ 29.285384] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 29.285389] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 29.285394] [<ffffffff8023919b>] kthread+0x0/0x101
[ 29.285397] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 29.285399]
[ 31.281290] BUG: scheduling while atomic: events/0/0x00000001/4
[ 31.281295]
[ 31.281296] Call Trace:
[ 31.281309] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 31.281314] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 31.281319] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 31.281323] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 31.281329] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 31.281333] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 31.281338] [<ffffffff80239269>] kthread+0xce/0x101
[ 31.281342] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 31.281346] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 31.281351] [<ffffffff8023919b>] kthread+0x0/0x101
[ 31.281354] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 31.281356]
[ 32.949597] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack
[ 33.277294] BUG: scheduling while atomic: events/0/0x00000001/4
[ 33.277298]
[ 33.277299] Call Trace:
[ 33.277311] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 33.277317] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 33.277322] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 33.277325] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 33.277331] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 33.277335] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 33.277340] [<ffffffff80239269>] kthread+0xce/0x101
[ 33.277344] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 33.277348] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 33.277353] [<ffffffff8023919b>] kthread+0x0/0x101
[ 33.277357] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 33.277359]
[ 35.273273] BUG: scheduling while atomic: events/0/0x00000001/4
[ 35.273278]
[ 35.273279] Call Trace:
[ 35.273291] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
[ 35.273296] [<ffffffff802f34b6>] list_add+0xc/0xe
[ 35.273302] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 35.273305] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
[ 35.273311] [<ffffffff802233e2>] default_wake_function+0x0/0xf
[ 35.273315] [<ffffffff80236519>] worker_thread+0x0/0x11b
[ 35.273320] [<ffffffff80239269>] kthread+0xce/0x101
[ 35.273324] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
[ 35.273329] [<ffffffff8020a238>] child_rip+0xa/0x12
[ 35.273334] [<ffffffff8023919b>] kthread+0x0/0x101
[ 35.273337] [<ffffffff8020a22e>] child_rip+0x0/0x12
[ 35.273339]
[...]
------------------------------------------------------------------
--
Paolo Ornati
Linux 2.6.19-rc5 on x86_64
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2)
2006-11-11 9:25 ` Paolo Ornati
@ 2006-11-11 10:49 ` Rafael J. Wysocki
2006-11-11 12:29 ` Paolo Ornati
0 siblings, 1 reply; 91+ messages in thread
From: Rafael J. Wysocki @ 2006-11-11 10:49 UTC (permalink / raw)
To: Paolo Ornati; +Cc: Adrian Bunk, LKML
On Saturday, 11 November 2006 10:25, Paolo Ornati wrote:
> On Sat, 11 Nov 2006 10:08:37 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > > Subject : BUG: scheduling while atomic: events/0/0x00000001/4
> > > after resume
> > > References : http://lkml.org/lkml/2006/11/2/209
> > > Submitter : Paolo Ornati <ornati@fastwebnet.it>
> > > Status : unknown
> >
> > I couldn't find anything in the report that would indicate the problem occured
> > after a resume. Was it really the case?
>
> Ahh, I've written that in another email but I trimmed LKML from CC by
> mistake ;)
>
>
> Relevant portion of that mail follows... anyway it seems that "-rc5" is
> _OK_ since I'm running it by 2 days and it survived 9 suspend/resume
> cycles.
Okay, please let us know if it survives the next several cycles.
OTOH, the problem may be hiding.
> ------------------------------------------------------------------
>
> I've reproduced it (with rc4-g4b1c46a3), and I think it is
> suspend/resume related sice the messages start flooding dmesg just
> after a resume...
>
> I'll see if it is reproducible just doing suspend/resume a couple of
> times... and if so I'll try with -rc5.
>
>
> dmesg (stripped at the end):
>
> [ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006
> [ 0.000000] Command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60
> [ 0.000000] BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> [ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
> [ 0.000000] BIOS-e820: 0000000000100000 - 000000001ff30000 (usable)
> [ 0.000000] BIOS-e820: 000000001ff30000 - 000000001ff40000 (ACPI data)
> [ 0.000000] BIOS-e820: 000000001ff40000 - 000000001fff0000 (ACPI NVS)
> [ 0.000000] BIOS-e820: 000000001fff0000 - 0000000020000000 (reserved)
> [ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
> [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used
> [ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used
> [ 0.000000] end_pfn_map = 1048576
> [ 0.000000] DMI 2.3 present.
> [ 0.000000] ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000fa850
> [ 0.000000] ACPI: RSDT (v001 A M I OEMRSDT 0x06000517 MSFT 0x00000097) @ 0x000000001ff30000
> [ 0.000000] ACPI: FADT (v001 A M I OEMFACP 0x06000517 MSFT 0x00000097) @ 0x000000001ff30200
> [ 0.000000] ACPI: MADT (v001 A M I OEMAPIC 0x06000517 MSFT 0x00000097) @ 0x000000001ff30390
> [ 0.000000] ACPI: OEMB (v001 A M I OEMBIOS 0x06000517 MSFT 0x00000097) @ 0x000000001ff40040
> [ 0.000000] ACPI: DSDT (v001 A0058 A0058002 0x00000002 MSFT 0x0100000d) @ 0x0000000000000000
> [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used
> [ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used
> [ 0.000000] Zone PFN ranges:
> [ 0.000000] DMA 0 -> 4096
> [ 0.000000] DMA32 4096 -> 1048576
> [ 0.000000] Normal 1048576 -> 1048576
> [ 0.000000] early_node_map[2] active PFN ranges
> [ 0.000000] 0: 0 -> 159
> [ 0.000000] 0: 256 -> 130864
> [ 0.000000] On node 0 totalpages: 130767
> [ 0.000000] DMA zone: 56 pages used for memmap
> [ 0.000000] DMA zone: 1183 pages reserved
> [ 0.000000] DMA zone: 2760 pages, LIFO batch:0
> [ 0.000000] DMA32 zone: 1733 pages used for memmap
> [ 0.000000] DMA32 zone: 125035 pages, LIFO batch:31
> [ 0.000000] Normal zone: 0 pages used for memmap
> [ 0.000000] ACPI: PM-Timer IO Port: 0x808
> [ 0.000000] ACPI: Local APIC address 0xfee00000
> [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> [ 0.000000] Processor #0 (Bootup-CPU)
> [ 0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
> [ 0.000000] IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [ 0.000000] ACPI: IRQ0 used by override.
> [ 0.000000] ACPI: IRQ2 used by override.
> [ 0.000000] ACPI: IRQ9 used by override.
> [ 0.000000] Setting APIC routing to flat
> [ 0.000000] Using ACPI (MADT) for SMP configuration information
> [ 0.000000] Nosave address range: 000000000009f000 - 00000000000a0000
> [ 0.000000] Nosave address range: 00000000000a0000 - 00000000000e4000
> [ 0.000000] Nosave address range: 00000000000e4000 - 0000000000100000
> [ 0.000000] Allocating PCI resources starting at 30000000 (gap: 20000000:dff80000)
> [ 0.000000] Built 1 zonelists. Total pages: 127795
> [ 0.000000] Kernel command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60
> [ 0.000000] Initializing CPU#0
> [ 0.000000] PID hash table entries: 2048 (order: 11, 16384 bytes)
> [ 32.727602] time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
> [ 32.727605] time.c: Detected 2202.943 MHz processor.
> [ 32.730265] Console: colour VGA+ 80x25
> [ 32.733073] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
> [ 32.733509] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
> [ 32.733646] Checking aperture...
> [ 32.733705] CPU 0: aperture @ f8000000 size 64 MB
> [ 32.740581] Memory: 507900k/523456k available (2708k kernel code, 14716k reserved, 1343k data, 200k init)
> [ 32.800438] Calibrating delay using timer specific routine.. 4409.08 BogoMIPS (lpj=2204542)
> [ 32.800591] Mount-cache hash table entries: 256
> [ 32.800771] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> [ 32.800833] CPU: L2 Cache: 512K (64 bytes/line)
> [ 32.800913] CPU: AMD Athlon(tm) 64 Processor 3200+ stepping 00
> [ 32.801063] ACPI: Core revision 20060707
> [ 32.814145] Using local APIC timer interrupts.
> [ 32.859518] result 12516743
> [ 32.859573] Detected 12.516 MHz APIC timer.
> [ 32.860328] testing NMI watchdog ... OK.
> [ 32.870515] checking if image is initramfs...it isn't (bad gzip magic numbers); looks like an initrd
> [ 32.873517] Freeing initrd memory: 2000k freed
> [ 32.875609] NET: Registered protocol family 16
> [ 32.875754] ACPI: bus type pci registered
> [ 32.875818] PCI: Using configuration type 1
> [ 32.881215] ACPI: Interpreter enabled
> [ 32.881276] ACPI: Using IOAPIC for interrupt routing
> [ 32.882250] ACPI: PCI Root Bridge [PCI0] (0000:00)
> [ 32.882315] PCI: Probing PCI hardware (bus 00)
> [ 32.884489] PCI: enabled onboard AC97/MC97 devices
> [ 32.884765] Boot video device is 0000:01:00.0
> [ 32.884843] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> [ 32.896169] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15)
> [ 32.896831] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15)
> [ 32.897488] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 *7 10 11 14 15)
> [ 32.898143] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 7 10 11 14 15)
> [ 32.898809] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
> [ 32.899560] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
> [ 32.900312] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
> [ 32.901052] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
> [ 32.901700] Linux Plug and Play Support v0.97 (c) Adam Belay
> [ 32.901768] pnp: PnP ACPI init
> [ 32.904855] pnp: PnP ACPI: found 13 devices
> [ 32.905045] SCSI subsystem initialized
> [ 32.905158] usbcore: registered new interface driver usbfs
> [ 32.905247] usbcore: registered new interface driver hub
> [ 32.905335] usbcore: registered new device driver usb
> [ 32.905452] PCI: Using ACPI for IRQ routing
> [ 32.905511] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
> [ 32.905592] PCI: Cannot allocate resource region 0 of device 0000:00:00.0
> [ 32.905754] agpgart: Detected AGP bridge 0
> [ 32.908913] agpgart: AGP aperture is 64M @ 0xf8000000
> [ 32.908997] PCI-DMA: Disabling IOMMU.
> [ 32.909754] PCI: Bridge: 0000:00:01.0
> [ 32.909812] IO window: a000-afff
> [ 32.909871] MEM window: fd100000-fd6fffff
> [ 32.909931] PREFETCH window: d5000000-f4ffffff
> [ 32.910006] PCI: Setting latency timer of device 0000:00:01.0 to 64
> [ 32.910026] NET: Registered protocol family 2
> [ 32.918232] IP route cache hash table entries: 4096 (order: 3, 32768 bytes)
> [ 32.918368] TCP established hash table entries: 16384 (order: 5, 131072 bytes)
> [ 32.918516] TCP bind hash table entries: 8192 (order: 4, 65536 bytes)
> [ 32.918624] TCP: Hash tables configured (established 16384 bind 8192)
> [ 32.918684] TCP reno registered
> [ 32.919334] io scheduler noop registered
> [ 32.919440] io scheduler cfq registered (default)
> [ 32.920310] ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
> [ 32.920477] radeonfb: Found Intel x86 BIOS ROM Image
> [ 32.920539] radeonfb: Retrieved PLL infos from BIOS
> [ 32.920598] radeonfb: Reference=27.00 MHz (RefDiv=12) Memory=200.00 Mhz, System=166.00 MHz
> [ 32.920671] radeonfb: PLL min 20000 max 40000
> [ 32.921735] radeonfb: Monitor 1 type CRT found
> [ 32.921793] radeonfb: Monitor 2 type no found
> [ 32.955438] Console: switching to colour frame buffer device 128x48
> [ 32.975175] radeonfb (0000:01:00.0): ATI Radeon Yd
> [ 32.975352] ACPI: Power Button (FF) [PWRF]
> [ 32.975477] ACPI: Power Button (CM) [PWRB]
> [ 32.975586] ACPI: Sleep Button (CM) [SLPB]
> [ 32.977202] Real Time Clock Driver v1.12ac
> [ 32.977313] Linux agpgart interface v0.101 (c) Dave Jones
> [ 32.977457] [drm] Initialized drm 1.0.1 20051102
> [ 32.978126] [drm] Initialized radeon 1.25.0 20060524 on minor 0
> [ 32.978294] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
> [ 32.978587] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> [ 32.978834] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> [ 32.979259] 00:0a: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> [ 32.979511] 00:0b: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> [ 32.979773] Floppy drive(s): fd0 is 1.44M
> [ 32.994906] FDC 0 is a post-1991 82077
> [ 32.996382] RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
> [ 32.996766] loop: loaded (max 8 devices)
> [ 32.996906] ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 17
> [ 32.999471] skge 1.9 addr 0xfdc00000 irq 17 chip Yukon-Lite rev 9
> [ 33.001981] skge eth0: addr 00:11:d8:1c:a0:7a
> [ 33.004520] 8139too Fast Ethernet driver 0.9.28
> [ 33.007047] ACPI: PCI Interrupt 0000:00:0e.0[A] -> GSI 19 (level, low) -> IRQ 19
> [ 33.010258] eth1: RealTek RTL8139 at 0xffffc20000004000, 00:e0:4c:f0:ab:b8, IRQ 19
> [ 33.013135] eth1: Identified 8139 chip type 'RTL-8139C'
> [ 33.013151] Linux video capture interface: v2.00
> [ 33.016118] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> [ 33.019236] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> [ 33.022583] VP_IDE: IDE controller at PCI slot 0000:00:0f.1
> [ 33.026030] ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 20
> [ 33.029676] VP_IDE: chipset revision 6
> [ 33.033379] VP_IDE: not 100% native mode: will probe irqs later
> [ 33.037248] VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
> [ 33.041286] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
> [ 33.045502] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
> [ 33.049751] Probing IDE interface ide0...
> [ 33.841448] hda: HL-DT-ST DVDRAM GSA-4167B, ATAPI CD/DVD-ROM drive
> [ 34.152000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> [ 34.156490] Probing IDE interface ide1...
> [ 34.948226] hdc: HL-DT-ST GCE-8400B, ATAPI CD/DVD-ROM drive
> [ 35.257686] ide1 at 0x170-0x177,0x376 on irq 15
> [ 35.264947] hda: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33)
> [ 35.269964] Uniform CD-ROM driver Revision: 3.20
> [ 35.283252] hdc: ATAPI 40X CD-ROM CD-R/RW drive, 2048kB Cache, DMA
> [ 35.289395] libata version 2.00 loaded.
> [ 35.289419] sata_via 0000:00:0f.0: version 2.0
> [ 35.289430] ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 20
> [ 35.294798] sata_via 0000:00:0f.0: routed to hard irq line 10
> [ 35.300148] ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD400 irq 20
> [ 35.305530] ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xD802 bmdma 0xD408 irq 20
> [ 35.310844] scsi0 : sata_via
> [ 35.515983] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [ 35.673125] ata1.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 0/32)
> [ 35.678757] ata1.00: ata1: dev 0 multi count 16
> [ 35.686006] ata1.00: configured for UDMA/133
> [ 35.691568] scsi1 : sata_via
> [ 35.897218] ata2: SATA link down 1.5 Gbps (SStatus 0 SControl 300)
> [ 35.913718] ATA: abnormal status 0x7F on port 0xE007
> [ 35.919332] scsi 0:0:0:0: Direct-Access ATA ST380817AS 3.42 PQ: 0 ANSI: 5
> [ 35.925134] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
> [ 35.930882] sda: Write Protect is off
> [ 35.936662] sda: Mode Sense: 00 3a 00 00
> [ 35.936675] SCSI device sda: drive cache: write back
> [ 35.942505] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
> [ 35.948370] sda: Write Protect is off
> [ 35.954136] sda: Mode Sense: 00 3a 00 00
> [ 35.954148] SCSI device sda: drive cache: write back
> [ 35.959889] sda: sda1 sda2 < sda5 sda6 sda7 sda8 >
> [ 36.027032] sd 0:0:0:0: Attached scsi disk sda
> [ 36.032710] sd 0:0:0:0: Attached scsi generic sg0 type 0
> [ 36.038359] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21
> [ 36.044338] ehci_hcd 0000:00:10.4: EHCI Host Controller
> [ 36.050131] ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 1
> [ 36.055834] ehci_hcd 0000:00:10.4: irq 21, io mem 0xfd900000
> [ 36.061409] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
> [ 36.067109] usb usb1: configuration #1 chosen from 1 choice
> [ 36.072686] hub 1-0:1.0: USB hub found
> [ 36.078141] hub 1-0:1.0: 8 ports detected
> [ 36.183732] USB Universal Host Controller Interface driver v3.0
> [ 36.189136] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21
> [ 36.194595] uhci_hcd 0000:00:10.0: UHCI Host Controller
> [ 36.200026] uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 2
> [ 36.205457] uhci_hcd 0000:00:10.0: irq 21, io base 0x0000b000
> [ 36.211000] usb usb2: configuration #1 chosen from 1 choice
> [ 36.216430] hub 2-0:1.0: USB hub found
> [ 36.221776] hub 2-0:1.0: 2 ports detected
> [ 36.327443] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21
> [ 36.332864] uhci_hcd 0000:00:10.1: UHCI Host Controller
> [ 36.338237] uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 3
> [ 36.343600] uhci_hcd 0000:00:10.1: irq 21, io base 0x0000b400
> [ 36.348985] usb usb3: configuration #1 chosen from 1 choice
> [ 36.354262] hub 3-0:1.0: USB hub found
> [ 36.359430] hub 3-0:1.0: 2 ports detected
> [ 36.465147] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21
> [ 36.470440] uhci_hcd 0000:00:10.2: UHCI Host Controller
> [ 36.475698] uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 4
> [ 36.480965] uhci_hcd 0000:00:10.2: irq 21, io base 0x0000b800
> [ 36.486227] usb usb4: configuration #1 chosen from 1 choice
> [ 36.491386] hub 4-0:1.0: USB hub found
> [ 36.496528] hub 4-0:1.0: 2 ports detected
> [ 36.601875] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21
> [ 36.607110] uhci_hcd 0000:00:10.3: UHCI Host Controller
> [ 36.612333] uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 5
> [ 36.617549] uhci_hcd 0000:00:10.3: irq 21, io base 0x0000c000
> [ 36.622729] usb usb5: configuration #1 chosen from 1 choice
> [ 36.627812] hub 5-0:1.0: USB hub found
> [ 36.632845] hub 5-0:1.0: 2 ports detected
> [ 37.161683] usbcore: registered new interface driver cdc_acm
> [ 37.166690] drivers/usb/class/cdc-acm.c: v0.25:USB Abstract Control Model driver for USB modems and ISDN adapters
> [ 37.171947] usbcore: registered new interface driver usblp
> [ 37.177212] drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
> [ 37.182561] Initializing USB Mass Storage driver...
> [ 37.187966] usbcore: registered new interface driver usb-storage
> [ 37.193343] USB Mass Storage support registered.
> [ 37.198710] PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
> [ 37.204404] serio: i8042 KBD port at 0x60,0x64 irq 1
> [ 37.209839] serio: i8042 AUX port at 0x60,0x64 irq 12
> [ 37.215222] mice: PS/2 mouse device common for all mice
> [ 37.220524] i2c /dev entries driver
> [ 37.226206] Advanced Linux Sound Architecture Driver Version 1.0.13 (Sun Oct 22 08:56:16 2006 UTC).
> [ 37.231946] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22
> [ 37.239539] PCI: Setting latency timer of device 0000:00:11.5 to 64
> [ 37.251526] input: AT Translated Set 2 keyboard as /class/input/input0
> [ 37.751566] codec_read: codec 0 is not valid [0xfe0000]
> [ 37.764869] codec_read: codec 0 is not valid [0xfe0000]
> [ 37.778121] codec_read: codec 0 is not valid [0xfe0000]
> [ 37.791242] codec_read: codec 0 is not valid [0xfe0000]
> [ 37.808500] ALSA device list:
> [ 37.813631] #0: VIA 8237 with AD1980 at 0xec00, irq 22
> [ 37.818899] oprofile: using NMI interrupt.
> [ 37.824105] TCP cubic registered
> [ 37.829169] NET: Registered protocol family 1
> [ 37.834229] NET: Registered protocol family 17
> [ 37.839196] NET: Registered protocol family 15
> [ 37.844134] ACPI: (supports S0 S1 S3 S4 S5)
> [ 37.961630] input: ImPS/2 Logitech Wheel Mouse as /class/input/input1
> [ 38.050012] RAMDISK: ext2 filesystem found at block 0
> [ 38.054928] RAMDISK: Loading 2000KiB [1 disk] into ram disk... |\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\bdone.
> [ 38.066647] VFS: Mounted root (ext2 filesystem).
> [ 38.129633] kjournald starting. Commit interval 5 seconds
> [ 38.134550] EXT3-fs: mounted filesystem with ordered data mode.
> [ 38.139493] VFS: Mounted root (ext3 filesystem) readonly.
> [ 38.144469] Trying to move old root to /initrd ... /initrd does not exist. Ignored.
> [ 38.149670] Unmounting old root
> [ 38.154701] Trying to free ramdisk memory ... okay
> [ 38.159878] Freeing unused kernel memory: 200k freed
> [ 38.164962] Write protecting the kernel read-only data: 560k
> [ 40.454619] warning: process `touch' used the removed sysctl system call
> [ 40.890041] warning: process `sleep' used the removed sysctl system call
> [ 40.999232] warning: process `sleep' used the removed sysctl system call
> [ 41.104730] warning: process `sleep' used the removed sysctl system call
> [ 41.873966] warning: process `sleep' used the removed sysctl system call
> [ 43.453372] EXT3 FS on sda6, internal journal
> [ 43.999154] kjournald starting. Commit interval 5 seconds
> [ 43.999164] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
> [ 43.999323] EXT3 FS on sda8, internal journal
> [ 43.999328] EXT3-fs: mounted filesystem with ordered data mode.
> [ 44.117933] Adding 1004016k swap on /dev/sda7. Priority:-1 extents:1 across:1004016k
> [ 50.175883] skge eth0: enabling interface
> [ 62.993341] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
> [ 62.993361] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
> [ 62.993436] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
> [ 63.288516] [drm] Setting GART location based on new memory map
> [ 63.288604] [drm] Loading R200 Microcode
> [ 63.288694] [drm] writeback test succeeded in 1 usecs
> [ 79.495845] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
> [ 81.127324] ip_tables: (C) 2000-2006 Netfilter Core Team
> [ 81.170715] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack
> [ 180.334016] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
> [ 306.826671] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
> [ 306.826693] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
> [ 306.826768] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
> [ 306.826780] [drm] Loading R200 Microcode
> [ 340.827814] Stopping tasks: ==========================================================================================================================================|
> [ 340.828658] Shrinking memory... \b-\b\\b|\b/\b-\bdone (51711 pages freed)
> [ 340.925349] Suspending console(s)
> [ 341.809883] pnp: Device 00:0b disabled.
> [ 341.810124] pnp: Device 00:0a disabled.
> [ 341.810148] radeonfb (0000:01:00.0): suspending for event: 1...
> [ 341.887207] skge eth0: disabling interface
> [ 341.899219] pci_set_power_state(): 0000:00:00.0: state=3, current state=5
> [ 341.912910] swsusp: Need to copy 63368 pages
> [ 26.062737] APIC error on CPU0: 00(00)
> [ 26.062827] PCI: Setting latency timer of device 0000:00:01.0 to 64
> [ 26.146592] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a)
> [ 26.146599] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000)
> [ 26.146609] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801)
> [ 26.146614] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000)
> [ 26.146619] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010)
> [ 26.146626] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117)
> [ 26.146658] skge eth0: enabling interface
> [ 26.160202] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
> [ 26.171176] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21
> [ 26.171218] usb usb2: root hub lost power or was reset
> [ 26.182150] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21
> [ 26.182188] usb usb3: root hub lost power or was reset
> [ 26.193128] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21
> [ 26.193165] usb usb4: root hub lost power or was reset
> [ 26.204106] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21
> [ 26.204143] usb usb5: root hub lost power or was reset
> [ 26.215084] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21
> [ 26.215109] usb usb1: root hub lost power or was reset
> [ 26.215127] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
> [ 26.226082] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22
> [ 26.226088] PCI: Setting latency timer of device 0000:00:11.5 to 64
> [ 26.229463] radeonfb (0000:01:00.0): resuming from state: 1...
> [ 26.247263] pnp: Failed to activate device 00:03.
> [ 26.247391] pnp: Failed to activate device 00:04.
> [ 26.248318] pnp: Device 00:0a activated.
> [ 26.249004] pnp: Device 00:0b activated.
> [ 27.134110] Restarting tasks... done
> [ 27.565554] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0.
> [ 27.565593] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
> [ 27.565670] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
> [ 27.565682] [drm] Loading R200 Microcode
> [ 28.443446] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack
> [ 752.692523] Stopping tasks: ======================================================================================================================================|
> [ 752.693363] Shrinking memory... \b-\b\\b|\b/\b-\b\\bdone (58183 pages freed)
> [ 756.669812] Suspending console(s)
> [ 757.578446] pnp: Device 00:0b disabled.
> [ 757.578702] pnp: Device 00:0a disabled.
> [ 757.578727] radeonfb (0000:01:00.0): suspending for event: 1...
> [ 757.655322] skge eth0: disabling interface
> [ 757.695225] swsusp: Need to copy 58533 pages
> [ 25.139916] APIC error on CPU0: 00(00)
> [ 25.293551] PCI: Setting latency timer of device 0000:00:01.0 to 64
> [ 25.377319] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a)
> [ 25.377326] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000)
> [ 25.377338] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801)
> [ 25.377343] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000)
> [ 25.377348] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010)
> [ 25.377353] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117)
> [ 25.377384] skge eth0: enabling interface
> [ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4
> [ 25.382086]
> [ 25.382087] Call Trace:
> [ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
> [ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe
> [ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b
> [ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
> [ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf
> [ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b
> [ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101
> [ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
> [ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12
> [ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101
> [ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12
Apparently, the kernel thinks that worker_thread() is running in the atomic
context, so there may be a problem with preempt_count(), for example.
Is preemption enabled in your kernel(s)?
Greetings,
Rafael
--
You never change things by fighting the existing reality.
R. Buckminster Fuller
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2)
2006-11-11 10:49 ` Rafael J. Wysocki
@ 2006-11-11 12:29 ` Paolo Ornati
2006-11-14 16:44 ` Paolo Ornati
0 siblings, 1 reply; 91+ messages in thread
From: Paolo Ornati @ 2006-11-11 12:29 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Adrian Bunk, LKML
[-- Attachment #1: Type: text/plain, Size: 2699 bytes --]
On Sat, 11 Nov 2006 11:49:26 +0100
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > Subject : BUG: scheduling while atomic: events/0/0x00000001/4
> > > > after resume
> > > > References : http://lkml.org/lkml/2006/11/2/209
> > > > Submitter : Paolo Ornati <ornati@fastwebnet.it>
> > > > Status : unknown
> > >
> > > I couldn't find anything in the report that would indicate the problem occured
> > > after a resume. Was it really the case?
> >
> > Ahh, I've written that in another email but I trimmed LKML from CC by
> > mistake ;)
> >
> >
> > Relevant portion of that mail follows... anyway it seems that "-rc5" is
> > _OK_ since I'm running it by 2 days and it survived 9 suspend/resume
> > cycles.
>
> Okay, please let us know if it survives the next several cycles.
>
> OTOH, the problem may be hiding.
Ok, and if it survives againg and again I can do a partial bisection...
so that someone could guess the change that hides/fixes this and I can
revert it on top of "-rc5" to confirm.
>
> > ------------------------------------------------------------------
> >
> > I've reproduced it (with rc4-g4b1c46a3), and I think it is
> > suspend/resume related sice the messages start flooding dmesg just
> > after a resume...
> >
> > I'll see if it is reproducible just doing suspend/resume a couple of
> > times... and if so I'll try with -rc5.
> >
> >
> > dmesg (stripped at the end):
> >
> > [ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006
[CUT]
> > [ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4
> > [ 25.382086]
> > [ 25.382087] Call Trace:
> > [ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
> > [ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe
> > [ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b
> > [ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
> > [ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf
> > [ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b
> > [ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101
> > [ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
> > [ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12
> > [ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101
> > [ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12
>
> Apparently, the kernel thinks that worker_thread() is running in the atomic
> context, so there may be a problem with preempt_count(), for example.
>
> Is preemption enabled in your kernel(s)?
YES (see first line of dmesg) - full config attached
--
Paolo Ornati
Linux 2.6.19-rc5 on x86_64
[-- Attachment #2: CONFIG.gz --]
[-- Type: application/x-gzip, Size: 9060 bytes --]
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq
2006-11-10 12:42 ` Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq Komuro
@ 2006-11-13 16:02 ` Linus Torvalds
2006-11-13 17:11 ` Eric W. Biederman
0 siblings, 1 reply; 91+ messages in thread
From: Linus Torvalds @ 2006-11-13 16:02 UTC (permalink / raw)
To: Komuro
Cc: tglx, Eric W. Biederman, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List, mingo
On Fri, 10 Nov 2006, Komuro wrote:
>
> I tried the 2.6.19-rc5, the problem still happens.
Ok, that's good data, and especially:
> But,
> I remove the disable_irq_nosync() , enable_irq()
> from the linux/drivers/net/pcmcia/axnet_cs.c
> the interrupt is generated properly.
All RIGHT. That's a very good clue. The major difference between PCI and
ISA irq's is that they have different trigger types (they also have
different polarity, but that tends to be just a small detail). In
particular, ISA IRQ's are edge-triggered, and PCI IRQ's are level-
triggered.
Now, edge-triggered interrupts are a _lot_ harder to mask, because the
Intel APIC is an unbelievable piece of sh*t, and has the edge-detect logic
_before_ the mask logic, so if a edge happens _while_ the device is
masked, you'll never ever see the edge ever again (unmasking will not
cause a new edge, so you simply lost the interrupt).
So when you "mask" an edge-triggered IRQ, you can't really mask it at all,
because if you did that, you'd lose it forever if the IRQ comes in while
you masked it. Instead, we're supposed to leave it active, and set a flag,
and IF the IRQ comes in, we just remember it, and mask it at that point
instead, and then on unmasking, we have to replay it by sending a
self-IPI.
Maybe that part got broken by some of the IRQ changes by Eric.
Eric, can you please double-check this all? I suspect you disable
edge-triggered interrupts when moving them, or something, and maybe you
didn't realize that if you disable them on the IO-APIC level, they can be
gone forever.
[ Note: this is true EVEN IF we are in the interrupt handler right then -
if we get another edge while in the interrupt handler, the interrupt
will normally be _delayed_ until we've ACK'ed it, but if we have
_masked_ it, it will simply be lost entirely. So a simple "mask"
operation is always incorrect for edge-triggered interrupts.
One option might be to do a simple mask, and on unmask, turn the edge
trigger into a level trigger at the same time. Then, the first time you
get the interrupt, you turn it back into an edge trigger _before_ you
call the interrupt handlers. That might actually be simpler than doing
the "irq replay" dance with self-IPI, because we can't actually just
fake the IRQ handling - when enable_irq() is called, irq's are normally
disabled on the CPU, so we can't just call the irq handler at that
point: we really do need to "replay" the dang thing.
Did I mention that the Intel APIC's are a piece of cr*p already? ]
> So I think enable_irq does not enable the irq.
It probably does enable it (that's the easy part), but see above: if any
of the support structure for the APIC crapola is subtly broken, we'll have
lost the IRQ anyway.
(Many other IRQ controllers get this right: the "old and broken" Intel
i8259 interrupt controller was a much better IRQ controller than the APIC
in this regard, because it simply had the edge-detect logic after the
masking logic, so if you unmasked an active interrupt that had been
masked, you would always see it as an edge, and the i8259 controller needs
none of the subtle code at _all_. It just works.)
Anyway, if you _can_ bisect the exact point where this started happening,
that would be good. But I would not be surprised in the least if this is
all introduced by Eric Biedermans dynamic IRQ handling.
Eric?
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq
2006-11-13 16:02 ` Linus Torvalds
@ 2006-11-13 17:11 ` Eric W. Biederman
2006-11-13 20:44 ` Ingo Molnar
0 siblings, 1 reply; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-13 17:11 UTC (permalink / raw)
To: Linus Torvalds
Cc: Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List, mingo
Linus Torvalds <torvalds@osdl.org> writes:
> On Fri, 10 Nov 2006, Komuro wrote:
>>
>> I tried the 2.6.19-rc5, the problem still happens.
>
> Ok, that's good data, and especially:
>
>> But,
>> I remove the disable_irq_nosync() , enable_irq()
>> from the linux/drivers/net/pcmcia/axnet_cs.c
>> the interrupt is generated properly.
>
> All RIGHT. That's a very good clue. The major difference between PCI and
> ISA irq's is that they have different trigger types (they also have
> different polarity, but that tends to be just a small detail). In
> particular, ISA IRQ's are edge-triggered, and PCI IRQ's are level-
> triggered.
>
> Now, edge-triggered interrupts are a _lot_ harder to mask, because the
> Intel APIC is an unbelievable piece of sh*t, and has the edge-detect logic
> _before_ the mask logic, so if a edge happens _while_ the device is
> masked, you'll never ever see the edge ever again (unmasking will not
> cause a new edge, so you simply lost the interrupt).
>
> So when you "mask" an edge-triggered IRQ, you can't really mask it at all,
> because if you did that, you'd lose it forever if the IRQ comes in while
> you masked it. Instead, we're supposed to leave it active, and set a flag,
> and IF the IRQ comes in, we just remember it, and mask it at that point
> instead, and then on unmasking, we have to replay it by sending a
> self-IPI.
>
> Maybe that part got broken by some of the IRQ changes by Eric.
Hmm. The other possibility is that this is a genirq migration issue.
Yep. That looks like it. In the genirq migration the edge and
level triggered cases got merged and previously disable_edge_ioapic
was a noop. Ouch.
Darn I missed this one in my review of Ingos changes.
I'm not at all certain what the correct fix here is.
- Do we make the make the generic code aware of this messed up
case? I believe it is aware of part of the don't disable edge
triggered interrupt logic already.
- Do we modify the disable logic so it doesn't actually disable the
irq?
- Do we do as Linus suggests and make the enable logic pass through a
level triggered state?
- Do we split the edge and level triggered cases apart on again on
i386 and x86_64?
And how do we make it drop dead clear what we are doing so that
someone doesn't break this in the future by accident. That I
suspect was the real problem. That stupid vector == irq case had
introduced so many levels of abstraction it was nearly impossible
to read the code.
Can we get this abstraction right so that we can make obviously
correct code here and still handle all of the weird code bugs?
> Eric, can you please double-check this all? I suspect you disable
> edge-triggered interrupts when moving them, or something, and maybe you
> didn't realize that if you disable them on the IO-APIC level, they can be
> gone forever.
Sure. So the hypothesis is that it is somewhere near commit
e7b946e98a456077dd6897f726f3d6197bd7e3b9 causing the problem.
Anything I have changed in this area should affect both i386 and
x86_64.
> [ Note: this is true EVEN IF we are in the interrupt handler right then -
> if we get another edge while in the interrupt handler, the interrupt
> will normally be _delayed_ until we've ACK'ed it, but if we have
> _masked_ it, it will simply be lost entirely. So a simple "mask"
> operation is always incorrect for edge-triggered interrupts.
>
> One option might be to do a simple mask, and on unmask, turn the edge
> trigger into a level trigger at the same time. Then, the first time you
> get the interrupt, you turn it back into an edge trigger _before_ you
> call the interrupt handlers. That might actually be simpler than doing
> the "irq replay" dance with self-IPI, because we can't actually just
> fake the IRQ handling - when enable_irq() is called, irq's are normally
> disabled on the CPU, so we can't just call the irq handler at that
> point: we really do need to "replay" the dang thing.
>
> Did I mention that the Intel APIC's are a piece of cr*p already? ]
Ok. After a quick skim it appears that there is a disable/enable pair
in the irq migration path for edge triggered interrupts. But we
do that work while the irq is pending and it doesn't look like I
changed that part of the code. Just the level triggered irq
migration.
>> So I think enable_irq does not enable the irq.
>
> It probably does enable it (that's the easy part), but see above: if any
> of the support structure for the APIC crapola is subtly broken, we'll have
> lost the IRQ anyway.
>
> (Many other IRQ controllers get this right: the "old and broken" Intel
> i8259 interrupt controller was a much better IRQ controller than the APIC
> in this regard, because it simply had the edge-detect logic after the
> masking logic, so if you unmasked an active interrupt that had been
> masked, you would always see it as an edge, and the i8259 controller needs
> none of the subtle code at _all_. It just works.)
>
> Anyway, if you _can_ bisect the exact point where this started happening,
> that would be good. But I would not be surprised in the least if this is
> all introduced by Eric Biedermans dynamic IRQ handling.
I will share the credit because I missed this in code review but this
is really Ingo's generic irq code.
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq
2006-11-13 17:11 ` Eric W. Biederman
@ 2006-11-13 20:44 ` Ingo Molnar
2006-11-13 21:11 ` Eric W. Biederman
0 siblings, 1 reply; 91+ messages in thread
From: Ingo Molnar @ 2006-11-13 20:44 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
On Mon, 2006-11-13 at 10:11 -0700, Eric W. Biederman wrote:
> > So when you "mask" an edge-triggered IRQ, you can't really mask it
> at all,
> > because if you did that, you'd lose it forever if the IRQ comes in
> while
> > you masked it. Instead, we're supposed to leave it active, and set a
> flag,
> > and IF the IRQ comes in, we just remember it, and mask it at that
> point
> > instead, and then on unmasking, we have to replay it by sending a
> > self-IPI.
> >
> > Maybe that part got broken by some of the IRQ changes by Eric.
>
> Hmm. The other possibility is that this is a genirq migration issue.
>
> Yep. That looks like it. In the genirq migration the edge and
> level triggered cases got merged and previously disable_edge_ioapic
> was a noop. Ouch.
hm, that should be solved by the generic edge-triggered flow handler as
well: we never mask an IRQ first time around, we only mask it if
we /already/ have the 'soft' IRQ_PENDING flag set. (in that case the
lost edge is not an issue because we have the information already - and
the masking will prevent a screaming edge source)
but maybe this concept has not been pushed through to the disable/enable
irq logic itself? (it's only present in the flow handler) Thomas, do you
concur?
Ingo
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq
2006-11-13 20:44 ` Ingo Molnar
@ 2006-11-13 21:11 ` Eric W. Biederman
2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar
0 siblings, 1 reply; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-13 21:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
Ingo Molnar <mingo@redhat.com> writes:
> On Mon, 2006-11-13 at 10:11 -0700, Eric W. Biederman wrote:
>> > So when you "mask" an edge-triggered IRQ, you can't really mask it
>> at all,
>> > because if you did that, you'd lose it forever if the IRQ comes in
>> while
>> > you masked it. Instead, we're supposed to leave it active, and set a
>> flag,
>> > and IF the IRQ comes in, we just remember it, and mask it at that
>> point
>> > instead, and then on unmasking, we have to replay it by sending a
>> > self-IPI.
>> >
>> > Maybe that part got broken by some of the IRQ changes by Eric.
>>
>> Hmm. The other possibility is that this is a genirq migration issue.
>>
>> Yep. That looks like it. In the genirq migration the edge and
>> level triggered cases got merged and previously disable_edge_ioapic
>> was a noop. Ouch.
>
> hm, that should be solved by the generic edge-triggered flow handler as
> well: we never mask an IRQ first time around, we only mask it if
> we /already/ have the 'soft' IRQ_PENDING flag set. (in that case the
> lost edge is not an issue because we have the information already - and
> the masking will prevent a screaming edge source)
>
> but maybe this concept has not been pushed through to the disable/enable
> irq logic itself? (it's only present in the flow handler) Thomas, do you
> concur?
I just looked. I think the logic is actually in there as well.
I keep forgetting disable != mask.
I looks like what is really missing is that we aren't setting
IRQ_DELAYED_DISABLE.
So I think what we really need to do is just set IRQ_DELAYED_DISABLE.
Does the patch below look right?
Eric
diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c
index 41bfc49..14654e6 100644
--- a/arch/x86_64/kernel/io_apic.c
+++ b/arch/x86_64/kernel/io_apic.c
@@ -790,9 +790,11 @@ static void ioapic_register_intr(int irq
trigger == IOAPIC_LEVEL)
set_irq_chip_and_handler_name(irq, &ioapic_chip,
handle_fasteoi_irq, "fasteoi");
- else
+ else {
+ irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
set_irq_chip_and_handler_name(irq, &ioapic_chip,
handle_edge_irq, "edge");
+ }
}
static void __init setup_IO_APIC_irqs(void)
^ permalink raw reply related [flat|nested] 91+ messages in thread
* 2.6.19-rc5: known regressions with patches
2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds
` (2 preceding siblings ...)
[not found] ` <20061111015035.GU4729@stusta.de>
@ 2006-11-13 22:14 ` Adrian Bunk
2006-11-13 22:56 ` Brian King
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
4 siblings, 1 reply; 91+ messages in thread
From: Adrian Bunk @ 2006-11-13 22:14 UTC (permalink / raw)
To: Linus Torvalds, Andrew Morton
Cc: Linux Kernel Mailing List, Aaron Durbin, Mel Gorman, ak, discuss,
Paul Mackerras, Brian King, jgarzik, linux-ide
This email lists some known regressions in 2.6.19-rc5 compared to 2.6.18
with patches available.
If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.
Due to the huge amount of recipients, please trim the Cc when answering.
Subject : x86_64: Fix partial page check to ensure unusable memory
is not being marked usable
References : http://lkml.org/lkml/2006/11/9/239
Submitter : Aaron Durbin <adurbin@google.com>
Caused-By : Mel Gorman <mel@csn.ul.ie>
commit 5cb248abf5ab65ab543b2d5fc16c738b28031fc0
Patch : http://lkml.org/lkml/2006/11/9/239
Status : patch available
Subject : libata must be initialized earlier
References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html
Submitter : Paul Mackerras <paulus@samba.org>
Handled-By : Brian King <brking@us.ibm.com>
Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2
Status : patch available
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-09 5:10 ` Eric W. Biederman
@ 2006-11-13 22:46 ` Tim Chen
2006-11-14 0:03 ` Eric W. Biederman
0 siblings, 1 reply; 91+ messages in thread
From: Tim Chen @ 2006-11-13 22:46 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Adrian Bunk, Linus Torvalds, Andrew Morton,
Linux Kernel Mailing List
On Wed, 2006-11-08 at 22:10 -0700, Eric W. Biederman wrote:
>
> Cool. I'm glad to know it was simply a buggy lmbench.
>
> What is sysconf(_SN_NPROCESSORS_ONLN) doing that it slows down as the
> number of irqs increase? It is a slow path certainly but possibly
> something we should fix. My hunch is cat /proc/cpuinfo...
>
The increase in time of sysconf(_SN_NPROCESSORS_ONLN) call
is within "show_stat" function after looking at profiling data.
There are a couple of loops that iterate over kstat_irqs
interrupt statistics and depend on NR_IRQS. Doesn't
look like something we need to fix.
Tim
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions with patches
2006-11-13 22:14 ` 2.6.19-rc5: known regressions with patches Adrian Bunk
@ 2006-11-13 22:56 ` Brian King
2006-11-13 23:15 ` Linus Torvalds
0 siblings, 1 reply; 91+ messages in thread
From: Brian King @ 2006-11-13 22:56 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List,
Paul Mackerras, jgarzik, linux-ide
Adrian Bunk wrote:
> Subject : libata must be initialized earlier
> References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html
> Submitter : Paul Mackerras <paulus@samba.org>
> Handled-By : Brian King <brking@us.ibm.com>
> Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2
> Status : patch available
I just resubmitted this patch a few minutes ago.
Brian
--
Brian King
eServer Storage I/O
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions with patches
2006-11-13 22:56 ` Brian King
@ 2006-11-13 23:15 ` Linus Torvalds
2006-11-14 2:35 ` Jeff Garzik
0 siblings, 1 reply; 91+ messages in thread
From: Linus Torvalds @ 2006-11-13 23:15 UTC (permalink / raw)
To: Brian King
Cc: Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
Paul Mackerras, jgarzik, linux-ide
On Mon, 13 Nov 2006, Brian King wrote:
> Adrian Bunk wrote:
> > Subject : libata must be initialized earlier
> > References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html
> > Submitter : Paul Mackerras <paulus@samba.org>
> > Handled-By : Brian King <brking@us.ibm.com>
> > Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2
> > Status : patch available
>
> I just resubmitted this patch a few minutes ago.
I definitely want an ACK on this from Jeff - I'll take a few broken ppc64
machines any day over the worry that there might be problems elsewhere.
Jeff? Ack, Nack, or "I'll push it to you through my git tree", please..
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions
2006-11-13 22:46 ` Tim Chen
@ 2006-11-14 0:03 ` Eric W. Biederman
0 siblings, 0 replies; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-14 0:03 UTC (permalink / raw)
To: tim.c.chen
Cc: Adrian Bunk, Linus Torvalds, Andrew Morton,
Linux Kernel Mailing List
Tim Chen <tim.c.chen@linux.intel.com> writes:
> On Wed, 2006-11-08 at 22:10 -0700, Eric W. Biederman wrote:
>
>>
>> Cool. I'm glad to know it was simply a buggy lmbench.
>>
>> What is sysconf(_SN_NPROCESSORS_ONLN) doing that it slows down as the
>> number of irqs increase? It is a slow path certainly but possibly
>> something we should fix. My hunch is cat /proc/cpuinfo...
>>
>
> The increase in time of sysconf(_SN_NPROCESSORS_ONLN) call
> is within "show_stat" function after looking at profiling data.
> There are a couple of loops that iterate over kstat_irqs
> interrupt statistics and depend on NR_IRQS. Doesn't
> look like something we need to fix.
Thanks.
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions with patches
2006-11-13 23:15 ` Linus Torvalds
@ 2006-11-14 2:35 ` Jeff Garzik
0 siblings, 0 replies; 91+ messages in thread
From: Jeff Garzik @ 2006-11-14 2:35 UTC (permalink / raw)
To: Linus Torvalds
Cc: Brian King, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
Paul Mackerras, linux-ide
Linus Torvalds wrote:
>
> On Mon, 13 Nov 2006, Brian King wrote:
>
>> Adrian Bunk wrote:
>>> Subject : libata must be initialized earlier
>>> References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html
>>> Submitter : Paul Mackerras <paulus@samba.org>
>>> Handled-By : Brian King <brking@us.ibm.com>
>>> Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2
>>> Status : patch available
>> I just resubmitted this patch a few minutes ago.
>
> I definitely want an ACK on this from Jeff - I'll take a few broken ppc64
> machines any day over the worry that there might be problems elsewhere.
>
> Jeff? Ack, Nack, or "I'll push it to you through my git tree", please..
Reluctant ACK. But this whole subsys_init() mess is highly fragile, and
this is going to change again once a new dependency arises :/
Jeff
^ permalink raw reply [flat|nested] 91+ messages in thread
* [patch] irq: do not mask interrupts by default
2006-11-13 21:11 ` Eric W. Biederman
@ 2006-11-14 8:14 ` Ingo Molnar
2006-11-14 8:20 ` Arjan van de Ven
` (2 more replies)
0 siblings, 3 replies; 91+ messages in thread
From: Ingo Molnar @ 2006-11-14 8:14 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
On Mon, 2006-11-13 at 14:11 -0700, Eric W. Biederman wrote:
> - else
> + else {
> + irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
> set_irq_chip_and_handler_name(irq, &ioapic_chip,
> handle_edge_irq,
> "edge");
> + }
> }
yeah. Komuro, could you try my patch below - Eric's patch only updates
x86_64 while your failure was on the i386 kernel.
Note, i also took another approach to fix this problem, that should
cover both the case found by Komuro, and some other cases as well. The
theory is this:
1) disable_irq() is relatively rare (used in about 10% of drivers, but
there it's overwhelmingly used in some slowpath) so it's performance
uncritical.
2) missing an IRQ while the line is masked is often a lethal regression
to the user. An IRQ could be missed even if we think that the IRQ line
is 'level-triggered'.
so my patch changes the default irq-disable logic of /all/ controllers
to "delayed disable". (IRQ chips can still override this by providing a
different chip->disable method that just clones their ->mask method, if
it is absolutely sure that no IRQs can be lost while masked)
So this patch has the worst-case effect of getting at most one 'extra'
interrupt after the IRQ line has been 'disabled' - at which point the
line will be masked for real (by the flow handler). (I updated the
fasteoi and the simple irq flow handlers to mask the IRQ for real if an
IRQ triggers and the line was disabled.)
It's a bit late in the -rc cycle for a change like this, but i'm fairly
positive about it. I booted it on a couple of boxes and saw no badness.
(neither did i see any increase in IRQ rates)
Ingo
NOTE: this also means that the old IRQ_DELAYED_DISABLE bit can probably
be scrapped - i'll do that later on in a separate mail, if this patch
works out fine.
------------>
Subject: irq: do not mask interrupts by default
From: Ingo Molnar <mingo@elte.hu>
never mask interrupts immediately upon request. Disabling
interrupts in high-performance codepaths is rare, and on
the other hand this change could recover lost edges (or
even other types of lost interrupts) by conservatively
only masking interrupts after they happen. (NOTE: with
this change the highlevel irq-disable code still soft-disables
this IRQ line - and if such an interrupt happens then the
IRQ flow handler keeps the IRQ masked.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/irq/chip.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
Index: linux/kernel/irq/chip.c
===================================================================
--- linux.orig/kernel/irq/chip.c
+++ linux/kernel/irq/chip.c
@@ -202,10 +202,6 @@ static void default_enable(unsigned int
*/
static void default_disable(unsigned int irq)
{
- struct irq_desc *desc = irq_desc + irq;
-
- if (!(desc->status & IRQ_DELAYED_DISABLE))
- desc->chip->mask(irq);
}
/*
@@ -272,8 +268,11 @@ handle_simple_irq(unsigned int irq, stru
kstat_cpu(cpu).irqs[irq]++;
action = desc->action;
- if (unlikely(!action || (desc->status & IRQ_DISABLED)))
+ if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
+ if (desc->chip->mask)
+ desc->chip->mask(irq);
goto out_unlock;
+ }
desc->status |= IRQ_INPROGRESS;
spin_unlock(&desc->lock);
@@ -366,11 +365,13 @@ handle_fasteoi_irq(unsigned int irq, str
/*
* If its disabled or no action available
- * keep it masked and get out of here
+ * then mask it and get out of here:
*/
action = desc->action;
if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
desc->status |= IRQ_PENDING;
+ if (desc->chip->mask)
+ desc->chip->mask(irq);
goto out;
}
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] irq: do not mask interrupts by default
2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar
@ 2006-11-14 8:20 ` Arjan van de Ven
2006-11-14 12:43 ` Komuro
2006-11-14 16:10 ` Linus Torvalds
2 siblings, 0 replies; 91+ messages in thread
From: Arjan van de Ven @ 2006-11-14 8:20 UTC (permalink / raw)
To: Ingo Molnar
Cc: Eric W. Biederman, Linus Torvalds, Komuro, tglx, Adrian Bunk,
Andrew Morton, Linux Kernel Mailing List
> so my patch changes the default irq-disable logic of /all/ controllers
> to "delayed disable". (IRQ chips can still override this by providing a
> different chip->disable method that just clones their ->mask method, if
> it is absolutely sure that no IRQs can be lost while masked)
>
> So this patch has the worst-case effect of getting at most one 'extra'
> interrupt after the IRQ line has been 'disabled' - at which point the
> line will be masked for real (by the flow handler). (I updated the
> fasteoi and the simple irq flow handlers to mask the IRQ for real if an
> IRQ triggers and the line was disabled.)
since disable_irq() is used as locking against interrupt context by
several drivers (*cough* ne2000 *cough*) I am not entirely convinced
this is a good idea....
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] irq: do not mask interrupts by default
2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar
2006-11-14 8:20 ` Arjan van de Ven
@ 2006-11-14 12:43 ` Komuro
2006-11-14 16:10 ` Linus Torvalds
2 siblings, 0 replies; 91+ messages in thread
From: Komuro @ 2006-11-14 12:43 UTC (permalink / raw)
To: Ingo Molnar
Cc: Eric W. Biederman, Linus Torvalds, Komuro, tglx, Adrian Bunk,
Andrew Morton, Linux Kernel Mailing List
Dear Ingo
I tried your patch with 2.6.19-rc5.
The irq is generated properly.
Thanks!
Best Regards
Komuro
>>
>------------>
>Subject: irq: do not mask interrupts by default
>From: Ingo Molnar <mingo@elte.hu>
>
>never mask interrupts immediately upon request. Disabling
>interrupts in high-performance codepaths is rare, and on
>the other hand this change could recover lost edges (or
>even other types of lost interrupts) by conservatively
>only masking interrupts after they happen. (NOTE: with
>this change the highlevel irq-disable code still soft-disables
>this IRQ line - and if such an interrupt happens then the
>IRQ flow handler keeps the IRQ masked.)
>
>Signed-off-by: Ingo Molnar <mingo@elte.hu>
>---
> kernel/irq/chip.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
>Index: linux/kernel/irq/chip.c
>===================================================================
>--- linux.orig/kernel/irq/chip.c
>+++ linux/kernel/irq/chip.c
>@@ -202,10 +202,6 @@ static void default_enable(unsigned int
> */
> static void default_disable(unsigned int irq)
> {
>- struct irq_desc *desc = irq_desc + irq;
>-
>- if (!(desc->status & IRQ_DELAYED_DISABLE))
>- desc->chip->mask(irq);
> }
>
> /*
>@@ -272,8 +268,11 @@ handle_simple_irq(unsigned int irq, stru
> kstat_cpu(cpu).irqs[irq]++;
>
> action = desc->action;
>- if (unlikely(!action || (desc->status & IRQ_DISABLED)))
>+ if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
>+ if (desc->chip->mask)
>+ desc->chip->mask(irq);
> goto out_unlock;
>+ }
>
> desc->status |= IRQ_INPROGRESS;
> spin_unlock(&desc->lock);
>@@ -366,11 +365,13 @@ handle_fasteoi_irq(unsigned int irq, str
>
> /*
> * If its disabled or no action available
>- * keep it masked and get out of here
>+ * then mask it and get out of here:
> */
> action = desc->action;
> if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
> desc->status |= IRQ_PENDING;
>+ if (desc->chip->mask)
>+ desc->chip->mask(irq);
> goto out;
> }
>
>
>
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] irq: do not mask interrupts by default
2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar
2006-11-14 8:20 ` Arjan van de Ven
2006-11-14 12:43 ` Komuro
@ 2006-11-14 16:10 ` Linus Torvalds
2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman
[not found] ` <20061115090427.GA16173@elte.hu>
2 siblings, 2 replies; 91+ messages in thread
From: Linus Torvalds @ 2006-11-14 16:10 UTC (permalink / raw)
To: Ingo Molnar
Cc: Eric W. Biederman, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
On Tue, 14 Nov 2006, Ingo Molnar wrote:
>
> 1) disable_irq() is relatively rare (used in about 10% of drivers, but
> there it's overwhelmingly used in some slowpath) so it's performance
> uncritical.
Well, the thing is, the _replay_ if it does happen, is going to be really
really slow compared to the masking. So at that point, it may well be a
net performance downside if the masking is going to almost always have an
interrupt happen while the thing is masked. I dunno.
There's another thing too:
For level-triggered interrupts, I _really_ don't think we should do this.
The code inside the masked region is sometimes "setup code", which will do
things that _will_ raise an interrupt, but may read the status register or
whatever to then unraise it. So in that case, your patch will generate
different behaviour, something that I really don't want to introduce at
this point in the 2.6.19 series.
> 2) missing an IRQ while the line is masked is often a lethal regression
> to the user. An IRQ could be missed even if we think that the IRQ line
> is 'level-triggered'.
If it's level-triggered, it's going to be missed only if it's de-asserted
by code inside the masked region, and that is what we have always done on
purpose, so "missing" it is the right thing to do. It's what we have
tested all level-triggered interrupts with for the last 15+ years, and
it's been part of the semantics for masking.
So I absolutely do _not_ think your change is improved semantics. It's new
semantics, and illogical. If the driver masked the irq line, did some
testing that raises and clears it again ("let's check if this version of
the chip raises the interrupt when we do XYZZY"), then the logical thing
to do would be to not cause the interrupt to happen.
Of course, for edge-triggered APIC interrupts, we _have_ to replay the irq
(since we don't have any way of even *knowing* whether we might get it
again), but for level-triggered and for the old legacy i8259 controller
that gets it right for edges anwyay, we should _not_ send the spurious
interrupt that is no longer active.
And a lot of code has been tested with either just the i8259 (old machines
without any APIC) or with PCI-only devices (which are always level-
triggered), so the fact that edge-triggered things have always seen the
potential for spurious interrupts is not a reasong to say "well, they have
to handle it anyway". True PCI drivers generally do _not_ have to handle
the crazy case, and generally have never seen it.
> so my patch changes the default irq-disable logic of /all/ controllers
> to "delayed disable". (IRQ chips can still override this by providing a
> different chip->disable method that just clones their ->mask method, if
> it is absolutely sure that no IRQs can be lost while masked)
I really think we should do this just for APIC edge triggered interrupts,
ie keep the old behaviour.
Also, I worry a bit about the patch:
> @@ -272,8 +268,11 @@ handle_simple_irq(unsigned int irq, stru
> kstat_cpu(cpu).irqs[irq]++;
>
> action = desc->action;
> - if (unlikely(!action || (desc->status & IRQ_DISABLED)))
> + if (unlikely(!action || (desc->status & IRQ_DISABLED))) {
> + if (desc->chip->mask)
> + desc->chip->mask(irq);
> goto out_unlock;
> + }
The simple-irq case too? That's not even going to replay the thing? So now
you just mask (without replaying) simple irqs, but then the other irqs you
mask and replay.. See above on why I don't think this is necessarily a bug
(since masking is almost always the right thing _anyway_), but now it will
*STILL* depend on some internal implementation decision on whether the
replay happens at all. I'd much rather have the replay decision be based
on hard physical data: we replay _only_ for edge-triggered interrupts, and
_only_ for controllers that need it.
In other words, I think we should just make APIC-edge have the "please
delay masking and replay" bit, and nobody else.
Can you send that patch (for both x86 and x86-64), and we can ask Komuro
to test it. That would be the "same behaviour as we've always had" thing,
which I think is also the _right_ behaviour.
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2)
2006-11-11 12:29 ` Paolo Ornati
@ 2006-11-14 16:44 ` Paolo Ornati
2006-11-29 10:10 ` [SOLVED] " Paolo Ornati
0 siblings, 1 reply; 91+ messages in thread
From: Paolo Ornati @ 2006-11-14 16:44 UTC (permalink / raw)
To: Paolo Ornati; +Cc: Rafael J. Wysocki, Adrian Bunk, LKML
On Sat, 11 Nov 2006 13:29:29 +0100
Paolo Ornati <ornati@fastwebnet.it> wrote:
> > Okay, please let us know if it survives the next several cycles.
> >
> > OTOH, the problem may be hiding.
>
> Ok, and if it survives againg and again I can do a partial bisection...
"-rc5" is still alive: 6 days of uptime using suspend/resume many times
every day...
so if the problem is there it's hiding very well.
Now I'll slowly go back with older kernels and see what happens...
--
Paolo Ornati
Linux 2.6.19-rc5 on x86_64
^ permalink raw reply [flat|nested] 91+ messages in thread
* [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
2006-11-14 16:10 ` Linus Torvalds
@ 2006-11-14 17:52 ` Eric W. Biederman
2006-11-14 23:35 ` Linus Torvalds
` (2 more replies)
[not found] ` <20061115090427.GA16173@elte.hu>
1 sibling, 3 replies; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-14 17:52 UTC (permalink / raw)
To: Linus Torvalds
Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
Linus Torvalds <torvalds@osdl.org> writes:
> Of course, for edge-triggered APIC interrupts, we _have_ to replay the irq
> (since we don't have any way of even *knowing* whether we might get it
> again), but for level-triggered and for the old legacy i8259 controller
> that gets it right for edges anwyay, we should _not_ send the spurious
> interrupt that is no longer active.
>
> And a lot of code has been tested with either just the i8259 (old machines
> without any APIC) or with PCI-only devices (which are always level-
> triggered), so the fact that edge-triggered things have always seen the
> potential for spurious interrupts is not a reasong to say "well, they have
> to handle it anyway". True PCI drivers generally do _not_ have to handle
> the crazy case, and generally have never seen it.
>
> In other words, I think we should just make APIC-edge have the "please
> delay masking and replay" bit, and nobody else.
>
> Can you send that patch (for both x86 and x86-64), and we can ask Komuro
> to test it. That would be the "same behaviour as we've always had" thing,
> which I think is also the _right_ behaviour.
Hopefully this is the trivial patch that solves the problem.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
diff --git a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
index ad84bc2..3b7a63e 100644
--- a/arch/i386/kernel/io_apic.c
+++ b/arch/i386/kernel/io_apic.c
@@ -1287,9 +1287,11 @@ static void ioapic_register_intr(int irq
trigger == IOAPIC_LEVEL)
set_irq_chip_and_handler_name(irq, &ioapic_chip,
handle_fasteoi_irq, "fasteoi");
- else
+ else {
+ irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
set_irq_chip_and_handler_name(irq, &ioapic_chip,
handle_edge_irq, "edge");
+ }
set_intr_gate(vector, interrupt[irq]);
}
diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c
index 41bfc49..14654e6 100644
--- a/arch/x86_64/kernel/io_apic.c
+++ b/arch/x86_64/kernel/io_apic.c
@@ -790,9 +790,11 @@ static void ioapic_register_intr(int irq
trigger == IOAPIC_LEVEL)
set_irq_chip_and_handler_name(irq, &ioapic_chip,
handle_fasteoi_irq, "fasteoi");
- else
+ else {
+ irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
set_irq_chip_and_handler_name(irq, &ioapic_chip,
handle_edge_irq, "edge");
+ }
}
static void __init setup_IO_APIC_irqs(void)
^ permalink raw reply related [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman
@ 2006-11-14 23:35 ` Linus Torvalds
2006-11-15 1:17 ` Linus Torvalds
2006-11-15 12:40 ` Komuro
2 siblings, 0 replies; 91+ messages in thread
From: Linus Torvalds @ 2006-11-14 23:35 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
On Tue, 14 Nov 2006, Eric W. Biederman wrote:
>
> Hopefully this is the trivial patch that solves the problem.
Komuro, can you check this patch _instead_ of the one from Ingo (ie not
together with, since that combination won't tell us anything new - if
Ingo's patch is there too, the new patch will basically be a no-op).
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman
2006-11-14 23:35 ` Linus Torvalds
@ 2006-11-15 1:17 ` Linus Torvalds
2006-11-15 5:14 ` Eric W. Biederman
2006-11-15 12:40 ` Komuro
2 siblings, 1 reply; 91+ messages in thread
From: Linus Torvalds @ 2006-11-15 1:17 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
On Tue, 14 Nov 2006, Eric W. Biederman wrote:
>
> Hopefully this is the trivial patch that solves the problem.
Ok, having looked more at this, I have to say that the whole
"IRQ_DELAYED_DISABLE" thing seems very fragile indeed.
It looks like we should do it not only for APIC edge-triggered interrupts,
but for HT and MSI interrupts too, as far as I can tell (at least they
also use the "handle_edge_irq" routine)
So I'm wondering how many other cases there are that are missing this.
In that sense, Ingo's patch was a lot safer, although I still dislike it
for all the other reasons I mentioned - it's simply wrong to re-send a
level-triggered irq.
I don't know MSI and HT interrupts well enough to tell whether they will
re-trigger on their own when we unmask them, but the point is, this
_looks_ like it might be incomplete.
I think part of the problem is a bad interface. We should simply never set
the IRQ handler on its own. It should be a field in the "irq_chip"
structure, and we should use _different_ irq chip structures for level and
edge-triggered. Then we should also add the "flags" thing there, and you
could do something like
static struct irq_chip level_ioapic_chip = {
..
instead of making the insane decision to use the "same" chip for all
ioapic things.
Ingo? Eric? Comments?
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
2006-11-15 1:17 ` Linus Torvalds
@ 2006-11-15 5:14 ` Eric W. Biederman
2006-11-15 16:06 ` Linus Torvalds
0 siblings, 1 reply; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-15 5:14 UTC (permalink / raw)
To: Linus Torvalds
Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
Linus Torvalds <torvalds@osdl.org> writes:
> On Tue, 14 Nov 2006, Eric W. Biederman wrote:
>>
>> Hopefully this is the trivial patch that solves the problem.
>
> Ok, having looked more at this, I have to say that the whole
> "IRQ_DELAYED_DISABLE" thing seems very fragile indeed.
>
> It looks like we should do it not only for APIC edge-triggered interrupts,
> but for HT and MSI interrupts too, as far as I can tell (at least they
> also use the "handle_edge_irq" routine)
>
> So I'm wondering how many other cases there are that are missing this.
I think it is a good question.
The big one I did not set it on is the interrupt if it comes in
through ExtInt. I assume the 8259 is sane but I may be wrong.
> In that sense, Ingo's patch was a lot safer, although I still dislike it
> for all the other reasons I mentioned - it's simply wrong to re-send a
> level-triggered irq.
>
> I don't know MSI and HT interrupts well enough to tell whether they will
> re-trigger on their own when we unmask them, but the point is, this
> _looks_ like it might be incomplete.
Yes. I think there is an interrupt status bit there.
For at least one case in MSI we don't have a disable at all.
The truth is in practice I don't think it matters because I don't
think anyone actually disables MSI or hypertransport interrupts.
If it was going to change it would probably change per card.
But the real truth is that the hardware device knows what is going on.
The interrupt message is sent by the hardware device or it is not.
This isn't a case of can we detect an interrupt being raised by the
device while we disabled the interrupt at the device. This is a
case of we disable the interrupt at the device. So I think the whole
question of do we detect an interrupt raised by the device while
we have disabled interrupts on the device is silly.
So until I learn more I am going to assume that MSI and hypertransport
interrupts are sane like 8259 interrupts. If that makes sense.
> I think part of the problem is a bad interface. We should simply never set
> the IRQ handler on its own. It should be a field in the "irq_chip"
> structure, and we should use _different_ irq chip structures for level and
> edge-triggered. Then we should also add the "flags" thing there, and you
> could do something like
>
> static struct irq_chip level_ioapic_chip = {
> ..
>
> instead of making the insane decision to use the "same" chip for all
> ioapic things.
I think there is probably a sensible case for a separate structure.
At this point I have two questions.
- What is the easiest path to get us to a stable 2.6.19 where
everything works?
I don't think that is backing out genirq. But I haven't at all of
these corner cases.
I think for 2.6.19 we can get away with just my stupid patch, or
some simple variation of it.
- What is the sanest thing for long term maintenance, of irqs?
genirq is less code to maintain overall (a plus).
genirq helps us do things across architectures, which is nice.
genirq is also a little convoluted to read and to use a downside.
My gut feel is that there is room for a lot more cleanup in this
area but we probably need to stabilize what we have.
Since you aren't complaining about what the code actually does but
rather how the interface looks, I have a proposal. I assert that
the interface for registering an irq is much to general, and broad.
Instead of having:
irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
set_irq_chip_and_handler_name(irq, &ioapic_chip,
handle_edge_irq, "edge");
We should have a set of helper functions one for each common type
of interrupt.
set_irq_edge_lossy(irq, &ioapic_chip);
set_irq_edge(irq, &ioapic_chip);
set_irq_level(irq, &ioapic_chip);
The more stupid parameters we have to set the more likely
an implementor is to get it wrong.
Although I do agree to some extent it has been a bit of a strain
having both edge and level triggered interrupts with the same
methods. So if our goal is to make an even simpler interface
than what we have now I will be happy. Hopefully we can do all
of this in helper functions instead of having to rip up all of
the interrupt infrastructure one more time.
I really don't know. I'm tired and I want to see this code work.
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* 2.6.19-rc5: known regressions (v3)
2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds
` (3 preceding siblings ...)
2006-11-13 22:14 ` 2.6.19-rc5: known regressions with patches Adrian Bunk
@ 2006-11-15 10:21 ` Adrian Bunk
2006-11-15 10:35 ` Jens Axboe
` (4 more replies)
4 siblings, 5 replies; 91+ messages in thread
From: Adrian Bunk @ 2006-11-15 10:21 UTC (permalink / raw)
To: Linus Torvalds, Andrew Morton
Cc: Linux Kernel Mailing List, Stephen Hemminger, gregkh, linux-pci,
Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown,
Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el,
oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov,
Alan Stern, linux-usb-devel
This email lists some known regressions in 2.6.19-rc5 compared to 2.6.18
that are not yet fixed in Linus' tree.
If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.
Due to the huge amount of recipients, please trim the Cc when answering.
Subject : PCI MSI setting corrupted during resume
References : http://bugzilla.kernel.org/show_bug.cgi?id=7479
Submitter : Stephen Hemminger <shemminger@osdl.org>
Status : unknown
Subject : SMP kernel can not generate ISA irq properly
References : http://lkml.org/lkml/2006/10/22/15
http://lkml.org/lkml/2006/11/10/142
Submitter : Komuro <komurojun-mbn@nifty.com>
Handled-By : "Eric W. Biederman" <ebiederm@xmission.com>
Ingo Molnar <mingo@redhat.com>
Status : problem is being debugged
Subject : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter : Ernst Herzberg <earny@net4u.de>
Handled-By : Len Brown <len.brown@intel.com>
Status : problem is being debugged
Subject : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
http://lkml.org/lkml/2006/11/10/208
Submitter : Andre Noll <maan@systemlinux.org>
Handled-By : Andi Kleen <ak@suse.de>
Status : Andi is investigating
Subject : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
Submitter : Prakash Punnoor <prakash@punnoor.de>
Status : unknown
Subject : unable to rip cd
References : http://lkml.org/lkml/2006/10/13/100
http://lkml.org/lkml/2006/11/8/42
Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
Handled-By : Jens Axboe <jens.axboe@oracle.com>
Status : Jens is investigating
Subject : can't disable OHCI wakeup via sysfs
References : http://lkml.org/lkml/2006/11/11/33
Submitter : Andrey Borzenkov <arvidjaar@mail.ru>
Handled-By : Alan Stern <stern@rowland.harvard.edu>
Patch : http://lkml.org/lkml/2006/11/13/261
Status : patch available
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
@ 2006-11-15 10:35 ` Jens Axboe
2006-11-15 10:53 ` Adrian Bunk
2006-11-15 10:35 ` Eric Dumazet
` (3 subsequent siblings)
4 siblings, 1 reply; 91+ messages in thread
From: Jens Axboe @ 2006-11-15 10:35 UTC (permalink / raw)
To: Adrian Bunk; +Cc: Linux Kernel
On Wed, Nov 15 2006, Adrian Bunk wrote:
> Subject : unable to rip cd
> References : http://lkml.org/lkml/2006/10/13/100
> http://lkml.org/lkml/2006/11/8/42
> Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
> Handled-By : Jens Axboe <jens.axboe@oracle.com>
> Status : Jens is investigating
it's fixed and patched has been merged.
--
Jens Axboe
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
2006-11-15 10:35 ` Jens Axboe
@ 2006-11-15 10:35 ` Eric Dumazet
2006-11-15 10:50 ` Andi Kleen
2006-11-22 10:28 ` Eric Dumazet
2006-11-15 11:06 ` Brice Goglin
` (2 subsequent siblings)
4 siblings, 2 replies; 91+ messages in thread
From: Eric Dumazet @ 2006-11-15 10:35 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List,
Stephen Hemminger, gregkh, linux-pci, Komuro, Eric W. Biederman,
Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, Andi Kleen,
discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan,
Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel
On Wednesday 15 November 2006 11:21, Adrian Bunk wrote:
> Subject : x86_64: oprofile doesn't work
> References : http://lkml.org/lkml/2006/10/27/3
> Submitter : Prakash Punnoor <prakash@punnoor.de>
> Status : unknown
>
I confirm a got this one too.
On a working kernel on an Opteron, we have normally 4 directories
in /dev/oprofile :
# ls -ld /dev/oprofile/?
drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/0
drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/1
drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/2
drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/3
With linux-2.6.19-rc5, the first one (0) is missing and we get 1,2,3
Maybe the 'bug' is in oprofile tools, that currently expect to find '0'
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:35 ` Eric Dumazet
@ 2006-11-15 10:50 ` Andi Kleen
2006-11-15 16:40 ` William Cohen
2006-11-22 10:28 ` Eric Dumazet
1 sibling, 1 reply; 91+ messages in thread
From: Andi Kleen @ 2006-11-15 10:50 UTC (permalink / raw)
To: Eric Dumazet
Cc: Adrian Bunk, Linus Torvalds, Andrew Morton,
Linux Kernel Mailing List, Stephen Hemminger, gregkh, linux-pci,
Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown,
Andre Noll, discuss, Prakash Punnoor, phil.el, oprofile-list,
Alex Romosan, Jens Axboe, Andrey Borzenkov, Alan Stern,
linux-usb-devel
> On a working kernel on an Opteron, we have normally 4 directories
> in /dev/oprofile :
>
> # ls -ld /dev/oprofile/?
> drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/0
> drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/1
> drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/2
> drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/3
>
> With linux-2.6.19-rc5, the first one (0) is missing and we get 1,2,3
That's because 0 was never available. It is used by the NMI watchdog.
The new kernel doesn't give it to oprofile anymore.
> Maybe the 'bug' is in oprofile tools, that currently expect to find '0'
Yes, it's likely a user space issue.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:35 ` Jens Axboe
@ 2006-11-15 10:53 ` Adrian Bunk
0 siblings, 0 replies; 91+ messages in thread
From: Adrian Bunk @ 2006-11-15 10:53 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linux Kernel, Alex Romosan
On Wed, Nov 15, 2006 at 11:35:05AM +0100, Jens Axboe wrote:
> On Wed, Nov 15 2006, Adrian Bunk wrote:
> > Subject : unable to rip cd
> > References : http://lkml.org/lkml/2006/10/13/100
> > http://lkml.org/lkml/2006/11/8/42
> > Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
> > Handled-By : Jens Axboe <jens.axboe@oracle.com>
> > Status : Jens is investigating
>
> it's fixed and patched has been merged.
Thanks for the information, I've removed it from my list.
> Jens Axboe
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
2006-11-15 10:35 ` Jens Axboe
2006-11-15 10:35 ` Eric Dumazet
@ 2006-11-15 11:06 ` Brice Goglin
2006-11-15 22:32 ` Adrian Bunk
2006-11-15 12:07 ` Alan
2006-11-15 15:52 ` Stephen Hemminger
4 siblings, 1 reply; 91+ messages in thread
From: Brice Goglin @ 2006-11-15 11:06 UTC (permalink / raw)
To: Adrian Bunk; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List
Adrian Bunk wrote:
> Subject : unable to rip cd
> References : http://lkml.org/lkml/2006/10/13/100
> http://lkml.org/lkml/2006/11/8/42
> Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
> Handled-By : Jens Axboe <jens.axboe@oracle.com>
> Status : Jens is investigating
I think this one is already fixed.
Brice
commit 616e8a091a035c0bd9b871695f4af191df123caa
author Jens Axboe <jens.axboe@oracle.com> 1163437499 +0100
committer Linus Torvalds <torvalds@g5.osdl.org> 1163440020 -0800
[PATCH] Fix bad data direction in SG_IO
Contrary to what the name misleads you to believe, SG_DXFER_TO_FROM_DEV
is really just a normal read seen from the device side.
This patch fixes http://lkml.org/lkml/2006/10/13/100
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
` (2 preceding siblings ...)
2006-11-15 11:06 ` Brice Goglin
@ 2006-11-15 12:07 ` Alan
2006-11-15 15:52 ` Stephen Hemminger
4 siblings, 0 replies; 91+ messages in thread
From: Alan @ 2006-11-15 12:07 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List,
Stephen Hemminger, gregkh, linux-pci, Komuro, Eric W. Biederman,
Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, Andi Kleen,
discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan,
Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel
> Subject : PCI MSI setting corrupted during resume
> References : http://bugzilla.kernel.org/show_bug.cgi?id=7479
> Submitter : Stephen Hemminger <shemminger@osdl.org>
> Status : unknown
This is one of the minor resume problems as far as I can tell. I believe
the patches I posted for having a resume quirk run on each device if
appropriate should correctly resolve these. See the patch I sent to l/k.
There are a variety of other resume quirks we definitely require.
Alan
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman
2006-11-14 23:35 ` Linus Torvalds
2006-11-15 1:17 ` Linus Torvalds
@ 2006-11-15 12:40 ` Komuro
2 siblings, 0 replies; 91+ messages in thread
From: Komuro @ 2006-11-15 12:40 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linus Torvalds, Ingo Molnar, Komuro, tglx, Adrian Bunk,
Andrew Morton, Linux Kernel Mailing List
Hi,
I tried the Eric's patch instead of Ingo's
with 2.6.19-rc5.
The interrupt is generated properly.
Thanks!
Best Regards
Komuro
>
>Hopefully this is the trivial patch that solves the problem.
>
>Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>
>diff --git a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
>index ad84bc2..3b7a63e 100644
>--- a/arch/i386/kernel/io_apic.c
>+++ b/arch/i386/kernel/io_apic.c
>@@ -1287,9 +1287,11 @@ static void ioapic_register_intr(int irq
> trigger == IOAPIC_LEVEL)
> set_irq_chip_and_handler_name(irq, &ioapic_chip,
> handle_fasteoi_irq, "fasteoi");
>- else
>+ else {
>+ irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
> set_irq_chip_and_handler_name(irq, &ioapic_chip,
> handle_edge_irq, "edge");
>+ }
> set_intr_gate(vector, interrupt[irq]);
> }
>
>diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c
>index 41bfc49..14654e6 100644
>--- a/arch/x86_64/kernel/io_apic.c
>+++ b/arch/x86_64/kernel/io_apic.c
>@@ -790,9 +790,11 @@ static void ioapic_register_intr(int irq
> trigger == IOAPIC_LEVEL)
> set_irq_chip_and_handler_name(irq, &ioapic_chip,
> handle_fasteoi_irq, "fasteoi");
>- else
>+ else {
>+ irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
> set_irq_chip_and_handler_name(irq, &ioapic_chip,
> handle_edge_irq, "edge");
>+ }
> }
>
> static void __init setup_IO_APIC_irqs(void)
>
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
` (3 preceding siblings ...)
2006-11-15 12:07 ` Alan
@ 2006-11-15 15:52 ` Stephen Hemminger
2006-11-15 16:35 ` Eric W. Biederman
4 siblings, 1 reply; 91+ messages in thread
From: Stephen Hemminger @ 2006-11-15 15:52 UTC (permalink / raw)
To: Adrian Bunk
Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, gregkh,
linux-pci, Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg,
Len Brown, Andre Noll, Andi Kleen, discuss, Prakash Punnoor,
phil.el, oprofile-list, Alex Romosan, Jens Axboe,
Andrey Borzenkov, Alan Stern, linux-usb-devel
>
> Subject : PCI MSI setting corrupted during resume
> References : http://bugzilla.kernel.org/show_bug.cgi?id=7479
> Submitter : Stephen Hemminger <shemminger@osdl.org>
> Status : unknown
>
Turns out this isn't a regression, it was always there. It has to do with ACPI
clearing state on resume. MSI wasn't being used the same in older kernels so
it didn't show up.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
2006-11-15 5:14 ` Eric W. Biederman
@ 2006-11-15 16:06 ` Linus Torvalds
2006-11-15 16:58 ` Eric W. Biederman
0 siblings, 1 reply; 91+ messages in thread
From: Linus Torvalds @ 2006-11-15 16:06 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
On Tue, 14 Nov 2006, Eric W. Biederman wrote:
>
> The big one I did not set it on is the interrupt if it comes in
> through ExtInt. I assume the 8259 is sane but I may be wrong.
Yes, ExtInt is ok, i fyou actually mask it at the 8259. As mentioned
earlier in the thread, the i8259 has its edge detect logic _after_ the
masking logic, so if the irq is still active, and you unmask it, it will
see an edge, and re-assert the interrupt in hardware.
So the i8259 is a good interrupt controller, and does not need delayed
disable and software logic to re-assert the irq.
> The truth is in practice I don't think it matters because I don't
> think anyone actually disables MSI or hypertransport interrupts.
Fair enough, at least for a 2.6.19 kind of release timeframe (and that is
what I worry about most, at least right now).
> At this point I have two questions.
> - What is the easiest path to get us to a stable 2.6.19 where
> everything works?
If people don't expect HT and MSI interrupts to be masked (and I can well
imagine that), then I think your two-liner patch is good to go. Komuro
seems to have acked it already, and in many ways that's the "minimal
change" for 2.6.19 right now.
I do like Ingo's patch because it seems "safe" (even if I think it might
be a bit _overly_ safe), but it changes semantics enough that I don't like
it for 2.6.19. Even his second version definitely changes semantics for
level-triggered PCI interrupts, even though he fixed ExtInt/i8259 ones.
So I think I'll go with your patch for now, and we can re-visit Ingo's
thing after 2.6.19.
> - What is the sanest thing for long term maintenance, of irqs?
>
> genirq is less code to maintain overall (a plus).
Oh, I absolutely think genirq is the right thing to do. No question at
all. I just think that we might want to refactor the code somewhat, and in
particular I suspect that many irq controller drivers should use separate
"struct irq_chip" entries for edge and level, because they are
fundamentally different.
> My gut feel is that there is room for a lot more cleanup in this
> area but we probably need to stabilize what we have.
Exactly. Baby steps. Make it work. Then clean up. Slowly.
> Since you aren't complaining about what the code actually does but
> rather how the interface looks, I have a proposal. I assert that
> the interface for registering an irq is much to general, and broad.
>
> Instead of having:
>
> irq_desc[irq].status |= IRQ_DELAYED_DISABLE;
> set_irq_chip_and_handler_name(irq, &ioapic_chip,
> handle_edge_irq, "edge");
>
> We should have a set of helper functions one for each common type
> of interrupt.
>
> set_irq_edge_lossy(irq, &ioapic_chip);
> set_irq_edge(irq, &ioapic_chip);
> set_irq_level(irq, &ioapic_chip);
Yeah, that might be a fine way too. That's largely what we do for the IO
schedulers, and it's been fairly successful. Start out by setting common
defaults, and then allow chip drivers to specify particular details
explicitly.
Ingo?
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] genirq: do not mask interrupts by default
[not found] ` <20061115090427.GA16173@elte.hu>
@ 2006-11-15 16:13 ` Linus Torvalds
2006-11-15 17:46 ` Ingo Molnar
0 siblings, 1 reply; 91+ messages in thread
From: Linus Torvalds @ 2006-11-15 16:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ingo Molnar, Eric W. Biederman, Komuro, tglx, Adrian Bunk,
Andrew Morton, Linux Kernel Mailing List
On Wed, 15 Nov 2006, Ingo Molnar wrote:
>
> problem is, we dont know /for a fact/ that something is "APIC-edge". We
> only know that the BIOS claims it that it's so.
This is incorrect. We will have _programmed_ the APIC with whatever the
BIOS said in the MP tables, so if we think it's level triggered, it _is_
level triggered.
So I really think that all the arguments for i8259 not wanting replay
weigh equally on level-triggered PCI irq's too.
Now, the one thing that makes me think your approach is the right one is
that it's potentially going to be better performance - if people disable
irq's and the normal case is that no irq will actually happen, then
optimistically not doing anything at all (except marking the irq disabled,
of course) is always good.
However, because it's a semantic change, I _really_ don't want to do it
right now. We're maybe a week away from 2.6.19, and the "ISA irq's don't
work" report is one of the things that is holding things up right now.
So that's why I'd much rather go with Eric's patch for now - because it
keeps the semantics that we've always had.
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 15:52 ` Stephen Hemminger
@ 2006-11-15 16:35 ` Eric W. Biederman
0 siblings, 0 replies; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-15 16:35 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Adrian Bunk, Linus Torvalds, Andrew Morton,
Linux Kernel Mailing List, gregkh, linux-pci, Komuro,
Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown,
Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el,
oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov,
Alan Stern, linux-usb-devel
Stephen Hemminger <shemminger@osdl.org> writes:
>>
>> Subject : PCI MSI setting corrupted during resume
>> References : http://bugzilla.kernel.org/show_bug.cgi?id=7479
>> Submitter : Stephen Hemminger <shemminger@osdl.org>
>> Status : unknown
>>
> Turns out this isn't a regression, it was always there. It has to do with ACPI
> clearing state on resume. MSI wasn't being used the same in older kernels so
> it didn't show up.
Ok. Do we know enough to fix the MSI case?
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:50 ` Andi Kleen
@ 2006-11-15 16:40 ` William Cohen
2006-11-15 16:48 ` [discuss] " Andi Kleen
0 siblings, 1 reply; 91+ messages in thread
From: William Cohen @ 2006-11-15 16:40 UTC (permalink / raw)
To: Andi Kleen
Cc: Eric Dumazet, Andrew Morton, Komuro, Ernst Herzberg, Andre Noll,
oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk,
Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger,
Prakash Punnoor, Len Brown, Alex Romosan, Linus Torvalds, discuss,
gregkh, Linux Kernel Mailing List, Eric W. Biederman,
Andrey Borzenkov
Andi Kleen wrote:
>>On a working kernel on an Opteron, we have normally 4 directories
>>in /dev/oprofile :
>>
>># ls -ld /dev/oprofile/?
>>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/0
>>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/1
>>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/2
>>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/3
>>
>>With linux-2.6.19-rc5, the first one (0) is missing and we get 1,2,3
>
>
> That's because 0 was never available. It is used by the NMI watchdog.
> The new kernel doesn't give it to oprofile anymore.
>
>
>>Maybe the 'bug' is in oprofile tools, that currently expect to find '0'
>
>
> Yes, it's likely a user space issue.
>
> -Andi
OProfile has a simplistic view of the performance monitoring hardware. The
routines in libop/op_alloc_counter.c determine what set of performance registers
is available from the processor in use. There is no check to see what registers
are actually available in the /dev/oprofile directory.
opcontrol executes ophelp to determine which specific counters to count which
events. The function map_event_to_counter() in libop/op_alloc_counter.c does the
actual selection. It seems what is needed is for map_event_to_counter() to check
to see which counters are available and mark the others as unavailable.
-Will
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 16:40 ` William Cohen
@ 2006-11-15 16:48 ` Andi Kleen
2006-11-15 18:39 ` Andrew Morton
0 siblings, 1 reply; 91+ messages in thread
From: Andi Kleen @ 2006-11-15 16:48 UTC (permalink / raw)
To: discuss
Cc: William Cohen, Eric Dumazet, Andrew Morton, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, Linus Torvalds, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
> OProfile has a simplistic view of the performance monitoring hardware. The
> routines in libop/op_alloc_counter.c determine what set of performance registers
> is available from the processor in use. There is no check to see what registers
> are actually available in the /dev/oprofile directory.
>
> opcontrol executes ophelp to determine which specific counters to count which
> events. The function map_event_to_counter() in libop/op_alloc_counter.c does the
> actual selection. It seems what is needed is for map_event_to_counter() to check
> to see which counters are available and mark the others as unavailable
Thanks for the explanation. Can you please fix it and release a new version?
Documentation/Changes could be adapted then.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts
2006-11-15 16:06 ` Linus Torvalds
@ 2006-11-15 16:58 ` Eric W. Biederman
0 siblings, 0 replies; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-15 16:58 UTC (permalink / raw)
To: Linus Torvalds
Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton,
Linux Kernel Mailing List
Linus Torvalds <torvalds@osdl.org> writes:
> On Tue, 14 Nov 2006, Eric W. Biederman wrote:
>
>> The truth is in practice I don't think it matters because I don't
>> think anyone actually disables MSI or hypertransport interrupts.
>
> Fair enough, at least for a 2.6.19 kind of release timeframe (and that is
> what I worry about most, at least right now).
>
>> At this point I have two questions.
>> - What is the easiest path to get us to a stable 2.6.19 where
>> everything works?
>
> If people don't expect HT and MSI interrupts to be masked (and I can well
> imagine that), then I think your two-liner patch is good to go. Komuro
> seems to have acked it already, and in many ways that's the "minimal
> change" for 2.6.19 right now.
Well I just doubled checked this assertion. The one driver that uses
the hypertransport irqs doesn't call disable_irq. On the msi side
at least the forcedeth driver does call disable_irq when in msi mode.
I just doubled checked the historical behavior of the msi code and
it has never done the delayed disable thing. So not doing it there
is not a regression.
The MSI case is different. MSI is fundamentally about non-shared
interrupts, and interrupts that don't race with your DMAs. So with
MSI you don't need a status register read to process the interrupt.
In the context of Ingo's patch I don't like the idea of saddling MSI
interrupts down with the best in class work arounds for a completely
different hardware interrupt model. Although I don't doubt MSI will
get it's own set of work arounds as we come to know it better.
> I do like Ingo's patch because it seems "safe" (even if I think it might
> be a bit _overly_ safe), but it changes semantics enough that I don't like
> it for 2.6.19. Even his second version definitely changes semantics for
> level-triggered PCI interrupts, even though he fixed ExtInt/i8259 ones.
>
> So I think I'll go with your patch for now, and we can re-visit Ingo's
> thing after 2.6.19.
Sounds like a plan.
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] genirq: do not mask interrupts by default
2006-11-15 16:13 ` [patch] genirq: do not mask interrupts by default Linus Torvalds
@ 2006-11-15 17:46 ` Ingo Molnar
0 siblings, 0 replies; 91+ messages in thread
From: Ingo Molnar @ 2006-11-15 17:46 UTC (permalink / raw)
To: Linus Torvalds
Cc: Ingo Molnar, Eric W. Biederman, Komuro, tglx, Adrian Bunk,
Andrew Morton, Linux Kernel Mailing List
* Linus Torvalds <torvalds@osdl.org> wrote:
> On Wed, 15 Nov 2006, Ingo Molnar wrote:
> >
> > problem is, we dont know /for a fact/ that something is "APIC-edge".
> > We only know that the BIOS claims it that it's so.
>
> This is incorrect. We will have _programmed_ the APIC with whatever
> the BIOS said in the MP tables, so if we think it's level triggered,
> it _is_ level triggered.
yeah. I was thinking about the low 16 irqs (those are really the problem
spots most of the time, not the normal IO-APIC irqs) - which are routed
all across the southbridge and might end up being handled by a
i8259A-lookalike entity. Right now we default to level-triggered IRQ
flow handling:
if (i < 16) {
/*
* 16 old-style INTA-cycle interrupts:
*/
set_irq_chip_and_handler_name(i, &i8259A_chip,
handle_level_irq, "XT");
because that's the best we can do (it's also what our i8259 code did
historically). But it would be one step safer to also do the
lazy-disable. Just in case things might get lost while masked. Or is
that an absolutely horrible hardware breakage that i shouldnt worry
about?
> So I really think that all the arguments for i8259 not wanting replay
> weigh equally on level-triggered PCI irq's too.
>
> Now, the one thing that makes me think your approach is the right one
> is that it's potentially going to be better performance - if people
> disable irq's and the normal case is that no irq will actually happen,
> then optimistically not doing anything at all (except marking the irq
> disabled, of course) is always good.
>
> However, because it's a semantic change, I _really_ don't want to do
> it right now. We're maybe a week away from 2.6.19, and the "ISA irq's
> don't work" report is one of the things that is holding things up
> right now.
>
> So that's why I'd much rather go with Eric's patch for now - because
> it keeps the semantics that we've always had.
ok, i'm fine with Eric's patch too, if it solves Komuro's problem:
Acked-by: Ingo Molnar <mingo@elte.hu>
and we dont have to worry about the present ugliness of the
delayed-disabled flag either, as it would just go away in 2.6.20.
Ingo
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 16:48 ` [discuss] " Andi Kleen
@ 2006-11-15 18:39 ` Andrew Morton
2006-11-15 18:45 ` Andi Kleen
0 siblings, 1 reply; 91+ messages in thread
From: Andrew Morton @ 2006-11-15 18:39 UTC (permalink / raw)
To: Andi Kleen
Cc: discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg,
Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el,
Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci,
Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan,
Linus Torvalds, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
On Wed, 15 Nov 2006 17:48:05 +0100
Andi Kleen <ak@suse.de> wrote:
>
> > OProfile has a simplistic view of the performance monitoring hardware. The
> > routines in libop/op_alloc_counter.c determine what set of performance registers
> > is available from the processor in use. There is no check to see what registers
> > are actually available in the /dev/oprofile directory.
> >
> > opcontrol executes ophelp to determine which specific counters to count which
> > events. The function map_event_to_counter() in libop/op_alloc_counter.c does the
> > actual selection. It seems what is needed is for map_event_to_counter() to check
> > to see which counters are available and mark the others as unavailable
>
> Thanks for the explanation. Can you please fix it and release a new version?
> Documentation/Changes could be adapted then.
>
Meanwhile we should restore the NMI counter to fix this bug.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 18:39 ` Andrew Morton
@ 2006-11-15 18:45 ` Andi Kleen
2006-11-15 19:07 ` Linus Torvalds
0 siblings, 1 reply; 91+ messages in thread
From: Andi Kleen @ 2006-11-15 18:45 UTC (permalink / raw)
To: Andrew Morton
Cc: discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg,
Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el,
Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci,
Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan,
Linus Torvalds, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
On Wednesday 15 November 2006 19:39, Andrew Morton wrote:
> On Wed, 15 Nov 2006 17:48:05 +0100
> Andi Kleen <ak@suse.de> wrote:
>
> >
> > > OProfile has a simplistic view of the performance monitoring hardware. The
> > > routines in libop/op_alloc_counter.c determine what set of performance registers
> > > is available from the processor in use. There is no check to see what registers
> > > are actually available in the /dev/oprofile directory.
> > >
> > > opcontrol executes ophelp to determine which specific counters to count which
> > > events. The function map_event_to_counter() in libop/op_alloc_counter.c does the
> > > actual selection. It seems what is needed is for map_event_to_counter() to check
> > > to see which counters are available and mark the others as unavailable
> >
> > Thanks for the explanation. Can you please fix it and release a new version?
> > Documentation/Changes could be adapted then.
> >
>
> Meanwhile we should restore the NMI counter to fix this bug.
No, it was always oprofile who was buggy here, silently taking
the nmi watchdog away.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 18:45 ` Andi Kleen
@ 2006-11-15 19:07 ` Linus Torvalds
2006-11-15 19:23 ` Andi Kleen
0 siblings, 1 reply; 91+ messages in thread
From: Linus Torvalds @ 2006-11-15 19:07 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, discuss, William Cohen, Eric Dumazet, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
On Wed, 15 Nov 2006, Andi Kleen wrote:
> >
> > Meanwhile we should restore the NMI counter to fix this bug.
>
> No, it was always oprofile who was buggy here, silently taking
> the nmi watchdog away.
Andi, your "blame game" doesn't matter.
The fact is, it used to work, and the kernel changed interfaces, so now it
doesn't.
In other words, a kernel interface to user land changed. THAT IS ALWAYS A
BUG. We don't change UI.
Yes, "oprofile" should be fixed to not depend on that, but the kernel
shouldn't change the interfaces, and we should add back the zero entry.
Linus
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 19:07 ` Linus Torvalds
@ 2006-11-15 19:23 ` Andi Kleen
2006-11-15 20:21 ` Andrew Morton
0 siblings, 1 reply; 91+ messages in thread
From: Andi Kleen @ 2006-11-15 19:23 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, discuss, William Cohen, Eric Dumazet, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
> The fact is, it used to work, and the kernel changed interfaces, so now it
> doesn't.
No, it didn't work. oprofile may have done something, but it
just silently killed the NMI watchdog in the process.
That was never acceptable.
Now we do proper accounting of NMI sources and also proper allocation
of performance counters.
> Yes, "oprofile" should be fixed to not depend on that, but the kernel
> shouldn't change the interfaces, and we should add back the zero entry.
That would break the nmi watchdog again.
Anyways, there is a sysctl to disable the nmi watchdog if someone
is desperate.
But I think it is clearly oprofile who did wrong here and needs
to be fixed.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 19:23 ` Andi Kleen
@ 2006-11-15 20:21 ` Andrew Morton
2006-11-15 21:18 ` Eric W. Biederman
2006-11-16 3:21 ` Andi Kleen
0 siblings, 2 replies; 91+ messages in thread
From: Andrew Morton @ 2006-11-15 20:21 UTC (permalink / raw)
To: Andi Kleen
Cc: Linus Torvalds, discuss, William Cohen, Eric Dumazet, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
On Wed, 15 Nov 2006 20:23:53 +0100
Andi Kleen <ak@suse.de> wrote:
>
> > The fact is, it used to work, and the kernel changed interfaces, so now it
> > doesn't.
>
> No, it didn't work. oprofile may have done something, but it
> just silently killed the NMI watchdog in the process.
> That was never acceptable.
But people could get profiles out. I know, I've seen them!
> Now we do proper accounting of NMI sources and also proper allocation
> of performance counters.
>
>
> > Yes, "oprofile" should be fixed to not depend on that, but the kernel
> > shouldn't change the interfaces, and we should add back the zero entry.
>
> That would break the nmi watchdog again.
>
> Anyways, there is a sysctl to disable the nmi watchdog if someone
> is desperate.
>
> But I think it is clearly oprofile who did wrong here and needs
> to be fixed.
>
Is it correct to say that oprofile-on-2.6.18 works, and that
oprofile-on-2.6.19-rc5 does not?
Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail
in some particular scenarios?
If it's really true that oprofile is simply busted then that's a serious
problem and we should find some way of unbusting it. If that means just
adding a dummy "0" entry which always returns zero or something like that,
then fine.
But we can't just go and bust it.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 20:21 ` Andrew Morton
@ 2006-11-15 21:18 ` Eric W. Biederman
2006-11-15 21:31 ` Andrew Morton
2006-11-16 3:21 ` Andi Kleen
1 sibling, 1 reply; 91+ messages in thread
From: Eric W. Biederman @ 2006-11-15 21:18 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov
Andrew Morton <akpm@osdl.org> writes:
> Is it correct to say that oprofile-on-2.6.18 works, and that
> oprofile-on-2.6.19-rc5 does not?
>
> Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail
> in some particular scenarios?
>
> If it's really true that oprofile is simply busted then that's a serious
> problem and we should find some way of unbusting it. If that means just
> adding a dummy "0" entry which always returns zero or something like that,
> then fine.
>
> But we can't just go and bust it.
The simple question. If we turn off the NMI watchdog on 2.6.19-rc5
does oprofile work? I believe that is what Andi said.
The description I read was a resource conflict. The resources oprofile
just expects it can used are already in use so we tell it no and
the user space oprofile doesn't cope.
Now I don't know the interface allows us to rename the interfaces
from 1 2 3 to 0 1 2. If we can then that looks like something we can
fix. Otherwise from the description I tend to agree with Andi.
The user space application assumed it own hardware that it did not.
Hmm. I bet if nothing else we could move the NMI watchdog from 0 to 3
and make things work that way...
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 21:18 ` Eric W. Biederman
@ 2006-11-15 21:31 ` Andrew Morton
2006-11-16 10:55 ` Mikael Pettersson
0 siblings, 1 reply; 91+ messages in thread
From: Andrew Morton @ 2006-11-15 21:31 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov
On Wed, 15 Nov 2006 14:18:24 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:
> Andrew Morton <akpm@osdl.org> writes:
>
> > Is it correct to say that oprofile-on-2.6.18 works, and that
> > oprofile-on-2.6.19-rc5 does not?
> >
> > Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail
> > in some particular scenarios?
> >
> > If it's really true that oprofile is simply busted then that's a serious
> > problem and we should find some way of unbusting it. If that means just
> > adding a dummy "0" entry which always returns zero or something like that,
> > then fine.
> >
> > But we can't just go and bust it.
>
> The simple question. If we turn off the NMI watchdog on 2.6.19-rc5
> does oprofile work? I believe that is what Andi said.
>
> The description I read was a resource conflict. The resources oprofile
> just expects it can used are already in use so we tell it no and
> the user space oprofile doesn't cope.
That would have been a bug in earlier kernels.
> Now I don't know the interface allows us to rename the interfaces
> from 1 2 3 to 0 1 2. If we can then that looks like something we can
> fix. Otherwise from the description I tend to agree with Andi.
>
> The user space application assumed it own hardware that it did not.
>
> Hmm. I bet if nothing else we could move the NMI watchdog from 0 to 3
> and make things work that way...
Surely the appropriate behaviour is to allow oprofile to steal the NMI and
to then put the NMI back to doing the watchdog thing after oprofile has
finished with it.
If that's not a feasible thing to do for 2.6.19 then some short-term
hack which makes oprofile work again is needed.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 11:06 ` Brice Goglin
@ 2006-11-15 22:32 ` Adrian Bunk
0 siblings, 0 replies; 91+ messages in thread
From: Adrian Bunk @ 2006-11-15 22:32 UTC (permalink / raw)
To: Brice Goglin; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List
On Wed, Nov 15, 2006 at 12:06:22PM +0100, Brice Goglin wrote:
> Adrian Bunk wrote:
> > Subject : unable to rip cd
> > References : http://lkml.org/lkml/2006/10/13/100
> > http://lkml.org/lkml/2006/11/8/42
> > Submitter : Alex Romosan <romosan@sycorax.lbl.gov>
> > Handled-By : Jens Axboe <jens.axboe@oracle.com>
> > Status : Jens is investigating
>
> I think this one is already fixed.
Thanks for this information (Jens already told me the same).
> Brice
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 20:21 ` Andrew Morton
2006-11-15 21:18 ` Eric W. Biederman
@ 2006-11-16 3:21 ` Andi Kleen
2006-11-16 5:05 ` Andrew Morton
1 sibling, 1 reply; 91+ messages in thread
From: Andi Kleen @ 2006-11-16 3:21 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Linus Torvalds, discuss, William Cohen, Eric Dumazet,
Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
On Wed, Nov 15, 2006 at 12:21:18PM -0800, Andrew Morton wrote:
> Andi Kleen <ak@suse.de> wrote:
>
> >
> > > The fact is, it used to work, and the kernel changed interfaces, so now it
> > > doesn't.
> >
> > No, it didn't work. oprofile may have done something, but it
> > just silently killed the NMI watchdog in the process.
> > That was never acceptable.
>
> But people could get profiles out. I know, I've seen them!
Just the nmi watchdog was gone then.
>
> > Now we do proper accounting of NMI sources and also proper allocation
> > of performance counters.
> >
> >
> > > Yes, "oprofile" should be fixed to not depend on that, but the kernel
> > > shouldn't change the interfaces, and we should add back the zero entry.
> >
> > That would break the nmi watchdog again.
> >
> > Anyways, there is a sysctl to disable the nmi watchdog if someone
> > is desperate.
> >
> > But I think it is clearly oprofile who did wrong here and needs
> > to be fixed.
> >
>
> Is it correct to say that oprofile-on-2.6.18 works, and that
> oprofile-on-2.6.19-rc5 does not?
>
> Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail
echo 0 > /proc/sys/kernel/nmi_watchdog before the oprofile module is loaded.
With builtin oprofile probably nmi_watchdog=0
> in some particular scenarios?
On x86-64 and on newer i386 machines (based on DMI year)
>
> If it's really true that oprofile is simply busted then that's a serious
> problem and we should find some way of unbusting it. If that means just
> adding a dummy "0" entry which always returns zero or something like that,
> then fine.
That could be probably done.
> But we can't just go and bust it.
It just did something unbelievable broken before. I would say it busted
itself.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-16 3:21 ` Andi Kleen
@ 2006-11-16 5:05 ` Andrew Morton
2006-11-16 7:04 ` Andi Kleen
0 siblings, 1 reply; 91+ messages in thread
From: Andrew Morton @ 2006-11-16 5:05 UTC (permalink / raw)
To: Andi Kleen
Cc: Linus Torvalds, discuss, William Cohen, Eric Dumazet, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
On Thu, 16 Nov 2006 04:21:09 +0100
Andi Kleen <ak@suse.de> wrote:
> >
> > If it's really true that oprofile is simply busted then that's a serious
> > problem and we should find some way of unbusting it. If that means just
> > adding a dummy "0" entry which always returns zero or something like that,
> > then fine.
>
> That could be probably done.
I'm told that this is exactly what it was doing before it got changed.
> > But we can't just go and bust it.
>
> It just did something unbelievable broken before.
What did it do?
> I would say it busted
> itself.
It gave profiles, which was fairly handy.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-16 5:05 ` Andrew Morton
@ 2006-11-16 7:04 ` Andi Kleen
2006-11-16 15:34 ` William Cohen
0 siblings, 1 reply; 91+ messages in thread
From: Andi Kleen @ 2006-11-16 7:04 UTC (permalink / raw)
To: Andrew Morton
Cc: Linus Torvalds, discuss, William Cohen, Eric Dumazet, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
On Thursday 16 November 2006 06:05, Andrew Morton wrote:
> On Thu, 16 Nov 2006 04:21:09 +0100
> Andi Kleen <ak@suse.de> wrote:
>
> > >
> > > If it's really true that oprofile is simply busted then that's a serious
> > > problem and we should find some way of unbusting it. If that means just
> > > adding a dummy "0" entry which always returns zero or something like that,
> > > then fine.
> >
> > That could be probably done.
>
> I'm told that this is exactly what it was doing before it got changed.
Hmm, ok perhaps that can be arranged again.
The trouble is that I want to use this performance counter for
other purposes too, so we would run into trouble again
if oprofile keeps stealing it.
> > > But we can't just go and bust it.
> >
> > It just did something unbelievable broken before.
>
> What did it do?
Silently kill the nmi watchdog.
>
> > I would say it busted
> > itself.
>
> It gave profiles, which was fairly handy.
I'm sure it can be fixed there. Ok ok I keep sounding like a sysfs maintainer
now @)
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 21:31 ` Andrew Morton
@ 2006-11-16 10:55 ` Mikael Pettersson
2006-11-16 20:23 ` Andrew Morton
0 siblings, 1 reply; 91+ messages in thread
From: Mikael Pettersson @ 2006-11-16 10:55 UTC (permalink / raw)
To: Andrew Morton
Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss,
William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list,
Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar,
Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor,
Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List,
Andrey Borzenkov
Andrew Morton writes:
> Surely the appropriate behaviour is to allow oprofile to steal the NMI and
> to then put the NMI back to doing the watchdog thing after oprofile has
> finished with it.
Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented
the in-kernel API allowing real performance counter drivers like
oprofile (and perfctr) to claim the HW from the NMI watchdog,
do their work, and then release it which resumed the watchdog.
Note that oprofile (and perfctr) didn't do anything behind the
NMI watchdog's back. They went via the API. Nothing dodgy going on.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-16 7:04 ` Andi Kleen
@ 2006-11-16 15:34 ` William Cohen
2006-11-16 15:47 ` Andi Kleen
2006-11-16 21:32 ` Stephane Eranian
0 siblings, 2 replies; 91+ messages in thread
From: William Cohen @ 2006-11-16 15:34 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Linus Torvalds, discuss, Eric Dumazet, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
Andi Kleen wrote:
> On Thursday 16 November 2006 06:05, Andrew Morton wrote:
>
>>On Thu, 16 Nov 2006 04:21:09 +0100
>>Andi Kleen <ak@suse.de> wrote:
>>
>>
>>>>If it's really true that oprofile is simply busted then that's a serious
>>>>problem and we should find some way of unbusting it. If that means just
>>>>adding a dummy "0" entry which always returns zero or something like that,
>>>>then fine.
>>>
>>>That could be probably done.
>>
>>I'm told that this is exactly what it was doing before it got changed.
>
>
> Hmm, ok perhaps that can be arranged again.
>
> The trouble is that I want to use this performance counter for
> other purposes too, so we would run into trouble again
> if oprofile keeps stealing it.
What other purposes do you see the performance counters useful for? To collect
information on process characteristics so they can be scheduled more efficiently?
Is this going to require sharing the nmi interrupt and knowing which perfcounter
register triggered the interrupt to get the correct action? Currently the
oprofile interrupt handler assumes any performance monitoring counter it sees
overflowing is something it should count.
-Will
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-16 15:34 ` William Cohen
@ 2006-11-16 15:47 ` Andi Kleen
2006-11-16 21:32 ` Stephane Eranian
1 sibling, 0 replies; 91+ messages in thread
From: Andi Kleen @ 2006-11-16 15:47 UTC (permalink / raw)
To: William Cohen
Cc: Andrew Morton, Linus Torvalds, discuss, Eric Dumazet, Komuro,
Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe,
linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern,
linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown,
Alex Romosan, gregkh, Linux Kernel Mailing List,
Eric W. Biederman, Andrey Borzenkov
> What other purposes do you see the performance counters useful for?
Export one to user space as a cycle counter for benchmarking. RDTSC doesn't
do this job anymore.
> To collect information on process characteristics so they can be scheduled more efficiently?
That might happen at some point in the future, but i would expect
us to wait for CPUs with more performance counters first.
> Is this going to require sharing the nmi interrupt and knowing which perfcounter
> register triggered the interrupt to get the correct action? Currently the
> oprofile interrupt handler assumes any performance monitoring counter it sees
> overflowing is something it should count.
Yes. That needs to be fixed.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-16 10:55 ` Mikael Pettersson
@ 2006-11-16 20:23 ` Andrew Morton
2006-11-17 9:59 ` Mikael Pettersson
0 siblings, 1 reply; 91+ messages in thread
From: Andrew Morton @ 2006-11-16 20:23 UTC (permalink / raw)
To: Mikael Pettersson
Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss,
William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list,
Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar,
Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor,
Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List,
Andrey Borzenkov
On Thu, 16 Nov 2006 11:55:46 +0100
Mikael Pettersson <mikpe@it.uu.se> wrote:
> Andrew Morton writes:
> > Surely the appropriate behaviour is to allow oprofile to steal the NMI and
> > to then put the NMI back to doing the watchdog thing after oprofile has
> > finished with it.
>
> Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented
> the in-kernel API allowing real performance counter drivers like
> oprofile (and perfctr) to claim the HW from the NMI watchdog,
> do their work, and then release it which resumed the watchdog.
OK. But from Andi's comments it seems that the NMI watchdog was failing to
resume its operation.
> Note that oprofile (and perfctr) didn't do anything behind the
> NMI watchdog's back. They went via the API. Nothing dodgy going on.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-16 15:34 ` William Cohen
2006-11-16 15:47 ` Andi Kleen
@ 2006-11-16 21:32 ` Stephane Eranian
1 sibling, 0 replies; 91+ messages in thread
From: Stephane Eranian @ 2006-11-16 21:32 UTC (permalink / raw)
To: William Cohen
Cc: Andi Kleen, Andrew Morton, Komuro, Ernst Herzberg, Andre Noll,
oprofile-list, Jens Axboe, Adrian Bunk, linux-usb-devel, phil.el,
Eric Dumazet, Ingo Molnar, Alan Stern, linux-pci, Prakash Punnoor,
Eric W. Biederman, Len Brown, Alex Romosan, Linus Torvalds,
discuss, gregkh, Linux Kernel Mailing List, Stephen Hemminger,
Andrey Borzenkov
Hello,
On Thu, Nov 16, 2006 at 10:34:56AM -0500, William Cohen wrote:
>
> Is this going to require sharing the nmi interrupt and knowing which perfcounter
> register triggered the interrupt to get the correct action? Currently the
> oprofile interrupt handler assumes any performance monitoring counter it sees
> overflowing is something it should count.
>
Yes, you need to share the NMI interrupt. In my next perfmon patch you will
see that this can be made to work. You just need to add one check in the
NMI handler callback: is it for me or else try perfmon? Perfmon can auto-detect
if NMI is active and give up the right counter (there is an API to check
what is reserved). The interface propagates the list of available counters
to apps which then pass the information onto libpfm which tries to use
the remaining counters.
--
-Stephane
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-16 20:23 ` Andrew Morton
@ 2006-11-17 9:59 ` Mikael Pettersson
2006-11-17 10:13 ` Andrew Morton
2006-11-17 10:29 ` Andi Kleen
0 siblings, 2 replies; 91+ messages in thread
From: Mikael Pettersson @ 2006-11-17 9:59 UTC (permalink / raw)
To: Andrew Morton
Cc: Mikael Pettersson, Eric W. Biederman, Andi Kleen, Linus Torvalds,
discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll,
oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk,
Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger,
Prakash Punnoor, Len Brown, Alex Romosan, gregkh,
Linux Kernel Mailing List, Andrey Borzenkov
Andrew Morton writes:
> On Thu, 16 Nov 2006 11:55:46 +0100
> Mikael Pettersson <mikpe@it.uu.se> wrote:
>
> > Andrew Morton writes:
> > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and
> > > to then put the NMI back to doing the watchdog thing after oprofile has
> > > finished with it.
> >
> > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented
> > the in-kernel API allowing real performance counter drivers like
> > oprofile (and perfctr) to claim the HW from the NMI watchdog,
> > do their work, and then release it which resumed the watchdog.
>
> OK. But from Andi's comments it seems that the NMI watchdog was failing to
> resume its operation.
It certainly worked when I originally implemented it. If it didn't work
that way before 2.6.19-rc1 butchered it then that would have been a bug
that should have been fixed.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-17 9:59 ` Mikael Pettersson
@ 2006-11-17 10:13 ` Andrew Morton
2006-11-19 3:05 ` Bill Davidsen
2006-11-17 10:29 ` Andi Kleen
1 sibling, 1 reply; 91+ messages in thread
From: Andrew Morton @ 2006-11-17 10:13 UTC (permalink / raw)
To: Mikael Pettersson
Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss,
William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list,
Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar,
Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor,
Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List,
Andrey Borzenkov
On Fri, 17 Nov 2006 10:59:07 +0100
Mikael Pettersson <mikpe@it.uu.se> wrote:
> Andrew Morton writes:
> > On Thu, 16 Nov 2006 11:55:46 +0100
> > Mikael Pettersson <mikpe@it.uu.se> wrote:
> >
> > > Andrew Morton writes:
> > > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and
> > > > to then put the NMI back to doing the watchdog thing after oprofile has
> > > > finished with it.
> > >
> > > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented
> > > the in-kernel API allowing real performance counter drivers like
> > > oprofile (and perfctr) to claim the HW from the NMI watchdog,
> > > do their work, and then release it which resumed the watchdog.
> >
> > OK. But from Andi's comments it seems that the NMI watchdog was failing to
> > resume its operation.
>
> It certainly worked when I originally implemented it. If it didn't work
> that way before 2.6.19-rc1 butchered it then that would have been a bug
> that should have been fixed.
Oh. OK.
Meanwhile, 2.6.19-rc6 remains unfixed.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-17 9:59 ` Mikael Pettersson
2006-11-17 10:13 ` Andrew Morton
@ 2006-11-17 10:29 ` Andi Kleen
1 sibling, 0 replies; 91+ messages in thread
From: Andi Kleen @ 2006-11-17 10:29 UTC (permalink / raw)
To: Mikael Pettersson
Cc: Andrew Morton, Eric W. Biederman, Linus Torvalds, discuss,
William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list,
Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar,
Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor,
Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List,
Andrey Borzenkov
On Friday 17 November 2006 10:59, Mikael Pettersson wrote:
> It certainly worked when I originally implemented it.
I don't think so. NMI watchdog never recovered no matter if oprofile
used the counter or not.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3)
2006-11-17 10:13 ` Andrew Morton
@ 2006-11-19 3:05 ` Bill Davidsen
0 siblings, 0 replies; 91+ messages in thread
From: Bill Davidsen @ 2006-11-19 3:05 UTC (permalink / raw)
To: Andrew Morton
Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss,
William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list,
Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar,
Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor,
Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List,
Andrey Borzenkov
Andrew Morton wrote:
> On Fri, 17 Nov 2006 10:59:07 +0100
> Mikael Pettersson <mikpe@it.uu.se> wrote:
>
>> Andrew Morton writes:
>> > On Thu, 16 Nov 2006 11:55:46 +0100
>> > Mikael Pettersson <mikpe@it.uu.se> wrote:
>> >
>> > > Andrew Morton writes:
>> > > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and
>> > > > to then put the NMI back to doing the watchdog thing after oprofile has
>> > > > finished with it.
>> > >
>> > > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented
>> > > the in-kernel API allowing real performance counter drivers like
>> > > oprofile (and perfctr) to claim the HW from the NMI watchdog,
>> > > do their work, and then release it which resumed the watchdog.
>> >
>> > OK. But from Andi's comments it seems that the NMI watchdog was failing to
>> > resume its operation.
>>
>> It certainly worked when I originally implemented it. If it didn't work
>> that way before 2.6.19-rc1 butchered it then that would have been a bug
>> that should have been fixed.
>
> Oh. OK.
>
> Meanwhile, 2.6.19-rc6 remains unfixed.
>
Has anyone verified that nmi watchdog works at all in 2.6.19-rc6? I
haven't built a kernel since rc2, other things have been taking my time.
--
Bill Davidsen <davidsen@tmr.com>
Obscure bug of 2004: BASH BUFFER OVERFLOW - if bash is being run by a
normal user and is setuid root, with the "vi" line edit mode selected,
and the character set is "big5," an off-by-one errors occurs during
wildcard (glob) expansion.
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-15 10:35 ` Eric Dumazet
2006-11-15 10:50 ` Andi Kleen
@ 2006-11-22 10:28 ` Eric Dumazet
2006-11-22 10:36 ` Andi Kleen
` (2 more replies)
1 sibling, 3 replies; 91+ messages in thread
From: Eric Dumazet @ 2006-11-22 10:28 UTC (permalink / raw)
To: Adrian Bunk
Cc: Andrew Morton, Linux Kernel Mailing List, Stephen Hemminger,
gregkh, Ingo Molnar, Len Brown, Andi Kleen, phil.el,
oprofile-list
On Wednesday 15 November 2006 11:35, Eric Dumazet wrote:
> On Wednesday 15 November 2006 11:21, Adrian Bunk wrote:
> > Subject : x86_64: oprofile doesn't work
> > References : http://lkml.org/lkml/2006/10/27/3
> > Submitter : Prakash Punnoor <prakash@punnoor.de>
> > Status : unknown
>
I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set.
# opcontrol --setup --event=RESOURCE_STALLS:1000 --vmlinux=$VMFILE
# opcontrol --start
/usr/bin/opcontrol: line 911: /dev/oprofile/0/enabled: No such file or
directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/event: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/count: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/kernel: No such file or
directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/user: No such file or directory
/usr/bin/opcontrol: line 911: /dev/oprofile/0/unit_mask: No such file or
directory
Using 2.6+ OProfile kernel interface.
Reading module info.
Using log file /var/lib/oprofile/oprofiled.log
Daemon started.
Profiler running.
# ls -l /dev/oprofile/
total 0
drwxr-xr-x 1 root root 0 Nov 22 11:18 1
-rw-r--r-- 1 root root 0 Nov 22 11:18 backtrace_depth
-rw-r--r-- 1 root root 0 Nov 22 11:18 buffer
-rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_size
-rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_watershed
-rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_buffer_size
-rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_type
-rw-rw-rw- 1 root root 0 Nov 22 11:18 dump
-rw-r--r-- 1 root root 0 Nov 22 11:18 enable
-rw-r--r-- 1 root root 0 Nov 22 11:18 pointer_size
drwxr-xr-x 1 root root 0 Nov 22 11:18 stats
# dmesg | grep oprofile
oprofile: using NMI interrupt.
# opcontrol --version
opcontrol: oprofile 0.9.2 compiled on Nov 22 2006 11:24:09
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-22 10:28 ` Eric Dumazet
@ 2006-11-22 10:36 ` Andi Kleen
2006-11-22 18:42 ` Andrew Morton
2006-11-22 17:59 ` William Cohen
2006-11-22 18:05 ` William Cohen
2 siblings, 1 reply; 91+ messages in thread
From: Andi Kleen @ 2006-11-22 10:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
Stephen Hemminger, gregkh, Ingo Molnar, Len Brown, phil.el,
oprofile-list
On Wednesday 22 November 2006 11:28, Eric Dumazet wrote:
> On Wednesday 15 November 2006 11:35, Eric Dumazet wrote:
> > On Wednesday 15 November 2006 11:21, Adrian Bunk wrote:
> > > Subject : x86_64: oprofile doesn't work
> > > References : http://lkml.org/lkml/2006/10/27/3
> > > Submitter : Prakash Punnoor <prakash@punnoor.de>
> > > Status : unknown
> >
>
> I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set.
oprofile is still broken because it cannot deal with the lack of perfctr 0.
You can disable the nmi watchdog as a workaround.
-Andi
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-22 10:28 ` Eric Dumazet
2006-11-22 10:36 ` Andi Kleen
@ 2006-11-22 17:59 ` William Cohen
2006-11-22 18:05 ` William Cohen
2 siblings, 0 replies; 91+ messages in thread
From: William Cohen @ 2006-11-22 17:59 UTC (permalink / raw)
To: Eric Dumazet
Cc: Adrian Bunk, Andrew Morton, Len Brown, phil.el, gregkh,
Linux Kernel Mailing List, Andi Kleen, Ingo Molnar, oprofile-list,
Stephen Hemminger
Eric Dumazet wrote:
> On Wednesday 15 November 2006 11:35, Eric Dumazet wrote:
>
>>On Wednesday 15 November 2006 11:21, Adrian Bunk wrote:
>>
>>>Subject : x86_64: oprofile doesn't work
>>>References : http://lkml.org/lkml/2006/10/27/3
>>>Submitter : Prakash Punnoor <prakash@punnoor.de>
>>>Status : unknown
>>
>
> I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set.
>
> # opcontrol --setup --event=RESOURCE_STALLS:1000 --vmlinux=$VMFILE
> # opcontrol --start
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/enabled: No such file or
> directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/event: No such file or directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/count: No such file or directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/kernel: No such file or
> directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/user: No such file or directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/unit_mask: No such file or
> directory
> Using 2.6+ OProfile kernel interface.
> Reading module info.
> Using log file /var/lib/oprofile/oprofiled.log
> Daemon started.
> Profiler running.
>
> # ls -l /dev/oprofile/
> total 0
> drwxr-xr-x 1 root root 0 Nov 22 11:18 1
> -rw-r--r-- 1 root root 0 Nov 22 11:18 backtrace_depth
> -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer
> -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_size
> -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_watershed
> -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_buffer_size
> -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_type
> -rw-rw-rw- 1 root root 0 Nov 22 11:18 dump
> -rw-r--r-- 1 root root 0 Nov 22 11:18 enable
> -rw-r--r-- 1 root root 0 Nov 22 11:18 pointer_size
> drwxr-xr-x 1 root root 0 Nov 22 11:18 stats
> # dmesg | grep oprofile
> oprofile: using NMI interrupt.
> # opcontrol --version
> opcontrol: oprofile 0.9.2 compiled on Nov 22 2006 11:24:09
>
> Eric
Could you try the patch that I posted on the oprofile mailing list last week
November 17 2005 for op_allocate.c and see if that resolves the problem you are
having?
http://sourceforge.net/mailarchive/message.php?msg_id=37316102
-Will
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-22 10:28 ` Eric Dumazet
2006-11-22 10:36 ` Andi Kleen
2006-11-22 17:59 ` William Cohen
@ 2006-11-22 18:05 ` William Cohen
2006-11-22 18:26 ` Eric Dumazet
2 siblings, 1 reply; 91+ messages in thread
From: William Cohen @ 2006-11-22 18:05 UTC (permalink / raw)
To: Eric Dumazet
Cc: Adrian Bunk, Andrew Morton, Len Brown, phil.el, gregkh,
Linux Kernel Mailing List, Andi Kleen, Ingo Molnar, oprofile-list,
Stephen Hemminger
[-- Attachment #1: Type: text/plain, Size: 2044 bytes --]
Eric Dumazet wrote:
> On Wednesday 15 November 2006 11:35, Eric Dumazet wrote:
>
>>On Wednesday 15 November 2006 11:21, Adrian Bunk wrote:
>>
>>>Subject : x86_64: oprofile doesn't work
>>>References : http://lkml.org/lkml/2006/10/27/3
>>>Submitter : Prakash Punnoor <prakash@punnoor.de>
>>>Status : unknown
>>
>
> I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set.
>
> # opcontrol --setup --event=RESOURCE_STALLS:1000 --vmlinux=$VMFILE
> # opcontrol --start
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/enabled: No such file or
> directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/event: No such file or directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/count: No such file or directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/kernel: No such file or
> directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/user: No such file or directory
> /usr/bin/opcontrol: line 911: /dev/oprofile/0/unit_mask: No such file or
> directory
> Using 2.6+ OProfile kernel interface.
> Reading module info.
> Using log file /var/lib/oprofile/oprofiled.log
> Daemon started.
> Profiler running.
>
> # ls -l /dev/oprofile/
> total 0
> drwxr-xr-x 1 root root 0 Nov 22 11:18 1
> -rw-r--r-- 1 root root 0 Nov 22 11:18 backtrace_depth
> -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer
> -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_size
> -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_watershed
> -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_buffer_size
> -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_type
> -rw-rw-rw- 1 root root 0 Nov 22 11:18 dump
> -rw-r--r-- 1 root root 0 Nov 22 11:18 enable
> -rw-r--r-- 1 root root 0 Nov 22 11:18 pointer_size
> drwxr-xr-x 1 root root 0 Nov 22 11:18 stats
> # dmesg | grep oprofile
> oprofile: using NMI interrupt.
> # opcontrol --version
> opcontrol: oprofile 0.9.2 compiled on Nov 22 2006 11:24:09
>
> Eric
You will also need another patch checked into the oprofile cvs last week mentioned:
http://sourceforge.net/mailarchive/message.php?msg_id=35422937
-Will
[-- Attachment #2: opalloc.diff --]
[-- Type: text/x-patch, Size: 538 bytes --]
Index: libop/op_alloc_counter.c
===================================================================
RCS file: /cvsroot/oprofile/oprofile/libop/op_alloc_counter.c,v
retrieving revision 1.6
diff -u -r1.6 op_alloc_counter.c
--- libop/op_alloc_counter.c 1 Oct 2003 21:53:46 -0000 1.6
+++ libop/op_alloc_counter.c 17 Nov 2006 17:03:04 -0000
@@ -130,7 +130,7 @@
counter_arc const * arc = list_entry(pos, counter_arc, next);
if (allocated_mask & (1 << arc->counter))
- return 0;
+ continue;
counter_map[depth] = arc->counter;
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-22 18:05 ` William Cohen
@ 2006-11-22 18:26 ` Eric Dumazet
0 siblings, 0 replies; 91+ messages in thread
From: Eric Dumazet @ 2006-11-22 18:26 UTC (permalink / raw)
To: William Cohen
Cc: Adrian Bunk, Andrew Morton, Len Brown, phil.el, gregkh,
Linux Kernel Mailing List, Andi Kleen, Ingo Molnar, oprofile-list,
Stephen Hemminger
On Wednesday 22 November 2006 19:05, William Cohen wrote:
> You will also need another patch checked into the oprofile cvs last week
> mentioned:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=35422937
>
> -Will
Thank you William.
I confirm that CVS oprofile version + patches you gave here works with
linux-2.6.16-rc6 on i386, regardless of disabling nmi_watchdog (adding or
not nmi_watchdog=0 in boot params)
Eric
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-22 10:36 ` Andi Kleen
@ 2006-11-22 18:42 ` Andrew Morton
2006-12-16 11:20 ` Ray Lee
0 siblings, 1 reply; 91+ messages in thread
From: Andrew Morton @ 2006-11-22 18:42 UTC (permalink / raw)
To: Andi Kleen
Cc: Eric Dumazet, Adrian Bunk, Linux Kernel Mailing List,
Stephen Hemminger, gregkh, Ingo Molnar, Len Brown, phil.el,
oprofile-list
On Wed, 22 Nov 2006 11:36:14 +0100
Andi Kleen <ak@suse.de> wrote:
> On Wednesday 22 November 2006 11:28, Eric Dumazet wrote:
> > On Wednesday 15 November 2006 11:35, Eric Dumazet wrote:
> > > On Wednesday 15 November 2006 11:21, Adrian Bunk wrote:
> > > > Subject : x86_64: oprofile doesn't work
> > > > References : http://lkml.org/lkml/2006/10/27/3
> > > > Submitter : Prakash Punnoor <prakash@punnoor.de>
> > > > Status : unknown
> > >
> >
> > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set.
>
> oprofile is still broken because it cannot deal with the lack of perfctr 0.
The kernel is still broken because we changed the interface.
> You can disable the nmi watchdog as a workaround.
I don't understand why you think this is acceptable.
^ permalink raw reply [flat|nested] 91+ messages in thread
* [SOLVED] Re: [discuss] 2.6.19-rc5: known regressions (v2)
2006-11-14 16:44 ` Paolo Ornati
@ 2006-11-29 10:10 ` Paolo Ornati
0 siblings, 0 replies; 91+ messages in thread
From: Paolo Ornati @ 2006-11-29 10:10 UTC (permalink / raw)
To: Paolo Ornati; +Cc: Rafael J. Wysocki, Adrian Bunk, LKML
On Tue, 14 Nov 2006 17:44:51 +0100
Paolo Ornati <ornati@fastwebnet.it> wrote:
> > > Okay, please let us know if it survives the next several cycles.
> > >
> > > OTOH, the problem may be hiding.
> >
> > Ok, and if it survives againg and again I can do a partial bisection...
>
> "-rc5" is still alive: 6 days of uptime using suspend/resume many times
> every day...
>
> so if the problem is there it's hiding very well.
>
>
> Now I'll slowly go back with older kernels and see what happens...
SHORT CONCLUSION: it was just a kernel miscompilation (I usually do
"make oldconfig; make clean; make" so I don't know if I missed "make
clean" or if it was caused by ccache...).
The fact that it's a miscompilation is "proved" by 3 simple things:
1) I've only seen the problem with that particular version
2) slow bisection pointed that the ipotetic bug was fixed between
4b1c46a3..d1ed6a3e, but I don't see any change that matters (on x86_64).
3) I'm running a clean recompiled 2.6.19-rc4-g4b1c46a3, that doesn't
have any problem.
:D
--
Paolo Ornati
Linux 2.6.19-rc4-g4b1c46a3 on x86_64
^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3)
2006-11-22 18:42 ` Andrew Morton
@ 2006-12-16 11:20 ` Ray Lee
0 siblings, 0 replies; 91+ messages in thread
From: Ray Lee @ 2006-12-16 11:20 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Eric Dumazet, Adrian Bunk, Linux Kernel Mailing List,
Stephen Hemminger, gregkh, Ingo Molnar, Len Brown, phil.el,
oprofile-list
On 11/22/06, Andrew Morton <akpm@osdl.org> wrote:
> On Wed, 22 Nov 2006 11:36:14 +0100
> Andi Kleen <ak@suse.de> wrote:
>
> > On Wednesday 22 November 2006 11:28, Eric Dumazet wrote:
> > > On Wednesday 15 November 2006 11:35, Eric Dumazet wrote:
> > > > On Wednesday 15 November 2006 11:21, Adrian Bunk wrote:
> > > > > Subject : x86_64: oprofile doesn't work
> > > > > References : http://lkml.org/lkml/2006/10/27/3
> > > > > Submitter : Prakash Punnoor <prakash@punnoor.de>
> > > > > Status : unknown
> > > >
> > >
> > > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set.
> >
> > oprofile is still broken because it cannot deal with the lack of perfctr 0.
>
> The kernel is still broken because we changed the interface.
I just got bit by this on 2.6.20-latest (well, of two days ago anyway)
while trying to debug another transient 'kacpid sucks all available
cpu time'. But that's okay, I'm sure it will happen again in a week or
two.
In the meantime, who won this pis^H^H^H discussion?
Mikael Pettersson wrote:
> Andrew Morton writes:
> > Surely the appropriate behaviour is to allow oprofile to steal the NMI and
> > to then put the NMI back to doing the watchdog thing after oprofile has
> > finished with it.
>
> Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented
> the in-kernel API allowing real performance counter drivers like
> oprofile (and perfctr) to claim the HW from the NMI watchdog,
> do their work, and then release it which resumed the watchdog.
>
> Note that oprofile (and perfctr) didn't do anything behind the
> NMI watchdog's back. They went via the API. Nothing dodgy going on.
Well, that seems clear.
^ permalink raw reply [flat|nested] 91+ messages in thread
end of thread, other threads:[~2006-12-16 11:20 UTC | newest]
Thread overview: 91+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds
[not found] ` <20061108085235.GT4729@stusta.de>
2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich
2006-11-08 10:21 ` Adrian Bunk
2006-11-08 9:34 ` Jens Axboe
2006-11-08 19:09 ` Alex Romosan
2006-11-08 19:29 ` Jens Axboe
2006-11-08 19:38 ` Alex Romosan
2006-11-08 19:45 ` Jens Axboe
2006-11-08 21:40 ` Alex Romosan
2006-11-08 20:03 ` Arjan van de Ven
2006-11-08 20:19 ` Jens Axboe
2006-11-08 11:04 ` Eric W. Biederman
2006-11-08 11:32 ` Thomas Gleixner
[not found] ` <7813413.118221162987983254.komurojun-mbn@nifty.com>
2006-11-08 16:00 ` Linus Torvalds
2006-11-10 12:42 ` Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq Komuro
2006-11-13 16:02 ` Linus Torvalds
2006-11-13 17:11 ` Eric W. Biederman
2006-11-13 20:44 ` Ingo Molnar
2006-11-13 21:11 ` Eric W. Biederman
2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar
2006-11-14 8:20 ` Arjan van de Ven
2006-11-14 12:43 ` Komuro
2006-11-14 16:10 ` Linus Torvalds
2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman
2006-11-14 23:35 ` Linus Torvalds
2006-11-15 1:17 ` Linus Torvalds
2006-11-15 5:14 ` Eric W. Biederman
2006-11-15 16:06 ` Linus Torvalds
2006-11-15 16:58 ` Eric W. Biederman
2006-11-15 12:40 ` Komuro
[not found] ` <20061115090427.GA16173@elte.hu>
2006-11-15 16:13 ` [patch] genirq: do not mask interrupts by default Linus Torvalds
2006-11-15 17:46 ` Ingo Molnar
[not found] ` <m1y7qm425l.fsf@ebiederm.dsl.xmission.com>
[not found] ` <Pine.LNX.4.64.0611080745150.3667@g5.osdl.org>
2006-11-08 16:22 ` 2.6.19-rc5: known regressions Adrian Bunk
2006-11-08 23:11 ` Tim Chen
2006-11-09 2:49 ` Tim Chen
2006-11-09 5:10 ` Eric W. Biederman
2006-11-13 22:46 ` Tim Chen
2006-11-14 0:03 ` Eric W. Biederman
2006-11-08 9:43 ` Linux 2.6.19-rc5 Nigel Cunningham
2006-11-08 9:59 ` Alessandro Suardi
2006-11-08 10:04 ` Nigel Cunningham
2006-11-08 14:19 ` Gene Heskett
2006-11-08 15:43 ` Linus Torvalds
[not found] ` <20061111015035.GU4729@stusta.de>
2006-11-11 9:08 ` [discuss] 2.6.19-rc5: known regressions (v2) Rafael J. Wysocki
2006-11-11 9:25 ` Paolo Ornati
2006-11-11 10:49 ` Rafael J. Wysocki
2006-11-11 12:29 ` Paolo Ornati
2006-11-14 16:44 ` Paolo Ornati
2006-11-29 10:10 ` [SOLVED] " Paolo Ornati
2006-11-13 22:14 ` 2.6.19-rc5: known regressions with patches Adrian Bunk
2006-11-13 22:56 ` Brian King
2006-11-13 23:15 ` Linus Torvalds
2006-11-14 2:35 ` Jeff Garzik
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
2006-11-15 10:35 ` Jens Axboe
2006-11-15 10:53 ` Adrian Bunk
2006-11-15 10:35 ` Eric Dumazet
2006-11-15 10:50 ` Andi Kleen
2006-11-15 16:40 ` William Cohen
2006-11-15 16:48 ` [discuss] " Andi Kleen
2006-11-15 18:39 ` Andrew Morton
2006-11-15 18:45 ` Andi Kleen
2006-11-15 19:07 ` Linus Torvalds
2006-11-15 19:23 ` Andi Kleen
2006-11-15 20:21 ` Andrew Morton
2006-11-15 21:18 ` Eric W. Biederman
2006-11-15 21:31 ` Andrew Morton
2006-11-16 10:55 ` Mikael Pettersson
2006-11-16 20:23 ` Andrew Morton
2006-11-17 9:59 ` Mikael Pettersson
2006-11-17 10:13 ` Andrew Morton
2006-11-19 3:05 ` Bill Davidsen
2006-11-17 10:29 ` Andi Kleen
2006-11-16 3:21 ` Andi Kleen
2006-11-16 5:05 ` Andrew Morton
2006-11-16 7:04 ` Andi Kleen
2006-11-16 15:34 ` William Cohen
2006-11-16 15:47 ` Andi Kleen
2006-11-16 21:32 ` Stephane Eranian
2006-11-22 10:28 ` Eric Dumazet
2006-11-22 10:36 ` Andi Kleen
2006-11-22 18:42 ` Andrew Morton
2006-12-16 11:20 ` Ray Lee
2006-11-22 17:59 ` William Cohen
2006-11-22 18:05 ` William Cohen
2006-11-22 18:26 ` Eric Dumazet
2006-11-15 11:06 ` Brice Goglin
2006-11-15 22:32 ` Adrian Bunk
2006-11-15 12:07 ` Alan
2006-11-15 15:52 ` Stephen Hemminger
2006-11-15 16:35 ` Eric W. Biederman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).