* Linux 2.6.19-rc5
@ 2006-11-08 2:33 Linus Torvalds
2006-11-08 9:43 ` Nigel Cunningham
` (4 more replies)
0 siblings, 5 replies; 91+ messages in thread
From: Linus Torvalds @ 2006-11-08 2:33 UTC (permalink / raw)
To: Linux Kernel Mailing List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 16204 bytes --]
Ok, things are finally calming down, it seems.
The -rc5 thing is mainly a few random architecture updates (arm, mips,
uml, avr, power) and the only really noticeable one there is likely some
fixes to the local APIC accesses on x86, which apparently fixes a few
machines.
The rest is really mostly one-liners (or close) to various subsystems. New
PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better
about this - there may be a -rc6, but maybe we don't even need one.
As usual, thanks to everybody who tested and chased down some of the
regressions,
Linus
---
Adrian Bunk (2):
[TIPC] net/tipc/port.c: fix NULL dereference
PCI: Let PCI_MULTITHREAD_PROBE depend on BROKEN
Akinobu Mita (4):
tokenring: fix module_init error handling
n2: fix confusing error code
edac_mc: fix error handling
sunrpc: add missing spin_unlock
Al Viro (8):
[IPV6]: File the fingerprints off ah6->spi/esp6->spi
[IPX]: Trivial parts of endianness annotations
[IPX]: Annotate and fix IPX checksum
[IPV6]: Fix ECN bug on big-endian
[NETFILTER] bug: NFULA_CFG_QTHRESH uses 32bit
[NETFILTER] bug: nfulnl_msg_config_mode ->copy_range is 32bit
[NETFILTER] bug: skb->protocol is already net-endian
[PKTGEN]: TCI endianness fixes
Alexey Dobriyan (1):
[GFS2] don't panic needlessly
Amol Lad (1):
drivers/isdn/hysdn/hysdn_sched.c: sleep after taking spinlock fix
Andreas Gruenbacher (1):
Fix user.* xattr permission check for sticky dirs
Andrew Morton (6):
find_bd_holder() fix
tidy "md: check bio address after mapping through partitions"
Add printk_timed_ratelimit()
schedule removal of FUTEX_FD
acpi_noirq section fix
spi section fix
Andy Fleming (2):
[POWERPC] Fix rmb() for e500-based machines it
[POWERPC] Fix oprofile support for e500 in arch/powerpc
Ankita Garg (1):
Fix for LKDTM MEM_SWAPOUT crashpoint
Atsushi Nemoto (2):
[MIPS] Fixup migration to GENERIC_TIME
[MIPS] Do not use -msym32 option for modules.
Auke Kok (1):
e1000: Fix regression: garbled stats and irq allocation during swsusp
Ben Dooks (5):
[ARM] 3915/1: S3C2412: Add s3c2410_gpio_getirq() to general gpio.c
[ARM] 3920/1: S3C24XX: Remove smdk2410_defconfig
[ARM] 3921/1: S3C24XX: remove bast_defconfig
[ARM] 3922/1: S3C24XX: update s3c2410_defconfig to 2.6.19-rc4
[ARM] 3923/1: S3C24XX: update s3c2410_defconfig with new drivers
Benjamin Herrenschmidt (2):
[POWERPC] Fix various offb issues
[POWERPC] Make alignment exception always check exception table
Bjorn Schneider (1):
USB: new VID/PID-combos for cp2101
Brice Goglin (1):
myri10ge: ServerWorks HT2000 PCI id is already defined in pci_ids.h
Daniel Drake (1):
jfs: Add splice support
Daniel Ritz (1):
usbtouchscreen: use endpoint address from endpoint descriptor
Daniel Yeisley (1):
init_reap_node() initialization fix
Dave Kleikamp (1):
JFS: Remove redundant xattr permission checking
David Brownell (3):
USB: fix compiler issues with newer gcc versions
USB: use MII hooks only if CONFIG_MII is enabled
[ARM] 3926/1: make timer led handle HZ != 100
David Härdeman (1):
V4L/DVB (4785): Budget-ci: Change DEBIADDR_IR to a safer default
David Rientjes (1):
net s2io: return on NULL dev_alloc_skb()
David S. Miller (7):
[APPLETALK]: Fix potential OOPS in atalk_sendmsg().
[XFRM] xfrm_user: Fix unaligned accesses.
[ETH1394]: Fix unaligned accesses.
[SPARC64]: Fix Tomatillo/Schizo IRQ handling.
[SPARC64]: Add some missing print_symbol() calls.
[SPARC64]: Fix futex_atomic_cmpxchg_inatomic implementation.
[SPARC]: Fix robust futex syscalls and wire up migrate_pages.
Dmitry Mishin (3):
[NETFILTER]: Missed and reordered checks in {arp,ip,ip6}_tables
[NETFILTER]: ip_tables: compat code module refcounting fix
[IPV6]: Add ndisc_netdev_notifier unregister.
Dominic Cerquetti (1):
USB: xpad: additional USB id's added
Enrico Scholz (1):
[ARM] 3919/1: Fixed definition of some PXA270 CIF related registers
Erez Zilber (1):
IB/iser: Start connection after enabling iSER
Eric Sandeen (1):
fix UFS superblock alignment issues
Eric W. Biederman (3):
Improve the removed sysctl warnings
sysctl: allow a zero ctl_name in the middle of a sysctl table
sysctl: implement CTL_UNNUMBERED
Gautham R Shenoy (1):
Fix the spurious unlock_cpu_hotplug false warnings
Grant Grundler (1):
hid-core: big-endian fix fix
Greg Kroah-Hartman (2):
PCI: Revert "PCI: i386/x86_84: disable PCI resource decode on device disable"
USB: add another sierra wireless device id
Gui,Jian (1):
[POWERPC] Disallow kprobes on emulate_step and branch_taken
Haavard Skinnemoen (4):
AVR32: Get rid of board_early_init
AVR32: Fix thinko in generic_find_next_zero_le_bit()
AVR32: Wire up sys_epoll_pwait
AVR32: Add missing return instruction in __raw_writesb
Hartmut Hackmann (1):
V4L/DVB (4770): Fix mode switch of Compro Videomate T300
Heiko Carstens (4):
[NET]: fix uaccess handling
sys_pselect7 vs compat_sys_pselect7 uaccess error handling
[S390] revert add_active_range() usage patch.
[S390] IRQs too early enabled.
Herbert Xu (2):
[NET]: Fix segmentation of linear packets
[SCTP]: Always linearise packet on input
Hugh Dickins (3):
[POWERPC] Make current preempt-safe
[POWERPC] Make high hugepage areas preempt safe
[POWERPC] Make mmiowb's io_sync preempt safe
Jack Morgenstein (1):
IB/uverbs: Return sq_draining value in query_qp response
James Morris (3):
[IPV6]: fix lockup via /proc/net/ip6_flowlabel
[IPV6]: return EINVAL for invalid address with flowlabel lease request
[IPV6]: fix flowlabel seqfile handling
Jamie Lenehan (2):
sh: Fix IPR-IRQ's for IRQ-chip change breakage.
sh: Titan defconfig update.
Jan Luebbe (1):
USB: sierra: Fix id for Sierra Wireless MC8755 in new table
Jan Mate (1):
USB Storage: unusual_devs.h entry for Sony Ericsson P990i
Jan-Benedict Glaw (1):
Update for the srm_env driver.
Jan-Bernd Themann (1):
ehea: kzalloc GFP_ATOMIC fix
Jeff Dike (4):
uml: add _text definition to linker scripts
uml: add INITCALLS
uml: fix I/O hang
uml: include tidying
Jeff Garzik (1):
Revert "Add 0x7110 piix to ata_piix.c"
Jeff Mahoney (1):
reiserfs: reset errval after initializing bitmap cache
Jens Axboe (3):
CFQ: request <-> request merging rr_list fixup
Add 0x7110 piix to ata_piix.c
splice: fix problem introduced with inode diet
Jes Sorensen (1):
[IA64] don't double >> PAGE_SHIFT pointer for /dev/kmem access
Jiri Benc (1):
ieee80211: don't flood log with errors
Johannes Berg (1):
b44: change comment about irq mask register
Keith Owens (1):
[IA64] Correct definition of handle_IPI
Kenji Kaneshige (1):
[IA64] cpu-hotplug: Fixing confliction between CPU hot-add and IPI
Kevin Hilman (2):
[ARM] 3917/1: Fix dmabounce symbol exports
[ARM] 3918/1: ixp4xx irq-chip rework
Krishna Kumar (1):
RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count
Kristoffer Ericson (1):
video: Fix include in hp680_bl.
Larry Finger (1):
bcm43xx: fix unexpected LED control values in BCM4303 sprom
Larry Woodman (1):
[NET]: __alloc_pages() failures reported due to fragmentation
Lennert Buytenhek (3):
ep93xx_eth: fix RX/TXstatus ring full handling
ep93xx_eth: fix unlikely(x) > y test
ep93xx_eth: don't report RX errors
Linas Vepstas (1):
[POWERPC] Use 4kB iommu pages even on 64kB-page systems
Linus Torvalds (6):
i386: clean up io-apic accesses
i386: write IO APIC irq routing entries in correct order
Revert unintentional "volatile" changes in ipc/msg.c
Fix unlikely (but possible) race condition on task->user access
Make sure "user->sigpending" count is in sync
Linux 2.6.19-rc5
Manish Lachwani (1):
[MIPS] Add missing file for support of backplane on TX4927 based board
Martin Josefsson (1):
[NETFILTER]: nf_conntrack: add missing unlock in get_next_corpse()
Meelis Roos (1):
[NETFILTER]: silence a warning in ebtables
Michael Buesch (1):
bcm43xx: Fix low-traffic netdev watchdog TX timeouts
Michael Chan (1):
[TG3]: Fix 2nd ifup failure on 5752M.
Michael Halcrow (7):
eCryptfs: Clean up crypto initialization
eCryptfs: Hash code to new crypto API
eCryptfs: Cipher code to new crypto API
eCryptfs: Consolidate lower dentry_open's
eCryptfs: Remove ecryptfs_umount_begin
eCryptfs: Fix handling of lower d_count
eCryptfs: Fix pointer deref
Michael S. Tsirkin (1):
IB/mthca: Fix MAD extended header format for MAD_IFC firmware command
Naranjo Manuel Francisco (1):
USB: HID: add blacklist AIRcable USB, little beautification
NeilBrown (2):
md: check bio address after mapping through partitions.
md: send online/offline uevents when an md array starts/stops
nkalmala (1):
mm: un-needed add-store operation wastes a few bytes
OGAWA Hirofumi (4):
Cleanup read_pages()
cifs: ->readpages() fixes
fuse: ->readpages() cleanup
gfs2: ->readpages() fixes
Oleg Nesterov (2):
taskstats: fix sub-threads accounting
fix Documentation/accounting/getdelays.c buf size
Oliver Endriss (1):
V4L/DVB (4784): [saa7146_i2c] short_delay mode fixed for fast machines
Oliver Neukum (2):
USB: failure in usblp's error path
USB: usblp: fix system suspend for some systems
Paolo 'Blaisorblade' Giarrusso (11):
uml ubd driver: allow using up to 16 UBD devices
uml ubd driver: document some struct fields
uml ubd driver: var renames
uml ubd driver: give better names to some functions.
uml ubd driver: change ubd_lock to be a mutex
uml ubd driver: ubd_io_lock usage fixup
uml ubd driver: convert do_ubd to a boolean variable
uml ubd driver: reformat ubd_config
uml ubd driver: use bitfields where possible
uml ubd driver: do not store error codes as ->fd
uml ubd driver: various little changes
Patrick Caulfield (2):
[DLM] Fix kref_put oops
[DLM] fix oops in kref_put when removing a lockspace
Patrick McHardy (2):
[NETFILTER]: remove masq/NAT from ip6tables Kconfig help
[IPV6]: Give sit driver an appropriate module alias.
Paul Gortmaker (1):
[ARM] 3912/1: Make PXA270 advertise HWCAP_IWMMXT capability
Paul Mackerras (2):
IB/ehca: Fix eHCA driver compilation for uniprocessor
powerpc: Eliminate "exceeds stub group size" linker warning
Paul Moore (2):
[NetLabel]: protect the CIPSOv4 socket option from setsockopt()
[NETLABEL]: Fix build failure.
Paul Mundt (2):
sh: Wire up new syscalls.
sh: Update r7780rp_defconfig.
Pavel Emelianov (1):
Fix ipc entries removal
Pavel Roskin (1):
hostap_plx: fix CIS verification
Peer Chen (5):
[libata] sata_nv: Add PCI IDs
[libata] Add support for PATA controllers of MCP67 to pata_amd.c.
[libata] Add support for AHCI controllers of MCP67.
pci_ids.h: Add NVIDIA PCI ID
IDE: Add the support of nvidia PATA controllers of MCP67 to amd74xx.c
Peter Zijlstra (1):
lockdep: fix delayacct locking bug
Phil Dibowitz (1):
USB: usb-storage: Unusual_dev update
Rafael J. Wysocki (1):
swsusp: debugging
Ralf Baechle (26):
[MIPS] TX4927: Remove indent error message that somehow ended in the code.
[MIPS] Sort out missuse of __init for prom_getcmdline()
[MIPS] VSMP: Fix initialization ordering bug.
[MIPS] Flags must be unsigned long.
[MIPS] VSMP: Synchronize cp0 counters on bootup.
[MIPS] 16K & 64K page size fixes
[MIPS] SMTC: Fix crash if # of TC's > # of VPE's after pt_regs irq cleanup.
[MIPS] SMTC: Synchronize cp0 counters on bootup.
Revert "[MIPS] Make SPARSEMEM selectable on QEMU."
[MIPS] Fix merge screwup by patch(1)
[MIPS] IP27: Allow SMP ;-) Another changeset messed up by patch.
[MIPS] Fix warning about init_initrd() call if !CONFIG_BLK_DEV_INITRD.
[MIPS] Ocelot G: Fix : "CURRENTLY_UNUSED" is not defined warning.
[MIPS] Don't use R10000 llsc workaround version for all llsc-full processors.
[MIPS] Ocelot C: Fix large number of warnings.
[MIPS] Ocelot C: fix eth registration after conversion to platform_device
[MIPS] Ocelot C: Fix warning about missmatching format string.
[MIPS] Ocelot C: Fix mapping of ioport address range.
[MIPS] Ocelot 3: Fix large number of warnings.
[MIPS] SB1: On bootup only flush cache on local CPU.
[MIPS] Ocelot C: Fix MAC address detection after platform_device conversion.
[MIPS] Ocelot 3: Fix MAC address detection after platform_device conversion.
[MIPS] EV64120: Fix timer initialization for HZ != 100.
[MIPS] Make irq number allocator generally available for fixing EV64120.
[MIPS] EV64120: Fix PCI interrupt allocation.
[MIPS] Fix EV64120 and Ocelot builds by providing a plat_timer_setup().
Randy Dunlap (8):
[NET] sealevel: uses arp_broken_ops
[DCCP]: fix printk format warnings
SCSI: ISCSI build failure
V4L/DVB (4786): Pvrusb2: use NULL instead of 0
update some docbook comments
docbook: merge journal-api into filesystems.tmpl
lkdtm: cleanup headers and module_param/MODULE_PARM_DESC
Kconfig: remove redundant NETDEVICES depends
Ray Lehtiniemi (1):
[ARM] 3927/1: Allow show_mem() to work with holes in memory map.
Raymond Mantchala (1):
V4L/DVB (4787): Budget-ci: Inversion setting fixed for Technotrend 1500 T
Russ Anderson (1):
[IA64] MCA recovery: Montecito support
Sean Hefty (1):
RDMA/addr: Use client registration to fix module unload race
Srinivasa Ds (1):
NFS4: fix for recursive locking problem
Stephen Hemminger (4):
sky2: not experimental
skge, sky2, et all. gplv2 only
sky2: netpoll on dual port cards
[TCP]: Set default congestion control when no sysctl.
Stephen Rothwell (3):
Create compat_sys_migrate_pages
powerpc: wire up sys_migrate_pages
Fix sys_move_pages when a NULL node list is passed
Steve French (3):
[CIFS] Fix readdir breakage when blocksize set too small
[CIFS] Allow null user connections
[CIFS] report rename failure when target file is locked by Windows
Steve Wise (2):
IB/amso1100: Use dma_alloc_coherent() instead of kmalloc/dma_map_single
IB/amso1100: Fix incorrect pr_debug()
Steven Whitehouse (2):
[GFS2] Fix incorrect fs sync behaviour.
[GFS2] Fix OOM error handling
Tejun Heo (4):
sata_sis: fix flags handling for the secondary port
libata: unexport ata_dev_revalidate()
ata_piix: allow 01b MAP for both ICH6M and ICH7M
ahci: fix status register check in ahci_softreset
Thomas Klein (3):
ehea: Nullpointer dereferencation fix
ehea: Removed redundant define
ehea: 64K page support fix
Tilman Schmidt (1):
isdn/gigaset: convert warning message
Timur Tabi (1):
[POWERPC] qe_lib: qe_issue_cmd writes wrong value to CECDR
Trent Piepho (2):
V4L/DVB (4752): DVB: Add DVB_FE_CUSTOMISE support for MT2060
V4L/DVB (4751): Fix DBV_FE_CUSTOMISE for card drivers compiled into kernel
Troy Heber (1):
[IA64] move SAL_CACHE_FLUSH check later in boot
Vasily Averin (1):
[NETFILTER]: ip_tables: compat error way cleanup
Vlad Yasevich (2):
[SCTP]: Correctly set IP id for SCTP traffic
[SCTP]: Remove temporary associations from backlog and hash.
Yoichi Yuasa (3):
[MIPS] Yosemite: fix uninitialized variable in titan_i2c_xfer()
[MIPS] Fix warning of printk format in mips_srs_init()
[MIPS] Fix warning in mips-boards generic PCI
Yvan Seth (1):
ipmi_si_intf.c sets bad class_mask with PCI_DEVICE_CLASS
^ permalink raw reply [flat|nested] 91+ messages in thread* Re: Linux 2.6.19-rc5 2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds @ 2006-11-08 9:43 ` Nigel Cunningham 2006-11-08 9:59 ` Alessandro Suardi 2006-11-08 15:43 ` Linus Torvalds [not found] ` <20061108085235.GT4729@stusta.de> ` (3 subsequent siblings) 4 siblings, 2 replies; 91+ messages in thread From: Nigel Cunningham @ 2006-11-08 9:43 UTC (permalink / raw) To: Linus Torvalds; +Cc: Linux Kernel Mailing List Gidday. On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote: > Ok, things are finally calming down, it seems. > > The -rc5 thing is mainly a few random architecture updates (arm, mips, > uml, avr, power) and the only really noticeable one there is likely some > fixes to the local APIC accesses on x86, which apparently fixes a few > machines. > > The rest is really mostly one-liners (or close) to various subsystems. New > PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better > about this - there may be a -rc6, but maybe we don't even need one. > > As usual, thanks to everybody who tested and chased down some of the > regressions, > > Linus The patch etc doesn't seem to be available yet. (The front page is still showing -rc4, for example). Regards, Nigel ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5 2006-11-08 9:43 ` Nigel Cunningham @ 2006-11-08 9:59 ` Alessandro Suardi 2006-11-08 10:04 ` Nigel Cunningham 2006-11-08 14:19 ` Gene Heskett 2006-11-08 15:43 ` Linus Torvalds 1 sibling, 2 replies; 91+ messages in thread From: Alessandro Suardi @ 2006-11-08 9:59 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Linus Torvalds, Linux Kernel Mailing List On 11/8/06, Nigel Cunningham <ncunningham@linuxmail.org> wrote: > Gidday. > > On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote: > > Ok, things are finally calming down, it seems. > > > > The -rc5 thing is mainly a few random architecture updates (arm, mips, > > uml, avr, power) and the only really noticeable one there is likely some > > fixes to the local APIC accesses on x86, which apparently fixes a few > > machines. > > > > The rest is really mostly one-liners (or close) to various subsystems. New > > PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better > > about this - there may be a -rc6, but maybe we don't even need one. > > > > As usual, thanks to everybody who tested and chased down some of the > > regressions, > > > > Linus > > The patch etc doesn't seem to be available yet. (The front page is still > showing -rc4, for example). The patch is available, it's just the kernel.org home that isn't updated. http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.19-rc5.bz2 --alessandro "...when I get it, I _get_ it" (Lara Eidemiller) ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5 2006-11-08 9:59 ` Alessandro Suardi @ 2006-11-08 10:04 ` Nigel Cunningham 2006-11-08 14:19 ` Gene Heskett 1 sibling, 0 replies; 91+ messages in thread From: Nigel Cunningham @ 2006-11-08 10:04 UTC (permalink / raw) To: Alessandro Suardi; +Cc: Linus Torvalds, Linux Kernel Mailing List Hi. On Wed, 2006-11-08 at 10:59 +0100, Alessandro Suardi wrote: > On 11/8/06, Nigel Cunningham <ncunningham@linuxmail.org> wrote: > > Gidday. > > > > On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote: > > > Ok, things are finally calming down, it seems. > > > > > > The -rc5 thing is mainly a few random architecture updates (arm, mips, > > > uml, avr, power) and the only really noticeable one there is likely some > > > fixes to the local APIC accesses on x86, which apparently fixes a few > > > machines. > > > > > > The rest is really mostly one-liners (or close) to various subsystems. New > > > PCI ID's, trivial fixes, cifs, dvb, things like that. I'm feeling better > > > about this - there may be a -rc6, but maybe we don't even need one. > > > > > > As usual, thanks to everybody who tested and chased down some of the > > > regressions, > > > > > > Linus > > > > The patch etc doesn't seem to be available yet. (The front page is still > > showing -rc4, for example). > > The patch is available, it's just the kernel.org home that > isn't updated. > > http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.19-rc5.bz2 Ta. I was more concerned that whoever needs to fix whatever's broken knows the issue exists. Regards, Nigel ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5 2006-11-08 9:59 ` Alessandro Suardi 2006-11-08 10:04 ` Nigel Cunningham @ 2006-11-08 14:19 ` Gene Heskett 1 sibling, 0 replies; 91+ messages in thread From: Gene Heskett @ 2006-11-08 14:19 UTC (permalink / raw) To: linux-kernel On Wednesday 08 November 2006 04:59, Alessandro Suardi wrote: >On 11/8/06, Nigel Cunningham <ncunningham@linuxmail.org> wrote: >> Gidday. >> >> On Tue, 2006-11-07 at 18:33 -0800, Linus Torvalds wrote: >> > Ok, things are finally calming down, it seems. >> > >> > The -rc5 thing is mainly a few random architecture updates (arm, >> > mips, uml, avr, power) and the only really noticeable one there is >> > likely some fixes to the local APIC accesses on x86, which apparently >> > fixes a few machines. >> > >> > The rest is really mostly one-liners (or close) to various >> > subsystems. New PCI ID's, trivial fixes, cifs, dvb, things like that. >> > I'm feeling better about this - there may be a -rc6, but maybe we >> > don't even need one. >> > >> > As usual, thanks to everybody who tested and chased down some of the >> > regressions, >> > >> > Linus >> >> The patch etc doesn't seem to be available yet. (The front page is >> still showing -rc4, for example). > >The patch is available, it's just the kernel.org home that > isn't updated. > Tis now, I have it building. >http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.19-rc5.bz2 > >--alessandro > >"...when I get it, I _get_ it" > > (Lara Eidemiller) >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" > in the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Yahoo.com and AOL/TW attorneys please note, additions to the above message by Gene Heskett are: Copyright 2006 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Linux 2.6.19-rc5 2006-11-08 9:43 ` Nigel Cunningham 2006-11-08 9:59 ` Alessandro Suardi @ 2006-11-08 15:43 ` Linus Torvalds 1 sibling, 0 replies; 91+ messages in thread From: Linus Torvalds @ 2006-11-08 15:43 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Linux Kernel Mailing List On Wed, 8 Nov 2006, Nigel Cunningham wrote: > > The patch etc doesn't seem to be available yet. (The front page is still > showing -rc4, for example). It seems that mirroring is taking forever again. The patch and tar-balls are definitely there on the master site, and even gitweb has mirrored out (at least to one of the mirrors), but it looks like the mirroring hasn't gotten to the kernel source "testing" directory yet. Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
[parent not found: <20061108085235.GT4729@stusta.de>]
* Re: [discuss] 2.6.19-rc5: known regressions [not found] ` <20061108085235.GT4729@stusta.de> @ 2006-11-08 9:29 ` Jan Beulich 2006-11-08 10:21 ` Adrian Bunk 2006-11-08 9:34 ` Jens Axboe ` (4 subsequent siblings) 5 siblings, 1 reply; 91+ messages in thread From: Jan Beulich @ 2006-11-08 9:29 UTC (permalink / raw) To: Adrian Bunk; +Cc: Linux Kernel Mailing List, discuss >Subject : i386: more DWARFs and strange messages >References : http://lkml.org/lkml/2006/10/29/127 >Submitter : Martin Lorenz <martin@lorenz.eu.org> >Status : should be fixed by > commit 4b96b1a10cb00c867103b21f0f2a6c91b705db11 This commit should be related only to the 'strange messages'; I'm yet to look into the DWARFs. Jan ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions 2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich @ 2006-11-08 10:21 ` Adrian Bunk 0 siblings, 0 replies; 91+ messages in thread From: Adrian Bunk @ 2006-11-08 10:21 UTC (permalink / raw) To: Jan Beulich; +Cc: Linux Kernel Mailing List, discuss, Martin Lorenz On Wed, Nov 08, 2006 at 10:29:36AM +0100, Jan Beulich wrote: > >Subject : i386: more DWARFs and strange messages > >References : http://lkml.org/lkml/2006/10/29/127 > >Submitter : Martin Lorenz <martin@lorenz.eu.org> > >Status : should be fixed by > > commit 4b96b1a10cb00c867103b21f0f2a6c91b705db11 > > This commit should be related only to the 'strange messages'; I'm > yet to look into the DWARFs. Thanks for the information, I've updated it in my list. > Jan cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions [not found] ` <20061108085235.GT4729@stusta.de> 2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich @ 2006-11-08 9:34 ` Jens Axboe 2006-11-08 19:09 ` Alex Romosan 2006-11-08 11:04 ` Eric W. Biederman ` (3 subsequent siblings) 5 siblings, 1 reply; 91+ messages in thread From: Jens Axboe @ 2006-11-08 9:34 UTC (permalink / raw) To: romosan; +Cc: linux-kernel On Wed, Nov 08 2006, Adrian Bunk wrote: > Subject : unable to rip cd > References : http://lkml.org/lkml/2006/10/13/100 > Submitter : Alex Romosan <romosan@sycorax.lbl.gov> > Status : unknown Alex, was/is this repeatable? If so I'd like you to repeat with this debug patch applied, I cannot reproduce it locally. diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c index bddfebd..ad03e19 100644 --- a/drivers/ide/ide-cd.c +++ b/drivers/ide/ide-cd.c @@ -1726,8 +1726,10 @@ static ide_startstop_t cdrom_newpc_intr( /* * write to drive */ - if (cdrom_write_check_ireason(drive, len, ireason)) + if (cdrom_write_check_ireason(drive, len, ireason)) { + blk_dump_rq_flags(rq, "cdrom_newpc"); return ide_stopped; + } xferfunc = HWIF(drive)->atapi_output_bytes; } else { @@ -1859,8 +1861,10 @@ static ide_startstop_t cdrom_write_intr( } /* Check that the drive is expecting to do the same thing we are. */ - if (cdrom_write_check_ireason(drive, len, ireason)) + if (cdrom_write_check_ireason(drive, len, ireason)) { + blk_dump_rq_flags(rq, "cdrom_pc"); return ide_stopped; + } sectors_to_transfer = len / SECTOR_SIZE; -- Jens Axboe ^ permalink raw reply related [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 9:34 ` Jens Axboe @ 2006-11-08 19:09 ` Alex Romosan 2006-11-08 19:29 ` Jens Axboe 0 siblings, 1 reply; 91+ messages in thread From: Alex Romosan @ 2006-11-08 19:09 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel Jens Axboe <jens.axboe@oracle.com> writes: > On Wed, Nov 08 2006, Adrian Bunk wrote: >> Subject : unable to rip cd >> References : http://lkml.org/lkml/2006/10/13/100 >> Submitter : Alex Romosan <romosan@sycorax.lbl.gov> >> Status : unknown > > Alex, was/is this repeatable? If so I'd like you to repeat with this > debug patch applied, I cannot reproduce it locally. > > diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c > index bddfebd..ad03e19 100644 > --- a/drivers/ide/ide-cd.c > +++ b/drivers/ide/ide-cd.c > @@ -1726,8 +1726,10 @@ static ide_startstop_t cdrom_newpc_intr( > /* > * write to drive > */ > - if (cdrom_write_check_ireason(drive, len, ireason)) > + if (cdrom_write_check_ireason(drive, len, ireason)) { > + blk_dump_rq_flags(rq, "cdrom_newpc"); > return ide_stopped; > + } > > xferfunc = HWIF(drive)->atapi_output_bytes; > } else { > @@ -1859,8 +1861,10 @@ static ide_startstop_t cdrom_write_intr( > } > > /* Check that the drive is expecting to do the same thing we are. */ > - if (cdrom_write_check_ireason(drive, len, ireason)) > + if (cdrom_write_check_ireason(drive, len, ireason)) { > + blk_dump_rq_flags(rq, "cdrom_pc"); > return ide_stopped; > + } > > sectors_to_transfer = len / SECTOR_SIZE; > i've tried it again with the above patch applied and when i start cdparanoia i get: kernel: hdc: write_intr: wrong transfer direction! kernel: cdrom_newpc: dev hdc: type=2, flags=114c9 kernel: kernel: sector 59534648, nr/cnr 0/0 kernel: bio 00000000, biotail c14b2800, buffer 00000000, data 00000000, len 56 kernel: cdb: 12 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 as for the lock up, the ripping process never completes, it starts and then it hangs somewhere in the middle of the track. it could be that the disk has some problems. anyway, abort execution doesn't work until i physically eject the cd from the drive (which seems to be an improvement from a couple of rc's ago). hope this helps. --alex-- -- | I believe the moment is at hand when, by a paranoiac and active | | advance of the mind, it will be possible (simultaneously with | | automatism and other passive states) to systematize confusion | | and thus to help to discredit completely the world of reality. | ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 19:09 ` Alex Romosan @ 2006-11-08 19:29 ` Jens Axboe 2006-11-08 19:38 ` Alex Romosan 2006-11-08 20:03 ` Arjan van de Ven 0 siblings, 2 replies; 91+ messages in thread From: Jens Axboe @ 2006-11-08 19:29 UTC (permalink / raw) To: Alex Romosan; +Cc: linux-kernel On Wed, Nov 08 2006, Alex Romosan wrote: > Jens Axboe <jens.axboe@oracle.com> writes: > > > On Wed, Nov 08 2006, Adrian Bunk wrote: > >> Subject : unable to rip cd > >> References : http://lkml.org/lkml/2006/10/13/100 > >> Submitter : Alex Romosan <romosan@sycorax.lbl.gov> > >> Status : unknown > > > > Alex, was/is this repeatable? If so I'd like you to repeat with this > > debug patch applied, I cannot reproduce it locally. > > > > diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c > > index bddfebd..ad03e19 100644 > > --- a/drivers/ide/ide-cd.c > > +++ b/drivers/ide/ide-cd.c > > @@ -1726,8 +1726,10 @@ static ide_startstop_t cdrom_newpc_intr( > > /* > > * write to drive > > */ > > - if (cdrom_write_check_ireason(drive, len, ireason)) > > + if (cdrom_write_check_ireason(drive, len, ireason)) { > > + blk_dump_rq_flags(rq, "cdrom_newpc"); > > return ide_stopped; > > + } > > > > xferfunc = HWIF(drive)->atapi_output_bytes; > > } else { > > @@ -1859,8 +1861,10 @@ static ide_startstop_t cdrom_write_intr( > > } > > > > /* Check that the drive is expecting to do the same thing we are. */ > > - if (cdrom_write_check_ireason(drive, len, ireason)) > > + if (cdrom_write_check_ireason(drive, len, ireason)) { > > + blk_dump_rq_flags(rq, "cdrom_pc"); > > return ide_stopped; > > + } > > > > sectors_to_transfer = len / SECTOR_SIZE; > > > > i've tried it again with the above patch applied and when i start > cdparanoia i get: > > kernel: hdc: write_intr: wrong transfer direction! > kernel: cdrom_newpc: dev hdc: type=2, flags=114c9 > kernel: > kernel: sector 59534648, nr/cnr 0/0 > kernel: bio 00000000, biotail c14b2800, buffer 00000000, data 00000000, len 56 > kernel: cdb: 12 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 Wonderful! So this is an INQUIRY command, yet the WRITE bit is set. The drive gets really confused about that, for good reason. The question is where that write bit comes from, it looks really odd. Additionally, we have killed ->bio but ->biotail still looks valid. Perhaps it's some of the error handling that got screwed. > as for the lock up, the ripping process never completes, it starts and > then it hangs somewhere in the middle of the track. it could be that > the disk has some problems. anyway, abort execution doesn't work until > i physically eject the cd from the drive (which seems to be an > improvement from a couple of rc's ago). hope this helps. It helps a lot, thanks! I may ask you to retest with another patch, if you don't mind. -- Jens Axboe ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 19:29 ` Jens Axboe @ 2006-11-08 19:38 ` Alex Romosan 2006-11-08 19:45 ` Jens Axboe 2006-11-08 20:03 ` Arjan van de Ven 1 sibling, 1 reply; 91+ messages in thread From: Alex Romosan @ 2006-11-08 19:38 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel Jens Axboe <jens.axboe@oracle.com> writes: > It helps a lot, thanks! I may ask you to retest with another patch, > if you don't mind. send the patches, i'll test them all. thanks. --alex-- -- | I believe the moment is at hand when, by a paranoiac and active | | advance of the mind, it will be possible (simultaneously with | | automatism and other passive states) to systematize confusion | | and thus to help to discredit completely the world of reality. | ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 19:38 ` Alex Romosan @ 2006-11-08 19:45 ` Jens Axboe 2006-11-08 21:40 ` Alex Romosan 0 siblings, 1 reply; 91+ messages in thread From: Jens Axboe @ 2006-11-08 19:45 UTC (permalink / raw) To: Alex Romosan; +Cc: linux-kernel On Wed, Nov 08 2006, Alex Romosan wrote: > Jens Axboe <jens.axboe@oracle.com> writes: > > > It helps a lot, thanks! I may ask you to retest with another patch, > > if you don't mind. > > send the patches, i'll test them all. thanks. If you could retest with something crazy like this, then that would likely help: diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c index 7c47e62..010acfa 100644 --- a/drivers/ide/ide-cd.c +++ b/drivers/ide/ide-cd.c @@ -630,6 +630,9 @@ static void cdrom_end_request (ide_drive struct request *rq = HWGROUP(drive)->rq; int nsectors = rq->hard_cur_sectors; + if (blk_pc_request(rq) && rq->cmd[0] == 0x12) + printk("ide-cd: end INQ rq %p\n", rq); + if (blk_sense_request(rq) && uptodate) { /* * For REQ_TYPE_SENSE, "rq->buffer" points to the original @@ -1671,6 +1674,9 @@ static ide_startstop_t cdrom_newpc_intr( xfer_func_t *xferfunc; unsigned long flags; + if (rq->cmd[0] == 0x12) + printk("ide-cd: newpc %p\n", rq); + /* Check for errors. */ dma_error = 0; dma = info->dma; @@ -1789,6 +1795,8 @@ static ide_startstop_t cdrom_newpc_intr( return ide_started; end_request: + if (rq->cmd[0] == 0x12) + printk("ide-cd: newpc end INQ %p\n", rq); if (!rq->data_len) post_transform_command(rq); @@ -1959,7 +1967,13 @@ static ide_startstop_t cdrom_do_block_pc { struct cdrom_info *info = drive->driver_data; - rq->cmd_flags |= REQ_QUIET; + if (rq->cmd[0] == 0x12) { + printk("ide-cd: starting INQ %p\n", rq); + if (rq_data_dir(rq) == WRITE) + printk("ide-cd: INQ with write set seen\n"); + } + if (!rq->bio && rq->biotail) + printk("ide-cd: no bio, but biotail\n"); info->dma = 0; -- Jens Axboe ^ permalink raw reply related [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 19:45 ` Jens Axboe @ 2006-11-08 21:40 ` Alex Romosan 0 siblings, 0 replies; 91+ messages in thread From: Alex Romosan @ 2006-11-08 21:40 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel Jens Axboe <jens.axboe@oracle.com> writes: > If you could retest with something crazy like this, then that would > likely help: > > diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c > index 7c47e62..010acfa 100644 > --- a/drivers/ide/ide-cd.c > +++ b/drivers/ide/ide-cd.c > @@ -630,6 +630,9 @@ static void cdrom_end_request (ide_drive > struct request *rq = HWGROUP(drive)->rq; > int nsectors = rq->hard_cur_sectors; > > + if (blk_pc_request(rq) && rq->cmd[0] == 0x12) > + printk("ide-cd: end INQ rq %p\n", rq); > + > if (blk_sense_request(rq) && uptodate) { > /* > * For REQ_TYPE_SENSE, "rq->buffer" points to the original > @@ -1671,6 +1674,9 @@ static ide_startstop_t cdrom_newpc_intr( > xfer_func_t *xferfunc; > unsigned long flags; > > + if (rq->cmd[0] == 0x12) > + printk("ide-cd: newpc %p\n", rq); > + > /* Check for errors. */ > dma_error = 0; > dma = info->dma; > @@ -1789,6 +1795,8 @@ static ide_startstop_t cdrom_newpc_intr( > return ide_started; > > end_request: > + if (rq->cmd[0] == 0x12) > + printk("ide-cd: newpc end INQ %p\n", rq); > if (!rq->data_len) > post_transform_command(rq); > > @@ -1959,7 +1967,13 @@ static ide_startstop_t cdrom_do_block_pc > { > struct cdrom_info *info = drive->driver_data; > > - rq->cmd_flags |= REQ_QUIET; > + if (rq->cmd[0] == 0x12) { > + printk("ide-cd: starting INQ %p\n", rq); > + if (rq_data_dir(rq) == WRITE) > + printk("ide-cd: INQ with write set seen\n"); > + } > + if (!rq->bio && rq->biotail) > + printk("ide-cd: no bio, but biotail\n"); > > info->dma = 0; i applied this patch on top of the old one. this is what i get now: kernel: ide-cd: starting INQ df5ad074 kernel: ide-cd: INQ with write set seen kernel: ide-cd: newpc df5ad074 kernel: hdc: write_intr: wrong transfer direction! kernel: ide-cd: end INQ rq df5ad074 kernel: cdrom_newpc: dev hdc: type=2, flags=104c9 kernel: kernel: sector 59534648, nr/cnr 0/0 kernel: bio 00000000, biotail dee57c80, buffer 00000000, data 00000000, len 56 kernel: cdb: 12 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 kernel: ide-cd: starting INQ df5ad074 kernel: ide-cd: newpc df5ad074 kernel: ide-cd: newpc df5ad074 kernel: ide-cd: newpc end INQ df5ad074 kernel: hdc: packet command error: status=0x51 { DriveReady SeekComplete Error } kernel: hdc: packet command error: error=0xb4 { AbortedCommand LastFailedSense=0x0b } kernel: ide: failed opcode was: unknown kernel: ATAPI device hdc: kernel: Error: Aborted command -- (Sense key=0x0b) kernel: (reserved error code) -- (asc=0x11, ascq=0x11) kernel: The failed "Read CD" packet command was: kernel: "be 00 00 00 51 93 00 00 0d f8 00 00 00 00 00 00 " kernel: hdc: packet command error: status=0x51 { DriveReady SeekComplete Error } kernel: hdc: packet command error: error=0x30 { LastFailedSense=0x03 } kernel: ide: failed opcode was: unknown kernel: ATAPI device hdc: kernel: Error: Medium error -- (Sense key=0x03) kernel: Unrecovered read error -- (asc=0x11, ascq=0x00) kernel: The failed "Read CD" packet command was: kernel: "be 00 00 00 51 a0 00 00 07 f8 00 00 00 00 00 00 " kernel: hdc: packet command error: status=0x51 { DriveReady SeekComplete Error } kernel: hdc: packet command error: error=0xb4 { AbortedCommand LastFailedSense=0x0b } kernel: ide: failed opcode was: unknown kernel: ATAPI device hdc: kernel: Error: Aborted command -- (Sense key=0x0b) kernel: (reserved error code) -- (asc=0x11, ascq=0x11) kernel: The failed "Read CD" packet command was: kernel: "be 00 00 00 51 9b 00 00 0d f8 00 00 00 00 00 00 " hdc is the cdrom drive and the errors started showing up when cdparanoia hung. --alex-- -- | I believe the moment is at hand when, by a paranoiac and active | | advance of the mind, it will be possible (simultaneously with | | automatism and other passive states) to systematize confusion | | and thus to help to discredit completely the world of reality. | ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 19:29 ` Jens Axboe 2006-11-08 19:38 ` Alex Romosan @ 2006-11-08 20:03 ` Arjan van de Ven 2006-11-08 20:19 ` Jens Axboe 1 sibling, 1 reply; 91+ messages in thread From: Arjan van de Ven @ 2006-11-08 20:03 UTC (permalink / raw) To: Jens Axboe; +Cc: Alex Romosan, linux-kernel > Wonderful! So this is an INQUIRY command, yet the WRITE bit is set. The > drive gets really confused about that, for good reason. The question is > where that write bit comes from, it looks really odd. Additionally, we it could be a userspace command; some userspace tools send inquiry via sg... ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 20:03 ` Arjan van de Ven @ 2006-11-08 20:19 ` Jens Axboe 0 siblings, 0 replies; 91+ messages in thread From: Jens Axboe @ 2006-11-08 20:19 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Alex Romosan, linux-kernel On Wed, Nov 08 2006, Arjan van de Ven wrote: > > Wonderful! So this is an INQUIRY command, yet the WRITE bit is set. The > > drive gets really confused about that, for good reason. The question is > > where that write bit comes from, it looks really odd. Additionally, we > > it could be a userspace command; some userspace tools send inquiry via > sg... it is a userspace command, it originates from SG_IO. So that is a given. The question is where the write bit comes from, I'd be puzzled if the user app sets it - cdparanoia in this case. Seeing as there's other request mangling, I hope the new debug patch can shed some light on that. -- Jens Axboe ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions [not found] ` <20061108085235.GT4729@stusta.de> 2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich 2006-11-08 9:34 ` Jens Axboe @ 2006-11-08 11:04 ` Eric W. Biederman 2006-11-08 11:32 ` Thomas Gleixner ` (2 subsequent siblings) 5 siblings, 0 replies; 91+ messages in thread From: Eric W. Biederman @ 2006-11-08 11:04 UTC (permalink / raw) To: Adrian Bunk Cc: Andrew Morton, Linux Kernel Mailing List, Bryan O'Sullivan Adrian Bunk <bunk@stusta.de> writes: > Subject : ipath driver MCEs system on load when HT chip present > References : http://bugzilla.kernel.org/show_bug.cgi?id=7455 > Submitter : Bryan O'Sullivan <bos@serpentine.com> > Caused-By : Eric W. Biederman <ebiederm@xmission.com> > Handled-By : Bryan O'Sullivan <bos@serpentine.com> > Eric W. Biederman <ebiederm@xmission.com> > Status : Bryan and Eric are working on fixing the ipath driver Except for some stupid little issues the fixes are now agreed to. Just final code reviews and testing are needed. Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions [not found] ` <20061108085235.GT4729@stusta.de> ` (2 preceding siblings ...) 2006-11-08 11:04 ` Eric W. Biederman @ 2006-11-08 11:32 ` Thomas Gleixner [not found] ` <7813413.118221162987983254.komurojun-mbn@nifty.com> [not found] ` <m1y7qm425l.fsf@ebiederm.dsl.xmission.com> 5 siblings, 0 replies; 91+ messages in thread From: Thomas Gleixner @ 2006-11-08 11:32 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, mingo, Komuro On Wed, 2006-11-08 at 09:52 +0100, Adrian Bunk wrote: > Subject : SMP kernel can not generate ISA irq properly > References : http://lkml.org/lkml/2006/10/22/15 > Submitter : Komuro <komurojun-mbn@nifty.com> > Handled-By : Thomas Gleixner <tglx@linutronix.de> > Status : Thomas is investigating Problem is not reproducable on any of my boxen. Komuro, is this still happening on -rc5 ? If yes, can you please provide the boot log with "apic=verbose" on the commandline ? tglx ^ permalink raw reply [flat|nested] 91+ messages in thread
[parent not found: <7813413.118221162987983254.komurojun-mbn@nifty.com>]
* Re: Re: 2.6.19-rc5: known regressions [not found] ` <7813413.118221162987983254.komurojun-mbn@nifty.com> @ 2006-11-08 16:00 ` Linus Torvalds 2006-11-10 12:42 ` Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq Komuro 0 siblings, 1 reply; 91+ messages in thread From: Linus Torvalds @ 2006-11-08 16:00 UTC (permalink / raw) To: Komuro; +Cc: tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List, mingo On Wed, 8 Nov 2006, Komuro wrote: > > Intel ISA PCIC probe: > Intel i82365sl B step ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets > host opts [0]: none > host opts [1]: none > ISA irqs (scanned) = 3,4,5,7,9,11,15 status change on irq 15 This definitely means that the IRQ subsystem works, at least here. That "scanned" means that the PCMCIA driver actually tested those interrupts, and they worked. At that point, at least. Of course, the "they worked" test is fairly simple, so it's by no means foolproof, but in general, it does sound like it all really should be ok. Komuro, if you're a git user (or are willing to learn), and it's reliable with one particular card, it really would make most sense to bisect it. Just start off with git bisect start git bisect good v2.6.18 git bisect bad v2.6.19-rc1 and off you go. That's a lot of commits (abotu 5000), but even if you don't ant to do the 12 or 13 kernel compiles and reboots that are needed for a full bisection, doing just 4-5 would cut the number down a lot, and then you can send the bisection log out. But testing 2.6.19-rc5 is still worth it. The APIC fixes might fix it, or some other changes might. Linus > warning: process `date' used the removed sysctl system call > EXT3 FS on hda1, internal journal > Adding 257032k swap on /dev/hda2. Priority:-1 extents:1 across:257032k > warning: process `ls' used the removed sysctl system call > warning: process `sleep' used the removed sysctl system call > cs: IO port probe 0x100-0x3af: excluding 0x170-0x177 0x290-0x297 0x370-0x37f > cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7 > cs: IO port probe 0x820-0x8ff: clean. > cs: IO port probe 0xc00-0xcf7: clean. > cs: IO port probe 0xa00-0xaff: clean. > cs: IO port probe 0x100-0x3af: excluding 0x170-0x177 0x290-0x297 0x370-0x37f > cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7 > cs: IO port probe 0x820-0x8ff: clean. > cs: IO port probe 0xc00-0xcf7: clean. > cs: IO port probe 0xa00-0xaff: clean. > > Best Regards > Komuro > > ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq 2006-11-08 16:00 ` Linus Torvalds @ 2006-11-10 12:42 ` Komuro 2006-11-13 16:02 ` Linus Torvalds 0 siblings, 1 reply; 91+ messages in thread From: Komuro @ 2006-11-10 12:42 UTC (permalink / raw) To: Linus Torvalds Cc: tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List, mingo Hi, >> Intel ISA PCIC probe: >> Intel i82365sl B step ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets >> host opts [0]: none >> host opts [1]: none >> ISA irqs (scanned) = 3,4,5,7,9,11,15 status change on irq 15 > >This definitely means that the IRQ subsystem works, at least here. That >"scanned" means that the PCMCIA driver actually tested those interrupts, >and they worked. > >At that point, at least. > >Of course, the "they worked" test is fairly simple, so it's by no means >foolproof, but in general, it does sound like it all really should be ok. > > >But testing 2.6.19-rc5 is still worth it. The APIC fixes might fix it, or >some other changes might. > > Linus I tried the 2.6.19-rc5, the problem still happens. But, I remove the disable_irq_nosync() , enable_irq() from the linux/drivers/net/pcmcia/axnet_cs.c the interrupt is generated properly. So I think enable_irq does not enable the irq. Thanks! Best Regards Komuro ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq 2006-11-10 12:42 ` Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq Komuro @ 2006-11-13 16:02 ` Linus Torvalds 2006-11-13 17:11 ` Eric W. Biederman 0 siblings, 1 reply; 91+ messages in thread From: Linus Torvalds @ 2006-11-13 16:02 UTC (permalink / raw) To: Komuro Cc: tglx, Eric W. Biederman, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List, mingo On Fri, 10 Nov 2006, Komuro wrote: > > I tried the 2.6.19-rc5, the problem still happens. Ok, that's good data, and especially: > But, > I remove the disable_irq_nosync() , enable_irq() > from the linux/drivers/net/pcmcia/axnet_cs.c > the interrupt is generated properly. All RIGHT. That's a very good clue. The major difference between PCI and ISA irq's is that they have different trigger types (they also have different polarity, but that tends to be just a small detail). In particular, ISA IRQ's are edge-triggered, and PCI IRQ's are level- triggered. Now, edge-triggered interrupts are a _lot_ harder to mask, because the Intel APIC is an unbelievable piece of sh*t, and has the edge-detect logic _before_ the mask logic, so if a edge happens _while_ the device is masked, you'll never ever see the edge ever again (unmasking will not cause a new edge, so you simply lost the interrupt). So when you "mask" an edge-triggered IRQ, you can't really mask it at all, because if you did that, you'd lose it forever if the IRQ comes in while you masked it. Instead, we're supposed to leave it active, and set a flag, and IF the IRQ comes in, we just remember it, and mask it at that point instead, and then on unmasking, we have to replay it by sending a self-IPI. Maybe that part got broken by some of the IRQ changes by Eric. Eric, can you please double-check this all? I suspect you disable edge-triggered interrupts when moving them, or something, and maybe you didn't realize that if you disable them on the IO-APIC level, they can be gone forever. [ Note: this is true EVEN IF we are in the interrupt handler right then - if we get another edge while in the interrupt handler, the interrupt will normally be _delayed_ until we've ACK'ed it, but if we have _masked_ it, it will simply be lost entirely. So a simple "mask" operation is always incorrect for edge-triggered interrupts. One option might be to do a simple mask, and on unmask, turn the edge trigger into a level trigger at the same time. Then, the first time you get the interrupt, you turn it back into an edge trigger _before_ you call the interrupt handlers. That might actually be simpler than doing the "irq replay" dance with self-IPI, because we can't actually just fake the IRQ handling - when enable_irq() is called, irq's are normally disabled on the CPU, so we can't just call the irq handler at that point: we really do need to "replay" the dang thing. Did I mention that the Intel APIC's are a piece of cr*p already? ] > So I think enable_irq does not enable the irq. It probably does enable it (that's the easy part), but see above: if any of the support structure for the APIC crapola is subtly broken, we'll have lost the IRQ anyway. (Many other IRQ controllers get this right: the "old and broken" Intel i8259 interrupt controller was a much better IRQ controller than the APIC in this regard, because it simply had the edge-detect logic after the masking logic, so if you unmasked an active interrupt that had been masked, you would always see it as an edge, and the i8259 controller needs none of the subtle code at _all_. It just works.) Anyway, if you _can_ bisect the exact point where this started happening, that would be good. But I would not be surprised in the least if this is all introduced by Eric Biedermans dynamic IRQ handling. Eric? Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq 2006-11-13 16:02 ` Linus Torvalds @ 2006-11-13 17:11 ` Eric W. Biederman 2006-11-13 20:44 ` Ingo Molnar 0 siblings, 1 reply; 91+ messages in thread From: Eric W. Biederman @ 2006-11-13 17:11 UTC (permalink / raw) To: Linus Torvalds Cc: Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List, mingo Linus Torvalds <torvalds@osdl.org> writes: > On Fri, 10 Nov 2006, Komuro wrote: >> >> I tried the 2.6.19-rc5, the problem still happens. > > Ok, that's good data, and especially: > >> But, >> I remove the disable_irq_nosync() , enable_irq() >> from the linux/drivers/net/pcmcia/axnet_cs.c >> the interrupt is generated properly. > > All RIGHT. That's a very good clue. The major difference between PCI and > ISA irq's is that they have different trigger types (they also have > different polarity, but that tends to be just a small detail). In > particular, ISA IRQ's are edge-triggered, and PCI IRQ's are level- > triggered. > > Now, edge-triggered interrupts are a _lot_ harder to mask, because the > Intel APIC is an unbelievable piece of sh*t, and has the edge-detect logic > _before_ the mask logic, so if a edge happens _while_ the device is > masked, you'll never ever see the edge ever again (unmasking will not > cause a new edge, so you simply lost the interrupt). > > So when you "mask" an edge-triggered IRQ, you can't really mask it at all, > because if you did that, you'd lose it forever if the IRQ comes in while > you masked it. Instead, we're supposed to leave it active, and set a flag, > and IF the IRQ comes in, we just remember it, and mask it at that point > instead, and then on unmasking, we have to replay it by sending a > self-IPI. > > Maybe that part got broken by some of the IRQ changes by Eric. Hmm. The other possibility is that this is a genirq migration issue. Yep. That looks like it. In the genirq migration the edge and level triggered cases got merged and previously disable_edge_ioapic was a noop. Ouch. Darn I missed this one in my review of Ingos changes. I'm not at all certain what the correct fix here is. - Do we make the make the generic code aware of this messed up case? I believe it is aware of part of the don't disable edge triggered interrupt logic already. - Do we modify the disable logic so it doesn't actually disable the irq? - Do we do as Linus suggests and make the enable logic pass through a level triggered state? - Do we split the edge and level triggered cases apart on again on i386 and x86_64? And how do we make it drop dead clear what we are doing so that someone doesn't break this in the future by accident. That I suspect was the real problem. That stupid vector == irq case had introduced so many levels of abstraction it was nearly impossible to read the code. Can we get this abstraction right so that we can make obviously correct code here and still handle all of the weird code bugs? > Eric, can you please double-check this all? I suspect you disable > edge-triggered interrupts when moving them, or something, and maybe you > didn't realize that if you disable them on the IO-APIC level, they can be > gone forever. Sure. So the hypothesis is that it is somewhere near commit e7b946e98a456077dd6897f726f3d6197bd7e3b9 causing the problem. Anything I have changed in this area should affect both i386 and x86_64. > [ Note: this is true EVEN IF we are in the interrupt handler right then - > if we get another edge while in the interrupt handler, the interrupt > will normally be _delayed_ until we've ACK'ed it, but if we have > _masked_ it, it will simply be lost entirely. So a simple "mask" > operation is always incorrect for edge-triggered interrupts. > > One option might be to do a simple mask, and on unmask, turn the edge > trigger into a level trigger at the same time. Then, the first time you > get the interrupt, you turn it back into an edge trigger _before_ you > call the interrupt handlers. That might actually be simpler than doing > the "irq replay" dance with self-IPI, because we can't actually just > fake the IRQ handling - when enable_irq() is called, irq's are normally > disabled on the CPU, so we can't just call the irq handler at that > point: we really do need to "replay" the dang thing. > > Did I mention that the Intel APIC's are a piece of cr*p already? ] Ok. After a quick skim it appears that there is a disable/enable pair in the irq migration path for edge triggered interrupts. But we do that work while the irq is pending and it doesn't look like I changed that part of the code. Just the level triggered irq migration. >> So I think enable_irq does not enable the irq. > > It probably does enable it (that's the easy part), but see above: if any > of the support structure for the APIC crapola is subtly broken, we'll have > lost the IRQ anyway. > > (Many other IRQ controllers get this right: the "old and broken" Intel > i8259 interrupt controller was a much better IRQ controller than the APIC > in this regard, because it simply had the edge-detect logic after the > masking logic, so if you unmasked an active interrupt that had been > masked, you would always see it as an edge, and the i8259 controller needs > none of the subtle code at _all_. It just works.) > > Anyway, if you _can_ bisect the exact point where this started happening, > that would be good. But I would not be surprised in the least if this is > all introduced by Eric Biedermans dynamic IRQ handling. I will share the credit because I missed this in code review but this is really Ingo's generic irq code. Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq 2006-11-13 17:11 ` Eric W. Biederman @ 2006-11-13 20:44 ` Ingo Molnar 2006-11-13 21:11 ` Eric W. Biederman 0 siblings, 1 reply; 91+ messages in thread From: Ingo Molnar @ 2006-11-13 20:44 UTC (permalink / raw) To: Eric W. Biederman Cc: Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List On Mon, 2006-11-13 at 10:11 -0700, Eric W. Biederman wrote: > > So when you "mask" an edge-triggered IRQ, you can't really mask it > at all, > > because if you did that, you'd lose it forever if the IRQ comes in > while > > you masked it. Instead, we're supposed to leave it active, and set a > flag, > > and IF the IRQ comes in, we just remember it, and mask it at that > point > > instead, and then on unmasking, we have to replay it by sending a > > self-IPI. > > > > Maybe that part got broken by some of the IRQ changes by Eric. > > Hmm. The other possibility is that this is a genirq migration issue. > > Yep. That looks like it. In the genirq migration the edge and > level triggered cases got merged and previously disable_edge_ioapic > was a noop. Ouch. hm, that should be solved by the generic edge-triggered flow handler as well: we never mask an IRQ first time around, we only mask it if we /already/ have the 'soft' IRQ_PENDING flag set. (in that case the lost edge is not an issue because we have the information already - and the masking will prevent a screaming edge source) but maybe this concept has not been pushed through to the disable/enable irq logic itself? (it's only present in the flow handler) Thomas, do you concur? Ingo ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq 2006-11-13 20:44 ` Ingo Molnar @ 2006-11-13 21:11 ` Eric W. Biederman 2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar 0 siblings, 1 reply; 91+ messages in thread From: Eric W. Biederman @ 2006-11-13 21:11 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List Ingo Molnar <mingo@redhat.com> writes: > On Mon, 2006-11-13 at 10:11 -0700, Eric W. Biederman wrote: >> > So when you "mask" an edge-triggered IRQ, you can't really mask it >> at all, >> > because if you did that, you'd lose it forever if the IRQ comes in >> while >> > you masked it. Instead, we're supposed to leave it active, and set a >> flag, >> > and IF the IRQ comes in, we just remember it, and mask it at that >> point >> > instead, and then on unmasking, we have to replay it by sending a >> > self-IPI. >> > >> > Maybe that part got broken by some of the IRQ changes by Eric. >> >> Hmm. The other possibility is that this is a genirq migration issue. >> >> Yep. That looks like it. In the genirq migration the edge and >> level triggered cases got merged and previously disable_edge_ioapic >> was a noop. Ouch. > > hm, that should be solved by the generic edge-triggered flow handler as > well: we never mask an IRQ first time around, we only mask it if > we /already/ have the 'soft' IRQ_PENDING flag set. (in that case the > lost edge is not an issue because we have the information already - and > the masking will prevent a screaming edge source) > > but maybe this concept has not been pushed through to the disable/enable > irq logic itself? (it's only present in the flow handler) Thomas, do you > concur? I just looked. I think the logic is actually in there as well. I keep forgetting disable != mask. I looks like what is really missing is that we aren't setting IRQ_DELAYED_DISABLE. So I think what we really need to do is just set IRQ_DELAYED_DISABLE. Does the patch below look right? Eric diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c index 41bfc49..14654e6 100644 --- a/arch/x86_64/kernel/io_apic.c +++ b/arch/x86_64/kernel/io_apic.c @@ -790,9 +790,11 @@ static void ioapic_register_intr(int irq trigger == IOAPIC_LEVEL) set_irq_chip_and_handler_name(irq, &ioapic_chip, handle_fasteoi_irq, "fasteoi"); - else + else { + irq_desc[irq].status |= IRQ_DELAYED_DISABLE; set_irq_chip_and_handler_name(irq, &ioapic_chip, handle_edge_irq, "edge"); + } } static void __init setup_IO_APIC_irqs(void) ^ permalink raw reply related [flat|nested] 91+ messages in thread
* [patch] irq: do not mask interrupts by default 2006-11-13 21:11 ` Eric W. Biederman @ 2006-11-14 8:14 ` Ingo Molnar 2006-11-14 8:20 ` Arjan van de Ven ` (2 more replies) 0 siblings, 3 replies; 91+ messages in thread From: Ingo Molnar @ 2006-11-14 8:14 UTC (permalink / raw) To: Eric W. Biederman Cc: Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List On Mon, 2006-11-13 at 14:11 -0700, Eric W. Biederman wrote: > - else > + else { > + irq_desc[irq].status |= IRQ_DELAYED_DISABLE; > set_irq_chip_and_handler_name(irq, &ioapic_chip, > handle_edge_irq, > "edge"); > + } > } yeah. Komuro, could you try my patch below - Eric's patch only updates x86_64 while your failure was on the i386 kernel. Note, i also took another approach to fix this problem, that should cover both the case found by Komuro, and some other cases as well. The theory is this: 1) disable_irq() is relatively rare (used in about 10% of drivers, but there it's overwhelmingly used in some slowpath) so it's performance uncritical. 2) missing an IRQ while the line is masked is often a lethal regression to the user. An IRQ could be missed even if we think that the IRQ line is 'level-triggered'. so my patch changes the default irq-disable logic of /all/ controllers to "delayed disable". (IRQ chips can still override this by providing a different chip->disable method that just clones their ->mask method, if it is absolutely sure that no IRQs can be lost while masked) So this patch has the worst-case effect of getting at most one 'extra' interrupt after the IRQ line has been 'disabled' - at which point the line will be masked for real (by the flow handler). (I updated the fasteoi and the simple irq flow handlers to mask the IRQ for real if an IRQ triggers and the line was disabled.) It's a bit late in the -rc cycle for a change like this, but i'm fairly positive about it. I booted it on a couple of boxes and saw no badness. (neither did i see any increase in IRQ rates) Ingo NOTE: this also means that the old IRQ_DELAYED_DISABLE bit can probably be scrapped - i'll do that later on in a separate mail, if this patch works out fine. ------------> Subject: irq: do not mask interrupts by default From: Ingo Molnar <mingo@elte.hu> never mask interrupts immediately upon request. Disabling interrupts in high-performance codepaths is rare, and on the other hand this change could recover lost edges (or even other types of lost interrupts) by conservatively only masking interrupts after they happen. (NOTE: with this change the highlevel irq-disable code still soft-disables this IRQ line - and if such an interrupt happens then the IRQ flow handler keeps the IRQ masked.) Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/irq/chip.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) Index: linux/kernel/irq/chip.c =================================================================== --- linux.orig/kernel/irq/chip.c +++ linux/kernel/irq/chip.c @@ -202,10 +202,6 @@ static void default_enable(unsigned int */ static void default_disable(unsigned int irq) { - struct irq_desc *desc = irq_desc + irq; - - if (!(desc->status & IRQ_DELAYED_DISABLE)) - desc->chip->mask(irq); } /* @@ -272,8 +268,11 @@ handle_simple_irq(unsigned int irq, stru kstat_cpu(cpu).irqs[irq]++; action = desc->action; - if (unlikely(!action || (desc->status & IRQ_DISABLED))) + if (unlikely(!action || (desc->status & IRQ_DISABLED))) { + if (desc->chip->mask) + desc->chip->mask(irq); goto out_unlock; + } desc->status |= IRQ_INPROGRESS; spin_unlock(&desc->lock); @@ -366,11 +365,13 @@ handle_fasteoi_irq(unsigned int irq, str /* * If its disabled or no action available - * keep it masked and get out of here + * then mask it and get out of here: */ action = desc->action; if (unlikely(!action || (desc->status & IRQ_DISABLED))) { desc->status |= IRQ_PENDING; + if (desc->chip->mask) + desc->chip->mask(irq); goto out; } ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] irq: do not mask interrupts by default 2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar @ 2006-11-14 8:20 ` Arjan van de Ven 2006-11-14 12:43 ` Komuro 2006-11-14 16:10 ` Linus Torvalds 2 siblings, 0 replies; 91+ messages in thread From: Arjan van de Ven @ 2006-11-14 8:20 UTC (permalink / raw) To: Ingo Molnar Cc: Eric W. Biederman, Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List > so my patch changes the default irq-disable logic of /all/ controllers > to "delayed disable". (IRQ chips can still override this by providing a > different chip->disable method that just clones their ->mask method, if > it is absolutely sure that no IRQs can be lost while masked) > > So this patch has the worst-case effect of getting at most one 'extra' > interrupt after the IRQ line has been 'disabled' - at which point the > line will be masked for real (by the flow handler). (I updated the > fasteoi and the simple irq flow handlers to mask the IRQ for real if an > IRQ triggers and the line was disabled.) since disable_irq() is used as locking against interrupt context by several drivers (*cough* ne2000 *cough*) I am not entirely convinced this is a good idea.... ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] irq: do not mask interrupts by default 2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar 2006-11-14 8:20 ` Arjan van de Ven @ 2006-11-14 12:43 ` Komuro 2006-11-14 16:10 ` Linus Torvalds 2 siblings, 0 replies; 91+ messages in thread From: Komuro @ 2006-11-14 12:43 UTC (permalink / raw) To: Ingo Molnar Cc: Eric W. Biederman, Linus Torvalds, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List Dear Ingo I tried your patch with 2.6.19-rc5. The irq is generated properly. Thanks! Best Regards Komuro >> >------------> >Subject: irq: do not mask interrupts by default >From: Ingo Molnar <mingo@elte.hu> > >never mask interrupts immediately upon request. Disabling >interrupts in high-performance codepaths is rare, and on >the other hand this change could recover lost edges (or >even other types of lost interrupts) by conservatively >only masking interrupts after they happen. (NOTE: with >this change the highlevel irq-disable code still soft-disables >this IRQ line - and if such an interrupt happens then the >IRQ flow handler keeps the IRQ masked.) > >Signed-off-by: Ingo Molnar <mingo@elte.hu> >--- > kernel/irq/chip.c | 13 +++++++------ > 1 file changed, 7 insertions(+), 6 deletions(-) > >Index: linux/kernel/irq/chip.c >=================================================================== >--- linux.orig/kernel/irq/chip.c >+++ linux/kernel/irq/chip.c >@@ -202,10 +202,6 @@ static void default_enable(unsigned int > */ > static void default_disable(unsigned int irq) > { >- struct irq_desc *desc = irq_desc + irq; >- >- if (!(desc->status & IRQ_DELAYED_DISABLE)) >- desc->chip->mask(irq); > } > > /* >@@ -272,8 +268,11 @@ handle_simple_irq(unsigned int irq, stru > kstat_cpu(cpu).irqs[irq]++; > > action = desc->action; >- if (unlikely(!action || (desc->status & IRQ_DISABLED))) >+ if (unlikely(!action || (desc->status & IRQ_DISABLED))) { >+ if (desc->chip->mask) >+ desc->chip->mask(irq); > goto out_unlock; >+ } > > desc->status |= IRQ_INPROGRESS; > spin_unlock(&desc->lock); >@@ -366,11 +365,13 @@ handle_fasteoi_irq(unsigned int irq, str > > /* > * If its disabled or no action available >- * keep it masked and get out of here >+ * then mask it and get out of here: > */ > action = desc->action; > if (unlikely(!action || (desc->status & IRQ_DISABLED))) { > desc->status |= IRQ_PENDING; >+ if (desc->chip->mask) >+ desc->chip->mask(irq); > goto out; > } > > > ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] irq: do not mask interrupts by default 2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar 2006-11-14 8:20 ` Arjan van de Ven 2006-11-14 12:43 ` Komuro @ 2006-11-14 16:10 ` Linus Torvalds 2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman [not found] ` <20061115090427.GA16173@elte.hu> 2 siblings, 2 replies; 91+ messages in thread From: Linus Torvalds @ 2006-11-14 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: Eric W. Biederman, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List On Tue, 14 Nov 2006, Ingo Molnar wrote: > > 1) disable_irq() is relatively rare (used in about 10% of drivers, but > there it's overwhelmingly used in some slowpath) so it's performance > uncritical. Well, the thing is, the _replay_ if it does happen, is going to be really really slow compared to the masking. So at that point, it may well be a net performance downside if the masking is going to almost always have an interrupt happen while the thing is masked. I dunno. There's another thing too: For level-triggered interrupts, I _really_ don't think we should do this. The code inside the masked region is sometimes "setup code", which will do things that _will_ raise an interrupt, but may read the status register or whatever to then unraise it. So in that case, your patch will generate different behaviour, something that I really don't want to introduce at this point in the 2.6.19 series. > 2) missing an IRQ while the line is masked is often a lethal regression > to the user. An IRQ could be missed even if we think that the IRQ line > is 'level-triggered'. If it's level-triggered, it's going to be missed only if it's de-asserted by code inside the masked region, and that is what we have always done on purpose, so "missing" it is the right thing to do. It's what we have tested all level-triggered interrupts with for the last 15+ years, and it's been part of the semantics for masking. So I absolutely do _not_ think your change is improved semantics. It's new semantics, and illogical. If the driver masked the irq line, did some testing that raises and clears it again ("let's check if this version of the chip raises the interrupt when we do XYZZY"), then the logical thing to do would be to not cause the interrupt to happen. Of course, for edge-triggered APIC interrupts, we _have_ to replay the irq (since we don't have any way of even *knowing* whether we might get it again), but for level-triggered and for the old legacy i8259 controller that gets it right for edges anwyay, we should _not_ send the spurious interrupt that is no longer active. And a lot of code has been tested with either just the i8259 (old machines without any APIC) or with PCI-only devices (which are always level- triggered), so the fact that edge-triggered things have always seen the potential for spurious interrupts is not a reasong to say "well, they have to handle it anyway". True PCI drivers generally do _not_ have to handle the crazy case, and generally have never seen it. > so my patch changes the default irq-disable logic of /all/ controllers > to "delayed disable". (IRQ chips can still override this by providing a > different chip->disable method that just clones their ->mask method, if > it is absolutely sure that no IRQs can be lost while masked) I really think we should do this just for APIC edge triggered interrupts, ie keep the old behaviour. Also, I worry a bit about the patch: > @@ -272,8 +268,11 @@ handle_simple_irq(unsigned int irq, stru > kstat_cpu(cpu).irqs[irq]++; > > action = desc->action; > - if (unlikely(!action || (desc->status & IRQ_DISABLED))) > + if (unlikely(!action || (desc->status & IRQ_DISABLED))) { > + if (desc->chip->mask) > + desc->chip->mask(irq); > goto out_unlock; > + } The simple-irq case too? That's not even going to replay the thing? So now you just mask (without replaying) simple irqs, but then the other irqs you mask and replay.. See above on why I don't think this is necessarily a bug (since masking is almost always the right thing _anyway_), but now it will *STILL* depend on some internal implementation decision on whether the replay happens at all. I'd much rather have the replay decision be based on hard physical data: we replay _only_ for edge-triggered interrupts, and _only_ for controllers that need it. In other words, I think we should just make APIC-edge have the "please delay masking and replay" bit, and nobody else. Can you send that patch (for both x86 and x86-64), and we can ask Komuro to test it. That would be the "same behaviour as we've always had" thing, which I think is also the _right_ behaviour. Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* [PATCH] Use delayed disable mode of ioapic edge triggered interrupts 2006-11-14 16:10 ` Linus Torvalds @ 2006-11-14 17:52 ` Eric W. Biederman 2006-11-14 23:35 ` Linus Torvalds ` (2 more replies) [not found] ` <20061115090427.GA16173@elte.hu> 1 sibling, 3 replies; 91+ messages in thread From: Eric W. Biederman @ 2006-11-14 17:52 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List Linus Torvalds <torvalds@osdl.org> writes: > Of course, for edge-triggered APIC interrupts, we _have_ to replay the irq > (since we don't have any way of even *knowing* whether we might get it > again), but for level-triggered and for the old legacy i8259 controller > that gets it right for edges anwyay, we should _not_ send the spurious > interrupt that is no longer active. > > And a lot of code has been tested with either just the i8259 (old machines > without any APIC) or with PCI-only devices (which are always level- > triggered), so the fact that edge-triggered things have always seen the > potential for spurious interrupts is not a reasong to say "well, they have > to handle it anyway". True PCI drivers generally do _not_ have to handle > the crazy case, and generally have never seen it. > > In other words, I think we should just make APIC-edge have the "please > delay masking and replay" bit, and nobody else. > > Can you send that patch (for both x86 and x86-64), and we can ask Komuro > to test it. That would be the "same behaviour as we've always had" thing, > which I think is also the _right_ behaviour. Hopefully this is the trivial patch that solves the problem. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> diff --git a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c index ad84bc2..3b7a63e 100644 --- a/arch/i386/kernel/io_apic.c +++ b/arch/i386/kernel/io_apic.c @@ -1287,9 +1287,11 @@ static void ioapic_register_intr(int irq trigger == IOAPIC_LEVEL) set_irq_chip_and_handler_name(irq, &ioapic_chip, handle_fasteoi_irq, "fasteoi"); - else + else { + irq_desc[irq].status |= IRQ_DELAYED_DISABLE; set_irq_chip_and_handler_name(irq, &ioapic_chip, handle_edge_irq, "edge"); + } set_intr_gate(vector, interrupt[irq]); } diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c index 41bfc49..14654e6 100644 --- a/arch/x86_64/kernel/io_apic.c +++ b/arch/x86_64/kernel/io_apic.c @@ -790,9 +790,11 @@ static void ioapic_register_intr(int irq trigger == IOAPIC_LEVEL) set_irq_chip_and_handler_name(irq, &ioapic_chip, handle_fasteoi_irq, "fasteoi"); - else + else { + irq_desc[irq].status |= IRQ_DELAYED_DISABLE; set_irq_chip_and_handler_name(irq, &ioapic_chip, handle_edge_irq, "edge"); + } } static void __init setup_IO_APIC_irqs(void) ^ permalink raw reply related [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts 2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman @ 2006-11-14 23:35 ` Linus Torvalds 2006-11-15 1:17 ` Linus Torvalds 2006-11-15 12:40 ` Komuro 2 siblings, 0 replies; 91+ messages in thread From: Linus Torvalds @ 2006-11-14 23:35 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List On Tue, 14 Nov 2006, Eric W. Biederman wrote: > > Hopefully this is the trivial patch that solves the problem. Komuro, can you check this patch _instead_ of the one from Ingo (ie not together with, since that combination won't tell us anything new - if Ingo's patch is there too, the new patch will basically be a no-op). Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts 2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman 2006-11-14 23:35 ` Linus Torvalds @ 2006-11-15 1:17 ` Linus Torvalds 2006-11-15 5:14 ` Eric W. Biederman 2006-11-15 12:40 ` Komuro 2 siblings, 1 reply; 91+ messages in thread From: Linus Torvalds @ 2006-11-15 1:17 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List On Tue, 14 Nov 2006, Eric W. Biederman wrote: > > Hopefully this is the trivial patch that solves the problem. Ok, having looked more at this, I have to say that the whole "IRQ_DELAYED_DISABLE" thing seems very fragile indeed. It looks like we should do it not only for APIC edge-triggered interrupts, but for HT and MSI interrupts too, as far as I can tell (at least they also use the "handle_edge_irq" routine) So I'm wondering how many other cases there are that are missing this. In that sense, Ingo's patch was a lot safer, although I still dislike it for all the other reasons I mentioned - it's simply wrong to re-send a level-triggered irq. I don't know MSI and HT interrupts well enough to tell whether they will re-trigger on their own when we unmask them, but the point is, this _looks_ like it might be incomplete. I think part of the problem is a bad interface. We should simply never set the IRQ handler on its own. It should be a field in the "irq_chip" structure, and we should use _different_ irq chip structures for level and edge-triggered. Then we should also add the "flags" thing there, and you could do something like static struct irq_chip level_ioapic_chip = { .. instead of making the insane decision to use the "same" chip for all ioapic things. Ingo? Eric? Comments? Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts 2006-11-15 1:17 ` Linus Torvalds @ 2006-11-15 5:14 ` Eric W. Biederman 2006-11-15 16:06 ` Linus Torvalds 0 siblings, 1 reply; 91+ messages in thread From: Eric W. Biederman @ 2006-11-15 5:14 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List Linus Torvalds <torvalds@osdl.org> writes: > On Tue, 14 Nov 2006, Eric W. Biederman wrote: >> >> Hopefully this is the trivial patch that solves the problem. > > Ok, having looked more at this, I have to say that the whole > "IRQ_DELAYED_DISABLE" thing seems very fragile indeed. > > It looks like we should do it not only for APIC edge-triggered interrupts, > but for HT and MSI interrupts too, as far as I can tell (at least they > also use the "handle_edge_irq" routine) > > So I'm wondering how many other cases there are that are missing this. I think it is a good question. The big one I did not set it on is the interrupt if it comes in through ExtInt. I assume the 8259 is sane but I may be wrong. > In that sense, Ingo's patch was a lot safer, although I still dislike it > for all the other reasons I mentioned - it's simply wrong to re-send a > level-triggered irq. > > I don't know MSI and HT interrupts well enough to tell whether they will > re-trigger on their own when we unmask them, but the point is, this > _looks_ like it might be incomplete. Yes. I think there is an interrupt status bit there. For at least one case in MSI we don't have a disable at all. The truth is in practice I don't think it matters because I don't think anyone actually disables MSI or hypertransport interrupts. If it was going to change it would probably change per card. But the real truth is that the hardware device knows what is going on. The interrupt message is sent by the hardware device or it is not. This isn't a case of can we detect an interrupt being raised by the device while we disabled the interrupt at the device. This is a case of we disable the interrupt at the device. So I think the whole question of do we detect an interrupt raised by the device while we have disabled interrupts on the device is silly. So until I learn more I am going to assume that MSI and hypertransport interrupts are sane like 8259 interrupts. If that makes sense. > I think part of the problem is a bad interface. We should simply never set > the IRQ handler on its own. It should be a field in the "irq_chip" > structure, and we should use _different_ irq chip structures for level and > edge-triggered. Then we should also add the "flags" thing there, and you > could do something like > > static struct irq_chip level_ioapic_chip = { > .. > > instead of making the insane decision to use the "same" chip for all > ioapic things. I think there is probably a sensible case for a separate structure. At this point I have two questions. - What is the easiest path to get us to a stable 2.6.19 where everything works? I don't think that is backing out genirq. But I haven't at all of these corner cases. I think for 2.6.19 we can get away with just my stupid patch, or some simple variation of it. - What is the sanest thing for long term maintenance, of irqs? genirq is less code to maintain overall (a plus). genirq helps us do things across architectures, which is nice. genirq is also a little convoluted to read and to use a downside. My gut feel is that there is room for a lot more cleanup in this area but we probably need to stabilize what we have. Since you aren't complaining about what the code actually does but rather how the interface looks, I have a proposal. I assert that the interface for registering an irq is much to general, and broad. Instead of having: irq_desc[irq].status |= IRQ_DELAYED_DISABLE; set_irq_chip_and_handler_name(irq, &ioapic_chip, handle_edge_irq, "edge"); We should have a set of helper functions one for each common type of interrupt. set_irq_edge_lossy(irq, &ioapic_chip); set_irq_edge(irq, &ioapic_chip); set_irq_level(irq, &ioapic_chip); The more stupid parameters we have to set the more likely an implementor is to get it wrong. Although I do agree to some extent it has been a bit of a strain having both edge and level triggered interrupts with the same methods. So if our goal is to make an even simpler interface than what we have now I will be happy. Hopefully we can do all of this in helper functions instead of having to rip up all of the interrupt infrastructure one more time. I really don't know. I'm tired and I want to see this code work. Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts 2006-11-15 5:14 ` Eric W. Biederman @ 2006-11-15 16:06 ` Linus Torvalds 2006-11-15 16:58 ` Eric W. Biederman 0 siblings, 1 reply; 91+ messages in thread From: Linus Torvalds @ 2006-11-15 16:06 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List On Tue, 14 Nov 2006, Eric W. Biederman wrote: > > The big one I did not set it on is the interrupt if it comes in > through ExtInt. I assume the 8259 is sane but I may be wrong. Yes, ExtInt is ok, i fyou actually mask it at the 8259. As mentioned earlier in the thread, the i8259 has its edge detect logic _after_ the masking logic, so if the irq is still active, and you unmask it, it will see an edge, and re-assert the interrupt in hardware. So the i8259 is a good interrupt controller, and does not need delayed disable and software logic to re-assert the irq. > The truth is in practice I don't think it matters because I don't > think anyone actually disables MSI or hypertransport interrupts. Fair enough, at least for a 2.6.19 kind of release timeframe (and that is what I worry about most, at least right now). > At this point I have two questions. > - What is the easiest path to get us to a stable 2.6.19 where > everything works? If people don't expect HT and MSI interrupts to be masked (and I can well imagine that), then I think your two-liner patch is good to go. Komuro seems to have acked it already, and in many ways that's the "minimal change" for 2.6.19 right now. I do like Ingo's patch because it seems "safe" (even if I think it might be a bit _overly_ safe), but it changes semantics enough that I don't like it for 2.6.19. Even his second version definitely changes semantics for level-triggered PCI interrupts, even though he fixed ExtInt/i8259 ones. So I think I'll go with your patch for now, and we can re-visit Ingo's thing after 2.6.19. > - What is the sanest thing for long term maintenance, of irqs? > > genirq is less code to maintain overall (a plus). Oh, I absolutely think genirq is the right thing to do. No question at all. I just think that we might want to refactor the code somewhat, and in particular I suspect that many irq controller drivers should use separate "struct irq_chip" entries for edge and level, because they are fundamentally different. > My gut feel is that there is room for a lot more cleanup in this > area but we probably need to stabilize what we have. Exactly. Baby steps. Make it work. Then clean up. Slowly. > Since you aren't complaining about what the code actually does but > rather how the interface looks, I have a proposal. I assert that > the interface for registering an irq is much to general, and broad. > > Instead of having: > > irq_desc[irq].status |= IRQ_DELAYED_DISABLE; > set_irq_chip_and_handler_name(irq, &ioapic_chip, > handle_edge_irq, "edge"); > > We should have a set of helper functions one for each common type > of interrupt. > > set_irq_edge_lossy(irq, &ioapic_chip); > set_irq_edge(irq, &ioapic_chip); > set_irq_level(irq, &ioapic_chip); Yeah, that might be a fine way too. That's largely what we do for the IO schedulers, and it's been fairly successful. Start out by setting common defaults, and then allow chip drivers to specify particular details explicitly. Ingo? Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts 2006-11-15 16:06 ` Linus Torvalds @ 2006-11-15 16:58 ` Eric W. Biederman 0 siblings, 0 replies; 91+ messages in thread From: Eric W. Biederman @ 2006-11-15 16:58 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List Linus Torvalds <torvalds@osdl.org> writes: > On Tue, 14 Nov 2006, Eric W. Biederman wrote: > >> The truth is in practice I don't think it matters because I don't >> think anyone actually disables MSI or hypertransport interrupts. > > Fair enough, at least for a 2.6.19 kind of release timeframe (and that is > what I worry about most, at least right now). > >> At this point I have two questions. >> - What is the easiest path to get us to a stable 2.6.19 where >> everything works? > > If people don't expect HT and MSI interrupts to be masked (and I can well > imagine that), then I think your two-liner patch is good to go. Komuro > seems to have acked it already, and in many ways that's the "minimal > change" for 2.6.19 right now. Well I just doubled checked this assertion. The one driver that uses the hypertransport irqs doesn't call disable_irq. On the msi side at least the forcedeth driver does call disable_irq when in msi mode. I just doubled checked the historical behavior of the msi code and it has never done the delayed disable thing. So not doing it there is not a regression. The MSI case is different. MSI is fundamentally about non-shared interrupts, and interrupts that don't race with your DMAs. So with MSI you don't need a status register read to process the interrupt. In the context of Ingo's patch I don't like the idea of saddling MSI interrupts down with the best in class work arounds for a completely different hardware interrupt model. Although I don't doubt MSI will get it's own set of work arounds as we come to know it better. > I do like Ingo's patch because it seems "safe" (even if I think it might > be a bit _overly_ safe), but it changes semantics enough that I don't like > it for 2.6.19. Even his second version definitely changes semantics for > level-triggered PCI interrupts, even though he fixed ExtInt/i8259 ones. > > So I think I'll go with your patch for now, and we can re-visit Ingo's > thing after 2.6.19. Sounds like a plan. Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [PATCH] Use delayed disable mode of ioapic edge triggered interrupts 2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman 2006-11-14 23:35 ` Linus Torvalds 2006-11-15 1:17 ` Linus Torvalds @ 2006-11-15 12:40 ` Komuro 2 siblings, 0 replies; 91+ messages in thread From: Komuro @ 2006-11-15 12:40 UTC (permalink / raw) To: Eric W. Biederman Cc: Linus Torvalds, Ingo Molnar, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List Hi, I tried the Eric's patch instead of Ingo's with 2.6.19-rc5. The interrupt is generated properly. Thanks! Best Regards Komuro > >Hopefully this is the trivial patch that solves the problem. > >Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> > >diff --git a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c >index ad84bc2..3b7a63e 100644 >--- a/arch/i386/kernel/io_apic.c >+++ b/arch/i386/kernel/io_apic.c >@@ -1287,9 +1287,11 @@ static void ioapic_register_intr(int irq > trigger == IOAPIC_LEVEL) > set_irq_chip_and_handler_name(irq, &ioapic_chip, > handle_fasteoi_irq, "fasteoi"); >- else >+ else { >+ irq_desc[irq].status |= IRQ_DELAYED_DISABLE; > set_irq_chip_and_handler_name(irq, &ioapic_chip, > handle_edge_irq, "edge"); >+ } > set_intr_gate(vector, interrupt[irq]); > } > >diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c >index 41bfc49..14654e6 100644 >--- a/arch/x86_64/kernel/io_apic.c >+++ b/arch/x86_64/kernel/io_apic.c >@@ -790,9 +790,11 @@ static void ioapic_register_intr(int irq > trigger == IOAPIC_LEVEL) > set_irq_chip_and_handler_name(irq, &ioapic_chip, > handle_fasteoi_irq, "fasteoi"); >- else >+ else { >+ irq_desc[irq].status |= IRQ_DELAYED_DISABLE; > set_irq_chip_and_handler_name(irq, &ioapic_chip, > handle_edge_irq, "edge"); >+ } > } > > static void __init setup_IO_APIC_irqs(void) > ^ permalink raw reply [flat|nested] 91+ messages in thread
[parent not found: <20061115090427.GA16173@elte.hu>]
* Re: [patch] genirq: do not mask interrupts by default [not found] ` <20061115090427.GA16173@elte.hu> @ 2006-11-15 16:13 ` Linus Torvalds 2006-11-15 17:46 ` Ingo Molnar 0 siblings, 1 reply; 91+ messages in thread From: Linus Torvalds @ 2006-11-15 16:13 UTC (permalink / raw) To: Ingo Molnar Cc: Ingo Molnar, Eric W. Biederman, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List On Wed, 15 Nov 2006, Ingo Molnar wrote: > > problem is, we dont know /for a fact/ that something is "APIC-edge". We > only know that the BIOS claims it that it's so. This is incorrect. We will have _programmed_ the APIC with whatever the BIOS said in the MP tables, so if we think it's level triggered, it _is_ level triggered. So I really think that all the arguments for i8259 not wanting replay weigh equally on level-triggered PCI irq's too. Now, the one thing that makes me think your approach is the right one is that it's potentially going to be better performance - if people disable irq's and the normal case is that no irq will actually happen, then optimistically not doing anything at all (except marking the irq disabled, of course) is always good. However, because it's a semantic change, I _really_ don't want to do it right now. We're maybe a week away from 2.6.19, and the "ISA irq's don't work" report is one of the things that is holding things up right now. So that's why I'd much rather go with Eric's patch for now - because it keeps the semantics that we've always had. Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [patch] genirq: do not mask interrupts by default 2006-11-15 16:13 ` [patch] genirq: do not mask interrupts by default Linus Torvalds @ 2006-11-15 17:46 ` Ingo Molnar 0 siblings, 0 replies; 91+ messages in thread From: Ingo Molnar @ 2006-11-15 17:46 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Eric W. Biederman, Komuro, tglx, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List * Linus Torvalds <torvalds@osdl.org> wrote: > On Wed, 15 Nov 2006, Ingo Molnar wrote: > > > > problem is, we dont know /for a fact/ that something is "APIC-edge". > > We only know that the BIOS claims it that it's so. > > This is incorrect. We will have _programmed_ the APIC with whatever > the BIOS said in the MP tables, so if we think it's level triggered, > it _is_ level triggered. yeah. I was thinking about the low 16 irqs (those are really the problem spots most of the time, not the normal IO-APIC irqs) - which are routed all across the southbridge and might end up being handled by a i8259A-lookalike entity. Right now we default to level-triggered IRQ flow handling: if (i < 16) { /* * 16 old-style INTA-cycle interrupts: */ set_irq_chip_and_handler_name(i, &i8259A_chip, handle_level_irq, "XT"); because that's the best we can do (it's also what our i8259 code did historically). But it would be one step safer to also do the lazy-disable. Just in case things might get lost while masked. Or is that an absolutely horrible hardware breakage that i shouldnt worry about? > So I really think that all the arguments for i8259 not wanting replay > weigh equally on level-triggered PCI irq's too. > > Now, the one thing that makes me think your approach is the right one > is that it's potentially going to be better performance - if people > disable irq's and the normal case is that no irq will actually happen, > then optimistically not doing anything at all (except marking the irq > disabled, of course) is always good. > > However, because it's a semantic change, I _really_ don't want to do > it right now. We're maybe a week away from 2.6.19, and the "ISA irq's > don't work" report is one of the things that is holding things up > right now. > > So that's why I'd much rather go with Eric's patch for now - because > it keeps the semantics that we've always had. ok, i'm fine with Eric's patch too, if it solves Komuro's problem: Acked-by: Ingo Molnar <mingo@elte.hu> and we dont have to worry about the present ugliness of the delayed-disabled flag either, as it would just go away in 2.6.20. Ingo ^ permalink raw reply [flat|nested] 91+ messages in thread
[parent not found: <m1y7qm425l.fsf@ebiederm.dsl.xmission.com>]
[parent not found: <Pine.LNX.4.64.0611080745150.3667@g5.osdl.org>]
* Re: 2.6.19-rc5: known regressions [not found] ` <Pine.LNX.4.64.0611080745150.3667@g5.osdl.org> @ 2006-11-08 16:22 ` Adrian Bunk 2006-11-08 23:11 ` Tim Chen 0 siblings, 1 reply; 91+ messages in thread From: Adrian Bunk @ 2006-11-08 16:22 UTC (permalink / raw) To: Linus Torvalds Cc: Eric W. Biederman, Andrew Morton, Linux Kernel Mailing List, Tim Chen On Wed, Nov 08, 2006 at 07:47:07AM -0800, Linus Torvalds wrote: > > > On Wed, 8 Nov 2006, Eric W. Biederman wrote: > > > > I haven't seen anyone reproduce this but Tim Chen, and Tim wasn't > > able to root cause the problem so I believe we are going to have > > this regression :( > > Note that you really shouldn't look too closely at lmbench scheduling > fluctuations. They can fluctuate a _lot_, especially under SMP, and it can > depend on things like cache layout that has nothing to do with the > scheduler (ie just code movement can make the lmbench numbers change). > > So there are "regressions" and there are "shit happens". It can sometimes > be hard to tell the two apart, of course ;) There's perhaps one thing that might help us to see whether it's just a benchmark effekt or a real problem: With Tim's CONFIG_NR_CPUS=8, NR_IRQS only increases from 224 in 2.6.18 to 512 in 2.6.19-rc. With CONFIG_NR_CPUS=255, NR_IRQS increases from 224 in 2.6.18 to 8416 in 2.6.19-rc. @Tim: Can you try CONFIG_NR_CPUS=255 with both 2.6.18 and 2.6.19-rc5? > Linus cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 16:22 ` 2.6.19-rc5: known regressions Adrian Bunk @ 2006-11-08 23:11 ` Tim Chen 2006-11-09 2:49 ` Tim Chen 0 siblings, 1 reply; 91+ messages in thread From: Tim Chen @ 2006-11-08 23:11 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Eric W. Biederman, Andrew Morton, Linux Kernel Mailing List On Wed, 2006-11-08 at 17:22 +0100, Adrian Bunk wrote: > There's perhaps one thing that might help us to see whether it's just a > benchmark effekt or a real problem: > > With Tim's CONFIG_NR_CPUS=8, NR_IRQS only increases from 224 in 2.6.18 > to 512 in 2.6.19-rc. > > With CONFIG_NR_CPUS=255, NR_IRQS increases from 224 in 2.6.18 > to 8416 in 2.6.19-rc. > > @Tim: > Can you try CONFIG_NR_CPUS=255 with both 2.6.18 and 2.6.19-rc5? > With CONFIG_NR_CPUS increased from 8 to 64: 2.6.18 see no change in fork time measured. 2.6.19-rc5 see a 138% increase in fork time. When I increase CONFIG_NR_CPUS to 128, the child process from fork got killed when it executes sched_getaffinity call in the routine to pin the process onto a processor. This happened for both 2.6.18 and 2.6.19-rc5. I'll need to check more carefully what lmbench is doing there. Tim ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-08 23:11 ` Tim Chen @ 2006-11-09 2:49 ` Tim Chen 2006-11-09 5:10 ` Eric W. Biederman 0 siblings, 1 reply; 91+ messages in thread From: Tim Chen @ 2006-11-09 2:49 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Eric W. Biederman, Andrew Morton, Linux Kernel Mailing List On Wed, 2006-11-08 at 15:11 -0800, Tim Chen wrote: > On Wed, 2006-11-08 at 17:22 +0100, Adrian Bunk wrote: > > > There's perhaps one thing that might help us to see whether it's just a > > benchmark effekt or a real problem: > > > > With Tim's CONFIG_NR_CPUS=8, NR_IRQS only increases from 224 in 2.6.18 > > to 512 in 2.6.19-rc. > > > > With CONFIG_NR_CPUS=255, NR_IRQS increases from 224 in 2.6.18 > > to 8416 in 2.6.19-rc. > > > > @Tim: > > Can you try CONFIG_NR_CPUS=255 with both 2.6.18 and 2.6.19-rc5? > > > > With CONFIG_NR_CPUS increased from 8 to 64: > 2.6.18 see no change in fork time measured. > 2.6.19-rc5 see a 138% increase in fork time. > Lmbench is broken in its fork time measurement. It includes overhead time when it is pinning processes onto specific cpu. The actual fork time is not affected by NR_IRQS. Lmbench calls the following C library function to determine the number of processors online before it pin the processes: sysconf(_SC_NPROCESSORS_ONLN); This function takes the same order of time to run as fork itself. In addition, runtime of this function increases with NR_IRQS. This resulted in the change in time measured. After hardcoding the number of online processors in lmbench, the fork time measured now does not change with CONFIG_NR_CPUS for both 2.6.18 and 2.6.19-rc5. So we can now conclude that NR_IRQS does not affect fork. We can remove this particular issue from the known regression. Tim ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-09 2:49 ` Tim Chen @ 2006-11-09 5:10 ` Eric W. Biederman 2006-11-13 22:46 ` Tim Chen 0 siblings, 1 reply; 91+ messages in thread From: Eric W. Biederman @ 2006-11-09 5:10 UTC (permalink / raw) To: tim.c.chen Cc: Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List Tim Chen <tim.c.chen@linux.intel.com> writes: > On Wed, 2006-11-08 at 15:11 -0800, Tim Chen wrote: >> On Wed, 2006-11-08 at 17:22 +0100, Adrian Bunk wrote: >> >> With CONFIG_NR_CPUS increased from 8 to 64: >> 2.6.18 see no change in fork time measured. CONFIG_NR_CPUS has no affect on NR_IRQS in 2.6.18. So this test unfortunately told us nothing. >> 2.6.19-rc5 see a 138% increase in fork time. >> > > Lmbench is broken in its fork time measurement. > It includes overhead time when it is pinning processes onto > specific cpu. The actual fork time is not affected by NR_IRQS. > > Lmbench calls the following C library function to determine the > number of processors online before it pin the processes: > sysconf(_SC_NPROCESSORS_ONLN); > > This function takes the same order of time to run as > fork itself. In addition, runtime of this function > increases with NR_IRQS. This resulted in the change in > time measured. > > After hardcoding the number of online processors in lmbench, > the fork time measured now does not change with CONFIG_NR_CPUS > for both 2.6.18 and 2.6.19-rc5. So we can now conclude that > NR_IRQS does not affect fork. We can remove this particular > issue from the known regression. Cool. I'm glad to know it was simply a buggy lmbench. What is sysconf(_SN_NPROCESSORS_ONLN) doing that it slows down as the number of irqs increase? It is a slow path certainly but possibly something we should fix. My hunch is cat /proc/cpuinfo... Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-09 5:10 ` Eric W. Biederman @ 2006-11-13 22:46 ` Tim Chen 2006-11-14 0:03 ` Eric W. Biederman 0 siblings, 1 reply; 91+ messages in thread From: Tim Chen @ 2006-11-13 22:46 UTC (permalink / raw) To: Eric W. Biederman Cc: Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List On Wed, 2006-11-08 at 22:10 -0700, Eric W. Biederman wrote: > > Cool. I'm glad to know it was simply a buggy lmbench. > > What is sysconf(_SN_NPROCESSORS_ONLN) doing that it slows down as the > number of irqs increase? It is a slow path certainly but possibly > something we should fix. My hunch is cat /proc/cpuinfo... > The increase in time of sysconf(_SN_NPROCESSORS_ONLN) call is within "show_stat" function after looking at profiling data. There are a couple of loops that iterate over kstat_irqs interrupt statistics and depend on NR_IRQS. Doesn't look like something we need to fix. Tim ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions 2006-11-13 22:46 ` Tim Chen @ 2006-11-14 0:03 ` Eric W. Biederman 0 siblings, 0 replies; 91+ messages in thread From: Eric W. Biederman @ 2006-11-14 0:03 UTC (permalink / raw) To: tim.c.chen Cc: Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List Tim Chen <tim.c.chen@linux.intel.com> writes: > On Wed, 2006-11-08 at 22:10 -0700, Eric W. Biederman wrote: > >> >> Cool. I'm glad to know it was simply a buggy lmbench. >> >> What is sysconf(_SN_NPROCESSORS_ONLN) doing that it slows down as the >> number of irqs increase? It is a slow path certainly but possibly >> something we should fix. My hunch is cat /proc/cpuinfo... >> > > The increase in time of sysconf(_SN_NPROCESSORS_ONLN) call > is within "show_stat" function after looking at profiling data. > There are a couple of loops that iterate over kstat_irqs > interrupt statistics and depend on NR_IRQS. Doesn't > look like something we need to fix. Thanks. Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
[parent not found: <20061111015035.GU4729@stusta.de>]
* Re: [discuss] 2.6.19-rc5: known regressions (v2) [not found] ` <20061111015035.GU4729@stusta.de> @ 2006-11-11 9:08 ` Rafael J. Wysocki 2006-11-11 9:25 ` Paolo Ornati 0 siblings, 1 reply; 91+ messages in thread From: Rafael J. Wysocki @ 2006-11-11 9:08 UTC (permalink / raw) To: Adrian Bunk; +Cc: LKML, Paolo Ornati Hi, On Saturday, 11 November 2006 02:50, Adrian Bunk wrote: > This email lists some known regressions in 2.6.19-rc5 compared to 2.6.18 > that are not yet fixed in Linus' tree. > > If you find your name in the Cc header, you are either submitter of one > of the bugs, maintainer of an affectected subsystem or driver, a patch > of you caused a breakage or I'm considering you in any other way possibly > involved with one or more of these issues. > > Due to the huge amount of recipients, please trim the Cc when answering. > > > Subject : PCI MSI setting corrupted during resume > References : http://bugzilla.kernel.org/show_bug.cgi?id=7479 > Submitter : Stephen Hemminger <shemminger@osdl.org> > Status : unknown > > > Subject : x86_64 boot failure: irq 22: nobody cared (hda_intel MSI) > References : http://lkml.org/lkml/2006/11/8/98 > Submitter : Olivier Nicolas <olivn@trollprod.org> > Status : unknown > > > Subject : SMP kernel can not generate ISA irq properly > References : http://lkml.org/lkml/2006/10/22/15 > http://lkml.org/lkml/2006/11/10/142 > Submitter : Komuro <komurojun-mbn@nifty.com> > Handled-By : Thomas Gleixner <tglx@linutronix.de> > Status : Thomas is investigating > > > Subject : x86_64: Fix partial page check to ensure unusable memory > is not being marked usable > References : http://lkml.org/lkml/2006/11/9/239 > Submitter : Aaron Durbin <adurbin@google.com> > Caused-By : Mel Gorman <mel@csn.ul.ie> > commit 5cb248abf5ab65ab543b2d5fc16c738b28031fc0 > Patch : http://lkml.org/lkml/2006/11/9/239 > Status : patch available > > > Subject : x86_64: Bad page state in process 'swapper' > References : http://lkml.org/lkml/2006/11/10/135 > http://lkml.org/lkml/2006/11/10/208 > Submitter : Andre Noll <maan@systemlinux.org> > Handled-By : Andi Kleen <ak@suse.de> > Status : Andi is investigating > > > Subject : x86_64: oprofile doesn't work > References : http://lkml.org/lkml/2006/10/27/3 > Submitter : Prakash Punnoor <prakash@punnoor.de> > Status : unknown > > > Subject : weird battery charge level reported > ACPI Error method parse / execution failed > References : http://bugzilla.kernel.org/show_bug.cgi?id=7466 > Submitter : Olivier Mondoloni <olivier.mondoloni@waika9.com> > Status : unknown > > > Subject : ThinkPad R50p: boot fail with (lapic && on_battery) > References : http://lkml.org/lkml/2006/10/31/333 > Submitter : Ernst Herzberg <earny@net4u.de> > Handled-By : Len Brown <len.brown@intel.com> > Status : problem is being debugged > > > Subject : BUG: scheduling while atomic: events/0/0x00000001/4 > after resume > References : http://lkml.org/lkml/2006/11/2/209 > Submitter : Paolo Ornati <ornati@fastwebnet.it> > Status : unknown I couldn't find anything in the report that would indicate the problem occured after a resume. Was it really the case? > Subject : sata-via doesn't detect anymore disks attached to VIA vt6421 > References : http://bugzilla.kernel.org/show_bug.cgi?id=7255 > Submitter : Thierry Vignaud <tvignaud@mandriva.com> > Status : unknown > > > Subject : libata must be initialized earlier > References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html > Submitter : Paul Mackerras <paulus@samba.org> > Handled-By : Brian King <brking@us.ibm.com> > Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2 > Status : patch available > > > Subject : unable to rip cd > References : http://lkml.org/lkml/2006/10/13/100 > http://lkml.org/lkml/2006/11/8/42 > Submitter : Alex Romosan <romosan@sycorax.lbl.gov> > Handled-By : Jens Axboe <jens.axboe@oracle.com> > Status : Jens is investigating Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2) 2006-11-11 9:08 ` [discuss] 2.6.19-rc5: known regressions (v2) Rafael J. Wysocki @ 2006-11-11 9:25 ` Paolo Ornati 2006-11-11 10:49 ` Rafael J. Wysocki 0 siblings, 1 reply; 91+ messages in thread From: Paolo Ornati @ 2006-11-11 9:25 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Adrian Bunk, LKML On Sat, 11 Nov 2006 10:08:37 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > Subject : BUG: scheduling while atomic: events/0/0x00000001/4 > > after resume > > References : http://lkml.org/lkml/2006/11/2/209 > > Submitter : Paolo Ornati <ornati@fastwebnet.it> > > Status : unknown > > I couldn't find anything in the report that would indicate the problem occured > after a resume. Was it really the case? Ahh, I've written that in another email but I trimmed LKML from CC by mistake ;) Relevant portion of that mail follows... anyway it seems that "-rc5" is _OK_ since I'm running it by 2 days and it survived 9 suspend/resume cycles. ------------------------------------------------------------------ I've reproduced it (with rc4-g4b1c46a3), and I think it is suspend/resume related sice the messages start flooding dmesg just after a resume... I'll see if it is reproducible just doing suspend/resume a couple of times... and if so I'll try with -rc5. dmesg (stripped at the end): [ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006 [ 0.000000] Command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60 [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 000000001ff30000 (usable) [ 0.000000] BIOS-e820: 000000001ff30000 - 000000001ff40000 (ACPI data) [ 0.000000] BIOS-e820: 000000001ff40000 - 000000001fff0000 (ACPI NVS) [ 0.000000] BIOS-e820: 000000001fff0000 - 0000000020000000 (reserved) [ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used [ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used [ 0.000000] end_pfn_map = 1048576 [ 0.000000] DMI 2.3 present. [ 0.000000] ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000fa850 [ 0.000000] ACPI: RSDT (v001 A M I OEMRSDT 0x06000517 MSFT 0x00000097) @ 0x000000001ff30000 [ 0.000000] ACPI: FADT (v001 A M I OEMFACP 0x06000517 MSFT 0x00000097) @ 0x000000001ff30200 [ 0.000000] ACPI: MADT (v001 A M I OEMAPIC 0x06000517 MSFT 0x00000097) @ 0x000000001ff30390 [ 0.000000] ACPI: OEMB (v001 A M I OEMBIOS 0x06000517 MSFT 0x00000097) @ 0x000000001ff40040 [ 0.000000] ACPI: DSDT (v001 A0058 A0058002 0x00000002 MSFT 0x0100000d) @ 0x0000000000000000 [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used [ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0 -> 4096 [ 0.000000] DMA32 4096 -> 1048576 [ 0.000000] Normal 1048576 -> 1048576 [ 0.000000] early_node_map[2] active PFN ranges [ 0.000000] 0: 0 -> 159 [ 0.000000] 0: 256 -> 130864 [ 0.000000] On node 0 totalpages: 130767 [ 0.000000] DMA zone: 56 pages used for memmap [ 0.000000] DMA zone: 1183 pages reserved [ 0.000000] DMA zone: 2760 pages, LIFO batch:0 [ 0.000000] DMA32 zone: 1733 pages used for memmap [ 0.000000] DMA32 zone: 125035 pages, LIFO batch:31 [ 0.000000] Normal zone: 0 pages used for memmap [ 0.000000] ACPI: PM-Timer IO Port: 0x808 [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) [ 0.000000] Processor #0 (Bootup-CPU) [ 0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0]) [ 0.000000] IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] ACPI: IRQ0 used by override. [ 0.000000] ACPI: IRQ2 used by override. [ 0.000000] ACPI: IRQ9 used by override. [ 0.000000] Setting APIC routing to flat [ 0.000000] Using ACPI (MADT) for SMP configuration information [ 0.000000] Nosave address range: 000000000009f000 - 00000000000a0000 [ 0.000000] Nosave address range: 00000000000a0000 - 00000000000e4000 [ 0.000000] Nosave address range: 00000000000e4000 - 0000000000100000 [ 0.000000] Allocating PCI resources starting at 30000000 (gap: 20000000:dff80000) [ 0.000000] Built 1 zonelists. Total pages: 127795 [ 0.000000] Kernel command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60 [ 0.000000] Initializing CPU#0 [ 0.000000] PID hash table entries: 2048 (order: 11, 16384 bytes) [ 32.727602] time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. [ 32.727605] time.c: Detected 2202.943 MHz processor. [ 32.730265] Console: colour VGA+ 80x25 [ 32.733073] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) [ 32.733509] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) [ 32.733646] Checking aperture... [ 32.733705] CPU 0: aperture @ f8000000 size 64 MB [ 32.740581] Memory: 507900k/523456k available (2708k kernel code, 14716k reserved, 1343k data, 200k init) [ 32.800438] Calibrating delay using timer specific routine.. 4409.08 BogoMIPS (lpj=2204542) [ 32.800591] Mount-cache hash table entries: 256 [ 32.800771] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) [ 32.800833] CPU: L2 Cache: 512K (64 bytes/line) [ 32.800913] CPU: AMD Athlon(tm) 64 Processor 3200+ stepping 00 [ 32.801063] ACPI: Core revision 20060707 [ 32.814145] Using local APIC timer interrupts. [ 32.859518] result 12516743 [ 32.859573] Detected 12.516 MHz APIC timer. [ 32.860328] testing NMI watchdog ... OK. [ 32.870515] checking if image is initramfs...it isn't (bad gzip magic numbers); looks like an initrd [ 32.873517] Freeing initrd memory: 2000k freed [ 32.875609] NET: Registered protocol family 16 [ 32.875754] ACPI: bus type pci registered [ 32.875818] PCI: Using configuration type 1 [ 32.881215] ACPI: Interpreter enabled [ 32.881276] ACPI: Using IOAPIC for interrupt routing [ 32.882250] ACPI: PCI Root Bridge [PCI0] (0000:00) [ 32.882315] PCI: Probing PCI hardware (bus 00) [ 32.884489] PCI: enabled onboard AC97/MC97 devices [ 32.884765] Boot video device is 0000:01:00.0 [ 32.884843] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] [ 32.896169] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15) [ 32.896831] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15) [ 32.897488] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 *7 10 11 14 15) [ 32.898143] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 7 10 11 14 15) [ 32.898809] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. [ 32.899560] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. [ 32.900312] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. [ 32.901052] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. [ 32.901700] Linux Plug and Play Support v0.97 (c) Adam Belay [ 32.901768] pnp: PnP ACPI init [ 32.904855] pnp: PnP ACPI: found 13 devices [ 32.905045] SCSI subsystem initialized [ 32.905158] usbcore: registered new interface driver usbfs [ 32.905247] usbcore: registered new interface driver hub [ 32.905335] usbcore: registered new device driver usb [ 32.905452] PCI: Using ACPI for IRQ routing [ 32.905511] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report [ 32.905592] PCI: Cannot allocate resource region 0 of device 0000:00:00.0 [ 32.905754] agpgart: Detected AGP bridge 0 [ 32.908913] agpgart: AGP aperture is 64M @ 0xf8000000 [ 32.908997] PCI-DMA: Disabling IOMMU. [ 32.909754] PCI: Bridge: 0000:00:01.0 [ 32.909812] IO window: a000-afff [ 32.909871] MEM window: fd100000-fd6fffff [ 32.909931] PREFETCH window: d5000000-f4ffffff [ 32.910006] PCI: Setting latency timer of device 0000:00:01.0 to 64 [ 32.910026] NET: Registered protocol family 2 [ 32.918232] IP route cache hash table entries: 4096 (order: 3, 32768 bytes) [ 32.918368] TCP established hash table entries: 16384 (order: 5, 131072 bytes) [ 32.918516] TCP bind hash table entries: 8192 (order: 4, 65536 bytes) [ 32.918624] TCP: Hash tables configured (established 16384 bind 8192) [ 32.918684] TCP reno registered [ 32.919334] io scheduler noop registered [ 32.919440] io scheduler cfq registered (default) [ 32.920310] ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16 [ 32.920477] radeonfb: Found Intel x86 BIOS ROM Image [ 32.920539] radeonfb: Retrieved PLL infos from BIOS [ 32.920598] radeonfb: Reference=27.00 MHz (RefDiv=12) Memory=200.00 Mhz, System=166.00 MHz [ 32.920671] radeonfb: PLL min 20000 max 40000 [ 32.921735] radeonfb: Monitor 1 type CRT found [ 32.921793] radeonfb: Monitor 2 type no found [ 32.955438] Console: switching to colour frame buffer device 128x48 [ 32.975175] radeonfb (0000:01:00.0): ATI Radeon Yd [ 32.975352] ACPI: Power Button (FF) [PWRF] [ 32.975477] ACPI: Power Button (CM) [PWRB] [ 32.975586] ACPI: Sleep Button (CM) [SLPB] [ 32.977202] Real Time Clock Driver v1.12ac [ 32.977313] Linux agpgart interface v0.101 (c) Dave Jones [ 32.977457] [drm] Initialized drm 1.0.1 20051102 [ 32.978126] [drm] Initialized radeon 1.25.0 20060524 on minor 0 [ 32.978294] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled [ 32.978587] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 32.978834] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 32.979259] 00:0a: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 32.979511] 00:0b: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 32.979773] Floppy drive(s): fd0 is 1.44M [ 32.994906] FDC 0 is a post-1991 82077 [ 32.996382] RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize [ 32.996766] loop: loaded (max 8 devices) [ 32.996906] ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 17 [ 32.999471] skge 1.9 addr 0xfdc00000 irq 17 chip Yukon-Lite rev 9 [ 33.001981] skge eth0: addr 00:11:d8:1c:a0:7a [ 33.004520] 8139too Fast Ethernet driver 0.9.28 [ 33.007047] ACPI: PCI Interrupt 0000:00:0e.0[A] -> GSI 19 (level, low) -> IRQ 19 [ 33.010258] eth1: RealTek RTL8139 at 0xffffc20000004000, 00:e0:4c:f0:ab:b8, IRQ 19 [ 33.013135] eth1: Identified 8139 chip type 'RTL-8139C' [ 33.013151] Linux video capture interface: v2.00 [ 33.016118] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 [ 33.019236] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx [ 33.022583] VP_IDE: IDE controller at PCI slot 0000:00:0f.1 [ 33.026030] ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 20 [ 33.029676] VP_IDE: chipset revision 6 [ 33.033379] VP_IDE: not 100% native mode: will probe irqs later [ 33.037248] VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1 [ 33.041286] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio [ 33.045502] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio [ 33.049751] Probing IDE interface ide0... [ 33.841448] hda: HL-DT-ST DVDRAM GSA-4167B, ATAPI CD/DVD-ROM drive [ 34.152000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 [ 34.156490] Probing IDE interface ide1... [ 34.948226] hdc: HL-DT-ST GCE-8400B, ATAPI CD/DVD-ROM drive [ 35.257686] ide1 at 0x170-0x177,0x376 on irq 15 [ 35.264947] hda: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33) [ 35.269964] Uniform CD-ROM driver Revision: 3.20 [ 35.283252] hdc: ATAPI 40X CD-ROM CD-R/RW drive, 2048kB Cache, DMA [ 35.289395] libata version 2.00 loaded. [ 35.289419] sata_via 0000:00:0f.0: version 2.0 [ 35.289430] ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 20 [ 35.294798] sata_via 0000:00:0f.0: routed to hard irq line 10 [ 35.300148] ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD400 irq 20 [ 35.305530] ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xD802 bmdma 0xD408 irq 20 [ 35.310844] scsi0 : sata_via [ 35.515983] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 35.673125] ata1.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 0/32) [ 35.678757] ata1.00: ata1: dev 0 multi count 16 [ 35.686006] ata1.00: configured for UDMA/133 [ 35.691568] scsi1 : sata_via [ 35.897218] ata2: SATA link down 1.5 Gbps (SStatus 0 SControl 300) [ 35.913718] ATA: abnormal status 0x7F on port 0xE007 [ 35.919332] scsi 0:0:0:0: Direct-Access ATA ST380817AS 3.42 PQ: 0 ANSI: 5 [ 35.925134] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) [ 35.930882] sda: Write Protect is off [ 35.936662] sda: Mode Sense: 00 3a 00 00 [ 35.936675] SCSI device sda: drive cache: write back [ 35.942505] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) [ 35.948370] sda: Write Protect is off [ 35.954136] sda: Mode Sense: 00 3a 00 00 [ 35.954148] SCSI device sda: drive cache: write back [ 35.959889] sda: sda1 sda2 < sda5 sda6 sda7 sda8 > [ 36.027032] sd 0:0:0:0: Attached scsi disk sda [ 36.032710] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 36.038359] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21 [ 36.044338] ehci_hcd 0000:00:10.4: EHCI Host Controller [ 36.050131] ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 1 [ 36.055834] ehci_hcd 0000:00:10.4: irq 21, io mem 0xfd900000 [ 36.061409] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 [ 36.067109] usb usb1: configuration #1 chosen from 1 choice [ 36.072686] hub 1-0:1.0: USB hub found [ 36.078141] hub 1-0:1.0: 8 ports detected [ 36.183732] USB Universal Host Controller Interface driver v3.0 [ 36.189136] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21 [ 36.194595] uhci_hcd 0000:00:10.0: UHCI Host Controller [ 36.200026] uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 2 [ 36.205457] uhci_hcd 0000:00:10.0: irq 21, io base 0x0000b000 [ 36.211000] usb usb2: configuration #1 chosen from 1 choice [ 36.216430] hub 2-0:1.0: USB hub found [ 36.221776] hub 2-0:1.0: 2 ports detected [ 36.327443] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21 [ 36.332864] uhci_hcd 0000:00:10.1: UHCI Host Controller [ 36.338237] uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 3 [ 36.343600] uhci_hcd 0000:00:10.1: irq 21, io base 0x0000b400 [ 36.348985] usb usb3: configuration #1 chosen from 1 choice [ 36.354262] hub 3-0:1.0: USB hub found [ 36.359430] hub 3-0:1.0: 2 ports detected [ 36.465147] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21 [ 36.470440] uhci_hcd 0000:00:10.2: UHCI Host Controller [ 36.475698] uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 4 [ 36.480965] uhci_hcd 0000:00:10.2: irq 21, io base 0x0000b800 [ 36.486227] usb usb4: configuration #1 chosen from 1 choice [ 36.491386] hub 4-0:1.0: USB hub found [ 36.496528] hub 4-0:1.0: 2 ports detected [ 36.601875] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21 [ 36.607110] uhci_hcd 0000:00:10.3: UHCI Host Controller [ 36.612333] uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 5 [ 36.617549] uhci_hcd 0000:00:10.3: irq 21, io base 0x0000c000 [ 36.622729] usb usb5: configuration #1 chosen from 1 choice [ 36.627812] hub 5-0:1.0: USB hub found [ 36.632845] hub 5-0:1.0: 2 ports detected [ 37.161683] usbcore: registered new interface driver cdc_acm [ 37.166690] drivers/usb/class/cdc-acm.c: v0.25:USB Abstract Control Model driver for USB modems and ISDN adapters [ 37.171947] usbcore: registered new interface driver usblp [ 37.177212] drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver [ 37.182561] Initializing USB Mass Storage driver... [ 37.187966] usbcore: registered new interface driver usb-storage [ 37.193343] USB Mass Storage support registered. [ 37.198710] PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12 [ 37.204404] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 37.209839] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 37.215222] mice: PS/2 mouse device common for all mice [ 37.220524] i2c /dev entries driver [ 37.226206] Advanced Linux Sound Architecture Driver Version 1.0.13 (Sun Oct 22 08:56:16 2006 UTC). [ 37.231946] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22 [ 37.239539] PCI: Setting latency timer of device 0000:00:11.5 to 64 [ 37.251526] input: AT Translated Set 2 keyboard as /class/input/input0 [ 37.751566] codec_read: codec 0 is not valid [0xfe0000] [ 37.764869] codec_read: codec 0 is not valid [0xfe0000] [ 37.778121] codec_read: codec 0 is not valid [0xfe0000] [ 37.791242] codec_read: codec 0 is not valid [0xfe0000] [ 37.808500] ALSA device list: [ 37.813631] #0: VIA 8237 with AD1980 at 0xec00, irq 22 [ 37.818899] oprofile: using NMI interrupt. [ 37.824105] TCP cubic registered [ 37.829169] NET: Registered protocol family 1 [ 37.834229] NET: Registered protocol family 17 [ 37.839196] NET: Registered protocol family 15 [ 37.844134] ACPI: (supports S0 S1 S3 S4 S5) [ 37.961630] input: ImPS/2 Logitech Wheel Mouse as /class/input/input1 [ 38.050012] RAMDISK: ext2 filesystem found at block 0 [ 38.054928] RAMDISK: Loading 2000KiB [1 disk] into ram disk... |\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\bdone. [ 38.066647] VFS: Mounted root (ext2 filesystem). [ 38.129633] kjournald starting. Commit interval 5 seconds [ 38.134550] EXT3-fs: mounted filesystem with ordered data mode. [ 38.139493] VFS: Mounted root (ext3 filesystem) readonly. [ 38.144469] Trying to move old root to /initrd ... /initrd does not exist. Ignored. [ 38.149670] Unmounting old root [ 38.154701] Trying to free ramdisk memory ... okay [ 38.159878] Freeing unused kernel memory: 200k freed [ 38.164962] Write protecting the kernel read-only data: 560k [ 40.454619] warning: process `touch' used the removed sysctl system call [ 40.890041] warning: process `sleep' used the removed sysctl system call [ 40.999232] warning: process `sleep' used the removed sysctl system call [ 41.104730] warning: process `sleep' used the removed sysctl system call [ 41.873966] warning: process `sleep' used the removed sysctl system call [ 43.453372] EXT3 FS on sda6, internal journal [ 43.999154] kjournald starting. Commit interval 5 seconds [ 43.999164] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended [ 43.999323] EXT3 FS on sda8, internal journal [ 43.999328] EXT3-fs: mounted filesystem with ordered data mode. [ 44.117933] Adding 1004016k swap on /dev/sda7. Priority:-1 extents:1 across:1004016k [ 50.175883] skge eth0: enabling interface [ 62.993341] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. [ 62.993361] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode [ 62.993436] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode [ 63.288516] [drm] Setting GART location based on new memory map [ 63.288604] [drm] Loading R200 Microcode [ 63.288694] [drm] writeback test succeeded in 1 usecs [ 79.495845] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 [ 81.127324] ip_tables: (C) 2000-2006 Netfilter Core Team [ 81.170715] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack [ 180.334016] Installing knfsd (copyright (C) 1996 okir@monad.swb.de). [ 306.826671] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. [ 306.826693] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode [ 306.826768] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode [ 306.826780] [drm] Loading R200 Microcode [ 340.827814] Stopping tasks: ==========================================================================================================================================| [ 340.828658] Shrinking memory... \b-\b\\b|\b/\b-\bdone (51711 pages freed) [ 340.925349] Suspending console(s) [ 341.809883] pnp: Device 00:0b disabled. [ 341.810124] pnp: Device 00:0a disabled. [ 341.810148] radeonfb (0000:01:00.0): suspending for event: 1... [ 341.887207] skge eth0: disabling interface [ 341.899219] pci_set_power_state(): 0000:00:00.0: state=3, current state=5 [ 341.912910] swsusp: Need to copy 63368 pages [ 26.062737] APIC error on CPU0: 00(00) [ 26.062827] PCI: Setting latency timer of device 0000:00:01.0 to 64 [ 26.146592] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a) [ 26.146599] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000) [ 26.146609] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801) [ 26.146614] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000) [ 26.146619] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010) [ 26.146626] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117) [ 26.146658] skge eth0: enabling interface [ 26.160202] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 [ 26.171176] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21 [ 26.171218] usb usb2: root hub lost power or was reset [ 26.182150] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21 [ 26.182188] usb usb3: root hub lost power or was reset [ 26.193128] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21 [ 26.193165] usb usb4: root hub lost power or was reset [ 26.204106] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21 [ 26.204143] usb usb5: root hub lost power or was reset [ 26.215084] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21 [ 26.215109] usb usb1: root hub lost power or was reset [ 26.215127] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 [ 26.226082] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22 [ 26.226088] PCI: Setting latency timer of device 0000:00:11.5 to 64 [ 26.229463] radeonfb (0000:01:00.0): resuming from state: 1... [ 26.247263] pnp: Failed to activate device 00:03. [ 26.247391] pnp: Failed to activate device 00:04. [ 26.248318] pnp: Device 00:0a activated. [ 26.249004] pnp: Device 00:0b activated. [ 27.134110] Restarting tasks... done [ 27.565554] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. [ 27.565593] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode [ 27.565670] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode [ 27.565682] [drm] Loading R200 Microcode [ 28.443446] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack [ 752.692523] Stopping tasks: ======================================================================================================================================| [ 752.693363] Shrinking memory... \b-\b\\b|\b/\b-\b\\bdone (58183 pages freed) [ 756.669812] Suspending console(s) [ 757.578446] pnp: Device 00:0b disabled. [ 757.578702] pnp: Device 00:0a disabled. [ 757.578727] radeonfb (0000:01:00.0): suspending for event: 1... [ 757.655322] skge eth0: disabling interface [ 757.695225] swsusp: Need to copy 58533 pages [ 25.139916] APIC error on CPU0: 00(00) [ 25.293551] PCI: Setting latency timer of device 0000:00:01.0 to 64 [ 25.377319] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a) [ 25.377326] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000) [ 25.377338] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801) [ 25.377343] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000) [ 25.377348] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010) [ 25.377353] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117) [ 25.377384] skge eth0: enabling interface [ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4 [ 25.382086] [ 25.382087] Call Trace: [ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe [ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101 [ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101 [ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 25.382141] [ 25.387073] BUG: scheduling while atomic: events/0/0x00000001/4 [ 25.387074] [ 25.387075] Call Trace: [ 25.387078] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 25.387081] [<ffffffff802f34b6>] list_add+0xc/0xe [ 25.387085] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 25.387088] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 25.387092] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 25.387096] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 25.387099] [<ffffffff80239269>] kthread+0xce/0x101 [ 25.387103] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 25.387106] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 25.387111] [<ffffffff8023919b>] kthread+0x0/0x101 [ 25.387114] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 25.387115] [ 25.391072] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 [ 25.392063] BUG: scheduling while atomic: events/0/0x00000001/4 [ 25.392065] [ 25.392065] Call Trace: [ 25.392068] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 25.392072] [<ffffffff802f34b6>] list_add+0xc/0xe [ 25.392075] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 25.392079] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 25.392082] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 25.392086] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 25.392090] [<ffffffff80239269>] kthread+0xce/0x101 [ 25.392093] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 25.392097] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 25.392101] [<ffffffff8023919b>] kthread+0x0/0x101 [ 25.392104] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 25.392106] [ 25.402046] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21 [ 25.402095] usb usb2: root hub lost power or was reset [ 25.413022] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21 [ 25.413067] usb usb3: root hub lost power or was reset [ 25.423998] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21 [ 25.424043] usb usb4: root hub lost power or was reset [ 25.434977] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21 [ 25.435020] usb usb5: root hub lost power or was reset [ 25.445955] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21 [ 25.445987] usb usb1: root hub lost power or was reset [ 25.446006] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 [ 25.456957] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22 [ 25.456963] PCI: Setting latency timer of device 0000:00:11.5 to 64 [ 25.460339] radeonfb (0000:01:00.0): resuming from state: 1... [ 25.478130] pnp: Failed to activate device 00:03. [ 25.478258] pnp: Failed to activate device 00:04. [ 25.479252] pnp: Device 00:0a activated. [ 25.479961] pnp: Device 00:0b activated. [ 26.365424] Restarting tasks... done [ 26.441032] BUG: sleeping function called from invalid context at include/asm/semaphore.h:105 [ 26.441035] in_atomic():1, irqs_disabled():0 [ 26.441037] [ 26.441038] Call Trace: [ 26.441046] [<ffffffff8049ff6c>] thread_return+0x0/0xf9 [ 26.441054] [<ffffffff802226ae>] __might_sleep+0xb2/0xb4 [ 26.441058] [<ffffffff8022749a>] acquire_console_sem+0x66/0x90 [ 26.441064] [<ffffffff80358086>] console_callback+0xe/0xde [ 26.441068] [<ffffffff80235fce>] run_workqueue+0xb6/0x126 [ 26.441072] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 26.441075] [<ffffffff802365ff>] worker_thread+0xe6/0x11b [ 26.441079] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 26.441083] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 26.441088] [<ffffffff80239269>] kthread+0xce/0x101 [ 26.441112] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 26.441117] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 26.441121] [<ffffffff8023919b>] kthread+0x0/0x101 [ 26.441125] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 26.441126] [ 26.441160] BUG: scheduling while atomic: events/0/0x00000001/4 [ 26.441162] [ 26.441162] Call Trace: [ 26.441166] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 26.441171] [<ffffffff802f34b6>] list_add+0xc/0xe [ 26.441174] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 26.441178] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 26.441181] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 26.441185] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 26.441189] [<ffffffff80239269>] kthread+0xce/0x101 [ 26.441193] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 26.441196] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 26.441201] [<ffffffff8023919b>] kthread+0x0/0x101 [ 26.441204] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 26.441206] [ 27.289287] BUG: scheduling while atomic: events/0/0x00000001/4 [ 27.289292] [ 27.289293] Call Trace: [ 27.289305] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 27.289310] [<ffffffff802f34b6>] list_add+0xc/0xe [ 27.289315] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 27.289319] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 27.289324] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 27.289328] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 27.289333] [<ffffffff80239269>] kthread+0xce/0x101 [ 27.289337] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 27.289342] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 27.289346] [<ffffffff8023919b>] kthread+0x0/0x101 [ 27.289349] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 27.289351] [ 27.427130] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. [ 27.427152] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode [ 27.427228] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode [ 27.427240] [drm] Loading R200 Microcode [ 29.285334] BUG: scheduling while atomic: events/0/0x00000001/4 [ 29.285339] [ 29.285340] Call Trace: [ 29.285352] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 29.285358] [<ffffffff802f34b6>] list_add+0xc/0xe [ 29.285363] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 29.285366] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 29.285372] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 29.285376] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 29.285380] [<ffffffff80239269>] kthread+0xce/0x101 [ 29.285384] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 29.285389] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 29.285394] [<ffffffff8023919b>] kthread+0x0/0x101 [ 29.285397] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 29.285399] [ 31.281290] BUG: scheduling while atomic: events/0/0x00000001/4 [ 31.281295] [ 31.281296] Call Trace: [ 31.281309] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 31.281314] [<ffffffff802f34b6>] list_add+0xc/0xe [ 31.281319] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 31.281323] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 31.281329] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 31.281333] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 31.281338] [<ffffffff80239269>] kthread+0xce/0x101 [ 31.281342] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 31.281346] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 31.281351] [<ffffffff8023919b>] kthread+0x0/0x101 [ 31.281354] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 31.281356] [ 32.949597] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack [ 33.277294] BUG: scheduling while atomic: events/0/0x00000001/4 [ 33.277298] [ 33.277299] Call Trace: [ 33.277311] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 33.277317] [<ffffffff802f34b6>] list_add+0xc/0xe [ 33.277322] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 33.277325] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 33.277331] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 33.277335] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 33.277340] [<ffffffff80239269>] kthread+0xce/0x101 [ 33.277344] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 33.277348] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 33.277353] [<ffffffff8023919b>] kthread+0x0/0x101 [ 33.277357] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 33.277359] [ 35.273273] BUG: scheduling while atomic: events/0/0x00000001/4 [ 35.273278] [ 35.273279] Call Trace: [ 35.273291] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc [ 35.273296] [<ffffffff802f34b6>] list_add+0xc/0xe [ 35.273302] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 35.273305] [<ffffffff802365ce>] worker_thread+0xb5/0x11b [ 35.273311] [<ffffffff802233e2>] default_wake_function+0x0/0xf [ 35.273315] [<ffffffff80236519>] worker_thread+0x0/0x11b [ 35.273320] [<ffffffff80239269>] kthread+0xce/0x101 [ 35.273324] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 [ 35.273329] [<ffffffff8020a238>] child_rip+0xa/0x12 [ 35.273334] [<ffffffff8023919b>] kthread+0x0/0x101 [ 35.273337] [<ffffffff8020a22e>] child_rip+0x0/0x12 [ 35.273339] [...] ------------------------------------------------------------------ -- Paolo Ornati Linux 2.6.19-rc5 on x86_64 ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2) 2006-11-11 9:25 ` Paolo Ornati @ 2006-11-11 10:49 ` Rafael J. Wysocki 2006-11-11 12:29 ` Paolo Ornati 0 siblings, 1 reply; 91+ messages in thread From: Rafael J. Wysocki @ 2006-11-11 10:49 UTC (permalink / raw) To: Paolo Ornati; +Cc: Adrian Bunk, LKML On Saturday, 11 November 2006 10:25, Paolo Ornati wrote: > On Sat, 11 Nov 2006 10:08:37 +0100 > "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > > > Subject : BUG: scheduling while atomic: events/0/0x00000001/4 > > > after resume > > > References : http://lkml.org/lkml/2006/11/2/209 > > > Submitter : Paolo Ornati <ornati@fastwebnet.it> > > > Status : unknown > > > > I couldn't find anything in the report that would indicate the problem occured > > after a resume. Was it really the case? > > Ahh, I've written that in another email but I trimmed LKML from CC by > mistake ;) > > > Relevant portion of that mail follows... anyway it seems that "-rc5" is > _OK_ since I'm running it by 2 days and it survived 9 suspend/resume > cycles. Okay, please let us know if it survives the next several cycles. OTOH, the problem may be hiding. > ------------------------------------------------------------------ > > I've reproduced it (with rc4-g4b1c46a3), and I think it is > suspend/resume related sice the messages start flooding dmesg just > after a resume... > > I'll see if it is reproducible just doing suspend/resume a couple of > times... and if so I'll try with -rc5. > > > dmesg (stripped at the end): > > [ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006 > [ 0.000000] Command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60 > [ 0.000000] BIOS-provided physical RAM map: > [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) > [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) > [ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved) > [ 0.000000] BIOS-e820: 0000000000100000 - 000000001ff30000 (usable) > [ 0.000000] BIOS-e820: 000000001ff30000 - 000000001ff40000 (ACPI data) > [ 0.000000] BIOS-e820: 000000001ff40000 - 000000001fff0000 (ACPI NVS) > [ 0.000000] BIOS-e820: 000000001fff0000 - 0000000020000000 (reserved) > [ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) > [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used > [ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used > [ 0.000000] end_pfn_map = 1048576 > [ 0.000000] DMI 2.3 present. > [ 0.000000] ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000fa850 > [ 0.000000] ACPI: RSDT (v001 A M I OEMRSDT 0x06000517 MSFT 0x00000097) @ 0x000000001ff30000 > [ 0.000000] ACPI: FADT (v001 A M I OEMFACP 0x06000517 MSFT 0x00000097) @ 0x000000001ff30200 > [ 0.000000] ACPI: MADT (v001 A M I OEMAPIC 0x06000517 MSFT 0x00000097) @ 0x000000001ff30390 > [ 0.000000] ACPI: OEMB (v001 A M I OEMBIOS 0x06000517 MSFT 0x00000097) @ 0x000000001ff40040 > [ 0.000000] ACPI: DSDT (v001 A0058 A0058002 0x00000002 MSFT 0x0100000d) @ 0x0000000000000000 > [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used > [ 0.000000] Entering add_active_range(0, 256, 130864) 1 entries of 256 used > [ 0.000000] Zone PFN ranges: > [ 0.000000] DMA 0 -> 4096 > [ 0.000000] DMA32 4096 -> 1048576 > [ 0.000000] Normal 1048576 -> 1048576 > [ 0.000000] early_node_map[2] active PFN ranges > [ 0.000000] 0: 0 -> 159 > [ 0.000000] 0: 256 -> 130864 > [ 0.000000] On node 0 totalpages: 130767 > [ 0.000000] DMA zone: 56 pages used for memmap > [ 0.000000] DMA zone: 1183 pages reserved > [ 0.000000] DMA zone: 2760 pages, LIFO batch:0 > [ 0.000000] DMA32 zone: 1733 pages used for memmap > [ 0.000000] DMA32 zone: 125035 pages, LIFO batch:31 > [ 0.000000] Normal zone: 0 pages used for memmap > [ 0.000000] ACPI: PM-Timer IO Port: 0x808 > [ 0.000000] ACPI: Local APIC address 0xfee00000 > [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) > [ 0.000000] Processor #0 (Bootup-CPU) > [ 0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0]) > [ 0.000000] IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23 > [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) > [ 0.000000] ACPI: IRQ0 used by override. > [ 0.000000] ACPI: IRQ2 used by override. > [ 0.000000] ACPI: IRQ9 used by override. > [ 0.000000] Setting APIC routing to flat > [ 0.000000] Using ACPI (MADT) for SMP configuration information > [ 0.000000] Nosave address range: 000000000009f000 - 00000000000a0000 > [ 0.000000] Nosave address range: 00000000000a0000 - 00000000000e4000 > [ 0.000000] Nosave address range: 00000000000e4000 - 0000000000100000 > [ 0.000000] Allocating PCI resources starting at 30000000 (gap: 20000000:dff80000) > [ 0.000000] Built 1 zonelists. Total pages: 127795 > [ 0.000000] Kernel command line: root=/dev/sda6 elevator=cfq video=radeonfb:1024x768@60 > [ 0.000000] Initializing CPU#0 > [ 0.000000] PID hash table entries: 2048 (order: 11, 16384 bytes) > [ 32.727602] time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. > [ 32.727605] time.c: Detected 2202.943 MHz processor. > [ 32.730265] Console: colour VGA+ 80x25 > [ 32.733073] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) > [ 32.733509] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) > [ 32.733646] Checking aperture... > [ 32.733705] CPU 0: aperture @ f8000000 size 64 MB > [ 32.740581] Memory: 507900k/523456k available (2708k kernel code, 14716k reserved, 1343k data, 200k init) > [ 32.800438] Calibrating delay using timer specific routine.. 4409.08 BogoMIPS (lpj=2204542) > [ 32.800591] Mount-cache hash table entries: 256 > [ 32.800771] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) > [ 32.800833] CPU: L2 Cache: 512K (64 bytes/line) > [ 32.800913] CPU: AMD Athlon(tm) 64 Processor 3200+ stepping 00 > [ 32.801063] ACPI: Core revision 20060707 > [ 32.814145] Using local APIC timer interrupts. > [ 32.859518] result 12516743 > [ 32.859573] Detected 12.516 MHz APIC timer. > [ 32.860328] testing NMI watchdog ... OK. > [ 32.870515] checking if image is initramfs...it isn't (bad gzip magic numbers); looks like an initrd > [ 32.873517] Freeing initrd memory: 2000k freed > [ 32.875609] NET: Registered protocol family 16 > [ 32.875754] ACPI: bus type pci registered > [ 32.875818] PCI: Using configuration type 1 > [ 32.881215] ACPI: Interpreter enabled > [ 32.881276] ACPI: Using IOAPIC for interrupt routing > [ 32.882250] ACPI: PCI Root Bridge [PCI0] (0000:00) > [ 32.882315] PCI: Probing PCI hardware (bus 00) > [ 32.884489] PCI: enabled onboard AC97/MC97 devices > [ 32.884765] Boot video device is 0000:01:00.0 > [ 32.884843] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] > [ 32.896169] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15) > [ 32.896831] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15) > [ 32.897488] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 *7 10 11 14 15) > [ 32.898143] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 7 10 11 14 15) > [ 32.898809] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. > [ 32.899560] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. > [ 32.900312] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. > [ 32.901052] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled. > [ 32.901700] Linux Plug and Play Support v0.97 (c) Adam Belay > [ 32.901768] pnp: PnP ACPI init > [ 32.904855] pnp: PnP ACPI: found 13 devices > [ 32.905045] SCSI subsystem initialized > [ 32.905158] usbcore: registered new interface driver usbfs > [ 32.905247] usbcore: registered new interface driver hub > [ 32.905335] usbcore: registered new device driver usb > [ 32.905452] PCI: Using ACPI for IRQ routing > [ 32.905511] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report > [ 32.905592] PCI: Cannot allocate resource region 0 of device 0000:00:00.0 > [ 32.905754] agpgart: Detected AGP bridge 0 > [ 32.908913] agpgart: AGP aperture is 64M @ 0xf8000000 > [ 32.908997] PCI-DMA: Disabling IOMMU. > [ 32.909754] PCI: Bridge: 0000:00:01.0 > [ 32.909812] IO window: a000-afff > [ 32.909871] MEM window: fd100000-fd6fffff > [ 32.909931] PREFETCH window: d5000000-f4ffffff > [ 32.910006] PCI: Setting latency timer of device 0000:00:01.0 to 64 > [ 32.910026] NET: Registered protocol family 2 > [ 32.918232] IP route cache hash table entries: 4096 (order: 3, 32768 bytes) > [ 32.918368] TCP established hash table entries: 16384 (order: 5, 131072 bytes) > [ 32.918516] TCP bind hash table entries: 8192 (order: 4, 65536 bytes) > [ 32.918624] TCP: Hash tables configured (established 16384 bind 8192) > [ 32.918684] TCP reno registered > [ 32.919334] io scheduler noop registered > [ 32.919440] io scheduler cfq registered (default) > [ 32.920310] ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16 > [ 32.920477] radeonfb: Found Intel x86 BIOS ROM Image > [ 32.920539] radeonfb: Retrieved PLL infos from BIOS > [ 32.920598] radeonfb: Reference=27.00 MHz (RefDiv=12) Memory=200.00 Mhz, System=166.00 MHz > [ 32.920671] radeonfb: PLL min 20000 max 40000 > [ 32.921735] radeonfb: Monitor 1 type CRT found > [ 32.921793] radeonfb: Monitor 2 type no found > [ 32.955438] Console: switching to colour frame buffer device 128x48 > [ 32.975175] radeonfb (0000:01:00.0): ATI Radeon Yd > [ 32.975352] ACPI: Power Button (FF) [PWRF] > [ 32.975477] ACPI: Power Button (CM) [PWRB] > [ 32.975586] ACPI: Sleep Button (CM) [SLPB] > [ 32.977202] Real Time Clock Driver v1.12ac > [ 32.977313] Linux agpgart interface v0.101 (c) Dave Jones > [ 32.977457] [drm] Initialized drm 1.0.1 20051102 > [ 32.978126] [drm] Initialized radeon 1.25.0 20060524 on minor 0 > [ 32.978294] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled > [ 32.978587] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A > [ 32.978834] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A > [ 32.979259] 00:0a: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A > [ 32.979511] 00:0b: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A > [ 32.979773] Floppy drive(s): fd0 is 1.44M > [ 32.994906] FDC 0 is a post-1991 82077 > [ 32.996382] RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize > [ 32.996766] loop: loaded (max 8 devices) > [ 32.996906] ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 17 > [ 32.999471] skge 1.9 addr 0xfdc00000 irq 17 chip Yukon-Lite rev 9 > [ 33.001981] skge eth0: addr 00:11:d8:1c:a0:7a > [ 33.004520] 8139too Fast Ethernet driver 0.9.28 > [ 33.007047] ACPI: PCI Interrupt 0000:00:0e.0[A] -> GSI 19 (level, low) -> IRQ 19 > [ 33.010258] eth1: RealTek RTL8139 at 0xffffc20000004000, 00:e0:4c:f0:ab:b8, IRQ 19 > [ 33.013135] eth1: Identified 8139 chip type 'RTL-8139C' > [ 33.013151] Linux video capture interface: v2.00 > [ 33.016118] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > [ 33.019236] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx > [ 33.022583] VP_IDE: IDE controller at PCI slot 0000:00:0f.1 > [ 33.026030] ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 20 > [ 33.029676] VP_IDE: chipset revision 6 > [ 33.033379] VP_IDE: not 100% native mode: will probe irqs later > [ 33.037248] VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1 > [ 33.041286] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio > [ 33.045502] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio > [ 33.049751] Probing IDE interface ide0... > [ 33.841448] hda: HL-DT-ST DVDRAM GSA-4167B, ATAPI CD/DVD-ROM drive > [ 34.152000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > [ 34.156490] Probing IDE interface ide1... > [ 34.948226] hdc: HL-DT-ST GCE-8400B, ATAPI CD/DVD-ROM drive > [ 35.257686] ide1 at 0x170-0x177,0x376 on irq 15 > [ 35.264947] hda: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33) > [ 35.269964] Uniform CD-ROM driver Revision: 3.20 > [ 35.283252] hdc: ATAPI 40X CD-ROM CD-R/RW drive, 2048kB Cache, DMA > [ 35.289395] libata version 2.00 loaded. > [ 35.289419] sata_via 0000:00:0f.0: version 2.0 > [ 35.289430] ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 20 > [ 35.294798] sata_via 0000:00:0f.0: routed to hard irq line 10 > [ 35.300148] ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD400 irq 20 > [ 35.305530] ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xD802 bmdma 0xD408 irq 20 > [ 35.310844] scsi0 : sata_via > [ 35.515983] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > [ 35.673125] ata1.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 0/32) > [ 35.678757] ata1.00: ata1: dev 0 multi count 16 > [ 35.686006] ata1.00: configured for UDMA/133 > [ 35.691568] scsi1 : sata_via > [ 35.897218] ata2: SATA link down 1.5 Gbps (SStatus 0 SControl 300) > [ 35.913718] ATA: abnormal status 0x7F on port 0xE007 > [ 35.919332] scsi 0:0:0:0: Direct-Access ATA ST380817AS 3.42 PQ: 0 ANSI: 5 > [ 35.925134] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) > [ 35.930882] sda: Write Protect is off > [ 35.936662] sda: Mode Sense: 00 3a 00 00 > [ 35.936675] SCSI device sda: drive cache: write back > [ 35.942505] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB) > [ 35.948370] sda: Write Protect is off > [ 35.954136] sda: Mode Sense: 00 3a 00 00 > [ 35.954148] SCSI device sda: drive cache: write back > [ 35.959889] sda: sda1 sda2 < sda5 sda6 sda7 sda8 > > [ 36.027032] sd 0:0:0:0: Attached scsi disk sda > [ 36.032710] sd 0:0:0:0: Attached scsi generic sg0 type 0 > [ 36.038359] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21 > [ 36.044338] ehci_hcd 0000:00:10.4: EHCI Host Controller > [ 36.050131] ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 1 > [ 36.055834] ehci_hcd 0000:00:10.4: irq 21, io mem 0xfd900000 > [ 36.061409] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 > [ 36.067109] usb usb1: configuration #1 chosen from 1 choice > [ 36.072686] hub 1-0:1.0: USB hub found > [ 36.078141] hub 1-0:1.0: 8 ports detected > [ 36.183732] USB Universal Host Controller Interface driver v3.0 > [ 36.189136] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21 > [ 36.194595] uhci_hcd 0000:00:10.0: UHCI Host Controller > [ 36.200026] uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 2 > [ 36.205457] uhci_hcd 0000:00:10.0: irq 21, io base 0x0000b000 > [ 36.211000] usb usb2: configuration #1 chosen from 1 choice > [ 36.216430] hub 2-0:1.0: USB hub found > [ 36.221776] hub 2-0:1.0: 2 ports detected > [ 36.327443] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21 > [ 36.332864] uhci_hcd 0000:00:10.1: UHCI Host Controller > [ 36.338237] uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 3 > [ 36.343600] uhci_hcd 0000:00:10.1: irq 21, io base 0x0000b400 > [ 36.348985] usb usb3: configuration #1 chosen from 1 choice > [ 36.354262] hub 3-0:1.0: USB hub found > [ 36.359430] hub 3-0:1.0: 2 ports detected > [ 36.465147] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21 > [ 36.470440] uhci_hcd 0000:00:10.2: UHCI Host Controller > [ 36.475698] uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 4 > [ 36.480965] uhci_hcd 0000:00:10.2: irq 21, io base 0x0000b800 > [ 36.486227] usb usb4: configuration #1 chosen from 1 choice > [ 36.491386] hub 4-0:1.0: USB hub found > [ 36.496528] hub 4-0:1.0: 2 ports detected > [ 36.601875] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21 > [ 36.607110] uhci_hcd 0000:00:10.3: UHCI Host Controller > [ 36.612333] uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 5 > [ 36.617549] uhci_hcd 0000:00:10.3: irq 21, io base 0x0000c000 > [ 36.622729] usb usb5: configuration #1 chosen from 1 choice > [ 36.627812] hub 5-0:1.0: USB hub found > [ 36.632845] hub 5-0:1.0: 2 ports detected > [ 37.161683] usbcore: registered new interface driver cdc_acm > [ 37.166690] drivers/usb/class/cdc-acm.c: v0.25:USB Abstract Control Model driver for USB modems and ISDN adapters > [ 37.171947] usbcore: registered new interface driver usblp > [ 37.177212] drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver > [ 37.182561] Initializing USB Mass Storage driver... > [ 37.187966] usbcore: registered new interface driver usb-storage > [ 37.193343] USB Mass Storage support registered. > [ 37.198710] PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12 > [ 37.204404] serio: i8042 KBD port at 0x60,0x64 irq 1 > [ 37.209839] serio: i8042 AUX port at 0x60,0x64 irq 12 > [ 37.215222] mice: PS/2 mouse device common for all mice > [ 37.220524] i2c /dev entries driver > [ 37.226206] Advanced Linux Sound Architecture Driver Version 1.0.13 (Sun Oct 22 08:56:16 2006 UTC). > [ 37.231946] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22 > [ 37.239539] PCI: Setting latency timer of device 0000:00:11.5 to 64 > [ 37.251526] input: AT Translated Set 2 keyboard as /class/input/input0 > [ 37.751566] codec_read: codec 0 is not valid [0xfe0000] > [ 37.764869] codec_read: codec 0 is not valid [0xfe0000] > [ 37.778121] codec_read: codec 0 is not valid [0xfe0000] > [ 37.791242] codec_read: codec 0 is not valid [0xfe0000] > [ 37.808500] ALSA device list: > [ 37.813631] #0: VIA 8237 with AD1980 at 0xec00, irq 22 > [ 37.818899] oprofile: using NMI interrupt. > [ 37.824105] TCP cubic registered > [ 37.829169] NET: Registered protocol family 1 > [ 37.834229] NET: Registered protocol family 17 > [ 37.839196] NET: Registered protocol family 15 > [ 37.844134] ACPI: (supports S0 S1 S3 S4 S5) > [ 37.961630] input: ImPS/2 Logitech Wheel Mouse as /class/input/input1 > [ 38.050012] RAMDISK: ext2 filesystem found at block 0 > [ 38.054928] RAMDISK: Loading 2000KiB [1 disk] into ram disk... |\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\b/\b-\b\\b|\bdone. > [ 38.066647] VFS: Mounted root (ext2 filesystem). > [ 38.129633] kjournald starting. Commit interval 5 seconds > [ 38.134550] EXT3-fs: mounted filesystem with ordered data mode. > [ 38.139493] VFS: Mounted root (ext3 filesystem) readonly. > [ 38.144469] Trying to move old root to /initrd ... /initrd does not exist. Ignored. > [ 38.149670] Unmounting old root > [ 38.154701] Trying to free ramdisk memory ... okay > [ 38.159878] Freeing unused kernel memory: 200k freed > [ 38.164962] Write protecting the kernel read-only data: 560k > [ 40.454619] warning: process `touch' used the removed sysctl system call > [ 40.890041] warning: process `sleep' used the removed sysctl system call > [ 40.999232] warning: process `sleep' used the removed sysctl system call > [ 41.104730] warning: process `sleep' used the removed sysctl system call > [ 41.873966] warning: process `sleep' used the removed sysctl system call > [ 43.453372] EXT3 FS on sda6, internal journal > [ 43.999154] kjournald starting. Commit interval 5 seconds > [ 43.999164] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended > [ 43.999323] EXT3 FS on sda8, internal journal > [ 43.999328] EXT3-fs: mounted filesystem with ordered data mode. > [ 44.117933] Adding 1004016k swap on /dev/sda7. Priority:-1 extents:1 across:1004016k > [ 50.175883] skge eth0: enabling interface > [ 62.993341] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. > [ 62.993361] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode > [ 62.993436] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode > [ 63.288516] [drm] Setting GART location based on new memory map > [ 63.288604] [drm] Loading R200 Microcode > [ 63.288694] [drm] writeback test succeeded in 1 usecs > [ 79.495845] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 > [ 81.127324] ip_tables: (C) 2000-2006 Netfilter Core Team > [ 81.170715] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack > [ 180.334016] Installing knfsd (copyright (C) 1996 okir@monad.swb.de). > [ 306.826671] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. > [ 306.826693] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode > [ 306.826768] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode > [ 306.826780] [drm] Loading R200 Microcode > [ 340.827814] Stopping tasks: ==========================================================================================================================================| > [ 340.828658] Shrinking memory... \b-\b\\b|\b/\b-\bdone (51711 pages freed) > [ 340.925349] Suspending console(s) > [ 341.809883] pnp: Device 00:0b disabled. > [ 341.810124] pnp: Device 00:0a disabled. > [ 341.810148] radeonfb (0000:01:00.0): suspending for event: 1... > [ 341.887207] skge eth0: disabling interface > [ 341.899219] pci_set_power_state(): 0000:00:00.0: state=3, current state=5 > [ 341.912910] swsusp: Need to copy 63368 pages > [ 26.062737] APIC error on CPU0: 00(00) > [ 26.062827] PCI: Setting latency timer of device 0000:00:01.0 to 64 > [ 26.146592] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a) > [ 26.146599] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000) > [ 26.146609] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801) > [ 26.146614] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000) > [ 26.146619] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010) > [ 26.146626] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117) > [ 26.146658] skge eth0: enabling interface > [ 26.160202] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 > [ 26.171176] ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21 > [ 26.171218] usb usb2: root hub lost power or was reset > [ 26.182150] ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21 > [ 26.182188] usb usb3: root hub lost power or was reset > [ 26.193128] ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21 > [ 26.193165] usb usb4: root hub lost power or was reset > [ 26.204106] ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21 > [ 26.204143] usb usb5: root hub lost power or was reset > [ 26.215084] ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21 > [ 26.215109] usb usb1: root hub lost power or was reset > [ 26.215127] ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 > [ 26.226082] ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 22 > [ 26.226088] PCI: Setting latency timer of device 0000:00:11.5 to 64 > [ 26.229463] radeonfb (0000:01:00.0): resuming from state: 1... > [ 26.247263] pnp: Failed to activate device 00:03. > [ 26.247391] pnp: Failed to activate device 00:04. > [ 26.248318] pnp: Device 00:0a activated. > [ 26.249004] pnp: Device 00:0b activated. > [ 27.134110] Restarting tasks... done > [ 27.565554] agpgart: Found an AGP 3.5 compliant device at 0000:00:00.0. > [ 27.565593] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode > [ 27.565670] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode > [ 27.565682] [drm] Loading R200 Microcode > [ 28.443446] ip_conntrack version 2.4 (2044 buckets, 16352 max) - 248 bytes per conntrack > [ 752.692523] Stopping tasks: ======================================================================================================================================| > [ 752.693363] Shrinking memory... \b-\b\\b|\b/\b-\b\\bdone (58183 pages freed) > [ 756.669812] Suspending console(s) > [ 757.578446] pnp: Device 00:0b disabled. > [ 757.578702] pnp: Device 00:0a disabled. > [ 757.578727] radeonfb (0000:01:00.0): suspending for event: 1... > [ 757.655322] skge eth0: disabling interface > [ 757.695225] swsusp: Need to copy 58533 pages > [ 25.139916] APIC error on CPU0: 00(00) > [ 25.293551] PCI: Setting latency timer of device 0000:00:01.0 to 64 > [ 25.377319] PM: Writing back config space on device 0000:00:0a.0 at offset f (was 1f170100, writing 1f17010a) > [ 25.377326] PM: Writing back config space on device 0000:00:0a.0 at offset c (was 0, writing fdb00000) > [ 25.377338] PM: Writing back config space on device 0000:00:0a.0 at offset 5 (was 1, writing c801) > [ 25.377343] PM: Writing back config space on device 0000:00:0a.0 at offset 4 (was 0, writing fdc00000) > [ 25.377348] PM: Writing back config space on device 0000:00:0a.0 at offset 3 (was 0, writing 4010) > [ 25.377353] PM: Writing back config space on device 0000:00:0a.0 at offset 1 (was 2b00000, writing 2b00117) > [ 25.377384] skge eth0: enabling interface > [ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4 > [ 25.382086] > [ 25.382087] Call Trace: > [ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc > [ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe > [ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b > [ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b > [ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf > [ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b > [ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101 > [ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 > [ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12 > [ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101 > [ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12 Apparently, the kernel thinks that worker_thread() is running in the atomic context, so there may be a problem with preempt_count(), for example. Is preemption enabled in your kernel(s)? Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2) 2006-11-11 10:49 ` Rafael J. Wysocki @ 2006-11-11 12:29 ` Paolo Ornati 2006-11-14 16:44 ` Paolo Ornati 0 siblings, 1 reply; 91+ messages in thread From: Paolo Ornati @ 2006-11-11 12:29 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Adrian Bunk, LKML [-- Attachment #1: Type: text/plain, Size: 2699 bytes --] On Sat, 11 Nov 2006 11:49:26 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > > > Subject : BUG: scheduling while atomic: events/0/0x00000001/4 > > > > after resume > > > > References : http://lkml.org/lkml/2006/11/2/209 > > > > Submitter : Paolo Ornati <ornati@fastwebnet.it> > > > > Status : unknown > > > > > > I couldn't find anything in the report that would indicate the problem occured > > > after a resume. Was it really the case? > > > > Ahh, I've written that in another email but I trimmed LKML from CC by > > mistake ;) > > > > > > Relevant portion of that mail follows... anyway it seems that "-rc5" is > > _OK_ since I'm running it by 2 days and it survived 9 suspend/resume > > cycles. > > Okay, please let us know if it survives the next several cycles. > > OTOH, the problem may be hiding. Ok, and if it survives againg and again I can do a partial bisection... so that someone could guess the change that hides/fixes this and I can revert it on top of "-rc5" to confirm. > > > ------------------------------------------------------------------ > > > > I've reproduced it (with rc4-g4b1c46a3), and I think it is > > suspend/resume related sice the messages start flooding dmesg just > > after a resume... > > > > I'll see if it is reproducible just doing suspend/resume a couple of > > times... and if so I'll try with -rc5. > > > > > > dmesg (stripped at the end): > > > > [ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006 [CUT] > > [ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4 > > [ 25.382086] > > [ 25.382087] Call Trace: > > [ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc > > [ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe > > [ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b > > [ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b > > [ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf > > [ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b > > [ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101 > > [ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2 > > [ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12 > > [ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101 > > [ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12 > > Apparently, the kernel thinks that worker_thread() is running in the atomic > context, so there may be a problem with preempt_count(), for example. > > Is preemption enabled in your kernel(s)? YES (see first line of dmesg) - full config attached -- Paolo Ornati Linux 2.6.19-rc5 on x86_64 [-- Attachment #2: CONFIG.gz --] [-- Type: application/x-gzip, Size: 9060 bytes --] ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] 2.6.19-rc5: known regressions (v2) 2006-11-11 12:29 ` Paolo Ornati @ 2006-11-14 16:44 ` Paolo Ornati 2006-11-29 10:10 ` [SOLVED] " Paolo Ornati 0 siblings, 1 reply; 91+ messages in thread From: Paolo Ornati @ 2006-11-14 16:44 UTC (permalink / raw) To: Paolo Ornati; +Cc: Rafael J. Wysocki, Adrian Bunk, LKML On Sat, 11 Nov 2006 13:29:29 +0100 Paolo Ornati <ornati@fastwebnet.it> wrote: > > Okay, please let us know if it survives the next several cycles. > > > > OTOH, the problem may be hiding. > > Ok, and if it survives againg and again I can do a partial bisection... "-rc5" is still alive: 6 days of uptime using suspend/resume many times every day... so if the problem is there it's hiding very well. Now I'll slowly go back with older kernels and see what happens... -- Paolo Ornati Linux 2.6.19-rc5 on x86_64 ^ permalink raw reply [flat|nested] 91+ messages in thread
* [SOLVED] Re: [discuss] 2.6.19-rc5: known regressions (v2) 2006-11-14 16:44 ` Paolo Ornati @ 2006-11-29 10:10 ` Paolo Ornati 0 siblings, 0 replies; 91+ messages in thread From: Paolo Ornati @ 2006-11-29 10:10 UTC (permalink / raw) To: Paolo Ornati; +Cc: Rafael J. Wysocki, Adrian Bunk, LKML On Tue, 14 Nov 2006 17:44:51 +0100 Paolo Ornati <ornati@fastwebnet.it> wrote: > > > Okay, please let us know if it survives the next several cycles. > > > > > > OTOH, the problem may be hiding. > > > > Ok, and if it survives againg and again I can do a partial bisection... > > "-rc5" is still alive: 6 days of uptime using suspend/resume many times > every day... > > so if the problem is there it's hiding very well. > > > Now I'll slowly go back with older kernels and see what happens... SHORT CONCLUSION: it was just a kernel miscompilation (I usually do "make oldconfig; make clean; make" so I don't know if I missed "make clean" or if it was caused by ccache...). The fact that it's a miscompilation is "proved" by 3 simple things: 1) I've only seen the problem with that particular version 2) slow bisection pointed that the ipotetic bug was fixed between 4b1c46a3..d1ed6a3e, but I don't see any change that matters (on x86_64). 3) I'm running a clean recompiled 2.6.19-rc4-g4b1c46a3, that doesn't have any problem. :D -- Paolo Ornati Linux 2.6.19-rc4-g4b1c46a3 on x86_64 ^ permalink raw reply [flat|nested] 91+ messages in thread
* 2.6.19-rc5: known regressions with patches 2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds ` (2 preceding siblings ...) [not found] ` <20061111015035.GU4729@stusta.de> @ 2006-11-13 22:14 ` Adrian Bunk 2006-11-13 22:56 ` Brian King 2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk 4 siblings, 1 reply; 91+ messages in thread From: Adrian Bunk @ 2006-11-13 22:14 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Linux Kernel Mailing List, Aaron Durbin, Mel Gorman, ak, discuss, Paul Mackerras, Brian King, jgarzik, linux-ide This email lists some known regressions in 2.6.19-rc5 compared to 2.6.18 with patches available. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : x86_64: Fix partial page check to ensure unusable memory is not being marked usable References : http://lkml.org/lkml/2006/11/9/239 Submitter : Aaron Durbin <adurbin@google.com> Caused-By : Mel Gorman <mel@csn.ul.ie> commit 5cb248abf5ab65ab543b2d5fc16c738b28031fc0 Patch : http://lkml.org/lkml/2006/11/9/239 Status : patch available Subject : libata must be initialized earlier References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html Submitter : Paul Mackerras <paulus@samba.org> Handled-By : Brian King <brking@us.ibm.com> Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2 Status : patch available ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions with patches 2006-11-13 22:14 ` 2.6.19-rc5: known regressions with patches Adrian Bunk @ 2006-11-13 22:56 ` Brian King 2006-11-13 23:15 ` Linus Torvalds 0 siblings, 1 reply; 91+ messages in thread From: Brian King @ 2006-11-13 22:56 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Paul Mackerras, jgarzik, linux-ide Adrian Bunk wrote: > Subject : libata must be initialized earlier > References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html > Submitter : Paul Mackerras <paulus@samba.org> > Handled-By : Brian King <brking@us.ibm.com> > Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2 > Status : patch available I just resubmitted this patch a few minutes ago. Brian -- Brian King eServer Storage I/O IBM Linux Technology Center ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions with patches 2006-11-13 22:56 ` Brian King @ 2006-11-13 23:15 ` Linus Torvalds 2006-11-14 2:35 ` Jeff Garzik 0 siblings, 1 reply; 91+ messages in thread From: Linus Torvalds @ 2006-11-13 23:15 UTC (permalink / raw) To: Brian King Cc: Adrian Bunk, Andrew Morton, Linux Kernel Mailing List, Paul Mackerras, jgarzik, linux-ide On Mon, 13 Nov 2006, Brian King wrote: > Adrian Bunk wrote: > > Subject : libata must be initialized earlier > > References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html > > Submitter : Paul Mackerras <paulus@samba.org> > > Handled-By : Brian King <brking@us.ibm.com> > > Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2 > > Status : patch available > > I just resubmitted this patch a few minutes ago. I definitely want an ACK on this from Jeff - I'll take a few broken ppc64 machines any day over the worry that there might be problems elsewhere. Jeff? Ack, Nack, or "I'll push it to you through my git tree", please.. Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions with patches 2006-11-13 23:15 ` Linus Torvalds @ 2006-11-14 2:35 ` Jeff Garzik 0 siblings, 0 replies; 91+ messages in thread From: Jeff Garzik @ 2006-11-14 2:35 UTC (permalink / raw) To: Linus Torvalds Cc: Brian King, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List, Paul Mackerras, linux-ide Linus Torvalds wrote: > > On Mon, 13 Nov 2006, Brian King wrote: > >> Adrian Bunk wrote: >>> Subject : libata must be initialized earlier >>> References : http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027945.html >>> Submitter : Paul Mackerras <paulus@samba.org> >>> Handled-By : Brian King <brking@us.ibm.com> >>> Patch : http://marc.theaimsgroup.com/?l=linux-ide&m=116169938407596&w=2 >>> Status : patch available >> I just resubmitted this patch a few minutes ago. > > I definitely want an ACK on this from Jeff - I'll take a few broken ppc64 > machines any day over the worry that there might be problems elsewhere. > > Jeff? Ack, Nack, or "I'll push it to you through my git tree", please.. Reluctant ACK. But this whole subsys_init() mess is highly fragile, and this is going to change again once a new dependency arises :/ Jeff ^ permalink raw reply [flat|nested] 91+ messages in thread
* 2.6.19-rc5: known regressions (v3) 2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds ` (3 preceding siblings ...) 2006-11-13 22:14 ` 2.6.19-rc5: known regressions with patches Adrian Bunk @ 2006-11-15 10:21 ` Adrian Bunk 2006-11-15 10:35 ` Jens Axboe ` (4 more replies) 4 siblings, 5 replies; 91+ messages in thread From: Adrian Bunk @ 2006-11-15 10:21 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Linux Kernel Mailing List, Stephen Hemminger, gregkh, linux-pci, Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel This email lists some known regressions in 2.6.19-rc5 compared to 2.6.18 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : PCI MSI setting corrupted during resume References : http://bugzilla.kernel.org/show_bug.cgi?id=7479 Submitter : Stephen Hemminger <shemminger@osdl.org> Status : unknown Subject : SMP kernel can not generate ISA irq properly References : http://lkml.org/lkml/2006/10/22/15 http://lkml.org/lkml/2006/11/10/142 Submitter : Komuro <komurojun-mbn@nifty.com> Handled-By : "Eric W. Biederman" <ebiederm@xmission.com> Ingo Molnar <mingo@redhat.com> Status : problem is being debugged Subject : ThinkPad R50p: boot fail with (lapic && on_battery) References : http://lkml.org/lkml/2006/10/31/333 Submitter : Ernst Herzberg <earny@net4u.de> Handled-By : Len Brown <len.brown@intel.com> Status : problem is being debugged Subject : x86_64: Bad page state in process 'swapper' References : http://lkml.org/lkml/2006/11/10/135 http://lkml.org/lkml/2006/11/10/208 Submitter : Andre Noll <maan@systemlinux.org> Handled-By : Andi Kleen <ak@suse.de> Status : Andi is investigating Subject : x86_64: oprofile doesn't work References : http://lkml.org/lkml/2006/10/27/3 Submitter : Prakash Punnoor <prakash@punnoor.de> Status : unknown Subject : unable to rip cd References : http://lkml.org/lkml/2006/10/13/100 http://lkml.org/lkml/2006/11/8/42 Submitter : Alex Romosan <romosan@sycorax.lbl.gov> Handled-By : Jens Axboe <jens.axboe@oracle.com> Status : Jens is investigating Subject : can't disable OHCI wakeup via sysfs References : http://lkml.org/lkml/2006/11/11/33 Submitter : Andrey Borzenkov <arvidjaar@mail.ru> Handled-By : Alan Stern <stern@rowland.harvard.edu> Patch : http://lkml.org/lkml/2006/11/13/261 Status : patch available ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk @ 2006-11-15 10:35 ` Jens Axboe 2006-11-15 10:53 ` Adrian Bunk 2006-11-15 10:35 ` Eric Dumazet ` (3 subsequent siblings) 4 siblings, 1 reply; 91+ messages in thread From: Jens Axboe @ 2006-11-15 10:35 UTC (permalink / raw) To: Adrian Bunk; +Cc: Linux Kernel On Wed, Nov 15 2006, Adrian Bunk wrote: > Subject : unable to rip cd > References : http://lkml.org/lkml/2006/10/13/100 > http://lkml.org/lkml/2006/11/8/42 > Submitter : Alex Romosan <romosan@sycorax.lbl.gov> > Handled-By : Jens Axboe <jens.axboe@oracle.com> > Status : Jens is investigating it's fixed and patched has been merged. -- Jens Axboe ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:35 ` Jens Axboe @ 2006-11-15 10:53 ` Adrian Bunk 0 siblings, 0 replies; 91+ messages in thread From: Adrian Bunk @ 2006-11-15 10:53 UTC (permalink / raw) To: Jens Axboe; +Cc: Linux Kernel, Alex Romosan On Wed, Nov 15, 2006 at 11:35:05AM +0100, Jens Axboe wrote: > On Wed, Nov 15 2006, Adrian Bunk wrote: > > Subject : unable to rip cd > > References : http://lkml.org/lkml/2006/10/13/100 > > http://lkml.org/lkml/2006/11/8/42 > > Submitter : Alex Romosan <romosan@sycorax.lbl.gov> > > Handled-By : Jens Axboe <jens.axboe@oracle.com> > > Status : Jens is investigating > > it's fixed and patched has been merged. Thanks for the information, I've removed it from my list. > Jens Axboe cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk 2006-11-15 10:35 ` Jens Axboe @ 2006-11-15 10:35 ` Eric Dumazet 2006-11-15 10:50 ` Andi Kleen 2006-11-22 10:28 ` Eric Dumazet 2006-11-15 11:06 ` Brice Goglin ` (2 subsequent siblings) 4 siblings, 2 replies; 91+ messages in thread From: Eric Dumazet @ 2006-11-15 10:35 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Stephen Hemminger, gregkh, linux-pci, Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel On Wednesday 15 November 2006 11:21, Adrian Bunk wrote: > Subject : x86_64: oprofile doesn't work > References : http://lkml.org/lkml/2006/10/27/3 > Submitter : Prakash Punnoor <prakash@punnoor.de> > Status : unknown > I confirm a got this one too. On a working kernel on an Opteron, we have normally 4 directories in /dev/oprofile : # ls -ld /dev/oprofile/? drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/0 drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/1 drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/2 drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/3 With linux-2.6.19-rc5, the first one (0) is missing and we get 1,2,3 Maybe the 'bug' is in oprofile tools, that currently expect to find '0' Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:35 ` Eric Dumazet @ 2006-11-15 10:50 ` Andi Kleen 2006-11-15 16:40 ` William Cohen 2006-11-22 10:28 ` Eric Dumazet 1 sibling, 1 reply; 91+ messages in thread From: Andi Kleen @ 2006-11-15 10:50 UTC (permalink / raw) To: Eric Dumazet Cc: Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Stephen Hemminger, gregkh, linux-pci, Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel > On a working kernel on an Opteron, we have normally 4 directories > in /dev/oprofile : > > # ls -ld /dev/oprofile/? > drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/0 > drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/1 > drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/2 > drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/3 > > With linux-2.6.19-rc5, the first one (0) is missing and we get 1,2,3 That's because 0 was never available. It is used by the NMI watchdog. The new kernel doesn't give it to oprofile anymore. > Maybe the 'bug' is in oprofile tools, that currently expect to find '0' Yes, it's likely a user space issue. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:50 ` Andi Kleen @ 2006-11-15 16:40 ` William Cohen 2006-11-15 16:48 ` [discuss] " Andi Kleen 0 siblings, 1 reply; 91+ messages in thread From: William Cohen @ 2006-11-15 16:40 UTC (permalink / raw) To: Andi Kleen Cc: Eric Dumazet, Andrew Morton, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, Linus Torvalds, discuss, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov Andi Kleen wrote: >>On a working kernel on an Opteron, we have normally 4 directories >>in /dev/oprofile : >> >># ls -ld /dev/oprofile/? >>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/0 >>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/1 >>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/2 >>drwxr-xr-x 1 root root 0 15. Nov 12:38 /dev/oprofile/3 >> >>With linux-2.6.19-rc5, the first one (0) is missing and we get 1,2,3 > > > That's because 0 was never available. It is used by the NMI watchdog. > The new kernel doesn't give it to oprofile anymore. > > >>Maybe the 'bug' is in oprofile tools, that currently expect to find '0' > > > Yes, it's likely a user space issue. > > -Andi OProfile has a simplistic view of the performance monitoring hardware. The routines in libop/op_alloc_counter.c determine what set of performance registers is available from the processor in use. There is no check to see what registers are actually available in the /dev/oprofile directory. opcontrol executes ophelp to determine which specific counters to count which events. The function map_event_to_counter() in libop/op_alloc_counter.c does the actual selection. It seems what is needed is for map_event_to_counter() to check to see which counters are available and mark the others as unavailable. -Will ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 16:40 ` William Cohen @ 2006-11-15 16:48 ` Andi Kleen 2006-11-15 18:39 ` Andrew Morton 0 siblings, 1 reply; 91+ messages in thread From: Andi Kleen @ 2006-11-15 16:48 UTC (permalink / raw) To: discuss Cc: William Cohen, Eric Dumazet, Andrew Morton, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, Linus Torvalds, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov > OProfile has a simplistic view of the performance monitoring hardware. The > routines in libop/op_alloc_counter.c determine what set of performance registers > is available from the processor in use. There is no check to see what registers > are actually available in the /dev/oprofile directory. > > opcontrol executes ophelp to determine which specific counters to count which > events. The function map_event_to_counter() in libop/op_alloc_counter.c does the > actual selection. It seems what is needed is for map_event_to_counter() to check > to see which counters are available and mark the others as unavailable Thanks for the explanation. Can you please fix it and release a new version? Documentation/Changes could be adapted then. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 16:48 ` [discuss] " Andi Kleen @ 2006-11-15 18:39 ` Andrew Morton 2006-11-15 18:45 ` Andi Kleen 0 siblings, 1 reply; 91+ messages in thread From: Andrew Morton @ 2006-11-15 18:39 UTC (permalink / raw) To: Andi Kleen Cc: discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, Linus Torvalds, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov On Wed, 15 Nov 2006 17:48:05 +0100 Andi Kleen <ak@suse.de> wrote: > > > OProfile has a simplistic view of the performance monitoring hardware. The > > routines in libop/op_alloc_counter.c determine what set of performance registers > > is available from the processor in use. There is no check to see what registers > > are actually available in the /dev/oprofile directory. > > > > opcontrol executes ophelp to determine which specific counters to count which > > events. The function map_event_to_counter() in libop/op_alloc_counter.c does the > > actual selection. It seems what is needed is for map_event_to_counter() to check > > to see which counters are available and mark the others as unavailable > > Thanks for the explanation. Can you please fix it and release a new version? > Documentation/Changes could be adapted then. > Meanwhile we should restore the NMI counter to fix this bug. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 18:39 ` Andrew Morton @ 2006-11-15 18:45 ` Andi Kleen 2006-11-15 19:07 ` Linus Torvalds 0 siblings, 1 reply; 91+ messages in thread From: Andi Kleen @ 2006-11-15 18:45 UTC (permalink / raw) To: Andrew Morton Cc: discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, Linus Torvalds, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov On Wednesday 15 November 2006 19:39, Andrew Morton wrote: > On Wed, 15 Nov 2006 17:48:05 +0100 > Andi Kleen <ak@suse.de> wrote: > > > > > > OProfile has a simplistic view of the performance monitoring hardware. The > > > routines in libop/op_alloc_counter.c determine what set of performance registers > > > is available from the processor in use. There is no check to see what registers > > > are actually available in the /dev/oprofile directory. > > > > > > opcontrol executes ophelp to determine which specific counters to count which > > > events. The function map_event_to_counter() in libop/op_alloc_counter.c does the > > > actual selection. It seems what is needed is for map_event_to_counter() to check > > > to see which counters are available and mark the others as unavailable > > > > Thanks for the explanation. Can you please fix it and release a new version? > > Documentation/Changes could be adapted then. > > > > Meanwhile we should restore the NMI counter to fix this bug. No, it was always oprofile who was buggy here, silently taking the nmi watchdog away. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 18:45 ` Andi Kleen @ 2006-11-15 19:07 ` Linus Torvalds 2006-11-15 19:23 ` Andi Kleen 0 siblings, 1 reply; 91+ messages in thread From: Linus Torvalds @ 2006-11-15 19:07 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov On Wed, 15 Nov 2006, Andi Kleen wrote: > > > > Meanwhile we should restore the NMI counter to fix this bug. > > No, it was always oprofile who was buggy here, silently taking > the nmi watchdog away. Andi, your "blame game" doesn't matter. The fact is, it used to work, and the kernel changed interfaces, so now it doesn't. In other words, a kernel interface to user land changed. THAT IS ALWAYS A BUG. We don't change UI. Yes, "oprofile" should be fixed to not depend on that, but the kernel shouldn't change the interfaces, and we should add back the zero entry. Linus ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 19:07 ` Linus Torvalds @ 2006-11-15 19:23 ` Andi Kleen 2006-11-15 20:21 ` Andrew Morton 0 siblings, 1 reply; 91+ messages in thread From: Andi Kleen @ 2006-11-15 19:23 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov > The fact is, it used to work, and the kernel changed interfaces, so now it > doesn't. No, it didn't work. oprofile may have done something, but it just silently killed the NMI watchdog in the process. That was never acceptable. Now we do proper accounting of NMI sources and also proper allocation of performance counters. > Yes, "oprofile" should be fixed to not depend on that, but the kernel > shouldn't change the interfaces, and we should add back the zero entry. That would break the nmi watchdog again. Anyways, there is a sysctl to disable the nmi watchdog if someone is desperate. But I think it is clearly oprofile who did wrong here and needs to be fixed. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 19:23 ` Andi Kleen @ 2006-11-15 20:21 ` Andrew Morton 2006-11-15 21:18 ` Eric W. Biederman 2006-11-16 3:21 ` Andi Kleen 0 siblings, 2 replies; 91+ messages in thread From: Andrew Morton @ 2006-11-15 20:21 UTC (permalink / raw) To: Andi Kleen Cc: Linus Torvalds, discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov On Wed, 15 Nov 2006 20:23:53 +0100 Andi Kleen <ak@suse.de> wrote: > > > The fact is, it used to work, and the kernel changed interfaces, so now it > > doesn't. > > No, it didn't work. oprofile may have done something, but it > just silently killed the NMI watchdog in the process. > That was never acceptable. But people could get profiles out. I know, I've seen them! > Now we do proper accounting of NMI sources and also proper allocation > of performance counters. > > > > Yes, "oprofile" should be fixed to not depend on that, but the kernel > > shouldn't change the interfaces, and we should add back the zero entry. > > That would break the nmi watchdog again. > > Anyways, there is a sysctl to disable the nmi watchdog if someone > is desperate. > > But I think it is clearly oprofile who did wrong here and needs > to be fixed. > Is it correct to say that oprofile-on-2.6.18 works, and that oprofile-on-2.6.19-rc5 does not? Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail in some particular scenarios? If it's really true that oprofile is simply busted then that's a serious problem and we should find some way of unbusting it. If that means just adding a dummy "0" entry which always returns zero or something like that, then fine. But we can't just go and bust it. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 20:21 ` Andrew Morton @ 2006-11-15 21:18 ` Eric W. Biederman 2006-11-15 21:31 ` Andrew Morton 2006-11-16 3:21 ` Andi Kleen 1 sibling, 1 reply; 91+ messages in thread From: Eric W. Biederman @ 2006-11-15 21:18 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov Andrew Morton <akpm@osdl.org> writes: > Is it correct to say that oprofile-on-2.6.18 works, and that > oprofile-on-2.6.19-rc5 does not? > > Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail > in some particular scenarios? > > If it's really true that oprofile is simply busted then that's a serious > problem and we should find some way of unbusting it. If that means just > adding a dummy "0" entry which always returns zero or something like that, > then fine. > > But we can't just go and bust it. The simple question. If we turn off the NMI watchdog on 2.6.19-rc5 does oprofile work? I believe that is what Andi said. The description I read was a resource conflict. The resources oprofile just expects it can used are already in use so we tell it no and the user space oprofile doesn't cope. Now I don't know the interface allows us to rename the interfaces from 1 2 3 to 0 1 2. If we can then that looks like something we can fix. Otherwise from the description I tend to agree with Andi. The user space application assumed it own hardware that it did not. Hmm. I bet if nothing else we could move the NMI watchdog from 0 to 3 and make things work that way... Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 21:18 ` Eric W. Biederman @ 2006-11-15 21:31 ` Andrew Morton 2006-11-16 10:55 ` Mikael Pettersson 0 siblings, 1 reply; 91+ messages in thread From: Andrew Morton @ 2006-11-15 21:31 UTC (permalink / raw) To: Eric W. Biederman Cc: Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov On Wed, 15 Nov 2006 14:18:24 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote: > Andrew Morton <akpm@osdl.org> writes: > > > Is it correct to say that oprofile-on-2.6.18 works, and that > > oprofile-on-2.6.19-rc5 does not? > > > > Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail > > in some particular scenarios? > > > > If it's really true that oprofile is simply busted then that's a serious > > problem and we should find some way of unbusting it. If that means just > > adding a dummy "0" entry which always returns zero or something like that, > > then fine. > > > > But we can't just go and bust it. > > The simple question. If we turn off the NMI watchdog on 2.6.19-rc5 > does oprofile work? I believe that is what Andi said. > > The description I read was a resource conflict. The resources oprofile > just expects it can used are already in use so we tell it no and > the user space oprofile doesn't cope. That would have been a bug in earlier kernels. > Now I don't know the interface allows us to rename the interfaces > from 1 2 3 to 0 1 2. If we can then that looks like something we can > fix. Otherwise from the description I tend to agree with Andi. > > The user space application assumed it own hardware that it did not. > > Hmm. I bet if nothing else we could move the NMI watchdog from 0 to 3 > and make things work that way... Surely the appropriate behaviour is to allow oprofile to steal the NMI and to then put the NMI back to doing the watchdog thing after oprofile has finished with it. If that's not a feasible thing to do for 2.6.19 then some short-term hack which makes oprofile work again is needed. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 21:31 ` Andrew Morton @ 2006-11-16 10:55 ` Mikael Pettersson 2006-11-16 20:23 ` Andrew Morton 0 siblings, 1 reply; 91+ messages in thread From: Mikael Pettersson @ 2006-11-16 10:55 UTC (permalink / raw) To: Andrew Morton Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov Andrew Morton writes: > Surely the appropriate behaviour is to allow oprofile to steal the NMI and > to then put the NMI back to doing the watchdog thing after oprofile has > finished with it. Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented the in-kernel API allowing real performance counter drivers like oprofile (and perfctr) to claim the HW from the NMI watchdog, do their work, and then release it which resumed the watchdog. Note that oprofile (and perfctr) didn't do anything behind the NMI watchdog's back. They went via the API. Nothing dodgy going on. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-16 10:55 ` Mikael Pettersson @ 2006-11-16 20:23 ` Andrew Morton 2006-11-17 9:59 ` Mikael Pettersson 0 siblings, 1 reply; 91+ messages in thread From: Andrew Morton @ 2006-11-16 20:23 UTC (permalink / raw) To: Mikael Pettersson Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov On Thu, 16 Nov 2006 11:55:46 +0100 Mikael Pettersson <mikpe@it.uu.se> wrote: > Andrew Morton writes: > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and > > to then put the NMI back to doing the watchdog thing after oprofile has > > finished with it. > > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented > the in-kernel API allowing real performance counter drivers like > oprofile (and perfctr) to claim the HW from the NMI watchdog, > do their work, and then release it which resumed the watchdog. OK. But from Andi's comments it seems that the NMI watchdog was failing to resume its operation. > Note that oprofile (and perfctr) didn't do anything behind the > NMI watchdog's back. They went via the API. Nothing dodgy going on. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-16 20:23 ` Andrew Morton @ 2006-11-17 9:59 ` Mikael Pettersson 2006-11-17 10:13 ` Andrew Morton 2006-11-17 10:29 ` Andi Kleen 0 siblings, 2 replies; 91+ messages in thread From: Mikael Pettersson @ 2006-11-17 9:59 UTC (permalink / raw) To: Andrew Morton Cc: Mikael Pettersson, Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov Andrew Morton writes: > On Thu, 16 Nov 2006 11:55:46 +0100 > Mikael Pettersson <mikpe@it.uu.se> wrote: > > > Andrew Morton writes: > > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and > > > to then put the NMI back to doing the watchdog thing after oprofile has > > > finished with it. > > > > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented > > the in-kernel API allowing real performance counter drivers like > > oprofile (and perfctr) to claim the HW from the NMI watchdog, > > do their work, and then release it which resumed the watchdog. > > OK. But from Andi's comments it seems that the NMI watchdog was failing to > resume its operation. It certainly worked when I originally implemented it. If it didn't work that way before 2.6.19-rc1 butchered it then that would have been a bug that should have been fixed. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-17 9:59 ` Mikael Pettersson @ 2006-11-17 10:13 ` Andrew Morton 2006-11-19 3:05 ` Bill Davidsen 2006-11-17 10:29 ` Andi Kleen 1 sibling, 1 reply; 91+ messages in thread From: Andrew Morton @ 2006-11-17 10:13 UTC (permalink / raw) To: Mikael Pettersson Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov On Fri, 17 Nov 2006 10:59:07 +0100 Mikael Pettersson <mikpe@it.uu.se> wrote: > Andrew Morton writes: > > On Thu, 16 Nov 2006 11:55:46 +0100 > > Mikael Pettersson <mikpe@it.uu.se> wrote: > > > > > Andrew Morton writes: > > > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and > > > > to then put the NMI back to doing the watchdog thing after oprofile has > > > > finished with it. > > > > > > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented > > > the in-kernel API allowing real performance counter drivers like > > > oprofile (and perfctr) to claim the HW from the NMI watchdog, > > > do their work, and then release it which resumed the watchdog. > > > > OK. But from Andi's comments it seems that the NMI watchdog was failing to > > resume its operation. > > It certainly worked when I originally implemented it. If it didn't work > that way before 2.6.19-rc1 butchered it then that would have been a bug > that should have been fixed. Oh. OK. Meanwhile, 2.6.19-rc6 remains unfixed. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-17 10:13 ` Andrew Morton @ 2006-11-19 3:05 ` Bill Davidsen 0 siblings, 0 replies; 91+ messages in thread From: Bill Davidsen @ 2006-11-19 3:05 UTC (permalink / raw) To: Andrew Morton Cc: Eric W. Biederman, Andi Kleen, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov Andrew Morton wrote: > On Fri, 17 Nov 2006 10:59:07 +0100 > Mikael Pettersson <mikpe@it.uu.se> wrote: > >> Andrew Morton writes: >> > On Thu, 16 Nov 2006 11:55:46 +0100 >> > Mikael Pettersson <mikpe@it.uu.se> wrote: >> > >> > > Andrew Morton writes: >> > > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and >> > > > to then put the NMI back to doing the watchdog thing after oprofile has >> > > > finished with it. >> > > >> > > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented >> > > the in-kernel API allowing real performance counter drivers like >> > > oprofile (and perfctr) to claim the HW from the NMI watchdog, >> > > do their work, and then release it which resumed the watchdog. >> > >> > OK. But from Andi's comments it seems that the NMI watchdog was failing to >> > resume its operation. >> >> It certainly worked when I originally implemented it. If it didn't work >> that way before 2.6.19-rc1 butchered it then that would have been a bug >> that should have been fixed. > > Oh. OK. > > Meanwhile, 2.6.19-rc6 remains unfixed. > Has anyone verified that nmi watchdog works at all in 2.6.19-rc6? I haven't built a kernel since rc2, other things have been taking my time. -- Bill Davidsen <davidsen@tmr.com> Obscure bug of 2004: BASH BUFFER OVERFLOW - if bash is being run by a normal user and is setuid root, with the "vi" line edit mode selected, and the character set is "big5," an off-by-one errors occurs during wildcard (glob) expansion. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-17 9:59 ` Mikael Pettersson 2006-11-17 10:13 ` Andrew Morton @ 2006-11-17 10:29 ` Andi Kleen 1 sibling, 0 replies; 91+ messages in thread From: Andi Kleen @ 2006-11-17 10:29 UTC (permalink / raw) To: Mikael Pettersson Cc: Andrew Morton, Eric W. Biederman, Linus Torvalds, discuss, William Cohen, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Andrey Borzenkov On Friday 17 November 2006 10:59, Mikael Pettersson wrote: > It certainly worked when I originally implemented it. I don't think so. NMI watchdog never recovered no matter if oprofile used the counter or not. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 20:21 ` Andrew Morton 2006-11-15 21:18 ` Eric W. Biederman @ 2006-11-16 3:21 ` Andi Kleen 2006-11-16 5:05 ` Andrew Morton 1 sibling, 1 reply; 91+ messages in thread From: Andi Kleen @ 2006-11-16 3:21 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Linus Torvalds, discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov On Wed, Nov 15, 2006 at 12:21:18PM -0800, Andrew Morton wrote: > Andi Kleen <ak@suse.de> wrote: > > > > > > The fact is, it used to work, and the kernel changed interfaces, so now it > > > doesn't. > > > > No, it didn't work. oprofile may have done something, but it > > just silently killed the NMI watchdog in the process. > > That was never acceptable. > > But people could get profiles out. I know, I've seen them! Just the nmi watchdog was gone then. > > > Now we do proper accounting of NMI sources and also proper allocation > > of performance counters. > > > > > > > Yes, "oprofile" should be fixed to not depend on that, but the kernel > > > shouldn't change the interfaces, and we should add back the zero entry. > > > > That would break the nmi watchdog again. > > > > Anyways, there is a sysctl to disable the nmi watchdog if someone > > is desperate. > > > > But I think it is clearly oprofile who did wrong here and needs > > to be fixed. > > > > Is it correct to say that oprofile-on-2.6.18 works, and that > oprofile-on-2.6.19-rc5 does not? > > Or is there some sort of workaround for this, or does 2.6.19-rc5 only fail echo 0 > /proc/sys/kernel/nmi_watchdog before the oprofile module is loaded. With builtin oprofile probably nmi_watchdog=0 > in some particular scenarios? On x86-64 and on newer i386 machines (based on DMI year) > > If it's really true that oprofile is simply busted then that's a serious > problem and we should find some way of unbusting it. If that means just > adding a dummy "0" entry which always returns zero or something like that, > then fine. That could be probably done. > But we can't just go and bust it. It just did something unbelievable broken before. I would say it busted itself. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-16 3:21 ` Andi Kleen @ 2006-11-16 5:05 ` Andrew Morton 2006-11-16 7:04 ` Andi Kleen 0 siblings, 1 reply; 91+ messages in thread From: Andrew Morton @ 2006-11-16 5:05 UTC (permalink / raw) To: Andi Kleen Cc: Linus Torvalds, discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov On Thu, 16 Nov 2006 04:21:09 +0100 Andi Kleen <ak@suse.de> wrote: > > > > If it's really true that oprofile is simply busted then that's a serious > > problem and we should find some way of unbusting it. If that means just > > adding a dummy "0" entry which always returns zero or something like that, > > then fine. > > That could be probably done. I'm told that this is exactly what it was doing before it got changed. > > But we can't just go and bust it. > > It just did something unbelievable broken before. What did it do? > I would say it busted > itself. It gave profiles, which was fairly handy. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-16 5:05 ` Andrew Morton @ 2006-11-16 7:04 ` Andi Kleen 2006-11-16 15:34 ` William Cohen 0 siblings, 1 reply; 91+ messages in thread From: Andi Kleen @ 2006-11-16 7:04 UTC (permalink / raw) To: Andrew Morton Cc: Linus Torvalds, discuss, William Cohen, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov On Thursday 16 November 2006 06:05, Andrew Morton wrote: > On Thu, 16 Nov 2006 04:21:09 +0100 > Andi Kleen <ak@suse.de> wrote: > > > > > > > If it's really true that oprofile is simply busted then that's a serious > > > problem and we should find some way of unbusting it. If that means just > > > adding a dummy "0" entry which always returns zero or something like that, > > > then fine. > > > > That could be probably done. > > I'm told that this is exactly what it was doing before it got changed. Hmm, ok perhaps that can be arranged again. The trouble is that I want to use this performance counter for other purposes too, so we would run into trouble again if oprofile keeps stealing it. > > > But we can't just go and bust it. > > > > It just did something unbelievable broken before. > > What did it do? Silently kill the nmi watchdog. > > > I would say it busted > > itself. > > It gave profiles, which was fairly handy. I'm sure it can be fixed there. Ok ok I keep sounding like a sysfs maintainer now @) -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-16 7:04 ` Andi Kleen @ 2006-11-16 15:34 ` William Cohen 2006-11-16 15:47 ` Andi Kleen 2006-11-16 21:32 ` Stephane Eranian 0 siblings, 2 replies; 91+ messages in thread From: William Cohen @ 2006-11-16 15:34 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, Linus Torvalds, discuss, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov Andi Kleen wrote: > On Thursday 16 November 2006 06:05, Andrew Morton wrote: > >>On Thu, 16 Nov 2006 04:21:09 +0100 >>Andi Kleen <ak@suse.de> wrote: >> >> >>>>If it's really true that oprofile is simply busted then that's a serious >>>>problem and we should find some way of unbusting it. If that means just >>>>adding a dummy "0" entry which always returns zero or something like that, >>>>then fine. >>> >>>That could be probably done. >> >>I'm told that this is exactly what it was doing before it got changed. > > > Hmm, ok perhaps that can be arranged again. > > The trouble is that I want to use this performance counter for > other purposes too, so we would run into trouble again > if oprofile keeps stealing it. What other purposes do you see the performance counters useful for? To collect information on process characteristics so they can be scheduled more efficiently? Is this going to require sharing the nmi interrupt and knowing which perfcounter register triggered the interrupt to get the correct action? Currently the oprofile interrupt handler assumes any performance monitoring counter it sees overflowing is something it should count. -Will ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-16 15:34 ` William Cohen @ 2006-11-16 15:47 ` Andi Kleen 2006-11-16 21:32 ` Stephane Eranian 1 sibling, 0 replies; 91+ messages in thread From: Andi Kleen @ 2006-11-16 15:47 UTC (permalink / raw) To: William Cohen Cc: Andrew Morton, Linus Torvalds, discuss, Eric Dumazet, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, linux-usb-devel, phil.el, Adrian Bunk, Ingo Molnar, Alan Stern, linux-pci, Stephen Hemminger, Prakash Punnoor, Len Brown, Alex Romosan, gregkh, Linux Kernel Mailing List, Eric W. Biederman, Andrey Borzenkov > What other purposes do you see the performance counters useful for? Export one to user space as a cycle counter for benchmarking. RDTSC doesn't do this job anymore. > To collect information on process characteristics so they can be scheduled more efficiently? That might happen at some point in the future, but i would expect us to wait for CPUs with more performance counters first. > Is this going to require sharing the nmi interrupt and knowing which perfcounter > register triggered the interrupt to get the correct action? Currently the > oprofile interrupt handler assumes any performance monitoring counter it sees > overflowing is something it should count. Yes. That needs to be fixed. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: [discuss] Re: 2.6.19-rc5: known regressions (v3) 2006-11-16 15:34 ` William Cohen 2006-11-16 15:47 ` Andi Kleen @ 2006-11-16 21:32 ` Stephane Eranian 1 sibling, 0 replies; 91+ messages in thread From: Stephane Eranian @ 2006-11-16 21:32 UTC (permalink / raw) To: William Cohen Cc: Andi Kleen, Andrew Morton, Komuro, Ernst Herzberg, Andre Noll, oprofile-list, Jens Axboe, Adrian Bunk, linux-usb-devel, phil.el, Eric Dumazet, Ingo Molnar, Alan Stern, linux-pci, Prakash Punnoor, Eric W. Biederman, Len Brown, Alex Romosan, Linus Torvalds, discuss, gregkh, Linux Kernel Mailing List, Stephen Hemminger, Andrey Borzenkov Hello, On Thu, Nov 16, 2006 at 10:34:56AM -0500, William Cohen wrote: > > Is this going to require sharing the nmi interrupt and knowing which perfcounter > register triggered the interrupt to get the correct action? Currently the > oprofile interrupt handler assumes any performance monitoring counter it sees > overflowing is something it should count. > Yes, you need to share the NMI interrupt. In my next perfmon patch you will see that this can be made to work. You just need to add one check in the NMI handler callback: is it for me or else try perfmon? Perfmon can auto-detect if NMI is active and give up the right counter (there is an API to check what is reserved). The interface propagates the list of available counters to apps which then pass the information onto libpfm which tries to use the remaining counters. -- -Stephane ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:35 ` Eric Dumazet 2006-11-15 10:50 ` Andi Kleen @ 2006-11-22 10:28 ` Eric Dumazet 2006-11-22 10:36 ` Andi Kleen ` (2 more replies) 1 sibling, 3 replies; 91+ messages in thread From: Eric Dumazet @ 2006-11-22 10:28 UTC (permalink / raw) To: Adrian Bunk Cc: Andrew Morton, Linux Kernel Mailing List, Stephen Hemminger, gregkh, Ingo Molnar, Len Brown, Andi Kleen, phil.el, oprofile-list On Wednesday 15 November 2006 11:35, Eric Dumazet wrote: > On Wednesday 15 November 2006 11:21, Adrian Bunk wrote: > > Subject : x86_64: oprofile doesn't work > > References : http://lkml.org/lkml/2006/10/27/3 > > Submitter : Prakash Punnoor <prakash@punnoor.de> > > Status : unknown > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set. # opcontrol --setup --event=RESOURCE_STALLS:1000 --vmlinux=$VMFILE # opcontrol --start /usr/bin/opcontrol: line 911: /dev/oprofile/0/enabled: No such file or directory /usr/bin/opcontrol: line 911: /dev/oprofile/0/event: No such file or directory /usr/bin/opcontrol: line 911: /dev/oprofile/0/count: No such file or directory /usr/bin/opcontrol: line 911: /dev/oprofile/0/kernel: No such file or directory /usr/bin/opcontrol: line 911: /dev/oprofile/0/user: No such file or directory /usr/bin/opcontrol: line 911: /dev/oprofile/0/unit_mask: No such file or directory Using 2.6+ OProfile kernel interface. Reading module info. Using log file /var/lib/oprofile/oprofiled.log Daemon started. Profiler running. # ls -l /dev/oprofile/ total 0 drwxr-xr-x 1 root root 0 Nov 22 11:18 1 -rw-r--r-- 1 root root 0 Nov 22 11:18 backtrace_depth -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_size -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_watershed -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_buffer_size -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_type -rw-rw-rw- 1 root root 0 Nov 22 11:18 dump -rw-r--r-- 1 root root 0 Nov 22 11:18 enable -rw-r--r-- 1 root root 0 Nov 22 11:18 pointer_size drwxr-xr-x 1 root root 0 Nov 22 11:18 stats # dmesg | grep oprofile oprofile: using NMI interrupt. # opcontrol --version opcontrol: oprofile 0.9.2 compiled on Nov 22 2006 11:24:09 Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-22 10:28 ` Eric Dumazet @ 2006-11-22 10:36 ` Andi Kleen 2006-11-22 18:42 ` Andrew Morton 2006-11-22 17:59 ` William Cohen 2006-11-22 18:05 ` William Cohen 2 siblings, 1 reply; 91+ messages in thread From: Andi Kleen @ 2006-11-22 10:36 UTC (permalink / raw) To: Eric Dumazet Cc: Adrian Bunk, Andrew Morton, Linux Kernel Mailing List, Stephen Hemminger, gregkh, Ingo Molnar, Len Brown, phil.el, oprofile-list On Wednesday 22 November 2006 11:28, Eric Dumazet wrote: > On Wednesday 15 November 2006 11:35, Eric Dumazet wrote: > > On Wednesday 15 November 2006 11:21, Adrian Bunk wrote: > > > Subject : x86_64: oprofile doesn't work > > > References : http://lkml.org/lkml/2006/10/27/3 > > > Submitter : Prakash Punnoor <prakash@punnoor.de> > > > Status : unknown > > > > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set. oprofile is still broken because it cannot deal with the lack of perfctr 0. You can disable the nmi watchdog as a workaround. -Andi ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-22 10:36 ` Andi Kleen @ 2006-11-22 18:42 ` Andrew Morton 2006-12-16 11:20 ` Ray Lee 0 siblings, 1 reply; 91+ messages in thread From: Andrew Morton @ 2006-11-22 18:42 UTC (permalink / raw) To: Andi Kleen Cc: Eric Dumazet, Adrian Bunk, Linux Kernel Mailing List, Stephen Hemminger, gregkh, Ingo Molnar, Len Brown, phil.el, oprofile-list On Wed, 22 Nov 2006 11:36:14 +0100 Andi Kleen <ak@suse.de> wrote: > On Wednesday 22 November 2006 11:28, Eric Dumazet wrote: > > On Wednesday 15 November 2006 11:35, Eric Dumazet wrote: > > > On Wednesday 15 November 2006 11:21, Adrian Bunk wrote: > > > > Subject : x86_64: oprofile doesn't work > > > > References : http://lkml.org/lkml/2006/10/27/3 > > > > Submitter : Prakash Punnoor <prakash@punnoor.de> > > > > Status : unknown > > > > > > > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set. > > oprofile is still broken because it cannot deal with the lack of perfctr 0. The kernel is still broken because we changed the interface. > You can disable the nmi watchdog as a workaround. I don't understand why you think this is acceptable. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-22 18:42 ` Andrew Morton @ 2006-12-16 11:20 ` Ray Lee 0 siblings, 0 replies; 91+ messages in thread From: Ray Lee @ 2006-12-16 11:20 UTC (permalink / raw) To: Andrew Morton Cc: Andi Kleen, Eric Dumazet, Adrian Bunk, Linux Kernel Mailing List, Stephen Hemminger, gregkh, Ingo Molnar, Len Brown, phil.el, oprofile-list On 11/22/06, Andrew Morton <akpm@osdl.org> wrote: > On Wed, 22 Nov 2006 11:36:14 +0100 > Andi Kleen <ak@suse.de> wrote: > > > On Wednesday 22 November 2006 11:28, Eric Dumazet wrote: > > > On Wednesday 15 November 2006 11:35, Eric Dumazet wrote: > > > > On Wednesday 15 November 2006 11:21, Adrian Bunk wrote: > > > > > Subject : x86_64: oprofile doesn't work > > > > > References : http://lkml.org/lkml/2006/10/27/3 > > > > > Submitter : Prakash Punnoor <prakash@punnoor.de> > > > > > Status : unknown > > > > > > > > > > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set. > > > > oprofile is still broken because it cannot deal with the lack of perfctr 0. > > The kernel is still broken because we changed the interface. I just got bit by this on 2.6.20-latest (well, of two days ago anyway) while trying to debug another transient 'kacpid sucks all available cpu time'. But that's okay, I'm sure it will happen again in a week or two. In the meantime, who won this pis^H^H^H discussion? Mikael Pettersson wrote: > Andrew Morton writes: > > Surely the appropriate behaviour is to allow oprofile to steal the NMI and > > to then put the NMI back to doing the watchdog thing after oprofile has > > finished with it. > > Which is _exactly_ what pre-2.6.19-rc1 kernels did. I implemented > the in-kernel API allowing real performance counter drivers like > oprofile (and perfctr) to claim the HW from the NMI watchdog, > do their work, and then release it which resumed the watchdog. > > Note that oprofile (and perfctr) didn't do anything behind the > NMI watchdog's back. They went via the API. Nothing dodgy going on. Well, that seems clear. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-22 10:28 ` Eric Dumazet 2006-11-22 10:36 ` Andi Kleen @ 2006-11-22 17:59 ` William Cohen 2006-11-22 18:05 ` William Cohen 2 siblings, 0 replies; 91+ messages in thread From: William Cohen @ 2006-11-22 17:59 UTC (permalink / raw) To: Eric Dumazet Cc: Adrian Bunk, Andrew Morton, Len Brown, phil.el, gregkh, Linux Kernel Mailing List, Andi Kleen, Ingo Molnar, oprofile-list, Stephen Hemminger Eric Dumazet wrote: > On Wednesday 15 November 2006 11:35, Eric Dumazet wrote: > >>On Wednesday 15 November 2006 11:21, Adrian Bunk wrote: >> >>>Subject : x86_64: oprofile doesn't work >>>References : http://lkml.org/lkml/2006/10/27/3 >>>Submitter : Prakash Punnoor <prakash@punnoor.de> >>>Status : unknown >> > > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set. > > # opcontrol --setup --event=RESOURCE_STALLS:1000 --vmlinux=$VMFILE > # opcontrol --start > /usr/bin/opcontrol: line 911: /dev/oprofile/0/enabled: No such file or > directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/event: No such file or directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/count: No such file or directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/kernel: No such file or > directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/user: No such file or directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/unit_mask: No such file or > directory > Using 2.6+ OProfile kernel interface. > Reading module info. > Using log file /var/lib/oprofile/oprofiled.log > Daemon started. > Profiler running. > > # ls -l /dev/oprofile/ > total 0 > drwxr-xr-x 1 root root 0 Nov 22 11:18 1 > -rw-r--r-- 1 root root 0 Nov 22 11:18 backtrace_depth > -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer > -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_size > -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_watershed > -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_buffer_size > -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_type > -rw-rw-rw- 1 root root 0 Nov 22 11:18 dump > -rw-r--r-- 1 root root 0 Nov 22 11:18 enable > -rw-r--r-- 1 root root 0 Nov 22 11:18 pointer_size > drwxr-xr-x 1 root root 0 Nov 22 11:18 stats > # dmesg | grep oprofile > oprofile: using NMI interrupt. > # opcontrol --version > opcontrol: oprofile 0.9.2 compiled on Nov 22 2006 11:24:09 > > Eric Could you try the patch that I posted on the oprofile mailing list last week November 17 2005 for op_allocate.c and see if that resolves the problem you are having? http://sourceforge.net/mailarchive/message.php?msg_id=37316102 -Will ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-22 10:28 ` Eric Dumazet 2006-11-22 10:36 ` Andi Kleen 2006-11-22 17:59 ` William Cohen @ 2006-11-22 18:05 ` William Cohen 2006-11-22 18:26 ` Eric Dumazet 2 siblings, 1 reply; 91+ messages in thread From: William Cohen @ 2006-11-22 18:05 UTC (permalink / raw) To: Eric Dumazet Cc: Adrian Bunk, Andrew Morton, Len Brown, phil.el, gregkh, Linux Kernel Mailing List, Andi Kleen, Ingo Molnar, oprofile-list, Stephen Hemminger [-- Attachment #1: Type: text/plain, Size: 2044 bytes --] Eric Dumazet wrote: > On Wednesday 15 November 2006 11:35, Eric Dumazet wrote: > >>On Wednesday 15 November 2006 11:21, Adrian Bunk wrote: >> >>>Subject : x86_64: oprofile doesn't work >>>References : http://lkml.org/lkml/2006/10/27/3 >>>Submitter : Prakash Punnoor <prakash@punnoor.de> >>>Status : unknown >> > > I hit the same problem on i386 architecture too, if CONFIG_ACPI is not set. > > # opcontrol --setup --event=RESOURCE_STALLS:1000 --vmlinux=$VMFILE > # opcontrol --start > /usr/bin/opcontrol: line 911: /dev/oprofile/0/enabled: No such file or > directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/event: No such file or directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/count: No such file or directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/kernel: No such file or > directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/user: No such file or directory > /usr/bin/opcontrol: line 911: /dev/oprofile/0/unit_mask: No such file or > directory > Using 2.6+ OProfile kernel interface. > Reading module info. > Using log file /var/lib/oprofile/oprofiled.log > Daemon started. > Profiler running. > > # ls -l /dev/oprofile/ > total 0 > drwxr-xr-x 1 root root 0 Nov 22 11:18 1 > -rw-r--r-- 1 root root 0 Nov 22 11:18 backtrace_depth > -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer > -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_size > -rw-r--r-- 1 root root 0 Nov 22 11:18 buffer_watershed > -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_buffer_size > -rw-r--r-- 1 root root 0 Nov 22 11:18 cpu_type > -rw-rw-rw- 1 root root 0 Nov 22 11:18 dump > -rw-r--r-- 1 root root 0 Nov 22 11:18 enable > -rw-r--r-- 1 root root 0 Nov 22 11:18 pointer_size > drwxr-xr-x 1 root root 0 Nov 22 11:18 stats > # dmesg | grep oprofile > oprofile: using NMI interrupt. > # opcontrol --version > opcontrol: oprofile 0.9.2 compiled on Nov 22 2006 11:24:09 > > Eric You will also need another patch checked into the oprofile cvs last week mentioned: http://sourceforge.net/mailarchive/message.php?msg_id=35422937 -Will [-- Attachment #2: opalloc.diff --] [-- Type: text/x-patch, Size: 538 bytes --] Index: libop/op_alloc_counter.c =================================================================== RCS file: /cvsroot/oprofile/oprofile/libop/op_alloc_counter.c,v retrieving revision 1.6 diff -u -r1.6 op_alloc_counter.c --- libop/op_alloc_counter.c 1 Oct 2003 21:53:46 -0000 1.6 +++ libop/op_alloc_counter.c 17 Nov 2006 17:03:04 -0000 @@ -130,7 +130,7 @@ counter_arc const * arc = list_entry(pos, counter_arc, next); if (allocated_mask & (1 << arc->counter)) - return 0; + continue; counter_map[depth] = arc->counter; ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-22 18:05 ` William Cohen @ 2006-11-22 18:26 ` Eric Dumazet 0 siblings, 0 replies; 91+ messages in thread From: Eric Dumazet @ 2006-11-22 18:26 UTC (permalink / raw) To: William Cohen Cc: Adrian Bunk, Andrew Morton, Len Brown, phil.el, gregkh, Linux Kernel Mailing List, Andi Kleen, Ingo Molnar, oprofile-list, Stephen Hemminger On Wednesday 22 November 2006 19:05, William Cohen wrote: > You will also need another patch checked into the oprofile cvs last week > mentioned: > > http://sourceforge.net/mailarchive/message.php?msg_id=35422937 > > -Will Thank you William. I confirm that CVS oprofile version + patches you gave here works with linux-2.6.16-rc6 on i386, regardless of disabling nmi_watchdog (adding or not nmi_watchdog=0 in boot params) Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk 2006-11-15 10:35 ` Jens Axboe 2006-11-15 10:35 ` Eric Dumazet @ 2006-11-15 11:06 ` Brice Goglin 2006-11-15 22:32 ` Adrian Bunk 2006-11-15 12:07 ` Alan 2006-11-15 15:52 ` Stephen Hemminger 4 siblings, 1 reply; 91+ messages in thread From: Brice Goglin @ 2006-11-15 11:06 UTC (permalink / raw) To: Adrian Bunk; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List Adrian Bunk wrote: > Subject : unable to rip cd > References : http://lkml.org/lkml/2006/10/13/100 > http://lkml.org/lkml/2006/11/8/42 > Submitter : Alex Romosan <romosan@sycorax.lbl.gov> > Handled-By : Jens Axboe <jens.axboe@oracle.com> > Status : Jens is investigating I think this one is already fixed. Brice commit 616e8a091a035c0bd9b871695f4af191df123caa author Jens Axboe <jens.axboe@oracle.com> 1163437499 +0100 committer Linus Torvalds <torvalds@g5.osdl.org> 1163440020 -0800 [PATCH] Fix bad data direction in SG_IO Contrary to what the name misleads you to believe, SG_DXFER_TO_FROM_DEV is really just a normal read seen from the device side. This patch fixes http://lkml.org/lkml/2006/10/13/100 ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 11:06 ` Brice Goglin @ 2006-11-15 22:32 ` Adrian Bunk 0 siblings, 0 replies; 91+ messages in thread From: Adrian Bunk @ 2006-11-15 22:32 UTC (permalink / raw) To: Brice Goglin; +Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List On Wed, Nov 15, 2006 at 12:06:22PM +0100, Brice Goglin wrote: > Adrian Bunk wrote: > > Subject : unable to rip cd > > References : http://lkml.org/lkml/2006/10/13/100 > > http://lkml.org/lkml/2006/11/8/42 > > Submitter : Alex Romosan <romosan@sycorax.lbl.gov> > > Handled-By : Jens Axboe <jens.axboe@oracle.com> > > Status : Jens is investigating > > I think this one is already fixed. Thanks for this information (Jens already told me the same). > Brice cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk ` (2 preceding siblings ...) 2006-11-15 11:06 ` Brice Goglin @ 2006-11-15 12:07 ` Alan 2006-11-15 15:52 ` Stephen Hemminger 4 siblings, 0 replies; 91+ messages in thread From: Alan @ 2006-11-15 12:07 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, Stephen Hemminger, gregkh, linux-pci, Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel > Subject : PCI MSI setting corrupted during resume > References : http://bugzilla.kernel.org/show_bug.cgi?id=7479 > Submitter : Stephen Hemminger <shemminger@osdl.org> > Status : unknown This is one of the minor resume problems as far as I can tell. I believe the patches I posted for having a resume quirk run on each device if appropriate should correctly resolve these. See the patch I sent to l/k. There are a variety of other resume quirks we definitely require. Alan ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk ` (3 preceding siblings ...) 2006-11-15 12:07 ` Alan @ 2006-11-15 15:52 ` Stephen Hemminger 2006-11-15 16:35 ` Eric W. Biederman 4 siblings, 1 reply; 91+ messages in thread From: Stephen Hemminger @ 2006-11-15 15:52 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, gregkh, linux-pci, Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel > > Subject : PCI MSI setting corrupted during resume > References : http://bugzilla.kernel.org/show_bug.cgi?id=7479 > Submitter : Stephen Hemminger <shemminger@osdl.org> > Status : unknown > Turns out this isn't a regression, it was always there. It has to do with ACPI clearing state on resume. MSI wasn't being used the same in older kernels so it didn't show up. ^ permalink raw reply [flat|nested] 91+ messages in thread
* Re: 2.6.19-rc5: known regressions (v3) 2006-11-15 15:52 ` Stephen Hemminger @ 2006-11-15 16:35 ` Eric W. Biederman 0 siblings, 0 replies; 91+ messages in thread From: Eric W. Biederman @ 2006-11-15 16:35 UTC (permalink / raw) To: Stephen Hemminger Cc: Adrian Bunk, Linus Torvalds, Andrew Morton, Linux Kernel Mailing List, gregkh, linux-pci, Komuro, Eric W. Biederman, Ingo Molnar, Ernst Herzberg, Len Brown, Andre Noll, Andi Kleen, discuss, Prakash Punnoor, phil.el, oprofile-list, Alex Romosan, Jens Axboe, Andrey Borzenkov, Alan Stern, linux-usb-devel Stephen Hemminger <shemminger@osdl.org> writes: >> >> Subject : PCI MSI setting corrupted during resume >> References : http://bugzilla.kernel.org/show_bug.cgi?id=7479 >> Submitter : Stephen Hemminger <shemminger@osdl.org> >> Status : unknown >> > Turns out this isn't a regression, it was always there. It has to do with ACPI > clearing state on resume. MSI wasn't being used the same in older kernels so > it didn't show up. Ok. Do we know enough to fix the MSI case? Eric ^ permalink raw reply [flat|nested] 91+ messages in thread
end of thread, other threads:[~2006-12-16 11:20 UTC | newest]
Thread overview: 91+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-08 2:33 Linux 2.6.19-rc5 Linus Torvalds
2006-11-08 9:43 ` Nigel Cunningham
2006-11-08 9:59 ` Alessandro Suardi
2006-11-08 10:04 ` Nigel Cunningham
2006-11-08 14:19 ` Gene Heskett
2006-11-08 15:43 ` Linus Torvalds
[not found] ` <20061108085235.GT4729@stusta.de>
2006-11-08 9:29 ` [discuss] 2.6.19-rc5: known regressions Jan Beulich
2006-11-08 10:21 ` Adrian Bunk
2006-11-08 9:34 ` Jens Axboe
2006-11-08 19:09 ` Alex Romosan
2006-11-08 19:29 ` Jens Axboe
2006-11-08 19:38 ` Alex Romosan
2006-11-08 19:45 ` Jens Axboe
2006-11-08 21:40 ` Alex Romosan
2006-11-08 20:03 ` Arjan van de Ven
2006-11-08 20:19 ` Jens Axboe
2006-11-08 11:04 ` Eric W. Biederman
2006-11-08 11:32 ` Thomas Gleixner
[not found] ` <7813413.118221162987983254.komurojun-mbn@nifty.com>
2006-11-08 16:00 ` Linus Torvalds
2006-11-10 12:42 ` Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq Komuro
2006-11-13 16:02 ` Linus Torvalds
2006-11-13 17:11 ` Eric W. Biederman
2006-11-13 20:44 ` Ingo Molnar
2006-11-13 21:11 ` Eric W. Biederman
2006-11-14 8:14 ` [patch] irq: do not mask interrupts by default Ingo Molnar
2006-11-14 8:20 ` Arjan van de Ven
2006-11-14 12:43 ` Komuro
2006-11-14 16:10 ` Linus Torvalds
2006-11-14 17:52 ` [PATCH] Use delayed disable mode of ioapic edge triggered interrupts Eric W. Biederman
2006-11-14 23:35 ` Linus Torvalds
2006-11-15 1:17 ` Linus Torvalds
2006-11-15 5:14 ` Eric W. Biederman
2006-11-15 16:06 ` Linus Torvalds
2006-11-15 16:58 ` Eric W. Biederman
2006-11-15 12:40 ` Komuro
[not found] ` <20061115090427.GA16173@elte.hu>
2006-11-15 16:13 ` [patch] genirq: do not mask interrupts by default Linus Torvalds
2006-11-15 17:46 ` Ingo Molnar
[not found] ` <m1y7qm425l.fsf@ebiederm.dsl.xmission.com>
[not found] ` <Pine.LNX.4.64.0611080745150.3667@g5.osdl.org>
2006-11-08 16:22 ` 2.6.19-rc5: known regressions Adrian Bunk
2006-11-08 23:11 ` Tim Chen
2006-11-09 2:49 ` Tim Chen
2006-11-09 5:10 ` Eric W. Biederman
2006-11-13 22:46 ` Tim Chen
2006-11-14 0:03 ` Eric W. Biederman
[not found] ` <20061111015035.GU4729@stusta.de>
2006-11-11 9:08 ` [discuss] 2.6.19-rc5: known regressions (v2) Rafael J. Wysocki
2006-11-11 9:25 ` Paolo Ornati
2006-11-11 10:49 ` Rafael J. Wysocki
2006-11-11 12:29 ` Paolo Ornati
2006-11-14 16:44 ` Paolo Ornati
2006-11-29 10:10 ` [SOLVED] " Paolo Ornati
2006-11-13 22:14 ` 2.6.19-rc5: known regressions with patches Adrian Bunk
2006-11-13 22:56 ` Brian King
2006-11-13 23:15 ` Linus Torvalds
2006-11-14 2:35 ` Jeff Garzik
2006-11-15 10:21 ` 2.6.19-rc5: known regressions (v3) Adrian Bunk
2006-11-15 10:35 ` Jens Axboe
2006-11-15 10:53 ` Adrian Bunk
2006-11-15 10:35 ` Eric Dumazet
2006-11-15 10:50 ` Andi Kleen
2006-11-15 16:40 ` William Cohen
2006-11-15 16:48 ` [discuss] " Andi Kleen
2006-11-15 18:39 ` Andrew Morton
2006-11-15 18:45 ` Andi Kleen
2006-11-15 19:07 ` Linus Torvalds
2006-11-15 19:23 ` Andi Kleen
2006-11-15 20:21 ` Andrew Morton
2006-11-15 21:18 ` Eric W. Biederman
2006-11-15 21:31 ` Andrew Morton
2006-11-16 10:55 ` Mikael Pettersson
2006-11-16 20:23 ` Andrew Morton
2006-11-17 9:59 ` Mikael Pettersson
2006-11-17 10:13 ` Andrew Morton
2006-11-19 3:05 ` Bill Davidsen
2006-11-17 10:29 ` Andi Kleen
2006-11-16 3:21 ` Andi Kleen
2006-11-16 5:05 ` Andrew Morton
2006-11-16 7:04 ` Andi Kleen
2006-11-16 15:34 ` William Cohen
2006-11-16 15:47 ` Andi Kleen
2006-11-16 21:32 ` Stephane Eranian
2006-11-22 10:28 ` Eric Dumazet
2006-11-22 10:36 ` Andi Kleen
2006-11-22 18:42 ` Andrew Morton
2006-12-16 11:20 ` Ray Lee
2006-11-22 17:59 ` William Cohen
2006-11-22 18:05 ` William Cohen
2006-11-22 18:26 ` Eric Dumazet
2006-11-15 11:06 ` Brice Goglin
2006-11-15 22:32 ` Adrian Bunk
2006-11-15 12:07 ` Alan
2006-11-15 15:52 ` Stephen Hemminger
2006-11-15 16:35 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).