* 2.6.12-rc6-mm1
@ 2005-06-07 11:29 Andrew Morton
2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander
` (9 more replies)
0 siblings, 10 replies; 101+ messages in thread
From: Andrew Morton @ 2005-06-07 11:29 UTC (permalink / raw)
To: linux-kernel
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/
- Added v9fs
- Various random fixes
- Probably a similar number of breakages
Changes since 2.6.12-rc5-mm2:
-fix-ide-scsi-eh-locking.patch
-ext3-fix-log_do_checkpoint-assertion-failure.patch
-ext3-fix-list-scanning-in-__cleanup_transaction.patch
-namei-fixes-01-19.patch
-namei-fixes-02-19.patch
-namei-fixes-03-19.patch
-namei-fixes-04-19.patch
-namei-fixes-05-19.patch
-namei-fixes-06-19.patch
-namei-fixes-07-19.patch
-namei-fixes-08-19.patch
-namei-fixes-09-19.patch
-namei-fixes-10-19.patch
-namei-fixes-11-19.patch
-namei-fixes-12-19.patch
-namei-fixes-13-19.patch
-namei-fixes-14-19.patch
-namei-fixes-15-19.patch
-namei-fixes-16-19.patch
-namei-fixes-17-19.patch
-namei-fixes-18-19.patch
-namei-fixes-19-19.patch
-ipmi-class_simple-fixes.patch
-gregkh-i2c-i2c-ali1563.patch
-git-ocfs-fix-for-shemminger-tcp-stuff.patch
-gregkh-pci-pci-hotplug-shpchp-_HPP-fix.patch
-gregkh-pci-pci-hotplug-shpchp-PERR-fix.patch
-gregkh-pci-pci-amd74xx-ids.patch
-gregkh-pci-pci-cpci-update.patch
-gregkh-usb-usb-sl811-hcd-fixes.patch
-gregkh-usb-usb-sl811_cs.patch
-gregkh-usb-usb-ftdi_sio-new-id.patch
-gregkh-usb-usb-serial-generic-init-fix.patch
-gregkh-usb-usb-ub_multi_lun.patch
-gregkh-usb-usb-remove_pwc_changelog.patch
-gregkh-usb-usb-add-new-wacom-device-to-usb-hid-core-list.patch
-gregkh-usb-usb-urb_documentation.patch
-gregkh-usb-usb-earthmate-hid-blacklist.patch
-gregkh-usb-usb-storage-trumpion.patch
-gregkh-usb-usb-modalias-shrink.patch
-gregkh-usb-usb-cp2101-flow-control.patch
-gregkh-usb-usb-usbatm-reduce-log-spam.patch
-gregkh-usb-usb-usbatm-avoid-oops-on-bind-failure.patch
-gregkh-usb-usb-usbatm-1-fix.patch
-usb-option-card-driver.patch
-usb-wacom-tablet-driver.patch
-atm-nicstar-remove-a-bunch-of-pointless-casts-of-null.patch
-fix-atm-build-with-o=.patch
-drivers-net-hamradio-baycom_eppc-cleanups.patch
-ppc32-apple-device-tree-bug-fix.patch
-ppc32-ppc64-cleanup-proc-device-tree.patch
-ppc64-cleanup-spr-definitions.patch
-ppc64-cleanup-iseries-runlight-support.patch
-ppc64-remove-decr_overclock.patch
-ppc64-fix-a-device-tree-bug-on-apples.patch
-i386-collect-host-bridge-resources.patch
-x86_64-collect-host-bridge-resources.patch
-allow-ev_abs-to-work-in-uinputc.patch
-serial-update-nec-vr4100-series-serial-support.patch
Merged
+ppc32-add-linux-compilerh-to-asm-sigcontexth.patch
+include-linux-configh-before-testing-config_acpi.patch
+uml-make-the-emulated-iomem-driver-work-on-26.patch
+uml-compile-fixes-for-gcc-4.patch
+uml-fix-strace-f.patch
+uml-clean-up-error-path.patch
+uml-link-tt-mode-against-nptl.patch
+send_ipi_mask_sequence-warning-fix.patch
+ppc32-add-405ep-cpu_spec-entry.patch
+input-disable-scroll-feature-on-at-keyboards.patch
Planned for 2.6.12
+x86_64-task_size-fixes-for-compatibility-mode-processes.patch
x86_64 critical fixes (needs work)
+ia64-disable-preempt.patch
Disable CONFIG_PREEMPT on ia64 (it has problem with floating-point
save/restore)
+fix-up-macro-abuse-in-drivers-acpi-sleep-procc.patch
ACPI cleanup
+git-arm.patch
+git-arm-smp.patch
ARM git trees
-git-cpufreq.patch
Empty
+fix-warning-in-powernow-k8c.patch
Fix a cpufreq warning
+gregkh-driver-ipmi-class_simple-fixes.patch
+gregkh-driver-sysfs-permissions-01.patch
+gregkh-driver-sysfs-permissions-02.patch
+gregkh-driver-sysfs-permissions-03.patch
+gregkh-driver-dont-loose-devices-on-suspend-failure.patch
New driver core patches
-bk-drm.patch
-bk-drm-via.patch
DRM is moving to git
-update-drm-ioctl-compatibility-to-new-world-order.patch
The code which this pathces isn't there any more (it will come back)
+git-drm-initmap.patch
+git-drm-via.patch
Some DRM git trees
+gregkh-i2c-i2c-Kconfig-update.patch
+gregkh-i2c-i2c-pcf8574-cleanup.patch
+gregkh-i2c-i2c-adm9240-docs.patch
+gregkh-i2c-i2c-device-attr-lm90.patch
+gregkh-i2c-i2c-device-attr-lm83.patch
+gregkh-i2c-i2c-device-attr-lm63.patch
+gregkh-i2c-i2c-device-attr-it87.patch
+gregkh-i2c-hwmon-01.patch
+gregkh-i2c-hwmon-02.patch
+gregkh-i2c-hwmon-03.patch
i2c tree updates
+i2c-chips-need-hwmon.patch
+gregkh-i2c-hwmon-02-sparc64-fix.patch
Fix a few things in the i2c tree
+sonypi-make-sure-that-input_work-is-not-running-when-unloading.patch
sonypi fix
-git-libata-adma.patch
-git-libata-ahci-msi.patch
-git-libata-bridge-detect.patch
-git-libata-chs-support.patch
-git-libata-docs.patch
-git-libata-svw.patch
-git-libata-promise-sata-pata.patch
-git-libata-pdc2027x.patch
Dropped the libata tree - it changes all the time and I can't wqork out wtf
is going on.
-git-netdev-r8169.patch
Too many rejects from this one.
+fix-recursive-ipw2200-dependencies.patch
+drivers-net-chelsio-cxgb2-use-the-dma_3264bit_mask-constants.patch
+drivers-net-wireless-ipw2100-use-the-dma_32bit_mask-constant.patch
+drivers-net-wireless-ipw2200-use-the-dma_32bit_mask-constant.patch
+fix-tulip-suspend-resume.patch
Net driver fixes
+scalable-tcp-cleaned.patch
"scalable TCP"
+git-serial.patch
Serial subsystem tree
+gregkh-pci-pci-fix-routing-in-parent-bridge.patch
+gregkh-pci-pci-dma-bursting-advice.patch
+gregkh-pci-pci-collect-host-bridge-resources-01.patch
+gregkh-pci-pci-collect-host-bridge-resources-02.patch
PCI subsystem tree updates
+gregkh-pci-pci-dma-bursting-advice-fix.patch
Fix it
-git-scsi-rc-fixes.patch
This is empty
+gregkh-usb-usb-usbatm-reduce-log-spam.patch
+gregkh-usb-usb-usbatm-avoid-oops-on-bind-failure.patch
+gregkh-usb-usb-usbatm-fix-gcc-2.95.x.patch
+gregkh-usb-usb-usbatm-kcalloc.patch
+gregkh-usb-usb-uhci-detect-invalid-ports.patch
+gregkh-usb-usb-export-getput_intf.patch
+gregkh-usb-usb-cdc-acm-reference-count-fix.patch
+gregkh-usb-usb-ehci-fix-page-pointer-allocate.patch
+gregkh-usb-usb-wireless-definitions.patch
+gregkh-usb-usb-usblp-race-fix.patch
+gregkh-usb-usb-stv680-creative-mini.patch
+gregkh-usb-usb-atiremote-sysfs-links.patch
+gregkh-usb-usb-gotemp.patch
USB tree updates
+sparsemem-memory-model-fix-4.patch
+sparsemem-memory-model-fix-5.patch
Fix sparsemem-memory-model.patch even more
+sparsemem-hotplug-base-fix.patch
Fix sparsemem-hotplug-base.patch
-vm-merge_lru_pages.patch
-vm-page-cache-reclaim-core.patch
-vm-page-cache-reclaim-core-tidy.patch
-vm-reclaim_page_cache_node-syscall.patch
-vm-reclaim_page_cache_node-syscall-x86.patch
-vm-automatic-reclaim-through-mempolicy.patch
+vm-add-may_swap-flag-to-scan_control.patch
+vm-early-zone-reclaim.patch
+vm-early-zone-reclaim-tidy.patch
+vm-add-__gfp_noreclaim.patch
+vm-rate-limit-early-reclaim.patch
These patches were updated
+node-local-per-cpu-pages-tidy-2-fix.patch
Fix node-local-per-cpu-pages.patch some more.
+avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch
Fix a patch clash
+__mod_page_state-pass-unsigned-long-instead-of-unsigned.patch
+__read_page_state-pass-unsigned-long-instead-of-unsigned.patch
Warning fixes
+add-oom-debug.patch
Additional debug output when the box goes oom.
+periodically-drain-non-local-pagesets.patch
+periodically-drain-non-local-pagesets-fix.patch
Shrink the per-cpu-pages caches occasionally
+ia64-uncached-alloc.patch
+sn2-xpc-build-patches.patch
Special allocator for uncached pages
+shmem-restore-superblock-info.patch
+mbind-fix-verify_pages-pte_page.patch
+mbind-check_range-use-standard-ptwalk.patch
+dup_mmap-update-comment-on-new-vma.patch
+bad_page-clear-reclaim-and-slab.patch
+rme96xx-fix-pagereserved-range.patch
+get_user_pages-kill-get_page_map.patch
+do_wp_page-cannot-share-file-page.patch
+can_share_swap_page-use-page_mapcount.patch
+msync-check-pte-dirty-earlier.patch
Various mm fixes
+sunzilog-warning-fixes.patch
+ppp-handle-misaligned-accesses.patch
Net fixes
+ppc32-removed-dependency-on-config_cpm2-for-building.patch
+ppc32-converted-mpc10x-bridge-to-use-platform.patch
+cpm_uart-route-scc2-pins-for-the-stx-gp3-board.patch
ppc32 updates
+ppc64-iseries-remove-iseries_proch.patch
+ppc64-iseries-header-file-white-space-cleanups.patch
+ppc64-iseries-more-header-file-white-space-cleanups.patch
+ppc64-iseries-obvious-code-simplifications.patch
+ppc64-iseries-remove-lpardatah.patch
+ppc64-iseries-eliminate-some-unused-inline-functions.patch
+ppc64-iseries-remove-hvcallcfgh.patch
+ppc64-iseries-cleanup-itlpqueueh.patch
+ppc64-iseries-tidy-up-some-includes-and-hvcallh.patch
+ppc64-iseries-misc-header-cleanups.patch
+update-ppc64-defconfig.patch
+ppc64-iseries-remove-iseries_pci_resetc.patch
+ppc64-iseries-iommuh-cleanups.patch
+ppc64-iseries-iseries_vpdinfoc-cleanups.patch
+ppc64-iseries-iseries_pcih-cleanups.patch
+ppc64-iseries-remove-ioretry-from-iseries_device_node.patch
+ppc64-iseries-remove-some-more-members-of.patch
ppc64 updates
+x86-x86_64-pcibus_to_node-fix.patch
Fix x86-x86_64-pcibus_to_node.patch
+mempool-bounce-buffer-restriction.patch
Limit the amount of memory which can be used for bounce buffers
+arm-irqs_disabled-type-fix.patch
ARM warning fix
+variable-overflow-after-hundreds-round-of-hotplug-cpu.patch
CPU hotplug fix
+x86_64-change-init-sections-for-cpu-hotplug-support.patch
+x86_64-change-init-sections-for-cpu-hotplug-support-fix.patch
+x86_64-cpu-hotplug-support.patch
+x86_64-cpu-hotplug-sibling-map-cleanup.patch
+x86_64-dont-use-broadcast-shortcut-to-make-it-cpu-hotplug-safe.patch
+x86_64-provide-ability-to-choose-using-shortcuts-for-ipi-in-flat-mode.patch
CPU hotplug for x86_64
+m32r-support-m3a-2170mappi-iii-platform-fix.patch
+m32r-support-m3a-2170mappi-iii-platform-fix-2.patch
+m32r-update-setup_xxxxxc.patch
+m32r-update-m32r_cfc-to-support-mappi-iii-fix.patch
+m32r-cleanup-arch-m32r-mm-extablec.patch
+m32r-remove-include-asm-m32r-m32102perih.patch
+m32r-update-defconfig-files.patch
+m32r-use-asm-generic-div64h.patch
m32r fixes and updates
+s390-cio-max-channels-checks.patch
+s390-cio-documentation.patch
+s390-ifdefs-in-compat_ioctls.patch
+s390-kernel-stack-overflow-panic.patch
+s390-cmm-sender-parameter-visibility.patch
+s390-memory-detection-32gb.patch
+s390-pending-interrupt-after-ipl-from-reader.patch
s/390 updates
+ecryptfs-export-user-key-type.patch
Export a symbol
+x86_64-specific-function-return-probes.patch
+kprobes-ia64-cleanup-2.patch
+kprobes-ia64-cmp-ctype-unc-support.patch
+kprobes-ia64-safe-register-kprobe.patch
+kprobes-temporary-disarming-of-reentrant-probe-for-x86_64-fix.patch
+allow-a-jprobe-to-coexist-with-muliple-kprobes.patch
kprobes updates
+cs4236-irq-handling-fix.patch
OSS driver fix
+block-add-unlocked_ioctl-support-for-block-devices.patch
Support lock_kernel-less ioctls on blockdevs
+pcdp-handle-tables-that-dont-supply-baud-rate.patch
serial driver update
+stop-arch-i386-kernel-vsyscall-noteo-being-rebuilt-every-time.patch
kbuild fix
+remove-f_error-field-from-struct-file.patch
cleanup
+autofs4-avoid-panic-on-bind-mount-of-autofs-owned-directory.patch
+autofs4-post-expire-race-fix.patch
+autofs4-bad-lookup-fix.patch
+autofs4-subversion-bump-to-identify-these-changes.patch
autofs4 updates
+rapidio-support-core-base.patch
+rapidio-support-core-includes.patch
+rapidio-support-core-enum.patch
+rapidio-support-ppc32.patch
+rapidio-support-net-driver.patch
RapidIO driver
+dlm-lockspaces-callbacks-directory-dlm-consistent-ifdefs.patch
+dlm-lockspaces-callbacks-directory-fix-2-dlm-dont-repeat-include.patch
+dlm-lockspaces-callbacks-directory-fix-3.patch
+dlm-lockspaces-callbacks-directory-dlm-dont-free-lvb-twice.patch
+dlm-communication-dlm-dont-add-duplicate-node-addresses.patch
+dlm-recovery-dlm-timer-cant-be-global.patch
+dlm-recovery-dlm-clear-recovery-flags.patch
+dlm-device-interface-dlm-uncomment-unregister_lockspace.patch
+dlm-device-interface-dlm-newline-in-printks.patch
+dlm-debug-fs-dlm-consistent-ifdefs.patch
Various fixes and updates to the DLM driver
+tuner-corec-improvments-and-ymec-tvision-tvf8533mf.patch
v4l udpate
+oprofile-report-anonymous-region-samples.patch
oprofile feature
+lockd-flush-signals-on-shutdown.patch
+nfs4-hold-filp-while-reading-or-writing.patch
+nfsd4-fix-probe_callback.patch
+nfsd4-nfs4_check_open_reclaim-cleanup.patch
+nfsd4-create-separate-laundromat-workqueue.patch
+nfsd4-simplify-lease-changing.patch
+nfsd4-delegation-recovery.patch
+nfsd4-rename-nfs4_state_init.patch
+nfsd4-clean-up-state-initialization.patch
+nfsd4-remove-nfs4_reclaim_init.patch
+nfsd4-idmap-initialization.patch
+nfsd4-setclientid-simplification.patch
+nfsd4-reboot-hash.patch
+nfsd4-add-find_unconf_by_str-functions-to-simplify-setclientid.patch
+nfsd4-grace-period-end.patch
+nfsd4-make-needlessly-global-code-static.patch
+nfsd4-fix-uncomfirmed-list.patch
+nfsd4-fix-setclientid_confirm-cases.patch
+nfsd4-fix-setclientid_confirm-error-return.patch
+nfsd4-setclientid_confirm-gotoectomy.patch
+nfsd4-setclientid_confirm-comments.patch
+nfsd4-miscellaneous-setclientid_confirm-cleanup.patch
+nfsd4-rename-state-list-fields.patch
+nfsd4-allow-multiple-lockowners.patch
+nfsd4-remove-cb_parsed.patch
+nfsd4-initialize-recovery-directory.patch
+nfsd4-reboot-recovery.patch
+nfsd4-reboot-dirname.patch
nfsd updates
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags.patch
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags-tidy.patch
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags-fix.patch
isofs feature work
+numa-aware-slab-allocator-v5.patch
The NUMA-aware slab allocator is back. Needs ifdef-reduction work.
-periodically-scan-redzone-entries-and-slab-control-structures.patch
-slab-leak-detector.patch
-slab-leak-detector-warning-fixes.patch
It broke these.
+numa-aware-slab-allocator-v3-__bad_size-fix.patch
Fix it.
+sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch
CPU scheduler fix
+v4l-add-support-for-pixelview-ultra-pro.patch
+dvico-fusionhdtv3-gold-t-documentation-fix.patch
v4l updates
+kexec-code-cleanup.patch
Make all the kexec patches resemble CodingStyle.
+v9fs-documentation-makefiles-configuration.patch
+v9fs-documentation-makefiles-configuration-fix.patch
+v9fs-vfs-file-dentry-and-directory-operations.patch
+v9fs-vfs-file-dentry-and-directory-operations-fix.patch
+v9fs-vfs-inode-operations.patch
+v9fs-vfs-superblock-operations-and-glue.patch
+v9fs-9p-protocol-implementation.patch
+v9fs-transport-modules.patch
+v9fs-debug-and-support-routines.patch
+v9fs-debug-and-support-routines-fix.patch
The plan9 networked filesystem
+framebuffer-driver-for-arc-lcd-board.patch
+framebuffer-driver-for-arc-lcd-board-tidy.patch
+framebuffer-driver-for-arc-lcd-board-update.patch
+new-pci-id-for-chipsfb.patch
fbdev updates
+modules-add-version-and-srcversion-to-sysfs-fix.patch
+modules-add-version-and-srcversion-to-sysfs-fix-2.patch
Fix modules-add-version-and-srcversion-to-sysfs.patch
+fuse-device-functions-fuse-serious-information-leak-fix.patch
FUSE fix
+remove-redundant-info-from-submittingpatches.patch
Documentation update
-unexport-slab_reclaim_pages.patch
Drop this due to some reject.
number of patches in -mm: 1397
number of changesets in external trees: 53
number of patches in -mm only: 1395
total patches: 1448
All 1397 patches:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/patch-list
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-07 14:24 ` Wolfgang Wander
2005-06-07 14:49 ` 2.6.12-rc6-mm1 Wolfgang Wander
2005-06-07 14:48 ` 2.6.12-rc6-mm1 Brice Goglin
` (8 subsequent siblings)
9 siblings, 1 reply; 101+ messages in thread
From: Wolfgang Wander @ 2005-06-07 14:24 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Chen, Kenneth W
Andrew Morton wrote:
> +avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch
As a heads-up.
This one breaks the fragmentation reduction patch in 32 bit emulation mode.
Our test case shows the standard 17 fragmented regions in /proc/self/maps (as in
the 2.6 standard kernel) vs the 2 regions in 2.6.12-rc5-mm2 (and before).
Somehow the new way of detecting 32 bit remulation mode seems to fail here.
I'll try to figure out a fix.
Wolfgang
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander
@ 2005-06-07 14:48 ` Brice Goglin
2005-06-07 20:59 ` 2.6.12-rc6-mm1: rio confusion Adrian Bunk
` (7 subsequent siblings)
9 siblings, 0 replies; 101+ messages in thread
From: Brice Goglin @ 2005-06-07 14:48 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Andrew Morton a écrit :
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/
>
> - Added v9fs
>
> - Various random fixes
>
> - Probably a similar number of breakages
Hi Andrew,
I didn't see any breakage. But I get these two lines during boot:
yenta 0000:02:03.1: no resource of type 100 available, trying to continue...
yenta 0000:02:03.1: no resource of type 100 available, trying to continue...
Anyway, my PCMCIA slots seem to still work.
Brice
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander
@ 2005-06-07 14:49 ` Wolfgang Wander
0 siblings, 0 replies; 101+ messages in thread
From: Wolfgang Wander @ 2005-06-07 14:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Chen, Kenneth W
[-- Attachment #1: Type: text/plain, Size: 1053 bytes --]
Wolfgang Wander wrote:
> Andrew Morton wrote:
>
>> +avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch
>>
>
>
> As a heads-up.
>
> This one breaks the fragmentation reduction patch in 32 bit emulation mode.
> Our test case shows the standard 17 fragmented regions in
> /proc/self/maps (as in
> the 2.6 standard kernel) vs the 2 regions in 2.6.12-rc5-mm2 (and before).
>
> Somehow the new way of detecting 32 bit remulation mode seems to fail here.
>
> I'll try to figure out a fix.
>
Here is one possibility:
Since rc6 the difference between TASK_UNMAPPED_64 and TASK_UNMAPPED_32 is gone
and both are now merged into TASK_UNMAPPED_BASE. Therefore we can no longer
check our local base against TASK_UNMAPPED_BASE to see if we are running in 32bit
emulation mode. The appended patch uses other (hopefully the right) means.
Tested on x86_64 in 32 and 64 mode (64 bit fragments as desired, 32 bit
collapses as desired).
Signed-off-by: Wolfgang Wander <wwc@rentec.com>
[-- Attachment #2: avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes-fix.patch --]
[-- Type: text/x-patch, Size: 511 bytes --]
--- arch/x86_64/kernel/sys_x86_64.c~ 2005-06-07 09:12:31.000000000 -0400
+++ arch/x86_64/kernel/sys_x86_64.c 2005-06-07 10:32:07.000000000 -0400
@@ -105,7 +105,8 @@ arch_get_unmapped_area(struct file *filp
(!vma || addr + len <= vma->vm_start))
return addr;
}
- if (begin != TASK_UNMAPPED_BASE && len <= mm->cached_hole_size) {
+ if (((flags & MAP_32BIT) || test_thread_flag(TIF_IA32))
+ && len <= mm->cached_hole_size) {
mm->cached_hole_size = 0;
mm->free_area_cache = begin;
}
^ permalink raw reply [flat|nested] 101+ messages in thread
* 2.6.12-rc6-mm1: rio confusion
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander
2005-06-07 14:48 ` 2.6.12-rc6-mm1 Brice Goglin
@ 2005-06-07 20:59 ` Adrian Bunk
2005-06-07 21:38 ` Matt Porter
2005-06-07 23:15 ` 2.6.12-rc6-mm1 Francois Romieu
` (6 subsequent siblings)
9 siblings, 1 reply; 101+ messages in thread
From: Adrian Bunk @ 2005-06-07 20:59 UTC (permalink / raw)
To: Andrew Morton, Matt Porter; +Cc: linux-kernel
On Tue, Jun 07, 2005 at 04:29:31AM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.12-rc5-mm2:
>...
> +rapidio-support-core-base.patch
> +rapidio-support-core-includes.patch
> +rapidio-support-core-enum.patch
> +rapidio-support-ppc32.patch
> +rapidio-support-net-driver.patch
>
> RapidIO driver
>...
That we do now have both drivers/rio/ and drivers/char/rio/ and that
they are for completely different things is confusing.
What about drivers/rapidio/ ?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1: rio confusion
2005-06-07 20:59 ` 2.6.12-rc6-mm1: rio confusion Adrian Bunk
@ 2005-06-07 21:38 ` Matt Porter
0 siblings, 0 replies; 101+ messages in thread
From: Matt Porter @ 2005-06-07 21:38 UTC (permalink / raw)
To: Adrian Bunk; +Cc: Andrew Morton, linux-kernel
On Tue, Jun 07, 2005 at 10:59:06PM +0200, Adrian Bunk wrote:
> On Tue, Jun 07, 2005 at 04:29:31AM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.12-rc5-mm2:
> >...
> > +rapidio-support-core-base.patch
> > +rapidio-support-core-includes.patch
> > +rapidio-support-core-enum.patch
> > +rapidio-support-ppc32.patch
> > +rapidio-support-net-driver.patch
> >
> > RapidIO driver
> >...
>
> That we do now have both drivers/rio/ and drivers/char/rio/ and that
> they are for completely different things is confusing.
>
> What about drivers/rapidio/ ?
Fine with me. I'll roll it into my next update.
-Matt
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
` (2 preceding siblings ...)
2005-06-07 20:59 ` 2.6.12-rc6-mm1: rio confusion Adrian Bunk
@ 2005-06-07 23:15 ` Francois Romieu
2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott
` (5 subsequent siblings)
9 siblings, 0 replies; 101+ messages in thread
From: Francois Romieu @ 2005-06-07 23:15 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Andrew Morton <akpm@osdl.org> :
[...]
> -git-netdev-r8169.patch
>
> Too many rejects from this one.
How did you generate git-netdev-r8169.patch ?
Jeff's 'upstream-2.6.13' includes all the pending r8169 changes and
nothing will be merged before 2.6.12 is out. Imho you can safely
ignore any r8169 change until 2.6.12 appears.
--
Ueimor
^ permalink raw reply [flat|nested] 101+ messages in thread
* 2.6.12-rc6-mm1
@ 2005-06-07 23:50 Martin J. Bligh
2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton
0 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-07 23:50 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Wheeee! it actually compiles and boots for me on x86 ;-)
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
Seems to show that perf is rather sucky on kernbench though.
baseline (-rc6) data is here:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4760/kernbench.test/
-mm1 is here:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4876/kernbench.test/
Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
If I factor it by 4x, I get:
47796 10.9% total
16644 30.5% buffered_rmqueue
15574 7.7% default_idle
2229 239.4% kmem_cache_free
1782 11.1% zap_pte_range
1752 0.0% inotify_inode_queue_event
1467 36.3% release_pages
1281 73.3% set_page_dirty
1155 12.8% do_wp_page
924 8.3% _spin_lock
896 0.0% find_idlest_group
828 21.7% free_hot_cold_page
780 0.0% drain_remote_pages
772 0.0% dput_recursive
464 0.0% inotify_dentry_parent_queue_event
...
-412 -8.1% __d_lookup
-508 -98.4% find_idlest_cpu
-542 -24.5% do_anonymous_page
-549 -47.5% current_fs_time
-580 -100.0% del_timer_sync
-594 -86.6% dput
-695 -31.4% __copy_user_intel
-1461 -13.9% strnlen_user
Buggered if I know what that is from. I'm guessing scheduler, or the
HZ change. I guess I can rerun with the HZ set to 1000 ... you got any
experimental scheduler stuff in your tree?
Else I guess it's some memory allocator stuff maybe?
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 23:50 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-07 23:56 ` Andrew Morton
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Martin J. Bligh
0 siblings, 2 replies; 101+ messages in thread
From: Andrew Morton @ 2005-06-07 23:56 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: linux-kernel, Christoph Lameter
"Martin J. Bligh" <mbligh@mbligh.org> wrote:
>
> Wheeee! it actually compiles and boots for me on x86 ;-)
We aim to please.
> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
>
> Seems to show that perf is rather sucky on kernbench though.
CPU scheduler.
> baseline (-rc6) data is here:
>
> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4760/kernbench.test/
>
> -mm1 is here:
>
> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4876/kernbench.test/
>
> Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
Oh crap, so it does. That's wrong.
> If I factor it by 4x, I get:
Would it be possible to set it back to 100Hz, retest?
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-08 0:02 ` Christoph Lameter
2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-09 1:58 ` 2.6.12-rc6-mm1 Lee Revell
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Martin J. Bligh
1 sibling, 2 replies; 101+ messages in thread
From: Christoph Lameter @ 2005-06-08 0:02 UTC (permalink / raw)
To: Andrew Morton; +Cc: Martin J. Bligh, linux-kernel
On Tue, 7 Jun 2005, Andrew Morton wrote:
> > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
>
> Oh crap, so it does. That's wrong.
Email by you and Linus indicated that 250 should be the default.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter
@ 2005-06-08 0:02 ` Martin J. Bligh
1 sibling, 0 replies; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-08 0:02 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Christoph Lameter
>> Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
>
> Oh crap, so it does. That's wrong.
>
>> If I factor it by 4x, I get:
>
> Would it be possible to set it back to 100Hz, retest?
Sure. but you mean 1000, right?
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter
@ 2005-06-08 0:08 ` Andrew Morton
2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin
` (2 more replies)
2005-06-09 1:58 ` 2.6.12-rc6-mm1 Lee Revell
1 sibling, 3 replies; 101+ messages in thread
From: Andrew Morton @ 2005-06-08 0:08 UTC (permalink / raw)
To: Christoph Lameter; +Cc: mbligh, linux-kernel
Christoph Lameter <clameter@engr.sgi.com> wrote:
>
> On Tue, 7 Jun 2005, Andrew Morton wrote:
>
> > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
> >
> > Oh crap, so it does. That's wrong.
>
> Email by you and Linus indicated that 250 should be the default.
Oh, OK. hrm.
Martin, it would be useful if you could determine whether the kernbench
slowdown was due to the 1000Hz->250Hz change, thanks.
I'm assuming it was the CPU scheduler patches. There are 36 of them ;)
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
` (3 preceding siblings ...)
2005-06-07 23:15 ` 2.6.12-rc6-mm1 Francois Romieu
@ 2005-06-08 1:59 ` Søren Lott
2005-06-08 5:53 ` 2.6.12-rc6-mm1 Jean Delvare
2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft
` (4 subsequent siblings)
9 siblings, 1 reply; 101+ messages in thread
From: Søren Lott @ 2005-06-08 1:59 UTC (permalink / raw)
To: Andrew Morton, gregkh; +Cc: linux-kernel
On Tuesday 07 June 2005 08:29, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.
>6.12-rc6-mm1/
[snip]
> +gregkh-i2c-i2c-Kconfig-update.patch
> +gregkh-i2c-i2c-pcf8574-cleanup.patch
> +gregkh-i2c-i2c-adm9240-docs.patch
> +gregkh-i2c-i2c-device-attr-lm90.patch
> +gregkh-i2c-i2c-device-attr-lm83.patch
> +gregkh-i2c-i2c-device-attr-lm63.patch
> +gregkh-i2c-i2c-device-attr-it87.patch
> +gregkh-i2c-hwmon-01.patch
> +gregkh-i2c-hwmon-02.patch
> +gregkh-i2c-hwmon-03.patch
>
> i2c tree updates
>
> +i2c-chips-need-hwmon.patch
> +gregkh-i2c-hwmon-02-sparc64-fix.patch
>
> Fix a few things in the i2c tree
[snip]
after those changes i don't get entries in /sys for my W83627THF chip.
(p4c800-D, i875,ICH5)
relevant config parts:
CONFIG_HWMON=y
CONFIG_I2C=y
CONFIG_I2C_ISA=y
CONFIG_I2C_SENSOR=y
CONFIG_SENSORS_W83627HF=y
thanks.
-SL
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-08 3:17 ` Nick Piggin
2005-06-08 3:33 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-08 14:15 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh
2 siblings, 1 reply; 101+ messages in thread
From: Nick Piggin @ 2005-06-08 3:17 UTC (permalink / raw)
To: Andrew Morton; +Cc: Christoph Lameter, mbligh, lkml, Con Kolivas
On Tue, 2005-06-07 at 17:08 -0700, Andrew Morton wrote:
> Christoph Lameter <clameter@engr.sgi.com> wrote:
> >
> > On Tue, 7 Jun 2005, Andrew Morton wrote:
> >
> > > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
> > >
> > > Oh crap, so it does. That's wrong.
> >
> > Email by you and Linus indicated that 250 should be the default.
>
> Oh, OK. hrm.
>
> Martin, it would be useful if you could determine whether the kernbench
> slowdown was due to the 1000Hz->250Hz change, thanks.
>
> I'm assuming it was the CPU scheduler patches. There are 36 of them ;)
I'm looking at some issues with the scheduler patches.
To start with, it looks like the smp-nice patches are broken. Even if
they weren't I think it might be a good idea just to put them on hold
until we work out what to do with the other sched patches... we're
only just starting to get some interesting tests (ie. regressions) being
run on -mm (at least that I've been made aware of). So give me a bit of
time to work though that.
Anyway, Con, this is what it is doing on a 64-way Altix running aim7:
(compare imbalances, task move rates, wakeup move rates, etc).
--- wakeup statistics ---
269.174 task wakes / s
31.704% of them from the local CPU
14.190% of remote wakeups come from domain0
0.000% are moved to the local CPU via passive load balancing
26.660% are moved to the local CPU via affine wakeups
46.672% of remote wakeups come from domain1
10.359% are moved to the local CPU via passive load balancing
0.000% are moved to the local CPU via affine wakeups
39.012% of remote wakeups come from domain2
10.659% are moved to the local CPU via passive load balancing
0.000% are moved to the local CPU via affine wakeups
--- load balancing statistics ---
for domain0
4368.652 load balance calls / s move 137.083 tasks / s
96.456% calls and 1.174% task moves came from idle balancing
0.042% were imbalanced with an average imbalance of 566.708
0.038% found an imbalance but failed
6.165% of tasks moved were cache hot
1.818% calls and 73.086% task moves came from busy balancing
47.694% were imbalanced with an average imbalance of 335.932
4.704% found an imbalance but failed
0.140% of tasks moved were cache hot
1.726% calls and 25.740% task moves came from new-idle balancing
26.938% were imbalanced with an average imbalance of 198.054
9.136% found an imbalance but failed
0.151% of tasks moved were cache hot
0.000 active balances / s move 0.000 tasks / s
0.000 exec balances / s move 0.000 tasks / s
0.000 fork balances / s move 0.000 tasks / s
for domain1
102.002 load balance calls / s move 180.344 tasks / s
85.398% calls and 17.496% task moves came from idle balancing
5.920% were imbalanced with an average imbalance of 386.172
2.103% found an imbalance but failed
0.920% of tasks moved were cache hot
14.602% calls and 82.504% task moves came from busy balancing
69.017% were imbalanced with an average imbalance of 702.928
5.849% found an imbalance but failed
0.075% of tasks moved were cache hot
0.000% calls and 0.000% task moves came from new-idle balancing
0.048 active balances / s move 0.002 tasks / s
%95.000 attempts failed
0.000 exec balances / s move 0.000 tasks / s
0.000 fork balances / s move 0.000 tasks / s
for domain2
9.496 load balance calls / s move 13.070 tasks / s
91.335% calls and 32.327% task moves came from idle balancing
21.094% were imbalanced with an average imbalance of 115.513
16.936% found an imbalance but failed
2.978% of tasks moved were cache hot
8.665% calls and 67.673% task moves came from busy balancing
64.118% were imbalanced with an average imbalance of 503.867
17.353% found an imbalance but failed
0.383% of tasks moved were cache hot
0.000% calls and 0.000% task moves came from new-idle balancing
0.007 active balances / s move 0.007 tasks / s
%0.000 attempts failed
0.000 exec balances / s move 0.000 tasks / s
0.000 fork balances / s move 0.000 tasks / s
And this is what it looks like with smpnice #if'ed out:
--- wakeup statistics ---
331.734 task wakes / s
25.492% of them from the local CPU
13.601% of remote wakeups come from domain0
0.000% are moved to the local CPU via passive load balancing
1.674% are moved to the local CPU via affine wakeups
44.484% of remote wakeups come from domain1
3.139% are moved to the local CPU via passive load balancing
0.000% are moved to the local CPU via affine wakeups
42.088% of remote wakeups come from domain2
0.000% are moved to the local CPU via passive load balancing
0.000% are moved to the local CPU via affine wakeups
--- load balancing statistics ---
for domain0
3940.070 load balance calls / s move 3.671 tasks / s
96.488% calls and 48.889% task moves came from idle balancing
0.068% were imbalanced with an average imbalance of 1.132
0.029% found an imbalance but failed
3.135% of tasks moved were cache hot
1.339% calls and 33.563% task moves came from busy balancing
2.319% were imbalanced with an average imbalance of 1.037
0.069% found an imbalance but failed
0.228% of tasks moved were cache hot
2.173% calls and 17.548% task moves came from new-idle balancing
1.259% were imbalanced with an average imbalance of 1.008
0.516% found an imbalance but failed
3.057% of tasks moved were cache hot
0.006 active balances / s move 0.006 tasks / s
%0.000 attempts failed
0.000 exec balances / s move 0.000 tasks / s
0.000 fork balances / s move 0.000 tasks / s
for domain1
86.378 load balance calls / s move 2.644 tasks / s
94.236% calls and 89.468% task moves came from idle balancing
4.116% were imbalanced with an average imbalance of 1.123
1.597% found an imbalance but failed
4.281% of tasks moved were cache hot
5.764% calls and 10.532% task moves came from busy balancing
6.667% were imbalanced with an average imbalance of 1.008
1.130% found an imbalance but failed
0.000% of tasks moved were cache hot
0.000% calls and 0.000% task moves came from new-idle balancing
0.082 active balances / s move 0.017 tasks / s
%79.310 attempts failed
0.000 exec balances / s move 0.000 tasks / s
0.000 fork balances / s move 0.000 tasks / s
for domain2
9.024 load balance calls / s move 0.343 tasks / s
95.293% calls and 88.525% task moves came from idle balancing
12.103% were imbalanced with an average imbalance of 1.003
8.701% found an imbalance but failed
14.815% of tasks moved were cache hot
4.707% calls and 11.475% task moves came from busy balancing
16.556% were imbalanced with an average imbalance of 1.000
7.285% found an imbalance but failed
21.429% of tasks moved were cache hot
0.000% calls and 0.000% task moves came from new-idle balancing
0.008 active balances / s move 0.008 tasks / s
%0.000 attempts failed
0.000 exec balances / s move 0.000 tasks / s
0.000 fork balances / s move 0.000 tasks / s
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin
@ 2005-06-08 3:33 ` Con Kolivas
2005-06-08 3:50 ` 2.6.12-rc6-mm1 Nick Piggin
0 siblings, 1 reply; 101+ messages in thread
From: Con Kolivas @ 2005-06-08 3:33 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, Christoph Lameter, mbligh, lkml
On Wed, 8 Jun 2005 01:17 pm, Nick Piggin wrote:
> On Tue, 2005-06-07 at 17:08 -0700, Andrew Morton wrote:
> > Christoph Lameter <clameter@engr.sgi.com> wrote:
> > > On Tue, 7 Jun 2005, Andrew Morton wrote:
> > > > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
> > > >
> > > > Oh crap, so it does. That's wrong.
> > >
> > > Email by you and Linus indicated that 250 should be the default.
> >
> > Oh, OK. hrm.
> >
> > Martin, it would be useful if you could determine whether the kernbench
> > slowdown was due to the 1000Hz->250Hz change, thanks.
> >
> > I'm assuming it was the CPU scheduler patches. There are 36 of them ;)
>
> I'm looking at some issues with the scheduler patches.
>
> To start with, it looks like the smp-nice patches are broken. Even if
> they weren't I think it might be a good idea just to put them on hold
> until we work out what to do with the other sched patches...
I originally said I'd wait till the sched patches settled down before tackling
it but it didn't look like that was ever going to happen and broken nice on
SMP is a real bug biting people now so I figured I should just tackle it
anyway. I don't mind if we just work on it later though.
> Anyway, Con, this is what it is doing on a 64-way Altix running aim7:
> (compare imbalances, task move rates, wakeup move rates, etc).
Definitely different I agree. As for the performance impact the statistics
alone don't tell us if they're for good or evil, but we can look at it again
separately when we tackle smp nice again. It is a real issue for users now,
though so it would be good if we can have a calmer period in the future to do
this (smp nice) by itself.
These are the four patches Andrew:
sched-implement-nice-support-across-physical-cpus-on-smp.patch
sched-change_prio_bias_only_if_queued.patch
sched-account_rt_tasks_in_prio_bias.patch
sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
The other HT patch by me is separate and a bugfix so please leave that in.
Cheers,
Con
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 3:33 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-08 3:50 ` Nick Piggin
0 siblings, 0 replies; 101+ messages in thread
From: Nick Piggin @ 2005-06-08 3:50 UTC (permalink / raw)
To: Con Kolivas; +Cc: Andrew Morton, Christoph Lameter, mbligh, lkml
On Wed, 2005-06-08 at 13:33 +1000, Con Kolivas wrote:
> On Wed, 8 Jun 2005 01:17 pm, Nick Piggin wrote:
> > To start with, it looks like the smp-nice patches are broken. Even if
> > they weren't I think it might be a good idea just to put them on hold
> > until we work out what to do with the other sched patches...
>
> I originally said I'd wait till the sched patches settled down before tackling
> it but it didn't look like that was ever going to happen and broken nice on
> SMP is a real bug biting people now so I figured I should just tackle it
> anyway. I don't mind if we just work on it later though.
>
Well I agree with you that it would be nice to fix it. I
think your approach has good potential, and it is along
the same lines as what I had in mind.
> > Anyway, Con, this is what it is doing on a 64-way Altix running aim7:
> > (compare imbalances, task move rates, wakeup move rates, etc).
>
> Definitely different I agree. As for the performance impact the statistics
> alone don't tell us if they're for good or evil, but we can look at it again
> separately when we tackle smp nice again. It is a real issue for users now,
> though so it would be good if we can have a calmer period in the future to do
> this (smp nice) by itself.
>
True. Fortunately this seems to only come up once a year or so.
Although I guess with the rise and rise of multi threaded and
multi cored CPUs it could become a bigger issue.
> These are the four patches Andrew:
> sched-implement-nice-support-across-physical-cpus-on-smp.patch
> sched-change_prio_bias_only_if_queued.patch
> sched-account_rt_tasks_in_prio_bias.patch
> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
>
Thanks.
> The other HT patch by me is separate and a bugfix so please leave that in.
>
Yep.
Nick
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott
@ 2005-06-08 5:53 ` Jean Delvare
2005-06-08 7:08 ` 2.6.12-rc6-mm1 Søren Lott
0 siblings, 1 reply; 101+ messages in thread
From: Jean Delvare @ 2005-06-08 5:53 UTC (permalink / raw)
To: Søren Lott; +Cc: Andrew Morton, Greg KH, LKML, LM Sensors
Hi Soren,
> [snip]
>
> > +gregkh-i2c-i2c-Kconfig-update.patch
> > +gregkh-i2c-i2c-pcf8574-cleanup.patch
> > +gregkh-i2c-i2c-adm9240-docs.patch
> > +gregkh-i2c-i2c-device-attr-lm90.patch
> > +gregkh-i2c-i2c-device-attr-lm83.patch
> > +gregkh-i2c-i2c-device-attr-lm63.patch
> > +gregkh-i2c-i2c-device-attr-it87.patch
> > +gregkh-i2c-hwmon-01.patch
> > +gregkh-i2c-hwmon-02.patch
> > +gregkh-i2c-hwmon-03.patch
> >
> > i2c tree updates
> >
> > +i2c-chips-need-hwmon.patch
> > +gregkh-i2c-hwmon-02-sparc64-fix.patch
> >
> > Fix a few things in the i2c tree
>
> [snip]
>
> after those changes i don't get entries in /sys for my W83627THF chip.
>
> (p4c800-D, i875,ICH5)
>
> relevant config parts:
>
> CONFIG_HWMON=y
> CONFIG_I2C=y
> CONFIG_I2C_ISA=y
> CONFIG_I2C_SENSOR=y
> CONFIG_SENSORS_W83627HF=y
Which kernel are you upgrading from?
Is CONFIG_PNPACPI set? If it is, try whithout it.
If it doesn't work, please try reverting (in reverse order):
gregkh-i2c-hwmon-01.patch
gregkh-i2c-hwmon-02.patch
gregkh-i2c-hwmon-03.patch
i2c-chips-need-hwmon.patch
gregkh-i2c-hwmon-02-sparc64-fix.patch
and see how it goes.
Thanks,
--
Jean Delvare
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 5:53 ` 2.6.12-rc6-mm1 Jean Delvare
@ 2005-06-08 7:08 ` Søren Lott
2005-06-09 3:47 ` [lm-sensors] 2.6.12-rc6-mm1 Mark M. Hoffman
0 siblings, 1 reply; 101+ messages in thread
From: Søren Lott @ 2005-06-08 7:08 UTC (permalink / raw)
To: Jean Delvare; +Cc: Andrew Morton, Greg KH, LKML, LM Sensors
On Wednesday 08 June 2005 02:53, Jean Delvare wrote:
> Hi Soren,
Hi,
> Which kernel are you upgrading from?
from 2.6.12-rc5-mm2
> Is CONFIG_PNPACPI set? If it is, try whithout it.
nope, don't even have CONFIG_PNP set.
> If it doesn't work, please try reverting (in reverse order):
> gregkh-i2c-hwmon-01.patch
> gregkh-i2c-hwmon-02.patch
> gregkh-i2c-hwmon-03.patch
> i2c-chips-need-hwmon.patch
> gregkh-i2c-hwmon-02-sparc64-fix.patch
> and see how it goes.
yeap, reverting these did the trick, all i2c entries in sysfs are back. :)
> Thanks,
thanks alot.
cheers.
-SL
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin
@ 2005-06-08 14:15 ` Martin J. Bligh
2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh
2 siblings, 0 replies; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-08 14:15 UTC (permalink / raw)
To: Andrew Morton, Christoph Lameter; +Cc: linux-kernel
--Andrew Morton <akpm@osdl.org> wrote (on Tuesday, June 07, 2005 17:08:53 -0700):
> Christoph Lameter <clameter@engr.sgi.com> wrote:
>>
>> On Tue, 7 Jun 2005, Andrew Morton wrote:
>>
>> > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
>> >
>> > Oh crap, so it does. That's wrong.
>>
>> Email by you and Linus indicated that 250 should be the default.
>
> Oh, OK. hrm.
>
> Martin, it would be useful if you could determine whether the kernbench
> slowdown was due to the 1000Hz->250Hz change, thanks.
>
> I'm assuming it was the CPU scheduler patches. There are 36 of them ;)
Is actually worse with HZ=1000 ... so I think we still have another problem,
probably with scheduler patches. (the one marked -mm1+p4947 in blue is the
patched one)
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
I can back out various patches ... are all the scheduler patches starting
in sched.* or something equally obvious? if not, a list of what to blat
would help me ... or I'll do a crapshoot, and see what falls out ;-)
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
` (4 preceding siblings ...)
2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott
@ 2005-06-08 14:22 ` Andy Whitcroft
2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin
2005-06-08 14:33 ` BUG in i2c_detach_client Andrew James Wade
` (3 subsequent siblings)
9 siblings, 2 replies; 101+ messages in thread
From: Andy Whitcroft @ 2005-06-08 14:22 UTC (permalink / raw)
To: Andrew Morton, Andrey Panin; +Cc: linux-kernel
We've been seeing an early boot hang on IBM x-series (at least on an
x440) with -rc6-mm1. Finally got hold of a box to go search for this
and it seems that backing out the three patches below fixes it.
515 dmi-move-acpi-boot-quirk.patch
516 dmi-move-acpi-sleep-quirk.patch
517 dmi-remove-central-blacklist.patch
I am pretty sure it is actually the first one (thats where my bisection
search pointed) but I had to drop the other two to back it out. Anyhow,
2.6.12-rc6-mm1 boots on an x440 with these backed out.
Cheers.
-apw
^ permalink raw reply [flat|nested] 101+ messages in thread
* BUG in i2c_detach_client
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
` (5 preceding siblings ...)
2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft
@ 2005-06-08 14:33 ` Andrew James Wade
2005-06-08 16:21 ` Jean Delvare
2005-06-08 21:26 ` Andrew Morton
2005-06-11 11:51 ` 2.6.12-rc6-mm1 Benoit Boissinot
` (2 subsequent siblings)
9 siblings, 2 replies; 101+ messages in thread
From: Andrew James Wade @ 2005-06-08 14:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1348 bytes --]
2.6.12-rc5-mm1 didn't crash.
kernel BUG at include/linux/list.h:166!
invalid operand: 0000 [#1]
PREEMPT
CPU: 0
EIP: 0060:[<c0319cd4>] Not tainted VLI
EFLAGS: 00010a83 (2.6.12-rc6-mm1)
EIP is at i2c_detach_client+0xb4/0x110
eax: dfc0bcc0 ebx: c15fc26c ecx: c15fc264 edx: c04378d0
esi: c15fc14c edi: c0437720 ebp: 00000000 esp: dff81f10
ds: 007b es: 007b ss: 0068
Process swapper (pid: 1, threadinfo=dff80000 task=c14dca00)
Stack: dfff6110 dfc0bdb4 00000286 00000286 c15fc26c c15fc14c c15fc160 ffffffed
c031d512 c15fc160 c03edac1 c15fc26c 00000000 0000002d 00000001 0000002d
c0437720 00000000 c0437c5c 00000001 00000000 c031b100 00000000 00000000
Call Trace:
[<c031d512>] asb100_detect+0x442/0x520
[<c031b100>] i2c_detect+0x240/0x380
[<c031d0d0>] asb100_detect+0x0/0x520
[<c0319889>] i2c_add_driver+0x89/0xc0
[<c047e7eb>] do_initcalls+0x2b/0xc0
[<c015a915>] kern_mount+0x15/0x19
[<c01002b0>] init+0x0/0x110
[<c01002df>] init+0x2f/0x110
[<c0100f28>] kernel_thread_helper+0x0/0x18
[<c0100f2d>] kernel_thread_helper+0x5/0x18
Code: 89 40 04 89 f0 e8 8d 31 fa ff 89 f0 e8 16 34 fa ff ff 47 2c 0f 8e 25 11
00 00 89 d8 e8 56 53 09 00 89 e8 83 c4 10 5b 5e 5f 5d c3 <0f> 0b a6 00 44
56 3c c0 eb 91 0f 0b a5 00 44 56 3c c0 e9 79 ff
<0>Kernel panic - not syncing: Attempted to kill init!
.config attached.
[-- Attachment #2: .config --]
[-- Type: text/plain, Size: 30440 bytes --]
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.12-rc6-mm1
# Wed Jun 8 01:45:54 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
# CONFIG_IKCONFIG is not set
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
#
# Loadable module support
#
# CONFIG_MODULES is not set
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_SMP is not set
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
# CONFIG_MICROCODE is not set
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_HAVE_DEC_LOCK=y
CONFIG_REGPARM=y
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
#
# Performance-monitoring counters support
#
CONFIG_PERFCTR=y
# CONFIG_PERFCTR_INIT_TESTS is not set
CONFIG_PERFCTR_VIRTUAL=y
CONFIG_PERFCTR_INTERRUPT_SUPPORT=y
CONFIG_PHYSICAL_START=0x100000
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
# CONFIG_ACPI_SLEEP is not set
# CONFIG_ACPI_AC is not set
# CONFIG_ACPI_BATTERY is not set
CONFIG_ACPI_BUTTON=y
# CONFIG_ACPI_VIDEO is not set
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_IBM is not set
# CONFIG_ACPI_TOSHIBA is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
# CONFIG_X86_PM_TIMER is not set
# CONFIG_ACPI_CONTAINER is not set
#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
CONFIG_APM_CPU_IDLE=y
# CONFIG_APM_DISPLAY_BLANK is not set
CONFIG_APM_RTC_IS_GMT=y
# CONFIG_APM_ALLOW_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
# CONFIG_PCIEPORTBUS is not set
# CONFIG_PCI_MSI is not set
# CONFIG_PCI_LEGACY_PROC is not set
# CONFIG_PCI_NAMES is not set
# CONFIG_PCI_DEBUG is not set
CONFIG_ISA_DMA_API=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# PCCARD (PCMCIA/CardBus) support
#
# CONFIG_PCCARD is not set
#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_AOUT is not set
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_DEBUG_DRIVER is not set
#
# Connector - unified userspace <-> kernelspace linker
#
CONFIG_CONNECTOR=y
CONFIG_FORK_CONNECTOR=y
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
# CONFIG_PARPORT is not set
#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set
#
# Protocols
#
CONFIG_PNPACPI=y
#
# Block devices
#
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
# CONFIG_BLK_DEV_RAM is not set
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_LBD is not set
# CONFIG_CDROM_PKTCDVD is not set
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_IOSCHED_CFQ=y
# CONFIG_ATA_OVER_ETH is not set
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
CONFIG_BLK_DEV_VIA82CXXX=y
# CONFIG_IDE_ARM is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=y
# CONFIG_SCSI_PROC_FS is not set
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
# CONFIG_CHR_DEV_SCH is not set
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA2XXX=y
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
# CONFIG_FUSION_SPI is not set
# CONFIG_FUSION_FC is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_TUNNEL is not set
CONFIG_IP_TCPDIAG=y
# CONFIG_IP_TCPDIAG_IPV6 is not set
#
# TCP congestion control
#
CONFIG_TCP_CONG_BIC=y
CONFIG_TCP_CONG_WESTWOOD=y
# CONFIG_TCP_CONG_HTCP is not set
# CONFIG_TCP_CONG_HSTCP is not set
# CONFIG_TCP_CONG_HYBLA is not set
# CONFIG_TCP_CONG_VEGAS is not set
# CONFIG_TCP_CONG_SCALABLE is not set
# CONFIG_IPV6 is not set
# CONFIG_NETFILTER is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set
# CONFIG_NET_CLS_ROUTE is not set
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_IEEE80211 is not set
CONFIG_NETDEVICES=y
# CONFIG_DUMMY is not set
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_NET_SB1000 is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
CONFIG_8139TOO=y
# CONFIG_8139TOO_PIO is not set
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_OLD_RX_RESET is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SKGE is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set
#
# Ethernet (10000 Mbit)
#
# CONFIG_CHELSIO_T1 is not set
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1280
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=1024
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_GAMEPORT is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_SERIAL_8250_ACPI is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=y
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_NVIDIA is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_SWORKS is not set
CONFIG_AGP_VIA=y
# CONFIG_AGP_EFFICEON is not set
CONFIG_DRM=y
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
CONFIG_DRM_RADEON=y
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_HPET_MMAP=y
# CONFIG_HANGCHECK_TIMER is not set
#
# TPM devices
#
# CONFIG_TCG_TPM is not set
#
# Hardware Monitoring (Sensors) support
#
CONFIG_HWMON=y
#
# I2C support
#
CONFIG_I2C=y
# CONFIG_I2C_CHARDEV is not set
#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=y
# CONFIG_I2C_ALGOPCF is not set
# CONFIG_I2C_ALGOPCA is not set
#
# I2C Hardware Bus support
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
# CONFIG_I2C_I801 is not set
# CONFIG_I2C_I810 is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_ISA is not set
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_PARPORT_LIGHT is not set
# CONFIG_I2C_PROSAVAGE is not set
# CONFIG_I2C_SAVAGE4 is not set
# CONFIG_SCx200_ACB is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
CONFIG_I2C_VIAPRO=y
# CONFIG_I2C_VOODOO3 is not set
# CONFIG_I2C_PCA_ISA is not set
#
# Hardware Sensors Chip support
#
CONFIG_I2C_SENSOR=y
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1026 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM9240 is not set
CONFIG_SENSORS_ASB100=y
# CONFIG_SENSORS_ATXP1 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_FSCHER is not set
# CONFIG_SENSORS_FSCPOS is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_GL520SM is not set
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_LM92 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set
#
# Other I2C Chip support
#
# CONFIG_SENSORS_DS1337 is not set
# CONFIG_SENSORS_EEPROM is not set
# CONFIG_SENSORS_PCF8574 is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_SENSORS_RTC8564 is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
CONFIG_FB=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_SOFT_CURSOR=y
# CONFIG_FB_MACMODES is not set
CONFIG_FB_MODE_HELPERS=y
# CONFIG_FB_TILEBLITTING is not set
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_VESA is not set
# CONFIG_VIDEO_SELECT is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I810 is not set
# CONFIG_FB_INTEL is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON_OLD is not set
CONFIG_FB_RADEON=y
CONFIG_FB_RADEON_I2C=y
# CONFIG_FB_RADEON_DEBUG is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_GEODE is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_VIRTUAL is not set
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FONTS=y
# CONFIG_FONT_8x8 is not set
CONFIG_FONT_8x16=y
# CONFIG_FONT_6x11 is not set
# CONFIG_FONT_PEARL_8x8 is not set
# CONFIG_FONT_ACORN_8x8 is not set
# CONFIG_FONT_MINI_4x6 is not set
# CONFIG_FONT_SUN8x16 is not set
# CONFIG_FONT_SUN12x22 is not set
#
# Logo configuration
#
# CONFIG_LOGO is not set
# CONFIG_BACKLIGHT_LCD_SUPPORT is not set
#
# Sound
#
CONFIG_SOUND=y
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_HWDEP=y
CONFIG_SND_RAWMIDI=y
CONFIG_SND_SEQUENCER=y
# CONFIG_SND_SEQ_DUMMY is not set
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
#
# Generic devices
#
CONFIG_SND_MPU401_UART=y
CONFIG_SND_OPL3_LIB=y
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
#
# PCI devices
#
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
CONFIG_SND_CMIPCI=y
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_HDA_INTEL is not set
#
# USB devices
#
# CONFIG_SND_USB_AUDIO is not set
# CONFIG_SND_USB_USX2Y is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set
#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_BANDWIDTH is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_OTG is not set
#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=y
# CONFIG_USB_EHCI_SPLIT_ISO is not set
# CONFIG_USB_EHCI_ROOT_HUB_TT is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_OHCI_HCD is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
#
# USB Device Class drivers
#
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_BLUETOOTH_TTY is not set
# CONFIG_USB_MIDI is not set
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support' may also be needed; see USB_STORAGE Help for more information
#
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_DPCM is not set
# CONFIG_USB_STORAGE_USBAT is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
#
# USB Input Devices
#
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
# CONFIG_HID_FF is not set
# CONFIG_USB_HIDDEV is not set
# CONFIG_USB_AIPTEK is not set
# CONFIG_USB_WACOM is not set
# CONFIG_USB_ACECAD is not set
# CONFIG_USB_KBTAB is not set
# CONFIG_USB_POWERMATE is not set
# CONFIG_USB_MTOUCH is not set
# CONFIG_USB_ITMTOUCH is not set
# CONFIG_USB_EGALAX is not set
# CONFIG_USB_XPAD is not set
# CONFIG_USB_ATI_REMOTE is not set
#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set
#
# Video4Linux support is needed for USB Multimedia device support
#
#
# USB Network Adapters
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
# CONFIG_USB_MON is not set
#
# USB port drivers
#
#
# USB Serial Converter support
#
# CONFIG_USB_SERIAL is not set
#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_AUERSWALD is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_GOTEMP is not set
# CONFIG_USB_PHIDGETKIT is not set
# CONFIG_USB_PHIDGETSERVO is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_TEST is not set
#
# USB DSL modem support
#
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# MMC/SD Card support
#
# CONFIG_MMC is not set
#
# InfiniBand support
#
# CONFIG_INFINIBAND is not set
#
# SN Devices
#
#
# Distributed Lock Manager
#
# CONFIG_DLM is not set
#
# File systems
#
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
# CONFIG_JBD is not set
CONFIG_REISER4_FS=y
# CONFIG_REISER4_DEBUG is not set
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_REISERFS_FS_XATTR is not set
# CONFIG_JFS_FS is not set
#
# XFS support
#
# CONFIG_XFS_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_INOTIFY=y
# CONFIG_QUOTA is not set
CONFIG_DNOTIFY=y
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set
#
# Caches
#
# CONFIG_FSCACHE is not set
# CONFIG_FUSE_FS is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
# CONFIG_ZISOFS is not set
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_TMPFS_XATTR is not set
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_RAMFS=y
# CONFIG_CONFIGFS_FS is not set
# CONFIG_RELAYFS_FS is not set
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
#
# Network File Systems
#
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set
# CONFIG_SMB_FS is not set
CONFIG_CIFS=y
# CONFIG_CIFS_STATS is not set
# CONFIG_CIFS_XATTR is not set
# CONFIG_CIFS_EXPERIMENTAL is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
# CONFIG_9P_FS is not set
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
CONFIG_NLS_UTF8=y
#
# Profiling support
#
# CONFIG_PROFILING is not set
#
# Kernel hacking
#
# CONFIG_PRINTK_TIME is not set
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_LOG_BUF_SHIFT=14
CONFIG_DETECT_SOFTLOCKUP=y
# CONFIG_SCHEDSTATS is not set
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_PREEMPT=y
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
# CONFIG_PAGE_OWNER is not set
# CONFIG_DEBUG_FS is not set
# CONFIG_FRAME_POINTER is not set
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_KPROBES is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_4KSTACKS is not set
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
# CONFIG_KGDB is not set
#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY is not set
#
# Cryptographic options
#
# CONFIG_CRYPTO is not set
#
# Hardware crypto devices
#
#
# Library routines
#
# CONFIG_CRC_CCITT is not set
CONFIG_CRC32=y
# CONFIG_LIBCRC32C is not set
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-08 14:33 ` BUG in i2c_detach_client Andrew James Wade
@ 2005-06-08 16:21 ` Jean Delvare
2005-06-08 21:26 ` Andrew Morton
1 sibling, 0 replies; 101+ messages in thread
From: Jean Delvare @ 2005-06-08 16:21 UTC (permalink / raw)
To: Andrew James Wade; +Cc: Andrew Morton, LKML
Hi Andrew,
> 2.6.12-rc5-mm1 didn't crash.
>
> kernel BUG at include/linux/list.h:166!
> invalid operand: 0000 [#1]
> PREEMPT
> CPU: 0
> EIP: 0060:[<c0319cd4>] Not tainted VLI
> EFLAGS: 00010a83 (2.6.12-rc6-mm1)
> EIP is at i2c_detach_client+0xb4/0x110
> eax: dfc0bcc0 ebx: c15fc26c ecx: c15fc264 edx: c04378d0
> esi: c15fc14c edi: c0437720 ebp: 00000000 esp: dff81f10
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 1, threadinfo=dff80000 task=c14dca00)
> Stack: dfff6110 dfc0bdb4 00000286 00000286 c15fc26c c15fc14c c15fc160
> ffffffed
> c031d512 c15fc160 c03edac1 c15fc26c 00000000 0000002d 00000001
> 0000002d c0437720 00000000 c0437c5c 00000001 00000000 c031b100
> 00000000 00000000
> Call Trace:
> [<c031d512>] asb100_detect+0x442/0x520
> [<c031b100>] i2c_detect+0x240/0x380
> [<c031d0d0>] asb100_detect+0x0/0x520
> [<c0319889>] i2c_add_driver+0x89/0xc0
I suspect you didn't "make oldconfig" before compiling 2.6.12-rc6-mm1.
You should have CONFIG_HWMON=Y in .config, and I don't see it. Note that
I can't explain why it results in the BUG right above, but it must be
related.
If "make oldconfig" doesn't help, try reverting:
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/broken-out/gregkh-i2c-hwmon-03.patch
Thanks,
--
Jean Delvare
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft
@ 2005-06-08 20:01 ` Andrew Morton
2005-06-08 23:14 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin
1 sibling, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-06-08 20:01 UTC (permalink / raw)
To: Andy Whitcroft; +Cc: pazke, linux-kernel
Andy Whitcroft <apw@shadowen.org> wrote:
>
> We've been seeing an early boot hang on IBM x-series (at least on an
> x440) with -rc6-mm1. Finally got hold of a box to go search for this
> and it seems that backing out the three patches below fixes it.
>
> 515 dmi-move-acpi-boot-quirk.patch
> 516 dmi-move-acpi-sleep-quirk.patch
> 517 dmi-remove-central-blacklist.patch
Thanks for taking the time to do that - it helps enormously.
The patches aren't terribly important - I'll drop them if nobody sees the
problem. It might be an incorrect __init/__initdata/etc marking. But that
wouldn't cause an "early" boot hang...
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-08 14:33 ` BUG in i2c_detach_client Andrew James Wade
2005-06-08 16:21 ` Jean Delvare
@ 2005-06-08 21:26 ` Andrew Morton
2005-06-08 22:56 ` Andrew James Wade
2005-06-09 7:47 ` Jean Delvare
1 sibling, 2 replies; 101+ messages in thread
From: Andrew Morton @ 2005-06-08 21:26 UTC (permalink / raw)
To: Andrew James Wade; +Cc: linux-kernel, Jean Delvare, Greg KH
Andrew James Wade <ajwade@cpe00095b3131a0-cm0011ae8cd564.cpe.net.cable.rogers.com> wrote:
>
> 2.6.12-rc5-mm1 didn't crash.
>
> kernel BUG at include/linux/list.h:166!
> invalid operand: 0000 [#1]
> PREEMPT
> CPU: 0
> EIP: 0060:[<c0319cd4>] Not tainted VLI
> EFLAGS: 00010a83 (2.6.12-rc6-mm1)
> EIP is at i2c_detach_client+0xb4/0x110
> eax: dfc0bcc0 ebx: c15fc26c ecx: c15fc264 edx: c04378d0
> esi: c15fc14c edi: c0437720 ebp: 00000000 esp: dff81f10
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 1, threadinfo=dff80000 task=c14dca00)
> Stack: dfff6110 dfc0bdb4 00000286 00000286 c15fc26c c15fc14c c15fc160 ffffffed
> c031d512 c15fc160 c03edac1 c15fc26c 00000000 0000002d 00000001 0000002d
> c0437720 00000000 c0437c5c 00000001 00000000 c031b100 00000000 00000000
> Call Trace:
> [<c031d512>] asb100_detect+0x442/0x520
Were there no interesting printks before this BUG hit?
It's due to the kernel running list_del() on a list_head which isn't on a list.
Seems there is an error-path bug in that driver, but I don' thtink the fix
will fix it. Please test?
From: Andrew Morton <akpm@osdl.org>
Fix error backing-out code in asb100.c
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---
drivers/i2c/chips/asb100.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff -puN drivers/i2c/chips/asb100.c~asb100-fix drivers/i2c/chips/asb100.c
--- 25/drivers/i2c/chips/asb100.c~asb100-fix 2005-06-08 14:23:52.000000000 -0700
+++ 25-akpm/drivers/i2c/chips/asb100.c 2005-06-08 14:24:13.000000000 -0700
@@ -690,18 +690,20 @@ static int asb100_detect_subclients(stru
if ((err = i2c_attach_client(data->lm75[0]))) {
dev_err(&new_client->dev, "subclient %d registration "
"at address 0x%x failed.\n", i, data->lm75[0]->addr);
- goto ERROR_SC_2;
+ goto ERROR_SC_3;
}
if ((err = i2c_attach_client(data->lm75[1]))) {
dev_err(&new_client->dev, "subclient %d registration "
"at address 0x%x failed.\n", i, data->lm75[1]->addr);
- goto ERROR_SC_3;
+ goto ERROR_SC_4;
}
return 0;
/* Undo inits in case of errors */
+ERROR_SC_4:
+ i2c_detach_client(data->lm75[1]);
ERROR_SC_3:
i2c_detach_client(data->lm75[0]);
ERROR_SC_2:
_
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-08 21:26 ` Andrew Morton
@ 2005-06-08 22:56 ` Andrew James Wade
2005-06-08 23:32 ` Andrew Morton
2005-06-09 7:47 ` Jean Delvare
1 sibling, 1 reply; 101+ messages in thread
From: Andrew James Wade @ 2005-06-08 22:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Jean Delvare, Greg KH
On June 8, 2005 05:26 pm, Andrew Morton wrote:
> Were there no interesting printks before this BUG hit?
Nope :-(
> It's due to the kernel running list_del() on a list_head which isn't on a list.
>
> Seems there is an error-path bug in that driver, but I don' thtink the fix
> will fix it. Please test?
Will do. But I don't think that's it. I've been adding printks to determine the
execution path and it goes through the ERROR3 path in asb100_detect(), which means
AFACT that the error path in asb100_detect_subclients() isn't taken:
ERROR3:
i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);
ERROR2:
i2c_detach_client(new_client); // <--- BUG() in here.
ERROR1:
kfree(data);
ERROR0:
return err;
But the ERROR2 path does work despite the location of the bug. If I apply:
--- 2.6.12-rc6-mm1/drivers/i2c/chips/asb100.c 2005-06-08 17:46:02.123864000 -0400
+++ linux/drivers/i2c/chips/asb100.c 2005-06-08 17:59:21.461819500 -0400
@@ -811,6 +811,7 @@ static int asb100_detect(struct i2c_adap
if ((err = i2c_attach_client(new_client)))
goto ERROR1;
+ goto ERROR2;
/* Attach secondary lm75 clients */
if ((err = asb100_detect_subclients(adapter, address, kind,
new_client)))
@@ -874,7 +875,6 @@ static int asb100_detach_client(struct i
{
int err;
- hwmon_device_unregister(client->class_dev);
if ((err = i2c_detach_client(client))) {
dev_err(&client->dev, "client deregistration failed; "
No bug(). But the ERROR3 path doesn't:
--- 2.6.12-rc6-mm1/drivers/i2c/chips/asb100.c 2005-06-08 17:46:02.123864000 -0400
+++ linux/drivers/i2c/chips/asb100.c 2005-06-08 17:58:15.749712750 -0400
@@ -815,6 +815,7 @@ static int asb100_detect(struct i2c_adap
if ((err = asb100_detect_subclients(adapter, address, kind,
new_client)))
goto ERROR2;
+ goto ERROR3;
/* Initialize the chip */
asb100_init_client(new_client);
@@ -874,7 +875,6 @@ static int asb100_detach_client(struct i
{
int err;
- hwmon_device_unregister(client->class_dev);
if ((err = i2c_detach_client(client))) {
dev_err(&client->dev, "client deregistration failed; "
causes a BUG(). I've yet to track the problem down further. Unfortunately
I have no more time today, I'll play with it again tomorrow.
Regards,
Andrew
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-08 23:14 ` Martin J. Bligh
2005-06-08 23:22 ` 2.6.12-rc6-mm1 Andrew Morton
0 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-08 23:14 UTC (permalink / raw)
To: Andrew Morton, Andy Whitcroft; +Cc: pazke, linux-kernel
--On Wednesday, June 08, 2005 13:01:17 -0700 Andrew Morton <akpm@osdl.org> wrote:
> Andy Whitcroft <apw@shadowen.org> wrote:
>>
>> We've been seeing an early boot hang on IBM x-series (at least on an
>> x440) with -rc6-mm1. Finally got hold of a box to go search for this
>> and it seems that backing out the three patches below fixes it.
>>
>> 515 dmi-move-acpi-boot-quirk.patch
>> 516 dmi-move-acpi-sleep-quirk.patch
>> 517 dmi-remove-central-blacklist.patch
>
> Thanks for taking the time to do that - it helps enormously.
>
> The patches aren't terribly important - I'll drop them if nobody sees the
> problem. It might be an incorrect __init/__initdata/etc marking. But that
> wouldn't cause an "early" boot hang...
That does indeed make it boot. However ... once it's booted it seems
to hit another problem, a hang condition ;-( I suspect it's unrelated.
The box is still up and responsive, but cp spins.
I'm still chasing the other boot/hang double problem (amd64), so can't
really look at this right now, but if anyone has any bright ideas they
want me to try, or wants more info, let me know (machine is still hung
in that state).
Some snippets:
ps -ef:
root 10980 10979 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/run test kernbench 32 5 -
m 2^M
root 11060 10980 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/getsysinfo before /usr/lo
cal/autobench/logs/k^M
root 11219 11060 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/archive_dir /proc/scsi /u
sr/local/autobench/l^M
root 11221 11219 99 09:02 ? 04:13:26 cp -r /proc/scsi/aic7xxx /proc/scsi/device_info /proc/scsi/scsi
/usr/local/autobench^M
alt+sysrq+t
^M^@getsysinfo S CB5260CC 0 11060 10980 11219 (NOTLB)
^M^@d5fc1f40 00000082 fffffe00 cb5260cc 00000000 c011259b 2691b900 003d08e4
^M^@ 080fa558 00000001 d5fc1f38 c04715c0 c0473080 bfcb43b8 d740e000 cb526020
^M^@ 00000001 cb526020 00000007 d5fc1fbc 0008b824 26cec200 003d08e4 c02fc928
^M^@Call Trace:
^M^@ [<c011259b>] do_page_fault+0x193/0x60f
^M^@ [<c011d584>] do_wait+0x2a4/0x358
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c011d6c6>] sys_wait4+0x26/0x38
^M^@ [<c011d6ee>] sys_waitpid+0x16/0x1a
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@archive_dir S CBB810CC 0 11219 11060 11221 (NOTLB)
^M^@d7793f40 00000082 fffffe00 cbb810cc 00000000 c011259b 28b70a00 003d08e4
^M^@ 080fa158 00000001 d7793f38 c04715c0 c0473080 bfc51a68 c040e000 cbb81020
^M^@ 00000001 cbb81020 00000007 d7793fbc 00000000 28b70a00 003d08e4 c02fc928
^M^@Call Trace:
^M^@ [<c011259b>] do_page_fault+0x193/0x60f
^M^@ [<c011d584>] do_wait+0x2a4/0x358
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c011d6c6>] sys_wait4+0x26/0x38
^M^@ [<c011d6ee>] sys_waitpid+0x16/0x1a
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@cp R running 0 11221 11219 (NOTLB)
^M^@sleep S D77A1F68 0 11906 1409 (NOTLB)
^M^@d77a1f58 00000086 0039a67c d77a1f68 bfade9d8 272d8698 b605a700 003d16b7
^M^@ d5c1e804 d6ecdbac d77a1f50 c04715c0 c0473080 d77a1fbc d6ecd814 d76d3020
^M^@ 00000282 c0121f31 0039a67c c107d0e0 00000000 b605a700 003d16b7 d77a1f68
^M^@Call Trace:
^M^@ [<c0121f31>] lock_timer_base+0x19/0x3c
^M^@ [<c02ef4db>] schedule_timeout+0x7b/0x9c
^M^@ [<c0122904>] process_timeout+0x0/0xc
^M^@ [<c01229fb>] sys_nanosleep+0xdb/0x158
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@BUG: soft lockup detected on CPU#0!
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efcd9>] CPU: 0
^M^@EIP is at _spin_unlock_irqrestore+0x5/0x8
^M^@ EFLAGS: 00000292 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c03b9b84 EBX: c03b9ad4 ECX: 0a000000 EDX: 00000292
^M^@ESI: 00000074 EDI: c040ffa4 EBP: d5c16000 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f9008 CR3: 16dd0300 CR4: 000006b0
^M^@ [<c020e729>] __handle_sysrq+0x121/0x128
^M^@ [<c020e74f>] handle_sysrq+0x1f/0x24
^M^@ [<c021dda4>] receive_chars+0x16c/0x270
^M^@ [<c021e0a2>] serial8250_interrupt+0x66/0xe4
^M^@ [<c01320f0>] handle_IRQ_event+0x28/0x58
^M^@ [<c0132203>] __do_IRQ+0xe3/0x134
^M^@ [<c0104b4b>] do_IRQ+0x1b/0x28
^M^@ [<c01033d6>] common_interrupt+0x1a/0x20
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c01002c8>] rest_init+0x28/0x2c
^M^@ [<c0410899>] start_kernel+0x19d/0x1a0
alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
doesn't seem to inter-react with the other NMI code well)
Command> break
^@SysRq : Show Regs
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 0
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: c040e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e3f5a0 CR3: 16dd0300 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c01002c8>] rest_init+0x28/0x2c
^M^@ [<c0410899>] start_kernel+0x19d/0x1a0
^M^@ Uhhuh. NMI received for unknown reason 00 on CPU 1.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 16.
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 3.
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 17.
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 16
^M^@EIP is at default_idle+0x23/0x2c
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 2.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 18.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 19.
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7420000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@CR0: 8005003b CR2: 00000000 CR3: 17771800 CR4: 000006b0
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 6.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 20.
^M^@ start_secondary+0x13d/0x140
^M^@Dazed and confused, but trying to continue
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 18
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 10.
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7426000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f25d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 29.
^M^@ cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 2
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7400000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7edeb00 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 23.
^M^@ start_secondary+0x13d/0x140
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 7.
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 3
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7402000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@Do you have a strange power saving mode enabled?
^M^@CR0: 8005003b CR2: b7f95438 CR3: 17771800 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 4.
^M^@ cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 17
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 5.
^M^@Dazed and confused, but trying to continue
^M^@EIP is at default_idle+0x23/0x2c
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 14.
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7424000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 9.
^M^@ cpu_idle+0x7b/0x8c
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 25.
^M^@ [<c010e79d>] start_secondary+0x13d/0x140Dazed and confused, but trying to continue
^M
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 19
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7428000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f30d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 13.
^M^@ start_secondary+0x13d/0x140
^M^@ Do you have a strange power saving mode enabled?
^M^@----------- IPI show regs -----------Uhhuh. NMI received for unknown reason 00 on CPU 8.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 11.
^M^@Dazed and confused, but trying to continue
^M^@Dazed and confused, but trying to continue
^M^@Dazed and confused, but trying to continue
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 20
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 22.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 26.
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ESI: d742a000 EDI: c0470300 EBP: c0470300Uhhuh. NMI received for unknown reason 00 on CPU 30.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 21.
^M^@ [<c0100ca3>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ cpu_idle+0x7b/0x8c
^M^@Dazed and confused, but trying to continue
^M^@ [<c010e79d>]Do you have a strange power saving mode enabled?
^M^@ start_secondary+0x13d/0x140
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 27.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 24.
^M^@ ----------- IPI show regs -----------
^M^@Pid: 11221, comm: cp
^M^@EIP: 0060:[<c02efbdc>] CPU: 5
^M^@Do you have a strange power saving mode enabled?
^M^@EIP is at _spin_lock_irqsave+0x14/0x20
^M^@ EFLAGS: 00000286 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@Dazed and confused, but trying to continue
^M^@EAX: 00000286 EBX: d6ce4800 ECX: c03cabe0 EDX: c049ba84
^M^@ESI: ffffffea EDI: d55f8000 EBP: d55f8000 DS: 007b ES: 007b
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@CR0: 80050033 CR2: bfc7d2fc CR3: 16dd02e0 CR4: 000006b0
^M^@ [<c0270377>]Uhhuh. NMI received for unknown reason 00 on CPU 31.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 12.
^M^@ ahc_linux_proc_info+0x27/0x212
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ [<c0149052>]Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ page_add_anon_rmap+0x62/0x68
^M^@ [<c0144358>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 15.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 28.
^M^@Dazed and confused, but trying to continue
^M^@ do_anonymous_page+0x1f0/0x21c
^M^@ [<c0144370>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ do_anonymous_page+0x208/0x21c
^M^@Dazed and confused, but trying to continue
^M^@ [<c01443d9>]Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ do_no_page+0x55/0x3e8
^M^@ [<c01372b5>] prep_new_page+0x49/0x50
^M^@ [<c0137973>] buffered_rmqueue+0x16f/0x1d0
^M^@ [<c0137e1b>] __alloc_pages+0x3bb/0x3c8
^M^@ [<c0257cdb>] proc_scsi_read+0x2b/0x44
^M^@ [<c0182f28>] proc_file_read+0xec/0x200
^M^@ [<c0152ff9>] vfs_read+0x91/0x12c
^M^@ [<c01532e4>] sys_read+0x40/0x6c
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 7
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d740c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 4
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7404000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f9c48 CR3: 17771320 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efcae>] CPU: 30
^M^@EIP is at _spin_lock+0xa/0x10
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c1050aa0 EBX: c1050aa0 ECX: d7463ea8 EDX: 00000003
^M^@ESI: c10d9620 EDI: c10d9fe0 EBP: d7463eb0 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7eea900 CR3: 00474000 CR4: 000006b0
^M^@ [<c011583b>] load_balance+0xcf/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efb6a>] CPU: 15
^M^@EIP is at _spin_trylock+0x6/0x14
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
^M^@ESI: c10875a0 EDI: c1087f60 EBP: d741fe84 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
^M^@ [<c01157e4>] load_balance+0x78/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 21
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d742c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 14
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d741c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e64070 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 27
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d745c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f66d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 8
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d740e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f133c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 25
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7436000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f74d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efb6a>] CPU: 29
^M^@EIP is at _spin_trylock+0x6/0x14
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000001 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
^M^@ESI: c10d3ea0 EDI: c10d4860 EBP: d7461e84 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
^M^@ [<c01157e4>] load_balance+0x78/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 31
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7464000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 24
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7434000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 10
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7412000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7ea6920 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 26
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7438000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c01154a3>] CPU: 13
^M^@EIP is at find_busiest_group+0x103/0x2f8
^M^@ EFLAGS: 00000086 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000005 EBX: 00000005 ECX: c1050aa0 EDX: 00000000
^M^@ESI: c04813ac EDI: 00000200 EBP: d741be7c DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e7e070 CR3: 00474000 CR4: 000006b0
^M^@ [<c01157a2>] load_balance+0x36/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 28
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d745e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c011e897>] CPU: 12
^M^@EIP is at __do_softirq+0x47/0x100
^M^@ EFLAGS: 00000006 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c0470380 EBX: c0476020 ECX: 00000030 EDX: c1075ce0
^M^@ESI: 00000002 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f54000 CR3: 00474000 CR4: 000006b0
^M^@ [<c011e97f>] do_softirq+0x2f/0x34
^M^@ [<c011ea24>] irq_exit+0x34/0x38
^M^@ [<c010f601>] smp_apic_timer_interrupt+0xdd/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 9
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7410000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f1d900 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 11
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7414000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 23
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7432000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 6
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7408000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 22
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7430000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 1
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: c13fc000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7ee1d9c CR3: 17771640 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ^M
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 23:14 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-08 23:22 ` Andrew Morton
2005-06-08 23:34 ` 2.6.12-rc6-mm1 Martin J. Bligh
0 siblings, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-06-08 23:22 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: apw, pazke, linux-kernel
"Martin J. Bligh" <mbligh@mbligh.org> wrote:
>
> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
> doesn't seem to inter-react with the other NMI code well)
What patch?
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-08 22:56 ` Andrew James Wade
@ 2005-06-08 23:32 ` Andrew Morton
2005-06-09 7:52 ` Jean Delvare
0 siblings, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-06-08 23:32 UTC (permalink / raw)
To: Andrew James Wade; +Cc: linux-kernel, khali, greg
Andrew James Wade <ajwade@cpe00095b3131a0-cm0011ae8cd564.cpe.net.cable.rogers.com> wrote:
>
> On June 8, 2005 05:26 pm, Andrew Morton wrote:
> > Were there no interesting printks before this BUG hit?
> Nope :-(
>
> > It's due to the kernel running list_del() on a list_head which isn't on a list.
> >
> > Seems there is an error-path bug in that driver, but I don' thtink the fix
> > will fix it. Please test?
> Will do. But I don't think that's it. I've been adding printks to determine the
> execution path and it goes through the ERROR3 path in asb100_detect(), which means
> AFACT that the error path in asb100_detect_subclients() isn't taken:
>
> ERROR3:
> i2c_detach_client(data->lm75[0]);
> kfree(data->lm75[1]);
> kfree(data->lm75[0]);
> ERROR2:
> i2c_detach_client(new_client); // <--- BUG() in here.
> ERROR1:
> kfree(data);
> ERROR0:
> return err;
hm, the tree I have here doesn't do that. What kernel do you have there?
I suggest you work against
http://www.zip.com.au/~akpm/linux/patches/stuff/x.bz2 which is a patch
against 2.6.12-rc6 containing everybody's latest everything.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 23:22 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-08 23:34 ` Martin J. Bligh
2005-06-09 7:17 ` 2.6.12-rc6-mm1 Kirill Korotaev
0 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-08 23:34 UTC (permalink / raw)
To: Andrew Morton; +Cc: apw, pazke, linux-kernel, dev
--On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <akpm@osdl.org> wrote:
> "Martin J. Bligh" <mbligh@mbligh.org> wrote:
>>
>> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>> doesn't seem to inter-react with the other NMI code well)
>
> What patch?
Sorry.
nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch
It does seem to work. But probably needs some cleanup for the NMI
errors.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter
2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-09 1:58 ` Lee Revell
1 sibling, 0 replies; 101+ messages in thread
From: Lee Revell @ 2005-06-09 1:58 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andrew Morton, Martin J. Bligh, linux-kernel
On Tue, 2005-06-07 at 17:02 -0700, Christoph Lameter wrote:
> On Tue, 7 Jun 2005, Andrew Morton wrote:
>
> > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
> >
> > Oh crap, so it does. That's wrong.
>
> Email by you and Linus indicated that 250 should be the default.
Wait, does that mean the default HZ is going to be changed in the 2.6.x
timeframe? That's a big user-visible regression, as it makes the
sleep() resolution worse, and would force apps with tight timing
requirements to go back to using the RTC like on 2.4.
Unless, of course, the plan is to merge the high-res timers patch at the
same time.
Lee
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [lm-sensors] Re: 2.6.12-rc6-mm1
2005-06-08 7:08 ` 2.6.12-rc6-mm1 Søren Lott
@ 2005-06-09 3:47 ` Mark M. Hoffman
0 siblings, 0 replies; 101+ messages in thread
From: Mark M. Hoffman @ 2005-06-09 3:47 UTC (permalink / raw)
To: Søren Lott; +Cc: Jean Delvare, Andrew Morton, Greg KH, LKML, LM Sensors
Hi Soren, et. al.:
> On Wednesday 08 June 2005 02:53, Jean Delvare wrote:
> > If it doesn't work, please try reverting (in reverse order):
> > gregkh-i2c-hwmon-01.patch
> > gregkh-i2c-hwmon-02.patch
> > gregkh-i2c-hwmon-03.patch
> > i2c-chips-need-hwmon.patch
> > gregkh-i2c-hwmon-02-sparc64-fix.patch
> > and see how it goes.
* Søren Lott <soren3@gmail.com> [2005-06-08 04:08:04 -0300]:
> yeap, reverting these did the trick, all i2c entries in sysfs are back. :)
My bad. Although I will redo the hwmon patches soon anyway, here is a
patch that you can apply (after reapplying the above) that should get
you working again. BTW: I tested it on almost identical h/w as yours,
this time with the same relevant config options, against 2.6.12-rc5-mm1.
This applies to -rc6-mm1.
---------------
This patch fixes an init order bug between hwmon and i2c/chips,
without which many sensors drivers will not initialize properly
(in non-modular systems).
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Index: linux-2.6.12-rc6-mm1/drivers/Makefile
===================================================================
--- linux-2.6.12-rc6-mm1.orig/drivers/Makefile
+++ linux-2.6.12-rc6-mm1/drivers/Makefile
@@ -53,8 +53,11 @@ obj-$(CONFIG_USB_GADGET) += usb/gadget/
obj-$(CONFIG_GAMEPORT) += input/gameport/
obj-$(CONFIG_INPUT) += input/
obj-$(CONFIG_I2O) += message/
-obj-$(CONFIG_I2C) += i2c/
+
+# most of i2c/chips depends on hwmon/
obj-$(CONFIG_HWMON) += hwmon/
+obj-$(CONFIG_I2C) += i2c/
+
obj-$(CONFIG_W1) += w1/
obj-$(CONFIG_PHONE) += telephony/
obj-$(CONFIG_MD) += md/
--
Mark M. Hoffman
mhoffman@lightlink.com
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft
2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-09 4:27 ` Andrey Panin
2005-06-09 13:12 ` 2.6.12-rc6-mm1 Andy Whitcroft
1 sibling, 1 reply; 101+ messages in thread
From: Andrey Panin @ 2005-06-09 4:27 UTC (permalink / raw)
To: Andy Whitcroft; +Cc: Andrew Morton, linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 794 bytes --]
On 159, 06 08, 2005 at 03:22:57 +0100, Andy Whitcroft wrote:
> We've been seeing an early boot hang on IBM x-series (at least on an
> x440) with -rc6-mm1. Finally got hold of a box to go search for this
> and it seems that backing out the three patches below fixes it.
>
> 515 dmi-move-acpi-boot-quirk.patch
> 516 dmi-move-acpi-sleep-quirk.patch
> 517 dmi-remove-central-blacklist.patch
>
> I am pretty sure it is actually the first one (thats where my bisection
> search pointed) but I had to drop the other two to back it out. Anyhow,
> 2.6.12-rc6-mm1 boots on an x440 with these backed out.
Yeah, probably brown paper bag time... Please try the attached patch.
--
Andrey Panin | Linux and UNIX system administrator
pazke@donpac.ru | PGP key: wwwkeys.pgp.net
[-- Attachment #1.2: patch-stupid-dmi-bug --]
[-- Type: text/plain, Size: 978 bytes --]
diff -urdpNX /usr/share/dontdiff linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/boot.c linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/boot.c
--- linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/boot.c 2005-06-09 08:02:06.000000000 +0400
+++ linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/boot.c 2005-06-09 08:24:01.000000000 +0400
@@ -1040,6 +1040,7 @@ static struct dmi_system_id __initdata a
},
},
#endif
+ { }
};
#endif /* __i386__ */
diff -urdpNX /usr/share/dontdiff linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/sleep.c linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/sleep.c
--- linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/sleep.c 2005-06-09 08:02:06.000000000 +0400
+++ linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/sleep.c 2005-06-09 08:24:15.000000000 +0400
@@ -108,6 +108,7 @@ static __initdata struct dmi_system_id a
DMI_MATCH(DMI_PRODUCT_NAME, "S4030CDT/4.3"),
},
},
+ { }
};
static int __init acpisleep_dmi_init(void)
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 23:34 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-09 7:17 ` Kirill Korotaev
2005-06-09 13:38 ` 2.6.12-rc6-mm1 Martin J. Bligh
0 siblings, 1 reply; 101+ messages in thread
From: Kirill Korotaev @ 2005-06-09 7:17 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: Andrew Morton, apw, pazke, linux-kernel
> --On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <akpm@osdl.org> wrote:
>
>
>>"Martin J. Bligh" <mbligh@mbligh.org> wrote:
>>
>>>alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>>> doesn't seem to inter-react with the other NMI code well)
>>
>>What patch?
>
>
> Sorry.
>
> nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch
>
> It does seem to work. But probably needs some cleanup for the NMI
> errors.
If you give me to know where the problem come from I can fix it and make
a cleanup.
Kirill
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-08 21:26 ` Andrew Morton
2005-06-08 22:56 ` Andrew James Wade
@ 2005-06-09 7:47 ` Jean Delvare
2005-06-09 11:05 ` Andrew James Wade
2005-06-09 13:32 ` Andrew James Wade
1 sibling, 2 replies; 101+ messages in thread
From: Jean Delvare @ 2005-06-09 7:47 UTC (permalink / raw)
To: akpm, ajwade; +Cc: linux-kernel, Greg KH, Mark M. Hoffman
Hi Andrew, Andrew, all,
[Adding Mark M. Hoffman in the loop, as the author and recent modifier of
the asb100 driver.]
> From: Andrew Morton <akpm@osdl.org>
>
> Fix error backing-out code in asb100.c
>
> Cc: Greg KH <greg@kroah.com>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> (...)
> --- 25/drivers/i2c/chips/asb100.c~asb100-fix
> +++ 25-akpm/drivers/i2c/chips/asb100.c
> @@ -690,18 +690,20 @@ static int asb100_detect_subclients(stru
> if ((err = i2c_attach_client(data->lm75[0]))) {
> dev_err(&new_client->dev, "subclient %d registration "
> "at address 0x%x failed.\n", i, data->lm75[0]->addr);
> - goto ERROR_SC_2;
> + goto ERROR_SC_3;
> }
>
> if ((err = i2c_attach_client(data->lm75[1]))) {
> dev_err(&new_client->dev, "subclient %d registration "
> "at address 0x%x failed.\n", i, data->lm75[1]->addr);
> - goto ERROR_SC_3;
> + goto ERROR_SC_4;
> }
>
> return 0;
>
> /* Undo inits in case of errors */
> +ERROR_SC_4:
> + i2c_detach_client(data->lm75[1]);
> ERROR_SC_3:
> i2c_detach_client(data->lm75[0]);
> ERROR_SC_2:
This patch looks broken to me, please discard it. You do not want to call
i2c_detach_client when the corresponding i2c_attach_client failed. The
original code was fine in that respect. I don't think there is any
problem in the asb100_detect_subclients() function.
I do however think that there is a problem in the asb100_detect()
function, where a call to i2c_detach client() is missing:
ERROR3:
i2c_detach_client(data->lm75[1]); <-- HERE
i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);
If we take that error path, it means that both subclients have been
successfully attached, thus need proper detach.
The reason why the bug triggered on Andrew (James Wade) is probably that
hwmon_device_register() failed, due to an order problem in a Makefile.
See http://lkml.org/lkml/2005/6/8/338, which has an explanation and a
patch fixing it (I think).
This still doesn't explain why the error path triggers the BUG(), and
although applying the aforementioned patch will probably get the driver
working, I'd really like to understand what's going on there.
Thanks,
--
Jean Delvare
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-08 23:32 ` Andrew Morton
@ 2005-06-09 7:52 ` Jean Delvare
0 siblings, 0 replies; 101+ messages in thread
From: Jean Delvare @ 2005-06-09 7:52 UTC (permalink / raw)
To: akpm, ajwade; +Cc: linux-kernel@vger.kernel.org, Greg KH
Hi Andrew,
> > Will do. But I don't think that's it. I've been adding printks to
> > determine the execution path and it goes through the ERROR3 path in
> > asb100_detect(), which means AFACT that the error path in
> > asb100_detect_subclients() isn't taken:
> >
> > ERROR3:
> > i2c_detach_client(data->lm75[0]);
> > kfree(data->lm75[1]);
> > kfree(data->lm75[0]);
> > ERROR2:
> > i2c_detach_client(new_client); // <--- BUG() in here.
> > ERROR1:
> > kfree(data);
> > ERROR0:
> > return err;
>
> hm, the tree I have here doesn't do that. What kernel do you have there?
I suspect that the bug will only show when the i2c-core and asb100
drivers (and the relevant i2c bus driver) are built into the kernel.
(See my previous post.)
Thanks,
--
Jean Delvare
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-09 7:47 ` Jean Delvare
@ 2005-06-09 11:05 ` Andrew James Wade
2005-06-09 13:32 ` Andrew James Wade
1 sibling, 0 replies; 101+ messages in thread
From: Andrew James Wade @ 2005-06-09 11:05 UTC (permalink / raw)
To: Jean Delvare; +Cc: akpm, linux-kernel, Greg KH, Mark M. Hoffman
On June 9, 2005 03:47 am, Jean Delvare wrote:
> The reason why the bug triggered on Andrew (James Wade) is probably that
> hwmon_device_register() failed, due to an order problem in a Makefile.
> See http://lkml.org/lkml/2005/6/8/338, which has an explanation and a
> patch fixing it (I think).
Yup, the kernel now boots.
> This still doesn't explain why the error path triggers the BUG(), and
> although applying the aforementioned patch will probably get the driver
> working, I'd really like to understand what's going on there.
Ok, I'll keep playing around with the kernel to see what I can find out.
(and I'll take a look at
http://www.zip.com.au/~akpm/linux/patches/stuff/x.bz2 as Andrew Morton
suggested)
Thanks,
Andrew
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin
@ 2005-06-09 13:12 ` Andy Whitcroft
0 siblings, 0 replies; 101+ messages in thread
From: Andy Whitcroft @ 2005-06-09 13:12 UTC (permalink / raw)
To: Andrey Panin; +Cc: Andrew Morton, linux-kernel, Martin J. Bligh
Andrey Panin wrote:
> Yeah, probably brown paper bag time... Please try the attached patch.
Ok. I can confirm that linux-2.6.12-rc6-mm1 + just this fix boots fine
and works. And yes I said works? I can't understand why backing the
others out left us with the odd spin hang and this combination doesn't.
I've managed to run 4 sets of boot and kernbench (10 runs) without a hang.
/me feels there is something else ugly in here we don't want but
unrelated to this patch.
-apw
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-09 7:47 ` Jean Delvare
2005-06-09 11:05 ` Andrew James Wade
@ 2005-06-09 13:32 ` Andrew James Wade
2005-06-09 15:57 ` Jean Delvare
1 sibling, 1 reply; 101+ messages in thread
From: Andrew James Wade @ 2005-06-09 13:32 UTC (permalink / raw)
To: Jean Delvare; +Cc: akpm, linux-kernel, Greg KH, Mark M. Hoffman
Mystery solved.
ERROR3:
i2c_detach_client(data->lm75[1]); <-- HERE
i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);
The missing i2c_detach_client call meant that data->lm75[1] was still on
the list of i2c devices when it was freed. This was corrupting the list.
The ERROR3 path now works on my kernel.
Thanks for your help.
Andrew
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-09 7:17 ` 2.6.12-rc6-mm1 Kirill Korotaev
@ 2005-06-09 13:38 ` Martin J. Bligh
2005-06-10 12:12 ` 2.6.12-rc6-mm1 Kirill Korotaev
0 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-09 13:38 UTC (permalink / raw)
To: Kirill Korotaev; +Cc: Andrew Morton, apw, pazke, linux-kernel
--Kirill Korotaev <dev@sw.ru> wrote (on Thursday, June 09, 2005 11:17:43 +0400):
>> --On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <akpm@osdl.org> wrote:
>>
>>
>>> "Martin J. Bligh" <mbligh@mbligh.org> wrote:
>>>
>>>> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>>>> doesn't seem to inter-react with the other NMI code well)
>>>
>>> What patch?
>>
>>
>> Sorry.
>>
>> nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch
>>
>> It does seem to work. But probably needs some cleanup for the NMI
>> errors.
> If you give me to know where the problem come from I can fix it and make a cleanup.
It gets a lot of the "dazed and confused" errors. Possibly you just need
to disable that part of the handler?
Command> break
^@SysRq : Show Regs
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 0
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: c040e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e3f5a0 CR3: 16dd0300 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c01002c8>] rest_init+0x28/0x2c
^M^@ [<c0410899>] start_kernel+0x19d/0x1a0
^M^@ Uhhuh. NMI received for unknown reason 00 on CPU 1.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 16.
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 3.
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 17.
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 16
^M^@EIP is at default_idle+0x23/0x2c
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 2.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 18.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 19.
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7420000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@CR0: 8005003b CR2: 00000000 CR3: 17771800 CR4: 000006b0
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 6.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 20.
^M^@ start_secondary+0x13d/0x140
^M^@Dazed and confused, but trying to continue
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 18
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 10.
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7426000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f25d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 29.
^M^@ cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 2
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7400000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7edeb00 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 23.
^M^@ start_secondary+0x13d/0x140
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 7.
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 3
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7402000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@Do you have a strange power saving mode enabled?
^M^@CR0: 8005003b CR2: b7f95438 CR3: 17771800 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 4.
^M^@ cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 17
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 5.
^M^@Dazed and confused, but trying to continue
^M^@EIP is at default_idle+0x23/0x2c
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 14.
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7424000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 9.
^M^@ cpu_idle+0x7b/0x8c
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 25.
^M^@ [<c010e79d>] start_secondary+0x13d/0x140Dazed and confused, but trying to continue
^M
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 19
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7428000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f30d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 13.
^M^@ start_secondary+0x13d/0x140
^M^@ Do you have a strange power saving mode enabled?
^M^@----------- IPI show regs -----------Uhhuh. NMI received for unknown reason 00 on CPU 8.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 11.
^M^@Dazed and confused, but trying to continue
^M^@Dazed and confused, but trying to continue
^M^@Dazed and confused, but trying to continue
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 20
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 22.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 26.
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ESI: d742a000 EDI: c0470300 EBP: c0470300Uhhuh. NMI received for unknown reason 00 on CPU 30.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 21.
^M^@ [<c0100ca3>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ cpu_idle+0x7b/0x8c
^M^@Dazed and confused, but trying to continue
^M^@ [<c010e79d>]Do you have a strange power saving mode enabled?
^M^@ start_secondary+0x13d/0x140
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 27.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 24.
^M^@ ----------- IPI show regs -----------
^M^@Pid: 11221, comm: cp
^M^@EIP: 0060:[<c02efbdc>] CPU: 5
^M^@Do you have a strange power saving mode enabled?
^M^@EIP is at _spin_lock_irqsave+0x14/0x20
^M^@ EFLAGS: 00000286 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@Dazed and confused, but trying to continue
^M^@EAX: 00000286 EBX: d6ce4800 ECX: c03cabe0 EDX: c049ba84
^M^@ESI: ffffffea EDI: d55f8000 EBP: d55f8000 DS: 007b ES: 007b
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@CR0: 80050033 CR2: bfc7d2fc CR3: 16dd02e0 CR4: 000006b0
^M^@ [<c0270377>]Uhhuh. NMI received for unknown reason 00 on CPU 31.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 12.
^M^@ ahc_linux_proc_info+0x27/0x212
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ [<c0149052>]Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ page_add_anon_rmap+0x62/0x68
^M^@ [<c0144358>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 15.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 28.
^M^@Dazed and confused, but trying to continue
^M^@ do_anonymous_page+0x1f0/0x21c
^M^@ [<c0144370>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ do_anonymous_page+0x208/0x21c
^M^@Dazed and confused, but trying to continue
^M^@ [<c01443d9>]Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ do_no_page+0x55/0x3e8
^M^@ [<c01372b5>] prep_new_page+0x49/0x50
^M^@ [<c0137973>] buffered_rmqueue+0x16f/0x1d0
^M^@ [<c0137e1b>] __alloc_pages+0x3bb/0x3c8
^M^@ [<c0257cdb>] proc_scsi_read+0x2b/0x44
^M^@ [<c0182f28>] proc_file_read+0xec/0x200
^M^@ [<c0152ff9>] vfs_read+0x91/0x12c
^M^@ [<c01532e4>] sys_read+0x40/0x6c
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 7
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d740c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 4
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7404000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f9c48 CR3: 17771320 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efcae>] CPU: 30
^M^@EIP is at _spin_lock+0xa/0x10
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c1050aa0 EBX: c1050aa0 ECX: d7463ea8 EDX: 00000003
^M^@ESI: c10d9620 EDI: c10d9fe0 EBP: d7463eb0 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7eea900 CR3: 00474000 CR4: 000006b0
^M^@ [<c011583b>] load_balance+0xcf/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efb6a>] CPU: 15
^M^@EIP is at _spin_trylock+0x6/0x14
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
^M^@ESI: c10875a0 EDI: c1087f60 EBP: d741fe84 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
^M^@ [<c01157e4>] load_balance+0x78/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 21
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d742c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 14
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d741c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e64070 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 27
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d745c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f66d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 8
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d740e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f133c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 25
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7436000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f74d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efb6a>] CPU: 29
^M^@EIP is at _spin_trylock+0x6/0x14
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000001 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
^M^@ESI: c10d3ea0 EDI: c10d4860 EBP: d7461e84 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
^M^@ [<c01157e4>] load_balance+0x78/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 31
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7464000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 24
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7434000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 10
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7412000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7ea6920 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 26
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7438000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c01154a3>] CPU: 13
^M^@EIP is at find_busiest_group+0x103/0x2f8
^M^@ EFLAGS: 00000086 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000005 EBX: 00000005 ECX: c1050aa0 EDX: 00000000
^M^@ESI: c04813ac EDI: 00000200 EBP: d741be7c DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e7e070 CR3: 00474000 CR4: 000006b0
^M^@ [<c01157a2>] load_balance+0x36/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 28
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d745e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c011e897>] CPU: 12
^M^@EIP is at __do_softirq+0x47/0x100
^M^@ EFLAGS: 00000006 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c0470380 EBX: c0476020 ECX: 00000030 EDX: c1075ce0
^M^@ESI: 00000002 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f54000 CR3: 00474000 CR4: 000006b0
^M^@ [<c011e97f>] do_softirq+0x2f/0x34
^M^@ [<c011ea24>] irq_exit+0x34/0x38
^M^@ [<c010f601>] smp_apic_timer_interrupt+0xdd/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 9
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7410000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f1d900 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 11
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7414000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 23
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7432000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 6
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7408000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 22
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7430000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 1
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: c13fc000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7ee1d9c CR3: 17771640 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ^M
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-09 13:32 ` Andrew James Wade
@ 2005-06-09 15:57 ` Jean Delvare
2005-06-10 5:58 ` Greg KH
0 siblings, 1 reply; 101+ messages in thread
From: Jean Delvare @ 2005-06-09 15:57 UTC (permalink / raw)
To: Andrew James Wade; +Cc: Andrew Morton, linux-kernel, Greg KH, Mark M. Hoffman
Hi Andrew,
> Mystery solved.
>
> ERROR3:
> i2c_detach_client(data->lm75[1]); <-- HERE
> i2c_detach_client(data->lm75[0]);
> kfree(data->lm75[1]);
> kfree(data->lm75[0]);
>
> The missing i2c_detach_client call meant that data->lm75[1] was still
> on the list of i2c devices when it was freed. This was corrupting the
> list. The ERROR3 path now works on my kernel.
Oh my, I had it right under my nose and didn't see it ;) Thanks for the
clarification.
Greg, please apply the following patch on top of the hwmon patches until
Mark submits an updated version of the whole thing.
----------------------------------
Fix a broken error path in the asb100 driver.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
--- linux-2.6.12-rc6/drivers/i2c/chips/asb100.c.orig Wed Jun 8 09:47:53 2005
+++ linux-2.6.12-rc6/drivers/i2c/chips/asb100.c Thu Jun 9 11:58:34 2005
@@ -859,6 +859,7 @@
return 0;
ERROR3:
+ i2c_detach_client(data->lm75[1]);
i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);
--
Jean Delvare
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin
2005-06-08 14:15 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-09 23:56 ` Martin J. Bligh
2005-06-10 7:02 ` 2.6.12-rc6-mm1 Ingo Molnar
2 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-09 23:56 UTC (permalink / raw)
To: Andrew Morton, Christoph Lameter; +Cc: linux-kernel
--On Tuesday, June 07, 2005 17:08:53 -0700 Andrew Morton <akpm@osdl.org> wrote:
> Christoph Lameter <clameter@engr.sgi.com> wrote:
>>
>> On Tue, 7 Jun 2005, Andrew Morton wrote:
>>
>> > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
>> >
>> > Oh crap, so it does. That's wrong.
>>
>> Email by you and Linus indicated that 250 should be the default.
>
> Oh, OK. hrm.
>
> Martin, it would be useful if you could determine whether the kernbench
> slowdown was due to the 1000Hz->250Hz change, thanks.
>
> I'm assuming it was the CPU scheduler patches. There are 36 of them ;)
Backed them all out ... performance thunks down to earth again, and is actually
the best I've seen it ever (probably 250Hz is helping, I used to run 100 in
-mjb for better benefit).
the +5081 item is the one to look at
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
Patch I used was here:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/patches/nosched
But it was just everything under the "CPU scheduler" section of your series
file.
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-09 15:57 ` Jean Delvare
@ 2005-06-10 5:58 ` Greg KH
2005-06-10 7:08 ` Jean Delvare
0 siblings, 1 reply; 101+ messages in thread
From: Greg KH @ 2005-06-10 5:58 UTC (permalink / raw)
To: Jean Delvare
Cc: Andrew James Wade, Andrew Morton, linux-kernel, Mark M. Hoffman
On Thu, Jun 09, 2005 at 05:57:44PM +0200, Jean Delvare wrote:
> Hi Andrew,
>
> > Mystery solved.
> >
> > ERROR3:
> > i2c_detach_client(data->lm75[1]); <-- HERE
> > i2c_detach_client(data->lm75[0]);
> > kfree(data->lm75[1]);
> > kfree(data->lm75[0]);
> >
> > The missing i2c_detach_client call meant that data->lm75[1] was still
> > on the list of i2c devices when it was freed. This was corrupting the
> > list. The ERROR3 path now works on my kernel.
>
> Oh my, I had it right under my nose and didn't see it ;) Thanks for the
> clarification.
>
> Greg, please apply the following patch on top of the hwmon patches until
> Mark submits an updated version of the whole thing.
>
> ----------------------------------
>
> Fix a broken error path in the asb100 driver.
>
> Signed-off-by: Jean Delvare <khali@linux-fr.org>
>
> --- linux-2.6.12-rc6/drivers/i2c/chips/asb100.c.orig Wed Jun 8 09:47:53 2005
> +++ linux-2.6.12-rc6/drivers/i2c/chips/asb100.c Thu Jun 9 11:58:34 2005
> @@ -859,6 +859,7 @@
> return 0;
>
> ERROR3:
> + i2c_detach_client(data->lm75[1]);
> i2c_detach_client(data->lm75[0]);
> kfree(data->lm75[1]);
> kfree(data->lm75[0]);
Hm, what tree is this against? Am I missing some inbetween patch here?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-10 7:02 ` Ingo Molnar
2005-06-10 12:03 ` 2.6.12-rc6-mm1 Con Kolivas
0 siblings, 1 reply; 101+ messages in thread
From: Ingo Molnar @ 2005-06-10 7:02 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: Andrew Morton, Christoph Lameter, linux-kernel
* Martin J. Bligh <mbligh@mbligh.org> wrote:
> > I'm assuming it was the CPU scheduler patches. There are 36 of them ;)
>
> Backed them all out ... performance thunks down to earth again, and is
> actually the best I've seen it ever (probably 250Hz is helping, I used
> to run 100 in -mjb for better benefit).
>
> the +5081 item is the one to look at
> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
>
> Patch I used was here:
>
> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/patches/nosched
>
> But it was just everything under the "CPU scheduler" section of your
> series file.
we know from Nick's testing that the patches up to and including
dynamic-sched-domains-ia64-changes.patch are probably OK. So the
candidates for the regression are:
sched-implement-nice-support-across-physical-cpus-on-smp.patch
sched-change_prio_bias_only_if_queued.patch
sched-account_rt_tasks_in_prio_bias.patch
consolidate-preempt-options-into-kernel-kconfigpreempt.patch
enable-preempt_bkl-on-preemptsmp-too.patch
sched-tweak-idle-thread-setup-semantics.patch
sched-voluntary-kernel-preemption.patch
sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
sched-task_noninteractive.patch
sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch
there are two feature patches in this:
enable-preempt_bkl-on-preemptsmp-too.patch
sched-voluntary-kernel-preemption.patch
so make sure you have PREEMPT_BKL and PREEMPT_VOLUNTARY disabled.
these ones should not impact your workload's functionality (unless they
are buggy):
sched-account_rt_tasks_in_prio_bias.patch
consolidate-preempt-options-into-kernel-kconfigpreempt.patch
sched-tweak-idle-thread-setup-semantics.patch
sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch
and unless you are using separate nice levels, this one shouldnt make a
difference in theory:
sched-implement-nice-support-across-physical-cpus-on-smp.patch
which leaves the following 3 likely candidates:
sched-change_prio_bias_only_if_queued.patch
sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
sched-task_noninteractive.patch
so if you could do a run with all 3 of the above unapplied, that would
be a good starting point. (But any of the others might be it too, if
they contain some sort of bug.)
Ingo
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: BUG in i2c_detach_client
2005-06-10 5:58 ` Greg KH
@ 2005-06-10 7:08 ` Jean Delvare
0 siblings, 0 replies; 101+ messages in thread
From: Jean Delvare @ 2005-06-10 7:08 UTC (permalink / raw)
To: greg; +Cc: Andrew James Wade, Andrew Morton, LKML, Mark M. Hoffman
Hi Greg,
> > --- linux-2.6.12-rc6/drivers/i2c/chips/asb100.c.orig
> > +++ linux-2.6.12-rc6/drivers/i2c/chips/asb100.c
> > @@ -859,6 +859,7 @@
> > return 0;
> >
> > ERROR3:
> > + i2c_detach_client(data->lm75[1]);
> > i2c_detach_client(data->lm75[0]);
> > kfree(data->lm75[1]);
> > kfree(data->lm75[0]);
>
> Hm, what tree is this against? Am I missing some inbetween patch here?
2.6.12-rc6-mm1, but that was a fix to Mark's hwmon patches, which you
just backed out from your tree - so this fix is no more needed (and
should unsurprisingly fail to apply).
Thanks,
--
Jean Delvare
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 7:02 ` 2.6.12-rc6-mm1 Ingo Molnar
@ 2005-06-10 12:03 ` Con Kolivas
2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas
0 siblings, 1 reply; 101+ messages in thread
From: Con Kolivas @ 2005-06-10 12:03 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter
On Fri, 10 Jun 2005 17:02, Ingo Molnar wrote:
> * Martin J. Bligh <mbligh@mbligh.org> wrote:
> > > I'm assuming it was the CPU scheduler patches. There are 36 of them ;)
> >
> > Backed them all out ... performance thunks down to earth again, and is
> > actually the best I've seen it ever (probably 250Hz is helping, I used
> > to run 100 in -mjb for better benefit).
> >
> > the +5081 item is the one to look at
> > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.
> >moe.png
> >
> > Patch I used was here:
> >
> > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/patches/nosched
> >
> > But it was just everything under the "CPU scheduler" section of your
> > series file.
>
> we know from Nick's testing that the patches up to and including
> dynamic-sched-domains-ia64-changes.patch are probably OK. So the
> candidates for the regression are:
>
> sched-implement-nice-support-across-physical-cpus-on-smp.patch
> sched-change_prio_bias_only_if_queued.patch
> sched-account_rt_tasks_in_prio_bias.patch
> consolidate-preempt-options-into-kernel-kconfigpreempt.patch
> enable-preempt_bkl-on-preemptsmp-too.patch
> sched-tweak-idle-thread-setup-semantics.patch
> sched-voluntary-kernel-preemption.patch
> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
> sched-task_noninteractive.patch
> sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch
>
> there are two feature patches in this:
>
> enable-preempt_bkl-on-preemptsmp-too.patch
> sched-voluntary-kernel-preemption.patch
>
> so make sure you have PREEMPT_BKL and PREEMPT_VOLUNTARY disabled.
>
> these ones should not impact your workload's functionality (unless they
> are buggy):
>
> sched-account_rt_tasks_in_prio_bias.patch
> consolidate-preempt-options-into-kernel-kconfigpreempt.patch
> sched-tweak-idle-thread-setup-semantics.patch
> sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch
>
> and unless you are using separate nice levels, this one shouldnt make a
> difference in theory:
>
> sched-implement-nice-support-across-physical-cpus-on-smp.patch
>
> which leaves the following 3 likely candidates:
>
> sched-change_prio_bias_only_if_queued.patch
> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
These tend to run together so just try adding my four patches together. In
retrospect I guess they're likely candidates because they also change the
_ratio_ of balance which they should not so they are buggy as a group
currently. Easy enough to fix but it will make it easy to pinpoint the
problem if they're responsible.
sched-implement-nice-support-across-physical-cpus-on-smp.patch
sched-change_prio_bias_only_if_queued.patch
sched-account_rt_tasks_in_prio_bias.patch
sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
Con
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-09 13:38 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-10 12:12 ` Kirill Korotaev
0 siblings, 0 replies; 101+ messages in thread
From: Kirill Korotaev @ 2005-06-10 12:12 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: Andrew Morton, apw, pazke, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 24361 bytes --]
>>>>>alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>>>>>doesn't seem to inter-react with the other NMI code well)
>>>>
>>>>What patch?
>>>
>>>
>>>Sorry.
>>>
>>>nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch
>>>
>>>It does seem to work. But probably needs some cleanup for the NMI
>>>errors.
>>
>>If you give me to know where the problem come from I can fix it and make a cleanup.
>
>
> It gets a lot of the "dazed and confused" errors. Possibly you just need
> to disable that part of the handler?
can you try this cleanup patch?
This fixes the problem for me, though I do no like the way it does so
very much...
Kirill
> Command> break
> ^@SysRq : Show Regs
> ^M
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 0
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: c040e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7e3f5a0 CR3: 16dd0300 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c01002c8>] rest_init+0x28/0x2c
> ^M^@ [<c0410899>] start_kernel+0x19d/0x1a0
> ^M^@ Uhhuh. NMI received for unknown reason 00 on CPU 1.
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 16.
> ^M^@Dazed and confused, but trying to continue
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 3.
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 17.
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 16
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@Dazed and confused, but trying to continue
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 2.
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 18.
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 19.
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7420000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 17771800 CR4: 000006b0
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 6.
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 20.
> ^M^@ start_secondary+0x13d/0x140
> ^M^@Dazed and confused, but trying to continue
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 18
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 10.
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7426000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f25d9c CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 29.
> ^M^@ cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 2
> ^M^@ EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7400000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7edeb00 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 23.
> ^M^@ start_secondary+0x13d/0x140
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 7.
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 3
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7402000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@CR0: 8005003b CR2: b7f95438 CR3: 17771800 CR4: 000006b0
> ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 4.
> ^M^@ cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 17
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 5.
> ^M^@Dazed and confused, but trying to continue
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 14.
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7424000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 9.
> ^M^@ cpu_idle+0x7b/0x8c
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 25.
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140Dazed and confused, but trying to continue
> ^M
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 19
> ^M^@ EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7428000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f30d9c CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 13.
> ^M^@ start_secondary+0x13d/0x140
> ^M^@ Do you have a strange power saving mode enabled?
> ^M^@----------- IPI show regs -----------Uhhuh. NMI received for unknown reason 00 on CPU 8.
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 11.
> ^M^@Dazed and confused, but trying to continue
> ^M^@Dazed and confused, but trying to continue
> ^M^@Dazed and confused, but trying to continue
> ^M
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 20
> ^M^@Dazed and confused, but trying to continue
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 22.
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 26.
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ESI: d742a000 EDI: c0470300 EBP: c0470300Uhhuh. NMI received for unknown reason 00 on CPU 30.
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 21.
> ^M^@ [<c0100ca3>]Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ cpu_idle+0x7b/0x8c
> ^M^@Dazed and confused, but trying to continue
> ^M^@ [<c010e79d>]Do you have a strange power saving mode enabled?
> ^M^@ start_secondary+0x13d/0x140
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 27.
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 24.
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 11221, comm: cp
> ^M^@EIP: 0060:[<c02efbdc>] CPU: 5
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@EIP is at _spin_lock_irqsave+0x14/0x20
> ^M^@ EFLAGS: 00000286 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@Dazed and confused, but trying to continue
> ^M^@EAX: 00000286 EBX: d6ce4800 ECX: c03cabe0 EDX: c049ba84
> ^M^@ESI: ffffffea EDI: d55f8000 EBP: d55f8000 DS: 007b ES: 007b
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@CR0: 80050033 CR2: bfc7d2fc CR3: 16dd02e0 CR4: 000006b0
> ^M^@ [<c0270377>]Uhhuh. NMI received for unknown reason 00 on CPU 31.
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 12.
> ^M^@ ahc_linux_proc_info+0x27/0x212
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ [<c0149052>]Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ page_add_anon_rmap+0x62/0x68
> ^M^@ [<c0144358>]Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 15.
> ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 28.
> ^M^@Dazed and confused, but trying to continue
> ^M^@ do_anonymous_page+0x1f0/0x21c
> ^M^@ [<c0144370>]Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ do_anonymous_page+0x208/0x21c
> ^M^@Dazed and confused, but trying to continue
> ^M^@ [<c01443d9>]Do you have a strange power saving mode enabled?
> ^M^@Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@ do_no_page+0x55/0x3e8
> ^M^@ [<c01372b5>] prep_new_page+0x49/0x50
> ^M^@ [<c0137973>] buffered_rmqueue+0x16f/0x1d0
> ^M^@ [<c0137e1b>] __alloc_pages+0x3bb/0x3c8
> ^M^@ [<c0257cdb>] proc_scsi_read+0x2b/0x44
> ^M^@ [<c0182f28>] proc_file_read+0xec/0x200
> ^M^@ [<c0152ff9>] vfs_read+0x91/0x12c
> ^M^@ [<c01532e4>] sys_read+0x40/0x6c
> ^M^@ [<c0102a19>] syscall_call+0x7/0xb
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 7
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d740c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 4
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7404000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 080f9c48 CR3: 17771320 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c02efcae>] CPU: 30
> ^M^@EIP is at _spin_lock+0xa/0x10
> ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: c1050aa0 EBX: c1050aa0 ECX: d7463ea8 EDX: 00000003
> ^M^@ESI: c10d9620 EDI: c10d9fe0 EBP: d7463eb0 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7eea900 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c011583b>] load_balance+0xcf/0x170
> ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
> ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
> ^M^@ [<c01225b3>] update_process_times+0xef/0x100
> ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
> ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
> ^M^@ [<c0100bb0>] default_idle+0x0/0x2c
> ^M^@ [<c0100bd3>] default_idle+0x23/0x2c
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c02efb6a>] CPU: 15
> ^M^@EIP is at _spin_trylock+0x6/0x14
> ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
> ^M^@ESI: c10875a0 EDI: c1087f60 EBP: d741fe84 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
> ^M^@ [<c01157e4>] load_balance+0x78/0x170
> ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
> ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
> ^M^@ [<c01225b3>] update_process_times+0xef/0x100
> ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
> ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
> ^M^@ [<c0100bb0>] default_idle+0x0/0x2c
> ^M^@ [<c0100bd3>] default_idle+0x23/0x2c
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 21
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d742c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 14
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d741c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7e64070 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 27
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d745c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f66d9c CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 8
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d740e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 080f133c CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 25
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7436000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f74d9c CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c02efb6a>] CPU: 29
> ^M^@EIP is at _spin_trylock+0x6/0x14
> ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000001 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
> ^M^@ESI: c10d3ea0 EDI: c10d4860 EBP: d7461e84 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
> ^M^@ [<c01157e4>] load_balance+0x78/0x170
> ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
> ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
> ^M^@ [<c01225b3>] update_process_times+0xef/0x100
> ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
> ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
> ^M^@ [<c0100bb0>] default_idle+0x0/0x2c
> ^M^@ [<c0100bd3>] default_idle+0x23/0x2c
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 31
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7464000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 24
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7434000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 10
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7412000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7ea6920 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 26
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7438000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c01154a3>] CPU: 13
> ^M^@EIP is at find_busiest_group+0x103/0x2f8
> ^M^@ EFLAGS: 00000086 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000005 EBX: 00000005 ECX: c1050aa0 EDX: 00000000
> ^M^@ESI: c04813ac EDI: 00000200 EBP: d741be7c DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7e7e070 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c01157a2>] load_balance+0x36/0x170
> ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
> ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
> ^M^@ [<c01225b3>] update_process_times+0xef/0x100
> ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
> ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
> ^M^@ [<c0100bb0>] default_idle+0x0/0x2c
> ^M^@ [<c0100bd3>] default_idle+0x23/0x2c
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 28
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d745e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c011e897>] CPU: 12
> ^M^@EIP is at __do_softirq+0x47/0x100
> ^M^@ EFLAGS: 00000006 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: c0470380 EBX: c0476020 ECX: 00000030 EDX: c1075ce0
> ^M^@ESI: 00000002 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f54000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c011e97f>] do_softirq+0x2f/0x34
> ^M^@ [<c011ea24>] irq_exit+0x34/0x38
> ^M^@ [<c010f601>] smp_apic_timer_interrupt+0xdd/0xe4
> ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
> ^M^@ [<c0100bb0>] default_idle+0x0/0x2c
> ^M^@ [<c0100bd3>] default_idle+0x23/0x2c
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 9
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7410000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f1d900 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 11
> ^M^@ EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7414000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 23
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7432000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 6
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7408000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 22
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: d7430000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ Dazed and confused, but trying to continue
> ^M^@Do you have a strange power saving mode enabled?
> ^M^@----------- IPI show regs -----------
> ^M^@Pid: 0, comm: swapper
> ^M^@EIP: 0060:[<c0100bd3>] CPU: 1
> ^M^@EIP is at default_idle+0x23/0x2c
> ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
> ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
> ^M^@ESI: c13fc000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
> ^M^@CR0: 8005003b CR2: b7ee1d9c CR3: 17771640 CR4: 000006b0
> ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
> ^M^@ [<c010e79d>] start_secondary+0x13d/0x140
> ^M^@ ^M
>
>
>
>
>
>
[-- Attachment #2: altsysrq-p-cleanup --]
[-- Type: text/plain, Size: 1080 bytes --]
--- ./arch/i386/kernel/traps.c.xxx 2005-05-10 18:27:04.000000000 +0400
+++ ./arch/i386/kernel/traps.c 2005-06-10 14:18:32.000000000 +0400
@@ -574,6 +574,14 @@ void die_nmi (struct pt_regs *regs, cons
do_exit(SIGSEGV);
}
+static int dummy_nmi_callback(struct pt_regs * regs, int cpu)
+{
+ return 0;
+}
+
+static nmi_callback_t nmi_callback = dummy_nmi_callback;
+static nmi_callback_t nmi_ipi_callback = dummy_nmi_callback;
+
static void default_do_nmi(struct pt_regs * regs)
{
unsigned char reason = 0;
@@ -596,6 +604,9 @@ static void default_do_nmi(struct pt_reg
return;
}
#endif
+ if (nmi_ipi_callback != dummy_nmi_callback)
+ return;
+
unknown_nmi_error(reason, regs);
return;
}
@@ -612,14 +623,6 @@ static void default_do_nmi(struct pt_reg
reassert_nmi();
}
-static int dummy_nmi_callback(struct pt_regs * regs, int cpu)
-{
- return 0;
-}
-
-static nmi_callback_t nmi_callback = dummy_nmi_callback;
-static nmi_callback_t nmi_ipi_callback = dummy_nmi_callback;
-
fastcall void do_nmi(struct pt_regs * regs, long error_code)
{
int cpu;
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 12:03 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-10 14:19 ` Con Kolivas
2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon
2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh
0 siblings, 2 replies; 101+ messages in thread
From: Con Kolivas @ 2005-06-10 14:19 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter,
Nick Piggin
[-- Attachment #1.1: Type: text/plain, Size: 2194 bytes --]
On Fri, 10 Jun 2005 22:03, Con Kolivas wrote:
> On Fri, 10 Jun 2005 17:02, Ingo Molnar wrote:
> > * Martin J. Bligh <mbligh@mbligh.org> wrote:
> > > > I'm assuming it was the CPU scheduler patches. There are 36 of them
> > > > ;)
> > So the
> > candidates for the regression are:
> >
> > sched-implement-nice-support-across-physical-cpus-on-smp.patch
> > sched-change_prio_bias_only_if_queued.patch
> > sched-account_rt_tasks_in_prio_bias.patch
> > consolidate-preempt-options-into-kernel-kconfigpreempt.patch
> > enable-preempt_bkl-on-preemptsmp-too.patch
> > sched-tweak-idle-thread-setup-semantics.patch
> > sched-voluntary-kernel-preemption.patch
> > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
> > sched-task_noninteractive.patch
> > sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch
> These tend to run together so just try adding my four patches together. In
> retrospect I guess they're likely candidates because they also change the
> _ratio_ of balance which they should not so they are buggy as a group
> currently. Easy enough to fix but it will make it easy to pinpoint the
> problem if they're responsible.
>
> sched-implement-nice-support-across-physical-cpus-on-smp.patch
> sched-change_prio_bias_only_if_queued.patch
> sched-account_rt_tasks_in_prio_bias.patch
> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
By the way it has already been decided to remove these patches from -mm
pending the completion of current scheduler work. If they turn out to be
responsible for this regression I apologise profusely :-|.
It is clearer to me now that I have made a mistake with the priority biasing,
and the following patch corrects it to the planned behaviour. This is
academic at this stage as we won't be looking at this particular feature
again in earnest until the other 32 scheduler patches (and any followups) go
upstream.
It's already known that schedstats data will be off without further code to
understand smp nice as well (thanks Nick for pointing out the data)... more
academic stuff but obviously something to consider when/if we get there.
Cheers,
Con
[-- Attachment #1.2: sched-correct_smp_nice_bias.patch --]
[-- Type: text/x-diff, Size: 1621 bytes --]
The priority biasing was off by mutliplying the total load by the total
priority bias and this ruins the ratio of loads between runqueues. This
patch should correct the ratios of loads between runqueues to be proportional
to overall load.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Index: linux-2.6.12-rc6-mm1/kernel/sched.c
===================================================================
--- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000
+++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-10 23:59:57.000000000 +1000
@@ -978,7 +978,7 @@ static inline unsigned long __source_loa
else
source_load = min(cpu_load, load_now);
- if (idle == NOT_IDLE || rq->nr_running > 1)
+ if (idle == NOT_IDLE || rq->nr_running > 1) {
/*
* If we are busy rebalancing the load is biased by
* priority to create 'nice' support across cpus. When
@@ -987,7 +987,10 @@ static inline unsigned long __source_loa
* prevent idle rebalance from trying to pull tasks from a
* queue with only one running task.
*/
- source_load *= rq->prio_bias;
+ unsigned long prio_bias = rq->prio_bias / rq->nr_running;
+
+ source_load *= prio_bias;
+ }
return source_load;
}
@@ -1011,8 +1014,11 @@ static inline unsigned long __target_loa
else
target_load = max(cpu_load, load_now);
- if (idle == NOT_IDLE || rq->nr_running > 1)
- target_load *= rq->prio_bias;
+ if (idle == NOT_IDLE || rq->nr_running > 1) {
+ unsigned long prio_bias = rq->prio_bias / rq->nr_running;
+
+ target_load *= prio_bias;
+ }
return target_load;
}
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-10 23:14 ` J.A. Magallon
2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh
1 sibling, 1 reply; 101+ messages in thread
From: J.A. Magallon @ 2005-06-10 23:14 UTC (permalink / raw)
To: Con Kolivas
Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton,
Christoph Lameter, Nick Piggin
On 06.10, Con Kolivas wrote:
> The priority biasing was off by mutliplying the total load by the total
> priority bias and this ruins the ratio of loads between runqueues. This
> patch should correct the ratios of loads between runqueues to be proportional
> to overall load.
>
2.6.12-rc6-mm1 + this patch just oopses nicely on boot.
I did not had a digital camera handy, but the first oops that fit in the
screen was this call chain:
kernel_thread_helper
init
init
do:base_setup
usermodehelper_init
__create_workqueue
EIP in try_to_wake_up
After this, there was another with some do_div_error calls...
Something looks un-initialized the first time, or the integer arithmetic
is wrong. I really dont like a*(b/c), I really prefer (a*b)/c. It is more
common b/c == 0 (because b<c), than the possibility of overflowing (a*b).
So I tried both. With this, it boots again:
--- linux-2.6.11-jam24/kernel/sched.c.orig 2005-06-11 00:59:44.000000000 +0200
+++ linux-2.6.11-jam24/kernel/sched.c 2005-06-11 01:03:32.000000000 +0200
@@ -987,9 +987,10 @@
* prevent idle rebalance from trying to pull tasks from a
* queue with only one running task.
*/
- unsigned long prio_bias = rq->prio_bias / rq->nr_running;
+ unsigned long prio_scale = (rq->nr_running > 0 ?
+ rq->nr_running : 1);
- source_load *= prio_bias;
+ source_load = (source_load*rq->prio_bias) / prio_scale;
}
return source_load;
@@ -1015,9 +1016,10 @@
target_load = max(cpu_load, load_now);
if (idle == NOT_IDLE || rq->nr_running > 1) {
- unsigned long prio_bias = rq->prio_bias / rq->nr_running;
+ unsigned long prio_scale = (rq->nr_running > 0 ?
+ rq->nr_running : 1);
- target_load *= prio_bias;
+ target_load = (target_load*rq->prio_bias) / prio_scale;
}
return target_load;
Perhaps this:
if (idle == NOT_IDLE || rq->nr_running > 1)
should be
if (idle == NOT_IDLE && rq->nr_running > 1)
???
Hope this helps, thanks.
> Signed-off-by: Con Kolivas <kernel@kolivas.org>
>
> Index: linux-2.6.12-rc6-mm1/kernel/sched.c
> ===================================================================
> --- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000
> +++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-10 23:59:57.000000000 +1000
> @@ -978,7 +978,7 @@ static inline unsigned long __source_loa
> else
> source_load = min(cpu_load, load_now);
>
> - if (idle == NOT_IDLE || rq->nr_running > 1)
> + if (idle == NOT_IDLE || rq->nr_running > 1) {
> /*
> * If we are busy rebalancing the load is biased by
> * priority to create 'nice' support across cpus. When
> @@ -987,7 +987,10 @@ static inline unsigned long __source_loa
> * prevent idle rebalance from trying to pull tasks from a
> * queue with only one running task.
> */
> - source_load *= rq->prio_bias;
> + unsigned long prio_bias = rq->prio_bias / rq->nr_running;
> +
> + source_load *= prio_bias;
> + }
>
> return source_load;
> }
> @@ -1011,8 +1014,11 @@ static inline unsigned long __target_loa
> else
> target_load = max(cpu_load, load_now);
>
> - if (idle == NOT_IDLE || rq->nr_running > 1)
> - target_load *= rq->prio_bias;
> + if (idle == NOT_IDLE || rq->nr_running > 1) {
> + unsigned long prio_bias = rq->prio_bias / rq->nr_running;
> +
> + target_load *= prio_bias;
> + }
>
> return target_load;
> }
>
--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandriva Linux release 2006.0 (Cooker) for i586
Linux 2.6.11-jam24 (gcc 4.0.0 (4.0.0-3mdk for Mandriva Linux release 2006.0))
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon
@ 2005-06-10 23:50 ` Martin J. Bligh
2005-06-11 4:14 ` 2.6.12-rc6-mm1 Martin J. Bligh
1 sibling, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-10 23:50 UTC (permalink / raw)
To: Con Kolivas, linux-kernel
Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin
>> These tend to run together so just try adding my four patches together. In
>> retrospect I guess they're likely candidates because they also change the
>> _ratio_ of balance which they should not so they are buggy as a group
>> currently. Easy enough to fix but it will make it easy to pinpoint the
>> problem if they're responsible.
>>
>> sched-implement-nice-support-across-physical-cpus-on-smp.patch
>> sched-change_prio_bias_only_if_queued.patch
>> sched-account_rt_tasks_in_prio_bias.patch
>> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
>
> By the way it has already been decided to remove these patches from -mm
> pending the completion of current scheduler work. If they turn out to be
> responsible for this regression I apologise profusely :-|.
>
> It is clearer to me now that I have made a mistake with the priority biasing,
> and the following patch corrects it to the planned behaviour. This is
> academic at this stage as we won't be looking at this particular feature
> again in earnest until the other 32 scheduler patches (and any followups) go
> upstream.
>
> It's already known that schedstats data will be off without further code to
> understand smp nice as well (thanks Nick for pointing out the data)... more
> academic stuff but obviously something to consider when/if we get there.
OK, I backed out those 4, and the degredation mostly went away.
See http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
and more specifically, see the +p5150 near the right hand side.
I don't think it's quite as good as mainline, but much closer.
I did this run with HZ=1000, and the the one with no scheduler
patches at all with HZ=250, so I'll try to do a run that's more
directly comparable as well
Thanks,
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon
@ 2005-06-10 23:59 ` Con Kolivas
2005-06-11 0:18 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon
0 siblings, 2 replies; 101+ messages in thread
From: Con Kolivas @ 2005-06-10 23:59 UTC (permalink / raw)
To: J.A. Magallon
Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton,
Christoph Lameter, Nick Piggin
[-- Attachment #1.1: Type: text/plain, Size: 1816 bytes --]
On Sat, 11 Jun 2005 09:14, J.A. Magallon wrote:
> On 06.10, Con Kolivas wrote:
> > The priority biasing was off by mutliplying the total load by the total
> > priority bias and this ruins the ratio of loads between runqueues. This
> > patch should correct the ratios of loads between runqueues to be
> > proportional to overall load.
>
> 2.6.12-rc6-mm1 + this patch just oopses nicely on boot.
> I did not had a digital camera handy, but the first oops that fit in the
> screen was this call chain:
>
> kernel_thread_helper
> init
> init
> do:base_setup
> usermodehelper_init
> __create_workqueue
> EIP in try_to_wake_up
>
> After this, there was another with some do_div_error calls...
>
> Something looks un-initialized the first time, or the integer arithmetic
> is wrong. I really dont like a*(b/c), I really prefer (a*b)/c. It is more
> common b/c == 0 (because b<c), than the possibility of overflowing (a*b).
>
> So I tried both. With this, it boots again:
Doh Doh DOH DOH!
I need a real swift kick up the bum. The point of the patch was to show what
was wrong with the math, and I shouldn't have posted it without actually
trying it.
> - unsigned long prio_bias = rq->prio_bias / rq->nr_running;
rq->nr_running can often be 0 and rq->prio_bias by definition has to be larger
than or equal to rq->nr_running.
> Perhaps this:
>
> if (idle == NOT_IDLE || rq->nr_running > 1)
>
> should be
>
> if (idle == NOT_IDLE && rq->nr_running > 1)
No, testing for rq->nr_running > 1 is only needed if we are balancing in an
idle balance.
> Hope this helps, thanks.
Yes it does :\
Here is what the patch _should_ have been. (*same warnings with this patch
about math demonstration and untested as should have been posted with the
earlier one*)
Con
[-- Attachment #1.2: sched-correct_smp_nice_bias.patch --]
[-- Type: text/x-diff, Size: 1720 bytes --]
The priority biasing was off by mutliplying the total load by the total
priority bias and this ruins the ratio of loads between runqueues. This
patch should correct the ratios of loads between runqueues to be proportional
to overall load. -2nd attempt.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Index: linux-2.6.12-rc6-mm1/kernel/sched.c
===================================================================
--- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000
+++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-11 09:55:56.000000000 +1000
@@ -978,7 +978,8 @@ static inline unsigned long __source_loa
else
source_load = min(cpu_load, load_now);
- if (idle == NOT_IDLE || rq->nr_running > 1)
+ if (idle == NOT_IDLE || rq->nr_running > 1) {
+ unsigned long prio_bias = 1;
/*
* If we are busy rebalancing the load is biased by
* priority to create 'nice' support across cpus. When
@@ -987,7 +988,10 @@ static inline unsigned long __source_loa
* prevent idle rebalance from trying to pull tasks from a
* queue with only one running task.
*/
- source_load *= rq->prio_bias;
+ if (rq->nr_running)
+ prio_bias = rq->prio_bias / rq->nr_running;
+ source_load *= prio_bias;
+ }
return source_load;
}
@@ -1011,8 +1015,13 @@ static inline unsigned long __target_loa
else
target_load = max(cpu_load, load_now);
- if (idle == NOT_IDLE || rq->nr_running > 1)
- target_load *= rq->prio_bias;
+ if (idle == NOT_IDLE || rq->nr_running > 1) {
+ unsigned long prio_bias = 1;
+
+ if (rq->nr_running)
+ prio_bias = rq->prio_bias / rq->nr_running;
+ target_load *= prio_bias;
+ }
return target_load;
}
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-11 0:18 ` Con Kolivas
2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon
1 sibling, 0 replies; 101+ messages in thread
From: Con Kolivas @ 2005-06-11 0:18 UTC (permalink / raw)
To: linux-kernel
Cc: J.A. Magallon, Ingo Molnar, Martin J. Bligh, Andrew Morton,
Christoph Lameter, Nick Piggin
[-- Attachment #1: Type: text/plain, Size: 447 bytes --]
On Sat, 11 Jun 2005 09:59, Con Kolivas wrote:
> Here is what the patch _should_ have been. (*same warnings with this patch
> about math demonstration and untested as should have been posted with the
> earlier one*)
Ok I booted this patch and all seems fine. Thanks to those that tracked down
this regression and the bugs, and apologies for the inconvenience. Looks like
Martin's automated testbed is already paying off.
Cheers,
Con
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 0:18 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-11 0:32 ` J.A. Magallon
2005-06-11 0:48 ` 2.6.12-rc6-mm1 Con Kolivas
1 sibling, 1 reply; 101+ messages in thread
From: J.A. Magallon @ 2005-06-11 0:32 UTC (permalink / raw)
To: Con Kolivas
Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton,
Christoph Lameter, Nick Piggin
On 06.11, Con Kolivas wrote:
>
> Here is what the patch _should_ have been. (*same warnings with this patch
> about math demonstration and untested as should have been posted with the
> earlier one*)
>
> + if (idle == NOT_IDLE || rq->nr_running > 1) {
> + unsigned long prio_bias = 1;
> + if (rq->nr_running)
> + prio_bias = rq->prio_bias / rq->nr_running;
> + source_load *= prio_bias;
> + }
>
Again... sorry, I don't try to be picky, just want to know if its worth or
not...
Would not be better something like:
if (idle == NOT_IDLE || rq->nr_running > 1) {
if (rq->nr_running)
source_load = (source_load*rq->prio_bias) / rq->nr_running;
}
wrt the integer math ? Think of
100*( 5/5) vs 500/5
100*( 6/5) vs 600/5
100*( 7/5) vs 700/5
100*( 8/5) vs 800/5
100*( 9/5) vs 900/5
100*(10/5) vs 1000/5
--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandriva Linux release 2006.0 (Cooker) for i586
Linux 2.6.11-jam24 (gcc 4.0.0 (4.0.0-3mdk for Mandriva Linux release 2006.0))
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon
@ 2005-06-11 0:48 ` Con Kolivas
2005-06-11 0:52 ` 2.6.12-rc6-mm1 Con Kolivas
0 siblings, 1 reply; 101+ messages in thread
From: Con Kolivas @ 2005-06-11 0:48 UTC (permalink / raw)
To: J.A. Magallon
Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton,
Christoph Lameter, Nick Piggin
[-- Attachment #1: Type: text/plain, Size: 931 bytes --]
On Sat, 11 Jun 2005 10:32, J.A. Magallon wrote:
> On 06.11, Con Kolivas wrote:
> > Here is what the patch _should_ have been. (*same warnings with this
> > patch about math demonstration and untested as should have been posted
> > with the earlier one*)
> >
> > + if (idle == NOT_IDLE || rq->nr_running > 1) {
> > + unsigned long prio_bias = 1;
> > + if (rq->nr_running)
> > + prio_bias = rq->prio_bias / rq->nr_running;
> > + source_load *= prio_bias;
> > + }
>
> Again... sorry, I don't try to be picky, just want to know if its worth or
> not...
>
> Would not be better something like:
>
> if (idle == NOT_IDLE || rq->nr_running > 1) {
> if (rq->nr_running)
> source_load = (source_load*rq->prio_bias) / rq->nr_running;
> }
I understand your concern, but by definition rq->nr_running will always be a
factor of rq->prio_bias so integer math should be fine. Either will do.
Cheers,
Con
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 0:48 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-11 0:52 ` Con Kolivas
0 siblings, 0 replies; 101+ messages in thread
From: Con Kolivas @ 2005-06-11 0:52 UTC (permalink / raw)
To: J.A. Magallon
Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton,
Christoph Lameter, Nick Piggin
On Sat, 11 Jun 2005 10:48, Con Kolivas wrote:
> On Sat, 11 Jun 2005 10:32, J.A. Magallon wrote:
> > On 06.11, Con Kolivas wrote:
> > > Here is what the patch _should_ have been. (*same warnings with this
> > > patch about math demonstration and untested as should have been posted
> > > with the earlier one*)
> > >
> > > + if (idle == NOT_IDLE || rq->nr_running > 1) {
> > > + unsigned long prio_bias = 1;
> > > + if (rq->nr_running)
> > > + prio_bias = rq->prio_bias / rq->nr_running;
> > > + source_load *= prio_bias;
> > > + }
> >
> > Again... sorry, I don't try to be picky, just want to know if its worth
> > or not...
> >
> > Would not be better something like:
> >
> > if (idle == NOT_IDLE || rq->nr_running > 1) {
> > if (rq->nr_running)
> > source_load = (source_load*rq->prio_bias) / rq->nr_running;
> > }
>
> I understand your concern, but by definition rq->nr_running will always be
> a factor of rq->prio_bias so integer math should be fine. Either will do.
Hmm. No you are right and I'm smoking crack, but integer math should still be
accurate enough here. Let me think about the accuracy before spraying more
patches like a fool.
Cheers,
Con
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-11 4:14 ` Martin J. Bligh
2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas
0 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-11 4:14 UTC (permalink / raw)
To: Con Kolivas, linux-kernel
Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin
--"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 16:50:40 -0700):
>>> These tend to run together so just try adding my four patches together. In
>>> retrospect I guess they're likely candidates because they also change the
>>> _ratio_ of balance which they should not so they are buggy as a group
>>> currently. Easy enough to fix but it will make it easy to pinpoint the
>>> problem if they're responsible.
>>>
>>> sched-implement-nice-support-across-physical-cpus-on-smp.patch
>>> sched-change_prio_bias_only_if_queued.patch
>>> sched-account_rt_tasks_in_prio_bias.patch
>>> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch
>>
>> By the way it has already been decided to remove these patches from -mm
>> pending the completion of current scheduler work. If they turn out to be
>> responsible for this regression I apologise profusely :-|.
>>
>> It is clearer to me now that I have made a mistake with the priority biasing,
>> and the following patch corrects it to the planned behaviour. This is
>> academic at this stage as we won't be looking at this particular feature
>> again in earnest until the other 32 scheduler patches (and any followups) go
>> upstream.
>>
>> It's already known that schedstats data will be off without further code to
>> understand smp nice as well (thanks Nick for pointing out the data)... more
>> academic stuff but obviously something to consider when/if we get there.
>
> OK, I backed out those 4, and the degredation mostly went away.
> See http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
>
> and more specifically, see the +p5150 near the right hand side.
> I don't think it's quite as good as mainline, but much closer.
> I did this run with HZ=1000, and the the one with no scheduler
> patches at all with HZ=250, so I'll try to do a run that's more
> directly comparable as well
OK, that makes it look much more like mainline. Looks like you were still
revising the details of your patch Con ... once you're ready, drop me a
URL for it, and I'll make the system whack on that too.
M.
PS. Hmmm. I need to get better at identifying what +p5150 means in the
graphs, etc ;-( Maybe HTML explanation with embedded png image.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 4:14 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-11 5:22 ` Con Kolivas
2005-06-11 5:56 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh
0 siblings, 2 replies; 101+ messages in thread
From: Con Kolivas @ 2005-06-11 5:22 UTC (permalink / raw)
To: Martin J. Bligh
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]
On Sat, 11 Jun 2005 14:14, Martin J. Bligh wrote:
> --"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 > >
OK, I backed out those 4, and the degredation mostly went away.
> > See
> > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.
> >moe.png
> >
> > and more specifically, see the +p5150 near the right hand side.
> > I don't think it's quite as good as mainline, but much closer.
> > I did this run with HZ=1000, and the the one with no scheduler
> > patches at all with HZ=250, so I'll try to do a run that's more
> > directly comparable as well
>
> OK, that makes it look much more like mainline. Looks like you were still
> revising the details of your patch Con ... once you're ready, drop me a
> URL for it, and I'll make the system whack on that too.
Great thanks. Here are rolled up all the reconsidered changes that apply
directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very
grateful to see how this performed; it has been boot and stress tested at
this end. If it shows detriment I'll have to make the smp nice changes more
complex.
Cheers,
Con
[-- Attachment #2: 2.6.12-rc6-mm1-mjbtest.patch --]
[-- Type: text/x-diff, Size: 1253 bytes --]
Index: linux-2.6.12-rc6-mm1/kernel/sched.c
===================================================================
--- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000
+++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-11 11:48:09.000000000 +1000
@@ -978,7 +978,7 @@ static inline unsigned long __source_loa
else
source_load = min(cpu_load, load_now);
- if (idle == NOT_IDLE || rq->nr_running > 1)
+ if (rq->nr_running > 1 || (idle == NOT_IDLE && rq->nr_running))
/*
* If we are busy rebalancing the load is biased by
* priority to create 'nice' support across cpus. When
@@ -987,7 +987,7 @@ static inline unsigned long __source_loa
* prevent idle rebalance from trying to pull tasks from a
* queue with only one running task.
*/
- source_load *= rq->prio_bias;
+ source_load = source_load * rq->prio_bias / rq->nr_running;
return source_load;
}
@@ -1011,8 +1011,8 @@ static inline unsigned long __target_loa
else
target_load = max(cpu_load, load_now);
- if (idle == NOT_IDLE || rq->nr_running > 1)
- target_load *= rq->prio_bias;
+ if (rq->nr_running > 1 || (idle == NOT_IDLE && rq->nr_running))
+ target_load = target_load * rq->prio_bias / rq->nr_running;
return target_load;
}
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-11 5:56 ` Martin J. Bligh
2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh
1 sibling, 0 replies; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-11 5:56 UTC (permalink / raw)
To: Con Kolivas
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
--Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 15:22:30 +1000):
> On Sat, 11 Jun 2005 14:14, Martin J. Bligh wrote:
>> --"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 > >
> OK, I backed out those 4, and the degredation mostly went away.
>> > See
>> > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.
>> > moe.png
>> >
>> > and more specifically, see the +p5150 near the right hand side.
>> > I don't think it's quite as good as mainline, but much closer.
>> > I did this run with HZ=1000, and the the one with no scheduler
>> > patches at all with HZ=250, so I'll try to do a run that's more
>> > directly comparable as well
>>
>> OK, that makes it look much more like mainline. Looks like you were still
>> revising the details of your patch Con ... once you're ready, drop me a
>> URL for it, and I'll make the system whack on that too.
>
> Great thanks. Here are rolled up all the reconsidered changes that apply
> directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very
> grateful to see how this performed; it has been boot and stress tested at
> this end. If it shows detriment I'll have to make the smp nice changes more
> complex.
Kicked it off - should appear in a few hours as
http://mbligh.org/abat/con_sched_test
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
` (6 preceding siblings ...)
2005-06-08 14:33 ` BUG in i2c_detach_client Andrew James Wade
@ 2005-06-11 11:51 ` Benoit Boissinot
2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall
9 siblings, 0 replies; 101+ messages in thread
From: Benoit Boissinot @ 2005-06-11 11:51 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
On 6/7/05, Andrew Morton <akpm@osdl.org> wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/
>
> - Added v9fs
>
> - Various random fixes
>
> - Probably a similar number of breakages
>
I just had the following Oopses:
Unable to handle kernel paging request at virtual address 901a1960
printing eip:
c0139251
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: radeon drm tun snd_seq snd_pcm_oss snd_mixer_oss
snd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore ipt_multiport
ipt_state ipt_limit ipt_MASQUERADE ipt_mark iptable_mangle ipt_MARK
ipt_REJECT iptable_filter iptable_nat ip_tables ip_conntrack_irc
ip_conntrack_ftp ip_conntrack skge 8139too mii usbcore ide_cd cdrom
CPU: 0
EIP: 0060:[<c0139251>] Not tainted VLI
EFLAGS: 00010086 (2.6.12-rc6-mm1-arakou)
EIP is at find_lock_page+0x21/0xb0
eax: 901a195c ebx: 901a195c ecx: d8a3b094 edx: 00000003
esi: 00109380 edi: c18e4b08 ebp: d822cb10 esp: d822cb00
ds: 007b es: 007b ss: 0068
Process emerge (pid: 31977, threadinfo=d822c000 task=cbb9d040)
Stack: c18e4b04 c1218060 00000000 00000050 d822cb34 c013930e 00000050 00109380
c18e4b04 c0333d04 00109380 c18e4a00 00001000 d822cb50 c0157986 d822cb50
00109380 00109380 00001000 c18e4a00 d822cb70 c0157af5 00001000 d822cb70
Call Trace:
[<c0103d17>] show_stack+0x97/0xd0
[<c0103ec5>] show_registers+0x155/0x1f0
[<c01040c1>] die+0xc1/0x140
[<c01157ec>] do_page_fault+0x23c/0x6b5
[<c010395f>] error_code+0x4f/0x54
[<c013930e>] find_or_create_page+0x2e/0xd0
[<c0157986>] grow_dev_page+0x26/0x110
[<c0157af5>] __getblk_slow+0x85/0x130
[<c0157e8b>] __getblk+0x3b/0x50
[<c01a788b>] search_by_key+0x9b/0xf40
[<c0195095>] reiserfs_read_locked_inode+0x65/0x110
[<c01951e9>] reiserfs_iget+0x79/0xa0
[<c0190330>] reiserfs_lookup+0xd0/0x130
[<c0161f80>] real_lookup+0xb0/0xd0
[<c01622be>] do_lookup+0x7e/0x90
[<c0162a06>] __link_path_walk+0x736/0xd50
[<c016306a>] link_path_walk+0x4a/0x110
[<c01633b4>] path_lookup+0x74/0x120
[<c01635ee>] __user_walk+0x2e/0x50
[<c015e240>] vfs_stat+0x20/0x50
[<c015e834>] sys_stat64+0x14/0x30
[<c0102e0f>] sysenter_past_esp+0x54/0x75
Code: c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 57 56 89 d6 53 83 ec 04
89 45 f0 fa 8d 78 04 89 f2 89 f8 e8 35 04 0b 00 85 c0 89 c3 74 56 <ff>
40 04 0f ba 28 00 19 c0 85 c0 74 49 fb 0f ba 2b 00 19 c0 85
<1>Unable to handle kernel paging request at virtual address 71ef2710
printing eip:
c0157140
*pde = 00000000
Oops: 0000 [#2]
Modules linked in: radeon drm tun snd_seq snd_pcm_oss snd_mixer_oss
snd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore ipt_multiport
ipt_state ipt_limit ipt_MASQUERADE ipt_mark iptable_mangle ipt_MARK
ipt_REJECT iptable_filter iptable_nat ip_tables ip_conntrack_irc
ip_conntrack_ftp ip_conntrack skge 8139too mii usbcore ide_cd cdrom
CPU: 0
EIP: 0060:[<c0157140>] Not tainted VLI
EFLAGS: 00010a16 (2.6.12-rc6-mm1-arakou)
EIP is at __find_get_block_slow+0x90/0x140
eax: 00000000 ebx: 71ef26fc ecx: cb96f0e7 edx: 00000001
esi: c1309f20 edi: 000f9df5 ebp: e35d5b98 esp: e35d5b74
ds: 007b es: 007b ss: 0068
Process vim (pid: 32081, threadinfo=e35d5000 task=c2cc55b0)
Stack: df7fb6bc f6de8a4c f7cf12fc f6de8cec 00000002 c18e4584 dcb43d7c c18e4520
000f9df5 e35d5bac c0157e1c 00001000 000f9df5 c18e4520 e35d5bc0 c0157e6c
00003e94 0000003e 000f9df5 e35d5ce0 c01a788b 0000001e 0000001f e35d5bf0
Call Trace:
[<c0103d17>] show_stack+0x97/0xd0
[<c0103ec5>] show_registers+0x155/0x1f0
[<c01040c1>] die+0xc1/0x140
[<c01157ec>] do_page_fault+0x23c/0x6b5
[<c010395f>] error_code+0x4f/0x54
[<c0157e1c>] __find_get_block+0x6c/0xa0
[<c0157e6c>] __getblk+0x1c/0x50
[<c01a788b>] search_by_key+0x9b/0xf40
[<c018fc2c>] search_by_entry_key+0x1c/0x1f0
[<c01901e0>] reiserfs_find_entry+0x90/0x110
[<c01902d2>] reiserfs_lookup+0x72/0x130
[<c0161f80>] real_lookup+0xb0/0xd0
[<c01622be>] do_lookup+0x7e/0x90
[<c0162a06>] __link_path_walk+0x736/0xd50
[<c016306a>] link_path_walk+0x4a/0x110
[<c01633b4>] path_lookup+0x74/0x120
[<c0163a09>] open_namei+0x79/0x5f0
[<c0154c29>] filp_open+0x29/0x50
[<c0154fac>] sys_open+0x3c/0xc0
[<c0102e0f>] sysenter_past_esp+0x54/0x75
Code: 89 f0 e8 34 b8 fe ff 89 d8 83 c4 18 5b 5e 5f c9 c3 8b 06 f6 c4
08 0f 84 a4 00 00 00 8b 5e 0c ba 01 00 00 00 89 d9 90 8d 74 26 00 <3b>
7b 14 74 7b 8b 03 8b 5b 04 a8 10 b8 00 00 00 00 0f 44 d0 39
Bad page state at free_hot_cold_page (in process 'firefox-bin', page c1309360)
flags:0x40000000 mapping:00000000 mapcount:-1 count:0
Backtrace:
[<c0103d67>] dump_stack+0x17/0x20
[<c013cb52>] bad_page+0x72/0xb0
[<c013d2da>] free_hot_cold_page+0x4a/0xe0
[<c013da81>] __pagevec_free+0x31/0x40
[<c0142a9d>] release_pages+0x9d/0x150
[<c0142b68>] __pagevec_release+0x18/0x30
[<c01430bb>] truncate_inode_pages_range+0x13b/0x300
[<c014329a>] truncate_inode_pages+0x1a/0x20
[<c016d8e2>] generic_delete_inode+0xb2/0xd0
[<c016da1f>] generic_drop_inode+0xf/0x20
[<c016da92>] iput+0x62/0x90
[<c016494f>] sys_unlink+0xdf/0x110
[<c0102e0f>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed
regards,
Benoit Boissinot
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 5:56 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-11 20:13 ` Martin J. Bligh
2005-06-11 22:20 ` 2.6.12-rc6-mm1 Con Kolivas
1 sibling, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-11 20:13 UTC (permalink / raw)
To: Con Kolivas
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
--Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 15:22:30 +1000):
> On Sat, 11 Jun 2005 14:14, Martin J. Bligh wrote:
>> --"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 > >
> OK, I backed out those 4, and the degredation mostly went away.
>> > See
>> > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.
>> > moe.png
>> >
>> > and more specifically, see the +p5150 near the right hand side.
>> > I don't think it's quite as good as mainline, but much closer.
>> > I did this run with HZ=1000, and the the one with no scheduler
>> > patches at all with HZ=250, so I'll try to do a run that's more
>> > directly comparable as well
>>
>> OK, that makes it look much more like mainline. Looks like you were still
>> revising the details of your patch Con ... once you're ready, drop me a
>> URL for it, and I'll make the system whack on that too.
>
> Great thanks. Here are rolled up all the reconsidered changes that apply
> directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very
> grateful to see how this performed; it has been boot and stress tested at
> this end. If it shows detriment I'll have to make the smp nice changes more
> complex.
It's much better ... but still a degredation - see point p5181 on:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
Only really seems to hurt the NUMA box (the x440 one ... elm3b67 ... is
still trying to find it's ass with both hands). I'm not necessarily saying
it's a problem ... not sure what the benefits of the patch are, but it's a
data point, at least ?
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-11 22:20 ` Con Kolivas
2005-06-11 23:27 ` 2.6.12-rc6-mm1 Martin J. Bligh
0 siblings, 1 reply; 101+ messages in thread
From: Con Kolivas @ 2005-06-11 22:20 UTC (permalink / raw)
To: Martin J. Bligh
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
On Sun, 12 Jun 2005 06:13, Martin J. Bligh wrote:
> --Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 > >
Great thanks. Here are rolled up all the reconsidered changes that apply
> > directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very
> > grateful to see how this performed; it has been boot and stress tested at
> > this end. If it shows detriment I'll have to make the smp nice changes
> > more complex.
>
> It's much better ... but still a degredation - see point p5181 on:
>
> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.mo
>e.png
>
> Only really seems to hurt the NUMA box (the x440 one ... elm3b67 ... is
> still trying to find it's ass with both hands). I'm not necessarily saying
> it's a problem ... not sure what the benefits of the patch are, but it's a
> data point, at least ?
Thanks a lot!
Just checking the numbering of the test runs with you. This is the blue line
order as plotted on the graph:
5181 is with this patch
4947 is mm1?
5150 is mm1 with the 4 patches backed out
5081 is mm1 with the 4 patches backed out and Hz changed to 100?
5169 is ?
Cheers,
Con
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 22:20 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-11 23:27 ` Martin J. Bligh
2005-06-11 23:47 ` 2.6.12-rc6-mm1 Con Kolivas
0 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-11 23:27 UTC (permalink / raw)
To: Con Kolivas
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
--Con Kolivas <kernel@kolivas.org> wrote (on Sunday, June 12, 2005 08:20:05 +1000):
> On Sun, 12 Jun 2005 06:13, Martin J. Bligh wrote:
>> --Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 > >
> Great thanks. Here are rolled up all the reconsidered changes that apply
>> > directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very
>> > grateful to see how this performed; it has been boot and stress tested at
>> > this end. If it shows detriment I'll have to make the smp nice changes
>> > more complex.
>>
>> It's much better ... but still a degredation - see point p5181 on:
>>
>> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.mo
>> e.png
>>
>> Only really seems to hurt the NUMA box (the x440 one ... elm3b67 ... is
>> still trying to find it's ass with both hands). I'm not necessarily saying
>> it's a problem ... not sure what the benefits of the patch are, but it's a
>> data point, at least ?
>
> Thanks a lot!
>
> Just checking the numbering of the test runs with you. This is the blue line
> order as plotted on the graph:
>
> 5181 is with this patch
> 4947 is mm1?
> 5150 is mm1 with the 4 patches backed out
> 5081 is mm1 with the 4 patches backed out and Hz changed to 100?
> 5169 is ?
Until I get off my ass and write an html wrapper for the graphs, easiest
thing to do is just cross-reference to here:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/regression_matrix.html
The +pXXXX numbers on the graph match the job numbers in the boxes. You
can click on the patches down the left side, and see exactly what they
were if you want.
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 23:27 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-11 23:47 ` Con Kolivas
2005-06-12 0:23 ` 2.6.12-rc6-mm1 Martin J. Bligh
0 siblings, 1 reply; 101+ messages in thread
From: Con Kolivas @ 2005-06-11 23:47 UTC (permalink / raw)
To: Martin J. Bligh
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
[-- Attachment #1: Type: text/plain, Size: 1001 bytes --]
On Sun, 12 Jun 2005 09:27, Martin J. Bligh wrote:
> >> not sure what the benefits of the patch are,
I should have answered this. Since we moved to one runqueue per cpu with the
current scheduler, 'nice' levels basically fall apart on SMP. Balancing tends
to group together all the wrong tasks to have any meaningful 'nice' support
where often on a 2 cpu machine if we run 4 tasks, 2 nice 0 and 2 nice 19 we
end up with:
cpu 1: nice 19 + nice 19
cpu 2: nice 0 + nice 0
which means each nice 19 task gets half a cpu and each nice 0 task gets half a
cpu which is lousy fairness.
The smp nice patches should end up with
cpu 1: nice 0 + nice 19
cpu 2: nice 0 + nice 19
so that the nice 0 tasks get 95% of a cpu and nice 19 tasks get 5% of a cpu.
The patches should balance things as fairly as possible according to nice
levels across cpus. As you can see this is clearly a bug in behaviour and has
been a showstopper for many trying to move from 2.4.
Cheers,
Con
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-11 23:47 ` 2.6.12-rc6-mm1 Con Kolivas
@ 2005-06-12 0:23 ` Martin J. Bligh
2005-06-12 5:19 ` 2.6.12-rc6-mm1 Con Kolivas
0 siblings, 1 reply; 101+ messages in thread
From: Martin J. Bligh @ 2005-06-12 0:23 UTC (permalink / raw)
To: Con Kolivas
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
--Con Kolivas <kernel@kolivas.org> wrote (on Sunday, June 12, 2005 09:47:08 +1000):
> On Sun, 12 Jun 2005 09:27, Martin J. Bligh wrote:
>> >> not sure what the benefits of the patch are,
>
> I should have answered this. Since we moved to one runqueue per cpu with the
> current scheduler, 'nice' levels basically fall apart on SMP. Balancing tends
> to group together all the wrong tasks to have any meaningful 'nice' support
> where often on a 2 cpu machine if we run 4 tasks, 2 nice 0 and 2 nice 19 we
> end up with:
>
> cpu 1: nice 19 + nice 19
> cpu 2: nice 0 + nice 0
>
> which means each nice 19 task gets half a cpu and each nice 0 task gets half a
> cpu which is lousy fairness.
>
> The smp nice patches should end up with
> cpu 1: nice 0 + nice 19
> cpu 2: nice 0 + nice 19
>
> so that the nice 0 tasks get 95% of a cpu and nice 19 tasks get 5% of a cpu.
>
> The patches should balance things as fairly as possible according to nice
> levels across cpus. As you can see this is clearly a bug in behaviour and has
> been a showstopper for many trying to move from 2.4.
Oh, right. that makes a lot of sense ... maybe just let it have an error
factor when migrating cross numa nodes (ie not be as strict)? Not sure
that's really the problem, as I doubt anything in my test is actually
niced anyway (assuming you're meaning static prio, not dynamic). In that
case, your changes should have no effect, right (from explanation, not
looking at the code ;-))
M.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-12 0:23 ` 2.6.12-rc6-mm1 Martin J. Bligh
@ 2005-06-12 5:19 ` Con Kolivas
0 siblings, 0 replies; 101+ messages in thread
From: Con Kolivas @ 2005-06-12 5:19 UTC (permalink / raw)
To: Martin J. Bligh
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter,
Nick Piggin
On Sun, 12 Jun 2005 10:23, Martin J. Bligh wrote:
> --Con Kolivas <kernel@kolivas.org> wrote (on Sunday, June 12, 2005 09:47:08
+1000):
> > The patches should balance things as fairly as possible according to nice
> > levels across cpus. As you can see this is clearly a bug in behaviour and
> > has been a showstopper for many trying to move from 2.4.
>
> Oh, right. that makes a lot of sense ... maybe just let it have an error
> factor when migrating cross numa nodes (ie not be as strict)? Not sure
> that's really the problem, as I doubt anything in my test is actually
> niced anyway (assuming you're meaning static prio, not dynamic). In that
> case, your changes should have no effect, right (from explanation, not
> looking at the code ;-))
The balancing code is not really aware that the loads being returned are being
altered and it was not clear whether this would be needed or not as it
usually bases its decisions on ratios of load rather than absolute amounts.
The tricky part is idle balancing where we don't want to try and pull from a
queue that only has one running task and the patch has a "if single task
running and idle balancing tell it only one task running and don't bias"
feature. This may cause slight performance effects on numa as I guess the
other nodes suddenly seem much more loaded and we normally wouldn't try
balancing between nodes until there was a larger load discrepancy than
between cpus. I'll think on this and see how much more nice-aware the
balancing code needs to be for this to not have any effect.
Cheers,
Con
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
` (7 preceding siblings ...)
2005-06-11 11:51 ` 2.6.12-rc6-mm1 Benoit Boissinot
@ 2005-06-18 22:39 ` Richard Purdie
2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King
2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall
9 siblings, 2 replies; 101+ messages in thread
From: Richard Purdie @ 2005-06-18 22:39 UTC (permalink / raw)
To: Russell King; +Cc: LKML, Andrew Morton
On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> +git-arm-smp.patch
>
> ARM git trees
The arm pxa255 based Zaurus won't resume from a suspend with the patches
from the above tree applied. The suspend looks normal and gets at least
as far as pxa_pm_enter(). After that, the device appears to be dead and
needs a battery removal to reset. I'm unsure if it actually suspends and
is failing to resume or is crashing in the latter suspend stages.
Is there some documentation on what the above patch is aiming to do
anywhere?
Richard
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie
@ 2005-06-18 22:44 ` Andrew Morton
2005-06-18 22:57 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King
1 sibling, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-06-18 22:44 UTC (permalink / raw)
To: Richard Purdie; +Cc: linux, linux-kernel
Richard Purdie <rpurdie@rpsys.net> wrote:
>
> On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > +git-arm-smp.patch
> >
> > ARM git trees
>
> The arm pxa255 based Zaurus won't resume from a suspend with the patches
> from the above tree applied. The suspend looks normal and gets at least
> as far as pxa_pm_enter(). After that, the device appears to be dead and
> needs a battery removal to reset. I'm unsure if it actually suspends and
> is failing to resume or is crashing in the latter suspend stages.
>
> Is there some documentation on what the above patch is aiming to do
> anywhere?
Did you apply just that patch, or are you talking about the whole -mm lineup?
If the latter, please test with only git-arm-smp.patch.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-18 22:57 ` Richard Purdie
2005-06-18 23:11 ` 2.6.12-rc6-mm1 Richard Purdie
0 siblings, 1 reply; 101+ messages in thread
From: Richard Purdie @ 2005-06-18 22:57 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux, linux-kernel
On Sat, 2005-06-18 at 15:44 -0700, Andrew Morton wrote:
> Richard Purdie <rpurdie@rpsys.net> wrote:
> >
> > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > > +git-arm-smp.patch
> > >
> > > ARM git trees
> >
> > The arm pxa255 based Zaurus won't resume from a suspend with the patches
> > from the above tree applied. The suspend looks normal and gets at least
> > as far as pxa_pm_enter(). After that, the device appears to be dead and
> > needs a battery removal to reset. I'm unsure if it actually suspends and
> > is failing to resume or is crashing in the latter suspend stages.
> >
> > Is there some documentation on what the above patch is aiming to do
> > anywhere?
>
> Did you apply just that patch, or are you talking about the whole -mm lineup?
>
> If the latter, please test with only git-arm-smp.patch.
Sorry, I wasn't clear. I had problems with the -mm lineup and tracked it
down to the above patch. With the above patch removed, -mm works fine.
(I know there's a number of changes to the arm pxa suspend/resume code
in git-arm.patch but they're definitely not causing the problem.)
Richard
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-18 22:57 ` 2.6.12-rc6-mm1 Richard Purdie
@ 2005-06-18 23:11 ` Richard Purdie
0 siblings, 0 replies; 101+ messages in thread
From: Richard Purdie @ 2005-06-18 23:11 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux, linux-kernel
On Sat, 2005-06-18 at 23:57 +0100, Richard Purdie wrote:
> On Sat, 2005-06-18 at 15:44 -0700, Andrew Morton wrote:
> > > > +git-arm-smp.patch
> > > > ARM git trees
> > >
> > > The arm pxa255 based Zaurus won't resume from a suspend with the patches
> > > from the above tree applied. The suspend looks normal and gets at least
> > > as far as pxa_pm_enter(). After that, the device appears to be dead and
> > > needs a battery removal to reset. I'm unsure if it actually suspends and
> > > is failing to resume or is crashing in the latter suspend stages.
> > >
> > > Is there some documentation on what the above patch is aiming to do
> > > anywhere?
> >
> > Did you apply just that patch, or are you talking about the whole -mm lineup?
> >
> > If the latter, please test with only git-arm-smp.patch.
>
> Sorry, I wasn't clear. I had problems with the -mm lineup and tracked it
> down to the above patch. With the above patch removed, -mm works fine.
>
> (I know there's a number of changes to the arm pxa suspend/resume code
> in git-arm.patch but they're definitely not causing the problem.)
I meant to add that git-arm-smp.patch breaks suspend/resume, even
applied in isolation against 2.6.12-rc6.
Richard
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-06-18 23:18 ` Russell King
2005-06-19 1:20 ` 2.6.12-rc6-mm1 Richard Purdie
1 sibling, 1 reply; 101+ messages in thread
From: Russell King @ 2005-06-18 23:18 UTC (permalink / raw)
To: Richard Purdie; +Cc: LKML, Andrew Morton
On Sat, Jun 18, 2005 at 11:39:18PM +0100, Richard Purdie wrote:
> On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > +git-arm-smp.patch
> >
> > ARM git trees
>
> The arm pxa255 based Zaurus won't resume from a suspend with the patches
> from the above tree applied. The suspend looks normal and gets at least
> as far as pxa_pm_enter(). After that, the device appears to be dead and
> needs a battery removal to reset. I'm unsure if it actually suspends and
> is failing to resume or is crashing in the latter suspend stages.
<grumble>Well, its a bit late for this since (a) stuff has rapidly
moved on at rmk towers since 2.6.12 was released this morning, and
(b) I've just asked Linus to pull this.</grumble>
Thinking about what's probably happening, I suspect all the ARM suspend
and resume code needs to be reworked to save more state. I'll try to
cook up a patch tomorrow to fix it, but I'll need you to provide
feedback.
Please note that you may see other ARM breakage over the next month
or so - I'm going to be concentrating on merging ARM SMP support,
and whatever bashing other people like yourself can give the kernel
will help ensure that problems are picked up quickly.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King
@ 2005-06-19 1:20 ` Richard Purdie
2005-06-19 9:02 ` 2.6.12-rc6-mm1 Russell King
0 siblings, 1 reply; 101+ messages in thread
From: Richard Purdie @ 2005-06-19 1:20 UTC (permalink / raw)
To: Russell King; +Cc: LKML, Andrew Morton
On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> On Sat, Jun 18, 2005 at 11:39:18PM +0100, Richard Purdie wrote:
> > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > > +git-arm-smp.patch
> > >
> > > ARM git trees
> >
> > The arm pxa255 based Zaurus won't resume from a suspend with the patches
> > from the above tree applied. The suspend looks normal and gets at least
> > as far as pxa_pm_enter(). After that, the device appears to be dead and
> > needs a battery removal to reset. I'm unsure if it actually suspends and
> > is failing to resume or is crashing in the latter suspend stages.
>
> <grumble>Well, its a bit late for this since (a) stuff has rapidly
> moved on at rmk towers since 2.6.12 was released this morning, and
> (b) I've just asked Linus to pull this.</grumble>
Please don't underestimate the time it takes to wade through all the
patches in the -mm tree, find the one causing the breakage, investigate
the patch and report it to the person concerned. I'm doing the Zaurus
work in my spare time and don't get paid for it. Just reflashing and
booting a new kernel probably takes ~15mins on the Zaurus. The
copy/clearpage problem took a complete weekend to track down (as it was
showing up randomly) and then needed further evenings to debug your
patch which is a large chunk of my free time. The Checked-By: line
didn't quite give the full picture.
I realise its taken me a while to find enough time to test/debug this
kernel but as least you now know there's a problem...
> Thinking about what's probably happening, I suspect all the ARM suspend
> and resume code needs to be reworked to save more state. I'll try to
> cook up a patch tomorrow to fix it, but I'll need you to provide
> feedback.
Ok, thanks. I'm happy to test any fixes/patches.
> Please note that you may see other ARM breakage over the next month
> or so - I'm going to be concentrating on merging ARM SMP support,
> and whatever bashing other people like yourself can give the kernel
> will help ensure that problems are picked up quickly.
In order to assist with that, can you publish these patches somewhere?
That way, I can apply them against a known good Zaurus kernel tree and
know straight away if they break anything (diff/patch format would be
preferable as my Zaurus trees are all patch based).
On a positive note, something in the later 2.6.12-rc kernels has made a
massive difference to the speed on the Zaurus - I suspect the removal of
the preempt locks on copy/clearpage. It boots up ~1.5x faster and the
speed gain will make a lot of people very happy :)
Richard
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-19 1:20 ` 2.6.12-rc6-mm1 Richard Purdie
@ 2005-06-19 9:02 ` Russell King
2005-06-19 9:11 ` 2.6.12-rc6-mm1 Russell King
0 siblings, 1 reply; 101+ messages in thread
From: Russell King @ 2005-06-19 9:02 UTC (permalink / raw)
To: Richard Purdie; +Cc: LKML, Andrew Morton
On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote:
> On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> > Thinking about what's probably happening, I suspect all the ARM suspend
> > and resume code needs to be reworked to save more state. I'll try to
> > cook up a patch tomorrow to fix it, but I'll need you to provide
> > feedback.
>
> Ok, thanks. I'm happy to test any fixes/patches.
This should resolve the problem - we now rely on the stack pointer for
each CPU mode to remain constant throughout the running time of the
kernel, which includes across suspend/resume cycles.
--- a/arch/arm/mach-pxa/sleep.S
+++ b/arch/arm/mach-pxa/sleep.S
@@ -38,6 +38,16 @@ ENTRY(pxa_cpu_suspend)
#endif
stmfd sp!, {r2 - r12, lr} @ save registers on stack
+ @ preserve IRQ, abort and undefined mode stack pointers
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov r4, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov r5, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov r6, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+ stmfd sp!, {r4 - r6}
+
@ get coprocessor registers
mrc p14, 0, r3, c6, c0, 0 @ clock configuration, for turbo mode
mrc p15, 0, r4, c15, c1, 0 @ CP access reg
@@ -229,6 +239,17 @@ resume_after_mmu:
#ifdef CONFIG_XSCALE_CACHE_ERRATA
bl cpu_xscale_proc_init
#endif
+
+ @ restore IRQ, abort and undefined mode stack pointers
+ ldmfd sp!, {r4 - r6}
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov sp, r4
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov sp, r5
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov sp, r6
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+
ldmfd sp!, {r2, r3}
#ifndef CONFIG_IWMMXT
mar acc0, r2, r3
--- a/arch/arm/mach-sa1100/sleep.S
+++ b/arch/arm/mach-sa1100/sleep.S
@@ -37,6 +37,16 @@ ENTRY(sa1100_cpu_suspend)
stmfd sp!, {r4 - r12, lr} @ save registers on stack
+ @ preserve IRQ, abort and undefined mode stack pointers
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov r4, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov r5, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov r6, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+ stmfd sp!, {r4 - r6}
+
@ get coprocessor registers
mrc p15, 0, r4, c3, c0, 0 @ domain ID
mrc p15, 0, r5, c2, c0, 0 @ translation table base addr
@@ -210,6 +220,17 @@ sleep_save_sp:
.text
resume_after_mmu:
mcr p15, 0, r1, c15, c1, 2 @ enable clock switching
+
+ @ restore IRQ, abort and undefined mode stack pointers
+ ldmfd sp!, {r4 - r6}
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov sp, r4
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov sp, r5
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov sp, r6
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+
ldmfd sp!, {r4 - r12, pc} @ return to caller
> > Please note that you may see other ARM breakage over the next month
> > or so - I'm going to be concentrating on merging ARM SMP support,
> > and whatever bashing other people like yourself can give the kernel
> > will help ensure that problems are picked up quickly.
>
> In order to assist with that, can you publish these patches somewhere?
> That way, I can apply them against a known good Zaurus kernel tree and
> know straight away if they break anything (diff/patch format would be
> preferable as my Zaurus trees are all patch based).
I'll see what I can do, but I'm going to be working fairly rapidly on
merging this, so expect roughly a patch each day. Hopefully though,
the later patches will only affect the Integrator platform.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-19 9:02 ` 2.6.12-rc6-mm1 Russell King
@ 2005-06-19 9:11 ` Russell King
2005-06-19 17:12 ` 2.6.12-rc6-mm1 Richard Purdie
0 siblings, 1 reply; 101+ messages in thread
From: Russell King @ 2005-06-19 9:11 UTC (permalink / raw)
To: Richard Purdie, LKML, Andrew Morton
On Sun, Jun 19, 2005 at 10:02:26AM +0100, Russell King wrote:
> On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote:
> > On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> > > Thinking about what's probably happening, I suspect all the ARM suspend
> > > and resume code needs to be reworked to save more state. I'll try to
> > > cook up a patch tomorrow to fix it, but I'll need you to provide
> > > feedback.
> >
> > Ok, thanks. I'm happy to test any fixes/patches.
>
> This should resolve the problem - we now rely on the stack pointer for
> each CPU mode to remain constant throughout the running time of the
> kernel, which includes across suspend/resume cycles.
Actually, this patch is probably an all-round better solution.
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -328,7 +328,7 @@ static void __init setup_processor(void)
* cpu_init dumps the cache information, initialises SMP specific
* information, and sets up the per-CPU stacks.
*/
-void __init cpu_init(void)
+void cpu_init(void)
{
unsigned int cpu = smp_processor_id();
struct stack *stk = &stacks[cpu];
--- a/arch/arm/mach-pxa/pm.c
+++ b/arch/arm/mach-pxa/pm.c
@@ -133,6 +133,8 @@ static int pxa_pm_enter(suspend_state_t
/* *** go zzz *** */
pxa_cpu_pm_enter(state);
+ cpu_init();
+
/* after sleeping, validate the checksum */
checksum = 0;
for (i = 0; i < SLEEP_SAVE_SIZE - 1; i++)
--- a/arch/arm/mach-sa1100/pm.c
+++ b/arch/arm/mach-sa1100/pm.c
@@ -88,6 +88,8 @@ static int sa11x0_pm_enter(suspend_state
/* go zzz */
sa1100_cpu_suspend();
+ cpu_init();
+
/*
* Ensure not to come back here if it wasn't intended
*/
--- a/include/asm-arm/system.h
+++ b/include/asm-arm/system.h
@@ -104,6 +104,7 @@ extern void show_pte(struct mm_struct *m
extern void __show_regs(struct pt_regs *);
extern int cpu_architecture(void);
+extern void cpu_init(void);
#define set_cr(x) \
__asm__ __volatile__( \
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-19 9:11 ` 2.6.12-rc6-mm1 Russell King
@ 2005-06-19 17:12 ` Richard Purdie
2005-06-19 17:39 ` 2.6.12-rc6-mm1 Russell King
0 siblings, 1 reply; 101+ messages in thread
From: Richard Purdie @ 2005-06-19 17:12 UTC (permalink / raw)
To: Russell King; +Cc: LKML, Andrew Morton
On Sun, 2005-06-19 at 10:11 +0100, Russell King wrote:
> On Sun, Jun 19, 2005 at 10:02:26AM +0100, Russell King wrote:
> > On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote:
> > > On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> > > > Thinking about what's probably happening, I suspect all the ARM suspend
> > > > and resume code needs to be reworked to save more state. I'll try to
> > > > cook up a patch tomorrow to fix it, but I'll need you to provide
> > > > feedback.
> > >
> > > Ok, thanks. I'm happy to test any fixes/patches.
> >
> > This should resolve the problem - we now rely on the stack pointer for
> > each CPU mode to remain constant throughout the running time of the
> > kernel, which includes across suspend/resume cycles.
>
> Actually, this patch is probably an all-round better solution.
This patch (the simpler of the two using cpu_init()) allows the pxa to
suspend/resume happily with the git-arm-smp.patch applied.
Richard
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-19 17:12 ` 2.6.12-rc6-mm1 Richard Purdie
@ 2005-06-19 17:39 ` Russell King
2005-06-19 18:25 ` 2.6.12-rc6-mm1 Richard Purdie
0 siblings, 1 reply; 101+ messages in thread
From: Russell King @ 2005-06-19 17:39 UTC (permalink / raw)
To: Richard Purdie; +Cc: LKML, Andrew Morton
On Sun, Jun 19, 2005 at 06:12:38PM +0100, Richard Purdie wrote:
> This patch (the simpler of the two using cpu_init()) allows the pxa to
> suspend/resume happily with the git-arm-smp.patch applied.
Good. Fix committed.
Next batched smp patch can be found at www.home.arm.../~rmk/nightly
which I'm currently planning to go to Linus tonight.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-19 17:39 ` 2.6.12-rc6-mm1 Russell King
@ 2005-06-19 18:25 ` Richard Purdie
2005-06-19 18:56 ` 2.6.12-rc6-mm1 Russell King
0 siblings, 1 reply; 101+ messages in thread
From: Richard Purdie @ 2005-06-19 18:25 UTC (permalink / raw)
To: Russell King; +Cc: LKML, Andrew Morton
On Sun, 2005-06-19 at 18:39 +0100, Russell King wrote:
> Good. Fix committed.
Thanks.
> Next batched smp patch can be found at www.home.arm.../~rmk/nightly
> which I'm currently planning to go to Linus tonight.
I applied smp-20050619.patch to 2.6.12-rc6-mm1 + the last fix and the
Zaurus seems perfectly happy with it. Let me know as and when you have
further releases that need testing (a message to linux-arm-kernel might
be the best way to announce them?).
Richard
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-19 18:25 ` 2.6.12-rc6-mm1 Richard Purdie
@ 2005-06-19 18:56 ` Russell King
0 siblings, 0 replies; 101+ messages in thread
From: Russell King @ 2005-06-19 18:56 UTC (permalink / raw)
To: Richard Purdie; +Cc: LKML, Andrew Morton
On Sun, Jun 19, 2005 at 07:25:59PM +0100, Richard Purdie wrote:
> On Sun, 2005-06-19 at 18:39 +0100, Russell King wrote:
> > Next batched smp patch can be found at www.home.arm.../~rmk/nightly
> > which I'm currently planning to go to Linus tonight.
>
> I applied smp-20050619.patch to 2.6.12-rc6-mm1 + the last fix and the
> Zaurus seems perfectly happy with it. Let me know as and when you have
> further releases that need testing (a message to linux-arm-kernel might
> be the best way to announce them?).
Thanks for testing. Most of the other patches are platform specific
so this may not be required. However, if there are other changes to
non-platform specific, I'll try to point them out a couple of days
before they get merged.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
` (8 preceding siblings ...)
2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie
@ 2005-06-21 13:20 ` Dominik Karall
2005-06-24 21:27 ` 2.6.12-rc6-mm1 Alexey Dobriyan
2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton
9 siblings, 2 replies; 101+ messages in thread
From: Dominik Karall @ 2005-06-21 13:20 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1169 bytes --]
On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.
>6.12-rc6-mm1/
After looking in my dmesg output today, I saw following error with
2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
happens, cause I never used mono last time, I just did an emerge mono on my
gentoo system, maybe this forced the failure.
note: mono[26736] exited with preempt_count 1
scheduling while atomic: mono/0x10000001/26736
Call Trace:<ffffffff803e13ea>{schedule+122} <ffffffff8013197b>{vprintk+635}
<ffffffff803e2738>{cond_resched+56} <ffffffff80164de3>{unmap_vmas+1587}
<ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31}
<ffffffff80133466>{do_exit+438}
<ffffffff8013bf25>{__dequeue_signal+501}
<ffffffff801340c8>{do_group_exit+280}
<ffffffff8013e147>{get_signal_to_deliver+1575}
<ffffffff8010de92>{do_signal+162}
<ffffffff8012d1e0>{default_wake_function+0}
<ffffffff8010e8e1>{sys_rt_sigreturn+577}
<ffffffff8010eb3f>{sysret_signal+28}
<ffffffff8010ee27>{ptregscall_common+103}
cheers,
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall
@ 2005-06-24 21:27 ` Alexey Dobriyan
2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton
1 sibling, 0 replies; 101+ messages in thread
From: Alexey Dobriyan @ 2005-06-24 21:27 UTC (permalink / raw)
To: Dominik Karall; +Cc: Andrew Morton, linux-kernel
On Tuesday 21 June 2005 17:20, Dominik Karall wrote:
> After looking in my dmesg output today, I saw following error with
> 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
> happens, cause I never used mono last time, I just did an emerge mono on my
> gentoo system, maybe this forced the failure.
>
> note: mono[26736] exited with preempt_count 1
> scheduling while atomic: mono/0x10000001/26736
I've filed a bug at kernel bugzilla, so your report won't be lost.
See http://bugme.osdl.org/show_bug.cgi?id=4794
You can register at http://bugme.osdl.org/createaccount.cgi and add yourself
to CC list.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall
2005-06-24 21:27 ` 2.6.12-rc6-mm1 Alexey Dobriyan
@ 2005-07-29 4:54 ` Andrew Morton
2005-07-29 13:39 ` 2.6.12-rc6-mm1 Dominik Karall
1 sibling, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-07-29 4:54 UTC (permalink / raw)
To: Dominik Karall; +Cc: linux-kernel
Dominik Karall <dominik.karall@gmx.net> wrote:
>
> On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.
> >6.12-rc6-mm1/
>
> After looking in my dmesg output today, I saw following error with
> 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
> happens, cause I never used mono last time, I just did an emerge mono on my
> gentoo system, maybe this forced the failure.
>
> note: mono[26736] exited with preempt_count 1
> scheduling while atomic: mono/0x10000001/26736
>
> Call Trace:<ffffffff803e13ea>{schedule+122} <ffffffff8013197b>{vprintk+635}
> <ffffffff803e2738>{cond_resched+56} <ffffffff80164de3>{unmap_vmas+1587}
> <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31}
> <ffffffff80133466>{do_exit+438}
> <ffffffff8013bf25>{__dequeue_signal+501}
> <ffffffff801340c8>{do_group_exit+280}
> <ffffffff8013e147>{get_signal_to_deliver+1575}
> <ffffffff8010de92>{do_signal+162}
> <ffffffff8012d1e0>{default_wake_function+0}
> <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> <ffffffff8010eb3f>{sysret_signal+28}
> <ffffffff8010ee27>{ptregscall_common+103}
>
A couple of people reported this, but all seems to have gone quiet. Is it
fixed in later -mm's? Is 2.6.13-rc4 running OK?
Thanks.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-07-29 13:39 ` Dominik Karall
2005-07-29 18:22 ` 2.6.12-rc6-mm1 Andrew Morton
0 siblings, 1 reply; 101+ messages in thread
From: Dominik Karall @ 2005-07-29 13:39 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2179 bytes --]
On Friday 29 July 2005 06:54, Andrew Morton wrote:
> Dominik Karall <dominik.karall@gmx.net> wrote:
> > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc
> > >6/2. 6.12-rc6-mm1/
> >
> > After looking in my dmesg output today, I saw following error with
> > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
> > happens, cause I never used mono last time, I just did an emerge mono on
> > my gentoo system, maybe this forced the failure.
> >
> > note: mono[26736] exited with preempt_count 1
> > scheduling while atomic: mono/0x10000001/26736
> >
> > Call Trace:<ffffffff803e13ea>{schedule+122}
> > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56}
> > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128}
> > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438}
> > <ffffffff8013bf25>{__dequeue_signal+501}
> > <ffffffff801340c8>{do_group_exit+280}
> > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > <ffffffff8010de92>{do_signal+162}
> > <ffffffff8012d1e0>{default_wake_function+0}
> > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > <ffffffff8010eb3f>{sysret_signal+28}
> > <ffffffff8010ee27>{ptregscall_common+103}
>
> A couple of people reported this, but all seems to have gone quiet. Is it
> fixed in later -mm's? Is 2.6.13-rc4 running OK?
>
> Thanks.
hi andrew!
I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge mono
right now to test it, and I got this one:
Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1
Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip
00002aaaaaf652cf rsp 00007fffffe43b50 error 4
Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip
00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip
00002aaaaaf652cf rsp 00007fffff905f80 error 4
DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info about
the bug. Did I forget any debug option?
greets,
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-07-29 13:39 ` 2.6.12-rc6-mm1 Dominik Karall
@ 2005-07-29 18:22 ` Andrew Morton
2005-07-29 21:19 ` 2.6.12-rc6-mm1 Dominik Karall
0 siblings, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-07-29 18:22 UTC (permalink / raw)
To: Dominik Karall; +Cc: linux-kernel
Dominik Karall <dominik.karall@gmx.net> wrote:
>
> On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc
> > > >6/2. 6.12-rc6-mm1/
> > >
> > > After looking in my dmesg output today, I saw following error with
> > > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
> > > happens, cause I never used mono last time, I just did an emerge mono on
> > > my gentoo system, maybe this forced the failure.
> > >
> > > note: mono[26736] exited with preempt_count 1
> > > scheduling while atomic: mono/0x10000001/26736
> > >
> > > Call Trace:<ffffffff803e13ea>{schedule+122}
> > > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56}
> > > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128}
> > > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438}
> > > <ffffffff8013bf25>{__dequeue_signal+501}
> > > <ffffffff801340c8>{do_group_exit+280}
> > > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > > <ffffffff8010de92>{do_signal+162}
> > > <ffffffff8012d1e0>{default_wake_function+0}
> > > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > > <ffffffff8010eb3f>{sysret_signal+28}
> > > <ffffffff8010ee27>{ptregscall_common+103}
> >
> > A couple of people reported this, but all seems to have gone quiet. Is it
> > fixed in later -mm's? Is 2.6.13-rc4 running OK?
> >
> > Thanks.
>
> hi andrew!
>
> I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge mono
> right now to test it, and I got this one:
> Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1
> Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip
> 00002aaaaaf652cf rsp 00007fffffe43b50 error 4
> Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip
> 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
> Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip
> 00002aaaaaf652cf rsp 00007fffff905f80 error 4
>
> DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info about
> the bug. Did I forget any debug option?
Gee, I don't know how to find this one. Do you know if the problem is
specific to -mm?
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-07-29 18:22 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-07-29 21:19 ` Dominik Karall
2005-07-29 21:27 ` 2.6.12-rc6-mm1 Andrew Morton
0 siblings, 1 reply; 101+ messages in thread
From: Dominik Karall @ 2005-07-29 21:19 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2706 bytes --]
On Friday 29 July 2005 20:22, Andrew Morton wrote:
> Dominik Karall <dominik.karall@gmx.net> wrote:
> > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.1
> > > > >2-rc 6/2. 6.12-rc6-mm1/
> > > >
> > > > After looking in my dmesg output today, I saw following error with
> > > > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it
> > > > exactly happens, cause I never used mono last time, I just did an
> > > > emerge mono on my gentoo system, maybe this forced the failure.
> > > >
> > > > note: mono[26736] exited with preempt_count 1
> > > > scheduling while atomic: mono/0x10000001/26736
> > > >
> > > > Call Trace:<ffffffff803e13ea>{schedule+122}
> > > > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56}
> > > > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128}
> > > > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438}
> > > > <ffffffff8013bf25>{__dequeue_signal+501}
> > > > <ffffffff801340c8>{do_group_exit+280}
> > > > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > > > <ffffffff8010de92>{do_signal+162}
> > > > <ffffffff8012d1e0>{default_wake_function+0}
> > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > > > <ffffffff8010eb3f>{sysret_signal+28}
> > > > <ffffffff8010ee27>{ptregscall_common+103}
> > >
> > > A couple of people reported this, but all seems to have gone quiet. Is
> > > it fixed in later -mm's? Is 2.6.13-rc4 running OK?
> > >
> > > Thanks.
> >
> > hi andrew!
> >
> > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge
> > mono right now to test it, and I got this one:
> > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1
> > Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip
> > 00002aaaaaf652cf rsp 00007fffffe43b50 error 4
> > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip
> > 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
> > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip
> > 00002aaaaaf652cf rsp 00007fffff905f80 error 4
> >
> > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info
> > about the bug. Did I forget any debug option?
>
> Gee, I don't know how to find this one. Do you know if the problem is
> specific to -mm?
Tested with 2.6.13-rc4 and it seems to work. Didn't get any error.
So it seems to be -mm related. Do you suspect any patch which could cause the
error?
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-07-29 21:19 ` 2.6.12-rc6-mm1 Dominik Karall
@ 2005-07-29 21:27 ` Andrew Morton
2005-07-29 21:37 ` 2.6.12-rc6-mm1 Dominik Karall
0 siblings, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-07-29 21:27 UTC (permalink / raw)
To: Dominik Karall; +Cc: linux-kernel
Dominik Karall <dominik.karall@gmx.net> wrote:
>
> On Friday 29 July 2005 20:22, Andrew Morton wrote:
> > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.1
> > > > > >2-rc 6/2. 6.12-rc6-mm1/
> > > > >
> > > > > After looking in my dmesg output today, I saw following error with
> > > > > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it
> > > > > exactly happens, cause I never used mono last time, I just did an
> > > > > emerge mono on my gentoo system, maybe this forced the failure.
> > > > >
> > > > > note: mono[26736] exited with preempt_count 1
> > > > > scheduling while atomic: mono/0x10000001/26736
> > > > >
> > > > > Call Trace:<ffffffff803e13ea>{schedule+122}
> > > > > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56}
> > > > > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128}
> > > > > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438}
> > > > > <ffffffff8013bf25>{__dequeue_signal+501}
> > > > > <ffffffff801340c8>{do_group_exit+280}
> > > > > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > > > > <ffffffff8010de92>{do_signal+162}
> > > > > <ffffffff8012d1e0>{default_wake_function+0}
> > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > > > > <ffffffff8010eb3f>{sysret_signal+28}
> > > > > <ffffffff8010ee27>{ptregscall_common+103}
> > > >
> > > > A couple of people reported this, but all seems to have gone quiet. Is
> > > > it fixed in later -mm's? Is 2.6.13-rc4 running OK?
> > > >
> > > > Thanks.
> > >
> > > hi andrew!
> > >
> > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge
> > > mono right now to test it, and I got this one:
> > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1
> > > Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip
> > > 00002aaaaaf652cf rsp 00007fffffe43b50 error 4
> > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip
> > > 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
> > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip
> > > 00002aaaaaf652cf rsp 00007fffff905f80 error 4
> > >
> > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info
> > > about the bug. Did I forget any debug option?
> >
> > Gee, I don't know how to find this one. Do you know if the problem is
> > specific to -mm?
>
> Tested with 2.6.13-rc4 and it seems to work. Didn't get any error.
Great, thanks for that.
> So it seems to be -mm related. Do you suspect any patch which could cause the
> error?
I wouldn't know, sorry. Possible the scheduler patches, possibly an
x86_64-specific patch. Is the problem repeatable? If so, a binary search
would only take ten build-n-boots ;)
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-07-29 21:27 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-07-29 21:37 ` Dominik Karall
2005-08-04 19:44 ` 2.6.12-rc6-mm1 Andrew Morton
0 siblings, 1 reply; 101+ messages in thread
From: Dominik Karall @ 2005-07-29 21:37 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 3597 bytes --]
On Friday 29 July 2005 23:27, Andrew Morton wrote:
> Dominik Karall <dominik.karall@gmx.net> wrote:
> > On Friday 29 July 2005 20:22, Andrew Morton wrote:
> > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2
> > > > > > >.6.1 2-rc 6/2. 6.12-rc6-mm1/
> > > > > >
> > > > > > After looking in my dmesg output today, I saw following error
> > > > > > with 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when
> > > > > > it exactly happens, cause I never used mono last time, I just did
> > > > > > an emerge mono on my gentoo system, maybe this forced the
> > > > > > failure.
> > > > > >
> > > > > > note: mono[26736] exited with preempt_count 1
> > > > > > scheduling while atomic: mono/0x10000001/26736
> > > > > >
> > > > > > Call Trace:<ffffffff803e13ea>{schedule+122}
> > > > > > <ffffffff8013197b>{vprintk+635}
> > > > > > <ffffffff803e2738>{cond_resched+56}
> > > > > > <ffffffff80164de3>{unmap_vmas+1587}
> > > > > > <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31}
> > > > > > <ffffffff80133466>{do_exit+438}
> > > > > > <ffffffff8013bf25>{__dequeue_signal+501}
> > > > > > <ffffffff801340c8>{do_group_exit+280}
> > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > > > > > <ffffffff8010de92>{do_signal+162}
> > > > > > <ffffffff8012d1e0>{default_wake_function+0}
> > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > > > > > <ffffffff8010eb3f>{sysret_signal+28}
> > > > > > <ffffffff8010ee27>{ptregscall_common+103}
> > > > >
> > > > > A couple of people reported this, but all seems to have gone quiet.
> > > > > Is it fixed in later -mm's? Is 2.6.13-rc4 running OK?
> > > > >
> > > > > Thanks.
> > > >
> > > > hi andrew!
> > > >
> > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an
> > > > emerge mono right now to test it, and I got this one:
> > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count
> > > > 1 Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000
> > > > rip 00002aaaaaf652cf rsp 00007fffffe43b50 error 4
> > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000
> > > > rip 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
> > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000
> > > > rip 00002aaaaaf652cf rsp 00007fffff905f80 error 4
> > > >
> > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more
> > > > info about the bug. Did I forget any debug option?
> > >
> > > Gee, I don't know how to find this one. Do you know if the problem is
> > > specific to -mm?
> >
> > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error.
>
> Great, thanks for that.
>
> > So it seems to be -mm related. Do you suspect any patch which could cause
> > the error?
>
> I wouldn't know, sorry. Possible the scheduler patches, possibly an
> x86_64-specific patch. Is the problem repeatable? If so, a binary search
> would only take ten build-n-boots ;)
Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I will try
to find the right patch tomorrow, 10 build-n-boots would end up in morning ;)
btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old patch which
wasn't merged to linus tree till now...hope there aren't a lot of them :)
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-07-29 21:37 ` 2.6.12-rc6-mm1 Dominik Karall
@ 2005-08-04 19:44 ` Andrew Morton
2005-08-04 22:28 ` 2.6.12-rc6-mm1 Andrew Morton
0 siblings, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2005-08-04 19:44 UTC (permalink / raw)
To: Dominik Karall; +Cc: linux-kernel
Dominik Karall <dominik.karall@gmx.net> wrote:
>
> On Friday 29 July 2005 23:27, Andrew Morton wrote:
> > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > On Friday 29 July 2005 20:22, Andrew Morton wrote:
> > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2
> > > > > > > >.6.1 2-rc 6/2. 6.12-rc6-mm1/
> > > > > > >
> > > > > > > After looking in my dmesg output today, I saw following error
> > > > > > > with 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when
> > > > > > > it exactly happens, cause I never used mono last time, I just did
> > > > > > > an emerge mono on my gentoo system, maybe this forced the
> > > > > > > failure.
> > > > > > >
> > > > > > > note: mono[26736] exited with preempt_count 1
> > > > > > > scheduling while atomic: mono/0x10000001/26736
> > > > > > >
> > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122}
> > > > > > > <ffffffff8013197b>{vprintk+635}
> > > > > > > <ffffffff803e2738>{cond_resched+56}
> > > > > > > <ffffffff80164de3>{unmap_vmas+1587}
> > > > > > > <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31}
> > > > > > > <ffffffff80133466>{do_exit+438}
> > > > > > > <ffffffff8013bf25>{__dequeue_signal+501}
> > > > > > > <ffffffff801340c8>{do_group_exit+280}
> > > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > > > > > > <ffffffff8010de92>{do_signal+162}
> > > > > > > <ffffffff8012d1e0>{default_wake_function+0}
> > > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > > > > > > <ffffffff8010eb3f>{sysret_signal+28}
> > > > > > > <ffffffff8010ee27>{ptregscall_common+103}
> > > > > >
> > > > > > A couple of people reported this, but all seems to have gone quiet.
> > > > > > Is it fixed in later -mm's? Is 2.6.13-rc4 running OK?
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > hi andrew!
> > > > >
> > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an
> > > > > emerge mono right now to test it, and I got this one:
> > > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count
> > > > > 1 Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000
> > > > > rip 00002aaaaaf652cf rsp 00007fffffe43b50 error 4
> > > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000
> > > > > rip 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
> > > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000
> > > > > rip 00002aaaaaf652cf rsp 00007fffff905f80 error 4
> > > > >
> > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more
> > > > > info about the bug. Did I forget any debug option?
> > > >
> > > > Gee, I don't know how to find this one. Do you know if the problem is
> > > > specific to -mm?
> > >
> > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error.
> >
> > Great, thanks for that.
> >
> > > So it seems to be -mm related. Do you suspect any patch which could cause
> > > the error?
> >
> > I wouldn't know, sorry. Possible the scheduler patches, possibly an
> > x86_64-specific patch. Is the problem repeatable? If so, a binary search
> > would only take ten build-n-boots ;)
>
> Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I will try
> to find the right patch tomorrow, 10 build-n-boots would end up in morning ;)
>
> btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old patch which
> wasn't merged to linus tree till now...hope there aren't a lot of them :)
>
Any progress on this? It kinda measn that the whole of the -mm lineup is
stuck until we can identify the offending patch. We have a couple of weeks
in which to do this but if you can identify the bad patch it'd help
enormously, thanks.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-08-04 19:44 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-08-04 22:28 ` Andrew Morton
2005-08-04 22:44 ` 2.6.12-rc6-mm1 Dominik Karall
2005-08-05 10:48 ` [patch] preempt-trace.patch Ingo Molnar
0 siblings, 2 replies; 101+ messages in thread
From: Andrew Morton @ 2005-08-04 22:28 UTC (permalink / raw)
To: dominik.karall, linux-kernel; +Cc: Ingo Molnar
Andrew Morton <akpm@osdl.org> wrote:
>
> Dominik Karall <dominik.karall@gmx.net> wrote:
> >
> > On Friday 29 July 2005 23:27, Andrew Morton wrote:
> > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > On Friday 29 July 2005 20:22, Andrew Morton wrote:
> > > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > > > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2
> > > > > > > > >.6.1 2-rc 6/2. 6.12-rc6-mm1/
> > > > > > > >
> > > > > > > > After looking in my dmesg output today, I saw following error
> > > > > > > > with 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when
> > > > > > > > it exactly happens, cause I never used mono last time, I just did
> > > > > > > > an emerge mono on my gentoo system, maybe this forced the
> > > > > > > > failure.
> > > > > > > >
> > > > > > > > note: mono[26736] exited with preempt_count 1
> > > > > > > > scheduling while atomic: mono/0x10000001/26736
> > > > > > > >
> > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122}
> > > > > > > > <ffffffff8013197b>{vprintk+635}
> > > > > > > > <ffffffff803e2738>{cond_resched+56}
> > > > > > > > <ffffffff80164de3>{unmap_vmas+1587}
> > > > > > > > <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31}
> > > > > > > > <ffffffff80133466>{do_exit+438}
> > > > > > > > <ffffffff8013bf25>{__dequeue_signal+501}
> > > > > > > > <ffffffff801340c8>{do_group_exit+280}
> > > > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > > > > > > > <ffffffff8010de92>{do_signal+162}
> > > > > > > > <ffffffff8012d1e0>{default_wake_function+0}
> > > > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > > > > > > > <ffffffff8010eb3f>{sysret_signal+28}
> > > > > > > > <ffffffff8010ee27>{ptregscall_common+103}
> > > > > > >
> > > > > > > A couple of people reported this, but all seems to have gone quiet.
> > > > > > > Is it fixed in later -mm's? Is 2.6.13-rc4 running OK?
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > hi andrew!
> > > > > >
> > > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an
> > > > > > emerge mono right now to test it, and I got this one:
> > > > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count
> > > > > > 1 Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000
> > > > > > rip 00002aaaaaf652cf rsp 00007fffffe43b50 error 4
> > > > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000
> > > > > > rip 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
> > > > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000
> > > > > > rip 00002aaaaaf652cf rsp 00007fffff905f80 error 4
> > > > > >
> > > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more
> > > > > > info about the bug. Did I forget any debug option?
> > > > >
> > > > > Gee, I don't know how to find this one. Do you know if the problem is
> > > > > specific to -mm?
> > > >
> > > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error.
> > >
> > > Great, thanks for that.
> > >
> > > > So it seems to be -mm related. Do you suspect any patch which could cause
> > > > the error?
> > >
> > > I wouldn't know, sorry. Possible the scheduler patches, possibly an
> > > x86_64-specific patch. Is the problem repeatable? If so, a binary search
> > > would only take ten build-n-boots ;)
> >
> > Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I will try
> > to find the right patch tomorrow, 10 build-n-boots would end up in morning ;)
> >
> > btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old patch which
> > wasn't merged to linus tree till now...hope there aren't a lot of them :)
> >
>
> Any progress on this? It kinda measn that the whole of the -mm lineup is
> stuck until we can identify the offending patch. We have a couple of weeks
> in which to do this but if you can identify the bad patch it'd help
> enormously, thanks.
>
OK, Bartosz Taudul tells me that he's occasionally seeing this on stock
2.6.12 (thanks!). So there's not a lot of point in doing the -mm bisection
search.
I think Ingo was planning on coming up with some infrastructure which would
allow us to debug this further.
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: 2.6.12-rc6-mm1
2005-08-04 22:28 ` 2.6.12-rc6-mm1 Andrew Morton
@ 2005-08-04 22:44 ` Dominik Karall
2005-08-05 10:48 ` [patch] preempt-trace.patch Ingo Molnar
1 sibling, 0 replies; 101+ messages in thread
From: Dominik Karall @ 2005-08-04 22:44 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Ingo Molnar
[-- Attachment #1: Type: text/plain, Size: 5121 bytes --]
On Friday 05 August 2005 00:28, Andrew Morton wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > On Friday 29 July 2005 23:27, Andrew Morton wrote:
> > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > On Friday 29 July 2005 20:22, Andrew Morton wrote:
> > > > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > > > > > > Dominik Karall <dominik.karall@gmx.net> wrote:
> > > > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches
> > > > > > > > > >/2.6/2 .6.1 2-rc 6/2. 6.12-rc6-mm1/
> > > > > > > > >
> > > > > > > > > After looking in my dmesg output today, I saw following
> > > > > > > > > error with 2.6.12-rc6-mm1, maybe it's usefull to you. I
> > > > > > > > > don't know when it exactly happens, cause I never used mono
> > > > > > > > > last time, I just did an emerge mono on my gentoo system,
> > > > > > > > > maybe this forced the failure.
> > > > > > > > >
> > > > > > > > > note: mono[26736] exited with preempt_count 1
> > > > > > > > > scheduling while atomic: mono/0x10000001/26736
> > > > > > > > >
> > > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122}
> > > > > > > > > <ffffffff8013197b>{vprintk+635}
> > > > > > > > > <ffffffff803e2738>{cond_resched+56}
> > > > > > > > > <ffffffff80164de3>{unmap_vmas+1587}
> > > > > > > > > <ffffffff8016a560>{exit_mmap+128}
> > > > > > > > > <ffffffff8012e7bf>{mmput+31}
> > > > > > > > > <ffffffff80133466>{do_exit+438}
> > > > > > > > > <ffffffff8013bf25>{__dequeue_signal+501}
> > > > > > > > > <ffffffff801340c8>{do_group_exit+280}
> > > > > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > > > > > > > > <ffffffff8010de92>{do_signal+162}
> > > > > > > > > <ffffffff8012d1e0>{default_wake_function+0}
> > > > > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > > > > > > > > <ffffffff8010eb3f>{sysret_signal+28}
> > > > > > > > > <ffffffff8010ee27>{ptregscall_common+103}
> > > > > > > >
> > > > > > > > A couple of people reported this, but all seems to have gone
> > > > > > > > quiet. Is it fixed in later -mm's? Is 2.6.13-rc4 running
> > > > > > > > OK?
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > >
> > > > > > > hi andrew!
> > > > > > >
> > > > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did
> > > > > > > an emerge mono right now to test it, and I got this one:
> > > > > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with
> > > > > > > preempt_count 1 Jul 29 15:26:50 [kernel] file[14627]: segfault
> > > > > > > at 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffffe43b50
> > > > > > > error 4
> > > > > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at
> > > > > > > 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffffcc87a0
> > > > > > > error 4
> > > > > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at
> > > > > > > 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffff905f80
> > > > > > > error 4
> > > > > > >
> > > > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get
> > > > > > > more info about the bug. Did I forget any debug option?
> > > > > >
> > > > > > Gee, I don't know how to find this one. Do you know if the
> > > > > > problem is specific to -mm?
> > > > >
> > > > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error.
> > > >
> > > > Great, thanks for that.
> > > >
> > > > > So it seems to be -mm related. Do you suspect any patch which could
> > > > > cause the error?
> > > >
> > > > I wouldn't know, sorry. Possible the scheduler patches, possibly an
> > > > x86_64-specific patch. Is the problem repeatable? If so, a binary
> > > > search would only take ten build-n-boots ;)
> > >
> > > Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I
> > > will try to find the right patch tomorrow, 10 build-n-boots would end
> > > up in morning ;)
> > >
> > > btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old
> > > patch which wasn't merged to linus tree till now...hope there aren't a
> > > lot of them :)
> >
> > Any progress on this? It kinda measn that the whole of the -mm lineup is
> > stuck until we can identify the offending patch. We have a couple of
> > weeks in which to do this but if you can identify the bad patch it'd help
> > enormously, thanks.
>
> OK, Bartosz Taudul tells me that he's occasionally seeing this on stock
> 2.6.12 (thanks!). So there's not a lot of point in doing the -mm bisection
> search.
>
> I think Ingo was planning on coming up with some infrastructure which would
> allow us to debug this further.
I'm sorry that I couldn't do the tests earlier, but I had no time this week. I
did some tests now and noticed that the bug only occures when kde is
running...weird.
I'm going to continue testing tomorrow after work, exactly in 12 hours ;)
I will let you know if I have any news!
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* [patch] preempt-trace.patch
2005-08-04 22:28 ` 2.6.12-rc6-mm1 Andrew Morton
2005-08-04 22:44 ` 2.6.12-rc6-mm1 Dominik Karall
@ 2005-08-05 10:48 ` Ingo Molnar
2005-08-05 11:44 ` Dominik Karall
2005-08-05 14:26 ` [patch] preempt-trace.patch (mono preempt-trace) Dominik Karall
1 sibling, 2 replies; 101+ messages in thread
From: Ingo Molnar @ 2005-08-05 10:48 UTC (permalink / raw)
To: Andrew Morton; +Cc: dominik.karall, linux-kernel
* Andrew Morton <akpm@osdl.org> wrote:
> I think Ingo was planning on coming up with some infrastructure which
> would allow us to debug this further.
yeah. I've done this today and have split it out of the -RT tree, see
the patch below. After some exposure in -mm i'd like this feature to go
upstream too.
the patch is against recent Linus trees, 2.6.13-rc4 or later should all
work. Dominik, could you try it and send us the new kernel logs whenever
you happen to hit that warning message again? (Please also enable
CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)
Ingo
------
this patch implements the "non-preemptible section trace" feature, which
prints out a "critical section nesting" trace after stackdumps:
Call Trace:
[<c0103db1>] show_stack+0x7a/0x90
[<c0103f36>] show_registers+0x156/0x1ce
[<c010412e>] die+0xe8/0x172
[<c010422e>] do_trap+0x76/0xa3
[<c01044fe>] do_invalid_op+0xa3/0xad
[<c01039ef>] error_code+0x4f/0x54
[<c0120be9>] test+0x8/0xa
[<c0120c41>] sys_gettimeofday+0x56/0x74
[<c0102eeb>] sysenter_past_esp+0x54/0x75
---------------------------
| preempt count: 00000004 ]
| 4 levels deep critical section nesting:
-----------------------------------------
.. [<c0120bbe>] .... test3+0xd/0xf
.....[<c0120bc8>] .. ( <= test2+0x8/0x21)
.. [<c0120bbe>] .... test3+0xd/0xf
.....[<c0120bcd>] .. ( <= test2+0xd/0x21)
.. [<c0120bd7>] .... test2+0x17/0x21
.....[<c0120be9>] .. ( <= test+0x8/0xa)
.. [<c010407f>] .... die+0x39/0x172
.....[<c010422e>] .. ( <= do_trap+0x76/0xa3)
this feature is implemented via a low-overhead mechanism by keeping
the caller and caller-parent addresses for each disable_preempt()
call site, and printing it upon crashes. Note that every other
locking API is thus traced too, such as spinlocks, rwlocks, per-cpu
variables, etc. This feature is especially useful in identifying
leaked preemption counts, as the missing count is displayed as an
extra entry in the stack.
the feature is active when PREEMPT_DEBUG is enabled.
i've also cleaned up preemption-count debugging by moving the debug
functions out of sched.c into lib/preempt.c.
also, i have added preemption-counter-imbalance checks to the hardirq
and softirq processing codepaths. The behavior of preemption-counter
checks is now uniform: a warning is printed with all info we have at
that point, and the preemption counter is then restored to the old
value.
on x86 i have changed the 4KSTACKS feature to inherit the low bits of
the preemption-count across hardirq/softirq-context switching, so that
the preemption trace entries of interrupts do not overwrite process
level preemption trace entries.
boot-tested on x86. Should work on all architectures, but only x86 and
x64 has been updated to print the trace-stack out at stackdump time.
This feature was part of the PREEMPT_RT tree for some time and was very
useful in debugging preempt-counter leaks and deadlock/lockup
situations.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
arch/i386/kernel/irq.c | 28 ++++++++++++
arch/i386/kernel/traps.c | 1
arch/x86_64/kernel/traps.c | 1
include/linux/sched.h | 13 +++++
kernel/exit.c | 9 ++--
kernel/irq/handle.c | 17 +++++++
kernel/sched.c | 33 --------------
kernel/softirq.c | 16 +++++++
kernel/timer.c | 27 +++++++-----
lib/Makefile | 2
lib/preempt.c | 101 +++++++++++++++++++++++++++++++++++++++++++++
11 files changed, 200 insertions(+), 48 deletions(-)
Index: linux/arch/i386/kernel/irq.c
===================================================================
--- linux.orig/arch/i386/kernel/irq.c
+++ linux/arch/i386/kernel/irq.c
@@ -55,6 +55,9 @@ fastcall unsigned int do_IRQ(struct pt_r
{
/* high bits used in ret_from_ code */
int irq = regs->orig_eax & 0xff;
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 count = preempt_count() & PREEMPT_MASK;
+#endif
#ifdef CONFIG_4KSTACKS
union irq_ctx *curctx, *irqctx;
u32 *isp;
@@ -95,6 +98,14 @@ fastcall unsigned int do_IRQ(struct pt_r
irqctx->tinfo.task = curctx->tinfo.task;
irqctx->tinfo.previous_esp = current_stack_pointer;
+ /*
+ * Keep the preemption-count offset, so that the
+ * process-level preemption-trace entries do not
+ * get overwritten by the hardirq context:
+ */
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count += count;
+#endif
asm volatile(
" xchgl %%ebx,%%esp \n"
" call __do_IRQ \n"
@@ -103,6 +114,9 @@ fastcall unsigned int do_IRQ(struct pt_r
: "0" (irq), "1" (regs), "2" (isp)
: "memory", "cc", "ecx"
);
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count -= count;
+#endif
} else
#endif
__do_IRQ(irq, regs);
@@ -165,6 +179,9 @@ extern asmlinkage void __do_softirq(void
asmlinkage void do_softirq(void)
{
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 count = preempt_count() & PREEMPT_MASK;
+#endif
unsigned long flags;
struct thread_info *curctx;
union irq_ctx *irqctx;
@@ -181,6 +198,14 @@ asmlinkage void do_softirq(void)
irqctx->tinfo.task = curctx->task;
irqctx->tinfo.previous_esp = current_stack_pointer;
+ /*
+ * Keep the preemption-count offset, so that the
+ * process-level preemption-trace entries do not
+ * get overwritten by the softirq context:
+ */
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count += count;
+#endif
/* build the stack frame on the softirq stack */
isp = (u32*) ((char*)irqctx + sizeof(*irqctx));
@@ -192,6 +217,9 @@ asmlinkage void do_softirq(void)
: "0"(isp)
: "memory", "cc", "edx", "ecx", "eax"
);
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count -= count;
+#endif
}
local_irq_restore(flags);
Index: linux/arch/i386/kernel/traps.c
===================================================================
--- linux.orig/arch/i386/kernel/traps.c
+++ linux/arch/i386/kernel/traps.c
@@ -164,6 +164,7 @@ void show_trace(struct task_struct *task
break;
printk(" =======================\n");
}
+ print_preempt_trace(task, preempt_count());
}
void show_stack(struct task_struct *task, unsigned long *esp)
Index: linux/arch/x86_64/kernel/traps.c
===================================================================
--- linux.orig/arch/x86_64/kernel/traps.c
+++ linux/arch/x86_64/kernel/traps.c
@@ -221,6 +221,7 @@ void show_trace(unsigned long *stack)
HANDLE_STACK (((long) stack & (THREAD_SIZE-1)) != 0);
#undef HANDLE_STACK
printk("\n");
+ print_traces(task);
}
void show_stack(struct task_struct *tsk, unsigned long * rsp)
Index: linux/include/linux/sched.h
===================================================================
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -592,6 +592,14 @@ extern int groups_search(struct group_in
#define GROUP_AT(gi, i) \
((gi)->blocks[(i)/NGROUPS_PER_BLOCK][(i)%NGROUPS_PER_BLOCK])
+#ifdef CONFIG_DEBUG_PREEMPT
+# define MAX_PREEMPT_TRACE 25
+extern void print_preempt_trace(struct task_struct *task, u32 count);
+#else
+static inline void print_preempt_trace(struct task_struct *task, u32 count)
+{
+}
+#endif
struct audit_context; /* See audit.c */
struct mempolicy;
@@ -770,6 +778,11 @@ struct task_struct {
int cpuset_mems_generation;
#endif
atomic_t fs_excl; /* holding fs exclusive resources */
+
+#ifdef CONFIG_DEBUG_PREEMPT
+ void *preempt_off_caller[MAX_PREEMPT_TRACE];
+ void *preempt_off_parent[MAX_PREEMPT_TRACE];
+#endif
};
static inline pid_t process_group(struct task_struct *tsk)
Index: linux/kernel/exit.c
===================================================================
--- linux.orig/kernel/exit.c
+++ linux/kernel/exit.c
@@ -821,10 +821,11 @@ fastcall NORET_TYPE void do_exit(long co
tsk->it_prof_expires = cputime_zero;
tsk->it_sched_expires = 0;
- if (unlikely(in_atomic()))
- printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n",
- current->comm, current->pid,
- preempt_count());
+ if (unlikely(in_atomic())) {
+ printk(KERN_ERR "BUG: %s[%d] exited with nonzero preempt_count %d!\n",
+ tsk->comm, tsk->pid, preempt_count());
+ print_preempt_trace(tsk, preempt_count());
+ }
acct_update_integrals(tsk);
update_mem_hiwater(tsk);
Index: linux/kernel/irq/handle.c
===================================================================
--- linux.orig/kernel/irq/handle.c
+++ linux/kernel/irq/handle.c
@@ -85,7 +85,24 @@ fastcall int handle_IRQ_event(unsigned i
local_irq_enable();
do {
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 in_count = preempt_count(), out_count;
+#endif
ret = action->handler(irq, action->dev_id, regs);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ printk(KERN_ERR "BUG: irq %d [%s] preempt-count "
+ "imbalance: in=%08x, out=%08x!\n",
+ irq, action->name, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * We already printed all the useful info,
+ * fix up the preemption count now:
+ */
+ preempt_count() = in_count;
+ }
+#endif
if (ret == IRQ_HANDLED)
status |= action->flags;
retval |= ret;
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -47,6 +47,7 @@
#include <linux/syscalls.h>
#include <linux/times.h>
#include <linux/acct.h>
+#include <linux/kallsyms.h>
#include <asm/tlb.h>
#include <asm/unistd.h>
@@ -2707,38 +2708,6 @@ static inline int dependent_sleeper(int
}
#endif
-#if defined(CONFIG_PREEMPT) && defined(CONFIG_DEBUG_PREEMPT)
-
-void fastcall add_preempt_count(int val)
-{
- /*
- * Underflow?
- */
- BUG_ON((preempt_count() < 0));
- preempt_count() += val;
- /*
- * Spinlock count overflowing soon?
- */
- BUG_ON((preempt_count() & PREEMPT_MASK) >= PREEMPT_MASK-10);
-}
-EXPORT_SYMBOL(add_preempt_count);
-
-void fastcall sub_preempt_count(int val)
-{
- /*
- * Underflow?
- */
- BUG_ON(val > preempt_count());
- /*
- * Is the spinlock portion underflowing?
- */
- BUG_ON((val < PREEMPT_MASK) && !(preempt_count() & PREEMPT_MASK));
- preempt_count() -= val;
-}
-EXPORT_SYMBOL(sub_preempt_count);
-
-#endif
-
/*
* schedule() is the main scheduler function.
*/
Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c
+++ linux/kernel/softirq.c
@@ -92,7 +92,23 @@ restart:
do {
if (pending & 1) {
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 in_count = preempt_count();
+#endif
h->action(h);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ printk(KERN_ERR "BUG: softirq %d preempt-count "
+ "imbalance: in=%08x, out=%08x!\n",
+ h - softirq_vec, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * Fix up the bad preemption count:
+ */
+ preempt_count() = in_count;
+ }
+#endif
rcu_bh_qsctr_inc(cpu);
}
h++;
Index: linux/kernel/timer.c
===================================================================
--- linux.orig/kernel/timer.c
+++ linux/kernel/timer.c
@@ -33,6 +33,7 @@
#include <linux/posix-timers.h>
#include <linux/cpu.h>
#include <linux/syscalls.h>
+#include <linux/kallsyms.h>
#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -480,6 +481,7 @@ static inline void __run_timers(tvec_bas
while (!list_empty(head)) {
void (*fn)(unsigned long);
unsigned long data;
+ int in_count, out_count;
timer = list_entry(head->next,struct timer_list,entry);
fn = timer->function;
@@ -488,17 +490,20 @@ static inline void __run_timers(tvec_bas
set_running_timer(base, timer);
detach_timer(timer, 1);
spin_unlock_irq(&base->t_base.lock);
- {
- int preempt_count = preempt_count();
- fn(data);
- if (preempt_count != preempt_count()) {
- printk(KERN_WARNING "huh, entered %p "
- "with preempt_count %08x, exited"
- " with %08x?\n",
- fn, preempt_count,
- preempt_count());
- BUG();
- }
+
+ in_count = preempt_count();
+ fn(data);
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ print_symbol(KERN_ERR "BUG: %s", (long)fn);
+ printk(KERN_ERR "(%p) preempt-count imbalance: "
+ "in=%08x, out=%08x!",
+ fn, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * Fix up the bad preemption count:
+ */
+ preempt_count() = in_count;
}
spin_lock_irq(&base->t_base.lock);
}
Index: linux/lib/Makefile
===================================================================
--- linux.orig/lib/Makefile
+++ linux/lib/Makefile
@@ -20,7 +20,7 @@ lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) +=
lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_GENERIC_FIND_NEXT_BIT) += find_next_bit.o
obj-$(CONFIG_LOCK_KERNEL) += kernel_lock.o
-obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o
+obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o preempt.o
ifneq ($(CONFIG_HAVE_DEC_LOCK),y)
lib-y += dec_and_lock.o
Index: linux/lib/preempt.c
===================================================================
--- /dev/null
+++ linux/lib/preempt.c
@@ -0,0 +1,101 @@
+/*
+ * lib/preempt.c
+ *
+ * DEBUG_PREEMPT variant of add_preempt_count() and sub_preempt_count().
+ * Preemption tracing.
+ *
+ * (C) 2005 Ingo Molnar, Red Hat
+ */
+#include <linux/module.h>
+#include <linux/hardirq.h>
+#include <linux/kallsyms.h>
+
+/*
+ * Add a value to the preemption count, and check for overflows,
+ * underflows and maintain a small stack of callers that gets
+ * printed upon crashes.
+ */
+void fastcall add_preempt_count(int val)
+{
+ unsigned int count = preempt_count(), idx = count & PREEMPT_MASK;
+
+ /*
+ * Underflow?
+ */
+ BUG_ON(count < 0);
+
+ preempt_count() += val;
+
+ /*
+ * Spinlock count overflowing soon?
+ */
+ BUG_ON(idx >= PREEMPT_MASK-10);
+
+ /*
+ * Maintain the per-task preemption-nesting stack (which
+ * will be printed upon crashes). It's a low-overhead thing,
+ * constant overhead per preempt-disable.
+ */
+ if (idx < MAX_PREEMPT_TRACE) {
+ void *caller = __builtin_return_address(0), *parent = NULL;
+
+#ifdef CONFIG_FRAME_POINTER
+ parent = __builtin_return_address(1);
+ if (in_lock_functions(parent)) {
+ parent = __builtin_return_address(2);
+ if (in_lock_functions(parent))
+ parent = __builtin_return_address(3);
+ }
+#endif
+ current->preempt_off_caller[idx] = caller;
+ current->preempt_off_parent[idx] = parent;
+ }
+}
+EXPORT_SYMBOL(add_preempt_count);
+
+void fastcall sub_preempt_count(int val)
+{
+ unsigned int count = preempt_count();
+
+ /*
+ * Underflow?
+ */
+ BUG_ON(val > count);
+ /*
+ * Is the spinlock portion underflowing?
+ */
+ BUG_ON((val < PREEMPT_MASK) && !(count & PREEMPT_MASK));
+
+ preempt_count() -= val;
+}
+EXPORT_SYMBOL(sub_preempt_count);
+
+void print_preempt_trace(struct task_struct *task, u32 count)
+{
+ unsigned int i, idx = count & PREEMPT_MASK;
+
+ preempt_disable();
+
+ printk("---------------------------\n");
+ printk("| preempt count: %08x ]\n", count);
+ if (count) {
+ printk("| %d level deep critical section nesting:\n", idx);
+ printk("----------------------------------------\n");
+ } else
+ printk("---------------------------\n");
+ for (i = 0; i < idx; i++) {
+ printk(".. [<%p>] .... ", task->preempt_off_caller[i]);
+ print_symbol("%s\n", (long)task->preempt_off_caller[i]);
+ printk(".....[<%p>] .. ( <= ",
+ task->preempt_off_parent[i]);
+ print_symbol("%s)\n", (long)task->preempt_off_parent[i]);
+ if (i == MAX_PREEMPT_TRACE-1) {
+ printk("[rest truncated, reached MAX_PREEMPT_TRACE]\n");
+ break;
+ }
+ }
+ printk("\n");
+
+ preempt_enable();
+}
+
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch
2005-08-05 10:48 ` [patch] preempt-trace.patch Ingo Molnar
@ 2005-08-05 11:44 ` Dominik Karall
2005-08-05 15:13 ` [patch] preempt-trace-fixes.patch Ingo Molnar
2005-08-05 14:26 ` [patch] preempt-trace.patch (mono preempt-trace) Dominik Karall
1 sibling, 1 reply; 101+ messages in thread
From: Dominik Karall @ 2005-08-05 11:44 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1484 bytes --]
On Friday 05 August 2005 12:48, Ingo Molnar wrote:
> * Andrew Morton <akpm@osdl.org> wrote:
> > I think Ingo was planning on coming up with some infrastructure which
> > would allow us to debug this further.
>
> yeah. I've done this today and have split it out of the -RT tree, see
> the patch below. After some exposure in -mm i'd like this feature to go
> upstream too.
>
> the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> work. Dominik, could you try it and send us the new kernel logs whenever
> you happen to hit that warning message again? (Please also enable
> CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)
I tried to compile the patch on top of 2.6.13-rc4-mm1, it applied with a few
offsets, but it looked ok.
Here is the error I get when I compiled it:
CC arch/x86_64/kernel/traps.o
arch/x86_64/kernel/traps.c: In function `show_trace':
arch/x86_64/kernel/traps.c:228: warning: implicit declaration of function
`print_traces'
arch/x86_64/kernel/traps.c:228: error: `task' undeclared (first use in this
function)
arch/x86_64/kernel/traps.c:228: error: (Each undeclared identifier is reported
only once
arch/x86_64/kernel/traps.c:228: error: for each function it appears in.)
make[1]: *** [arch/x86_64/kernel/traps.o] Error 1
I took a look at the traps.c file, but couldn't find any solution, as there is
no print_traces function and task variable too in this section.
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 10:48 ` [patch] preempt-trace.patch Ingo Molnar
2005-08-05 11:44 ` Dominik Karall
@ 2005-08-05 14:26 ` Dominik Karall
2005-08-05 15:22 ` Ingo Molnar
1 sibling, 1 reply; 101+ messages in thread
From: Dominik Karall @ 2005-08-05 14:26 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1347 bytes --]
On Friday 05 August 2005 12:48, Ingo Molnar wrote:
> * Andrew Morton <akpm@osdl.org> wrote:
> > I think Ingo was planning on coming up with some infrastructure which
> > would allow us to debug this further.
>
> yeah. I've done this today and have split it out of the -RT tree, see
> the patch below. After some exposure in -mm i'd like this feature to go
> upstream too.
>
> the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> work. Dominik, could you try it and send us the new kernel logs whenever
> you happen to hit that warning message again? (Please also enable
> CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)
Here's a preempt trace output from mono. To compile preempt-trace.patch I
remove the traps.c patch and added u32 definition for out_count in handle.c.
After those changes, the kernel compiled fine.
Now here's the output, let me know if it is ok, or if you can make any reveals
where the bug is located.
BUG: mono[10011] exited with nonzero preempt_count 1!
---------------------------
| preempt count: 00000001 ]
| 1 level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
.....[<0000000000000000>] .. ( <= 0x0)
If there is anything I should test, let me know!
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* [patch] preempt-trace-fixes.patch
2005-08-05 11:44 ` Dominik Karall
@ 2005-08-05 15:13 ` Ingo Molnar
2005-08-05 18:14 ` Dominik Karall
0 siblings, 1 reply; 101+ messages in thread
From: Ingo Molnar @ 2005-08-05 15:13 UTC (permalink / raw)
To: Dominik Karall; +Cc: Andrew Morton, linux-kernel
* Dominik Karall <dominik.karall@gmx.net> wrote:
> > yeah. I've done this today and have split it out of the -RT tree, see
> > the patch below. After some exposure in -mm i'd like this feature to go
> > upstream too.
> >
> > the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> > work. Dominik, could you try it and send us the new kernel logs whenever
> > you happen to hit that warning message again? (Please also enable
> > CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)
>
> I tried to compile the patch on top of 2.6.13-rc4-mm1, it applied with a few
> offsets, but it looked ok.
> Here is the error I get when I compiled it:
ok, does the additional patch below fix things for you?
Ingo
------
- fix the x64 build
- get the preempt_count from the right task on x86 (it's usually
'current', but not always.)
- fix compiler warning in kernel/softirq.c on 64-bit platforms
Signed-off-by: Ingo Molnar <mingo@elte.hu>
arch/i386/kernel/traps.c | 2 +-
arch/x86_64/kernel/process.c | 2 +-
arch/x86_64/kernel/traps.c | 9 +++++----
include/asm-x86_64/proto.h | 2 +-
kernel/softirq.c | 2 +-
5 files changed, 9 insertions(+), 8 deletions(-)
Index: linux-preempt-trace/arch/i386/kernel/traps.c
===================================================================
--- linux-preempt-trace.orig/arch/i386/kernel/traps.c
+++ linux-preempt-trace/arch/i386/kernel/traps.c
@@ -164,7 +164,7 @@ void show_trace(struct task_struct *task
break;
printk(" =======================\n");
}
- print_preempt_trace(task, preempt_count());
+ print_preempt_trace(task, task->thread_info->preempt_count);
}
void show_stack(struct task_struct *task, unsigned long *esp)
Index: linux-preempt-trace/arch/x86_64/kernel/process.c
===================================================================
--- linux-preempt-trace.orig/arch/x86_64/kernel/process.c
+++ linux-preempt-trace/arch/x86_64/kernel/process.c
@@ -311,7 +311,7 @@ void __show_regs(struct pt_regs * regs)
void show_regs(struct pt_regs *regs)
{
__show_regs(regs);
- show_trace(®s->rsp);
+ show_trace(current, ®s->rsp);
}
/*
Index: linux-preempt-trace/arch/x86_64/kernel/traps.c
===================================================================
--- linux-preempt-trace.orig/arch/x86_64/kernel/traps.c
+++ linux-preempt-trace/arch/x86_64/kernel/traps.c
@@ -29,6 +29,7 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/nmi.h>
+#include <linux/sched.h>
#include <asm/system.h>
#include <asm/uaccess.h>
@@ -156,7 +157,7 @@ static unsigned long *in_exception_stack
* severe exception (double fault, nmi, stack fault, debug, mce) hardware stack
*/
-void show_trace(unsigned long *stack)
+void show_trace(struct task_struct *task, unsigned long *stack)
{
unsigned long addr;
const unsigned cpu = safe_smp_processor_id();
@@ -221,7 +222,7 @@ void show_trace(unsigned long *stack)
HANDLE_STACK (((long) stack & (THREAD_SIZE-1)) != 0);
#undef HANDLE_STACK
printk("\n");
- print_traces(task);
+ print_preempt_trace(task, task->thread_info->preempt_count);
}
void show_stack(struct task_struct *tsk, unsigned long * rsp)
@@ -258,7 +259,7 @@ void show_stack(struct task_struct *tsk,
printk("%016lx ", *stack++);
touch_nmi_watchdog();
}
- show_trace((unsigned long *)rsp);
+ show_trace(tsk, (unsigned long *)rsp);
}
/*
@@ -267,7 +268,7 @@ void show_stack(struct task_struct *tsk,
void dump_stack(void)
{
unsigned long dummy;
- show_trace(&dummy);
+ show_trace(current, &dummy);
}
EXPORT_SYMBOL(dump_stack);
Index: linux-preempt-trace/include/asm-x86_64/proto.h
===================================================================
--- linux-preempt-trace.orig/include/asm-x86_64/proto.h
+++ linux-preempt-trace/include/asm-x86_64/proto.h
@@ -66,7 +66,7 @@ extern unsigned long end_pfn_map;
extern cpumask_t cpu_initialized;
-extern void show_trace(unsigned long * rsp);
+extern void show_trace(struct task_struct *task, unsigned long *rsp);
extern void show_registers(struct pt_regs *regs);
extern void exception_table_check(void);
Index: linux-preempt-trace/kernel/softirq.c
===================================================================
--- linux-preempt-trace.orig/kernel/softirq.c
+++ linux-preempt-trace/kernel/softirq.c
@@ -99,7 +99,7 @@ restart:
#ifdef CONFIG_DEBUG_PREEMPT
out_count = preempt_count();
if (in_count != out_count) {
- printk(KERN_ERR "BUG: softirq %d preempt-count "
+ printk(KERN_ERR "BUG: softirq %ld preempt-count "
"imbalance: in=%08x, out=%08x!\n",
h - softirq_vec, in_count, out_count);
print_preempt_trace(current, out_count);
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 14:26 ` [patch] preempt-trace.patch (mono preempt-trace) Dominik Karall
@ 2005-08-05 15:22 ` Ingo Molnar
2005-08-05 17:58 ` Dominik Karall
2005-08-05 18:05 ` Andrew Morton
0 siblings, 2 replies; 101+ messages in thread
From: Ingo Molnar @ 2005-08-05 15:22 UTC (permalink / raw)
To: Dominik Karall; +Cc: Andrew Morton, linux-kernel
* Dominik Karall <dominik.karall@gmx.net> wrote:
> BUG: mono[10011] exited with nonzero preempt_count 1!
> ---------------------------
> | preempt count: 00000001 ]
> | 1 level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
> .....[<0000000000000000>] .. ( <= 0x0)
>
> If there is anything I should test, let me know!
please enable CONFIG_FRAME_POINTERS!
we now know that it's a spin_lock reference that got leaked, but we dont
(yet) know the parent.
Ingo
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 15:22 ` Ingo Molnar
@ 2005-08-05 17:58 ` Dominik Karall
2005-08-05 18:46 ` Hugh Dickins
2005-08-05 18:05 ` Andrew Morton
1 sibling, 1 reply; 101+ messages in thread
From: Dominik Karall @ 2005-08-05 17:58 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 840 bytes --]
On Friday 05 August 2005 17:22, Ingo Molnar wrote:
> * Dominik Karall <dominik.karall@gmx.net> wrote:
> > BUG: mono[10011] exited with nonzero preempt_count 1!
> > ---------------------------
> >
> > | preempt count: 00000001 ]
> > | 1 level deep critical section nesting:
> >
> > ----------------------------------------
> > .. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
> > .....[<0000000000000000>] .. ( <= 0x0)
> >
> > If there is anything I should test, let me know!
>
> please enable CONFIG_FRAME_POINTERS!
>
> we now know that it's a spin_lock reference that got leaked, but we dont
> (yet) know the parent.
I'm sorry, but I think I can't enable CONFIG_FRAME_POINTERS.
Depends on: DEBUG_KERNEL && (X86 && !X86_64 || CRIS || M68K || M68KNOMMU ||
FRV || UML)
Seems to be disabled for x86_64.
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 15:22 ` Ingo Molnar
2005-08-05 17:58 ` Dominik Karall
@ 2005-08-05 18:05 ` Andrew Morton
2005-08-05 20:08 ` Ingo Molnar
2005-08-05 20:13 ` Ingo Molnar
1 sibling, 2 replies; 101+ messages in thread
From: Andrew Morton @ 2005-08-05 18:05 UTC (permalink / raw)
To: Ingo Molnar; +Cc: dominik.karall, linux-kernel
Ingo Molnar <mingo@elte.hu> wrote:
>
>
> * Dominik Karall <dominik.karall@gmx.net> wrote:
>
> > BUG: mono[10011] exited with nonzero preempt_count 1!
> > ---------------------------
> > | preempt count: 00000001 ]
> > | 1 level deep critical section nesting:
> > ----------------------------------------
> > .. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
> > .....[<0000000000000000>] .. ( <= 0x0)
> >
> > If there is anything I should test, let me know!
Thanks, Dominik.
> please enable CONFIG_FRAME_POINTERS!
Seems a bit tricky. Wouldn't it be best if enabling CONFIG_DEBUG_PREEMPT
autoselected CONFIG_KALLSYMS_ALL, CONFIG_FRAME_POINTER and whatever else
we need?
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace-fixes.patch
2005-08-05 15:13 ` [patch] preempt-trace-fixes.patch Ingo Molnar
@ 2005-08-05 18:14 ` Dominik Karall
0 siblings, 0 replies; 101+ messages in thread
From: Dominik Karall @ 2005-08-05 18:14 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1550 bytes --]
On Friday 05 August 2005 17:13, Ingo Molnar wrote:
> * Dominik Karall <dominik.karall@gmx.net> wrote:
> > > yeah. I've done this today and have split it out of the -RT tree, see
> > > the patch below. After some exposure in -mm i'd like this feature to go
> > > upstream too.
> > >
> > > the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> > > work. Dominik, could you try it and send us the new kernel logs
> > > whenever you happen to hit that warning message again? (Please also
> > > enable CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as
> > > possible.)
> >
> > I tried to compile the patch on top of 2.6.13-rc4-mm1, it applied with a
> > few offsets, but it looked ok.
> > Here is the error I get when I compiled it:
>
> ok, does the additional patch below fix things for you?
Yes, only out_count wasn't defined in softirq.c, here's the patch to fix it.
The first patch in traps.c failed on rc4-mm1, but it doesn't matter, as
sched.h seems to be already included there. I think it is even included in
-rc4 too.
dominik
-----
--- linux/kernel/softirq.c.orig 2005-08-05 20:00:28.000000000 +0200
+++ linux/kernel/softirq.c 2005-08-05 20:02:40.000000000 +0200
@@ -93,7 +93,7 @@ restart:
do {
if (pending & 1) {
#ifdef CONFIG_DEBUG_PREEMPT
- u32 in_count = preempt_count();
+ u32 in_count = preempt_count(), out_count;
#endif
h->action(h);
#ifdef CONFIG_DEBUG_PREEMPT
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 17:58 ` Dominik Karall
@ 2005-08-05 18:46 ` Hugh Dickins
2005-08-05 19:23 ` Dominik Karall
0 siblings, 1 reply; 101+ messages in thread
From: Hugh Dickins @ 2005-08-05 18:46 UTC (permalink / raw)
To: Dominik Karall; +Cc: Ingo Molnar, Andrew Morton, linux-kernel
On Fri, 5 Aug 2005, Dominik Karall wrote:
> On Friday 05 August 2005 17:22, Ingo Molnar wrote:
> >
> > please enable CONFIG_FRAME_POINTERS!
>
> I'm sorry, but I think I can't enable CONFIG_FRAME_POINTERS.
> Depends on: DEBUG_KERNEL && (X86 && !X86_64 || CRIS || M68K || M68KNOMMU ||
> FRV || UML)
>
> Seems to be disabled for x86_64.
It is disabled for x86_64, but not for any very good reason (beyond
reducing the test matrix). I work with CONFIG_FRAME_POINTERS on x86_64
with no trouble, just add in the patch below, make oldconfig, choose
frame pointers and rebuild). But I can't guarantee it'll actually
reveal the info Ingo and all are longing to see.
Hugh
--- 2.6.13-rc5/lib/Kconfig.debug 2005-06-17 20:48:29.000000000 +0100
+++ linux/lib/Kconfig.debug 2005-07-29 18:40:28.000000000 +0100
@@ -151,7 +151,7 @@ config DEBUG_FS
config FRAME_POINTER
bool "Compile the kernel with frame pointers"
- depends on DEBUG_KERNEL && ((X86 && !X86_64) || CRIS || M68K || M68KNOMMU || FRV || UML)
+ depends on DEBUG_KERNEL && (X86 || CRIS || M68K || M68KNOMMU || FRV || UML)
default y if DEBUG_INFO && UML
help
If you say Y here the resulting kernel image will be slightly larger
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 18:46 ` Hugh Dickins
@ 2005-08-05 19:23 ` Dominik Karall
2005-08-05 20:04 ` Ingo Molnar
0 siblings, 1 reply; 101+ messages in thread
From: Dominik Karall @ 2005-08-05 19:23 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Ingo Molnar, Andrew Morton, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]
On Friday 05 August 2005 20:46, Hugh Dickins wrote:
> On Fri, 5 Aug 2005, Dominik Karall wrote:
> > On Friday 05 August 2005 17:22, Ingo Molnar wrote:
> > > please enable CONFIG_FRAME_POINTERS!
> >
> > I'm sorry, but I think I can't enable CONFIG_FRAME_POINTERS.
> > Depends on: DEBUG_KERNEL && (X86 && !X86_64 || CRIS || M68K || M68KNOMMU
> > || FRV || UML)
> >
> > Seems to be disabled for x86_64.
>
> It is disabled for x86_64, but not for any very good reason (beyond
> reducing the test matrix). I work with CONFIG_FRAME_POINTERS on x86_64
> with no trouble, just add in the patch below, make oldconfig, choose
> frame pointers and rebuild). But I can't guarantee it'll actually
> reveal the info Ingo and all are longing to see.
With FRAME_POINTERS enabled:
BUG: mono[3193] exited with nonzero preempt_count 1!
---------------------------
| preempt count: 00000001 ]
| 1 level deep critical section nesting:
----------------------------------------
.. [<ffffffff80400a46>] .... _spin_lock+0x16/0x80
.....[<ffffffff801ed30c>] .. ( <= sys_semtimedop+0x28c/0x7c0)
hth, let me know!
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 19:23 ` Dominik Karall
@ 2005-08-05 20:04 ` Ingo Molnar
2005-08-05 20:48 ` Dominik Karall
0 siblings, 1 reply; 101+ messages in thread
From: Ingo Molnar @ 2005-08-05 20:04 UTC (permalink / raw)
To: Dominik Karall; +Cc: Hugh Dickins, Andrew Morton, linux-kernel
* Dominik Karall <dominik.karall@gmx.net> wrote:
> With FRAME_POINTERS enabled:
>
> BUG: mono[3193] exited with nonzero preempt_count 1!
> ---------------------------
> | preempt count: 00000001 ]
> | 1 level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffff80400a46>] .... _spin_lock+0x16/0x80
> .....[<ffffffff801ed30c>] .. ( <= sys_semtimedop+0x28c/0x7c0)
thanks. It seems semundo->lock somehow leaked. One possibility would be
of semundo->refcount going from 2 to 1 while another thread has it
locked. I dont see what prevents this scenario from happening. To test
this theory, could you apply the patch below, which will do semundo
locking not conditional on the refcount - does it fix the bug?
Ingo
ipc/sem.c | 10 +++-------
1 files changed, 3 insertions(+), 7 deletions(-)
Index: linux-preempt-trace/ipc/sem.c
===================================================================
--- linux-preempt-trace.orig/ipc/sem.c
+++ linux-preempt-trace/ipc/sem.c
@@ -895,7 +895,7 @@ static inline void lock_semundo(void)
struct sem_undo_list *undo_list;
undo_list = current->sysvsem.undo_list;
- if ((undo_list != NULL) && (atomic_read(&undo_list->refcnt) != 1))
+ if (undo_list)
spin_lock(&undo_list->lock);
}
@@ -915,7 +915,7 @@ static inline void unlock_semundo(void)
struct sem_undo_list *undo_list;
undo_list = current->sysvsem.undo_list;
- if ((undo_list != NULL) && (atomic_read(&undo_list->refcnt) != 1))
+ if (undo_list)
spin_unlock(&undo_list->lock);
}
@@ -943,9 +943,7 @@ static inline int get_undo_list(struct s
if (undo_list == NULL)
return -ENOMEM;
memset(undo_list, 0, size);
- /* don't initialize unodhd->lock here. It's done
- * in copy_semundo() instead.
- */
+ spin_lock_init(&undo_list->lock);
atomic_set(&undo_list->refcnt, 1);
current->sysvsem.undo_list = undo_list;
}
@@ -1231,8 +1229,6 @@ int copy_semundo(unsigned long clone_fla
error = get_undo_list(&undo_list);
if (error)
return error;
- if (atomic_read(&undo_list->refcnt) == 1)
- spin_lock_init(&undo_list->lock);
atomic_inc(&undo_list->refcnt);
tsk->sysvsem.undo_list = undo_list;
} else
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 18:05 ` Andrew Morton
@ 2005-08-05 20:08 ` Ingo Molnar
2005-08-05 20:13 ` Ingo Molnar
1 sibling, 0 replies; 101+ messages in thread
From: Ingo Molnar @ 2005-08-05 20:08 UTC (permalink / raw)
To: Andrew Morton; +Cc: dominik.karall, linux-kernel
* Andrew Morton <akpm@osdl.org> wrote:
> > please enable CONFIG_FRAME_POINTERS!
>
> Seems a bit tricky. Wouldn't it be best if enabling
> CONFIG_DEBUG_PREEMPT autoselected CONFIG_KALLSYMS_ALL,
> CONFIG_FRAME_POINTER and whatever else we need?
ok, agreed:
-----
when DEBUG_PREEMPT is enabled, select FRAME_POINTER and KALLSYMS_ALL
as well, to make the debug output more useful.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
lib/Kconfig.debug | 3 +++
1 files changed, 3 insertions(+)
Index: linux-preempt-trace/lib/Kconfig.debug
===================================================================
--- linux-preempt-trace.orig/lib/Kconfig.debug
+++ linux-preempt-trace/lib/Kconfig.debug
@@ -70,6 +70,9 @@ config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
default y
+ select FRAME_POINTER
+ select KALLSYMS
+ select KALLSYMS_ALL
help
If you say Y here then the kernel will use a debug variant of the
commonly used smp_processor_id() function and will print warnings
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 18:05 ` Andrew Morton
2005-08-05 20:08 ` Ingo Molnar
@ 2005-08-05 20:13 ` Ingo Molnar
1 sibling, 0 replies; 101+ messages in thread
From: Ingo Molnar @ 2005-08-05 20:13 UTC (permalink / raw)
To: Andrew Morton; +Cc: dominik.karall, linux-kernel
here's a full patch again of all things preempt-trace (excludes the sysv
semaphores change):
--------
this patch implements the "non-preemptible section trace" feature, which
prints out a "critical section nesting" trace after stackdumps:
Call Trace:
[<c0103db1>] show_stack+0x7a/0x90
[<c0103f36>] show_registers+0x156/0x1ce
[<c010412e>] die+0xe8/0x172
[<c010422e>] do_trap+0x76/0xa3
[<c01044fe>] do_invalid_op+0xa3/0xad
[<c01039ef>] error_code+0x4f/0x54
[<c0120be9>] test+0x8/0xa
[<c0120c41>] sys_gettimeofday+0x56/0x74
[<c0102eeb>] sysenter_past_esp+0x54/0x75
---------------------------
| preempt count: 00000004 ]
| 4 levels deep critical section nesting:
-----------------------------------------
.. [<c0120bbe>] .... test3+0xd/0xf
.....[<c0120bc8>] .. ( <= test2+0x8/0x21)
.. [<c0120bbe>] .... test3+0xd/0xf
.....[<c0120bcd>] .. ( <= test2+0xd/0x21)
.. [<c0120bd7>] .... test2+0x17/0x21
.....[<c0120be9>] .. ( <= test+0x8/0xa)
.. [<c010407f>] .... die+0x39/0x172
.....[<c010422e>] .. ( <= do_trap+0x76/0xa3)
this feature is implemented via a low-overhead mechanism by keeping
the caller and caller-parent addresses for each disable_preempt()
call site, and printing it upon crashes. Note that every other
locking API is thus traced too, such as spinlocks, rwlocks, per-cpu
variables, etc. This feature is especially useful in identifying
leaked preemption counts, as the missing count is displayed as an
extra entry in the stack.
the feature is active when PREEMPT_DEBUG is enabled.
i've also cleaned up preemption-count debugging by moving the debug
functions out of sched.c into lib/preempt.c.
also, i have added preemption-counter-imbalance checks to the hardirq
and softirq processing codepaths. The behavior of preemption-counter
checks is now uniform: a warning is printed with all info we have at
that point, and the preemption counter is then restored to the old
value.
on x86 i have changed the 4KSTACKS feature to inherit the low bits of
the preemption-count across hardirq/softirq-context switching, so that
the preemption trace entries of interrupts do not overwrite process
level preemption trace entries.
boot-tested on x86. Should work on all architectures.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Index: linux-preempt-trace/arch/i386/kernel/irq.c
===================================================================
--- linux-preempt-trace.orig/arch/i386/kernel/irq.c
+++ linux-preempt-trace/arch/i386/kernel/irq.c
@@ -55,6 +55,9 @@ fastcall unsigned int do_IRQ(struct pt_r
{
/* high bits used in ret_from_ code */
int irq = regs->orig_eax & 0xff;
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 count = preempt_count() & PREEMPT_MASK;
+#endif
#ifdef CONFIG_4KSTACKS
union irq_ctx *curctx, *irqctx;
u32 *isp;
@@ -95,6 +98,14 @@ fastcall unsigned int do_IRQ(struct pt_r
irqctx->tinfo.task = curctx->tinfo.task;
irqctx->tinfo.previous_esp = current_stack_pointer;
+ /*
+ * Keep the preemption-count offset, so that the
+ * process-level preemption-trace entries do not
+ * get overwritten by the hardirq context:
+ */
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count += count;
+#endif
asm volatile(
" xchgl %%ebx,%%esp \n"
" call __do_IRQ \n"
@@ -103,6 +114,9 @@ fastcall unsigned int do_IRQ(struct pt_r
: "0" (irq), "1" (regs), "2" (isp)
: "memory", "cc", "ecx"
);
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count -= count;
+#endif
} else
#endif
__do_IRQ(irq, regs);
@@ -165,6 +179,9 @@ extern asmlinkage void __do_softirq(void
asmlinkage void do_softirq(void)
{
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 count = preempt_count() & PREEMPT_MASK;
+#endif
unsigned long flags;
struct thread_info *curctx;
union irq_ctx *irqctx;
@@ -181,6 +198,14 @@ asmlinkage void do_softirq(void)
irqctx->tinfo.task = curctx->task;
irqctx->tinfo.previous_esp = current_stack_pointer;
+ /*
+ * Keep the preemption-count offset, so that the
+ * process-level preemption-trace entries do not
+ * get overwritten by the softirq context:
+ */
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count += count;
+#endif
/* build the stack frame on the softirq stack */
isp = (u32*) ((char*)irqctx + sizeof(*irqctx));
@@ -192,6 +217,9 @@ asmlinkage void do_softirq(void)
: "0"(isp)
: "memory", "cc", "edx", "ecx", "eax"
);
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count -= count;
+#endif
}
local_irq_restore(flags);
Index: linux-preempt-trace/arch/i386/kernel/traps.c
===================================================================
--- linux-preempt-trace.orig/arch/i386/kernel/traps.c
+++ linux-preempt-trace/arch/i386/kernel/traps.c
@@ -164,6 +164,7 @@ void show_trace(struct task_struct *task
break;
printk(" =======================\n");
}
+ print_preempt_trace(task, task->thread_info->preempt_count);
}
void show_stack(struct task_struct *task, unsigned long *esp)
Index: linux-preempt-trace/arch/x86_64/kernel/process.c
===================================================================
--- linux-preempt-trace.orig/arch/x86_64/kernel/process.c
+++ linux-preempt-trace/arch/x86_64/kernel/process.c
@@ -311,7 +311,7 @@ void __show_regs(struct pt_regs * regs)
void show_regs(struct pt_regs *regs)
{
__show_regs(regs);
- show_trace(®s->rsp);
+ show_trace(current, ®s->rsp);
}
/*
Index: linux-preempt-trace/arch/x86_64/kernel/traps.c
===================================================================
--- linux-preempt-trace.orig/arch/x86_64/kernel/traps.c
+++ linux-preempt-trace/arch/x86_64/kernel/traps.c
@@ -29,6 +29,7 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/nmi.h>
+#include <linux/sched.h>
#include <asm/system.h>
#include <asm/uaccess.h>
@@ -156,7 +157,7 @@ static unsigned long *in_exception_stack
* severe exception (double fault, nmi, stack fault, debug, mce) hardware stack
*/
-void show_trace(unsigned long *stack)
+void show_trace(struct task_struct *task, unsigned long *stack)
{
unsigned long addr;
const unsigned cpu = safe_smp_processor_id();
@@ -221,6 +222,7 @@ void show_trace(unsigned long *stack)
HANDLE_STACK (((long) stack & (THREAD_SIZE-1)) != 0);
#undef HANDLE_STACK
printk("\n");
+ print_preempt_trace(task, task->thread_info->preempt_count);
}
void show_stack(struct task_struct *tsk, unsigned long * rsp)
@@ -257,7 +259,7 @@ void show_stack(struct task_struct *tsk,
printk("%016lx ", *stack++);
touch_nmi_watchdog();
}
- show_trace((unsigned long *)rsp);
+ show_trace(tsk, (unsigned long *)rsp);
}
/*
@@ -266,7 +268,7 @@ void show_stack(struct task_struct *tsk,
void dump_stack(void)
{
unsigned long dummy;
- show_trace(&dummy);
+ show_trace(current, &dummy);
}
EXPORT_SYMBOL(dump_stack);
Index: linux-preempt-trace/include/asm-x86_64/proto.h
===================================================================
--- linux-preempt-trace.orig/include/asm-x86_64/proto.h
+++ linux-preempt-trace/include/asm-x86_64/proto.h
@@ -66,7 +66,7 @@ extern unsigned long end_pfn_map;
extern cpumask_t cpu_initialized;
-extern void show_trace(unsigned long * rsp);
+extern void show_trace(struct task_struct *task, unsigned long *rsp);
extern void show_registers(struct pt_regs *regs);
extern void exception_table_check(void);
Index: linux-preempt-trace/include/linux/sched.h
===================================================================
--- linux-preempt-trace.orig/include/linux/sched.h
+++ linux-preempt-trace/include/linux/sched.h
@@ -592,6 +592,14 @@ extern int groups_search(struct group_in
#define GROUP_AT(gi, i) \
((gi)->blocks[(i)/NGROUPS_PER_BLOCK][(i)%NGROUPS_PER_BLOCK])
+#ifdef CONFIG_DEBUG_PREEMPT
+# define MAX_PREEMPT_TRACE 25
+extern void print_preempt_trace(struct task_struct *task, u32 count);
+#else
+static inline void print_preempt_trace(struct task_struct *task, u32 count)
+{
+}
+#endif
struct audit_context; /* See audit.c */
struct mempolicy;
@@ -770,6 +778,11 @@ struct task_struct {
int cpuset_mems_generation;
#endif
atomic_t fs_excl; /* holding fs exclusive resources */
+
+#ifdef CONFIG_DEBUG_PREEMPT
+ void *preempt_off_caller[MAX_PREEMPT_TRACE];
+ void *preempt_off_parent[MAX_PREEMPT_TRACE];
+#endif
};
static inline pid_t process_group(struct task_struct *tsk)
Index: linux-preempt-trace/kernel/exit.c
===================================================================
--- linux-preempt-trace.orig/kernel/exit.c
+++ linux-preempt-trace/kernel/exit.c
@@ -821,10 +821,11 @@ fastcall NORET_TYPE void do_exit(long co
tsk->it_prof_expires = cputime_zero;
tsk->it_sched_expires = 0;
- if (unlikely(in_atomic()))
- printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n",
- current->comm, current->pid,
- preempt_count());
+ if (unlikely(in_atomic())) {
+ printk(KERN_ERR "BUG: %s[%d] exited with nonzero preempt_count %d!\n",
+ tsk->comm, tsk->pid, preempt_count());
+ print_preempt_trace(tsk, preempt_count());
+ }
acct_update_integrals(tsk);
update_mem_hiwater(tsk);
Index: linux-preempt-trace/kernel/irq/handle.c
===================================================================
--- linux-preempt-trace.orig/kernel/irq/handle.c
+++ linux-preempt-trace/kernel/irq/handle.c
@@ -85,7 +85,24 @@ fastcall int handle_IRQ_event(unsigned i
local_irq_enable();
do {
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 in_count = preempt_count(), out_count;
+#endif
ret = action->handler(irq, action->dev_id, regs);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ printk(KERN_ERR "BUG: irq %d [%s] preempt-count "
+ "imbalance: in=%08x, out=%08x!\n",
+ irq, action->name, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * We already printed all the useful info,
+ * fix up the preemption count now:
+ */
+ preempt_count() = in_count;
+ }
+#endif
if (ret == IRQ_HANDLED)
status |= action->flags;
retval |= ret;
Index: linux-preempt-trace/kernel/sched.c
===================================================================
--- linux-preempt-trace.orig/kernel/sched.c
+++ linux-preempt-trace/kernel/sched.c
@@ -47,6 +47,7 @@
#include <linux/syscalls.h>
#include <linux/times.h>
#include <linux/acct.h>
+#include <linux/kallsyms.h>
#include <asm/tlb.h>
#include <asm/unistd.h>
@@ -2707,38 +2708,6 @@ static inline int dependent_sleeper(int
}
#endif
-#if defined(CONFIG_PREEMPT) && defined(CONFIG_DEBUG_PREEMPT)
-
-void fastcall add_preempt_count(int val)
-{
- /*
- * Underflow?
- */
- BUG_ON((preempt_count() < 0));
- preempt_count() += val;
- /*
- * Spinlock count overflowing soon?
- */
- BUG_ON((preempt_count() & PREEMPT_MASK) >= PREEMPT_MASK-10);
-}
-EXPORT_SYMBOL(add_preempt_count);
-
-void fastcall sub_preempt_count(int val)
-{
- /*
- * Underflow?
- */
- BUG_ON(val > preempt_count());
- /*
- * Is the spinlock portion underflowing?
- */
- BUG_ON((val < PREEMPT_MASK) && !(preempt_count() & PREEMPT_MASK));
- preempt_count() -= val;
-}
-EXPORT_SYMBOL(sub_preempt_count);
-
-#endif
-
/*
* schedule() is the main scheduler function.
*/
Index: linux-preempt-trace/kernel/softirq.c
===================================================================
--- linux-preempt-trace.orig/kernel/softirq.c
+++ linux-preempt-trace/kernel/softirq.c
@@ -92,7 +92,23 @@ restart:
do {
if (pending & 1) {
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 in_count = preempt_count(), out_count;
+#endif
h->action(h);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ printk(KERN_ERR "BUG: softirq %ld preempt-count "
+ "imbalance: in=%08x, out=%08x!\n",
+ h - softirq_vec, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * Fix up the bad preemption count:
+ */
+ preempt_count() = in_count;
+ }
+#endif
rcu_bh_qsctr_inc(cpu);
}
h++;
Index: linux-preempt-trace/kernel/timer.c
===================================================================
--- linux-preempt-trace.orig/kernel/timer.c
+++ linux-preempt-trace/kernel/timer.c
@@ -33,6 +33,7 @@
#include <linux/posix-timers.h>
#include <linux/cpu.h>
#include <linux/syscalls.h>
+#include <linux/kallsyms.h>
#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -480,6 +481,7 @@ static inline void __run_timers(tvec_bas
while (!list_empty(head)) {
void (*fn)(unsigned long);
unsigned long data;
+ int in_count, out_count;
timer = list_entry(head->next,struct timer_list,entry);
fn = timer->function;
@@ -488,17 +490,20 @@ static inline void __run_timers(tvec_bas
set_running_timer(base, timer);
detach_timer(timer, 1);
spin_unlock_irq(&base->t_base.lock);
- {
- int preempt_count = preempt_count();
- fn(data);
- if (preempt_count != preempt_count()) {
- printk(KERN_WARNING "huh, entered %p "
- "with preempt_count %08x, exited"
- " with %08x?\n",
- fn, preempt_count,
- preempt_count());
- BUG();
- }
+
+ in_count = preempt_count();
+ fn(data);
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ print_symbol(KERN_ERR "BUG: %s", (long)fn);
+ printk(KERN_ERR "(%p) preempt-count imbalance: "
+ "in=%08x, out=%08x!",
+ fn, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * Fix up the bad preemption count:
+ */
+ preempt_count() = in_count;
}
spin_lock_irq(&base->t_base.lock);
}
@@ -914,6 +919,10 @@ static void run_timer_softirq(struct sof
if (time_after_eq(jiffies, base->timer_jiffies))
__run_timers(base);
+ if (panic_timeout == 2) {
+ panic_timeout = 0;
+ preempt_disable();
+ }
}
/*
@@ -922,6 +931,10 @@ static void run_timer_softirq(struct sof
void run_local_timers(void)
{
raise_softirq(TIMER_SOFTIRQ);
+ if (panic_timeout == 1) {
+ panic_timeout = 0;
+ preempt_disable();
+ }
}
/*
Index: linux-preempt-trace/lib/Kconfig.debug
===================================================================
--- linux-preempt-trace.orig/lib/Kconfig.debug
+++ linux-preempt-trace/lib/Kconfig.debug
@@ -70,6 +70,9 @@ config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
default y
+ select FRAME_POINTER
+ select KALLSYMS
+ select KALLSYMS_ALL
help
If you say Y here then the kernel will use a debug variant of the
commonly used smp_processor_id() function and will print warnings
Index: linux-preempt-trace/lib/Makefile
===================================================================
--- linux-preempt-trace.orig/lib/Makefile
+++ linux-preempt-trace/lib/Makefile
@@ -20,7 +20,7 @@ lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) +=
lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_GENERIC_FIND_NEXT_BIT) += find_next_bit.o
obj-$(CONFIG_LOCK_KERNEL) += kernel_lock.o
-obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o
+obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o preempt.o
ifneq ($(CONFIG_HAVE_DEC_LOCK),y)
lib-y += dec_and_lock.o
Index: linux-preempt-trace/lib/preempt.c
===================================================================
--- /dev/null
+++ linux-preempt-trace/lib/preempt.c
@@ -0,0 +1,101 @@
+/*
+ * lib/preempt.c
+ *
+ * DEBUG_PREEMPT variant of add_preempt_count() and sub_preempt_count().
+ * Preemption tracing.
+ *
+ * (C) 2005 Ingo Molnar, Red Hat
+ */
+#include <linux/module.h>
+#include <linux/hardirq.h>
+#include <linux/kallsyms.h>
+
+/*
+ * Add a value to the preemption count, and check for overflows,
+ * underflows and maintain a small stack of callers that gets
+ * printed upon crashes.
+ */
+void fastcall add_preempt_count(int val)
+{
+ unsigned int count = preempt_count(), idx = count & PREEMPT_MASK;
+
+ /*
+ * Underflow?
+ */
+ BUG_ON(count < 0);
+
+ preempt_count() += val;
+
+ /*
+ * Spinlock count overflowing soon?
+ */
+ BUG_ON(idx >= PREEMPT_MASK-10);
+
+ /*
+ * Maintain the per-task preemption-nesting stack (which
+ * will be printed upon crashes). It's a low-overhead thing,
+ * constant overhead per preempt-disable.
+ */
+ if (idx < MAX_PREEMPT_TRACE) {
+ void *caller = __builtin_return_address(0), *parent = NULL;
+
+#ifdef CONFIG_FRAME_POINTER
+ parent = __builtin_return_address(1);
+ if (in_lock_functions(parent)) {
+ parent = __builtin_return_address(2);
+ if (in_lock_functions(parent))
+ parent = __builtin_return_address(3);
+ }
+#endif
+ current->preempt_off_caller[idx] = caller;
+ current->preempt_off_parent[idx] = parent;
+ }
+}
+EXPORT_SYMBOL(add_preempt_count);
+
+void fastcall sub_preempt_count(int val)
+{
+ unsigned int count = preempt_count();
+
+ /*
+ * Underflow?
+ */
+ BUG_ON(val > count);
+ /*
+ * Is the spinlock portion underflowing?
+ */
+ BUG_ON((val < PREEMPT_MASK) && !(count & PREEMPT_MASK));
+
+ preempt_count() -= val;
+}
+EXPORT_SYMBOL(sub_preempt_count);
+
+void print_preempt_trace(struct task_struct *task, u32 count)
+{
+ unsigned int i, idx = count & PREEMPT_MASK;
+
+ preempt_disable();
+
+ printk("---------------------------\n");
+ printk("| preempt count: %08x ]\n", count);
+ if (count) {
+ printk("| %d level deep critical section nesting:\n", idx);
+ printk("----------------------------------------\n");
+ } else
+ printk("---------------------------\n");
+ for (i = 0; i < idx; i++) {
+ printk(".. [<%p>] .... ", task->preempt_off_caller[i]);
+ print_symbol("%s\n", (long)task->preempt_off_caller[i]);
+ printk(".....[<%p>] .. ( <= ",
+ task->preempt_off_parent[i]);
+ print_symbol("%s)\n", (long)task->preempt_off_parent[i]);
+ if (i == MAX_PREEMPT_TRACE-1) {
+ printk("[rest truncated, reached MAX_PREEMPT_TRACE]\n");
+ break;
+ }
+ }
+ printk("\n");
+
+ preempt_enable();
+}
+
^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch] preempt-trace.patch (mono preempt-trace)
2005-08-05 20:04 ` Ingo Molnar
@ 2005-08-05 20:48 ` Dominik Karall
0 siblings, 0 replies; 101+ messages in thread
From: Dominik Karall @ 2005-08-05 20:48 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Hugh Dickins, Andrew Morton, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 873 bytes --]
On Friday 05 August 2005 22:04, Ingo Molnar wrote:
> * Dominik Karall <dominik.karall@gmx.net> wrote:
> > With FRAME_POINTERS enabled:
> >
> > BUG: mono[3193] exited with nonzero preempt_count 1!
> > ---------------------------
> >
> > | preempt count: 00000001 ]
> > | 1 level deep critical section nesting:
> >
> > ----------------------------------------
> > .. [<ffffffff80400a46>] .... _spin_lock+0x16/0x80
> > .....[<ffffffff801ed30c>] .. ( <= sys_semtimedop+0x28c/0x7c0)
>
> thanks. It seems semundo->lock somehow leaked. One possibility would be
> of semundo->refcount going from 2 to 1 while another thread has it
> locked. I dont see what prevents this scenario from happening. To test
> this theory, could you apply the patch below, which will do semundo
> locking not conditional on the refcount - does it fix the bug?
yeah! it works, great job! :)
dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]
^ permalink raw reply [flat|nested] 101+ messages in thread
end of thread, other threads:[~2005-08-05 20:46 UTC | newest]
Thread overview: 101+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton
2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander
2005-06-07 14:49 ` 2.6.12-rc6-mm1 Wolfgang Wander
2005-06-07 14:48 ` 2.6.12-rc6-mm1 Brice Goglin
2005-06-07 20:59 ` 2.6.12-rc6-mm1: rio confusion Adrian Bunk
2005-06-07 21:38 ` Matt Porter
2005-06-07 23:15 ` 2.6.12-rc6-mm1 Francois Romieu
2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott
2005-06-08 5:53 ` 2.6.12-rc6-mm1 Jean Delvare
2005-06-08 7:08 ` 2.6.12-rc6-mm1 Søren Lott
2005-06-09 3:47 ` [lm-sensors] 2.6.12-rc6-mm1 Mark M. Hoffman
2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft
2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-08 23:14 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-08 23:22 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-08 23:34 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-09 7:17 ` 2.6.12-rc6-mm1 Kirill Korotaev
2005-06-09 13:38 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-10 12:12 ` 2.6.12-rc6-mm1 Kirill Korotaev
2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin
2005-06-09 13:12 ` 2.6.12-rc6-mm1 Andy Whitcroft
2005-06-08 14:33 ` BUG in i2c_detach_client Andrew James Wade
2005-06-08 16:21 ` Jean Delvare
2005-06-08 21:26 ` Andrew Morton
2005-06-08 22:56 ` Andrew James Wade
2005-06-08 23:32 ` Andrew Morton
2005-06-09 7:52 ` Jean Delvare
2005-06-09 7:47 ` Jean Delvare
2005-06-09 11:05 ` Andrew James Wade
2005-06-09 13:32 ` Andrew James Wade
2005-06-09 15:57 ` Jean Delvare
2005-06-10 5:58 ` Greg KH
2005-06-10 7:08 ` Jean Delvare
2005-06-11 11:51 ` 2.6.12-rc6-mm1 Benoit Boissinot
2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-18 22:57 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-18 23:11 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King
2005-06-19 1:20 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-19 9:02 ` 2.6.12-rc6-mm1 Russell King
2005-06-19 9:11 ` 2.6.12-rc6-mm1 Russell King
2005-06-19 17:12 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-19 17:39 ` 2.6.12-rc6-mm1 Russell King
2005-06-19 18:25 ` 2.6.12-rc6-mm1 Richard Purdie
2005-06-19 18:56 ` 2.6.12-rc6-mm1 Russell King
2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall
2005-06-24 21:27 ` 2.6.12-rc6-mm1 Alexey Dobriyan
2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton
2005-07-29 13:39 ` 2.6.12-rc6-mm1 Dominik Karall
2005-07-29 18:22 ` 2.6.12-rc6-mm1 Andrew Morton
2005-07-29 21:19 ` 2.6.12-rc6-mm1 Dominik Karall
2005-07-29 21:27 ` 2.6.12-rc6-mm1 Andrew Morton
2005-07-29 21:37 ` 2.6.12-rc6-mm1 Dominik Karall
2005-08-04 19:44 ` 2.6.12-rc6-mm1 Andrew Morton
2005-08-04 22:28 ` 2.6.12-rc6-mm1 Andrew Morton
2005-08-04 22:44 ` 2.6.12-rc6-mm1 Dominik Karall
2005-08-05 10:48 ` [patch] preempt-trace.patch Ingo Molnar
2005-08-05 11:44 ` Dominik Karall
2005-08-05 15:13 ` [patch] preempt-trace-fixes.patch Ingo Molnar
2005-08-05 18:14 ` Dominik Karall
2005-08-05 14:26 ` [patch] preempt-trace.patch (mono preempt-trace) Dominik Karall
2005-08-05 15:22 ` Ingo Molnar
2005-08-05 17:58 ` Dominik Karall
2005-08-05 18:46 ` Hugh Dickins
2005-08-05 19:23 ` Dominik Karall
2005-08-05 20:04 ` Ingo Molnar
2005-08-05 20:48 ` Dominik Karall
2005-08-05 18:05 ` Andrew Morton
2005-08-05 20:08 ` Ingo Molnar
2005-08-05 20:13 ` Ingo Molnar
-- strict thread matches above, loose matches on Subject: below --
2005-06-07 23:50 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter
2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton
2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin
2005-06-08 3:33 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-08 3:50 ` 2.6.12-rc6-mm1 Nick Piggin
2005-06-08 14:15 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-10 7:02 ` 2.6.12-rc6-mm1 Ingo Molnar
2005-06-10 12:03 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon
2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 0:18 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon
2005-06-11 0:48 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 0:52 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-11 4:14 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 5:56 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-11 22:20 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-11 23:27 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-11 23:47 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-12 0:23 ` 2.6.12-rc6-mm1 Martin J. Bligh
2005-06-12 5:19 ` 2.6.12-rc6-mm1 Con Kolivas
2005-06-09 1:58 ` 2.6.12-rc6-mm1 Lee Revell
2005-06-08 0:02 ` 2.6.12-rc6-mm1 Martin J. Bligh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox