linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.17-rc5-mm3
@ 2006-06-04  6:20 Andrew Morton
  2006-06-04  9:38 ` 2.6.17-rc5-mm3 Barry K. Nathan
                   ` (7 more replies)
  0 siblings, 8 replies; 52+ messages in thread
From: Andrew Morton @ 2006-06-04  6:20 UTC (permalink / raw)
  To: linux-kernel


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/

- Lots of PCI and USB updates

- The various lock validator, stack backtracing and IRQ management problems
  are converging, but we're not quite there yet.



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.




Changes since 2.6.17-rc5-mm2:


 origin.patch
 git-acpi.patch
 git-agpgart.patch
 git-alsa.patch
 git-audit-master.patch
 git-block.patch
 git-cifs.patch
 git-cpufreq.patch
 git-cpufreq-fixup.patch
 git-dvb.patch
 git-gfs2.patch
 git-ia64.patch
 git-infiniband.patch
 git-intelfb.patch
 git-klibc.patch
 git-hdrcleanup.patch
 git-hdrinstall.patch
 git-libata-all.patch
 git-mips.patch
 git-mtd.patch
 git-netdev-all.patch
 git-net.patch
 git-nfs.patch
 git-powerpc.patch
 git-rbtree.patch
 git-sas.patch
 git-pcmcia.patch
 git-scsi-target.patch
 git-supertrak.patch
 git-watchdog.patch
 git-cryptodev.patch

 git trees

-drivers-usb-core-devioc-dereference-userspace-pointer.patch
-nbd-endian-annotations.patch
-cifs-build-fix.patch
-git-cifs-kconfig-fix.patch
-cifs-do-not-overwrite-aops-elements.patch
-scx200_acb-use-pci-i-o-resource-when-appropriate-fix.patch
-git-mtd-cs553x_nand-build-fix.patch
-pmf_register_irq_client-gives-sleep-with-locks-held-warning.patch
-64-bit-resources-arch-powerpc-changes-update.patch
-fix-pciehp-compile-issue-when-config_acpi-is-not.patch
-gregkh-pci-pci-64-bit-resources-drivers-others-changes-amba-fix.patch
-i386-export-memory-more-than-4g-through-proc-iomem.patch
-pci-pci-64-bit-resources-arch-changes-update.patch
-improve-pci-config-space-writeback.patch
-reverse-pci-config-space-restore-order.patch
-pci-add-pci_assign_resource_fixed-allow-fixed-address.patch
-add-a-enable-sysfs-attribute-to-the-pci-devices-to-allow.patch
-fix-recovery-path-from-errors-during-pcie_init.patch
-move-various-pci-ids-to-header-file.patch
-kconfigurable-resources-core-changes.patch
-kconfigurable-resources-core-changes-i386-fix.patch
-kconfigurable-resources-core-changes-fix.patch
-kconfigurable-resources-driver-pci-changes.patch
-kconfigurable-resources-driver-others-changes.patch
-kconfigurable-resources-arch-dependent-changes-arch-a-i.patch
-kconfigurable-resources-arch-dependent-changes-arch-a-i-fix.patch
-kconfigurable-resources-arch-dependent-changes-arch-j-p.patch
-kconfigurable-resources-arch-dependent-changes-arch-q-z.patch
-typesh-sector_t-and-blkcnt_t-arent-for-userspace.patch
-allow-msi-to-work-on-kexec-kernel.patch
-pci-disable-msi-mode-in-pci_disable_device.patch
-pci-dont-move-ioapics-below-pci-bridge.patch
-git-scsi-rc-fixes.patch
-scsi-properly-count-the-number-of-pages-in-scsi_req_map_sg-fix.patch
-revert-gregkh-usb-usb-ohci-avoids-root-hub-timer-polling.patch
-gregkh-usb-usb-serial-mos7720-powerpc-wrokaround.patch
-usb-add-sierra-wireless-mc5720-id-to-airprimec.patch
-usb-negative-index-in-drivers-usb-host-isp116x-hcdc.patch
-xfs-sparc32-build-fix.patch
-add-pci_cap_id_vndr.patch

 Merged into mainline or a subsystem tree

+alpha-smp-irq-routing-fix.patch
+fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path.patch
+fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path-fix.patch
+pmf_register_irq_client-gives-sleep-with-locks-held-warning.patch
+implement-get--set-tso-for-forcedeth-driver.patch
+fix-hpet-operation-on-32-bit-nvidia-platforms.patch
+fix-hpet-operation-on-32-bit-nvidia-platforms-build-fix.patch
+fix-hpet-operation-on-64-bit-nvidia-platforms.patch
+maintainers-add-entries-for-bnx2-and-tg3.patch
+sbp2-fix-check-of-return-value-of.patch
+sata_sil24-sii3124-sata-driver-endian-problem.patch
+m48t86-ia64-build-fix.patch
+m68k-get_user-build-fix.patch
+uml-add-asm-irqflagsh.patch
+uml-fix-wall_to_monotonic-initialization.patch
+uml-fix-a-typo-in-do_uml_initcalls.patch
+uml-__user-annotation-in-arch_prctl.patch
+uml-more-__user-annotations.patch
+uml-add-ffreestanding-to-cflags.patch

 2.6.17 queue

+kevent-add-new-uevent.patch

 Required for acpi-dock-driver.patch

-acpi-dock-driver-v3.patch
-acpi-dock-driver-v4.patch
-acpi-dock-driver-interface-fixups.patch

 Folded into acpi-dock-driver.patch

-acpiphp-use-new-dock-driver-fix.patch
-acpiphp-use-new-dock-driver-v2.patch

 Folded into acpiphp-use-new-dock-driver.patch

-acpi-atlas-acpi-driver-v2-tidy.patch

 Folded into acpi-atlas-acpi-driver.patch

+acpi-atlas-acpi-driver-fix.patch

 Fix acpi-atlas-acpi-driver.patch

+ieee1394-video1394-be-quiet.patch
+ieee1394-ohci1394c-function-calls-without.patch
+ieee1394-sbp2-make-tsb42aa9-workaround-specific.patch
+ieee1394-semaphore-to-mutex-conversion.patch
+ieee1394-raw1394-fix-whitespace-after-x86_64.patch
+ieee1394-ieee1394-ohci1394-cycletoolong.patch
+ieee1394-ieee1394-support-for-slow-links-or-slow.patch
+ieee1394-ieee1394-save-ram-by-using-a-single.patch
+ieee1394-sbp2-remove-manipulation-of-inquiry.patch
+ieee1394-sbp2-log-number-of-supported-concurrent.patch
+ieee1394-ieee1394-extend-lowlevel-api-for.patch
+ieee1394-ohci1394-set-address-range-properties.patch
+ieee1394-ohci1394-make-phys_dma-parameter.patch
+ieee1394-sbp2-sbp2-remove-ohci1394-specific.patch
+ieee1394-sbp2-fix-s800-transfers-if-phys_dma-is.patch
+ieee1394-update-feature-removal-of-obsolete.patch
+ieee1394-sbp2-provide-helptext-for.patch
+ieee1394-sbp2-kconfig-fix.patch
+ieee1394-sbp2-use-__attribute__packed-for.patch
+ieee1394-sbp2-fix-deregistration-of-status-fifo-address-space.patch
+ieee1394-add-preprocessor-constant-for-invalid-csr.patch
+eth1394-endian-fixes.patch

 ieee1394 updates

-ieee1394_core-switch-to-kthread-api-fix.patch

 Folded into ieee1394_core-switch-to-kthread-api.patch

-input-fix-oops-on-mk712-load.patch

 Dropped

-via-pmu-add-input-device-tidy.patch

 Folded into via-pmu-add-input-device.patch

-input-powermac-cleanup-of-mac_hid-and-support-for-ctrlclick-and-commandclick-update.patch

 Folded into input-powermac-cleanup-of-mac_hid-and-support-for-ctrlclick-and-commandclick.patch

-input-logitech-trackman-trackball-support.patch

 Dropped (I think)

-input-new-force-feedback-interface-fix.patch

 Folded into input-new-force-feedback-interface.patch

+kconfig-integrate-split-config-into-silentoldconfig-fix.patch

 Folded into kconfig-integrate-split-config-into-silentoldconfig.patch

+kbuild-obj-dirs-is-calculated-incorrectly-if-hostprogs-y-is-defined.patch
+fix-make-rpm-for-powerpc.patch

 kbuild fixes

+revert-sata_sil24-sii3124-sata-driver-endian-problem.patch

 Revert earlier patch so that git-libata-all applies OK.

+libata-add-missing-data_xfer-for-pata_pdc2027x-and-pdc_adma-fix.patch

 Fix libata-add-missing-data_xfer-for-pata_pdc2027x-and-pdc_adma.patch

+prevent-au1xmmcc-breakage-on-non-au1200-alchemy.patch

 mmc driver fix

+myri10ge-alpha-build-fix.patch

 Fix git-netdev-all.patch

+forcedeth-config-ring-sizes.patch
+forcedeth-config-flow-control.patch
+forcedeth-config-phy.patch
+forcedeth-config-wol.patch
+forcedeth-config-csum.patch
+forcedeth-config-statistics.patch
+forcedeth-config-diagnostics.patch
+forcedeth-config-module-parameters.patch
+forcedeth-config-version.patch
+forcedeth-new-device-ids.patch
+forcedeth-typecast-cleanup.patch

 forcedeth updates

+lock-validator-netlinkc-netlink_table_grab-fix.patch

 netlink locking fix

+recent-match-fix-sleeping-function-called-from-invalid-context.patch
+recent-match-missing-refcnt-initialization.patch

 netfilter fixes

-fix-for-serial-uart-lockup.patch

 Dropped.

+gregkh-pci-pci-add-pci_cap_id_vndr.patch
+gregkh-pci-pci-fix-pciehp-compile-issue-when-config_acpi-is-not-enabled.patch
+gregkh-pci-pci-64-bit-resource-fixup-pci-resource-dbg-code-to-handle-size-change.patch
+gregkh-pci-pci-64-bit-resource-fix-amba-build-warning.patch
+gregkh-pci-pci-64-bit-resources-fix-pnp-sysfs-interface.patch
+gregkh-pci-pci-64-bit-resources-arch-powerpc-changes-update.patch
+gregkh-pci-kconfigurable-resources-core-changes.patch
+gregkh-pci-kconfigurable-resources-driver-pci-changes.patch
+gregkh-pci-kconfigurable-resources-driver-others-changes.patch
+gregkh-pci-kconfigurable-resources-arch-dependent-changes.patch
+gregkh-pci-kconfigurable-resources-arch-dependent-changes-arch.patch
+gregkh-pci-kconfigurable-resources-arch-dependent-changes-arch-q-z.patch
+gregkh-pci-i386-export-memory-more-than-4g-through-proc-iomem.patch
+gregkh-pci-pci-error-handling-on-pci-device-resume.patch
+gregkh-pci-pci-ignore-pre-set-64-bit-bars-on-32-bit-platforms.patch
+gregkh-pci-pciehp-dont-call-pci_enable_dev.patch
+gregkh-pci-pci-improve-pci-config-space-writeback.patch
+gregkh-pci-pci-reverse-pci-config-space-restore-order.patch
+gregkh-pci-pci-add-pci_assign_resource_fixed-allow-fixed-address-assignments.patch
+gregkh-pci-pci-add-a-enable-sysfs-attribute-to-the-pci-devices-to-allow-userspace-to-enable-devices-without-doing-foul-direct-access.patch
+gregkh-pci-pci-don-t-enable-device-if-already-enabled.patch
+gregkh-pci-pci-acpi-rename-the-functions-to-avoid-multiple-instances.patch
+gregkh-pci-acpi_pcihp-fix-programming-_hpp-values.patch
+gregkh-pci-acpi_pcihp-remove-improper-error-message-about-oshp.patch
+gregkh-pci-acpi_pcihp-add-support-for-_hpx.patch
+gregkh-pci-pciehp-fix-programming-hotplug-parameters.patch
+gregkh-pci-shpc-cleanup-shpc-register-access.patch
+gregkh-pci-shpc-cleanup-shpc-logical-slot-register-access.patch
+gregkh-pci-shpc-cleanup-shpc-logical-slot-register-bits-access.patch
+gregkh-pci-shpc-fix-shpc-logical-slot-register-bits-access.patch
+gregkh-pci-shpc-fix-shpc-contoller-serr-int-register-bits-access.patch
+gregkh-pci-shpchp-mask-global-serr-and-intr-at-controller-release-time.patch
+gregkh-pci-shpchp-create-shpchpd-at-controller-probe-time.patch
+gregkh-pci-pci-i386-x86_84-disable-pci-resource-decode-on-device-disable.patch
+gregkh-pci-sgi-hotplug-incorrect-power-status.patch
+gregkh-pci-pci-bus-parity-status-broken-hardware-attribute-edac-foundation.patch
+gregkh-pci-pci-hotplug-fix-recovery-path-from-errors-during-pcie_init.patch
+gregkh-pci-pciehp-replace-pci_find_slot-with-pci_get_slot.patch
+gregkh-pci-pciehp-add-missing-pci_dev_put.patch
+gregkh-pci-pciehp-implement-get_address-callback.patch
+gregkh-pci-shpchp-remove-unnecessary-hpc_ctlr_handle-check.patch
+gregkh-pci-shpchp-cleanup-interrupt-handler.patch
+gregkh-pci-shpchp-cleanup-shpc-commands.patch
+gregkh-pci-shpchp-cleanup-interrupt-polling-timer.patch
+gregkh-pci-shpchp-remove-unused-hpc_evelnt_lock.patch
+gregkh-pci-shpchp-cleanup-improper-info-messages.patch
+gregkh-pci-pci-move-various-pci-ids-to-header-file.patch
+gregkh-pci-pci-amd-8131-msi-quirk-called-too-late-bus_flags-not-inherited.patch
+gregkh-pci-pci-allow-msi-to-work-on-kexec-kernel.patch
+gregkh-pci-pci-disable-msi-mode-in-pci_disable_device.patch
+gregkh-pci-pci-hotplug-fake-null-pointer-dereferences-in-ibm-hot-plug-controller-driver.patch
+gregkh-pci-pci-cleanup-unused-variable-about-msi-driver.patch
+gregkh-pci-pci-don-t-move-ioapics-below-pci-bridge.patch
+gregkh-pci-pci-remove-unneeded-msi-code.patch
+gregkh-pci-pci-clean-up-pci-documentation-to-be-more-specific.patch
+gregkh-pci-pci-fix-race-with-pci_walk_bus-and-pci_destroy_dev.patch
+gregkh-pci-pci-test-that-drivers-properly-call-pci_set_master.patch

 PCI tree updates

+revert-gregkh-pci-pci-test-that-drivers-properly-call-pci_set_master.patch
+gregkh-pci-kconfigurable-resources-arch-dependent-changes-arm-fix.patch
+gregkh-pci-pci-64-bit-resources-core-changes-mips-fix.patch

 Unbreak it.

-bogus-disk-geometry-on-large-disks-warning-fix.patch

 Folded into bogus-disk-geometry-on-large-disks.patch

-areca-raid-linux-scsi-driver-update6-for-2617-rc1-mm3.patch
-areca-raid-linux-scsi-driver-update6-for-2617-rc1-mm3-externs-go-in-headers.patch

 Folded into areca-raid-linux-scsi-driver.patch

+git-scsi-target-fixup.patch

 Fix reject in git-scsi-target.patch.

+gregkh-usb-usb-whiteheat-fix-firmware-spurious-errors.patch
+gregkh-usb-usb-add-sierra-wireless-mc5720-id-to-airprime.c.patch
+gregkh-usb-usb-negative-index-in-drivers-usb-host-isp116x-hcd.c.patch
+gregkh-usb-usb-cdc_ether-recognize-olympus-r1000.patch
+gregkh-usb-usbcore-port-reset-for-composite-devices.patch
+gregkh-usb-usb-hub-use-usb_reset_composite_device.patch
+gregkh-usb-usb-storage-use-usb_reset_composite_device.patch
+gregkh-usb-usbhid-use-usb_reset_composite_device.patch
+gregkh-usb-usbcore-recovery-from-set-configuration-failure.patch
+gregkh-usb-usb-drivers-usb-core-devio.c-dereferences-a-userspace-pointer.patch
+gregkh-usb-usb-new-devices-for-the-option-driver.patch

 USB tree updates

+x86_64-mm-acpi-blacklist-xw9300.patch
+x86_64-mm-apic-support-for-extended-apic-interrupt.patch
+x86_64-mm-mce_amd-relocate-sysfs-files.patch
+x86_64-mm-mce_amd-support-for-family-0x10-processors.patch
+x86_64-mm-mce_amd-cleanup.patch
+x86_64-mm-miscellaneous-mm-initc-fixes.patch

 x86_64 tree updates

+fall-back-to-old-style-call-trace-if-no-unwinding.patch
+allow-unwinder-to-build-without-module-support.patch

 Fix it.

+mm-slabc-fix-early-init-assumption.patch

 slab fix

+tiacx-ia64-fix.patch

 wireless driver fix

+selinux-add-hooks-for-key-subsystem.patch

 Wire the key management subsystem into selinux.

+powerpc-vdso-updates.patch

 powerpc update

+remove-empty-node-at-boot-time.patch

 NUMA fixlet.

+jbd-fix-bug-in-journal_commit_transaction-fix.patch

 Fix jbd-fix-bug-in-journal_commit_transaction.patch

+ufs-easy-debug.patch
+ufs-little-directory-lookup-optimization.patch
+ufs-i_blocks-wrong-count.patch
+ufs-unlock_super-without-lock.patch
+ufs-zero-metadata.patch
+ufs-printk-warning-fixes.patch

 More UFS fixes

-inotify-kernel-api.patch
-inotify-kernel-api-fix.patch
+inotify-split-kernel-api-from-userspace-support.patch
+inotify-add-names-inode-to-event-handler.patch
+inotify-add-interfaces-to-kernel-api.patch
+inotify-allow-watch-removal-from-event-handler.patch
+inotify-update-kernel-documentation.patch

 Updated inotify patch series

+lock-validator-introduce-warn_on_oncecond-speedup.patch

 Fix lock-validator-introduce-warn_on_oncecond.patch

+add-max6902-rtc-support-update.patch

 Fix add-max6902-rtc-support.patch

+nbd-endian-annotations.patch
+epoll-use-unlocked-wqueue-operations.patch

 misc updates

+per-task-delay-accounting-taskstats-interface-fix-2.patch

 Fix per-task-delay-accounting-taskstats-interface-fix-1.patch

+sched-fix-smt-nice-lock-contention-and-optimization.patch
+sched-fix-smt-nice-lock-contention-and-optimization-tidy.patch

 CPu scheduler scability improvements.

+namespaces-utsname-sysctl-hack-cleanup-2-fix.patch

 Fix namespaces-utsname-sysctl-hack-cleanup-2.patch

+reiser4-hardirq-include-fix.patch

 Fix reiser4.patch

+skeletonfb-remove-duplicate-module-init-exit-license-lines.patch
+neofb-fix-unblank-logic-interfering-with-lid-toggled-backlight.patch

 fbdev updates

+statistics-infrastructure-prerequisite-timestamp-fix.patch

 Fix statistics-infrastructure-prerequisite-timestamp.patch

+genirq-msi-fixes-2.patch

 Fix genirq-core.patch

+genirq-add-irq-chip-support-fix.patch

 Fix genirq-add-irq-chip-support.patch

+genirq-add-chip-eoi-fastack-fasteoi-fix.patch

 Fix genirq-add-chip-eoi-fastack-fasteoi.patch

+lock-validator-floppyc-irq-release-fix-fix.patch

 Fix lock-validator-floppyc-irq-release-fix.patch

+lock-validator-locking-api-self-tests-self-test-fix.patch

 Fix lock-validator-locking-api-self-tests.patch

+lock-validator-beautify-x86_64-stacktraces-fix-2.patch
+lock-validator-beautify-x86_64-stacktraces-fix-3.patch
+lock-validator-beautify-x86_64-stacktraces-fix-4.patch

 Fix lock-validator-beautify-x86_64-stacktraces.patch some more.

+lock-validator-x86_64-irqflags-trace-entrys-fix.patch

 Fix lock-validator-irqtrace-cleanup-include-asm-x86_64-irqflagsh.patch

+lock-validator-core-early_boot_irqs_-build-fix.patch
+lock-validator-core-fix-compiler-warning.patch

 Fix lock-validator-core.patch

+lock-validator-special-locking-serio.patch
+lockdep-add-i_mutex-ordering-annotations-to-the-sunrpc.patch
+lockdep-add-parent-child-annotations-to-usbfs.patch

 lockdep workarounds

+i386-remove-multi-entry-backtraces.patch

 More work on x86 backtraces.


All 1492 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm2/patch-list


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
@ 2006-06-04  9:38 ` Barry K. Nathan
  2006-06-04  9:49   ` 2.6.17-rc5-mm3 Andrew Morton
  2006-06-04 18:20 ` 2.6.17-rc5-mm3 Rafael J. Wysocki
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Barry K. Nathan @ 2006-06-04  9:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

When I build ACPI processor support as a module, I get this:

  MODPOST
WARNING: drivers/acpi/processor.o - Section mismatch: reference to
.init.data: from .text between 'acpi_processor_power_init' (at offset
0xfb0) and 'acpi_safe_halt'

(This is also true of -mm2, but I didn't get a chance to report it
before -mm3 was released. Before then, I built it into the kernel and
not as a module.)

and I still get this:
WARNING: "scsi_tgt_queue_command" [drivers/scsi/libsrp.ko] undefined!
-- 
-Barry K. Nathan <barryn@pobox.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  9:38 ` 2.6.17-rc5-mm3 Barry K. Nathan
@ 2006-06-04  9:49   ` Andrew Morton
  2006-06-04 10:08     ` 2.6.17-rc5-mm3 Michal Piotrowski
  0 siblings, 1 reply; 52+ messages in thread
From: Andrew Morton @ 2006-06-04  9:49 UTC (permalink / raw)
  To: Barry K. Nathan; +Cc: linux-kernel

On Sun, 4 Jun 2006 02:38:03 -0700
"Barry K. Nathan" <barryn@pobox.com> wrote:

> When I build ACPI processor support as a module, I get this:
> 
>   MODPOST
> WARNING: drivers/acpi/processor.o - Section mismatch: reference to
> .init.data: from .text between 'acpi_processor_power_init' (at offset
> 0xfb0) and 'acpi_safe_halt'

yup.  The code in there is actually correct (assuming
acpi_processor_power_init()'s first invokation is at initcall-time).

Maybe we'll do something to kill the warning, once we're down to the last
few thousand of them ;)


> (This is also true of -mm2, but I didn't get a chance to report it
> before -mm3 was released. Before then, I built it into the kernel and
> not as a module.)
> 
> and I still get this:
> WARNING: "scsi_tgt_queue_command" [drivers/scsi/libsrp.ko] undefined!

git-scsi-target Kconfig snafu.  I passed it over to James the other day. 
He might have fixed it - I get my git-scsi-misc via git-infiniband (don't
ask) and it's a bit laggy.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  9:49   ` 2.6.17-rc5-mm3 Andrew Morton
@ 2006-06-04 10:08     ` Michal Piotrowski
  2006-06-04 10:41       ` 2.6.17-rc5-mm3 Ingo Molnar
  0 siblings, 1 reply; 52+ messages in thread
From: Michal Piotrowski @ 2006-06-04 10:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, Arjan van de Ven, linux-kernel

Hi Andrew,

On 04/06/06, Andrew Morton <akpm@osdl.org> wrote:
> On Sun, 4 Jun 2006 02:38:03 -0700
> "Barry K. Nathan" <barryn@pobox.com> wrote:
>
> > When I build ACPI processor support as a module, I get this:
> >
> >   MODPOST
> > WARNING: drivers/acpi/processor.o - Section mismatch: reference to
> > .init.data: from .text between 'acpi_processor_power_init' (at offset
> > 0xfb0) and 'acpi_safe_halt'
>
> yup.  The code in there is actually correct (assuming
> acpi_processor_power_init()'s first invokation is at initcall-time).
>
> Maybe we'll do something to kill the warning, once we're down to the last
> few thousand of them ;)

I have got something similar
WARNING: drivers/usb/storage/usb-storage.o - Section mismatch:
reference to .exit.text: from .smp_locks after '' (at offset 0x3c)
WARNING: net/ipv4/netfilter/ip_conntrack.o - Section mismatch:
reference to .init.text: from .smp_locks after '' (at offset 0x8)
WARNING: net/ipv6/ipv6.o - Section mismatch: reference to .init.text:
from .smp_locks after '' (at offset 0x14c)
WARNING: net/ipv6/ipv6.o - Section mismatch: reference to .init.text:
from .smp_locks after '' (at offset 0x17c)

BTW. I still get this bug
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm2/bug_1.jpg
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm2/mm-config

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 10:08     ` 2.6.17-rc5-mm3 Michal Piotrowski
@ 2006-06-04 10:41       ` Ingo Molnar
  2006-06-04 20:38         ` 2.6.17-rc5-mm3 Valdis.Kletnieks
       [not found]         ` <6bffcb0e0606040407u4f56f7fdyf5ec479314afc082@mail.gmail.com>
  0 siblings, 2 replies; 52+ messages in thread
From: Ingo Molnar @ 2006-06-04 10:41 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: Andrew Morton, Arjan van de Ven, linux-kernel


* Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:

> BTW. I still get this bug
> http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm2/bug_1.jpg
> http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm2/mm-config

could you please apply the following patches ontop of -mm3:

  http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm3.patch
  http://redhat.com/~mingo/lockdep-patches/lockdep-tracer-2.6.17-rc5-mm3.patch

accept all the default 'make oldconfig' options and reboot into the 
patched kernel. If everything goes well then the system should still 
boot up fine and you should still get the lockdep warning - but this 
time there should be a long trace in /proc/latency_trace. Please upload 
that trace - it gives us the kernel's function trace, leading up to the 
warning.

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
  2006-06-04  9:38 ` 2.6.17-rc5-mm3 Barry K. Nathan
@ 2006-06-04 18:20 ` Rafael J. Wysocki
  2006-06-04 23:15 ` 2.6.17-rc5-mm3 J.A. Magallón
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Rafael J. Wysocki @ 2006-06-04 18:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sunday 04 June 2006 08:20, Andrew Morton wrote:
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/

Small compilation fix needed for x86_64 without SMP:

 arch/x86_64/kernel/mce_amd.c |    4 ++++
 1 files changed, 4 insertions(+)

Index: linux-2.6.17-rc5-mm3/arch/x86_64/kernel/mce_amd.c
===================================================================
--- linux-2.6.17-rc5-mm3.orig/arch/x86_64/kernel/mce_amd.c
+++ linux-2.6.17-rc5-mm3/arch/x86_64/kernel/mce_amd.c
@@ -494,7 +494,11 @@ static __cpuinit int threshold_create_ba
 
 	kobject_set_name(&b->kobj, "threshold_bank%i", bank);
 	b->kobj.parent = &per_cpu(device_mce, cpu).kobj;
+#ifdef CONFIG_SMP
 	b->cpus = cpu_core_map[cpu];
+#else
+	b->cpus = CPU_MASK_CPU0;
+#endif
 
 	err = kobject_register(&b->kobj);
 	if (err)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 10:41       ` 2.6.17-rc5-mm3 Ingo Molnar
@ 2006-06-04 20:38         ` Valdis.Kletnieks
       [not found]         ` <6bffcb0e0606040407u4f56f7fdyf5ec479314afc082@mail.gmail.com>
  1 sibling, 0 replies; 52+ messages in thread
From: Valdis.Kletnieks @ 2006-06-04 20:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Michal Piotrowski, Andrew Morton, Arjan van de Ven, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1279 bytes --]

On Sun, 04 Jun 2006 12:41:21 +0200, Ingo Molnar said:

> could you please apply the following patches ontop of -mm3:
> 
>   http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm3.patch
>   http://redhat.com/~mingo/lockdep-patches/lockdep-tracer-2.6.17-rc5-mm3.patch

Just for grins, I tried building this, and got this error:

  CC      kernel/irq/handle.o
kernel/irq/handle.c:246:35: error: macro "early_init_irq_lock_type" passed 1 arguments, but takes just 0
kernel/irq/handle.c:247: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token
make[2]: *** [kernel/irq/handle.o] Error 1

It won't build if you don't have CONFIG_TRACE_IRQFLAGS defined - and that
is defined like this:

config TRACE_IRQFLAGS
        bool
        default y
        depends on TRACE_IRQFLAGS_SUPPORT
        depends on PROVE_SPIN_LOCKING || PROVE_RW_LOCKING

but my config has:
% grep PROVE .config
# CONFIG_PROVE_SPIN_LOCKING is not set
# CONFIG_PROVE_RW_LOCKING is not set
# CONFIG_PROVE_MUTEX_LOCKING is not set
# CONFIG_PROVE_RWSEM_LOCKING is not set

So using the defaults for the PROVE_* won't compile clean.  Yes, probably
a stupid setting for anybody applying the patches, but.. ;)

(I'm off to go build kernels without the patch, and with the PROVE_* set)..


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
       [not found]         ` <6bffcb0e0606040407u4f56f7fdyf5ec479314afc082@mail.gmail.com>
@ 2006-06-04 21:38           ` Ingo Molnar
  2006-06-04 22:35             ` 2.6.17-rc5-mm3 Michal Piotrowski
  0 siblings, 1 reply; 52+ messages in thread
From: Ingo Molnar @ 2006-06-04 21:38 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: Andrew Morton, Arjan van de Ven, linux-kernel


* Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:

> Unfortunately I can't compile this
> Here is output from my build log

> /usr/src/linux-mm/kernel/sched.c:3040: error: 'p' redeclared as

i've uploaded a fixed version - does that work for you?

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 21:38           ` 2.6.17-rc5-mm3 Ingo Molnar
@ 2006-06-04 22:35             ` Michal Piotrowski
  0 siblings, 0 replies; 52+ messages in thread
From: Michal Piotrowski @ 2006-06-04 22:35 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, Arjan van de Ven, linux-kernel

On 04/06/06, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>
> > Unfortunately I can't compile this
> > Here is output from my build log
>
> > /usr/src/linux-mm/kernel/sched.c:3040: error: 'p' redeclared as
>
> i've uploaded a fixed version - does that work for you?

Yes, thanks.

Here is dmesg http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm3/mm-dmesg
Here is latency trace
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm3/mm-latency.bz2
Here is config http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm3/mm-config

Here is something new
Jun  4 23:59:44 ltg01-fedora kernel: hdd: set_drive_speed_status:
status=0x51 { DriveReady SeekComplete Error }
Jun  4 23:59:44 ltg01-fedora kernel: hdd: set_drive_speed_status:
error=0xb4 { AbortedCommand LastFailedSense=0x0b }
Jun  4 23:59:44 ltg01-fedora kernel: (          hdparm-1821 |#0): new
164424143 us user-latency.
Jun  4 23:59:44 ltg01-fedora kernel: stopped custom tracer.
Jun  4 23:59:44 ltg01-fedora kernel:
Jun  4 23:59:44 ltg01-fedora kernel: ============================
Jun  4 23:59:44 ltg01-fedora kernel: [ BUG: illegal lock usage! ]
Jun  4 23:59:44 ltg01-fedora kernel: ----------------------------
Jun  4 23:59:44 ltg01-fedora kernel: illegal {in-hardirq-W} ->
{hardirq-on-W} usage.
Jun  4 23:59:44 ltg01-fedora kernel: hdparm/1821 [HC0[0]:SC0[0]:HE1:SE1] takes:
Jun  4 23:59:44 ltg01-fedora kernel:  (ide_lock){++..}, at:
[<c0268388>] ide_dump_opcode+0x13/0x9b
Jun  4 23:59:44 ltg01-fedora kernel: {in-hardirq-W} state was registered at:
Jun  4 23:59:44 ltg01-fedora kernel:   [<c013b536>] lockdep_acquire+0x67/0x7f
Jun  4 23:59:44 ltg01-fedora kernel:   [<c0305755>] _spin_lock_irqsave+0x2d/0x3c
Jun  4 23:59:44 ltg01-fedora kernel:   [<c0265fff>] ide_intr+0x18/0x1ab
Jun  4 23:59:44 ltg01-fedora kernel:   [<c015062c>] handle_IRQ_event+0x1d/0x52
Jun  4 23:59:44 ltg01-fedora kernel:   [<c015169c>] handle_edge_irq+0x113/0x15a
Jun  4 23:59:44 ltg01-fedora kernel:   [<c0105857>] do_IRQ+0xa2/0xc7
Jun  4 23:59:44 ltg01-fedora kernel: irq event stamp: 2011
Jun  4 23:59:44 ltg01-fedora kernel: hardirqs last  enabled at (2011):
[<c0305b29>] _spin_unlock_irq+0x24/0x58
Jun  4 23:59:44 ltg01-fedora kernel: hardirqs last disabled at (2010):
[<c03056c9>] _spin_lock_irq+0x11/0x38
Jun  4 23:59:44 ltg01-fedora kernel: softirqs last  enabled at (2008):
[<c012630c>] __do_softirq+0xf0/0xf8
Jun  4 23:59:44 ltg01-fedora kernel: softirqs last disabled at (2001):
[<c0105741>] do_softirq+0x5e/0xd2
Jun  4 23:59:44 ltg01-fedora kernel:
Jun  4 23:59:44 ltg01-fedora kernel: other info that might help us debug this:
Jun  4 23:59:44 ltg01-fedora kernel: no locks held by hdparm/1821.
Jun  4 23:59:44 ltg01-fedora kernel:
Jun  4 23:59:44 ltg01-fedora kernel: stack backtrace:
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0104513>] show_trace+0x1b/0x20
Jun  4 23:59:44 ltg01-fedora kernel:  [<c01045f1>] dump_stack+0x1f/0x24
Jun  4 23:59:44 ltg01-fedora kernel:  [<c013976c>] print_usage_bug+0x1a5/0x1b1
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0139e90>] mark_lock+0x2ca/0x4f7
Jun  4 23:59:44 ltg01-fedora kernel:  [<c013aa96>] __lockdep_acquire+0x47e/0xaa4
Jun  4 23:59:44 ltg01-fedora kernel:  [<c013b536>] lockdep_acquire+0x67/0x7f
Jun  4 23:59:44 ltg01-fedora kernel:  [<c030552d>] _spin_lock+0x24/0x32
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0268388>] ide_dump_opcode+0x13/0x9b
Jun  4 23:59:44 ltg01-fedora kernel:  [<c02688b6>] ide_dump_status+0x4a6/0x4cc
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0267ae6>]
ide_config_drive_speed+0x32a/0x33a
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0262dc5>] piix_tune_chipset+0x2ed/0x2f8
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0262e31>]
piix_config_drive_xfer_rate+0x61/0xb5
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0263a82>] set_using_dma+0x2f/0x60
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0263bee>] ide_write_setting+0x4a/0xc3
Jun  4 23:59:44 ltg01-fedora kernel:  [<c02647ca>] generic_ide_ioctl+0x8a/0x47f
Jun  4 23:59:44 ltg01-fedora kernel:  [<f886003a>]
idecd_ioctl+0xfd/0x133 [ide_cd]
Jun  4 23:59:44 ltg01-fedora kernel:  [<c01f1fff>] blkdev_driver_ioctl+0x4b/0x5f
Jun  4 23:59:44 ltg01-fedora kernel:  [<c01f2783>] blkdev_ioctl+0x770/0x7bd
Jun  4 23:59:44 ltg01-fedora kernel:  [<c017dc0d>] block_ioctl+0x1f/0x21
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0189353>] do_ioctl+0x27/0x6e
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0189604>] vfs_ioctl+0x26a/0x280
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0189667>] sys_ioctl+0x4d/0x7e
Jun  4 23:59:44 ltg01-fedora kernel:  [<c0305ed2>] sysenter_past_esp+0x63/0xa1
Jun  4 23:59:44 ltg01-fedora kernel: ---------------------------
Jun  4 23:59:44 ltg01-fedora kernel: | preempt count: 00000001 ]
Jun  4 23:59:44 ltg01-fedora kernel: | 1-level deep critical section nesting:
Jun  4 23:59:44 ltg01-fedora kernel: ----------------------------------------
Jun  4 23:59:44 ltg01-fedora kernel: .. [<c030551b>] .... _spin_lock+0x12/0x32
Jun  4 23:59:44 ltg01-fedora kernel: .....[<c0268388>] ..   ( <=
ide_dump_opcode+0x13/0x9b)
Jun  4 23:59:44 ltg01-fedora kernel:
Jun  4 23:59:44 ltg01-fedora kernel: ide: failed opcode was: unknown

I get this when I do "hdparm -c 1 -d 1 /dev/hd{c,d}"

>
>         Ingo
>

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
  2006-06-04  9:38 ` 2.6.17-rc5-mm3 Barry K. Nathan
  2006-06-04 18:20 ` 2.6.17-rc5-mm3 Rafael J. Wysocki
@ 2006-06-04 23:15 ` J.A. Magallón
  2006-06-04 23:42   ` 2.6.17-rc5-mm3 Andrew Morton
                     ` (2 more replies)
  2006-06-04 23:28 ` 2.6.17-rc5-mm3 J.A. Magallón
                   ` (4 subsequent siblings)
  7 siblings, 3 replies; 52+ messages in thread
From: J.A. Magallón @ 2006-06-04 23:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
> 
> - Lots of PCI and USB updates
> 
> - The various lock validator, stack backtracing and IRQ management problems
>   are converging, but we're not quite there yet.
> 

Got this on boot. Looks like another locking bug in firewire:

ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 20 (level, low) -> IRQ 20
ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[20]  MMIO=[ec024000-ec0247ff]  Max Packet=[2048]  IR/IT contexts=[4/8]
stopped custom tracer.

============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {hardirq-on-W} -> {in-hardirq-R} usage.
idle/0 [HC1[1]:SC1[0]:HE0:SE0] takes:
 (hl_irqs_lock){--+.}, at: [<f8835cb9>] highlevel_host_reset+0x11/0x5b [ieee1394]
{hardirq-on-W} state was registered at:
  [<c0133fe4>] lockdep_acquire+0x4d/0x63
  [<c02f3421>] _write_lock+0x2e/0x3b
  [<f88365ab>] hpsb_register_highlevel+0xac/0xea [ieee1394]
  [<f8836d6a>] init_csr+0x28/0x3f [ieee1394]
  [<f880617d>] 0xf880617d
  [<c01398df>] sys_init_module+0x12a/0x1b7b
  [<c02f3b2d>] sysenter_past_esp+0x56/0x8d
irq event stamp: 258193
hardirqs last  enabled at (258192): [<c011fab5>] __do_softirq+0x67/0xf7
hardirqs last disabled at (258193): [<c0102eb7>] common_interrupt+0x1b/0x2c
softirqs last  enabled at (258186): [<c011fb34>] __do_softirq+0xe6/0xf7
softirqs last disabled at (258191): [<c0104cec>] do_softirq+0x5a/0xc9

other info that might help us debug this:
no locks held by idle/0.

stack backtrace:
 [<c01034ba>] show_trace+0x12/0x14
 [<c0103b8d>] dump_stack+0x19/0x1b
 [<c0132025>] print_usage_bug+0x20b/0x215
 [<c01329cc>] mark_lock+0x4fa/0x5b4
 [<c0133399>] __lockdep_acquire+0x310/0xbc0
 [<c0133fe4>] lockdep_acquire+0x4d/0x63
 [<c02f3153>] _read_lock+0x2e/0x3b
 [<f8835cb9>] highlevel_host_reset+0x11/0x5b [ieee1394]
 [<f8833867>] hpsb_selfid_complete+0x286/0x307 [ieee1394]
 [<f884ec30>] ohci_irq_handler+0x6c9/0x995 [ohci1394]
 [<c013d3a2>] handle_IRQ_event+0x2e/0x63
 [<c013e4c3>] handle_fasteoi_irq+0x6b/0xac
 [<c0104dc7>] do_IRQ+0x6c/0xa5
 =======================
 [<c0102ec1>] common_interrupt+0x25/0x2c
 [<c0104cec>] do_softirq+0x5a/0xc9
 =======================
 [<c011fb90>] irq_exit+0x4b/0x4d
 [<c0104dce>] do_IRQ+0x73/0xa5
 [<c0102ec1>] common_interrupt+0x25/0x2c
 [<c010164e>] cpu_idle+0x63/0x80
 [<c0100599>] rest_init+0x33/0x3a
 [<c03d97af>] start_kernel+0x339/0x3aa
 [<c0100210>] 0xc0100210
ieee1394: Host added: ID:BUS[0-00:1023]  GUID[00e018000063814f]

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.0 (Cooker) for i586
Linux 2.6.16-jam18 (gcc 4.1.1 20060518 (prerelease)) #2 SMP PREEMPT Mon

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
                   ` (2 preceding siblings ...)
  2006-06-04 23:15 ` 2.6.17-rc5-mm3 J.A. Magallón
@ 2006-06-04 23:28 ` J.A. Magallón
  2006-06-05  0:06   ` 2.6.17-rc5-mm3 Barry K. Nathan
                     ` (3 more replies)
  2006-06-05 17:56 ` 2.6.17-rc5-mm3 Mel Gorman
                   ` (3 subsequent siblings)
  7 siblings, 4 replies; 52+ messages in thread
From: J.A. Magallón @ 2006-06-04 23:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
> 
> - Lots of PCI and USB updates
> 
> - The various lock validator, stack backtracing and IRQ management problems
>   are converging, but we're not quite there yet.
> 

I got this with -mm2, is it supposed to be cured in -mm3 ? I still have to
try with mm3:

Jun  2 14:34:39 annwn kernel: Console: colour VGA+ 80x60
Jun  2 14:34:39 annwn kernel: ------------------------
Jun  2 14:34:39 annwn kernel: | Locking API testsuite:
Jun  2 14:34:39 annwn kernel: ----------------------------------------------------------------------------
Jun  2 14:34:39 annwn kernel:                                  | spin |wlock |rlock |mutex | wsem | rsem |
Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
Jun  2 14:34:39 annwn kernel:                      A-A deadlock:failed|failed|failed|failed|failed|failed|
Jun  2 14:34:39 annwn kernel:                  A-B-B-A deadlock:failed|failed|  ok  |failed|failed|failed|
Jun  2 14:34:39 annwn kernel:              A-B-B-C-C-A deadlock:failed|failed|  ok  |failed|failed|failed|
Jun  2 14:34:39 annwn kernel:              A-B-C-A-B-C deadlock:failed|failed|  ok  |failed|failed|failed|
Jun  2 14:34:39 annwn kernel:          A-B-B-C-C-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
Jun  2 14:34:39 annwn kernel:          A-B-C-D-B-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
Jun  2 14:34:39 annwn kernel:          A-B-C-D-B-C-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
Jun  2 14:34:39 annwn kernel:                     double unlock:failed|failed|failed|failed|failed|failed|
Jun  2 14:34:39 annwn kernel:                  bad unlock order:failed|failed|failed|failed|failed|failed|
Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
Jun  2 14:34:39 annwn kernel:               recursive read-lock:             |  ok  |             |failed|
Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
Jun  2 14:34:39 annwn kernel:      hard-irqs-on + irq-safe-A/12:failed|failed|  ok  |
Jun  2 14:34:39 annwn kernel:      soft-irqs-on + irq-safe-A/12:failed|failed|  ok  |
Jun  2 14:34:39 annwn kernel:      hard-irqs-on + irq-safe-A/21:failed|failed|  ok  |
Jun  2 14:34:39 annwn kernel:      soft-irqs-on + irq-safe-A/21:failed|failed|  ok  |
Jun  2 14:34:39 annwn kernel:        sirq-safe-A => hirqs-on/12:failed|failed|  ok  |
Jun  2 14:34:39 annwn kernel:        sirq-safe-A => hirqs-on/21:failed|failed|  ok  |
Jun  2 14:34:39 annwn kernel:          hard-safe-A + irqs-on/12:failed|failed|  ok  |
Jun  2 14:34:39 annwn kernel:          soft-safe-A + irqs-on/12:failed|failed|  ok  |

(all tests failed like this...)

Jun  2 14:34:39 annwn kernel: --------------------------------------------------------
Jun  2 14:34:39 annwn kernel: 141 out of 206 testcases failed, as expected. |
Jun  2 14:34:39 annwn kernel: ----------------------------------------------------

Expected ? Uh ?

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.0 (Cooker) for i586
Linux 2.6.16-jam18 (gcc 4.1.1 20060518 (prerelease)) #2 SMP PREEMPT Mon

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 23:15 ` 2.6.17-rc5-mm3 J.A. Magallón
@ 2006-06-04 23:42   ` Andrew Morton
  2006-06-05  6:02   ` 2.6.17-rc5-mm3 Valdis.Kletnieks
  2006-06-05  8:04   ` 2.6.17-rc5-mm3 Arjan van de Ven
  2 siblings, 0 replies; 52+ messages in thread
From: Andrew Morton @ 2006-06-04 23:42 UTC (permalink / raw)
  To: "J.A. =?ISO-8859-1?B?TWFnYWxs824i?= <jamagallon
  Cc: linux-kernel, Stefan Richter

On Mon, 5 Jun 2006 01:15:31 +0200
"J.A. Magallón" <jamagallon@ono.com> wrote:

> On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:
> 
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
> > 
> > - Lots of PCI and USB updates
> > 
> > - The various lock validator, stack backtracing and IRQ management problems
> >   are converging, but we're not quite there yet.
> > 
> 
> Got this on boot. Looks like another locking bug in firewire:
> 
> ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 20 (level, low) -> IRQ 20
> ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[20]  MMIO=[ec024000-ec0247ff]  Max Packet=[2048]  IR/IT contexts=[4/8]
> stopped custom tracer.
> 
> ============================
> [ BUG: illegal lock usage! ]
> ----------------------------
> illegal {hardirq-on-W} -> {in-hardirq-R} usage.

So we have an rwlock which was acquired for writing under
local_irq_enable() but we later acquired it for reading inside an interrupt
handler.


> idle/0 [HC1[1]:SC1[0]:HE0:SE0] takes:
>  (hl_irqs_lock){--+.}, at: [<f8835cb9>] highlevel_host_reset+0x11/0x5b [ieee1394]
> {hardirq-on-W} state was registered at:
>   [<c0133fe4>] lockdep_acquire+0x4d/0x63
>   [<c02f3421>] _write_lock+0x2e/0x3b
>   [<f88365ab>] hpsb_register_highlevel+0xac/0xea [ieee1394]
>   [<f8836d6a>] init_csr+0x28/0x3f [ieee1394]
>   [<f880617d>] 0xf880617d
>   [<c01398df>] sys_init_module+0x12a/0x1b7b
>   [<c02f3b2d>] sysenter_past_esp+0x56/0x8d

Here's the irqs-on write_lock.

> irq event stamp: 258193
> hardirqs last  enabled at (258192): [<c011fab5>] __do_softirq+0x67/0xf7
> hardirqs last disabled at (258193): [<c0102eb7>] common_interrupt+0x1b/0x2c
> softirqs last  enabled at (258186): [<c011fb34>] __do_softirq+0xe6/0xf7
> softirqs last disabled at (258191): [<c0104cec>] do_softirq+0x5a/0xc9
> 
> other info that might help us debug this:
> no locks held by idle/0.
> 
> stack backtrace:
>  [<c01034ba>] show_trace+0x12/0x14
>  [<c0103b8d>] dump_stack+0x19/0x1b
>  [<c0132025>] print_usage_bug+0x20b/0x215
>  [<c01329cc>] mark_lock+0x4fa/0x5b4
>  [<c0133399>] __lockdep_acquire+0x310/0xbc0
>  [<c0133fe4>] lockdep_acquire+0x4d/0x63
>  [<c02f3153>] _read_lock+0x2e/0x3b
>  [<f8835cb9>] highlevel_host_reset+0x11/0x5b [ieee1394]
>  [<f8833867>] hpsb_selfid_complete+0x286/0x307 [ieee1394]
>  [<f884ec30>] ohci_irq_handler+0x6c9/0x995 [ohci1394]
>  [<c013d3a2>] handle_IRQ_event+0x2e/0x63
>  [<c013e4c3>] handle_fasteoi_irq+0x6b/0xac
>  [<c0104dc7>] do_IRQ+0x6c/0xa5

And here's the in-irq read_lock().

Simple fix would be to take hl_irqs_lock in an irq-safe manner everywhere.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 23:28 ` 2.6.17-rc5-mm3 J.A. Magallón
@ 2006-06-05  0:06   ` Barry K. Nathan
  2006-06-05  0:25   ` 2.6.17-rc5-mm3 Grant Coady
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 52+ messages in thread
From: Barry K. Nathan @ 2006-06-05  0:06 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: Andrew Morton, linux-kernel

On 6/4/06, J.A. Magallón <jamagallon@ono.com> wrote:
> Jun  2 14:34:39 annwn kernel: --------------------------------------------------------
> Jun  2 14:34:39 annwn kernel: 141 out of 206 testcases failed, as expected. |
> Jun  2 14:34:39 annwn kernel: ----------------------------------------------------
>
> Expected ? Uh ?

grep PROVE .config

Make sure all 4 of them are set to Y; if any of them are N, then test
case failures would in fact be expected.
-- 
-Barry K. Nathan <barryn@pobox.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 23:28 ` 2.6.17-rc5-mm3 J.A. Magallón
  2006-06-05  0:06   ` 2.6.17-rc5-mm3 Barry K. Nathan
@ 2006-06-05  0:25   ` Grant Coady
  2006-06-05  0:45   ` 2.6.17-rc5-mm3 Grant Coady
  2006-06-05  9:12   ` 2.6.17-rc5-mm3 Ingo Molnar
  3 siblings, 0 replies; 52+ messages in thread
From: Grant Coady @ 2006-06-05  0:25 UTC (permalink / raw)
  Cc: Andrew Morton, linux-kernel

On Mon, 5 Jun 2006 01:28:42 +0200, "J.A. Magallón" <jamagallon@ono.com> wrote:

>On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:
>
>> 
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
>> 
>> - Lots of PCI and USB updates
>> 
>> - The various lock validator, stack backtracing and IRQ management problems
>>   are converging, but we're not quite there yet.
>> 
>
>I got this with -mm2, is it supposed to be cured in -mm3 ? I still have to
>try with mm3:
>
>Jun  2 14:34:39 annwn kernel: Console: colour VGA+ 80x60
>Jun  2 14:34:39 annwn kernel: ------------------------
>Jun  2 14:34:39 annwn kernel: | Locking API testsuite:
>Jun  2 14:34:39 annwn kernel: ----------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:                                  | spin |wlock |rlock |mutex | wsem | rsem |
>Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:                      A-A deadlock:failed|failed|failed|failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:                  A-B-B-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:              A-B-B-C-C-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:              A-B-C-A-B-C deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:          A-B-B-C-C-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:          A-B-C-D-B-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:          A-B-C-D-B-C-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:                     double unlock:failed|failed|failed|failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:                  bad unlock order:failed|failed|failed|failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:               recursive read-lock:             |  ok  |             |failed|
>Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:      hard-irqs-on + irq-safe-A/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:      soft-irqs-on + irq-safe-A/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:      hard-irqs-on + irq-safe-A/21:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:      soft-irqs-on + irq-safe-A/21:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:        sirq-safe-A => hirqs-on/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:        sirq-safe-A => hirqs-on/21:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:          hard-safe-A + irqs-on/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:          soft-safe-A + irqs-on/12:failed|failed|  ok  |
>
>(all tests failed like this...)
>
>Jun  2 14:34:39 annwn kernel: --------------------------------------------------------
>Jun  2 14:34:39 annwn kernel: 141 out of 206 testcases failed, as expected. |
>Jun  2 14:34:39 annwn kernel: ----------------------------------------------------
>
>Expected ? Uh ?

I got something like that here before turning on all the test options, 
suggest '  --  ' for non-selected tests.  More info, first four files:
<http://bugsplatter.mine.nu/test/linux-2.6/sempro/?M=D>

dmesg, false positives:

<http://bugsplatter.mine.nu/test/linux-2.6/sempro/dmesg-2.6.17-rc5-mm3a.gz>:

------------------------
| Locking API testsuite:
----------------------------------------------------------------------------
                                 | spin |wlock |rlock |mutex | wsem | rsem |
  --------------------------------------------------------------------------
                     A-A deadlock:failed|failed|failed|failed|failed|failed|
                 A-B-B-A deadlock:failed|failed|  ok  |failed|failed|failed|
             A-B-B-C-C-A deadlock:failed|failed|  ok  |failed|failed|failed|
             A-B-C-A-B-C deadlock:failed|failed|  ok  |failed|failed|failed|
         A-B-B-C-C-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
         A-B-C-D-B-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
         A-B-C-D-B-C-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
                    double unlock:failed|failed|failed|failed|failed|failed|
                 bad unlock order:failed|failed|failed|failed|failed|failed|
  --------------------------------------------------------------------------
              recursive read-lock:             |  ok  |             |failed|
  --------------------------------------------------------------------------
     hard-irqs-on + irq-safe-A/12:failed|failed|  ok  |
     soft-irqs-on + irq-safe-A/12:failed|failed|  ok  |
     hard-irqs-on + irq-safe-A/21:failed|failed|  ok  |
     soft-irqs-on + irq-safe-A/21:failed|failed|  ok  |

and dmesg, okay:
<http://bugsplatter.mine.nu/test/linux-2.6/sempro/dmesg-2.6.17-rc5-mm3a-2.gz>:
------------------------
| Locking API testsuite:
----------------------------------------------------------------------------
                                 | spin |wlock |rlock |mutex | wsem | rsem |
  --------------------------------------------------------------------------
                     A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
             A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
             A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-B-C-C-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                    double unlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 bad unlock order:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
  --------------------------------------------------------------------------
              recursive read-lock:             |  ok  |             |  ok  |
  --------------------------------------------------------------------------
                non-nested unlock:  ok  |  ok  |  ok  |  ok  |
  ------------------------------------------------------------
     hard-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
     soft-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
     hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
     soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |

Grant.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 23:28 ` 2.6.17-rc5-mm3 J.A. Magallón
  2006-06-05  0:06   ` 2.6.17-rc5-mm3 Barry K. Nathan
  2006-06-05  0:25   ` 2.6.17-rc5-mm3 Grant Coady
@ 2006-06-05  0:45   ` Grant Coady
  2006-06-05  9:12   ` 2.6.17-rc5-mm3 Ingo Molnar
  3 siblings, 0 replies; 52+ messages in thread
From: Grant Coady @ 2006-06-05  0:45 UTC (permalink / raw)
  To: <unlisted-recipients; +Cc: Andrew Morton, linux-kernel

On Mon, 5 Jun 2006 01:28:42 +0200, "J.A. Magallón" <jamagallon@ono.com> wrote:

>On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:
>
>> 
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
>> 
>> - Lots of PCI and USB updates
>> 
>> - The various lock validator, stack backtracing and IRQ management problems
>>   are converging, but we're not quite there yet.
>> 
>
>I got this with -mm2, is it supposed to be cured in -mm3 ? I still have to
>try with mm3:
>
>Jun  2 14:34:39 annwn kernel: Console: colour VGA+ 80x60
>Jun  2 14:34:39 annwn kernel: ------------------------
>Jun  2 14:34:39 annwn kernel: | Locking API testsuite:
>Jun  2 14:34:39 annwn kernel: ----------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:                                  | spin |wlock |rlock |mutex | wsem | rsem |
>Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:                      A-A deadlock:failed|failed|failed|failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:                  A-B-B-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:              A-B-B-C-C-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:              A-B-C-A-B-C deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:          A-B-B-C-C-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:          A-B-C-D-B-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:          A-B-C-D-B-C-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:                     double unlock:failed|failed|failed|failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:                  bad unlock order:failed|failed|failed|failed|failed|failed|
>Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:               recursive read-lock:             |  ok  |             |failed|
>Jun  2 14:34:39 annwn kernel:   --------------------------------------------------------------------------
>Jun  2 14:34:39 annwn kernel:      hard-irqs-on + irq-safe-A/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:      soft-irqs-on + irq-safe-A/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:      hard-irqs-on + irq-safe-A/21:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:      soft-irqs-on + irq-safe-A/21:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:        sirq-safe-A => hirqs-on/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:        sirq-safe-A => hirqs-on/21:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:          hard-safe-A + irqs-on/12:failed|failed|  ok  |
>Jun  2 14:34:39 annwn kernel:          soft-safe-A + irqs-on/12:failed|failed|  ok  |
>
>(all tests failed like this...)
>
>Jun  2 14:34:39 annwn kernel: --------------------------------------------------------
>Jun  2 14:34:39 annwn kernel: 141 out of 206 testcases failed, as expected. |
>Jun  2 14:34:39 annwn kernel: ----------------------------------------------------
>
>Expected ? Uh ?

I got something like that here before turning on all the test options, 
suggest '  --  ' for non-selected tests.  More info, first four files:
<http://bugsplatter.mine.nu/test/linux-2.6/sempro/?M=D>

dmesg, false positives:

<http://bugsplatter.mine.nu/test/linux-2.6/sempro/dmesg-2.6.17-rc5-mm3a.gz>:

------------------------
| Locking API testsuite:
----------------------------------------------------------------------------
                                 | spin |wlock |rlock |mutex | wsem | rsem |
  --------------------------------------------------------------------------
                     A-A deadlock:failed|failed|failed|failed|failed|failed|
                 A-B-B-A deadlock:failed|failed|  ok  |failed|failed|failed|
             A-B-B-C-C-A deadlock:failed|failed|  ok  |failed|failed|failed|
             A-B-C-A-B-C deadlock:failed|failed|  ok  |failed|failed|failed|
         A-B-B-C-C-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
         A-B-C-D-B-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
         A-B-C-D-B-C-D-A deadlock:failed|failed|  ok  |failed|failed|failed|
                    double unlock:failed|failed|failed|failed|failed|failed|
                 bad unlock order:failed|failed|failed|failed|failed|failed|
  --------------------------------------------------------------------------
              recursive read-lock:             |  ok  |             |failed|
  --------------------------------------------------------------------------
     hard-irqs-on + irq-safe-A/12:failed|failed|  ok  |
     soft-irqs-on + irq-safe-A/12:failed|failed|  ok  |
     hard-irqs-on + irq-safe-A/21:failed|failed|  ok  |
     soft-irqs-on + irq-safe-A/21:failed|failed|  ok  |

and dmesg, okay:
<http://bugsplatter.mine.nu/test/linux-2.6/sempro/dmesg-2.6.17-rc5-mm3a-2.gz>:
------------------------
| Locking API testsuite:
----------------------------------------------------------------------------
                                 | spin |wlock |rlock |mutex | wsem | rsem |
  --------------------------------------------------------------------------
                     A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
             A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
             A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-B-C-C-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
         A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                    double unlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 bad unlock order:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
  --------------------------------------------------------------------------
              recursive read-lock:             |  ok  |             |  ok  |
  --------------------------------------------------------------------------
                non-nested unlock:  ok  |  ok  |  ok  |  ok  |
  ------------------------------------------------------------
     hard-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
     soft-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
     hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
     soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |

Grant.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 23:15 ` 2.6.17-rc5-mm3 J.A. Magallón
  2006-06-04 23:42   ` 2.6.17-rc5-mm3 Andrew Morton
@ 2006-06-05  6:02   ` Valdis.Kletnieks
  2006-06-05  8:04   ` 2.6.17-rc5-mm3 Arjan van de Ven
  2 siblings, 0 replies; 52+ messages in thread
From: Valdis.Kletnieks @ 2006-06-05  6:02 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 970 bytes --]

On Mon, 05 Jun 2006 01:15:31 +0200, "J.A. =?UTF-8?B?TWFnYWxsw7Nu?=" said:
> On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/

> ============================
> [ BUG: illegal lock usage! ]
> ----------------------------
> illegal {hardirq-on-W} -> {in-hardirq-R} usage.
> idle/0 [HC1[1]:SC1[0]:HE0:SE0] takes:
>  (hl_irqs_lock){--+.}, at: [<f8835cb9>] highlevel_host_reset+0x11/0x5b [ieee1394]
> {hardirq-on-W} state was registered at:
>   [<c0133fe4>] lockdep_acquire+0x4d/0x63
>   [<c02f3421>] _write_lock+0x2e/0x3b
>   [<f88365ab>] hpsb_register_highlevel+0xac/0xea [ieee1394]
>   [<f8836d6a>] init_csr+0x28/0x3f [ieee1394]
>   [<f880617d>] 0xf880617d
>   [<c01398df>] sys_init_module+0x12a/0x1b7b
>   [<c02f3b2d>] sysenter_past_esp+0x56/0x8d

ACK.  I saw this same one too, while udevd was trying to get its act
together in very early rc.sysinit....


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 23:15 ` 2.6.17-rc5-mm3 J.A. Magallón
  2006-06-04 23:42   ` 2.6.17-rc5-mm3 Andrew Morton
  2006-06-05  6:02   ` 2.6.17-rc5-mm3 Valdis.Kletnieks
@ 2006-06-05  8:04   ` Arjan van de Ven
  2 siblings, 0 replies; 52+ messages in thread
From: Arjan van de Ven @ 2006-06-05  8:04 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: Andrew Morton, linux-kernel

On Mon, 2006-06-05 at 01:15 +0200, J.A. Magallón wrote:
> On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:
> 
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
> > 
> > - Lots of PCI and USB updates
> > 
> > - The various lock validator, stack backtracing and IRQ management problems
> >   are converging, but we're not quite there yet.
> > 
> 
> Got this on boot. Looks like another locking bug in firewire:
> 
> ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 20 (level, low) -> IRQ 20
> ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[20]  MMIO=[ec024000-ec0247ff]  Max Packet=[2048]  IR/IT contexts=[4/8]
> stopped custom tracer.
> 
> ============================
> [ BUG: illegal lock usage! ]
> ----------------------------
> illegal {hardirq-on-W} -> {in-hardirq-R} usage.
> idle/0 [HC1[1]:SC1[0]:HE0:SE0] takes:
>  (hl_irqs_lock){--+.}, at: [<f8835cb9>] highlevel_host_reset+0x11/0x5b [ieee1394]

this one was reported a few days ago and acknowledged by the firewire
people as real.. it seems they haven't sent Andrew a fix yet.
If they don't do that today I'll send a provisional fix


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04 23:28 ` 2.6.17-rc5-mm3 J.A. Magallón
                     ` (2 preceding siblings ...)
  2006-06-05  0:45   ` 2.6.17-rc5-mm3 Grant Coady
@ 2006-06-05  9:12   ` Ingo Molnar
  3 siblings, 0 replies; 52+ messages in thread
From: Ingo Molnar @ 2006-06-05  9:12 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: Andrew Morton, linux-kernel


* J.A. Magallón <jamagallon@ono.com> wrote:

> I got this with -mm2, is it supposed to be cured in -mm3 ? I still 
> have to try with mm3:

> (all tests failed like this...)
> 
> Jun  2 14:34:39 annwn kernel: --------------------------------------------------------
> Jun  2 14:34:39 annwn kernel: 141 out of 206 testcases failed, as expected. |
> Jun  2 14:34:39 annwn kernel: ----------------------------------------------------
> 
> Expected ? Uh ?

to have lock validation you should have these options enabled:

 CONFIG_PROVE_SPIN_LOCKING=y
 CONFIG_PROVE_RW_LOCKING=y
 CONFIG_PROVE_MUTEX_LOCKING=y
 CONFIG_PROVE_RWSEM_LOCKING=y

otherwise the tests are still run, but the deadlocks are not detected. 
That's why those 141 testcases are 'expected' failures.

and definitely try -mm3 plus the current combo patch:

  http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm3.patch

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
@ 2006-06-05 16:30 Martin Bligh
  2006-06-05 19:44 ` 2.6.17-rc5-mm3 Ingo Molnar
  0 siblings, 1 reply; 52+ messages in thread
From: Martin Bligh @ 2006-06-05 16:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andy Whitcroft, LKML, Ingo Molnar

panic on NUMA-Q during LTP. Was fine in -mm2.

BUG: unable to handle kernel paging request at virtual address 22222232
  printing eip:
c012cf84
*pde = 25b5a001
*pte = 00000000
Oops: 0000 [#1]
SMP
last sysfs file: /devices/pci0000:00/0000:00:0a.0/resource
Modules linked in:
CPU:    12
EIP:    0060:[<c012cf84>]    Not tainted VLI
EFLAGS: 00010002   (2.6.17-rc5-mm3-autokern1 #1)
EIP is at check_deadlock+0x19/0xe1
eax: 00000001   ebx: e4453030   ecx: 00000000   edx: e4008000
esi: 22222222   edi: 00000001   ebp: 22222222   esp: e47ebec0
ds: 007b   es: 007b   ss: 0068
Process mkdir09 (pid: 18319, threadinfo=e47ea000 task=e5f91ab0)
Stack: e4453030 22222222 00000000 e459231c c012d015 22222222 00000001 
e4008000
        e459231c e47ea000 e47ebf1c e5f91ab0 c012d1ce e459231c 00000000 
e47ea000
        e47ebf1c e459231c 00000246 c02f1d74 e459231c e47ebf1c e47ea000 
e47ebf1c
Call Trace:
  [<c012d015>] check_deadlock+0xaa/0xe1
  [<c012d1ce>] debug_mutex_add_waiter+0x4a/0x5c
  [<c02f1d74>] __mutex_lock_slowpath+0x9e/0x1cb
  [<c01648a9>] do_rmdir+0x67/0xc2
  [<c02001da>] __put_user_4+0x12/0x18
  [<c016490f>] sys_rmdir+0xb/0xe
  [<c02f2f1f>] syscall_call+0x7/0xb
Code: 0c 68 60 07 31 c0 e8 22 c0 fe ff 58 fa 5b 5e 5f 5d c3 55 83 3d cc 
11 36 c0 00 57 56 53 8b 6c 24 14 8b 7c 24 18 0f 84 c1 00 00 00 <8b> 55 
10 31 c0 85 d2 0f 84 b6 00 00 00 8b 1a 31 f6 8b 83 c4 04
EIP: [<c012cf84>] check_deadlock+0x19/0xe1 SS:ESP 0068:e47ebec0

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
                   ` (3 preceding siblings ...)
  2006-06-04 23:28 ` 2.6.17-rc5-mm3 J.A. Magallón
@ 2006-06-05 17:56 ` Mel Gorman
  2006-06-05 18:54   ` 2.6.17-rc5-mm3 Andrew Morton
  2006-06-05 19:48 ` 2.6.17-rc5-mm3 Dave Jones
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Mel Gorman @ 2006-06-05 17:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


I am seeing more networking-related funniness with 2.6.17-rc5-mm3 on the
same machine previously fixed by git-net-llc-fix.patch. The console log is
below. I've done no investigation work in case it's a known problem.

kernel /vmlinuz-autobench ro root=/dev/VolGroup00/LogVol00 rhgb quiet console=t
tyS1,19200 autobench_args: root=30726124 ABAT:1149529388 earlyprintk=serial,tty
S1,19200
   [Linux-bzImage, setup=0x1e00, size=0x1e0687]
initrd /initrd-autobench.img
   [Linux-initrd @ 0x37e60000, 0x18fbd9 bytes]
Bootdata ok (command line is ro root=/dev/VolGroup00/LogVol00 rhgb quiet console=ttyS1,19200 autobench_args: root=30726124 ABAT:1149529388 earlyprintk=serial,ttyS1,19200)
Linux version 2.6.17-rc5-mm2-autokern1 (root@bl6-13.ltc.austin.ibm.com) (/usr/local/autobench/var/tmp/build/scripts/mkcompile_h: line 61: /usr/local/autobench/sources/x86_64-cross/*/bin/x86_64-unknown-linux-gnu-gcc: No such file or directory) #1 SMP Mon Jun 5 12:36:09 CDT 2006
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009d400 (usable)
 BIOS-e820: 000000000009d400 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003ffcddc0 (usable)
 BIOS-e820: 000000003ffcddc0 - 000000003ffd0000 (ACPI data)
 BIOS-e820: 000000003ffd0000 - 0000000040000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
kernel direct mapping tables up to 100000000 @ 8000-8000
DMI 2.3 present.
ACPI: PM-Timer IO Port: 0x2208
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
Processor #2 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
Processor #3 15:1 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x0d] address[0xfec10000] gsi_base[24])
IOAPIC[1]: apic_id 13, version 17, address 0xfec10000, GSI 24-27
ACPI: IOAPIC (id[0x0c] address[0xfec20000] gsi_base[48])
IOAPIC[2]: apic_id 12, version 17, address 0xfec20000, GSI 48-51
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 low level)
Setting APIC routing to physical flat
ACPI: HPET id: 0x10228203 base: 0xfecff000
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
SMP: Allowing 4 CPUs, 0 hotplug CPUs
Built 1 zonelists
Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet console=ttyS1,19200 autobench_args: root=30726124 ABAT:1149529388 earlyprintk=serial,ttyS1,19200
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
Red Hat nash version 5.0.32 starting
  Reading all physical volumes.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
  2 logical volume(s) in volume group "VolGroup00" now active
INIT: version 2.86 booting
                Welcome to Fedora Core
                press 'I' to enter interactive startup.
Setting clock  (localtime): Mon Jun  5 12:47:49 CDT 2006 [  OK  ]
Starting udev: [  OK  ]
Setting hostname bl6-13.ltc.austin.ibm.com:  [  OK  ]
Setting up Logical Volume Management:   2 logical volume(s) in volume group "VolGroup00" now active
[  OK  ]
Checking filesystems
Checking all file systems.
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/VolGroup00/LogVol00 
/dev/VolGroup00/LogVol00: clean, 285228/7929856 files, 2745851/7929856 blocks
[/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1 
/boot: clean, 63/512512 files, 43614/512064 blocks
[  OK  ]
Remounting root filesystem in read-write mode:  [  OK  ]
Mounting local filesystems:  [  OK  ]
Enabling local filesystem quotas:  [  OK  ]
Enabling swap space:  [  OK  ]
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting readahead_early:  Starting background readahead: [  OK  ]
[  OK  ]
FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.17-rc5-mm2-autokern1/kernel/arch/x86_64/kernel/cpufreq/acpi-cpufreq.ko): No such device
Bringing up loopback interface:  [  OK  ]
Bringing up interface eth1:  [  OK  ]
Starting system logger: [  OK  ]
Starting kernel logger: [  OK  ]
Starting irqbalance: [  OK  ]
Starting portmap: [  OK  ]
Starting NFS statd: [  OK  ]
Starting RPC idmapd: FATAL: Module sunrpc not found.
FATAL: Error running install command for sunrpc
Starting system message bus: [  OK  ]
Starting Bluetooth services:[  OK  ]
[  OK  ]
Mounting other filesystems:  [  OK  ]
Starting hidd: [  OK  ]
Starting automount: [  OK  ]
Starting smartd: [  OK  ]
Starting acpi daemon: [  OK  ]
Starting hpiod: [  OK  ]
Starting hpssd: [  OK  ]
Starting cups: [  OK  ]
Starting sshd: [  OK  ]
Starting sendmail: [  OK  ]
Starting sm-client: [  OK  ]
Starting console mouse services: [  OK  ]
Starting crond: [  OK  ]
Starting xfs: [  OK  ]
Starting anacron: [  OK  ]
Starting atd: [  OK  ]
Starting Avahi daemon: [  OK  ]
Starting cups-config-daemon: [  OK  ]
Starting HAL daemon: [  OK  ]
Fedora Core release 5 (Bordeaux)
Kernel 2.6.17-rc5-mm2-autokern1 on an x86_64
bl6-13.ltc.austin.ibm.com login: -- 0:conmux-control -- time-stamp -- Jun/05/06 10:47:46 --
-- 0:conmux-control -- time-stamp -- Jun/05/06 10:51:12 --
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81268fc4>] icmp_rcv+0x17c/0x184
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160<7>Losing some ticks... checking if CPU frequency changed.
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81268fc4>] icmp_rcv+0x17c/0x184
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP: 
 [<ffffffff8108b063>] prepare_binprm+0xb/0xf4
PGD ccd9067 PUD e0ce067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /block/sda/sda1/size
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
CPU 2 
Modules linked in: ipv6 ppdev hidp rfcomm l2cap bluetooth video sony_acpi button battery asus_acpi ac lp parport_pc parport nvram
Pid: 18763, comm: sh Not tainted 2.6.17-rc5-mm2-autokern1 #1
RIP: 0010:[<ffffffff8108b063>]  [<ffffffff8108b063>] prepare_binprm+0xb/0xf4
RSP: 0018:ffff81000cb3ded8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff81003eae6800 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000003fff RDI: ffff81003eae6800
RBP: ffff810029649c00 R08: ffff81000cb3c000 R09: 000000000000afa7
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff81001a618000 R14: 000000000070a560 R15: 000000000070b5c0
FS:  00002aba68ba6d30(0000) GS:ffff81003ffbe8c0(0000) knlGS:00000000f7f5a6b0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 000000000cdab000 CR4: 00000000000006e0
Process sh (pid: 18763, threadinfo ffff81000cb3c000, task ffff8100285437c0)
Stack: ffff81003eae6800 BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
ffffffff8108b63f ffff81000cb3df58 ffff81001a618000 
       000000000070a560 000000000070b5c0 ffff81001a618000 0000000000709620 
       0000000000000000 ffffffff81007f14 
Call Trace:
 [<ffffffff8108b63f>] do_execve+0x11d/0x24b
 [<ffffffff81007f14>] sys_execve+0x34/0x87
 [<ffffffff81009677>] stub_execve+0x67/0xb0
Code: 48 8b 41 10 48 8b 70 20 b8 f3 ff ff ff 0f b7 56 4c f6 c2 49 
RIP  [<ffffffff8108b063>] prepare_binprm+0xb/0xf4 RSP <ffff81000cb3ded8>
CR2: 0000000000000010
 BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff811d94a3>] sd_rw_intr+0x2a2/0x2b1
 [<ffffffff811cbe47>] scsi_device_unbusy+0x5d/0x77
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff811d94a3>] sd_rw_intr+0x2a2/0x2b1
 [<ffffffff811cbe47>] scsi_device_unbusy+0x5d/0x77
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff811d94a3>] sd_rw_intr+0x2a2/0x2b1
 [<ffffffff811cbe47>] scsi_device_unbusy+0x5d/0x77
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81252ce2>] tcp_recvmsg+0x622/0x7fb
 [<ffffffff8122720e>] sock_common_recvmsg+0x2d/0x44
 [<ffffffff81223aaf>] do_sock_read+0xc6/0xd1
 [<ffffffff81223bff>] sock_aio_read+0x4f/0x5e
 [<ffffffff8102bed5>] __wake_up+0x36/0x4d
 [<ffffffff81080bc8>] do_sync_read+0xc9/0x106
 [<ffffffff81045d20>] autoremove_wake_function+0x0/0x2e
 [<ffffffff81170577>] tty_ldisc_deref+0x65/0x77
 [<ffffffff81080ce9>] vfs_read+0xe4/0x172
 [<ffffffff81081037>] sys_read+0x45/0x6e
 [<ffffffff810092be>] system_call+0x7e/0x83
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81252ce2>] tcp_recvmsg+0x622/0x7fb
 [<ffffffff8122720e>] sock_common_recvmsg+0x2d/0x44
 [<ffffffff81223aaf>] do_sock_read+0xc6/0xd1
 [<ffffffff81223bff>] sock_aio_read+0x4f/0x5e
 [<ffffffff8102bed5>] __wake_up+0x36/0x4d
 [<ffffffff81080bc8>] do_sync_read+0xc9/0x106
 [<ffffffff81045d20>] autoremove_wake_function+0x0/0x2e
 [<ffffffff81170577>] tty_ldisc_deref+0x65/0x77
 [<ffffffff81080ce9>] vfs_read+0xe4/0x172
 [<ffffffff81081037>] sys_read+0x45/0x6e
 [<ffffffff810092be>] system_call+0x7e/0x83
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff811d94a3>] sd_rw_intr+0x2a2/0x2b1
 [<ffffffff811cbe47>] scsi_device_unbusy+0x5d/0x77
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff8125923f>] tcp_rcv_established+0xe3/0x71a
 [<ffffffff8126079d>] tcp_v4_do_rcv+0x2b/0x2ff
 [<ffffffff8126106d>] tcp_v4_rcv+0x5fc/0x996
 [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
 [<ffffffff812489bf>] ip_rcv+0x434/0x475
 [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
 [<ffffffff81199add>] tg3_poll+0x716/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
BUG: warning at include/net/dst.h:153/dst_release()
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff811d94a3>] sd_rw_intr+0x2a2/0x2b1
 [<ffffffff811cbe47>] scsi_device_unbusy+0x5d/0x77
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
Call Trace:
 [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81252ce2>] tcp_recvmsg+0x622/0x7fb
 [<ffffffff8122720e>] sock_common_recvmsg+0x2d/0x44
 [<ffffffff81223aaf>] do_sock_read+0xc6/0xd1
 [<ffffffff81223bff>] sock_aio_read+0x4f/0x5e
 [<ffffffff8102bed5>] __wake_up+0x36/0x4d
 [<ffffffff81080bc8>] do_sync_read+0xc9/0x106BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
 [<ffffffff81045d20>] autoremove_wake_function+0x0/0x2e
 [<ffffffff81092210>] do_ioctl+0x64/0x6f
 [<ffffffff81080ce9>] vfs_read+0xe4/0x172
 [<ffffffff81081037>] sys_read+0x45/0x6e
 [<ffffffff810092be>] system_call+0x7e/0x83
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81252ce2>] tcp_recvmsg+0x622/0x7fb
 [<ffffffff8122720e>] sock_common_recvmsg+0x2d/0x44
 [<ffffffff81223aaf>] do_sock_read+0xc6/0xd1
 [<ffffffff81223bff>] sock_aio_read+0x4f/0x5e
 [<ffffffff8102bed5>] __wake_up+0x36/0x4d
 [<ffffffff81080bc8>] do_sync_read+0xc9/0x106
 [<ffffffff81045d20>] autoremove_wake_function+0x0/0x2e
 [<ffffffff81092210>] do_ioctl+0x64/0x6f
 [<ffffffff81080ce9>] vfs_read+0xe4/0x172
 [<ffffffff81081037>] sys_read+0x45/0x6e
 [<ffffffff810092be>] system_call+0x7e/0x83
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81252ce2>] tcp_recvmsg+0x622/0x7fb
 [<ffffffff8122720e>] sock_common_recvmsg+0x2d/0x44
 [<ffffffff81223aaf>] do_sock_read+0xc6/0xd1
 [<ffffffff81223bff>] sock_aio_read+0x4f/0x5e
 [<ffffffff8102bed5>] __wake_up+0x36/0x4d
 [<ffffffff81080bc8>] do_sync_read+0xc9/0x106
 [<ffffffff81045d20>] autoremove_wake_function+0x0/0x2e
 [<ffffffff81092210>] do_ioctl+0x64/0x6f
 [<ffffffff81080ce9>] vfs_read+0xe4/0x172BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff81007807>] default_idle+0x0/0x54
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
 [<ffffffff81081037>] sys_read+0x45/0x6e
 [<ffffffff810092be>] system_call+0x7e/0x83
general protection fault: 0000 [2] SMP 
last sysfs file: /block/sda/sda1/size
CPU 3 
Modules linked in: ipv6 ppdev hidp rfcomm l2cap bluetooth video sony_acpi button battery asus_acpi ac lp parport_pc parport nvram
Pid: 17887, comm: sshd Not tainted 2.6.17-rc5-mm2-autokern1 #1
RIP: 0010:[<ffffffff81228334>]  [<ffffffff81228334>] skb_drop_fraglist+0x17/0x26
RSP: 0018:ffff81000ef8dc48  EFLAGS: 00010206
RAX: 00000000026b2300 RBX: 4000000000000060 RCX: 000000000000b56c
RDX: ffff8100162f6900 RSI: ffffffff81321250 RDI: 4000000000000060
RBP: 0000000000000090 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: ffff81000ea78800 R12: 00000000ffffff95
R13: 0000000000000000 R14: ffff810034b84ac0 R15: 0000000000003f70
FS:  00002ae22a4babe0(0000) GS:ffff810037e0cdc0(0000) knlGS:00000000f7f0b6b0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000005b9e5c CR3: 000000000f2ef000 CR4: 00000000000006e0
Process sshd (pid: 17887, threadinfo ffff81000ef8c000, task ffff810015501840)
Stack: ffff810034b84ac0 ffffffff812283c7 00000000000001d8 ffff810034b84ac0 
       0000000000000090 ffffffff812281c2 ffff81000ea78800 ffffffff81252ce2 
       0000000000000246 0000000100000000 
Call Trace:
 [<ffffffff812283c7>] skb_release_data+0x84/0x97
 [<ffffffff812281c2>] kfree_skbmem+0x9/0x7fBUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8103b598>] do_timer+0x9b/0x4bd
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
 [<ffffffff81252ce2>] tcp_recvmsg+0x622/0x7fb
 [<ffffffff8122720e>] sock_common_recvmsg+0x2d/0x44
 [<ffffffff81223aaf>] do_sock_read+0xc6/0xd1
 [<ffffffff81223bff>] sock_aio_read+0x4f/0x5e
 [<ffffffff8102bed5>] __wake_up+0x36/0x4d
 [<ffffffff81080bc8>] do_sync_read+0xc9/0x106
 [<ffffffff81045d20>] autoremove_wake_function+0x0/0x2e
 [<ffffffff81092210>] do_ioctl+0x64/0x6f
 [<ffffffff81080ce9>] vfs_read+0xe4/0x172
 [<ffffffff81081037>] sys_read+0x45/0x6e
 [<ffffffff810092be>] system_call+0x7e/0x83
Code: 48 8b 1b e8 b9 ff ff ff 48 85 db 75 f0 5b c3 55 53 48 89 fb 
RIP  [<ffffffff81228334>] skb_drop_fraglist+0x17/0x26 RSP <ffff81000ef8dc48>
 ----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/slab.c:3430
invalid opcode: 0000 [3] SMP 
last sysfs file: /block/sda/sda1/size
CPU 1 
Modules linked in: ipv6 ppdev hidp rfcomm l2cap bluetooth video sony_acpi button battery asus_acpi ac lp parport_pc parport nvram
Pid: 6, comm: ksoftirqd/1 Not tainted 2.6.17-rc5-mm2-autokern1 #1
RIP: 0010:[<ffffffff8107d0a4>]  [<ffffffff8107d0a4>] kmem_cache_free+0x5f/0x77
RSP: 0018:ffff810037e1ff28  EFLAGS: 00010287
RAX: 0000000000000080 RBX: ffff81000cfc1480 RCX: 000000000000000a
RDX: ffff81000185c980 RSI: ffff81000e3a6c00 RDI: ffff8100026b2340
RBP: ffff8100024e7d40 R08: 0000000000000008 R09: 0000000000000000
BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
R10: 0000000000000000 R11: ffff81003ff950d0 R12: 0000000000000008
R13: 0000000000000001 R14: ffffffff812b1104 R15: 0000000000000000
FS:  00002ae22a4babe0(0000) GS:ffff81003ff81340(0000) knlGS:00000000f7f5c6b0
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000005b9e5c CR3: 0000000001001000 CR4: 00000000000006e0
Process ksoftirqd/1 (pid: 6, threadinfo ffff810037e14000, task ffff81003ff950d0)
Stack: ffff81000cfc1480 ffffffff8104394c ffff8100024e7dc0 0000000000000000 
       ffffffff814cbc90 ffffffff810439ee 0000000000000000 ffffffff81037bc1 
       0000000000000001 ffffffff81476f90 
Call Trace:
 <IRQ> [<ffffffff8104394c>] __rcu_process_callbacks+0x12a/0x1ab
 [<ffffffff810439ee>] rcu_process_callbacks+0x21/0x42
 [<ffffffff81037bc1>] tasklet_action+0x69/0xa8
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff81037d1b>] ksoftirqd+0x0/0xbf
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 <EOI> [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff81037d84>] ksoftirqd+0x69/0xbf
 [<ffffffff81045806>] kthread+0x107/0x133
 [<ffffffff81037d1b>] ksoftirqd+0x0/0xbf
 [<ffffffff8100a146>] child_rip+0x8/0x12
 [<ffffffff81037d1b>] ksoftirqd+0x0/0xbf
 [<ffffffff810456ff>] kthread+0x0/0x133
 [<ffffffff8100a13e>] child_rip+0x0/0x12
Code: 0f 0b 68 41 56 2b 81 c2 66 0d 9c 5b fa 31 d2 e8 99 f9 ff ff 
RIP  [<ffffffff8107d0a4>] kmem_cache_free+0x5f/0x77BUG: warning at include/net/dst.h:153/dst_release()
Call Trace:
 <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
 [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
 [<ffffffff8122d80c>] net_rx_action+0xac/0x160
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
 [<ffffffff810097b8>] ret_from_intr+0x0/0xb
 <EOI>
Attempt to release alive inet socket ffff81003f8b2780
 RSP <ffff810037e1ff28>
 <3>BUG: sleeping function called from invalid context at include/linux/rwsem.h:53
in_atomic():1, irqs_disabled():0
Call Trace:
 <IRQ> [<ffffffff810299a0>] __might_sleep+0xc0/0xc2
 [<ffffffff8103f5a1>] blocking_notifier_call_chain+0x1f/0x4e
 [<ffffffff81034c96>] do_exit+0x22/0x8b2
 [<ffffffff8128a3a7>] _spin_unlock_irqrestore+0xb/0xd
 [<ffffffff8100aa61>] do_divide_error+0x0/0xa2
 [<ffffffff8128ad5e>] do_trap+0xe6/0xf3
 [<ffffffff8100ac90>] do_invalid_op+0x9b/0xa5
 [<ffffffff8107d0a4>] kmem_cache_free+0x5f/0x77
 [<ffffffff81009f8d>] error_exit+0x0/0x84
 [<ffffffff8107d0a4>] kmem_cache_free+0x5f/0x77
 [<ffffffff8104394c>] __rcu_process_callbacks+0x12a/0x1ab
 [<ffffffff810439ee>] rcu_process_callbacks+0x21/0x42
 [<ffffffff81037bc1>] tasklet_action+0x69/0xa8
 [<ffffffff81037904>] __do_softirq+0x48/0xb4
 [<ffffffff81037d1b>] ksoftirqd+0x0/0xbf
 [<ffffffff8100a496>] call_softirq+0x1e/0x28
 <EOI> [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
 [<ffffffff81037d84>] ksoftirqd+0x69/0xbf
 [<ffffffff81045806>] kthread+0x107/0x133
 [<ffffffff81037d1b>] ksoftirqd+0x0/0xbf
 [<ffffffff8100a146>] child_rip+0x8/0x12
 [<ffffffff81037d1b>] ksoftirqd+0x0/0xbf
 [<ffffffff810456ff>] kthread+0x0/0x133
 [<ffffffff8100a13e>] child_rip+0x0/0x12
Kernel panic - not syncing: Aiee, killing interrupt handler!
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 17:56 ` 2.6.17-rc5-mm3 Mel Gorman
@ 2006-06-05 18:54   ` Andrew Morton
  2006-06-06  9:43     ` 2.6.17-rc5-mm3 Mel Gorman
  2006-06-06 10:57     ` 2.6.17-rc5-mm3 Mel Gorman
  0 siblings, 2 replies; 52+ messages in thread
From: Andrew Morton @ 2006-06-05 18:54 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-kernel, netdev

On Mon, 5 Jun 2006 18:56:37 +0100
mel@csn.ul.ie (Mel Gorman) wrote:

> 
> I am seeing more networking-related funniness with 2.6.17-rc5-mm3 on the
> same machine previously fixed by git-net-llc-fix.patch. The console log is
> below. I've done no investigation work in case it's a known problem.

It's not a known problem, afaik.

> ...
> Starting anacron: [  OK  ]
> Starting atd: [  OK  ]
> Starting Avahi daemon: [  OK  ]
> Starting cups-config-daemon: [  OK  ]
> Starting HAL daemon: [  OK  ]
> Fedora Core release 5 (Bordeaux)
> Kernel 2.6.17-rc5-mm2-autokern1 on an x86_64
> bl6-13.ltc.austin.ibm.com login: -- 0:conmux-control -- time-stamp -- Jun/05/06 10:47:46 --
> -- 0:conmux-control -- time-stamp -- Jun/05/06 10:51:12 --
> BUG: warning at include/net/dst.h:153/dst_release()
> Call Trace:
>  <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
>  [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
>  [<ffffffff8122d80c>] net_rx_action+0xac/0x160
>  [<ffffffff81037904>] __do_softirq+0x48/0xb4
>  [<ffffffff8100a496>] call_softirq+0x1e/0x28
>  [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
>  [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
>  [<ffffffff81007807>] default_idle+0x0/0x54
>  [<ffffffff810097b8>] ret_from_intr+0x0/0xb
>  <EOI>
> Attempt to release alive inet socket ffff81003f8b2780
> BUG: warning at include/net/dst.h:153/dst_release()
> Call Trace:
>  <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
>  [<ffffffff81268fc4>] icmp_rcv+0x17c/0x184
>  [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
>  [<ffffffff812489bf>] ip_rcv+0x434/0x475
>  [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
>  [<ffffffff81199add>] tg3_poll+0x716/0x94f
>  [<ffffffff8122d80c>] net_rx_action+0xac/0x160<7>Losing some ticks... checking if CPU frequency changed.
>  [<ffffffff81037904>] __do_softirq+0x48/0xb4
>  [<ffffffff8100a496>] call_softirq+0x1e/0x28
>  [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
>  [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
>  [<ffffffff81007807>] default_idle+0x0/0x54
>  [<ffffffff810097b8>] ret_from_intr+0x0/0xb

There are quite a few changes in the net tree.  I guess the first thing to
investigate would be 2.6.17-rc5+origin.patch+git-net.patch.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 16:30 2.6.17-rc5-mm3 Martin Bligh
@ 2006-06-05 19:44 ` Ingo Molnar
  2006-06-05 20:00   ` 2.6.17-rc5-mm3 Randy.Dunlap
  0 siblings, 1 reply; 52+ messages in thread
From: Ingo Molnar @ 2006-06-05 19:44 UTC (permalink / raw)
  To: Martin Bligh; +Cc: Andrew Morton, Andy Whitcroft, LKML


* Martin Bligh <mbligh@google.com> wrote:

> panic on NUMA-Q during LTP. Was fine in -mm2.
> 
> BUG: unable to handle kernel paging request at virtual address 22222232

> EIP is at check_deadlock+0x19/0xe1
> eax: 00000001   ebx: e4453030   ecx: 00000000   edx: e4008000
> esi: 22222222   edi: 00000001   ebp: 22222222   esp: e47ebec0

again these 0x22222222 entries on the stack. What on earth does this? 
Andy got a similar crash on x86_64, with a 0x2222222222222222 entry ...

nothing of our magic values are 0x22 or 0x222222222.

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
                   ` (4 preceding siblings ...)
  2006-06-05 17:56 ` 2.6.17-rc5-mm3 Mel Gorman
@ 2006-06-05 19:48 ` Dave Jones
  2006-06-05 20:06   ` 2.6.17-rc5-mm3 Andrew Morton
  2006-06-05 23:02 ` 2.6.17-rc5-mm3 Dave Jones
  2006-06-06  8:03 ` 2.6.17-rc5-mm3 J.A. Magallón
  7 siblings, 1 reply; 52+ messages in thread
From: Dave Jones @ 2006-06-05 19:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat, Jun 03, 2006 at 11:20:04PM -0700, Andrew Morton wrote:
 > 
 > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
 > 
 > - Lots of PCI and USB updates
 > 
 > - The various lock validator, stack backtracing and IRQ management problems
 >   are converging, but we're not quite there yet.

Thought I'd try my bi-annual "poke at -mm". Results were less
than spectacular.

http://www.codemonkey.org.uk/junk/DSC00347.JPG
First the sound driver oopsed.

Then, the whole thing locked up after probing the parallel port.
I disabled it in the BIOS, and then it locked up probing the floppy drive..
http://www.codemonkey.org.uk/junk/DSC00348.JPG

System is still alive, and responds to keyboard, but makes no forward progress.

(sysrq-B spewed a lockdep trace and then rebooted. I'll try and get
that hooked up to a serial console)

On a whim, I enabled the floppy drive in the BIOS, and rebooted.
That got me here. http://www.codemonkey.org.uk/junk/DSC00349.JPG
Same dead userspace.

Off to find a serial cable.

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 19:44 ` 2.6.17-rc5-mm3 Ingo Molnar
@ 2006-06-05 20:00   ` Randy.Dunlap
  2006-06-05 20:05     ` 2.6.17-rc5-mm3 Ingo Molnar
                       ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Randy.Dunlap @ 2006-06-05 20:00 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: mbligh, akpm, apw, linux-kernel

On Mon, 5 Jun 2006 21:44:22 +0200 Ingo Molnar wrote:

> 
> * Martin Bligh <mbligh@google.com> wrote:
> 
> > panic on NUMA-Q during LTP. Was fine in -mm2.
> > 
> > BUG: unable to handle kernel paging request at virtual address 22222232
> 
> > EIP is at check_deadlock+0x19/0xe1
> > eax: 00000001   ebx: e4453030   ecx: 00000000   edx: e4008000
> > esi: 22222222   edi: 00000001   ebp: 22222222   esp: e47ebec0
> 
> again these 0x22222222 entries on the stack. What on earth does this? 
> Andy got a similar crash on x86_64, with a 0x2222222222222222 entry ...
> 
> nothing of our magic values are 0x22 or 0x222222222.

kernel/mutex-debug.c:
void debug_mutex_free_waiter(struct mutex_waiter *waiter)
{
	DEBUG_WARN_ON(!list_empty(&waiter->list));
	memset(waiter, 0x22, sizeof(*waiter));
}

---
~Randy

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:00   ` 2.6.17-rc5-mm3 Randy.Dunlap
@ 2006-06-05 20:05     ` Ingo Molnar
  2006-06-05 20:05     ` 2.6.17-rc5-mm3 Dave Jones
  2006-06-06  8:56     ` [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code Ingo Molnar
  2 siblings, 0 replies; 52+ messages in thread
From: Ingo Molnar @ 2006-06-05 20:05 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: mbligh, akpm, apw, linux-kernel


* Randy.Dunlap <rdunlap@xenotime.net> wrote:

> > > panic on NUMA-Q during LTP. Was fine in -mm2.
> > > 
> > > BUG: unable to handle kernel paging request at virtual address 22222232
> > 
> > > EIP is at check_deadlock+0x19/0xe1
> > > eax: 00000001   ebx: e4453030   ecx: 00000000   edx: e4008000
> > > esi: 22222222   edi: 00000001   ebp: 22222222   esp: e47ebec0
> > 
> > again these 0x22222222 entries on the stack. What on earth does this? 
> > Andy got a similar crash on x86_64, with a 0x2222222222222222 entry ...
> > 
> > nothing of our magic values are 0x22 or 0x222222222.
> 
> kernel/mutex-debug.c:
> void debug_mutex_free_waiter(struct mutex_waiter *waiter)
> {
> 	DEBUG_WARN_ON(!list_empty(&waiter->list));
> 	memset(waiter, 0x22, sizeof(*waiter));
> }

ah!!! that's indeed a hint. Will take a look tomorrow.

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:00   ` 2.6.17-rc5-mm3 Randy.Dunlap
  2006-06-05 20:05     ` 2.6.17-rc5-mm3 Ingo Molnar
@ 2006-06-05 20:05     ` Dave Jones
  2006-06-05 20:08       ` 2.6.17-rc5-mm3 Ingo Molnar
  2006-06-05 20:14       ` 2.6.17-rc5-mm3 Randy.Dunlap
  2006-06-06  8:56     ` [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code Ingo Molnar
  2 siblings, 2 replies; 52+ messages in thread
From: Dave Jones @ 2006-06-05 20:05 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Ingo Molnar, mbligh, akpm, apw, linux-kernel

On Mon, Jun 05, 2006 at 01:00:39PM -0700, Randy.Dunlap wrote:
 > On Mon, 5 Jun 2006 21:44:22 +0200 Ingo Molnar wrote:
 > 
 > > 
 > > * Martin Bligh <mbligh@google.com> wrote:
 > > 
 > > > panic on NUMA-Q during LTP. Was fine in -mm2.
 > > > 
 > > > BUG: unable to handle kernel paging request at virtual address 22222232
 > > 
 > > > EIP is at check_deadlock+0x19/0xe1
 > > > eax: 00000001   ebx: e4453030   ecx: 00000000   edx: e4008000
 > > > esi: 22222222   edi: 00000001   ebp: 22222222   esp: e47ebec0
 > > 
 > > again these 0x22222222 entries on the stack. What on earth does this? 
 > > Andy got a similar crash on x86_64, with a 0x2222222222222222 entry ...
 > > 
 > > nothing of our magic values are 0x22 or 0x222222222.
 > 
 > kernel/mutex-debug.c:
 > void debug_mutex_free_waiter(struct mutex_waiter *waiter)
 > {
 > 	DEBUG_WARN_ON(!list_empty(&waiter->list));
 > 	memset(waiter, 0x22, sizeof(*waiter));
 > }

Documentation/magic-number.txt sounds so promising, but we scatter definitions
of numbers all over the place. (No mention of the slab poison values,
or similar numbers there for eg, and various pointers to _other_ lists
of magic numbers).

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 19:48 ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-05 20:06   ` Andrew Morton
  2006-06-05 20:09     ` 2.6.17-rc5-mm3 Dave Jones
  2006-06-06 10:15     ` 2.6.17-rc5-mm3 Takashi Iwai
  0 siblings, 2 replies; 52+ messages in thread
From: Andrew Morton @ 2006-06-05 20:06 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-kernel, Jaroslav Kysela, Takashi Iwai

On Mon, 5 Jun 2006 15:48:45 -0400
Dave Jones <davej@redhat.com> wrote:

> On Sat, Jun 03, 2006 at 11:20:04PM -0700, Andrew Morton wrote:
>  > 
>  > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
>  > 
>  > - Lots of PCI and USB updates
>  > 
>  > - The various lock validator, stack backtracing and IRQ management problems
>  >   are converging, but we're not quite there yet.
> 
> Thought I'd try my bi-annual "poke at -mm". Results were less
> than spectacular.
> 
> http://www.codemonkey.org.uk/junk/DSC00347.JPG
> First the sound driver oopsed.

That's a bug in sound/pci/cs4281.c.

There's a debug patch in -mm
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/broken-out/debug-shared-irqs.patch
which trips up drivers which request an IRQ before their IRQ handler is
ready to accept IRQs (they'll crash in real life if the IRQ is shared).

> Then, the whole thing locked up after probing the parallel port.
> I disabled it in the BIOS, and then it locked up probing the floppy drive..
> http://www.codemonkey.org.uk/junk/DSC00348.JPG

That looks like the same thing?

> System is still alive, and responds to keyboard, but makes no forward progress.
> 
> (sysrq-B spewed a lockdep trace and then rebooted. I'll try and get
> that hooked up to a serial console)
> 
> On a whim, I enabled the floppy drive in the BIOS, and rebooted.
> That got me here. http://www.codemonkey.org.uk/junk/DSC00349.JPG
> Same dead userspace.

So does that.

> Off to find a serial cable.

Try reverting debug-shared-irqs.patch, or disable the sound driver?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:05     ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-05 20:08       ` Ingo Molnar
  2006-06-05 20:14       ` 2.6.17-rc5-mm3 Randy.Dunlap
  1 sibling, 0 replies; 52+ messages in thread
From: Ingo Molnar @ 2006-06-05 20:08 UTC (permalink / raw)
  To: Dave Jones, Randy.Dunlap, mbligh, akpm, apw, linux-kernel


* Dave Jones <davej@redhat.com> wrote:

>  > kernel/mutex-debug.c:
>  > void debug_mutex_free_waiter(struct mutex_waiter *waiter)
>  > {
>  > 	DEBUG_WARN_ON(!list_empty(&waiter->list));
>  > 	memset(waiter, 0x22, sizeof(*waiter));
>  > }
> 
> Documentation/magic-number.txt sounds so promising, but we scatter 
> definitions of numbers all over the place. (No mention of the slab 
> poison values, or similar numbers there for eg, and various pointers 
> to _other_ lists of magic numbers).

we've also got include/linux/poison.h - i'll move this value there.

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:06   ` 2.6.17-rc5-mm3 Andrew Morton
@ 2006-06-05 20:09     ` Dave Jones
  2006-06-05 20:44       ` 2.6.17-rc5-mm3 Dave Jones
  2006-06-06 10:15     ` 2.6.17-rc5-mm3 Takashi Iwai
  1 sibling, 1 reply; 52+ messages in thread
From: Dave Jones @ 2006-06-05 20:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Jaroslav Kysela, Takashi Iwai

On Mon, Jun 05, 2006 at 01:06:26PM -0700, Andrew Morton wrote:

 > > Then, the whole thing locked up after probing the parallel port.
 > > I disabled it in the BIOS, and then it locked up probing the floppy drive..
 > > http://www.codemonkey.org.uk/junk/DSC00348.JPG
 > 
 > That looks like the same thing?
 > 
 > > System is still alive, and responds to keyboard, but makes no forward progress.
 > > 
 > > (sysrq-B spewed a lockdep trace and then rebooted. I'll try and get
 > > that hooked up to a serial console)
 > > 
 > > On a whim, I enabled the floppy drive in the BIOS, and rebooted.
 > > That got me here. http://www.codemonkey.org.uk/junk/DSC00349.JPG
 > > Same dead userspace.
 > 
 > So does that.

The top half the screen is the same as the first pic yes, but the purpose
of those latter two pics was to show that we're locking up (in aparently
different places) shortly afterwards.

 > Try reverting debug-shared-irqs.patch, or disable the sound driver?

Will turn off the sound driver, and see what happens.

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:05     ` 2.6.17-rc5-mm3 Dave Jones
  2006-06-05 20:08       ` 2.6.17-rc5-mm3 Ingo Molnar
@ 2006-06-05 20:14       ` Randy.Dunlap
  2006-06-05 20:54         ` [PATCH] poison: add & use more constants Randy.Dunlap
  1 sibling, 1 reply; 52+ messages in thread
From: Randy.Dunlap @ 2006-06-05 20:14 UTC (permalink / raw)
  To: Dave Jones; +Cc: mingo, mbligh, akpm, apw, linux-kernel

On Mon, 5 Jun 2006 16:05:54 -0400 Dave Jones wrote:

> On Mon, Jun 05, 2006 at 01:00:39PM -0700, Randy.Dunlap wrote:
>  > On Mon, 5 Jun 2006 21:44:22 +0200 Ingo Molnar wrote:
>  > 
>  > > 
>  > > * Martin Bligh <mbligh@google.com> wrote:
>  > > 
>  > > > panic on NUMA-Q during LTP. Was fine in -mm2.
>  > > > 
>  > > > BUG: unable to handle kernel paging request at virtual address 22222232
>  > > 
>  > > > EIP is at check_deadlock+0x19/0xe1
>  > > > eax: 00000001   ebx: e4453030   ecx: 00000000   edx: e4008000
>  > > > esi: 22222222   edi: 00000001   ebp: 22222222   esp: e47ebec0
>  > > 
>  > > again these 0x22222222 entries on the stack. What on earth does this? 
>  > > Andy got a similar crash on x86_64, with a 0x2222222222222222 entry ...
>  > > 
>  > > nothing of our magic values are 0x22 or 0x222222222.
>  > 
>  > kernel/mutex-debug.c:
>  > void debug_mutex_free_waiter(struct mutex_waiter *waiter)
>  > {
>  > 	DEBUG_WARN_ON(!list_empty(&waiter->list));
>  > 	memset(waiter, 0x22, sizeof(*waiter));
>  > }
> 
> Documentation/magic-number.txt sounds so promising, but we scatter definitions
> of numbers all over the place. (No mention of the slab poison values,
> or similar numbers there for eg, and various pointers to _other_ lists
> of magic numbers).

I have a few more that I can add to include/linux/poison.h, like this one
above (only in -mm at present).

./include/linux/libata.h:#define ATA_TAG_POISON		0xfafbfcfdU

./arch/ppc/8260_io/fcc_enet.c:1918:	memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
./drivers/usb/mon/mon_text.c:429:	memset(mem, 0xe5, sizeof(struct mon_event_text));

./kernel/mutex-debug.c:384:	memset(waiter, 0x11, sizeof(*waiter));
./kernel/mutex-debug.c:400:	memset(waiter, 0x22, sizeof(*waiter));

./security/keys/key.c:985:			memset(&key->payload, 0xbd, sizeof(key->payload));

./drivers/char/ftape/lowlevel/ftape-ctl.c:738:		memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);

./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */


---
~Randy

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:09     ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-05 20:44       ` Dave Jones
  2006-06-05 20:53         ` 2.6.17-rc5-mm3 Andrew Morton
  2006-06-05 21:03         ` 2.6.17-rc5-mm3 Arjan van de Ven
  0 siblings, 2 replies; 52+ messages in thread
From: Dave Jones @ 2006-06-05 20:44 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel; +Cc: arjan, mingo

On Mon, Jun 05, 2006 at 04:09:47PM -0400, Dave Jones wrote:

 >  > Try reverting debug-shared-irqs.patch, or disable the sound driver?
 > Will turn off the sound driver, and see what happens.

Win! It now boots.   I blew it up really easy with a socket-fuzzer though.
(http://people.redhat.com/davej/sfuzz.c)

[  874.865028] ======================================
[  874.943738] [ BUG: bad unlock ordering detected! ]
[  875.002919] --------------------------------------
[  875.062134] sfuzz/23915 is trying to release lock (&sctp_port_alloc_lock) at:
[  875.149619]  [<d128ed4e>] sctp_get_port_local+0xd0/0x285 [sctp]
[  875.222636] but the next lock to release is:
[  875.276019]  (&sctp_port_hashtable[i].lock){-...}, at: [<d128ed0e>] sctp_get_port_local+0x90/0x285 [sctp]
[  875.393031]
[  875.393032] other info that might help us debug this:
[  875.476583] 1 locks held by sfuzz/23915:
[  875.526247]  #0:  (&sctp_port_alloc_lock){-...}, at: [<d128ecd9>] sctp_get_port_local+0x5b/0x285 [sctp]
[  875.641621]
[  875.641623] stack backtrace:
[  875.699891]  [<c0104966>] show_trace_log_lvl+0x54/0xfd
[  875.764425]  [<c0104f1a>] show_trace+0xd/0x10
[  875.819622]  [<c010502f>] dump_stack+0x19/0x1b
[  875.875924]  [<c013b4af>] lockdep_release+0x150/0x2d1
[  875.939610]  [<c032341e>] _spin_unlock+0x16/0x20
[  875.998171]  [<d128ed4e>] sctp_get_port_local+0xd0/0x285 [sctp]
[  876.072345]  [<d128efd4>] sctp_do_bind+0x9a/0x158 [sctp]
[  876.139315]  [<d128f0ce>] sctp_autobind+0x3c/0x44 [sctp]
[  876.206310]  [<d129253d>] sctp_inet_listen+0xe9/0x139 [sctp]
[  876.277539]  [<c02c20af>] sys_listen+0x4a/0x65
[  876.334730]  [<c02c308d>] sys_socketcall+0x98/0x186
[  876.397175]  [<c03239cb>] syscall_call+0x7/0xb


-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:44       ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-05 20:53         ` Andrew Morton
  2006-06-05 21:02           ` 2.6.17-rc5-mm3 Dave Jones
  2006-06-05 21:03         ` 2.6.17-rc5-mm3 Arjan van de Ven
  1 sibling, 1 reply; 52+ messages in thread
From: Andrew Morton @ 2006-06-05 20:53 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-kernel, arjan, mingo

On Mon, 5 Jun 2006 16:44:56 -0400
Dave Jones <davej@redhat.com> wrote:

> On Mon, Jun 05, 2006 at 04:09:47PM -0400, Dave Jones wrote:
> 
>  >  > Try reverting debug-shared-irqs.patch, or disable the sound driver?
>  > Will turn off the sound driver, and see what happens.
> 
> Win! It now boots.

So does Windows 95.

>   I blew it up really easy with a socket-fuzzer though.
> (http://people.redhat.com/davej/sfuzz.c)

But it kept running OK, yes?

> [  874.865028] ======================================
> [  874.943738] [ BUG: bad unlock ordering detected! ]
> [  875.002919] --------------------------------------
> [  875.062134] sfuzz/23915 is trying to release lock (&sctp_port_alloc_lock) at:
> [  875.149619]  [<d128ed4e>] sctp_get_port_local+0xd0/0x285 [sctp]
> [  875.222636] but the next lock to release is:
> [  875.276019]  (&sctp_port_hashtable[i].lock){-...}, at: [<d128ed0e>] sctp_get_port_local+0x90/0x285 [sctp]
> [  875.393031]
> [  875.393032] other info that might help us debug this:
> [  875.476583] 1 locks held by sfuzz/23915:
> [  875.526247]  #0:  (&sctp_port_alloc_lock){-...}, at: [<d128ecd9>] sctp_get_port_local+0x5b/0x285 [sctp]
> [  875.641621]
> [  875.641623] stack backtrace:
> [  875.699891]  [<c0104966>] show_trace_log_lvl+0x54/0xfd
> [  875.764425]  [<c0104f1a>] show_trace+0xd/0x10
> [  875.819622]  [<c010502f>] dump_stack+0x19/0x1b
> [  875.875924]  [<c013b4af>] lockdep_release+0x150/0x2d1
> [  875.939610]  [<c032341e>] _spin_unlock+0x16/0x20
> [  875.998171]  [<d128ed4e>] sctp_get_port_local+0xd0/0x285 [sctp]
> [  876.072345]  [<d128efd4>] sctp_do_bind+0x9a/0x158 [sctp]
> [  876.139315]  [<d128f0ce>] sctp_autobind+0x3c/0x44 [sctp]
> [  876.206310]  [<d129253d>] sctp_inet_listen+0xe9/0x139 [sctp]
> [  876.277539]  [<c02c20af>] sys_listen+0x4a/0x65
> [  876.334730]  [<c02c308d>] sys_socketcall+0x98/0x186
> [  876.397175]  [<c03239cb>] syscall_call+0x7/0xb

This is a really really fussy "BUG", IMO.  So we undid the locks in an
inappropriate order - big deal.

But often these _are_ things which we should tune up, as an efficiency
thing, so it is interesting to hear about them.  But calling it a "BUG" is
a bit alarmist.

Thanks for booting it.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH] poison: add & use more constants
  2006-06-05 20:14       ` 2.6.17-rc5-mm3 Randy.Dunlap
@ 2006-06-05 20:54         ` Randy.Dunlap
  2006-06-06 13:33           ` Steven Rostedt
  0 siblings, 1 reply; 52+ messages in thread
From: Randy.Dunlap @ 2006-06-05 20:54 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: davej, mingo, mbligh, akpm, apw, linux-kernel

From: Randy Dunlap <rdunlap@xenotime.net>

Add more poison values to include/linux/poison.h.
It's not clear to me whether some others should be added or not,
so I haven't added any of these:

./include/linux/libata.h:#define ATA_TAG_POISON		0xfafbfcfdU
./arch/ppc/8260_io/fcc_enet.c:1918:	memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
./drivers/usb/mon/mon_text.c:429:	memset(mem, 0xe5, sizeof(struct mon_event_text));
./drivers/char/ftape/lowlevel/ftape-ctl.c:738:		memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
---
 include/linux/poison.h |    7 +++++++
 kernel/mutex-debug.c   |    5 +++--
 security/keys/key.c    |    3 ++-
 3 files changed, 12 insertions(+), 3 deletions(-)

--- linux-2617-rc5mm3.orig/include/linux/poison.h
+++ linux-2617-rc5mm3/include/linux/poison.h
@@ -45,6 +45,13 @@
 /********** drivers/atm/ **********/
 #define ATM_POISON_FREE		0x12
 
+/********** kernel/mutexes **********/
+#define MUTEX_DEBUG_INIT	0x11
+#define MUTEX_DEBUG_FREE	0x22
+
+/********** security/ **********/
+#define KEY_DESTROY		0xbd
+
 /********** sound/oss/ **********/
 #define OSS_POISON_FREE		0xAB
 
--- linux-2617-rc5mm3.orig/kernel/mutex-debug.c
+++ linux-2617-rc5mm3/kernel/mutex-debug.c
@@ -16,6 +16,7 @@
 #include <linux/sched.h>
 #include <linux/delay.h>
 #include <linux/module.h>
+#include <linux/poison.h>
 #include <linux/spinlock.h>
 #include <linux/kallsyms.h>
 #include <linux/interrupt.h>
@@ -155,7 +156,7 @@ void debug_mutex_set_owner(struct mutex 
 
 void debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter)
 {
-	memset(waiter, 0x11, sizeof(*waiter));
+	memset(waiter, MUTEX_DEBUG_INIT, sizeof(*waiter));
 	waiter->magic = waiter;
 	INIT_LIST_HEAD(&waiter->list);
 }
@@ -171,7 +172,7 @@ void debug_mutex_wake_waiter(struct mute
 void debug_mutex_free_waiter(struct mutex_waiter *waiter)
 {
 	DEBUG_WARN_ON(!list_empty(&waiter->list));
-	memset(waiter, 0x22, sizeof(*waiter));
+	memset(waiter, MUTEX_DEBUG_FREE, sizeof(*waiter));
 }
 
 void debug_mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
--- linux-2617-rc5mm3.orig/security/keys/key.c
+++ linux-2617-rc5mm3/security/keys/key.c
@@ -11,6 +11,7 @@
 
 #include <linux/module.h>
 #include <linux/init.h>
+#include <linux/poison.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/security.h>
@@ -986,7 +987,7 @@ void unregister_key_type(struct key_type
 		if (key->type == ktype) {
 			if (ktype->destroy)
 				ktype->destroy(key);
-			memset(&key->payload, 0xbd, sizeof(key->payload));
+			memset(&key->payload, KEY_DESTROY, sizeof(key->payload));
 		}
 	}
 



---

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:53         ` 2.6.17-rc5-mm3 Andrew Morton
@ 2006-06-05 21:02           ` Dave Jones
  0 siblings, 0 replies; 52+ messages in thread
From: Dave Jones @ 2006-06-05 21:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, arjan, mingo

On Mon, Jun 05, 2006 at 01:53:54PM -0700, Andrew Morton wrote:
 
 > >  >  > Try reverting debug-shared-irqs.patch, or disable the sound driver?
 > >  > Will turn off the sound driver, and see what happens.
 > > Win! It now boots.
 > So does Windows 95.

Hey, it's my turn to play "optimist" today. :)

 > >   I blew it up really easy with a socket-fuzzer though.
 > > (http://people.redhat.com/davej/sfuzz.c)
 > 
 > But it kept running OK, yes?

Yep, still ticking along (for now).

 > > [  874.865028] ======================================
 > > [  874.943738] [ BUG: bad unlock ordering detected! ]
 > > [  875.002919] --------------------------------------
 > > [  875.062134] sfuzz/23915 is trying to release lock (&sctp_port_alloc_lock) at:
 > > [  875.149619]  [<d128ed4e>] sctp_get_port_local+0xd0/0x285 [sctp]
 > > [  875.222636] but the next lock to release is:
 > > [  875.276019]  (&sctp_port_hashtable[i].lock){-...}, at: [<d128ed0e>] sctp_get_port_local+0x90/0x285 [sctp]
 > > [  875.393031]
 > > [  875.393032] other info that might help us debug this:
 > > [  875.476583] 1 locks held by sfuzz/23915:
 > > [  875.526247]  #0:  (&sctp_port_alloc_lock){-...}, at: [<d128ecd9>] sctp_get_port_local+0x5b/0x285 [sctp]
 > > [  875.641621]
 > > [  875.641623] stack backtrace:
 > > [  875.699891]  [<c0104966>] show_trace_log_lvl+0x54/0xfd
 > > [  875.764425]  [<c0104f1a>] show_trace+0xd/0x10
 > > [  875.819622]  [<c010502f>] dump_stack+0x19/0x1b
 > > [  875.875924]  [<c013b4af>] lockdep_release+0x150/0x2d1
 > > [  875.939610]  [<c032341e>] _spin_unlock+0x16/0x20
 > > [  875.998171]  [<d128ed4e>] sctp_get_port_local+0xd0/0x285 [sctp]
 > > [  876.072345]  [<d128efd4>] sctp_do_bind+0x9a/0x158 [sctp]
 > > [  876.139315]  [<d128f0ce>] sctp_autobind+0x3c/0x44 [sctp]
 > > [  876.206310]  [<d129253d>] sctp_inet_listen+0xe9/0x139 [sctp]
 > > [  876.277539]  [<c02c20af>] sys_listen+0x4a/0x65
 > > [  876.334730]  [<c02c308d>] sys_socketcall+0x98/0x186
 > > [  876.397175]  [<c03239cb>] syscall_call+0x7/0xb
 > 
 > This is a really really fussy "BUG", IMO.  So we undid the locks in an
 > inappropriate order - big deal.
 > 
 > But often these _are_ things which we should tune up, as an efficiency
 > thing, so it is interesting to hear about them.  But calling it a "BUG" is
 > a bit alarmist.

Maybe so, but it's still pretty grotty though.

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:44       ` 2.6.17-rc5-mm3 Dave Jones
  2006-06-05 20:53         ` 2.6.17-rc5-mm3 Andrew Morton
@ 2006-06-05 21:03         ` Arjan van de Ven
  1 sibling, 0 replies; 52+ messages in thread
From: Arjan van de Ven @ 2006-06-05 21:03 UTC (permalink / raw)
  To: Dave Jones; +Cc: Andrew Morton, linux-kernel, mingo

On Mon, 2006-06-05 at 16:44 -0400, Dave Jones wrote:
> On Mon, Jun 05, 2006 at 04:09:47PM -0400, Dave Jones wrote:
> 
>  >  > Try reverting debug-shared-irqs.patch, or disable the sound driver?
>  > Will turn off the sound driver, and see what happens.
> 
> Win! It now boots.   I blew it up really easy with a socket-fuzzer though.
> (http://people.redhat.com/davej/sfuzz.c)
> 
> [  874.865028] ======================================
> [  874.943738] [ BUG: bad unlock ordering detected! ]
> [  875.002919] --------------------------------------
> [  875.062134] sfuzz/23915 is trying to release lock (&sctp_port_alloc_lock) at:
> [  875.149619]  [<d128ed4e>] sctp_get_port_local+0xd0/0x285 [sctp]
> [  875.222636] but the next lock to release is:
> [  875.276019]  (&sctp_port_hashtable[i].lock){-...}, at: [<d128ed0e>] sctp_get_port_local+0x90/0x285 [sctp]
> [  875.393031]

this is "interesting" code to follow but it looks like a honest case of
deliberate out of order unlock

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

---
 net/sctp/socket.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.17-rc5-mm3/net/sctp/socket.c
===================================================================
--- linux-2.6.17-rc5-mm3.orig/net/sctp/socket.c
+++ linux-2.6.17-rc5-mm3/net/sctp/socket.c
@@ -4597,7 +4597,7 @@ static long sctp_get_port_local(struct s
 			sctp_spin_unlock(&head->lock);
 		} while (--remaining > 0);
 		sctp_port_rover = rover;
-		sctp_spin_unlock(&sctp_port_alloc_lock);
+		spin_unlock_non_nested(&sctp_port_alloc_lock);
 
 		/* Exhausted local port range during search? */
 		ret = 1;



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
                   ` (5 preceding siblings ...)
  2006-06-05 19:48 ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-05 23:02 ` Dave Jones
  2006-06-06  1:44   ` 2.6.17-rc5-mm3 Randy.Dunlap
  2006-06-06  8:03 ` 2.6.17-rc5-mm3 J.A. Magallón
  7 siblings, 1 reply; 52+ messages in thread
From: Dave Jones @ 2006-06-05 23:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, paulkf

On Sat, Jun 03, 2006 at 11:20:04PM -0700, Andrew Morton wrote:
 > 
 > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/

WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol alloc_hdlcdev
WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_close
WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_set_carrier
WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol register_hdlc_device
WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_open
WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_ioctl
WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol unregister_hdlc_device

(19:02:21:root@northwood:mm3)# grep SYNCLINK .config
CONFIG_SYNCLINK_CS=m
CONFIG_SYNCLINK_CS_HDLC=y
(19:02:25:root@northwood:mm3)# grep HDLC .config
CONFIG_HDLC=m
# CONFIG_HDLC_RAW is not set
# CONFIG_HDLC_RAW_ETH is not set
# CONFIG_HDLC_CISCO is not set
# CONFIG_HDLC_FR is not set
# CONFIG_HDLC_PPP is not set
CONFIG_HISAX_HDLC=y
CONFIG_SYNCLINK_CS_HDLC=y


		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 23:02 ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-06  1:44   ` Randy.Dunlap
  2006-06-06  1:54     ` 2.6.17-rc5-mm3 Paul Fulghum
  0 siblings, 1 reply; 52+ messages in thread
From: Randy.Dunlap @ 2006-06-06  1:44 UTC (permalink / raw)
  To: Dave Jones; +Cc: akpm, linux-kernel, paulkf, zippel

On Mon, 5 Jun 2006 19:02:48 -0400 Dave Jones wrote:

> On Sat, Jun 03, 2006 at 11:20:04PM -0700, Andrew Morton wrote:
>  > 
>  > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
> 
> WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol alloc_hdlcdev
> WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_close
> WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_set_carrier
> WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol register_hdlc_device
> WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_open
> WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol hdlc_ioctl
> WARNING: /lib/modules/2.6.17-rc5-mm3/kernel/drivers/char/pcmcia/synclink_cs.ko needs unknown symbol unregister_hdlc_device
> 
> (19:02:21:root@northwood:mm3)# grep SYNCLINK .config
> CONFIG_SYNCLINK_CS=m
> CONFIG_SYNCLINK_CS_HDLC=y
> (19:02:25:root@northwood:mm3)# grep HDLC .config
> CONFIG_HDLC=m
> # CONFIG_HDLC_RAW is not set
> # CONFIG_HDLC_RAW_ETH is not set
> # CONFIG_HDLC_CISCO is not set
> # CONFIG_HDLC_FR is not set
> # CONFIG_HDLC_PPP is not set
> CONFIG_HISAX_HDLC=y
> CONFIG_SYNCLINK_CS_HDLC=y

Those Kconfig + Makefiles are quite ugly to me.  I would rather see
SYNCLINK depend on HDLC rather than using some tricks to SELECT HDLC.
And then it selects HDLC (and HDLC depends on WAN), but (in my case)
WAN was not enabled, and doing "SELECT HDLC" did not enable WAN.

Adding SELECT WAN and changing the hdlc (wan) Makefile to use
obj-m or obj-y (it was ONLY obj-y for hdlc) fixes^W makes it build
with no missing symbols.  However, I'll also see about a fix
that uses "depends on HDLC" instead of "selects HDLC".

---
From: Randy Dunlap <rdunlap@xenotime.net>

Fix many missing hdlc_generic symbols when CONFIG_HDLC=m.
When Selecting HDLC, also Select WAN.
Fix Makefile to build for HDLC=y or HDLC=m.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
---
 drivers/char/Kconfig     |    3 +++
 drivers/net/wan/Makefile |    8 ++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

--- linux-2617-rc5mm3.orig/drivers/net/wan/Makefile
+++ linux-2617-rc5mm3/drivers/net/wan/Makefile
@@ -9,14 +9,18 @@ cyclomx-y                       := cycx_
 cyclomx-$(CONFIG_CYCLOMX_X25)	+= cycx_x25.o
 cyclomx-objs			:= $(cyclomx-y)  
 
-hdlc-y				:= hdlc_generic.o
+hdlc-$(CONFIG_HDLC)		:= hdlc_generic.o
 hdlc-$(CONFIG_HDLC_RAW)		+= hdlc_raw.o
 hdlc-$(CONFIG_HDLC_RAW_ETH)	+= hdlc_raw_eth.o
 hdlc-$(CONFIG_HDLC_CISCO)	+= hdlc_cisco.o
 hdlc-$(CONFIG_HDLC_FR)		+= hdlc_fr.o
 hdlc-$(CONFIG_HDLC_PPP)		+= hdlc_ppp.o
 hdlc-$(CONFIG_HDLC_X25)		+= hdlc_x25.o
-hdlc-objs			:= $(hdlc-y)
+ifeq ($(CONFIG_HDLC),y)
+  hdlc-objs			:= $(hdlc-y)
+else
+  hdlc-objs			:= $(hdlc-m)
+endif
 
 pc300-y				:= pc300_drv.o
 pc300-$(CONFIG_PC300_MLPPP)	+= pc300_tty.o
--- linux-2617-rc5mm3.orig/drivers/char/Kconfig
+++ linux-2617-rc5mm3/drivers/char/Kconfig
@@ -197,6 +197,7 @@ config ISI
 config SYNCLINK
 	tristate "SyncLink PCI/ISA support"
 	depends on SERIAL_NONSTANDARD && PCI && ISA_DMA_API
+	select WAN if SYNCLINK_HDLC
 	select HDLC if SYNCLINK_HDLC
 	help
 	  Driver for SyncLink ISA and PCI synchronous serial adapters.
@@ -214,6 +215,7 @@ config SYNCLINK_HDLC
 config SYNCLINKMP
 	tristate "SyncLink Multiport support"
 	depends on SERIAL_NONSTANDARD && PCI
+	select WAN if SYNCLINKMP_HDLC
 	select HDLC if SYNCLINKMP_HDLC
 	help
 	  Driver for SyncLink Multiport (2 or 4 ports) PCI synchronous serial adapter.
@@ -231,6 +233,7 @@ config SYNCLINKMP_HDLC
 config SYNCLINK_GT
 	tristate "SyncLink GT/AC support"
 	depends on SERIAL_NONSTANDARD && PCI
+	select WAN if SYNCLINK_GT_HDLC
 	select HDLC if SYNCLINK_GT_HDLC
 	help
 	  Support for SyncLink GT and SyncLink AC families of

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-06  1:44   ` 2.6.17-rc5-mm3 Randy.Dunlap
@ 2006-06-06  1:54     ` Paul Fulghum
  2006-06-06  2:03       ` 2.6.17-rc5-mm3 Randy.Dunlap
  0 siblings, 1 reply; 52+ messages in thread
From: Paul Fulghum @ 2006-06-06  1:54 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Dave Jones, akpm, linux-kernel, zippel

Randy.Dunlap wrote:
> Those Kconfig + Makefiles are quite ugly to me.  I would rather see
> SYNCLINK depend on HDLC rather than using some tricks to SELECT HDLC.
> And then it selects HDLC (and HDLC depends on WAN), but (in my case)
> WAN was not enabled, and doing "SELECT HDLC" did not enable WAN.
> 
> Adding SELECT WAN and changing the hdlc (wan) Makefile to use
> obj-m or obj-y (it was ONLY obj-y for hdlc) fixes^W makes it build
> with no missing symbols.  However, I'll also see about a fix
> that uses "depends on HDLC" instead of "selects HDLC".

Generic HDLC support in the synclink drivers is optional.
Should the generic HDLC code be enabled even if it is not used?

Some of our customers would scream if we started forcing
them to compile and load unused code.

> Fix many missing hdlc_generic symbols when CONFIG_HDLC=m.
> When Selecting HDLC, also Select WAN.
> Fix Makefile to build for HDLC=y or HDLC=m.
> 
> +	select WAN if SYNCLINK_HDLC

If this is the accepted approach, then synclink_cs should be added also.
(drivers/char/pcmcia)

What about select WAN if HDLC instead?
Or does kbuild not propogate the reverse dependency?
(SYNCLINK_HDLC selects HDLC, HDLC selects WAN)

--
Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-06  1:54     ` 2.6.17-rc5-mm3 Paul Fulghum
@ 2006-06-06  2:03       ` Randy.Dunlap
  2006-06-06  2:19         ` 2.6.17-rc5-mm3 Randy.Dunlap
  0 siblings, 1 reply; 52+ messages in thread
From: Randy.Dunlap @ 2006-06-06  2:03 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: davej, akpm, linux-kernel, zippel

On Mon, 05 Jun 2006 20:54:51 -0500 Paul Fulghum wrote:

> Randy.Dunlap wrote:
> > Those Kconfig + Makefiles are quite ugly to me.  I would rather see
> > SYNCLINK depend on HDLC rather than using some tricks to SELECT HDLC.
> > And then it selects HDLC (and HDLC depends on WAN), but (in my case)
> > WAN was not enabled, and doing "SELECT HDLC" did not enable WAN.
> > 
> > Adding SELECT WAN and changing the hdlc (wan) Makefile to use
> > obj-m or obj-y (it was ONLY obj-y for hdlc) fixes^W makes it build
> > with no missing symbols.  However, I'll also see about a fix
> > that uses "depends on HDLC" instead of "selects HDLC".
> 
> Generic HDLC support in the synclink drivers is optional.
> Should the generic HDLC code be enabled even if it is not used?
> 
> Some of our customers would scream if we started forcing
> them to compile and load unused code.

OK, I'll try to allow for that.

> > Fix many missing hdlc_generic symbols when CONFIG_HDLC=m.
> > When Selecting HDLC, also Select WAN.
> > Fix Makefile to build for HDLC=y or HDLC=m.
> > 
> > +	select WAN if SYNCLINK_HDLC
> 
> If this is the accepted approach, then synclink_cs should be added also.
> (drivers/char/pcmcia)

It's not the desired approach AFAIK, but it may be the only
reasonable one.  I'm still testing alternatives, but you are welcome
to take over and fix it.  :)

> What about select WAN if HDLC instead?
> Or does kbuild not propogate the reverse dependency?
> (SYNCLINK_HDLC selects HDLC, HDLC selects WAN)

OK.

---
~Randy

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-06  2:03       ` 2.6.17-rc5-mm3 Randy.Dunlap
@ 2006-06-06  2:19         ` Randy.Dunlap
  2006-06-06  2:35           ` 2.6.17-rc5-mm3 Paul Fulghum
  2006-06-06 13:30           ` 2.6.17-rc5-mm3 Paul Fulghum
  0 siblings, 2 replies; 52+ messages in thread
From: Randy.Dunlap @ 2006-06-06  2:19 UTC (permalink / raw)
  To: paulkf; +Cc: davej, akpm, linux-kernel, zippel

On Mon, 5 Jun 2006 19:03:55 -0700 Randy.Dunlap wrote:

> On Mon, 05 Jun 2006 20:54:51 -0500 Paul Fulghum wrote:
> 
> > Randy.Dunlap wrote:
> > > Those Kconfig + Makefiles are quite ugly to me.  I would rather see
> > > SYNCLINK depend on HDLC rather than using some tricks to SELECT HDLC.
> > > And then it selects HDLC (and HDLC depends on WAN), but (in my case)
> > > WAN was not enabled, and doing "SELECT HDLC" did not enable WAN.
> > > 
> > > Adding SELECT WAN and changing the hdlc (wan) Makefile to use
> > > obj-m or obj-y (it was ONLY obj-y for hdlc) fixes^W makes it build
> > > with no missing symbols.  However, I'll also see about a fix
> > > that uses "depends on HDLC" instead of "selects HDLC".
> > 
> > Generic HDLC support in the synclink drivers is optional.
> > Should the generic HDLC code be enabled even if it is not used?
> > 
> > Some of our customers would scream if we started forcing
> > them to compile and load unused code.
> 
> OK, I'll try to allow for that.
> 
> > > Fix many missing hdlc_generic symbols when CONFIG_HDLC=m.
> > > When Selecting HDLC, also Select WAN.
> > > Fix Makefile to build for HDLC=y or HDLC=m.
> > > 
> > > +	select WAN if SYNCLINK_HDLC
> > 
> > If this is the accepted approach, then synclink_cs should be added also.
> > (drivers/char/pcmcia)
> 
> It's not the desired approach AFAIK, but it may be the only
> reasonable one.  I'm still testing alternatives, but you are welcome
> to take over and fix it.  :)
> 
> > What about select WAN if HDLC instead?
> > Or does kbuild not propogate the reverse dependency?
> > (SYNCLINK_HDLC selects HDLC, HDLC selects WAN)
> 
> OK.

Hi Paul,
Here's another version of the patch for you to consider.
---

From: Randy Dunlap <rdunlap@xenotime.net>

Fix missing symbol references to hdlc_generic functions.
Switch SYNCLINK drivers from using SELECT to using DEPENDS for HDLC.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
---
 drivers/char/Kconfig        |    9 +++------
 drivers/char/pcmcia/Kconfig |    3 +--
 drivers/net/wan/Makefile    |    8 ++++++--
 3 files changed, 10 insertions(+), 10 deletions(-)

--- linux-2617-rc5mm3.orig/drivers/char/Kconfig
+++ linux-2617-rc5mm3/drivers/char/Kconfig
@@ -197,7 +197,6 @@ config ISI
 config SYNCLINK
 	tristate "SyncLink PCI/ISA support"
 	depends on SERIAL_NONSTANDARD && PCI && ISA_DMA_API
-	select HDLC if SYNCLINK_HDLC
 	help
 	  Driver for SyncLink ISA and PCI synchronous serial adapters.
 	  These adapters are no longer in production and have
@@ -205,7 +204,7 @@ config SYNCLINK
 
 config SYNCLINK_HDLC
 	bool "Generic HDLC support for SyncLink driver"
-	depends on SYNCLINK
+	depends on SYNCLINK && HDLC
 	help
 	  Enable generic HDLC support for the SyncLink PCI/ISA driver.
 	  Generic HDLC implements multiple higher layer networking
@@ -214,7 +213,6 @@ config SYNCLINK_HDLC
 config SYNCLINKMP
 	tristate "SyncLink Multiport support"
 	depends on SERIAL_NONSTANDARD && PCI
-	select HDLC if SYNCLINKMP_HDLC
 	help
 	  Driver for SyncLink Multiport (2 or 4 ports) PCI synchronous serial adapter.
 	  These adapters are no longer in production and have
@@ -222,7 +220,7 @@ config SYNCLINKMP
 
 config SYNCLINKMP_HDLC
 	bool "Generic HDLC support for SyncLink Multiport"
-	depends on SYNCLINKMP
+	depends on SYNCLINKMP && HDLC
 	help
 	  Enable generic HDLC support for the SyncLink Multiport driver.
 	  Generic HDLC implements multiple higher layer networking
@@ -231,7 +229,6 @@ config SYNCLINKMP_HDLC
 config SYNCLINK_GT
 	tristate "SyncLink GT/AC support"
 	depends on SERIAL_NONSTANDARD && PCI
-	select HDLC if SYNCLINK_GT_HDLC
 	help
 	  Support for SyncLink GT and SyncLink AC families of
 	  synchronous and asynchronous serial adapters
@@ -239,7 +236,7 @@ config SYNCLINK_GT
 
 config SYNCLINK_GT_HDLC
 	bool "Generic HDLC support for SyncLink GT/AC"
-	depends on SYNCLINK_GT
+	depends on SYNCLINK_GT && HDLC
 	help
 	  Enable generic HDLC support for the SyncLink GT/AC driver.
 	  Generic HDLC implements multiple higher layer networking
--- linux-2617-rc5mm3.orig/drivers/char/pcmcia/Kconfig
+++ linux-2617-rc5mm3/drivers/char/pcmcia/Kconfig
@@ -8,13 +8,12 @@ menu "PCMCIA character devices"
 config SYNCLINK_CS
 	tristate "SyncLink PC Card support"
 	depends on PCMCIA
-	select HDLC if SYNCLINK_CS_HDLC
 	help
 	  Driver for SyncLink PC Card synchronous serial adapter.
 
 config SYNCLINK_CS_HDLC
 	bool "Generic HDLC support for SyncLink Multiport"
-	depends on SYNCLINK_CS
+	depends on SYNCLINK_CS && HDLC
 	help
 	  Enable generic HDLC support for the SyncLink PC Card driver.
 	  Generic HDLC implements multiple higher layer networking
--- linux-2617-rc5mm3.orig/drivers/net/wan/Makefile
+++ linux-2617-rc5mm3/drivers/net/wan/Makefile
@@ -9,14 +9,18 @@ cyclomx-y                       := cycx_
 cyclomx-$(CONFIG_CYCLOMX_X25)	+= cycx_x25.o
 cyclomx-objs			:= $(cyclomx-y)  
 
-hdlc-y				:= hdlc_generic.o
+hdlc-$(CONFIG_HDLC)		:= hdlc_generic.o
 hdlc-$(CONFIG_HDLC_RAW)		+= hdlc_raw.o
 hdlc-$(CONFIG_HDLC_RAW_ETH)	+= hdlc_raw_eth.o
 hdlc-$(CONFIG_HDLC_CISCO)	+= hdlc_cisco.o
 hdlc-$(CONFIG_HDLC_FR)		+= hdlc_fr.o
 hdlc-$(CONFIG_HDLC_PPP)		+= hdlc_ppp.o
 hdlc-$(CONFIG_HDLC_X25)		+= hdlc_x25.o
-hdlc-objs			:= $(hdlc-y)
+ifeq ($(CONFIG_HDLC),y)
+  hdlc-objs			:= $(hdlc-y)
+else
+  hdlc-objs			:= $(hdlc-m)
+endif
 
 pc300-y				:= pc300_drv.o
 pc300-$(CONFIG_PC300_MLPPP)	+= pc300_tty.o

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-06  2:19         ` 2.6.17-rc5-mm3 Randy.Dunlap
@ 2006-06-06  2:35           ` Paul Fulghum
  2006-06-06 13:30           ` 2.6.17-rc5-mm3 Paul Fulghum
  1 sibling, 0 replies; 52+ messages in thread
From: Paul Fulghum @ 2006-06-06  2:35 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: davej, akpm, linux-kernel, zippel

Randy.Dunlap wrote:
> On Mon, 5 Jun 2006 19:03:55 -0700 Randy.Dunlap wrote:
> Here's another version of the patch for you to consider.

This looks like the correct implementation of what I was trying (unsuccessfully)
to do when the random config errors were first reported.

I'll do testing tomorrow to make sure I understand this completely.

Thanks,
Paul

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
                   ` (6 preceding siblings ...)
  2006-06-05 23:02 ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-06  8:03 ` J.A. Magallón
  7 siblings, 0 replies; 52+ messages in thread
From: J.A. Magallón @ 2006-06-06  8:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Ingo Molnar

On Sat, 3 Jun 2006 23:20:04 -0700, Andrew Morton <akpm@osdl.org> wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
> 
> - Lots of PCI and USB updates
> 
> - The various lock validator, stack backtracing and IRQ management problems
>   are converging, but we're not quite there yet.
> 

One more, could not find it already reported (if yes, sorry for the noise).
It is not in lockdep-combo as 20060606.

ide-floppy driver 0.99.newide
stopped custom tracer.
BUG: warning at kernel/lockdep.c:1856/trace_hardirqs_on()
 [<c01034ba>] show_trace+0x12/0x14
 [<c0103b8d>] dump_stack+0x19/0x1b
 [<c0133c56>] trace_hardirqs_on+0x14d/0x152
 [<f88bcaa9>] idefloppy_pc_intr+0x192/0x6ca [ide_floppy]
 [<f89402e2>] ide_intr+0x74/0x1c7 [ide_core]
 [<c013d212>] handle_IRQ_event+0x2e/0x63
 [<c013e451>] handle_edge_irq+0xad/0x132
 [<c0104dc7>] do_IRQ+0x6c/0xa5
 =======================
 [<c0102ec1>] common_interrupt+0x25/0x2c
 [<c0103029>] error_code+0x39/0x40
hdb: No disk in drive
hdb: 244736kB, 239/64/32 CHS, 4096 kBps, 512 sector size, 2941 rpm
hdb: No disk in drive

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.0 (Cooker) for i586
Linux 2.6.16-jam19 (gcc 4.1.1 20060518 (prerelease)) #1 SMP PREEMPT Tue

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code
  2006-06-05 20:00   ` 2.6.17-rc5-mm3 Randy.Dunlap
  2006-06-05 20:05     ` 2.6.17-rc5-mm3 Ingo Molnar
  2006-06-05 20:05     ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-06  8:56     ` Ingo Molnar
  2006-06-06 11:40       ` Andy Whitcroft
  2006-06-07  9:17       ` Andy Whitcroft
  2 siblings, 2 replies; 52+ messages in thread
From: Ingo Molnar @ 2006-06-06  8:56 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: mbligh, akpm, apw, linux-kernel


* Randy.Dunlap <rdunlap@xenotime.net> wrote:

> BUG: unable to handle kernel paging request at virtual address 22222232

ok, this was a big thinko on my part, and it was right before our eyes. 
Mutex deadlock checking relied on the 'big mutex debugging lock', but 
that one is gone now - so mutex deadlock checking became racy (as your 
crashes nicely pinpointed that). The races are more likely with an 
increasing number of CPUs.

so the patch below finishes the cleanup i started: it removes deadlock 
checking from the mutex code and lets the lock validator do that. This 
should also be (much) faster on SMP, because the lock validator is 
lockless in the fastpath. (if CONFIG_DEBUG_LOCKDEP is disabled)

	Ingo

----------------
Subject: better lock debugging: remove mutex deadlock checking code
From: Ingo Molnar <mingo@elte.hu>

with the lock validator we detect mutex deadlocks (and more), the mutex
deadlock checking code is both redundant and slower. So remove it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/mutex-debug.c |  126 ---------------------------------------------------
 lib/Kconfig.debug    |    8 ---
 2 files changed, 1 insertion(+), 133 deletions(-)

Index: linux/kernel/mutex-debug.c
===================================================================
--- linux.orig/kernel/mutex-debug.c
+++ linux/kernel/mutex-debug.c
@@ -23,128 +23,6 @@
 
 #include "mutex-debug.h"
 
-static void printk_task(struct task_struct *p)
-{
-	if (p)
-		printk("%16s:%5d [%p, %3d]", p->comm, p->pid, p, p->prio);
-	else
-		printk("<none>");
-}
-
-static void printk_ti(struct thread_info *ti)
-{
-	if (ti)
-		printk_task(ti->task);
-	else
-		printk("<none>");
-}
-
-static void printk_lock(struct mutex *lock, int print_owner)
-{
-#ifdef CONFIG_PROVE_MUTEX_LOCKING
-	printk(" [%p] {%s}\n", lock, lock->dep_map.name);
-#else
-	printk(" [%p]\n", lock);
-#endif
-
-	if (print_owner && lock->owner) {
-		printk(".. held by:  ");
-		printk_ti(lock->owner);
-		printk("\n");
-	}
-}
-
-static void report_deadlock(struct task_struct *task, struct mutex *lock,
-			    struct mutex *lockblk)
-{
-	printk("\n%s/%d is trying to acquire this lock:\n",
-		current->comm, current->pid);
-	printk_lock(lock, 1);
-	debug_show_held_locks(current);
-
-	if (lockblk) {
-		printk("but %s/%d is deadlocking current task %s/%d!\n\n",
-			task->comm, task->pid, current->comm, current->pid);
-		printk("\n%s/%d is blocked on this lock:\n",
-			task->comm, task->pid);
-		printk_lock(lockblk, 1);
-
-		debug_show_held_locks(task);
-
-		printk("\n%s/%d's [blocked] stackdump:\n\n",
-			task->comm, task->pid);
-		show_stack(task, NULL);
-	}
-
-	printk("\n%s/%d's [current] stackdump:\n\n",
-		current->comm, current->pid);
-	dump_stack();
-	debug_show_all_locks();
-	printk("[ turning off deadlock detection. Please report this. ]\n\n");
-	local_irq_disable();
-}
-
-/*
- * Recursively check for mutex deadlocks:
- */
-static int check_deadlock(struct mutex *lock, int depth, struct thread_info *ti)
-{
-	struct mutex *lockblk;
-	struct task_struct *task;
-
-	if (!debug_locks)
-		return 0;
-
-	ti = lock->owner;
-	if (!ti)
-		return 0;
-
-	task = ti->task;
-	/*
-	 * In the PROVE_MUTEX_LOCKING we are tracking all held
-	 * locks already, which allows us to optimize this:
-	 */
-#ifdef CONFIG_PROVE_MUTEX_LOCKING
-	if (!task->lockdep_depth)
-		return 0;
-#endif
-	lockblk = NULL;
-	if (task->blocked_on)
-		lockblk = task->blocked_on->lock;
-
-	/* Self-deadlock: */
-	if (current == task) {
-		debug_locks_off();
-		if (depth)
-			return 1;
-		printk("\n==========================================\n");
-		printk(  "[ BUG: lock recursion deadlock detected! |\n");
-		printk(  "------------------------------------------\n");
-		report_deadlock(task, lock, NULL);
-		return 0;
-	}
-
-	/* Ugh, something corrupted the lock data structure? */
-	if (depth > 20) {
-		debug_locks_off();
-		printk("\n===========================================\n");
-		printk(  "[ BUG: infinite lock dependency detected!? |\n");
-		printk(  "-------------------------------------------\n");
-		report_deadlock(task, lock, lockblk);
-		return 0;
-	}
-
-	/* Recursively check for dependencies: */
-	if (lockblk && check_deadlock(lockblk, depth+1, ti)) {
-		printk("\n============================================\n");
-		printk(  "[ BUG: circular locking deadlock detected! ]\n");
-		printk(  "--------------------------------------------\n");
-		report_deadlock(task, lock, lockblk);
-		return 0;
-	}
-	return 0;
-}
-
 /*
  * Must be called with lock->wait_lock held.
  */
@@ -178,9 +56,7 @@ void debug_mutex_add_waiter(struct mutex
 			    struct thread_info *ti)
 {
 	SMP_DEBUG_WARN_ON(!spin_is_locked(&lock->wait_lock));
-#ifdef CONFIG_DEBUG_MUTEX_DEADLOCKS
-	check_deadlock(lock, 0, ti);
-#endif
+
 	/* Mark the current thread as blocked on the lock: */
 	ti->task->blocked_on = waiter;
 	waiter->lock = lock;
Index: linux/lib/Kconfig.debug
===================================================================
--- linux.orig/lib/Kconfig.debug
+++ linux/lib/Kconfig.debug
@@ -164,14 +164,6 @@ config DEBUG_MUTEX_ALLOC
 	 (kfree(), kmem_cache_free(), free_pages(), vfree(), etc.),
 	 or whether there is any lock held during task exit.
 
-config DEBUG_MUTEX_DEADLOCKS
-	bool "Detect mutex related deadlocks"
-	default y
-	depends on DEBUG_MUTEXES
-	help
-	 This feature will automatically detect and report mutex related
-	 deadlocks, as they happen.
-
 config DEBUG_RT_MUTEXES
 	bool "RT Mutex debugging, deadlock detection"
 	default y

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 18:54   ` 2.6.17-rc5-mm3 Andrew Morton
@ 2006-06-06  9:43     ` Mel Gorman
  2006-06-06 10:57     ` 2.6.17-rc5-mm3 Mel Gorman
  1 sibling, 0 replies; 52+ messages in thread
From: Mel Gorman @ 2006-06-06  9:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev

On Mon, 5 Jun 2006, Andrew Morton wrote:

> On Mon, 5 Jun 2006 18:56:37 +0100
> mel@csn.ul.ie (Mel Gorman) wrote:
>
>>
>> I am seeing more networking-related funniness with 2.6.17-rc5-mm3 on the
>> same machine previously fixed by git-net-llc-fix.patch. The console log is
>> below. I've done no investigation work in case it's a known problem.
>
> It's not a known problem, afaik.
>
>> ...
>> Starting anacron: [  OK  ]
>> Starting atd: [  OK  ]
>> Starting Avahi daemon: [  OK  ]
>> Starting cups-config-daemon: [  OK  ]
>> Starting HAL daemon: [  OK  ]
>> Fedora Core release 5 (Bordeaux)
>> Kernel 2.6.17-rc5-mm2-autokern1 on an x86_64
>> bl6-13.ltc.austin.ibm.com login: -- 0:conmux-control -- time-stamp -- Jun/05/06 10:47:46 --
>> -- 0:conmux-control -- time-stamp -- Jun/05/06 10:51:12 --
>> BUG: warning at include/net/dst.h:153/dst_release()
>> Call Trace:
>>  <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
>>  [<ffffffff81199568>] tg3_poll+0x1a1/0x94f
>>  [<ffffffff8122d80c>] net_rx_action+0xac/0x160
>>  [<ffffffff81037904>] __do_softirq+0x48/0xb4
>>  [<ffffffff8100a496>] call_softirq+0x1e/0x28
>>  [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
>>  [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
>>  [<ffffffff81007807>] default_idle+0x0/0x54
>>  [<ffffffff810097b8>] ret_from_intr+0x0/0xb
>>  <EOI>
>> Attempt to release alive inet socket ffff81003f8b2780
>> BUG: warning at include/net/dst.h:153/dst_release()
>> Call Trace:
>>  <IRQ> [<ffffffff81228274>] __kfree_skb+0x3c/0xbd
>>  [<ffffffff81268fc4>] icmp_rcv+0x17c/0x184
>>  [<ffffffff812484ca>] ip_local_deliver+0xfe/0x1bf
>>  [<ffffffff812489bf>] ip_rcv+0x434/0x475
>>  [<ffffffff8122d615>] netif_receive_skb+0x2c6/0x2e5
>>  [<ffffffff81199add>] tg3_poll+0x716/0x94f
>>  [<ffffffff8122d80c>] net_rx_action+0xac/0x160<7>Losing some ticks... checking if CPU frequency changed.
>>  [<ffffffff81037904>] __do_softirq+0x48/0xb4
>>  [<ffffffff8100a496>] call_softirq+0x1e/0x28
>>  [<ffffffff8100b84e>] do_softirq+0x2c/0x7e
>>  [<ffffffff8100b6c8>] do_IRQ+0x50/0x59
>>  [<ffffffff81007807>] default_idle+0x0/0x54
>>  [<ffffffff810097b8>] ret_from_intr+0x0/0xb
>
> There are quite a few changes in the net tree.  I guess the first thing to
> investigate would be 2.6.17-rc5+origin.patch+git-net.patch.
>

That survived long enough to build a kernel, but backing out git-net on 
top of mm like I did for the LLC bug also survived. Not sure what is going 
on.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 20:06   ` 2.6.17-rc5-mm3 Andrew Morton
  2006-06-05 20:09     ` 2.6.17-rc5-mm3 Dave Jones
@ 2006-06-06 10:15     ` Takashi Iwai
  1 sibling, 0 replies; 52+ messages in thread
From: Takashi Iwai @ 2006-06-06 10:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dave Jones, linux-kernel, Jaroslav Kysela

At Mon, 5 Jun 2006 13:06:26 -0700,
Andrew Morton wrote:
> 
> On Mon, 5 Jun 2006 15:48:45 -0400
> Dave Jones <davej@redhat.com> wrote:
> 
> > On Sat, Jun 03, 2006 at 11:20:04PM -0700, Andrew Morton wrote:
> >  > 
> >  > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/
> >  > 
> >  > - Lots of PCI and USB updates
> >  > 
> >  > - The various lock validator, stack backtracing and IRQ management problems
> >  >   are converging, but we're not quite there yet.
> > 
> > Thought I'd try my bi-annual "poke at -mm". Results were less
> > than spectacular.
> > 
> > http://www.codemonkey.org.uk/junk/DSC00347.JPG
> > First the sound driver oopsed.
> 
> That's a bug in sound/pci/cs4281.c.
> 
> There's a debug patch in -mm
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm3/broken-out/debug-shared-irqs.patch
> which trips up drivers which request an IRQ before their IRQ handler is
> ready to accept IRQs (they'll crash in real life if the IRQ is shared).

I guess that the bug in cs4281 is ioremap too lately issued after the
registration of irq handler.

Does the patch below fix the problem?

Takashi


[PATCH] Fix possible Oops in cs4281 irq handler

Call ioremap before request_irq for avoiding possible Oops
in cs4281 driver.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
diff -r 84d14cbbd713 sound/pci/cs4281.c
--- a/sound/pci/cs4281.c	Fri Jun 02 09:15:44 2006 +0200
+++ b/sound/pci/cs4281.c	Tue Jun 06 12:11:56 2006 +0200
@@ -1379,6 +1379,13 @@ static int __devinit snd_cs4281_create(s
 	chip->ba0_addr = pci_resource_start(pci, 0);
 	chip->ba1_addr = pci_resource_start(pci, 1);
 
+	chip->ba0 = ioremap_nocache(chip->ba0_addr, pci_resource_len(pci, 0));
+	chip->ba1 = ioremap_nocache(chip->ba1_addr, pci_resource_len(pci, 1));
+	if (!chip->ba0 || !chip->ba1) {
+		snd_cs4281_free(chip);
+		return -ENOMEM;
+	}
+	
 	if (request_irq(pci->irq, snd_cs4281_interrupt, SA_INTERRUPT|SA_SHIRQ,
 			"CS4281", chip)) {
 		snd_printk(KERN_ERR "unable to grab IRQ %d\n", pci->irq);
@@ -1387,13 +1394,6 @@ static int __devinit snd_cs4281_create(s
 	}
 	chip->irq = pci->irq;
 
-	chip->ba0 = ioremap_nocache(chip->ba0_addr, pci_resource_len(pci, 0));
-	chip->ba1 = ioremap_nocache(chip->ba1_addr, pci_resource_len(pci, 1));
-	if (!chip->ba0 || !chip->ba1) {
-		snd_cs4281_free(chip);
-		return -ENOMEM;
-	}
-	
 	tmp = snd_cs4281_chip_init(chip);
 	if (tmp) {
 		snd_cs4281_free(chip);

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-05 18:54   ` 2.6.17-rc5-mm3 Andrew Morton
  2006-06-06  9:43     ` 2.6.17-rc5-mm3 Mel Gorman
@ 2006-06-06 10:57     ` Mel Gorman
  1 sibling, 0 replies; 52+ messages in thread
From: Mel Gorman @ 2006-06-06 10:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev

On Mon, 5 Jun 2006, Andrew Morton wrote:

> On Mon, 5 Jun 2006 18:56:37 +0100
> mel@csn.ul.ie (Mel Gorman) wrote:
>
>>
>> I am seeing more networking-related funniness with 2.6.17-rc5-mm3 on the
>> same machine previously fixed by git-net-llc-fix.patch. The console log is
>> below. I've done no investigation work in case it's a known problem.
>
> It's not a known problem, afaik.
>
>> ...
>> Starting anacron: [  OK  ]
>> Starting atd: [  OK  ]
>> Starting Avahi daemon: [  OK  ]
>> Starting cups-config-daemon: [  OK  ]
>> Starting HAL daemon: [  OK  ]
>> Fedora Core release 5 (Bordeaux)
>> Kernel 2.6.17-rc5-mm2-autokern1 on an x86_64

Bah, I'm a spanner. The patches I was testing were rebased to the latest 
-mm, but the kernel version they were then tested on was not changed. This 
was probably the LLC bug with a different shaped error and the first set 
of tests are passing with -mm3. Sorry for the noise.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code
  2006-06-06  8:56     ` [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code Ingo Molnar
@ 2006-06-06 11:40       ` Andy Whitcroft
  2006-06-06 17:17         ` Andy Whitcroft
  2006-06-07  9:17       ` Andy Whitcroft
  1 sibling, 1 reply; 52+ messages in thread
From: Andy Whitcroft @ 2006-06-06 11:40 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Randy.Dunlap, mbligh, akpm, linux-kernel

Ingo Molnar wrote:
> * Randy.Dunlap <rdunlap@xenotime.net> wrote:
> 
> 
>>BUG: unable to handle kernel paging request at virtual address 22222232
> 
> 
> ok, this was a big thinko on my part, and it was right before our eyes. 
> Mutex deadlock checking relied on the 'big mutex debugging lock', but 
> that one is gone now - so mutex deadlock checking became racy (as your 
> crashes nicely pinpointed that). The races are more likely with an 
> increasing number of CPUs.
> 
> so the patch below finishes the cleanup i started: it removes deadlock 
> checking from the mutex code and lets the lock validator do that. This 
> should also be (much) faster on SMP, because the lock validator is 
> lockless in the fastpath. (if CONFIG_DEBUG_LOCKDEP is disabled)
> 
> 	Ingo
> 
> ----------------
> Subject: better lock debugging: remove mutex deadlock checking code
> From: Ingo Molnar <mingo@elte.hu>
> 
> with the lock validator we detect mutex deadlocks (and more), the mutex
> deadlock checking code is both redundant and slower. So remove it.
> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  kernel/mutex-debug.c |  126 ---------------------------------------------------
>  lib/Kconfig.debug    |    8 ---
>  2 files changed, 1 insertion(+), 133 deletions(-)
> 
> Index: linux/kernel/mutex-debug.c
> ===================================================================
> --- linux.orig/kernel/mutex-debug.c
> +++ linux/kernel/mutex-debug.c
> @@ -23,128 +23,6 @@
>  
>  #include "mutex-debug.h"
>  
> -static void printk_task(struct task_struct *p)
> -{
> -	if (p)
> -		printk("%16s:%5d [%p, %3d]", p->comm, p->pid, p, p->prio);
> -	else
> -		printk("<none>");
> -}
> -
> -static void printk_ti(struct thread_info *ti)
> -{
> -	if (ti)
> -		printk_task(ti->task);
> -	else
> -		printk("<none>");
> -}
> -
> -static void printk_lock(struct mutex *lock, int print_owner)
> -{
> -#ifdef CONFIG_PROVE_MUTEX_LOCKING
> -	printk(" [%p] {%s}\n", lock, lock->dep_map.name);
> -#else
> -	printk(" [%p]\n", lock);
> -#endif
> -
> -	if (print_owner && lock->owner) {
> -		printk(".. held by:  ");
> -		printk_ti(lock->owner);
> -		printk("\n");
> -	}
> -}
> -
> -static void report_deadlock(struct task_struct *task, struct mutex *lock,
> -			    struct mutex *lockblk)
> -{
> -	printk("\n%s/%d is trying to acquire this lock:\n",
> -		current->comm, current->pid);
> -	printk_lock(lock, 1);
> -	debug_show_held_locks(current);
> -
> -	if (lockblk) {
> -		printk("but %s/%d is deadlocking current task %s/%d!\n\n",
> -			task->comm, task->pid, current->comm, current->pid);
> -		printk("\n%s/%d is blocked on this lock:\n",
> -			task->comm, task->pid);
> -		printk_lock(lockblk, 1);
> -
> -		debug_show_held_locks(task);
> -
> -		printk("\n%s/%d's [blocked] stackdump:\n\n",
> -			task->comm, task->pid);
> -		show_stack(task, NULL);
> -	}
> -
> -	printk("\n%s/%d's [current] stackdump:\n\n",
> -		current->comm, current->pid);
> -	dump_stack();
> -	debug_show_all_locks();
> -	printk("[ turning off deadlock detection. Please report this. ]\n\n");
> -	local_irq_disable();
> -}
> -
> -/*
> - * Recursively check for mutex deadlocks:
> - */
> -static int check_deadlock(struct mutex *lock, int depth, struct thread_info *ti)
> -{
> -	struct mutex *lockblk;
> -	struct task_struct *task;
> -
> -	if (!debug_locks)
> -		return 0;
> -
> -	ti = lock->owner;
> -	if (!ti)
> -		return 0;
> -
> -	task = ti->task;
> -	/*
> -	 * In the PROVE_MUTEX_LOCKING we are tracking all held
> -	 * locks already, which allows us to optimize this:
> -	 */
> -#ifdef CONFIG_PROVE_MUTEX_LOCKING
> -	if (!task->lockdep_depth)
> -		return 0;
> -#endif
> -	lockblk = NULL;
> -	if (task->blocked_on)
> -		lockblk = task->blocked_on->lock;
> -
> -	/* Self-deadlock: */
> -	if (current == task) {
> -		debug_locks_off();
> -		if (depth)
> -			return 1;
> -		printk("\n==========================================\n");
> -		printk(  "[ BUG: lock recursion deadlock detected! |\n");
> -		printk(  "------------------------------------------\n");
> -		report_deadlock(task, lock, NULL);
> -		return 0;
> -	}
> -
> -	/* Ugh, something corrupted the lock data structure? */
> -	if (depth > 20) {
> -		debug_locks_off();
> -		printk("\n===========================================\n");
> -		printk(  "[ BUG: infinite lock dependency detected!? |\n");
> -		printk(  "-------------------------------------------\n");
> -		report_deadlock(task, lock, lockblk);
> -		return 0;
> -	}
> -
> -	/* Recursively check for dependencies: */
> -	if (lockblk && check_deadlock(lockblk, depth+1, ti)) {
> -		printk("\n============================================\n");
> -		printk(  "[ BUG: circular locking deadlock detected! ]\n");
> -		printk(  "--------------------------------------------\n");
> -		report_deadlock(task, lock, lockblk);
> -		return 0;
> -	}
> -	return 0;
> -}
> -
>  /*
>   * Must be called with lock->wait_lock held.
>   */
> @@ -178,9 +56,7 @@ void debug_mutex_add_waiter(struct mutex
>  			    struct thread_info *ti)
>  {
>  	SMP_DEBUG_WARN_ON(!spin_is_locked(&lock->wait_lock));
> -#ifdef CONFIG_DEBUG_MUTEX_DEADLOCKS
> -	check_deadlock(lock, 0, ti);
> -#endif
> +
>  	/* Mark the current thread as blocked on the lock: */
>  	ti->task->blocked_on = waiter;
>  	waiter->lock = lock;
> Index: linux/lib/Kconfig.debug
> ===================================================================
> --- linux.orig/lib/Kconfig.debug
> +++ linux/lib/Kconfig.debug
> @@ -164,14 +164,6 @@ config DEBUG_MUTEX_ALLOC
>  	 (kfree(), kmem_cache_free(), free_pages(), vfree(), etc.),
>  	 or whether there is any lock held during task exit.
>  
> -config DEBUG_MUTEX_DEADLOCKS
> -	bool "Detect mutex related deadlocks"
> -	default y
> -	depends on DEBUG_MUTEXES
> -	help
> -	 This feature will automatically detect and report mutex related
> -	 deadlocks, as they happen.
> -
>  config DEBUG_RT_MUTEXES
>  	bool "RT Mutex debugging, deadlock detection"
>  	default y

I'll shove this one in for testing too.  Results on TKO as I have them.

-apw

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: 2.6.17-rc5-mm3
  2006-06-06  2:19         ` 2.6.17-rc5-mm3 Randy.Dunlap
  2006-06-06  2:35           ` 2.6.17-rc5-mm3 Paul Fulghum
@ 2006-06-06 13:30           ` Paul Fulghum
  1 sibling, 0 replies; 52+ messages in thread
From: Paul Fulghum @ 2006-06-06 13:30 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: davej, akpm, linux-kernel, zippel

Randy.Dunlap wrote:
> Here's another version of the patch for you to consider.
> ---
> --- linux-2617-rc5mm3.orig/drivers/char/Kconfig
> +++ linux-2617-rc5mm3/drivers/char/Kconfig
> @@ -197,7 +197,6 @@ config ISI
>  config SYNCLINK
>  	tristate "SyncLink PCI/ISA support"
>  	depends on SERIAL_NONSTANDARD && PCI && ISA_DMA_API
> -	select HDLC if SYNCLINK_HDLC
>  	help
>  	  Driver for SyncLink ISA and PCI synchronous serial adapters.
>  	  These adapters are no longer in production and have
> @@ -205,7 +204,7 @@ config SYNCLINK
>  
>  config SYNCLINK_HDLC
>  	bool "Generic HDLC support for SyncLink driver"
> -	depends on SYNCLINK
> +	depends on SYNCLINK && HDLC
>  	help
>  	  Enable generic HDLC support for the SyncLink PCI/ISA driver.
>  	  Generic HDLC implements multiple higher layer networking

Now I remember that this was tried before and does
not work because SYNCLINK_HDLC is a bool and will
force the HDLC module to 'y' even if the synclink
driver is a 'm' which results in build errors.

I also tried 'depends on HDLC if SYNCLINK_HDLC'
in the config SYNCLINK section, but that causes a
cyclic dependency error. I suppose I could do that if I
remove 'depends on SYNCLINK' from config SYNCLINK_HDLC.
The only down side of that is the way the
SYNCLINK_HDLC option would be displayed.

I'll review this again to find the best solution.

-- 
Paul Fulghum
Microgate Systems, Ltd.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] poison: add & use more constants
  2006-06-05 20:54         ` [PATCH] poison: add & use more constants Randy.Dunlap
@ 2006-06-06 13:33           ` Steven Rostedt
  0 siblings, 0 replies; 52+ messages in thread
From: Steven Rostedt @ 2006-06-06 13:33 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: davej, mingo, mbligh, akpm, apw, linux-kernel

On Mon, 2006-06-05 at 13:54 -0700, Randy.Dunlap wrote:
> From: Randy Dunlap <rdunlap@xenotime.net>
> 
> Add more poison values to include/linux/poison.h.
> It's not clear to me whether some others should be added or not,
> so I haven't added any of these:
> 
> ./include/linux/libata.h:#define ATA_TAG_POISON		0xfafbfcfdU
> ./arch/ppc/8260_io/fcc_enet.c:1918:	memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
> ./drivers/usb/mon/mon_text.c:429:	memset(mem, 0xe5, sizeof(struct mon_event_text));
> ./drivers/char/ftape/lowlevel/ftape-ctl.c:738:		memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
> ./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

You don't have my personal favorite?  From AIX that would poison pages
with 0xdeadbeef  :)

-- Steve



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code
  2006-06-06 11:40       ` Andy Whitcroft
@ 2006-06-06 17:17         ` Andy Whitcroft
  2006-06-06 19:29           ` Ingo Molnar
  0 siblings, 1 reply; 52+ messages in thread
From: Andy Whitcroft @ 2006-06-06 17:17 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: Ingo Molnar, Randy.Dunlap, mbligh, akpm, linux-kernel

Andy Whitcroft wrote:
> Ingo Molnar wrote:
> 
>>* Randy.Dunlap <rdunlap@xenotime.net> wrote:
>>
>>
>>
>>>BUG: unable to handle kernel paging request at virtual address 22222232
>>
>>
>>ok, this was a big thinko on my part, and it was right before our eyes. 
>>Mutex deadlock checking relied on the 'big mutex debugging lock', but 
>>that one is gone now - so mutex deadlock checking became racy (as your 
>>crashes nicely pinpointed that). The races are more likely with an 
>>increasing number of CPUs.
>>
>>so the patch below finishes the cleanup i started: it removes deadlock 
>>checking from the mutex code and lets the lock validator do that. This 
>>should also be (much) faster on SMP, because the lock validator is 
>>lockless in the fastpath. (if CONFIG_DEBUG_LOCKDEP is disabled)
>>
>>	Ingo
>>
>>----------------
>>Subject: better lock debugging: remove mutex deadlock checking code
>>From: Ingo Molnar <mingo@elte.hu>
>>
>>with the lock validator we detect mutex deadlocks (and more), the mutex
>>deadlock checking code is both redundant and slower. So remove it.
>>
>>Signed-off-by: Ingo Molnar <mingo@elte.hu>
>>---
>> kernel/mutex-debug.c |  126 ---------------------------------------------------
>> lib/Kconfig.debug    |    8 ---
>> 2 files changed, 1 insertion(+), 133 deletions(-)
>>
>>Index: linux/kernel/mutex-debug.c
>>===================================================================
>>--- linux.orig/kernel/mutex-debug.c
>>+++ linux/kernel/mutex-debug.c
>>@@ -23,128 +23,6 @@
>> 
>> #include "mutex-debug.h"
>> 
>>-static void printk_task(struct task_struct *p)
>>-{
>>-	if (p)
>>-		printk("%16s:%5d [%p, %3d]", p->comm, p->pid, p, p->prio);
>>-	else
>>-		printk("<none>");
>>-}
>>-
>>-static void printk_ti(struct thread_info *ti)
>>-{
>>-	if (ti)
>>-		printk_task(ti->task);
>>-	else
>>-		printk("<none>");
>>-}
>>-
>>-static void printk_lock(struct mutex *lock, int print_owner)
>>-{
>>-#ifdef CONFIG_PROVE_MUTEX_LOCKING
>>-	printk(" [%p] {%s}\n", lock, lock->dep_map.name);
>>-#else
>>-	printk(" [%p]\n", lock);
>>-#endif
>>-
>>-	if (print_owner && lock->owner) {
>>-		printk(".. held by:  ");
>>-		printk_ti(lock->owner);
>>-		printk("\n");
>>-	}
>>-}
>>-
>>-static void report_deadlock(struct task_struct *task, struct mutex *lock,
>>-			    struct mutex *lockblk)
>>-{
>>-	printk("\n%s/%d is trying to acquire this lock:\n",
>>-		current->comm, current->pid);
>>-	printk_lock(lock, 1);
>>-	debug_show_held_locks(current);
>>-
>>-	if (lockblk) {
>>-		printk("but %s/%d is deadlocking current task %s/%d!\n\n",
>>-			task->comm, task->pid, current->comm, current->pid);
>>-		printk("\n%s/%d is blocked on this lock:\n",
>>-			task->comm, task->pid);
>>-		printk_lock(lockblk, 1);
>>-
>>-		debug_show_held_locks(task);
>>-
>>-		printk("\n%s/%d's [blocked] stackdump:\n\n",
>>-			task->comm, task->pid);
>>-		show_stack(task, NULL);
>>-	}
>>-
>>-	printk("\n%s/%d's [current] stackdump:\n\n",
>>-		current->comm, current->pid);
>>-	dump_stack();
>>-	debug_show_all_locks();
>>-	printk("[ turning off deadlock detection. Please report this. ]\n\n");
>>-	local_irq_disable();
>>-}
>>-
>>-/*
>>- * Recursively check for mutex deadlocks:
>>- */
>>-static int check_deadlock(struct mutex *lock, int depth, struct thread_info *ti)
>>-{
>>-	struct mutex *lockblk;
>>-	struct task_struct *task;
>>-
>>-	if (!debug_locks)
>>-		return 0;
>>-
>>-	ti = lock->owner;
>>-	if (!ti)
>>-		return 0;
>>-
>>-	task = ti->task;
>>-	/*
>>-	 * In the PROVE_MUTEX_LOCKING we are tracking all held
>>-	 * locks already, which allows us to optimize this:
>>-	 */
>>-#ifdef CONFIG_PROVE_MUTEX_LOCKING
>>-	if (!task->lockdep_depth)
>>-		return 0;
>>-#endif
>>-	lockblk = NULL;
>>-	if (task->blocked_on)
>>-		lockblk = task->blocked_on->lock;
>>-
>>-	/* Self-deadlock: */
>>-	if (current == task) {
>>-		debug_locks_off();
>>-		if (depth)
>>-			return 1;
>>-		printk("\n==========================================\n");
>>-		printk(  "[ BUG: lock recursion deadlock detected! |\n");
>>-		printk(  "------------------------------------------\n");
>>-		report_deadlock(task, lock, NULL);
>>-		return 0;
>>-	}
>>-
>>-	/* Ugh, something corrupted the lock data structure? */
>>-	if (depth > 20) {
>>-		debug_locks_off();
>>-		printk("\n===========================================\n");
>>-		printk(  "[ BUG: infinite lock dependency detected!? |\n");
>>-		printk(  "-------------------------------------------\n");
>>-		report_deadlock(task, lock, lockblk);
>>-		return 0;
>>-	}
>>-
>>-	/* Recursively check for dependencies: */
>>-	if (lockblk && check_deadlock(lockblk, depth+1, ti)) {
>>-		printk("\n============================================\n");
>>-		printk(  "[ BUG: circular locking deadlock detected! ]\n");
>>-		printk(  "--------------------------------------------\n");
>>-		report_deadlock(task, lock, lockblk);
>>-		return 0;
>>-	}
>>-	return 0;
>>-}
>>-
>> /*
>>  * Must be called with lock->wait_lock held.
>>  */
>>@@ -178,9 +56,7 @@ void debug_mutex_add_waiter(struct mutex
>> 			    struct thread_info *ti)
>> {
>> 	SMP_DEBUG_WARN_ON(!spin_is_locked(&lock->wait_lock));
>>-#ifdef CONFIG_DEBUG_MUTEX_DEADLOCKS
>>-	check_deadlock(lock, 0, ti);
>>-#endif
>>+
>> 	/* Mark the current thread as blocked on the lock: */
>> 	ti->task->blocked_on = waiter;
>> 	waiter->lock = lock;
>>Index: linux/lib/Kconfig.debug
>>===================================================================
>>--- linux.orig/lib/Kconfig.debug
>>+++ linux/lib/Kconfig.debug
>>@@ -164,14 +164,6 @@ config DEBUG_MUTEX_ALLOC
>> 	 (kfree(), kmem_cache_free(), free_pages(), vfree(), etc.),
>> 	 or whether there is any lock held during task exit.
>> 
>>-config DEBUG_MUTEX_DEADLOCKS
>>-	bool "Detect mutex related deadlocks"
>>-	default y
>>-	depends on DEBUG_MUTEXES
>>-	help
>>-	 This feature will automatically detect and report mutex related
>>-	 deadlocks, as they happen.
>>-
>> config DEBUG_RT_MUTEXES
>> 	bool "RT Mutex debugging, deadlock detection"
>> 	default y
> 
> 
> I'll shove this one in for testing too.  Results on TKO as I have them.
> 
> -apw
> 

This is definatly clearing up a bunch of problems with the current -mm.

-apw

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code
  2006-06-06 17:17         ` Andy Whitcroft
@ 2006-06-06 19:29           ` Ingo Molnar
  0 siblings, 0 replies; 52+ messages in thread
From: Ingo Molnar @ 2006-06-06 19:29 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: Randy.Dunlap, mbligh, akpm, linux-kernel


* Andy Whitcroft <apw@shadowen.org> wrote:

> > I'll shove this one in for testing too.  Results on TKO as I have them.
> 
> This is definatly clearing up a bunch of problems with the current 
> -mm.

great! Thanks for testing this out, this bug was the scariest pending 
one.

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code
  2006-06-06  8:56     ` [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code Ingo Molnar
  2006-06-06 11:40       ` Andy Whitcroft
@ 2006-06-07  9:17       ` Andy Whitcroft
  1 sibling, 0 replies; 52+ messages in thread
From: Andy Whitcroft @ 2006-06-07  9:17 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Randy.Dunlap, mbligh, akpm, linux-kernel

Ingo Molnar wrote:
> * Randy.Dunlap <rdunlap@xenotime.net> wrote:
> 
> 
>>BUG: unable to handle kernel paging request at virtual address 22222232
> 
> 
> ok, this was a big thinko on my part, and it was right before our eyes. 
> Mutex deadlock checking relied on the 'big mutex debugging lock', but 
> that one is gone now - so mutex deadlock checking became racy (as your 
> crashes nicely pinpointed that). The races are more likely with an 
> increasing number of CPUs.
> 
> so the patch below finishes the cleanup i started: it removes deadlock 
> checking from the mutex code and lets the lock validator do that. This 
> should also be (much) faster on SMP, because the lock validator is 
> lockless in the fastpath. (if CONFIG_DEBUG_LOCKDEP is disabled)
> 
> 	Ingo
> 
> ----------------
> Subject: better lock debugging: remove mutex deadlock checking code
> From: Ingo Molnar <mingo@elte.hu>
> 
> with the lock validator we detect mutex deadlocks (and more), the mutex
> deadlock checking code is both redundant and slower. So remove it.
> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---

Ok, this patch in combination with either fix for the swap max bug are
showing passes across the board.

Acked-by: Andy Whitcroft <apw@shadowen.org>

-apw

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2006-06-07  9:17 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-05 16:30 2.6.17-rc5-mm3 Martin Bligh
2006-06-05 19:44 ` 2.6.17-rc5-mm3 Ingo Molnar
2006-06-05 20:00   ` 2.6.17-rc5-mm3 Randy.Dunlap
2006-06-05 20:05     ` 2.6.17-rc5-mm3 Ingo Molnar
2006-06-05 20:05     ` 2.6.17-rc5-mm3 Dave Jones
2006-06-05 20:08       ` 2.6.17-rc5-mm3 Ingo Molnar
2006-06-05 20:14       ` 2.6.17-rc5-mm3 Randy.Dunlap
2006-06-05 20:54         ` [PATCH] poison: add & use more constants Randy.Dunlap
2006-06-06 13:33           ` Steven Rostedt
2006-06-06  8:56     ` [patch, -rc5-mm3] better lock debugging: remove mutex deadlock checking code Ingo Molnar
2006-06-06 11:40       ` Andy Whitcroft
2006-06-06 17:17         ` Andy Whitcroft
2006-06-06 19:29           ` Ingo Molnar
2006-06-07  9:17       ` Andy Whitcroft
  -- strict thread matches above, loose matches on Subject: below --
2006-06-04  6:20 2.6.17-rc5-mm3 Andrew Morton
2006-06-04  9:38 ` 2.6.17-rc5-mm3 Barry K. Nathan
2006-06-04  9:49   ` 2.6.17-rc5-mm3 Andrew Morton
2006-06-04 10:08     ` 2.6.17-rc5-mm3 Michal Piotrowski
2006-06-04 10:41       ` 2.6.17-rc5-mm3 Ingo Molnar
2006-06-04 20:38         ` 2.6.17-rc5-mm3 Valdis.Kletnieks
     [not found]         ` <6bffcb0e0606040407u4f56f7fdyf5ec479314afc082@mail.gmail.com>
2006-06-04 21:38           ` 2.6.17-rc5-mm3 Ingo Molnar
2006-06-04 22:35             ` 2.6.17-rc5-mm3 Michal Piotrowski
2006-06-04 18:20 ` 2.6.17-rc5-mm3 Rafael J. Wysocki
2006-06-04 23:15 ` 2.6.17-rc5-mm3 J.A. Magallón
2006-06-04 23:42   ` 2.6.17-rc5-mm3 Andrew Morton
2006-06-05  6:02   ` 2.6.17-rc5-mm3 Valdis.Kletnieks
2006-06-05  8:04   ` 2.6.17-rc5-mm3 Arjan van de Ven
2006-06-04 23:28 ` 2.6.17-rc5-mm3 J.A. Magallón
2006-06-05  0:06   ` 2.6.17-rc5-mm3 Barry K. Nathan
2006-06-05  0:25   ` 2.6.17-rc5-mm3 Grant Coady
2006-06-05  0:45   ` 2.6.17-rc5-mm3 Grant Coady
2006-06-05  9:12   ` 2.6.17-rc5-mm3 Ingo Molnar
2006-06-05 17:56 ` 2.6.17-rc5-mm3 Mel Gorman
2006-06-05 18:54   ` 2.6.17-rc5-mm3 Andrew Morton
2006-06-06  9:43     ` 2.6.17-rc5-mm3 Mel Gorman
2006-06-06 10:57     ` 2.6.17-rc5-mm3 Mel Gorman
2006-06-05 19:48 ` 2.6.17-rc5-mm3 Dave Jones
2006-06-05 20:06   ` 2.6.17-rc5-mm3 Andrew Morton
2006-06-05 20:09     ` 2.6.17-rc5-mm3 Dave Jones
2006-06-05 20:44       ` 2.6.17-rc5-mm3 Dave Jones
2006-06-05 20:53         ` 2.6.17-rc5-mm3 Andrew Morton
2006-06-05 21:02           ` 2.6.17-rc5-mm3 Dave Jones
2006-06-05 21:03         ` 2.6.17-rc5-mm3 Arjan van de Ven
2006-06-06 10:15     ` 2.6.17-rc5-mm3 Takashi Iwai
2006-06-05 23:02 ` 2.6.17-rc5-mm3 Dave Jones
2006-06-06  1:44   ` 2.6.17-rc5-mm3 Randy.Dunlap
2006-06-06  1:54     ` 2.6.17-rc5-mm3 Paul Fulghum
2006-06-06  2:03       ` 2.6.17-rc5-mm3 Randy.Dunlap
2006-06-06  2:19         ` 2.6.17-rc5-mm3 Randy.Dunlap
2006-06-06  2:35           ` 2.6.17-rc5-mm3 Paul Fulghum
2006-06-06 13:30           ` 2.6.17-rc5-mm3 Paul Fulghum
2006-06-06  8:03 ` 2.6.17-rc5-mm3 J.A. Magallón

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).