More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
@ 2006-03-12  2:04 Krzysztof Oledzki
  2006-03-12  5:03 ` Andrew Morton
  0 siblings, 1 reply; 25+ messages in thread
From: Krzysztof Oledzki @ 2006-03-12  2:04 UTC (permalink / raw)
  To: Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2175 bytes --]

Hello,

After upgrading to 2.6.16-rc6 I noticed this strange message:

More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.

This is a Dell PowerEdge SC1425 with two P4 Xeons with HT enabled (so with 
totoal of 4 logical CPUs).

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_SMP=y
CONFIG_NR_CPUS=4
CONFIG_SCHED_SMT=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=y
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set


Best regards,

 			Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12  2:04 More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6 Krzysztof Oledzki
@ 2006-03-12  5:03 ` Andrew Morton
  2006-03-12 11:04   ` Krzysztof Oledzki
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-03-12  5:03 UTC (permalink / raw)
  To: Krzysztof Oledzki; +Cc: linux-kernel

Krzysztof Oledzki <olel@ans.pl> wrote:
>
> After upgrading to 2.6.16-rc6 I noticed this strange message:
> 
>  More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
>  Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
>
> This is a Dell PowerEdge SC1425 with two P4 Xeons with HT enabled (so with 
>  totoal of 4 logical CPUs).

Please send full dmesg output for the failing kernel, thanks.

Which is the most-recently-tested kernel which behaved correctly?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12  5:03 ` Andrew Morton
@ 2006-03-12 11:04   ` Krzysztof Oledzki
  2006-03-12 11:25     ` Andrew Morton
  0 siblings, 1 reply; 25+ messages in thread
From: Krzysztof Oledzki @ 2006-03-12 11:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 603 bytes --]



On Sat, 11 Mar 2006, Andrew Morton wrote:

> Krzysztof Oledzki <olel@ans.pl> wrote:
>>
>> After upgrading to 2.6.16-rc6 I noticed this strange message:
>>
>>  More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
>>  Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
>>
>> This is a Dell PowerEdge SC1425 with two P4 Xeons with HT enabled (so with
>>  totoal of 4 logical CPUs).
>
> Please send full dmesg output for the failing kernel, thanks.
Attached.

> Which is the most-recently-tested kernel which behaved correctly?
2.6.15.6

Best regards,

 					Krzysztof Olędzki

[-- Attachment #2: Type: TEXT/PLAIN, Size: 23230 bytes --]

Linux version 2.6.16-rc6 (root@fw2) (gcc version 3.4.5) #1 SMP PREEMPT Sun Mar 12 02:43:26 CET 2006
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 000000007ffc0000 (usable)
 BIOS-e820: 000000007ffc0000 - 000000007ffcfc00 (ACPI data)
 BIOS-e820: 000000007ffcfc00 - 000000007ffff000 (reserved)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec90000 (reserved)
 BIOS-e820: 00000000fed00000 - 00000000fed00400 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
2047MB LOWMEM available.
found SMP MP-table at 000fe710
On node 0 totalpages: 524224
  DMA zone: 4096 pages, LIFO batch:0
  DMA32 zone: 0 pages, LIFO batch:0
  Normal zone: 520128 pages, LIFO batch:31
  HighMem zone: 0 pages, LIFO batch:0
DMI 2.3 present.
ACPI: RSDP (v000 DELL                                  ) @ 0x000fd650
ACPI: RSDT (v001 DELL   PESC1425 0x00000001 MSFT 0x0100000a) @ 0x000fd664
ACPI: FADT (v001 DELL   PESC1425 0x00000001 MSFT 0x0100000a) @ 0x000fd6b0
ACPI: MADT (v001 DELL   PESC1425 0x00000001 MSFT 0x0100000a) @ 0x000fd724
ACPI: SPCR (v001 DELL   PESC1425 0x00000001 MSFT 0x0100000a) @ 0x000fd7c0
ACPI: HPET (v001 DELL   PESC1425 0x00000001 MSFT 0x0100000a) @ 0x000fd810
ACPI: MCFG (v001 DELL   PESC1425 0x00000001 MSFT 0x0100000a) @ 0x000fd848
ACPI: DSDT (v001 DELL   PESC1425 0x00000001 MSFT 0x0100000e) @ 0x00000000
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
Processor #1 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] enabled)
Processor #7 15:4 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[32])
IOAPIC[1]: apic_id 9, version 32, address 0xfec80000, GSI 32-55
ACPI: IOAPIC (id[0x0a] address[0xfec80800] gsi_base[64])
IOAPIC[2]: apic_id 10, version 32, address 0xfec80800, GSI 64-87
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 3 I/O APICs
ACPI: HPET id: 0xffffffff base: 0xfed00000
More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 80000000 (gap: 7ffff000:60001000)
Built 1 zonelists
Kernel command line: auto BOOT_IMAGE=Linux-2.6.16r6 ro root=900 rootflags=data=journal ip_conntrack.hashsize=131072
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
mapped IOAPIC to ffffb000 (fec80000)
mapped IOAPIC to ffffa000 (fec80800)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 65536 bytes)
Console: colour VGA+ 80x30
Dentry cache hash table entries: 524288 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 262144 (order: 8, 1048576 bytes)
Memory: 2072088k/2096896k available (2967k kernel code, 24404k reserved, 1041k data, 216k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
hpet0: at MMIO 0xfed00000 (virtual 0xf8800000), IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
Using HPET for base-timer
Using HPET for gettimeofday
Detected 3200.687 MHz processor.
Using hpet for high-res timesource
Calibrating delay using timer specific routine.. 6404.87 BogoMIPS (lpj=3202439)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (24) available
CPU0: Thermal monitoring enabled
Checking 'hlt' instruction... OK.
CPU0: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
Booting processor 1/1 eip 3000
Initializing CPU#1
Calibrating delay using timer specific routine.. 6400.28 BogoMIPS (lpj=3200140)
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (24) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
Booting processor 2/6 eip 3000
Initializing CPU#2
Calibrating delay using timer specific routine.. 6400.31 BogoMIPS (lpj=3200157)
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 3
CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#2.
CPU2: Intel P4/Xeon Extended MCE MSRs (24) available
CPU2: Thermal monitoring enabled
CPU2: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
Booting processor 3/7 eip 3000
Initializing CPU#3
Calibrating delay using timer specific routine.. 6400.32 BogoMIPS (lpj=3200161)
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 3
CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#3.
CPU3: Intel P4/Xeon Extended MCE MSRs (24) available
CPU3: Thermal monitoring enabled
CPU3: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
Total of 4 processors activated (25605.79 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
checking TSC synchronization across 4 CPUs: passed.
Brought up 4 CPUs
migration_cost=1000,1000
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xfc3de, last bus=4
PCI: Using MMCONFIG
ACPI: Subsystem revision 20060127
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
PCI quirk: region 0800-087f claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 0880-08bf claimed by ICH4 GPIO
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: PXH quirk detected, disabling MSI for SHPC device
PCI: PXH quirk detected, disabling MSI for SHPC device
Boot video device is 0000:04:0d.0
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PALO._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PALO.PXHB._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PALO.PXHA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PICH._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 *11 12)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 6 7 10 11 12)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 *10 11 12)
ACPI: PCI Interrupt Link [LNKE] (IRQs *3 4 5 6 7 10 11 12)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 *6 7 10 11 12)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 11 devices
SCSI subsystem initialized
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
TC classifier action (bugs to netdev@vger.kernel.org cc hadi@cyberus.ca)
pnp: 00:08: ioport range 0x800-0x87f could not be reserved
pnp: 00:08: ioport range 0x880-0x8bf has been reserved
pnp: 00:08: ioport range 0x8c0-0x8df has been reserved
pnp: 00:08: ioport range 0x8e0-0x8e3 has been reserved
pnp: 00:08: ioport range 0xc00-0xc0f has been reserved
pnp: 00:08: ioport range 0xc10-0xc1f has been reserved
pnp: 00:08: ioport range 0xca0-0xcaf has been reserved
pnp: 00:08: ioport range 0xc20-0xc3f has been reserved
PCI: Bridge: 0000:01:00.0
  IO window: e000-efff
  MEM window: fe900000-feafffff
  PREFETCH window: disabled.
PCI: Bridge: 0000:01:00.2
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: 0000:00:02.0
  IO window: e000-efff
  MEM window: fe700000-feafffff
  PREFETCH window: disabled.
PCI: Bridge: 0000:00:1e.0
  IO window: d000-dfff
  MEM window: fe500000-fe6fffff
  PREFETCH window: f0000000-f7ffffff
ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 169
PCI: Setting latency timer of device 0000:00:02.0 to 64
PCI: Setting latency timer of device 0000:01:00.0 to 64
PCI: Setting latency timer of device 0000:01:00.2 to 64
PCI: Setting latency timer of device 0000:00:1e.0 to 64
Machine check exception polling timer started.
IA-32 Microcode Update Driver: v1.14 <tigran@veritas.com>
audit: initializing netlink socket (disabled)
audit(1142131816.949:1): initialized
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered (default)
io scheduler cfq registered
Intel E7520/7320/7525 detected.<6>ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 169
PCI: Setting latency timer of device 0000:00:02.0 to 64
Allocate Port Service[0000:00:02.0:pcie00]
Allocate Port Service[0000:00:02.0:pcie01]
ACPI: Power Button (FF) [PWRF]
ACPI: Video Device [EVGA] (multi-head: no  rom: yes  post: no)
Real Time Clock Driver v1.12ac
hpet_resources: 0xfed00000 is busy
ipmi message handler version 38.0
ipmi device interface
IPMI System Interface driver.
ipmi_si: Found SMBIOS-specified state machine at I/O address 0xca8, slave address 0x20
 IPMI kcs interface initialized
IPMI Watchdog: driver initialized
Copyright (C) 2004 MontaVista Software - IPMI Powerdown via sys_reboot.
IPMI poweroff: Found a chassis style poweroff function
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
floppy0: no floppy controllers found
loop: loaded (max 8 devices)
Intel(R) PRO/1000 Network Driver - version 6.3.9-k4-NAPI
Copyright (c) 1999-2005 Intel Corporation.
ACPI: PCI Interrupt 0000:02:04.0[A] -> GSI 32 (level, low) -> IRQ 177
e1000: 0000:02:04.0: e1000_probe: (PCI:66MHz:32-bit) 00:14:22:b0:cb:52
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
ACPI: PCI Interrupt 0000:04:03.0[A] -> GSI 20 (level, low) -> IRQ 185
e1000: 0000:04:03.0: e1000_probe: (PCI:33MHz:32-bit) 00:14:22:b0:cb:53
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH5: IDE controller at PCI slot 0000:00:1f.1
PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 193
ICH5: chipset revision 2
ICH5: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
Probing IDE interface ide0...
hda: HL-DT-ST DVD-ROM GDR-8084N, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 24X DVD-ROM drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
libata version 1.20 loaded.
ata_piix 0000:00:1f.2: version 1.05
ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 18 (level, low) -> IRQ 193
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0xCCB8 ctl 0xCCB2 bmdma 0xCC80 irq 193
ata2: SATA max UDMA/133 cmd 0xCCA0 ctl 0xCC9A bmdma 0xCC88 irq 193
ata1: dev 0 cfg 49:2f00 82:746b 83:7f01 84:4023 85:7469 86:3c01 87:4023 88:207f
ata1: dev 0 ATA-7, max UDMA/133, 156250000 sectors: LBA48
ata1: dev 0 configured for UDMA/133
scsi0 : ata_piix
ata2: dev 0 cfg 49:2f00 82:746b 83:7f01 84:4023 85:7469 86:3c01 87:4023 88:207f
ata2: dev 0 ATA-7, max UDMA/133, 156250000 sectors: LBA48
ata2: dev 0 configured for UDMA/133
scsi1 : ata_piix
  Vendor: ATA       Model: WDC WD800JD-75MS  Rev: 10.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
  Vendor: ATA       Model: WDC WD800JD-75MS  Rev: 10.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 156250000 512-byte hdwr sectors (80000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 156250000 512-byte hdwr sectors (80000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 >
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 156250000 512-byte hdwr sectors (80000 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 156250000 512-byte hdwr sectors (80000 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 >
sd 1:0:0:0: Attached scsi disk sdb
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 1:0:0:0: Attached scsi generic sg1 type 0
ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 201
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: irq 201, io mem 0xfeb00000
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 4 ports detected
USB Universal Host Controller Interface driver v2.3
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 169
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 169, io base 0x0000cce0
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 209
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 209, io base 0x0000ccc0
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
usbcore: registered new driver usblp
drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /class/input/input0
input: PC Speaker as /class/input/input1
md: raid1 personality registered for level 1
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
 dcdbas: Dell Systems Management Base Driver (version 5.6.0-2)
GACT probability on
Mirror/redirect action on
Simple TC action Loaded
netem: version 1.2
u32 classifier
    Perfomance counters on
    input device check on 
    Actions configured 
Netfilter messages via NETLINK v0.30.
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 7, 524288 bytes)
TCP established hash table entries: 262144 (order: 9, 3145728 bytes)
TCP bind hash table entries: 65536 (order: 7, 786432 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
IPv4 over IPv4 tunneling driver
GRE over IPv4 tunneling driver
ip_conntrack version 2.4 (131072 buckets, 1048576 max) - 232 bytes per conntrack
ctnetlink v0.90: registering with nfnetlink.
ip_conntrack_pptp version 3.1 loaded
ip_nat_pptp version 3.0 loaded
ip_tables: (C) 2000-2006 Netfilter Core Team
ipt_time loading
ipt_random match loaded
ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>.  http://snowman.net/projects/ipt_recent/
IPP2P v0.8.1_rc1 loading
ClusterIP Version 0.8 loaded successfully
arp_tables: (C) 2002 David S. Miller
IPVS: Registered protocols (TCP, UDP)
IPVS: Connection hash table configured (size=65536, memory=512Kbytes)
IPVS: ipvs loaded.
IPVS: [rr] scheduler registered.
IPVS: [wrr] scheduler registered.
IPVS: [lc] scheduler registered.
IPVS: [wlc] scheduler registered.
IPVS: [lblc] scheduler registered.
IPVS: [lblcr] scheduler registered.
IPVS: [dh] scheduler registered.
IPVS: [sh] scheduler registered.
IPVS: [sed] scheduler registered.
IPVS: [nq] scheduler registered.
TCP bic registered
TCP cubic registered
TCP westwood registered
TCP highspeed registered
TCP hybla registered
TCP htcp registered
TCP vegas registered
TCP scalable registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
ip6_tables: (C) 2000-2006 Netfilter Core Team
NET: Registered protocol family 17
NET: Registered protocol family 15
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available
Using IPI No-Shortcut mode
ACPI wakeup devices: 
PCI0 PALO  PXH PXHB PXHA PICH 
ACPI: (supports S0 S4 S5)
BIOS EDD facility v0.16 2004-Jun-25, 2 devices found
md: Autodetecting RAID arrays.
input: ImExPS/2 Logitech Explorer Mouse as /class/input/input2
md: autorun ...
md: considering sdb9 ...
md:  adding sdb9 ...
md: sdb8 has different UUID to sdb9
md: sdb7 has different UUID to sdb9
md: sdb6 has different UUID to sdb9
md: sdb5 has different UUID to sdb9
md: sdb3 has different UUID to sdb9
md: sdb2 has different UUID to sdb9
md:  adding sda9 ...
md: sda8 has different UUID to sdb9
md: sda7 has different UUID to sdb9
md: sda6 has different UUID to sdb9
md: sda5 has different UUID to sdb9
md: sda3 has different UUID to sdb9
md: sda2 has different UUID to sdb9
md: created md5
md: bind<sda9>
md: bind<sdb9>
md: running: <sdb9><sda9>
raid1: raid set md5 active with 2 out of 2 mirrors
md: considering sdb8 ...
md:  adding sdb8 ...
md: sdb7 has different UUID to sdb8
md: sdb6 has different UUID to sdb8
md: sdb5 has different UUID to sdb8
md: sdb3 has different UUID to sdb8
md: sdb2 has different UUID to sdb8
md:  adding sda8 ...
md: sda7 has different UUID to sdb8
md: sda6 has different UUID to sdb8
md: sda5 has different UUID to sdb8
md: sda3 has different UUID to sdb8
md: sda2 has different UUID to sdb8
md: created md4
md: bind<sda8>
md: bind<sdb8>
md: running: <sdb8><sda8>
raid1: raid set md4 active with 2 out of 2 mirrors
md: considering sdb7 ...
md:  adding sdb7 ...
md: sdb6 has different UUID to sdb7
md: sdb5 has different UUID to sdb7
md: sdb3 has different UUID to sdb7
md: sdb2 has different UUID to sdb7
md:  adding sda7 ...
md: sda6 has different UUID to sdb7
md: sda5 has different UUID to sdb7
md: sda3 has different UUID to sdb7
md: sda2 has different UUID to sdb7
md: created md3
md: bind<sda7>
md: bind<sdb7>
md: running: <sdb7><sda7>
raid1: raid set md3 active with 2 out of 2 mirrors
md: considering sdb6 ...
md:  adding sdb6 ...
md: sdb5 has different UUID to sdb6
md: sdb3 has different UUID to sdb6
md: sdb2 has different UUID to sdb6
md:  adding sda6 ...
md: sda5 has different UUID to sdb6
md: sda3 has different UUID to sdb6
md: sda2 has different UUID to sdb6
md: created md2
md: bind<sda6>
md: bind<sdb6>
md: running: <sdb6><sda6>
raid1: raid set md2 active with 2 out of 2 mirrors
md: considering sdb5 ...
md:  adding sdb5 ...
md: sdb3 has different UUID to sdb5
md: sdb2 has different UUID to sdb5
md:  adding sda5 ...
md: sda3 has different UUID to sdb5
md: sda2 has different UUID to sdb5
md: created md1
md: bind<sda5>
md: bind<sdb5>
md: running: <sdb5><sda5>
raid1: raid set md1 active with 2 out of 2 mirrors
md: considering sdb3 ...
md:  adding sdb3 ...
md: sdb2 has different UUID to sdb3
md:  adding sda3 ...
md: sda2 has different UUID to sdb3
md: created md15
md: bind<sda3>
md: bind<sdb3>
md: running: <sdb3><sda3>
raid1: raid set md15 active with 2 out of 2 mirrors
md: considering sdb2 ...
md:  adding sdb2 ...
md:  adding sda2 ...
md: created md0
md: bind<sda2>
md: bind<sdb2>
md: running: <sdb2><sda2>
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with journal data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 216k freed
Adding 2007992k swap on /dev/md15.  Priority:8192 extents:1 across:2007992k
EXT3 FS on md0, internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS on md1, internal journal
EXT3-fs: mounted filesystem with journal data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS on md2, internal journal
EXT3-fs: mounted filesystem with journal data mode.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12 11:04   ` Krzysztof Oledzki
@ 2006-03-12 11:25     ` Andrew Morton
  2006-03-12 13:05       ` Krzysztof Oledzki
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-03-12 11:25 UTC (permalink / raw)
  To: Krzysztof Oledzki
  Cc: linux-kernel, Ashok Raj, Venkatesh Pallipadi, Suresh B Siddha,
	Rajesh Shah

Krzysztof Oledzki <olel@ans.pl> wrote:
>
> On Sat, 11 Mar 2006, Andrew Morton wrote:
> 
>  > Krzysztof Oledzki <olel@ans.pl> wrote:
>  >>
>  >> After upgrading to 2.6.16-rc6 I noticed this strange message:
>  >>
>  >>  More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
>  >>  Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
>  >>
>  >> This is a Dell PowerEdge SC1425 with two P4 Xeons with HT enabled (so with
>  >>  totoal of 4 logical CPUs).
>  >
>  > Please send full dmesg output for the failing kernel, thanks.
>  Attached.
> 
>  > Which is the most-recently-tested kernel which behaved correctly?
>  2.6.15.6

OK, thanks.  I assume the machine's working OK?

>From my reading, you have CONFIG_HOTPLUG_CPU enabled and the machine has an
APIC.  I'd expect that lots of people would hit that warning but for some
reason they don't - possibly because most APICs don't have sufficiently
high version numbers?

Anyway, various people cc'ed.  I _think_ it's harmless, although the way in
which def_to_bigsmp propagates into the DMI and APIC code might be a
problem, depending upon config options.

Certainly the warning is incorrect, but I'm not sure what is the best thing
to do about it?


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12 11:25     ` Andrew Morton
@ 2006-03-12 13:05       ` Krzysztof Oledzki
  2006-03-12 15:35         ` Venkatesh Pallipadi
  0 siblings, 1 reply; 25+ messages in thread
From: Krzysztof Oledzki @ 2006-03-12 13:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Ashok Raj, Venkatesh Pallipadi, Suresh B Siddha,
	Rajesh Shah

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1441 bytes --]



On Sun, 12 Mar 2006, Andrew Morton wrote:

> Krzysztof Oledzki <olel@ans.pl> wrote:
>>
>> On Sat, 11 Mar 2006, Andrew Morton wrote:
>>
>> > Krzysztof Oledzki <olel@ans.pl> wrote:
>> >>
>> >> After upgrading to 2.6.16-rc6 I noticed this strange message:
>> >>
>> >>  More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
>> >>  Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
>> >>
>> >> This is a Dell PowerEdge SC1425 with two P4 Xeons with HT enabled (so with
>> >>  totoal of 4 logical CPUs).
>> >
>> > Please send full dmesg output for the failing kernel, thanks.
>>  Attached.
>>
>> > Which is the most-recently-tested kernel which behaved correctly?
>>  2.6.15.6
>
> OK, thanks.  I assume the machine's working OK?

Yes. So far no problems, only this warning.

> From my reading, you have CONFIG_HOTPLUG_CPU enabled and the machine has an
> APIC.
That is correct.

> I'd expect that lots of people would hit that warning but for some
> reason they don't - possibly because most APICs don't have sufficiently
> high version numbers?
>
> Anyway, various people cc'ed.  I _think_ it's harmless, although the way in
> which def_to_bigsmp propagates into the DMI and APIC code might be a
> problem, depending upon config options.
>
> Certainly the warning is incorrect, but I'm not sure what is the best thing
> to do about it?

OK. Thank you.

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12 13:05       ` Krzysztof Oledzki
@ 2006-03-12 15:35         ` Venkatesh Pallipadi
  2006-03-12 21:13           ` Krzysztof Oledzki
  0 siblings, 1 reply; 25+ messages in thread
From: Venkatesh Pallipadi @ 2006-03-12 15:35 UTC (permalink / raw)
  To: Krzysztof Oledzki
  Cc: Andrew Morton, linux-kernel, Ashok Raj, Venkatesh Pallipadi,
	Suresh B Siddha, Rajesh Shah

On Sun, Mar 12, 2006 at 02:05:00PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Sun, 12 Mar 2006, Andrew Morton wrote:
> 
> > Krzysztof Oledzki <olel@ans.pl> wrote:
> >>
> >> On Sat, 11 Mar 2006, Andrew Morton wrote:
> >>
> >> > Krzysztof Oledzki <olel@ans.pl> wrote:
> >> >>
> >> >> After upgrading to 2.6.16-rc6 I noticed this strange message:
> >> >>
> >> >>  More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
> >> >>  Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
> >> >>
> >> >> This is a Dell PowerEdge SC1425 with two P4 Xeons with HT enabled (so with
> >> >>  totoal of 4 logical CPUs).
> >> >
> >> > Please send full dmesg output for the failing kernel, thanks.
> >>  Attached.
> >>
> >> > Which is the most-recently-tested kernel which behaved correctly?
> >>  2.6.15.6
> >
> > OK, thanks.  I assume the machine's working OK?
> 
> Yes. So far no problems, only this warning.
> 
> > From my reading, you have CONFIG_HOTPLUG_CPU enabled and the machine has an
> > APIC.
> That is correct.
> 
> > I'd expect that lots of people would hit that warning but for some
> > reason they don't - possibly because most APICs don't have sufficiently
> > high version numbers?
> >

Actually, this warning should be seen on many other systems on well. We
use the bigsmp when there _or_ more than 8 CPUs or CPU_HOTPLUG is used.
So, in that sense the message is wrong, it should also have CPU_HOTPLUG in
there. Or we should make CPU_HOTPLUG depend on GENERIC_ARCH or auto select
GENERIC_ARCH with hotplug at the CONFIG level.

Will defer to Ashok for a proper fix.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12 15:35         ` Venkatesh Pallipadi
@ 2006-03-12 21:13           ` Krzysztof Oledzki
  2006-03-12 22:30             ` Andrew Morton
  0 siblings, 1 reply; 25+ messages in thread
From: Krzysztof Oledzki @ 2006-03-12 21:13 UTC (permalink / raw)
  To: Venkatesh Pallipadi
  Cc: Andrew Morton, linux-kernel, Ashok Raj, Suresh B Siddha,
	Rajesh Shah

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1999 bytes --]



On Sun, 12 Mar 2006, Venkatesh Pallipadi wrote:

> On Sun, Mar 12, 2006 at 02:05:00PM +0100, Krzysztof Oledzki wrote:
>>
>>
>> On Sun, 12 Mar 2006, Andrew Morton wrote:
>>
>>> Krzysztof Oledzki <olel@ans.pl> wrote:
>>>>
>>>> On Sat, 11 Mar 2006, Andrew Morton wrote:
>>>>
>>>>> Krzysztof Oledzki <olel@ans.pl> wrote:
>>>>>>
>>>>>> After upgrading to 2.6.16-rc6 I noticed this strange message:
>>>>>>
>>>>>>  More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
>>>>>>  Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
>>>>>>
>>>>>> This is a Dell PowerEdge SC1425 with two P4 Xeons with HT enabled (so with
>>>>>>  totoal of 4 logical CPUs).
>>>>>
>>>>> Please send full dmesg output for the failing kernel, thanks.
>>>>  Attached.
>>>>
>>>>> Which is the most-recently-tested kernel which behaved correctly?
>>>>  2.6.15.6
>>>
>>> OK, thanks.  I assume the machine's working OK?
>>
>> Yes. So far no problems, only this warning.
>>
>>> From my reading, you have CONFIG_HOTPLUG_CPU enabled and the machine has an
>>> APIC.
>> That is correct.
>>
>>> I'd expect that lots of people would hit that warning but for some
>>> reason they don't - possibly because most APICs don't have sufficiently
>>> high version numbers?
>>>
>
> Actually, this warning should be seen on many other systems on well. We
> use the bigsmp when there _or_ more than 8 CPUs or CPU_HOTPLUG is used.
> So, in that sense the message is wrong, it should also have CPU_HOTPLUG in
> there. Or we should make CPU_HOTPLUG depend on GENERIC_ARCH or auto select
> GENERIC_ARCH with hotplug at the CONFIG level.

Why? I have exactly 4 HT CPUs (2 cores), no more. I use CPU hotplug so I 
can disable or enable any of them when I want to. So, this is a classic 
SMP system and 2.6.15 is totally happy with this. Or is there any other 
(better?) way to disable/enable CPU (especially second logical CPU from 
HT) on running systems?

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12 21:13           ` Krzysztof Oledzki
@ 2006-03-12 22:30             ` Andrew Morton
  2006-03-13 19:36               ` Ashok Raj
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-03-12 22:30 UTC (permalink / raw)
  To: Krzysztof Oledzki
  Cc: venkatesh.pallipadi, linux-kernel, ashok.raj, suresh.b.siddha,
	rajesh.shah

Krzysztof Oledzki <olel@ans.pl> wrote:
>
> > Actually, this warning should be seen on many other systems on well. We
>  > use the bigsmp when there _or_ more than 8 CPUs or CPU_HOTPLUG is used.
>  > So, in that sense the message is wrong, it should also have CPU_HOTPLUG in
>  > there. Or we should make CPU_HOTPLUG depend on GENERIC_ARCH or auto select
>  > GENERIC_ARCH with hotplug at the CONFIG level.
> 
>  Why? I have exactly 4 HT CPUs (2 cores), no more. I use CPU hotplug so I 
>  can disable or enable any of them when I want to. So, this is a classic 
>  SMP system and 2.6.15 is totally happy with this. Or is there any other 
>  (better?) way to disable/enable CPU (especially second logical CPU from 
>  HT) on running systems?

Maybe we should have:

	if (num_possible_cpus() <= 8)
		dont_do_any_of_that_stuff();

That's assuming that hotplug-cpu-capable platforms are correctly setting
cpu_possible_map.  Do they?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-12 22:30             ` Andrew Morton
@ 2006-03-13 19:36               ` Ashok Raj
  2006-03-13 19:51                 ` Andrew Morton
  0 siblings, 1 reply; 25+ messages in thread
From: Ashok Raj @ 2006-03-13 19:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Krzysztof Oledzki, venkatesh.pallipadi, linux-kernel, ashok.raj,
	suresh.b.siddha, rajesh.shah

On Sun, Mar 12, 2006 at 02:30:53PM -0800, Andrew Morton wrote:
> 
> Maybe we should have:
> 
> 	if (num_possible_cpus() <= 8)
> 		dont_do_any_of_that_stuff();
> 
> That's assuming that hotplug-cpu-capable platforms are correctly setting
> cpu_possible_map.  Do they?

That wont work, since we use HOTPLUG_CPU to suspend/resume as well. We 
switched to using bigsmp (that uses physflat for IPI's) just to avoid
sending IPI's to offline CPUs. When we use logical flat we use shortcuts
that have ill effects on CPUs that are offline.

Think making CONFIG_HOTPLUG_CPU depend on X86_GENERICARCH, or X86_BIGSMP
seems like a better choice.

-- 
Cheers,
Ashok Raj
- Open Source Technology Center


When CONFIG_HOTPLUG_CPU is turned on we always use physflat mode (bigsmp) even 
when #of CPUs are less than 8 to avoid sending IPI to offline processors.

Without having BIGSMP on it spits out a warning during boot on systems that
seems misleading, since it complains even on systems that have less
than 8 cpus.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
---------------------------------------------------------

 arch/i386/Kconfig |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.16-rc6-mm1/arch/i386/Kconfig
===================================================================
--- linux-2.6.16-rc6-mm1.orig/arch/i386/Kconfig
+++ linux-2.6.16-rc6-mm1/arch/i386/Kconfig
@@ -760,7 +760,7 @@ config PHYSICAL_START
 
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs (EXPERIMENTAL)"
-	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER
+	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER && (X86_GENERICARCH || X86_BIGSMP)
 	---help---
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-13 19:36               ` Ashok Raj
@ 2006-03-13 19:51                 ` Andrew Morton
  2006-03-13 20:05                   ` Ashok Raj
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-03-13 19:51 UTC (permalink / raw)
  To: Ashok Raj
  Cc: olel, venkatesh.pallipadi, linux-kernel, ashok.raj,
	suresh.b.siddha, rajesh.shah

Ashok Raj <ashok.raj@intel.com> wrote:
>
> When CONFIG_HOTPLUG_CPU is turned on we always use physflat mode (bigsmp) even 
>  when #of CPUs are less than 8 to avoid sending IPI to offline processors.
> 
>  Without having BIGSMP on it spits out a warning during boot on systems that
>  seems misleading, since it complains even on systems that have less
>  than 8 cpus.
> 
> ...
>
>  --- linux-2.6.16-rc6-mm1.orig/arch/i386/Kconfig
>  +++ linux-2.6.16-rc6-mm1/arch/i386/Kconfig
>  @@ -760,7 +760,7 @@ config PHYSICAL_START
>   
>   config HOTPLUG_CPU
>   	bool "Support for hot-pluggable CPUs (EXPERIMENTAL)"
>  -	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER
>  +	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER && (X86_GENERICARCH || X86_BIGSMP)
>   	---help---
>   	  Say Y here to experiment with turning CPUs off and on.  CPUs
>   	  can be controlled through /sys/devices/system/cpu.

One of the main reasons for turning on CONFIG_HOTPLUG_CPU on x86 is
actually for suspend-to-disk on SMP.  I don't think it's desirable to force
all those little machines to use X86_GENERICARCH || X86_BIGSMP.  And it'd
be good to make that warning go away for 2.6.16.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-13 19:51                 ` Andrew Morton
@ 2006-03-13 20:05                   ` Ashok Raj
  2006-03-13 22:22                     ` Andrew Morton
  0 siblings, 1 reply; 25+ messages in thread
From: Ashok Raj @ 2006-03-13 20:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ashok Raj, olel, venkatesh.pallipadi, linux-kernel,
	suresh.b.siddha, rajesh.shah, ak

On Mon, Mar 13, 2006 at 11:51:55AM -0800, Andrew Morton wrote:
> 
> One of the main reasons for turning on CONFIG_HOTPLUG_CPU on x86 is
> actually for suspend-to-disk on SMP.  I don't think it's desirable to force
> all those little machines to use X86_GENERICARCH || X86_BIGSMP.  And it'd
> be good to make that warning go away for 2.6.16.

But we cant use X86_PC since it uses logical flat mode for IPI's that could 
cause hangup's if we deliver IPI's using IPI broadcast shortcut.

In i386 we do have an alternate that would use mask value to deliver IPI's
but Andi's recommendataion was to use flat physical mode just like what we
do for X86_64.

Other than the IPI mode, are there any other things that hurt small systems
by choosing bigsmp mode?

Venki suggested we could make it !X86_PC instead of listing 
GENERICARCH or BIGSMP separately.

-- 
Cheers,
Ashok Raj
- Open Source Technology Center

When CONFIG_HOTPLUG_CPU is turned on we always use physflat mode (bigsmp) even 
when #of CPUs are less than 8 to avoid sending IPI to offline processors.

Without having BIGSMP on it spits out a warning during boot on systems that
seems misleading, since it complains even on systems that have less
than 8 cpus.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
---------------------------------------------------------

 arch/i386/Kconfig |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.16-rc6-mm1/arch/i386/Kconfig
===================================================================
--- linux-2.6.16-rc6-mm1.orig/arch/i386/Kconfig
+++ linux-2.6.16-rc6-mm1/arch/i386/Kconfig
@@ -760,7 +760,7 @@ config PHYSICAL_START

 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs (EXPERIMENTAL)"
-	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER
+	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER && !X86_PC
 	---help---
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-13 20:05                   ` Ashok Raj
@ 2006-03-13 22:22                     ` Andrew Morton
  2006-03-13 23:04                       ` Ashok Raj
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-03-13 22:22 UTC (permalink / raw)
  To: Ashok Raj
  Cc: ashok.raj, olel, venkatesh.pallipadi, linux-kernel,
	suresh.b.siddha, rajesh.shah, ak

Ashok Raj <ashok.raj@intel.com> wrote:
>
> 
> 
>  When CONFIG_HOTPLUG_CPU is turned on we always use physflat mode (bigsmp) even 
>  when #of CPUs are less than 8 to avoid sending IPI to offline processors.
> 
>  Without having BIGSMP on it spits out a warning during boot on systems that
>  seems misleading, since it complains even on systems that have less
>  than 8 cpus.
> 
>  Signed-off-by: Ashok Raj <ashok.raj@intel.com>
>  ---------------------------------------------------------
> 
>   arch/i386/Kconfig |    2 +-
>   1 files changed, 1 insertion(+), 1 deletion(-)
> 
>  Index: linux-2.6.16-rc6-mm1/arch/i386/Kconfig
>  ===================================================================
>  --- linux-2.6.16-rc6-mm1.orig/arch/i386/Kconfig
>  +++ linux-2.6.16-rc6-mm1/arch/i386/Kconfig
>  @@ -760,7 +760,7 @@ config PHYSICAL_START
>   
>   config HOTPLUG_CPU
>   	bool "Support for hot-pluggable CPUs (EXPERIMENTAL)"
>  -	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER
>  +	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER && !X86_PC
>   	---help---
>   	  Say Y here to experiment with turning CPUs off and on.  CPUs
>   	  can be controlled through /sys/devices/system/cpu.

Still seems wrong.  People _do_ use HOTPLUG_CPU on X86_PCs so they can get
software suspend.  The number of people who do this are probably 100000x
the number of people who have physically hotpluggable CPUs.  And I don't
think we can churn their config requirements this much so late in the game.

So for now I suggest we're best off simply killing the printk (or doing
something smarter, like comparing cpu_online-map with cpu_possible_map
(which isn't right)).

Longer term, it appears that we need to do some Kconfig and C work to
separate out the HOTPLUG_CPU infrastructure which swsusp needs from actual
CPU hotplugging.

What _is_ this IPI problem anyway?  Can't send point-to-point IPIs to
offlined CPUs?  (Don't do that then?) Or do broadcast IPIs go wrong, or
what?

And does it affect pretend-x86-hotplug, or is it only affecting real hotplug?

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-13 22:22                     ` Andrew Morton
@ 2006-03-13 23:04                       ` Ashok Raj
  2006-03-15  5:44                         ` Nathan Lynch
  0 siblings, 1 reply; 25+ messages in thread
From: Ashok Raj @ 2006-03-13 23:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ashok Raj, olel, venkatesh.pallipadi, linux-kernel,
	suresh.b.siddha, rajesh.shah, ak

On Mon, Mar 13, 2006 at 02:22:23PM -0800, Andrew Morton wrote:
> >   config HOTPLUG_CPU
> >   	bool "Support for hot-pluggable CPUs (EXPERIMENTAL)"
> >  -	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER
> >  +	depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER && !X86_PC
> >   	---help---
> >   	  Say Y here to experiment with turning CPUs off and on.  CPUs
> >   	  can be controlled through /sys/devices/system/cpu.
> 
> Longer term, it appears that we need to do some Kconfig and C work to
> separate out the HOTPLUG_CPU infrastructure which swsusp needs from actual
> CPU hotplugging.

The needs are not any different. Both (swsusp and cpu hotplug) both require
logical cpu offlining which is what CONFIG_HOTPLUG_CPU does.

Physical cpu hotplug is enabled by CONFIG_ACPI_HOTPLUG_CPU.

> 
> What _is_ this IPI problem anyway?  Can't send point-to-point IPIs to
> offlined CPUs?  (Don't do that then?) Or do broadcast IPIs go wrong, or
> what?

Its not the point-to-point..we do that only to wake a CPU, but thats done
in flat physical mode always.

When we do smp_call_function() under X86_PC we use logical flat mode. 
This sends a broadcast IPI by using a shortcut message. This is bad, since 
the offline cpu may also receive it and process just when we bring the cpu 
online. 

send_IPI_allbutself() and send_IPI_all() versions that use the shortcut
values are the ones to avoid. 

> 
> And does it affect pretend-x86-hotplug, or is it only affecting real hotplug?
> 
its no more pretend-x86, in the past we used to put the cpu in idle(), 
now we do put the cpu in halt and bring back by another startup ipi, just like 
boot sequence, both for x86 and x86_64.

-- 
Cheers,
Ashok Raj
- Open Source Technology Center

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-13 23:04                       ` Ashok Raj
@ 2006-03-15  5:44                         ` Nathan Lynch
  2006-03-15  6:18                           ` Shaohua Li
  0 siblings, 1 reply; 25+ messages in thread
From: Nathan Lynch @ 2006-03-15  5:44 UTC (permalink / raw)
  To: Ashok Raj
  Cc: Andrew Morton, olel, venkatesh.pallipadi, linux-kernel,
	suresh.b.siddha, rajesh.shah, ak

Ashok Raj wrote:
> On Mon, Mar 13, 2006 at 02:22:23PM -0800, Andrew Morton wrote:
> > 
> > And does it affect pretend-x86-hotplug, or is it only affecting real hotplug?
> > 
> its no more pretend-x86, in the past we used to put the cpu in idle(), 
> now we do put the cpu in halt and bring back by another startup ipi, just like 
> boot sequence, both for x86 and x86_64.

That's actually broken since 2.6.14 (at least on my P3 box); please
see:

Subject: i386 cpu hotplug bug - instant reboot when onlining secondary

http://lkml.org/lkml/2006/2/19/186



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-15  5:44                         ` Nathan Lynch
@ 2006-03-15  6:18                           ` Shaohua Li
  2006-03-15  7:31                             ` Andrew Morton
  0 siblings, 1 reply; 25+ messages in thread
From: Shaohua Li @ 2006-03-15  6:18 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Raj, Ashok, Andrew Morton, olel, Pallipadi, Venkatesh,
	linux-kernel, Siddha, Suresh B, Shah, Rajesh, ak

On Wed, 2006-03-15 at 13:44 +0800, Nathan Lynch wrote:
> Ashok Raj wrote: 
> > On Mon, Mar 13, 2006 at 02:22:23PM -0800, Andrew Morton wrote: 
> > >  
> > > And does it affect pretend-x86-hotplug, or is it only affecting
> real hotplug? 
> > >  
> > its no more pretend-x86, in the past we used to put the cpu in
> idle(),  
> > now we do put the cpu in halt and bring back by another startup ipi,
> just like  
> > boot sequence, both for x86 and x86_64.
> 
> That's actually broken since 2.6.14 (at least on my P3 box); please 
> see:
> 
> Subject: i386 cpu hotplug bug - instant reboot when onlining secondary
> 
> http://lkml.org/lkml/2006/2/19/186
Works for me. But I saw a warning.

---

 linux-2.6.15-root/arch/i386/kernel/cpu/common.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN arch/i386/kernel/cpu/common.c~cpuhp arch/i386/kernel/cpu/common.c
--- linux-2.6.15/arch/i386/kernel/cpu/common.c~cpuhp	2006-03-14 12:13:43.000000000 +0800
+++ linux-2.6.15-root/arch/i386/kernel/cpu/common.c	2006-03-14 12:14:12.000000000 +0800
@@ -605,7 +605,7 @@ void __devinit cpu_init(void)
 		/* alloc_bootmem_pages panics on failure, so no check */
 		memset(gdt, 0, PAGE_SIZE);
 	} else {
-		gdt = (struct desc_struct *)get_zeroed_page(GFP_KERNEL);
+		gdt = (struct desc_struct *)get_zeroed_page(GFP_ATOMIC);
 		if (unlikely(!gdt)) {
 			printk(KERN_CRIT "CPU%d failed to allocate GDT\n", cpu);
 			for (;;)
_



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-15  6:18                           ` Shaohua Li
@ 2006-03-15  7:31                             ` Andrew Morton
  2006-03-15  9:37                               ` [PATCH] No need to protect current->group_info in sys_getgroups(), in_group_p() and in_egroup_p() Eric Dumazet
  2006-03-15 18:09                               ` More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6 Ashok Raj
  0 siblings, 2 replies; 25+ messages in thread
From: Andrew Morton @ 2006-03-15  7:31 UTC (permalink / raw)
  To: Shaohua Li
  Cc: ntl, ashok.raj, olel, venkatesh.pallipadi, linux-kernel,
	suresh.b.siddha, rajesh.shah, ak

Shaohua Li <shaohua.li@intel.com> wrote:
>
> On Wed, 2006-03-15 at 13:44 +0800, Nathan Lynch wrote:
>  > Ashok Raj wrote: 
>  > > On Mon, Mar 13, 2006 at 02:22:23PM -0800, Andrew Morton wrote: 
>  > > >  
>  > > > And does it affect pretend-x86-hotplug, or is it only affecting
>  > real hotplug? 
>  > > >  
>  > > its no more pretend-x86, in the past we used to put the cpu in
>  > idle(),  
>  > > now we do put the cpu in halt and bring back by another startup ipi,
>  > just like  
>  > > boot sequence, both for x86 and x86_64.
>  > 
>  > That's actually broken since 2.6.14 (at least on my P3 box); please 
>  > see:
>  > 
>  > Subject: i386 cpu hotplug bug - instant reboot when onlining secondary
>  > 
>  > http://lkml.org/lkml/2006/2/19/186
>  Works for me. But I saw a warning.

Guys, will you please stop being so cryptic?  What worked for you?  What
warning?  wtf is going on?  Who owns this problem, whatever it is?
<head spins>

>   linux-2.6.15-root/arch/i386/kernel/cpu/common.c |    2 +-
>   1 files changed, 1 insertion(+), 1 deletion(-)
> 
>  diff -puN arch/i386/kernel/cpu/common.c~cpuhp arch/i386/kernel/cpu/common.c
>  --- linux-2.6.15/arch/i386/kernel/cpu/common.c~cpuhp	2006-03-14 12:13:43.000000000 +0800
>  +++ linux-2.6.15-root/arch/i386/kernel/cpu/common.c	2006-03-14 12:14:12.000000000 +0800
>  @@ -605,7 +605,7 @@ void __devinit cpu_init(void)
>   		/* alloc_bootmem_pages panics on failure, so no check */
>   		memset(gdt, 0, PAGE_SIZE);
>   	} else {
>  -		gdt = (struct desc_struct *)get_zeroed_page(GFP_KERNEL);
>  +		gdt = (struct desc_struct *)get_zeroed_page(GFP_ATOMIC);

That would be rather a sad thing to have to do.  OK if it's during initial
bootup, less OK if it's during CPU hot-add.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] No need to protect current->group_info in sys_getgroups(), in_group_p() and in_egroup_p()
  2006-03-15  7:31                             ` Andrew Morton
@ 2006-03-15  9:37                               ` Eric Dumazet
  2006-03-20 19:09                                 ` [PATCH] Use unsigned int types for a faster bsearch Eric Dumazet
  2006-03-15 18:09                               ` More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6 Ashok Raj
  1 sibling, 1 reply; 25+ messages in thread
From: Eric Dumazet @ 2006-03-15  9:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 887 bytes --]


While doing some benchmarks of an Apache/PHP SMP server, I noticed high 
oprofile numbers in in_group_p() and _atomic_dec_and_lock().

rank  percent
  1     4.8911 % __link_path_walk
  2     4.8503 % __d_lookup
*3     4.2911 % _atomic_dec_and_lock
  4     3.9307 % __copy_to_user_ll
  5     4.9004 % sysenter_past_esp
*6     3.3248 % in_group_p

It appears that in_group_p() does an uncessary

get_group_info(current->group_info); /* atomic_inc() */
  ... /* access current->group_info */
put_group_info(current->group_info); /* _atomic_dec_and_lock */


It is not necessary to do this, because the current task holds a reference on 
its own group_info, and this reference cannot change during the lookup.

This patch deletes the get_group_info()/put_group_info() pair from 
sys_getgroups(), in_group_p() and in_egroup_p() functions.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>


[-- Attachment #2: kernel_sys.patch --]
[-- Type: text/plain, Size: 884 bytes --]

--- a/kernel/sys.c	2006-03-15 10:14:37.000000000 +0100
+++ b/kernel/sys.c	2006-03-15 10:15:55.000000000 +0100
@@ -1433,7 +1433,6 @@
 		return -EINVAL;
 
 	/* no need to grab task_lock here; it cannot change */
-	get_group_info(current->group_info);
 	i = current->group_info->ngroups;
 	if (gidsetsize) {
 		if (i > gidsetsize) {
@@ -1446,7 +1445,6 @@
 		}
 	}
 out:
-	put_group_info(current->group_info);
 	return i;
 }
 
@@ -1487,9 +1485,7 @@
 {
 	int retval = 1;
 	if (grp != current->fsgid) {
-		get_group_info(current->group_info);
 		retval = groups_search(current->group_info, grp);
-		put_group_info(current->group_info);
 	}
 	return retval;
 }
@@ -1500,9 +1496,7 @@
 {
 	int retval = 1;
 	if (grp != current->egid) {
-		get_group_info(current->group_info);
 		retval = groups_search(current->group_info, grp);
-		put_group_info(current->group_info);
 	}
 	return retval;
 }

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] Use unsigned int types for a faster bsearch
  2006-03-15  9:37                               ` [PATCH] No need to protect current->group_info in sys_getgroups(), in_group_p() and in_egroup_p() Eric Dumazet
@ 2006-03-20 19:09                                 ` Eric Dumazet
  2006-03-22  5:06                                   ` [PATCH] Use __read_mostly on some hot fs variables Eric Dumazet
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Dumazet @ 2006-03-20 19:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 574 bytes --]

This patch avoids arithmetic on 'signed' types that are slower than 
'unsigned'. This saves space and cpu cycles.

size of kernel/sys.o before the patch (gcc-3.4.5)

    text    data     bss     dec     hex filename
   10924     252       4   11180    2bac kernel/sys.o

size of kernel/sys.o after the patch
    text    data     bss     dec     hex filename
   10903     252       4   11159    2b97 kernel/sys.o

I noticed that gcc-4.1.0 (from Fedora Core 5) even uses idiv instruction for 
(a+b)/2 if a and b are signed.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>


[-- Attachment #2: groups_search.patch --]
[-- Type: text/plain, Size: 540 bytes --]

--- a/kernel/sys.c	2006-03-20 18:42:41.000000000 +0100
+++ b/kernel/sys.c	2006-03-20 19:00:43.000000000 +0100
@@ -1375,7 +1375,7 @@
 /* a simple bsearch */
 int groups_search(struct group_info *group_info, gid_t grp)
 {
-	int left, right;
+	unsigned int left, right;
 
 	if (!group_info)
 		return 0;
@@ -1383,7 +1383,7 @@
 	left = 0;
 	right = group_info->ngroups;
 	while (left < right) {
-		int mid = (left+right)/2;
+		unsigned int mid = (left+right)/2;
 		int cmp = grp - GROUP_AT(group_info, mid);
 		if (cmp > 0)
 			left = mid + 1;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] Use __read_mostly on some hot fs variables
  2006-03-20 19:09                                 ` [PATCH] Use unsigned int types for a faster bsearch Eric Dumazet
@ 2006-03-22  5:06                                   ` Eric Dumazet
  2006-03-22  5:15                                     ` Nick Piggin
  2006-03-22  6:23                                     ` [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes Eric Dumazet
  0 siblings, 2 replies; 25+ messages in thread
From: Eric Dumazet @ 2006-03-22  5:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 773 bytes --]

I discovered on oprofile hunting on a SMP platform that dentry lookups were 
slowed down because d_hash_mask, d_hash_shift and dentry_hashtable were in a 
cache line that contained inodes_stat. So each time inodes_stats is changed by 
a cpu, other cpus have to refill their cache line.

This patch moves some variables to the __read_mostly section, in order to 
avoid false sharing. RCU dentry lookups can go full speed.

Before someone asks, it is valid to declare a pointer as 'read mostly', even 
if the data pointed by the pointer is heavily modified. hash table pointers 
and kmem_cache pointers are setup at boot time, so they are perfect candidates 
  to 'read_mostly' section. Same apply for 'struct vfsmount *'

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>



[-- Attachment #2: fs_readmostly.patch --]
[-- Type: text/plain, Size: 7330 bytes --]

--- a/fs/dcache.c	2006-03-21 13:48:13.000000000 +0100
+++ b/fs/dcache.c	2006-03-21 13:55:00.000000000 +0100
@@ -36,7 +36,7 @@
 
 /* #define DCACHE_DEBUG 1 */
 
-int sysctl_vfs_cache_pressure = 100;
+int sysctl_vfs_cache_pressure __read_mostly = 100;
 EXPORT_SYMBOL_GPL(sysctl_vfs_cache_pressure);
 
  __cacheline_aligned_in_smp DEFINE_SPINLOCK(dcache_lock);
@@ -44,7 +44,7 @@
 
 EXPORT_SYMBOL(dcache_lock);
 
-static kmem_cache_t *dentry_cache; 
+static kmem_cache_t *dentry_cache __read_mostly;
 
 #define DNAME_INLINE_LEN (sizeof(struct dentry)-offsetof(struct dentry,d_iname))
 
@@ -59,9 +59,9 @@
 #define D_HASHBITS     d_hash_shift
 #define D_HASHMASK     d_hash_mask
 
-static unsigned int d_hash_mask;
-static unsigned int d_hash_shift;
-static struct hlist_head *dentry_hashtable;
+static unsigned int d_hash_mask __read_mostly;
+static unsigned int d_hash_shift __read_mostly;
+static struct hlist_head *dentry_hashtable __read_mostly;
 static LIST_HEAD(dentry_unused);
 
 /* Statistics gathering. */
@@ -1706,10 +1706,10 @@
 }
 
 /* SLAB cache for __getname() consumers */
-kmem_cache_t *names_cachep;
+kmem_cache_t *names_cachep __read_mostly;
 
 /* SLAB cache for file structures */
-kmem_cache_t *filp_cachep;
+kmem_cache_t *filp_cachep __read_mostly;
 
 EXPORT_SYMBOL(d_genocide);
 
--- a/fs/inode.c	2006-03-21 13:50:19.000000000 +0100
+++ b/fs/inode.c	2006-03-21 13:54:39.000000000 +0100
@@ -56,8 +56,8 @@
 #define I_HASHBITS	i_hash_shift
 #define I_HASHMASK	i_hash_mask
 
-static unsigned int i_hash_mask;
-static unsigned int i_hash_shift;
+static unsigned int i_hash_mask __read_mostly;
+static unsigned int i_hash_shift __read_mostly;
 
 /*
  * Each inode can be on two separate lists. One is
@@ -73,7 +73,7 @@
 
 LIST_HEAD(inode_in_use);
 LIST_HEAD(inode_unused);
-static struct hlist_head *inode_hashtable;
+static struct hlist_head *inode_hashtable __read_mostly;
 
 /*
  * A simple spinlock to protect the list manipulations.
@@ -98,7 +98,7 @@
  */
 struct inodes_stat_t inodes_stat;
 
-static kmem_cache_t * inode_cachep;
+static kmem_cache_t * inode_cachep __read_mostly;
 
 static struct inode *alloc_inode(struct super_block *sb)
 {
--- a/fs/namespace.c	2006-03-22 05:20:33.000000000 +0100
+++ b/fs/namespace.c	2006-03-22 05:23:40.000000000 +0100
@@ -43,9 +43,9 @@
 
 static int event;
 
-static struct list_head *mount_hashtable;
+static struct list_head *mount_hashtable __read_mostly;
 static int hash_mask __read_mostly, hash_bits __read_mostly;
-static kmem_cache_t *mnt_cache;
+static kmem_cache_t *mnt_cache __read_mostly;
 static struct rw_semaphore namespace_sem;
 
 /* /sys/fs */
--- a/fs/dcookies.c	2006-03-22 05:35:46.000000000 +0100
+++ b/fs/dcookies.c	2006-03-22 05:36:55.000000000 +0100
@@ -37,9 +37,9 @@
 
 static LIST_HEAD(dcookie_users);
 static DECLARE_MUTEX(dcookie_sem);
-static kmem_cache_t * dcookie_cache;
-static struct list_head * dcookie_hashtable;
-static size_t hash_size;
+static kmem_cache_t * dcookie_cache __read_mostly;
+static struct list_head * dcookie_hashtable __read_mostly;
+static size_t hash_size __read_mostly;
 
 static inline int is_live(void)
 {
--- a/fs/fcntl.c	2006-03-22 05:37:40.000000000 +0100
+++ b/fs/fcntl.c	2006-03-22 05:43:49.000000000 +0100
@@ -413,7 +413,7 @@
 
 /* Table to convert sigio signal codes into poll band bitmaps */
 
-static long band_table[NSIGPOLL] = {
+static const long band_table[NSIGPOLL] = {
 	POLLIN | POLLRDNORM,			/* POLL_IN */
 	POLLOUT | POLLWRNORM | POLLWRBAND,	/* POLL_OUT */
 	POLLIN | POLLRDNORM | POLLMSG,		/* POLL_MSG */
@@ -532,7 +532,7 @@
 }
 
 static DEFINE_RWLOCK(fasync_lock);
-static kmem_cache_t *fasync_cache;
+static kmem_cache_t *fasync_cache __read_mostly;
 
 /*
  * fasync_helper() is used by some character device drivers (mainly mice)
--- a/fs/eventpoll.c	2006-03-22 05:39:06.000000000 +0100
+++ b/fs/eventpoll.c	2006-03-22 05:43:49.000000000 +0100
@@ -280,13 +280,13 @@
 static struct poll_safewake psw;
 
 /* Slab cache used to allocate "struct epitem" */
-static kmem_cache_t *epi_cache;
+static kmem_cache_t *epi_cache __read_mostly;
 
 /* Slab cache used to allocate "struct eppoll_entry" */
-static kmem_cache_t *pwq_cache;
+static kmem_cache_t *pwq_cache __read_mostly;
 
 /* Virtual fs used to allocate inodes for eventpoll files */
-static struct vfsmount *eventpoll_mnt;
+static struct vfsmount *eventpoll_mnt __read_mostly;
 
 /* File callbacks that implement the eventpoll file behaviour */
 static struct file_operations eventpoll_fops = {
--- a/fs/inotify.c	2006-03-22 05:40:44.000000000 +0100
+++ b/fs/inotify.c	2006-03-22 05:43:49.000000000 +0100
@@ -40,15 +40,15 @@
 static atomic_t inotify_cookie;
 static atomic_t inotify_watches;
 
-static kmem_cache_t *watch_cachep;
-static kmem_cache_t *event_cachep;
+static kmem_cache_t *watch_cachep __read_mostly;
+static kmem_cache_t *event_cachep __read_mostly;
 
-static struct vfsmount *inotify_mnt;
+static struct vfsmount *inotify_mnt __read_mostly;
 
 /* these are configurable via /proc/sys/fs/inotify/ */
-int inotify_max_user_instances;
-int inotify_max_user_watches;
-int inotify_max_queued_events;
+int inotify_max_user_instances __read_mostly;
+int inotify_max_user_watches __read_mostly;
+int inotify_max_queued_events __read_mostly;
 
 /*
  * Lock ordering:
--- a/fs/locks.c	2006-03-22 05:43:34.000000000 +0100
+++ b/fs/locks.c	2006-03-22 05:43:49.000000000 +0100
@@ -145,7 +145,7 @@
 
 static LIST_HEAD(blocked_list);
 
-static kmem_cache_t *filelock_cache;
+static kmem_cache_t *filelock_cache __read_mostly;
 
 /* Allocate an empty lock structure. */
 static struct file_lock *locks_alloc_lock(void)
--- a/fs/bio.c	2006-03-22 05:44:20.000000000 +0100
+++ b/fs/bio.c	2006-03-22 05:50:37.000000000 +0100
@@ -29,7 +29,7 @@
 
 #define BIO_POOL_SIZE 256
 
-static kmem_cache_t *bio_slab;
+static kmem_cache_t *bio_slab __read_mostly;
 
 #define BIOVEC_NR_POOLS 6
 
@@ -38,7 +38,7 @@
  * basically we just need to survive
  */
 #define BIO_SPLIT_ENTRIES 8	
-mempool_t *bio_split_pool;
+mempool_t *bio_split_pool __read_mostly;
 
 struct biovec_slab {
 	int nr_vecs;
--- a/fs/dnotify.c	2006-03-22 05:46:33.000000000 +0100
+++ b/fs/dnotify.c	2006-03-22 05:50:37.000000000 +0100
@@ -21,9 +21,9 @@
 #include <linux/spinlock.h>
 #include <linux/slab.h>
 
-int dir_notify_enable = 1;
+int dir_notify_enable __read_mostly = 1;
 
-static kmem_cache_t *dn_cache;
+static kmem_cache_t *dn_cache __read_mostly;
 
 static void redo_inode_mask(struct inode *inode)
 {
--- a/fs/block_dev.c	2006-03-22 05:48:29.000000000 +0100
+++ b/fs/block_dev.c	2006-03-22 05:50:37.000000000 +0100
@@ -238,7 +238,7 @@
  */
 
 static  __cacheline_aligned_in_smp DEFINE_SPINLOCK(bdev_lock);
-static kmem_cache_t * bdev_cachep;
+static kmem_cache_t * bdev_cachep __read_mostly;
 
 static struct inode *bdev_alloc_inode(struct super_block *sb)
 {
@@ -312,7 +312,7 @@
 	.kill_sb	= kill_anon_super,
 };
 
-static struct vfsmount *bd_mnt;
+static struct vfsmount *bd_mnt __read_mostly;
 struct super_block *blockdev_superblock;
 
 void __init bdev_cache_init(void)
--- a/fs/pipe.c	2006-03-22 05:51:16.000000000 +0100
+++ b/fs/pipe.c	2006-03-22 05:54:11.000000000 +0100
@@ -676,7 +676,7 @@
 	return NULL;
 }
 
-static struct vfsmount *pipe_mnt;
+static struct vfsmount *pipe_mnt __read_mostly;
 static int pipefs_delete_dentry(struct dentry *dentry)
 {
 	return 1;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Use __read_mostly on some hot fs variables
  2006-03-22  5:06                                   ` [PATCH] Use __read_mostly on some hot fs variables Eric Dumazet
@ 2006-03-22  5:15                                     ` Nick Piggin
  2006-03-22  6:23                                     ` [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes Eric Dumazet
  1 sibling, 0 replies; 25+ messages in thread
From: Nick Piggin @ 2006-03-22  5:15 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andrew Morton, linux-kernel

Eric Dumazet wrote:

>
> Before someone asks, it is valid to declare a pointer as 'read 
> mostly', even if the data pointed by the pointer is heavily modified. 
> hash table pointers and kmem_cache pointers are setup at boot time, so 
> they are perfect candidates  to 'read_mostly' section. Same apply for 
> 'struct vfsmount *'
>

Yes... why wouldn't it be if the variable is only written to once? This
is _exactly_ what __read_mostly section is for, isn't it?

Patch looks good though.

Nick
---
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes
  2006-03-22  5:06                                   ` [PATCH] Use __read_mostly on some hot fs variables Eric Dumazet
  2006-03-22  5:15                                     ` Nick Piggin
@ 2006-03-22  6:23                                     ` Eric Dumazet
  2006-03-22  6:41                                       ` Andrew Morton
  1 sibling, 1 reply; 25+ messages in thread
From: Eric Dumazet @ 2006-03-22  6:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 812 bytes --]

Goal : Avoid some locking/unlocking 'struct files_struct'->file_lock for mono 
threaded processes.

We define files_multithreaded() function .

static inline int files_multithreaded(const struct files_struct *files)
{
        return sizeof(files->file_lock) > 0 && atomic_read(&files->count) > 1;
}

On plain UP kernel (not preemptable nor spinlock debug), this function is a 
const 0, so that gcc can wipe out some code.

On preemptible or SMP, or spinlock debug kernels, the result is true only if 
the ref count is greater than 1 (multi threaded process or /proc/{pid}/fd is 
under investigation by another task)

This patch increases kernel size but pros are worth the cons, as said Linus 
himself, we should increase performance of mono-threaded tasks....

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

[-- Attachment #2: files_multithreaded.patch --]
[-- Type: text/plain, Size: 1790 bytes --]

--- a/include/linux/file.h	2006-03-22 06:23:02.000000000 +0100
+++ b/include/linux/file.h	2006-03-22 07:11:08.000000000 +0100
@@ -42,6 +42,11 @@
 	spinlock_t file_lock;     /* Protects concurrent writers.  Nests inside tsk->alloc_lock */
 };
 
+static inline int files_multithreaded(const struct files_struct *files)
+{
+	return sizeof(files->file_lock) > 0 && atomic_read(&files->count) > 1;
+}
+
 #define files_fdtable(files) (rcu_dereference((files)->fdt))
 
 extern void FASTCALL(__fput(struct file *));
--- a/fs/open.c.orig	2006-03-22 06:24:34.000000000 +0100
+++ b/fs/open.c	2006-03-22 06:30:54.000000000 +0100
@@ -1050,11 +1050,17 @@
 {
 	struct files_struct *files = current->files;
 	struct fdtable *fdt;
-	spin_lock(&files->file_lock);
+	int fl_locked = 0;
+
+	if (files_multithreaded(files)) {
+		spin_lock(&files->file_lock);
+		fl_locked = 1;
+	}
 	fdt = files_fdtable(files);
 	BUG_ON(fdt->fd[fd] != NULL);
 	rcu_assign_pointer(fdt->fd[fd], file);
-	spin_unlock(&files->file_lock);
+	if (fl_locked)
+		spin_unlock(&files->file_lock);
 }
 
 EXPORT_SYMBOL(fd_install);
@@ -1147,8 +1153,12 @@
 	struct file * filp;
 	struct files_struct *files = current->files;
 	struct fdtable *fdt;
+	int fl_locked = 0;
 
-	spin_lock(&files->file_lock);
+	if (files_multithreaded(files)) {
+		spin_lock(&files->file_lock);
+		fl_locked = 1;
+	}
 	fdt = files_fdtable(files);
 	if (fd >= fdt->max_fds)
 		goto out_unlock;
@@ -1158,11 +1168,13 @@
 	rcu_assign_pointer(fdt->fd[fd], NULL);
 	FD_CLR(fd, fdt->close_on_exec);
 	__put_unused_fd(files, fd);
-	spin_unlock(&files->file_lock);
+	if (fl_locked)
+		spin_unlock(&files->file_lock);
 	return filp_close(filp, files);
 
 out_unlock:
-	spin_unlock(&files->file_lock);
+	if (fl_locked)
+		spin_unlock(&files->file_lock);
 	return -EBADF;
 }
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes
  2006-03-22  6:23                                     ` [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes Eric Dumazet
@ 2006-03-22  6:41                                       ` Andrew Morton
  2006-03-22  6:59                                         ` Eric Dumazet
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-03-22  6:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel

Eric Dumazet <dada1@cosmosbay.com> wrote:
>
> Goal : Avoid some locking/unlocking 'struct files_struct'->file_lock for mono 
> threaded processes.
> 
> We define files_multithreaded() function .
> 
> static inline int files_multithreaded(const struct files_struct *files)
> {
>         return sizeof(files->file_lock) > 0 && atomic_read(&files->count) > 1;
> }

That's bascially sizeof(spinlock_t).  That's architecture dependent and
varies wildly according to the day of week.

It _might_ work in all situations - probably you checked that.  But I still
wouldn't do it because it might break in the future.  Let's be explicit and
stick the appropriate ifdefs in there.

I'd also consider renaming it to files_shared() - processes are
multithreaded, not data structures.

Once you're done with that we should change fget_light() and fput_light() to
use this helper.  Separate patch.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes
  2006-03-22  6:41                                       ` Andrew Morton
@ 2006-03-22  6:59                                         ` Eric Dumazet
  2006-03-22  7:03                                           ` Andrew Morton
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Dumazet @ 2006-03-22  6:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew Morton a écrit :
> Eric Dumazet <dada1@cosmosbay.com> wrote:
>> Goal : Avoid some locking/unlocking 'struct files_struct'->file_lock for mono 
>> threaded processes.
>>
>> We define files_multithreaded() function .
>>
>> static inline int files_multithreaded(const struct files_struct *files)
>> {
>>         return sizeof(files->file_lock) > 0 && atomic_read(&files->count) > 1;
>> }
> 
> That's bascially sizeof(spinlock_t).  That's architecture dependent and
> varies wildly according to the day of week.

I used sizeof(files->file_lock) instead of sizeof(spinlock_t) because I found 
it more explicit , while not using ugly ifdefs.

> 
> It _might_ work in all situations - probably you checked that.  But I still
> wouldn't do it because it might break in the future.  Let's be explicit and
> stick the appropriate ifdefs in there.
> 
> I'd also consider renaming it to files_shared() - processes are
> multithreaded, not data structures.

Thanks for the feedback, I will redo the patch and test it on various 
platforms before resubmit (including performance data :) )

> 
> Once you're done with that we should change fget_light() and fput_light() to
> use this helper.  Separate patch.

Hum... this discussion is not relevant with fget_light() unless I mistaken.

Nowadays, this function doesnt take spinlock thanks to RCU

Eric

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes
  2006-03-22  6:59                                         ` Eric Dumazet
@ 2006-03-22  7:03                                           ` Andrew Morton
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Morton @ 2006-03-22  7:03 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel

Eric Dumazet <dada1@cosmosbay.com> wrote:
>
> > 
>  > Once you're done with that we should change fget_light() and fput_light() to
>  > use this helper.  Separate patch.
> 
>  Hum... this discussion is not relevant with fget_light() unless I mistaken.

Take a look.  fget_light() uses essentially the same test as you do, for
the same reason.  So it's appropriate that fget_light() use this new helper
rather than open-coding it.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6
  2006-03-15  7:31                             ` Andrew Morton
  2006-03-15  9:37                               ` [PATCH] No need to protect current->group_info in sys_getgroups(), in_group_p() and in_egroup_p() Eric Dumazet
@ 2006-03-15 18:09                               ` Ashok Raj
  1 sibling, 0 replies; 25+ messages in thread
From: Ashok Raj @ 2006-03-15 18:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Shaohua Li, ntl, ashok.raj, olel, venkatesh.pallipadi,
	linux-kernel, suresh.b.siddha, rajesh.shah, ak, zwane

On Tue, Mar 14, 2006 at 11:31:38PM -0800, Andrew Morton wrote:
> >  > see:
> >  > 
> >  > Subject: i386 cpu hotplug bug - instant reboot when onlining secondary
> >  > 
> >  > http://lkml.org/lkml/2006/2/19/186
> >  Works for me. But I saw a warning.
> 
> Guys, will you please stop being so cryptic?  What worked for you?  What
> warning?  wtf is going on?  Who owns this problem, whatever it is?

Nathan's problem is different, its nothing related to this thread.

Appears that a PIII box had trouble to bring a CPU back online after it was
just offlined. Iam not able to reproduce it with the systems i have here.
I have tried a PIII box itself, and also a x86_64 system booting a i386 kernel
and all seems to work ok.

Zwane was attempting to trace Nathan's issue with some experimental patches
but dont think it went far along yet.


> >  -		gdt = (struct desc_struct *)get_zeroed_page(GFP_KERNEL);
> >  +		gdt = (struct desc_struct *)get_zeroed_page(GFP_ATOMIC);
> 

It might help to post with the actual warning, so we can understand why this
fix is necessary.

-- 
Cheers,
Ashok Raj
- Open Source Technology Center

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2006-03-22  7:07 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-12  2:04 More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6 Krzysztof Oledzki
2006-03-12  5:03 ` Andrew Morton
2006-03-12 11:04   ` Krzysztof Oledzki
2006-03-12 11:25     ` Andrew Morton
2006-03-12 13:05       ` Krzysztof Oledzki
2006-03-12 15:35         ` Venkatesh Pallipadi
2006-03-12 21:13           ` Krzysztof Oledzki
2006-03-12 22:30             ` Andrew Morton
2006-03-13 19:36               ` Ashok Raj
2006-03-13 19:51                 ` Andrew Morton
2006-03-13 20:05                   ` Ashok Raj
2006-03-13 22:22                     ` Andrew Morton
2006-03-13 23:04                       ` Ashok Raj
2006-03-15  5:44                         ` Nathan Lynch
2006-03-15  6:18                           ` Shaohua Li
2006-03-15  7:31                             ` Andrew Morton
2006-03-15  9:37                               ` [PATCH] No need to protect current->group_info in sys_getgroups(), in_group_p() and in_egroup_p() Eric Dumazet
2006-03-20 19:09                                 ` [PATCH] Use unsigned int types for a faster bsearch Eric Dumazet
2006-03-22  5:06                                   ` [PATCH] Use __read_mostly on some hot fs variables Eric Dumazet
2006-03-22  5:15                                     ` Nick Piggin
2006-03-22  6:23                                     ` [RFC, PATCH] avoid some atomics in open()/close() for monothreaded processes Eric Dumazet
2006-03-22  6:41                                       ` Andrew Morton
2006-03-22  6:59                                         ` Eric Dumazet
2006-03-22  7:03                                           ` Andrew Morton
2006-03-15 18:09                               ` More than 8 CPUs detected and CONFIG_X86_PC cannot handle it on 2.6.16-rc6 Ashok Raj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox