public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: HELP: Is writeq an atomic operation??
       [not found] <0631C836DBF79F42B5A60C8C8D4E822901047B1D@NAMAIL2.ad.lsil.com>
@ 2008-05-02 22:32 ` David Miller
  2008-05-02 22:43   ` Roland Dreier
  0 siblings, 1 reply; 23+ messages in thread
From: David Miller @ 2008-05-02 22:32 UTC (permalink / raw)
  To: Eric.Moore; +Cc: linux-scsi, linux-kernel

From: "Moore, Eric" <Eric.Moore@lsi.com>
Date: Fri, 2 May 2008 16:19:49 -0600

> Is a 64bit write to MMIO registers an atomic operation when using the
> writeq API?  

The answer to this question this is platform dependent.

On most 64-bit platforms, it is.  On some 32-bit ones, it is not.

This is not a SCSI layer question, so belongs minimally on
linux-kernel which I've CC:'d.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* HELP:  Is writeq an atomic operation??
@ 2008-05-02 22:40 Moore, Eric
  2008-05-02 22:46 ` Roland Dreier
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Moore, Eric @ 2008-05-02 22:40 UTC (permalink / raw)
  To: linux-kernel

Is a 64bit write to MMIO registers an atomic operation when using the
writeq API?  

My concern is when I send 64bit data via writeq, will it be sent out as
two 32 bit writes?  If so, is it possible that another CPU be sending
the data at the same time.  Meaning can I write the 1st 32bit data from
CPU-A, meanwhile CPU-B is writing his 32bit data at the same time, and
CPU-A didn't complete the full 64bit in one shot.  If this could occur,
is there an API that I can use to make sure the entire data sent in one
atomic operation?


Here is a trace from pci express analyzer.   I'm sending
0x0800010000000000 to the adress DD1400C0 using writeq.   Notice that in
the TLP header it sent a 32bit Memory write with data length of two.

Trace follows:

Link Tra(597) Downstream 2.5(x1) TLP(1992) Mem MWr(32)(10:00000) TC(0)
TD(0) 
_______| EP(0) Attributes(01) Length(2) RequesterID(000:02:0) Tag(8) 
_______| Address(DD1400C0) 1st BE(1111) Last BE(1111) Data(08000100
00000000) 
_______| VC ID(0) Explicit ACK(Packet #1195) Metrics # Packets(2) 
_______| Time Stamp(0003 . 120 181 840 s) 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP: Is writeq an atomic operation??
  2008-05-02 22:32 ` HELP: Is writeq an atomic operation?? David Miller
@ 2008-05-02 22:43   ` Roland Dreier
  2008-05-02 22:49     ` David Miller
  2008-05-02 22:49     ` Moore, Eric
  0 siblings, 2 replies; 23+ messages in thread
From: Roland Dreier @ 2008-05-02 22:43 UTC (permalink / raw)
  To: David Miller; +Cc: Eric.Moore, linux-scsi, linux-kernel

 > > Is a 64bit write to MMIO registers an atomic operation when using the
 > > writeq API?  
 > 
 > The answer to this question this is platform dependent.
 > 
 > On most 64-bit platforms, it is.  On some 32-bit ones, it is not.

Are there any 32-bit platforms with writeq()?  A quick grep suggests not.

Are there any 64-bit platforms where writeq() allows the MMIO to be
split into multiple cycles from the target device's view?  I've been
coding assuming that at least no other MMIO writes will reach the device
in the middle of a writeq().

 - R.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 22:40 Moore, Eric
@ 2008-05-02 22:46 ` Roland Dreier
  2008-05-03  0:42   ` H. Peter Anvin
  2008-05-03 22:37   ` Benjamin Herrenschmidt
  2008-05-02 22:50 ` Andi Kleen
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 23+ messages in thread
From: Roland Dreier @ 2008-05-02 22:46 UTC (permalink / raw)
  To: Moore, Eric; +Cc: linux-kernel

 > Is a 64bit write to MMIO registers an atomic operation when using the
 > writeq API?  
 > 
 > My concern is when I send 64bit data via writeq, will it be sent out as
 > two 32 bit writes?  If so, is it possible that another CPU be sending
 > the data at the same time.  Meaning can I write the 1st 32bit data from
 > CPU-A, meanwhile CPU-B is writing his 32bit data at the same time, and
 > CPU-A didn't complete the full 64bit in one shot.  If this could occur,
 > is there an API that I can use to make sure the entire data sent in one
 > atomic operation?

I don't have an authoritative answer, but I can say that I coded
drivers/infiniband/hw/mthca and .../mlx4 assuming that writeq() is
atomic in the sense that you say, and no one has reported any problems.

But I'm sure no one has stressed the drivers on 64-bit mips or anything
unusual like that.

 - R.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP: Is writeq an atomic operation??
  2008-05-02 22:43   ` Roland Dreier
@ 2008-05-02 22:49     ` David Miller
  2008-05-02 22:49     ` Moore, Eric
  1 sibling, 0 replies; 23+ messages in thread
From: David Miller @ 2008-05-02 22:49 UTC (permalink / raw)
  To: rdreier; +Cc: Eric.Moore, linux-scsi, linux-kernel

From: Roland Dreier <rdreier@cisco.com>
Date: Fri, 02 May 2008 15:43:32 -0700

> Are there any 32-bit platforms with writeq()?  A quick grep suggests not.

Right, I guess there aren't, but what drivers do currently is roll
their own 64-bit MMIO for such cases.

I noticed this when writing drivers/net/niu.c

I suppose this is on purpose, so the driver can setup any
such protection and handling, as needed.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: HELP: Is writeq an atomic operation??
  2008-05-02 22:43   ` Roland Dreier
  2008-05-02 22:49     ` David Miller
@ 2008-05-02 22:49     ` Moore, Eric
  2008-05-02 22:53       ` Roland Dreier
  1 sibling, 1 reply; 23+ messages in thread
From: Moore, Eric @ 2008-05-02 22:49 UTC (permalink / raw)
  To: Roland Dreier, David Miller; +Cc: linux-scsi, linux-kernel

>  > 
>  > The answer to this question this is platform dependent.
>  > 
>  > On most 64-bit platforms, it is.  On some 32-bit ones, it is not.
> 
> Are there any 32-bit platforms with writeq()?  A quick grep 
> suggests not.

I think writeq defined in include/asm-x86/io_64.h

> 
> Are there any 64-bit platforms where writeq() allows the MMIO to be
> split into multiple cycles from the target device's view?  I've been
> coding assuming that at least no other MMIO writes will reach 
> the device
> in the middle of a writeq().
> 

I hope that is the case.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 22:40 Moore, Eric
  2008-05-02 22:46 ` Roland Dreier
@ 2008-05-02 22:50 ` Andi Kleen
  2008-05-02 23:03   ` Moore, Eric
  2008-05-02 23:04 ` Roland Dreier
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2008-05-02 22:50 UTC (permalink / raw)
  To: Moore, Eric; +Cc: linux-kernel

"Moore, Eric" <Eric.Moore@lsi.com> writes:

> Here is a trace from pci express analyzer.   I'm sending

With what CPU/chipset was that?

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP: Is writeq an atomic operation??
  2008-05-02 22:49     ` Moore, Eric
@ 2008-05-02 22:53       ` Roland Dreier
  2008-05-02 23:13         ` Moore, Eric
  0 siblings, 1 reply; 23+ messages in thread
From: Roland Dreier @ 2008-05-02 22:53 UTC (permalink / raw)
  To: Moore, Eric; +Cc: David Miller, linux-scsi, linux-kernel

 > > Are there any 32-bit platforms with writeq()?  A quick grep 
 > > suggests not.
 > 
 > I think writeq defined in include/asm-x86/io_64.h

Umm... io_64.h is 64-bit only (look at asm-x86/io.h if you don't believe me ;)

 - R.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: HELP:  Is writeq an atomic operation??
  2008-05-02 22:50 ` Andi Kleen
@ 2008-05-02 23:03   ` Moore, Eric
  2008-05-02 23:13     ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Moore, Eric @ 2008-05-02 23:03 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

On Friday, May 02, 2008 4:50 PM, Andi Kleen wrote: 
> 
> > Here is a trace from pci express analyzer.   I'm sending
> 
> With what CPU/chipset was that?
> 

I'm developing sas driver for next generation controllers.  They should
work at least on x86, x86_64, ia64, and ppc64.  The host I was using for
gathering that trace was  x86_64 Intel Platform.  Here is the lspci
output:

00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev
0c)
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, fast devsel, latency 0
	Capabilities: [40] Vendor Specific Information

00:00.1 Class ff00: Intel Corporation E7525/E7520 Error Reporting
Registers (rev 0c)
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: fast devsel

00:01.0 System peripheral: Intel Corporation E7520 DMA Controller (rev
0c)
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: fast devsel, IRQ 5
	Memory at dd000000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: [b0] Message Signalled Interrupts: Mask- 64bit-
Queue=0/1 Enable-

00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port
A (rev 0c) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 00002000-00002fff
	Memory behind bridge: dd100000-dd1fffff
	Prefetchable memory behind bridge:
0000000088000000-00000000880fffff
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: Mask- 64bit-
Queue=0/1 Enable-
	Capabilities: [64] Express Root Port (Slot-) IRQ 0
	Capabilities: [100] Advanced Error Reporting

00:03.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port
A1 (rev 0c) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=02, subordinate=04, sec-latency=0
	I/O behind bridge: 00003000-00003fff
	Memory behind bridge: dd200000-dd3fffff
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: Mask- 64bit-
Queue=0/1 Enable-
	Capabilities: [64] Express Root Port (Slot-) IRQ 0
	Capabilities: [100] Advanced Error Reporting

00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B
(rev 0c) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: Mask- 64bit-
Queue=0/1 Enable-
	Capabilities: [64] Express Root Port (Slot-) IRQ 0
	Capabilities: [100] Advanced Error Reporting

00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 0c)
(prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=06, subordinate=06, sec-latency=0
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: Mask- 64bit-
Queue=0/1 Enable-
	Capabilities: [64] Express Root Port (Slot-) IRQ 0
	Capabilities: [100] Advanced Error Reporting

00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #1 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, medium devsel, latency 0, IRQ 16
	I/O ports at 1400 [size=32]

00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #2 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, medium devsel, latency 0, IRQ 19
	I/O ports at 1420 [size=32]

00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #3 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, medium devsel, latency 0, IRQ 20
	I/O ports at 1440 [size=32]

00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #4 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, medium devsel, latency 0, IRQ 16
	I/O ports at 1460 [size=32]

00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2
EHCI Controller (rev 02) (prog-if 20 [EHCI])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, medium devsel, latency 0, IRQ 21
	Memory at dd001000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Debug port

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=07, subordinate=07, sec-latency=32
	I/O behind bridge: 00004000-00004fff
	Memory behind bridge: dd400000-deffffff
	Prefetchable memory behind bridge: 88100000-881fffff

00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC
Interface Bridge (rev 02)
	Flags: bus master, medium devsel, latency 0

00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller
(rev 02) (prog-if 8a [Master SecP PriP])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 20
	I/O ports at 01f0 [size=8]
	I/O ports at 03f4 [size=1]
	I/O ports at 0170 [size=8]
	I/O ports at 0374 [size=1]
	I/O ports at 14b0 [size=16]

00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus
Controller (rev 02)
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: medium devsel, IRQ 22
	I/O ports at 1100 [size=32]

01:00.0 Mass storage controller [0108]: LSI Logic / Symbios Logic
Unknown device 0077 (rev ff)
	Subsystem: Compaq Computer Corporation Unknown device 40a0
	Flags: bus master, fast devsel, latency 0, IRQ 5
	I/O ports at 2000 [size=256]
	Memory at dd140000 (64-bit, non-prefetchable) [size=16K]
	[virtual] Expansion ROM at 88000000 [disabled] [size=64K]
	Capabilities: [50] Power Management version 3
	Capabilities: [68] Express Endpoint IRQ 0
	Capabilities: [d0] Vital Product Data
	Capabilities: [a8] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-
	Capabilities: [c0] MSI-X: Enable- Mask- TabSize=15
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [138] Power Budgeting

02:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge
A (rev 09) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=02, secondary=03, subordinate=03, sec-latency=64
	Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
	Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-
	Capabilities: [6c] Power Management version 2
	Capabilities: [d8] PCI-X bridge device
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [300] Power Budgeting

02:00.1 PIC: Intel Corporation 6700/6702PXH I/OxAPIC Interrupt
Controller A (rev 09) (prog-if 20 [IO(X)-APIC])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, fast devsel, latency 0
	Memory at dd200000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: [44] Express Endpoint IRQ 0
	Capabilities: [6c] Power Management version 2

02:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge
B (rev 09) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=02, secondary=04, subordinate=04, sec-latency=64
	I/O behind bridge: 00003000-00003fff
	Memory behind bridge: dd300000-dd3fffff
	Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
	Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-
	Capabilities: [6c] Power Management version 2
	Capabilities: [d8] PCI-X bridge device
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [300] Power Budgeting

02:00.3 PIC: Intel Corporation 6700PXH I/OxAPIC Interrupt Controller B
(rev 09) (prog-if 20 [IO(X)-APIC])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, fast devsel, latency 0
	Memory at dd201000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: [44] Express Endpoint IRQ 0
	Capabilities: [6c] Power Management version 2

04:02.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet
Controller (rev 03)
	Subsystem: Intel Corporation PRO/1000 MT Dual Port Server
Adapter
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 17
	Memory at dd300000 (64-bit, non-prefetchable) [size=128K]
	I/O ports at 3000 [size=64]
	Capabilities: [dc] Power Management version 2
	Capabilities: [e4] PCI-X non-bridge device
	Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-

04:02.1 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet
Controller (rev 03)
	Subsystem: Intel Corporation PRO/1000 MT Dual Port Server
Adapter
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 18
	Memory at dd320000 (64-bit, non-prefetchable) [size=128K]
	I/O ports at 3040 [size=64]
	Capabilities: [dc] Power Management version 2
	Capabilities: [e4] PCI-X non-bridge device
	Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-

07:01.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
(prog-if 00 [VGA])
	Subsystem: Super Micro Computer Inc Unknown device 5480
	Flags: bus master, stepping, medium devsel, latency 66, IRQ 10
	Memory at de000000 (32-bit, non-prefetchable) [size=16M]
	I/O ports at 4000 [size=256]
	Memory at dd400000 (32-bit, non-prefetchable) [size=4K]
	[virtual] Expansion ROM at 88100000 [disabled] [size=128K]
	Capabilities: [5c] Power Management version 2

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 22:40 Moore, Eric
  2008-05-02 22:46 ` Roland Dreier
  2008-05-02 22:50 ` Andi Kleen
@ 2008-05-02 23:04 ` Roland Dreier
  2008-05-02 23:20   ` Moore, Eric
  2008-05-02 23:12 ` Jesse Barnes
  2008-05-03  0:41 ` H. Peter Anvin
  4 siblings, 1 reply; 23+ messages in thread
From: Roland Dreier @ 2008-05-02 23:04 UTC (permalink / raw)
  To: Moore, Eric; +Cc: linux-kernel

 > Here is a trace from pci express analyzer.   I'm sending
 > 0x0800010000000000 to the adress DD1400C0 using writeq.   Notice that in
 > the TLP header it sent a 32bit Memory write with data length of two.

By the way, are you worried that there is something wrong with this
trace?  Your write went out in a single PCIe packet, so it looks perfect
to me.  Does the "Mem MWr(32)" worry you?  I can't see why it would be a
problem -- the PCIe TLP only has one type of memory write transaction
(well, except for 32-bit or 64-bit addressing, but you don't care about
that), and memory write transactions are sent as a string of 32-bit
chunks, so there's no other way a 64-bit write could be sent.

 > Trace follows:
 > 
 > Link Tra(597) Downstream 2.5(x1) TLP(1992) Mem MWr(32)(10:00000) TC(0)
 > TD(0) 
 > _______| EP(0) Attributes(01) Length(2) RequesterID(000:02:0) Tag(8) 
 > _______| Address(DD1400C0) 1st BE(1111) Last BE(1111) Data(08000100
 > 00000000) 
 > _______| VC ID(0) Explicit ACK(Packet #1195) Metrics # Packets(2) 
 > _______| Time Stamp(0003 . 120 181 840 s) 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 22:40 Moore, Eric
                   ` (2 preceding siblings ...)
  2008-05-02 23:04 ` Roland Dreier
@ 2008-05-02 23:12 ` Jesse Barnes
  2008-05-03  0:41 ` H. Peter Anvin
  4 siblings, 0 replies; 23+ messages in thread
From: Jesse Barnes @ 2008-05-02 23:12 UTC (permalink / raw)
  To: Moore, Eric; +Cc: linux-kernel

On Friday, May 02, 2008 3:40 pm Moore, Eric wrote:
> Is a 64bit write to MMIO registers an atomic operation when using the
> writeq API?
>
> My concern is when I send 64bit data via writeq, will it be sent out as
> two 32 bit writes?  If so, is it possible that another CPU be sending
> the data at the same time.  Meaning can I write the 1st 32bit data from
> CPU-A, meanwhile CPU-B is writing his 32bit data at the same time, and
> CPU-A didn't complete the full 64bit in one shot.  If this could occur,
> is there an API that I can use to make sure the entire data sent in one
> atomic operation?
>
>
> Here is a trace from pci express analyzer.   I'm sending
> 0x0800010000000000 to the adress DD1400C0 using writeq.   Notice that in
> the TLP header it sent a 32bit Memory write with data length of two.
>
> Trace follows:
>
> Link Tra(597) Downstream 2.5(x1) TLP(1992) Mem MWr(32)(10:00000) TC(0)
> TD(0)
> _______| EP(0) Attributes(01) Length(2) RequesterID(000:02:0) Tag(8)
> _______| Address(DD1400C0) 1st BE(1111) Last BE(1111) Data(08000100
> 00000000)
> _______| VC ID(0) Explicit ACK(Packet #1195) Metrics # Packets(2)
> _______| Time Stamp(0003 . 120 181 840 s)

I think this is normal; PCIe defines transactions in terms of dwords, to a 64 
bit write would indeed be a transaction packet with a length of two (it can 
go up to 4k).  AFAIK though transactions are processed as a whole, so even a 
4k write (as long as it's generated as a single transaction) won't result in 
the device seeing e.g. 2x2k writes.  I'd have to double check the routing 
rules to be 100% sure though, maybe in some cases the fabric is allowed to 
break up transactions (?).

Jesse

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: HELP: Is writeq an atomic operation??
  2008-05-02 22:53       ` Roland Dreier
@ 2008-05-02 23:13         ` Moore, Eric
  2008-05-02 23:21           ` Roland Dreier
  0 siblings, 1 reply; 23+ messages in thread
From: Moore, Eric @ 2008-05-02 23:13 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Miller, linux-scsi, linux-kernel

On Friday, May 02, 2008 4:53 PM, Roland Dreier wrote: 
> 
>  > > Are there any 32-bit platforms with writeq()?  A quick grep 
>  > > suggests not.
>  > 
>  > I think writeq defined in include/asm-x86/io_64.h
> 
> Umm... io_64.h is 64-bit only (look at asm-x86/io.h if you 
> don't believe me ;)
> 

Yeah,  I forgot I have a #ifndef writeq, then defined the x86_64 version
of that.   I've not tested on x86, so I'm not sure whether it works.
How are you handling writeq when its not defined, as the case in x86?

Eric


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 23:03   ` Moore, Eric
@ 2008-05-02 23:13     ` Andi Kleen
  0 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2008-05-02 23:13 UTC (permalink / raw)
  To: Moore, Eric; +Cc: linux-kernel

Moore, Eric wrote:
> On Friday, May 02, 2008 4:50 PM, Andi Kleen wrote: 
>>> Here is a trace from pci express analyzer.   I'm sending
>> With what CPU/chipset was that?
>>
> 
> I'm developing sas driver for next generation controllers.  They should
> work at least on x86, x86_64, ia64, and ppc64.  The host I was using for
> gathering that trace was  x86_64 Intel Platform.  Here is the lspci
> output:

I doubt the CPU was to blame. Probably the chipset split it up for some
reason. Did you check the data sheets?

At the CPU level writeq is simply a 64bit store on x86-64 Linux.

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: HELP:  Is writeq an atomic operation??
  2008-05-02 23:04 ` Roland Dreier
@ 2008-05-02 23:20   ` Moore, Eric
  2008-05-03  0:10     ` Roland Dreier
  0 siblings, 1 reply; 23+ messages in thread
From: Moore, Eric @ 2008-05-02 23:20 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-kernel

> to me.  Does the "Mem MWr(32)" worry you?  I can't see why it 
> would be a
> problem -- the PCIe TLP only has one type of memory write transaction

The concern was raised in a code review we had earlier; it sounds like
we are good.

Under platforms where writeq is not defined, what should I do?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP: Is writeq an atomic operation??
  2008-05-02 23:13         ` Moore, Eric
@ 2008-05-02 23:21           ` Roland Dreier
  2008-05-02 23:31             ` Moore, Eric
  0 siblings, 1 reply; 23+ messages in thread
From: Roland Dreier @ 2008-05-02 23:21 UTC (permalink / raw)
  To: Moore, Eric; +Cc: David Miller, linux-scsi, linux-kernel

 > Yeah,  I forgot I have a #ifndef writeq, then defined the x86_64 version
 > of that.   I've not tested on x86, so I'm not sure whether it works.
 > How are you handling writeq when its not defined, as the case in x86?

Write two writel() inside a spinlock to avoid any transactions in the
middle (the HW I'm dealing with can deal with two 32-bit transactions,
as long as nothing comes in the middle).  If your hardware demands a
single 64-bit transaction, you may be in trouble, because I'm not sure
all 32-bit systems can generate such a PCIe transaction.

You can see include/linux/mlx4/doorbell.h for exactly what I did.

 - R.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: HELP: Is writeq an atomic operation??
  2008-05-02 23:21           ` Roland Dreier
@ 2008-05-02 23:31             ` Moore, Eric
  0 siblings, 0 replies; 23+ messages in thread
From: Moore, Eric @ 2008-05-02 23:31 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Miller, linux-scsi, linux-kernel

On Friday, May 02, 2008 5:22 PM,  Roland Dreier wrote:
>  > Yeah,  I forgot I have a #ifndef writeq, then defined the 
> x86_64 version
>  > of that.   I've not tested on x86, so I'm not sure whether 
> it works.
>  > How are you handling writeq when its not defined, as the 
> case in x86?
> 
> Write two writel() inside a spinlock to avoid any transactions in the
> middle (the HW I'm dealing with can deal with two 32-bit transactions,
> as long as nothing comes in the middle).  If your hardware demands a
> single 64-bit transaction, you may be in trouble, because I'm not sure
> all 32-bit systems can generate such a PCIe transaction.
> 
> You can see include/linux/mlx4/doorbell.h for exactly what I did.
> 

Thanks for the code sample.   Yes, I need to send a single atomic 64-bit
transaction.

Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 23:20   ` Moore, Eric
@ 2008-05-03  0:10     ` Roland Dreier
  0 siblings, 0 replies; 23+ messages in thread
From: Roland Dreier @ 2008-05-03  0:10 UTC (permalink / raw)
  To: Moore, Eric; +Cc: linux-kernel

 > Under platforms where writeq is not defined, what should I do?

Umm... is it too late to change your chip design?

Seriously, for example on PowerPC 440SPe (which has a PCIe bus on a
32-bit PowerPC core), I don't know any way you can generate a 64-bit PCI
transaction.  Even on 32-bit x86 I think you're stuck using something
ugly like MMX to do it.

Basically it's going to be architecture-dependent at best.

 - R.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 22:40 Moore, Eric
                   ` (3 preceding siblings ...)
  2008-05-02 23:12 ` Jesse Barnes
@ 2008-05-03  0:41 ` H. Peter Anvin
  4 siblings, 0 replies; 23+ messages in thread
From: H. Peter Anvin @ 2008-05-03  0:41 UTC (permalink / raw)
  To: Moore, Eric; +Cc: linux-kernel

Moore, Eric wrote:
> Is a 64bit write to MMIO registers an atomic operation when using the
> writeq API?  
> 
> My concern is when I send 64bit data via writeq, will it be sent out as
> two 32 bit writes?  If so, is it possible that another CPU be sending
> the data at the same time.  Meaning can I write the 1st 32bit data from
> CPU-A, meanwhile CPU-B is writing his 32bit data at the same time, and
> CPU-A didn't complete the full 64bit in one shot.  If this could occur,
> is there an API that I can use to make sure the entire data sent in one
> atomic operation?
> 
> 
> Here is a trace from pci express analyzer.   I'm sending
> 0x0800010000000000 to the adress DD1400C0 using writeq.   Notice that in
> the TLP header it sent a 32bit Memory write with data length of two.
> 
> Trace follows:
> 
> Link Tra(597) Downstream 2.5(x1) TLP(1992) Mem MWr(32)(10:00000) TC(0)
> TD(0) 
> _______| EP(0) Attributes(01) Length(2) RequesterID(000:02:0) Tag(8) 
> _______| Address(DD1400C0) 1st BE(1111) Last BE(1111) Data(08000100
> 00000000) 
> _______| VC ID(0) Explicit ACK(Packet #1195) Metrics # Packets(2) 
> _______| Time Stamp(0003 . 120 181 840 s) 

That is how that operation is represented on the PCI Express bus.

"32 bit" refers to the size of the address, not the size of the data.

However, for 32-bit systems, including x86 in 32-bit mode, there isn't 
any 64-bit atomic operation without using MMX/SSE.

	-hpa

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 22:46 ` Roland Dreier
@ 2008-05-03  0:42   ` H. Peter Anvin
  2008-05-03 14:35     ` Alan Cox
  2008-05-03 22:37   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 23+ messages in thread
From: H. Peter Anvin @ 2008-05-03  0:42 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Moore, Eric, linux-kernel

Roland Dreier wrote:
>  > Is a 64bit write to MMIO registers an atomic operation when using the
>  > writeq API?  
>  > 
>  > My concern is when I send 64bit data via writeq, will it be sent out as
>  > two 32 bit writes?  If so, is it possible that another CPU be sending
>  > the data at the same time.  Meaning can I write the 1st 32bit data from
>  > CPU-A, meanwhile CPU-B is writing his 32bit data at the same time, and
>  > CPU-A didn't complete the full 64bit in one shot.  If this could occur,
>  > is there an API that I can use to make sure the entire data sent in one
>  > atomic operation?
> 
> I don't have an authoritative answer, but I can say that I coded
> drivers/infiniband/hw/mthca and .../mlx4 assuming that writeq() is
> atomic in the sense that you say, and no one has reported any problems.
> 

If you're not under lock you're screwed on a 32-bit platform.

	-hpa

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-03  0:42   ` H. Peter Anvin
@ 2008-05-03 14:35     ` Alan Cox
  2008-05-03 17:40       ` H. Peter Anvin
  0 siblings, 1 reply; 23+ messages in thread
From: Alan Cox @ 2008-05-03 14:35 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Roland Dreier, Moore, Eric, linux-kernel

> > I don't have an authoritative answer, but I can say that I coded
> > drivers/infiniband/hw/mthca and .../mlx4 assuming that writeq() is
> > atomic in the sense that you say, and no one has reported any problems.
> > 
> 
> If you're not under lock you're screwed on a 32-bit platform.

So what cycles does an MMX, SSE or double float store generate on the
bus ?

Alan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-03 14:35     ` Alan Cox
@ 2008-05-03 17:40       ` H. Peter Anvin
  0 siblings, 0 replies; 23+ messages in thread
From: H. Peter Anvin @ 2008-05-03 17:40 UTC (permalink / raw)
  To: Alan Cox; +Cc: Roland Dreier, Moore, Eric, linux-kernel

Alan Cox wrote:
>>> I don't have an authoritative answer, but I can say that I coded
>>> drivers/infiniband/hw/mthca and .../mlx4 assuming that writeq() is
>>> atomic in the sense that you say, and no one has reported any problems.
>>>
>> If you're not under lock you're screwed on a 32-bit platform.
> 
> So what cycles does an MMX, SSE or double float store generate on the
> bus ?
> 

Those do generate 64-bit stores; it's just *really* expensive to do it 
in the kernel.  I have used that trick to test 64-bit hardware in a 
32-bit only system, though.

	-hpa

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-02 22:46 ` Roland Dreier
  2008-05-03  0:42   ` H. Peter Anvin
@ 2008-05-03 22:37   ` Benjamin Herrenschmidt
  2008-05-04 17:01     ` Roland Dreier
  1 sibling, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2008-05-03 22:37 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Moore, Eric, linux-kernel


On Fri, 2008-05-02 at 15:46 -0700, Roland Dreier wrote:
> > Is a 64bit write to MMIO registers an atomic operation when using the
>  > writeq API?  
>  > 
>  > My concern is when I send 64bit data via writeq, will it be sent out as
>  > two 32 bit writes?  If so, is it possible that another CPU be sending
>  > the data at the same time.  Meaning can I write the 1st 32bit data from
>  > CPU-A, meanwhile CPU-B is writing his 32bit data at the same time, and
>  > CPU-A didn't complete the full 64bit in one shot.  If this could occur,
>  > is there an API that I can use to make sure the entire data sent in one
>  > atomic operation?
> 
> I don't have an authoritative answer, but I can say that I coded
> drivers/infiniband/hw/mthca and .../mlx4 assuming that writeq() is
> atomic in the sense that you say, and no one has reported any problems.
> 
> But I'm sure no one has stressed the drivers on 64-bit mips or anything
> unusual like that.

Surely only on 64 bits archs right ?

Ben.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: HELP:  Is writeq an atomic operation??
  2008-05-03 22:37   ` Benjamin Herrenschmidt
@ 2008-05-04 17:01     ` Roland Dreier
  0 siblings, 0 replies; 23+ messages in thread
From: Roland Dreier @ 2008-05-04 17:01 UTC (permalink / raw)
  To: benh; +Cc: Moore, Eric, linux-kernel

 > > I don't have an authoritative answer, but I can say that I coded
 > > drivers/infiniband/hw/mthca and .../mlx4 assuming that writeq() is
 > > atomic in the sense that you say, and no one has reported any problems.
 > > 
 > > But I'm sure no one has stressed the drivers on 64-bit mips or anything
 > > unusual like that.
 > 
 > Surely only on 64 bits archs right ?

Your question is a bit too terse for me to know exactly what you're
asking, but it is true that these IB drivers use writeq() only on 64-bit
architectures (since no 32-bit architectures even define writeq()!).

The hardware I'm dealing with is smart enough to cope with a driver that
does a write to these 64-bit registers in two 32-bit chunks, as long as
no other writes come in the middle.  So on 32-bit architectures I just
have a spinlock around two writel()s.

The assumption I'm making is that no locking or anything is needed on
64-bit architectures to avoid the writeq() being split into two
transactions with a third unrelated transaction in the middle.

It sounds as though Eric's hardware is much harder to deal with in that
it requires the write to a 64-bit register to be done in a single
transaction, and I'm not sure there is a way to do that on all 32-bit
architectures; certainly we have nothing clean and portable that a
driver can use to do that.

 - R.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2008-05-04 17:02 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <0631C836DBF79F42B5A60C8C8D4E822901047B1D@NAMAIL2.ad.lsil.com>
2008-05-02 22:32 ` HELP: Is writeq an atomic operation?? David Miller
2008-05-02 22:43   ` Roland Dreier
2008-05-02 22:49     ` David Miller
2008-05-02 22:49     ` Moore, Eric
2008-05-02 22:53       ` Roland Dreier
2008-05-02 23:13         ` Moore, Eric
2008-05-02 23:21           ` Roland Dreier
2008-05-02 23:31             ` Moore, Eric
2008-05-02 22:40 Moore, Eric
2008-05-02 22:46 ` Roland Dreier
2008-05-03  0:42   ` H. Peter Anvin
2008-05-03 14:35     ` Alan Cox
2008-05-03 17:40       ` H. Peter Anvin
2008-05-03 22:37   ` Benjamin Herrenschmidt
2008-05-04 17:01     ` Roland Dreier
2008-05-02 22:50 ` Andi Kleen
2008-05-02 23:03   ` Moore, Eric
2008-05-02 23:13     ` Andi Kleen
2008-05-02 23:04 ` Roland Dreier
2008-05-02 23:20   ` Moore, Eric
2008-05-03  0:10     ` Roland Dreier
2008-05-02 23:12 ` Jesse Barnes
2008-05-03  0:41 ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox