Linux MIPS Architecture development
 help / color / mirror / Atom feed
* RFH:  What are the semantics of writeb() and friends?
@ 2005-07-01  5:22 David Daney
  2005-07-01  5:22 ` David Daney
  2005-07-01  9:33 ` Maciej W. Rozycki
  0 siblings, 2 replies; 9+ messages in thread
From: David Daney @ 2005-07-01  5:22 UTC (permalink / raw)
  To: linux-mips

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

In this thread:
 
http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=42C1C6EA.5080709%40avtrex.com
 
I relate the problems I was having with the Intel e100 driver on a new 2.6.12 port to a 4ke based system.
 
My new question is:  What are the semantics of writeb(), writel() et al.?  I would assume that the effects of these must be in the same order that they were issued, and that any hardware write back queue cannot combine or merge them in any way.  Is that correct?
 
 
A second question I have is:  What is the difference in the semantics of wbflush() and wmb()?  For my CPU they both evaluate to the same thing (the 'sync' instruction).  So for my own sake I could use either, but depending on the situation I assume that one would be used over the other.
 
Thanks,
David Daney.

[-- Attachment #2: Type: text/html, Size: 1752 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RFH:  What are the semantics of writeb() and friends?
  2005-07-01  5:22 RFH: What are the semantics of writeb() and friends? David Daney
@ 2005-07-01  5:22 ` David Daney
  2005-07-01  9:33 ` Maciej W. Rozycki
  1 sibling, 0 replies; 9+ messages in thread
From: David Daney @ 2005-07-01  5:22 UTC (permalink / raw)
  To: linux-mips

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

In this thread:
 
http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=42C1C6EA.5080709%40avtrex.com
 
I relate the problems I was having with the Intel e100 driver on a new 2.6.12 port to a 4ke based system.
 
My new question is:  What are the semantics of writeb(), writel() et al.?  I would assume that the effects of these must be in the same order that they were issued, and that any hardware write back queue cannot combine or merge them in any way.  Is that correct?
 
 
A second question I have is:  What is the difference in the semantics of wbflush() and wmb()?  For my CPU they both evaluate to the same thing (the 'sync' instruction).  So for my own sake I could use either, but depending on the situation I assume that one would be used over the other.
 
Thanks,
David Daney.

[-- Attachment #2: Type: text/html, Size: 1752 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFH:  What are the semantics of writeb() and friends?
  2005-07-01  5:22 RFH: What are the semantics of writeb() and friends? David Daney
  2005-07-01  5:22 ` David Daney
@ 2005-07-01  9:33 ` Maciej W. Rozycki
  2005-07-01 11:46   ` Alan Cox
  1 sibling, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2005-07-01  9:33 UTC (permalink / raw)
  To: David Daney; +Cc: linux-mips

On Thu, 30 Jun 2005, David Daney wrote:

> My new question is:  What are the semantics of writeb(), writel() et 
> al.?  I would assume that the effects of these must be in the same order 
> that they were issued, and that any hardware write back queue cannot 
> combine or merge them in any way.  Is that correct?

 No it's not.  You need to insert appropriate barriers, one of: wmb(), 
mb() or rmb().  In rare cases you may need to use iob(), which is 
currently non-portable (which reminds me I should really push it 
upstream).

 For historical reasons only outb(), inl(), outl(), inl(), etc. are meant 
to imply an mb() beforehand and afterwards (IOW, their resulting cycles 
always appear externally in the programmed order).  You may still need 
iob(), though.

> A second question I have is:  What is the difference in the semantics of 
> wbflush() and wmb()?  For my CPU they both evaluate to the same thing 
> (the 'sync' instruction).  So for my own sake I could use either, but 
> depending on the situation I assume that one would be used over the 
> other.

 wbflush() is an old name for iob() -- it will probably vanish one day as 
the name is somewhat inadequate for non-MIPS systems.  They are meant as a 
read/write barrier combined with write completion from the host's point of 
view (i.e. external writes cycles are actually issued by CPU and its 
system controller; they may still be posted e.g. in an I/O bus bridge and 
require another bridge-specific operation to proceed further).  wmb() is 
just a write barrier -- it assures later writes won't be combined with or 
reordered before earlier ones.

 Depending on your needs iob() may be too strong resulting in unnecessary 
performance penalty or wmb() may be to weak resulting in cycles appearing 
in the wrong order or being delayed for too long.  If your CPU happens to 
use "sync" for both, then it probably has an overkill implementation for 
this instruction resulting in performance loss in some places.

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFH:  What are the semantics of writeb() and friends?
  2005-07-01  9:33 ` Maciej W. Rozycki
@ 2005-07-01 11:46   ` Alan Cox
  2005-07-01 12:54     ` Maciej W. Rozycki
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Cox @ 2005-07-01 11:46 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: David Daney, linux-mips

On Gwe, 2005-07-01 at 10:33, Maciej W. Rozycki wrote:
> > al.?  I would assume that the effects of these must be in the same order 
> > that they were issued, and that any hardware write back queue cannot 
> > combine or merge them in any way.  Is that correct?
> 
>  No it's not.  You need to insert appropriate barriers, one of: wmb(), 
> mb() or rmb().  In rare cases you may need to use iob(), which is 
> currently non-portable (which reminds me I should really push it 
> upstream).

Its even more complicated than that 8)

writeb/writel may be merged in some cases (but not re-ordered) for I/O
devices but a simple mb() will only synchronize them as viewed from
cpu/memory interface. There are two other synchronization points. From
the bridge with the I/O device (typically the PCI root bridge) which is
not enforced automatically across processors on some large numa boxes
but is not usually a problem and on the PCI bus itself.

PCI permits posting (delaying writes) and some forms of merging (but not
re-ordering). Thus if you need an I/O to hit a device on the PCI bus and
know it arrived you must follow it by a read from the same device. So
for example if you want to shut down a DMA transfer and free the buffer
for a PCI device you
need to do

		writel(TURN_DMA_OFF, dev->control);
		readl(dev->something);
		/* Only now is the free safe */

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFH:  What are the semantics of writeb() and friends?
  2005-07-01 11:46   ` Alan Cox
@ 2005-07-01 12:54     ` Maciej W. Rozycki
  2005-07-01 13:31       ` Alan Cox
  0 siblings, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2005-07-01 12:54 UTC (permalink / raw)
  To: Alan Cox; +Cc: David Daney, linux-mips

On Fri, 1 Jul 2005, Alan Cox wrote:

> >  No it's not.  You need to insert appropriate barriers, one of: wmb(), 
> > mb() or rmb().  In rare cases you may need to use iob(), which is 
> > currently non-portable (which reminds me I should really push it 
> > upstream).
> 
> Its even more complicated than that 8)
> 
> writeb/writel may be merged in some cases (but not re-ordered) for I/O

 Is that non-reordering specified anywhere for the API or does it just 
happen to be satisfied by most implementations?  Ours (for MIPS, that is) 
for example does nothing to ensure that.

> devices but a simple mb() will only synchronize them as viewed from
> cpu/memory interface. There are two other synchronization points. From

 That's true -- which is why I mentioned bridge-specific operations may be 
required.

> the bridge with the I/O device (typically the PCI root bridge) which is
> not enforced automatically across processors on some large numa boxes
> but is not usually a problem and on the PCI bus itself.

 What if the host I/O bus is not PCI?  For this kind of stuff I tend to 
think in the terms of TURBOchannel systems, just to be sure not to get 
influenced by the most common hardware. ;-)

 E.g. I have this R4400-based TURBOchannel system with aggressive 
buffering in the CPU's MB (memory buffer) ASIC which requires a read-back 
(RAM is OK for that) after a write and a memory barrier only to make 
writes propagate to the I/O bridge.  It may be worse yet with TURBOchannel 
Alpha and VAX systems.  With the latters TURBOchannel is behind two 
bridges, with two intermediate buses on the way.

> PCI permits posting (delaying writes) and some forms of merging (but not
> re-ordering). Thus if you need an I/O to hit a device on the PCI bus and
> know it arrived you must follow it by a read from the same device. So
> for example if you want to shut down a DMA transfer and free the buffer
> for a PCI device you
> need to do
> 
> 		writel(TURN_DMA_OFF, dev->control);
> 		readl(dev->something);
> 		/* Only now is the free safe */

 Again, the I/O bus your host is attached to need not be PCI and you may 
need a bridge specific operation to make your write be completed, possibly 
combined with your quoted sequence (if there is actually PCI somewhere in 
the system; think AlphaServer 8400).

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFH:  What are the semantics of writeb() and friends?
  2005-07-01 12:54     ` Maciej W. Rozycki
@ 2005-07-01 13:31       ` Alan Cox
  2005-07-01 14:43         ` Maciej W. Rozycki
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Cox @ 2005-07-01 13:31 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: David Daney, linux-mips

On Gwe, 2005-07-01 at 13:54, Maciej W. Rozycki wrote:
> > writeb/writel may be merged in some cases (but not re-ordered) for I/O
> 
>  Is that non-reordering specified anywhere for the API or does it just 
> happen to be satisfied by most implementations?  Ours (for MIPS, that is) 
> for example does nothing to ensure that.

It is defined by the device I/O document as follows:


        The read and write functions are defined to be ordered. That is
the
        compiler is not permitted to reorder the I/O sequence. When the
        ordering can be compiler optimised, you can use <function>
        __readb</function> and friends to indicate the relaxed ordering.
Use
        this with care.

Note order - not synchronicity. On that it says

       While the basic functions are defined to be synchronous with
respect
        to each other and ordered with respect to each other the busses
the
        devices sit on may themselves have asynchronicity. In particular
many
        authors are burned by the fact that PCI bus writes are posted
        asynchronously. A driver author must issue a read from the same
        device to ensure that writes have occurred in the specific cases
the
        author cares. This kind of property cannot be hidden from driver
        writers in the API.  In some cases, the read used to flush the
device
        may be expected to fail (if the card is resetting, for
example).  In
        that case, the read should be done from config space, which is
        guaranteed to soft-fail if the card doesn't respond.

>  What if the host I/O bus is not PCI?  For this kind of stuff I tend to 
> think in the terms of TURBOchannel systems, just to be sure not to get 
> influenced by the most common hardware. ;-)

The bus behaviour is bus defined.

>  Again, the I/O bus your host is attached to need not be PCI and you may 
> need a bridge specific operation to make your write be completed, possibly 
> combined with your quoted sequence (if there is actually PCI somewhere in 
> the system; think AlphaServer 8400).

We don't currently have cross bridge "io_write_and_be_synchronous()"
type functions. So far drivers have always known what to do. Your
example might break that of course.

Alan
--
        "In flight refueling scares me. It's like two elephants
                        mating at mach one"
                                -- Arjan van de Ven

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFH:  What are the semantics of writeb() and friends?
  2005-07-01 13:31       ` Alan Cox
@ 2005-07-01 14:43         ` Maciej W. Rozycki
  2005-07-01 19:53           ` Alan Cox
  0 siblings, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2005-07-01 14:43 UTC (permalink / raw)
  To: Alan Cox; +Cc: David Daney, linux-mips

On Fri, 1 Jul 2005, Alan Cox wrote:

> >  Is that non-reordering specified anywhere for the API or does it just 
> > happen to be satisfied by most implementations?  Ours (for MIPS, that is) 
> > for example does nothing to ensure that.
> 
> It is defined by the device I/O document as follows:
> 
> 
>         The read and write functions are defined to be ordered. That is
> the
>         compiler is not permitted to reorder the I/O sequence. When the
>         ordering can be compiler optimised, you can use <function>
>         __readb</function> and friends to indicate the relaxed ordering.
> Use
>         this with care.

 Oh, wonderful! -- another set of three functions per each operation, for 
direct, CPU-endian and memory-endian accesses.  Sigh...

> Note order - not synchronicity. On that it says

 But that mentions compiler only, not CPU ordering!  I understand the BIU 
of the issuing CPU and any external hardware is still permitted to 
merge/reorder these accesses unless separated by wmb()/rmb()/mb() as 
appropriate.  Note that there are MIPS-based systems that e.g. retrieve 
data pending in the write-back buffer (which is logically external to the 
CPU; sometimes even physically) for reads, e.g. with:

	writel(COMMAND, dev->csr);
	status = readl(dev->csr);

you'll likely get COMMAND in status, rather than any actual value of 
dev->csr and no read cycle ever reaches that device at all!  You need an 
mb() in between so that COMMAND leaves the CPU domain before issuing a 
read for this code to work as expected.  And of course an arbitrary number 
of read cycles to dev->irq_status placed after readl() above may bypass 
the write as well.

 We have that iob() macro/call as well, so that you can push cycles out of 
the CPU domain immediately as well, which is equivalent to:

	mb(); 
	make_host_complete_writes();

>        While the basic functions are defined to be synchronous with
> respect
>         to each other and ordered with respect to each other the busses
> the
>         devices sit on may themselves have asynchronicity. In particular
> many
>         authors are burned by the fact that PCI bus writes are posted
>         asynchronously. A driver author must issue a read from the same
>         device to ensure that writes have occurred in the specific cases
> the
>         author cares. This kind of property cannot be hidden from driver
>         writers in the API.  In some cases, the read used to flush the
> device
>         may be expected to fail (if the card is resetting, for
> example).  In
>         that case, the read should be done from config space, which is
>         guaranteed to soft-fail if the card doesn't respond.

 True and obvious once cycles actually reach your I/O bus of choice -- 
rules for that bus apply from then on.

> >  What if the host I/O bus is not PCI?  For this kind of stuff I tend to 
> > think in the terms of TURBOchannel systems, just to be sure not to get 
> > influenced by the most common hardware. ;-)
> 
> The bus behaviour is bus defined.

 Certainly -- does it apply to host buses as well from the Linux point of 
view?  I don't think drivers should be made aware of them -- they should 
be abstracted by the means of these barriers.

> >  Again, the I/O bus your host is attached to need not be PCI and you may 
> > need a bridge specific operation to make your write be completed, possibly 
> > combined with your quoted sequence (if there is actually PCI somewhere in 
> > the system; think AlphaServer 8400).
> 
> We don't currently have cross bridge "io_write_and_be_synchronous()"
> type functions. So far drivers have always known what to do. Your
> example might break that of course.

 So far I've been able to get away with that iob() function, but if the 
bus and buffering hierarchy gets even more complicated, there may be more 
barriers like this needed.

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFH:  What are the semantics of writeb() and friends?
  2005-07-01 14:43         ` Maciej W. Rozycki
@ 2005-07-01 19:53           ` Alan Cox
  2005-07-04 13:08             ` Maciej W. Rozycki
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Cox @ 2005-07-01 19:53 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: David Daney, linux-mips

On Gwe, 2005-07-01 at 15:43, Maciej W. Rozycki wrote:
>  But that mentions compiler only, not CPU ordering!  I understand the BIU 
> of the issuing CPU and any external hardware is still permitted to 
> merge/reorder these accesses unless separated by wmb()/rmb()/mb() as 

I think the practical situation is that this implies ordering to the bus
interface. It might be interesting to ask the powerpc people their
experience but looking at most PCI drivers they assume this and it would
be expensive not to do so on x86.

>  We have that iob() macro/call as well, so that you can push cycles out of 
> the CPU domain immediately as well, which is equivalent to:

> 	mb(); 
> 	make_host_complete_writes();

My feeling is the default readb etc are __readb + mb + make_hos...
>  So far I've been able to get away with that iob() function, but if the 
> bus and buffering hierarchy gets even more complicated, there may be more 
> barriers like this needed.

Agreed - and we now have the device model so we can actually do that by
passing a device pointer.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RFH:  What are the semantics of writeb() and friends?
  2005-07-01 19:53           ` Alan Cox
@ 2005-07-04 13:08             ` Maciej W. Rozycki
  0 siblings, 0 replies; 9+ messages in thread
From: Maciej W. Rozycki @ 2005-07-04 13:08 UTC (permalink / raw)
  To: Alan Cox; +Cc: David Daney, linux-mips

On Fri, 1 Jul 2005, Alan Cox wrote:

> >  But that mentions compiler only, not CPU ordering!  I understand the BIU 
> > of the issuing CPU and any external hardware is still permitted to 
> > merge/reorder these accesses unless separated by wmb()/rmb()/mb() as 
> 
> I think the practical situation is that this implies ordering to the bus
> interface. It might be interesting to ask the powerpc people their
> experience but looking at most PCI drivers they assume this and it would
> be expensive not to do so on x86.

 Hmm, doing this OTOH would be expensive on platforms actually requiring 
explicit barriers for this to be the case.  The problem is only drivers 
know what they expect, e.g. you may need as much as:

	writel();
	mb();
	readl();

but only:

	readl();
	rmb();
	readl();

With barriers coded explicitly in drivers, you may control this, with ones 
inside these mmio functions/macros you need to use mb() everywhere as you 
don't know what the surrounding operations are going to be.  And mb() may 
be significantly more expensive than rmb().

 Of course to facilitate such explicit barriers for platforms where 
inter-processor ordering rules are different to ones for mmio a different 
set of operations would have to be defined -- actually we've already got 
one, mmiowb(), as a starting point.

> >  We have that iob() macro/call as well, so that you can push cycles out of 
> > the CPU domain immediately as well, which is equivalent to:
> 
> > 	mb(); 
> > 	make_host_complete_writes();
> 
> My feeling is the default readb etc are __readb + mb + make_hos...

 Hmm, barriers are normally expected to happen *before* affected 
operations, which is natural and often much faster as in the case of 
traditional MIPS write-back buffers, where there is no "flush" operation 
and mb() is just a tight loop spinning on the WB condition non-empty, 
e.g.: "0: bc0f 0b" till the buffer empties itself.  So I'd rather make 
readb() being mb() + make_host_complete_writes() + __readb().  But it 
would be more painful performance-wise than necessary for many cases, 
questioning the whole idea as any sane driver writer would prefer to use 
these double-underscore calls and schedule barriers as necessary manually 
anyway.

 But if it's indeed what's intended I'd prefer it to be documented 
somewhere in a reasonable place as there are people outside the Intel 
world which may not necessarily know which interfaces imply Intel 
semantics and which do not. ;-)

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-07-04 13:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-01  5:22 RFH: What are the semantics of writeb() and friends? David Daney
2005-07-01  5:22 ` David Daney
2005-07-01  9:33 ` Maciej W. Rozycki
2005-07-01 11:46   ` Alan Cox
2005-07-01 12:54     ` Maciej W. Rozycki
2005-07-01 13:31       ` Alan Cox
2005-07-01 14:43         ` Maciej W. Rozycki
2005-07-01 19:53           ` Alan Cox
2005-07-04 13:08             ` Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox