public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* ia64 implementation of lib/iomap.c
@ 2004-10-21 14:34 David Mosberger
  2004-10-21 17:34 ` Bjorn Helgaas
                   ` (11 more replies)
  0 siblings, 12 replies; 15+ messages in thread
From: David Mosberger @ 2004-10-21 14:34 UTC (permalink / raw)
  To: linux-ia64

Is anybody already working on an ia64-version of lib/iomap.c?  If not:
just be aware that as more drivers are starting to use the API, it
becomes increasingly likely that drivers will start to misbehave.  At
least I think that's the case because the default lib/iomap.c makes
some rather arbitrary assumptions about where the I/O ports can live.

	--david

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
@ 2004-10-21 17:34 ` Bjorn Helgaas
  2004-10-21 17:38 ` David Mosberger
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2004-10-21 17:34 UTC (permalink / raw)
  To: linux-ia64

On Thursday 21 October 2004 8:34 am, David Mosberger wrote:
> Is anybody already working on an ia64-version of lib/iomap.c?  If not:
> just be aware that as more drivers are starting to use the API, it
> becomes increasingly likely that drivers will start to misbehave.  At
> least I think that's the case because the default lib/iomap.c makes
> some rather arbitrary assumptions about where the I/O ports can live.

For example, tulip is now busted on sx1000 systems.  If you're
working on it, let me know, otherwise I'll poke at it :-)  I'd
like to get generic_defconfig and zx1_defconfig working again
on those boxes.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
  2004-10-21 17:34 ` Bjorn Helgaas
@ 2004-10-21 17:38 ` David Mosberger
  2004-10-25 16:48 ` Bjorn Helgaas
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: David Mosberger @ 2004-10-21 17:38 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Thu, 21 Oct 2004 11:34:17 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:

  Bjorn> For example, tulip is now busted on sx1000 systems.  If
  Bjorn> you're working on it, let me know, otherwise I'll poke at it
  Bjorn> :-)

I'm not.  I was planning to, but have some other fires to extinguish first...

	--david

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
  2004-10-21 17:34 ` Bjorn Helgaas
  2004-10-21 17:38 ` David Mosberger
@ 2004-10-25 16:48 ` Bjorn Helgaas
  2004-10-26  7:48 ` David Mosberger
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2004-10-25 16:48 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 2320 bytes --]

On Thursday 21 October 2004 8:34 am, David Mosberger wrote:
> Is anybody already working on an ia64-version of lib/iomap.c?

Here's a start (also attached, because of the kmail bug that
corrupts whitespace).

The idea is that all MMIO iomem cookies are in region 6, so
anything less than that must be a PIO cookie.  So we have:

 0xCxxxxxxxxxxxxxxx MMIO cookie (return from ioremap)
 0xRxxxxxxx1SPPPPPP PIO cookie (R=[0-9AB], S=space num, P..P=port)

I heard a rumor that ioreadX() on PIO cookies is supposed to
have looser semantics than inX() on the port, so we might be
able to get away without the memory fence in inb().  But I
can't substantiate that, so this keeps the generic behavior
of ioreadX() and inX() having identical semantics for PIO.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>

===== include/asm-ia64/io.h 1.21 vs edited =====
--- 1.21/include/asm-ia64/io.h 2004-10-05 12:30:39 -06:00
+++ edited/include/asm-ia64/io.h 2004-10-25 10:06:00 -06:00
@@ -32,7 +32,8 @@
  */
 #define IO_SPACE_LIMIT  0xffffffffffffffffUL
 
-#define MAX_IO_SPACES   16
+#define MAX_IO_SPACES_BITS  4
+#define MAX_IO_SPACES   (1UL << MAX_IO_SPACES_BITS)
 #define IO_SPACE_BITS   24
 #define IO_SPACE_SIZE   (1UL << IO_SPACE_BITS)
 
@@ -52,10 +53,16 @@
 
 # ifdef __KERNEL__
 
+#define PIO_OFFSET  (1UL << (MAX_IO_SPACES_BITS + IO_SPACE_BITS))
+#define PIO_MASK  (PIO_OFFSET - 1)
+#define PIO_RESERVED  __IA64_UNCACHED_OFFSET
+#define HAVE_ARCH_PIO_SIZE
+
 #include <asm/intrinsics.h>
 #include <asm/machvec.h>
 #include <asm/page.h>
 #include <asm/system.h>
+#include <asm-generic/iomap.h>
 
 /*
  * Change virtual addresses to physical addresses and vv.
===== lib/iomap.c 1.5 vs edited =====
--- 1.5/lib/iomap.c 2004-10-18 23:27:35 -06:00
+++ edited/lib/iomap.c 2004-10-25 10:09:26 -06:00
@@ -19,7 +19,10 @@
  *
  * Architectures for which this is not true can't use this generic
  * implementation and should do their own copy.
- *
+ */
+
+#ifndef HAVE_ARCH_PIO_SIZE
+/*
  * We encode the physical PIO addresses (0-0xffff) into the
  * pointer by offsetting them with a constant (0x10000) and
  * assuming that all the low addresses are always PIO. That means
@@ -29,6 +32,7 @@
 #define PIO_OFFSET 0x10000UL
 #define PIO_MASK 0x0ffffUL
 #define PIO_RESERVED 0x40000UL
+#endif
 
 /*
  * Ugly macros are a way of life.

[-- Attachment #2: diffs --]
[-- Type: text/x-diff, Size: 1515 bytes --]

===== include/asm-ia64/io.h 1.21 vs edited =====
--- 1.21/include/asm-ia64/io.h	2004-10-05 12:30:39 -06:00
+++ edited/include/asm-ia64/io.h	2004-10-25 10:06:00 -06:00
@@ -32,7 +32,8 @@
  */
 #define IO_SPACE_LIMIT		0xffffffffffffffffUL
 
-#define MAX_IO_SPACES			16
+#define MAX_IO_SPACES_BITS		4
+#define MAX_IO_SPACES			(1UL << MAX_IO_SPACES_BITS)
 #define IO_SPACE_BITS			24
 #define IO_SPACE_SIZE			(1UL << IO_SPACE_BITS)
 
@@ -52,10 +53,16 @@
 
 # ifdef __KERNEL__
 
+#define PIO_OFFSET		(1UL << (MAX_IO_SPACES_BITS + IO_SPACE_BITS))
+#define PIO_MASK		(PIO_OFFSET - 1)
+#define PIO_RESERVED		__IA64_UNCACHED_OFFSET
+#define HAVE_ARCH_PIO_SIZE
+
 #include <asm/intrinsics.h>
 #include <asm/machvec.h>
 #include <asm/page.h>
 #include <asm/system.h>
+#include <asm-generic/iomap.h>
 
 /*
  * Change virtual addresses to physical addresses and vv.
===== lib/iomap.c 1.5 vs edited =====
--- 1.5/lib/iomap.c	2004-10-18 23:27:35 -06:00
+++ edited/lib/iomap.c	2004-10-25 10:09:26 -06:00
@@ -19,7 +19,10 @@
  *
  * Architectures for which this is not true can't use this generic
  * implementation and should do their own copy.
- *
+ */
+
+#ifndef HAVE_ARCH_PIO_SIZE
+/*
  * We encode the physical PIO addresses (0-0xffff) into the
  * pointer by offsetting them with a constant (0x10000) and
  * assuming that all the low addresses are always PIO. That means
@@ -29,6 +32,7 @@
 #define PIO_OFFSET	0x10000UL
 #define PIO_MASK	0x0ffffUL
 #define PIO_RESERVED	0x40000UL
+#endif
 
 /*
  * Ugly macros are a way of life.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (2 preceding siblings ...)
  2004-10-25 16:48 ` Bjorn Helgaas
@ 2004-10-26  7:48 ` David Mosberger
  2004-10-26 15:21   ` Linus Torvalds
  2004-10-26 16:23 ` Jesse Barnes
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 15+ messages in thread
From: David Mosberger @ 2004-10-26  7:48 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Mon, 25 Oct 2004 10:48:48 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:

  Bjorn> On Thursday 21 October 2004 8:34 am, David Mosberger wrote:
  >> Is anybody already working on an ia64-version of lib/iomap.c?

  Bjorn> Here's a start (also attached, because of the kmail bug that
  Bjorn> corrupts whitespace).

Nice!

  Bjorn> The idea is that all MMIO iomem cookies are in region 6, so
  Bjorn> anything less than that must be a PIO cookie.  So we have:

  Bjorn> 0xCxxxxxxxxxxxxxxx MMIO cookie (return from ioremap)
  Bjorn> 0xRxxxxxxx1SPPPPPP PIO cookie (R=[0-9AB], S=space num, P..P=port)

In reality, `R' is always 0 though, right?  Would it be useful to add
the above two lines to asm-ia64/io.h?  I think they really help
understanding the code.  Perhaps it would also be useful to point out
that the "1" bit is there to catch old/buggy code which attempts to do
an I/O operation on a port without the prerequisite iomap()?

  Bjorn> I heard a rumor that ioreadX() on PIO cookies is supposed to
  Bjorn> have looser semantics than inX() on the port, so we might be
  Bjorn> able to get away without the memory fence in inb().  But I
  Bjorn> can't substantiate that, so this keeps the generic behavior
  Bjorn> of ioreadX() and inX() having identical semantics for PIO.

Can somebody confirm?  Dropping the mf.a from ioreadX() for I/O port
accesses would save lots of cycles.  Though I guess most
high-performance devices are smart enough to stay away from I/O port
space nowadays, so perhaps it doesn't matter in reality.

Thanks,

	--david

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-26  7:48 ` David Mosberger
@ 2004-10-26 15:21   ` Linus Torvalds
  2004-10-26 16:26     ` David Mosberger
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2004-10-26 15:21 UTC (permalink / raw)
  To: davidm; +Cc: Bjorn Helgaas, linux-ia64, linux-arch



On Tue, 26 Oct 2004, David Mosberger wrote:
> 
>   Bjorn> I heard a rumor that ioreadX() on PIO cookies is supposed to
>   Bjorn> have looser semantics than inX() on the port, so we might be
>   Bjorn> able to get away without the memory fence in inb().  But I
>   Bjorn> can't substantiate that, so this keeps the generic behavior
>   Bjorn> of ioreadX() and inX() having identical semantics for PIO.
> 
> Can somebody confirm?  Dropping the mf.a from ioreadX() for I/O port
> accesses would save lots of cycles.

My personal opinion is (but I don't know that everybody will buy into
this) that the _CPU_ serialization of "io_read/io_write()" has to be
independent of whether the argument points to an IO port or memory-mapped
IO.

That's simply because I believe that we should encourage implementations
to be able to have a straight-line path in the accessor functions if the
hardware just supports it (ie there should be no need to check the type at
run-time unless the hardware _forces_ that check on you due to the
ioremap() phase not being able to do everything once-and-for-all).

So historically, on x86, an IO port access would be totally synchronous,
in that a outb() will actually _wait_ for the out to hit the bus. Quite
frankly, I don't think most other architectures ever implemented this even
for IO ports, and we should not even _try_ to do so for io_writeX().

HOWEVER. From a _bus_ standpoint, an IO port access is still very 
different from a memory-mapped IO access. In particular, we still have to 
guarantee that IO port accesses never get merged, and that they never get 
re-ordered. If the CPU needs to do those guarantees by hand, then they 
need to be there in the code. But I guess that those are really just the 
same guarantees as for a non-porefetchable MMIO region, so again this 
implies that the _software_ side should be the same for MMIO as for PIO.

And I don't know what "mf.a" means on ia64. Magic hardware.

I assume it's just a "total ordering", and no, I don't think we need it.  
As long as the page tables (or something else in the actual hardware)  
guarantees that reads and writes don't get re-ordered, I think we're fine.

If it's the old "ordering between CPU's" issue, then I think it's back to 
the CPU serialization side, and we should just say "if you use io_write(), 
you get the same CPU serialization for both IO ports and MMIO, and if you 
want inter-CPU ordering, you need to use a spinlock".

			Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (3 preceding siblings ...)
  2004-10-26  7:48 ` David Mosberger
@ 2004-10-26 16:23 ` Jesse Barnes
  2004-10-26 17:06 ` Linus Torvalds
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Jesse Barnes @ 2004-10-26 16:23 UTC (permalink / raw)
  To: linux-ia64

On Tuesday, October 26, 2004 12:48 am, David Mosberger wrote:
>   Bjorn> I heard a rumor that ioreadX() on PIO cookies is supposed to
>   Bjorn> have looser semantics than inX() on the port, so we might be
>   Bjorn> able to get away without the memory fence in inb().  But I
>   Bjorn> can't substantiate that, so this keeps the generic behavior
>   Bjorn> of ioreadX() and inX() having identical semantics for PIO.
>
> Can somebody confirm?  Dropping the mf.a from ioreadX() for I/O port
> accesses would save lots of cycles.  Though I guess most
> high-performance devices are smart enough to stay away from I/O port
> space nowadays, so perhaps it doesn't matter in reality.

I'm pretty sure this is the case.  In fact when I last discussed this with 
Linus he indicated that an ioread shouldn't guarantee DMA completion either, 
which would mean we could reuse the read_relaxed stuff to implement it.

Jesse

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-26 15:21   ` Linus Torvalds
@ 2004-10-26 16:26     ` David Mosberger
  0 siblings, 0 replies; 15+ messages in thread
From: David Mosberger @ 2004-10-26 16:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: davidm, Bjorn Helgaas, linux-ia64, linux-arch

>>>>> On Tue, 26 Oct 2004 08:21:15 -0700 (PDT), Linus Torvalds <torvalds@osdl.org> said:

  Linus> So historically, on x86, an IO port access would be totally
  Linus> synchronous, in that a outb() will actually _wait_ for the
  Linus> out to hit the bus. Quite frankly, I don't think most other
  Linus> architectures ever implemented this even for IO ports, and we
  Linus> should not even _try_ to do so for io_writeX().

Well, ia64 does.  It's precisely what "mf.a" gives you:

  [mf.a] prevents any subsequent data memory accesses by the processor
  from initiating transactions to the external platform until:

    o all prior loads to sequential pages have returned data, and
    o all prior stores to sequential pages have been accepted by
      the external platform

(sequential pages are basically pages mapped uncached).  We use this
to emulate INx/OUTx semantics via memory-mapped I/O.

By your argument, it should be safe to drop the "mf.a" from the I/O
port-based writes.

OTOH, I'm not sure its worth the bother: if you have an I/O device
that does lots of pokes through I/O port space, it's gonna be slow no
matter what and the extra 1000 or so cycles the CPU stalls may not
make any difference (even though it would make any compiler-writer
cringe! ;-).  Also, if x86 gives stronger ordering guarantees, suspect
there is _some_ broken driver out there that may rely on that
property, so it may just be safer to leave the "mf.a" there.

	--david

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (4 preceding siblings ...)
  2004-10-26 16:23 ` Jesse Barnes
@ 2004-10-26 17:06 ` Linus Torvalds
  2004-10-26 17:49 ` Jesse Barnes
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2004-10-26 17:06 UTC (permalink / raw)
  To: linux-ia64



On Tue, 26 Oct 2004, Jesse Barnes wrote:
> 
> I'm pretty sure this is the case.  In fact when I last discussed this with 
> Linus he indicated that an ioread shouldn't guarantee DMA completion either, 
> which would mean we could reuse the read_relaxed stuff to implement it.

.. but other people disagreed with me.  I think the consensus was that DMA 
completion _should_ be honoured, but if SGI knows that their machines are
not doing it right, and take on the responsibility for fixing drivers, 
that's _their_ problem. You only need to care about a few drivers, after 
all.

In short, I think of that DMA completion issue as a SGI-private
optimization, and _not_ a general rule.

			Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (5 preceding siblings ...)
  2004-10-26 17:06 ` Linus Torvalds
@ 2004-10-26 17:49 ` Jesse Barnes
  2004-10-26 17:55 ` Grant Grundler
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Jesse Barnes @ 2004-10-26 17:49 UTC (permalink / raw)
  To: linux-ia64

On Tuesday, October 26, 2004 10:06 am, Linus Torvalds wrote:
> On Tue, 26 Oct 2004, Jesse Barnes wrote:
> > I'm pretty sure this is the case.  In fact when I last discussed this
> > with Linus he indicated that an ioread shouldn't guarantee DMA completion
> > either, which would mean we could reuse the read_relaxed stuff to
> > implement it.
>
> .. but other people disagreed with me.  I think the consensus was that DMA
> completion _should_ be honoured, but if SGI knows that their machines are
> not doing it right, and take on the responsibility for fixing drivers,
> that's _their_ problem. You only need to care about a few drivers, after
> all.
>
> In short, I think of that DMA completion issue as a SGI-private
> optimization, and _not_ a general rule.

What about the relaxed read then?  Should we have ioread_relaxed?  I thought 
we had agreed that it was easier to assume relaxed semantics for ioread and 
add a dma_sync interface.  Since PCI-X and PCI-Express have optional relaxed 
semantics that might make sense...

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (6 preceding siblings ...)
  2004-10-26 17:49 ` Jesse Barnes
@ 2004-10-26 17:55 ` Grant Grundler
  2004-10-26 18:05 ` Grant Grundler
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Grant Grundler @ 2004-10-26 17:55 UTC (permalink / raw)
  To: linux-ia64

On Tue, Oct 26, 2004 at 09:23:24AM -0700, Jesse Barnes wrote:
> I'm pretty sure this is the case.

Me too. (Re perf sensitive devices NOT using IO Port address space)

> In fact when I last discussed this with 
> Linus he indicated that an ioread shouldn't guarantee DMA completion either, 
> which would mean we could reuse the read_relaxed stuff to implement it.

How can a device driver guarantee all in-flight DMA has completed
before unmapping control data?
(ie buffers allocated with pci_alloc_consistent()).

PCI ordering rules dictate MMIO read flush in-flight inbound DMA.
I'm just looking for a replacement if there is going to be
a difference in semantics between readl() and io_readl().

thanks,
grant

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (7 preceding siblings ...)
  2004-10-26 17:55 ` Grant Grundler
@ 2004-10-26 18:05 ` Grant Grundler
  2004-10-26 18:12 ` Grant Grundler
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Grant Grundler @ 2004-10-26 18:05 UTC (permalink / raw)
  To: linux-ia64

On Tue, Oct 26, 2004 at 10:55:52AM -0700, Grant Grundler wrote:
> How can a device driver guarantee all in-flight DMA has completed
> before unmapping control data?
> (ie buffers allocated with pci_alloc_consistent()).

nevermind...linus just answered that question.

grant

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (8 preceding siblings ...)
  2004-10-26 18:05 ` Grant Grundler
@ 2004-10-26 18:12 ` Grant Grundler
  2004-10-26 18:19 ` Jesse Barnes
  2004-10-26 18:37 ` Grant Grundler
  11 siblings, 0 replies; 15+ messages in thread
From: Grant Grundler @ 2004-10-26 18:12 UTC (permalink / raw)
  To: linux-ia64

On Tue, Oct 26, 2004 at 10:49:07AM -0700, Jesse Barnes wrote:
> What about the relaxed read then?  Should we have ioread_relaxed?
> I thought we had agreed that it was easier to assume relaxed semantics
> for ioread and add a dma_sync interface.

I would expect that requires fixing PCI drivers that depend on it.
Adding a dma_sync interface would probably make it easier to support
non-coherent (DMA and CPU caches are not coherent) platforms.

> Since PCI-X and PCI-Express have optional relaxed semantics that
> might make sense...

Jesse, you keep mixing up PCI-X Relaxed Ordering with readX() interface
and the two are NOT (directly) related.
The device driver can enable PCI-X Relaxed Ordering hints in general.
But the IO device controls "RO" hint use on individual bus transactions
it masters.

I'm not sure about PCI-Express (whole new bus protocol).

grant

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (9 preceding siblings ...)
  2004-10-26 18:12 ` Grant Grundler
@ 2004-10-26 18:19 ` Jesse Barnes
  2004-10-26 18:37 ` Grant Grundler
  11 siblings, 0 replies; 15+ messages in thread
From: Jesse Barnes @ 2004-10-26 18:19 UTC (permalink / raw)
  To: linux-ia64

On Tuesday, October 26, 2004 11:12 am, Grant Grundler wrote:
> On Tue, Oct 26, 2004 at 10:49:07AM -0700, Jesse Barnes wrote:
> > What about the relaxed read then?  Should we have ioread_relaxed?
> > I thought we had agreed that it was easier to assume relaxed semantics
> > for ioread and add a dma_sync interface.
>
> I would expect that requires fixing PCI drivers that depend on it.
> Adding a dma_sync interface would probably make it easier to support
> non-coherent (DMA and CPU caches are not coherent) platforms.

Yep.  And it has to sync both consistent and non-consistent memory (flush 
might be a better term since coherence isn't really the issue).

> > Since PCI-X and PCI-Express have optional relaxed semantics that
> > might make sense...
>
> Jesse, you keep mixing up PCI-X Relaxed Ordering with readX() interface
> and the two are NOT (directly) related.
> The device driver can enable PCI-X Relaxed Ordering hints in general.
> But the IO > device controls "RO" hint use on individual bus transactions 
> it masters.

I don't believe you.  The spec makes it look like the I/O address *coming from 
the CPU* has to contain a bit to indicate relaxed ordering.  But as I've said 
before, we won't know until we see chipsets that support all aspects of this 
feature.

Jesse

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ia64 implementation of lib/iomap.c
  2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
                   ` (10 preceding siblings ...)
  2004-10-26 18:19 ` Jesse Barnes
@ 2004-10-26 18:37 ` Grant Grundler
  11 siblings, 0 replies; 15+ messages in thread
From: Grant Grundler @ 2004-10-26 18:37 UTC (permalink / raw)
  To: linux-ia64

On Tue, Oct 26, 2004 at 11:19:20AM -0700, Jesse Barnes wrote:
> Yep.  And it has to sync both consistent and non-consistent memory (flush 
> might be a better term since coherence isn't really the issue).

coherence is the issue. data is coherent until it reaches
whatever domain the chipset defines to be coherent.
Normally that means the data has to reach some path common to
CPU (cache) and memory controller.

> > Jesse, you keep mixing up PCI-X Relaxed Ordering with readX() interface
> > and the two are NOT (directly) related.
> > The device driver can enable PCI-X Relaxed Ordering hints in general.
> > But the IO device controls "RO" hint use on individual bus transactions 
> > it masters.
> 
> I don't believe you.  The spec makes it look like the I/O address *coming
> from the CPU* has to contain a bit to indicate relaxed ordering.

I agree that's true for outbound DMA but not inbound DMA.
Whoever masters the transaction gets to set the attribute.
For outbound (to device), CPU/Chipset get to decide.

I quote from section 11.1 "Relaxed Write Ordering" of
"PCI-X Addendum to the PCI Local Bus Specification, Revision 1.0a,
July 24, 2000":
    ...Thus, individual write transactions to that buffer area can be
    allowed to complete out of order as long as the status write pushes all
    previous writes ahead of it. An I/O device can easily accomplish this by
    setting the Relaxed Ordering attribute for all payload write transactions
    but always generating a separate transaction for the status write(s) with
    the Relaxed Ordering attribute not set.

The above description makes it pretty clear the device is setting the
attribute.

I'm not worried about ordering on outbound (aka DMA reads vs MMIO
writes) since I know parisc violates that rule and it works fine
for the IO devices that are common used there. Some HP IA64 platforms
also violate the outbound order (DMA reads can bypass MMIO writes).

> But as I've said 
> before, we won't know until we see chipsets that support all aspects of this 
> feature.

Agreed. We might never enable it. But in any case,  I'll assert again
PCI-X RO is orthogonal to the read_relaxed() discussion.

grant

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-10-26 18:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-21 14:34 ia64 implementation of lib/iomap.c David Mosberger
2004-10-21 17:34 ` Bjorn Helgaas
2004-10-21 17:38 ` David Mosberger
2004-10-25 16:48 ` Bjorn Helgaas
2004-10-26  7:48 ` David Mosberger
2004-10-26 15:21   ` Linus Torvalds
2004-10-26 16:26     ` David Mosberger
2004-10-26 16:23 ` Jesse Barnes
2004-10-26 17:06 ` Linus Torvalds
2004-10-26 17:49 ` Jesse Barnes
2004-10-26 17:55 ` Grant Grundler
2004-10-26 18:05 ` Grant Grundler
2004-10-26 18:12 ` Grant Grundler
2004-10-26 18:19 ` Jesse Barnes
2004-10-26 18:37 ` Grant Grundler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox