* RE: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
@ 2004-03-02 0:45 Andrew Vasquez
2004-03-02 1:02 ` Jeremy Higdon
2004-03-03 10:10 ` Jeremy Higdon
0 siblings, 2 replies; 9+ messages in thread
From: Andrew Vasquez @ 2004-03-02 0:45 UTC (permalink / raw)
To: Jeremy Higdon; +Cc: linux-scsi
On Wednesday, February 25, 2004 8:06 PM,
linux-scsi-owner@vger.kernel.org wrote:
> For those to whom this is new (it was discussed on linux-kernel and
> linux-ia64 I believe), normal PCI register reads imply that PCI DMA
> writes that occured prior to the PCI MMR (memory mapped register)
> read (on the PCI bus) will be reflected in system memory once the
> MMR read is complete.
>
> On our platforms, we can speed up the MMR read significantly if that
> ordering requirement is "relaxed".
>
Interesting...but this implementation seems to be applying a different
set of semantic rules to the term 'relaxed' in comparison to the
'relaxed ordering' rules defined by PCI-X and PCI-Express, no?
> So I attempted to find the common register reads that don't have a
> need for this ordering so that I could make them use this faster
> read.
>
> I did not change this line (111 of drivers/scsi/qla2xxx/qla_isr.c),
> because it may in some cases imply that a DMA write has completed.
>
> stat = RD_REG_DWORD(®->u.isp2300.host_status);
>
Yes in several cases it would.
> Andrew, if you have a chance to look at this and incorporate it in
> the driver, it would be great. Also, any comments would be welcome.
>
Sounds good, I just wish I had an Altix machine to test with. You
seem to have covered most of the fast-path cases, I'll look some more
tonight for any others. I guess my only nit-pick is the lowercase
_relaxed() suffix applied to RD_REG* #defines.
Thanks,
Andrew Vasquez
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
2004-03-02 0:45 [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed Andrew Vasquez
@ 2004-03-02 1:02 ` Jeremy Higdon
2004-03-02 17:19 ` Jesse Barnes
2004-03-03 10:10 ` Jeremy Higdon
1 sibling, 1 reply; 9+ messages in thread
From: Jeremy Higdon @ 2004-03-02 1:02 UTC (permalink / raw)
To: Andrew Vasquez; +Cc: linux-scsi, jbarnes
On Mon, Mar 01, 2004 at 04:45:49PM -0800, Andrew Vasquez wrote:
> On Wednesday, February 25, 2004 8:06 PM,
> linux-scsi-owner@vger.kernel.org wrote: (really me)
> > For those to whom this is new (it was discussed on linux-kernel and
> > linux-ia64 I believe), normal PCI register reads imply that PCI DMA
> > writes that occured prior to the PCI MMR (memory mapped register)
> > read (on the PCI bus) will be reflected in system memory once the
> > MMR read is complete.
> >
> > On our platforms, we can speed up the MMR read significantly if that
> > ordering requirement is "relaxed".
> >
>
> Interesting...but this implementation seems to be applying a different
> set of semantic rules to the term 'relaxed' in comparison to the
> 'relaxed ordering' rules defined by PCI-X and PCI-Express, no?
It is similar, but not the same, unfortunately.
We could have used, for example, readb_nodmasync, or readb_nosync. I
think I like _relaxed better, but ultimately, I'd be willing to bend.
> > So I attempted to find the common register reads that don't have a
> > need for this ordering so that I could make them use this faster
> > read.
> >
> > I did not change this line (111 of drivers/scsi/qla2xxx/qla_isr.c),
> > because it may in some cases imply that a DMA write has completed.
> >
> > stat = RD_REG_DWORD(®->u.isp2300.host_status);
> >
>
> Yes in several cases it would.
>
> > Andrew, if you have a chance to look at this and incorporate it in
> > the driver, it would be great. Also, any comments would be welcome.
> >
>
> Sounds good, I just wish I had an Altix machine to test with. You
> seem to have covered most of the fast-path cases, I'll look some more
> tonight for any others. I guess my only nit-pick is the lowercase
> _relaxed() suffix applied to RD_REG* #defines.
You can certainly change the #defines in the driver to match what you
want. If you want to change the underlying pci codes, we'd have to
patch the kernel a bit. If there's a good consensus that _relaxed
is bad and something else is better, we could do that, right Jesse? :-)
In any case, thanks for taking a look at this.
jeremy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
2004-03-02 1:02 ` Jeremy Higdon
@ 2004-03-02 17:19 ` Jesse Barnes
0 siblings, 0 replies; 9+ messages in thread
From: Jesse Barnes @ 2004-03-02 17:19 UTC (permalink / raw)
To: Jeremy Higdon; +Cc: Andrew Vasquez, linux-scsi
On Mon, Mar 01, 2004 at 05:02:28PM -0800, Jeremy Higdon wrote:
> On Mon, Mar 01, 2004 at 04:45:49PM -0800, Andrew Vasquez wrote:
> > Sounds good, I just wish I had an Altix machine to test with. You
> > seem to have covered most of the fast-path cases, I'll look some more
> > tonight for any others. I guess my only nit-pick is the lowercase
> > _relaxed() suffix applied to RD_REG* #defines.
Cool, that would be nice.
> You can certainly change the #defines in the driver to match what you
> want. If you want to change the underlying pci codes, we'd have to
> patch the kernel a bit. If there's a good consensus that _relaxed
> is bad and something else is better, we could do that, right Jesse? :-)
Sure, we can do a search and replace before there are too many users if
there's enough demand.
Jesse
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
2004-03-02 0:45 [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed Andrew Vasquez
2004-03-02 1:02 ` Jeremy Higdon
@ 2004-03-03 10:10 ` Jeremy Higdon
2004-03-05 5:51 ` Andrew Vasquez
1 sibling, 1 reply; 9+ messages in thread
From: Jeremy Higdon @ 2004-03-03 10:10 UTC (permalink / raw)
To: Andrew Vasquez; +Cc: linux-scsi, jbarnes
On Mon, Mar 01, 2004 at 04:45:49PM -0800, Andrew Vasquez wrote:
> On Wednesday, February 25, 2004 8:06 PM,
> linux-scsi-owner@vger.kernel.org wrote:
> > For those to whom this is new (it was discussed on linux-kernel and
> > linux-ia64 I believe), normal PCI register reads imply that PCI DMA
> > writes that occured prior to the PCI MMR (memory mapped register)
> > read (on the PCI bus) will be reflected in system memory once the
> > MMR read is complete.
> >
> > On our platforms, we can speed up the MMR read significantly if that
> > ordering requirement is "relaxed".
> >
>
> Interesting...but this implementation seems to be applying a different
> set of semantic rules to the term 'relaxed' in comparison to the
> 'relaxed ordering' rules defined by PCI-X and PCI-Express, no?
I'm going to retract my previous agreement to this :-). I took a good
look at the PCI Express specification today, and it appears to me as
though this implementation should be applicable to PCI Express.
Relaxed Ordering on PCI Express can be set on memory reads (i.e. PIO
reads originated by the processor) and posted memory writes (i.e. DMA
writes originated by the Qlogic chip).
When set, it indicates that the posted write or the read completion
(reply from the device to the PIO read) may pass other posted writes,
read completions, or messages.
For optimal efficiency, the Qlogic chip should set the relaxed order
bit for DMA writes of data (not response queue DMA writes), and the
host bridge should set the relaxed order bit on PIO reads in which
it is safe for the reply to pass posted writes, read completions, and
messages. Of course, this all assume that the host bridge and the
PCI device implement this bit and can turn it on and off on a
per PCI transaction basis.
In theory, I think this would mean you could turn on relaxed ordering
for all PIO reads, save for the read of the response queue out (or
host status on the 2300). You could make that read relaxed also with
a little bit of driver work. You'd need to detect when the response
queue DMA write for IOCB X completes after a PIO read indicates that
IOCB X is done. When you saw that, you could issue a non-relaxed read
of the response queue out (or host status). Since mailbox completions
are relatively rare, you could issue a non-relaxed read for those
if necessary, before examing data that is implied by the mailbox
completion.
jeremy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
2004-03-03 10:10 ` Jeremy Higdon
@ 2004-03-05 5:51 ` Andrew Vasquez
2004-03-05 6:52 ` Jeremy Higdon
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Vasquez @ 2004-03-05 5:51 UTC (permalink / raw)
To: linux-scsi; +Cc: Jeremy Higdon
On Wed, 03 Mar 2004, Jeremy Higdon wrote:
> On Mon, Mar 01, 2004 at 04:45:49PM -0800, Andrew Vasquez wrote:
> > Interesting...but this implementation seems to be applying a different
> > set of semantic rules to the term 'relaxed' in comparison to the
> > 'relaxed ordering' rules defined by PCI-X and PCI-Express, no?
>
>
> I'm going to retract my previous agreement to this :-). I took a good
> look at the PCI Express specification today, and it appears to me as
> though this implementation should be applicable to PCI Express.
>
The intent of my original statement wasn't to imply that the
readX_relaxed() routines wouldn't be applicable to PCI-[X|Express],
but instead to note my own concerns with the differing semantics of
readX_relaxed() and 'relaxed ordering' of PCI-[X|Express], which you
mention below.
> When set, it indicates that the posted write or the read completion
> (reply from the device to the PIO read) may pass other posted writes,
> read completions, or messages.
>
I've read through the original readX_relaxed() thread posted on LK and
no-one else seemed to express similiar concerns. My own personal
slant leans towards the readX_nosyncdma() name, but again I seem to
always be in the minority...
Finally, in looking through other locations where we could apply this
technique (though i must admit your patch covers almost all fast-path
uses), the old calls to qla2x00_debounce_register(ISP_REQ_Q_OUT())
in qla2x00_req_pkt() and qla2x00_ms_req_pkt() should be replaced with
the standard RD_REG_WORD(). Through not in the fast, the
RD_REG_WORD() could then be extended to the *_relaxed() calls.
Thanks again for the patches.
Regards,
Andrew Vasquez
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
2004-03-05 5:51 ` Andrew Vasquez
@ 2004-03-05 6:52 ` Jeremy Higdon
0 siblings, 0 replies; 9+ messages in thread
From: Jeremy Higdon @ 2004-03-05 6:52 UTC (permalink / raw)
To: Andrew Vasquez, linux-scsi
On Thu, Mar 04, 2004 at 09:51:17PM -0800, Andrew Vasquez wrote:
>
> The intent of my original statement wasn't to imply that the
> readX_relaxed() routines wouldn't be applicable to PCI-[X|Express],
> but instead to note my own concerns with the differing semantics of
> readX_relaxed() and 'relaxed ordering' of PCI-[X|Express], which you
> mention below.
I understand. I think if you have a bridge that allows you to set
the RO bit, then you need a way to control when it is set. I think
the readX_relaxed works very well for that.
> > When set, it indicates that the posted write or the read completion
> > (reply from the device to the PIO read) may pass other posted writes,
> > read completions, or messages.
> >
>
> I've read through the original readX_relaxed() thread posted on LK and
> no-one else seemed to express similiar concerns. My own personal
> slant leans towards the readX_nosyncdma() name, but again I seem to
> always be in the minority...
That was a confusing thread that wandered a bit. In PCI-Express (the
PCI-X writeup is a little different, though I think it explains
similar semantics), because it is no longer a bus, read requests and
read completions are always separate events. So . . .
My understanding is that the host bridge will set (or clear) the RO
bit on the read request. Then the target device copies the setting
of the RO bit on its read completion. If the RO bit is set on the
completion, it is allowed to pass posted writes, other read completions,
and messages. In that respect, it no longer "pushes" DMA writes to
complete before it does. I don't think you'd want a host bridge that
unconditionally sets the bit, so if you're going to support setting
the RO bit as a bridge (or Root Complex in PCI-Express parlance), it
has to be under software control.
> Finally, in looking through other locations where we could apply this
> technique (though i must admit your patch covers almost all fast-path
> uses), the old calls to qla2x00_debounce_register(ISP_REQ_Q_OUT())
> in qla2x00_req_pkt() and qla2x00_ms_req_pkt() should be replaced with
> the standard RD_REG_WORD(). Through not in the fast, the
> RD_REG_WORD() could then be extended to the *_relaxed() calls.
Right. Just about all register reads can actually be changed to
relaxed reads, since there are very few that imply DMA write completions.
I was trying to keep the patch small :-)
> Thanks again for the patches.
Not at all. Thank you for taking the time to look at them and
comment.
> Regards,
> Andrew Vasquez
thanks
jeremy
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
@ 2004-02-26 4:05 Jeremy Higdon
2004-02-26 8:52 ` Arjan van de Ven
0 siblings, 1 reply; 9+ messages in thread
From: Jeremy Higdon @ 2004-02-26 4:05 UTC (permalink / raw)
To: linux-scsi, jbarnes
I'm submitting a patch for comment and consideration. The relaxed read
interface was added a few weeks back, and this patch lets the qla2xxx driver
take advantage of it.
For those to whom this is new (it was discussed on linux-kernel and linux-ia64
I believe), normal PCI register reads imply that PCI DMA writes that occured
prior to the PCI MMR (memory mapped register) read (on the PCI bus) will be
reflected in system memory once the MMR read is complete.
On our platforms, we can speed up the MMR read significantly if that ordering
requirement is "relaxed".
So I attempted to find the common register reads that don't have a need
for this ordering so that I could make them use this faster read.
I did not change this line (111 of drivers/scsi/qla2xxx/qla_isr.c),
because it may in some cases imply that a DMA write has completed.
stat = RD_REG_DWORD(®->u.isp2300.host_status);
Andrew, if you have a chance to look at this and incorporate it in the
driver, it would be great. Also, any comments would be welcome.
thanks
jeremy
===== drivers/scsi/qla2xxx/qla_def.h 1.5 vs edited =====
--- 1.5/drivers/scsi/qla2xxx/qla_def.h Mon Feb 2 08:02:02 2004
+++ edited/drivers/scsi/qla2xxx/qla_def.h Wed Feb 25 16:14:18 2004
@@ -154,6 +154,9 @@
#define RD_REG_BYTE(addr) readb(addr)
#define RD_REG_WORD(addr) readw(addr)
#define RD_REG_DWORD(addr) readl(addr)
+#define RD_REG_BYTE_relaxed(addr) readb_relaxed(addr)
+#define RD_REG_WORD_relaxed(addr) readw_relaxed(addr)
+#define RD_REG_DWORD_relaxed(addr) readl_relaxed(addr)
#define WRT_REG_BYTE(addr, data) writeb(data,addr)
#define WRT_REG_WORD(addr, data) writew(data,addr)
#define WRT_REG_DWORD(addr, data) writel(data,addr)
===== drivers/scsi/qla2xxx/qla_iocb.c 1.2 vs edited =====
--- 1.2/drivers/scsi/qla2xxx/qla_iocb.c Mon Feb 2 07:41:39 2004
+++ edited/drivers/scsi/qla2xxx/qla_iocb.c Wed Feb 25 01:23:36 2004
@@ -388,7 +388,7 @@
if (ha->req_q_cnt < (req_cnt + 2)) {
/* Calculate number of free request entries */
- cnt = RD_REG_WORD(ISP_REQ_Q_OUT(ha, reg));
+ cnt = RD_REG_WORD_relaxed(ISP_REQ_Q_OUT(ha, reg));
if (ha->req_ring_index < cnt)
ha->req_q_cnt = cnt - ha->req_ring_index;
else
@@ -496,7 +496,7 @@
/* Set chip new ring index. */
WRT_REG_WORD(ISP_REQ_Q_IN(ha, reg), ha->req_ring_index);
- RD_REG_WORD(ISP_REQ_Q_IN(ha, reg)); /* PCI Posting. */
+ RD_REG_WORD_relaxed(ISP_REQ_Q_IN(ha, reg)); /* PCI Posting. */
spin_unlock_irqrestore(&ha->hardware_lock, flags);
return (QLA_SUCCESS);
@@ -748,5 +748,5 @@
/* Set chip new ring index. */
WRT_REG_WORD(ISP_REQ_Q_IN(ha, reg), ha->req_ring_index);
- RD_REG_WORD(ISP_REQ_Q_IN(ha, reg)); /* PCI Posting. */
+ RD_REG_WORD_relaxed(ISP_REQ_Q_IN(ha, reg)); /* PCI Posting. */
}
===== drivers/scsi/qla2xxx/qla_isr.c 1.5 vs edited =====
--- 1.5/drivers/scsi/qla2xxx/qla_isr.c Tue Feb 10 13:51:30 2004
+++ edited/drivers/scsi/qla2xxx/qla_isr.c Wed Feb 25 01:18:04 2004
@@ -165,7 +165,7 @@
break;
}
WRT_REG_WORD(®->hccr, HCCR_CLR_RISC_INT);
- RD_REG_WORD(®->hccr);
+ RD_REG_WORD_relaxed(®->hccr);
}
}
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
2004-02-26 4:05 Jeremy Higdon
@ 2004-02-26 8:52 ` Arjan van de Ven
2004-02-26 16:47 ` Jesse Barnes
0 siblings, 1 reply; 9+ messages in thread
From: Arjan van de Ven @ 2004-02-26 8:52 UTC (permalink / raw)
To: Jeremy Higdon; +Cc: linux-scsi, jbarnes
[-- Attachment #1: Type: text/plain, Size: 323 bytes --]
On Thu, 2004-02-26 at 05:05, Jeremy Higdon wrote:
> I'm submitting a patch for comment and consideration. The relaxed read
> interface was added a few weeks back, and this patch lets the qla2xxx driver
> take advantage of it.
please tell me we don't have to pee all over the drivers to add this
everywhere now;(
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed
2004-02-26 8:52 ` Arjan van de Ven
@ 2004-02-26 16:47 ` Jesse Barnes
0 siblings, 0 replies; 9+ messages in thread
From: Jesse Barnes @ 2004-02-26 16:47 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Jeremy Higdon, linux-scsi, jbarnes
On Thu, Feb 26, 2004 at 09:52:30AM +0100, Arjan van de Ven wrote:
> On Thu, 2004-02-26 at 05:05, Jeremy Higdon wrote:
> > I'm submitting a patch for comment and consideration. The relaxed read
> > interface was added a few weeks back, and this patch lets the qla2xxx driver
> > take advantage of it.
>
> please tell me we don't have to pee all over the drivers to add this
> everywhere now;(
Not everywhere. Drivers will work fine without any changes, but will
work faster if readX is turned into readX_relaxed in the fast paths
where DMA completion isn't important. In the lkml thread that discussed
this feature everyone seemed to agree on that approach. Would you
rather have a different interface?
Thanks,
Jesse
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-03-05 6:52 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-02 0:45 [PATCH] 2.6.3 qla2xxx driver -- use readX_relaxed Andrew Vasquez
2004-03-02 1:02 ` Jeremy Higdon
2004-03-02 17:19 ` Jesse Barnes
2004-03-03 10:10 ` Jeremy Higdon
2004-03-05 5:51 ` Andrew Vasquez
2004-03-05 6:52 ` Jeremy Higdon
-- strict thread matches above, loose matches on Subject: below --
2004-02-26 4:05 Jeremy Higdon
2004-02-26 8:52 ` Arjan van de Ven
2004-02-26 16:47 ` Jesse Barnes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox