public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [patch resend] firewire: ieee1394: Move away from SG_ALL
@ 2008-08-04 18:08 Stefan Richter
  2008-08-04 19:21 ` James Bottomley
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Richter @ 2008-08-04 18:08 UTC (permalink / raw)
  To: linux1394-devel; +Cc: linux-scsi, Boaz Harrosh

As a discussion reminded me today, I believe I should merge the
following patch (could have done so much earlier in fact).  Or is there
anything speaking against it?

Meanwhile, SG_ALL has been redefined from 255 to SCSI_MAX_SG_SEGMENTS =
128 without me noticing it.  But since several popular architectures
define ARCH_HAS_SG_CHAIN, we may want to go back to 255 in (fw-)sbp2.
(Besides, we should also do some tests to figure out an actual optimum.
And fw-sbp2.c's struct sbp2_command_orb should become variably sized.)

Does the most relevant user of large transfers with SBP-2 devices,
sd-mod, use chained s/g lists?



Date: Thu, 17 Jan 2008 18:41:47 +0200
From: Boaz Harrosh <bharrosh@panasas.com>
Subject: [PATCH 1/8] firewire: ieee1394: Move away from SG_ALL

  SG_ALL wants to be ~0 meaning "any future size".
  Below group of drivers preallocate a scatter list
  buffer of max_size, so set that size to be 255 (Like before).

  A better schema can be advised with a more dynamic allocation.
  Perhaps from a kmem_cache.

  List of drivers/files:
	drivers/firewire/fw-sbp2.c
	drivers/ieee1394/sbp2.[ch]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 drivers/firewire/fw-sbp2.c |    6 ++++--
 drivers/ieee1394/sbp2.c    |    2 +-
 drivers/ieee1394/sbp2.h    |    4 +++-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/firewire/fw-sbp2.c b/drivers/firewire/fw-sbp2.c
index c2169d2..078501c 100644
--- a/drivers/firewire/fw-sbp2.c
+++ b/drivers/firewire/fw-sbp2.c
@@ -151,6 +151,7 @@ struct sbp2_target {
 };
 
 #define SBP2_MAX_SG_ELEMENT_LENGTH	0xf000
+#define SBP2_MAX_SG_COUNT		255
 #define SBP2_MAX_SECTORS		255	/* Max sectors supported */
 #define SBP2_ORB_TIMEOUT		2000	/* Timeout in ms */
 
@@ -272,7 +273,8 @@ struct sbp2_command_orb {
 	scsi_done_fn_t done;
 	struct sbp2_logical_unit *lu;
 
-	struct sbp2_pointer page_table[SG_ALL] __attribute__((aligned(8)));
+	struct sbp2_pointer page_table[SBP2_MAX_SG_COUNT]
+						__attribute__((aligned(8)));
 	dma_addr_t page_table_bus;
 };
 
@@ -1329,7 +1331,7 @@ static struct scsi_host_template scsi_driver_template = {
 	.slave_configure	= sbp2_scsi_slave_configure,
 	.eh_abort_handler	= sbp2_scsi_abort,
 	.this_id		= -1,
-	.sg_tablesize		= SG_ALL,
+	.sg_tablesize		= SBP2_MAX_SG_COUNT,
 	.use_clustering		= ENABLE_CLUSTERING,
 	.cmd_per_lun		= 1,
 	.can_queue		= 1,
diff --git a/drivers/ieee1394/sbp2.c b/drivers/ieee1394/sbp2.c
index 1eda11a..070763a 100644
--- a/drivers/ieee1394/sbp2.c
+++ b/drivers/ieee1394/sbp2.c
@@ -324,7 +324,7 @@ static struct scsi_host_template sbp2_shost_template = {
 	.slave_configure	 = sbp2scsi_slave_configure,
 	.slave_destroy		 = sbp2scsi_slave_destroy,
 	.this_id		 = -1,
-	.sg_tablesize		 = SG_ALL,
+	.sg_tablesize		 = SBP2_MAX_SG_COUNT,
 	.use_clustering		 = ENABLE_CLUSTERING,
 	.cmd_per_lun		 = SBP2_MAX_CMDS,
 	.can_queue		 = SBP2_MAX_CMDS,
diff --git a/drivers/ieee1394/sbp2.h b/drivers/ieee1394/sbp2.h
index 333a4bb..ae31788 100644
--- a/drivers/ieee1394/sbp2.h
+++ b/drivers/ieee1394/sbp2.h
@@ -222,6 +222,7 @@ struct sbp2_status_block {
  */
 
 #define SBP2_MAX_SG_ELEMENT_LENGTH		0xf000
+#define SBP2_MAX_SG_COUNT			255
 #define SBP2_MAX_SECTORS			255
 /* There is no real limitation of the queue depth (i.e. length of the linked
  * list of command ORBs) at the target. The chosen depth is merely an
@@ -257,7 +258,8 @@ struct sbp2_command_info {
 
 	/* Also need s/g structure for each sbp2 command */
 	struct sbp2_unrestricted_page_table
-		scatter_gather_element[SG_ALL] __attribute__((aligned(8)));
+		scatter_gather_element[SBP2_MAX_SG_COUNT]
+						__attribute__((aligned(8)));
 	dma_addr_t sge_dma;
 	void *sge_buffer;
 	dma_addr_t cmd_dma;

-- 
Stefan Richter
-=====-==--- =--- --=--
http://arcgraph.de/sr/



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-04 18:08 [patch resend] firewire: ieee1394: Move away from SG_ALL Stefan Richter
@ 2008-08-04 19:21 ` James Bottomley
  2008-08-04 19:58   ` Stefan Richter
  0 siblings, 1 reply; 14+ messages in thread
From: James Bottomley @ 2008-08-04 19:21 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux1394-devel, linux-scsi, Boaz Harrosh

On Mon, 2008-08-04 at 20:08 +0200, Stefan Richter wrote:
> As a discussion reminded me today, I believe I should merge the
> following patch (could have done so much earlier in fact).  Or is there
> anything speaking against it?

The value 255 is chosen because it worked before.  What you need to do
is establish what the upper limit for firewire is (or indeed if it has
one).

> Meanwhile, SG_ALL has been redefined from 255 to SCSI_MAX_SG_SEGMENTS =
> 128 without me noticing it.  But since several popular architectures
> define ARCH_HAS_SG_CHAIN, we may want to go back to 255 in (fw-)sbp2.
> (Besides, we should also do some tests to figure out an actual optimum.
> And fw-sbp2.c's struct sbp2_command_orb should become variably sized.)

Don't bother with optmium ... that's block's job based on what it sees
from the completions.  All we need to know is maximum.

> Does the most relevant user of large transfers with SBP-2 devices,
> sd-mod, use chained s/g lists?

pass, but firewire is a reasonably slow bus by modern standards, and you
have the protocol overhead for each ORB, so I'd guess there's a point at
which increasing the transaction size doesn't buy anything.

James



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-04 19:21 ` James Bottomley
@ 2008-08-04 19:58   ` Stefan Richter
  2008-08-05 12:22     ` Boaz Harrosh
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Richter @ 2008-08-04 19:58 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux1394-devel, linux-scsi, Boaz Harrosh

James Bottomley wrote:
> On Mon, 2008-08-04 at 20:08 +0200, Stefan Richter wrote:
>> As a discussion reminded me today, I believe I should merge the
>> following patch (could have done so much earlier in fact).  Or is there
>> anything speaking against it?
> 
> The value 255 is chosen because it worked before.  What you need to do
> is establish what the upper limit for firewire is (or indeed if it has
> one).

The limit is 65535, following from SBP-2 clause 5.1.2, definition of 
data_size.

[Side note:  The SBP-2 s/g list (a.k.a. page table) consists of 64bit 
wide entries and needs to be contiguous in memory from the POV of the 
FireWire PCI or PCIe controller, and the SBP-2 target reads the table 
from the initiator's memory.  The (fw-)sbp2 driver builds this table as 
a copy of the kernel's s/g list; but this was certainly already to the 
reader clear from the context in the diff.]

>> Meanwhile, SG_ALL has been redefined from 255 to SCSI_MAX_SG_SEGMENTS =
>> 128 without me noticing it.  But since several popular architectures
>> define ARCH_HAS_SG_CHAIN, we may want to go back to 255 in (fw-)sbp2.
>> (Besides, we should also do some tests to figure out an actual optimum.
>> And fw-sbp2.c's struct sbp2_command_orb should become variably sized.)
> 
> Don't bother with optmium ... that's block's job based on what it sees
> from the completions.  All we need to know is maximum.

OK, with the small caveat that the current (fw-)sbp2 driver code is 
somewhat simplistic WRT page table handling and geared towards rather 
short page tables.  But this may be possible to enhance without too much 
difficulty.

>> Does the most relevant user of large transfers with SBP-2 devices,
>> sd-mod, use chained s/g lists?
> 
> pass, but firewire is a reasonably slow bus by modern standards, and you

A 3200 Mb/s spec exists :-) (though no silicon yet, to my knowledge).

> have the protocol overhead for each ORB, so I'd guess there's a point at
> which increasing the transaction size doesn't buy anything.
> 
> James

-- 
Stefan Richter
-=====-==--- =--- --=--
http://arcgraph.de/sr/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-04 19:58   ` Stefan Richter
@ 2008-08-05 12:22     ` Boaz Harrosh
  2008-08-05 15:01       ` Stefan Richter
  2008-08-05 19:57       ` [patch resend] firewire: ieee1394: Move away from SG_ALL Stefan Richter
  0 siblings, 2 replies; 14+ messages in thread
From: Boaz Harrosh @ 2008-08-05 12:22 UTC (permalink / raw)
  To: Stefan Richter; +Cc: James Bottomley, linux1394-devel, linux-scsi

Stefan Richter wrote:
> James Bottomley wrote:
>> On Mon, 2008-08-04 at 20:08 +0200, Stefan Richter wrote:
>>> As a discussion reminded me today, I believe I should merge the
>>> following patch (could have done so much earlier in fact).  Or is there
>>> anything speaking against it?
>> The value 255 is chosen because it worked before.  What you need to do
>> is establish what the upper limit for firewire is (or indeed if it has
>> one).
> 
> The limit is 65535, following from SBP-2 clause 5.1.2, definition of 
> data_size.
> 
> [Side note:  The SBP-2 s/g list (a.k.a. page table) consists of 64bit 
> wide entries and needs to be contiguous in memory from the POV of the 
> FireWire PCI or PCIe controller, and the SBP-2 target reads the table 
> from the initiator's memory.  The (fw-)sbp2 driver builds this table as 
> a copy of the kernel's s/g list; but this was certainly already to the 
> reader clear from the context in the diff.]
> 

Does the above mean you need contiguous physical memory to hold the   
sbp2's sg table. If so then it should be allocated with alloc_coherent and
maybe total size up to a PAGE_SIZE.

>>> Meanwhile, SG_ALL has been redefined from 255 to SCSI_MAX_SG_SEGMENTS =
>>> 128 without me noticing it.  But since several popular architectures
>>> define ARCH_HAS_SG_CHAIN, we may want to go back to 255 in (fw-)sbp2.
>>> (Besides, we should also do some tests to figure out an actual optimum.
>>> And fw-sbp2.c's struct sbp2_command_orb should become variably sized.)
>> Don't bother with optmium ... that's block's job based on what it sees
>> from the completions.  All we need to know is maximum.
> 

>From my tests, the block layer will, very fast, send you chained commands.
if you set it to 255 you'll see two chained fragments. In iscsi we have
4096 and it is very good for performance. However since you are pre-allocating
the memory per can_queue commands, this can get big, and the memory overhead
can be an impact on performance.

> OK, with the small caveat that the current (fw-)sbp2 driver code is 
> somewhat simplistic WRT page table handling and geared towards rather 
> short page tables.  But this may be possible to enhance without too much 
> difficulty.
> 
>>> Does the most relevant user of large transfers with SBP-2 devices,
>>> sd-mod, use chained s/g lists?
>> pass, but firewire is a reasonably slow bus by modern standards, and you
> 
> A 3200 Mb/s spec exists :-) (though no silicon yet, to my knowledge).
> 

As you said above and since bitrate is increasing, a 255 is a good 
preallocated value, but perhaps it could also be a module parameter so
fast cards can have a bigger pre-allocated sg table. (per canQ command).
Also for embedded systems, they might want to lower this value.

>> have the protocol overhead for each ORB, so I'd guess there's a point at
>> which increasing the transaction size doesn't buy anything.
>>
>> James
> 

>From re-inspecting the code I can see that the struct sbp2_command_orb
might have problems with none DMA coherent ARCHs. I think for safety
and flexibility the sbp2_command_orb.page_table array should be
dynamically allocated at init time with even maybe alloc_coherent.

Do you need that I send you a sketch of what I mean?

Boaz

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-05 12:22     ` Boaz Harrosh
@ 2008-08-05 15:01       ` Stefan Richter
  2008-08-05 15:28         ` Boaz Harrosh
  2008-08-05 19:57       ` [patch resend] firewire: ieee1394: Move away from SG_ALL Stefan Richter
  1 sibling, 1 reply; 14+ messages in thread
From: Stefan Richter @ 2008-08-05 15:01 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: James Bottomley, linux1394-devel, linux-scsi

Boaz Harrosh wrote:
> Stefan Richter wrote:
>> James Bottomley wrote:
>>> On Mon, 2008-08-04 at 20:08 +0200, Stefan Richter wrote:
>>>> As a discussion reminded me today, I believe I should merge the
>>>> following patch (could have done so much earlier in fact).  Or is there
>>>> anything speaking against it?
>>> The value 255 is chosen because it worked before.  What you need to do
>>> is establish what the upper limit for firewire is (or indeed if it has
>>> one).
>> 
>> The limit is 65535, following from SBP-2 clause 5.1.2, definition of 
>> data_size.
>> 
>> [Side note:  The SBP-2 s/g list (a.k.a. page table) consists of 64bit 
>> wide entries and needs to be contiguous in memory from the POV of the 
>> FireWire PCI or PCIe controller, and the SBP-2 target reads the table 
>> from the initiator's memory.  The (fw-)sbp2 driver builds this table as 
>> a copy of the kernel's s/g list; but this was certainly already to the 
>> reader clear from the context in the diff.]
> 
> Does the above mean you need contiguous physical memory to hold the   
> sbp2's sg table. If so then it should be allocated with alloc_coherent and
> maybe total size up to a PAGE_SIZE.

It doesn't have to be cache-coherent but it indeed needs to be memory
which can be DMA-mapped in one piece.

(DMA direction is DMA_TO_DEVICE.  Lifetime of the mapping is from before
the ORB is committed until after we received completion status, or the
request timed out and its ORB was aborted.  fw-sbp2 implements exactly
this lifetime, while sbp2 keeps the DMA mapping around from before login
to the target until after logout.)


>>>> Meanwhile, SG_ALL has been redefined from 255 to SCSI_MAX_SG_SEGMENTS =
>>>> 128 without me noticing it.  But since several popular architectures
>>>> define ARCH_HAS_SG_CHAIN, we may want to go back to 255 in (fw-)sbp2.
>>>> (Besides, we should also do some tests to figure out an actual optimum.
>>>> And fw-sbp2.c's struct sbp2_command_orb should become variably sized.)
>>> Don't bother with optmium ... that's block's job based on what it sees
>>> from the completions.  All we need to know is maximum.
> 
> From my tests, the block layer will, very fast, send you chained commands.
> if you set it to 255 you'll see two chained fragments. In iscsi we have
> 4096 and it is very good for performance. However since you are pre-allocating
> the memory per can_queue commands, this can get big, and the memory overhead
> can be an impact on performance.

OK.  This gives me a better picture.


>> OK, with the small caveat that the current (fw-)sbp2 driver code is 
>> somewhat simplistic WRT page table handling and geared towards rather 
>> short page tables.  But this may be possible to enhance without too much 
>> difficulty.
>> 
>>>> Does the most relevant user of large transfers with SBP-2 devices,
>>>> sd-mod, use chained s/g lists?
>>> pass, but firewire is a reasonably slow bus by modern standards, and you
>> 
>> A 3200 Mb/s spec exists :-) (though no silicon yet, to my knowledge).
> 
> As you said above and since bitrate is increasing, a 255 is a good 
> preallocated value, but perhaps it could also be a module parameter so
> fast cards can have a bigger pre-allocated sg table. (per canQ command).
> Also for embedded systems, they might want to lower this value.

A preset according to connection speed could also be done by the driver
without user intervention.


>>> have the protocol overhead for each ORB, so I'd guess there's a point at
>>> which increasing the transaction size doesn't buy anything.
>>>
>>> James
>> 
> 
> From re-inspecting the code I can see that the struct sbp2_command_orb
> might have problems with none DMA coherent ARCHs.

Hmm.  This would be a bug.  But from what I see while scrolling through
fw-sbp2 and through sbp2, they both do it right.  Well, to be very
strict, sbp2 should dma_sync_single_for_cpu() at a slightly different
place than it does now.  I shall patch this.


> I think for safety
> and flexibility the sbp2_command_orb.page_table array should be
> dynamically allocated at init time

The older driver sbp2 uses a fixed-size pool of ORBs and s/g tables for
the whole time that it is logged in into a target.  The author of the
newer fw-sbp2 driver chose to keep ORB and s/g table only around for the
lifetime of a request, and I think it's a valid approach.

> with even maybe alloc_coherent.

alloc_coherent isn't necessary.  Even code simplicity wouldn't profit
too much IMO; we already have to DMA-map the data buffer for each
request anyway.
-- 
Stefan Richter
-=====-==--- =--- --=-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-05 15:01       ` Stefan Richter
@ 2008-08-05 15:28         ` Boaz Harrosh
  2008-08-08 11:59           ` [PATCH] ieee1394: sbp2: stricter dma_sync Stefan Richter
  0 siblings, 1 reply; 14+ messages in thread
From: Boaz Harrosh @ 2008-08-05 15:28 UTC (permalink / raw)
  To: Stefan Richter; +Cc: James Bottomley, linux1394-devel, linux-scsi

Stefan Richter wrote:
> Boaz Harrosh wrote:
>> Stefan Richter wrote:
>>> James Bottomley wrote:
>>>> On Mon, 2008-08-04 at 20:08 +0200, Stefan Richter wrote:
>>>>> As a discussion reminded me today, I believe I should merge the
>>>>> following patch (could have done so much earlier in fact).  Or is there
>>>>> anything speaking against it?
>>>> The value 255 is chosen because it worked before.  What you need to do
>>>> is establish what the upper limit for firewire is (or indeed if it has
>>>> one).
>>> The limit is 65535, following from SBP-2 clause 5.1.2, definition of 
>>> data_size.
>>>
>>> [Side note:  The SBP-2 s/g list (a.k.a. page table) consists of 64bit 
>>> wide entries and needs to be contiguous in memory from the POV of the 
>>> FireWire PCI or PCIe controller, and the SBP-2 target reads the table 
>>> from the initiator's memory.  The (fw-)sbp2 driver builds this table as 
>>> a copy of the kernel's s/g list; but this was certainly already to the 
>>> reader clear from the context in the diff.]
>> Does the above mean you need contiguous physical memory to hold the   
>> sbp2's sg table. If so then it should be allocated with alloc_coherent and
>> maybe total size up to a PAGE_SIZE.
> 
> It doesn't have to be cache-coherent but it indeed needs to be memory
> which can be DMA-mapped in one piece.
> 
> (DMA direction is DMA_TO_DEVICE.  Lifetime of the mapping is from before
> the ORB is committed until after we received completion status, or the
> request timed out and its ORB was aborted.  fw-sbp2 implements exactly
> this lifetime, while sbp2 keeps the DMA mapping around from before login
> to the target until after logout.)
> 

OK you are right, DMA_TO_DEVICE is less problematic in regard to variables
that are changed by the CPU that share cacheline with DMAed memory. As
long as the code has a cache-sync before the start of transfer.
But I see you'll take care of that below.

> 
>>>>> Meanwhile, SG_ALL has been redefined from 255 to SCSI_MAX_SG_SEGMENTS =
>>>>> 128 without me noticing it.  But since several popular architectures
>>>>> define ARCH_HAS_SG_CHAIN, we may want to go back to 255 in (fw-)sbp2.
>>>>> (Besides, we should also do some tests to figure out an actual optimum.
>>>>> And fw-sbp2.c's struct sbp2_command_orb should become variably sized.)
>>>> Don't bother with optmium ... that's block's job based on what it sees
>>>> from the completions.  All we need to know is maximum.
>> From my tests, the block layer will, very fast, send you chained commands.
>> if you set it to 255 you'll see two chained fragments. In iscsi we have
>> 4096 and it is very good for performance. However since you are pre-allocating
>> the memory per can_queue commands, this can get big, and the memory overhead
>> can be an impact on performance.
> 
> OK.  This gives me a better picture.
> 
> 
>>> OK, with the small caveat that the current (fw-)sbp2 driver code is 
>>> somewhat simplistic WRT page table handling and geared towards rather 
>>> short page tables.  But this may be possible to enhance without too much 
>>> difficulty.
>>>
>>>>> Does the most relevant user of large transfers with SBP-2 devices,
>>>>> sd-mod, use chained s/g lists?
>>>> pass, but firewire is a reasonably slow bus by modern standards, and you
>>> A 3200 Mb/s spec exists :-) (though no silicon yet, to my knowledge).
>> As you said above and since bitrate is increasing, a 255 is a good 
>> preallocated value, but perhaps it could also be a module parameter so
>> fast cards can have a bigger pre-allocated sg table. (per canQ command).
>> Also for embedded systems, they might want to lower this value.
> 
> A preset according to connection speed could also be done by the driver
> without user intervention.
> 

Even better thanks, that would be ideal. 

(Except for the embedded systems that would want to keep it low, perhaps
at configuration time)

> 
>>>> have the protocol overhead for each ORB, so I'd guess there's a point at
>>>> which increasing the transaction size doesn't buy anything.
>>>>
>>>> James
>> From re-inspecting the code I can see that the struct sbp2_command_orb
>> might have problems with none DMA coherent ARCHs.
> 
> Hmm.  This would be a bug.  But from what I see while scrolling through
> fw-sbp2 and through sbp2, they both do it right.  Well, to be very
> strict, sbp2 should dma_sync_single_for_cpu() at a slightly different
> place than it does now.  I shall patch this.
> 

Yes I agree with you. Keep me posted I'll review the code

> 
>> I think for safety
>> and flexibility the sbp2_command_orb.page_table array should be
>> dynamically allocated at init time
> 
> The older driver sbp2 uses a fixed-size pool of ORBs and s/g tables for
> the whole time that it is logged in into a target.  The author of the
> newer fw-sbp2 driver chose to keep ORB and s/g table only around for the
> lifetime of a request, and I think it's a valid approach.
> 
>> with even maybe alloc_coherent.
> 
> alloc_coherent isn't necessary.  Even code simplicity wouldn't profit
> too much IMO; we already have to DMA-map the data buffer for each
> request anyway.

OK I'm convinced. I think the proposed patch, (funny I wrote it) is good
as a first measure to take matters into drivers hands and avoid any
unintentional side effects from scsi constants.

Thanks
Boaz

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-05 12:22     ` Boaz Harrosh
  2008-08-05 15:01       ` Stefan Richter
@ 2008-08-05 19:57       ` Stefan Richter
  2008-08-06  9:10         ` Boaz Harrosh
  1 sibling, 1 reply; 14+ messages in thread
From: Stefan Richter @ 2008-08-05 19:57 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: James Bottomley, linux1394-devel, linux-scsi

Boaz Harrosh wrote:
> From my tests, the block layer will, very fast, send you chained commands.
> if you set it to 255 you'll see two chained fragments. In iscsi we have
> 4096 and it is very good for performance. However since you are pre-allocating
> the memory per can_queue commands, this can get big, and the memory overhead
> can be an impact on performance.

I did repeated read tests with dd now, with a 3.5" 750G SATA/ FireWire 
800 disk and with a 2.5" 250G IDE/ FireWire 400 disk, each with either 
FireWire stack.  I used dd with and without iflag=direct, just because I 
don't have a clue how to capture the influence of .sg_tablesize best.

Tests with .sg_tablesize = 128, 255, and 5000 all yield the same 
transfer rates.  So I guess I simply leave it at the current SG_ALL for now.
-- 
Stefan Richter
-=====-==--- =--- --=-=
http://arcgraph.de/sr/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-05 19:57       ` [patch resend] firewire: ieee1394: Move away from SG_ALL Stefan Richter
@ 2008-08-06  9:10         ` Boaz Harrosh
  2008-08-06 20:49           ` Stefan Richter
  0 siblings, 1 reply; 14+ messages in thread
From: Boaz Harrosh @ 2008-08-06  9:10 UTC (permalink / raw)
  To: Stefan Richter; +Cc: James Bottomley, linux1394-devel, linux-scsi, Jens Axboe

Stefan Richter wrote:
> Boaz Harrosh wrote:
>> From my tests, the block layer will, very fast, send you chained commands.
>> if you set it to 255 you'll see two chained fragments. In iscsi we have
>> 4096 and it is very good for performance. However since you are pre-allocating
>> the memory per can_queue commands, this can get big, and the memory overhead
>> can be an impact on performance.
> 
> I did repeated read tests with dd now, with a 3.5" 750G SATA/ FireWire 
> 800 disk and with a 2.5" 250G IDE/ FireWire 400 disk, each with either 
> FireWire stack.  I used dd with and without iflag=direct, just because I 
> don't have a clue how to capture the influence of .sg_tablesize best.
> 
> Tests with .sg_tablesize = 128, 255, and 5000 all yield the same 
> transfer rates.  So I guess I simply leave it at the current SG_ALL for now.

You will need to enable sg chaining for your device. I'm not sure
where is the latest documentation. (CC: Jens)

Here is the script I used in my tests:

	#!/bin/bash
	sdx=sdb

	# mess with sglist chaining
	cd /sys/block/$sdx/queue
	echo 4096 > max_segments
	cat max_hw_sectors_kb  > max_sectors_kb
	echo "max_hw_sectors_kb="$(cat max_hw_sectors_kb) 
	echo "max_sectors_kb="$(cat max_sectors_kb) 
	echo "max_segments="$(cat max_segments)

Try to view the content of max_segments before and after
see if it changed.


and also with dd you need a large block size here is how
I calculated it. (Other people might have better tests)
I did a write test but a read test should be better:

	if=/dev/zero
	of=/dev/sdb

	outputfile=$1.txt
	echo "Testing $1"

	# send 32G in $1 sectrors at once
	do_dd()
	{
		# blocks of one sector
		bs=512
		#memory page in blocks
		page=8
		#number of scatterlist elements in a transfer
		sgs=$1
		#calculate the bpt param
		bpt=$(($sgs*$page))
		#total blocks to transfer 32 Giga bytes
		count=64M


		echo $3: "bpt=$bpt"

	\time bash -c \
		"sg_dd blk_sgio=1 dio=1 if=$if of=$of bpt=$bpt bs=$bs count=$count 2>/dev/null" \
		2>> $2
	}

	echo "BEGIN RUN $1" >> $outputfile

	echo "run with 128-sgilist"
	echo "run with 128-sgilist" >> $outputfile
	for i in {1..20}; do
		do_dd 128 $outputfile $i;
	done

	echo "run with 2048-sgilist"
	echo "run with 2048-sgilist" >> $outputfile
	for i in {1..20}; do
		do_dd 2048 $outputfile $i;
	done

I think that in the second case with 2048 if your device does
not support it, dd will fail. (if I recall correctly)

Boaz

PS: OK I used sg_dd from sg-utils sg_read might be even better


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch resend] firewire: ieee1394: Move away from SG_ALL
  2008-08-06  9:10         ` Boaz Harrosh
@ 2008-08-06 20:49           ` Stefan Richter
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Richter @ 2008-08-06 20:49 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: James Bottomley, linux1394-devel, linux-scsi, Jens Axboe

Boaz Harrosh wrote:
> You will need to enable sg chaining for your device.

Probably, because...

> I'm not sure where is the latest documentation. (CC: Jens)
> 
> Here is the script I used in my tests:
> 
> 	#!/bin/bash
> 	sdx=sdb
> 
> 	# mess with sglist chaining
> 	cd /sys/block/$sdx/queue
> 	echo 4096 > max_segments
> 	cat max_hw_sectors_kb  > max_sectors_kb
> 	echo "max_hw_sectors_kb="$(cat max_hw_sectors_kb) 
> 	echo "max_sectors_kb="$(cat max_sectors_kb) 
> 	echo "max_segments="$(cat max_segments)
> 
> Try to view the content of max_segments before and after
> see if it changed.

...right now there is no max_segments attribute.

Anyway.  This already fully answers what my ideal table size limit is. 
Thanks so far,
-- 
Stefan Richter
-=====-==--- =--- --==-
http://arcgraph.de/sr/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] ieee1394: sbp2: stricter dma_sync
  2008-08-05 15:28         ` Boaz Harrosh
@ 2008-08-08 11:59           ` Stefan Richter
  2008-08-08 12:00             ` [PATCH] ieee1394: sbp2: avoid DMA bounce buffers for ORBs Stefan Richter
  2008-08-09 18:13             ` [PATCH update] ieee1394: sbp2: stricter dma_sync Stefan Richter
  0 siblings, 2 replies; 14+ messages in thread
From: Stefan Richter @ 2008-08-08 11:59 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: James Bottomley, linux1394-devel, linux-scsi

On  5 Aug, Boaz Harrosh wrote:
> Stefan Richter wrote:
>> Boaz Harrosh wrote:
>>> From re-inspecting the code I can see that the struct sbp2_command_orb
>>> might have problems with none DMA coherent ARCHs.
>> 
>> Hmm.  This would be a bug.  But from what I see while scrolling through
>> fw-sbp2 and through sbp2, they both do it right.  Well, to be very
>> strict, sbp2 should dma_sync_single_for_cpu() at a slightly different
>> place than it does now.  I shall patch this.
>> 
> 
> Yes I agree with you. Keep me posted I'll review the code


Two dma_sync_single_for_cpu() were called in the wrong place.
Luckily they were merely for DMA_TO_DEVICE, hence nobody noticed.

Also reorder the matching dma_sync_single_for_device() a little bit
so that they reside in the same functions as their counterparts.
This also avoids syncing the s/g table for requests which don't use it.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/ieee1394/sbp2.c |   33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)

On a related note, sbp2 lacks error checks for the DMA mapping when the
ORB pool is created and when the request buffer is mapped.  I leave that
as a ToDo for sometime later.

Index: linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
===================================================================
--- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.c
+++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
@@ -1522,6 +1522,10 @@ static void sbp2_prep_command_orb_sg(str
 		orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1);
 		orb->data_descriptor_lo = cmd->sge_dma;
 
+		dma_sync_single_for_cpu(hi->host->device.parent, cmd->sge_dma,
+					sizeof(cmd->scatter_gather_element),
+					DMA_TO_DEVICE);
+
 		/* loop through and fill out our SBP-2 page tables
 		 * (and split up anything too large) */
 		for (i = 0, sg_count = 0; i < count; i++, sg = sg_next(sg)) {
@@ -1548,6 +1552,11 @@ static void sbp2_prep_command_orb_sg(str
 		sbp2util_cpu_to_be32_buffer(sg_element,
 				(sizeof(struct sbp2_unrestricted_page_table)) *
 				sg_count);
+
+		dma_sync_single_for_device(hi->host->device.parent,
+					   cmd->sge_dma,
+					   sizeof(cmd->scatter_gather_element),
+					   DMA_TO_DEVICE);
 	}
 }
 
@@ -1555,12 +1564,14 @@ static void sbp2_create_command_orb(stru
 				    struct sbp2_command_info *cmd,
 				    struct scsi_cmnd *SCpnt)
 {
-	struct sbp2_fwhost_info *hi = lu->hi;
+	struct device *dev = lu->hi->host->device.parent;
 	struct sbp2_command_orb *orb = &cmd->command_orb;
 	u32 orb_direction;
 	unsigned int scsi_request_bufflen = scsi_bufflen(SCpnt);
 	enum dma_data_direction dma_dir = SCpnt->sc_data_direction;
 
+	dma_sync_single_for_cpu(dev, cmd->command_orb_dma,
+				sizeof(struct sbp2_command_orb), DMA_TO_DEVICE);
 	/*
 	 * Set-up our command ORB.
 	 *
@@ -1592,7 +1603,7 @@ static void sbp2_create_command_orb(stru
 		orb->data_descriptor_lo = 0x0;
 		orb->misc |= ORB_SET_DIRECTION(1);
 	} else
-		sbp2_prep_command_orb_sg(orb, hi, cmd, scsi_sg_count(SCpnt),
+		sbp2_prep_command_orb_sg(orb, lu->hi, cmd, scsi_sg_count(SCpnt),
 					 scsi_sglist(SCpnt),
 					 orb_direction, dma_dir);
 
@@ -1600,6 +1611,9 @@ static void sbp2_create_command_orb(stru
 
 	memset(orb->cdb, 0, sizeof(orb->cdb));
 	memcpy(orb->cdb, SCpnt->cmnd, SCpnt->cmd_len);
+
+	dma_sync_single_for_device(dev, cmd->command_orb_dma,
+			sizeof(struct sbp2_command_orb), DMA_TO_DEVICE);
 }
 
 static void sbp2_link_orb_command(struct sbp2_lu *lu,
@@ -1613,14 +1627,6 @@ static void sbp2_link_orb_command(struct
 	size_t length;
 	unsigned long flags;
 
-	dma_sync_single_for_device(hi->host->device.parent,
-				   cmd->command_orb_dma,
-				   sizeof(struct sbp2_command_orb),
-				   DMA_TO_DEVICE);
-	dma_sync_single_for_device(hi->host->device.parent, cmd->sge_dma,
-				   sizeof(cmd->scatter_gather_element),
-				   DMA_TO_DEVICE);
-
 	/* check to see if there are any previous orbs to use */
 	spin_lock_irqsave(&lu->cmd_orb_lock, flags);
 	last_orb = lu->last_orb;
@@ -1778,13 +1784,6 @@ static int sbp2_handle_status_write(stru
 	else
 		cmd = sbp2util_find_command_for_orb(lu, sb->ORB_offset_lo);
 	if (cmd) {
-		dma_sync_single_for_cpu(hi->host->device.parent,
-					cmd->command_orb_dma,
-					sizeof(struct sbp2_command_orb),
-					DMA_TO_DEVICE);
-		dma_sync_single_for_cpu(hi->host->device.parent, cmd->sge_dma,
-					sizeof(cmd->scatter_gather_element),
-					DMA_TO_DEVICE);
 		/* Grab SCSI command pointers and check status. */
 		/*
 		 * FIXME: If the src field in the status is 1, the ORB DMA must

-- 
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] ieee1394: sbp2: avoid DMA bounce buffers for ORBs
  2008-08-08 11:59           ` [PATCH] ieee1394: sbp2: stricter dma_sync Stefan Richter
@ 2008-08-08 12:00             ` Stefan Richter
  2008-08-08 17:54               ` Stefan Richter
  2008-08-09 18:13             ` [PATCH update] ieee1394: sbp2: stricter dma_sync Stefan Richter
  1 sibling, 1 reply; 14+ messages in thread
From: Stefan Richter @ 2008-08-08 12:00 UTC (permalink / raw)
  To: linux1394-devel; +Cc: James Bottomley, linux-scsi, Boaz Harrosh

The command ORB pool is used frequently during the whole lifetime of an
SBP-2 logical unit.  It doesn't make sense to let it live outside the
32bit DMA zone.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/ieee1394/sbp2.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
===================================================================
--- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.c
+++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
@@ -531,7 +531,7 @@ static int sbp2util_create_command_orb_p
 	int i, orbs = sbp2_serialize_io ? 2 : SBP2_MAX_CMDS;
 
 	for (i = 0; i < orbs; i++) {
-		cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
+		cmd = kzalloc(sizeof(*cmd), GFP_KERNEL | GFP_DMA32);
 		if (!cmd)
 			return -ENOMEM;
 		cmd->command_orb_dma = dma_map_single(hi->host->device.parent,

-- 
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ieee1394: sbp2: avoid DMA bounce buffers for ORBs
  2008-08-08 12:00             ` [PATCH] ieee1394: sbp2: avoid DMA bounce buffers for ORBs Stefan Richter
@ 2008-08-08 17:54               ` Stefan Richter
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Richter @ 2008-08-08 17:54 UTC (permalink / raw)
  To: linux1394-devel; +Cc: Boaz Harrosh, James Bottomley, linux-scsi

On  8 Aug, Stefan Richter wrote:
> --- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.c
> +++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
> @@ -531,7 +531,7 @@ static int sbp2util_create_command_orb_p
>  	int i, orbs = sbp2_serialize_io ? 2 : SBP2_MAX_CMDS;
>  
>  	for (i = 0; i < orbs; i++) {
> -		cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
> +		cmd = kzalloc(sizeof(*cmd), GFP_KERNEL | GFP_DMA32);
>  		if (!cmd)
>  			return -ENOMEM;
>  		cmd->command_orb_dma = dma_map_single(hi->host->device.parent,
> 

This patch is bogus.

I cannot use GFP_DMA32 in slab allocations.
-- 
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH update] ieee1394: sbp2: stricter dma_sync
  2008-08-08 11:59           ` [PATCH] ieee1394: sbp2: stricter dma_sync Stefan Richter
  2008-08-08 12:00             ` [PATCH] ieee1394: sbp2: avoid DMA bounce buffers for ORBs Stefan Richter
@ 2008-08-09 18:13             ` Stefan Richter
  2008-08-09 18:16               ` [PATCH] ieee1394: sbp2: check for DMA mapping failures Stefan Richter
  1 sibling, 1 reply; 14+ messages in thread
From: Stefan Richter @ 2008-08-09 18:13 UTC (permalink / raw)
  To: linux1394-devel; +Cc: linux-scsi, Boaz Harrosh

Two dma_sync_single_for_cpu() were called in the wrong place.
Luckily they were merely for DMA_TO_DEVICE, hence nobody noticed.

Also reorder the matching dma_sync_single_for_device() a little bit
so that they reside in the same functions as their counterparts.
This also avoids syncing the s/g table for requests which don't use it.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---

Update:
  - I forgot to remove four calls of dma_sync_single_for_cpu.
  - Call the pointer to the controller dmadev instead of dev
    to avoid confusion with the logical unit's device pointer.

 drivers/ieee1394/sbp2.c |   50 ++++++++++++----------------------------
 1 file changed, 16 insertions(+), 34 deletions(-)

Index: linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
===================================================================
--- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.c
+++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
@@ -1522,6 +1522,10 @@ static void sbp2_prep_command_orb_sg(str
 		orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1);
 		orb->data_descriptor_lo = cmd->sge_dma;
 
+		dma_sync_single_for_cpu(hi->host->device.parent, cmd->sge_dma,
+					sizeof(cmd->scatter_gather_element),
+					DMA_TO_DEVICE);
+
 		/* loop through and fill out our SBP-2 page tables
 		 * (and split up anything too large) */
 		for (i = 0, sg_count = 0; i < count; i++, sg = sg_next(sg)) {
@@ -1548,6 +1552,11 @@ static void sbp2_prep_command_orb_sg(str
 		sbp2util_cpu_to_be32_buffer(sg_element,
 				(sizeof(struct sbp2_unrestricted_page_table)) *
 				sg_count);
+
+		dma_sync_single_for_device(hi->host->device.parent,
+					   cmd->sge_dma,
+					   sizeof(cmd->scatter_gather_element),
+					   DMA_TO_DEVICE);
 	}
 }
 
@@ -1555,12 +1564,14 @@ static void sbp2_create_command_orb(stru
 				    struct sbp2_command_info *cmd,
 				    struct scsi_cmnd *SCpnt)
 {
-	struct sbp2_fwhost_info *hi = lu->hi;
+	struct device *dmadev = lu->hi->host->device.parent;
 	struct sbp2_command_orb *orb = &cmd->command_orb;
 	u32 orb_direction;
 	unsigned int scsi_request_bufflen = scsi_bufflen(SCpnt);
 	enum dma_data_direction dma_dir = SCpnt->sc_data_direction;
 
+	dma_sync_single_for_cpu(dmadev, cmd->command_orb_dma,
+				sizeof(struct sbp2_command_orb), DMA_TO_DEVICE);
 	/*
 	 * Set-up our command ORB.
 	 *
@@ -1592,7 +1603,7 @@ static void sbp2_create_command_orb(stru
 		orb->data_descriptor_lo = 0x0;
 		orb->misc |= ORB_SET_DIRECTION(1);
 	} else
-		sbp2_prep_command_orb_sg(orb, hi, cmd, scsi_sg_count(SCpnt),
+		sbp2_prep_command_orb_sg(orb, lu->hi, cmd, scsi_sg_count(SCpnt),
 					 scsi_sglist(SCpnt),
 					 orb_direction, dma_dir);
 
@@ -1600,6 +1611,9 @@ static void sbp2_create_command_orb(stru
 
 	memset(orb->cdb, 0, sizeof(orb->cdb));
 	memcpy(orb->cdb, SCpnt->cmnd, SCpnt->cmd_len);
+
+	dma_sync_single_for_device(dmadev, cmd->command_orb_dma,
+			sizeof(struct sbp2_command_orb), DMA_TO_DEVICE);
 }
 
 static void sbp2_link_orb_command(struct sbp2_lu *lu,
@@ -1613,14 +1627,6 @@ static void sbp2_link_orb_command(struct
 	size_t length;
 	unsigned long flags;
 
-	dma_sync_single_for_device(hi->host->device.parent,
-				   cmd->command_orb_dma,
-				   sizeof(struct sbp2_command_orb),
-				   DMA_TO_DEVICE);
-	dma_sync_single_for_device(hi->host->device.parent, cmd->sge_dma,
-				   sizeof(cmd->scatter_gather_element),
-				   DMA_TO_DEVICE);
-
 	/* check to see if there are any previous orbs to use */
 	spin_lock_irqsave(&lu->cmd_orb_lock, flags);
 	last_orb = lu->last_orb;
@@ -1778,13 +1784,6 @@ static int sbp2_handle_status_write(stru
 	else
 		cmd = sbp2util_find_command_for_orb(lu, sb->ORB_offset_lo);
 	if (cmd) {
-		dma_sync_single_for_cpu(hi->host->device.parent,
-					cmd->command_orb_dma,
-					sizeof(struct sbp2_command_orb),
-					DMA_TO_DEVICE);
-		dma_sync_single_for_cpu(hi->host->device.parent, cmd->sge_dma,
-					sizeof(cmd->scatter_gather_element),
-					DMA_TO_DEVICE);
 		/* Grab SCSI command pointers and check status. */
 		/*
 		 * FIXME: If the src field in the status is 1, the ORB DMA must
@@ -1901,7 +1900,6 @@ done:
 
 static void sbp2scsi_complete_all_commands(struct sbp2_lu *lu, u32 status)
 {
-	struct sbp2_fwhost_info *hi = lu->hi;
 	struct list_head *lh;
 	struct sbp2_command_info *cmd;
 	unsigned long flags;
@@ -1910,13 +1908,6 @@ static void sbp2scsi_complete_all_comman
 	while (!list_empty(&lu->cmd_orb_inuse)) {
 		lh = lu->cmd_orb_inuse.next;
 		cmd = list_entry(lh, struct sbp2_command_info, list);
-		dma_sync_single_for_cpu(hi->host->device.parent,
-				        cmd->command_orb_dma,
-					sizeof(struct sbp2_command_orb),
-					DMA_TO_DEVICE);
-		dma_sync_single_for_cpu(hi->host->device.parent, cmd->sge_dma,
-					sizeof(cmd->scatter_gather_element),
-					DMA_TO_DEVICE);
 		sbp2util_mark_command_completed(lu, cmd);
 		if (cmd->Current_SCpnt) {
 			cmd->Current_SCpnt->result = status << 16;
@@ -2038,7 +2029,6 @@ static void sbp2scsi_slave_destroy(struc
 static int sbp2scsi_abort(struct scsi_cmnd *SCpnt)
 {
 	struct sbp2_lu *lu = (struct sbp2_lu *)SCpnt->device->host->hostdata[0];
-	struct sbp2_fwhost_info *hi = lu->hi;
 	struct sbp2_command_info *cmd;
 	unsigned long flags;
 
@@ -2052,14 +2042,6 @@ static int sbp2scsi_abort(struct scsi_cm
 		spin_lock_irqsave(&lu->cmd_orb_lock, flags);
 		cmd = sbp2util_find_command_for_SCpnt(lu, SCpnt);
 		if (cmd) {
-			dma_sync_single_for_cpu(hi->host->device.parent,
-					cmd->command_orb_dma,
-					sizeof(struct sbp2_command_orb),
-					DMA_TO_DEVICE);
-			dma_sync_single_for_cpu(hi->host->device.parent,
-					cmd->sge_dma,
-					sizeof(cmd->scatter_gather_element),
-					DMA_TO_DEVICE);
 			sbp2util_mark_command_completed(lu, cmd);
 			if (cmd->Current_SCpnt) {
 				cmd->Current_SCpnt->result = DID_ABORT << 16;

-- 
Stefan Richter
-=====-==--- =--- -=--=
http://arcgraph.de/sr/


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] ieee1394: sbp2: check for DMA mapping failures
  2008-08-09 18:13             ` [PATCH update] ieee1394: sbp2: stricter dma_sync Stefan Richter
@ 2008-08-09 18:16               ` Stefan Richter
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Richter @ 2008-08-09 18:16 UTC (permalink / raw)
  To: linux1394-devel; +Cc: linux-scsi, Boaz Harrosh

On  8 Aug, Stefan Richter wrote:
> On a related note, sbp2 lacks error checks for the DMA mapping when the
> ORB pool is created and when the request buffer is mapped.  I leave that
> as a ToDo for sometime later.


From: Stefan Richter <stefanr@s5r6.in-berlin.de>
Subject: ieee1394: sbp2: check for DMA mapping failures

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/ieee1394/sbp2.c |   94 +++++++++++++++++++++++++---------------
 1 file changed, 59 insertions(+), 35 deletions(-)

Index: linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
===================================================================
--- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.c
+++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
@@ -526,26 +526,41 @@ static void sbp2util_write_doorbell(stru
 
 static int sbp2util_create_command_orb_pool(struct sbp2_lu *lu)
 {
-	struct sbp2_fwhost_info *hi = lu->hi;
 	struct sbp2_command_info *cmd;
+	struct device *dmadev = lu->hi->host->device.parent;
 	int i, orbs = sbp2_serialize_io ? 2 : SBP2_MAX_CMDS;
 
 	for (i = 0; i < orbs; i++) {
 		cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
 		if (!cmd)
-			return -ENOMEM;
-		cmd->command_orb_dma = dma_map_single(hi->host->device.parent,
-						&cmd->command_orb,
-						sizeof(struct sbp2_command_orb),
-						DMA_TO_DEVICE);
-		cmd->sge_dma = dma_map_single(hi->host->device.parent,
-					&cmd->scatter_gather_element,
-					sizeof(cmd->scatter_gather_element),
-					DMA_TO_DEVICE);
+			goto failed_alloc;
+
+		cmd->command_orb_dma =
+		    dma_map_single(dmadev, &cmd->command_orb,
+				   sizeof(struct sbp2_command_orb),
+				   DMA_TO_DEVICE);
+		if (dma_mapping_error(dmadev, cmd->command_orb_dma))
+			goto failed_orb;
+
+		cmd->sge_dma =
+		    dma_map_single(dmadev, &cmd->scatter_gather_element,
+				   sizeof(cmd->scatter_gather_element),
+				   DMA_TO_DEVICE);
+		if (dma_mapping_error(dmadev, cmd->sge_dma))
+			goto failed_sge;
+
 		INIT_LIST_HEAD(&cmd->list);
 		list_add_tail(&cmd->list, &lu->cmd_orb_completed);
 	}
 	return 0;
+
+failed_sge:
+	dma_unmap_single(dmadev, cmd->command_orb_dma,
+			 sizeof(struct sbp2_command_orb), DMA_TO_DEVICE);
+failed_orb:
+	kfree(cmd);
+failed_alloc:
+	return -ENOMEM;
 }
 
 static void sbp2util_remove_command_orb_pool(struct sbp2_lu *lu,
@@ -1483,14 +1498,16 @@ static int sbp2_agent_reset(struct sbp2_
 	return 0;
 }
 
-static void sbp2_prep_command_orb_sg(struct sbp2_command_orb *orb,
-				     struct sbp2_fwhost_info *hi,
-				     struct sbp2_command_info *cmd,
-				     unsigned int scsi_use_sg,
-				     struct scatterlist *sg,
-				     u32 orb_direction,
-				     enum dma_data_direction dma_dir)
+static int sbp2_prep_command_orb_sg(struct sbp2_command_orb *orb,
+				    struct sbp2_fwhost_info *hi,
+				    struct sbp2_command_info *cmd,
+				    unsigned int scsi_use_sg,
+				    struct scatterlist *sg,
+				    u32 orb_direction,
+				    enum dma_data_direction dma_dir)
 {
+	struct device *dmadev = hi->host->device.parent;
+
 	cmd->dma_dir = dma_dir;
 	orb->data_descriptor_hi = ORB_SET_NODE_ID(hi->host->node_id);
 	orb->misc |= ORB_SET_DIRECTION(orb_direction);
@@ -1500,9 +1517,12 @@ static void sbp2_prep_command_orb_sg(str
 
 		cmd->dma_size = sg->length;
 		cmd->dma_type = CMD_DMA_PAGE;
-		cmd->cmd_dma = dma_map_page(hi->host->device.parent,
-					    sg_page(sg), sg->offset,
+		cmd->cmd_dma = dma_map_page(dmadev, sg_page(sg), sg->offset,
 					    cmd->dma_size, cmd->dma_dir);
+		if (dma_mapping_error(dmadev, cmd->cmd_dma)) {
+			cmd->cmd_dma = 0;
+			return -ENOMEM;
+		}
 
 		orb->data_descriptor_lo = cmd->cmd_dma;
 		orb->misc |= ORB_SET_DATA_SIZE(cmd->dma_size);
@@ -1512,8 +1532,7 @@ static void sbp2_prep_command_orb_sg(str
 						&cmd->scatter_gather_element[0];
 		u32 sg_count, sg_len;
 		dma_addr_t sg_addr;
-		int i, count = dma_map_sg(hi->host->device.parent, sg,
-					  scsi_use_sg, dma_dir);
+		int i, count = dma_map_sg(dmadev, sg, scsi_use_sg, dma_dir);
 
 		cmd->dma_size = scsi_use_sg;
 		cmd->sge_buffer = sg;
@@ -1522,7 +1541,7 @@ static void sbp2_prep_command_orb_sg(str
 		orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1);
 		orb->data_descriptor_lo = cmd->sge_dma;
 
-		dma_sync_single_for_cpu(hi->host->device.parent, cmd->sge_dma,
+		dma_sync_single_for_cpu(dmadev, cmd->sge_dma,
 					sizeof(cmd->scatter_gather_element),
 					DMA_TO_DEVICE);
 
@@ -1553,22 +1572,23 @@ static void sbp2_prep_command_orb_sg(str
 				(sizeof(struct sbp2_unrestricted_page_table)) *
 				sg_count);
 
-		dma_sync_single_for_device(hi->host->device.parent,
-					   cmd->sge_dma,
+		dma_sync_single_for_device(dmadev, cmd->sge_dma,
 					   sizeof(cmd->scatter_gather_element),
 					   DMA_TO_DEVICE);
 	}
+	return 0;
 }
 
-static void sbp2_create_command_orb(struct sbp2_lu *lu,
-				    struct sbp2_command_info *cmd,
-				    struct scsi_cmnd *SCpnt)
+static int sbp2_create_command_orb(struct sbp2_lu *lu,
+				   struct sbp2_command_info *cmd,
+				   struct scsi_cmnd *SCpnt)
 {
 	struct device *dmadev = lu->hi->host->device.parent;
 	struct sbp2_command_orb *orb = &cmd->command_orb;
-	u32 orb_direction;
 	unsigned int scsi_request_bufflen = scsi_bufflen(SCpnt);
 	enum dma_data_direction dma_dir = SCpnt->sc_data_direction;
+	u32 orb_direction;
+	int ret;
 
 	dma_sync_single_for_cpu(dmadev, cmd->command_orb_dma,
 				sizeof(struct sbp2_command_orb), DMA_TO_DEVICE);
@@ -1602,11 +1622,13 @@ static void sbp2_create_command_orb(stru
 		orb->data_descriptor_hi = 0x0;
 		orb->data_descriptor_lo = 0x0;
 		orb->misc |= ORB_SET_DIRECTION(1);
-	} else
-		sbp2_prep_command_orb_sg(orb, lu->hi, cmd, scsi_sg_count(SCpnt),
-					 scsi_sglist(SCpnt),
-					 orb_direction, dma_dir);
-
+		ret = 0;
+	} else {
+		ret = sbp2_prep_command_orb_sg(orb, lu->hi, cmd,
+					       scsi_sg_count(SCpnt),
+					       scsi_sglist(SCpnt),
+					       orb_direction, dma_dir);
+	}
 	sbp2util_cpu_to_be32_buffer(orb, sizeof(*orb));
 
 	memset(orb->cdb, 0, sizeof(orb->cdb));
@@ -1614,6 +1636,7 @@ static void sbp2_create_command_orb(stru
 
 	dma_sync_single_for_device(dmadev, cmd->command_orb_dma,
 			sizeof(struct sbp2_command_orb), DMA_TO_DEVICE);
+	return ret;
 }
 
 static void sbp2_link_orb_command(struct sbp2_lu *lu,
@@ -1694,9 +1717,10 @@ static int sbp2_send_command(struct sbp2
 	if (!cmd)
 		return -EIO;
 
-	sbp2_create_command_orb(lu, cmd, SCpnt);
-	sbp2_link_orb_command(lu, cmd);
+	if (sbp2_create_command_orb(lu, cmd, SCpnt))
+		return -ENOMEM;
 
+	sbp2_link_orb_command(lu, cmd);
 	return 0;
 }
 
-- 
Stefan Richter
-=====-==--- =--- -=--=
http://arcgraph.de/sr/


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-08-09 18:16 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-04 18:08 [patch resend] firewire: ieee1394: Move away from SG_ALL Stefan Richter
2008-08-04 19:21 ` James Bottomley
2008-08-04 19:58   ` Stefan Richter
2008-08-05 12:22     ` Boaz Harrosh
2008-08-05 15:01       ` Stefan Richter
2008-08-05 15:28         ` Boaz Harrosh
2008-08-08 11:59           ` [PATCH] ieee1394: sbp2: stricter dma_sync Stefan Richter
2008-08-08 12:00             ` [PATCH] ieee1394: sbp2: avoid DMA bounce buffers for ORBs Stefan Richter
2008-08-08 17:54               ` Stefan Richter
2008-08-09 18:13             ` [PATCH update] ieee1394: sbp2: stricter dma_sync Stefan Richter
2008-08-09 18:16               ` [PATCH] ieee1394: sbp2: check for DMA mapping failures Stefan Richter
2008-08-05 19:57       ` [patch resend] firewire: ieee1394: Move away from SG_ALL Stefan Richter
2008-08-06  9:10         ` Boaz Harrosh
2008-08-06 20:49           ` Stefan Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox