public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request
  2014-03-12 16:05 [PATCH 0/2] mtip32xx: Fix arch-specific driver issues Felipe Franciosi
@ 2014-03-12 16:05 ` Felipe Franciosi
  2014-03-12 16:16   ` Jens Axboe
  2014-03-12 19:04   ` Sam Bradshaw
  0 siblings, 2 replies; 7+ messages in thread
From: Felipe Franciosi @ 2014-03-12 16:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, Felipe Franciosi

If the buffers are unmapped after completing a request, then stale data
might be in the request.

Signed-off-by: Felipe Franciosi <felipe@paradoxo.org>
---
 drivers/block/mtip32xx/mtip32xx.c |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index 24c87fdb..7e8fe0d 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -266,6 +266,12 @@ static void mtip_async_complete(struct mtip_port *port,
 			"Command tag %d failed due to TFE\n", tag);
 	}
 
+	/* Unmap the DMA scatter list entries */
+	dma_unmap_sg(&dd->pdev->dev,
+		command->sg,
+		command->scatter_ents,
+		command->direction);
+
 	/* Upper layer callback */
 	if (likely(command->async_callback))
 		command->async_callback(command->async_data, cb_status);
@@ -273,12 +279,6 @@ static void mtip_async_complete(struct mtip_port *port,
 	command->async_callback = NULL;
 	command->comp_func = NULL;
 
-	/* Unmap the DMA scatter list entries */
-	dma_unmap_sg(&dd->pdev->dev,
-		command->sg,
-		command->scatter_ents,
-		command->direction);
-
 	/* Clear the allocated and active bits for the command */
 	atomic_set(&port->commands[tag].active, 0);
 	release_slot(port, tag);
@@ -709,6 +709,12 @@ static void mtip_timeout_function(unsigned long int data)
 			 */
 			writel(1 << bit, port->completed[group]);
 
+			/* Unmap the DMA scatter list entries */
+			dma_unmap_sg(&port->dd->pdev->dev,
+					command->sg,
+					command->scatter_ents,
+					command->direction);
+
 			/* Call the async completion callback. */
 			if (likely(command->async_callback))
 				command->async_callback(command->async_data,
@@ -716,12 +722,6 @@ static void mtip_timeout_function(unsigned long int data)
 			command->async_callback = NULL;
 			command->comp_func = NULL;
 
-			/* Unmap the DMA scatter list entries */
-			dma_unmap_sg(&port->dd->pdev->dev,
-					command->sg,
-					command->scatter_ents,
-					command->direction);
-
 			/*
 			 * Clear the allocated bit and active tag for the
 			 * command.
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request
  2014-03-12 16:05 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi
@ 2014-03-12 16:16   ` Jens Axboe
  2014-03-12 19:04   ` Sam Bradshaw
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2014-03-12 16:16 UTC (permalink / raw)
  To: Felipe Franciosi, linux-kernel; +Cc: Sam Bradshaw

On 03/12/2014 10:05 AM, Felipe Franciosi wrote:
> If the buffers are unmapped after completing a request, then stale data
> might be in the request.

This is unfortunate, and a real bug. Good catch!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request
  2014-03-12 16:05 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi
  2014-03-12 16:16   ` Jens Axboe
@ 2014-03-12 19:04   ` Sam Bradshaw
  1 sibling, 0 replies; 7+ messages in thread
From: Sam Bradshaw @ 2014-03-12 19:04 UTC (permalink / raw)
  To: Felipe Franciosi
  Cc: linux-kernel, Jens Axboe,
	Asai Thambi Samymuthu Pattrayasamy (asamymuthupa)

On 03/12/2014 09:05 AM, Felipe Franciosi wrote:
> If the buffers are unmapped after completing a request, then stale data
> might be in the request.

Good find, Felipe, thank you.  I would prefer something along the lines 
of this patch to make sure to avoid double completions / dma_unmap_sg() 
calls during surprise removal and/or timeout conditions.

Jens: note that this patch also fixes a regression in the unaligned 
workaround implementation that was introduced by the SRSI patch.

Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 5160269..390ac6f 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port 
*port,
  				void *data,
  				int status)
  {
-	struct mtip_cmd *command;
+	struct mtip_cmd *cmd;
  	struct driver_data *dd = data;
-	int cb_status = status ? -EIO : 0;
+	int unaligned, cb_status = status ? -EIO : 0;
+	void (*func)(void *, int);

  	if (unlikely(!dd) || unlikely(!port))
  		return;

-	command = &port->commands[tag];
+	cmd = &port->commands[tag];

  	if (unlikely(status == PORT_IRQ_TF_ERR)) {
  		dev_warn(&port->dd->pdev->dev,
  			"Command tag %d failed due to TFE\n", tag);
  	}

+	/* Clear the active flag */
+	atomic_set(&port->commands[tag].active, 0);
+
  	/* Upper layer callback */
-	if (likely(command->async_callback))
-		command->async_callback(command->async_data, cb_status);
+	func = cmd->async_callback;
+	if (likely(func && cmpxchg(&cmd->async_callback, func, 0) == func)) {

-	command->async_callback = NULL;
-	command->comp_func = NULL;
+		/* Unmap the DMA scatter list entries */
+		dma_unmap_sg(&dd->pdev->dev,
+			cmd->sg,
+			cmd->scatter_ents,
+			cmd->direction);

-	/* Unmap the DMA scatter list entries */
-	dma_unmap_sg(&dd->pdev->dev,
-		command->sg,
-		command->scatter_ents,
-		command->direction);
+		func(cmd->async_data, cb_status);
+		unaligned = cmd->unaligned;

-	/* Clear the allocated and active bits for the command */
-	atomic_set(&port->commands[tag].active, 0);
-	release_slot(port, tag);
+		/* Clear the allocated bit for the command */
+		release_slot(port, tag);

-	up(&port->cmd_slot);
+		if (unlikely(unaligned))
+			up(&port->cmd_slot_unal);
+		else
+			up(&port->cmd_slot);
+	}
  }

  /*
@@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long 
int data)
  {
  	struct mtip_port *port = (struct mtip_port *) data;
  	struct host_to_dev_fis *fis;
-	struct mtip_cmd *command;
-	int tag, cmdto_cnt = 0;
+	struct mtip_cmd *cmd;
+	int unaligned, tag, cmdto_cnt = 0;
  	unsigned int bit, group;
  	unsigned int num_command_slots;
  	unsigned long to, tagaccum[SLOTBITS_IN_LONGS];
+	void (*func)(void *, int);

  	if (unlikely(!port))
  		return;
@@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int 
data)
  			group = tag >> 5;
  			bit = tag & 0x1F;

-			command = &port->commands[tag];
-			fis = (struct host_to_dev_fis *) command->command;
+			cmd = &port->commands[tag];
+			fis = (struct host_to_dev_fis *) cmd->command;

  			set_bit(tag, tagaccum);
  			cmdto_cnt++;
@@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long 
int data)
  			 */
  			writel(1 << bit, port->completed[group]);

-			/* Call the async completion callback. */
-			if (likely(command->async_callback))
-				command->async_callback(command->async_data,
-							 -EIO);
-			command->async_callback = NULL;
-			command->comp_func = NULL;
+			/* Clear the active flag for the command */
+			atomic_set(&port->commands[tag].active, 0);

-			/* Unmap the DMA scatter list entries */
-			dma_unmap_sg(&port->dd->pdev->dev,
-					command->sg,
-					command->scatter_ents,
-					command->direction);
+			func = cmd->async_callback;
+			if (func &&
+			    cmpxchg(&cmd->async_callback, func, 0) == func) {

-			/*
-			 * Clear the allocated bit and active tag for the
-			 * command.
-			 */
-			atomic_set(&port->commands[tag].active, 0);
-			release_slot(port, tag);
+				/* Unmap the DMA scatter list entries */
+				dma_unmap_sg(&port->dd->pdev->dev,
+						cmd->sg,
+						cmd->scatter_ents,
+						cmd->direction);

-			up(&port->cmd_slot);
+				func(cmd->async_data, -EIO);
+				unaligned = cmd->unaligned;
+
+				/* Clear the allocated bit for the command. */
+				release_slot(port, tag);
+
+				if (unaligned)
+					up(&port->cmd_slot_unal);
+				else
+					up(&port->cmd_slot);
+			}
  		}
  	}



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues
@ 2014-03-13 14:34 Felipe Franciosi
  2014-03-13 14:34 ` [PATCH 1/2] mtip32xx: Set queue bounce limit Felipe Franciosi
  2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi
  0 siblings, 2 replies; 7+ messages in thread
From: Felipe Franciosi @ 2014-03-13 14:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, David Vrabel

This V2 series reintroduces blk_queue_bounce(), but also sets a correct
blk_queue_bounce_limit() during the device initialisation. I have tested
this series and confirmed that performance is good on a 32-bit dom0 on Xen.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] mtip32xx: Set queue bounce limit
  2014-03-13 14:34 [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues Felipe Franciosi
@ 2014-03-13 14:34 ` Felipe Franciosi
  2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi
  1 sibling, 0 replies; 7+ messages in thread
From: Felipe Franciosi @ 2014-03-13 14:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, David Vrabel, Felipe Franciosi

We need to set the queue bounce limit during the device initialization to
prevent excessive bouncing on 32 bit architectures.

Signed-off-by: Felipe Franciosi <felipe@paradoxo.org>
---
 drivers/block/mtip32xx/mtip32xx.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index 5160269..787c9d3 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -4213,6 +4213,7 @@ skip_create_disk:
 	blk_queue_max_hw_sectors(dd->queue, 0xffff);
 	blk_queue_max_segment_size(dd->queue, 0x400000);
 	blk_queue_io_min(dd->queue, 4096);
+	blk_queue_bounce_limit(dd->queue, dd->pdev->dma_mask);
 
 	/*
 	 * write back cache is not supported in the device. FUA depends on
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request
  2014-03-13 14:34 [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues Felipe Franciosi
  2014-03-13 14:34 ` [PATCH 1/2] mtip32xx: Set queue bounce limit Felipe Franciosi
@ 2014-03-13 14:34 ` Felipe Franciosi
  2014-03-13 15:30   ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Felipe Franciosi @ 2014-03-13 14:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, David Vrabel, Felipe Franciosi

If the buffers are unmapped after completing a request, then stale data
might be in the request.

Signed-off-by: Felipe Franciosi <felipe@paradoxo.org>
---
 drivers/block/mtip32xx/mtip32xx.c |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index 787c9d3..4dd2642 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -266,6 +266,12 @@ static void mtip_async_complete(struct mtip_port *port,
 			"Command tag %d failed due to TFE\n", tag);
 	}
 
+	/* Unmap the DMA scatter list entries */
+	dma_unmap_sg(&dd->pdev->dev,
+		command->sg,
+		command->scatter_ents,
+		command->direction);
+
 	/* Upper layer callback */
 	if (likely(command->async_callback))
 		command->async_callback(command->async_data, cb_status);
@@ -273,12 +279,6 @@ static void mtip_async_complete(struct mtip_port *port,
 	command->async_callback = NULL;
 	command->comp_func = NULL;
 
-	/* Unmap the DMA scatter list entries */
-	dma_unmap_sg(&dd->pdev->dev,
-		command->sg,
-		command->scatter_ents,
-		command->direction);
-
 	/* Clear the allocated and active bits for the command */
 	atomic_set(&port->commands[tag].active, 0);
 	release_slot(port, tag);
@@ -709,6 +709,12 @@ static void mtip_timeout_function(unsigned long int data)
 			 */
 			writel(1 << bit, port->completed[group]);
 
+			/* Unmap the DMA scatter list entries */
+			dma_unmap_sg(&port->dd->pdev->dev,
+					command->sg,
+					command->scatter_ents,
+					command->direction);
+
 			/* Call the async completion callback. */
 			if (likely(command->async_callback))
 				command->async_callback(command->async_data,
@@ -716,12 +722,6 @@ static void mtip_timeout_function(unsigned long int data)
 			command->async_callback = NULL;
 			command->comp_func = NULL;
 
-			/* Unmap the DMA scatter list entries */
-			dma_unmap_sg(&port->dd->pdev->dev,
-					command->sg,
-					command->scatter_ents,
-					command->direction);
-
 			/*
 			 * Clear the allocated bit and active tag for the
 			 * command.
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request
  2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi
@ 2014-03-13 15:30   ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2014-03-13 15:30 UTC (permalink / raw)
  To: Felipe Franciosi, linux-kernel; +Cc: Sam Bradshaw, David Vrabel

On 03/13/2014 08:34 AM, Felipe Franciosi wrote:
> If the buffers are unmapped after completing a request, then stale data
> might be in the request.

Both should be marked for stable, I'll add that.

Sam, I'll queue these two up. Please re-send your fix for bug introduced 
with the unaligned reduction as a proper patch, and I'll get that in too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-03-13 15:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-13 14:34 [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues Felipe Franciosi
2014-03-13 14:34 ` [PATCH 1/2] mtip32xx: Set queue bounce limit Felipe Franciosi
2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi
2014-03-13 15:30   ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2014-03-12 16:05 [PATCH 0/2] mtip32xx: Fix arch-specific driver issues Felipe Franciosi
2014-03-12 16:05 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi
2014-03-12 16:16   ` Jens Axboe
2014-03-12 19:04   ` Sam Bradshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox