* [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues @ 2014-03-13 14:34 Felipe Franciosi 2014-03-13 14:34 ` [PATCH 1/2] mtip32xx: Set queue bounce limit Felipe Franciosi 2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi 0 siblings, 2 replies; 7+ messages in thread From: Felipe Franciosi @ 2014-03-13 14:34 UTC (permalink / raw) To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, David Vrabel This V2 series reintroduces blk_queue_bounce(), but also sets a correct blk_queue_bounce_limit() during the device initialisation. I have tested this series and confirmed that performance is good on a 32-bit dom0 on Xen. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] mtip32xx: Set queue bounce limit 2014-03-13 14:34 [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues Felipe Franciosi @ 2014-03-13 14:34 ` Felipe Franciosi 2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi 1 sibling, 0 replies; 7+ messages in thread From: Felipe Franciosi @ 2014-03-13 14:34 UTC (permalink / raw) To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, David Vrabel, Felipe Franciosi We need to set the queue bounce limit during the device initialization to prevent excessive bouncing on 32 bit architectures. Signed-off-by: Felipe Franciosi <felipe@paradoxo.org> --- drivers/block/mtip32xx/mtip32xx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 5160269..787c9d3 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -4213,6 +4213,7 @@ skip_create_disk: blk_queue_max_hw_sectors(dd->queue, 0xffff); blk_queue_max_segment_size(dd->queue, 0x400000); blk_queue_io_min(dd->queue, 4096); + blk_queue_bounce_limit(dd->queue, dd->pdev->dma_mask); /* * write back cache is not supported in the device. FUA depends on -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request 2014-03-13 14:34 [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues Felipe Franciosi 2014-03-13 14:34 ` [PATCH 1/2] mtip32xx: Set queue bounce limit Felipe Franciosi @ 2014-03-13 14:34 ` Felipe Franciosi 2014-03-13 15:30 ` Jens Axboe 1 sibling, 1 reply; 7+ messages in thread From: Felipe Franciosi @ 2014-03-13 14:34 UTC (permalink / raw) To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, David Vrabel, Felipe Franciosi If the buffers are unmapped after completing a request, then stale data might be in the request. Signed-off-by: Felipe Franciosi <felipe@paradoxo.org> --- drivers/block/mtip32xx/mtip32xx.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 787c9d3..4dd2642 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -266,6 +266,12 @@ static void mtip_async_complete(struct mtip_port *port, "Command tag %d failed due to TFE\n", tag); } + /* Unmap the DMA scatter list entries */ + dma_unmap_sg(&dd->pdev->dev, + command->sg, + command->scatter_ents, + command->direction); + /* Upper layer callback */ if (likely(command->async_callback)) command->async_callback(command->async_data, cb_status); @@ -273,12 +279,6 @@ static void mtip_async_complete(struct mtip_port *port, command->async_callback = NULL; command->comp_func = NULL; - /* Unmap the DMA scatter list entries */ - dma_unmap_sg(&dd->pdev->dev, - command->sg, - command->scatter_ents, - command->direction); - /* Clear the allocated and active bits for the command */ atomic_set(&port->commands[tag].active, 0); release_slot(port, tag); @@ -709,6 +709,12 @@ static void mtip_timeout_function(unsigned long int data) */ writel(1 << bit, port->completed[group]); + /* Unmap the DMA scatter list entries */ + dma_unmap_sg(&port->dd->pdev->dev, + command->sg, + command->scatter_ents, + command->direction); + /* Call the async completion callback. */ if (likely(command->async_callback)) command->async_callback(command->async_data, @@ -716,12 +722,6 @@ static void mtip_timeout_function(unsigned long int data) command->async_callback = NULL; command->comp_func = NULL; - /* Unmap the DMA scatter list entries */ - dma_unmap_sg(&port->dd->pdev->dev, - command->sg, - command->scatter_ents, - command->direction); - /* * Clear the allocated bit and active tag for the * command. -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request 2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi @ 2014-03-13 15:30 ` Jens Axboe 0 siblings, 0 replies; 7+ messages in thread From: Jens Axboe @ 2014-03-13 15:30 UTC (permalink / raw) To: Felipe Franciosi, linux-kernel; +Cc: Sam Bradshaw, David Vrabel On 03/13/2014 08:34 AM, Felipe Franciosi wrote: > If the buffers are unmapped after completing a request, then stale data > might be in the request. Both should be marked for stable, I'll add that. Sam, I'll queue these two up. Please re-send your fix for bug introduced with the unaligned reduction as a proper patch, and I'll get that in too. -- Jens Axboe ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/2] mtip32xx: Fix arch-specific driver issues @ 2014-03-12 16:05 Felipe Franciosi 2014-03-12 16:05 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi 0 siblings, 1 reply; 7+ messages in thread From: Felipe Franciosi @ 2014-03-12 16:05 UTC (permalink / raw) To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw This patch series fix a couple of issues with the Micron P320 driver. The first is an issue on 32-bit architectures where unecessary bouncing of requests causes really really bad performance. The second regards a problem on architectures where DMA unmapping has side effects. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request 2014-03-12 16:05 [PATCH 0/2] mtip32xx: Fix arch-specific driver issues Felipe Franciosi @ 2014-03-12 16:05 ` Felipe Franciosi 2014-03-12 16:16 ` Jens Axboe 2014-03-12 19:04 ` Sam Bradshaw 0 siblings, 2 replies; 7+ messages in thread From: Felipe Franciosi @ 2014-03-12 16:05 UTC (permalink / raw) To: linux-kernel; +Cc: Jens Axboe, Sam Bradshaw, Felipe Franciosi If the buffers are unmapped after completing a request, then stale data might be in the request. Signed-off-by: Felipe Franciosi <felipe@paradoxo.org> --- drivers/block/mtip32xx/mtip32xx.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 24c87fdb..7e8fe0d 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -266,6 +266,12 @@ static void mtip_async_complete(struct mtip_port *port, "Command tag %d failed due to TFE\n", tag); } + /* Unmap the DMA scatter list entries */ + dma_unmap_sg(&dd->pdev->dev, + command->sg, + command->scatter_ents, + command->direction); + /* Upper layer callback */ if (likely(command->async_callback)) command->async_callback(command->async_data, cb_status); @@ -273,12 +279,6 @@ static void mtip_async_complete(struct mtip_port *port, command->async_callback = NULL; command->comp_func = NULL; - /* Unmap the DMA scatter list entries */ - dma_unmap_sg(&dd->pdev->dev, - command->sg, - command->scatter_ents, - command->direction); - /* Clear the allocated and active bits for the command */ atomic_set(&port->commands[tag].active, 0); release_slot(port, tag); @@ -709,6 +709,12 @@ static void mtip_timeout_function(unsigned long int data) */ writel(1 << bit, port->completed[group]); + /* Unmap the DMA scatter list entries */ + dma_unmap_sg(&port->dd->pdev->dev, + command->sg, + command->scatter_ents, + command->direction); + /* Call the async completion callback. */ if (likely(command->async_callback)) command->async_callback(command->async_data, @@ -716,12 +722,6 @@ static void mtip_timeout_function(unsigned long int data) command->async_callback = NULL; command->comp_func = NULL; - /* Unmap the DMA scatter list entries */ - dma_unmap_sg(&port->dd->pdev->dev, - command->sg, - command->scatter_ents, - command->direction); - /* * Clear the allocated bit and active tag for the * command. -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request 2014-03-12 16:05 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi @ 2014-03-12 16:16 ` Jens Axboe 2014-03-12 19:04 ` Sam Bradshaw 1 sibling, 0 replies; 7+ messages in thread From: Jens Axboe @ 2014-03-12 16:16 UTC (permalink / raw) To: Felipe Franciosi, linux-kernel; +Cc: Sam Bradshaw On 03/12/2014 10:05 AM, Felipe Franciosi wrote: > If the buffers are unmapped after completing a request, then stale data > might be in the request. This is unfortunate, and a real bug. Good catch! -- Jens Axboe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request 2014-03-12 16:05 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi 2014-03-12 16:16 ` Jens Axboe @ 2014-03-12 19:04 ` Sam Bradshaw 1 sibling, 0 replies; 7+ messages in thread From: Sam Bradshaw @ 2014-03-12 19:04 UTC (permalink / raw) To: Felipe Franciosi Cc: linux-kernel, Jens Axboe, Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) On 03/12/2014 09:05 AM, Felipe Franciosi wrote: > If the buffers are unmapped after completing a request, then stale data > might be in the request. Good find, Felipe, thank you. I would prefer something along the lines of this patch to make sure to avoid double completions / dma_unmap_sg() calls during surprise removal and/or timeout conditions. Jens: note that this patch also fixes a regression in the unaligned workaround implementation that was introduced by the SRSI patch. Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 5160269..390ac6f 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port *port, void *data, int status) { - struct mtip_cmd *command; + struct mtip_cmd *cmd; struct driver_data *dd = data; - int cb_status = status ? -EIO : 0; + int unaligned, cb_status = status ? -EIO : 0; + void (*func)(void *, int); if (unlikely(!dd) || unlikely(!port)) return; - command = &port->commands[tag]; + cmd = &port->commands[tag]; if (unlikely(status == PORT_IRQ_TF_ERR)) { dev_warn(&port->dd->pdev->dev, "Command tag %d failed due to TFE\n", tag); } + /* Clear the active flag */ + atomic_set(&port->commands[tag].active, 0); + /* Upper layer callback */ - if (likely(command->async_callback)) - command->async_callback(command->async_data, cb_status); + func = cmd->async_callback; + if (likely(func && cmpxchg(&cmd->async_callback, func, 0) == func)) { - command->async_callback = NULL; - command->comp_func = NULL; + /* Unmap the DMA scatter list entries */ + dma_unmap_sg(&dd->pdev->dev, + cmd->sg, + cmd->scatter_ents, + cmd->direction); - /* Unmap the DMA scatter list entries */ - dma_unmap_sg(&dd->pdev->dev, - command->sg, - command->scatter_ents, - command->direction); + func(cmd->async_data, cb_status); + unaligned = cmd->unaligned; - /* Clear the allocated and active bits for the command */ - atomic_set(&port->commands[tag].active, 0); - release_slot(port, tag); + /* Clear the allocated bit for the command */ + release_slot(port, tag); - up(&port->cmd_slot); + if (unlikely(unaligned)) + up(&port->cmd_slot_unal); + else + up(&port->cmd_slot); + } } /* @@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long int data) { struct mtip_port *port = (struct mtip_port *) data; struct host_to_dev_fis *fis; - struct mtip_cmd *command; - int tag, cmdto_cnt = 0; + struct mtip_cmd *cmd; + int unaligned, tag, cmdto_cnt = 0; unsigned int bit, group; unsigned int num_command_slots; unsigned long to, tagaccum[SLOTBITS_IN_LONGS]; + void (*func)(void *, int); if (unlikely(!port)) return; @@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int data) group = tag >> 5; bit = tag & 0x1F; - command = &port->commands[tag]; - fis = (struct host_to_dev_fis *) command->command; + cmd = &port->commands[tag]; + fis = (struct host_to_dev_fis *) cmd->command; set_bit(tag, tagaccum); cmdto_cnt++; @@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long int data) */ writel(1 << bit, port->completed[group]); - /* Call the async completion callback. */ - if (likely(command->async_callback)) - command->async_callback(command->async_data, - -EIO); - command->async_callback = NULL; - command->comp_func = NULL; + /* Clear the active flag for the command */ + atomic_set(&port->commands[tag].active, 0); - /* Unmap the DMA scatter list entries */ - dma_unmap_sg(&port->dd->pdev->dev, - command->sg, - command->scatter_ents, - command->direction); + func = cmd->async_callback; + if (func && + cmpxchg(&cmd->async_callback, func, 0) == func) { - /* - * Clear the allocated bit and active tag for the - * command. - */ - atomic_set(&port->commands[tag].active, 0); - release_slot(port, tag); + /* Unmap the DMA scatter list entries */ + dma_unmap_sg(&port->dd->pdev->dev, + cmd->sg, + cmd->scatter_ents, + cmd->direction); - up(&port->cmd_slot); + func(cmd->async_data, -EIO); + unaligned = cmd->unaligned; + + /* Clear the allocated bit for the command. */ + release_slot(port, tag); + + if (unaligned) + up(&port->cmd_slot_unal); + else + up(&port->cmd_slot); + } } } ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-03-13 15:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-13 14:34 [PATCH 0/2 v2] mtip32xx: Fix arch-specific performance issues Felipe Franciosi 2014-03-13 14:34 ` [PATCH 1/2] mtip32xx: Set queue bounce limit Felipe Franciosi 2014-03-13 14:34 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi 2014-03-13 15:30 ` Jens Axboe -- strict thread matches above, loose matches on Subject: below -- 2014-03-12 16:05 [PATCH 0/2] mtip32xx: Fix arch-specific driver issues Felipe Franciosi 2014-03-12 16:05 ` [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request Felipe Franciosi 2014-03-12 16:16 ` Jens Axboe 2014-03-12 19:04 ` Sam Bradshaw
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox