linux-mmc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency
@ 2011-04-06 19:07 Per Forlin
  2011-04-06 19:07 ` [PATCH v2 01/12] mmc: add none blocking mmc request function Per Forlin
                   ` (6 more replies)
  0 siblings, 7 replies; 58+ messages in thread
From: Per Forlin @ 2011-04-06 19:07 UTC (permalink / raw)
  To: linux-mmc, linux-arm-kernel, linux-kernel, linaro-dev
  Cc: Chris Ball, Per Forlin

How significant is the cache maintenance over head?
It depends, the eMMC are much faster now
compared to a few years ago and cache maintenance cost more due to
multiple cache levels and speculative cache pre-fetch. In relation the
cost for handling the caches have increased and is now a bottle neck
dealing with fast eMMC together with DMA.

The intention for introducing none blocking mmc requests is to minimize the
time between a mmc request ends and another mmc request starts. In the
current implementation the MMC controller is idle when dma_map_sg and
dma_unmap_sg is processing. Introducing none blocking mmc request makes it
possible to prepare the caches for next job parallel with an active
mmc request.

This is done by making the issue_rw_rq() none blocking.
The increase in throughput is proportional to the time it takes to
prepare (major part of preparations is dma_map_sg and dma_unmap_sg)
a request and how fast the memory is. The faster the MMC/SD is
the more significant the prepare request time becomes. Measurements on U5500
and Panda on eMMC and SD shows significant performance gain for for large
reads when running DMA mode. In the PIO case the performance is unchanged.

There are two optional hooks pre_req() and post_req() that the host driver
may implement in order to move work to before and after the actual mmc_request
function is called. In the DMA case pre_req() may do dma_map_sg() and prepare
the dma descriptor and post_req runs the dma_unmap_sg.

Details on measurements from IOZone and mmc_test:
https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Specs/StoragePerfMMC-async-req

Changes since v1:
 * Add support for omap_hsmmc
 * Add test in mmc_test to compare performance with
   and without none blocking request.
 * Add random fault injection in mmc core to exercise error
   handling in the mmc block code.
 * Fix serveral issue in the mmc block error handling.
 * Add a host_cookie member in mmc_data to be used by
   pre_req to mark the data. The host driver will then
   check this mark to see if the data is prepared or not.
 * Previous patch subject was
   "add double buffering for mmc block requests".

Per Forlin (12):
  mmc: add none blocking mmc request function
  mmc: mmc_test: add debugfs file to list all tests
  mmc: mmc_test: add test for none blocking transfers
  mmc: add member in mmc queue struct to hold request data
  mmc: add a block request prepare function
  mmc: move error code in mmc_block_issue_rw_rq to a separate function.
  mmc: add a second mmc queue request member
  mmc: add handling for two parallel block requests in issue_rw_rq
  mmc: test: add random fault injection in core.c
  omap_hsmmc: use original sg_len for dma_unmap_sg
  omap_hsmmc: add support for pre_req and post_req
  mmci: implement pre_req() and post_req()

 drivers/mmc/card/block.c      |  493 +++++++++++++++++++++++++++--------------
 drivers/mmc/card/mmc_test.c   |  342 ++++++++++++++++++++++++++++-
 drivers/mmc/card/queue.c      |  171 +++++++++------
 drivers/mmc/card/queue.h      |   31 ++-
 drivers/mmc/core/core.c       |  132 ++++++++++-
 drivers/mmc/core/debugfs.c    |    5 +
 drivers/mmc/host/mmci.c       |  146 +++++++++++-
 drivers/mmc/host/mmci.h       |    8 +
 drivers/mmc/host/omap_hsmmc.c |   90 +++++++-
 include/linux/mmc/core.h      |    9 +-
 include/linux/mmc/host.h      |   13 +-
 lib/Kconfig.debug             |   11 +
 12 files changed, 1172 insertions(+), 279 deletions(-)

-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 58+ messages in thread
* [PATCH] mmc: sdhci: add support for pre_req and post_req
@ 2013-09-06  6:01 Chanho Min
  2013-09-06  8:05 ` Jaehoon Chung
  0 siblings, 1 reply; 58+ messages in thread
From: Chanho Min @ 2013-09-06  6:01 UTC (permalink / raw)
  To: Chris Ball, Per Forlin
  Cc: HyoJun Im, linux-mmc, linux-kernel, Gunho Lee, Chanho Min

This patch supports non-blocking mmc request function for the sdchi driver.
(commit: aa8b683a7d392271ed349c6ab9f36b8c313794b7)

pre_req() runs dma_map_sg(), post_req() runs dma_unmap_sg.  If not calling
pre_req() before sdhci_request(), dma_map_sg will be issued before
starting the transfer.  It is optional to use pre_req().  If issuing
pre_req(), post_req() must be called as well.

benchmark results:
 ARM CA9 1GHz, UHS DDR50 mode

 Before:
 dd if=/dev/mmcblk0p15 of=/dev/null bs=64k count=1024
 67108864 bytes (64.0MB) copied, 1.188846 seconds, 53.8MB/s

 After:
 dd if=/dev/mmcblk0p15 of=/dev/null bs=64k count=1024
 67108864 bytes (64.0MB) copied, 0.993098 seconds, 64.4MB/s

Signed-off-by: Chanho Min <chanho.min@lge.com>
---
 drivers/mmc/host/sdhci.c  |   96 +++++++++++++++++++++++++++++++++++++++------
 include/linux/mmc/sdhci.h |    6 +++
 2 files changed, 90 insertions(+), 12 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 2ea429c..0465a9a 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -465,6 +465,42 @@ static void sdhci_set_adma_desc(u8 *desc, u32 addr, int len, unsigned cmd)
 	dataddr[0] = cpu_to_le32(addr);
 }
 
+static int sdhci_pre_dma_transfer(struct sdhci_host *host,
+				       struct mmc_data *data,
+				       struct sdhci_next *next)
+{
+	int sg_count = 0;
+
+	if (!next && data->host_cookie &&
+	    data->host_cookie != host->next_data.cookie) {
+		pr_warn("[%s] invalid cookie: data->host_cookie %d"
+			" host->next_data.cookie %d\n",
+			__func__, data->host_cookie, host->next_data.cookie);
+		data->host_cookie = 0;
+	}
+
+	/* Check if next job is already prepared */
+	if (next ||
+	    (!next && data->host_cookie != host->next_data.cookie)) {
+		sg_count = dma_map_sg(mmc_dev(host->mmc), data->sg,
+					data->sg_len,
+					(data->flags & MMC_DATA_READ) ?
+						DMA_FROM_DEVICE :
+						DMA_TO_DEVICE);
+	} else {
+		sg_count = host->next_data.sg_count;
+		host->next_data.sg_count = 0;
+	}
+
+	if (next) {
+		next->sg_count = sg_count;
+		data->host_cookie = ++next->cookie < 0 ? 1 : next->cookie;
+	} else
+		host->sg_count = sg_count;
+
+	return sg_count;
+}
+
 static int sdhci_adma_table_pre(struct sdhci_host *host,
 	struct mmc_data *data)
 {
@@ -502,8 +538,8 @@ static int sdhci_adma_table_pre(struct sdhci_host *host,
 		goto fail;
 	BUG_ON(host->align_addr & 0x3);
 
-	host->sg_count = dma_map_sg(mmc_dev(host->mmc),
-		data->sg, data->sg_len, direction);
+	host->sg_count = sdhci_pre_dma_transfer(host, data, NULL);
+
 	if (host->sg_count == 0)
 		goto unmap_align;
 
@@ -643,9 +679,10 @@ static void sdhci_adma_table_post(struct sdhci_host *host,
 			}
 		}
 	}
-
-	dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-		data->sg_len, direction);
+	if (!data->host_cookie) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+			data->sg_len, direction);
+	}
 }
 
 static u8 sdhci_calc_timeout(struct sdhci_host *host, struct mmc_command *cmd)
@@ -824,12 +861,8 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd)
 			}
 		} else {
 			int sg_cnt;
+			sg_cnt = sdhci_pre_dma_transfer(host, data, NULL);
 
-			sg_cnt = dma_map_sg(mmc_dev(host->mmc),
-					data->sg, data->sg_len,
-					(data->flags & MMC_DATA_READ) ?
-						DMA_FROM_DEVICE :
-						DMA_TO_DEVICE);
 			if (sg_cnt == 0) {
 				/*
 				 * This only happens when someone fed
@@ -928,9 +961,12 @@ static void sdhci_finish_data(struct sdhci_host *host)
 		if (host->flags & SDHCI_USE_ADMA)
 			sdhci_adma_table_post(host, data);
 		else {
-			dma_unmap_sg(mmc_dev(host->mmc), data->sg,
-				data->sg_len, (data->flags & MMC_DATA_READ) ?
+			if (!data->host_cookie) {
+				dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+					data->sg_len,
+					(data->flags & MMC_DATA_READ) ?
 					DMA_FROM_DEVICE : DMA_TO_DEVICE);
+			}
 		}
 	}
 
@@ -2066,8 +2102,42 @@ static void sdhci_card_event(struct mmc_host *mmc)
 	spin_unlock_irqrestore(&host->lock, flags);
 }
 
+static void sdhci_post_req(struct mmc_host *mmc, struct mmc_request *mrq,
+				int err)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (host->flags & SDHCI_REQ_USE_DMA) {
+		dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
+			(data->flags & MMC_DATA_READ) ?
+			DMA_FROM_DEVICE :
+			DMA_TO_DEVICE);
+		data->host_cookie = 0;
+	}
+}
+
+static void sdhci_pre_req(struct mmc_host *mmc, struct mmc_request *mrq,
+			       bool is_first_req)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	struct mmc_data *data = mrq->data;
+
+	if (mrq->data->host_cookie) {
+		mrq->data->host_cookie = 0;
+		return;
+	}
+
+	if (host->flags & SDHCI_REQ_USE_DMA) {
+		if (!sdhci_pre_dma_transfer(host, data, &host->next_data))
+			mrq->data->host_cookie = 0;
+	}
+}
+
 static const struct mmc_host_ops sdhci_ops = {
 	.request	= sdhci_request,
+	.post_req	= sdhci_post_req,
+	.pre_req	= sdhci_pre_req,
 	.set_ios	= sdhci_set_ios,
 	.get_cd		= sdhci_get_cd,
 	.get_ro		= sdhci_get_ro,
@@ -3204,6 +3274,8 @@ int sdhci_add_host(struct sdhci_host *host)
 	}
 #endif
 
+	host->next_data.cookie = 1;
+
 	mmiowb();
 
 	mmc_add_host(mmc);
diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h
index b838ffc..220a515 100644
--- a/include/linux/mmc/sdhci.h
+++ b/include/linux/mmc/sdhci.h
@@ -17,6 +17,11 @@
 #include <linux/io.h>
 #include <linux/mmc/host.h>
 
+struct sdhci_next {
+	unsigned int	sg_count;
+	s32		cookie;
+};
+
 struct sdhci_host {
 	/* Data set by hardware interface driver */
 	const char *hw_name;	/* Hardware bus name */
@@ -177,5 +182,6 @@ struct sdhci_host {
 	struct timer_list	tuning_timer;	/* Timer for tuning */
 
 	unsigned long private[0] ____cacheline_aligned;
+	struct sdhci_next	next_data;
 };
 #endif /* LINUX_MMC_SDHCI_H */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2013-09-06  8:05 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-06 19:07 [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency Per Forlin
2011-04-06 19:07 ` [PATCH v2 01/12] mmc: add none blocking mmc request function Per Forlin
2011-04-15 10:34   ` David Vrabel
2011-04-20  7:17     ` Per Forlin
2011-04-26 13:29       ` David Vrabel
     [not found]         ` <4DB6C89F.10903-kQvG35nSl+M@public.gmane.org>
2011-04-26 14:22           ` Per Forlin
     [not found] ` <1302116833-24540-1-git-send-email-per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2011-04-06 19:07   ` [PATCH v2 02/12] mmc: mmc_test: add debugfs file to list all tests Per Forlin
2011-04-06 19:07   ` [PATCH v2 03/12] mmc: mmc_test: add test for none blocking transfers Per Forlin
2011-04-17  7:09     ` Lin Tony-B19295
2011-04-20  7:30       ` Per Forlin
     [not found]     ` <1302116833-24540-4-git-send-email-per.forlin-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2011-04-17 15:46       ` Shawn Guo
2011-04-20  7:41         ` Per Forlin
2011-04-06 19:07   ` [PATCH v2 04/12] mmc: add member in mmc queue struct to hold request data Per Forlin
2011-04-06 19:07   ` [PATCH v2 05/12] mmc: add a block request prepare function Per Forlin
2011-04-06 19:07   ` [PATCH v2 06/12] mmc: move error code in mmc_block_issue_rw_rq to a separate function Per Forlin
2011-04-06 19:07   ` [PATCH v2 07/12] mmc: add a second mmc queue request member Per Forlin
2011-04-06 19:07   ` [PATCH v2 08/12] mmc: add handling for two parallel block requests in issue_rw_rq Per Forlin
2011-04-20 11:32     ` Per Forlin
2011-04-06 19:07   ` [PATCH v2 09/12] mmc: test: add random fault injection in core.c Per Forlin
2011-04-06 19:07   ` [PATCH v2 10/12] omap_hsmmc: use original sg_len for dma_unmap_sg Per Forlin
2011-04-06 19:07   ` [PATCH v2 11/12] omap_hsmmc: add support for pre_req and post_req Per Forlin
2011-04-06 19:07 ` [PATCH v2 12/12] mmci: implement pre_req() and post_req() Per Forlin
2011-04-08 16:49 ` [PATCH v2 00/12] mmc: use nonblock mmc requests to minimize latency Linus Walleij
2011-04-09 11:55   ` Jae hoon Chung
     [not found]     ` <BANLkTikVeXvfSBS-xLDXVdesKJpKdtUVqg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-04-10  3:33       ` anish singh
2011-04-11  9:03         ` Per Forlin
     [not found]           ` <BANLkTikoj6UTx08ntZaMM15taKRXjrU_Mg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-04-11  9:07             ` Sachin Nikam
2011-04-11  9:08     ` Per Forlin
2011-04-19 14:30       ` Jae hoon Chung
2011-04-16 15:48 ` Shawn Guo
2011-04-20  8:19   ` Per Forlin
2011-04-16 16:48 ` [PATCH] mmc: sdhci: add support for pre_req and post_req Shawn Guo
2011-04-16 23:06   ` Andrei Warkentin
2011-04-22 11:01     ` Jaehoon Chung
2011-04-27  0:59       ` Andrei Warkentin
2011-04-26  1:26     ` Jaehoon Chung
2011-04-26  2:47       ` Shawn Guo
2011-04-26 10:21         ` Per Forlin
2011-04-17 16:33 ` [PATCH] mmc: mxs-mmc: " Shawn Guo
2011-04-17 16:48   ` Shawn Guo
2011-04-20  8:01     ` Per Forlin
2011-04-20 14:01       ` Shawn Guo
2011-04-20 15:22         ` Per Forlin
2011-04-21  6:25           ` Shawn Guo
2011-04-21  8:52             ` Per Forlin
2011-04-20 15:30         ` Per Forlin
2011-04-21  6:29           ` Shawn Guo
2011-04-21  8:46             ` Per Forlin
2011-04-21  9:11               ` Shawn Guo
2011-04-21  9:47                 ` Per Forlin
2011-04-21 10:15                   ` Per Forlin
2011-04-28  7:52                   ` Per Forlin
2011-04-28 10:10                     ` Russell King - ARM Linux
2011-04-20  7:58   ` Per Forlin
2011-04-20  8:17     ` Shawn Guo
2011-04-20 13:51   ` [PATCH v2] " Shawn Guo
  -- strict thread matches above, loose matches on Subject: below --
2013-09-06  6:01 [PATCH] mmc: sdhci: " Chanho Min
2013-09-06  8:05 ` Jaehoon Chung

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).