* [PATCH V3 0/2] mmc: hsq: dynamically adjust hsq_depth to improve performance
@ 2023-08-29 2:04 Wenchao Chen
2023-08-29 2:04 ` [PATCH V3 1/2] mmc: queue: replace immediate with hsq->depth Wenchao Chen
2023-08-29 2:04 ` [PATCH V3 2/2] mmc: hsq: dynamic adjustment of hsq->depth Wenchao Chen
0 siblings, 2 replies; 6+ messages in thread
From: Wenchao Chen @ 2023-08-29 2:04 UTC (permalink / raw)
To: ulf.hansson
Cc: linux-mmc, linux-kernel, wenchao.chen666, zhenxiong.lai,
yuelin.tang, Wenchao Chen
Change in v3:
- Use "mrq->data->blksz * mrq->data->blocks == 4096" for 4K.
- Add explanation for "HSQ_PERFORMANCE_DEPTH".
Change in v2:
- Support for dynamic adjustment of hsq_depth.
Test
=====
I tested 3 times for each case and output a average speed.
Ran 'fio' to evaluate the performance:
1.Fixed hsq_depth
1) Sequential write:
Speed: 168 164 165
Average speed: 165.67MB/S
2) Sequential read:
Speed: 326 326 326
Average speed: 326MB/S
3) Random write:
Speed: 82.6 83 83
Average speed: 82.87MB/S
4) Random read:
Speed: 48.2 48.3 47.6
Average speed: 48.03MB/S
2.Dynamic hsq_depth
1) Sequential write:
Speed: 167 166 166
Average speed: 166.33MB/S
2) Sequential read:
Speed: 327 326 326
Average speed: 326.3MB/S
3) Random write:
Speed: 86.1 86.2 87.7
Average speed: 86.67MB/S
4) Random read:
Speed: 48.1 48 48
Average speed: 48.03MB/S
Based on the above data, dynamic hsq_depth can improve the performance of random writes.
Random write improved by 4.6%.
In addition, we tested 8K and 16K.
1.Fixed hsq_depth
1) Random write(bs=8K):
Speed: 116 114 115
Average speed: 115MB/S
2) Random read(bs=8K):
Speed: 83 83 82.5
Average speed: 82.8MB/S
3) Random write(bs=16K):
Speed: 141 142 141
Average speed: 141.3MB/S
4) Random read(bs=16K):
Speed: 132 132 132
Average speed: 132MB/S
2.Dynamic hsq_depth(mrq->data->blksz * mrq->data->blocks == 8192 or 16384)
1) Random write(bs=8K):
Speed: 115 115 115
Average speed: 115MB/S
2) Random read(bs=8K):
Speed: 82.7 82.9 82.8
Average speed: 82.8MB/S
3) Random write(bs=16K):
Speed: 143 141 141
Average speed: 141.6MB/S
4) Random read(bs=16K):
Speed: 132 132 132
Average speed: 132MB/S
Increasing hsq_depth cannot improve 8k and 16k random read/write performance.
To reduce latency, we dynamically increase hsq_depth only for 4k random writes.
Test cmd
=========
1)write: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=write -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
2)read: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=read -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
3)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
4)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
5)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
6)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
7)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
8)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
Wenchao Chen (2):
mmc: queue: replace immediate with hsq->depth
mmc: hsq: dynamic adjustment of hsq->depth
drivers/mmc/core/queue.c | 6 +-----
drivers/mmc/host/mmc_hsq.c | 28 ++++++++++++++++++++++++++++
drivers/mmc/host/mmc_hsq.h | 11 +++++++++++
include/linux/mmc/host.h | 1 +
4 files changed, 41 insertions(+), 5 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH V3 1/2] mmc: queue: replace immediate with hsq->depth
2023-08-29 2:04 [PATCH V3 0/2] mmc: hsq: dynamically adjust hsq_depth to improve performance Wenchao Chen
@ 2023-08-29 2:04 ` Wenchao Chen
2023-08-29 2:04 ` [PATCH V3 2/2] mmc: hsq: dynamic adjustment of hsq->depth Wenchao Chen
1 sibling, 0 replies; 6+ messages in thread
From: Wenchao Chen @ 2023-08-29 2:04 UTC (permalink / raw)
To: ulf.hansson
Cc: linux-mmc, linux-kernel, wenchao.chen666, zhenxiong.lai,
yuelin.tang, Wenchao Chen
Hsq is similar to cqe, using hsq->depth to represent
the maximum processing capacity of HSQ.
Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
---
drivers/mmc/core/queue.c | 6 +-----
drivers/mmc/host/mmc_hsq.c | 1 +
drivers/mmc/host/mmc_hsq.h | 6 ++++++
include/linux/mmc/host.h | 1 +
4 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index b396e3900717..a0a2412f62a7 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -260,11 +260,7 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
}
break;
case MMC_ISSUE_ASYNC:
- /*
- * For MMC host software queue, we only allow 2 requests in
- * flight to avoid a long latency.
- */
- if (host->hsq_enabled && mq->in_flight[issue_type] > 2) {
+ if (host->hsq_enabled && mq->in_flight[issue_type] > host->hsq_depth) {
spin_unlock_irq(&mq->lock);
return BLK_STS_RESOURCE;
}
diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
index 424dc7b07858..8556cacb21a1 100644
--- a/drivers/mmc/host/mmc_hsq.c
+++ b/drivers/mmc/host/mmc_hsq.c
@@ -337,6 +337,7 @@ int mmc_hsq_init(struct mmc_hsq *hsq, struct mmc_host *mmc)
hsq->mmc = mmc;
hsq->mmc->cqe_private = hsq;
mmc->cqe_ops = &mmc_hsq_ops;
+ mmc->hsq_depth = HSQ_NORMAL_DEPTH;
for (i = 0; i < HSQ_NUM_SLOTS; i++)
hsq->tag_slot[i] = HSQ_INVALID_TAG;
diff --git a/drivers/mmc/host/mmc_hsq.h b/drivers/mmc/host/mmc_hsq.h
index 1808024fc6c5..aa5c4543b55f 100644
--- a/drivers/mmc/host/mmc_hsq.h
+++ b/drivers/mmc/host/mmc_hsq.h
@@ -5,6 +5,12 @@
#define HSQ_NUM_SLOTS 64
#define HSQ_INVALID_TAG HSQ_NUM_SLOTS
+/*
+ * For MMC host software queue, we only allow 2 requests in
+ * flight to avoid a long latency.
+ */
+#define HSQ_NORMAL_DEPTH 2
+
struct hsq_slot {
struct mmc_request *mrq;
};
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 461d1543893b..1fd8b1dd8698 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -520,6 +520,7 @@ struct mmc_host {
/* Host Software Queue support */
bool hsq_enabled;
+ int hsq_depth;
u32 err_stats[MMC_ERR_MAX];
unsigned long private[] ____cacheline_aligned;
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH V3 2/2] mmc: hsq: dynamic adjustment of hsq->depth
2023-08-29 2:04 [PATCH V3 0/2] mmc: hsq: dynamically adjust hsq_depth to improve performance Wenchao Chen
2023-08-29 2:04 ` [PATCH V3 1/2] mmc: queue: replace immediate with hsq->depth Wenchao Chen
@ 2023-08-29 2:04 ` Wenchao Chen
2023-09-14 12:57 ` Ulf Hansson
1 sibling, 1 reply; 6+ messages in thread
From: Wenchao Chen @ 2023-08-29 2:04 UTC (permalink / raw)
To: ulf.hansson
Cc: linux-mmc, linux-kernel, wenchao.chen666, zhenxiong.lai,
yuelin.tang, Wenchao Chen
Increasing hsq_depth improves random write performance.
Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
---
drivers/mmc/host/mmc_hsq.c | 27 +++++++++++++++++++++++++++
drivers/mmc/host/mmc_hsq.h | 5 +++++
2 files changed, 32 insertions(+)
diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
index 8556cacb21a1..0984c39108ba 100644
--- a/drivers/mmc/host/mmc_hsq.c
+++ b/drivers/mmc/host/mmc_hsq.c
@@ -21,6 +21,31 @@ static void mmc_hsq_retry_handler(struct work_struct *work)
mmc->ops->request(mmc, hsq->mrq);
}
+static void mmc_hsq_modify_threshold(struct mmc_hsq *hsq)
+{
+ struct mmc_host *mmc = hsq->mmc;
+ struct mmc_request *mrq;
+ struct hsq_slot *slot;
+ int need_change = 0;
+ int tag;
+
+ for (tag = 0; tag < HSQ_NUM_SLOTS; tag++) {
+ slot = &hsq->slot[tag];
+ mrq = slot->mrq;
+ if (mrq && mrq->data &&
+ (mrq->data->blksz * mrq->data->blocks == 4096) &&
+ (mrq->data->flags & MMC_DATA_WRITE))
+ need_change++;
+ else
+ break;
+ }
+
+ if (need_change > 1)
+ mmc->hsq_depth = HSQ_PERFORMANCE_DEPTH;
+ else
+ mmc->hsq_depth = HSQ_NORMAL_DEPTH;
+}
+
static void mmc_hsq_pump_requests(struct mmc_hsq *hsq)
{
struct mmc_host *mmc = hsq->mmc;
@@ -42,6 +67,8 @@ static void mmc_hsq_pump_requests(struct mmc_hsq *hsq)
return;
}
+ mmc_hsq_modify_threshold(hsq);
+
slot = &hsq->slot[hsq->next_tag];
hsq->mrq = slot->mrq;
hsq->qcnt--;
diff --git a/drivers/mmc/host/mmc_hsq.h b/drivers/mmc/host/mmc_hsq.h
index aa5c4543b55f..dd352a6ac32a 100644
--- a/drivers/mmc/host/mmc_hsq.h
+++ b/drivers/mmc/host/mmc_hsq.h
@@ -10,6 +10,11 @@
* flight to avoid a long latency.
*/
#define HSQ_NORMAL_DEPTH 2
+/*
+ * For 4k random writes, we allow hsq_depth to increase to 5
+ * for better performance.
+ */
+#define HSQ_PERFORMANCE_DEPTH 5
struct hsq_slot {
struct mmc_request *mrq;
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH V3 1/2] mmc: queue: replace immediate with hsq->depth
2023-09-05 2:39 [PATCH V3 0/2 RESEND] mmc: hsq: dynamically adjust hsq_depth to improve performance Wenchao Chen
@ 2023-09-05 2:39 ` Wenchao Chen
0 siblings, 0 replies; 6+ messages in thread
From: Wenchao Chen @ 2023-09-05 2:39 UTC (permalink / raw)
To: ulf.hansson
Cc: linux-mmc, linux-kernel, wenchao.chen666, zhenxiong.lai,
yuelin.tang, Wenchao Chen
Hsq is similar to cqe, using hsq->depth to represent
the maximum processing capacity of HSQ.
Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
---
drivers/mmc/core/queue.c | 6 +-----
drivers/mmc/host/mmc_hsq.c | 1 +
drivers/mmc/host/mmc_hsq.h | 6 ++++++
include/linux/mmc/host.h | 1 +
4 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index b396e3900717..a0a2412f62a7 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -260,11 +260,7 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
}
break;
case MMC_ISSUE_ASYNC:
- /*
- * For MMC host software queue, we only allow 2 requests in
- * flight to avoid a long latency.
- */
- if (host->hsq_enabled && mq->in_flight[issue_type] > 2) {
+ if (host->hsq_enabled && mq->in_flight[issue_type] > host->hsq_depth) {
spin_unlock_irq(&mq->lock);
return BLK_STS_RESOURCE;
}
diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
index 424dc7b07858..8556cacb21a1 100644
--- a/drivers/mmc/host/mmc_hsq.c
+++ b/drivers/mmc/host/mmc_hsq.c
@@ -337,6 +337,7 @@ int mmc_hsq_init(struct mmc_hsq *hsq, struct mmc_host *mmc)
hsq->mmc = mmc;
hsq->mmc->cqe_private = hsq;
mmc->cqe_ops = &mmc_hsq_ops;
+ mmc->hsq_depth = HSQ_NORMAL_DEPTH;
for (i = 0; i < HSQ_NUM_SLOTS; i++)
hsq->tag_slot[i] = HSQ_INVALID_TAG;
diff --git a/drivers/mmc/host/mmc_hsq.h b/drivers/mmc/host/mmc_hsq.h
index 1808024fc6c5..aa5c4543b55f 100644
--- a/drivers/mmc/host/mmc_hsq.h
+++ b/drivers/mmc/host/mmc_hsq.h
@@ -5,6 +5,12 @@
#define HSQ_NUM_SLOTS 64
#define HSQ_INVALID_TAG HSQ_NUM_SLOTS
+/*
+ * For MMC host software queue, we only allow 2 requests in
+ * flight to avoid a long latency.
+ */
+#define HSQ_NORMAL_DEPTH 2
+
struct hsq_slot {
struct mmc_request *mrq;
};
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 461d1543893b..1fd8b1dd8698 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -520,6 +520,7 @@ struct mmc_host {
/* Host Software Queue support */
bool hsq_enabled;
+ int hsq_depth;
u32 err_stats[MMC_ERR_MAX];
unsigned long private[] ____cacheline_aligned;
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH V3 2/2] mmc: hsq: dynamic adjustment of hsq->depth
2023-08-29 2:04 ` [PATCH V3 2/2] mmc: hsq: dynamic adjustment of hsq->depth Wenchao Chen
@ 2023-09-14 12:57 ` Ulf Hansson
2023-09-15 11:38 ` Wenchao Chen
0 siblings, 1 reply; 6+ messages in thread
From: Ulf Hansson @ 2023-09-14 12:57 UTC (permalink / raw)
To: Wenchao Chen
Cc: linux-mmc, linux-kernel, wenchao.chen666, zhenxiong.lai,
yuelin.tang
On Tue, 29 Aug 2023 at 04:05, Wenchao Chen <wenchao.chen@unisoc.com> wrote:
>
> Increasing hsq_depth improves random write performance.
>
> Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
> ---
> drivers/mmc/host/mmc_hsq.c | 27 +++++++++++++++++++++++++++
> drivers/mmc/host/mmc_hsq.h | 5 +++++
> 2 files changed, 32 insertions(+)
>
> diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
> index 8556cacb21a1..0984c39108ba 100644
> --- a/drivers/mmc/host/mmc_hsq.c
> +++ b/drivers/mmc/host/mmc_hsq.c
> @@ -21,6 +21,31 @@ static void mmc_hsq_retry_handler(struct work_struct *work)
> mmc->ops->request(mmc, hsq->mrq);
> }
>
> +static void mmc_hsq_modify_threshold(struct mmc_hsq *hsq)
> +{
> + struct mmc_host *mmc = hsq->mmc;
> + struct mmc_request *mrq;
> + struct hsq_slot *slot;
> + int need_change = 0;
Rather than using a variable to keep track of this, why not just do
the below here?
mmc->hsq_depth = HSQ_NORMAL_DEPTH;
> + int tag;
> +
> + for (tag = 0; tag < HSQ_NUM_SLOTS; tag++) {
> + slot = &hsq->slot[tag];
> + mrq = slot->mrq;
> + if (mrq && mrq->data &&
> + (mrq->data->blksz * mrq->data->blocks == 4096) &&
> + (mrq->data->flags & MMC_DATA_WRITE))
> + need_change++;
And following above, then we can do the below here:
mmc->hsq_depth = HSQ_PERFORMANCE_DEPTH;
break;
That should simplify and make this more efficient too, right?
> + else
> + break;
> + }
> +
> + if (need_change > 1)
> + mmc->hsq_depth = HSQ_PERFORMANCE_DEPTH;
> + else
> + mmc->hsq_depth = HSQ_NORMAL_DEPTH;
> +}
> +
[...]
Kind regards
Uffe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH V3 2/2] mmc: hsq: dynamic adjustment of hsq->depth
2023-09-14 12:57 ` Ulf Hansson
@ 2023-09-15 11:38 ` Wenchao Chen
0 siblings, 0 replies; 6+ messages in thread
From: Wenchao Chen @ 2023-09-15 11:38 UTC (permalink / raw)
To: Ulf Hansson
Cc: Wenchao Chen, linux-mmc, linux-kernel, zhenxiong.lai, yuelin.tang
On Thu, 14 Sept 2023 at 20:57, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Tue, 29 Aug 2023 at 04:05, Wenchao Chen <wenchao.chen@unisoc.com> wrote:
> >
> > Increasing hsq_depth improves random write performance.
> >
> > Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
> > ---
> > drivers/mmc/host/mmc_hsq.c | 27 +++++++++++++++++++++++++++
> > drivers/mmc/host/mmc_hsq.h | 5 +++++
> > 2 files changed, 32 insertions(+)
> >
> > diff --git a/drivers/mmc/host/mmc_hsq.c b/drivers/mmc/host/mmc_hsq.c
> > index 8556cacb21a1..0984c39108ba 100644
> > --- a/drivers/mmc/host/mmc_hsq.c
> > +++ b/drivers/mmc/host/mmc_hsq.c
> > @@ -21,6 +21,31 @@ static void mmc_hsq_retry_handler(struct work_struct *work)
> > mmc->ops->request(mmc, hsq->mrq);
> > }
> >
> > +static void mmc_hsq_modify_threshold(struct mmc_hsq *hsq)
> > +{
> > + struct mmc_host *mmc = hsq->mmc;
> > + struct mmc_request *mrq;
> > + struct hsq_slot *slot;
> > + int need_change = 0;
>
> Rather than using a variable to keep track of this, why not just do
> the below here?
>
> mmc->hsq_depth = HSQ_NORMAL_DEPTH;
>
> > + int tag;
> > +
> > + for (tag = 0; tag < HSQ_NUM_SLOTS; tag++) {
> > + slot = &hsq->slot[tag];
> > + mrq = slot->mrq;
> > + if (mrq && mrq->data &&
> > + (mrq->data->blksz * mrq->data->blocks == 4096) &&
> > + (mrq->data->flags & MMC_DATA_WRITE))
> > + need_change++;
>
> And following above, then we can do the below here:
> mmc->hsq_depth = HSQ_PERFORMANCE_DEPTH;
> break;
>
> That should simplify and make this more efficient too, right?
>
Yes, you are right. But need_change = 2, it means more reqs are allowed.
Alternatively, modify it like this:
mmc->hsq_depth = (need_change > 1) ? HSQ_PERFORMANCE_DEPTH : HSQ_NORMAL_DEPTH;
> > + else
> > + break;
> > + }
> > +
> > + if (need_change > 1)
> > + mmc->hsq_depth = HSQ_PERFORMANCE_DEPTH;
> > + else
> > + mmc->hsq_depth = HSQ_NORMAL_DEPTH;
> > +}
> > +
>
> [...]
>
> Kind regards
> Uffe
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-09-15 11:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-29 2:04 [PATCH V3 0/2] mmc: hsq: dynamically adjust hsq_depth to improve performance Wenchao Chen
2023-08-29 2:04 ` [PATCH V3 1/2] mmc: queue: replace immediate with hsq->depth Wenchao Chen
2023-08-29 2:04 ` [PATCH V3 2/2] mmc: hsq: dynamic adjustment of hsq->depth Wenchao Chen
2023-09-14 12:57 ` Ulf Hansson
2023-09-15 11:38 ` Wenchao Chen
-- strict thread matches above, loose matches on Subject: below --
2023-09-05 2:39 [PATCH V3 0/2 RESEND] mmc: hsq: dynamically adjust hsq_depth to improve performance Wenchao Chen
2023-09-05 2:39 ` [PATCH V3 1/2] mmc: queue: replace immediate with hsq->depth Wenchao Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox