* [PATCH 0/2] block: improve the share tag set performance @ 2023-05-09 6:52 Ed Tsai 2023-05-09 6:52 ` [PATCH 1/2] block: make the fair sharing of tag configurable Ed Tsai ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Ed Tsai @ 2023-05-09 6:52 UTC (permalink / raw) To: axboe Cc: linux-block, linux-scsi, linux-kernel, martin.petersen, bvanassche, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, Ed Tsai The tag allocation is limited by the number of active queues and a queue is marked as inactive by the queue timeout worker after up to 30Hz by default. UFS devices have multiple logical units, and they can limit the depth of data LUNs by the fair tag sharing algorithm. Make the fair tag sharing configurable and improve the performance for UFS devices. See also https://lore.kernel.org/all/20230103195337.158625-1-bvanassche@acm.org Ed Tsai (2): block: make the fair sharing of tag configurable ufs: don't use the fair tag sharings block/blk-mq-debugfs.c | 1 + block/blk-mq-tag.c | 1 + block/blk-mq.c | 3 ++- drivers/ufs/core/ufshcd.c | 3 +++ include/linux/blkdev.h | 6 +++++- 5 files changed, 12 insertions(+), 2 deletions(-) -- 2.18.0 ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/2] block: make the fair sharing of tag configurable 2023-05-09 6:52 [PATCH 0/2] block: improve the share tag set performance Ed Tsai @ 2023-05-09 6:52 ` Ed Tsai 2023-05-09 21:33 ` kernel test robot 2023-05-11 15:33 ` Christoph Hellwig 2023-05-09 6:52 ` [PATCH 2/2] ufs: don't use the fair tag sharings Ed Tsai 2023-05-10 22:56 ` [PATCH 0/2] block: improve the share tag set performance Bart Van Assche 2 siblings, 2 replies; 25+ messages in thread From: Ed Tsai @ 2023-05-09 6:52 UTC (permalink / raw) To: axboe Cc: linux-block, linux-scsi, linux-kernel, martin.petersen, bvanassche, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, Ed Tsai Add a new queue flag QUEUE_FLAG_FAIR_TAG_SHARING to make the fair tag sharing configurable. Signed-off-by: Ed Tsai <ed.tsai@mediatek.com> --- block/blk-mq-debugfs.c | 1 + block/blk-mq-tag.c | 1 + block/blk-mq.c | 3 ++- include/linux/blkdev.h | 6 +++++- 4 files changed, 9 insertions(+), 2 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index d23a8554ec4a..f03b8bfe63be 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -103,6 +103,7 @@ static const char *const blk_queue_flag_name[] = { QUEUE_FLAG_NAME(RQ_ALLOC_TIME), QUEUE_FLAG_NAME(HCTX_ACTIVE), QUEUE_FLAG_NAME(NOWAIT), + QUEUE_FLAG_NAME(FAIR_TAG_SHARING), }; #undef QUEUE_FLAG_NAME diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index d6af9d431dc6..b8b36823f5f5 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -97,6 +97,7 @@ static int __blk_mq_get_tag(struct blk_mq_alloc_data *data, struct sbitmap_queue *bt) { if (!data->q->elevator && !(data->flags & BLK_MQ_REQ_RESERVED) && + blk_queue_fair_tag_sharing(data->q) && !hctx_may_queue(data->hctx, bt)) return BLK_MQ_NO_TAG; diff --git a/block/blk-mq.c b/block/blk-mq.c index f6dad0886a2f..f903107759f7 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1746,7 +1746,8 @@ static bool __blk_mq_alloc_driver_tag(struct request *rq) bt = &rq->mq_hctx->tags->breserved_tags; tag_offset = 0; } else { - if (!hctx_may_queue(rq->mq_hctx, bt)) + if (blk_queue_fair_tag_sharing(rq->q) && + !hctx_may_queue(rq->mq_hctx, bt)) return false; } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index b441e633f4dd..7fcb2356860d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -561,10 +561,12 @@ struct request_queue { #define QUEUE_FLAG_NOWAIT 29 /* device supports NOWAIT */ #define QUEUE_FLAG_SQ_SCHED 30 /* single queue style io dispatch */ #define QUEUE_FLAG_SKIP_TAGSET_QUIESCE 31 /* quiesce_tagset skip the queue*/ +#define QUEUE_FLAG_FAIR_TAG_SHARING 32 /* fair allocation of shared tags */ #define QUEUE_FLAG_MQ_DEFAULT ((1UL << QUEUE_FLAG_IO_STAT) | \ (1UL << QUEUE_FLAG_SAME_COMP) | \ - (1UL << QUEUE_FLAG_NOWAIT)) + (1UL << QUEUE_FLAG_NOWAIT) | \ + (1UL << QUEUE_FLAG_FAIR_TAG_SHARING)) void blk_queue_flag_set(unsigned int flag, struct request_queue *q); void blk_queue_flag_clear(unsigned int flag, struct request_queue *q); @@ -602,6 +604,8 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); #define blk_queue_sq_sched(q) test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags) #define blk_queue_skip_tagset_quiesce(q) \ test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags) +#define blk_queue_fair_tag_sharing(q) \ + test_bit(QUEUE_FLAG_FAIR_TAG_SHARING, &(q)->queue_flags) extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q); -- 2.18.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/2] block: make the fair sharing of tag configurable 2023-05-09 6:52 ` [PATCH 1/2] block: make the fair sharing of tag configurable Ed Tsai @ 2023-05-09 21:33 ` kernel test robot 2023-05-11 15:33 ` Christoph Hellwig 1 sibling, 0 replies; 25+ messages in thread From: kernel test robot @ 2023-05-09 21:33 UTC (permalink / raw) To: Ed Tsai, axboe Cc: oe-kbuild-all, linux-block, linux-scsi, linux-kernel, martin.petersen, bvanassche, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, Ed Tsai Hi Ed, kernel test robot noticed the following build warnings: [auto build test WARNING on axboe-block/for-next] [also build test WARNING on jejb-scsi/for-next mkp-scsi/for-next linus/master v6.4-rc1 next-20230509] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Ed-Tsai/block-make-the-fair-sharing-of-tag-configurable/20230509-145439 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next patch link: https://lore.kernel.org/r/20230509065230.32552-2-ed.tsai%40mediatek.com patch subject: [PATCH 1/2] block: make the fair sharing of tag configurable config: openrisc-randconfig-r022-20230509 (https://download.01.org/0day-ci/archive/20230510/202305100557.gdIvlzRS-lkp@intel.com/config) compiler: or1k-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/b1081024bc6d1cdaf5b39994b19040cd8e6099ec git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Ed-Tsai/block-make-the-fair-sharing-of-tag-configurable/20230509-145439 git checkout b1081024bc6d1cdaf5b39994b19040cd8e6099ec # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=openrisc olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=openrisc SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202305100557.gdIvlzRS-lkp@intel.com/ All warnings (new ones prefixed by >>): In file included from block/blk-mq.c:12: block/blk-mq.c: In function 'blk_mq_init_allocated_queue': >> include/linux/blkdev.h:569:39: warning: left shift count >= width of type [-Wshift-count-overflow] 569 | (1UL << QUEUE_FLAG_FAIR_TAG_SHARING)) | ^~ block/blk-mq.c:4232:27: note: in expansion of macro 'QUEUE_FLAG_MQ_DEFAULT' 4232 | q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; | ^~~~~~~~~~~~~~~~~~~~~ vim +569 include/linux/blkdev.h 565 566 #define QUEUE_FLAG_MQ_DEFAULT ((1UL << QUEUE_FLAG_IO_STAT) | \ 567 (1UL << QUEUE_FLAG_SAME_COMP) | \ 568 (1UL << QUEUE_FLAG_NOWAIT) | \ > 569 (1UL << QUEUE_FLAG_FAIR_TAG_SHARING)) 570 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/2] block: make the fair sharing of tag configurable 2023-05-09 6:52 ` [PATCH 1/2] block: make the fair sharing of tag configurable Ed Tsai 2023-05-09 21:33 ` kernel test robot @ 2023-05-11 15:33 ` Christoph Hellwig 1 sibling, 0 replies; 25+ messages in thread From: Christoph Hellwig @ 2023-05-11 15:33 UTC (permalink / raw) To: Ed Tsai Cc: axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, bvanassche, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream On Tue, May 09, 2023 at 02:52:29PM +0800, Ed Tsai wrote: > Add a new queue flag QUEUE_FLAG_FAIR_TAG_SHARING to make the fair tag > sharing configurable. Why? ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-09 6:52 [PATCH 0/2] block: improve the share tag set performance Ed Tsai 2023-05-09 6:52 ` [PATCH 1/2] block: make the fair sharing of tag configurable Ed Tsai @ 2023-05-09 6:52 ` Ed Tsai 2023-05-09 8:03 ` Avri Altman 2023-05-11 15:34 ` Christoph Hellwig 2023-05-10 22:56 ` [PATCH 0/2] block: improve the share tag set performance Bart Van Assche 2 siblings, 2 replies; 25+ messages in thread From: Ed Tsai @ 2023-05-09 6:52 UTC (permalink / raw) To: axboe Cc: linux-block, linux-scsi, linux-kernel, martin.petersen, bvanassche, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, Ed Tsai The tags allocation is limited by the fair sharing algorithm. It hurts the performance for UFS devices, because the queue depth of general I/O is reduced by half once the UFS send a control command. Signed-off-by: Ed Tsai <ed.tsai@mediatek.com> --- drivers/ufs/core/ufshcd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 17d7bb875fee..e96a50265285 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -5149,6 +5149,9 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); if (hba->quirks & UFSHCD_QUIRK_4KB_DMA_ALIGNMENT) blk_queue_update_dma_alignment(q, 4096 - 1); + + blk_queue_flag_clear(QUEUE_FLAG_FAIR_TAG_SHARING, q); + /* * Block runtime-pm until all consumers are added. * Refer ufshcd_setup_links(). -- 2.18.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* RE: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-09 6:52 ` [PATCH 2/2] ufs: don't use the fair tag sharings Ed Tsai @ 2023-05-09 8:03 ` Avri Altman 2023-05-09 14:04 ` Bart Van Assche 2023-05-11 15:34 ` Christoph Hellwig 1 sibling, 1 reply; 25+ messages in thread From: Avri Altman @ 2023-05-09 8:03 UTC (permalink / raw) To: Ed Tsai, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, bvanassche@acm.org, stanley.chu@mediatek.com, peter.wang@mediatek.com, chun-hung.wu@mediatek.com, alice.chao@mediatek.com, powen.kao@mediatek.com, naomi.chu@mediatek.com, wsd_upstream@mediatek.com > The tags allocation is limited by the fair sharing algorithm. It hurts > the performance for UFS devices, because the queue depth of general I/O > is reduced by half once the UFS send a control command. Ack. However, I think the decision of that should be of the platform owner, And not in the core driver. Thanks, Avri > > Signed-off-by: Ed Tsai <ed.tsai@mediatek.com> > --- > drivers/ufs/core/ufshcd.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index 17d7bb875fee..e96a50265285 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -5149,6 +5149,9 @@ static int ufshcd_slave_configure(struct scsi_device > *sdev) > blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); > if (hba->quirks & UFSHCD_QUIRK_4KB_DMA_ALIGNMENT) > blk_queue_update_dma_alignment(q, 4096 - 1); > + > + blk_queue_flag_clear(QUEUE_FLAG_FAIR_TAG_SHARING, q); > + > /* > * Block runtime-pm until all consumers are added. > * Refer ufshcd_setup_links(). > -- > 2.18.0 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-09 8:03 ` Avri Altman @ 2023-05-09 14:04 ` Bart Van Assche 2023-05-09 16:19 ` Avri Altman 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-05-09 14:04 UTC (permalink / raw) To: Avri Altman, Ed Tsai, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, stanley.chu@mediatek.com, peter.wang@mediatek.com, chun-hung.wu@mediatek.com, alice.chao@mediatek.com, powen.kao@mediatek.com, naomi.chu@mediatek.com, wsd_upstream@mediatek.com On 5/9/23 01:03, Avri Altman wrote: > However, I think the decision of that should be of the platform owner, > And not in the core driver. Hi Avri, I don't see any use case in which performance of a UFS device would be improved by leaving QUEUE_FLAG_FAIR_TAG_SHARING enabled. Are you perhaps aware of such a use case? Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-09 14:04 ` Bart Van Assche @ 2023-05-09 16:19 ` Avri Altman 2023-05-09 16:30 ` Bart Van Assche 0 siblings, 1 reply; 25+ messages in thread From: Avri Altman @ 2023-05-09 16:19 UTC (permalink / raw) To: Bart Van Assche, Ed Tsai, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, stanley.chu@mediatek.com, peter.wang@mediatek.com, chun-hung.wu@mediatek.com, alice.chao@mediatek.com, powen.kao@mediatek.com, naomi.chu@mediatek.com, wsd_upstream@mediatek.com > On 5/9/23 01:03, Avri Altman wrote: > > However, I think the decision of that should be of the platform owner, > > And not in the core driver. > > Hi Avri, > > I don't see any use case in which performance of a UFS device would be > improved > by leaving QUEUE_FLAG_FAIR_TAG_SHARING enabled. Are you perhaps aware > of such a > use case? Following your argument, then why fair allocation exists in the first place? When running benchmarks I am hacking the scheduler's "fair" tag allocation as well. That's why I acked this change. Since this change may affect the IO profile as a whole, I think the platform owners should have the flexibility not to use it, Should they choose to. Thanks, Avri > > Thanks, > > Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-09 16:19 ` Avri Altman @ 2023-05-09 16:30 ` Bart Van Assche 2023-05-10 5:21 ` Avri Altman 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-05-09 16:30 UTC (permalink / raw) To: Avri Altman, Ed Tsai, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, stanley.chu@mediatek.com, peter.wang@mediatek.com, chun-hung.wu@mediatek.com, alice.chao@mediatek.com, powen.kao@mediatek.com, naomi.chu@mediatek.com, wsd_upstream@mediatek.com On 5/9/23 09:19, Avri Altman wrote: > Following your argument, then why fair allocation exists in the first place? Does this patch answer your question? [PATCH v2] block: Improve shared tag set performance https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/ Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-09 16:30 ` Bart Van Assche @ 2023-05-10 5:21 ` Avri Altman 2023-05-10 15:56 ` Bart Van Assche 0 siblings, 1 reply; 25+ messages in thread From: Avri Altman @ 2023-05-10 5:21 UTC (permalink / raw) To: Bart Van Assche, Ed Tsai, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, stanley.chu@mediatek.com, peter.wang@mediatek.com, chun-hung.wu@mediatek.com, alice.chao@mediatek.com, powen.kao@mediatek.com, naomi.chu@mediatek.com, wsd_upstream@mediatek.com > > On 5/9/23 09:19, Avri Altman wrote: > > Following your argument, then why fair allocation exists in the first place? > > Does this patch answer your question? > > [PATCH v2] block: Improve shared tag set performance > https://lore.kernel.org/linux-block/20230103195337.158625-1- > bvanassche@acm.org/ OK. Why it was needed to invent a new flag instead of just clear BLK_MQ_F_TAG_QUEUE_SHARED? Thanks, Avri > > Thanks, > > Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-10 5:21 ` Avri Altman @ 2023-05-10 15:56 ` Bart Van Assche 0 siblings, 0 replies; 25+ messages in thread From: Bart Van Assche @ 2023-05-10 15:56 UTC (permalink / raw) To: Avri Altman, Ed Tsai, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, stanley.chu@mediatek.com, peter.wang@mediatek.com, chun-hung.wu@mediatek.com, alice.chao@mediatek.com, powen.kao@mediatek.com, naomi.chu@mediatek.com, wsd_upstream@mediatek.com On 5/9/23 22:21, Avri Altman wrote: > Why it was needed to invent a new flag instead of just clear > BLK_MQ_F_TAG_QUEUE_SHARED? Hi Avri, The meaning of BLK_MQ_F_TAG_QUEUE_SHARED is "the tag set is shared across multiple request queues". Clearing BLK_MQ_F_TAG_QUEUE_SHARED if the tag set is shared would be wrong: it would break all the code outside the tag allocation code that tests this flag. Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-09 6:52 ` [PATCH 2/2] ufs: don't use the fair tag sharings Ed Tsai 2023-05-09 8:03 ` Avri Altman @ 2023-05-11 15:34 ` Christoph Hellwig 2023-05-11 15:38 ` Bart Van Assche 1 sibling, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2023-05-11 15:34 UTC (permalink / raw) To: Ed Tsai Cc: axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, bvanassche, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream On Tue, May 09, 2023 at 02:52:30PM +0800, Ed Tsai wrote: > The tags allocation is limited by the fair sharing algorithm. It hurts > the performance for UFS devices, because the queue depth of general I/O > is reduced by half once the UFS send a control command. But it is there for a reason. You completely fail to explain why you think your change is safe, and also why you did not try to even explain where the overhead is and how else you tried to mitigate it. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-11 15:34 ` Christoph Hellwig @ 2023-05-11 15:38 ` Bart Van Assche 2023-05-12 14:02 ` Christoph Hellwig 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-05-11 15:38 UTC (permalink / raw) To: Christoph Hellwig, Ed Tsai Cc: axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream On 5/11/23 08:34, Christoph Hellwig wrote: > On Tue, May 09, 2023 at 02:52:30PM +0800, Ed Tsai wrote: >> The tags allocation is limited by the fair sharing algorithm. It hurts >> the performance for UFS devices, because the queue depth of general I/O >> is reduced by half once the UFS send a control command. > > But it is there for a reason. You completely fail to explain why you > think your change is safe, and also why you did not try to even explain > where the overhead is and how else you tried to mitigate it. Hi Christoph, For which devices is the fair sharing algorithm useful? As far as I know the legacy block layer did not have an equivalent of the fair sharing algorithm and I'm not aware of any complaints about the legacy block layer regarding to fairness. This is why I proposed in January to remove the fair sharing code entirely. See also https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/. Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-11 15:38 ` Bart Van Assche @ 2023-05-12 14:02 ` Christoph Hellwig 2023-05-12 18:12 ` Bart Van Assche 0 siblings, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2023-05-12 14:02 UTC (permalink / raw) To: Bart Van Assche Cc: Christoph Hellwig, Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream On Thu, May 11, 2023 at 08:38:04AM -0700, Bart Van Assche wrote: > For which devices is the fair sharing algorithm useful? As far as I know the > legacy block layer did not have an equivalent of the fair sharing algorithm > and I'm not aware of any complaints about the legacy block layer regarding > to fairness. This is why I proposed in January to remove the fair sharing > code entirely. See also https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/. Because the old code did not do tag allocation itself? Either way I don't think a "I'll opt out for a random driver" is the proper approach when you think it's not needed. Especially not without any data explaining why just that driver is a special snowflake. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-12 14:02 ` Christoph Hellwig @ 2023-05-12 18:12 ` Bart Van Assche 2023-05-13 3:09 ` Yu Kuai 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-05-12 18:12 UTC (permalink / raw) To: Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream On 5/12/23 07:02, Christoph Hellwig wrote: > On Thu, May 11, 2023 at 08:38:04AM -0700, Bart Van Assche wrote: >> For which devices is the fair sharing algorithm useful? As far as I know the >> legacy block layer did not have an equivalent of the fair sharing algorithm >> and I'm not aware of any complaints about the legacy block layer regarding >> to fairness. This is why I proposed in January to remove the fair sharing >> code entirely. See also https://lore.kernel.org/linux-block/20230103195337.158625-1-bvanassche@acm.org/. > > Because the old code did not do tag allocation itself? Either way I > don't think a "I'll opt out for a random driver" is the proper approach > when you think it's not needed. Especially not without any data > explaining why just that driver is a special snowflake. Hi Christoph, I'm still wondering whether there are any drivers that benefit from the fair tag sharing algorithm. If the number of tags is large enough (NVMe), the number of tags exceeds the number of requests in flight and hence the fair tag sharing algorithm is not necessary. The fair tag sharing algorithm has a negative impact on all SCSI devices with multiple logical units. This is because logical units are considered active until (request timeout) seconds have elapsed after the logical unit stopped being used (see also the blk_mq_tag_idle() call in blk_mq_timeout_work()). UFS users are hit by this because UFS 3.0 devices have a limited queue depth (32) and because power management commands are submitted to a logical unit (WLUN). Hence, it happens often that the block layer "active queue" counter is equal to 2 while only one logical unit is being used actively (a logical unit backed by NAND flash). The performance difference between queue depths 16 and 32 for UFS devices is significant. Is my understanding correct that in the legacy block layer implementation blk_queue_start_tag() had to be called to assign a tag to a request? I haven't found any code in the Linux kernel v4.20 implementation of blk_queue_start_tag() that implements fairness in case a request tag map (struct blk_queue_tag) is shared across request queues (one request queue per logical unit in case of SCSI). Do you agree with my conclusion that from the point of view of the SCSI core in general and the UFS driver in particular the fair tag sharing algorithm in the blk-mq code introduced a performance regression? Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-12 18:12 ` Bart Van Assche @ 2023-05-13 3:09 ` Yu Kuai 2023-05-16 15:12 ` Bart Van Assche 0 siblings, 1 reply; 25+ messages in thread From: Yu Kuai @ 2023-05-13 3:09 UTC (permalink / raw) To: Bart Van Assche, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) Hi, 在 2023/05/13 2:12, Bart Van Assche 写道: > The fair tag sharing algorithm has a negative impact on all SCSI devices > with multiple logical units. This is because logical units are > considered active until (request timeout) seconds have elapsed after the > logical unit stopped being used (see also the blk_mq_tag_idle() call in > blk_mq_timeout_work()). UFS users are hit by this because UFS 3.0 > devices have a limited queue depth (32) and because power management > commands are submitted to a logical unit (WLUN). Hence, it happens often > that the block layer "active queue" counter is equal to 2 while only one > logical unit is being used actively (a logical unit backed by NAND > flash). The performance difference between queue depths 16 and 32 for > UFS devices is significant. We meet similiar problem before, but I think remove tag fair sharing might cause some problems, because get tag is not fair currently, for example 2 devices share 32 tag, while device a issue large amount of io concurrently, and device b only issue one io, in this case, if fair tag sharing is removed, device b can get bad io latency. By the way, I tried to propose a way to workaround this by following: 1) disable fair tag sharing untill get tag found no tag is avaiable; 2) enable fair tag sharing again if the disk donesn't faild to get tag for a period of time; Can this approch be considered? Thanks, Kuai ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-13 3:09 ` Yu Kuai @ 2023-05-16 15:12 ` Bart Van Assche 2023-05-17 7:49 ` Yu Kuai 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-05-16 15:12 UTC (permalink / raw) To: Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) On 5/12/23 20:09, Yu Kuai wrote: > 在 2023/05/13 2:12, Bart Van Assche 写道: >> The fair tag sharing algorithm has a negative impact on all SCSI >> devices with multiple logical units. This is because logical units are >> considered active until (request timeout) seconds have elapsed after >> the logical unit stopped being used (see also the blk_mq_tag_idle() >> call in blk_mq_timeout_work()). UFS users are hit by this because UFS >> 3.0 devices have a limited queue depth (32) and because power >> management commands are submitted to a logical unit (WLUN). Hence, it >> happens often that the block layer "active queue" counter is equal to >> 2 while only one logical unit is being used actively (a logical unit >> backed by NAND flash). The performance difference between queue depths >> 16 and 32 for UFS devices is significant. > > We meet similiar problem before, but I think remove tag fair sharing > might cause some problems, because get tag is not fair currently, for > example 2 devices share 32 tag, while device a issue large amount of > io concurrently, and device b only issue one io, in this case, if fair > tag sharing is removed, device b can get bad io latency. > > By the way, I tried to propose a way to workaround this by following: > > 1) disable fair tag sharing untill get tag found no tag is avaiable; > 2) enable fair tag sharing again if the disk donesn't faild to get tag > for a period of time; > > Can this approch be considered? I'm afraid that this approach won't help for the UFS driver since it is likely that all tags are in use by a single logical unit during an IOPS test. Hence, fair sharing would be enabled even when we don't want it to be enabled. I propose that we switch to one of these two approaches: * Either remove the fair tag sharing code entirely and rely on the fairness mechanism provided by the sbitmap code. I'm referring to how __sbitmap_queue_wake_up() uses the wake_index member variable. * Or make the behavior of the fairness algorithm configurable from user space. One possible approach is to make the proportion of tags for a logical unit / NVMe namespace configurable via sysfs. This will allow to reduce the number of tags for the WLUN of UFS devices. Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-16 15:12 ` Bart Van Assche @ 2023-05-17 7:49 ` Yu Kuai 2023-05-17 18:23 ` Bart Van Assche 0 siblings, 1 reply; 25+ messages in thread From: Yu Kuai @ 2023-05-17 7:49 UTC (permalink / raw) To: Bart Van Assche, Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) Hi, 在 2023/05/16 23:12, Bart Van Assche 写道: > On 5/12/23 20:09, Yu Kuai wrote: >> 在 2023/05/13 2:12, Bart Van Assche 写道: >>> The fair tag sharing algorithm has a negative impact on all SCSI >>> devices with multiple logical units. This is because logical units >>> are considered active until (request timeout) seconds have elapsed >>> after the logical unit stopped being used (see also the >>> blk_mq_tag_idle() call in blk_mq_timeout_work()). UFS users are hit >>> by this because UFS 3.0 devices have a limited queue depth (32) and >>> because power management commands are submitted to a logical unit >>> (WLUN). Hence, it happens often that the block layer "active queue" >>> counter is equal to 2 while only one logical unit is being used >>> actively (a logical unit backed by NAND flash). The performance >>> difference between queue depths 16 and 32 for UFS devices is >>> significant. >> >> We meet similiar problem before, but I think remove tag fair sharing >> might cause some problems, because get tag is not fair currently, for >> example 2 devices share 32 tag, while device a issue large amount of >> io concurrently, and device b only issue one io, in this case, if fair >> tag sharing is removed, device b can get bad io latency. >> >> By the way, I tried to propose a way to workaround this by following: >> >> 1) disable fair tag sharing untill get tag found no tag is avaiable; >> 2) enable fair tag sharing again if the disk donesn't faild to get tag >> for a period of time; >> >> Can this approch be considered? > > I'm afraid that this approach won't help for the UFS driver since it is > likely that all tags are in use by a single logical unit during an IOPS > test. Hence, fair sharing would be enabled even when we don't want it to > be enabled. It's right my original method is not flexible. > > I propose that we switch to one of these two approaches: How about a smoothing method that the device with more io will share more tag, and each device will get at least one tag? Thanks, Kuai > * Either remove the fair tag sharing code entirely and rely on the > fairness mechanism provided by the sbitmap code. I'm referring to how > __sbitmap_queue_wake_up() uses the wake_index member variable. > * Or make the behavior of the fairness algorithm configurable from user > space. One possible approach is to make the proportion of tags for a > logical unit / NVMe namespace configurable via sysfs. This will allow to > reduce the number of tags for the WLUN of UFS devices. > > Thanks, > > Bart. > > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-17 7:49 ` Yu Kuai @ 2023-05-17 18:23 ` Bart Van Assche 2023-05-18 1:49 ` Yu Kuai 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-05-17 18:23 UTC (permalink / raw) To: Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) On 5/17/23 00:49, Yu Kuai wrote: > 在 2023/05/16 23:12, Bart Van Assche 写道: >> I propose that we switch to one of these two approaches: > > How about a smoothing method that the device with more io will share > more tag, and each device will get at least one tag? Hi Yu, hctx_may_queue() is called from the hot path (blk_mq_get_tag()). I'm pretty sure that adding any nontrivial code in that path will cause a performance (IOPS) regression. So I don't think that adding a smoothing method in hctx_may_queue() is a realistic option. Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-17 18:23 ` Bart Van Assche @ 2023-05-18 1:49 ` Yu Kuai 2023-05-18 2:23 ` Bart Van Assche 0 siblings, 1 reply; 25+ messages in thread From: Yu Kuai @ 2023-05-18 1:49 UTC (permalink / raw) To: Bart Van Assche, Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) Hi, 在 2023/05/18 2:23, Bart Van Assche 写道: > On 5/17/23 00:49, Yu Kuai wrote: >> 在 2023/05/16 23:12, Bart Van Assche 写道: >>> I propose that we switch to one of these two approaches: >> >> How about a smoothing method that the device with more io will share >> more tag, and each device will get at least one tag? > > Hi Yu, > > hctx_may_queue() is called from the hot path (blk_mq_get_tag()). I'm > pretty sure that adding any nontrivial code in that path will cause a > performance (IOPS) regression. So I don't think that adding a smoothing > method in hctx_may_queue() is a realistic option. > Currently, fair share from hctx_may_queue() requires two atomic_read(active_queues and active_requests), I think this smoothing method can be placed into get_tag fail path, for example, the more times a disk failed to get tag in a period of time, the more tag this disk can get, and all the information can be updated here(perhaps directly record how many tags a disk can get, then hctx_may_queue() still only require 2 atomic_read()). Thanks, Bart ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-18 1:49 ` Yu Kuai @ 2023-05-18 2:23 ` Bart Van Assche 2023-05-18 7:55 ` Yu Kuai 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-05-18 2:23 UTC (permalink / raw) To: Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) On 5/17/23 18:49, Yu Kuai wrote: > Currently, fair share from hctx_may_queue() requires two > atomic_read(active_queues and active_requests), I think this smoothing > method can be placed into get_tag fail path, for example, the more times > a disk failed to get tag in a period of time, the more tag this disk can > get, and all the information can be updated here(perhaps directly > record how many tags a disk can get, then hctx_may_queue() still only > require 2 atomic_read()). That sounds interesting to me. Do you perhaps plan to implement this approach and to post it as a patch? Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-18 2:23 ` Bart Van Assche @ 2023-05-18 7:55 ` Yu Kuai 2023-06-13 14:07 ` Bart Van Assche 0 siblings, 1 reply; 25+ messages in thread From: Yu Kuai @ 2023-05-18 7:55 UTC (permalink / raw) To: Bart Van Assche, Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) Hi, 在 2023/05/18 10:23, Bart Van Assche 写道: > On 5/17/23 18:49, Yu Kuai wrote: >> Currently, fair share from hctx_may_queue() requires two >> atomic_read(active_queues and active_requests), I think this smoothing >> method can be placed into get_tag fail path, for example, the more times >> a disk failed to get tag in a period of time, the more tag this disk can >> get, and all the information can be updated here(perhaps directly >> record how many tags a disk can get, then hctx_may_queue() still only >> require 2 atomic_read()). > > That sounds interesting to me. Do you perhaps plan to implement this > approach and to post it as a patch? Of course, I'll try to send a RFC patch. Thanks, Kuai > > Thanks, > > Bart. > > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-05-18 7:55 ` Yu Kuai @ 2023-06-13 14:07 ` Bart Van Assche 2023-06-14 1:58 ` Yu Kuai 0 siblings, 1 reply; 25+ messages in thread From: Bart Van Assche @ 2023-06-13 14:07 UTC (permalink / raw) To: Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) On 5/18/23 00:55, Yu Kuai wrote: > 在 2023/05/18 10:23, Bart Van Assche 写道: >> On 5/17/23 18:49, Yu Kuai wrote: >>> Currently, fair share from hctx_may_queue() requires two >>> atomic_read(active_queues and active_requests), I think this smoothing >>> method can be placed into get_tag fail path, for example, the more times >>> a disk failed to get tag in a period of time, the more tag this disk can >>> get, and all the information can be updated here(perhaps directly >>> record how many tags a disk can get, then hctx_may_queue() still only >>> require 2 atomic_read()). >> >> That sounds interesting to me. Do you perhaps plan to implement this >> approach and to post it as a patch? > > Of course, I'll try to send a RFC patch. Hi Kuai, Has this RFC patch already been posted or did I perhaps miss it? Thanks, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/2] ufs: don't use the fair tag sharings 2023-06-13 14:07 ` Bart Van Assche @ 2023-06-14 1:58 ` Yu Kuai 0 siblings, 0 replies; 25+ messages in thread From: Yu Kuai @ 2023-06-14 1:58 UTC (permalink / raw) To: Bart Van Assche, Yu Kuai, Christoph Hellwig Cc: Ed Tsai, axboe, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream, yukuai (C) Hi, 在 2023/06/13 22:07, Bart Van Assche 写道: > On 5/18/23 00:55, Yu Kuai wrote: >> 在 2023/05/18 10:23, Bart Van Assche 写道: >>> On 5/17/23 18:49, Yu Kuai wrote: >>>> Currently, fair share from hctx_may_queue() requires two >>>> atomic_read(active_queues and active_requests), I think this smoothing >>>> method can be placed into get_tag fail path, for example, the more >>>> times >>>> a disk failed to get tag in a period of time, the more tag this disk >>>> can >>>> get, and all the information can be updated here(perhaps directly >>>> record how many tags a disk can get, then hctx_may_queue() still only >>>> require 2 atomic_read()). >>> >>> That sounds interesting to me. Do you perhaps plan to implement this >>> approach and to post it as a patch? >> >> Of course, I'll try to send a RFC patch. > > Hi Kuai, > > Has this RFC patch already been posted or did I perhaps miss it? Sorry for the delay, I finished switch from not share to fair share directly verion, however, I found that it's not that simple to add a smoothing method, and I'm stuck here for now. I'll try to send a RFC verion soon, the smoothing method may not be flexible, but I'll make sure it can work for your simple case that 2 device share tags, and one only issue few io while the other issue lots of io. Thanks, Kuai > > Thanks, > > Bart. > > > . > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/2] block: improve the share tag set performance 2023-05-09 6:52 [PATCH 0/2] block: improve the share tag set performance Ed Tsai 2023-05-09 6:52 ` [PATCH 1/2] block: make the fair sharing of tag configurable Ed Tsai 2023-05-09 6:52 ` [PATCH 2/2] ufs: don't use the fair tag sharings Ed Tsai @ 2023-05-10 22:56 ` Bart Van Assche 2 siblings, 0 replies; 25+ messages in thread From: Bart Van Assche @ 2023-05-10 22:56 UTC (permalink / raw) To: axboe Cc: Ed Tsai, linux-block, linux-scsi, linux-kernel, martin.petersen, stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao, naomi.chu, wsd_upstream On 5/8/23 23:52, Ed Tsai wrote: > The tag allocation is limited by the number of active queues and a > queue is marked as inactive by the queue timeout worker after up to 30Hz > by default. > > UFS devices have multiple logical units, and they can limit the depth of > data LUNs by the fair tag sharing algorithm. Make the fair tag sharing > configurable and improve the performance for UFS devices. > > See also https://lore.kernel.org/all/20230103195337.158625-1-bvanassche@acm.org Hi Jens, This patch series is slightly more complicated than the patch that I posted in January. Do you prefer the approach of this patch series or rather the approach of the patch that I posted in January? Thank you, Bart. ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2023-06-14 1:58 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-05-09 6:52 [PATCH 0/2] block: improve the share tag set performance Ed Tsai 2023-05-09 6:52 ` [PATCH 1/2] block: make the fair sharing of tag configurable Ed Tsai 2023-05-09 21:33 ` kernel test robot 2023-05-11 15:33 ` Christoph Hellwig 2023-05-09 6:52 ` [PATCH 2/2] ufs: don't use the fair tag sharings Ed Tsai 2023-05-09 8:03 ` Avri Altman 2023-05-09 14:04 ` Bart Van Assche 2023-05-09 16:19 ` Avri Altman 2023-05-09 16:30 ` Bart Van Assche 2023-05-10 5:21 ` Avri Altman 2023-05-10 15:56 ` Bart Van Assche 2023-05-11 15:34 ` Christoph Hellwig 2023-05-11 15:38 ` Bart Van Assche 2023-05-12 14:02 ` Christoph Hellwig 2023-05-12 18:12 ` Bart Van Assche 2023-05-13 3:09 ` Yu Kuai 2023-05-16 15:12 ` Bart Van Assche 2023-05-17 7:49 ` Yu Kuai 2023-05-17 18:23 ` Bart Van Assche 2023-05-18 1:49 ` Yu Kuai 2023-05-18 2:23 ` Bart Van Assche 2023-05-18 7:55 ` Yu Kuai 2023-06-13 14:07 ` Bart Van Assche 2023-06-14 1:58 ` Yu Kuai 2023-05-10 22:56 ` [PATCH 0/2] block: improve the share tag set performance Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).