From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yufen Yu Subject: Re: [PATCH v5 00/16] md/raid5: set STRIPE_SIZE as a configurable value Date: Thu, 9 Jul 2020 21:27:42 +0800 Message-ID: <21aaf87f-157b-c37a-f16b-4e981268eeda@huawei.com> References: <20200702120628.777303-1-yuyufen@huawei.com> <0dd1ebed-2802-2bef-48f0-87bbdd2ee8e5@huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Content-Language: en-US Sender: linux-raid-owner@vger.kernel.org To: Song Liu Cc: linux-raid , NeilBrown , Guoqing Jiang , Hou Tao List-Id: linux-raid.ids On 2020/7/9 7:55, Song Liu wrote: > On Wed, Jul 8, 2020 at 6:15 AM Yufen Yu wrote: >> >> >> >> On 2020/7/3 7:00, Song Liu wrote: >>> On Thu, Jul 2, 2020 at 5:05 AM Yufen Yu wrote: >>>> >>>> Hi, all >>>> >>>> For now, STRIPE_SIZE is equal to the value of PAGE_SIZE. That means, RAID5 >>>> will issue each bio to disk at least 64KB when PAGE_SIZE is 64KB in arm64. >>>> However, filesystem usually issue bio in the unit of 4KB. Then, RAID5 may >>>> waste resource of disk bandwidth. >>>> >>>> To solve the problem, this patchset try to set stripe_size as a configuare >>>> value. The default value is 4096. We will add a new sysfs entry and set it >>>> by writing a new value, likely: >>>> >>>> echo 16384 > /sys/block/md1/md/stripe_size >>> >>> Higher level question: do we need to support page size that is NOT 4kB >>> times power >>> of 2? Meaning, do we need to support 12kB, 20kB, 24kB, etc. If we only >>> supports, 4kB, >>> 8kB, 16kB, 32kB, etc. some of the logic can be simpler. >> >> Yeah, I think we just support 4kb, 8kb, 16kb, 32kb... is enough. >> But Sorry that I don't know what logic can be simpler in current implementation. >> I mean it also need to allocate page, and record page offset. > > I was thinking about replacing multiplication/division with bit > operations (shift left/right). > But I am not very sure how much that matters in modern ARM CPUs. Would you mind > running some benchmarks with this? To test multiplication/division and bit operation, I write a simple test case: $ cat normal.c int page_size = 65536; int stripe_size = 32768; //32KB int main(int argc, char *argv[]) { int i, j, count; int page, offset; if (argc != 2) return -1; count = atol(argv[1]); for (i = 0; i < count; i++) { for (j = 0; j < 4; j++) { page = page_size / stripe_size; offset = j * stripe_size; } } } $ cat shift.c int page_shift = 16; //64KB int stripe_shift = 15; //32KB int main(int argc, char *argv[]) { int i, j, count; int page, offset; if (argc != 2) return -1; count = atol(argv[1]); for (i = 0; i < count; i++) { for (j = 0; j < 4; j++) { page = 1 << (page_shift - stripe_shift); offset = j << stripe_shift; } } } Test them on a arm64 server, the result show there is a minor performance gap between multiplication/division and shift operation. [root@localhost shift]# time ./normal 104857600 real 0m1.199s user 0m1.198s sys 0m0.000s [root@localhost shift]# time ./shift 104857600 real 0m1.166s user 0m1.166s sys 0m0.000s For our implementation, page address and page offset are just computed when allocate stripe_size. After that, we just use the recorded value in sh->dev[i].page and sh->dev[i].offset. So, I think current implementation may not cause much overhead.