* Bad raid0 bio too large problem @ 2015-09-22 15:30 Jes Sorensen 2015-09-22 16:07 ` Jes Sorensen 2015-09-23 2:25 ` Neil Brown 0 siblings, 2 replies; 10+ messages in thread From: Jes Sorensen @ 2015-09-22 15:30 UTC (permalink / raw) To: NeilBrown; +Cc: Xiao Ni, linux-raid, yizhan Hi Neil, I think we have some bad side effects with this patch: commit 199dc6ed5179251fa6158a461499c24bdd99c836 Author: NeilBrown <neilb@suse.com> Date: Mon Aug 3 13:11:47 2015 +1000 md/raid0: update queue parameter in a safer location. When a (e.g.) RAID5 array is reshaped to RAID0, the updating of queue parameters (e.g. max number of sectors per bio) is done in the wrong place. It should be part of ->run, but it is actually part of ->takeover. This means it happens before level_store() calls: blk_set_stacking_limits(&mddev->queue->limits); Running the '03r0assem' test suite fills my kernel log with output like below. Yi Zhang also had issues where writes failed too. robably something we need to resolve for 4.2-final or revert the offending patch. Cheers, Jes md: bind<loop0> md: bind<loop1> md: bind<loop2> md/raid0:md2: md_size is 116736 sectors. md: RAID0 configuration for md2 - 1 zone md: zone0=[loop0/loop1/loop2] zone-offset= 0KB, device-offset= 0KB, size= 58368KB md2: detected capacity change from 0 to 59768832 bio too big device loop0 (296 > 255) bio too big device loop0 (272 > 255) bio too big device loop1 (672 > 255) bio too big device loop1 (352 > 255) bio too big device loop2 (1024 > 255) bio too big device loop0 (672 > 255) bio too big device loop1 (256 > 255) bio too big device loop2 (288 > 255) bio too big device loop2 (736 > 255) bio too big device loop0 (288 > 255) bio too big device loop2 (256 > 255) bio too big device loop2 (368 > 255) bio too big device loop0 (488 > 255) bio too big device loop0 (360 > 255) bio too big device loop0 (256 > 255) bio too big device loop1 (288 > 255) bio too big device loop1 (736 > 255) bio too big device loop2 (288 > 255) bio too big device loop1 (256 > 255) bio too big device loop1 (512 > 255) bio too big device loop2 (856 > 255) bio too big device loop2 (256 > 255) bio too big device loop0 (288 > 255) bio too big device loop0 (736 > 255) bio too big device loop1 (288 > 255) bio too big device loop0 (256 > 255) bio too big device loop0 (512 > 255) bio too big device loop1 (856 > 255) bio too big device loop1 (256 > 255) bio too big device loop2 (288 > 255) bio too big device loop2 (736 > 255) md2: detected capacity change from 59768832 to 0 md: md2 stopped. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-22 15:30 Bad raid0 bio too large problem Jes Sorensen @ 2015-09-22 16:07 ` Jes Sorensen 2015-09-23 2:25 ` Neil Brown 1 sibling, 0 replies; 10+ messages in thread From: Jes Sorensen @ 2015-09-22 16:07 UTC (permalink / raw) To: NeilBrown; +Cc: Xiao Ni, linux-raid, yizhan Jes Sorensen <Jes.Sorensen@redhat.com> writes: > Hi Neil, > > I think we have some bad side effects with this patch: > > commit 199dc6ed5179251fa6158a461499c24bdd99c836 > Author: NeilBrown <neilb@suse.com> > Date: Mon Aug 3 13:11:47 2015 +1000 > > md/raid0: update queue parameter in a safer location. > > When a (e.g.) RAID5 array is reshaped to RAID0, the updating > of queue parameters (e.g. max number of sectors per bio) is > done in the wrong place. > It should be part of ->run, but it is actually part of ->takeover. > This means it happens before level_store() calls: > > blk_set_stacking_limits(&mddev->queue->limits); > > Running the '03r0assem' test suite fills my kernel log with output like > below. Yi Zhang also had issues where writes failed too. > > robably something we need to resolve for 4.2-final or revert the > offending patch. Obviously I meant 4.3 here :) Jes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-22 15:30 Bad raid0 bio too large problem Jes Sorensen 2015-09-22 16:07 ` Jes Sorensen @ 2015-09-23 2:25 ` Neil Brown 2015-09-23 11:18 ` Jes Sorensen 1 sibling, 1 reply; 10+ messages in thread From: Neil Brown @ 2015-09-23 2:25 UTC (permalink / raw) To: Jes Sorensen; +Cc: Xiao Ni, linux-raid, yizhan [-- Attachment #1: Type: text/plain, Size: 1727 bytes --] Jes Sorensen <Jes.Sorensen@redhat.com> writes: > Hi Neil, > > I think we have some bad side effects with this patch: > > commit 199dc6ed5179251fa6158a461499c24bdd99c836 > Author: NeilBrown <neilb@suse.com> > Date: Mon Aug 3 13:11:47 2015 +1000 > > md/raid0: update queue parameter in a safer location. > > When a (e.g.) RAID5 array is reshaped to RAID0, the updating > of queue parameters (e.g. max number of sectors per bio) is > done in the wrong place. > It should be part of ->run, but it is actually part of ->takeover. > This means it happens before level_store() calls: > > blk_set_stacking_limits(&mddev->queue->limits); > > Running the '03r0assem' test suite fills my kernel log with output like > below. Yi Zhang also had issues where writes failed too. > > robably something we need to resolve for 4.2-final or revert the > offending patch. > > Cheers, > Jes > > md: bind<loop0> > md: bind<loop1> > md: bind<loop2> > md/raid0:md2: md_size is 116736 sectors. > md: RAID0 configuration for md2 - 1 zone > md: zone0=[loop0/loop1/loop2] > zone-offset= 0KB, device-offset= 0KB, size= 58368KB > > md2: detected capacity change from 0 to 59768832 > bio too big device loop0 (296 > 255) > bio too big device loop0 (272 > 255) 1/ Why do you blame that particular patch? 2/ Where is that error message coming from? I cannot find "bio too big" in the kernel (except in a comment). Commit: 54efd50bfd87 ("block: make generic_make_request handle arbitrarily sized bios") removed the only instance of the error message that I know of. Which kernel exactly are you testing? Thanks, NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-23 2:25 ` Neil Brown @ 2015-09-23 11:18 ` Jes Sorensen 2015-09-23 11:20 ` Jes Sorensen 2015-09-24 2:53 ` Neil Brown 0 siblings, 2 replies; 10+ messages in thread From: Jes Sorensen @ 2015-09-23 11:18 UTC (permalink / raw) To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan Neil Brown <neilb@suse.de> writes: > Jes Sorensen <Jes.Sorensen@redhat.com> writes: > >> Hi Neil, >> >> I think we have some bad side effects with this patch: >> >> commit 199dc6ed5179251fa6158a461499c24bdd99c836 >> Author: NeilBrown <neilb@suse.com> >> Date: Mon Aug 3 13:11:47 2015 +1000 >> >> md/raid0: update queue parameter in a safer location. >> >> When a (e.g.) RAID5 array is reshaped to RAID0, the updating >> of queue parameters (e.g. max number of sectors per bio) is >> done in the wrong place. >> It should be part of ->run, but it is actually part of ->takeover. >> This means it happens before level_store() calls: >> >> blk_set_stacking_limits(&mddev->queue->limits); >> >> Running the '03r0assem' test suite fills my kernel log with output like >> below. Yi Zhang also had issues where writes failed too. >> >> robably something we need to resolve for 4.2-final or revert the >> offending patch. >> >> Cheers, >> Jes >> >> md: bind<loop0> >> md: bind<loop1> >> md: bind<loop2> >> md/raid0:md2: md_size is 116736 sectors. >> md: RAID0 configuration for md2 - 1 zone >> md: zone0=[loop0/loop1/loop2] >> zone-offset= 0KB, device-offset= 0KB, size= 58368KB >> >> md2: detected capacity change from 0 to 59768832 >> bio too big device loop0 (296 > 255) >> bio too big device loop0 (272 > 255) > > 1/ Why do you blame that particular patch? > > 2/ Where is that error message coming from? I cannot find "bio too big" > in the kernel (except in a comment). > Commit: 54efd50bfd87 ("block: make generic_make_request handle > arbitrarily sized bios") > removed the only instance of the error message that I know of. > > Which kernel exactly are you testing? I blame it because of bisect - I revert that patch and the issue goes away. I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree, see the bio too large. I revert it and it goes away. Cheers, Jes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-23 11:18 ` Jes Sorensen @ 2015-09-23 11:20 ` Jes Sorensen 2015-09-23 11:42 ` Jes Sorensen 2015-09-24 2:53 ` Neil Brown 1 sibling, 1 reply; 10+ messages in thread From: Jes Sorensen @ 2015-09-23 11:20 UTC (permalink / raw) To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan Jes Sorensen <Jes.Sorensen@redhat.com> writes: > Neil Brown <neilb@suse.de> writes: >> Jes Sorensen <Jes.Sorensen@redhat.com> writes: >> >>> Hi Neil, >>> >>> I think we have some bad side effects with this patch: >>> >>> commit 199dc6ed5179251fa6158a461499c24bdd99c836 >>> Author: NeilBrown <neilb@suse.com> >>> Date: Mon Aug 3 13:11:47 2015 +1000 >>> >>> md/raid0: update queue parameter in a safer location. >>> >>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating >>> of queue parameters (e.g. max number of sectors per bio) is >>> done in the wrong place. >>> It should be part of ->run, but it is actually part of ->takeover. >>> This means it happens before level_store() calls: >>> >>> blk_set_stacking_limits(&mddev->queue->limits); >>> >>> Running the '03r0assem' test suite fills my kernel log with output like >>> below. Yi Zhang also had issues where writes failed too. >>> >>> robably something we need to resolve for 4.2-final or revert the >>> offending patch. >>> >>> Cheers, >>> Jes >>> >>> md: bind<loop0> >>> md: bind<loop1> >>> md: bind<loop2> >>> md/raid0:md2: md_size is 116736 sectors. >>> md: RAID0 configuration for md2 - 1 zone >>> md: zone0=[loop0/loop1/loop2] >>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB >>> >>> md2: detected capacity change from 0 to 59768832 >>> bio too big device loop0 (296 > 255) >>> bio too big device loop0 (272 > 255) >> >> 1/ Why do you blame that particular patch? >> >> 2/ Where is that error message coming from? I cannot find "bio too big" >> in the kernel (except in a comment). >> Commit: 54efd50bfd87 ("block: make generic_make_request handle >> arbitrarily sized bios") >> removed the only instance of the error message that I know of. >> >> Which kernel exactly are you testing? > > I blame it because of bisect - I revert that patch and the issue goes > away. > > I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree, > see the bio too large. I revert it and it goes away. Hmmm Xiao tells me that the warning message is not in upstream, but I was sure I reproduced it in the upstream kernel as well. If I screwed up, my apologies, I am going back to my cave and will investigate further. Jes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-23 11:20 ` Jes Sorensen @ 2015-09-23 11:42 ` Jes Sorensen 0 siblings, 0 replies; 10+ messages in thread From: Jes Sorensen @ 2015-09-23 11:42 UTC (permalink / raw) To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan Jes Sorensen <Jes.Sorensen@redhat.com> writes: > Jes Sorensen <Jes.Sorensen@redhat.com> writes: >> Neil Brown <neilb@suse.de> writes: >>> Jes Sorensen <Jes.Sorensen@redhat.com> writes: >>> >>>> Hi Neil, >>>> >>>> I think we have some bad side effects with this patch: >>>> >>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836 >>>> Author: NeilBrown <neilb@suse.com> >>>> Date: Mon Aug 3 13:11:47 2015 +1000 >>>> >>>> md/raid0: update queue parameter in a safer location. >>>> >>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating >>>> of queue parameters (e.g. max number of sectors per bio) is >>>> done in the wrong place. >>>> It should be part of ->run, but it is actually part of ->takeover. >>>> This means it happens before level_store() calls: >>>> >>>> blk_set_stacking_limits(&mddev->queue->limits); >>>> >>>> Running the '03r0assem' test suite fills my kernel log with output like >>>> below. Yi Zhang also had issues where writes failed too. >>>> >>>> robably something we need to resolve for 4.2-final or revert the >>>> offending patch. >>>> >>>> Cheers, >>>> Jes >>>> >>>> md: bind<loop0> >>>> md: bind<loop1> >>>> md: bind<loop2> >>>> md/raid0:md2: md_size is 116736 sectors. >>>> md: RAID0 configuration for md2 - 1 zone >>>> md: zone0=[loop0/loop1/loop2] >>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB >>>> >>>> md2: detected capacity change from 0 to 59768832 >>>> bio too big device loop0 (296 > 255) >>>> bio too big device loop0 (272 > 255) >>> >>> 1/ Why do you blame that particular patch? >>> >>> 2/ Where is that error message coming from? I cannot find "bio too big" >>> in the kernel (except in a comment). >>> Commit: 54efd50bfd87 ("block: make generic_make_request handle >>> arbitrarily sized bios") >>> removed the only instance of the error message that I know of. >>> >>> Which kernel exactly are you testing? >> >> I blame it because of bisect - I revert that patch and the issue goes >> away. >> >> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree, >> see the bio too large. I revert it and it goes away. > > Hmmm Xiao tells me that the warning message is not in upstream, but I > was sure I reproduced it in the upstream kernel as well. If I screwed > up, my apologies, I am going back to my cave and will investigate > further. Digging a bit further, I was testing on a 4.2.0-rc5+ where I saw the issue. That does raise the question of what this patch does to older kernels, since it was submitted it for 2.6.35+. "bio too big" message went away with this commit: commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e Author: Kent Overstreet <kent.overstreet@gmail.com> Date: Thu Apr 23 22:37:18 2015 -0700 block: make generic_make_request handle arbitrarily sized bios Cheers, Jes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-23 11:18 ` Jes Sorensen 2015-09-23 11:20 ` Jes Sorensen @ 2015-09-24 2:53 ` Neil Brown 2015-09-24 8:48 ` Xiao Ni 2015-09-24 12:59 ` Jes Sorensen 1 sibling, 2 replies; 10+ messages in thread From: Neil Brown @ 2015-09-24 2:53 UTC (permalink / raw) To: Jes Sorensen; +Cc: Xiao Ni, linux-raid, yizhan [-- Attachment #1: Type: text/plain, Size: 3671 bytes --] Jes Sorensen <Jes.Sorensen@redhat.com> writes: > Neil Brown <neilb@suse.de> writes: >> Jes Sorensen <Jes.Sorensen@redhat.com> writes: >> >>> Hi Neil, >>> >>> I think we have some bad side effects with this patch: >>> >>> commit 199dc6ed5179251fa6158a461499c24bdd99c836 >>> Author: NeilBrown <neilb@suse.com> >>> Date: Mon Aug 3 13:11:47 2015 +1000 >>> >>> md/raid0: update queue parameter in a safer location. >>> >>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating >>> of queue parameters (e.g. max number of sectors per bio) is >>> done in the wrong place. >>> It should be part of ->run, but it is actually part of ->takeover. >>> This means it happens before level_store() calls: >>> >>> blk_set_stacking_limits(&mddev->queue->limits); >>> >>> Running the '03r0assem' test suite fills my kernel log with output like >>> below. Yi Zhang also had issues where writes failed too. >>> >>> robably something we need to resolve for 4.2-final or revert the >>> offending patch. >>> >>> Cheers, >>> Jes >>> >>> md: bind<loop0> >>> md: bind<loop1> >>> md: bind<loop2> >>> md/raid0:md2: md_size is 116736 sectors. >>> md: RAID0 configuration for md2 - 1 zone >>> md: zone0=[loop0/loop1/loop2] >>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB >>> >>> md2: detected capacity change from 0 to 59768832 >>> bio too big device loop0 (296 > 255) >>> bio too big device loop0 (272 > 255) >> >> 1/ Why do you blame that particular patch? >> >> 2/ Where is that error message coming from? I cannot find "bio too big" >> in the kernel (except in a comment). >> Commit: 54efd50bfd87 ("block: make generic_make_request handle >> arbitrarily sized bios") >> removed the only instance of the error message that I know of. >> >> Which kernel exactly are you testing? > > I blame it because of bisect - I revert that patch and the issue goes > away. > > I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree, > see the bio too large. I revert it and it goes away. Well that's pretty convincing - thanks. And as you say - it is tagged for -stable so really needs to be fixed. Stares at the code again. And again. Ahhh. that patch moved the blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors); to after disk_stack_limits(...); That is wrong. Could you confirm that this fixes your test? Thanks, NeilBrown diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 4a13c3cb940b..0875e5e7e09a 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -431,12 +431,6 @@ static int raid0_run(struct mddev *mddev) struct md_rdev *rdev; bool discard_supported = false; - rdev_for_each(rdev, mddev) { - disk_stack_limits(mddev->gendisk, rdev->bdev, - rdev->data_offset << 9); - if (blk_queue_discard(bdev_get_queue(rdev->bdev))) - discard_supported = true; - } blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors); blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors); blk_queue_max_discard_sectors(mddev->queue, mddev->chunk_sectors); @@ -445,6 +439,12 @@ static int raid0_run(struct mddev *mddev) blk_queue_io_opt(mddev->queue, (mddev->chunk_sectors << 9) * mddev->raid_disks); + rdev_for_each(rdev, mddev) { + disk_stack_limits(mddev->gendisk, rdev->bdev, + rdev->data_offset << 9); + if (blk_queue_discard(bdev_get_queue(rdev->bdev))) + discard_supported = true; + } if (!discard_supported) queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, mddev->queue); else [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-24 2:53 ` Neil Brown @ 2015-09-24 8:48 ` Xiao Ni 2015-09-24 12:59 ` Jes Sorensen 1 sibling, 0 replies; 10+ messages in thread From: Xiao Ni @ 2015-09-24 8:48 UTC (permalink / raw) To: Neil Brown; +Cc: Jes Sorensen, linux-raid, yizhan ----- Original Message ----- > From: "Neil Brown" <neilb@suse.de> > To: "Jes Sorensen" <Jes.Sorensen@redhat.com> > Cc: "Xiao Ni" <xni@redhat.com>, "linux-raid" <linux-raid@vger.kernel.org>, yizhan@redhat.com > Sent: Thursday, September 24, 2015 10:53:06 AM > Subject: Re: Bad raid0 bio too large problem > > Jes Sorensen <Jes.Sorensen@redhat.com> writes: > > > Neil Brown <neilb@suse.de> writes: > >> Jes Sorensen <Jes.Sorensen@redhat.com> writes: > >> > >>> Hi Neil, > >>> > >>> I think we have some bad side effects with this patch: > >>> > >>> commit 199dc6ed5179251fa6158a461499c24bdd99c836 > >>> Author: NeilBrown <neilb@suse.com> > >>> Date: Mon Aug 3 13:11:47 2015 +1000 > >>> > >>> md/raid0: update queue parameter in a safer location. > >>> > >>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating > >>> of queue parameters (e.g. max number of sectors per bio) is > >>> done in the wrong place. > >>> It should be part of ->run, but it is actually part of ->takeover. > >>> This means it happens before level_store() calls: > >>> > >>> blk_set_stacking_limits(&mddev->queue->limits); > >>> > >>> Running the '03r0assem' test suite fills my kernel log with output like > >>> below. Yi Zhang also had issues where writes failed too. > >>> > >>> robably something we need to resolve for 4.2-final or revert the > >>> offending patch. > >>> > >>> Cheers, > >>> Jes > >>> > >>> md: bind<loop0> > >>> md: bind<loop1> > >>> md: bind<loop2> > >>> md/raid0:md2: md_size is 116736 sectors. > >>> md: RAID0 configuration for md2 - 1 zone > >>> md: zone0=[loop0/loop1/loop2] > >>> zone-offset= 0KB, device-offset= 0KB, size= > >>> 58368KB > >>> > >>> md2: detected capacity change from 0 to 59768832 > >>> bio too big device loop0 (296 > 255) > >>> bio too big device loop0 (272 > 255) > >> > >> 1/ Why do you blame that particular patch? > >> > >> 2/ Where is that error message coming from? I cannot find "bio too big" > >> in the kernel (except in a comment). > >> Commit: 54efd50bfd87 ("block: make generic_make_request handle > >> arbitrarily sized bios") > >> removed the only instance of the error message that I know of. > >> > >> Which kernel exactly are you testing? > > > > I blame it because of bisect - I revert that patch and the issue goes > > away. > > > > I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree, > > see the bio too large. I revert it and it goes away. > > Well that's pretty convincing - thanks. > And as you say - it is tagged for -stable so really needs to be fixed. > > Stares at the code again. And again. > > Ahhh. that patch moved the > blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors); > to after > disk_stack_limits(...); > > That is wrong. > > Could you confirm that this fixes your test? > > Thanks, > NeilBrown > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index 4a13c3cb940b..0875e5e7e09a 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -431,12 +431,6 @@ static int raid0_run(struct mddev *mddev) > struct md_rdev *rdev; > bool discard_supported = false; > > - rdev_for_each(rdev, mddev) { > - disk_stack_limits(mddev->gendisk, rdev->bdev, > - rdev->data_offset << 9); > - if (blk_queue_discard(bdev_get_queue(rdev->bdev))) > - discard_supported = true; > - } > blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors); > blk_queue_max_write_same_sectors(mddev->queue, mddev->chunk_sectors); > blk_queue_max_discard_sectors(mddev->queue, mddev->chunk_sectors); > @@ -445,6 +439,12 @@ static int raid0_run(struct mddev *mddev) > blk_queue_io_opt(mddev->queue, > (mddev->chunk_sectors << 9) * mddev->raid_disks); > > + rdev_for_each(rdev, mddev) { > + disk_stack_limits(mddev->gendisk, rdev->bdev, > + rdev->data_offset << 9); > + if (blk_queue_discard(bdev_get_queue(rdev->bdev))) > + discard_supported = true; > + } > if (!discard_supported) > queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, mddev->queue); > else > Hi Neil, Jes The problem is fixed with this patch. Best Regards Xiao ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-24 2:53 ` Neil Brown 2015-09-24 8:48 ` Xiao Ni @ 2015-09-24 12:59 ` Jes Sorensen 2015-09-25 4:23 ` Neil Brown 1 sibling, 1 reply; 10+ messages in thread From: Jes Sorensen @ 2015-09-24 12:59 UTC (permalink / raw) To: Neil Brown; +Cc: Xiao Ni, linux-raid, yizhan Neil Brown <neilb@suse.de> writes: > Jes Sorensen <Jes.Sorensen@redhat.com> writes: > >> Neil Brown <neilb@suse.de> writes: >>> Jes Sorensen <Jes.Sorensen@redhat.com> writes: >>> >>>> Hi Neil, >>>> >>>> I think we have some bad side effects with this patch: >>>> >>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836 >>>> Author: NeilBrown <neilb@suse.com> >>>> Date: Mon Aug 3 13:11:47 2015 +1000 >>>> >>>> md/raid0: update queue parameter in a safer location. >>>> >>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating >>>> of queue parameters (e.g. max number of sectors per bio) is >>>> done in the wrong place. >>>> It should be part of ->run, but it is actually part of ->takeover. >>>> This means it happens before level_store() calls: >>>> >>>> blk_set_stacking_limits(&mddev->queue->limits); >>>> >>>> Running the '03r0assem' test suite fills my kernel log with output like >>>> below. Yi Zhang also had issues where writes failed too. >>>> >>>> robably something we need to resolve for 4.2-final or revert the >>>> offending patch. >>>> >>>> Cheers, >>>> Jes >>>> >>>> md: bind<loop0> >>>> md: bind<loop1> >>>> md: bind<loop2> >>>> md/raid0:md2: md_size is 116736 sectors. >>>> md: RAID0 configuration for md2 - 1 zone >>>> md: zone0=[loop0/loop1/loop2] >>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB >>>> >>>> md2: detected capacity change from 0 to 59768832 >>>> bio too big device loop0 (296 > 255) >>>> bio too big device loop0 (272 > 255) >>> >>> 1/ Why do you blame that particular patch? >>> >>> 2/ Where is that error message coming from? I cannot find "bio too big" >>> in the kernel (except in a comment). >>> Commit: 54efd50bfd87 ("block: make generic_make_request handle >>> arbitrarily sized bios") >>> removed the only instance of the error message that I know of. >>> >>> Which kernel exactly are you testing? >> >> I blame it because of bisect - I revert that patch and the issue goes >> away. >> >> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree, >> see the bio too large. I revert it and it goes away. > > Well that's pretty convincing - thanks. > And as you say - it is tagged for -stable so really needs to be fixed. > > Stares at the code again. And again. > > Ahhh. that patch moved the > blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors); > to after > disk_stack_limits(...); > > That is wrong. > > Could you confirm that this fixes your test? I refuse to do that! .... since Xiao beat me to it! Thanks! I was half way bisecting my way through it last night. For some reason the problem was reproducible in 4.2 if I applied the offending patch, but not in 4.3-rc2. Any chance you'll push these to your git tree in the near future? Thanks! Jes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad raid0 bio too large problem 2015-09-24 12:59 ` Jes Sorensen @ 2015-09-25 4:23 ` Neil Brown 0 siblings, 0 replies; 10+ messages in thread From: Neil Brown @ 2015-09-25 4:23 UTC (permalink / raw) To: Jes Sorensen; +Cc: Xiao Ni, linux-raid, yizhan [-- Attachment #1: Type: text/plain, Size: 3060 bytes --] Jes Sorensen <Jes.Sorensen@redhat.com> writes: > Neil Brown <neilb@suse.de> writes: >> Jes Sorensen <Jes.Sorensen@redhat.com> writes: >> >>> Neil Brown <neilb@suse.de> writes: >>>> Jes Sorensen <Jes.Sorensen@redhat.com> writes: >>>> >>>>> Hi Neil, >>>>> >>>>> I think we have some bad side effects with this patch: >>>>> >>>>> commit 199dc6ed5179251fa6158a461499c24bdd99c836 >>>>> Author: NeilBrown <neilb@suse.com> >>>>> Date: Mon Aug 3 13:11:47 2015 +1000 >>>>> >>>>> md/raid0: update queue parameter in a safer location. >>>>> >>>>> When a (e.g.) RAID5 array is reshaped to RAID0, the updating >>>>> of queue parameters (e.g. max number of sectors per bio) is >>>>> done in the wrong place. >>>>> It should be part of ->run, but it is actually part of ->takeover. >>>>> This means it happens before level_store() calls: >>>>> >>>>> blk_set_stacking_limits(&mddev->queue->limits); >>>>> >>>>> Running the '03r0assem' test suite fills my kernel log with output like >>>>> below. Yi Zhang also had issues where writes failed too. >>>>> >>>>> robably something we need to resolve for 4.2-final or revert the >>>>> offending patch. >>>>> >>>>> Cheers, >>>>> Jes >>>>> >>>>> md: bind<loop0> >>>>> md: bind<loop1> >>>>> md: bind<loop2> >>>>> md/raid0:md2: md_size is 116736 sectors. >>>>> md: RAID0 configuration for md2 - 1 zone >>>>> md: zone0=[loop0/loop1/loop2] >>>>> zone-offset= 0KB, device-offset= 0KB, size= 58368KB >>>>> >>>>> md2: detected capacity change from 0 to 59768832 >>>>> bio too big device loop0 (296 > 255) >>>>> bio too big device loop0 (272 > 255) >>>> >>>> 1/ Why do you blame that particular patch? >>>> >>>> 2/ Where is that error message coming from? I cannot find "bio too big" >>>> in the kernel (except in a comment). >>>> Commit: 54efd50bfd87 ("block: make generic_make_request handle >>>> arbitrarily sized bios") >>>> removed the only instance of the error message that I know of. >>>> >>>> Which kernel exactly are you testing? >>> >>> I blame it because of bisect - I revert that patch and the issue goes >>> away. >>> >>> I checked out 199dc6ed5179251fa6158a461499c24bdd99c836 in Linus' tree, >>> see the bio too large. I revert it and it goes away. >> >> Well that's pretty convincing - thanks. >> And as you say - it is tagged for -stable so really needs to be fixed. >> >> Stares at the code again. And again. >> >> Ahhh. that patch moved the >> blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors); >> to after >> disk_stack_limits(...); >> >> That is wrong. >> >> Could you confirm that this fixes your test? > > I refuse to do that! .... since Xiao beat me to it! Thanks! > > I was half way bisecting my way through it last night. For some reason > the problem was reproducible in 4.2 if I applied the offending patch, > but not in 4.3-rc2. > > Any chance you'll push these to your git tree in the near future? Pushed. Thanks, NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 818 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-09-25 4:23 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-22 15:30 Bad raid0 bio too large problem Jes Sorensen 2015-09-22 16:07 ` Jes Sorensen 2015-09-23 2:25 ` Neil Brown 2015-09-23 11:18 ` Jes Sorensen 2015-09-23 11:20 ` Jes Sorensen 2015-09-23 11:42 ` Jes Sorensen 2015-09-24 2:53 ` Neil Brown 2015-09-24 8:48 ` Xiao Ni 2015-09-24 12:59 ` Jes Sorensen 2015-09-25 4:23 ` Neil Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).