* Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
@ 2017-08-08 10:23 Tom Yan
2017-08-09 13:38 ` hch
0 siblings, 1 reply; 13+ messages in thread
From: Tom Yan @ 2017-08-08 10:23 UTC (permalink / raw)
To: dm-devel@redhat.com, gmazyland, axboe@fb.com, axboe, hch@lst.de,
linux-block
(Sorry for the spam. Having hard time moving the thread from
outlook.com to gmail.com for linux-block. Please reply to this, thank
you.)
Hi again,
I think I have sort of identified the problem. It appears to me that
both the block layer and dm-crypt is defected on handling this.
First of all, check out the "fallback" zero out implementation, which
is used in this case, here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/blk-lib.c?h=v4.12#n309
>From the outer loop, it seem to imply that this should be done in
multiple "bio"s, if the request (original "nr_sects") is larger than
BIO_MAX_PAGES:
...
while (nr_sects != 0) {
bio = next_bio(bio, min(nr_sects, (sector_t)BIO_MAX_PAGES),
gfp_mask);
...
}
...
However, there is a inner loop:
...
while (nr_sects != 0) {
sz = min((sector_t) PAGE_SIZE >> 9 , nr_sects);
bi_size = bio_add_page(bio, ZERO_PAGE(0), sz << 9, 0);
nr_sects -= bi_size >> 9;
sector += bi_size >> 9;
if (bi_size < (sz << 9))
break;
}
...
which apparently would loop over the whole request on its own, making
the outer loop a bogus one.
The request ends up being done in a single (huge) bio. When the bio is
passed on to dm-crypt, it appears that dm-crypt will not split the bio
either when it allocates buffer for conversion/encryption:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/md/dm-crypt.c?h=v4.12#n1694
which leads to possible enormous uptake of memory, causing OOM / kernel panic.
There seems to be some measure that is suppose to split large bio though:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/md/dm-crypt.c?h=v4.12#n2787
Apparently it is called before kcryptd_crypt_write_convert() /
crypt_alloc_buffer(). However, I don't really parse
dm_accept_partial_bio() (or the comment about it) so I don't really
know what it actually does or how it does it. Neither can I see it
helps in reality anyway.
Here is another test case that shows the problem:
https://ptpb.pw/BWWo.png
https://ptpb.pw/aRTE.png
Regards,
Tom Yan
> From: Tom Yan
> Sent: Monday, August 7, 2017 9:58 AM
> To: dm-devel@redhat.com
> Subject: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
>
> Hi all,
>
> When I do the following:
>
> cryptsetup open /dev/sdX[Y] rand --type plain --key-file /dev/random;
> blkdiscard -z /dev/mapper/rand
>
> Some OOM killings and a kernel panic occur. Here is a screenshot:
> https://gitlab.com/cryptsetup/cryptsetup/uploads/207ffdada52f3172f54a014c67159625/DSC_0129.JPG
>
> Similar issue is NOT found in doing `cat /dev/zero > /dev/mapper/rand`, or `blkdiscard -z /dev/sdX[Y]` (without opening a dm-crypt on it), on the same hardware and configuration.
>
> Note that `blkdiscard -z` does not trigger the BLKDISCARD ioctl but the BLKZEROOUT ioctl. And the devices I have been testing on are USB devices that does NOT support WRITE SAME or UNMAP.
>
> Regards,
> Tom Yan
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
2017-08-08 10:23 [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic Tom Yan
@ 2017-08-09 13:38 ` hch
2017-08-09 18:21 ` [dm-devel] " Ondrej Kozina
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: hch @ 2017-08-09 13:38 UTC (permalink / raw)
To: Tom Yan
Cc: dm-devel@redhat.com, gmazyland, axboe@fb.com, axboe, hch@lst.de,
linux-block
Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
"block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
2017-08-09 13:38 ` hch
@ 2017-08-09 18:21 ` Ondrej Kozina
2017-08-09 23:27 ` Tom Yan
2017-08-14 2:47 ` [dm-devel] " Mikulas Patocka
2 siblings, 0 replies; 13+ messages in thread
From: Ondrej Kozina @ 2017-08-09 18:21 UTC (permalink / raw)
To: hch@lst.de, Tom Yan
Cc: axboe, linux-block, axboe@fb.com, dm-devel@redhat.com, gmazyland
On 08/09/2017 03:38 PM, hch@lst.de wrote:
> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
I crashed the 4.13-rc4 with the test above (blkdiscard -z on dm-crypt
dev). Unfortunately I hadn't have my VM configured properly so the only
info I got before it crashed was basically OOM. I'm going to play with
it tomorrow and will let you know.
O.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
2017-08-09 13:38 ` hch
2017-08-09 18:21 ` [dm-devel] " Ondrej Kozina
@ 2017-08-09 23:27 ` Tom Yan
2017-08-10 10:14 ` Tom Yan
2017-08-14 2:47 ` [dm-devel] " Mikulas Patocka
2 siblings, 1 reply; 13+ messages in thread
From: Tom Yan @ 2017-08-09 23:27 UTC (permalink / raw)
To: hch@lst.de
Cc: dm-devel@redhat.com, Milan Broz, axboe@fb.com, axboe, linux-block,
okozina
I haven't really tested but I was aware of the commit before I send my
last email. It doesn't seem relevant to be honest, because it doesn't
change the fact that the inner loop wil only end if the whole request
has been looped over. So still one big bio.
There are a few things that seem suspicious to me. First of all, the
inner loop has an if-break at its end that seem to practically do
nothing, especially in terms of making the inner loop end when the bio
reach its expected size (BIO_MAX_PAGES?).
bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);
If you take a look at bio_add_page():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820
it will basically always return "len", which is "sz" here, unchanged.
if (bi_size < sz)
break;
So this pretty much never happens, with two exceptions:
if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
return 0;
which is most likely irrelevant in this case,
if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;
which seem to matter in this case. However:
if (page == bv->bv_page &&
offset == bv->bv_offset + bv->bv_len) {
bv->bv_len += len;
goto done;
}
...
if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;
...
bio->bi_vcnt++;
done:
bio->bi_iter.bi_size += len;
return len;
So if the first condition in the above quote is matched, the exception
would never happen either.
On 9 August 2017 at 21:38, hch@lst.de <hch@lst.de> wrote:
> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
2017-08-09 23:27 ` Tom Yan
@ 2017-08-10 10:14 ` Tom Yan
0 siblings, 0 replies; 13+ messages in thread
From: Tom Yan @ 2017-08-10 10:14 UTC (permalink / raw)
To: hch@lst.de
Cc: dm-devel@redhat.com, Milan Broz, axboe@fb.com, axboe, linux-block,
okozina
Never mind, I guess I was wrong, coz bio_add_page() is always called
with "offset" = 0 here, so:
offset == bv->bv_offset + bv->bv_len
is never matched. Hence,
if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;
will trigger the if-break.
So maybe this is after all just a dm-crypt issue on handling multiple
bios created with next_bio (a bio chain)?
On 10 August 2017 at 07:27, Tom Yan <tom.ty89@gmail.com> wrote:
> I haven't really tested but I was aware of the commit before I send my
> last email. It doesn't seem relevant to be honest, because it doesn't
> change the fact that the inner loop wil only end if the whole request
> has been looped over. So still one big bio.
>
> There are a few things that seem suspicious to me. First of all, the
> inner loop has an if-break at its end that seem to practically do
> nothing, especially in terms of making the inner loop end when the bio
> reach its expected size (BIO_MAX_PAGES?).
>
> bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);
>
> If you take a look at bio_add_page():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820
>
> it will basically always return "len", which is "sz" here, unchanged.
>
> if (bi_size < sz)
> break;
>
> So this pretty much never happens, with two exceptions:
>
> if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
> return 0;
>
> which is most likely irrelevant in this case,
>
> if (bio->bi_vcnt >= bio->bi_max_vecs)
> return 0;
>
> which seem to matter in this case. However:
>
> if (page == bv->bv_page &&
> offset == bv->bv_offset + bv->bv_len) {
> bv->bv_len += len;
> goto done;
> }
> ...
> if (bio->bi_vcnt >= bio->bi_max_vecs)
> return 0;
> ...
> bio->bi_vcnt++;
> done:
> bio->bi_iter.bi_size += len;
> return len;
>
> So if the first condition in the above quote is matched, the exception
> would never happen either.
>
> On 9 August 2017 at 21:38, hch@lst.de <hch@lst.de> wrote:
>> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
>> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
2017-08-09 13:38 ` hch
2017-08-09 18:21 ` [dm-devel] " Ondrej Kozina
2017-08-09 23:27 ` Tom Yan
@ 2017-08-14 2:47 ` Mikulas Patocka
2017-08-14 4:04 ` Damien Le Moal
2 siblings, 1 reply; 13+ messages in thread
From: Mikulas Patocka @ 2017-08-14 2:47 UTC (permalink / raw)
To: hch@lst.de, Damien Le Moal
Cc: Tom Yan, axboe, linux-block, axboe@fb.com, dm-devel@redhat.com,
gmazyland
On Wed, 9 Aug 2017, hch@lst.de wrote:
> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
I think that patch is incorrect. sector_t may be a 32-bit type and
nr_sects << 9 may overflow.
static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
{
sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
}
Mikulas
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
2017-08-14 2:47 ` [dm-devel] " Mikulas Patocka
@ 2017-08-14 4:04 ` Damien Le Moal
2017-08-15 0:01 ` [PATCH] fix an integer overflow in __blkdev_sectors_to_bio_pages Mikulas Patocka
0 siblings, 1 reply; 13+ messages in thread
From: Damien Le Moal @ 2017-08-14 4:04 UTC (permalink / raw)
To: hch@lst.de, mpatocka@redhat.com
Cc: tom.ty89@gmail.com, dm-devel@redhat.com,
linux-block@vger.kernel.org, gmazyland@gmail.com, axboe@fb.com,
axboe@kernel.dk
T24gU3VuLCAyMDE3LTA4LTEzIGF0IDIyOjQ3IC0wNDAwLCBNaWt1bGFzIFBhdG9ja2Egd3JvdGU6
DQo+IA0KPiBPbiBXZWQsIDkgQXVnIDIwMTcsIGhjaEBsc3QuZGUgd3JvdGU6DQo+IA0KPiA+IERv
ZXMgY29tbWl0IDYxNWQyMmE1MWMwNDg1NmVmZTYyYWY2ZTFkNWI0NTBhYWY1Y2MyYzANCj4gPiAi
YmxvY2s6IEZpeCBfX2Jsa2Rldl9pc3N1ZV96ZXJvb3V0IGxvb3AiIGZpeCB0aGUgaXNzdWUgZm9y
IHlvdT8NCj4gPiANCj4gPiAtLQ0KPiA+IGRtLWRldmVsIG1haWxpbmcgbGlzdA0KPiA+IGRtLWRl
dmVsQHJlZGhhdC5jb20NCj4gPiBodHRwczovL3d3dy5yZWRoYXQuY29tL21haWxtYW4vbGlzdGlu
Zm8vZG0tZGV2ZWwNCj4gDQo+IEkgdGhpbmsgdGhhdCBwYXRjaCBpcyBpbmNvcnJlY3QuIHNlY3Rv
cl90IG1heSBiZSBhIDMyLWJpdCB0eXBlIGFuZCANCj4gbnJfc2VjdHMgPDwgOSBtYXkgb3ZlcmZs
b3cuDQo+IA0KPiBzdGF0aWMgdW5zaWduZWQgaW50IF9fYmxrZGV2X3NlY3RvcnNfdG9fYmlvX3Bh
Z2VzKHNlY3Rvcl90IG5yX3NlY3RzKQ0KPiB7DQo+ICAgICAgICBzZWN0b3JfdCBieXRlcyA9IChu
cl9zZWN0cyA8PCA5KSArIFBBR0VfU0laRSAtIDE7DQo+IA0KPiAgICAgICAgcmV0dXJuIG1pbihi
eXRlcyA+PiBQQUdFX1NISUZULCAoc2VjdG9yX3QpQklPX01BWF9QQUdFUyk7DQo+IH0NCj4gDQo+
IE1pa3VsYXMNCg0KTWlrdWxhcywNCg0KRG9lcyB0aGUgZm9sbHdpbmcgcGF0Y2ggZml4IHRoZSBw
cm9ibGVtID8NCg0KRnJvbSA5NDdiM2NmNDFlNzU5YjJiMjNmNjg0ZTIxNWU2NTFkMGM4MDM3Zjg4
IE1vbiBTZXAgMTcgMDA6MDA6MDAgMjAwMQ0KRnJvbTogRGFtaWVuIExlIE1vYWwgPGRhbWllbi5s
ZW1vYWxAd2RjLmNvbT4NCkRhdGU6IE1vbiwgMTQgQXVnIDIwMTcgMTM6MDE6MTYgKzA5MDANClN1
YmplY3Q6IFtQQVRDSF0gYmxvY2s6IEZpeCBfX2Jsa2Rldl9zZWN0b3JzX3RvX2Jpb19wYWdlcygp
DQoNCk9uIDMyYml0IHN5c3RlbXMgd2hlcmUgc2VjdG9yX3QgaXMgYSAzMmJpdHMgdHlwZSwgdGhl
IGNhbGN1bGF0aW9uIG9mDQpieXRlcyBtYXkgb3ZlcmZsb3cuIFVzZSB0aGUgdTY0IHR5cGUgZm9y
IHRoZSBsb2NhbCBjYWxjdWxhdGlvbiB0byBhdm9pZA0Kb3ZlcmZsb3dzLg0KDQpTaWduZWQtb2Zm
LWJ5OiBEYW1pZW4gTGUgTW9hbCA8ZGFtaWVuLmxlbW9hbEB3ZGMuY29tPg0KLS0tDQogYmxvY2sv
YmxrLWxpYi5jIHwgNCArKy0tDQogMSBmaWxlIGNoYW5nZWQsIDIgaW5zZXJ0aW9ucygrKSwgMiBk
ZWxldGlvbnMoLSkNCg0KZGlmZiAtLWdpdCBhL2Jsb2NrL2Jsay1saWIuYyBiL2Jsb2NrL2Jsay1s
aWIuYw0KaW5kZXggM2ZlMGFlYzkwNTk3Li5jY2YyMmRiYTIxZjAgMTAwNjQ0DQotLS0gYS9ibG9j
ay9ibGstbGliLmMNCisrKyBiL2Jsb2NrL2Jsay1saWIuYw0KQEAgLTI2OSw5ICsyNjksOSBAQCBz
dGF0aWMgaW50IF9fYmxrZGV2X2lzc3VlX3dyaXRlX3plcm9lcyhzdHJ1Y3QgYmxvY2tfZGV2aWNl
DQoqYmRldiwNCiAgKi8NCiBzdGF0aWMgdW5zaWduZWQgaW50IF9fYmxrZGV2X3NlY3RvcnNfdG9f
YmlvX3BhZ2VzKHNlY3Rvcl90IG5yX3NlY3RzKQ0KIHsNCi0Jc2VjdG9yX3QgYnl0ZXMgPSAobnJf
c2VjdHMgPDwgOSkgKyBQQUdFX1NJWkUgLSAxOw0KKwl1NjQgYnl0ZXMgPSAoKHU2NClucl9zZWN0
cyA8PCA5KSArIFBBR0VfU0laRSAtIDE7DQogDQotCXJldHVybiBtaW4oYnl0ZXMgPj4gUEFHRV9T
SElGVCwgKHNlY3Rvcl90KUJJT19NQVhfUEFHRVMpOw0KKwlyZXR1cm4gbWluKGJ5dGVzID4+IFBB
R0VfU0hJRlQsICh1NjQpQklPX01BWF9QQUdFUyk7DQogfQ0KIA0KIC8qKg0KLS0gDQoyLjEzLjQN
Cg0KDQotLSANCkRhbWllbiBMZSBNb2FsDQpXZXN0ZXJuIERpZ2l0YWw=
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] fix an integer overflow in __blkdev_sectors_to_bio_pages
2017-08-14 4:04 ` Damien Le Moal
@ 2017-08-15 0:01 ` Mikulas Patocka
2017-08-15 3:43 ` Damien Le Moal
2017-09-11 14:58 ` Mike Snitzer
0 siblings, 2 replies; 13+ messages in thread
From: Mikulas Patocka @ 2017-08-15 0:01 UTC (permalink / raw)
To: Damien Le Moal
Cc: hch@lst.de, tom.ty89@gmail.com, dm-devel@redhat.com,
linux-block@vger.kernel.org, gmazyland@gmail.com, axboe@fb.com,
axboe@kernel.dk
On Mon, 14 Aug 2017, Damien Le Moal wrote:
> On Sun, 2017-08-13 at 22:47 -0400, Mikulas Patocka wrote:
> >
> > On Wed, 9 Aug 2017, hch@lst.de wrote:
> >
> > > Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> > > "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
> > >
> > > --
> > > dm-devel mailing list
> > > dm-devel@redhat.com
> > > https://www.redhat.com/mailman/listinfo/dm-devel
> >
> > I think that patch is incorrect. sector_t may be a 32-bit type and
> > nr_sects << 9 may overflow.
> >
> > static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> > {
> > sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> >
> > return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> > }
> >
> > Mikulas
>
> Mikulas,
>
> Does the follwing patch fix the problem ?
>
> From 947b3cf41e759b2b23f684e215e651d0c8037f88 Mon Sep 17 00:00:00 2001
> From: Damien Le Moal <damien.lemoal@wdc.com>
> Date: Mon, 14 Aug 2017 13:01:16 +0900
> Subject: [PATCH] block: Fix __blkdev_sectors_to_bio_pages()
>
> On 32bit systems where sector_t is a 32bits type, the calculation of
> bytes may overflow. Use the u64 type for the local calculation to avoid
> overflows.
>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
> block/blk-lib.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 3fe0aec90597..ccf22dba21f0 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(struct block_device
> *bdev,
> */
> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> {
> - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> + u64 bytes = ((u64)nr_sects << 9) + PAGE_SIZE - 1;
>
> - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> + return min(bytes >> PAGE_SHIFT, (u64)BIO_MAX_PAGES);
> }
>
It's OK, but it is not needed to use 64-bit arithmetic here if all we need
is to shift the value right. Here I submit a simplified patch, using the
macro DIV_ROUND_UP_SECTOR_T (the macro gets optimized to just an addition
and right shift).
From: Mikulas Patocka <mpatocka@redhat.com>
Fix possible integer overflow in __blkdev_sectors_to_bio_pages if sector_t
is 32-bit.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 615d22a51c04 ("block: Fix __blkdev_issue_zeroout loop")
---
block/blk-lib.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux-2.6/block/blk-lib.c
===================================================================
--- linux-2.6.orig/block/blk-lib.c
+++ linux-2.6/block/blk-lib.c
@@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(s
*/
static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
{
- sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
+ sector_t pages = DIV_ROUND_UP_SECTOR_T(nr_sects, PAGE_SIZE / 512);
- return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
+ return min(pages, (sector_t)BIO_MAX_PAGES);
}
/**
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] fix an integer overflow in __blkdev_sectors_to_bio_pages
2017-08-15 0:01 ` [PATCH] fix an integer overflow in __blkdev_sectors_to_bio_pages Mikulas Patocka
@ 2017-08-15 3:43 ` Damien Le Moal
2017-09-11 14:58 ` Mike Snitzer
1 sibling, 0 replies; 13+ messages in thread
From: Damien Le Moal @ 2017-08-15 3:43 UTC (permalink / raw)
To: Mikulas Patocka
Cc: hch@lst.de, tom.ty89@gmail.com, dm-devel@redhat.com,
linux-block@vger.kernel.org, gmazyland@gmail.com, axboe@fb.com,
axboe@kernel.dk
On 8/15/17 09:01, Mikulas Patocka wrote:
>
>
> On Mon, 14 Aug 2017, Damien Le Moal wrote:
>
>> On Sun, 2017-08-13 at 22:47 -0400, Mikulas Patocka wrote:
>>>
>>> On Wed, 9 Aug 2017, hch@lst.de wrote:
>>>
>>>> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
>>>> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
>>>>
>>>> --
>>>> dm-devel mailing list
>>>> dm-devel@redhat.com
>>>> https://www.redhat.com/mailman/listinfo/dm-devel
>>>
>>> I think that patch is incorrect. sector_t may be a 32-bit type and
>>> nr_sects << 9 may overflow.
>>>
>>> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
>>> {
>>> sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
>>>
>>> return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
>>> }
>>>
>>> Mikulas
>>
>> Mikulas,
>>
>> Does the follwing patch fix the problem ?
>>
>> From 947b3cf41e759b2b23f684e215e651d0c8037f88 Mon Sep 17 00:00:00 2001
>> From: Damien Le Moal <damien.lemoal@wdc.com>
>> Date: Mon, 14 Aug 2017 13:01:16 +0900
>> Subject: [PATCH] block: Fix __blkdev_sectors_to_bio_pages()
>>
>> On 32bit systems where sector_t is a 32bits type, the calculation of
>> bytes may overflow. Use the u64 type for the local calculation to avoid
>> overflows.
>>
>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
>> ---
>> block/blk-lib.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 3fe0aec90597..ccf22dba21f0 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(struct block_device
>> *bdev,
>> */
>> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
>> {
>> - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
>> + u64 bytes = ((u64)nr_sects << 9) + PAGE_SIZE - 1;
>>
>> - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
>> + return min(bytes >> PAGE_SHIFT, (u64)BIO_MAX_PAGES);
>> }
>>
>
> It's OK, but it is not needed to use 64-bit arithmetic here if all we need
> is to shift the value right. Here I submit a simplified patch, using the
> macro DIV_ROUND_UP_SECTOR_T (the macro gets optimized to just an addition
> and right shift).
>
>
>
> From: Mikulas Patocka <mpatocka@redhat.com>
>
> Fix possible integer overflow in __blkdev_sectors_to_bio_pages if sector_t
> is 32-bit.
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Fixes: 615d22a51c04 ("block: Fix __blkdev_issue_zeroout loop")
>
> ---
> block/blk-lib.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/block/blk-lib.c
> ===================================================================
> --- linux-2.6.orig/block/blk-lib.c
> +++ linux-2.6/block/blk-lib.c
> @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(s
> */
> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> {
> - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> + sector_t pages = DIV_ROUND_UP_SECTOR_T(nr_sects, PAGE_SIZE / 512);
>
> - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> + return min(pages, (sector_t)BIO_MAX_PAGES);
> }
>
> /**
>
Nice ! Thank you.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
--
Damien Le Moal,
Western Digital
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: fix an integer overflow in __blkdev_sectors_to_bio_pages
2017-08-15 0:01 ` [PATCH] fix an integer overflow in __blkdev_sectors_to_bio_pages Mikulas Patocka
2017-08-15 3:43 ` Damien Le Moal
@ 2017-09-11 14:58 ` Mike Snitzer
2017-09-11 15:47 ` Jens Axboe
1 sibling, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2017-09-11 14:58 UTC (permalink / raw)
To: axboe, Mikulas Patocka
Cc: Damien Le Moal, axboe@kernel.dk, axboe@fb.com, hch@lst.de,
linux-block@vger.kernel.org, dm-devel@redhat.com,
tom.ty89@gmail.com, gmazyland@gmail.com
On Mon, Aug 14 2017 at 8:01pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:
>
>
> On Mon, 14 Aug 2017, Damien Le Moal wrote:
>
> > On Sun, 2017-08-13 at 22:47 -0400, Mikulas Patocka wrote:
> > >
> > > On Wed, 9 Aug 2017, hch@lst.de wrote:
> > >
> > > > Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> > > > "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
> > > >
> > > > --
> > > > dm-devel mailing list
> > > > dm-devel@redhat.com
> > > > https://www.redhat.com/mailman/listinfo/dm-devel
> > >
> > > I think that patch is incorrect. sector_t may be a 32-bit type and
> > > nr_sects << 9 may overflow.
> > >
> > > static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> > > {
> > > sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> > >
> > > return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> > > }
> > >
> > > Mikulas
> >
> > Mikulas,
> >
> > Does the follwing patch fix the problem ?
> >
> > From 947b3cf41e759b2b23f684e215e651d0c8037f88 Mon Sep 17 00:00:00 2001
> > From: Damien Le Moal <damien.lemoal@wdc.com>
> > Date: Mon, 14 Aug 2017 13:01:16 +0900
> > Subject: [PATCH] block: Fix __blkdev_sectors_to_bio_pages()
> >
> > On 32bit systems where sector_t is a 32bits type, the calculation of
> > bytes may overflow. Use the u64 type for the local calculation to avoid
> > overflows.
> >
> > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> > ---
> > block/blk-lib.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/block/blk-lib.c b/block/blk-lib.c
> > index 3fe0aec90597..ccf22dba21f0 100644
> > --- a/block/blk-lib.c
> > +++ b/block/blk-lib.c
> > @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(struct block_device
> > *bdev,
> > */
> > static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> > {
> > - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> > + u64 bytes = ((u64)nr_sects << 9) + PAGE_SIZE - 1;
> >
> > - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> > + return min(bytes >> PAGE_SHIFT, (u64)BIO_MAX_PAGES);
> > }
> >
>
> It's OK, but it is not needed to use 64-bit arithmetic here if all we need
> is to shift the value right. Here I submit a simplified patch, using the
> macro DIV_ROUND_UP_SECTOR_T (the macro gets optimized to just an addition
> and right shift).
>
>
>
> From: Mikulas Patocka <mpatocka@redhat.com>
>
> Fix possible integer overflow in __blkdev_sectors_to_bio_pages if sector_t
> is 32-bit.
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Fixes: 615d22a51c04 ("block: Fix __blkdev_issue_zeroout loop")
>
> ---
> block/blk-lib.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/block/blk-lib.c
> ===================================================================
> --- linux-2.6.orig/block/blk-lib.c
> +++ linux-2.6/block/blk-lib.c
> @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(s
> */
> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> {
> - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> + sector_t pages = DIV_ROUND_UP_SECTOR_T(nr_sects, PAGE_SIZE / 512);
>
> - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> + return min(pages, (sector_t)BIO_MAX_PAGES);
> }
>
> /**
>
Jens can you pick this up?
Also here:
https://patchwork.kernel.org/patch/9900407/
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: fix an integer overflow in __blkdev_sectors_to_bio_pages
2017-09-11 14:58 ` Mike Snitzer
@ 2017-09-11 15:47 ` Jens Axboe
0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2017-09-11 15:47 UTC (permalink / raw)
To: Mike Snitzer, axboe, Mikulas Patocka
Cc: Damien Le Moal, hch@lst.de, linux-block@vger.kernel.org,
dm-devel@redhat.com, tom.ty89@gmail.com, gmazyland@gmail.com
On 09/11/2017 08:58 AM, Mike Snitzer wrote:
> On Mon, Aug 14 2017 at 8:01pm -0400,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
>
>>
>>
>> On Mon, 14 Aug 2017, Damien Le Moal wrote:
>>
>>> On Sun, 2017-08-13 at 22:47 -0400, Mikulas Patocka wrote:
>>>>
>>>> On Wed, 9 Aug 2017, hch@lst.de wrote:
>>>>
>>>>> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
>>>>> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
>>>>>
>>>>> --
>>>>> dm-devel mailing list
>>>>> dm-devel@redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/dm-devel
>>>>
>>>> I think that patch is incorrect. sector_t may be a 32-bit type and
>>>> nr_sects << 9 may overflow.
>>>>
>>>> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
>>>> {
>>>> sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
>>>>
>>>> return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
>>>> }
>>>>
>>>> Mikulas
>>>
>>> Mikulas,
>>>
>>> Does the follwing patch fix the problem ?
>>>
>>> From 947b3cf41e759b2b23f684e215e651d0c8037f88 Mon Sep 17 00:00:00 2001
>>> From: Damien Le Moal <damien.lemoal@wdc.com>
>>> Date: Mon, 14 Aug 2017 13:01:16 +0900
>>> Subject: [PATCH] block: Fix __blkdev_sectors_to_bio_pages()
>>>
>>> On 32bit systems where sector_t is a 32bits type, the calculation of
>>> bytes may overflow. Use the u64 type for the local calculation to avoid
>>> overflows.
>>>
>>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
>>> ---
>>> block/blk-lib.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>>> index 3fe0aec90597..ccf22dba21f0 100644
>>> --- a/block/blk-lib.c
>>> +++ b/block/blk-lib.c
>>> @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(struct block_device
>>> *bdev,
>>> */
>>> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
>>> {
>>> - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
>>> + u64 bytes = ((u64)nr_sects << 9) + PAGE_SIZE - 1;
>>>
>>> - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
>>> + return min(bytes >> PAGE_SHIFT, (u64)BIO_MAX_PAGES);
>>> }
>>>
>>
>> It's OK, but it is not needed to use 64-bit arithmetic here if all we need
>> is to shift the value right. Here I submit a simplified patch, using the
>> macro DIV_ROUND_UP_SECTOR_T (the macro gets optimized to just an addition
>> and right shift).
>>
>>
>>
>> From: Mikulas Patocka <mpatocka@redhat.com>
>>
>> Fix possible integer overflow in __blkdev_sectors_to_bio_pages if sector_t
>> is 32-bit.
>>
>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>> Fixes: 615d22a51c04 ("block: Fix __blkdev_issue_zeroout loop")
>>
>> ---
>> block/blk-lib.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> Index: linux-2.6/block/blk-lib.c
>> ===================================================================
>> --- linux-2.6.orig/block/blk-lib.c
>> +++ linux-2.6/block/blk-lib.c
>> @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(s
>> */
>> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
>> {
>> - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
>> + sector_t pages = DIV_ROUND_UP_SECTOR_T(nr_sects, PAGE_SIZE / 512);
>>
>> - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
>> + return min(pages, (sector_t)BIO_MAX_PAGES);
>> }
>>
>> /**
>>
>
> Jens can you pick this up?
Yep added, thanks.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
@ 2017-08-08 9:57 Tom Yan
0 siblings, 0 replies; 13+ messages in thread
From: Tom Yan @ 2017-08-08 9:57 UTC (permalink / raw)
To: dm-devel@redhat.com, gmazyland, axboe@fb.com, axboe, hch@lst.de,
linux-block
Hi again,
I think I have sort of identified the problem. It appears to me that both t=
he block layer and dm-crypt is defected on handling this.
First of all, check out the "fallback" zero out implementation, which is us=
ed in this case, here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/blo=
ck/blk-lib.c?h=3Dv4.12#n309
>From the outer loop, it seem to imply that this should be done in multiple =
"bio"s, if the request (original "nr_sects") is larger than BIO_MAX_PAGES:
...
while (nr_sects !=3D 0) {
bio =3D next_bio(bio, min(nr_sects, (sector_t)BIO_MAX_PAGES),
gfp_mask);
...
}
...
However, there is a inner loop:
...
while (nr_sects !=3D 0) {
sz =3D min((sector_t) PAGE_SIZE >> 9 , nr_sects);
bi_size =3D bio_add_page(bio, ZERO_PAGE(0), sz << 9, 0);
nr_sects -=3D bi_size >> 9;
sector +=3D bi_size >> 9;
if (bi_size < (sz << 9))
break;
}
...
which apparently would loop over the whole request on its own, making the o=
uter loop a bogus one.
The request ends up being done in a single (huge) bio. When the bio is pass=
ed on to dm-crypt, it appears that dm-crypt will not split the bio either w=
hen it allocates buffer for conversion/encryption:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/dri=
vers/md/dm-crypt.c?h=3Dv4.12#n1694
which leads to possible enormous uptake of memory, causing OOM / kernel pan=
ic.
There seems to be some measure that is suppose to split large bio though:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/dri=
vers/md/dm-crypt.c?h=3Dv4.12#n2787
Apparently it is called before kcryptd_crypt_write_convert() / crypt_alloc_=
buffer(). However, I don't really parse dm_accept_partial_bio() (or the com=
ment about it) so I don't really know what it actually does or how it does =
it. Neither can I see it helps in reality anyway.
Here is another test case that shows the problem:
https://ptpb.pw/BWWo.png
https://ptpb.pw/aRTE.png
Regards,
Tom Yan
> From: Tom Yan
> Sent: Monday, August 7, 2017 9:58 AM
> To: dm-devel@redhat.com
> Subject: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
>
> Hi all,
>
> When I do the following:
>
> cryptsetup open /dev/sdX[Y] rand --type plain --key-file /dev/random;
> blkdiscard -z /dev/mapper/rand
>
> Some OOM killings and a kernel panic occur. Here is a screenshot:
> https://gitlab.com/cryptsetup/cryptsetup/uploads/207ffdada52f3172f54a014c67=
> 159625/DSC_0129.JPGhttps://gitlab.com/cryptsetup/cryptsetup/uploads/207ffda=
> da52f3172f54a014c67159625/DSC_0129.JPG=20
>
>
> Similar issue is NOT found in doing `cat /dev/zero > /dev/mapper/rand`, or =
> `blkdiscard -z /dev/sdX[Y]` (without opening a dm-crypt on it), on the same=
> hardware and configuration.
>
> Note that `blkdiscard -z` does not trigger the BLKDISCARD ioctl but the BLK=
> ZEROOUT ioctl. And the devices I have been testing on are USB devices that =
> does NOT support WRITE SAME or UNMAP.
>
> Regards,
> Tom Yan
^ permalink raw reply [flat|nested] 13+ messages in thread[parent not found: <HK2PR04MB064468C0371547D1D17C9C5488B50@HK2PR04MB0644.apcprd04.prod.outlook.com>]
* Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
[not found] <HK2PR04MB064468C0371547D1D17C9C5488B50@HK2PR04MB0644.apcprd04.prod.outlook.com>
@ 2017-08-08 9:42 ` Tom Yan
0 siblings, 0 replies; 13+ messages in thread
From: Tom Yan @ 2017-08-08 9:42 UTC (permalink / raw)
To: dm-devel@redhat.com, gmazyland@gmail.com, axboe@fb.com,
axboe@kernel.dk, hch@lst.de, linux-block@vger.kernel.org
Hi again,
I think I have sort of identified the problem. It appears to me that both t=
he block layer and dm-crypt is defected on handling this.
First of all, check out the "fallback" zero out implementation, which is us=
ed in this case, here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/blo=
ck/blk-lib.c?h=3Dv4.12#n309
>From the outer loop, it seem to imply that this should be done in multiple =
"bio"s, if the request (original "nr_sects") is larger than BIO_MAX_PAGES:
...
while (nr_sects !=3D 0) {
bio =3D next_bio(bio, min(nr_sects, (sector_t)BIO_MAX_PAGES),
gfp_mask);
...
}
...
However, there is a inner loop:
...
while (nr_sects !=3D 0) {
sz =3D min((sector_t) PAGE_SIZE >> 9 , nr_sects);
bi_size =3D bio_add_page(bio, ZERO_PAGE(0), sz << 9, 0);
nr_sects -=3D bi_size >> 9;
sector +=3D bi_size >> 9;
if (bi_size < (sz << 9))
break;
}
...
which apparently would loop over the whole request on its own, making the o=
uter loop a bogus one.
The request ends up being done in a single (huge) bio. When the bio is pass=
ed on to dm-crypt, it appears that dm-crypt will not split the bio either w=
hen it allocates buffer for conversion/encryption:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/dri=
vers/md/dm-crypt.c?h=3Dv4.12#n1694
which leads to possible enormous uptake of memory, causing OOM / kernel pan=
ic.
There seems to be some measure that is suppose to split large bio though:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/dri=
vers/md/dm-crypt.c?h=3Dv4.12#n2787
Apparently it is called before kcryptd_crypt_write_convert() / crypt_alloc_=
buffer(). However, I don't really parse dm_accept_partial_bio() (or the com=
ment about it) so I don't really know what it actually does or how it does =
it. Neither can I see it helps in reality anyway.
Here is another test case that shows the problem:
https://ptpb.pw/BWWo.png
https://ptpb.pw/aRTE.png
Regards,
Tom Yan
From: Tom Yan
Sent: Monday, August 7, 2017 9:58 AM
To: dm-devel@redhat.com
Subject: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
=A0 =20
Hi all,
When I do the following:
cryptsetup open /dev/sdX[Y] rand --type plain --key-file /dev/random;
blkdiscard -z /dev/mapper/rand
Some OOM killings and a kernel panic occur. Here is a screenshot:
https://gitlab.com/cryptsetup/cryptsetup/uploads/207ffdada52f3172f54a014c67=
159625/DSC_0129.JPGhttps://gitlab.com/cryptsetup/cryptsetup/uploads/207ffda=
da52f3172f54a014c67159625/DSC_0129.JPG=20
Similar issue is NOT found in doing `cat /dev/zero > /dev/mapper/rand`, or =
`blkdiscard -z /dev/sdX[Y]` (without opening a dm-crypt on it), on the same=
hardware and configuration.
Note that `blkdiscard -z` does not trigger the BLKDISCARD ioctl but the BLK=
ZEROOUT ioctl. And the devices I have been testing on are USB devices that =
does NOT support WRITE SAME or UNMAP.
Regards,
Tom Yan =
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-09-11 15:47 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-08 10:23 [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic Tom Yan
2017-08-09 13:38 ` hch
2017-08-09 18:21 ` [dm-devel] " Ondrej Kozina
2017-08-09 23:27 ` Tom Yan
2017-08-10 10:14 ` Tom Yan
2017-08-14 2:47 ` [dm-devel] " Mikulas Patocka
2017-08-14 4:04 ` Damien Le Moal
2017-08-15 0:01 ` [PATCH] fix an integer overflow in __blkdev_sectors_to_bio_pages Mikulas Patocka
2017-08-15 3:43 ` Damien Le Moal
2017-09-11 14:58 ` Mike Snitzer
2017-09-11 15:47 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2017-08-08 9:57 [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic Tom Yan
[not found] <HK2PR04MB064468C0371547D1D17C9C5488B50@HK2PR04MB0644.apcprd04.prod.outlook.com>
2017-08-08 9:42 ` Tom Yan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox