public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [io_uring]: fio's io_uring engine causing general protection fault
@ 2019-06-21 12:40 Stephen  Bates
  2019-06-21 13:13 ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen  Bates @ 2019-06-21 12:40 UTC (permalink / raw)
  To: linux-block@vger.kernel.org; +Cc: Jens Axboe, fio-owner@vger.kernel.org

Hi

I hit the following General Protection Fault when testing io_uring via the io_uring engine in fio. This was on a VM running 5.2-rc5 and the latest version of fio. The issue occurs for both null_blk and fake NVMe drives. I have not tested bare metal or real NVMe SSDs. The fio script used is given below.

[io_uring]
time_based=1
runtime=60
filename=/dev/nvme2n1 (note /dev/nullb0 also fails)
ioengine=io_uring
bs=4k
rw=readwrite
direct=1
fixedbufs=1
sqthread_poll=1
sqthread_poll_cpu=0

[  964.540374] general protection fault: 0000 [#1] SMP PTI
[  964.542041] CPU: 0 PID: 872 Comm: io_uring-sq Not tainted 5.2.0-rc5-cpacket-io-uring #1
[  964.545589] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  964.549761] RIP: 0010:fput_many+0x7/0x90
[  964.551522] Code: 01 48 85 ff 74 17 55 48 89 e5 53 48 8b 1f e8 a0 f9 ff ff 48 85 db 48 89 df 75 f0 5b 5d f3 c3 0f 1f 40 00 0f 1f 44 00 00 89 f6 <f0> 48 29 77 38 74 01 c3 55 48 89 e5 53 48 89 fb 65 48 \
8b 3c 25 c0
[  964.559031] RSP: 0018:ffffadeb817ebc50 EFLAGS: 00010246
[  964.561112] RAX: 0000000000000004 RBX: ffff8f46ad477480 RCX: 0000000000001805
[  964.563911] RDX: 0000000000000000 RSI: 0000000000000001 RDI: f18b51b9a39552b5
[  964.566580] RBP: ffffadeb817ebc58 R08: ffff8f46b7a318c0 R09: 000000000000015d
[  964.569109] R10: ffffadeb817ebce8 R11: 0000000000000020 R12: ffff8f46ad4cd000
[  964.571623] R13: 00000000fffffff7 R14: ffffadeb817ebe30 R15: 0000000000000004
[  964.574153] FS:  0000000000000000(0000) GS:ffff8f46b7a00000(0000) knlGS:0000000000000000
[  964.577020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  964.578917] CR2: 000055828f0bbbf0 CR3: 0000000232176004 CR4: 00000000003606f0
[  964.581221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  964.583511] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  964.585808] Call Trace:
[  964.586626]  ? fput+0x13/0x20
[  964.587613]  io_free_req+0x20/0x40
[  964.588733]  io_put_req+0x1b/0x20
[  964.589795]  io_submit_sqe+0x40a/0x680
[  964.590919]  ? __switch_to_asm+0x34/0x70
[  964.592090]  ? __switch_to_asm+0x40/0x70
[  964.593270]  io_submit_sqes+0xb9/0x160
[  964.594392]  ? io_submit_sqes+0xb9/0x160
[  964.595564]  ? __switch_to_asm+0x40/0x70
[  964.596737]  ? __switch_to_asm+0x34/0x70
[  964.597918]  ? __schedule+0x3f2/0x6a0
[  964.599015]  ? __switch_to_asm+0x34/0x70
[  964.600444]  io_sq_thread+0x1af/0x470
[  964.601568]  ? __switch_to_asm+0x34/0x70
[  964.602655]  ? wait_woken+0x80/0x80
[  964.603625]  ? __switch_to+0x85/0x410
[  964.604638]  ? __switch_to_asm+0x40/0x70
[  964.605726]  ? __switch_to_asm+0x34/0x70
[  964.606811]  ? __schedule+0x3f2/0x6a0
[  964.607827]  kthread+0x105/0x140
[  964.608725]  ? io_submit_sqes+0x160/0x160
[  964.609836]  ? kthread+0x105/0x140
[  964.610780]  ? io_submit_sqes+0x160/0x160
[  964.611887]  ? kthread_destroy_worker+0x50/0x50
[  964.613158]  ret_from_fork+0x35/0x40
[  964.614148] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper joydev input_leds serio_raw mac_hid sch_fq_codel sunrpc null_blk \
ip_tables x_tables autofs4 8139too psmouse 8139cp floppy mii i2c_piix4 pata_acpi
[  964.620856] ---[ end trace bdbba818b310272c ]---

Cheers
 
Stephen 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [io_uring]: fio's io_uring engine causing general protection fault
  2019-06-21 12:40 [io_uring]: fio's io_uring engine causing general protection fault Stephen  Bates
@ 2019-06-21 13:13 ` Jens Axboe
  2019-06-21 15:15   ` Stephen  Bates
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2019-06-21 13:13 UTC (permalink / raw)
  To: Stephen Bates, linux-block@vger.kernel.org; +Cc: fio-owner@vger.kernel.org

On 6/21/19 6:40 AM, Stephen  Bates wrote:
> Hi
> 
> I hit the following General Protection Fault when testing io_uring via the io_uring engine in fio. This was on a VM running 5.2-rc5 and the latest version of fio. The issue occurs for both null_blk and fake NVMe drives. I have not tested bare metal or real NVMe SSDs. The fio script used is given below.
> 
> [io_uring]
> time_based=1
> runtime=60
> filename=/dev/nvme2n1 (note /dev/nullb0 also fails)
> ioengine=io_uring
> bs=4k
> rw=readwrite
> direct=1
> fixedbufs=1
> sqthread_poll=1
> sqthread_poll_cpu=0
> 
> [  964.540374] general protection fault: 0000 [#1] SMP PTI
> [  964.542041] CPU: 0 PID: 872 Comm: io_uring-sq Not tainted 5.2.0-rc5-cpacket-io-uring #1
> [  964.545589] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> [  964.549761] RIP: 0010:fput_many+0x7/0x90
> [  964.551522] Code: 01 48 85 ff 74 17 55 48 89 e5 53 48 8b 1f e8 a0 f9 ff ff 48 85 db 48 89 df 75 f0 5b 5d f3 c3 0f 1f 40 00 0f 1f 44 00 00 89 f6 <f0> 48 29 77 38 74 01 c3 55 48 89 e5 53 48 89 fb 65 48 \
> 8b 3c 25 c0
> [  964.559031] RSP: 0018:ffffadeb817ebc50 EFLAGS: 00010246
> [  964.561112] RAX: 0000000000000004 RBX: ffff8f46ad477480 RCX: 0000000000001805
> [  964.563911] RDX: 0000000000000000 RSI: 0000000000000001 RDI: f18b51b9a39552b5
> [  964.566580] RBP: ffffadeb817ebc58 R08: ffff8f46b7a318c0 R09: 000000000000015d
> [  964.569109] R10: ffffadeb817ebce8 R11: 0000000000000020 R12: ffff8f46ad4cd000
> [  964.571623] R13: 00000000fffffff7 R14: ffffadeb817ebe30 R15: 0000000000000004
> [  964.574153] FS:  0000000000000000(0000) GS:ffff8f46b7a00000(0000) knlGS:0000000000000000
> [  964.577020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  964.578917] CR2: 000055828f0bbbf0 CR3: 0000000232176004 CR4: 00000000003606f0
> [  964.581221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  964.583511] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  964.585808] Call Trace:
> [  964.586626]  ? fput+0x13/0x20
> [  964.587613]  io_free_req+0x20/0x40
> [  964.588733]  io_put_req+0x1b/0x20
> [  964.589795]  io_submit_sqe+0x40a/0x680
> [  964.590919]  ? __switch_to_asm+0x34/0x70
> [  964.592090]  ? __switch_to_asm+0x40/0x70
> [  964.593270]  io_submit_sqes+0xb9/0x160
> [  964.594392]  ? io_submit_sqes+0xb9/0x160
> [  964.595564]  ? __switch_to_asm+0x40/0x70
> [  964.596737]  ? __switch_to_asm+0x34/0x70
> [  964.597918]  ? __schedule+0x3f2/0x6a0
> [  964.599015]  ? __switch_to_asm+0x34/0x70
> [  964.600444]  io_sq_thread+0x1af/0x470
> [  964.601568]  ? __switch_to_asm+0x34/0x70
> [  964.602655]  ? wait_woken+0x80/0x80
> [  964.603625]  ? __switch_to+0x85/0x410
> [  964.604638]  ? __switch_to_asm+0x40/0x70
> [  964.605726]  ? __switch_to_asm+0x34/0x70
> [  964.606811]  ? __schedule+0x3f2/0x6a0
> [  964.607827]  kthread+0x105/0x140
> [  964.608725]  ? io_submit_sqes+0x160/0x160
> [  964.609836]  ? kthread+0x105/0x140
> [  964.610780]  ? io_submit_sqes+0x160/0x160
> [  964.611887]  ? kthread_destroy_worker+0x50/0x50
> [  964.613158]  ret_from_fork+0x35/0x40
> [  964.614148] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper joydev input_leds serio_raw mac_hid sch_fq_codel sunrpc null_blk \
> ip_tables x_tables autofs4 8139too psmouse 8139cp floppy mii i2c_piix4 pata_acpi
> [  964.620856] ---[ end trace bdbba818b310272c ]---

Try this patch. Technically, it's not valid to use sqthread without
fixed files registered through io_uring_register(), and this case
looks to me like we're just not initializing ->file before we end
up failing the request due to a violation of that requirement.

Not tested, on vacation...


diff --git a/fs/io_uring.c b/fs/io_uring.c
index 86a2bd721900..485832deb7ea 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -579,6 +579,7 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
 		state->cur_req++;
 	}
 
+	req->file = NULL;
 	req->ctx = ctx;
 	req->flags = 0;
 	/* one is dropped after submission, the other at completion */
@@ -1801,10 +1802,8 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s,
 		req->sequence = ctx->cached_sq_head - 1;
 	}
 
-	if (!io_op_needs_file(s->sqe)) {
-		req->file = NULL;
+	if (!io_op_needs_file(s->sqe))
 		return 0;
-	}
 
 	if (flags & IOSQE_FIXED_FILE) {
 		if (unlikely(!ctx->user_files ||

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [io_uring]: fio's io_uring engine causing general protection fault
  2019-06-21 13:13 ` Jens Axboe
@ 2019-06-21 15:15   ` Stephen  Bates
  2019-06-21 16:23     ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen  Bates @ 2019-06-21 15:15 UTC (permalink / raw)
  To: Jens Axboe, linux-block@vger.kernel.org; +Cc: fio-owner@vger.kernel.org

> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 86a2bd721900..485832deb7ea 100644

Jens

Thanks! I tested that and it seems to resolve the GPF. I now get a Bad file descriptor error in fio (which is I think what we'd expect). If you turn that patch into an official kernel patch feel free to add

Tested-by: Stephen Bates <sbates@raithlin.com>

batesste@io_uring-vm1:~/io_uring-test$ sudo fio fio/io_uring.fio
io_uring: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=1
fio-3.14-7-g7184a
Starting 1 process
fio: io_u error on file /dev/nullb0: Bad file descriptor: read offset=0, buflen=4096

Cheers
 
Stephen
 

On 2019-06-21, 2:13 PM, "Jens Axboe" <axboe@kernel.dk> wrote:

    On 6/21/19 6:40 AM, Stephen  Bates wrote:
    > Hi
    > 
    > I hit the following General Protection Fault when testing io_uring via the io_uring engine in fio. This was on a VM running 5.2-rc5 and the latest version of fio. The issue occurs for both null_blk and fake NVMe drives. I have not tested bare metal or real NVMe SSDs. The fio script used is given below.
    > 
    > [io_uring]
    > time_based=1
    > runtime=60
    > filename=/dev/nvme2n1 (note /dev/nullb0 also fails)
    > ioengine=io_uring
    > bs=4k
    > rw=readwrite
    > direct=1
    > fixedbufs=1
    > sqthread_poll=1
    > sqthread_poll_cpu=0
    > 
    > [  964.540374] general protection fault: 0000 [#1] SMP PTI
    > [  964.542041] CPU: 0 PID: 872 Comm: io_uring-sq Not tainted 5.2.0-rc5-cpacket-io-uring #1
    > [  964.545589] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    > [  964.549761] RIP: 0010:fput_many+0x7/0x90
    > [  964.551522] Code: 01 48 85 ff 74 17 55 48 89 e5 53 48 8b 1f e8 a0 f9 ff ff 48 85 db 48 89 df 75 f0 5b 5d f3 c3 0f 1f 40 00 0f 1f 44 00 00 89 f6 <f0> 48 29 77 38 74 01 c3 55 48 89 e5 53 48 89 fb 65 48 \
    > 8b 3c 25 c0
    > [  964.559031] RSP: 0018:ffffadeb817ebc50 EFLAGS: 00010246
    > [  964.561112] RAX: 0000000000000004 RBX: ffff8f46ad477480 RCX: 0000000000001805
    > [  964.563911] RDX: 0000000000000000 RSI: 0000000000000001 RDI: f18b51b9a39552b5
    > [  964.566580] RBP: ffffadeb817ebc58 R08: ffff8f46b7a318c0 R09: 000000000000015d
    > [  964.569109] R10: ffffadeb817ebce8 R11: 0000000000000020 R12: ffff8f46ad4cd000
    > [  964.571623] R13: 00000000fffffff7 R14: ffffadeb817ebe30 R15: 0000000000000004
    > [  964.574153] FS:  0000000000000000(0000) GS:ffff8f46b7a00000(0000) knlGS:0000000000000000
    > [  964.577020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    > [  964.578917] CR2: 000055828f0bbbf0 CR3: 0000000232176004 CR4: 00000000003606f0
    > [  964.581221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    > [  964.583511] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    > [  964.585808] Call Trace:
    > [  964.586626]  ? fput+0x13/0x20
    > [  964.587613]  io_free_req+0x20/0x40
    > [  964.588733]  io_put_req+0x1b/0x20
    > [  964.589795]  io_submit_sqe+0x40a/0x680
    > [  964.590919]  ? __switch_to_asm+0x34/0x70
    > [  964.592090]  ? __switch_to_asm+0x40/0x70
    > [  964.593270]  io_submit_sqes+0xb9/0x160
    > [  964.594392]  ? io_submit_sqes+0xb9/0x160
    > [  964.595564]  ? __switch_to_asm+0x40/0x70
    > [  964.596737]  ? __switch_to_asm+0x34/0x70
    > [  964.597918]  ? __schedule+0x3f2/0x6a0
    > [  964.599015]  ? __switch_to_asm+0x34/0x70
    > [  964.600444]  io_sq_thread+0x1af/0x470
    > [  964.601568]  ? __switch_to_asm+0x34/0x70
    > [  964.602655]  ? wait_woken+0x80/0x80
    > [  964.603625]  ? __switch_to+0x85/0x410
    > [  964.604638]  ? __switch_to_asm+0x40/0x70
    > [  964.605726]  ? __switch_to_asm+0x34/0x70
    > [  964.606811]  ? __schedule+0x3f2/0x6a0
    > [  964.607827]  kthread+0x105/0x140
    > [  964.608725]  ? io_submit_sqes+0x160/0x160
    > [  964.609836]  ? kthread+0x105/0x140
    > [  964.610780]  ? io_submit_sqes+0x160/0x160
    > [  964.611887]  ? kthread_destroy_worker+0x50/0x50
    > [  964.613158]  ret_from_fork+0x35/0x40
    > [  964.614148] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper joydev input_leds serio_raw mac_hid sch_fq_codel sunrpc null_blk \
    > ip_tables x_tables autofs4 8139too psmouse 8139cp floppy mii i2c_piix4 pata_acpi
    > [  964.620856] ---[ end trace bdbba818b310272c ]---
    
    Try this patch. Technically, it's not valid to use sqthread without
    fixed files registered through io_uring_register(), and this case
    looks to me like we're just not initializing ->file before we end
    up failing the request due to a violation of that requirement.
    
    Not tested, on vacation...
    
    
    diff --git a/fs/io_uring.c b/fs/io_uring.c
    index 86a2bd721900..485832deb7ea 100644
    --- a/fs/io_uring.c
    +++ b/fs/io_uring.c
    @@ -579,6 +579,7 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
     		state->cur_req++;
     	}
     
    +	req->file = NULL;
     	req->ctx = ctx;
     	req->flags = 0;
     	/* one is dropped after submission, the other at completion */
    @@ -1801,10 +1802,8 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s,
     		req->sequence = ctx->cached_sq_head - 1;
     	}
     
    -	if (!io_op_needs_file(s->sqe)) {
    -		req->file = NULL;
    +	if (!io_op_needs_file(s->sqe))
     		return 0;
    -	}
     
     	if (flags & IOSQE_FIXED_FILE) {
     		if (unlikely(!ctx->user_files ||
    
    -- 
    Jens Axboe
    
    


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [io_uring]: fio's io_uring engine causing general protection fault
  2019-06-21 15:15   ` Stephen  Bates
@ 2019-06-21 16:23     ` Jens Axboe
  0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2019-06-21 16:23 UTC (permalink / raw)
  To: Stephen Bates, linux-block@vger.kernel.org; +Cc: fio-owner@vger.kernel.org

On 6/21/19 9:15 AM, Stephen  Bates wrote:
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 86a2bd721900..485832deb7ea 100644
> 
> Jens
> 
> Thanks! I tested that and it seems to resolve the GPF. I now get a Bad
> file descriptor error in fio (which is I think what we'd expect). If
> you turn that patch into an official kernel patch feel free to add

Yep indeed, it's supposed to just result in an EBADF cqe. Thanks for
testing, I'll queue this up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-21 16:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-21 12:40 [io_uring]: fio's io_uring engine causing general protection fault Stephen  Bates
2019-06-21 13:13 ` Jens Axboe
2019-06-21 15:15   ` Stephen  Bates
2019-06-21 16:23     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox