All of lore.kernel.org
 help / color / mirror / Atom feed
* ceph_read_iter NULL pointer dereference
@ 2024-08-02 16:39 Luis Henriques
  2024-08-05  5:11 ` Xiubo Li
  0 siblings, 1 reply; 3+ messages in thread
From: Luis Henriques @ 2024-08-02 16:39 UTC (permalink / raw)
  To: Xiubo Li; +Cc: ceph-devel

Hi Xiubo,

I was wondering if you ever seen the BUG below.  I've debugged it a bit
and the issue seems occurs here, while doing the SetPageUptodate():

		if (ret <= 0)
			left = 0;
		else if (off + ret > i_size)
			left = i_size - off;
		else
			left = ret;
		while (left > 0) {
			size_t plen, copied;

			plen = min_t(size_t, left, PAGE_SIZE - page_off);
			SetPageUptodate(pages[idx]);
			copied = copy_page_to_iter(pages[idx++],
						   page_off, plen, to);
			off += copied;
			left -= copied;
			page_off = 0;
			if (copied < plen) {
				ret = -EFAULT;
				break;
			}
		}

So, the issue is that we have idx > num_pages.  And I'm almost sure that's
because of i_size being '0' and 'left' ending up with a huge value.  But
haven't managed to figure out yet why i_size is '0'.

(Note: I'll be offline next week, but I'll continue looking into this the
week after.  But I figured I should report the bug anyway, in case you've
seen something similar.)

Cheers,
-- 
Luís

BUG: kernel NULL pointer dereference, address: 0000000000000002  
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 1032b4067 P4D 1032b4067 PUD 1032ce067 PMD 0 
Oops: Oops: 0002 [#1] PREEMPT SMP   
CPU: 0 UID: 0 PID: 427 Comm: python3 Not tainted 6.11.0-rc1+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014                                                              
RIP: 0010:__ceph_sync_read+0x4fd/0xa60     
Code: 31 f6 4d 89 e7 4d 85 ff 0f 84 ed 03 00 00 ba 00 10 00 00 49 63 c6 48 29 f2 48 8d 04 c3 4c 39 fa 48 8b 08 49 0f 47 d7 49 89 d4 <f0> 80 09 08 48 8b 38 48 8b 4d b8 41 836
RSP: 0018:ffffc900008dfcb0 EFLAGS: 00010207                                           
RAX: ffff888101858000 RBX: ffff888101856000 RCX: 0000000000000002
RDX: 0000000000001000 RSI: 0000000000000000 RDI: 00000000282c61c0
RBP: ffffc900008dfd80 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000012c00 R12: 0000000000001000           
R13: 0000000000421000 R14: 0000000000000400 R15: ffffffffffbdf000
FS:  00007f2eccd69040(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000002 CR3: 0000000103e0e000 CR4: 00000000000006b0
Call Trace:
 <TASK>
 ? __die+0x23/0x60
 ? page_fault_oops+0x158/0x450
 ? __ceph_sync_read+0x4fd/0xa60
 ? search_module_extables+0x4e/0x70
 ? exc_page_fault+0x2ab/0x880
 ? asm_exc_page_fault+0x26/0x30
 ? __ceph_sync_read+0x4fd/0xa60
 ceph_read_iter+0x3eb/0x8f0
 ? rcu_core+0x997/0xa60
 ? lock_release+0x148/0x2b0
 ? vfs_read+0x244/0x310
 vfs_read+0x244/0x310
 ksys_read+0x6d/0xf0
 do_syscall_64+0x71/0x140
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f2ecce6209d
Code: 31 c0 e9 c6 fe ff ff 50 48 8d 3d 66 55 0a 00 e8 89 fe 01 00 66 0f 1f 84 00 00 00 00 00 80 3d 41 25 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1fc
RSP: 002b:00007ffcdaa10578 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000a840d8 RCX: 00007f2ecce6209d
RDX: 0000000000400000 RSI: 0000000027ec61c0 RDI: 0000000000000004
RBP: 00007f2eccd68fc0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000400000
R13: 0000000027ec61c0 R14: 0000000000000004 R15: 0000000000000000
 </TASK>
Modules linked in:
CR2: 0000000000000002
---[ end trace 0000000000000000 ]---
RIP: 0010:__ceph_sync_read+0x4fd/0xa60
Code: 31 f6 4d 89 e7 4d 85 ff 0f 84 ed 03 00 00 ba 00 10 00 00 49 63 c6 48 29 f2 48 8d 04 c3 4c 39 fa 48 8b 08 49 0f 47 d7 49 89 d4 <f0> 80 09 08 48 8b 38 48 8b 4d b8 41 836
RSP: 0018:ffffc900008dfcb0 EFLAGS: 00010207
RAX: ffff888101858000 RBX: ffff888101856000 RCX: 0000000000000002
RDX: 0000000000001000 RSI: 0000000000000000 RDI: 00000000282c61c0
RBP: ffffc900008dfd80 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000012c00 R12: 0000000000001000
R13: 0000000000421000 R14: 0000000000000400 R15: ffffffffffbdf000
FS:  00007f2eccd69040(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fac71216a40 CR3: 0000000103e0e000 CR4: 00000000000006b0

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ceph_read_iter NULL pointer dereference
  2024-08-02 16:39 ceph_read_iter NULL pointer dereference Luis Henriques
@ 2024-08-05  5:11 ` Xiubo Li
  2024-08-12  9:35   ` Luis Henriques
  0 siblings, 1 reply; 3+ messages in thread
From: Xiubo Li @ 2024-08-05  5:11 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel

Hi Luis,

Thanks for your reporting, BTW, could this be reproduceable ?

This is also the first time I see this crash BUG.


The 'i_size == 0' could be easy to reproduce, please see my following 
debug logs:

++++++++++++++++++++++++++++

  ceph_read_iter: 0~1024 trying to get caps on 000000006a438277 
100000001f7.fffffffffffffffe
  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe need 
Fr want Fc
  __ceph_caps_issued: 000000006a438277 100000001f7.fffffffffffffffe cap 
000000001a8b6d16 issued pAsLsXsFrw
  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe have 
pAsLsXsFrw but not Fc (revoking -)
  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe ret 1 
got Fr
  ceph_read_iter: sync 000000006a438277 100000001f7.fffffffffffffffe 
0~1024 got cap refs on Fr
  ceph_sync_read: on file 00000000e029b65e 0~400
  __ceph_sync_read: on inode 000000006a438277 
100000001f7.fffffffffffffffe 0~400
  __ceph_sync_read: orig 0~1024 reading 0~1024
  __ceph_sync_read: 0~1024 got -2 i_size 0
  __ceph_sync_read: result 0 retry_op 0
  ceph_read_iter: 000000006a438277 100000001f7.fffffffffffffffe dropping 
cap refs on Fr = 0
  __ceph_put_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe had 
Fr last
  __ceph_caps_issued: 000000006a438277 100000001f7.fffffffffffffffe cap 
000000001a8b6d16 issued pAsLsXsFrw
+++++++++++++++++++++++++++++++++

I just created one empty file and then in Client.A open it for r/w, and 
then open the same file in Client.B and did a simple read.

Currently ceph kclient won't check the 'i_size' before sending out the 
sync read request to Rados, but will do it after it getting the contents 
back, As I remembered this logic comply to the "MIX" filelock state in MDS:

[LOCK_MIX]       = { 0,         false, LOCK_MIX,  0,    0,   REQ, ANY, 
0,   0,   0, CEPH_CAP_GRD|CEPH_CAP_GWR|CEPH_CAP_GLAZYIO,0,0,CEPH_CAP_GRD },

You can raise one ceph tracker for this.

Thanks

- Xiubo

On 8/3/24 00:39, Luis Henriques wrote:
> Hi Xiubo,
>
> I was wondering if you ever seen the BUG below.  I've debugged it a bit
> and the issue seems occurs here, while doing the SetPageUptodate():
>
> 		if (ret <= 0)
> 			left = 0;
> 		else if (off + ret > i_size)
> 			left = i_size - off;
> 		else
> 			left = ret;
> 		while (left > 0) {
> 			size_t plen, copied;
>
> 			plen = min_t(size_t, left, PAGE_SIZE - page_off);
> 			SetPageUptodate(pages[idx]);
> 			copied = copy_page_to_iter(pages[idx++],
> 						   page_off, plen, to);
> 			off += copied;
> 			left -= copied;
> 			page_off = 0;
> 			if (copied < plen) {
> 				ret = -EFAULT;
> 				break;
> 			}
> 		}
>
> So, the issue is that we have idx > num_pages.  And I'm almost sure that's
> because of i_size being '0' and 'left' ending up with a huge value.  But
> haven't managed to figure out yet why i_size is '0'.
>
> (Note: I'll be offline next week, but I'll continue looking into this the
> week after.  But I figured I should report the bug anyway, in case you've
> seen something similar.)
>
> Cheers,


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ceph_read_iter NULL pointer dereference
  2024-08-05  5:11 ` Xiubo Li
@ 2024-08-12  9:35   ` Luis Henriques
  0 siblings, 0 replies; 3+ messages in thread
From: Luis Henriques @ 2024-08-12  9:35 UTC (permalink / raw)
  To: Xiubo Li; +Cc: ceph-devel

On Mon, Aug 05 2024, Xiubo Li wrote:

> Hi Luis,
>
> Thanks for your reporting, BTW, could this be reproduceable ?
>
> This is also the first time I see this crash BUG.
>
>
> The 'i_size == 0' could be easy to reproduce, please see my following debug
> logs:
>
> ++++++++++++++++++++++++++++
>
>  ceph_read_iter: 0~1024 trying to get caps on 000000006a438277
> 100000001f7.fffffffffffffffe
>  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe need Fr want Fc
>  __ceph_caps_issued: 000000006a438277 100000001f7.fffffffffffffffe cap
> 000000001a8b6d16 issued pAsLsXsFrw
>  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe have pAsLsXsFrw
> but not Fc (revoking -)
>  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe ret 1 got Fr
>  ceph_read_iter: sync 000000006a438277 100000001f7.fffffffffffffffe 0~1024 got
> cap refs on Fr
>  ceph_sync_read: on file 00000000e029b65e 0~400
>  __ceph_sync_read: on inode 000000006a438277 100000001f7.fffffffffffffffe 0~400
>  __ceph_sync_read: orig 0~1024 reading 0~1024
>  __ceph_sync_read: 0~1024 got -2 i_size 0
>  __ceph_sync_read: result 0 retry_op 0
>  ceph_read_iter: 000000006a438277 100000001f7.fffffffffffffffe dropping cap refs
> on Fr = 0
>  __ceph_put_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe had Fr last
>  __ceph_caps_issued: 000000006a438277 100000001f7.fffffffffffffffe cap
> 000000001a8b6d16 issued pAsLsXsFrw
> +++++++++++++++++++++++++++++++++
>
> I just created one empty file and then in Client.A open it for r/w, and then
> open the same file in Client.B and did a simple read.
>
> Currently ceph kclient won't check the 'i_size' before sending out the sync read
> request to Rados, but will do it after it getting the contents back, As I
> remembered this logic comply to the "MIX" filelock state in MDS:
>
> [LOCK_MIX]       = { 0,         false, LOCK_MIX,  0,    0,   REQ, ANY, 0,   0,  
> 0, CEPH_CAP_GRD|CEPH_CAP_GWR|CEPH_CAP_GLAZYIO,0,0,CEPH_CAP_GRD },
>
> You can raise one ceph tracker for this.

I'll do that, and thanks for analysis.  I'll need to catch-up with a few
things first after being a week offline, but I'll get back to this bug
shortly.

Cheers,
-- 
Luís


>
> Thanks
>
> - Xiubo
>
> On 8/3/24 00:39, Luis Henriques wrote:
>> Hi Xiubo,
>>
>> I was wondering if you ever seen the BUG below.  I've debugged it a bit
>> and the issue seems occurs here, while doing the SetPageUptodate():
>>
>> 		if (ret <= 0)
>> 			left = 0;
>> 		else if (off + ret > i_size)
>> 			left = i_size - off;
>> 		else
>> 			left = ret;
>> 		while (left > 0) {
>> 			size_t plen, copied;
>>
>> 			plen = min_t(size_t, left, PAGE_SIZE - page_off);
>> 			SetPageUptodate(pages[idx]);
>> 			copied = copy_page_to_iter(pages[idx++],
>> 						   page_off, plen, to);
>> 			off += copied;
>> 			left -= copied;
>> 			page_off = 0;
>> 			if (copied < plen) {
>> 				ret = -EFAULT;
>> 				break;
>> 			}
>> 		}
>>
>> So, the issue is that we have idx > num_pages.  And I'm almost sure that's
>> because of i_size being '0' and 'left' ending up with a huge value.  But
>> haven't managed to figure out yet why i_size is '0'.
>>
>> (Note: I'll be offline next week, but I'll continue looking into this the
>> week after.  But I figured I should report the bug anyway, in case you've
>> seen something similar.)
>>
>> Cheers,
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-08-12  9:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-02 16:39 ceph_read_iter NULL pointer dereference Luis Henriques
2024-08-05  5:11 ` Xiubo Li
2024-08-12  9:35   ` Luis Henriques

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.