All of lore.kernel.org
 help / color / mirror / Atom feed
* rbd: assertion failure in rbd_img_obj_callback()
@ 2013-06-20  8:40 Andreas Bluemle
  2013-06-20 15:45 ` Sage Weil
  0 siblings, 1 reply; 2+ messages in thread
From: Andreas Bluemle @ 2013-06-20  8:40 UTC (permalink / raw)
  To: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

Hi,

when working on a large rbd image, I am hitting an
assertion failure in rbd kernel module; the
assertion is in rbd_img_obj_callback()

            rbd_assert(img_request != NULL);

The test is a sequential write on an rbd
image, format 1. The size of the image is 1 TByte,
order is 22 (i.e. default object size).

The code of the rbd (and libceph) kernel modules is
from kernel.org, 3.10-rc6, running on a 3.0.80-0.5 kernel.

The problem looks like the rbd_obj_request structure in
question has been released already when processing of
the reponse to the original request takes place.

Please find attached some more details on this problem.

Is this a known issue already?



Best Regards

Andreas Bluemle

-- 
Andreas Bluemle                     mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH                       http://www.itxperts.de
Balanstrasse 73, Geb. 08            Phone: (+49) 89 89044917
D-81541 Muenchen (Germany)          Fax:   (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm

[-- Attachment #2: README.rbd_img_obj_callback-panic --]
[-- Type: application/octet-stream, Size: 12483 bytes --]

Panic in rbd_img_obj_callback(): assertion failure

static void rbd_img_obj_callback(struct rbd_obj_request *obj_request)
{
        struct rbd_img_request *img_request;
        u32 which = obj_request->which;
        bool more = true;

        rbd_assert(obj_request_img_data_test(obj_request));
        img_request = obj_request->img_request;

        dout("%s: img %p obj %p\n", __func__, img_request, obj_request);
>>>>>   rbd_assert(img_request != NULL);
        rbd_assert(img_request->obj_request_count > 0);
        rbd_assert(which != BAD_WHICH);
        rbd_assert(which < img_request->obj_request_count);
        rbd_assert(which >= img_request->next_completion);
....


Test: single thread sequential write to rbd image of size 1 TB.


From crash:
      KERNEL: ./vmlinux-3.0.80-0.5-default      
   DEBUGINFO: .//vmlinux-3.0.80-0.5-default.debug
    DUMPFILE: vmcore
        CPUS: 32
        DATE: Tue Jun 18 18:09:40 2013
      UPTIME: 02:55:38
LOAD AVERAGE: 3.05, 3.04, 4.24
       TASKS: 445
    NODENAME: cibdb2
     RELEASE: 3.0.80-0.5-default
     VERSION: #1 SMP Fri May 31 20:04:37 UTC 2013 (f5a0e1f)
     MACHINE: x86_64  (2599 Mhz)
      MEMORY: 256 GB
       PANIC: "[10516.847816] kernel BUG at /home/afrie/rpm/BUILD/ceph-kmp-3.10-rc6/default/rbd.c:2146!"
         PID: 7597
     COMMAND: "kworker/u:16"
        TASK: ffff8813ce302440  [THREAD_INFO: ffff8813ce304000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 7597   TASK: ffff8813ce302440  CPU: 0   COMMAND: "kworker/u:16"
 #0 [ffff8813ce305990] machine_kexec at ffffffff810267be
 #1 [ffff8813ce3059e0] crash_kexec at ffffffff810a3e4a
 #2 [ffff8813ce305ab0] oops_end at ffffffff814484a8
 #3 [ffff8813ce305ad0] do_invalid_op at ffffffff810035f4
 #4 [ffff8813ce305b70] invalid_op at ffffffff8144fcbb
    [exception RIP: rbd_img_obj_callback+711]
    RIP: ffffffffa06ec0e7  RSP: ffff8813ce305c20  RFLAGS: 00010286
    RAX: 00000000000000bc  RBX: ffff8815d9120a28  RCX: 0000000000001a6d
    RDX: 0000000000001a6d  RSI: 0000000000000007  RDI: 0000000000000246
    RBP: ffff88081300c118   R8: ffffffff81d8fac0   R9: 0000000000000000
    R10: 0000000000000003  R11: 00000000ffffffff  R12: 0000000000000000
    R13: 0000000000000001  R14: 000000000243c1ca  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffff8813ce305c58] handle_reply at ffffffffa06a86c5 [libceph]
 #6 [ffff8813ce305c78] kernel_recvmsg at ffffffff813772fc
 #7 [ffff8813ce305cf8] dispatch at ffffffffa06aa613 [libceph]
 #8 [ffff8813ce305d18] process_message at ffffffffa069f975 [libceph]
 #9 [ffff8813ce305da8] try_read at ffffffffa06a268b [libceph]
#10 [ffff8813ce305df8] con_work at ffffffffa06a34b5 [libceph]
#11 [ffff8813ce305e28] process_one_work at ffffffff810747cc
#12 [ffff8813ce305e78] worker_thread at ffffffff8107730a
#13 [ffff8813ce305ee8] kthread at ffffffff8107b646
#14 [ffff8813ce305f48] kernel_thread_helper at ffffffff8144fe44


Analysis:
rbd code had set up an rbd_obj_request, wrapped with an
ceph_osd_request and passed that to osd_client code.

The answer for that request is handled at the osd_client layer,
but the reference to the rbd_obj_request seems to be invalid
at the time of processing the answer.


tail of crash log output BEGIN {

[10516.847816] kernel BUG at /home/afrie/rpm/BUILD/ceph-kmp-3.10-rc6/default/rbd.c:2146!
[10516.865227] invalid opcode: 0000 [#1] SMP 
[10516.874346] CPU 0 
[10516.878380] Modules linked in: rbd libceph edd af_packet rdma_ucm ib_srp scsi_transport_srp scsi_tgt ib_sdp(N) rdma_
cm iw_cm ib_addr ib_ipoib ib_cm ib_sa ib_uverbs ib_umad iw_cxgb3 cxgb3 mdio mlx4_en mlx4_ib ib_mthca ib_mad ib_core cpu
freq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse xfs loop dm_mod btrfs zlib_defl
ate crc32c libcrc32c igb iTCO_wdt ipv6 sr_mod ipv6_lib ac i2c_i801 joydev iTCO_vendor_support dca rtc_cmos cdrom mlx4_c
ore sg i2c_core container button pcspkr acpi_power_meter wmi ext3 jbd mbcache usbhid hid usb_storage ehci_hcd isci(X) u
sbcore libsas scsi_transport_sas usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_emc scsi_dh_hp_sw scs
i_dh_alua scsi_dh_rdac scsi_dh ahci libahci libata megaraid_sas scsi_mod
[10517.032149] Supported: No, Unsupported modules are loaded
[10517.044120] 
[10517.047467] Pid: 7597, comm: kworker/u:16 Tainted: G           NX 3.0.80-0.5-default #1 FUJITSU PRIMERGY RX300 S7/D2
939-A1
[10517.072005] RIP: 0010:[<ffffffffa06ec0e7>]  [<ffffffffa06ec0e7>] rbd_img_obj_callback+0x2c7/0x390 [rbd]
[10517.092887] RSP: 0018:ffff8813ce305c20  EFLAGS: 00010286
[10517.104672] RAX: 00000000000000bc RBX: ffff8815d9120a28 RCX: 0000000000001a6d
[10517.120500] RDX: 0000000000001a6d RSI: 0000000000000007 RDI: 0000000000000246
[10517.136326] RBP: ffff88081300c118 R08: ffffffff81d8fac0 R09: 0000000000000000
[10517.152143] R10: 0000000000000003 R11: 00000000ffffffff R12: 0000000000000000
[10517.167961] R13: 0000000000000001 R14: 000000000243c1ca R15: 0000000000000000
[10517.183770] FS:  0000000000000000(0000) GS:ffff88207fc00000(0000) knlGS:0000000000000000
[10517.201759] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[10517.214496] CR2: 00007f586eaef000 CR3: 0000001fb2faf000 CR4: 00000000000406f0
[10517.230300] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[10517.246103] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[10517.261906] Process kworker/u:16 (pid: 7597, threadinfo ffff8813ce304000, task ffff8813ce302440)
[10517.281422] Stack:
[10517.285919]  ffff8815d9120a40 0000000000000000 ffff88081300c118 ffff881ebcdb2a70
[10517.302417]  0000000000000025 000000000243c1ca 0000000000000000 ffffffffa06a86c5
[10517.318919]  ffffffffffffffff 000000000000000d 0000000000000000 ffffffff813772fc
[10517.335439] Call Trace:
[10517.340914]  [<ffffffffa06a86c5>] handle_reply+0x405/0x660 [libceph]
[10517.355000]  [<ffffffffa06aa613>] dispatch+0x93/0xb0 [libceph]
[10517.367941]  [<ffffffffa069f975>] process_message+0x95/0x160 [libceph]
[10517.382425]  [<ffffffffa06a268b>] try_read+0x38b/0x720 [libceph]
[10517.395744]  [<ffffffffa06a34b5>] con_work+0xb5/0x390 [libceph]
[10517.408869]  [<ffffffff810747cc>] process_one_work+0x16c/0x350
[10517.421801]  [<ffffffff8107730a>] worker_thread+0x17a/0x410
[10517.434156]  [<ffffffff8107b646>] kthread+0x96/0xa0
[10517.444979]  [<ffffffff8144fe44>] kernel_thread_helper+0x4/0x10
[10517.458102] Code: c0 e8 88 81 d5 e0 0f 0b eb fe 48 c7 c1 af 10 6f a0 ba 62 08 00 00 48 c7 c6 c0 33 6f a0 48 c7 c7 90
 1a 6f a0 31 c0 e8 63 81 d5 e0 <0f> 0b eb fe 49 89 f8 4c 89 e1 48 c7 c2 c0 33 6f a0 48 c7 c6 1a 
[10517.500821] RIP  [<ffffffffa06ec0e7>] rbd_img_obj_callback+0x2c7/0x390 [rbd]
[10517.516443]  RSP <ffff8813ce305c20>

}  END tail of crash log output


call flow and data analysis  BEGIN {
disassembly of handle_reply shows

0xffffffffa06a86ad <handle_reply+1005>: mov    0x280(%rbp),%rax
0xffffffffa06a86b4 <handle_reply+1012>: test   %rax,%rax
0xffffffffa06a86b7 <handle_reply+1015>: je     0xffffffffa06a87cc
0xffffffffa06a86bd <handle_reply+1021>: mov    %r12,%rsi
0xffffffffa06a86c0 <handle_reply+1024>: mov    %rbp,%rdi
0xffffffffa06a86c3 <handle_reply+1027>: callq  *%rax
0xffffffffa06a86c5 <handle_reply+1029>: and    $0x4,%r13d

handle_reply+1029 == handle_reply+0x405 as shown in the stack backtrace.
This is the return address after the call to r_callback in
the current ceph_osd_request structure.


RBP register holds the reference to the ceph_osd_request structure,
with offset 0x280 pointing at the r_callback element.

0xffff88081300c118:     0x000000000243c1ca      0xffff8814f0f8d180
0xffff88081300c128:     0x0000000000000000      0x0000000000000000
0xffff88081300c138:     0xffff88081300c138      0xffff88081300c138
0xffff88081300c148:     0xffff88081300c148      0xffff88081300c148
0xffff88081300c158:     0xffff88081300c158      0xffff88081300c158
0xffff88081300c168:     0xffff88081300c168      0xffff88081300c168
0xffff88081300c178:     0x0000000000000000      0x0000000000000002
0xffff88081300c188:     0xffff881ffa78acf8      0x000000d90000007b
0xffff88081300c198:     0x0000000000000000      0x0000000000000000
0xffff88081300c1a8:     0x0000000000000000      0x0000000000000000
0xffff88081300c1b8:     0x0000000000000000      0x0000000000000000
0xffff88081300c1c8:     0x0000000000000000      0x0000000000000002
0xffff88081300c1d8:     0xffff88135f47f988      0xffff881ebcdb2a70
0xffff88081300c1e8:     0x0000000100000024      0x0000000000000001
0xffff88081300c1f8:     0x0000100000002201      0x0000000000000000
0xffff88081300c208:     0x0000000000001000      0x0000000000000000
0xffff88081300c218:     0x0000000000000000      0x0000000000000003
0xffff88081300c228:     0xffff8814f5590c80      0x0000000000001000
0xffff88081300c238:     0x0000000000000000      0x0000000000000000
0xffff88081300c248:     0x0000000000000000      0x0000000000000000
0xffff88081300c258:     0x0000000000000000      0x0000000000000000
0xffff88081300c268:     0x0000000000000000      0x0000000000000000
0xffff88081300c278:     0x0000000000000000      0x0000000000000000
0xffff88081300c288:     0x0000000000000000      0x0000000000000000
0xffff88081300c298:     0x0000000000000000      0x0000000000000000
0xffff88081300c2a8:     0x0000000000000000      0x0000000000000000
0xffff88081300c2b8:     0x0000000000000000      0x0000000000000000
0xffff88081300c2c8:     0x0000000000000000      0x0000000000000000
0xffff88081300c2d8:     0x0000000000000000      0x0000000000000000
0xffff88081300c2e8:     0x0000000000000000      0x0000000000000000
0xffff88081300c2f8:     0xffff880e5142f884      0xffff880e5142f888
0xffff88081300c308:     0xffff880e5142f8a6      0xffff880e5142f8b7
0xffff88081300c318:     0xffff880e5142f926      0xffff880e5142f894
0xffff88081300c328:     0x0000000000000000      0x0000000000000000
0xffff88081300c338:     0x0000000100000000      0x0000000100000000
0xffff88081300c348:     0xffff881fb2fac718      0x0000000000000001
0xffff88081300c358:     0x0000000000000000      0x0000000000000000
0xffff88081300c368:     0xffff88081300c368      0xffff88081300c368
0xffff88081300c378:     0x0000000000000000      0x0000000000000000
0xffff88081300c388:     0xffff88081300c388      0xffff88081300c388
0xffff88081300c398:     0xffffffffa06ed6a0      0x0000000000000000
0xffff88081300c3a8:     0x0000000000006e7c      0x00000000000001bf
0xffff88081300c3b8:     0xffff88081300c3b8      0xffff88081300c3b8
0xffff88081300c3c8:     0x0000000000000000      0xffff8815d9120a28
0xffff88081300c3d8:     0x6630312e302e6272      0x3534623862362e30
0xffff88081300c3e8:     0x30303030302e3736      0x0062363938666530
0xffff88081300c3f8:     0x0000000000000000      0x0000000000000000
0xffff88081300c408:     0x0000000000000000      0x0000000000000000

0xffff88081300c118 + 0x280 = 0xffff88081300c398: 0xffffffffa06ed6a0
crash> dis 0xffffffffa06ed6a0
0xffffffffa06ed6a0 <rbd_osd_req_callback>:      push   %r13

Hence: the r_callback function is rbd_osd_req_callback().
This will in turn invoke the callback function for the rbd_obj_request:
the rbd_obj_request is contained in element r_priv of ceph_osd_request
at offset 0x2b8

0xffff88081300c118 + 0x2b8 = 0xffff88081300c3d0: 0xffff8815d9120a28

crash> gdb x/20xg 0xffff8815d9120a28
0xffff8815d9120a28:     0x0000000000000000      0x0000000000000000
0xffff8815d9120a38:     0x0000000000001000      0x0000000000000003
0xffff8815d9120a48:     0x0000000000000000      0x000000ef896b0000
0xffff8815d9120a58:     0xdead000000100100      0xdead000000200200
0xffff8815d9120a68:     0x00000001ffffffff      0xffff8814f5590c80
0xffff8815d9120a78:     0x0000000000000000      0x0000000000000000
0xffff8815d9120a88:     0x0000000000000000      0xffff88081300c118
0xffff8815d9120a98:     0x0000000000001000      0x0000000000000000
0xffff8815d9120aa8:     0x0000000000000000      0x0000000000000000
0xffff8815d9120ab8:     0x0000000000000000      0xffff8815d9120ac0

The img_request field at 0xffff8815d9120a48 is NULL: this is the cause
for the assertion.

The img_offset is a valid offset into the 1 TByte rbd image under test:
0xef896b0000 < 0x10000000000 (1 TByte).
The invalid pointers in the links of the rbd_obj_request (which is
a list_head structure) are marked as invalid:

0xffff8815d9120a58:     0xdead000000100100      0xdead000000200200


This suggests that the rbd_obj_request had been invalidated already
before the processing of the answer to the corressponding request
processing takes place.


} END call flow and data analysis

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-06-20 15:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-20  8:40 rbd: assertion failure in rbd_img_obj_callback() Andreas Bluemle
2013-06-20 15:45 ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.