From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Christie Subject: Re: Issue with Ceph File System and LIO Date: Tue, 15 Dec 2015 03:26:37 -0600 Message-ID: <566FDCCD.4090800@redhat.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: Received: from mx1.redhat.com ([209.132.183.28]:58104 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932275AbbLOJ0i (ORCPT ); Tue, 15 Dec 2015 04:26:38 -0500 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Eric Eastman , Ceph Development On 12/15/2015 12:08 AM, Eric Eastman wrote: > I am testing Linux Target SCSI, LIO, with a Ceph File System backstore > and I am seeing this error on my LIO gateway. I am using Ceph v9.2.0 > on a 4.4rc4 Kernel, on Trusty, using a kernel mounted Ceph File > System. A file on the Ceph File System is exported via iSCSI to a > VMware ESXi 5.0 server, and I am seeing this error when doing a lot of > I/O on the ESXi server. Is this a LIO or a Ceph issue? > > [Tue Dec 15 00:46:55 2015] ------------[ cut here ]------------ > [Tue Dec 15 00:46:55 2015] WARNING: CPU: 0 PID: 1123421 at > /home/kernel/COD/linux/fs/ceph/addr.c:125 > ceph_set_page_dirty+0x230/0x240 [ceph]() > [Tue Dec 15 00:46:55 2015] Modules linked in: iptable_filter ip_tables > x_tables xfs rbd iscsi_target_mod vhost_scsi tcm_qla2xxx ib_srpt > tcm_fc tcm_usb_gadget tcm_loop target_core_file target_core_iblock > target_core_pscsi target_core_user target_core_mod ipmi_devintf vhost > qla2xxx ib_cm ib_sa ib_mad ib_core ib_addr libfc scsi_transport_fc > libcomposite udc_core uio configfs ipmi_ssif ttm drm_kms_helper > gpio_ich drm i2c_algo_bit fb_sys_fops coretemp syscopyarea ipmi_si > sysfillrect ipmi_msghandler sysimgblt kvm acpi_power_meter 8250_fintek > irqbypass hpilo shpchp input_leds serio_raw lpc_ich i7core_edac > edac_core mac_hid ceph libceph libcrc32c fscache bonding lp parport > mlx4_en vxlan ip6_udp_tunnel udp_tunnel ptp pps_core hid_generic > usbhid hid hpsa mlx4_core psmouse bnx2 scsi_transport_sas fjes [last > unloaded: target_core_mod] > [Tue Dec 15 00:46:55 2015] CPU: 0 PID: 1123421 Comm: iscsi_trx > Tainted: G W I 4.4.0-040400rc4-generic #201512061930 > [Tue Dec 15 00:46:55 2015] Hardware name: HP ProLiant DL360 G6, BIOS > P64 01/22/2015 > [Tue Dec 15 00:46:55 2015] 0000000000000000 00000000fdc0ce43 > ffff880bf38c38c0 ffffffff813c8ab4 > [Tue Dec 15 00:46:55 2015] 0000000000000000 ffff880bf38c38f8 > ffffffff8107d772 ffffea00127a8680 > [Tue Dec 15 00:46:55 2015] ffff8804e52c1448 ffff8804e52c15b0 > ffff8804e52c10f0 0000000000000200 > [Tue Dec 15 00:46:55 2015] Call Trace: > [Tue Dec 15 00:46:55 2015] [] dump_stack+0x44/0x60 > [Tue Dec 15 00:46:55 2015] [] warn_slowpath_common+0x82/0xc0 > [Tue Dec 15 00:46:55 2015] [] warn_slowpath_null+0x1a/0x20 > [Tue Dec 15 00:46:55 2015] [] > ceph_set_page_dirty+0x230/0x240 [ceph] > [Tue Dec 15 00:46:55 2015] [] ? > pagecache_get_page+0x150/0x1c0 > [Tue Dec 15 00:46:55 2015] [] ? > ceph_pool_perm_check+0x48/0x700 [ceph] > [Tue Dec 15 00:46:55 2015] [] set_page_dirty+0x3d/0x70 > [Tue Dec 15 00:46:55 2015] [] > ceph_write_end+0x5e/0x180 [ceph] > [Tue Dec 15 00:46:55 2015] [] ? > iov_iter_copy_from_user_atomic+0x156/0x220 > [Tue Dec 15 00:46:55 2015] [] > generic_perform_write+0x114/0x1c0 > [Tue Dec 15 00:46:55 2015] [] > ceph_write_iter+0xf8a/0x1050 [ceph] > [Tue Dec 15 00:46:55 2015] [] ? > ceph_put_cap_refs+0x143/0x320 [ceph] > [Tue Dec 15 00:46:55 2015] [] ? > check_preempt_wakeup+0xfa/0x220 > [Tue Dec 15 00:46:55 2015] [] ? zone_statistics+0x7c/0xa0 > [Tue Dec 15 00:46:55 2015] [] ? copy_page_to_iter+0x5e/0xa0 > [Tue Dec 15 00:46:55 2015] [] ? > skb_copy_datagram_iter+0x122/0x250 > [Tue Dec 15 00:46:55 2015] [] vfs_iter_write+0x76/0xc0 > [Tue Dec 15 00:46:55 2015] [] > fd_do_rw.isra.5+0xd8/0x1e0 [target_core_file] > [Tue Dec 15 00:46:55 2015] [] > fd_execute_rw+0xc5/0x2a0 [target_core_file] > [Tue Dec 15 00:46:55 2015] [] > sbc_execute_rw+0x22/0x30 [target_core_mod] > [Tue Dec 15 00:46:55 2015] [] > __target_execute_cmd+0x1f/0x70 [target_core_mod] > [Tue Dec 15 00:46:55 2015] [] > target_execute_cmd+0x195/0x2a0 [target_core_mod] > [Tue Dec 15 00:46:55 2015] [] > iscsit_execute_cmd+0x20a/0x270 [iscsi_target_mod] > [Tue Dec 15 00:46:55 2015] [] > iscsit_sequence_cmd+0xda/0x190 [iscsi_target_mod] > [Tue Dec 15 00:46:55 2015] [] > iscsi_target_rx_thread+0x51d/0xe30 [iscsi_target_mod] > [Tue Dec 15 00:46:55 2015] [] ? __switch_to+0x1dc/0x5a0 > [Tue Dec 15 00:46:55 2015] [] ? > iscsi_target_tx_thread+0x1e0/0x1e0 [iscsi_target_mod] > [Tue Dec 15 00:46:55 2015] [] kthread+0xd8/0xf0 > [Tue Dec 15 00:46:55 2015] [] ? > kthread_create_on_node+0x1a0/0x1a0 > [Tue Dec 15 00:46:55 2015] [] ret_from_fork+0x3f/0x70 > [Tue Dec 15 00:46:55 2015] [] ? > kthread_create_on_node+0x1a0/0x1a0 > [Tue Dec 15 00:46:55 2015] ---[ end trace 4079437668c77cbb ]--- > [Tue Dec 15 00:47:45 2015] ABORT_TASK: Found referenced iSCSI task_tag: 95784927 > [Tue Dec 15 00:47:45 2015] ABORT_TASK: ref_tag: 95784927 already > complete, skipping > For writes, LIO just allocates pages using GFP_KERNEL, passes them to sock_recvmsg to read the data into them, then passes them to the fs using the function you see above, vfs_iter_write. So it does not do anything fancy. Do we need to send specific types of pages to ceph?