From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Dutile Subject: Re: [PATCH for-next 5/5] IB/isert: Fix for lib/dma_debug check_sync warning Date: Thu, 17 May 2018 19:01:30 -0400 Message-ID: <5372c148-a07c-64bb-5608-9ec64a90ef73@redhat.com> References: <20180516012947.12474.2286.stgit@scvm10.sc.intel.com> <20180516013137.12474.10259.stgit@scvm10.sc.intel.com> <20180516200441.GN25661@ziepe.ca> <4039808d-7997-8b95-f1b9-d3d1d181f00e@intel.com> <20180517151057.GC10842@ziepe.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20180517151057.GC10842@ziepe.ca> Content-Language: en-US Sender: stable-owner@vger.kernel.org To: Jason Gunthorpe , Dennis Dalessandro Cc: dledford@redhat.com, linux-rdma@vger.kernel.org, Mike Marciniszyn , Alex Estrin , stable@vger.kernel.org, Christoph Hellwig List-Id: linux-rdma@vger.kernel.org On 05/17/2018 11:10 AM, Jason Gunthorpe wrote: > On Thu, May 17, 2018 at 11:03:49AM -0400, Dennis Dalessandro wrote: >> On 5/16/2018 4:04 PM, Jason Gunthorpe wrote: >>> On Tue, May 15, 2018 at 06:31:39PM -0700, Dennis Dalessandro wrote: >>>> From: Alex Estrin >>>> >>>> The following error message occurs on a target host in a debug build >>>> during session login: >>>> >>>> [ 3524.411874] WARNING: CPU: 5 PID: 12063 at lib/dma-debug.c:1207 check_sync+0x4ec/0x5b0 >>>> [ 3524.421057] infiniband hfi1_0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000000000000] [size=76 bytes] >>>> ......snip ..... >>>> >>>> [ 3524.535846] CPU: 5 PID: 12063 Comm: iscsi_np Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 >>>> [ 3524.546764] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015 >>>> [ 3524.555740] Call Trace: >>>> [ 3524.559102] [] dump_stack+0x19/0x1b >>>> [ 3524.565477] [] __warn+0xd8/0x100 >>>> [ 3524.571557] [] warn_slowpath_fmt+0x5f/0x80 >>>> [ 3524.578610] [] check_sync+0x4ec/0x5b0 >>>> [ 3524.585177] [] ? set_cpus_allowed_ptr+0x5f/0x1c0 >>>> [ 3524.592812] [] debug_dma_sync_single_for_cpu+0x80/0x90 >>>> [ 3524.601029] [] ? x2apic_send_IPI_mask+0x13/0x20 >>>> [ 3524.608574] [] ? native_smp_send_reschedule+0x5b/0x80 >>>> [ 3524.616699] [] ? resched_curr+0xf6/0x140 >>>> [ 3524.623567] [] isert_create_send_desc.isra.26+0xe0/0x110 [ib_isert] >>>> [ 3524.633060] [] isert_put_login_tx+0x55/0x8b0 [ib_isert] >>>> [ 3524.641383] [] ? try_to_wake_up+0x1a4/0x430 >>>> [ 3524.648561] [] iscsi_target_do_tx_login_io+0xdd/0x230 [iscsi_target_mod] >>>> [ 3524.658557] [] iscsi_target_do_login+0x1a7/0x600 [iscsi_target_mod] >>>> [ 3524.668084] [] ? kstrdup+0x49/0x60 >>>> [ 3524.674420] [] iscsi_target_start_negotiation+0x56/0xc0 [iscsi_target_mod] >>>> [ 3524.684656] [] __iscsi_target_login_thread+0x90e/0x1070 [iscsi_target_mod] >>>> [ 3524.694901] [] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] >>>> [ 3524.705446] [] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] >>>> [ 3524.715976] [] iscsi_target_login_thread+0x28/0x60 [iscsi_target_mod] >>>> [ 3524.725739] [] kthread+0xef/0x100 >>>> [ 3524.732007] [] ? insert_kthread_work+0x80/0x80 >>>> [ 3524.739540] [] ret_from_fork_nospec_begin+0x21/0x21 >>>> [ 3524.747558] [] ? insert_kthread_work+0x80/0x80 >>>> [ 3524.755088] ---[ end trace 23f8bf9238bd1ed8 ]--- >>>> [ 3595.510822] iSCSI/iqn.1994-05.com.redhat:537fa56299: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. >>>> >>>> The code calls dma_sync on login_tx_desc->dma_addr prior to initializing it >>>> with dma-mapped address. >>>> login_tx_desc is a part of iser_conn structure and is used only once >>>> during login negotiation, so the issue is fixed by eliminating >>>> dma_sync call for this buffer using a special case routine. >>>> >>>> Cc: >>>> Reviewed-by: Mike Marciniszyn >>>> Reviewed-by: Don Dutile >>>> Signed-off-by: Alex Estrin >>>> Signed-off-by: Dennis Dalessandro >>>> drivers/infiniband/ulp/isert/ib_isert.c | 26 +++++++++++++++++--------- >>>> 1 files changed, 17 insertions(+), 9 deletions(-) >>> >>> This sounds like -rc material? Why is it for-next? >> >> Should something that only occurs in a debug build be considered -rc >> material? > > Usually errors produced in debug builds reflect some kind of real-world > error as well. > > Calling dma_sync on random garbage seems like the sort of thing that > might crash on some archs?? But I guess HFI1 don't care about that.. > > Jason > That's not the case here. What is actually happening is a blind, dma-sync call before any DMA is being done, and the call is not needed. Additionally, I believe dma-debug has a bug: -- doing a check on an op when there is no op in the dma-ops struct is not correct: static inline void dma_sync_single_for_cpu(struct device *dev, dma_addr_t addr, size_t size, enum dma_data_direction dir) { struct dma_map_ops *ops = get_dma_ops(dev); BUG_ON(!valid_dma_direction(dir)); if (ops->sync_single_for_cpu) ops->sync_single_for_cpu(dev, addr, size, dir); debug_dma_sync_single_for_cpu(dev, addr, size, dir); } Note: above: debug_dma_sync is called indep of whether an op function exists. For hfi1 & qib & rxe -- which use dma-virt-ops for non-IOMMU-enabled configs, the sync_syngle_for_cpu ops does not exist, yet debug_dma_sync is still called. Christoph: care to comment, since you've done a fair amount in this space? - Don