From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vu Pham Subject: Re: [ewg] Mellanox target workaround in SRP Date: Mon, 10 Jan 2011 11:58:02 -0800 Message-ID: <4D2B64CA.6040609@mellanox.com> References: <1294439717.6219.54.camel@lap75545.ornl.gov> <1294510396.7914.82.camel@obelisk.thedillows.org> <4D2B4E13.6070903@mellanox.com> <1294686163.3038.12.camel@obelisk.thedillows.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1294686163.3038.12.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: David Dillow Cc: Roland Dreier , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org, Ishai Rabinovitz List-Id: linux-rdma@vger.kernel.org David Dillow wrote: > On Mon, 2011-01-10 at 10:21 -0800, Vu Pham wrote: >> David Dillow wrote: >>> On Fri, 2011-01-07 at 20:05 -0800, Roland Dreier wrote: >>>> looking at the patch, I would guess that the corruption occurred when >>>> the target got an IO request that started at a non-page-aligned address >>>> but that spanned more than one page. > [snip] >>> Here's hoping someone from Mellanox can shed some light. >> >> I think that the patch is specific for srp initiator using Mellanox >> FMR. It tried to avoid indirect desc with Mellanox FMR having >> first-byte-offset != 0. >> Since the low level implementation of mlx4/mthca_map_phys_fmr() did >> not create + setup MPT for FMR with first_byte_offset != 0. The >> corruption can happen with any target. > > Thanks for taking a look Vu -- Thanks for taking ownership of srp :) but I'm not sure that is the problem, > either. The SRP FMR mapping code is careful to mask the SG address with > the FMR page mask, so we should never ask the HCA to map a page with the > first_byte_offset != 0. Instead, we tell the target to request an IO > virtual address appropriately offset into the first page of the FMR. > > Or perhaps I misunderstood you, and it's the non-zero first byte offset > in the RDMA command on the wire that is the issue, and not the FMR setup > in the initiator? And it only affects FMR-mapped memory, not the > kernel's MR? > It's not the kernel's MR. I suspect that the corruption happen with *only* Mellanox FMR + MPT setup without fbo and target doing RDMA with offset vaddr. I need to ask internal hw/fw guys and confirm if it's true. -vu -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html