From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1A88C2D0E4 for ; Sat, 5 Dec 2020 18:42:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 999FA22E00 for ; Sat, 5 Dec 2020 18:42:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726597AbgLESmR (ORCPT ); Sat, 5 Dec 2020 13:42:17 -0500 Received: from gentwo.org ([3.19.106.255]:40830 "EHLO gentwo.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726217AbgLESmM (ORCPT ); Sat, 5 Dec 2020 13:42:12 -0500 Received: by gentwo.org (Postfix, from userid 1002) id 2B21B3EF63; Sat, 5 Dec 2020 11:50:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by gentwo.org (Postfix) with ESMTP id 286733E8D6; Sat, 5 Dec 2020 11:50:30 +0000 (UTC) Date: Sat, 5 Dec 2020 11:50:30 +0000 (UTC) From: Christoph Lameter X-X-Sender: cl@www.lameter.com To: =?ISO-8859-15?Q?H=E5kon_Bugge?= cc: Honggang LI , Jason Gunthorpe , Mark Haywood , OFED mailing list Subject: Re: Is there a working cache for path record and lids etc for librdmacm? In-Reply-To: <7812B8AB-7D26-4148-8C8C-E1241A1FC8CD@oracle.com> Message-ID: References: <20201117193329.GH244516@ziepe.ca> <6F632AE0-7921-4C5F-8455-F8E9390BD071@oracle.com> <801AE4A1-7AE8-4756-8F32-5F3BFD189E2B@oracle.com> <648D2533-E8E8-4248-AF2D-C5F1F60E5BFC@oracle.com> <20201125081057.GA547111@dhcp-128-72.nay.redhat.com> <7812B8AB-7D26-4148-8C8C-E1241A1FC8CD@oracle.com> User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="531401748-840076553-1607169030=:41487" Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --531401748-840076553-1607169030=:41487 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT On Fri, 4 Dec 2020, HÃ¥kon Bugge wrote: > >> Nop, the kernel falls back and uses the neighbour cache instead. > > > > But ib_acme hangs? The main issue here is what the user space app does. > > And we need ibacm to cache user space address resolutions. > > I got the impression that you are debugging this with Honggang. If you want me to help, I need, to start with, an strace of ib_acme and ditto of ibacm. Ok will do that. Do you have access to the RH case on this one? > >>>> To resolve IPoIB address to PathRecord, you must: > >>>> 1) The IPoIB interface must UP and RUNNING on the client and target > >>>> side. > >>>> 2) The ibacm service must RUNNING on the client and target. > >>> > >>> That is working if you want to resolve only the IP addresses of the IB > >>> interfaces on the client and target. None else. > >> > >> That is why it is called IBacm, right? > > > > Huh? IBACM is an address resolution service for IB. Somehow that only > > includes addresses of hosts running IBACM? > > Yes. As Honggang explained, ibacmn's address resolution protocol is > based on IB multicast, as such, the peer must have ibacm running in > order to send a unicast response back with the L2 addr. What is the point of the route_prot and addr_prot then? > >>> Here is the description of ibacms function from the sources: > >>> > >>> "Conceptually, the ibacm service implements an ARP like protocol and > >>> either uses IB multicast records to construct path record data or queries > >>> the SA directly, depending on the selected route protocol. By default, the > >>> ibacm services uses and caches SA path record queries." > >>> > >>> SA queries dont work. So its broken and cannot talk to the SM. > >> > >> Why do you say that? It works all the time for me which uses "sa" as "route_prot". > > > > Not here and not in the tests that RH ran to verify the issue. > > > > "route_prot" set to "sa" is the default config for the Redhat release of > > IBACM. > > > > However, the addr_prot is set to "acm" by default. I set it to "sa" with > > no effect. > > OK. Understood. As stated above, let me know if you want me to debug this. Well whats the point to debug this if its only doing address resolution via multicast and not via the SA? Is there a particular issue with usiing the SA? The route information may contain process specific information? --531401748-840076553-1607169030=:41487--