From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE891C25B48 for ; Fri, 27 Oct 2023 04:02:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229501AbjJ0ECF (ORCPT ); Fri, 27 Oct 2023 00:02:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229644AbjJ0ECE (ORCPT ); Fri, 27 Oct 2023 00:02:04 -0400 Received: from out-188.mta1.migadu.com (out-188.mta1.migadu.com [IPv6:2001:41d0:203:375::bc]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 156231B1 for ; Thu, 26 Oct 2023 21:02:01 -0700 (PDT) Message-ID: <2330d7dc-d17d-45d5-a162-f8f95c24c051@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1698379319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KmZIlIgeBgTMR/LylELOvOh63Y9R9P3fD5Vn3rJ9tKo=; b=KMRAXuUwb5Og+yQ5ou2j9NoYstrsJEAgWX68f5gjcNTXTz+Yo+VKav5XlckR9f6BFaODGq UdYvAikatwFacHD7pyIhzeR2nC6tYqceUXYX65M0E7MK1QYUKKTm9iqcNtxGY11NqrV1GR nTBUsHdyqACgO2oKHXVBh8bgA2RLUWw= Date: Fri, 27 Oct 2023 12:01:47 +0800 MIME-Version: 1.0 Subject: Re: [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size To: Jason Gunthorpe Cc: "Zhijian Li (Fujitsu)" , Yi Zhang , "Daisuke Matsuda (Fujitsu)" , Zhu Yanjun , "leon@kernel.org" , "linux-rdma@vger.kernel.org" , "zyjzyj2000@gmail.com" , Bart Van Assche References: <20231020140139.GF691768@ziepe.ca> <6c57cf0d-c7a7-4aac-9eb2-d8bb1d832232@fujitsu.com> <1ffaeaa4-4ac2-4531-8e0c-586e13c14c97@fujitsu.com> <366da960-6036-49c5-ad47-3ae3f4e55452@fujitsu.com> <8f705223-6fde-4b29-880b-570349f40db8@fujitsu.com> <20231026114221.GT691768@ziepe.ca> <2374eb54-6a7e-4a56-b7e9-3aa5c9048fa1@linux.dev> <20231026232327.GZ691768@ziepe.ca> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Zhu Yanjun In-Reply-To: <20231026232327.GZ691768@ziepe.ca> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org 在 2023/10/27 7:23, Jason Gunthorpe 写道: > On Thu, Oct 26, 2023 at 08:59:34PM +0800, Zhu Yanjun wrote: >> 在 2023/10/26 19:42, Jason Gunthorpe 写道: >>> On Thu, Oct 26, 2023 at 09:05:52AM +0000, Zhijian Li (Fujitsu) wrote: >>>> The root cause is that >>>> >>>> rxe:rxe_set_page() gets wrong when mr.page_size != PAGE_SIZE where it only stores the *page to xarray. >>>> So the offset will get lost. >>>> >>>> For example, >>>> store process: >>>> page_size = 0x1000; >>>> PAGE_SIZE = 0x10000; >>>> va0 = 0xffff000020651000; >>>> page_offset = 0 = va & (page_size - 1); >>>> page = va_to_page(va); >>>> xa_store(&mr->page_list, mr->nbuf, page, GFP_KERNEL); >>>> >>>> load_process: >>>> page = xa_load(&mr->page_list, index); >>>> page_va = kmap_local_page(page) --> it must be a PAGE_SIZE align value, assume it as 0xffff000020650000 >>>> va1 = page_va + page_offset = 0xffff000020650000 + 0 = 0xffff000020650000; >>>> >>>> Obviously, *va0 != va1*, page_offset get lost. >>>> >>>> >>>> How to fix: >>>> - revert 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c") >>>> - don't allow ulp registering mr.page_size != PAGE_SIZE ? >>> Lets do the second one please. Most devices only support PAGE_SIZE anyhow. >> Normally page_size is PAGE_SIZE or the size of the whole compound page (in >> the latest kernel version, it is the size of folio). When compound page or >> folio is taken into account, the page_size is not equal to >> PAGE_SIZE. > folios are always multiples of PAGE_SIZE. rxe splits everything into > PAGE_SIZE units in the xarray. > >> If the ULP uses the compound page or folio, the similar problem will occur >> again. > No, it won't. We never store folios in the xarray. Sure. I mean, in ULP, if folio is used, the page size is set to multiple PAGE_SIZE, but in RXE, the page size is set to PAGE_SIZE. So the page size in ULP is different with the page size in RXE. I am not sure whether this similar problem still occur or not. I hope this problem will not occur even with folio. Zhu Yanjun > > Jason