From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CADBC25B48 for ; Fri, 27 Oct 2023 01:36:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229600AbjJ0Bgl (ORCPT ); Thu, 26 Oct 2023 21:36:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229437AbjJ0Bgl (ORCPT ); Thu, 26 Oct 2023 21:36:41 -0400 Received: from out-182.mta0.migadu.com (out-182.mta0.migadu.com [IPv6:2001:41d0:1004:224b::b6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE4C41B6 for ; Thu, 26 Oct 2023 18:36:38 -0700 (PDT) Message-ID: <7a84ba40-aa73-4d93-8a22-53583868f3ba@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1698370597; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q+h9zMpnKIhwfNRPPwc9EPtZEqsc6KHlSTARDYEBvdQ=; b=mYYGcUd0o7z5oWUgH5zxaAcSQsIu6jTSY2mcE+kQm7spGBc+VEBH0sSpjPnKtpqS/1WvhD HdkAtfU1prvWQDQ0gNAa2Ssc48MX6MtoekDe+mkI6G536CeC6rqugDuVtt3swAPtey17iG Oa+4nnBVBa9Oi73LJQQcTorGs0yqlic= Date: Fri, 27 Oct 2023 09:36:26 +0800 MIME-Version: 1.0 Subject: Re: [PATCH 1/1] RDMA/rxe: Fix blktests srp lead kernel panic with 64k page size To: Jason Gunthorpe Cc: "Zhijian Li (Fujitsu)" , Yi Zhang , "Daisuke Matsuda (Fujitsu)" , Zhu Yanjun , "leon@kernel.org" , "linux-rdma@vger.kernel.org" , "zyjzyj2000@gmail.com" , Bart Van Assche References: <20231020140139.GF691768@ziepe.ca> <6c57cf0d-c7a7-4aac-9eb2-d8bb1d832232@fujitsu.com> <1ffaeaa4-4ac2-4531-8e0c-586e13c14c97@fujitsu.com> <366da960-6036-49c5-ad47-3ae3f4e55452@fujitsu.com> <8f705223-6fde-4b29-880b-570349f40db8@fujitsu.com> <20231026114221.GT691768@ziepe.ca> <2374eb54-6a7e-4a56-b7e9-3aa5c9048fa1@linux.dev> <20231026232327.GZ691768@ziepe.ca> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Zhu Yanjun In-Reply-To: <20231026232327.GZ691768@ziepe.ca> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org 在 2023/10/27 7:23, Jason Gunthorpe 写道: > On Thu, Oct 26, 2023 at 08:59:34PM +0800, Zhu Yanjun wrote: >> 在 2023/10/26 19:42, Jason Gunthorpe 写道: >>> On Thu, Oct 26, 2023 at 09:05:52AM +0000, Zhijian Li (Fujitsu) wrote: >>>> The root cause is that >>>> >>>> rxe:rxe_set_page() gets wrong when mr.page_size != PAGE_SIZE where it only stores the *page to xarray. >>>> So the offset will get lost. >>>> >>>> For example, >>>> store process: >>>> page_size = 0x1000; >>>> PAGE_SIZE = 0x10000; >>>> va0 = 0xffff000020651000; >>>> page_offset = 0 = va & (page_size - 1); >>>> page = va_to_page(va); >>>> xa_store(&mr->page_list, mr->nbuf, page, GFP_KERNEL); >>>> >>>> load_process: >>>> page = xa_load(&mr->page_list, index); >>>> page_va = kmap_local_page(page) --> it must be a PAGE_SIZE align value, assume it as 0xffff000020650000 >>>> va1 = page_va + page_offset = 0xffff000020650000 + 0 = 0xffff000020650000; >>>> >>>> Obviously, *va0 != va1*, page_offset get lost. >>>> >>>> >>>> How to fix: >>>> - revert 325a7eb85199 ("RDMA/rxe: Cleanup page variables in rxe_mr.c") >>>> - don't allow ulp registering mr.page_size != PAGE_SIZE ? >>> Lets do the second one please. Most devices only support PAGE_SIZE anyhow. >> Normally page_size is PAGE_SIZE or the size of the whole compound page (in >> the latest kernel version, it is the size of folio). When compound page or >> folio is taken into account, the page_size is not equal to >> PAGE_SIZE. > folios are always multiples of PAGE_SIZE. rxe splits everything into > PAGE_SIZE units in the xarray. Sure. Thanks. Folio is multiple base pages. So the page size should be multiple PAGE_SIZE. This page size is set in infiniband core and rxe. Hope no problem will occur when folio or compound page is used in ULP. Zhu Yanjun > >> If the ULP uses the compound page or folio, the similar problem will occur >> again. > No, it won't. We never store folios in the xarray. > > Jason