From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2750BCF318C for ; Wed, 2 Oct 2024 02:34:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 659F0440170; Tue, 1 Oct 2024 22:34:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60956440168; Tue, 1 Oct 2024 22:34:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4865E440170; Tue, 1 Oct 2024 22:34:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1C27F440168 for ; Tue, 1 Oct 2024 22:34:51 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A79D8C18CE for ; Wed, 2 Oct 2024 02:34:50 +0000 (UTC) X-FDA: 82627094340.17.05346D8 Received: from mail-pl1-f193.google.com (mail-pl1-f193.google.com [209.85.214.193]) by imf20.hostedemail.com (Postfix) with ESMTP id B410E1C0002 for ; Wed, 2 Oct 2024 02:34:48 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eimoESsT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.214.193 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727836449; a=rsa-sha256; cv=none; b=aJ86FRfnTSGNmqIFo/J5wa7kefUfI1BEI6jOsxKBi/U/BozfxltJMo3gqTTcrRLQ2Od9pk 2OJuyJTNjap3Ro6L1qnXfFePP6s7+mqJ99dnIrDXNA3z/Vt8bDLYPF8ns4ow9Xs3RskWqG odv4TWJqrav6CxjHuhcFakVzScZF5nk= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eimoESsT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of yunshenglin0825@gmail.com designates 209.85.214.193 as permitted sender) smtp.mailfrom=yunshenglin0825@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727836449; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=94naYpv079to4aDULVCz9OCsch+/fu1zXqCOaX8AqZQ=; b=VX8tDxBcoQi9kfhylxqnE7atLLM8PYlY3Q4WOO5GTL17eGxNvpbA4Hfbmmp6qOz9S/yZEv 57MS4xwzscJDFGjxxho11VQo1VvJXmWDdLSGnjFu4t1DTfzJVc5QsiFfTgAC8Y1XZjxFBJ UPG/HenIZcNZ0LJPbLpgVKAqyPiJrPQ= Received: by mail-pl1-f193.google.com with SMTP id d9443c01a7336-20bc2970df5so6795285ad.3 for ; Tue, 01 Oct 2024 19:34:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727836487; x=1728441287; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=94naYpv079to4aDULVCz9OCsch+/fu1zXqCOaX8AqZQ=; b=eimoESsTaYszvYBF3PojI0Q37o6XkbGRUCPKdsrwXZS/+fLcCQVHfDurP0LYyoPKAa ND2VYpG4blWGjw0uNVj/5Ui0rMWl/2vJcTr20k+cvQbXd4mejhNouPHiUxN6eqe9LKvb UxqIwnAayp6zXv/QKpwZ44L76KwY+cuBAb2FAbsBzqeyGlUz28nSJLM0t2Dp8rioH1U2 v+M9CBYYbb621IK7qGuLhyIRIxDW/2DvttkrEgRPdQmxtNu5KaK+6mwg5zQb4kvx+2tL Rjq8ztSmgDC3am2WiqEpjeNq0rfXMaclpuVGsDHprQQSQ/eVLFBRbPsIs8uG4qwlzuym +prQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727836487; x=1728441287; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=94naYpv079to4aDULVCz9OCsch+/fu1zXqCOaX8AqZQ=; b=ZQb905rA/JIlCpS5b+NUCPfFNFot3Y8bggYHxrhdmz83us88kBm+irnmDW6gm2sOol mvkq2O26po/tXVNzIPHvZh6W/D1cdqvvn0rxvntuoBjTOHDWSFZTgsgEQFooMid3Ie0D cFxMhGU92w/pBol+I3tPkXr8Sc1E3Dw9NE9hdQUKkpF7ZvHzDgFz4XN4WHsGu2Lvg03j 0JPclSOcON4Ap1FVrd2VnpvZCkJFEkznMtCSGdiZ4rYz0HIZi6XfuLS+T1P5h54WP896 30yWEumrcmrcS0yHFPNsUYqLrNKXMlne5xCJhxZW2JF2utBPCjoj+m7ZRdNJNkpUUsfS amUA== X-Forwarded-Encrypted: i=1; AJvYcCVgR+0c5NGNoscQkPhmPmAOnlb6Y/fTMjU8TT9O6iqg8ggM3tiRVxRRMLJBTrKKcKvn5maw0m5UhA==@kvack.org X-Gm-Message-State: AOJu0YzKUlUgL9QZQYMJxNnrr7+PQ/sGyot4FRi27FBOtveEGmoZZ9ep GVl6o699vxphrvHMwqtvxsFqv/dbZXrCweHWkMNGbX1BVyvHgYYZ X-Google-Smtp-Source: AGHT+IF9UFJBsbcmQi++iX6HKeBG+ku6eNZRFQc3F4zkIg3OMASXuELxeSEfBVyfvWUc3NyPSynvqg== X-Received: by 2002:a17:902:e844:b0:20b:b132:4df9 with SMTP id d9443c01a7336-20bc5a01fa5mr17029495ad.34.1727836487138; Tue, 01 Oct 2024 19:34:47 -0700 (PDT) Received: from ?IPV6:2409:8a55:301b:e120:50d1:daaf:4d8b:70e8? ([2409:8a55:301b:e120:50d1:daaf:4d8b:70e8]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20b5b167128sm58750135ad.283.2024.10.01.19.34.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 01 Oct 2024 19:34:46 -0700 (PDT) Message-ID: <33f23809-abec-4d39-ab80-839dc525a2e6@gmail.com> Date: Wed, 2 Oct 2024 10:34:34 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net v2 2/2] page_pool: fix IOMMU crash when driver has already unbound To: Paolo Abeni , Yunsheng Lin , davem@davemloft.net, kuba@kernel.org Cc: liuyonglong@huawei.com, fanghaiqing@huawei.com, zhangkun09@huawei.com, Robin Murphy , Alexander Duyck , IOMMU , Wei Fang , Shenwei Wang , Clark Wang , Eric Dumazet , Tony Nguyen , Przemek Kitszel , Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Felix Fietkau , Lorenzo Bianconi , Ryder Lee , Shayne Chen , Sean Wang , Kalle Valo , Matthias Brugger , AngeloGioacchino Del Regno , Andrew Morton , Ilias Apalodimas , imx@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-mm@kvack.org References: <20240925075707.3970187-1-linyunsheng@huawei.com> <20240925075707.3970187-3-linyunsheng@huawei.com> <4968c2ec-5584-4a98-9782-143605117315@redhat.com> Content-Language: en-US From: Yunsheng Lin In-Reply-To: <4968c2ec-5584-4a98-9782-143605117315@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: B410E1C0002 X-Rspamd-Server: rspam01 X-Stat-Signature: f3okxxjde6k9q5wdyjdubfp76ptqdo1p X-HE-Tag: 1727836488-959892 X-HE-Meta: U2FsdGVkX1+K2vY3jJMUwyacQ7Xo3TEXqq68btQCbxlDOXEAwBFQjEPw8gkTUSgEiuzvWVZWKaH9hpnOforxKt1nYK6MMJHaOSsnZNbA5k8iZ7UPSSU4ILpTdZ7bZ8V2sa2iq9Nv9phaTKwSP1i0hKyGMuH9cuPvLLiGg+SAe737QDNkHg7RLNWd900xYgVxPnONuHkwvtVQUtjP9DKZYl870oZSB1VuWPSjVFgKyh5oFwXVXOotMkKvTYp6FjwPb0j0tBO+vygh0uMw3LNQiV0RuuxuGwI6i03Tm25HjSsgpSqVSGD+eN2SelBDjfWc3ATVkkzVqhaGjOLKEP3LQEkyJpwDnSdQ0wsn93QfaY7NoCnskt4M2E4BWekmQeCHwi9ba+qCvx+K2aKTdmJvVhZDaEpLCs7Ant3i7LZ13I3VIFnjWA9QgWW8tgs17eShh7/UjEuH7q1+4IATgYQXWqx3ic3pIH1qFuviyg0VPyn+7VlQB6AayNprckbI4B0y6WreDmNWcg99tYvnJn+iAaXp5uWHP+TupkKZOKy9uHmvOmwMihzWRBlBUjL76oah6XHG1PnYWiRaQPxa4thvdTyC0wG1PeF+H5wlE+XpRPyXmBm+Oavd1hKtLaMy6Dzzlr3C4yWMF6FzZMFaYDDe8K9JsdnXUFV8Rfv1bRAGjbSgCwWHMJ/eoMch9dKLY8xedZgQk3Pp1XLdvzsFayF3UrtRtIdvIpIskZrh6nX8H4c+BS57hySizG9Zxo8Arjbnok8QyJadpqIFhoHyc3kJYW5wmzZ9rys2H62pOZBvvFAwvvXnwIliQ7nWNb4An3bjX9H25G13fZj3gf7oknCCULozvSgxDRvlsIEirB2oPo2Ormld/IhpzzhoNFLEFh8lcmtp2EIwPcjKSK96ok2KxTaBuSKne0ggXAJBmiaIa72A9xlML/lm0LGsNrLL7UAxAJCR1st0TxaCawn0hOY q14ScIgV TnGJBdwUD/BENZhDY1n4TO6ia2hkwT20WZ4RGxCYDtT0dqN8yZU5TlToeL1a9eZENEz8IXCZfr7Gcdiytr7BJv54RCPkajaCnPAi+IVBLV40YHXFU5vxi/dgTgIpQNe2v2yRoTiWCXdTgzh2VA2WGZoZUmE6FvuU0S6oJaCrDheCwspCJYtChCExwsliabqqAhcnJDnmYBkRABg+1s2qdkv7fbYLNIUNdu1Ii8U3kbjw9mNNw4203MP8zZqB9c/K+d6KkIqeYUrW+1tOr6toQ64TnhOnS/BMa9UrUG4GyTTwy575zI7yi46lKO2eXK6OkWCqjKT0sgnwiSerzfmpJLl565FuF2b+T/xTkeqD6Xjx89r7IvW0p6Z4fEcz+zTKNH1ceCVKHo3QaV4QWFFGhxRGFhhXvDEo4x1PhaaSmm1UDo9NEvpIKBvyGei048ufoSJ/Ue64781ulFNl4VqCJauuITX74UY2k61LfdWsXWFk2e2MzIrtZbYvERs0SlbetnQKvu0fZc1LfGuLAOckvo/PQAyFnpXLPfw5YCI5eNkb0bnQsmzawSfGl8AOsqE6TzOP6GQxGrNWvN3OATXg5EYuLOA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/1/2024 9:32 PM, Paolo Abeni wrote: > On 9/25/24 09:57, Yunsheng Lin wrote: >> Networking driver with page_pool support may hand over page >> still with dma mapping to network stack and try to reuse that >> page after network stack is done with it and passes it back >> to page_pool to avoid the penalty of dma mapping/unmapping. >> With all the caching in the network stack, some pages may be >> held in the network stack without returning to the page_pool >> soon enough, and with VF disable causing the driver unbound, >> the page_pool does not stop the driver from doing it's >> unbounding work, instead page_pool uses workqueue to check >> if there is some pages coming back from the network stack >> periodically, if there is any, it will do the dma unmmapping >> related cleanup work. >> >> As mentioned in [1], attempting DMA unmaps after the driver >> has already unbound may leak resources or at worst corrupt >> memory. Fundamentally, the page pool code cannot allow DMA >> mappings to outlive the driver they belong to. >> >> Currently it seems there are at least two cases that the page >> is not released fast enough causing dma unmmapping done after >> driver has already unbound: >> 1. ipv4 packet defragmentation timeout: this seems to cause >>     delay up to 30 secs. >> 2. skb_defer_free_flush(): this may cause infinite delay if >>     there is no triggering for net_rx_action(). >> >> In order not to do the dma unmmapping after driver has already >> unbound and stall the unloading of the networking driver, add >> the pool->items array to record all the pages including the ones >> which are handed over to network stack, so the page_pool can >> do the dma unmmapping for those pages when page_pool_destroy() >> is called. As the pool->items need to be large enough to avoid >> performance degradation, add a 'item_full' stat to indicate the >> allocation failure due to unavailability of pool->items. > > This looks really invasive, with room for potentially large performance > regressions or worse. At very least it does not look suitable for net. I am open to targetting this to net-next, it can be backported when some testing is done through one or two kernel versions and there is still some interest to backport it too. Or if there is some non-invasive way to fix this. > > Is the problem only tied to VFs drivers? It's a pity all the page_pool > users will have to pay a bill for it... I am afraid it is not only tied to VFs drivers, as: attempting DMA unmaps after the driver has already unbound may leak resources or at worst corrupt memory. Unloading PFs driver might cause the above problems too, I guess the probability of crashing is low for the PF as PF can not be disable unless it can be hot-unplug'ed, but the probability of leaking resources behind the dma mapping might be similar. > > /P > >