From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9668CC10DC3 for ; Thu, 7 Dec 2023 11:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=rwx+hXWBWeC8t8OJKxsCEgB/iYro4c5gHDG28pQGxp8=; b=v6hKkqfw86vo+G 8KxX0nCdeN7e9cuxQDwd2KRFNMvv3q6eq2TxuJdmthcZWIXezl5vXoMNv8Rm/YKN41j0HXiHkKKpw O5P8LIUehd76vlU6IG5Pmqndyd2MFYWyxmFEKPjYnKPpAw2v+ix8uQ0GNAz3eo8AYGam8ourRIWLx STOX28k3M8iMX8vbaQ7QZKWMIbyVFRxdr9rqEpd921Hw9LSM6Q6ed/Lh92B959nPkDwo+8UbGWFRY 7FQYAMrsOKM8Vb9kUKOf8wQa4YRPIor6FbL1hIO1gY7I9QbdKufKxVD0eMsegeRFq36tRX/0IELEs 728uhB/tiRtdye7R050Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rBCK6-00CXpg-37; Thu, 07 Dec 2023 11:13:42 +0000 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rBCK4-00CXni-0p for kexec@lists.infradead.org; Thu, 07 Dec 2023 11:13:41 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701947619; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yEnLporZicOdwXzovzLImpcCBG07E/v0woo+yYuLxfQ=; b=fWp4QvKdyCg7KS9QB/6hQ5EnAU/WyPlOma6VnAFa0cT6MfTl1fOClBeoNO18aKGbNrO3C4 5sfMPUr/2SQBn23cM3q87Bpr7hQ6cGeDRFYscvF/qNi9lTwAUjKByvyhDxR/nWMMxvRGRl fYfpx1T4DlkF15d2sf5G3SBdQKUaK7Q= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-595-ZijdJiQQOamHdkna9Z4FVw-1; Thu, 07 Dec 2023 06:13:35 -0500 X-MC-Unique: ZijdJiQQOamHdkna9Z4FVw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A1270862FDF; Thu, 7 Dec 2023 11:13:34 +0000 (UTC) Received: from rotkaeppchen (unknown [10.39.192.227]) by smtp.corp.redhat.com (Postfix) with ESMTP id 25BAD40C6EB9; Thu, 7 Dec 2023 11:13:33 +0000 (UTC) Date: Thu, 7 Dec 2023 12:13:31 +0100 From: Philipp Rudo To: Michal Hocko Cc: Baoquan He , Donald Dutile , Jiri Bohac , Pingfan Liu , Tao Liu , Vivek Goyal , Dave Young , kexec@lists.infradead.org, linux-kernel@vger.kernel.org, David Hildenbrand Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA Message-ID: <20231207121331.59c7e370@rotkaeppchen> In-Reply-To: References: <20231201123353.2b3db7fa@rotkaeppchen> <20231201165113.43211a48@rotkaeppchen> <20231206120805.4fdcb8ab@rotkaeppchen> Organization: Red Hat inc. MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231207_031340_363114_AE6CE577 X-CRM114-Status: GOOD ( 25.85 ) X-BeenThere: kexec@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "kexec" Errors-To: kexec-bounces+kexec=archiver.kernel.org@lists.infradead.org On Wed, 6 Dec 2023 16:19:51 +0100 Michal Hocko wrote: > On Wed 06-12-23 14:49:51, Michal Hocko wrote: > > On Wed 06-12-23 12:08:05, Philipp Rudo wrote: > [...] > > > If I understand Documentation/core-api/pin_user_pages.rst correctly you > > > missed case 1 Direct IO. In that case "short term" DMA is allowed for > > > pages without FOLL_LONGTERM. Meaning that there is a way you can > > > corrupt the CMA and with that the crash kernel after the production > > > kernel has panicked. > > > > Could you expand on this? How exactly direct IO request survives across > > into the kdump kernel? I do understand the RMDA case because the IO is > > async and out of control of the receiving end. > > OK, I guess I get what you mean. You are worried that there is > DIO request > program DMA controller to read into CMA memory > > boot into crash kernel backed by CMA > DMA transfer is done. > > DIO doesn't migrate the pinned memory because it is considered a very > quick operation which doesn't block the movability for too long. That is > why I have considered that a non-problem. RDMA on the other might pin > memory for transfer for much longer but that case is handled by > migrating the memory away. Right that is the scenario we need to prevent. > Now I agree that there is a chance of the corruption from DIO. The > question I am not entirely clear about right now is how big of a real > problem that is. DMA transfers should be a very swift operation. Would > it help to wait for a grace period before jumping into the kdump kernel? Please see my other mail. > > Also if direct IO is a problem how come this is not a problem for kexec > > in general. The new kernel usually shares all the memory with the 1st > > kernel. > > This is also more clear now. Pure kexec is shutting down all the devices > which should terminate the in-flight DMA transfers. Right, it _should_ terminate all transfers. But here we are back at the shitty device drivers that don't have a working shutdown method. That's why we have already seen the problem you describe above with kexec. And please believe me that debugging such a scenario is an absolute pain. Especially when it's a proprietary, out-of-tree driver that caused the mess. Thanks Philipp _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec