Re: [PATCH 0/4] kdump: crashkernel reservation from CMA

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.com>
To: Philipp Rudo <prudo@redhat.com>
Cc: Baoquan He <bhe@redhat.com>, Donald Dutile <ddutile@redhat.com>,
	Jiri Bohac <jbohac@suse.cz>, Pingfan Liu <piliu@redhat.com>,
	Tao Liu <ltao@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
	Dave Young <dyoung@redhat.com>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	David Hildenbrand <dhildenb@redhat.com>
Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA
Date: Wed, 6 Dec 2023 14:49:50 +0100	[thread overview]
Message-ID: <ZXB7_rbC0GAkIp7p@tiehlicka> (raw)
In-Reply-To: <20231206120805.4fdcb8ab@rotkaeppchen>

On Wed 06-12-23 12:08:05, Philipp Rudo wrote:
> On Fri, 1 Dec 2023 17:59:02 +0100
> Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Fri 01-12-23 16:51:13, Philipp Rudo wrote:
> > > On Fri, 1 Dec 2023 12:55:52 +0100
> > > Michal Hocko <mhocko@suse.com> wrote:
> > >   
> > > > On Fri 01-12-23 12:33:53, Philipp Rudo wrote:
> > > > [...]  
> > > > > And yes, those are all what-if concerns but unfortunately that is all
> > > > > we have right now.    
> > > > 
> > > > Should theoretical concerns without an actual evidence (e.g. multiple
> > > > drivers known to be broken) become a roadblock for this otherwise useful
> > > > feature?   
> > > 
> > > Those concerns aren't just theoretical. They are experiences we have
> > > from a related feature that suffers exactly the same problem regularly
> > > which wouldn't exist if everybody would simply work "properly".  
> > 
> > What is the related feature?
> 
> kexec

OK, but that is a completely different thing, no? crashkernel parameter
doesn't affect kexec. Or what is the actual relation?

> > > And yes, even purely theoretical concerns can become a roadblock for a
> > > feature when the cost of those theoretical concerns exceed the benefit
> > > of the feature. The thing is that bugs will be reported against kexec.
> > > So _we_ need to figure out which of the shitty drivers caused the
> > > problem. That puts additional burden on _us_. What we are trying to
> > > evaluate at the moment is if the benefit outweighs the extra burden
> > > with the information we have at the moment.  
> > 
> > I do understand your concerns! But I am pretty sure you do realize that
> > it is really hard to argue theoreticals.  Let me restate what I consider
> > facts. Hopefully we can agree on these points
> > 	- the CMA region can be used by user space memory which is a
> > 	  great advantage because the memory is not wasted and our
> > 	  experience has shown that users do care about this a lot. We
> > 	  _know_ that pressure on making those reservations smaller
> > 	  results in a less reliable crashdump and more resources spent
> > 	  on tuning and testing (especially after major upgrades).  A
> > 	  larger reservation which is not completely wasted for the
> > 	  normal runtime is addressing that concern.
> > 	- There is no other known mechanism to achieve the reusability
> > 	  of the crash kernel memory to stop the wastage without much
> > 	  more intrusive code/api impact (e.g. a separate zone or
> > 	  dedicated interface to prevent any hazardous usage like RDMA).
> > 	- implementation wise the patch has a very small footprint. It
> > 	  is using an existing infrastructure (CMA) and it adds a
> > 	  minimal hooking into crashkernel configuration.
> > 	- The only identified risk so far is RDMA acting on this memory
> > 	  without using proper pinning interface. If it helps to have a
> > 	  statement from RDMA maintainers/developers then we can pull
> > 	  them in for a further discussion of course.
> > 	- The feature requires an explicit opt-in so this doesn't bring
> > 	  any new risk to existing crash kernel users until they decide
> > 	  to use it. AFAIU there is no way to tell that the crash kernel
> > 	  memory used to be CMA based in the primary kernel. If you
> > 	  believe that having that information available for
> > 	  debugability would help then I believe this shouldn't be hard
> > 	  to add.  I think it would even make sense to mark this feature
> > 	  experimental to make it clear to users that this needs some
> > 	  time before it can be marked production ready.
> > 
> > I hope I haven't really missed anything important. The final
> 
> If I understand Documentation/core-api/pin_user_pages.rst correctly you
> missed case 1 Direct IO. In that case "short term" DMA is allowed for
> pages without FOLL_LONGTERM. Meaning that there is a way you can
> corrupt the CMA and with that the crash kernel after the production
> kernel has panicked.

Could you expand on this? How exactly direct IO request survives across
into the kdump kernel? I do understand the RMDA case because the IO is
async and out of control of the receiving end.

Also if direct IO is a problem how come this is not a problem for kexec
in general. The new kernel usually shares all the memory with the 1st
kernel.

/me confused.
-- 
Michal Hocko
SUSE Labs

next prev parent reply	other threads:[~2023-12-06 13:50 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-24 19:54 [PATCH 0/4] kdump: crashkernel reservation from CMA Jiri Bohac
2023-11-24 19:57 ` [PATCH 1/4] kdump: add crashkernel cma suffix Jiri Bohac
2023-11-25  7:24   ` kernel test robot
2023-11-24 19:58 ` [PATCH 2/4] kdump: implement reserve_crashkernel_cma Jiri Bohac
2023-11-24 19:58 ` [PATCH 3/4] kdump, x86: implement crashkernel CMA reservation Jiri Bohac
2023-11-24 19:58 ` [PATCH 4/4] kdump, documentation: describe craskernel " Jiri Bohac
2023-11-25  1:51 ` [PATCH 0/4] kdump: crashkernel reservation from CMA Tao Liu
2023-11-25 21:22   ` Jiri Bohac
2023-11-28  1:12     ` Tao Liu
2023-11-28  2:11       ` Baoquan He
2023-11-28  9:08         ` Michal Hocko
2023-11-29  7:57           ` Baoquan He
2023-11-29  9:25             ` Michal Hocko
2023-11-30  2:42               ` Baoquan He
2023-11-29 10:51             ` Jiri Bohac
2023-11-30  4:01               ` Baoquan He
2023-12-01 12:35                 ` Jiri Bohac
2023-11-29  8:10           ` Baoquan He
2023-11-29 15:03             ` Donald Dutile
2023-11-30  3:00               ` Baoquan He
2023-11-30 10:16                 ` Michal Hocko
2023-11-30 12:04                   ` Baoquan He
2023-11-30 12:31                     ` Baoquan He
2023-11-30 13:41                       ` Michal Hocko
2023-12-01 11:33                         ` Philipp Rudo
2023-12-01 11:55                           ` Michal Hocko
2023-12-01 15:51                             ` Philipp Rudo
2023-12-01 16:59                               ` Michal Hocko
2023-12-06 11:08                                 ` Philipp Rudo
2023-12-06 11:23                                   ` David Hildenbrand
2023-12-06 13:49                                   ` Michal Hocko [this message]
2023-12-06 15:19                                     ` Michal Hocko
2023-12-07  4:23                                       ` Baoquan He
2023-12-07  8:55                                         ` Michal Hocko
2023-12-07 11:13                                           ` Philipp Rudo
2023-12-07 11:52                                             ` Michal Hocko
2023-12-08  1:55                                               ` Baoquan He
2023-12-08 10:04                                                 ` Michal Hocko
2023-12-08  2:10                                           ` Baoquan He
2023-12-07 11:13                                       ` Philipp Rudo
2023-11-30 13:29                     ` Michal Hocko
2023-11-30 13:33                       ` Pingfan Liu
2023-11-30 13:43                         ` Michal Hocko
2023-12-01  0:54                           ` Pingfan Liu
2023-12-01 10:37                             ` Michal Hocko
2023-11-28  2:07     ` Pingfan Liu
2023-11-28  8:58       ` Michal Hocko
2023-12-01 11:34 ` Philipp Rudo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXB7_rbC0GAkIp7p@tiehlicka \
    --to=mhocko@suse.com \
    --cc=bhe@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=dhildenb@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=jbohac@suse.cz \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltao@redhat.com \
    --cc=piliu@redhat.com \
    --cc=prudo@redhat.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox