public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Pingfan Liu <piliu@redhat.com>
Cc: Baoquan He <bhe@redhat.com>, Donald Dutile <ddutile@redhat.com>,
	Jiri Bohac <jbohac@suse.cz>, Tao Liu <ltao@redhat.com>,
	Vivek Goyal <vgoyal@redhat.com>, Dave Young <dyoung@redhat.com>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA
Date: Thu, 30 Nov 2023 14:43:08 +0100	[thread overview]
Message-ID: <ZWiRbLGdBMO2jFGs@tiehlicka> (raw)
In-Reply-To: <CAF+s44QSJL5e6BVTAyyHR9Kzx7RJqZSkR=uXEypaouK_XuBbEw@mail.gmail.com>

On Thu 30-11-23 21:33:04, Pingfan Liu wrote:
> On Thu, Nov 30, 2023 at 9:29 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Thu 30-11-23 20:04:59, Baoquan He wrote:
> > > On 11/30/23 at 11:16am, Michal Hocko wrote:
> > > > On Thu 30-11-23 11:00:48, Baoquan He wrote:
> > > > [...]
> > > > > Now, we are worried if there's risk if the CMA area is retaken into kdump
> > > > > kernel as system RAM. E.g is it possible that 1st kernel's ongoing RDMA
> > > > > or DMA will interfere with kdump kernel's normal memory accessing?
> > > > > Because kdump kernel usually only reset and initialize the needed
> > > > > device, e.g dump target. Those unneeded devices will be unshutdown and
> > > > > let go.
> > > >
> > > > I do not really want to discount your concerns but I am bit confused why
> > > > this matters so much. First of all, if there is a buggy RDMA driver
> > > > which doesn't use the proper pinning API (which would migrate away from
> > > > the CMA) then what is the worst case? We will get crash kernel corrupted
> > > > potentially and fail to take a proper kernel crash, right? Is this
> > > > worrisome? Yes. Is it a real roadblock? I do not think so. The problem
> > > > seems theoretical to me and it is not CMA usage at fault here IMHO. It
> > > > is the said theoretical driver that needs fixing anyway.
> > > >
> > > > Now, it is really fair to mention that CMA backed crash kernel memory
> > > > has some limitations
> > > >     - CMA reservation can only be used by the userspace in the
> > > >       primary kernel. If the size is overshot this might have
> > > >       negative impact on kernel allocations
> > > >     - userspace memory dumping in the crash kernel is fundamentally
> > > >       incomplete.
> > >
> > > I am not sure if we are talking about the same thing. My concern is:
> > > ====================================================================
> > > 1) system corrutption happened, crash dumping is prepared, cpu and
> > > interrupt controllers are shutdown;
> > > 2) all pci devices are kept alive;
> > > 3) kdump kernel boot up, initialization is only done on those devices
> > > which drivers are added into kdump kernel's initrd;
> > > 4) those on-flight DMA engine could be still working if their kernel
> > > module is not loaded;
> > >
> > > In this case, if the DMA's destination is located in crashkernel=,cma
> > > region, the DMA writting could continue even when kdump kernel has put
> > > important kernel data into the area. Is this possible or absolutely not
> > > possible with DMA, RDMA, or any other stuff which could keep accessing
> > > that area?
> >
> > I do nuderstand your concern. But as already stated if anybody uses
> > movable memory (CMA including) as a target of {R}DMA then that memory
> > should be properly pinned. That would mean that the memory will be
> > migrated to somewhere outside of movable (CMA) memory before the
> > transfer is configured. So modulo bugs this shouldn't really happen.
> > Are there {R}DMA drivers that do not pin memory correctly? Possibly. Is
> > that a road bloack to not using CMA to back crash kernel memory, I do
> > not think so. Those drivers should be fixed instead.
> >
> I think that is our concern. Is there any method to guarantee that
> will not happen instead of 'should be' ?
> Any static analysis during compiling time or dynamic checking method?

I am not aware of any method to detect a driver is going to configure a
RDMA.
 
> If this can be resolved, I think this method is promising.

Are you indicating this is a mandatory prerequisite?
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2023-11-30 13:43 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-24 19:54 [PATCH 0/4] kdump: crashkernel reservation from CMA Jiri Bohac
2023-11-24 19:57 ` [PATCH 1/4] kdump: add crashkernel cma suffix Jiri Bohac
2023-11-25  7:24   ` kernel test robot
2023-11-24 19:58 ` [PATCH 2/4] kdump: implement reserve_crashkernel_cma Jiri Bohac
2023-11-24 19:58 ` [PATCH 3/4] kdump, x86: implement crashkernel CMA reservation Jiri Bohac
2023-11-24 19:58 ` [PATCH 4/4] kdump, documentation: describe craskernel " Jiri Bohac
2023-11-25  1:51 ` [PATCH 0/4] kdump: crashkernel reservation from CMA Tao Liu
2023-11-25 21:22   ` Jiri Bohac
2023-11-28  1:12     ` Tao Liu
2023-11-28  2:11       ` Baoquan He
2023-11-28  9:08         ` Michal Hocko
2023-11-29  7:57           ` Baoquan He
2023-11-29  9:25             ` Michal Hocko
2023-11-30  2:42               ` Baoquan He
2023-11-29 10:51             ` Jiri Bohac
2023-11-30  4:01               ` Baoquan He
2023-12-01 12:35                 ` Jiri Bohac
2023-11-29  8:10           ` Baoquan He
2023-11-29 15:03             ` Donald Dutile
2023-11-30  3:00               ` Baoquan He
2023-11-30 10:16                 ` Michal Hocko
2023-11-30 12:04                   ` Baoquan He
2023-11-30 12:31                     ` Baoquan He
2023-11-30 13:41                       ` Michal Hocko
2023-12-01 11:33                         ` Philipp Rudo
2023-12-01 11:55                           ` Michal Hocko
2023-12-01 15:51                             ` Philipp Rudo
2023-12-01 16:59                               ` Michal Hocko
2023-12-06 11:08                                 ` Philipp Rudo
2023-12-06 11:23                                   ` David Hildenbrand
2023-12-06 13:49                                   ` Michal Hocko
2023-12-06 15:19                                     ` Michal Hocko
2023-12-07  4:23                                       ` Baoquan He
2023-12-07  8:55                                         ` Michal Hocko
2023-12-07 11:13                                           ` Philipp Rudo
2023-12-07 11:52                                             ` Michal Hocko
2023-12-08  1:55                                               ` Baoquan He
2023-12-08 10:04                                                 ` Michal Hocko
2023-12-08  2:10                                           ` Baoquan He
2023-12-07 11:13                                       ` Philipp Rudo
2023-11-30 13:29                     ` Michal Hocko
2023-11-30 13:33                       ` Pingfan Liu
2023-11-30 13:43                         ` Michal Hocko [this message]
2023-12-01  0:54                           ` Pingfan Liu
2023-12-01 10:37                             ` Michal Hocko
2023-11-28  2:07     ` Pingfan Liu
2023-11-28  8:58       ` Michal Hocko
2023-12-01 11:34 ` Philipp Rudo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZWiRbLGdBMO2jFGs@tiehlicka \
    --to=mhocko@suse.com \
    --cc=bhe@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=jbohac@suse.cz \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltao@redhat.com \
    --cc=piliu@redhat.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox