All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olaf Hering <olaf@aepfle.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] add hook to read_from_oldmem() to check for non-ram pages
Date: Fri, 6 May 2011 12:55:46 +0200	[thread overview]
Message-ID: <20110506105545.GA16945@aepfle.de> (raw)
In-Reply-To: <20110505142551.b4d2d95a.akpm@linux-foundation.org>

On Thu, May 05, Andrew Morton wrote:

> On Tue, 3 May 2011 21:08:06 +0200
> Olaf Hering <olaf@aepfle.de> wrote:

> > This will reduce the time to read /proc/vmcore.  Without this change a
> > 512M guest with 128M crashkernel region needs 200 seconds to read it,
> > with this change it takes just 2 seconds.
> 
> Seems reasonable, I suppose.

Andrew,
Thanks for your feedback.

> Is there some suitable ifdef we can put around this stuff to avoid
> adding it to kernel builds which will never use it?

The change is for pv-on-hvm guests. In this setup the (out-of-tree)
paravirtualized drivers shutdown the emulated hardware, then they
communicate directly with the backend.
There is no ifdef right now.  I guess at some point, when xen is fully
merged, this hook can be put into some CONFIG_XEN_PV_ON_HVM thing.

> > +void register_oldmem_pfn_is_ram(int (*fn)(unsigned long))
> > +{
> > +	if (oldmem_pfn_is_ram == NULL)
> > +		oldmem_pfn_is_ram = fn;
> > +}
> 
> This is racy, and it should return a success code.  And we may as well
> mark it __must_check to prevent people from cheating.

I will update that part.

> > +void unregister_oldmem_pfn_is_ram(void)
> > +{
> > +	wait_event(oldmem_fn_waitq, atomic_read(&oldmem_fn_refcount) == 0);
> > +	oldmem_pfn_is_ram = NULL;
> > +	wmb();
> > +}
> 
> I'd say we should do away with the (also racy) refcount thing. 
> Instead, require that callers not be using the thing when they
> unregister.  ie: that callers not be buggy.

I think oldmem_pfn_is_ram can be cleared unconditionally, the NULL check
in pfn_is_ram() below will prevent a crash.
This whole refcount thing is to prevent a module unload while
pfn_is_ram() is calling the hook, in other words the called code should
not go away until the hook returns to pfn_is_ram().
Should the called code increase/decrease the modules refcount instead?
I remember there was some MODULE_INC/MODULE_DEC macro (cant remember the
exact name) at some point. What needs to be done inside the module to
prevent an unload while its still in use? Is it __module_get/module_put
for each call of fn()? 

The called function which will go into the xen source at some point
looks like shown below. HVMOP_get_mem_type was just merged into xen-unstable.

xen-unstable.hg/unmodified_drivers/linux-2.6/platform-pci/platform-pci.c

#ifdef HAVE_OLDMEM_PFN_IS_RAM
static int xen_oldmem_pfn_is_ram(unsigned long pfn)
{
	struct xen_hvm_get_mem_type a;
	int ret;

	a.domid = DOMID_SELF;
	a.pfn = pfn;
	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a))
		return -ENXIO;

	switch (a.mem_type) {
		case HVMMEM_mmio_dm:
			ret = 0;
			break;
		case HVMMEM_ram_rw:
		case HVMMEM_ram_ro:
		default:
			ret = 1;
			break;
	}

	return ret;
}
#endif

static int __devinit platform_pci_init(...)
{
        /* other init stuff */
#ifdef HAVE_OLDMEM_PFN_IS_RAM
        register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram);
#endif
        /* other init stuff */
}


Also, this xen driver has no module_exit. So for xen the unregister is
not an issue. I havent looked at the to-be-merged pv-on-hvm drivers,
maybe they can be properly unloaded.

> > +static int pfn_is_ram(unsigned long pfn)
> > +{
> > +	int (*fn)(unsigned long);
> > +	/* pfn is ram unless fn() checks pagetype */
> > +	int ret = 1;
> > +
> > +	atomic_inc(&oldmem_fn_refcount);
> > +	smp_mb__after_atomic_inc();
> > +	fn = oldmem_pfn_is_ram;
> > +	if (fn)
> > +		ret = fn(pfn);
> > +	if (atomic_dec_and_test(&oldmem_fn_refcount))
> > +		wake_up(&oldmem_fn_waitq);
> > +
> > +	return ret;
> > +}
> 
> This function would have been a suitable place at which to document the
> entire feature.  As it stands, anyone who is reading this code won't
> have any clue why it exists.

I will add a comment.

> > +EXPORT_SYMBOL_GPL(register_oldmem_pfn_is_ram);
> > +EXPORT_SYMBOL_GPL(unregister_oldmem_pfn_is_ram);
> 
> Each export should be placed immediately after the function which is
> being exported, please.  Checkpatch reports this.  Please use checkpatch.

Will do.

> > +++ linux-2.6.39-rc5/include/linux/crash_dump.h
> > @@ -66,6 +66,11 @@ static inline void vmcore_unusable(void)
> >  	if (is_kdump_kernel())
> >  		elfcorehdr_addr = ELFCORE_ADDR_ERR;
> >  }
> > +
> > +#define HAVE_OLDMEM_PFN_IS_RAM 1
> 
> What's this for?

So that out-of-tree drivers dont fail to compile when they call that
hook unconditionally. Perhaps they could use the kernel version.

Olaf

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: Olaf Hering <olaf@aepfle.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org
Subject: Re: [PATCH] add hook to read_from_oldmem() to check for non-ram pages
Date: Fri, 6 May 2011 12:55:46 +0200	[thread overview]
Message-ID: <20110506105545.GA16945@aepfle.de> (raw)
In-Reply-To: <20110505142551.b4d2d95a.akpm@linux-foundation.org>

On Thu, May 05, Andrew Morton wrote:

> On Tue, 3 May 2011 21:08:06 +0200
> Olaf Hering <olaf@aepfle.de> wrote:

> > This will reduce the time to read /proc/vmcore.  Without this change a
> > 512M guest with 128M crashkernel region needs 200 seconds to read it,
> > with this change it takes just 2 seconds.
> 
> Seems reasonable, I suppose.

Andrew,
Thanks for your feedback.

> Is there some suitable ifdef we can put around this stuff to avoid
> adding it to kernel builds which will never use it?

The change is for pv-on-hvm guests. In this setup the (out-of-tree)
paravirtualized drivers shutdown the emulated hardware, then they
communicate directly with the backend.
There is no ifdef right now.  I guess at some point, when xen is fully
merged, this hook can be put into some CONFIG_XEN_PV_ON_HVM thing.

> > +void register_oldmem_pfn_is_ram(int (*fn)(unsigned long))
> > +{
> > +	if (oldmem_pfn_is_ram == NULL)
> > +		oldmem_pfn_is_ram = fn;
> > +}
> 
> This is racy, and it should return a success code.  And we may as well
> mark it __must_check to prevent people from cheating.

I will update that part.

> > +void unregister_oldmem_pfn_is_ram(void)
> > +{
> > +	wait_event(oldmem_fn_waitq, atomic_read(&oldmem_fn_refcount) == 0);
> > +	oldmem_pfn_is_ram = NULL;
> > +	wmb();
> > +}
> 
> I'd say we should do away with the (also racy) refcount thing. 
> Instead, require that callers not be using the thing when they
> unregister.  ie: that callers not be buggy.

I think oldmem_pfn_is_ram can be cleared unconditionally, the NULL check
in pfn_is_ram() below will prevent a crash.
This whole refcount thing is to prevent a module unload while
pfn_is_ram() is calling the hook, in other words the called code should
not go away until the hook returns to pfn_is_ram().
Should the called code increase/decrease the modules refcount instead?
I remember there was some MODULE_INC/MODULE_DEC macro (cant remember the
exact name) at some point. What needs to be done inside the module to
prevent an unload while its still in use? Is it __module_get/module_put
for each call of fn()? 

The called function which will go into the xen source at some point
looks like shown below. HVMOP_get_mem_type was just merged into xen-unstable.

xen-unstable.hg/unmodified_drivers/linux-2.6/platform-pci/platform-pci.c

#ifdef HAVE_OLDMEM_PFN_IS_RAM
static int xen_oldmem_pfn_is_ram(unsigned long pfn)
{
	struct xen_hvm_get_mem_type a;
	int ret;

	a.domid = DOMID_SELF;
	a.pfn = pfn;
	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a))
		return -ENXIO;

	switch (a.mem_type) {
		case HVMMEM_mmio_dm:
			ret = 0;
			break;
		case HVMMEM_ram_rw:
		case HVMMEM_ram_ro:
		default:
			ret = 1;
			break;
	}

	return ret;
}
#endif

static int __devinit platform_pci_init(...)
{
        /* other init stuff */
#ifdef HAVE_OLDMEM_PFN_IS_RAM
        register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram);
#endif
        /* other init stuff */
}


Also, this xen driver has no module_exit. So for xen the unregister is
not an issue. I havent looked at the to-be-merged pv-on-hvm drivers,
maybe they can be properly unloaded.

> > +static int pfn_is_ram(unsigned long pfn)
> > +{
> > +	int (*fn)(unsigned long);
> > +	/* pfn is ram unless fn() checks pagetype */
> > +	int ret = 1;
> > +
> > +	atomic_inc(&oldmem_fn_refcount);
> > +	smp_mb__after_atomic_inc();
> > +	fn = oldmem_pfn_is_ram;
> > +	if (fn)
> > +		ret = fn(pfn);
> > +	if (atomic_dec_and_test(&oldmem_fn_refcount))
> > +		wake_up(&oldmem_fn_waitq);
> > +
> > +	return ret;
> > +}
> 
> This function would have been a suitable place at which to document the
> entire feature.  As it stands, anyone who is reading this code won't
> have any clue why it exists.

I will add a comment.

> > +EXPORT_SYMBOL_GPL(register_oldmem_pfn_is_ram);
> > +EXPORT_SYMBOL_GPL(unregister_oldmem_pfn_is_ram);
> 
> Each export should be placed immediately after the function which is
> being exported, please.  Checkpatch reports this.  Please use checkpatch.

Will do.

> > +++ linux-2.6.39-rc5/include/linux/crash_dump.h
> > @@ -66,6 +66,11 @@ static inline void vmcore_unusable(void)
> >  	if (is_kdump_kernel())
> >  		elfcorehdr_addr = ELFCORE_ADDR_ERR;
> >  }
> > +
> > +#define HAVE_OLDMEM_PFN_IS_RAM 1
> 
> What's this for?

So that out-of-tree drivers dont fail to compile when they call that
hook unconditionally. Perhaps they could use the kernel version.

Olaf

  reply	other threads:[~2011-05-06 10:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-07  9:56 dynamic oldmem in kdump kernel Olaf Hering
2011-04-07  9:56 ` Olaf Hering
2011-04-07 10:23 ` Américo Wang
2011-04-07 10:23   ` Américo Wang
2011-04-07 13:12   ` Olaf Hering
2011-04-07 13:12     ` Olaf Hering
2011-04-08 10:49     ` Américo Wang
2011-04-08 10:49       ` Américo Wang
2011-05-02 10:22 ` Olaf Hering
2011-05-02 10:22   ` Olaf Hering
2011-05-03 19:08 ` [PATCH] add hook to read_from_oldmem() to check for non-ram pages Olaf Hering
2011-05-03 19:08   ` Olaf Hering
2011-05-05 21:25   ` Andrew Morton
2011-05-05 21:25     ` Andrew Morton
2011-05-06 10:55     ` Olaf Hering [this message]
2011-05-06 10:55       ` Olaf Hering
2011-05-06 19:30       ` Andrew Morton
2011-05-06 19:30         ` Andrew Morton
2011-05-06 19:39         ` Olaf Hering
2011-05-06 19:39           ` Olaf Hering
2011-05-06 19:55           ` Andrew Morton
2011-05-06 19:55             ` Andrew Morton
2011-05-06 13:20   ` [PATCH v2] " Olaf Hering
2011-05-06 13:20     ` Olaf Hering

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110506105545.GA16945@aepfle.de \
    --to=olaf@aepfle.de \
    --cc=akpm@linux-foundation.org \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.