public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Robert Richter <robert.richter@amd.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Carl E. Love" <cel@us.ibm.com>,
	Michael Ellerman <michaele@au1.ibm.com>,
	oprofile-list <oprofile-list@lists.sourceforge.net>
Subject: Re: [PATCH] oprofile: fix crash when accessing freed task structs
Date: Mon, 16 Aug 2010 08:22:04 +1000	[thread overview]
Message-ID: <1281910924.2811.0.camel@pasglop> (raw)
In-Reply-To: <20100813153910.GD26154@erda.amd.com>

On Fri, 2010-08-13 at 17:39 +0200, Robert Richter wrote:
> On 02.08.10 21:39:33, Benjamin Herrenschmidt wrote:
> 
> > I can't tell that much about the workload, I don't have access to it
> > either, let's say that from my point of view it's a "customer" binary
> > blob.
> > 
> > I can re-trigger it though.
> 
> Benjamin,
> 
> can you try the patch below?

Thanks. I'll see if the folks who have a repro-case can give it a spin
for me.

Cheers,
Ben.

> Thanks,
> 
> -Robert
> 
> >From 4435322debc38097e9e863e14597ab3f78814d14 Mon Sep 17 00:00:00 2001
> From: Robert Richter <robert.richter@amd.com>
> Date: Fri, 13 Aug 2010 16:29:04 +0200
> Subject: [PATCH] oprofile: fix crash when accessing freed task structs
> 
> This patch fixes a crash during shutdown reported below. The crash is
> caused be accessing already freed task structs. The fix changes the
> order for registering and unregistering notifier callbacks.
> 
> All notifiers must be initialized before buffers start working. To
> stop buffer synchronization we cancel all workqueues, unregister the
> notifier callback and then flush all buffers. After all of this we
> finally can free all tasks listed.
> 
> This should avoid accessing freed tasks.
> 
> On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote:
> 
> > So the initial observation is a spinlock bad magic followed by a crash
> > in the spinlock debug code:
> >
> > [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136
> > [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03
> >
> > Backtrace looks like:
> >
> >       spin_bug+0x74/0xd4
> >       ._raw_spin_lock+0x48/0x184
> >       ._spin_lock+0x10/0x24
> >       .get_task_mm+0x28/0x8c
> >       .sync_buffer+0x1b4/0x598
> >       .wq_sync_buffer+0xa0/0xdc
> >       .worker_thread+0x1d8/0x2a8
> >       .kthread+0xa8/0xb4
> >       .kernel_thread+0x54/0x70
> >
> > So we are accessing a freed task struct in the work queue when
> > processing the samples.
> 
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Robert Richter <robert.richter@amd.com>
> ---
>  drivers/oprofile/buffer_sync.c |   27 ++++++++++++++-------------
>  drivers/oprofile/cpu_buffer.c  |    2 --
>  2 files changed, 14 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c
> index a9352b2..b7e755f 100644
> --- a/drivers/oprofile/buffer_sync.c
> +++ b/drivers/oprofile/buffer_sync.c
> @@ -141,16 +141,6 @@ static struct notifier_block module_load_nb = {
>  	.notifier_call = module_load_notify,
>  };
>  
> -
> -static void end_sync(void)
> -{
> -	end_cpu_work();
> -	/* make sure we don't leak task structs */
> -	process_task_mortuary();
> -	process_task_mortuary();
> -}
> -
> -
>  int sync_start(void)
>  {
>  	int err;
> @@ -158,7 +148,7 @@ int sync_start(void)
>  	if (!zalloc_cpumask_var(&marked_cpus, GFP_KERNEL))
>  		return -ENOMEM;
>  
> -	start_cpu_work();
> +	mutex_lock(&buffer_mutex);
>  
>  	err = task_handoff_register(&task_free_nb);
>  	if (err)
> @@ -173,7 +163,10 @@ int sync_start(void)
>  	if (err)
>  		goto out4;
>  
> +	start_cpu_work();
> +
>  out:
> +	mutex_unlock(&buffer_mutex);
>  	return err;
>  out4:
>  	profile_event_unregister(PROFILE_MUNMAP, &munmap_nb);
> @@ -182,7 +175,6 @@ out3:
>  out2:
>  	task_handoff_unregister(&task_free_nb);
>  out1:
> -	end_sync();
>  	free_cpumask_var(marked_cpus);
>  	goto out;
>  }
> @@ -190,11 +182,20 @@ out1:
>  
>  void sync_stop(void)
>  {
> +	/* flush buffers */
> +	mutex_lock(&buffer_mutex);
> +	end_cpu_work();
>  	unregister_module_notifier(&module_load_nb);
>  	profile_event_unregister(PROFILE_MUNMAP, &munmap_nb);
>  	profile_event_unregister(PROFILE_TASK_EXIT, &task_exit_nb);
>  	task_handoff_unregister(&task_free_nb);
> -	end_sync();
> +	mutex_unlock(&buffer_mutex);
> +	flush_scheduled_work();
> +
> +	/* make sure we don't leak task structs */
> +	process_task_mortuary();
> +	process_task_mortuary();
> +
>  	free_cpumask_var(marked_cpus);
>  }
>  
> diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c
> index 219f79e..f179ac2 100644
> --- a/drivers/oprofile/cpu_buffer.c
> +++ b/drivers/oprofile/cpu_buffer.c
> @@ -120,8 +120,6 @@ void end_cpu_work(void)
>  
>  		cancel_delayed_work(&b->work);
>  	}
> -
> -	flush_scheduled_work();
>  }
>  
>  /*
> -- 
> 1.7.1.1
> 
> 
> 



  reply	other threads:[~2010-08-15 22:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-22  5:14 Possible Oprofile crash/race when stopping Benjamin Herrenschmidt
2010-07-28 12:21 ` Robert Richter
2010-08-03  1:39   ` Benjamin Herrenschmidt
2010-08-13 15:39     ` [PATCH] oprofile: fix crash when accessing freed task structs Robert Richter
2010-08-15 22:22       ` Benjamin Herrenschmidt [this message]
2010-08-31 10:28         ` Robert Richter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1281910924.2811.0.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=cel@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michaele@au1.ibm.com \
    --cc=oprofile-list@lists.sourceforge.net \
    --cc=robert.richter@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox