All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@domain.hid>
To: Philippe Gerum <rpm@xenomai.org>
Cc: adeos-main <adeos-main@gna.org>
Subject: Re: [Adeos-main] [RFC][PATCH] x86: Work around deadlock between ipipe_critical_enter and rwlocks
Date: Mon, 23 Aug 2010 15:25:33 +0200	[thread overview]
Message-ID: <4C7276CD.1010305@domain.hid> (raw)
In-Reply-To: <4C6BF202.4@domain.hid>

Jan Kiszka wrote:
> When preempting the read side of a Linux rwlock with a critical IPI
> while the write side was already spinning and Linux is the top-most
> domain, ipipe_critical_enter will trigger a deadlock.
> 
> Work around this issue by detecting long stalls while waiting for all
> CPUs, terminate the request in this case, and restart it. In that case
> we have to make sure that a potential sync function is not executed
> multiple time, so we have to synchronize the restart on all CPUs passing
> through the IPI handler, though now in arbitrary order.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> ---
> 
> This patch is still under test, and as it usually took at least several
> hundred reboot cycles to trigger the issue, this test may take some more
> days. Still I would like to collect first feedback on the proposal. I'm
> not fully happy with having to define an arbitrary loop count as restart
> condition. Let's hope this behaves properly when the number of CPU
> increases.

Patch ran well a few thousand reboot cycles inside kvm (until the latter
crashed the host - should have updated it first...) while it used to
lock up after a few hundreds before.

> 
> I haven't checked all other arches, but at least powerpc requires a
> similar fix - maybe by pushing the generic bits into generic ipipe code.
> 
>  arch/x86/kernel/ipipe.c |   38 +++++++++++++++++++++++++++++++++++++-
>  1 files changed, 37 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kernel/ipipe.c b/arch/x86/kernel/ipipe.c
> index 76283c3..cc2ca03 100644
> --- a/arch/x86/kernel/ipipe.c
> +++ b/arch/x86/kernel/ipipe.c
> @@ -55,10 +55,14 @@ DEFINE_PER_CPU(struct pt_regs, __ipipe_tick_regs);
>  
>  #ifdef CONFIG_SMP
>  
> +#define IPIPE_CRITICAL_TIMEOUT	1000000
> +
>  static cpumask_t __ipipe_cpu_sync_map;
>  
>  static cpumask_t __ipipe_cpu_lock_map;
>  
> +static cpumask_t __ipipe_cpu_pass_map;
> +
>  static unsigned long __ipipe_critical_lock;
>  
>  static IPIPE_DEFINE_SPINLOCK(__ipipe_cpu_barrier);
> @@ -406,6 +410,8 @@ void __ipipe_do_critical_sync(unsigned irq, void *cookie)
>  		/* Call the sync routine if any. */
>  		__ipipe_cpu_sync();
>  
> +	cpu_set(cpu, __ipipe_cpu_pass_map);
> +
>  	spin_unlock(&__ipipe_cpu_barrier);
>  
>  	cpu_clear(cpu, __ipipe_cpu_sync_map);
> @@ -441,6 +447,7 @@ unsigned long ipipe_critical_enter(void (*syncfn) (void))
>  
>  	{
>  		int cpu = ipipe_processor_id();
> +		unsigned long loops;
>  		cpumask_t lock_map;
>  
>  		if (!cpu_test_and_set(cpu, __ipipe_cpu_lock_map)) {
> @@ -451,17 +458,46 @@ unsigned long ipipe_critical_enter(void (*syncfn) (void))
>  				} while (++n < cpu);
>  			}
>  
> +restart:
>  			spin_lock(&__ipipe_cpu_barrier);
>  
>  			__ipipe_cpu_sync = syncfn;
>  
> +			cpus_clear(__ipipe_cpu_pass_map);
> +			cpu_set(cpu, __ipipe_cpu_pass_map);
> +
>  			/* Send the sync IPI to all processors but the current one. */
>  			apic->send_IPI_allbutself(IPIPE_CRITICAL_VECTOR);
>  
>  			cpus_andnot(lock_map, cpu_online_map, __ipipe_cpu_lock_map);
>  
> -			while (!cpus_equal(__ipipe_cpu_sync_map, lock_map))
> +			loops = IPIPE_CRITICAL_TIMEOUT;
> +
> +			while (!cpus_equal(__ipipe_cpu_sync_map, lock_map)) {
>  				cpu_relax();
> +
> +				if (--loops == 0) {
> +					/*
> +					 * We ran into a deadlock due to a
> +					 * contended rwlock. Cancel this round
> +					 * and retry after some cycles.
> +					 */
> +					__ipipe_cpu_sync = NULL;
> +
> +					spin_unlock(&__ipipe_cpu_barrier);
> +
> +					/*
> +					 * Ensure all CPUs consumed the IPI to
> +					 * avoid running __ipipe_cpu_sync
> +					 * prematurely.
> +					 */
> +					while (!cpus_equal(cpu_online_map,
> +							   __ipipe_cpu_pass_map))
> +						cpu_relax();
> +
> +					goto restart;
> +				}
> +			}
>  		}
>  
>  		atomic_inc(&__ipipe_critical_count);

Any further comments or change requests? Otherwise I would send out a
git pull request for this patch and two minor ones in my 34-x86 queue.

Also, I'm considering to refactor the critical sync code, moving it
completely to kernel/ipipe/core.c. There seem to be only minor
variations in its "arch-specific" implementations. But that would be
2.6.35 material.

BTW, any x86 bits for 2.6.35 in some private branch of yours, Philippe?
I have "kernel update" on my agenda, and I would like to use that
version as base. If there are no bits, I would start generating some.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



  reply	other threads:[~2010-08-23 13:25 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-18 14:45 [Adeos-main] [RFC][PATCH] x86: Work around deadlock between ipipe_critical_enter and rwlocks Jan Kiszka
2010-08-23 13:25 ` Jan Kiszka [this message]
2010-08-23 13:32   ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C7276CD.1010305@domain.hid \
    --to=jan.kiszka@domain.hid \
    --cc=adeos-main@gna.org \
    --cc=rpm@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.