All of lore.kernel.org
 help / color / mirror / Atom feed
From: jamie@shareable.org (Jamie Lokier)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] ARM: Add SWP/SWPB emulation for ARMv7 processors (v3)
Date: Wed, 6 Jan 2010 21:53:36 +0000	[thread overview]
Message-ID: <20100106215336.GA19488@shareable.org> (raw)

Leif Lindholm wrote:
> > From: Jamie Lokier [mailto:jamie at shareable.org]
> > Then calling it like this:
> > 
> > 		__user_swp_asm(data, address, res, "");
> > 		__user_swp_asm(data, address, res, "b");
> 
> Neat.
> But how about, for clarity, keeping the calling syntax in the calling
> functions and add macros for the variants?:
> 
> #define __user_swp_asm_generic(data, addr, res, B) \
> ...
> #define __user_swp_asm(data, addr, res) \
>         __user_swp_asm_generic(data, addr, res, "")
> #define __user_swpb_asm(data, addr, res) \
>         __user_swp_asm_generic(data, addr, res, "b")

I'd just call the generic one __user_swpX_data and call it - I don't
think the additional tiny macros add any clarity, particularly with
being used in one place just a few lines down.  But it's totally
subjective and up to you.

> > > +static int emulate_swp(struct pt_regs *regs, unsigned int address,
> > > +		       unsigned int destreg, unsigned int data)
> > 
> > > +static int emulate_swpb(struct pt_regs *regs, unsigned int address,
> > > +			unsigned int destreg, unsigned int data)
> > 
> > Two almost identical functions.  I wonder if it would be better to
> > merge them and take a flag.  It would also reduce the compiled code
> > size.
> 
> I'm hesitant to add more than 4 arguments (adds stack overhead).
> Also, at least cs2009q3 gcc (4.3.3) seems to inline both of these, so
> not sure a codesize improvement would occur in practise.

If they are inlined, there is no stack overhead.  Mainly I thought
it would make the source smaller/tidier tidier :-)

> > Why is the smp_mb() needed?  I don't doubt there's a reason, but I
> > don't see what it is.
> 
> A DMB is required between acquiring a lock and accessing the protected
> resource, as well as between modifying a protected resource and
> releasing its lock. Because there is no way to tell whether the SWP
> performed a lock or unlock operation, inserting the barriers on
> either side seemed the safest way to ensure that code written for ARMv5
> or earlier would work as expected.
> 
> I guess a case could be made that this is an application problem and
> should be resolved at that end.

That's a really good reason, and thanks for thinking of it: To make
ARMv5 application code that doesn't know about DMB work properly.  Any
code which is DMB-aware is likely to use LDREX/STREX itself, so this
is great match. :-)

However, please follow this advice from Documentation/SubmitChecklist:

>>    24: All memory barriers {e.g., barrier(), rmb(), wmb()} need a
>>        comment in the source code that explains the logic of what
>>        they are doing and why.

I think a comment is even more important in this case, because the
barriers are an ABI design decision you've made; nobody could deduce
why they are there from SMP correctness alone.

Unfortunately there are other places in threaded code that need DMB
(for example any good implementation of pthread_once), so ARMv5
threaded code can still fail in subtle, unpredictable ways :-( Is it
possible to turn off weak memory ordering, after a SWP instruction is
detected?  Though even that would not ensure correctness of
pthread_once equivalent if it comes before any mutex locks in a program :(

I wonder, too, about what ARM ARM says about LDREX/STREX only working
on memory with the "Shared TLB attribute".  Will single-threaded code
using SWP on mapped shared memory get its expected atomic behaviour at
all with your emulation?

> Good point, will do that in the next version.
>  
> > > +#ifndef CONFIG_ALIGNMENT_TRAP
> > > +	res = proc_mkdir("cpu", NULL);
> 
> > ?  Is that to work with different kernel versions?
> 
> It's to ensure it would work (without console warnings) even if someone
> decides to disable ALIGNMENT_TRAP. An alternative would be to strip the
> creation of /proc/cpu out from mm/alignment.c and put it somewhere else
> (or move the stats file somewhere else - but it seemed logical to group
> with /proc/alignment).

Seems to me both should be sysfs CPU attributes anyway, but I don't
know much about that so don't take my word for it.  The ifdef is kind
of ugly but maybe unavoidable.

-- Jamie

             reply	other threads:[~2010-01-06 21:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-06 21:53 Jamie Lokier [this message]
  -- strict thread matches above, loose matches on Subject: below --
2010-01-05 18:24 [PATCH] ARM: Add SWP/SWPB emulation for ARMv7 processors (v3) Leif Lindholm
2010-01-05 19:43 ` Jamie Lokier
2010-01-06 16:23   ` Catalin Marinas
2010-01-06 16:32     ` Russell King - ARM Linux
2010-01-06 16:58       ` Catalin Marinas
2010-01-06 18:17     ` Jamie Lokier
2010-01-07  9:59       ` Catalin Marinas
2010-01-08 14:19         ` Jamie Lokier
2010-01-06 19:19   ` Leif Lindholm
2010-01-06 19:36     ` Russell King - ARM Linux
2010-01-14 13:08       ` Leif Lindholm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100106215336.GA19488@shareable.org \
    --to=jamie@shareable.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.