public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
* Proposal: (u)intptr_t replaces (unsigned) long as opaque type
@ 2010-09-18 21:42 Greg McGary
  2010-09-18 23:54 ` Matthew Wilcox
  2010-09-19  4:33 ` Alexey Zaytsev
  0 siblings, 2 replies; 10+ messages in thread
From: Greg McGary @ 2010-09-18 21:42 UTC (permalink / raw)
  To: linux-arch

  I am porting Linux to a new architecture (a massively multi-threaded network processor), which has a unique and bothersome characteristic: sizeof (void*) > sizeof (long).  In most regards, it is a 32-bit machine, but pointers and all GPRs are 48 bit.  Pointers occupy 64 bits in memory, with 16 bits ignored by load/store.

Unfortunately, the kernel is thick with pointer/int casts and universally assumes that sizeof (void*) == sizeof(long).  The first phase of the port is s/long/intptr_t/ and s/unsigned long/uintptr_t/ for data that hold addresses and for casts to/from address data.  I think it improves code comprehensibility to use (u)intptr_t for opaque types and for address arithmetic.  I'm three weeks into the project and have almost 5000 changes.  I probably have another week and a few thousand more to go before I have eliminated the pointer/int cast warnings and can work on more substantive phases.

For most (all?) existing ports, (u)intptr_t=(unsigned) long, so in theory there should be zero change to the compiled Linux image for conventional ILP=32 LP=64 architectures.

If the intent is really to extend or truncate with a cast between pointer and integer, it can be done like so:

     void *ptr;
     long l;
     int i;
     long long ll;

     ptr = (void *)(intptr_t) l;
     l = (long)(intptr_t) ptr;

     ptr = (void *)(intptr_t) ll;
     ll = (long long)(intptr_t) ptr;

     ptr = (void *)(intptr_t) i;
     i = (int)(intptr_t) ptr;

Are maintainers willing to accept patches to covert from (unsigned) long and (u)intptr_t ?  I propose to do this in many small installments for digestibility and to moderate the downstream impact of merge conflicts.

G

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-18 21:42 Proposal: (u)intptr_t replaces (unsigned) long as opaque type Greg McGary
@ 2010-09-18 23:54 ` Matthew Wilcox
  2010-09-19  2:43   ` Greg McGary
  2010-09-19  4:33 ` Alexey Zaytsev
  1 sibling, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2010-09-18 23:54 UTC (permalink / raw)
  To: Greg McGary; +Cc: linux-arch

On Sat, Sep 18, 2010 at 02:42:35PM -0700, Greg McGary wrote:
>  I am porting Linux to a new architecture (a massively multi-threaded network processor), which has a unique and bothersome characteristic: sizeof (void*) > sizeof (long).  In most regards, it is a 32-bit machine, but pointers and all GPRs are 48 bit.  Pointers occupy 64 bits in memory, with 16 bits ignored by load/store.

Linux really only supports ILP32 and LP64 models.  Pick one, and make
your gcc mmtnp-unknown-linux triplet support it.

ILP32 may be a better model for you, depending how much RAM your mmtnp
processor is likely to support.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-18 23:54 ` Matthew Wilcox
@ 2010-09-19  2:43   ` Greg McGary
  2010-09-19  4:06     ` Matthew Wilcox
  0 siblings, 1 reply; 10+ messages in thread
From: Greg McGary @ 2010-09-19  2:43 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-arch

  On 09/18/10 16:54, Matthew Wilcox wrote:
> Linux really only supports ILP32 and LP64 models.  Pick one, and make
> your gcc mmtnp-unknown-linux triplet support it.
>
> ILP32 may be a better model for you, depending how much RAM your mmtnp
> processor is likely to support.
>
Yes, I know Linux currently supports only ILP32 and LP64.  What I propose is a way to make it support something new beyond those models, and do so in a way that has no impact on vmlinux images for existing ports.

What is mmtnp?

ILP32 will not work for me because pointers coerced to long will be truncated.  LP64 could work, but performance would suck, so that's not viable.

G

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-19  2:43   ` Greg McGary
@ 2010-09-19  4:06     ` Matthew Wilcox
  2010-09-19 16:53       ` Greg McGary
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2010-09-19  4:06 UTC (permalink / raw)
  To: Greg McGary; +Cc: linux-arch

On Sat, Sep 18, 2010 at 07:43:42PM -0700, Greg McGary wrote:
>  On 09/18/10 16:54, Matthew Wilcox wrote:
>> Linux really only supports ILP32 and LP64 models.  Pick one, and make
>> your gcc mmtnp-unknown-linux triplet support it.
>>
>> ILP32 may be a better model for you, depending how much RAM your mmtnp
>> processor is likely to support.
>>
> Yes, I know Linux currently supports only ILP32 and LP64.  What I propose is a way to make it support something new beyond those models, and do so in a way that has no impact on vmlinux images for existing ports.

But it's only the beginnings of your problems.  There's so much userspace
code that's written assuming ILP32 / LP64.  Even inside the kernel,
people are going to constantly break your port by introducing new code
that doesn't use uintptr_t.

Then there are other things we use unsigned long for, like interrupt
status flags (spin_lock_irqsave and friends).  Should those be converted
to some new type?

> What is mmtnp?

Your processor.  I took the acronym of the description you gave.

> ILP32 will not work for me because pointers coerced to long will be truncated.  LP64 could work, but performance would suck, so that's not viable.

How bad would performance really suck?  x86-32 performance sucks on
64-bit arithmetic, but that's because it has about three registers.
With a decent size register file (16 or 32), I doubt performance will
be that bad.

If you set up your memory map correctly, truncating pointers to 32 bit
may not be a huge problem.  You can use the kmap() abstraction to run
your kernel in a 32-bit address space.

Essentially, you're asking us to take on a huge number of changes in
order to make your new processor's performance not suck.  I don't think
it likely you're going to find much sympathy here.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-18 21:42 Proposal: (u)intptr_t replaces (unsigned) long as opaque type Greg McGary
  2010-09-18 23:54 ` Matthew Wilcox
@ 2010-09-19  4:33 ` Alexey Zaytsev
  2010-09-19  6:50   ` David Howells
  1 sibling, 1 reply; 10+ messages in thread
From: Alexey Zaytsev @ 2010-09-19  4:33 UTC (permalink / raw)
  To: Greg McGary; +Cc: linux-arch, Matthew Wilcox

On Sun, Sep 19, 2010 at 01:42, Greg McGary <greg@mcgary.org> wrote:
>  I am porting Linux to a new architecture (a massively multi-threaded
> network processor), which has a unique and bothersome characteristic: sizeof
> (void*) > sizeof (long).  In most regards, it is a 32-bit machine, but
> pointers and all GPRs are 48 bit.  Pointers occupy 64 bits in memory, with
> 16 bits ignored by load/store.

Forgive my ignorence, but is there something fundamentally wrong with
having 48-bit longs?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-19  4:33 ` Alexey Zaytsev
@ 2010-09-19  6:50   ` David Howells
  0 siblings, 0 replies; 10+ messages in thread
From: David Howells @ 2010-09-19  6:50 UTC (permalink / raw)
  To: Alexey Zaytsev; +Cc: dhowells, Greg McGary, linux-arch, Matthew Wilcox

Alexey Zaytsev <alexey.zaytsev@gmail.com> wrote:

> Forgive my ignorence, but is there something fundamentally wrong with
> having 48-bit longs?

48 is not a power of 2.  I wonder what that might break...

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-19  4:06     ` Matthew Wilcox
@ 2010-09-19 16:53       ` Greg McGary
  2010-09-20  9:30         ` Arnd Bergmann
  0 siblings, 1 reply; 10+ messages in thread
From: Greg McGary @ 2010-09-19 16:53 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-arch

 On 09/18/10 21:06, Matthew Wilcox wrote:

> But it's only the beginnings of your problems.  There's so much userspace
> code that's written assuming ILP32 / LP64.  Even inside the kernel,
> people are going to constantly break your port by introducing new code
> that doesn't use uintptr_t.

Actually, the kernel is my final problem.  The userspace problem turned-out to
be quite manageable.  Within a year, working mostly alone, I have complete binutils,
GCC, uClibc, busybox and QEMU (Linux userland mode) ports running and I didn't even
come close to pulling-out my hair.  I have two-thirds of the SPEC2000 benchmarks
running correctly on QEMU, and I expect the remaining third to yield when I have
some time to debug them.  QEMU required the most surgery, and of the same kind as
Linux needs: s/target_long/target_intptr_t/ for vars that hold addresses.  uClibc
was also fast & loose in a few places regarding pointer/integer casts, but for the
most part, non-kernel code uses void* as the opaque type and char* or void* for
doing address arithmetic.  The kernel is the sloppiest of all regarding
pointer/integer casts.


I'm unconcerned about future breakage.  Over time, people will catch-on to the new
convention and breakage will diminish.  I'm much happier about fixing new problems
as they arise than about maintaining a port that is perpetually far from the mainline.


> Then there are other things we use unsigned long for, like interrupt
> status flags (spin_lock_irqsave and friends).  Should those be converted
> to some new type?

The only types that should be converted are for (a) opaque data that might hold
either a long or a pointer (e.g., ioctl arg, fcnt arg, generic callback function
args) and (b) data on which Linux does address arithmetic.  If an interrupt status
flag is not intended to hold an address, then there's no reason to change it.


> Essentially, you're asking us to take on a huge number of changes in
> order to make your new processor's performance not suck.  I don't think
> it likely you're going to find much sympathy here.

I propose this because (a) there is no impact on generated code for other ports,
and (b) it benefits code quality, flexibility, portability and self-documentation.
If the kernel is so ossified that avoiding changes has become the cardinal virtue,
then I guess I'm hosed.  8^)  As stated before, I'm prepared to submit changes
slowly over time in order to moderate the impact.

G

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-19 16:53       ` Greg McGary
@ 2010-09-20  9:30         ` Arnd Bergmann
  2010-09-20 20:28           ` Greg McGary
  0 siblings, 1 reply; 10+ messages in thread
From: Arnd Bergmann @ 2010-09-20  9:30 UTC (permalink / raw)
  To: Greg McGary; +Cc: Matthew Wilcox, linux-arch

On Sunday 19 September 2010, Greg McGary wrote:
> > Essentially, you're asking us to take on a huge number of changes in
> > order to make your new processor's performance not suck.  I don't think
> > it likely you're going to find much sympathy here.
> 
> I propose this because (a) there is no impact on generated code for other ports,
> and (b) it benefits code quality, flexibility, portability and self-documentation.
> If the kernel is so ossified that avoiding changes has become the cardinal virtue,
> then I guess I'm hosed.  8^)  As stated before, I'm prepared to submit changes
> slowly over time in order to moderate the impact.

We make a lot of changes everywhere in the kernel that directly benefit only
a small subset of the community, like the RT work. I would think that the
only important requirements are that your proposed changes do not become
a burden for maintainance and that you minimise the risk for regressions.

Having proper types for opaque scalars that can be used for both pointer
and long values is a reasonable thing to do in general, and seems to fit
that requirement. Right now, we use 'unsigned long' to mean a number of
different things and from reading code it's not clear which one you want.
Not sure of intptr_t is the best name, so we might come up with something
better for this purpose, but it's a standard C99 type and the meaning
is pretty much what we want anyway.

Another possible alternative might be to add a kernel mode to your compiler
that defines long as 64 bits, but with only 48 significant bits, like you
have for your pointers. Does that work with your hardware?
It seems that this would be fast and means fewer changes to the kernel,
but opens a completely different can of worms in that it breaks code assuming
that "ULONG_MAX == (1ul << (sizeof (unsigned long) * 8)) - 1", and might
require many more changes to gcc.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-20  9:30         ` Arnd Bergmann
@ 2010-09-20 20:28           ` Greg McGary
  2010-09-21  6:32             ` Arnd Bergmann
  0 siblings, 1 reply; 10+ messages in thread
From: Greg McGary @ 2010-09-20 20:28 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Matthew Wilcox, linux-arch

On 09/20/10 02:30, Arnd Bergmann wrote:

> Having proper types for opaque scalars that can be used for both pointer
> and long values is a reasonable thing to do in general, and seems to fit
> that requirement. Right now, we use 'unsigned long' to mean a number of
> different things and from reading code it's not clear which one you want.
> Not sure of intptr_t is the best name, so we might come up with something
> better for this purpose, but it's a standard C99 type and the meaning
> is pretty much what we want anyway


Some common cases where kernel type choices can improve:

1) byte-address arithmetic (add, subtract, mask, align, low-order bit-flag
   manipulation): The C99 types (u)intptr_t are the best fit for this use
   case. (void*) and (char*) are not so good--they are serviceable for
   add/subtract on addresses, but don't work for masking and other bitops.

2) generic pointer: ioctl args account for approx 30% of all cases for type
   improvement.  Here, the best choice to replace (unsigned long) is (void *).

3) generic pointer-or-integer: We should probably invent a new typedef here:
   e.g., opaque_t or long_or_ptr_t.  For my port, sizeof(void*) > sizeof(long),
   but sometimes APIs have sizeof (void*) < sizeof(long).  With the
   indirection of a new type name, we could define it as the larger of
   (uintptr_t) or (unsigned long).

Of these three cases, #1 and #2 are straightforward and should not involve any
controversy.  The only issue with #3 is whether to invent a new type name, and
if so, what that name should be.

If I were to submit some patches for cases #1 and #2, would I be wasting my
time? 8^)  I'll leave-off for case #3 until there's consensus.

> Another possible alternative might be to add a kernel mode to your compiler
> that defines long as 64 bits, but with only 48 significant bits, like you
> have for your pointers. Does that work with your hardware?
> It seems that this would be fast and means fewer changes to the kernel,
> but opens a completely different can of worms in that it breaks code assuming
> that "ULONG_MAX == (1ul << (sizeof (unsigned long) * 8)) - 1", and might
> require many more changes to gcc.
>
> 	Arnd


The address space is not linear.  The upper 16 bits of pointers have segment
info so I can't just confine the address space to 2^32 and truncate pointers
to 32 bits.  I could implement a toolchain mode where long is 64 bits at the
cost of some performance to do 64-bit ALU ops as two 32-bit subops.  I will
keep that in mind as an option.  However, the work involved in implementing
and validating the 64-bit API/ABI mode toolchain is at least as much and maybe
more than fixing the kernel types.  I prefer to spend my time on the approach
that will yield the best performing product, even if I must maintain all these
type changes in my own git repo.

G

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Proposal: (u)intptr_t replaces (unsigned) long as opaque type
  2010-09-20 20:28           ` Greg McGary
@ 2010-09-21  6:32             ` Arnd Bergmann
  0 siblings, 0 replies; 10+ messages in thread
From: Arnd Bergmann @ 2010-09-21  6:32 UTC (permalink / raw)
  To: Greg McGary; +Cc: Matthew Wilcox, linux-arch

On Monday 20 September 2010 22:28:20 Greg McGary wrote:
> On 09/20/10 02:30, Arnd Bergmann wrote:

> 2) generic pointer: ioctl args account for approx 30% of all cases for type
>    improvement.  Here, the best choice to replace (unsigned long) is (void *).

Not sure about this one, maybe it should be uintptr_t as well. The problem
is that a significant number of ioctls also directly operate on the argument
as a scalar, even though the majority clearly needs to convert to a pointer.
 
> 3) generic pointer-or-integer: We should probably invent a new typedef here:
>    e.g., opaque_t or long_or_ptr_t.  For my port, sizeof(void*) > sizeof(long),
>    but sometimes APIs have sizeof (void*) < sizeof(long).  With the
>    indirection of a new type name, we could define it as the larger of
>    (uintptr_t) or (unsigned long).

No, I really wouldn't go that far without a need. If we can come up with
a good solution to cover the sizeof(void*) > sizeof(long) case but
not the sizeof (void*) < sizeof(long) case, I don't think it's worth
thinking about the more complex cases.

> If I were to submit some patches for cases #1 and #2, would I be wasting my
> time? 8^)  I'll leave-off for case #3 until there's consensus.

Case #1 looks worthwhile from a code cleanup perspective, I see that as
a useful step independent of your architecture, so if you're willing
to do the work, there shouldn't be anything holding you up.

For the specific case of ioctl, we actually have a cleanup pending in
that area anyway. The prototype is a bit weird for historic reasons
and should just be cleaned up now. We currently have

struct file_operations {
	...
	unsigned long unlocked_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
	unsigned long compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
	...
};

There are now three things wrong with this:
- the return type should be 'int'
- the name of the first function should be 'ioctl', not unlocked_ioctl
- you want the third argument to be uintptr_t.

If you can come up with a semantic patch that can be applied treewide to clean
this all up, I think that would be excellent.

> > Another possible alternative might be to add a kernel mode to your compiler
> > that defines long as 64 bits, but with only 48 significant bits, like you
> > have for your pointers. Does that work with your hardware?
> > It seems that this would be fast and means fewer changes to the kernel,
> > but opens a completely different can of worms in that it breaks code assuming
> > that "ULONG_MAX == (1ul << (sizeof (unsigned long) * 8)) - 1", and might
> > require many more changes to gcc.
> 
> The address space is not linear.  The upper 16 bits of pointers have segment
> info so I can't just confine the address space to 2^32 and truncate pointers
> to 32 bits.  I could implement a toolchain mode where long is 64 bits at the
> cost of some performance to do 64-bit ALU ops as two 32-bit subops.  I will
> keep that in mind as an option.  However, the work involved in implementing
> and validating the 64-bit API/ABI mode toolchain is at least as much and maybe
> more than fixing the kernel types.  I prefer to spend my time on the approach
> that will yield the best performing product, even if I must maintain all these
> type changes in my own git repo.

It would probably be helpful to know more about the segment info, maybe
there is yet another way to do it. This sounds similar to the s390
architecture, where we never used the segments (access registers) in Linux.

So your GPRs are all 32 bits, and a pointer consists of a GPR plus a segment
register, right? Do you always use more than one segment, both in kernel
and in user space? If they are just for user space, it would be a lot
simpler because you can get away with simply extending the __user pointers
in the kernel (which still includes all ioctl functions).

There has also been some work on having gcc support for multiple address
spaces recently. Maybe you can use that to describe each segment as
an C address space, but keep one segment as the default address space
for regular pointers.

	Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-09-21  6:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-18 21:42 Proposal: (u)intptr_t replaces (unsigned) long as opaque type Greg McGary
2010-09-18 23:54 ` Matthew Wilcox
2010-09-19  2:43   ` Greg McGary
2010-09-19  4:06     ` Matthew Wilcox
2010-09-19 16:53       ` Greg McGary
2010-09-20  9:30         ` Arnd Bergmann
2010-09-20 20:28           ` Greg McGary
2010-09-21  6:32             ` Arnd Bergmann
2010-09-19  4:33 ` Alexey Zaytsev
2010-09-19  6:50   ` David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox