From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Richard Cooper" Subject: OT: Re: newbie question about integers size/portabilty. Date: Wed, 29 Dec 2004 02:21:49 -0500 Message-ID: References: <20041228122916.GA7137@ic.unicamp.br> <16849.57445.118809.760890@eidolon.muppetlabs.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7BIT Return-path: In-Reply-To: <16849.57445.118809.760890@eidolon.muppetlabs.com> Sender: linux-assembly-owner@vger.kernel.org List-Id: Content-Type: text/plain; charset="us-ascii"; format="flowed delsp=yes" To: linux-assembly@vger.kernel.org If anyone's annoyed by offtopic discussions, I've got a perfectly good web board that's being used for nothing at the moment that we can move this to (assuming it goes any further). It's at: http://www.xersedefixion.com/forum/ However, I doubt anyone cares, so I'll assume that's the case until someone tells us to shut up, but I'll go ahead and post a copy of this there as well in case everyone else knows differently. > This rather misses the point, I think. It may very well, as it's simply how things currently appear to me. > This is untrue. The ANSI C standard specifies minimum guaranteed > sizes. chars have to be able to hold at least 8 bits, shorts and ints > both have to be able to hold at least 16 bits, longs at least 32 bits, > and long longs at least 64 bits. Now that's much more useful for getting code written, but totally useless for porting to machines with different data type sizes. It sounds fine from the perspective of eight-bits-to-the-byte computing, but go down to seven-bits-to-the-byte and it looks completely silly, as your chars will contain 6 more bits than you're allowed to use, your shorts will probably contain 12 more bits than you're allowed to use, and your longs will contain 24 extra bits. Go up to nine-bits-to-the-byte and it's not quite as bad as that, but then you still have bits you're not allowed to use because they aren't there when compiled on an eight-bits-to-the-byte system. So to stay compliant you have to pretend like you have an eight-bits-to-the-byte system, and if you're willing to do that to be portable, then why have those extra bits in your system at all? Why not just go buy an eight-bits-to-the-byte system? It makes sense I suppose in C context where the entire premise of portability is mearly coding to the lowest common denominator, but it's not what I would consider a workable solution to the problem. >> I myself would have made sizes like "1bit" and "2bit" and "3bit" and >> "4bit" all the way up to "1000bit" and said that if you need at >> least 11 bits in your number, use "11bit" and it'll compile into a >> data size at least large enough to hold 11 bit numbers. > ... thereby passing the problem of what integer size will produce > efficient code on to the programmer, so that you don't have to deal > with it. No, it wasn't something I even considered. But now that I have, it only took me a few seconds to think of something better than ANSI C's hack of a solution. Just have two sets of those data types, one for when you want the smallest type available, and another for when you want the fastest type available. Something like char1 through char99 for things like characters where you're just looking for small storage size, which always translate to the smallest size available, and another set like int1 through int99 for when you're looking for something you can make fast calculations with, which always translate to the machine's word size, except for the cases where it doesn't have enough bits, in which case it translates to something less-optimal yet sufficient. Then if you write your code for a 36 bit machine, where you can use "int36" and make full use of all 36 bits without having to think about eight-bits-to-the-byte systems since on those machines it will compile into a 64 bit number. Then you can use "int17" for other numbers which are smaller, but for which you still want your machine's 36 bit number for speed reasons, and they'll compile into 32 bit numbers on eight-bits-to-the-byte systems. Now that is portable, whereas making the programmer go through their code and change all of their data types isn't, and making a programmer pretend like their 36 bit machine is a 32 bit machine is just plain dumb. Am I mistaken that the goal is to have code that compiles without modification on different architectures? I don't see how C as it is can do that without forcing less-than-optimial code on those other architectures. Either people with 36 bit systems use only 32 bits, or they go and change half of their "ints" to "long longs" when they want to compile their code on a different system. If you consider languages where it's only portable if someone "ports" the code, then by that definition every language is portable, and so including all of this nonsense for portability is just a waste of time since someone will have to go in and change all of the data type sizes anyway. It seems to me like the goal is simply to be silly. It's portability by appearance only. You can't totally disregard bit sizes and end up with a solution that works, and ANSI C's standard of "lets just pretend everyone has eight-bits-to-the-byte systems" is still disregarding bit sizes, just in a way that works just fine for people with eight-bits-to-the-byte systems. I really think some people have it stuck in their head that all that's needed to be done to create portability is to pretend like bits don't exist, and then everything will magically work because nothing depends on data type sizes. But everything depends on data type sizes, and so you can't just ignore the bits.