Pointer arithmetic error

linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Pointer arithmetic error
@ 2008-06-26 23:40 David Given
  2008-06-26 23:51 ` Chris Li
  0 siblings, 1 reply; 20+ messages in thread
From: David Given @ 2008-06-26 23:40 UTC (permalink / raw)
  To: linux-sparse

[-- Attachment #1: Type: text/plain, Size: 855 bytes --]

I've just found what I think is a bug. In evaluate_ptr_add(), in
evaluate.c, are the following lines:

	/* Get the size of whatever the pointer points to */
	multiply = base->bit_size >> 3;

This divides the bit size by 8 to get the size in bytes. However, this
doesn't take into account that bits_in_char might not be 8. I think this
should actually be:

	/* Get the size of whatever the pointer points to */
	multiply = base->bit_size / bits_in_char;

(This now produces the right result on my non-8-bit-char 'platform'.)

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-26 23:40 Pointer arithmetic error David Given
@ 2008-06-26 23:51 ` Chris Li
  2008-06-27  0:17   ` David Given
  0 siblings, 1 reply; 20+ messages in thread
From: Chris Li @ 2008-06-26 23:51 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

On Thu, Jun 26, 2008 at 4:40 PM, David Given <dg@cowlark.com> wrote:
>        /* Get the size of whatever the pointer points to */
>        multiply = base->bit_size >> 3;
>
> This divides the bit size by 8 to get the size in bytes. However, this
> doesn't take into account that bits_in_char might not be 8. I think this
> should actually be:
>
>        /* Get the size of whatever the pointer points to */
>        multiply = base->bit_size / bits_in_char;

I don't think so. The offset part of the ptr_add op is mean to be the
absolute offset, not how many element of the pointer index. In other
words, it is how many address number it need to add.

Chris

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-26 23:51 ` Chris Li
@ 2008-06-27  0:17   ` David Given
  2008-06-27  9:00     ` Christopher Li
  2008-06-27  9:49     ` Bernd Petrovitsch
  0 siblings, 2 replies; 20+ messages in thread
From: David Given @ 2008-06-27  0:17 UTC (permalink / raw)
  To: linux-sparse

[-- Attachment #1: Type: text/plain, Size: 881 bytes --]

Chris Li wrote:
[...]
> I don't think so. The offset part of the ptr_add op is mean to be the
> absolute offset, not how many element of the pointer index. In other
> words, it is how many address number it need to add.

Hmm. True. But even so, it's still not right on my system, which doesn't
use 8-bit bytes. (It's word addressable where each word can contain any
value, so sizeof(int) == sizeof(double) == sizeof(char) == 1.)

Should there, then, be another symbol to define the number of bits in a
byte, distinct from the number of bits in a char?

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27  0:17   ` David Given
@ 2008-06-27  9:00     ` Christopher Li
  2008-06-27  9:49     ` Bernd Petrovitsch
  1 sibling, 0 replies; 20+ messages in thread
From: Christopher Li @ 2008-06-27  9:00 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

On Thu, Jun 26, 2008 at 5:17 PM, David Given <dg@cowlark.com> wrote:

> Hmm. True. But even so, it's still not right on my system, which doesn't
> use 8-bit bytes. (It's word addressable where each word can contain any
> value, so sizeof(int) == sizeof(double) == sizeof(char) == 1.)

Ah, I see.  I was thinking some thing else.
What platform exactly is that?

> Should there, then, be another symbol to define the number of bits in a
> byte, distinct from the number of bits in a char?

Byte need to big enough to hold the char. Using bits_in_byte is better.
There might be other place in sparse assume byte is 8 bits.

Chris

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27  0:17   ` David Given
  2008-06-27  9:00     ` Christopher Li
@ 2008-06-27  9:49     ` Bernd Petrovitsch
  2008-06-27 10:55       ` David Given
  1 sibling, 1 reply; 20+ messages in thread
From: Bernd Petrovitsch @ 2008-06-27  9:49 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

On Fre, 2008-06-27 at 01:17 +0100, David Given wrote:
> Chris Li wrote:
> [...]
> > I don't think so. The offset part of the ptr_add op is mean to be the
> > absolute offset, not how many element of the pointer index. In other
> > words, it is how many address number it need to add.
> 
> Hmm. True. But even so, it's still not right on my system, which doesn't
> use 8-bit bytes. (It's word addressable where each word can contain any
> value, so sizeof(int) == sizeof(double) == sizeof(char) == 1.)

/usr/include/limits.h should have a correct #define CHAR_BIT.

> Should there, then, be another symbol to define the number of bits in a
> byte, distinct from the number of bits in a char?

In C, there is no type "byte" (unless you typedef oder #define it).
"byte" is usually (but not necessarily) meant as "unsigned char".

	Bernd
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27  9:49     ` Bernd Petrovitsch
@ 2008-06-27 10:55       ` David Given
  2008-06-27 11:20         ` Bernd Petrovitsch
  0 siblings, 1 reply; 20+ messages in thread
From: David Given @ 2008-06-27 10:55 UTC (permalink / raw)
  To: linux-sparse

Bernd Petrovitsch wrote:
[...]
> In C, there is no type "byte" (unless you typedef oder #define it).
> "byte" is usually (but not necessarily) meant as "unsigned char".

The issue here is that sparse is helpful and converts pointer offsets 
into byte offsets when generating code for pointer arithmetic. So:

const int* p = (const char*) 1234;
p += 10;

-->

set.32 %p0 <- 1234
add.32 %p1 <- %p0, 40
(In fact, in this example it'd collapse these together and use 1274 
instead.)

This appears to be done using hard-coded knowledge that a byte (one unit 
of addressingness) is 8 bits wide.

Chris Li wrote:
> Byte need to big enough to hold the char. Using bits_in_byte is better.
> There might be other place in sparse assume byte is 8 bits.

IIRC C specifies that sizeof() returns values measured in chars, but I 
don't believe it specifies any mapping between the size of chars and the 
underlying addressing units --- it should be possible to use 16-bit 
chars, for example, on an 8-bit byte system. Using 32-bit ints, 
sizeof(int) would then return 2; but you wouldn't be able to access 
individual bytes from C.

That's assuming I've remembered the spec correctly, of course --- this 
stuff is annoying subtle.

If you like I'll go and have a look to see if this can be easily fixed 
--- I need to do it anyway. In fact, I have vague recollections that 
load and store were actually using offsets like 0, 1, 2, etc (the 
'right' values), which means they must have been using bits_per_char to 
do their calculations; I need to check up on this.

-- 
David Given
dg@cowlark.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 10:55       ` David Given
@ 2008-06-27 11:20         ` Bernd Petrovitsch
  2008-06-27 14:03           ` David Given
  0 siblings, 1 reply; 20+ messages in thread
From: Bernd Petrovitsch @ 2008-06-27 11:20 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

On Fre, 2008-06-27 at 11:55 +0100, David Given wrote:
> Bernd Petrovitsch wrote:
> [...]
> > In C, there is no type "byte" (unless you typedef oder #define it).
> > "byte" is usually (but not necessarily) meant as "unsigned char".
> 
> The issue here is that sparse is helpful and converts pointer offsets 
> into byte offsets when generating code for pointer arithmetic. So:
> 
> const int* p = (const char*) 1234;
> p += 10;
> 
> -->
> 
> set.32 %p0 <- 1234
> add.32 %p1 <- %p0, 40
> (In fact, in this example it'd collapse these together and use 1274 
> instead.)
> 
> This appears to be done using hard-coded knowledge that a byte (one unit 
> of addressingness) is 8 bits wide.

It is done IMHO with the false knowledge that "sizeof(int) == 4 *
sizeof(char)".
IIUC your hardware (a DSP or what is it exactly) has "sizeof(int) ==
sizeof(char)" so the size of a char (or int) as such is irrelevant.

> Chris Li wrote:
> > Byte need to big enough to hold the char. Using bits_in_byte is better.
> > There might be other place in sparse assume byte is 8 bits.
> 
> IIRC C specifies that sizeof() returns values measured in chars, but I 

ACK. Therefore "sizeof(char) == 1" must always hold.

> don't believe it specifies any mapping between the size of chars and the 
> underlying addressing units --- it should be possible to use 16-bit 

Yes, that's what CHAR_BIT is for.

> chars, for example, on an 8-bit byte system. Using 32-bit ints, 
> sizeof(int) would then return 2; but you wouldn't be able to access 
> individual bytes from C.

ACK (apart from "shift and mask it" ans similar).

	Bernd
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 11:20         ` Bernd Petrovitsch
@ 2008-06-27 14:03           ` David Given
  2008-06-27 14:45             ` Bernd Petrovitsch
  0 siblings, 1 reply; 20+ messages in thread
From: David Given @ 2008-06-27 14:03 UTC (permalink / raw)
  To: linux-sparse

Bernd Petrovitsch wrote:
> It is done IMHO with the false knowledge that "sizeof(int) == 4 *
> sizeof(char)".

No, it's correctly been told that ints are 32 bits wide, but then it's 
converting this to *bytes* (not chars) by dividing by the hard-coded 
constant 8.

(By 'byte' I'm referring to the quantum of addressingness of the 
underlying machine architecture. In other words, the numbers that are 
used as parameters as offsets into load and store. This isn't something 
that's exposed to C except on architectures where sizeof(byte) == 
sizeof(char), i.e., all sensible ones.)

[...]
> ACK. Therefore "sizeof(char) == 1" must always hold.

Yes; but that is only true from C's perspective.

We're dealing with things from the machine code perspective, where 
sizeof(byte) == 1, and sizeof(char) is not necessarily the same as 
sizeof(byte).

I find it really helps with this stuff if you are capable of holding two 
completely contradictory beliefs at the same time. A certain level of 
insanity can help too...

-- 
David Given
dg@cowlark.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 14:03           ` David Given
@ 2008-06-27 14:45             ` Bernd Petrovitsch
  2008-06-27 15:45               ` David Given
  0 siblings, 1 reply; 20+ messages in thread
From: Bernd Petrovitsch @ 2008-06-27 14:45 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

On Fre, 2008-06-27 at 15:03 +0100, David Given wrote:
> Bernd Petrovitsch wrote:
> > It is done IMHO with the false knowledge that "sizeof(int) == 4 *
> > sizeof(char)".
> 
> No, it's correctly been told that ints are 32 bits wide, but then it's 
> converting this to *bytes* (not chars) by dividing by the hard-coded 

That's the bug. there is no difference between "byte" and "char". Tell
it that a char has 32 bits too *if* it's the case.

> constant 8.

Why do you (or sparse) want to divide a the size of a type where
"sizeof(type) == 1"?
Use (a correct) CHAR_BIT for that.

> (By 'byte' I'm referring to the quantum of addressingness of the 
> underlying machine architecture. In other words, the numbers that are 
> used as parameters as offsets into load and store. This isn't something 
> that's exposed to C except on architectures where sizeof(byte) == 
> sizeof(char), i.e., all sensible ones.)

Such hardware-specific issues must be implemented by the compiler. So
you probably have to extend "sparse" for that feature.

> [...]
> > ACK. Therefore "sizeof(char) == 1" must always hold.
> 
> Yes; but that is only true from C's perspective.

Is there another remotely relevant perspective (let alone more
important) in a C parser?

> We're dealing with things from the machine code perspective, where 
> sizeof(byte) == 1, and sizeof(char) is not necessarily the same as 
> sizeof(byte).

And that's AFAIU the problem: "sizeof(char) == 1" in C (K&R, c8x, c99)
per definitionem. There is no way around it. For all architectures in
the world, past, present and future.

If your hardware has 32bit char, short and int types, there is no way in
C to define a "byte" in a somewhat sane way. Yes, there are bitfields
but that tend to get somewhat clumsy.

Perhaps the are some compiler-specific extensions for such strange
hardware. But then you leave the defined ground and need to implement
them into "sparse" too.

> I find it really helps with this stuff if you are capable of holding two 
> completely contradictory beliefs at the same time. A certain level of 

I'm just pointing out the view of "C" (or what I understand it is).
Since "sparse" is a C code parser, this should matter most (if not all
and only) IMNSHO.
I don't think the (above newly beyond "C" introduced) concept of "byte"
for your specific, unknown and secret CPU/architecture is helpful in any
way.

> insanity can help too...

	Bernd
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 14:45             ` Bernd Petrovitsch
@ 2008-06-27 15:45               ` David Given
  2008-06-27 18:01                 ` Christopher Li
  0 siblings, 1 reply; 20+ messages in thread
From: David Given @ 2008-06-27 15:45 UTC (permalink / raw)
  To: linux-sparse

Bernd Petrovitsch wrote:
[...]
> That's the bug. there is no difference between "byte" and "char". Tell
> it that a char has 32 bits too *if* it's the case.

Having checked the standard it turns out that we've been talking at 
cross purposes as I've been using the wrong terminology --- it actually 
defines (unhelpfully) that byte and char are the same size. Sorry for 
the confusion.

What I was referring to when I previously said (erroneously) 'byte' was 
'an address delta of 1', as understood by the assembler. Let's just call 
this a 'unit' for clarity. This is not necessarily the same size as a char.

[...]
>>> ACK. Therefore "sizeof(char) == 1" must always hold.
>> Yes; but that is only true from C's perspective.
> 
> Is there another remotely relevant perspective (let alone more
> important) in a C parser?

The issue here is that sparse is not just a C parser. It is also a 
compiler, and needs to know how to manipulate addresses that are 
represented as units. (This is why I'm using it for my project.) It has 
to be able to express an offset into a structure in units to pass on to 
the 'assembler' (with stock sparse, this is the human who's reading the 
output of test-linearise).

Currently I believe that some parts of sparse are assuming that a unit 
is 8 bits, and other parts are assuming that a unit is CHAR_BITS bits, 
neither of which is necessarily correct. However, I need to go and check 
up on this; I'm away from the machine with my project on it right now.

I'm proposing adding a bits_in_unit (or something) setting and then 
going through and tracking down these places and changing them to use 
it. That way it should still work fine on exotic architectures like mine.

(Wikipedia has a rather interesting table of machines with interesting 
word/byte/char sizes:

http://en.wikipedia.org/wiki/Word_(computing)#Table_of_word_sizes

The last machine with a freaky word size was the Cray C-90 in 1991, 
where a unit was 64 bits but a char was 8 bits...)

-- 
David Given
dg@cowlark.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 15:45               ` David Given
@ 2008-06-27 18:01                 ` Christopher Li
  2008-06-27 23:32                   ` David Given
  2008-06-29 12:19                   ` Bernd Petrovitsch
  0 siblings, 2 replies; 20+ messages in thread
From: Christopher Li @ 2008-06-27 18:01 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse, Bernd Petrovitsch

> In C, there is no type "byte" (unless you typedef oder #define it).
> "byte" is usually (but not necessarily) meant as "unsigned char".

In C spec, there is a concept of "byte". The union return by sizeof()
is byte. Char must fit in a byte. But char does not necessary have the
same bits as byte. Char can have more.

C99: 3.6, 3.7.1

Because char can always fit in byte, sizeof(char) == 1.

> IIRC C specifies that sizeof() returns values measured in chars, but I don't believe it specifies any mapping between the size of chars and the underlying addressing units --- it should be
> possible to use 16-bit chars, for example, on an 8-bit byte system. Using 32-bit ints, sizeof(int) would then return 2; but you wouldn't be able to access individual bytes from C.

sizeof() return value measure in _byte_.
C99: 6.5.3.4

On Fri, Jun 27, 2008 at 8:45 AM, David Given <dg@cowlark.com> wrote:
> Bernd Petrovitsch wrote:
> [...]
>>
>> That's the bug. there is no difference between "byte" and "char". Tell
>> it that a char has 32 bits too *if* it's the case.

No, there is a different between "byte" and "char". See above.


>
> Having checked the standard it turns out that we've been talking at cross
> purposes as I've been using the wrong terminology --- it actually defines
> (unhelpfully) that byte and char are the same size. Sorry for the confusion.
>
> What I was referring to when I previously said (erroneously) 'byte' was 'an
> address delta of 1', as understood by the assembler. Let's just call this a
> 'unit' for clarity. This is not necessarily the same size as a char.

In C's term, that is call a "byte" :-)

> I'm proposing adding a bits_in_unit (or something) setting and then going
> through and tracking down these places and changing them to use it. That way
> it should still work fine on exotic architectures like mine.

You are right that point out a bug (assumption) of sparse which byte is 8 bits.
Using bits_in_byte is instead of 8 is better there.
Using bits_in_char assumes char has same bits as byte. That is my read
of the C spec.

Thanks

Chris

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 18:01                 ` Christopher Li
@ 2008-06-27 23:32                   ` David Given
  2008-06-28  0:17                     ` Christopher Li
                                       ` (3 more replies)
  2008-06-29 12:19                   ` Bernd Petrovitsch
  1 sibling, 4 replies; 20+ messages in thread
From: David Given @ 2008-06-27 23:32 UTC (permalink / raw)
  To: linux-sparse


[-- Attachment #1.1: Type: text/plain, Size: 1160 bytes --]

Christopher Li wrote:
[...]
> You are right that point out a bug (assumption) of sparse which byte is 8 bits.
> Using bits_in_byte is instead of 8 is better there.
> Using bits_in_char assumes char has same bits as byte. That is my read
> of the C spec.

Yes, indeed, I'd managed to get my terminology muddled to the confusion
of everyone.

Okay, I've gone and looked at the implementation of this stuff; it would
appear that modifying the code to supply the back end with units scaled
in something other than C chars is actually quite complicated, so I
haven't tried to do that. However, I'm enclosing a patch that should,
hopefully, fix the cases where it's assuming 8 bit bytes. Hopefully I've
 managed to catch all the cases, without breaking the octal parse code...

(I haven't used git before; is this the right format?)

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup

[-- Attachment #1.2: diff --]
[-- Type: text/plain, Size: 18815 bytes --]

commit 0752e02d2c7c3f67b7c1835c86bf77f11c8384dd
Author: dg <dg@cowlark.com>
Date:   Sat Jun 28 00:24:36 2008 +0100

    Changed use of hardcoded 8s when converting bits to bytes to use bits_in_char instead.

diff --git a/compile-i386.c b/compile-i386.c
index 8526408..3bbc9c7 100644
--- a/compile-i386.c
+++ b/compile-i386.c
@@ -1584,7 +1584,7 @@ static struct storage *emit_select_expr(struct expression *expr)
 	/*
 	 * Do the actual select: check the conditional for zero,
 	 * move false over true if zero
-	 */ 
+	 */
 	insn("test", reg_cond, reg_cond, NULL);
 	insn("cmovz", reg_false, reg_true, NULL);
 
@@ -2081,7 +2081,7 @@ static struct storage *x86_call_expression(struct expression *expr)
 		insn("pushl", new, NULL,
 		     !framesize ? "begin function call" : NULL);
 
-		framesize += size >> 3;
+		framesize += size / bits_in_char;
 	} END_FOR_EACH_PTR_REVERSE(arg);
 
 	fn = expr->fn;
diff --git a/evaluate.c b/evaluate.c
index 2a126dd..6018e0c 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -87,13 +87,13 @@ static struct symbol *evaluate_string(struct expression *expr)
 	array->ctype.alignment = 1;
 	array->ctype.modifiers = MOD_STATIC;
 	array->ctype.base_type = &char_ctype;
-	
+
 	addr->symbol = sym;
 	addr->ctype = &lazy_ptr_ctype;
 
 	expr->type = EXPR_PREOP;
 	expr->op = '*';
-	expr->unop = addr;  
+	expr->unop = addr;
 	expr->ctype = sym;
 	return sym;
 }
@@ -579,7 +579,7 @@ static struct symbol *evaluate_ptr_add(struct expression *expr, struct symbol *i
 	}
 
 	/* Get the size of whatever the pointer points to */
-	multiply = base->bit_size >> 3;
+	multiply = base->bit_size / bits_in_char;
 
 	if (ctype == &null_ctype)
 		ctype = &ptr_ctype;
@@ -831,7 +831,7 @@ static struct symbol *evaluate_ptr_sub(struct expression *expr)
 		struct expression *sub = alloc_expression(expr->pos, EXPR_BINOP);
 		struct expression *div = expr;
 		struct expression *val = alloc_expression(expr->pos, EXPR_VALUE);
-		unsigned long value = lbase->bit_size >> 3;
+		unsigned long value = lbase->bit_size / bits_in_char;
 
 		val->ctype = size_t_ctype;
 		val->value = value;
@@ -850,7 +850,7 @@ static struct symbol *evaluate_ptr_sub(struct expression *expr)
 		div->left = sub;
 		div->right = val;
 	}
-		
+
 	return ssize_t_ctype;
 }
 
@@ -1591,7 +1591,7 @@ static struct symbol *degenerate(struct expression *expr)
 				e3->op = '+';
 				e3->left = e0;
 				e3->right = alloc_const_expression(expr->pos,
-							expr->r_bitpos >> 3);
+							expr->r_bitpos / bits_in_char);
 				e3->ctype = &lazy_ptr_ctype;
 			} else {
 				e3 = e0;
@@ -1727,7 +1727,7 @@ static struct symbol *evaluate_postop(struct expression *expr)
 	} else if (class == TYPE_PTR) {
 		struct symbol *target = examine_pointer_target(ctype);
 		if (!is_function(target))
-			multiply = target->bit_size >> 3;
+			multiply = target->bit_size / bits_in_char;
 	}
 
 	if (multiply) {
@@ -1850,7 +1850,7 @@ static struct symbol *find_identifier(struct ident *ident, struct symbol_list *_
 					continue;
 				*offset += sym->offset;
 				return sub;
-			}	
+			}
 		}
 	} while ((list = list->next) != head);
 	return NULL;
@@ -1949,7 +1949,7 @@ static struct symbol *evaluate_member_dereference(struct expression *expr)
 			expr->base = deref->base;
 			expr->r_bitpos = deref->r_bitpos;
 		}
-		expr->r_bitpos += offset << 3;
+		expr->r_bitpos += offset * bits_in_char;
 		expr->type = EXPR_SLICE;
 		expr->r_nrbits = member->bit_size;
 		expr->r_bitpos += member->bit_offset;
@@ -2040,7 +2040,7 @@ static struct symbol *evaluate_sizeof(struct expression *expr)
 	if ((size < 0) || (size & 7))
 		expression_error(expr, "cannot size expression");
 	expr->type = EXPR_VALUE;
-	expr->value = size >> 3;
+	expr->value = size / bits_in_char;
 	expr->taint = 0;
 	expr->ctype = size_t_ctype;
 	return size_t_ctype;
@@ -2074,7 +2074,7 @@ static struct symbol *evaluate_ptrsizeof(struct expression *expr)
 	if (size & 7)
 		size = 0;
 	expr->type = EXPR_VALUE;
-	expr->value = size >> 3;
+	expr->value = size / bits_in_char;
 	expr->taint = 0;
 	expr->ctype = size_t_ctype;
 	return size_t_ctype;
@@ -2865,7 +2865,7 @@ static struct symbol *evaluate_offsetof(struct expression *expr)
 			unrestrict(idx, i_class, &i_type);
 			idx = cast_to(idx, size_t_ctype);
 			m = alloc_const_expression(expr->pos,
-						   ctype->bit_size >> 3);
+						   ctype->bit_size / bits_in_char);
 			m->ctype = size_t_ctype;
 			m->flags = Int_const_expr;
 			expr->type = EXPR_BINOP;
diff --git a/example.c b/example.c
index ae897dc..b844d4f 100644
--- a/example.c
+++ b/example.c
@@ -27,7 +27,7 @@ static const char *opcodes[] = {
 	[OP_INVOKE] = "invoke",
 	[OP_COMPUTEDGOTO] = "jmp *",
 	[OP_UNWIND] = "unwind",
-	
+
 	/* Binary */
 	[OP_ADD] = "add",
 	[OP_SUB] = "sub",
@@ -40,7 +40,7 @@ static const char *opcodes[] = {
 	[OP_SHL] = "shl",
 	[OP_LSR] = "lsr",
 	[OP_ASR] = "asr",
-	
+
 	/* Logical */
 	[OP_AND] = "and",
 	[OP_OR] = "or",
@@ -66,7 +66,7 @@ static const char *opcodes[] = {
 
 	/* Special three-input */
 	[OP_SEL] = "select",
-	
+
 	/* Memory */
 	[OP_MALLOC] = "malloc",
 	[OP_FREE] = "free",
@@ -818,7 +818,7 @@ static const char *generic(struct bb_state *state, pseudo_t pseudo)
 		put_operand(state, op);
 		reg = target_reg(state, pseudo, NULL);
 		output_insn(state, "lea %s,%s", show_op(state, op), reg->name);
-		return reg->name;		
+		return reg->name;
 
 	default:
 		str = show_op(state, op);
@@ -870,7 +870,7 @@ static void kill_dead_reg(struct hardreg *reg)
 {
 	if (reg->dead) {
 		pseudo_t p;
-		
+
 		FOR_EACH_PTR(reg->contains, p) {
 			if (CURRENT_TAG(p) & TAG_DEAD) {
 				DELETE_CURRENT_PTR(p);
@@ -1011,7 +1011,7 @@ static void kill_pseudo(struct bb_state *state, pseudo_t pseudo)
 				continue;
 			if (CURRENT_TAG(p) & TAG_DEAD)
 				reg->dead--;
-			output_comment(state, "removing pseudo %s from reg %s", 
+			output_comment(state, "removing pseudo %s from reg %s",
 				show_pseudo(pseudo), reg->name);
 			DELETE_CURRENT_PTR(p);
 		} END_FOR_EACH_PTR(p);
@@ -1069,7 +1069,7 @@ static const char *conditional[] = {
 	[OP_SET_BE] = "be",
 	[OP_SET_AE] = "ae"
 };
-	
+
 
 static void generate_branch(struct bb_state *state, struct instruction *br)
 {
@@ -1187,7 +1187,7 @@ static void replace_asm_percent(const char **src_p, char **dst_p, struct asm_arg
 		if (index < nr)
 			replace_asm_arg(dst_p, args+index);
 		break;
-	}	
+	}
 	*src_p = src;
 	return;
 }
@@ -1583,7 +1583,7 @@ static int final_pseudo_flush(struct bb_state *state, pseudo_t pseudo, struct ha
 	struct hardreg *dst;
 
 	/*
-	 * Since this pseudo is live at exit, we'd better have output 
+	 * Since this pseudo is live at exit, we'd better have output
 	 * storage for it..
 	 */
 	hash = find_storage_hash(pseudo, state->outputs);
@@ -1829,8 +1829,8 @@ static void set_up_arch_entry(struct entrypoint *ep, struct instruction *entry)
 
 			in->type = REG_FRAME;
 			in->offset = offset;
-			
-			offset += bits >> 3;
+
+			offset += bits / bits_in_char;
 		}
 		i++;
 		NEXT_PTR_LIST(argtype);
@@ -1936,7 +1936,7 @@ static int compile(struct symbol_list *list)
 		if (ep)
 			output(ep);
 	} END_FOR_EACH_PTR(sym);
-	
+
 	return 0;
 }
 
diff --git a/expand.c b/expand.c
index 032f0c5..d1aa019 100644
--- a/expand.c
+++ b/expand.c
@@ -100,7 +100,7 @@ Int:
 	// Stop here unless checking for truncation
 	if (!Wcast_truncate || conservative)
 		return;
-	
+
 	// Check if we dropped any bits..
 	oldsignmask = 1ULL << (old_size-1);
 	oldmask = oldsignmask | (oldsignmask-1);
@@ -179,7 +179,7 @@ static int simplify_int_binop(struct expression *expr, struct symbol *ctype)
 		sl |= ~(mask-1);
 	if (is_signed && (sr & mask))
 		sr |= ~(mask-1);
-	
+
 	switch (CONVERT(expr->op,is_signed)) {
 	case SIGNED('+'):
 	case UNSIGNED('+'):
@@ -224,7 +224,7 @@ static int simplify_int_binop(struct expression *expr, struct symbol *ctype)
 
 	case UNSIGNED('/'):
 		if (!r) goto Div;
-		v = l / r; 
+		v = l / r;
 		break;
 
 	case SIGNED('%'):
@@ -241,7 +241,7 @@ static int simplify_int_binop(struct expression *expr, struct symbol *ctype)
 	case SIGNED(SPECIAL_LEFTSHIFT):
 	case UNSIGNED(SPECIAL_LEFTSHIFT):
 		v = l << r;
-		break; 
+		break;
 
 	case SIGNED(SPECIAL_RIGHTSHIFT):
 		v = sl >> r;
@@ -536,7 +536,7 @@ static int expand_conditional(struct expression *expr)
 
 	return cost + cond_cost + BRANCH_COST;
 }
-		
+
 static int expand_assignment(struct expression *expr)
 {
 	expand_expression(expr->left);
@@ -822,7 +822,7 @@ static int expand_expression_list(struct expression_list *list)
 	return cost;
 }
 
-/* 
+/*
  * We can simplify nested position expressions if
  * this is a simple (single) positional expression.
  */
@@ -880,7 +880,7 @@ static unsigned long bit_offset(const struct expression *expr)
 {
 	unsigned long offset = 0;
 	while (expr->type == EXPR_POS) {
-		offset += expr->init_offset << 3;
+		offset += expr->init_offset * bits_in_char;
 		expr = expr->init_expr;
 	}
 	if (expr && expr->ctype)
diff --git a/flow.c b/flow.c
index 82fb23a..bdb2c25 100644
--- a/flow.c
+++ b/flow.c
@@ -16,6 +16,7 @@
 #include "expression.h"
 #include "linearize.h"
 #include "flow.h"
+#include "target.h"
 
 unsigned long bb_generation;
 
@@ -265,8 +266,8 @@ void convert_load_instruction(struct instruction *insn, pseudo_t src)
 
 static int overlapping_memop(struct instruction *a, struct instruction *b)
 {
-	unsigned int a_start = a->offset << 3;
-	unsigned int b_start = b->offset << 3;
+	unsigned int a_start = a->offset * bits_in_char;
+	unsigned int b_start = b->offset * bits_in_char;
 	unsigned int a_size = a->size;
 	unsigned int b_size = b->size;
 
@@ -364,11 +365,11 @@ found_dominator:
 		use_pseudo(insn, phi, add_pseudo(dominators, phi));
 	} END_FOR_EACH_PTR(parent);
 	return 1;
-}		
+}
 
 /*
  * We should probably sort the phi list just to make it easier to compare
- * later for equality. 
+ * later for equality.
  */
 void rewrite_load_instruction(struct instruction *insn, struct pseudo_list *dominators)
 {
@@ -529,7 +530,7 @@ static void kill_dead_stores(pseudo_t pseudo, unsigned long generation, struct b
  * This should see if the "insn" trivially dominates some previous store, and kill the
  * store if unnecessary.
  */
-static void kill_dominated_stores(pseudo_t pseudo, struct instruction *insn, 
+static void kill_dominated_stores(pseudo_t pseudo, struct instruction *insn,
 	unsigned long generation, struct basic_block *bb, int local, int found)
 {
 	struct instruction *one;
@@ -581,13 +582,14 @@ void check_access(struct instruction *insn)
 	pseudo_t pseudo = insn->src;
 
 	if (insn->bb && pseudo->type == PSEUDO_SYM) {
-		int offset = insn->offset, bit = (offset << 3) + insn->size;
+		int offset = insn->offset, bit = (offset * bits_in_char) + insn->size;
 		struct symbol *sym = pseudo->sym;
 
 		if (sym->bit_size > 0 && (offset < 0 || bit > sym->bit_size))
 			warning(insn->pos, "invalid access %s '%s' (%d %d)",
 				offset < 0 ? "below" : "past the end of",
-				show_ident(sym->ident), offset, sym->bit_size >> 3);
+				show_ident(sym->ident), offset,
+				sym->bit_size / bits_in_char);
 	}
 }
 
@@ -705,7 +707,7 @@ external_visibility:
 			} END_FOR_EACH_PTR(bb);
 		}
 	}
-			
+
 	return;
 }
 
@@ -887,7 +889,7 @@ static void vrfy_children(struct basic_block *bb)
 	default:
 		break;
 	}
-		
+
 	FOR_EACH_PTR(bb->children, tmp) {
 		vrfy_bb_in_list(bb, tmp->parents);
 	} END_FOR_EACH_PTR(tmp);
diff --git a/show-parse.c b/show-parse.c
index 064af32..63d9144 100644
--- a/show-parse.c
+++ b/show-parse.c
@@ -246,8 +246,8 @@ deeper:
 
 		s = modifier_string(mod);
 		len = strlen(s);
-		name->start -= len;    
-		memcpy(name->start, s, len);  
+		name->start -= len;
+		memcpy(name->start, s, len);
 		mod = 0;
 		as = 0;
 	}
@@ -412,7 +412,7 @@ void show_symbol(struct symbol *sym)
 		struct statement *stmt = type->stmt;
 		if (stmt) {
 			int val;
-			printf("\n");		
+			printf("\n");
 			val = show_statement(stmt);
 			if (val)
 				printf("\tmov.%d\t\tretval,%d\n", stmt->ret->bit_size, val);
@@ -589,7 +589,7 @@ int show_statement(struct statement *stmt)
 		if (pre_condition) {
 			if (pre_condition->type == EXPR_VALUE) {
 				if (!pre_condition->value) {
-					loop_bottom = new_label();   
+					loop_bottom = new_label();
 					printf("\tjmp\t\t.L%d\n", loop_bottom);
 				}
 			} else {
@@ -623,7 +623,7 @@ int show_statement(struct statement *stmt)
 	}
 	case STMT_NONE:
 		break;
-	
+
 	case STMT_LABEL:
 		printf(".L%p:\n", stmt->label_identifier);
 		show_statement(stmt->label_statement);
@@ -649,9 +649,9 @@ int show_statement(struct statement *stmt)
 		int val = show_expression(stmt->range_expression);
 		int low = show_expression(stmt->range_low);
 		int high = show_expression(stmt->range_high);
-		printf("\trange( %d %d-%d)\n", val, low, high); 
+		printf("\trange( %d %d-%d)\n", val, low, high);
 		break;
-	}	
+	}
 	}
 	return 0;
 }
@@ -673,7 +673,7 @@ static int show_call_expression(struct expression *expr)
 		int new = show_expression(arg);
 		int size = arg->ctype->bit_size;
 		printf("\tpush.%d\t\tv%d\n", size, new);
-		framesize += size >> 3;
+		framesize += size / bits_in_char;
 	} END_FOR_EACH_PTR_REVERSE(arg);
 
 	fn = expr->fn;
@@ -845,7 +845,7 @@ static int show_inc_dec(struct expression *expr, int postop)
 	printf("\t%s.%d\t\tv%d,v%d,$1\n", opname, bits, new, retval);
 	show_store_gen(bits, new, expr->unop, addr);
 	return retval;
-}	
+}
 
 static int show_preop(struct expression *expr)
 {
@@ -864,7 +864,7 @@ static int show_preop(struct expression *expr)
 static int show_postop(struct expression *expr)
 {
 	return show_inc_dec(expr, 1);
-}	
+}
 
 static int show_symbol_expr(struct symbol *sym)
 {
@@ -918,7 +918,7 @@ static int show_cast_expr(struct expression *expr)
 
 	old_type = expr->cast_expression->ctype;
 	new_type = expr->cast_type;
-	
+
 	oldbits = old_type->bit_size;
 	newbits = new_type->bit_size;
 	if (oldbits >= newbits)
@@ -1017,7 +1017,7 @@ again:
 			entry = entry->ident_expression;
 			goto again;
 		}
-			
+
 		if (entry->type == EXPR_INDEX) {
 			printf(" AT '%d..%d:\n", entry->idx_from, entry->idx_to);
 			entry = entry->idx_expression;
@@ -1057,11 +1057,11 @@ int show_expression(struct expression *expr)
 			pos->line, pos->pos);
 		return 0;
 	}
-		
+
 	switch (expr->type) {
 	case EXPR_CALL:
 		return show_call_expression(expr);
-		
+
 	case EXPR_ASSIGNMENT:
 		return show_assignment(expr);
 
diff --git a/symbol.c b/symbol.c
index 3292907..a22731f 100644
--- a/symbol.c
+++ b/symbol.c
@@ -22,7 +22,7 @@
 
 /*
  * Secondary symbol list for stuff that needs to be output because it
- * was used. 
+ * was used.
  */
 struct symbol_list *translation_unit_used_list = NULL;
 
@@ -117,7 +117,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
 	}
 
 	bit_size = info->bit_size;
-	base_size = sym->bit_size; 
+	base_size = sym->bit_size;
 
 	/*
 	 * Unsized arrays cause us to not align the resulting
@@ -128,7 +128,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
 		base_size = 0;
 	}
 
-	align_bit_mask = (sym->ctype.alignment << 3) - 1;
+	align_bit_mask = (sym->ctype.alignment * bits_in_char) - 1;
 
 	/*
 	 * Bitfields have some very special rules..
@@ -143,7 +143,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
 			bit_size = (bit_size + align_bit_mask) & ~align_bit_mask;
 			bit_offset = 0;
 		}
-		sym->offset = (bit_size - bit_offset) >> 3;
+		sym->offset = (bit_size - bit_offset) / bits_in_char;
 		sym->bit_offset = bit_offset;
 		sym->ctype.base_type->bit_offset = bit_offset;
 		info->bit_size = bit_size + width;
@@ -156,7 +156,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
 	 * Otherwise, just align it right and add it up..
 	 */
 	bit_size = (bit_size + align_bit_mask) & ~align_bit_mask;
-	sym->offset = bit_size >> 3;
+	sym->offset = bit_size / bits_in_char;
 
 	info->bit_size = bit_size + base_size;
 	// warning (sym->pos, "regular: offset=%d", sym->offset);
@@ -182,7 +182,7 @@ static struct symbol * examine_struct_union_type(struct symbol *sym, int advance
 		sym->ctype.alignment = info.max_align;
 	bit_size = info.bit_size;
 	if (info.align_size) {
-		bit_align = (sym->ctype.alignment << 3)-1;
+		bit_align = (sym->ctype.alignment * bits_in_char)-1;
 		bit_size = (bit_size + bit_align) & ~bit_align;
 	}
 	sym->bit_size = bit_size;
@@ -340,7 +340,7 @@ static struct symbol * examine_node_type(struct symbol *sym)
 		if (node_type && node_type->bit_size >= 0)
 			bit_size = node_type->bit_size * count;
 	}
-	
+
 	sym->bit_size = bit_size;
 	return sym;
 }
@@ -654,7 +654,7 @@ static int expand_warning(struct expression *expr, int cost)
 	FOR_EACH_PTR (arglist, arg) {
 		/*
 		 * Constant strings get printed out as a warning. By the
-		 * time we get here, the EXPR_STRING has been fully 
+		 * time we get here, the EXPR_STRING has been fully
 		 * evaluated, so by now it's an anonymous symbol with a
 		 * string initializer.
 		 *
@@ -877,7 +877,7 @@ void init_ctype(void)
 		struct symbol *sym = ctype->ptr;
 		unsigned long bit_size = ctype->bit_size ? *ctype->bit_size : -1;
 		unsigned long maxalign = ctype->maxalign ? *ctype->maxalign : 0;
-		unsigned long alignment = (bit_size + 7) >> 3;
+		unsigned long alignment = (bit_size + bits_in_char - 1) / bits_in_char;
 
 		if (alignment > maxalign)
 			alignment = maxalign;
diff --git a/target.c b/target.c
index bf1bb8f..a0ee010 100644
--- a/target.c
+++ b/target.c
@@ -11,6 +11,8 @@ struct symbol *ssize_t_ctype = &int_ctype;
  */
 int max_alignment = 16;
 
+int bits_in_unit = 8;
+
 /*
  * Integer data types
  */
diff --git a/tokenize.c b/tokenize.c
index e72c56e..67daa97 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -96,7 +96,7 @@ static char *charstr(char *ptr, unsigned char c, unsigned char escape, unsigned
 	}
 	if (!isdigit(next))
 		return ptr + sprintf(ptr, "%o", c);
-		
+
 	return ptr + sprintf(ptr, "%03o", c);
 }
 
@@ -161,7 +161,7 @@ const char *show_token(const struct token *token)
 	case TOKEN_STREAMEND:
 		sprintf(buffer, "<end of '%s'>", stream_name(token->pos.stream));
 		return buffer;
-	
+
 	default:
 		return "WTF???";
 	}
@@ -483,7 +483,7 @@ static int escapechar(int first, int type, stream_t *stream, int *valp)
 				int nr = 2;
 				value -= '0';
 				while (next >= '0' && next <= '9') {
-					value = (value << 3) + (next-'0');
+					value = (value*8) + (next-'0');
 					next = nextchar(stream);
 					if (!--nr)
 						break;
@@ -572,7 +572,7 @@ static int get_string_token(int next, stream_t *stream)
 	token_type(token) = TOKEN_STRING;
 	token->string = string;
 	add_token(stream);
-	
+
 	return next;
 }
 
@@ -875,7 +875,7 @@ static int get_one_identifier(int c, stream_t *stream)
 	token->ident = ident;
 	add_token(stream);
 	return next;
-}		
+}
 
 static int get_one_token(int c, stream_t *stream)
 {

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 23:32                   ` David Given
@ 2008-06-28  0:17                     ` Christopher Li
  2008-06-28  0:23                       ` David Given
  2008-06-29  0:10                       ` David Given
  2008-06-28  0:29                     ` Josh Triplett
                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Christopher Li @ 2008-06-28  0:17 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

Your patch is white space damaged.

While you are there, you might want to consider macro

bits_to_byte(x)
byte_to_bits(x)

To isolate out the conversion. It should generate the same code.
Marginally more readable.

Chris

On Fri, Jun 27, 2008 at 4:32 PM, David Given <dg@cowlark.com> wrote:
> Christopher Li wrote:
> [...]
>> You are right that point out a bug (assumption) of sparse which byte is 8 bits.
>> Using bits_in_byte is instead of 8 is better there.
>> Using bits_in_char assumes char has same bits as byte. That is my read
>> of the C spec.
>
> Yes, indeed, I'd managed to get my terminology muddled to the confusion
> of everyone.
>
> Okay, I've gone and looked at the implementation of this stuff; it would
> appear that modifying the code to supply the back end with units scaled
> in something other than C chars is actually quite complicated, so I
> haven't tried to do that. However, I'm enclosing a patch that should,
> hopefully, fix the cases where it's assuming 8 bit bytes. Hopefully I've
>  managed to catch all the cases, without breaking the octal parse code...
>
> (I haven't used git before; is this the right format?)
>
> --
> ┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
> │ "I have always wished for my computer to be as easy to use as my
> │ telephone; my wish has come true because I can no longer figure out
> │ how to use my telephone." --- Bjarne Stroustrup
>
> commit 0752e02d2c7c3f67b7c1835c86bf77f11c8384dd
> Author: dg <dg@cowlark.com>
> Date:   Sat Jun 28 00:24:36 2008 +0100
>
>    Changed use of hardcoded 8s when converting bits to bytes to use bits_in_char instead.
>
> diff --git a/compile-i386.c b/compile-i386.c
> index 8526408..3bbc9c7 100644
> --- a/compile-i386.c
> +++ b/compile-i386.c
> @@ -1584,7 +1584,7 @@ static struct storage *emit_select_expr(struct expression *expr)
>        /*
>         * Do the actual select: check the conditional for zero,
>         * move false over true if zero
> -        */
> +        */
>        insn("test", reg_cond, reg_cond, NULL);
>        insn("cmovz", reg_false, reg_true, NULL);
>
> @@ -2081,7 +2081,7 @@ static struct storage *x86_call_expression(struct expression *expr)
>                insn("pushl", new, NULL,
>                     !framesize ? "begin function call" : NULL);
>
> -               framesize += size >> 3;
> +               framesize += size / bits_in_char;
>        } END_FOR_EACH_PTR_REVERSE(arg);
>
>        fn = expr->fn;
> diff --git a/evaluate.c b/evaluate.c
> index 2a126dd..6018e0c 100644
> --- a/evaluate.c
> +++ b/evaluate.c
> @@ -87,13 +87,13 @@ static struct symbol *evaluate_string(struct expression *expr)
>        array->ctype.alignment = 1;
>        array->ctype.modifiers = MOD_STATIC;
>        array->ctype.base_type = &char_ctype;
> -
> +
>        addr->symbol = sym;
>        addr->ctype = &lazy_ptr_ctype;
>
>        expr->type = EXPR_PREOP;
>        expr->op = '*';
> -       expr->unop = addr;
> +       expr->unop = addr;
>        expr->ctype = sym;
>        return sym;
>  }
> @@ -579,7 +579,7 @@ static struct symbol *evaluate_ptr_add(struct expression *expr, struct symbol *i
>        }
>
>        /* Get the size of whatever the pointer points to */
> -       multiply = base->bit_size >> 3;
> +       multiply = base->bit_size / bits_in_char;
>
>        if (ctype == &null_ctype)
>                ctype = &ptr_ctype;
> @@ -831,7 +831,7 @@ static struct symbol *evaluate_ptr_sub(struct expression *expr)
>                struct expression *sub = alloc_expression(expr->pos, EXPR_BINOP);
>                struct expression *div = expr;
>                struct expression *val = alloc_expression(expr->pos, EXPR_VALUE);
> -               unsigned long value = lbase->bit_size >> 3;
> +               unsigned long value = lbase->bit_size / bits_in_char;
>
>                val->ctype = size_t_ctype;
>                val->value = value;
> @@ -850,7 +850,7 @@ static struct symbol *evaluate_ptr_sub(struct expression *expr)
>                div->left = sub;
>                div->right = val;
>        }
> -
> +
>        return ssize_t_ctype;
>  }
>
> @@ -1591,7 +1591,7 @@ static struct symbol *degenerate(struct expression *expr)
>                                e3->op = '+';
>                                e3->left = e0;
>                                e3->right = alloc_const_expression(expr->pos,
> -                                                       expr->r_bitpos >> 3);
> +                                                       expr->r_bitpos / bits_in_char);
>                                e3->ctype = &lazy_ptr_ctype;
>                        } else {
>                                e3 = e0;
> @@ -1727,7 +1727,7 @@ static struct symbol *evaluate_postop(struct expression *expr)
>        } else if (class == TYPE_PTR) {
>                struct symbol *target = examine_pointer_target(ctype);
>                if (!is_function(target))
> -                       multiply = target->bit_size >> 3;
> +                       multiply = target->bit_size / bits_in_char;
>        }
>
>        if (multiply) {
> @@ -1850,7 +1850,7 @@ static struct symbol *find_identifier(struct ident *ident, struct symbol_list *_
>                                        continue;
>                                *offset += sym->offset;
>                                return sub;
> -                       }
> +                       }
>                }
>        } while ((list = list->next) != head);
>        return NULL;
> @@ -1949,7 +1949,7 @@ static struct symbol *evaluate_member_dereference(struct expression *expr)
>                        expr->base = deref->base;
>                        expr->r_bitpos = deref->r_bitpos;
>                }
> -               expr->r_bitpos += offset << 3;
> +               expr->r_bitpos += offset * bits_in_char;
>                expr->type = EXPR_SLICE;
>                expr->r_nrbits = member->bit_size;
>                expr->r_bitpos += member->bit_offset;
> @@ -2040,7 +2040,7 @@ static struct symbol *evaluate_sizeof(struct expression *expr)
>        if ((size < 0) || (size & 7))
>                expression_error(expr, "cannot size expression");
>        expr->type = EXPR_VALUE;
> -       expr->value = size >> 3;
> +       expr->value = size / bits_in_char;
>        expr->taint = 0;
>        expr->ctype = size_t_ctype;
>        return size_t_ctype;
> @@ -2074,7 +2074,7 @@ static struct symbol *evaluate_ptrsizeof(struct expression *expr)
>        if (size & 7)
>                size = 0;
>        expr->type = EXPR_VALUE;
> -       expr->value = size >> 3;
> +       expr->value = size / bits_in_char;
>        expr->taint = 0;
>        expr->ctype = size_t_ctype;
>        return size_t_ctype;
> @@ -2865,7 +2865,7 @@ static struct symbol *evaluate_offsetof(struct expression *expr)
>                        unrestrict(idx, i_class, &i_type);
>                        idx = cast_to(idx, size_t_ctype);
>                        m = alloc_const_expression(expr->pos,
> -                                                  ctype->bit_size >> 3);
> +                                                  ctype->bit_size / bits_in_char);
>                        m->ctype = size_t_ctype;
>                        m->flags = Int_const_expr;
>                        expr->type = EXPR_BINOP;
> diff --git a/example.c b/example.c
> index ae897dc..b844d4f 100644
> --- a/example.c
> +++ b/example.c
> @@ -27,7 +27,7 @@ static const char *opcodes[] = {
>        [OP_INVOKE] = "invoke",
>        [OP_COMPUTEDGOTO] = "jmp *",
>        [OP_UNWIND] = "unwind",
> -
> +
>        /* Binary */
>        [OP_ADD] = "add",
>        [OP_SUB] = "sub",
> @@ -40,7 +40,7 @@ static const char *opcodes[] = {
>        [OP_SHL] = "shl",
>        [OP_LSR] = "lsr",
>        [OP_ASR] = "asr",
> -
> +
>        /* Logical */
>        [OP_AND] = "and",
>        [OP_OR] = "or",
> @@ -66,7 +66,7 @@ static const char *opcodes[] = {
>
>        /* Special three-input */
>        [OP_SEL] = "select",
> -
> +
>        /* Memory */
>        [OP_MALLOC] = "malloc",
>        [OP_FREE] = "free",
> @@ -818,7 +818,7 @@ static const char *generic(struct bb_state *state, pseudo_t pseudo)
>                put_operand(state, op);
>                reg = target_reg(state, pseudo, NULL);
>                output_insn(state, "lea %s,%s", show_op(state, op), reg->name);
> -               return reg->name;
> +               return reg->name;
>
>        default:
>                str = show_op(state, op);
> @@ -870,7 +870,7 @@ static void kill_dead_reg(struct hardreg *reg)
>  {
>        if (reg->dead) {
>                pseudo_t p;
> -
> +
>                FOR_EACH_PTR(reg->contains, p) {
>                        if (CURRENT_TAG(p) & TAG_DEAD) {
>                                DELETE_CURRENT_PTR(p);
> @@ -1011,7 +1011,7 @@ static void kill_pseudo(struct bb_state *state, pseudo_t pseudo)
>                                continue;
>                        if (CURRENT_TAG(p) & TAG_DEAD)
>                                reg->dead--;
> -                       output_comment(state, "removing pseudo %s from reg %s",
> +                       output_comment(state, "removing pseudo %s from reg %s",
>                                show_pseudo(pseudo), reg->name);
>                        DELETE_CURRENT_PTR(p);
>                } END_FOR_EACH_PTR(p);
> @@ -1069,7 +1069,7 @@ static const char *conditional[] = {
>        [OP_SET_BE] = "be",
>        [OP_SET_AE] = "ae"
>  };
> -
> +
>
>  static void generate_branch(struct bb_state *state, struct instruction *br)
>  {
> @@ -1187,7 +1187,7 @@ static void replace_asm_percent(const char **src_p, char **dst_p, struct asm_arg
>                if (index < nr)
>                        replace_asm_arg(dst_p, args+index);
>                break;
> -       }
> +       }
>        *src_p = src;
>        return;
>  }
> @@ -1583,7 +1583,7 @@ static int final_pseudo_flush(struct bb_state *state, pseudo_t pseudo, struct ha
>        struct hardreg *dst;
>
>        /*
> -        * Since this pseudo is live at exit, we'd better have output
> +        * Since this pseudo is live at exit, we'd better have output
>         * storage for it..
>         */
>        hash = find_storage_hash(pseudo, state->outputs);
> @@ -1829,8 +1829,8 @@ static void set_up_arch_entry(struct entrypoint *ep, struct instruction *entry)
>
>                        in->type = REG_FRAME;
>                        in->offset = offset;
> -
> -                       offset += bits >> 3;
> +
> +                       offset += bits / bits_in_char;
>                }
>                i++;
>                NEXT_PTR_LIST(argtype);
> @@ -1936,7 +1936,7 @@ static int compile(struct symbol_list *list)
>                if (ep)
>                        output(ep);
>        } END_FOR_EACH_PTR(sym);
> -
> +
>        return 0;
>  }
>
> diff --git a/expand.c b/expand.c
> index 032f0c5..d1aa019 100644
> --- a/expand.c
> +++ b/expand.c
> @@ -100,7 +100,7 @@ Int:
>        // Stop here unless checking for truncation
>        if (!Wcast_truncate || conservative)
>                return;
> -
> +
>        // Check if we dropped any bits..
>        oldsignmask = 1ULL << (old_size-1);
>        oldmask = oldsignmask | (oldsignmask-1);
> @@ -179,7 +179,7 @@ static int simplify_int_binop(struct expression *expr, struct symbol *ctype)
>                sl |= ~(mask-1);
>        if (is_signed && (sr & mask))
>                sr |= ~(mask-1);
> -
> +
>        switch (CONVERT(expr->op,is_signed)) {
>        case SIGNED('+'):
>        case UNSIGNED('+'):
> @@ -224,7 +224,7 @@ static int simplify_int_binop(struct expression *expr, struct symbol *ctype)
>
>        case UNSIGNED('/'):
>                if (!r) goto Div;
> -               v = l / r;
> +               v = l / r;
>                break;
>
>        case SIGNED('%'):
> @@ -241,7 +241,7 @@ static int simplify_int_binop(struct expression *expr, struct symbol *ctype)
>        case SIGNED(SPECIAL_LEFTSHIFT):
>        case UNSIGNED(SPECIAL_LEFTSHIFT):
>                v = l << r;
> -               break;
> +               break;
>
>        case SIGNED(SPECIAL_RIGHTSHIFT):
>                v = sl >> r;
> @@ -536,7 +536,7 @@ static int expand_conditional(struct expression *expr)
>
>        return cost + cond_cost + BRANCH_COST;
>  }
> -
> +
>  static int expand_assignment(struct expression *expr)
>  {
>        expand_expression(expr->left);
> @@ -822,7 +822,7 @@ static int expand_expression_list(struct expression_list *list)
>        return cost;
>  }
>
> -/*
> +/*
>  * We can simplify nested position expressions if
>  * this is a simple (single) positional expression.
>  */
> @@ -880,7 +880,7 @@ static unsigned long bit_offset(const struct expression *expr)
>  {
>        unsigned long offset = 0;
>        while (expr->type == EXPR_POS) {
> -               offset += expr->init_offset << 3;
> +               offset += expr->init_offset * bits_in_char;
>                expr = expr->init_expr;
>        }
>        if (expr && expr->ctype)
> diff --git a/flow.c b/flow.c
> index 82fb23a..bdb2c25 100644
> --- a/flow.c
> +++ b/flow.c
> @@ -16,6 +16,7 @@
>  #include "expression.h"
>  #include "linearize.h"
>  #include "flow.h"
> +#include "target.h"
>
>  unsigned long bb_generation;
>
> @@ -265,8 +266,8 @@ void convert_load_instruction(struct instruction *insn, pseudo_t src)
>
>  static int overlapping_memop(struct instruction *a, struct instruction *b)
>  {
> -       unsigned int a_start = a->offset << 3;
> -       unsigned int b_start = b->offset << 3;
> +       unsigned int a_start = a->offset * bits_in_char;
> +       unsigned int b_start = b->offset * bits_in_char;
>        unsigned int a_size = a->size;
>        unsigned int b_size = b->size;
>
> @@ -364,11 +365,11 @@ found_dominator:
>                use_pseudo(insn, phi, add_pseudo(dominators, phi));
>        } END_FOR_EACH_PTR(parent);
>        return 1;
> -}
> +}
>
>  /*
>  * We should probably sort the phi list just to make it easier to compare
> - * later for equality.
> + * later for equality.
>  */
>  void rewrite_load_instruction(struct instruction *insn, struct pseudo_list *dominators)
>  {
> @@ -529,7 +530,7 @@ static void kill_dead_stores(pseudo_t pseudo, unsigned long generation, struct b
>  * This should see if the "insn" trivially dominates some previous store, and kill the
>  * store if unnecessary.
>  */
> -static void kill_dominated_stores(pseudo_t pseudo, struct instruction *insn,
> +static void kill_dominated_stores(pseudo_t pseudo, struct instruction *insn,
>        unsigned long generation, struct basic_block *bb, int local, int found)
>  {
>        struct instruction *one;
> @@ -581,13 +582,14 @@ void check_access(struct instruction *insn)
>        pseudo_t pseudo = insn->src;
>
>        if (insn->bb && pseudo->type == PSEUDO_SYM) {
> -               int offset = insn->offset, bit = (offset << 3) + insn->size;
> +               int offset = insn->offset, bit = (offset * bits_in_char) + insn->size;
>                struct symbol *sym = pseudo->sym;
>
>                if (sym->bit_size > 0 && (offset < 0 || bit > sym->bit_size))
>                        warning(insn->pos, "invalid access %s '%s' (%d %d)",
>                                offset < 0 ? "below" : "past the end of",
> -                               show_ident(sym->ident), offset, sym->bit_size >> 3);
> +                               show_ident(sym->ident), offset,
> +                               sym->bit_size / bits_in_char);
>        }
>  }
>
> @@ -705,7 +707,7 @@ external_visibility:
>                        } END_FOR_EACH_PTR(bb);
>                }
>        }
> -
> +
>        return;
>  }
>
> @@ -887,7 +889,7 @@ static void vrfy_children(struct basic_block *bb)
>        default:
>                break;
>        }
> -
> +
>        FOR_EACH_PTR(bb->children, tmp) {
>                vrfy_bb_in_list(bb, tmp->parents);
>        } END_FOR_EACH_PTR(tmp);
> diff --git a/show-parse.c b/show-parse.c
> index 064af32..63d9144 100644
> --- a/show-parse.c
> +++ b/show-parse.c
> @@ -246,8 +246,8 @@ deeper:
>
>                s = modifier_string(mod);
>                len = strlen(s);
> -               name->start -= len;
> -               memcpy(name->start, s, len);
> +               name->start -= len;
> +               memcpy(name->start, s, len);
>                mod = 0;
>                as = 0;
>        }
> @@ -412,7 +412,7 @@ void show_symbol(struct symbol *sym)
>                struct statement *stmt = type->stmt;
>                if (stmt) {
>                        int val;
> -                       printf("\n");
> +                       printf("\n");
>                        val = show_statement(stmt);
>                        if (val)
>                                printf("\tmov.%d\t\tretval,%d\n", stmt->ret->bit_size, val);
> @@ -589,7 +589,7 @@ int show_statement(struct statement *stmt)
>                if (pre_condition) {
>                        if (pre_condition->type == EXPR_VALUE) {
>                                if (!pre_condition->value) {
> -                                       loop_bottom = new_label();
> +                                       loop_bottom = new_label();
>                                        printf("\tjmp\t\t.L%d\n", loop_bottom);
>                                }
>                        } else {
> @@ -623,7 +623,7 @@ int show_statement(struct statement *stmt)
>        }
>        case STMT_NONE:
>                break;
> -
> +
>        case STMT_LABEL:
>                printf(".L%p:\n", stmt->label_identifier);
>                show_statement(stmt->label_statement);
> @@ -649,9 +649,9 @@ int show_statement(struct statement *stmt)
>                int val = show_expression(stmt->range_expression);
>                int low = show_expression(stmt->range_low);
>                int high = show_expression(stmt->range_high);
> -               printf("\trange( %d %d-%d)\n", val, low, high);
> +               printf("\trange( %d %d-%d)\n", val, low, high);
>                break;
> -       }
> +       }
>        }
>        return 0;
>  }
> @@ -673,7 +673,7 @@ static int show_call_expression(struct expression *expr)
>                int new = show_expression(arg);
>                int size = arg->ctype->bit_size;
>                printf("\tpush.%d\t\tv%d\n", size, new);
> -               framesize += size >> 3;
> +               framesize += size / bits_in_char;
>        } END_FOR_EACH_PTR_REVERSE(arg);
>
>        fn = expr->fn;
> @@ -845,7 +845,7 @@ static int show_inc_dec(struct expression *expr, int postop)
>        printf("\t%s.%d\t\tv%d,v%d,$1\n", opname, bits, new, retval);
>        show_store_gen(bits, new, expr->unop, addr);
>        return retval;
> -}
> +}
>
>  static int show_preop(struct expression *expr)
>  {
> @@ -864,7 +864,7 @@ static int show_preop(struct expression *expr)
>  static int show_postop(struct expression *expr)
>  {
>        return show_inc_dec(expr, 1);
> -}
> +}
>
>  static int show_symbol_expr(struct symbol *sym)
>  {
> @@ -918,7 +918,7 @@ static int show_cast_expr(struct expression *expr)
>
>        old_type = expr->cast_expression->ctype;
>        new_type = expr->cast_type;
> -
> +
>        oldbits = old_type->bit_size;
>        newbits = new_type->bit_size;
>        if (oldbits >= newbits)
> @@ -1017,7 +1017,7 @@ again:
>                        entry = entry->ident_expression;
>                        goto again;
>                }
> -
> +
>                if (entry->type == EXPR_INDEX) {
>                        printf(" AT '%d..%d:\n", entry->idx_from, entry->idx_to);
>                        entry = entry->idx_expression;
> @@ -1057,11 +1057,11 @@ int show_expression(struct expression *expr)
>                        pos->line, pos->pos);
>                return 0;
>        }
> -
> +
>        switch (expr->type) {
>        case EXPR_CALL:
>                return show_call_expression(expr);
> -
> +
>        case EXPR_ASSIGNMENT:
>                return show_assignment(expr);
>
> diff --git a/symbol.c b/symbol.c
> index 3292907..a22731f 100644
> --- a/symbol.c
> +++ b/symbol.c
> @@ -22,7 +22,7 @@
>
>  /*
>  * Secondary symbol list for stuff that needs to be output because it
> - * was used.
> + * was used.
>  */
>  struct symbol_list *translation_unit_used_list = NULL;
>
> @@ -117,7 +117,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
>        }
>
>        bit_size = info->bit_size;
> -       base_size = sym->bit_size;
> +       base_size = sym->bit_size;
>
>        /*
>         * Unsized arrays cause us to not align the resulting
> @@ -128,7 +128,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
>                base_size = 0;
>        }
>
> -       align_bit_mask = (sym->ctype.alignment << 3) - 1;
> +       align_bit_mask = (sym->ctype.alignment * bits_in_char) - 1;
>
>        /*
>         * Bitfields have some very special rules..
> @@ -143,7 +143,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
>                        bit_size = (bit_size + align_bit_mask) & ~align_bit_mask;
>                        bit_offset = 0;
>                }
> -               sym->offset = (bit_size - bit_offset) >> 3;
> +               sym->offset = (bit_size - bit_offset) / bits_in_char;
>                sym->bit_offset = bit_offset;
>                sym->ctype.base_type->bit_offset = bit_offset;
>                info->bit_size = bit_size + width;
> @@ -156,7 +156,7 @@ static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
>         * Otherwise, just align it right and add it up..
>         */
>        bit_size = (bit_size + align_bit_mask) & ~align_bit_mask;
> -       sym->offset = bit_size >> 3;
> +       sym->offset = bit_size / bits_in_char;
>
>        info->bit_size = bit_size + base_size;
>        // warning (sym->pos, "regular: offset=%d", sym->offset);
> @@ -182,7 +182,7 @@ static struct symbol * examine_struct_union_type(struct symbol *sym, int advance
>                sym->ctype.alignment = info.max_align;
>        bit_size = info.bit_size;
>        if (info.align_size) {
> -               bit_align = (sym->ctype.alignment << 3)-1;
> +               bit_align = (sym->ctype.alignment * bits_in_char)-1;
>                bit_size = (bit_size + bit_align) & ~bit_align;
>        }
>        sym->bit_size = bit_size;
> @@ -340,7 +340,7 @@ static struct symbol * examine_node_type(struct symbol *sym)
>                if (node_type && node_type->bit_size >= 0)
>                        bit_size = node_type->bit_size * count;
>        }
> -
> +
>        sym->bit_size = bit_size;
>        return sym;
>  }
> @@ -654,7 +654,7 @@ static int expand_warning(struct expression *expr, int cost)
>        FOR_EACH_PTR (arglist, arg) {
>                /*
>                 * Constant strings get printed out as a warning. By the
> -                * time we get here, the EXPR_STRING has been fully
> +                * time we get here, the EXPR_STRING has been fully
>                 * evaluated, so by now it's an anonymous symbol with a
>                 * string initializer.
>                 *
> @@ -877,7 +877,7 @@ void init_ctype(void)
>                struct symbol *sym = ctype->ptr;
>                unsigned long bit_size = ctype->bit_size ? *ctype->bit_size : -1;
>                unsigned long maxalign = ctype->maxalign ? *ctype->maxalign : 0;
> -               unsigned long alignment = (bit_size + 7) >> 3;
> +               unsigned long alignment = (bit_size + bits_in_char - 1) / bits_in_char;
>
>                if (alignment > maxalign)
>                        alignment = maxalign;
> diff --git a/target.c b/target.c
> index bf1bb8f..a0ee010 100644
> --- a/target.c
> +++ b/target.c
> @@ -11,6 +11,8 @@ struct symbol *ssize_t_ctype = &int_ctype;
>  */
>  int max_alignment = 16;
>
> +int bits_in_unit = 8;
> +
>  /*
>  * Integer data types
>  */
> diff --git a/tokenize.c b/tokenize.c
> index e72c56e..67daa97 100644
> --- a/tokenize.c
> +++ b/tokenize.c
> @@ -96,7 +96,7 @@ static char *charstr(char *ptr, unsigned char c, unsigned char escape, unsigned
>        }
>        if (!isdigit(next))
>                return ptr + sprintf(ptr, "%o", c);
> -
> +
>        return ptr + sprintf(ptr, "%03o", c);
>  }
>
> @@ -161,7 +161,7 @@ const char *show_token(const struct token *token)
>        case TOKEN_STREAMEND:
>                sprintf(buffer, "<end of '%s'>", stream_name(token->pos.stream));
>                return buffer;
> -
> +
>        default:
>                return "WTF???";
>        }
> @@ -483,7 +483,7 @@ static int escapechar(int first, int type, stream_t *stream, int *valp)
>                                int nr = 2;
>                                value -= '0';
>                                while (next >= '0' && next <= '9') {
> -                                       value = (value << 3) + (next-'0');
> +                                       value = (value*8) + (next-'0');
>                                        next = nextchar(stream);
>                                        if (!--nr)
>                                                break;
> @@ -572,7 +572,7 @@ static int get_string_token(int next, stream_t *stream)
>        token_type(token) = TOKEN_STRING;
>        token->string = string;
>        add_token(stream);
> -
> +
>        return next;
>  }
>
> @@ -875,7 +875,7 @@ static int get_one_identifier(int c, stream_t *stream)
>        token->ident = ident;
>        add_token(stream);
>        return next;
> -}
> +}
>
>  static int get_one_token(int c, stream_t *stream)
>  {
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-28  0:17                     ` Christopher Li
@ 2008-06-28  0:23                       ` David Given
  2008-06-29  0:10                       ` David Given
  1 sibling, 0 replies; 20+ messages in thread
From: David Given @ 2008-06-28  0:23 UTC (permalink / raw)
  To: linux-sparse

[-- Attachment #1: Type: text/plain, Size: 652 bytes --]

Christopher Li wrote:
> Your patch is white space damaged.
> 
> While you are there, you might want to consider macro
> 
> bits_to_byte(x)
> byte_to_bits(x)

Yes, that's a good idea. I'll redo it.

(What's the right way to generate the patch? I made that with 'git show
> diff', which I thought should do the right thing.)

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 23:32                   ` David Given
  2008-06-28  0:17                     ` Christopher Li
@ 2008-06-28  0:29                     ` Josh Triplett
  2008-06-29  0:13                     ` Tommy Thorn
       [not found]                     ` <48658B28.6010301@numba-tu.com>
  3 siblings, 0 replies; 20+ messages in thread
From: Josh Triplett @ 2008-06-28  0:29 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

On Sat, 2008-06-28 at 00:32 +0100, David Given wrote:
> diff --git a/compile-i386.c b/compile-i386.c
> index 8526408..3bbc9c7 100644
> --- a/compile-i386.c
> +++ b/compile-i386.c
> @@ -1584,7 +1584,7 @@ static struct storage *emit_select_expr(struct
> expression *expr)
>  	/*
>  	 * Do the actual select: check the conditional for zero,
>  	 * move false over true if zero
> -	 */ 
> +	 */

Don't include whitespace changes in unrelated patches.  You have many
whitespace changes in this patch; I haven't commented on all of them.

> @@ -2040,7 +2040,7 @@ static struct symbol *evaluate_sizeof(struct
> expression *expr)
>  	if ((size < 0) || (size & 7))
>  		expression_error(expr, "cannot size expression");

This "size & 7" represents an assumption about bits_in_char as well; it
checks if the size represents an integral number of chars.  You'll need
to look for other instances of 7 as well, to catch cases like this.

> @@ -2074,7 +2074,7 @@ static struct symbol *evaluate_ptrsizeof(struct
> expression *expr)
>  	if (size & 7)
>  		size = 0;

Same problem here.

>  	 * storage for it..
>  	 */
>  	hash = find_storage_hash(pseudo, state->outputs);
> @@ -1829,8 +1829,8 @@ static void set_up_arch_entry(struct entrypoint
> *ep, struct instruction *entry)
>  
>  			in->type = REG_FRAME;
>  			in->offset = offset;
> -			
> -			offset += bits >> 3;
> +
> +			offset += bits / bits_in_char;

The first line represents a whitespace-only change.

> --- a/target.c
> +++ b/target.c
> @@ -11,6 +11,8 @@ struct symbol *ssize_t_ctype = &int_ctype;
>   */
>  int max_alignment = 16;
>  
> +int bits_in_unit = 8;
> +

Unused.

> @@ -483,7 +483,7 @@ static int escapechar(int first, int type,
> stream_t *stream, int *valp)
>  				int nr = 2;
>  				value -= '0';
>  				while (next >= '0' && next <= '9') {
> -					value = (value << 3) + (next-'0');
> +					value = (value*8) + (next-'0');

This does not relate to bits_per_char, and should not change.

- Josh Triplett


--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-28  0:17                     ` Christopher Li
  2008-06-28  0:23                       ` David Given
@ 2008-06-29  0:10                       ` David Given
  1 sibling, 0 replies; 20+ messages in thread
From: David Given @ 2008-06-29  0:10 UTC (permalink / raw)
  To: linux-sparse


[-- Attachment #1.1: Type: text/plain, Size: 812 bytes --]

Christopher Li wrote:
> Your patch is white space damaged.
> 
> While you are there, you might want to consider macro
> 
> bits_to_byte(x)
> byte_to_bits(x)
> 
> To isolate out the conversion. It should generate the same code.
> Marginally more readable.

Yes, good idea. Updated patch enclosed. I've also applied Josh
Triplett's fixes.

(This one's generated manually because I've been totally unable to make
git behave for me. Can anyone suggest a decent getting-started guide?)

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup

[-- Attachment #1.2: diff --]
[-- Type: text/plain, Size: 7364 bytes --]

diff -rwu ./compile-i386.c ../sparse/compile-i386.c
--- ./compile-i386.c	2008-06-28 22:44:21.706554998 +0100
+++ ../sparse/compile-i386.c	2008-06-28 23:45:21.246554627 +0100
@@ -2081,7 +2081,7 @@
 		insn("pushl", new, NULL,
 		     !framesize ? "begin function call" : NULL);
 
-		framesize += size >> 3;
+		framesize += bits_to_bytes(size);
 	} END_FOR_EACH_PTR_REVERSE(arg);
 
 	fn = expr->fn;
Only in .: diff
diff -rwu ./evaluate.c ../sparse/evaluate.c
--- ./evaluate.c	2008-06-28 22:44:21.706554998 +0100
+++ ../sparse/evaluate.c	2008-06-28 23:50:42.306555440 +0100
@@ -579,7 +579,7 @@
 	}
 
 	/* Get the size of whatever the pointer points to */
-	multiply = base->bit_size >> 3;
+	multiply = bits_to_bytes(base->bit_size);
 
 	if (ctype == &null_ctype)
 		ctype = &ptr_ctype;
@@ -831,7 +831,7 @@
 		struct expression *sub = alloc_expression(expr->pos, EXPR_BINOP);
 		struct expression *div = expr;
 		struct expression *val = alloc_expression(expr->pos, EXPR_VALUE);
-		unsigned long value = lbase->bit_size >> 3;
+		unsigned long value = bits_to_bytes(lbase->bit_size);
 
 		val->ctype = size_t_ctype;
 		val->value = value;
@@ -1591,7 +1591,7 @@
 				e3->op = '+';
 				e3->left = e0;
 				e3->right = alloc_const_expression(expr->pos,
-							expr->r_bitpos >> 3);
+							bits_to_bytes(expr->r_bitpos));
 				e3->ctype = &lazy_ptr_ctype;
 			} else {
 				e3 = e0;
@@ -1727,7 +1727,7 @@
 	} else if (class == TYPE_PTR) {
 		struct symbol *target = examine_pointer_target(ctype);
 		if (!is_function(target))
-			multiply = target->bit_size >> 3;
+			multiply = bits_to_bytes(target->bit_size);
 	}
 
 	if (multiply) {
@@ -1949,7 +1949,7 @@
 			expr->base = deref->base;
 			expr->r_bitpos = deref->r_bitpos;
 		}
-		expr->r_bitpos += offset << 3;
+		expr->r_bitpos += bytes_to_bits(offset);
 		expr->type = EXPR_SLICE;
 		expr->r_nrbits = member->bit_size;
 		expr->r_bitpos += member->bit_offset;
@@ -2037,10 +2037,10 @@
 		return NULL;
 
 	size = type->bit_size;
-	if ((size < 0) || (size & 7))
+	if ((size < 0) || (size & (bits_in_char - 1)))
 		expression_error(expr, "cannot size expression");
 	expr->type = EXPR_VALUE;
-	expr->value = size >> 3;
+	expr->value = bits_to_bytes(size);
 	expr->taint = 0;
 	expr->ctype = size_t_ctype;
 	return size_t_ctype;
@@ -2071,10 +2071,10 @@
 		return NULL;
 	}
 	size = type->bit_size;
-	if (size & 7)
+	if (size & (bits_in_char-1))
 		size = 0;
 	expr->type = EXPR_VALUE;
-	expr->value = size >> 3;
+	expr->value = bits_to_bytes(size);
 	expr->taint = 0;
 	expr->ctype = size_t_ctype;
 	return size_t_ctype;
@@ -2865,7 +2865,7 @@
 			unrestrict(idx, i_class, &i_type);
 			idx = cast_to(idx, size_t_ctype);
 			m = alloc_const_expression(expr->pos,
-						   ctype->bit_size >> 3);
+						   bits_to_bytes(ctype->bit_size));
 			m->ctype = size_t_ctype;
 			m->flags = Int_const_expr;
 			expr->type = EXPR_BINOP;
diff -rwu ./example.c ../sparse/example.c
--- ./example.c	2008-06-28 22:44:21.706554998 +0100
+++ ../sparse/example.c	2008-06-28 23:47:38.750558346 +0100
@@ -1830,7 +1830,7 @@
 			in->type = REG_FRAME;
 			in->offset = offset;
 			
-			offset += bits >> 3;
+			offset += bits_to_bytes(bits);
 		}
 		i++;
 		NEXT_PTR_LIST(argtype);
diff -rwu ./expand.c ../sparse/expand.c
--- ./expand.c	2008-06-28 22:44:21.710555415 +0100
+++ ../sparse/expand.c	2008-06-28 23:49:46.523309559 +0100
@@ -880,7 +880,7 @@
 {
 	unsigned long offset = 0;
 	while (expr->type == EXPR_POS) {
-		offset += expr->init_offset << 3;
+		offset += bytes_to_bits(expr->init_offset);
 		expr = expr->init_expr;
 	}
 	if (expr && expr->ctype)
diff -rwu ./flow.c ../sparse/flow.c
--- ./flow.c	2008-06-28 22:44:21.710555415 +0100
+++ ../sparse/flow.c	2008-06-28 23:49:45.014554518 +0100
@@ -16,6 +16,7 @@
 #include "expression.h"
 #include "linearize.h"
 #include "flow.h"
+#include "target.h"
 
 unsigned long bb_generation;
 
@@ -265,8 +266,8 @@
 
 static int overlapping_memop(struct instruction *a, struct instruction *b)
 {
-	unsigned int a_start = a->offset << 3;
-	unsigned int b_start = b->offset << 3;
+	unsigned int a_start = bytes_to_bits(a->offset);
+	unsigned int b_start = bytes_to_bits(b->offset);
 	unsigned int a_size = a->size;
 	unsigned int b_size = b->size;
 
@@ -581,13 +582,14 @@
 	pseudo_t pseudo = insn->src;
 
 	if (insn->bb && pseudo->type == PSEUDO_SYM) {
-		int offset = insn->offset, bit = (offset << 3) + insn->size;
+		int offset = insn->offset, bit = bytes_to_bits(offset) + insn->size;
 		struct symbol *sym = pseudo->sym;
 
 		if (sym->bit_size > 0 && (offset < 0 || bit > sym->bit_size))
 			warning(insn->pos, "invalid access %s '%s' (%d %d)",
 				offset < 0 ? "below" : "past the end of",
-				show_ident(sym->ident), offset, sym->bit_size >> 3);
+				show_ident(sym->ident), offset,
+				bits_to_bytes(sym->bit_size));
 	}
 }
 
diff -rwu ./show-parse.c ../sparse/show-parse.c
--- ./show-parse.c	2008-06-28 22:44:21.718555926 +0100
+++ ../sparse/show-parse.c	2008-06-28 23:47:40.010553617 +0100
@@ -673,7 +673,7 @@
 		int new = show_expression(arg);
 		int size = arg->ctype->bit_size;
 		printf("\tpush.%d\t\tv%d\n", size, new);
-		framesize += size >> 3;
+		framesize += bits_to_bytes(size);
 	} END_FOR_EACH_PTR_REVERSE(arg);
 
 	fn = expr->fn;
diff -rwu ./symbol.c ../sparse/symbol.c
--- ./symbol.c	2008-06-28 22:44:21.722555178 +0100
+++ ../sparse/symbol.c	2008-06-28 23:49:45.758555385 +0100
@@ -128,7 +128,7 @@
 		base_size = 0;
 	}
 
-	align_bit_mask = (sym->ctype.alignment << 3) - 1;
+	align_bit_mask = bytes_to_bits(sym->ctype.alignment) - 1;
 
 	/*
 	 * Bitfields have some very special rules..
@@ -143,7 +143,7 @@
 			bit_size = (bit_size + align_bit_mask) & ~align_bit_mask;
 			bit_offset = 0;
 		}
-		sym->offset = (bit_size - bit_offset) >> 3;
+		sym->offset = bits_to_bytes(bit_size - bit_offset);
 		sym->bit_offset = bit_offset;
 		sym->ctype.base_type->bit_offset = bit_offset;
 		info->bit_size = bit_size + width;
@@ -156,7 +156,7 @@
 	 * Otherwise, just align it right and add it up..
 	 */
 	bit_size = (bit_size + align_bit_mask) & ~align_bit_mask;
-	sym->offset = bit_size >> 3;
+	sym->offset = bits_to_bytes(bit_size);
 
 	info->bit_size = bit_size + base_size;
 	// warning (sym->pos, "regular: offset=%d", sym->offset);
@@ -182,7 +182,7 @@
 		sym->ctype.alignment = info.max_align;
 	bit_size = info.bit_size;
 	if (info.align_size) {
-		bit_align = (sym->ctype.alignment << 3)-1;
+		bit_align = bytes_to_bits(sym->ctype.alignment)-1;
 		bit_size = (bit_size + bit_align) & ~bit_align;
 	}
 	sym->bit_size = bit_size;
@@ -877,7 +877,7 @@
 		struct symbol *sym = ctype->ptr;
 		unsigned long bit_size = ctype->bit_size ? *ctype->bit_size : -1;
 		unsigned long maxalign = ctype->maxalign ? *ctype->maxalign : 0;
-		unsigned long alignment = (bit_size + 7) >> 3;
+		unsigned long alignment = bits_to_bytes(bit_size + bits_in_char - 1);
 
 		if (alignment > maxalign)
 			alignment = maxalign;
diff -rwu ./target.h ../sparse/target.h
--- ./target.h	2008-06-28 22:44:21.722555178 +0100
+++ ../sparse/target.h	2008-06-28 23:44:50.766554787 +0100
@@ -42,4 +42,14 @@
 extern int bits_in_enum;
 extern int enum_alignment;
 
+/*
+ * Helper functions for converting bits to bytes and vice versa.
+ */
+
+static inline int bits_to_bytes(int bits)
+{ return bits / bits_in_char; }
+
+static inline int bytes_to_bits(int bytes)
+{ return bytes * bits_in_char; }
+
 #endif

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 23:32                   ` David Given
  2008-06-28  0:17                     ` Christopher Li
  2008-06-28  0:29                     ` Josh Triplett
@ 2008-06-29  0:13                     ` Tommy Thorn
       [not found]                     ` <48658B28.6010301@numba-tu.com>
  3 siblings, 0 replies; 20+ messages in thread
From: Tommy Thorn @ 2008-06-29  0:13 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse, Christopher Li

[Resent for the 3rd time. I seems I can't write to the list]

David Given wrote:
 > Christopher Li wrote:
 > [...]
 >   
 >> You are right that point out a bug (assumption) of sparse which byte 
is 8 bits.
 >> Using bits_in_byte is instead of 8 is better there.
 >> Using bits_in_char assumes char has same bits as byte. That is my read
 >> of the C spec.
 >>     
 >
 > Yes, indeed, I'd managed to get my terminology muddled to the confusion
 > of everyone.
 >
 > Okay, I've gone and looked at the implementation of this stuff; it would
 > appear that modifying the code to supply the back end with units scaled
 > in something other than C chars is actually quite complicated, so I
 > haven't tried to do that. However, I'm enclosing a patch that should,
 > hopefully, fix the cases where it's assuming 8 bit bytes. Hopefully I've
 >  managed to catch all the cases, without breaking the octal parse code...
 >
 > (I haven't used git before; is this the right format?)
 >   

Yes, but it would have been better to have kept all the white space
changes for a separate change so we could just see that relevant changes.

This introduces divides all over the place for all users, replacing a
very cheap constant shift with an expensive divide. All sparse users
would be paying the cost for a feature that is useful for 1 user, I
think we should perhaps think carefully about this.

One obvious solution would be to introduce a BITS_IN_CHAR macro in
target.h defaulting to 8 and let "exotic" architectures redefine it to
bits_in_char. This is very similar to the approach GCC is using.

On a completely unrelated node, I'm excited to see your work on using
sparse for compilation. Hopefully your experience will lead to it being
easier for the next guy. I would like to see sparse target for a simple
abstract machine, but I so far haven't had time to work on it.

Tommy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
       [not found]                     ` <48658B28.6010301@numba-tu.com>
@ 2008-06-29  0:30                       ` David Given
  2008-06-29  0:38                         ` Tommy Thorn
  0 siblings, 1 reply; 20+ messages in thread
From: David Given @ 2008-06-29  0:30 UTC (permalink / raw)
  To: linux-sparse

[-- Attachment #1: Type: text/plain, Size: 3431 bytes --]

Tommy Thorn wrote:
[...]
> Yes, but it would have been better to have kept all the white space
> changes for a separate change so we could just see that relevant changes.

Corrected; turned out it had been so long since I've used a diff that
didn't use my alias I'd forgotten you have to explicitly tell it to
ignore white space.

> This introduces divides all over the place for all users, replacing a
> very cheap constant shift with an expensive divide. All sparse users
> would be paying the cost for a feature that is useful for 1 user, I
> think we should perhaps think carefully about this.

I really don't think this is an issue. Certainly, a division by a
variable is more expensive than a shift by a constant... but you'd have
to do billions of them before anyone actually started to notice. If it
really did turn out to be an issue, it would be simple enough to change
later to "value >> bits_in_char_power_of_2".

> One obvious solution would be to introduce a BITS_IN_CHAR macro in
> target.h defaulting to 8 and let "exotic" architectures redefine it to
> bits_in_char. This is very similar to the approach GCC is using.

The problem with this approach is that it involves recompiling all of
sparse for each target, which is a bit user unfriendly. Currently all
configuration can be done at run time, which allows such nice features
as having all potential back-ends share the same code via shared libraries.

> On a completely unrelated node, I'm excited to see your work on using
> sparse for compilation. Hopefully your experience will lead to it being
> easier for the next guy. I would like to see sparse target for a simple
> abstract machine, but I so far haven't had time to work on it.

sparse is working rather well, and seems to be one of the better
compiler front ends that I've found. The linearising support is
particularly helpful. I have, however, been finding that the learning
curve is rather steep --- the API shows a lot of signs of having grown
organically over time --- and I've had to build big chunks of code that
I'm sure are unnecessary. For example, in order to generate the right
kind of code, I need to figure out whether a pseudo contains an integer,
float or pointer. The only way I've found out to do this is to
recursively follow the chain of pseudo->def pointers until I find an
instruction with enough type information attached to it to figure it
out. I'm sure there must be an easier way of doing this...

(BTW, given a function symbol, how do I find its return type? I can find
the list of arguments, but there's nothing in struct symbol that seems
to refer to the return value...)

I am intending to release my project once done, of course, and I hope
people will find it useful, but I'm not sure how much general use it
will be; I've basically ripped out and replaced the register allocator /
storage mechanism from my back end as I didn't understand it and didn't
need the complexity. That should probably all get rewritten properly at
some stage, but right now I'm focusing on getting things working...

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-29  0:30                       ` David Given
@ 2008-06-29  0:38                         ` Tommy Thorn
  0 siblings, 0 replies; 20+ messages in thread
From: Tommy Thorn @ 2008-06-29  0:38 UTC (permalink / raw)
  To: David Given; +Cc: linux-sparse

David Given wrote:
> sparse is working rather well, and seems to be one of the better
> compiler front ends that I've found. The linearising support is
> particularly helpful. I have, however, been finding that the learning
> curve is rather steep --- the API shows a lot of signs of having grown
> organically over time --- and I've had to build big chunks of code that
> I'm sure are unnecessary. For example, in order to generate the right
> kind of code, I need to figure out whether a pseudo contains an integer,
> float or pointer. The only way I've found out to do this is to
> recursively follow the chain of pseudo->def pointers until I find an
> instruction with enough type information attached to it to figure it
> out. I'm sure there must be an easier way of doing this...
>
> (BTW, given a function symbol, how do I find its return type? I can find
> the list of arguments, but there's nothing in struct symbol that seems
> to refer to the return value...)
>
> I am intending to release my project once done, of course, and I hope
> people will find it useful, but I'm not sure how much general use it
> will be; I've basically ripped out and replaced the register allocator /
> storage mechanism from my back end as I didn't understand it and didn't
> need the complexity. That should probably all get rewritten properly at
> some stage, but right now I'm focusing on getting things working...
>   

What I had in mind would not even perform register allocation or any 
attempt at optimization. Such only obscures how to use sparse as a compiler.

Cheers,
Tommy


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Pointer arithmetic error
  2008-06-27 18:01                 ` Christopher Li
  2008-06-27 23:32                   ` David Given
@ 2008-06-29 12:19                   ` Bernd Petrovitsch
  1 sibling, 0 replies; 20+ messages in thread
From: Bernd Petrovitsch @ 2008-06-29 12:19 UTC (permalink / raw)
  To: Christopher Li; +Cc: David Given, linux-sparse

On Fre, 2008-06-27 at 11:01 -0700, Christopher Li wrote: 
> > In C, there is no type "byte" (unless you typedef oder #define it).
> > "byte" is usually (but not necessarily) meant as "unsigned char".
> 
> In C spec, there is a concept of "byte". The union return by sizeof()

I stand corrected. Hmm, I need the time to read C99 thoroughly.

> is byte. Char must fit in a byte. But char does not necessary have the
> same bits as byte. Char can have more.
> 
> C99: 3.6, 3.7.1
>
> Because char can always fit in byte, sizeof(char) == 1.

But how can a char have more bits than a byte?

> > IIRC C specifies that sizeof() returns values measured in chars, but
> > I don't believe it specifies any mapping between the size of chars
> > and the underlying addressing units --- it should be possible to use
> > 16-bit chars, for example, on an 8-bit byte system. > Using 32-bit
> > ints, sizeof(int) would then return 2; but you wouldn't be able to
> > access individual bytes from C.
> 
> sizeof() return value measure in _byte_.
> C99: 6.5.3.4

Yes. But "sizeof(char)" is always 1 (as stated in the same chapter). 
So I see no real difference between "byte" and "char" (at least with the
size of them).

	Bernd
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2008-06-29 12:20 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-26 23:40 Pointer arithmetic error David Given
2008-06-26 23:51 ` Chris Li
2008-06-27  0:17   ` David Given
2008-06-27  9:00     ` Christopher Li
2008-06-27  9:49     ` Bernd Petrovitsch
2008-06-27 10:55       ` David Given
2008-06-27 11:20         ` Bernd Petrovitsch
2008-06-27 14:03           ` David Given
2008-06-27 14:45             ` Bernd Petrovitsch
2008-06-27 15:45               ` David Given
2008-06-27 18:01                 ` Christopher Li
2008-06-27 23:32                   ` David Given
2008-06-28  0:17                     ` Christopher Li
2008-06-28  0:23                       ` David Given
2008-06-29  0:10                       ` David Given
2008-06-28  0:29                     ` Josh Triplett
2008-06-29  0:13                     ` Tommy Thorn
     [not found]                     ` <48658B28.6010301@numba-tu.com>
2008-06-29  0:30                       ` David Given
2008-06-29  0:38                         ` Tommy Thorn
2008-06-29 12:19                   ` Bernd Petrovitsch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).