[PATCH] fix casts when linearizing compound assignments

linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] fix casts when linearizing compound assignments
@ 2012-06-08 10:47 Xi Wang
  2012-06-08 15:45 ` Xi Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Xi Wang @ 2012-06-08 10:47 UTC (permalink / raw)
  To: linux-sparse; +Cc: Xi Wang

linearize_assignment() emits confusing casts for compound assignments.
Consider p += 1, where p is a pointer.

	oldvalue = cast_pseudo(ep, oldvalue, src->ctype, expr->ctype);
	opcode = opcode_sign(op_trans[expr->op - SPECIAL_BASE], src->ctype);
	dst = add_binary_op(ep, src->ctype, opcode, oldvalue, value);
	value = cast_pseudo(ep, dst, expr->ctype, src->ctype);

It would make more sense to swap src->ctype and expr->ctype in both
cast_pseudo(ep, pseudo, from, to) calls:

In the first call, `oldvalue' is a pointer (expr->ctype), but it is
converted from an integer (src->ctype) to a pointer (expr->ctype);

In the second call, `dst' is an integer (src->ctype), but it is
converted from a pointer (expr->ctype) to an integer (src->ctype).

This patch instead converts `value' from an integer (src->ctype) to a
pointer (expr->ctype) so as to emit fewer casts.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
---
 linearize.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/linearize.c b/linearize.c
index 1d15cfd..d186387 100644
--- a/linearize.c
+++ b/linearize.c
@@ -1161,7 +1161,6 @@ static pseudo_t linearize_assignment(struct entrypoint *ep, struct expression *e
 		return value;
 	if (expr->op != '=') {
 		pseudo_t oldvalue = linearize_load_gen(ep, &ad);
-		pseudo_t dst;
 		static const int op_trans[] = {
 			[SPECIAL_ADD_ASSIGN - SPECIAL_BASE] = OP_ADD,
 			[SPECIAL_SUB_ASSIGN - SPECIAL_BASE] = OP_SUB,
@@ -1179,10 +1178,9 @@ static pseudo_t linearize_assignment(struct entrypoint *ep, struct expression *e
 		if (!src)
 			return VOID;
 
-		oldvalue = cast_pseudo(ep, oldvalue, src->ctype, expr->ctype);
-		opcode = opcode_sign(op_trans[expr->op - SPECIAL_BASE], src->ctype);
-		dst = add_binary_op(ep, src->ctype, opcode, oldvalue, value);
-		value = cast_pseudo(ep, dst, expr->ctype, src->ctype);
+		opcode = opcode_sign(op_trans[expr->op - SPECIAL_BASE], expr->ctype);
+		value = cast_pseudo(ep, value, src->ctype, expr->ctype);
+		value = add_binary_op(ep, expr->ctype, opcode, oldvalue, value);
 	}
 	value = linearize_store_gen(ep, value, &ad);
 	finish_address_gen(ep, &ad);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 10:47 [PATCH] fix casts when linearizing compound assignments Xi Wang
@ 2012-06-08 15:45 ` Xi Wang
  2012-06-08 19:10   ` Pekka Enberg
  0 siblings, 1 reply; 12+ messages in thread
From: Xi Wang @ 2012-06-08 15:45 UTC (permalink / raw)
  To: linux-sparse; +Cc: Pekka Enberg

On Jun 8, 2012, at 6:47 AM, Xi Wang wrote:
> -		oldvalue = cast_pseudo(ep, oldvalue, src->ctype, expr->ctype);
> -		opcode = opcode_sign(op_trans[expr->op - SPECIAL_BASE], src->ctype);
> -		dst = add_binary_op(ep, src->ctype, opcode, oldvalue, value);
> -		value = cast_pseudo(ep, dst, expr->ctype, src->ctype);
> +		opcode = opcode_sign(op_trans[expr->op - SPECIAL_BASE], expr->ctype);
> +		value = cast_pseudo(ep, value, src->ctype, expr->ctype);
> +		value = add_binary_op(ep, expr->ctype, opcode, oldvalue, value);

For sparse-llvm, is it better to get rid of the cast_pseudo()
at all?

For example, what to emit for p += 1:

1) pointer p + integer 1,

2) pointer + pointer [cast from integer], or

3) (pointer)(integer [cast from pointer] + integer)?

- xi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 15:45 ` Xi Wang
@ 2012-06-08 19:10   ` Pekka Enberg
  2012-06-08 19:18     ` Linus Torvalds
  2012-06-08 20:08     ` Jeff Garzik
  0 siblings, 2 replies; 12+ messages in thread
From: Pekka Enberg @ 2012-06-08 19:10 UTC (permalink / raw)
  To: Xi Wang
  Cc: linux-sparse, Jeff Garzik, Linus Torvalds, Benjamin Herrenschmidt,
	Christopher Li

On Fri, Jun 8, 2012 at 6:45 PM, Xi Wang <xi.wang@gmail.com> wrote:
> On Jun 8, 2012, at 6:47 AM, Xi Wang wrote:
>> -             oldvalue = cast_pseudo(ep, oldvalue, src->ctype, expr->ctype);
>> -             opcode = opcode_sign(op_trans[expr->op - SPECIAL_BASE], src->ctype);
>> -             dst = add_binary_op(ep, src->ctype, opcode, oldvalue, value);
>> -             value = cast_pseudo(ep, dst, expr->ctype, src->ctype);
>> +             opcode = opcode_sign(op_trans[expr->op - SPECIAL_BASE], expr->ctype);
>> +             value = cast_pseudo(ep, value, src->ctype, expr->ctype);
>> +             value = add_binary_op(ep, expr->ctype, opcode, oldvalue, value);
>
> For sparse-llvm, is it better to get rid of the cast_pseudo()
> at all?
>
> For example, what to emit for p += 1:
>
> 1) pointer p + integer 1,
>
> 2) pointer + pointer [cast from integer], or
>
> 3) (pointer)(integer [cast from pointer] + integer)?

Dunno.

This:

void *foo(void *p)
{
  p += 1;

  return p;
}

is converted as follows by clang:

define i8* @foo(i8* %p) nounwind uwtable readnone {
  %1 = getelementptr i8* %p, i64 1
  ret i8* %1
}

[ You can try it out yourself here: http://llvm.org/demo/index.cgi ]

So it seems that LLVM really wants to use the "getelementptr" thingy
for these. I have no idea what makes the most sense from sparse IR
point of view.

Jeff, Linus, Chris, hmm?

                        Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 19:10   ` Pekka Enberg
@ 2012-06-08 19:18     ` Linus Torvalds
  2012-06-08 19:25       ` Xi Wang
  2012-06-08 20:08     ` Jeff Garzik
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2012-06-08 19:18 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Xi Wang, linux-sparse, Jeff Garzik, Benjamin Herrenschmidt,
	Christopher Li

On Fri, Jun 8, 2012 at 12:10 PM, Pekka Enberg <penberg@kernel.org> wrote:
>
> So it seems that LLVM really wants to use the "getelementptr" thingy
> for these. I have no idea what makes the most sense from sparse IR
> point of view.

I think the right thing to do is as an EXPR_BINOP with a
"SPECIAL_ADDPTR" operation which takes a pointer and a (pre-scaled)
'long'.

From a code generation and optimization standpoint, OP_ADDPTR would be
the same as '+'

                   Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 19:18     ` Linus Torvalds
@ 2012-06-08 19:25       ` Xi Wang
  2012-06-08 19:28         ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Xi Wang @ 2012-06-08 19:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka Enberg, linux-sparse, Jeff Garzik, Benjamin Herrenschmidt,
	Christopher Li

On Jun 8, 2012, at 3:18 PM, Linus Torvalds wrote:

> On Fri, Jun 8, 2012 at 12:10 PM, Pekka Enberg <penberg@kernel.org> wrote:
>> 
>> So it seems that LLVM really wants to use the "getelementptr" thingy
>> for these. I have no idea what makes the most sense from sparse IR
>> point of view.
> 
> I think the right thing to do is as an EXPR_BINOP with a
> "SPECIAL_ADDPTR" operation which takes a pointer and a (pre-scaled)
> 'long'.
> 
> From a code generation and optimization standpoint, OP_ADDPTR would be
> the same as '+'

Not sure we really need getelementptr.

What I am doing now is to convert all operands to integers first,
perform the addition, and then convert the result back to pointer.
Anything wrong with that?

- xi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 19:25       ` Xi Wang
@ 2012-06-08 19:28         ` Linus Torvalds
  2012-06-08 19:47           ` Pekka Enberg
  2012-06-08 20:09           ` Jeff Garzik
  0 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2012-06-08 19:28 UTC (permalink / raw)
  To: Xi Wang
  Cc: Pekka Enberg, linux-sparse, Jeff Garzik, Benjamin Herrenschmidt,
	Christopher Li

On Fri, Jun 8, 2012 at 12:25 PM, Xi Wang <xi.wang@gmail.com> wrote:
>
> What I am doing now is to convert all operands to integers first,
> perform the addition, and then convert the result back to pointer.
> Anything wrong with that?

Wrong with it? No, it's what all reasonable hardware does anyway. But
it's possible that it would effectively disable some LVM warnings or
optimizations.

             Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 19:28         ` Linus Torvalds
@ 2012-06-08 19:47           ` Pekka Enberg
  2012-06-08 19:57             ` Xi Wang
  2012-06-08 20:09           ` Jeff Garzik
  1 sibling, 1 reply; 12+ messages in thread
From: Pekka Enberg @ 2012-06-08 19:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Xi Wang, linux-sparse, Jeff Garzik, Benjamin Herrenschmidt,
	Christopher Li

On Fri, Jun 8, 2012 at 12:25 PM, Xi Wang <xi.wang@gmail.com> wrote:
>> What I am doing now is to convert all operands to integers first,
>> perform the addition, and then convert the result back to pointer.
>> Anything wrong with that?

On Fri, Jun 8, 2012 at 10:28 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Wrong with it? No, it's what all reasonable hardware does anyway. But
> it's possible that it would effectively disable some LVM warnings or
> optimizations.

Yeah. I'd really prefer we avoid any special tricks with LLVM and just
do what clang does.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 19:47           ` Pekka Enberg
@ 2012-06-08 19:57             ` Xi Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Xi Wang @ 2012-06-08 19:57 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Linus Torvalds, linux-sparse, Jeff Garzik, Benjamin Herrenschmidt,
	Christopher Li

On Jun 8, 2012, at 3:47 PM, Pekka Enberg wrote:
> Yeah. I'd really prefer we avoid any special tricks with LLVM and just
> do what clang does.

This is not really a special trick:

http://llvm.org/docs/GetElementPtr.html#int

I won't worry about optimizations; an alias analysis that takes care of
inttoptr/ptrtoint should work.

I agree that using getelementptr looks more llvm-ish anyway. :-)

- xi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 19:10   ` Pekka Enberg
  2012-06-08 19:18     ` Linus Torvalds
@ 2012-06-08 20:08     ` Jeff Garzik
  2012-06-08 20:16       ` Linus Torvalds
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff Garzik @ 2012-06-08 20:08 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Xi Wang, linux-sparse, Linus Torvalds, Benjamin Herrenschmidt,
	Christopher Li

On 06/08/2012 03:10 PM, Pekka Enberg wrote:
> So it seems that LLVM really wants to use the "getelementptr" thingy
> for these. I have no idea what makes the most sense from sparse IR
> point of view.

IMO it's a major backend design choice, GEP or not.

Either choice requires a bit of ugly work, in different places.  sparse 
already calculates sufficient information to create low-level 
address/offset load/store generated code.  LLVM API seems to "prefer" 
that one use GEP code and associated type information, and was written 
with this preference in mind.  HOWEVER it is not a hard requirement, and 
addr/ofs load/store lowlevel code can be generated.  You wind up 
abandoning LLVM's type-related APIs, with all that entails.

Using GEP with sparse, OTOH, enables smarter warnings, optimizations, 
aliasing behaviors etc. during code generation, by virtue of providing 
more type information throughout the LLVM compile process.  The hard 
work of GEP+sparse is that sparse in places prefers to linearize to a 
lower level, and you may wind up having to re-create (or store early) 
some advanced type information sparse deems unimportant in later 
linearization phases.

I am biased in favor of GEP (more information) but sparse at present is 
biased in favor of addr/ofs load/store.  Each has benefits and costs.

	Jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 19:28         ` Linus Torvalds
  2012-06-08 19:47           ` Pekka Enberg
@ 2012-06-08 20:09           ` Jeff Garzik
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff Garzik @ 2012-06-08 20:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Xi Wang, Pekka Enberg, linux-sparse, Benjamin Herrenschmidt,
	Christopher Li

On 06/08/2012 03:28 PM, Linus Torvalds wrote:
> On Fri, Jun 8, 2012 at 12:25 PM, Xi Wang<xi.wang@gmail.com>  wrote:
>>
>> What I am doing now is to convert all operands to integers first,
>> perform the addition, and then convert the result back to pointer.
>> Anything wrong with that?
>
> Wrong with it? No, it's what all reasonable hardware does anyway. But
> it's possible that it would effectively disable some LVM warnings or
> optimizations.

Yep.  That is the tradeoff.

	Jeff





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 20:08     ` Jeff Garzik
@ 2012-06-08 20:16       ` Linus Torvalds
  2012-06-16  6:10         ` Xi Wang
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2012-06-08 20:16 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Pekka Enberg, Xi Wang, linux-sparse, Benjamin Herrenschmidt,
	Christopher Li

On Fri, Jun 8, 2012 at 1:08 PM, Jeff Garzik <jeff@garzik.org> wrote:
> LLVM API seems to "prefer" that one use GEP code
> and associated type information, and was written with this preference in
> mind.

Looking at the LLVM API docs that Xi referred to, the LLVM preference
is actually pretty weak.

So I'd almost say continue with the current IR, and just do the
pointer arithmetic by hand. If LLVM doesn't treat GEP *that*
specially, doing a separate pointer arithmetic level is quite possibly
going to just cause problems: simplification of combinations of GEP
and arithmetic is potentially quite annoying and has problems that
simplification of perfectly regular addition does not.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] fix casts when linearizing compound assignments
  2012-06-08 20:16       ` Linus Torvalds
@ 2012-06-16  6:10         ` Xi Wang
  0 siblings, 0 replies; 12+ messages in thread
From: Xi Wang @ 2012-06-16  6:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Pekka Enberg, linux-sparse, Benjamin Herrenschmidt,
	Christopher Li

On Jun 8, 2012, at 4:16 PM, Linus Torvalds wrote:
> 
> Looking at the LLVM API docs that Xi referred to, the LLVM preference
> is actually pretty weak.
> 
> So I'd almost say continue with the current IR, and just do the
> pointer arithmetic by hand. If LLVM doesn't treat GEP *that*
> specially, doing a separate pointer arithmetic level is quite possibly
> going to just cause problems: simplification of combinations of GEP
> and arithmetic is potentially quite annoying and has problems that
> simplification of perfectly regular addition does not.

I tried a simple strategy to emit an offset:

1) cast the pointer to i8* (char*);

2) emit GEP with the offset computed by sparse;

3) cast the pointer back to its original type.

There is no pointer-integer conversion at all.  This seems to work
well with both current IR and LLVM's optimizer.

Here are some examples, using my experimental backend "splay".

  struct X { int a, b; };

  void szero(struct X *x)
  {
    x->a = 0;
    x->b = 0;
  }

splay (with opt -O2) emits:

  %0 = getelementptr inbounds %struct.X* %x, i64 0, i32 0
  store i32 0, i32* %0, align 4
  %1 = getelementptr inbounds %struct.X* %x, i64 0, i32 1
  store i32 0, i32* %1, align 4

LLVM's optimizer is even able to "recover" GEP x, 0, 1 for x->b,
from what splay (without opt) actually emits:

  %4 = bitcast %struct.X* %x to i32*
  %5 = bitcast i32* %4 to i8*
  %6 = getelementptr inbounds i8* %5, i64 4

Here's an example of variable offset.

  void izero(int *x, size_t n)
  {
    x[n] = 0;
  }

splay (with opt -O2) emits:

  %0 = shl nsw i64 %n, 2
  %1 = bitcast i32* %x to i8*
  %2 = getelementptr inbounds i8* %1, i64 %0
  %3 = bitcast i8* %2 to i32*
  store i32 0, i32* %3, align 4

This time the result doesn't look as perfect as Clang, which emits
GEP %x, %n, but the assembly code is the same:

  movl  $0, (%rdi,%rsi,4)

Here goes a more complex one.

  void azero(int *x, size_t n)
  {
    size_t i;

    for (i = 0; i != n; ++i)
      x[i] = 0;
  }

splay (with opt -O2) emits:

  ...
  %1 = bitcast i32* %x to i8*
  %2 = shl i64 %n, 2
  call void @llvm.memset.p0i8.i64(i8* %1, i8 0, i64 %2, i32 4, i1 false)
  ...

LLVM's optimizer recognizes the in-loop GEP and rewrites the whole
thing using memset.

- xi


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-06-16  6:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-08 10:47 [PATCH] fix casts when linearizing compound assignments Xi Wang
2012-06-08 15:45 ` Xi Wang
2012-06-08 19:10   ` Pekka Enberg
2012-06-08 19:18     ` Linus Torvalds
2012-06-08 19:25       ` Xi Wang
2012-06-08 19:28         ` Linus Torvalds
2012-06-08 19:47           ` Pekka Enberg
2012-06-08 19:57             ` Xi Wang
2012-06-08 20:09           ` Jeff Garzik
2012-06-08 20:08     ` Jeff Garzik
2012-06-08 20:16       ` Linus Torvalds
2012-06-16  6:10         ` Xi Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).