linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* sparse-llvm incorrect definition of local variables
@ 2017-03-07  0:08 Dibyendu Majumdar
  2017-03-07  6:44 ` Luc Van Oostenryck
  2017-03-13 11:16 ` Dibyendu Majumdar
  0 siblings, 2 replies; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-07  0:08 UTC (permalink / raw)
  To: Linux-Sparse

Hi,

When I try to compile following in sparse-llvm:

int main(const char *argv[]) {
 int values[5];
 values[0] = 9;
 values[1] = 8;
 values[2] = 7;
 values[3] = 6;
 values[4] = 5;
 int *p = &values[4];
 return (*p == 5) ? 0 : 1;
}

I get:

@values = external global [5 x i32]
define i32 @main(i8**) {
L0:
  store i32 9, i32* getelementptr inbounds ([5 x i32], [5 x i32]*
@values, i32 0, i32 0)
  store i32 8, i32* bitcast (i8* getelementptr inbounds (i8, i8*
bitcast ([5 x i32]* @values to i8*), i64 4) to i32*)
  store i32 7, i32* bitcast (i8* getelementptr inbounds (i8, i8*
bitcast ([5 x i32]* @values to i8*), i64 8) to i32*)
  store i32 6, i32* bitcast (i8* getelementptr inbounds (i8, i8*
bitcast ([5 x i32]* @values to i8*), i64 12) to i32*)
  store i32 5, i32* bitcast (i8* getelementptr inbounds (i8, i8*
bitcast ([5 x i32]* @values to i8*), i64 16) to i32*)
  ret i32 0
}

Here 'values' has been defined above as external global, whereas it
should be allocated on the stack.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-07  0:08 sparse-llvm incorrect definition of local variables Dibyendu Majumdar
@ 2017-03-07  6:44 ` Luc Van Oostenryck
  2017-03-07  7:31   ` Dibyendu Majumdar
  2017-03-13 11:16 ` Dibyendu Majumdar
  1 sibling, 1 reply; 14+ messages in thread
From: Luc Van Oostenryck @ 2017-03-07  6:44 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Linux-Sparse

On Tue, Mar 07, 2017 at 12:08:22AM +0000, Dibyendu Majumdar wrote:
> Hi,
> 
> When I try to compile following in sparse-llvm:
> 
> int main(const char *argv[]) {
>  int values[5];
>  values[0] = 9;
>  ...
>  values[4] = 5;
>  int *p = &values[4];
>  return (*p == 5) ? 0 : 1;
> }
> 
> I get:
> 
> @values = external global [5 x i32]
> ...
> 
> Here 'values' has been defined above as external global, whereas it
> should be allocated on the stack.

Mmmh, I'll need to look a bit further at this one.

Thanks, for the report.
 
-- Luc

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-07  6:44 ` Luc Van Oostenryck
@ 2017-03-07  7:31   ` Dibyendu Majumdar
  2017-03-08  6:53     ` Luc Van Oostenryck
  0 siblings, 1 reply; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-07  7:31 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

Hi Luc,

On 7 March 2017 at 06:44, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Tue, Mar 07, 2017 at 12:08:22AM +0000, Dibyendu Majumdar wrote:
>> When I try to compile following in sparse-llvm:
>>
>> int main(const char *argv[]) {
>>  int values[5];
>>  values[0] = 9;
>>  ...
>>  values[4] = 5;
>>  int *p = &values[4];
>>  return (*p == 5) ? 0 : 1;
>> }
>>
>> I get:
>>
>> @values = external global [5 x i32]
>> ...
>>
>> Here 'values' has been defined above as external global, whereas it
>> should be allocated on the stack.
>
> Mmmh, I'll need to look a bit further at this one.
>

It appears that sparse-llvm only creates a local stack object for each
OP_PHI instruction. What should be the general approach for
identifying local stack objects in sparse linearized output?

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-07  7:31   ` Dibyendu Majumdar
@ 2017-03-08  6:53     ` Luc Van Oostenryck
  2017-03-08 23:38       ` Dibyendu Majumdar
  0 siblings, 1 reply; 14+ messages in thread
From: Luc Van Oostenryck @ 2017-03-08  6:53 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Linux-Sparse

On Tue, Mar 07, 2017 at 07:31:34AM +0000, Dibyendu Majumdar wrote:
> >> Here 'values' has been defined above as external global, whereas it
> >> should be allocated on the stack.
> >
> > Mmmh, I'll need to look a bit further at this one.
> >
> 
> It appears that sparse-llvm only creates a local stack object for each
> OP_PHI instruction. What should be the general approach for
> identifying local stack objects in sparse linearized output?

I've looked a bit at this and there is several issues at hand.
It's also not something LLVM specific.
So I can't answer to this for now. Hopefully, I'll be able to tell
you more in a few days.

-- Luc 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-08  6:53     ` Luc Van Oostenryck
@ 2017-03-08 23:38       ` Dibyendu Majumdar
  2017-03-09  1:50         ` Dibyendu Majumdar
  2017-03-09 14:42         ` Luc Van Oostenryck
  0 siblings, 2 replies; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-08 23:38 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

Hi Luc,

On 8 March 2017 at 06:53, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Tue, Mar 07, 2017 at 07:31:34AM +0000, Dibyendu Majumdar wrote:
>> >> Here 'values' has been defined above as external global, whereas it
>> >> should be allocated on the stack.
>> >
>> > Mmmh, I'll need to look a bit further at this one.
>> >
>>
>> It appears that sparse-llvm only creates a local stack object for each
>> OP_PHI instruction. What should be the general approach for
>> identifying local stack objects in sparse linearized output?
>
> I've looked a bit at this and there is several issues at hand.
> It's also not something LLVM specific.
> So I can't answer to this for now. Hopefully, I'll be able to tell
> you more in a few days.
>

Thank you for looking into this. It would be good to know what issues
you have discovered. I was wondering if a simple solution is to check
whether the symbol is marked  as toplevel / extern - and if not then
define it as a local symbol in LLVM. However we would need to handle
scope - i.e. if the same symbol occurs multiple times in a function in
different scopes then these need to be defined accordingly. Example:

extern void init(int *v);
int main(const char *argv[]) {
 int n = 0;
 {
  int x[5];
  init(x);
  n += x[0];
 }
 {
  int x[6];
  init(x);
  n += x[0];
 }
 return n;
}

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-08 23:38       ` Dibyendu Majumdar
@ 2017-03-09  1:50         ` Dibyendu Majumdar
  2017-03-09 14:35           ` Luc Van Oostenryck
  2017-03-09 14:42         ` Luc Van Oostenryck
  1 sibling, 1 reply; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-09  1:50 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

On 8 March 2017 at 23:38, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote:
> On 8 March 2017 at 06:53, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
>> On Tue, Mar 07, 2017 at 07:31:34AM +0000, Dibyendu Majumdar wrote:
>>> >> Here 'values' has been defined above as external global, whereas it
>>> >> should be allocated on the stack.
>>> >
>>> > Mmmh, I'll need to look a bit further at this one.
>>> >
>>>
>>> It appears that sparse-llvm only creates a local stack object for each
>>> OP_PHI instruction. What should be the general approach for
>>> identifying local stack objects in sparse linearized output?
>>
>> I've looked a bit at this and there is several issues at hand.
>> It's also not something LLVM specific.
>> So I can't answer to this for now. Hopefully, I'll be able to tell
>> you more in a few days.
>>
>
> Thank you for looking into this. It would be good to know what issues
> you have discovered. I was wondering if a simple solution is to check
> whether the symbol is marked  as toplevel / extern - and if not then
> define it as a local symbol in LLVM. However we would need to handle
> scope - i.e. if the same symbol occurs multiple times in a function in
> different scopes then these need to be defined accordingly. Example:
>
> extern void init(int *v);
> int main(const char *argv[]) {
>  int n = 0;
>  {
>   int x[5];
>   init(x);
>   n += x[0];
>  }
>  {
>   int x[6];
>   init(x);
>   n += x[0];
>  }
>  return n;
> }
>

I think by combining the symbol's identifier and address a unique
reference can be created - thus whenever we encounter a symbol that is
not extern or toplevel we should create a local object using the
symbol's name + address as the LLVM identifier.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-09  1:50         ` Dibyendu Majumdar
@ 2017-03-09 14:35           ` Luc Van Oostenryck
  2017-03-09 14:42             ` Dibyendu Majumdar
  2017-03-09 23:30             ` Dibyendu Majumdar
  0 siblings, 2 replies; 14+ messages in thread
From: Luc Van Oostenryck @ 2017-03-09 14:35 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Linux-Sparse

On Thu, Mar 09, 2017 at 01:50:16AM +0000, Dibyendu Majumdar wrote:
> 
> I think by combining the symbol's identifier and address a unique
> reference can be created - thus whenever we encounter a symbol that is
> not extern or toplevel we should create a local object using the
> symbol's name + address as the LLVM identifier.

That's more or less the idea, of course.
And you need to consider all objects that you don't need to
allocate because they are always in a (pseudo-) register.

-- Luc 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-09 14:35           ` Luc Van Oostenryck
@ 2017-03-09 14:42             ` Dibyendu Majumdar
  2017-03-09 23:30             ` Dibyendu Majumdar
  1 sibling, 0 replies; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-09 14:42 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

On 9 March 2017 at 14:35, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Thu, Mar 09, 2017 at 01:50:16AM +0000, Dibyendu Majumdar wrote:
>>
>> I think by combining the symbol's identifier and address a unique
>> reference can be created - thus whenever we encounter a symbol that is
>> not extern or toplevel we should create a local object using the
>> symbol's name + address as the LLVM identifier.
>
> That's more or less the idea, of course.
> And you need to consider all objects that you don't need to
> allocate because they are always in a (pseudo-) register.
>

Ok thank you. I will try to implement a fix and get back when I have
something that works or if I get stuck.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-08 23:38       ` Dibyendu Majumdar
  2017-03-09  1:50         ` Dibyendu Majumdar
@ 2017-03-09 14:42         ` Luc Van Oostenryck
  1 sibling, 0 replies; 14+ messages in thread
From: Luc Van Oostenryck @ 2017-03-09 14:42 UTC (permalink / raw)
  To: Dibyendu Majumdar; +Cc: Linux-Sparse

On Wed, Mar 08, 2017 at 11:38:22PM +0000, Dibyendu Majumdar wrote:
> Hi Luc,
> 
> On 8 March 2017 at 06:53, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> > On Tue, Mar 07, 2017 at 07:31:34AM +0000, Dibyendu Majumdar wrote:
> >> >> Here 'values' has been defined above as external global, whereas it
> >> >> should be allocated on the stack.
> >> >
> >> > Mmmh, I'll need to look a bit further at this one.
> >> >
> >>
> >> It appears that sparse-llvm only creates a local stack object for each
> >> OP_PHI instruction. What should be the general approach for
> >> identifying local stack objects in sparse linearized output?
> >
> > I've looked a bit at this and there is several issues at hand.
> > It's also not something LLVM specific.
> > So I can't answer to this for now. Hopefully, I'll be able to tell
> > you more in a few days.
> >
> 
> Thank you for looking into this. It would be good to know what issues
> you have discovered.
The issues is about which symbols are used, which ones don't need to be
allocated because all their use are via pseudos, ...
A lot of infrastructure already exist for it bit is incomplete or
wrong.

Also, I first focus to finalize the previous issues with the NULLs,
ADDs, CALLs, ...

-- Luc Van Oostenryck

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-09 14:35           ` Luc Van Oostenryck
  2017-03-09 14:42             ` Dibyendu Majumdar
@ 2017-03-09 23:30             ` Dibyendu Majumdar
  2017-03-09 23:37               ` Dibyendu Majumdar
  1 sibling, 1 reply; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-09 23:30 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

Hi Luc,

On 9 March 2017 at 14:35, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Thu, Mar 09, 2017 at 01:50:16AM +0000, Dibyendu Majumdar wrote:
>>
>> I think by combining the symbol's identifier and address a unique
>> reference can be created - thus whenever we encounter a symbol that is
>> not extern or toplevel we should create a local object using the
>> symbol's name + address as the LLVM identifier.
>
> That's more or less the idea, of course.
> And you need to consider all objects that you don't need to
> allocate because they are always in a (pseudo-) register.
>

This is my initial attempt at a fix. In the function pseudo_to_value()
I modified the handling of PSEUDO_SYM. Note that I needed to add a new
field 'priv' in the symbol structure (type void *) to hold the
allocated value - there is an 'aux' field already for the backend to
use but this is already used, so I wasn't sure whether there would be
a conflict if I tried to reuse that field.

 case PSEUDO_SYM: {
  struct symbol *sym = pseudo->sym;
  struct expression *expr;
  assert(sym->bb_target == NULL);
  expr = sym->initializer;
  if (expr) {
    ...
  } else {
   const char *name = show_ident(sym->ident);
   LLVMTypeRef type = symbol_type(fn->module, sym);
   if (LLVMGetTypeKind(type) == LLVMFunctionTypeKind) {
    result = LLVMGetNamedFunction(fn->module, name);
    if (!result)
     result = LLVMAddFunction(fn->module, name, type);
   } else {
    char localname[256] = { 0 };
    if (is_extern(sym) || is_toplevel(sym)) {
     result = LLVMGetNamedGlobal(fn->module, name);
     if (!result)
      result = LLVMAddGlobal(fn->module, type, name);
    }
    else {
     result = (LLVMValueRef)sym->priv;
     if (!result) {
      /* insert alloca into entry block */
      /* LLVM requires allocas to be at the start */
      LLVMBasicBlockRef entrybbr = LLVMGetEntryBasicBlock(fn->fn);
      /* Use temporary Builder as we don't want to mess the function builder */
      LLVMBuilderRef tmp_builder = LLVMCreateBuilder();
      LLVMValueRef firstins = LLVMGetFirstInstruction(entrybbr);
      if (firstins)
       LLVMPositionBuilderBefore(tmp_builder, firstins);
      else
       LLVMPositionBuilderAtEnd(tmp_builder, entrybbr);
      /* Since multiple locals may have same name but in different scopes we
         append the symbol's address to make each variable unique */
      snprintf(localname, sizeof localname, "%s_%p", name, sym);
      result = LLVMBuildAlloca(tmp_builder, type, localname);
      LLVMDisposeBuilder(tmp_builder);
      /* Store the alloca instruction for references to the value later on.
         TODO - should we convert this to pointer here? */
      sym->priv = result;
     }
    }
   }
  }
  break;
 }

Of course I hit the next issue - which is that when calling functions,
array parameters need to degenerate to pointers - I haven't resolved
this yet. Perhaps, the cached value should be converted to a pointer
immediately so that any references see the pointer rather than the
array, or there needs to be a conversion when an array is passed to a
function.

Anyway to workaround this issue. I added a cast to the simple test program:

extern void init(int *v);
int main(const char *argv[]) {

 int n = 0;
 {
  int x[5];
  init((int *)x);
  n += x[0];
 }
 {
  int x[6];
  init((int *)x);
  n += x[0];
 }
 return n;
}

With that the LLVM IR I get looks like this:

define i32 @main(i8**) {
L0:
  %x_0000020C5D05B2E8 = alloca [6 x i32]
  %x_0000020C5D05AF68 = alloca [5 x i32]
  %R2 = bitcast [5 x i32]* %x_0000020C5D05AF68 to i32*
  call void @init(i32* %R2)
  %1 = bitcast [5 x i32]* %x_0000020C5D05AF68 to i32*
  %2 = bitcast i32* %1 to i8*
  %3 = getelementptr inbounds i8, i8* %2, i64 0
  %4 = bitcast i8* %3 to i32*
  %load_target = load i32, i32* %4
  %R9 = bitcast [6 x i32]* %x_0000020C5D05B2E8 to i32*
  call void @init(i32* %R9)
  %5 = bitcast [6 x i32]* %x_0000020C5D05B2E8 to i32*
  %6 = bitcast i32* %5 to i8*
  %7 = getelementptr inbounds i8, i8* %6, i64 0
  %8 = bitcast i8* %7 to i32*
  %load_target1 = load i32, i32* %8
  %R13 = add i32 %load_target, %load_target1
  ret i32 %R13
}
declare void @init(i32*)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-09 23:30             ` Dibyendu Majumdar
@ 2017-03-09 23:37               ` Dibyendu Majumdar
  0 siblings, 0 replies; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-09 23:37 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Linux-Sparse

On 9 March 2017 at 23:30, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote:
> On 9 March 2017 at 14:35, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
>> On Thu, Mar 09, 2017 at 01:50:16AM +0000, Dibyendu Majumdar wrote:
>>>
>>> I think by combining the symbol's identifier and address a unique
>>> reference can be created - thus whenever we encounter a symbol that is
>>> not extern or toplevel we should create a local object using the
>>> symbol's name + address as the LLVM identifier.
>>
>> That's more or less the idea, of course.
>> And you need to consider all objects that you don't need to
>> allocate because they are always in a (pseudo-) register.
>>
>
> This is my initial attempt at a fix. In the function pseudo_to_value()
> I modified the handling of PSEUDO_SYM. Note that I needed to add a new
> field 'priv' in the symbol structure (type void *) to hold the
> allocated value - there is an 'aux' field already for the backend to
> use but this is already used, so I wasn't sure whether there would be
> a conflict if I tried to reuse that field.
>

Also works with structs:

struct mytype {
 int a, b;
 double c;
};
extern void init(struct mytype *v);
int main(const char *argv[]) {

 int n = 0;
 {
  struct mytype x[5];
  init((struct mytype *)x);
  n += x[0].b;
 }
 {
  struct mytype x[6];
  init((struct mytype *)x);
  n += x[0].a;
 }
 return n;
}

%struct.mytype = type { i32, i32, double }
define i32 @main(i8**) {
L0:
  %x_000001A677EE7788 = alloca [6 x %struct.mytype]
  %x_000001A677EE7408 = alloca [5 x %struct.mytype]
  %R2 = bitcast [5 x %struct.mytype]* %x_000001A677EE7408 to %struct.mytype*
  call void @init(%struct.mytype* %R2)
  %1 = bitcast [5 x %struct.mytype]* %x_000001A677EE7408 to i32*
  %2 = bitcast i32* %1 to i8*
  %3 = getelementptr inbounds i8, i8* %2, i64 4
  %4 = bitcast i8* %3 to i32*
  %load_target = load i32, i32* %4
  %R9 = bitcast [6 x %struct.mytype]* %x_000001A677EE7788 to %struct.mytype*
  call void @init(%struct.mytype* %R9)
  %5 = bitcast [6 x %struct.mytype]* %x_000001A677EE7788 to i32*
  %6 = bitcast i32* %5 to i8*
  %7 = getelementptr inbounds i8, i8* %6, i64 0
  %8 = bitcast i8* %7 to i32*
  %load_target1 = load i32, i32* %8
  %R13 = add i32 %load_target, %load_target1
  ret i32 %R13
}
declare void @init(%struct.mytype*)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-07  0:08 sparse-llvm incorrect definition of local variables Dibyendu Majumdar
  2017-03-07  6:44 ` Luc Van Oostenryck
@ 2017-03-13 11:16 ` Dibyendu Majumdar
  2017-03-14 15:54   ` Dibyendu Majumdar
  1 sibling, 1 reply; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-13 11:16 UTC (permalink / raw)
  To: Linux-Sparse

On 7 March 2017 at 00:08, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote:
> int main(const char *argv[]) {
>  int values[5];
>  values[0] = 9;
>  values[1] = 8;
>  values[2] = 7;
>  values[3] = 6;
>  values[4] = 5;
>  int *p = &values[4];
>  return (*p == 5) ? 0 : 1;
> }
>

The latest version of the changes I made are given below. This appears
to help also with simple assignments, and fixes another issue I
reported when accessing the data from a string.

Changes to symbol:

I added a field in the symbol structure for the backend to save the
LLVM instruction that creates / allocates the symbol.

struct symbol {
 ...
 void *priv;
};

The remaining changes are in sparse-llvm.c. (Apologies for showing
extracts here from my modified codebase but this should translate
easily to sparse).

Firstly in pseudo_to_value() the changes are to do with handling of PSEUDO_SYM:

 case PSEUDO_SYM: {
  struct symbol *sym = pseudo->sym;
  struct expression *expr;
  result = (LLVMValueRef)sym->priv;
  if (result)
   return result;
  assert(sym->bb_target == NULL);
  expr = sym->initializer;
  if (expr) {
   switch (expr->type) {
   case EXPR_STRING: {
    const char *s = expr->string->data;
    LLVMValueRef indices[] = { LLVMConstInt(LLVMInt64Type(), 0, 0),
LLVMConstInt(LLVMInt64Type(), 0, 0) };
    LLVMValueRef data;
    data = LLVMAddGlobal(fn->module, LLVMArrayType(LLVMInt8Type(),
strlen(s) + 1), ".str");
    LLVMSetLinkage(data, LLVMPrivateLinkage);
    LLVMSetGlobalConstant(data, 1);
    char *scopy = allocator_allocate(&C->byte_allocator, strlen(s) + 1);
    strcpy(scopy, s);
    LLVMSetInitializer(data, LLVMConstString(scopy, strlen(scopy) + 1, true));
    result = LLVMConstGEP(data, indices, ARRAY_SIZE(indices));
    sym->priv = result;
    break;
   }
   case EXPR_SYMBOL: {
    assert(0 && "unresolved symbol reference");
    struct symbol *sym = expr->symbol;
    result = LLVMGetNamedGlobal(fn->module, show_ident(C, sym->ident));
    assert(result != NULL);
    break;
   }
   case EXPR_VALUE:
    result = LLVMConstInt(symbol_type(C, fn->module, sym), expr->value, 1);
    sym->priv = result;
    break;
   case EXPR_FVALUE:
    result = LLVMConstReal(symbol_type(C, fn->module, sym), expr->fvalue);
    sym->priv = result;
    break;
   default:
    assert(0);
   }
  }
  else {
   const char *name = show_ident(C, sym->ident);
   LLVMTypeRef type = symbol_type(C, fn->module, sym);
   if (LLVMGetTypeKind(type) == LLVMFunctionTypeKind) {
    result = LLVMGetNamedFunction(fn->module, name);
    if (!result)
     result = LLVMAddFunction(fn->module, name, type);
    sym->priv = result;
   }
   else if (is_extern(sym) || is_toplevel(sym)) {
    result = LLVMGetNamedGlobal(fn->module, name);
    if (!result)
     result = LLVMAddGlobal(fn->module, type, name);
    sym->priv = result;
   }
   else {
    char localname[256] = { 0 };
    /* insert alloca into entry block */
    /* LLVM requires allocas to be at the start */
    LLVMBasicBlockRef entrybbr = LLVMGetEntryBasicBlock(fn->fn);
    /* Use temporary Builder as we don't want to mess the function builder */
    LLVMBuilderRef tmp_builder = LLVMCreateBuilder();
    LLVMValueRef firstins = LLVMGetFirstInstruction(entrybbr);
    if (firstins)
     LLVMPositionBuilderBefore(tmp_builder, firstins);
    else
     LLVMPositionBuilderAtEnd(tmp_builder, entrybbr);
    /* Since multiple locals may have same name but in different scopes we
       append the symbol's address to make each variable unique */
    snprintf(localname, sizeof localname, "%s_%p", name, sym);
    result = LLVMBuildAlloca(tmp_builder, type, localname);
    LLVMDisposeBuilder(tmp_builder);
    /* Store the alloca instruction for references to the value later on.
       TODO - should we convert this to pointer here? */
    sym->priv = result;
   }
  }
  break;
 }

Secondly in output_data() I added support for floating types and also
ensured that the sym->priv is set.

static LLVMValueRef output_data(struct dmr_C *C, LLVMModuleRef module,
struct symbol *sym)
{
 struct expression *initializer = sym->initializer;
 LLVMValueRef initial_value;
 LLVMValueRef data;
 const char *name;
 if (initializer) {
  switch (initializer->type) {
  case EXPR_VALUE:
   initial_value = LLVMConstInt(symbol_type(C, module, sym),
initializer->value, 1);
   break;
  case EXPR_FVALUE:
   initial_value = LLVMConstReal(symbol_type(C, module, sym),
initializer->fvalue);
   break;
  case EXPR_SYMBOL: {
   struct symbol *sym = initializer->symbol;
   initial_value = LLVMGetNamedGlobal(module, show_ident(C, sym->ident));
   if (!initial_value)
    initial_value = output_data(C, module, sym);
   break;
  }
  case EXPR_STRING: {
   const char *s = initializer->string->data;
   char *scopy = allocator_allocate(&C->byte_allocator, strlen(s) + 1);
   strcpy(scopy, s);
   initial_value = LLVMConstString(scopy, strlen(scopy) + 1, true);
   break;
  }
  default:
   assert(0);
  }
 } else {
  LLVMTypeRef type = symbol_type(C, module, sym);
  initial_value = LLVMConstNull(type);
 }
 name = show_ident(C, sym->ident);
 data = LLVMAddGlobal(module, LLVMTypeOf(initial_value), name);
 LLVMSetLinkage(data, data_linkage(C, sym));
 if (sym->ctype.modifiers & MOD_CONST)
  LLVMSetGlobalConstant(data, 1);
 if (sym->ctype.modifiers & MOD_TLS)
  LLVMSetThreadLocal(data, 1);
 if (sym->ctype.alignment)
  LLVMSetAlignment(data, sym->ctype.alignment);
 if (!(sym->ctype.modifiers & MOD_EXTERN))
  LLVMSetInitializer(data, initial_value);
 sym->priv = data;
 return data;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-13 11:16 ` Dibyendu Majumdar
@ 2017-03-14 15:54   ` Dibyendu Majumdar
  2017-03-17 10:48     ` Dibyendu Majumdar
  0 siblings, 1 reply; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-14 15:54 UTC (permalink / raw)
  To: Linux-Sparse, Luc Van Oostenryck, Christopher Li

On 13 March 2017 at 11:16, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote:
> On 7 March 2017 at 00:08, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote:
>> int main(const char *argv[]) {
>>  int values[5];
>>  values[0] = 9;
>>  values[1] = 8;
>>  values[2] = 7;
>>  values[3] = 6;
>>  values[4] = 5;
>>  int *p = &values[4];
>>  return (*p == 5) ? 0 : 1;
>> }
>>
>
> The latest version of the changes I made are given below. This appears
> to help also with simple assignments, and fixes another issue I
> reported when accessing the data from a string.
>
> Changes to symbol:
>
> I added a field in the symbol structure for the backend to save the
> LLVM instruction that creates / allocates the symbol.
>
> struct symbol {
>  ...
>  void *priv;
> };
>
> The remaining changes are in sparse-llvm.c. (Apologies for showing
> extracts here from my modified codebase but this should translate
> easily to sparse).
>
> Firstly in pseudo_to_value() the changes are to do with handling of PSEUDO_SYM:
>
>  case PSEUDO_SYM: {
>   struct symbol *sym = pseudo->sym;
>   struct expression *expr;
>   result = (LLVMValueRef)sym->priv;
>   if (result)
>    return result;
>   assert(sym->bb_target == NULL);
>   expr = sym->initializer;
>   if (expr) {
>    switch (expr->type) {
>    case EXPR_STRING: {
>     const char *s = expr->string->data;
>     LLVMValueRef indices[] = { LLVMConstInt(LLVMInt64Type(), 0, 0),
> LLVMConstInt(LLVMInt64Type(), 0, 0) };
>     LLVMValueRef data;
>     data = LLVMAddGlobal(fn->module, LLVMArrayType(LLVMInt8Type(),
> strlen(s) + 1), ".str");
>     LLVMSetLinkage(data, LLVMPrivateLinkage);
>     LLVMSetGlobalConstant(data, 1);
>     char *scopy = allocator_allocate(&C->byte_allocator, strlen(s) + 1);
>     strcpy(scopy, s);
>     LLVMSetInitializer(data, LLVMConstString(scopy, strlen(scopy) + 1, true));
>     result = LLVMConstGEP(data, indices, ARRAY_SIZE(indices));
>     sym->priv = result;
>     break;
>    }
>    case EXPR_SYMBOL: {
>     assert(0 && "unresolved symbol reference");
>     struct symbol *sym = expr->symbol;
>     result = LLVMGetNamedGlobal(fn->module, show_ident(C, sym->ident));
>     assert(result != NULL);
>     break;
>    }
>    case EXPR_VALUE:
>     result = LLVMConstInt(symbol_type(C, fn->module, sym), expr->value, 1);
>     sym->priv = result;
>     break;
>    case EXPR_FVALUE:
>     result = LLVMConstReal(symbol_type(C, fn->module, sym), expr->fvalue);
>     sym->priv = result;
>     break;
>    default:
>     assert(0);
>    }
>   }
>   else {
>    const char *name = show_ident(C, sym->ident);
>    LLVMTypeRef type = symbol_type(C, fn->module, sym);
>    if (LLVMGetTypeKind(type) == LLVMFunctionTypeKind) {
>     result = LLVMGetNamedFunction(fn->module, name);
>     if (!result)
>      result = LLVMAddFunction(fn->module, name, type);
>     sym->priv = result;
>    }
>    else if (is_extern(sym) || is_toplevel(sym)) {
>     result = LLVMGetNamedGlobal(fn->module, name);
>     if (!result)
>      result = LLVMAddGlobal(fn->module, type, name);
>     sym->priv = result;
>    }
>    else {
>     char localname[256] = { 0 };
>     /* insert alloca into entry block */
>     /* LLVM requires allocas to be at the start */
>     LLVMBasicBlockRef entrybbr = LLVMGetEntryBasicBlock(fn->fn);
>     /* Use temporary Builder as we don't want to mess the function builder */
>     LLVMBuilderRef tmp_builder = LLVMCreateBuilder();
>     LLVMValueRef firstins = LLVMGetFirstInstruction(entrybbr);
>     if (firstins)
>      LLVMPositionBuilderBefore(tmp_builder, firstins);
>     else
>      LLVMPositionBuilderAtEnd(tmp_builder, entrybbr);
>     /* Since multiple locals may have same name but in different scopes we
>        append the symbol's address to make each variable unique */
>     snprintf(localname, sizeof localname, "%s_%p", name, sym);
>     result = LLVMBuildAlloca(tmp_builder, type, localname);
>     LLVMDisposeBuilder(tmp_builder);
>     /* Store the alloca instruction for references to the value later on.
>        TODO - should we convert this to pointer here? */
>     sym->priv = result;
>    }
>   }
>   break;
>  }
>
> Secondly in output_data() I added support for floating types and also
> ensured that the sym->priv is set.
>
> static LLVMValueRef output_data(struct dmr_C *C, LLVMModuleRef module,
> struct symbol *sym)
> {
>  struct expression *initializer = sym->initializer;
>  LLVMValueRef initial_value;
>  LLVMValueRef data;
>  const char *name;
>  if (initializer) {
>   switch (initializer->type) {
>   case EXPR_VALUE:
>    initial_value = LLVMConstInt(symbol_type(C, module, sym),
> initializer->value, 1);
>    break;
>   case EXPR_FVALUE:
>    initial_value = LLVMConstReal(symbol_type(C, module, sym),
> initializer->fvalue);
>    break;
>   case EXPR_SYMBOL: {
>    struct symbol *sym = initializer->symbol;
>    initial_value = LLVMGetNamedGlobal(module, show_ident(C, sym->ident));
>    if (!initial_value)
>     initial_value = output_data(C, module, sym);
>    break;
>   }
>   case EXPR_STRING: {
>    const char *s = initializer->string->data;
>    char *scopy = allocator_allocate(&C->byte_allocator, strlen(s) + 1);
>    strcpy(scopy, s);
>    initial_value = LLVMConstString(scopy, strlen(scopy) + 1, true);
>    break;
>   }
>   default:
>    assert(0);
>   }
>  } else {
>   LLVMTypeRef type = symbol_type(C, module, sym);
>   initial_value = LLVMConstNull(type);
>  }
>  name = show_ident(C, sym->ident);
>  data = LLVMAddGlobal(module, LLVMTypeOf(initial_value), name);
>  LLVMSetLinkage(data, data_linkage(C, sym));
>  if (sym->ctype.modifiers & MOD_CONST)
>   LLVMSetGlobalConstant(data, 1);
>  if (sym->ctype.modifiers & MOD_TLS)
>   LLVMSetThreadLocal(data, 1);
>  if (sym->ctype.alignment)
>   LLVMSetAlignment(data, sym->ctype.alignment);
>  if (!(sym->ctype.modifiers & MOD_EXTERN))
>   LLVMSetInitializer(data, initial_value);
>  sym->priv = data;
>  return data;
> }

It turns out that the handling of EXPR_VALUE and EXPR_FVALUE doesn't
work if the address is taken of the local variable, so I had to change
this to allocate memory on the stack.

Regards
Dibyendu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: sparse-llvm incorrect definition of local variables
  2017-03-14 15:54   ` Dibyendu Majumdar
@ 2017-03-17 10:48     ` Dibyendu Majumdar
  0 siblings, 0 replies; 14+ messages in thread
From: Dibyendu Majumdar @ 2017-03-17 10:48 UTC (permalink / raw)
  To: Linux-Sparse, Luc Van Oostenryck, Christopher Li

Hi

I found additional issues with the handling of variables, in
particular if a variable is declared static inside a function.
Following is an updated set of changes that improve the situation.
These are against my modified code base but should translate easily to
sparse-llvm.

static LLVMValueRef build_local(struct dmr_C *C, struct function *fn,
struct symbol *sym)
{
 const char *name = show_ident(C, sym->ident);
 LLVMTypeRef type = symbol_type(C, fn->module, sym);
 LLVMValueRef result;
 char localname[256] = { 0 };
 snprintf(localname, sizeof localname, "%s_%p", name, sym);
 if (is_static(sym) || is_extern(sym) || is_toplevel(sym)) {
  result = LLVMGetNamedGlobal(fn->module, localname);
  if (!result) {
   result = LLVMAddGlobal(fn->module, type, localname);
   if (!is_extern(sym))
    LLVMSetLinkage(result, LLVMInternalLinkage);
   else
    LLVMSetLinkage(result, LLVMExternalLinkage);
  }
 }
 else {
  /* insert alloca into entry block */
  /* LLVM requires allocas to be at the start */
  LLVMBasicBlockRef entrybbr = LLVMGetEntryBasicBlock(fn->fn);
  /* Use temporary Builder as we don't want to mess the function builder */
  LLVMBuilderRef tmp_builder = LLVMCreateBuilder();
  LLVMValueRef firstins = LLVMGetFirstInstruction(entrybbr);
  if (firstins)
   LLVMPositionBuilderBefore(tmp_builder, firstins);
  else
   LLVMPositionBuilderAtEnd(tmp_builder, entrybbr);
  /* Since multiple locals may have same name but in different scopes we
  append the symbol's address to make each variable unique */
  result = LLVMBuildAlloca(tmp_builder, type, localname);
  LLVMDisposeBuilder(tmp_builder);
 }
 sym->priv = result;
 return result;
}

static LLVMValueRef pseudo_to_value(struct dmr_C *C, struct function
*fn, struct instruction *insn, pseudo_t pseudo)
{
 LLVMValueRef result = NULL;
...
 case PSEUDO_SYM: {
  struct symbol *sym = pseudo->sym;
  struct expression *expr;
  result = (LLVMValueRef)sym->priv;
  if (result)
   return result;
  assert(sym->bb_target == NULL);
  expr = sym->initializer;
  if (expr) {
   switch (expr->type) {
   case EXPR_STRING: {
    const char *s = expr->string->data;
    LLVMValueRef indices[] = { LLVMConstInt(LLVMInt64Type(), 0, 0),
LLVMConstInt(LLVMInt64Type(), 0, 0) };
    LLVMValueRef data;
    data = LLVMAddGlobal(fn->module, LLVMArrayType(LLVMInt8Type(),
strlen(s) + 1), ".str");
    LLVMSetLinkage(data, LLVMPrivateLinkage);
    LLVMSetGlobalConstant(data, 1);
    char *scopy = allocator_allocate(&C->byte_allocator, strlen(s) + 1);
    strcpy(scopy, s);
    LLVMSetInitializer(data, LLVMConstString(scopy, strlen(scopy) + 1, true));
    result = LLVMConstGEP(data, indices, ARRAY_SIZE(indices));
    sym->priv = result;
    break;
   }
   case EXPR_SYMBOL: {
    expression_error(C, expr, "unresolved symbol reference in initializer\n");
    show_expression(C, expr);
    exit(1);
    break;
   }
   case EXPR_VALUE:
    result = build_local(C, fn, sym);
    if (is_static(sym))
     LLVMSetInitializer(result, LLVMConstInt(symbol_type(C,
fn->module, sym), expr->value, 1));
    else
     LLVMBuildStore(fn->builder, LLVMConstInt(symbol_type(C,
fn->module, sym), expr->value, 1), result);
    sym->priv = result;
    break;
   case EXPR_FVALUE:
    result = build_local(C, fn, sym);
    if (is_static(sym))
     LLVMSetInitializer(result, LLVMConstReal(symbol_type(C,
fn->module, sym), expr->fvalue));
    else
     LLVMBuildStore(fn->builder, LLVMConstReal(symbol_type(C,
fn->module, sym), expr->fvalue), result);
    sym->priv = result;
    break;
   default:
    expression_error(C, expr, "unsupported expr type in initializer:
%d\n", expr->type);
    show_expression(C, expr);
    exit(1);
   }
  }
  else {
   const char *name = show_ident(C, sym->ident);
   LLVMTypeRef type = symbol_type(C, fn->module, sym);
   if (LLVMGetTypeKind(type) == LLVMFunctionTypeKind) {
    result = LLVMGetNamedFunction(fn->module, name);
    if (!result)
     result = LLVMAddFunction(fn->module, name, type);
    sym->priv = result;
   }
   else if (is_extern(sym) || is_toplevel(sym)) {
    result = LLVMGetNamedGlobal(fn->module, name);
    if (!result) {
     result = LLVMAddGlobal(fn->module, type, name);
     if (is_extern(sym))
      LLVMSetLinkage(result, LLVMExternalLinkage);
    }
    sym->priv = result;
   }
   else {
    result = build_local(C, fn, sym);
    if (is_static(sym)) {
     LLVMSetInitializer(result, LLVMConstNull(type));
    }
    sym->priv = result;
   }
  }
  break;
 }
 case PSEUDO_VAL: {
...
 return result;
}

static LLVMValueRef output_data(struct dmr_C *C, LLVMModuleRef module,
struct symbol *sym)
{
 struct expression *initializer = sym->initializer;
 LLVMValueRef initial_value = NULL;
 LLVMValueRef data = NULL;
 const char *name;
 if (initializer) {
  switch (initializer->type) {
  case EXPR_VALUE:
   initial_value = LLVMConstInt(symbol_type(C, module, sym),
initializer->value, 1);
   break;
  case EXPR_FVALUE:
   initial_value = LLVMConstReal(symbol_type(C, module, sym),
initializer->fvalue);
   break;
  case EXPR_SYMBOL: {
   struct symbol *sym = initializer->symbol;
   if (sym->ident)
    initial_value = LLVMGetNamedGlobal(module, show_ident(C, sym->ident));
   if (!initial_value)
    initial_value = output_data(C, module, sym);
   break;
  }
  case EXPR_STRING: {
   const char *s = initializer->string->data;
   char *scopy = allocator_allocate(&C->byte_allocator, strlen(s) + 1);
   strcpy(scopy, s);
   initial_value = LLVMConstString(scopy, strlen(scopy) + 1, true);
   break;
  }
  default:
   expression_error(C, initializer, "unsupported expr type in global
data initializer: %d\n", initializer->type);
   show_expression(C, initializer);
  }
  if (!initial_value)
   return NULL;
 } else {
  LLVMTypeRef type = symbol_type(C, module, sym);
  initial_value = LLVMConstNull(type);
 }
 name = show_ident(C, sym->ident);
 if (!sym->priv) {
  if (sym->ident)
   data = LLVMGetNamedGlobal(module, name);
  if (!data)
   data = LLVMAddGlobal(module, LLVMTypeOf(initial_value), name);
  LLVMSetLinkage(data, data_linkage(C, sym));
  if (sym->ctype.modifiers & MOD_CONST)
   LLVMSetGlobalConstant(data, 1);
  if (sym->ctype.modifiers & MOD_TLS)
   LLVMSetThreadLocal(data, 1);
  if (sym->ctype.alignment)
   LLVMSetAlignment(data, sym->ctype.alignment);
  if (!(sym->ctype.modifiers & MOD_EXTERN))
   LLVMSetInitializer(data, initial_value);
  sym->priv = data;
 }
 else {
  data = sym->priv;
  if (!(sym->ctype.modifiers & MOD_EXTERN))
   LLVMSetInitializer(data, initial_value);
 }
 return data;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-03-17 10:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-07  0:08 sparse-llvm incorrect definition of local variables Dibyendu Majumdar
2017-03-07  6:44 ` Luc Van Oostenryck
2017-03-07  7:31   ` Dibyendu Majumdar
2017-03-08  6:53     ` Luc Van Oostenryck
2017-03-08 23:38       ` Dibyendu Majumdar
2017-03-09  1:50         ` Dibyendu Majumdar
2017-03-09 14:35           ` Luc Van Oostenryck
2017-03-09 14:42             ` Dibyendu Majumdar
2017-03-09 23:30             ` Dibyendu Majumdar
2017-03-09 23:37               ` Dibyendu Majumdar
2017-03-09 14:42         ` Luc Van Oostenryck
2017-03-13 11:16 ` Dibyendu Majumdar
2017-03-14 15:54   ` Dibyendu Majumdar
2017-03-17 10:48     ` Dibyendu Majumdar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).