* Some thoughts on the SSA debate @ 2017-09-08 10:28 Dibyendu Majumdar 2017-09-09 2:06 ` Luc Van Oostenryck 0 siblings, 1 reply; 8+ messages in thread From: Dibyendu Majumdar @ 2017-09-08 10:28 UTC (permalink / raw) To: Linux-Sparse Hi, Having tried out the newssa approach for a bit, and after working on another backend that doesn't use SSA phi nodes, I would like to share some of my personal thoughts with regards to the changes: 1. Firstly I think it is important to retain the existing "pre-phi" linear code. This is fine for backends such as LLVM which do a good job of transforming code anyway. It is also usually less buggy and serves as a baseline for correctness. So after careful consideration I would vote against adding phi instructions in the initial phase. 2. It also appears to me that the original phi instructions, while not textbook correct in terms of placement, are internally consistent, in the sense that the only issue I have found so far is the inability to handle variables that are initialized in a branch of the code. Perhaps this is just a bug? The other issue with the original implementation appears to be related to the patches that attempted to kill things - but I get the feeling that the original design did not envision removing unused things physically, but just marked them logically dead? 3. It also appears that the SSA construction approach by Matthias Braun et al is inadequate in the sense that it is not a full solution and requires additional hacks. This is unfortunate as I understood the aim was to avoid hacks and ad-hoc changes and go with a more published algorithm. In view of this, perhaps the current experiment has failed and the effort should be directed to building the industry standard approach? 4. Final thought is that although I was previously suggesting changes should go into master, I think for such a large piece of fundamental work maybe the changes should be done on a dedicated branch and only merged when completely tested and validated. I hesitated putting forth my thoughts here as I still have the freedom in my project to accept / reject changes happening in Sparse, and so I am able to take a different approach. Also, I am not really qualified to judge the full implications of the new SSA construction approach - hence I am going by my impressions after testing and using the approach, and observing the issues that have been discussed so far on the list. Regards Dibyendu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Some thoughts on the SSA debate 2017-09-08 10:28 Some thoughts on the SSA debate Dibyendu Majumdar @ 2017-09-09 2:06 ` Luc Van Oostenryck 2017-09-12 10:06 ` Dibyendu Majumdar 0 siblings, 1 reply; 8+ messages in thread From: Luc Van Oostenryck @ 2017-09-09 2:06 UTC (permalink / raw) To: Dibyendu Majumdar; +Cc: Linux-Sparse On Fri, Sep 08, 2017 at 11:28:57AM +0100, Dibyendu Majumdar wrote: > Hi, > > Having tried out the newssa approach for a bit, and after working on > another backend that doesn't use SSA phi nodes, I would like to share > some of my personal thoughts with regards to the changes: > > 1. Firstly I think it is important to retain the existing "pre-phi" > linear code. This is fine for backends such as LLVM which do a good > job of transforming code anyway. It is also usually less buggy and > serves as a baseline for correctness. So after careful consideration I > would vote against adding phi instructions in the initial phase. That's forgetting that even with the "old/current" method create phi-nodes during linearization: - for the return mechanism - for some conditionals This also doesn't really take in account the fact that the SSA form is pretty fundamental to sparse (like it's most modern compilers). If you have a backend that doesn't understand phi-nodes, it means that this backend doesn't know and doesn't care about SSA and in this case you have to call unssa() to put back the IR in 'normal' form. > 2. It also appears to me that the original phi instructions, while not > textbook correct in terms of placement, are internally consistent, No. For *some* purposes/aspect, they are somehow consistent and the current code can do something relatively sensical with them (and often it seems to be more because of simple coincidence). But this doesn't change the fact that a lof ot things can't possibly be done correctly with the actual phi-nodes. I already explain what are the problems and here it is again: - broken because the phi-nodes are put not at the meet point but at a place where the value is needed, where there was a load. this place *may* be in the right BB but often it's in a 'later' BB. Real problems then arise when: * changing some BB edges (like removing a dead parent, BB merge, ...) * some undefined values are (conceptually) present (and the current code ignore them). - broken because when a phi-node is created it somehow inherit the phi-sources of any 'parent' phi-nodes when there are some. This is even more broken than the previous situation. You can't possibly do anything sensical with this. > 3. It also appears that the SSA construction approach by Matthias > Braun et al is inadequate in the sense that it is not a full solution > and requires additional hacks. This is a misrepresentation of the situation. And what are those 'hacks' are you talking about? There are 2 things with their method: 1) It only converts 'purely local variables', the ones that can't possibly be aliased to anything. It can be adapted to also convert more variables but this would, of course, requires doing some sort of alias analysis before the conversion and thus before the linearization which is absurd. The whole advantage of this method is to be able to do a quick conversion of the most essential variables and this with a very low cost. This can be very useful when: - you're in a lightweight environment and can't afford heavier methods (think JIT for example) - as a preliminary step for doing alias analysis with code already in SSA form or to be able to do some fast optimization/simplification before doing alais analysis (and then convert the others variables, when possible). In case, it is not clear, the current method do this in two steps too: - first in simplify_symbol_usage() - later in simplify_memops()/simplify_loads() but for sure the first step convert more variables than the SSSA method (and is not only the SSA conversion is broken, but the detection of aliasing is also wrong (but I do't know yet if it is just a small bug or if it is more fundamental). 2) The article doesn't explain how to deal with gotos. The article also doesn't explain how to deal with ifs, do-whiles, switches, ... The article is just that, an article presenting a method. It's not a step-by-step guide on how to write a compiler which will explain all the practical difficulties you will have. > 4. Final thought is that although I was previously suggesting changes > should go into master, I think for such a large piece of fundamental > work maybe the changes should be done on a dedicated branch and only > merged when completely tested and validated. Isn't what is done? Is the code for SSSA already upstreamed? -- Luc Van Oostenryck ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Some thoughts on the SSA debate 2017-09-09 2:06 ` Luc Van Oostenryck @ 2017-09-12 10:06 ` Dibyendu Majumdar 2017-09-12 10:48 ` Luc Van Oostenryck 0 siblings, 1 reply; 8+ messages in thread From: Dibyendu Majumdar @ 2017-09-12 10:06 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Linux-Sparse H Luc, On 9 September 2017 at 03:06, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > On Fri, Sep 08, 2017 at 11:28:57AM +0100, Dibyendu Majumdar wrote: >> >> 1. Firstly I think it is important to retain the existing "pre-phi" >> linear code. This is fine for backends such as LLVM which do a good >> job of transforming code anyway. It is also usually less buggy and >> serves as a baseline for correctness. So after careful consideration I >> would vote against adding phi instructions in the initial phase. > > That's forgetting that even with the "old/current" method create > phi-nodes during linearization: > - for the return mechanism > - for some conditionals > > This also doesn't really take in account the fact that the SSA form > is pretty fundamental to sparse (like it's most modern compilers). > I do not agree that the phi nodes "must" be present in the initial linear output. In fact if you look at clang output, initial linear output has allocas, and loads/stores. Later on these get converted to phi nodes. As long as the SSA property is maintained, which it is for Sparse, then this is fine in my view. I have some practical reasons for it: a) In the JIT world, we often want fast compilation. I already disable all subsequent phases, and generate code from the initial linear output. b) Introducing phi nodes and then taking them out doesn't help because it creates extra steps. I haven't yet tried the unssa option yet - I may have a look at this at some point. c) As both LLVM backend and my new backend use load/stores and allocas anyway, the presence of phi nodes just makes the output more convoluted as I showed earlier. d) You are right that some phi nodes are output already in the initial linear output. I would have preferred if this wasn't the case of course. Regards Dibyendu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Some thoughts on the SSA debate 2017-09-12 10:06 ` Dibyendu Majumdar @ 2017-09-12 10:48 ` Luc Van Oostenryck 2017-09-12 11:47 ` Dibyendu Majumdar 0 siblings, 1 reply; 8+ messages in thread From: Luc Van Oostenryck @ 2017-09-12 10:48 UTC (permalink / raw) To: Dibyendu Majumdar; +Cc: Linux-Sparse On Tue, Sep 12, 2017 at 12:06 PM, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote: > H Luc, > > On 9 September 2017 at 03:06, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: >> On Fri, Sep 08, 2017 at 11:28:57AM +0100, Dibyendu Majumdar wrote: >>> >>> 1. Firstly I think it is important to retain the existing "pre-phi" >>> linear code. This is fine for backends such as LLVM which do a good >>> job of transforming code anyway. It is also usually less buggy and >>> serves as a baseline for correctness. So after careful consideration I >>> would vote against adding phi instructions in the initial phase. >> >> That's forgetting that even with the "old/current" method create >> phi-nodes during linearization: >> - for the return mechanism >> - for some conditionals >> >> This also doesn't really take in account the fact that the SSA form >> is pretty fundamental to sparse (like it's most modern compilers). >> > > I do not agree that the phi nodes "must" be present in the initial > linear output. I fail to see where someone said they must. It's a fact though, that currently, during linearization, sparse already create phi-nodes for return values and for some conditional expressions. > I have some practical reasons for it: > > a) In the JIT world, we often want fast compilation. I already disable > all subsequent phases, and generate code from the initial linear > output. Amusingly, the Simple SSA method was (partially) created for such uses. I think you can easily understand why. Anyway, I'm more than a bit tired of this 'debate', mainly based by half-truths, assumptions and misunderstandings. As I have already stated twice, I've put the code for SSSA on hold, focusing instead on the *correct* conversion of the *others variables* (since SSSA is only concerned by local variables which can never be aliased to anything, depending on your code that can be most variables or only a small number). When I'll have something I'm happy with, I'll evaluate (read: "based on numbers") if there is any value left to the SSSA method or not. If I see there is a significant value for some uses, I'll submit it and we can have this 'debate'. *If* the code reaches upstream and bothers you too much, as you had already asked for, you will be able to disable it by simply doing so that the call to 'is_promotable()' always returns false and you will have all your nice stores & loads. Meanwhile, the current code is broken and the existing code with SSSA *can* be used to test, experiment, evaluate, compare, ... -- Luc ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Some thoughts on the SSA debate 2017-09-12 10:48 ` Luc Van Oostenryck @ 2017-09-12 11:47 ` Dibyendu Majumdar 2017-09-12 13:06 ` Luc Van Oostenryck 0 siblings, 1 reply; 8+ messages in thread From: Dibyendu Majumdar @ 2017-09-12 11:47 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Linux-Sparse Hi Luc, On 12 September 2017 at 11:48, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: >> I have some practical reasons for it: >> >> a) In the JIT world, we often want fast compilation. I already disable >> all subsequent phases, and generate code from the initial linear >> output. > > Amusingly, the Simple SSA method was (partially) created for such uses. > I think you can easily understand why. Yes, I think the SSA form is important for simplifications. And for some back-ends such as NanoJIT which do not have optimizations that LLVM has, the Sparse layer can act as the optimizer. But the issue right now is that the simplifications are not always correct even with the new SSA so that I have to disable everything anyway. And having the phi nodes then actually complicates things. > > > Anyway, I'm more than a bit tired of this 'debate', mainly based by half-truths, > assumptions and misunderstandings. Given that there could be major changes in the output (at least for the initial one) then it is good to make sure that there is a proper consideration of pros and cons. I am not arguing against the new implementation at all - just saying it should be a later phase. I also accept your point that currently there are issues with the SSA construction, and accept that changes are necessary. > > As I have already stated twice, I've put the code for SSSA on hold, focusing > instead on the *correct* conversion of the *others variables* (since SSSA is > only concerned by local variables which can never be aliased to anything, > depending on your code that can be most variables or only a small number). > > When I'll have something I'm happy with, I'll evaluate (read: "based > on numbers") > if there is any value left to the SSSA method or not. If I see there > is a significant > value for some uses, I'll submit it and we can have this 'debate'. > *If* the code reaches upstream and bothers you too much, as you had > already asked > for, you will be able to disable it by simply doing so that the call > to 'is_promotable()' > always returns false and you will have all your nice stores & loads. > > Meanwhile, the current code is broken and the existing code with SSSA > *can* be used to test, experiment, evaluate, compare, ... > Okay that is exactly what I am giving you - feedback based on using the new implementation. In fact I started with the assumption that it solved all the problems, only after using and testing it found that it didn't, i.e. later simplifications are still causing issues. Regards Dibyendu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Some thoughts on the SSA debate 2017-09-12 11:47 ` Dibyendu Majumdar @ 2017-09-12 13:06 ` Luc Van Oostenryck 2017-09-13 18:25 ` Dibyendu Majumdar 0 siblings, 1 reply; 8+ messages in thread From: Luc Van Oostenryck @ 2017-09-12 13:06 UTC (permalink / raw) To: Dibyendu Majumdar; +Cc: Linux-Sparse On Tue, Sep 12, 2017 at 1:47 PM, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote: > Hi Luc, > > On 12 September 2017 at 11:48, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: >>> I have some practical reasons for it: >>> >>> a) In the JIT world, we often want fast compilation. I already disable >>> all subsequent phases, and generate code from the initial linear >>> output. >> >> Amusingly, the Simple SSA method was (partially) created for such uses. >> I think you can easily understand why. > > Yes, I think the SSA form is important for simplifications. Sure, but the point here is about the Simple SSA method (which can do the conversion of local variables on the fly, during linearization). The advantage is that you have then converted the variables which matter the most and this: - very early - you can easily combine simple optimization during the conversion - with a very low cost (no need for another pass, no need to calculate the dominance frontiers). > And for > some back-ends such as NanoJIT which do not have optimizations that > LLVM has, the Sparse layer can act as the optimizer. And those optimizations are critical for sparse current usage as semantic checker for the kernel: - to be able to detect some situations - to not have false warnings in the context checking. > But the issue > right now is that the simplifications are not always correct even with > the new SSA so that I have to disable everything anyway. It's what you do but it's not what I do. The other part that is broken is simplify_loads() which try to convert more variables after each simplification/CSE steps. Since the code is copied from the code for the initial conversion it's broken in the same ways. I just disabled it and things are then correct. Now, be assured that I understand very well that the way you use sparse is quite different. Things like trying to reach a fix-point while doing simplification is at the opposite spectrum from what people would like for JIT. > And having > the phi nodes then actually complicates things. I also want to insist once more that SSA is fundamental to sparse. I won't do anything that purposely makes sparse more difficult to use without SSA but you must understand that it's the least of my priorities. Even more so because even without conversion in itself the 'normal' already create phi-nodes for returns & some conditionals. On the contrary, my priority is to make sparse to produce and keep correct SSA. >> Anyway, I'm more than a bit tired of this 'debate', mainly based by half-truths, >> assumptions and misunderstandings. > > Given that there could be major changes in the output (at least for > the initial one) then it is good to make sure that there is a proper > consideration of pros and cons. I am not arguing against the new > implementation at all - just saying it should be a later phase. It's where we disagree. What I'm saying is: - we want to also convert pointers & non-local variables decently (the general case is undecidable, anyway). - you can't convert all the non-local variables at once. If you have several optimization steps/rounds then each of them can create more conversion opportunities. - let see what is possible to have at which cost - *if* there is some good enough advantage to do a first quick conversion on-the-fly let's do it (and it can be a very good thing, for some usage, to just do that and stop there). - the extreme cases are not really concerned anyway as they don't want to run any of this - not doing the conversion (of local variables) has also its cost since then you: * have to deal in the IR with stores and loads instructions for *every* access * are unable to do most simplifications, even the cheapest ones * have to deal at run-time with these memory accesses instead of being able to use registers. >> >> Meanwhile, the current code is broken and the existing code with SSSA >> *can* be used to test, experiment, evaluate, compare, ... > > Okay that is exactly what I am giving you - feedback based on using > the new implementation. In fact, I started with the assumption that it > solved all the problems, only after using and testing it found that it > didn't, i.e. later simplifications are still causing issues. Be assured that it is something I didn't directly realize but it doesn't change anything to what can be done at the initial stage. And as I said above, you can put the single call to simplify_loads() in comment and be done with it, just like I do here. -- Luc ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Some thoughts on the SSA debate 2017-09-12 13:06 ` Luc Van Oostenryck @ 2017-09-13 18:25 ` Dibyendu Majumdar 2017-09-13 19:10 ` Luc Van Oostenryck 0 siblings, 1 reply; 8+ messages in thread From: Dibyendu Majumdar @ 2017-09-13 18:25 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Linux-Sparse Hi Luc, On 12 September 2017 at 14:06, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > On Tue, Sep 12, 2017 at 1:47 PM, Dibyendu Majumdar >> But the issue >> right now is that the simplifications are not always correct even with >> the new SSA so that I have to disable everything anyway. > > It's what you do but it's not what I do. > The other part that is broken is simplify_loads() which try to convert > more variables after each simplification/CSE steps. > Since the code is copied from the code for the initial conversion > it's broken in the same ways. > I just disabled it and things are then correct. > I probably ought to update my project with the latest version you have. Please let me know if you have an updated minimal changes branch that I can apply. Thanks and Regards Dibyendu ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Some thoughts on the SSA debate 2017-09-13 18:25 ` Dibyendu Majumdar @ 2017-09-13 19:10 ` Luc Van Oostenryck 0 siblings, 0 replies; 8+ messages in thread From: Luc Van Oostenryck @ 2017-09-13 19:10 UTC (permalink / raw) To: Dibyendu Majumdar; +Cc: Linux-Sparse On Wed, Sep 13, 2017 at 8:25 PM, Dibyendu Majumdar <mobile@majumdar.org.uk> wrote: > Hi Luc, > > On 12 September 2017 at 14:06, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: >> On Tue, Sep 12, 2017 at 1:47 PM, Dibyendu Majumdar >>> But the issue >>> right now is that the simplifications are not always correct even with >>> the new SSA so that I have to disable everything anyway. >> >> It's what you do but it's not what I do. >> The other part that is broken is simplify_loads() which try to convert >> more variables after each simplification/CSE steps. >> Since the code is copied from the code for the initial conversion >> it's broken in the same ways. >> I just disabled it and things are then correct. >> > > I probably ought to update my project with the latest version you > have. Please let me know if you have an updated minimal changes branch > that I can apply. For SSSA, I have what you should have already or something quite close: * https://github.com/lucvoo/sparse/tree/sssa-mini-v3 And the same but with all LLVM fixes and others fixes or optimizations: * https://github.com/lucvoo/sparse/tree/sssa-next I also have something I use only as a sort of reference which I have absolutely no intention to ever submit. It doesn't do the SSA conversion during linearization and converts more vars. It also seems to be quite correct and so is useful to me for doing comparisons but that's all (and it's not tested for against the kernel, for example): * https://github.com/lucvoo/sparse/tree/mem2ssa-next None of these trees are stable as I often rebuild them while working on the base. So, I'm not sure how useful they can be for you but if you find a problem with linearization, SSA conversion or optimization, you can always check with one of these (probably the last one) to see if it's already fixed or not. -- Luc ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-09-13 19:10 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-09-08 10:28 Some thoughts on the SSA debate Dibyendu Majumdar 2017-09-09 2:06 ` Luc Van Oostenryck 2017-09-12 10:06 ` Dibyendu Majumdar 2017-09-12 10:48 ` Luc Van Oostenryck 2017-09-12 11:47 ` Dibyendu Majumdar 2017-09-12 13:06 ` Luc Van Oostenryck 2017-09-13 18:25 ` Dibyendu Majumdar 2017-09-13 19:10 ` Luc Van Oostenryck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).