I worked on a PHP compiler: http://phpcompiler.org. Although it only supports PHP 5.2, it has a really nice parser, which had a lot of work put into it (a ton of edge cases in particular).
It's not in Go, but it does a lot of the interesting things you asked about: static analysis, dead code elimination, transpiling. It also compiles to (pretty poor) C code.
I worked on this for about 4 years, and if my experience is indicative of working on PHP compilers is general, you have a lot of fun, and a massive amount of frustration in front of you.
Since scrutinizer-ci took their interesting "PHP-analyser" private I've been looking for a better static analysis tool for PHP that I can contribute to. HPHPc is alright in HHVM but learning OCaml is slowing me down, so I'm definitely going to take a look at your compiler! Well done :)
Unlike the OP, Nikita's PHP parser is actually used for a lot of things. He wrote a script to detect code broken by syntax changes in the next version of PHP, for example.
nikic and Anthony are some of my favourite PHP developers. The stuff that people have been doing in PHP, what with HHVM and Hack and the PSR standards and Composer/Packagist, etc etc. is just amazing!
^ Wow is all I have to say. Solving PHP's grammatical syntax issues alone is a big thing, I commend efforts like this, it seems to work pretty well too (based on playing around with it for a few minutes).
It's being worked on for Go 1.4 (scheduled to release in December). It's the primary feature they want to get done for that release. There was a talk about in back in May[0]. rcs is working on c2go to convert the existing compiler to Go[1][2].
Woah, I'm pretty stoked to see someone link to my ruby implementation (neé Grubby).
It seems like the authors of Golang believe that a lot of problems with languages (refactoring, updating code to work with new libraries / versions, etc) can be solved as parsing problems. Hence, Golang has a lot of good tools for parsing text.
I'd be really delighted if you could show me a better tool for writing a parser, given a grammar in Golang than goyacc. You're absolutely right that the error reporting in yacc isn't that modern, but it's very functional, very powerful and (best of all), a lot of people have experience with it.
I certainly couldn't find any better tools in Golang when I started, but I wouldn't be surprised if someone had started one since.
How performant is this or other similar projects (pfff, PHP-Parser)? Are any of them a viable option to use for a base of a improved support for PHP in text editors (say vim, st, atom)?
Parsing, lexing and the general task of writing compilers is such a breeze in OCaml. I remember when I was in college, one of our year projects was to write a small compiler for a subset of postscript using ocamllex and ocamlyacc, couldn't believe how nice and natural it felt. What a great language.
Personally I would be a lot more interested in a good platform for writing static analysis tools. I believe the community in general would take a more immediate benefit (and what a benefit!...) from this than from a lone PHP to Go transpiler.
But with a transpiler, it could become an IL to
PHP compilation via Go...?
Not that it would be faster or better than say, HHVM or any other of a number of compilers for PHP [1] but my knowledge of that space is quite limited.
This could be a step in that process, but in the grand scheme of things required to compile one language to another, the mere front-end parser is not generally all that significant a portion of the effort. The vast majority of the effort would be the bug-for-bug compatible implementation of PHP semantics and base libraries and functionality.
("Bug-for-bug" here does not mean that PHP has a lot of bugs per se. What it is is the highest level of compatibility. An emulator of a game console strives to be "bug-for-bug" compatible, for instance. Programming and programming languages being what they are, anything less often turns out to be surprisingly non-linearly less useful, i.e., "80% compatible" isn't anywhere near "80% useful".)
I agree with you (that it could be an important tool).
However I will say that the VB6->VB.Net transpiler which Microsoft produced (and clearly spent significant amounts of effort on) was pretty terrible. And that is one of the most "complete" transpilers I know of...
The problem is that for a transpiler to produce "good" output code it needs to have a deep understanding of both context but also intent. This is particularly important when converting from one language to another with slightly different underlying concepts (like VB5-6 Vs. VB.Net). Without that understanding it just produces spaghetti code, that will technically compile (*although often it didn't in the VB6->VB.net example) but is unmaintable.
I liken it to Microsoft Word's HTML engine. Word can produce websites, and those websites technically looked correct in most browsers, but they became an unmaintainable mess in the medium to long term. A lot of transpilers have the same issue.
The best thing I can say about transpilers is that they're very good for a starting point (assume 100% refactoring anyway) and converting simplistic data storage vehicles (e.g. classes with tons of constants).
There's also the problem of dealing with the standard library. Not everything has a directly equivalent function. So you'll make your app dependent on an obscure library based on another language's standard library.
More advanced transformers even handle direct transformation of library calls to "native" library calls in the target language. I think it's mostly things that try to take advantage of syntax that is already pretty similar - Processing.js transforming from Java to JavaScript, for example, that decide it would be easier to do a relatively simple syntax transformation and then implement some sort of wrapper for function calls (as you describe) than to do a potentially more in depth and complicated transformation.
The migration tool that came with Visual Studio 2002-2008 is a licensed product from a third party company (a "lite" version actually). The tool was demonstrated (and this fact mentioned) in some Channel9 videos (Microsoft website) about ten years ago.
Word and Frontpage used the same COM code based on trident (IE). Frontpage is dead, the successor Expression Web used a new HTML engine and is dead as well. The second Frontpage successor that still used the trident engine was Sharepoint Designer 2007. Version 2010+ lacks most layout features as the old Frontpage based code generated ugly HTML4. Word 2010 (and probably also 2013) still generates (ugly) HTML4 with inlined VML (graphics, WordArt) based on older trident.
And there is also InfoPath 2003-2013 (dead as of 2014) that is based on a modified Frontpage/trident code. It uses CAB based archive format to store XML and XSD files that resemble the user defined form data. The InfoPath WebForms are generated server side based on the XML stylesheet and XML data. Microsoft is working on a successor to InfoPath merged with other Office products and mobile compatible.
It's not in Go, but it does a lot of the interesting things you asked about: static analysis, dead code elimination, transpiling. It also compiles to (pretty poor) C code.
For static analysis, there's a lot to do. Here's my PhD on the topic: http://paulbiggar.com/research/#phd-dissertation
I worked on this for about 4 years, and if my experience is indicative of working on PHP compilers is general, you have a lot of fun, and a massive amount of frustration in front of you.