Pharo VM Transpiler: Wrapping up
Progress done while contributing for Google Summer of Code
Another 6 weeks have passed! This is all the work done on the second half of my GSoC project
Development
Wrapping up validations
Pull Requests: #662
In the previous article I went pretty in-depth on validations, however, one small feature that I was eager to implement since starting the project was displaying these validations on the IDE so that the developer would save even more time compared to throwing an error during the transpilation.
The most natural way of displaying this information was as a linter rule, in this case, a new linter rule was added to check for redundant type declarations (for more details on the validation check my previous article).
Implementation
Linter rules in Pharo are implemented as classes, which must be a subclass of ReAbstractRule
, this provides you with a simple interface where a method receives a node and answers whether that RbAST node is valid or not
ReSlangRedundantTypeDeclarationRule >> basicCheck: aNode [
^ aNode isPragma and: [
aNode isTypeDefinition and: [
(aNode methodNode allDefinedVariables includes:
(aNode argumentAt: #var:) value) not ] ]
]
And that's it, this is how the linter rule shows up!
Type guided translations
Pull Requests: #683 (pending as of writing this article)
The issue
The last feature I worked on was based on a preexisting issue that had to do with the '&' operator in Pharo, to understand it let's use this code example
AClass >> aMethod
| result1 result2 |
result1 := self anOperation.
result2 := self anotherOperation.
^result1 & result2
How should the last line be translated? Well it actually depends on the type, when result1
and result2
are booleans the &
is a logical and which is translated as &&
in C. On the other hand, if they are numbers then the &
is a bit and, which is represented with the &
symbol.
Tapping into the translation pipeline
To get this behavior I had to modify the CAST generation, many operations are actually "intercepted" by this dictionary to add logic before the CAST. We can use it to call a method generateCASTInferredAnd
which handles the type guided translation.
CCodeGenerator >> initializeCASTTranslationDictionary [
| pairs |
castTranslationDict := Dictionary new: 200.
pairs := #(
#& #generateCASTInferredAnd:
#| #forbiddenSelector:
#abs #generateCASTAbs:
#and: #generateCASTSequentialAnd:
(...)
Dealing with string-based types
The biggest issue for this implementation was dealing with types, because these are handled as strings, for example, take a look at the way a constant's type is inferred
TConstantNode >> typeOrNilFrom: aCodeGenerator in: aTMethod [
| hb |
value isInteger
ifTrue:
[value positive
ifTrue:
[hb := value highBit.
hb < 32 ifTrue: [^#int].
hb = 32 ifTrue: [^#'unsigned int'].
hb = 64 ifTrue: [^#'unsigned long long'].
^#'long long']
ifFalse:
[hb := value bitInvert highBit.
hb < 32 ifTrue: [^#int].
^#'long long']].
value isFloat ifTrue: [^#double].
(#(nil true false) includes: value) ifTrue: [^#int].
(value isString and: [value isSymbol not]) ifTrue: [^#'char *'].
^nil
]
Not only are they all strings but this example displays the biggest issue, the boolean type is the same as the number type, which is int
. The solution we came up with was using objects that would wrap the string type and also be able to answer if they are boolean or not. This worked great however it needed changing a bunch of logic around the type system.
An important note is that this PR only transforms some strings into objects, the co-living of these representations is painful and transitioning completely to objects is a must.
Generating the CAST based on type
Once we can deduce if the type is boolean or not the actual type-guided translation has a simple implementation
CCodeGenerator >> generateCASTInferredAnd: aTSendNode [
| receiverType argumentType |
"fetch types"
receiverType := self
tTypeFor: aTSendNode receiver
in: self currentMethod.
argumentType := self
tTypeFor: aTSendNode arguments first
in: self currentMethod.
"if types are different then throw an error"
receiverType ~= argumentType ifTrue: [ TypeError signal: 'Cannot infer & type'].
"translate depending on type"
^ receiverType isBoolean
ifTrue: [ self generateCASTAnd: aTSendNode ]
ifFalse: [ self generateCASTBitAnd: aTSendNode ]
Now that the types are reified, this could be implemented for any operation!
Conclusions
Wrapping up my GSoC when I look back at my proposal although it wasn't a rigid path, the main goal of improving the development experience was achieved in several ways!
Lastly, I'd like to thank my mentors, they helped me every step of the way.