{"id":458,"date":"2020-03-12T00:19:42","date_gmt":"2020-03-11T23:19:42","guid":{"rendered":"https:\/\/www.nikostotz.de\/blog\/?p=458"},"modified":"2022-09-29T17:17:22","modified_gmt":"2022-09-29T15:17:22","slug":"mps-quest-of-the-holy-graalvm-of-interpreters","status":"publish","type":"post","link":"https:\/\/www.nikostotz.de\/blog\/mps-quest-of-the-holy-graalvm-of-interpreters\/","title":{"rendered":"MPS&#8217; Quest of the Holy GraalVM of Interpreters"},"content":{"rendered":"<p>A vision how to combine MPS and GraalVM<\/p>\n<p>Way too long ago, I <a href=\"https:\/\/www.nikostotz.de\/blog\/high-performance-interpreters-for-jetbrains-mps\/\">prototyped<\/a> a way to use <a href=\"https:\/\/www.graalvm.org\/\">GraalVM<\/a> and <a href=\"https:\/\/github.com\/oracle\/graal\/tree\/master\/truffle\">Truffle<\/a> inside <a href=\"https:\/\/www.jetbrains.com\/mps\/\">JetBrains MPS<\/a>. I hope to pick up this work soon. In this article, I describe the <em>grand picture<\/em> of what might be possible with this combination.<\/p>\n<h2 id=\"_part_i_get_it_working\">Part I: Get it Working<\/h2>\n<h3 id=\"_step_0_teach_annotation_processors_to_mps\">Step 0: Teach Annotation Processors to MPS<\/h3>\n<p>Truffle uses <a href=\"https:\/\/docs.oracle.com\/en\/java\/javase\/13\/docs\/api\/java.compiler\/javax\/annotation\/processing\/Processor.html\">Java Annotation Processors<\/a> heavily. Unfortunately, MPS doesn\u2019t support them during its internal Java compilation. The <a href=\"https:\/\/youtrack.jetbrains.com\/issue\/MPS-27653\">feature request<\/a> doesn\u2019t show any activity.<\/p>\n<p>So, we have to do it ourselves. A little less time ago, I started with an <a href=\"https:\/\/github.com\/enikao\/mps-annotation-processor-facet\/tree\/intermediate-state\">alternative Java Facet<\/a> to include Annotation Processors. I just pushed my work-in-progress state from 2018. As far as I remember, there were no fundamental problems with the approach.<\/p>\n<h3 id=\"_optional_step_1_teach_truffle_structured_sources\">Optional Step 1: Teach Truffle Structured Sources<\/h3>\n<p>For Truffle, all executed programs stem from a <a href=\"https:\/\/www.graalvm.org\/truffle\/javadoc\/com\/oracle\/truffle\/api\/source\/Source.html\"><tt>Source<\/tt><\/a>. However, this <tt>Source<\/tt> can only provide <tt>Byte<\/tt>s or <tt>Character<\/tt>s. In our case, we want to provide the input model. The prototype just put the <em>Node id<\/em> of the input model as a String into the Source; later steps resolved the id against MPS API. This approach works and is acceptable; directly passing the input node as object would be much nicer.<\/p>\n<h3 id=\"_step_2_implement_truffle_annotations_as_mps_language\">Step 2: Implement Truffle Annotations as MPS Language<\/h3>\n<p>We have to provide all additional hints as Annotations to Truffle. They are complex enough, so we want to leverage MPS&#8217; language features to directly represent all Truffle concepts.<\/p>\n<p>This might be a simple one-to-one representation of Java Annotations as MPS Concepts, but I\u2019d guess we can add some more semantics and checks. Such feedback within MPS should simplify the next steps: Annotation Processors (and thus, Truffle) have only limited options to report issues back to us.<\/p>\n<p>We use this MPS language to implement the interpreter for our DSL. This results in a <a href=\"https:\/\/www.graalvm.org\/truffle\/javadoc\/com\/oracle\/truffle\/api\/TruffleLanguage.html\"><tt>TruffleLanguage<\/tt><\/a> for our DSL.<\/p>\n<h3 id=\"_step_3_start_truffle_within_mps\">Step 3: Start Truffle within MPS<\/h3>\n<p>At the time when I wrote the proof-of-concept, a <tt>TruffleLanguage<\/tt> had to be loaded at JVM startup. To my understanding, Truffle overcame this limitation. I haven\u2019t looked into the current possibilities in detail yet.<\/p>\n<p>I can imagine two ways to provide our DSL interpreter to the Truffle runtime:<\/p>\n<ol class=\"loweralpha\" type=\"a\">\n<li>Always register <tt>MpsTruffleLanguage1<\/tt>, <tt>MpsTruffleLanguage2<\/tt>, etc. as placeholders. This would also work at JVM startup. If required, we can register additional placeholders with one JVM restart.<br \/>\nAll <em>non-colliding<\/em> DSL interpreters would be <tt>MpsTruffleLanguage1<\/tt> from Truffle\u2019s point of view. This works, as we know the MPS language for each input model, and can make sure Truffle uses the right evaluation for the node at hand. We might suffer a performance loss, as Truffle had to manage more evaluations.<\/p>\n<p>What are non-colliding interpreters? Assume we have a state machine DSL, an expression DSL, and a test DSL. The expression DSL is used within the state machines; we provide an interpreter for both of them.<br \/>\nWe provide two interpreters for the test DSL: One executes the test and checks the assertions, the other one only marks model nodes that are covered by the test.<br \/>\nThe state machine interpreter, the expression interpreter, and the first test interpreter are non-colliding, as they never want to execute on the same model node. All of them go to <tt>MpsTruffleLanguage1<\/tt>.<br \/>\nThe second test interpreter <em>does<\/em> collide, as it wants to do something with a node also covered by the other interpreters. We put it to <tt>MpsTruffleLanguage2<\/tt>.<\/li>\n<li>We register every DSL interpreter as a separate <tt>TruffleLanguage<\/tt>. Nice and clean one-to-one relation. In this scenario, we probably had to get <a href=\"https:\/\/www.graalvm.org\/truffle\/javadoc\/com\/oracle\/truffle\/api\/interop\/InteropLibrary.html\">Truffle Language Interop<\/a> right. I have not yet investigated this topic.<\/li>\n<\/ol>\n<h3 id=\"translate-input-model\">Step 4: Translate Input Model to Truffle Nodes<\/h3>\n<p>A lot of Truffle\u2019s magic stems from its AST representation. Thus, we need to translate our input model (a.k.a. DSL instance, a.k.a. program to execute) from MPS nodes into <a href=\"https:\/\/www.graalvm.org\/truffle\/javadoc\/com\/oracle\/truffle\/api\/nodes\/Node.html\">Truffle <tt>Node<\/tt>s<\/a>.<\/p>\n<p>Ideally, the Truffle AST would dynamically adopt any changes of the input model\u2009\u2014\u2009like hot code replacement in a debugger, except we don\u2019t want to stop the running program. From Truffle\u2019s point of view this shouldn\u2019t be a problem: It rewrites the AST all the time anyway.<\/p>\n<p><a href=\"https:\/\/github.com\/ModelingValueGroup\/DclareForMPS\">DclareForMPS<\/a> seems a fitting technology. We define mapping rules from MPS node to Truffle <tt>Node<\/tt>. Dclare makes sure they are in sync, and input changes are propagated optimally. These rules could either be generic, or be generated from the interpreter definition.<\/p>\n<p>We need to take care that Dclare doesn\u2019t try to adapt the MPS nodes to Truffle\u2019s optimizing AST changes (no back-propagation).<\/p>\n<p>We require special handling for edge cases of MPS \u2192 Truffle change propagation, e.g. the user deletes the currently executed part of the program.<\/p>\n<p>For memory optimization, we might translate only the entry nodes of our input model immediately. Instead of the actual child Truffle <tt>Node<\/tt>s, we\u2019d add special nodes that translate the next part of the AST.<br \/>\nUnloading the not required parts might be an issue. Also, on-demand processing seems to conflict with Dclare\u2019s rule-based approach.<\/p>\n<h2 id=\"_part_ii_adapt_to_mps\">Part II: Adapt to MPS<\/h2>\n<h3 id=\"_step_5_re_create_interpreter_language\">Step 5: Re-create Interpreter Language<\/h3>\n<p>The <a href=\"http:\/\/mbeddr.com\/interpreter\/Interpreter.html\">MPS interpreter framework<\/a> removes even more boilerplate from writing interpreters than Truffle. The same language concepts should be built again, as abstraction on top of the Truffle Annotation DSL. This would be a new language aspect.<\/p>\n<h3 id=\"migrate-framework\">Step 6: Migrate MPS Interpreter Framework<\/h3>\n<p>Once we had the Truffle-based interpreter language, we want to use it! Also, we don\u2019t want to rewrite all our nice interpreters.<\/p>\n<p>I think it\u2019s feasible to automatically migrate at least large parts of the existing MPS interpreter framework to the new language. I would expect some manual adjustment, though. That\u2019s the price we had to pay for two orders of magnitude performance improvement.<\/p>\n<h3 id=\"plumbing\">Step 7: Provide Plumbing for BaseLanguage, Checking Rules, Editors, and Tests<\/h3>\n<p>Using the interpreter should be as easy as possible. Thus, we have to provide the appropriate utilities:<\/p>\n<ul>\n<li>Call the interpreter from any BaseLanguage code.<br \/>\nWe had to make sure we get language \/ model loading and dependencies right. This should be easier with Truffle than with the current interpreter, as most language dependencies are only required at interpreter build time.<\/li>\n<li>Report interpreter results in Checking Rules.<br \/>\nCreating warnings or errors based on the interpreter\u2019s results is a standard use-case, and should be supported by dedicated language constructs.<\/li>\n<li>Show interpreter results in an editor.<br \/>\nAs another standard use-case, we might want to show the interpreter\u2019s results (or a derivative) inside an MPS editor. Especially for long-running or asynchronous calculations, getting this right is tricky. Dedicated editor extensions should take care of the details.<\/li>\n<li>Run tests that involve the interpreter.<br \/>\nYet another standard use-case: our DSL defines both calculation rules and examples. We want to assure they are in sync, meaning executing the rules in our DSL interpreter and comparing the results with the examples. This must work both inside MPS, and in a headless build \/ CI test environment.<\/li>\n<\/ul>\n<h3 id=\"_step_8_support_asynchronous_interpretation_andor_caching\">Step 8: Support Asynchronous Interpretation and\/or Caching<\/h3>\n<p>The simple implementation of interpreter support accepts a language, parameters, and a program (a.k.a. input model), and blocks until the interpretation is complete.<\/p>\n<p>This working mode is useful in various situations. However, we might want to run long-running interpretations in the background, and notify a callback once the computation is finished.<\/p>\n<p>Example: An MPS editor uses an interpreter to color a rule red if it is not in accordance with a provided example. This interpretation result is very useful, even if it takes several seconds to calculate. However, we don\u2019t want to block the editor (or even whole MPS) for that long.<\/p>\n<p>Extending the example, we might also want to show an error on such a rule. The typesystem runs asynchronously anyways, so blocking is not an issue. However, we now run the same expensive interpretation twice. The interpreter support should provide configurable caching mechanisms to avoid such waste.<\/p>\n<p>Both asynchronous interpretation and caching benefit from proper <a href=\"#plumbing\">language extensions<\/a>.<\/p>\n<h3 id=\"_step_9_integrate_with_mps_typesystem_and_scoping\">Step 9: Integrate with MPS Typesystem and Scoping<\/h3>\n<p>Truffle needs to know about our DSL\u2019s types, e.g. for resolving overloaded functions or type casting. We already provide this information to the MPS typesystem. I didn\u2019t look into the details yet; I\u2019d expect we could generate at least part of the Truffle input from MPS&#8217; type aspect.<\/p>\n<p>Truffle requires scoping knowledge to store variables in the right stack frame (and possibly other things I don\u2019t understand yet). I\u2019d expect we could use the resolved references in our model as input to Truffle. I\u2019m less optimistic to re-use MPS&#8217; actual scoping system.<\/p>\n<p>For both aspects, we can amend the missing information in the Interpreter Language, similar to the existing one.<\/p>\n<h3 id=\"_step_10_support_interpreter_development\">Step 10: Support Interpreter Development<\/h3>\n<p>As DSL developers, we want to make sure we implemented our interpreter correctly. Thus, we write tests; they are similar to other <a href=\"#plumbing\">tests involving the interpreter<\/a>.<\/p>\n<p>However, if they fail, we <em>don\u2019t<\/em> want to <a href=\"#debugger\">debug the program expressed in our DSL<\/a>, but our interpreter. For example, we might implement the interpreter for a <tt>switch<\/tt>-like construct, and had forgotten to handle an implicit <tt>default<\/tt> case.<\/p>\n<p>Using a regular Java debugger (attached to our running MPS instance) has only limited use, as we had to debug through the highly optimized Truffle code. We cannot use Truffle\u2019s debugging capabilities, as they work on the DSL.<br \/>\nThere might be ways to attach a regular Java debugger running inside MPS in a different thread to its own JVM. Combining the direct debugger access with our knowledge of the interpreter\u2019s structure, we might be able to provide sensible stepping through the interpreter to the DSL developer.<\/p>\n<p>Simpler ways to support the developers might be providing traces through the interpreter, or ship test support where the DSL developer can assure specific evaluators were (not) executed.<\/p>\n<h3 id=\"_step_11_create_language_for_interop\">Step 11: Create Language for Interop<\/h3>\n<p>Truffle provides a framework to describe any runtime in-memory data structure as <a href=\"https:\/\/www.graalvm.org\/truffle\/javadoc\/com\/oracle\/truffle\/api\/object\/Shape.html\"><tt>Shape<\/tt><\/a>, and to convert them between languages. This should be a nice extension of MPS&#8217; multi-language support into the runtime space, supported by an appropriate Meta-DSL (a.k.a. language aspect).<\/p>\n<h2 id=\"_part_iii_leverage_programming_language_tooling\">Part III: Leverage Programming Language Tooling<\/h2>\n<h3 id=\"debugger\">Step 12: Connect Truffle to MPS&#8217; Debugger<\/h3>\n<p>MPS contains the standard interactive debugger inherited from IntelliJ platform.<\/p>\n<p>Truffle exposes a standard <a href=\"https:\/\/www.graalvm.org\/truffle\/javadoc\/com\/oracle\/truffle\/api\/debug\/DebuggerSession.html\">interface for interactive debuggers<\/a> of the interpreted input. It takes care of the heavy lifting from Truffle AST to MPS input node.<\/p>\n<p>If we ran Truffle in a different thread than the MPS debugger, we should manage to connect both parts.<\/p>\n<h3 id=\"instrumentation\">Step 13: Integrate Instrumentation<\/h3>\n<p>Truffle also exposes an <a href=\"https:\/\/www.graalvm.org\/truffle\/javadoc\/com\/oracle\/truffle\/api\/instrumentation\/TruffleInstrument.html\">instrumentation interface<\/a>. We could provide standard instrumentation applications like &#8220;code&#8221; coverage (in our case: DSL node coverage) and tracing out-of-the-box.<\/p>\n<p>One might think of nice visualizations:<\/p>\n<ul>\n<li>Color node background based on coverage<\/li>\n<li>Mark the currently executed part of the model<\/li>\n<li>Project runtime values inline<\/li>\n<li>Show traces in trace explorer<\/li>\n<\/ul>\n<p>Other possible applications:<\/p>\n<ul>\n<li>Snapshot mechanism for current interpreter state<\/li>\n<li>Provide traces for offline debugging, and play them back<\/li>\n<\/ul>\n<h2 id=\"_part_iv_beyond_mps\">Part IV: Beyond MPS<\/h2>\n<h3 id=\"_step_14_serialize_truffle_nodes\">Step 14: Serialize Truffle Nodes<\/h3>\n<p>If we could serialize Truffle <tt>Node<\/tt>s (before any run-time optimization), we would have an MPS-independent representation of the executable DSL. Depending on the serialization format (implement <a href=\"https:\/\/docs.oracle.com\/en\/java\/javase\/13\/docs\/api\/java.base\/java\/io\/Serializable.html\">Serializable<\/a>, custom binary format, JSON, etc.), we could optimize for use-case, size, loading time, or other priorities.<\/p>\n<h3 id=\"_step_15_execute_dsl_stand_alone_without_generator\">Step 15: Execute DSL stand-alone without Generator<\/h3>\n<p>Assume an insurance calculation DSL.<br \/>\nUsually, we would implement<\/p>\n<ul>\n<li>an interpreter to execute test cases within MPS,<\/li>\n<li>a Generator to C to execute on the production server,<\/li>\n<li>and a Generator to Java to provide an preview for the insurance agent.<\/li>\n<\/ul>\n<p>With serialized Truffle <tt>Node<\/tt>s, we need only one interpreter:<\/p>\n<ul>\n<li>It runs out-of-the-box in MPS,<\/li>\n<li>Works stand-alone through GraalVM\u2019s <a href=\"https:\/\/github.com\/oracle\/graal\/tree\/master\/substratevm\">ahead-of-time compiler Substrate VM<\/a>,<\/li>\n<li>and can be used on any JVM using Truffle runtime.<\/li>\n<\/ul>\n<h2 id=\"_part_v_crazy_ideas\">Part V: Crazy Ideas<\/h2>\n<h3 id=\"step-back\">Step 16: Step Back Debugger<\/h3>\n<p>By combining <a href=\"#instrumentation\">Instrumentation<\/a> and <a href=\"#debugger\">debugger<\/a>, it might be feasible to provide step-back debugging.<\/p>\n<p>In the interpreter, we know the complete global state of the program, and can store deltas (to reduce memory usage). For quite some DSLs, this might be sufficient to store every intermediate state and thus arbitrary debug movement.<\/p>\n<h3 id=\"side-step\">Step 17: Side Step Debugger<\/h3>\n<p>By <a href=\"#step-back\">stepping back<\/a> through our execution and following different execution paths, we could explore alternate outcomes. The different execution path might stem from other input values, or <a href=\"#translate-input-model\">hot code replacement<\/a>.<\/p>\n<h3 id=\"_step_18_explorative_simulations\">Step 18: Explorative Simulations<\/h3>\n<p>If we had a <a href=\"#side-step\">side step debugger<\/a>, nice support to <a href=\"#plumbing\">project interpretation results<\/a>, and a <a href=\"#migrate-framework\">really fast interpreter<\/a>, we could run explorative simulations on lots of different executions paths. This might enable <a href=\"https:\/\/www.youtube.com\/watch?v=EGqwXt90ZqA&amp;t=642s\">legendary interactive development<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A vision how to combine MPS and GraalVM<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[29],"tags":[35,33,36,32,34],"class_list":["post-458","post","type-post","status-publish","format-standard","hentry","category-mps","tag-dsl","tag-graalvm","tag-interpreter","tag-jetbrains-mps","tag-truffle"],"_links":{"self":[{"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/posts\/458","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/comments?post=458"}],"version-history":[{"count":6,"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/posts\/458\/revisions"}],"predecessor-version":[{"id":491,"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/posts\/458\/revisions\/491"}],"wp:attachment":[{"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/media?parent=458"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/categories?post=458"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.nikostotz.de\/blog\/wp-json\/wp\/v2\/tags?post=458"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}