< Writing PPXs
Destructing AST nodes >

Generating AST nodes

The core of a rewriter is a function outputing code in the form of an AST. However, there are some issues with generating AST values using directly the constructors:

The second point is important: Indeed, since ppxlib translates the AST to the newest OCaml AST available before applying rewriting, your PPX would not only become incompatible with the new OCaml version, but also with all ppxlib version released after the new AST type is introduced.

For this reason, ppxlib provides abstractions over the OCaml AST, with a focus on usability and stability.

The different options

The two main options are:

Ast_builder provides an API to generate AST nodes for the latest OCaml version in a backward compatible way. Ppxlib_metaquot is different: it is a PPX that lets you generate OCaml AST node by writing OCaml code, using quoting and splicing.

Using Ppxlib_metaquot requires less knowledge of the OCaml AST than Ast_builder as it only uses natural OCaml syntax, however it is less general: it only allows to generate few different nodes such as structure items, expressions, patterns, ... but it is not possible to generate a value of type row_field_desc!

Note: The OCaml compiler API also provides a module, called Ast_helper, to manipulate the AST.

The AST_builder module

General presentation

The Ast_builder module provides several kinds of functions to generate AST nodes. The first kind are ones whose name matches closed the Parsetree equivalents, but there are also "higher level" wrappers around those basic blocks, for common patterns such as creating an integer or string constant.

Low-level builders

The function names match the Parsetree names closely, which makes it easy to build AST fragments by just knowing the Parsetree.

For types wrapped in the _desc field of a record, helpers are generated for each constructor, that generate the record wrapper. For instance for the type Parsetree.expression:

type expression =
  { pexp_desc       : expression_desc
  ; pexp_loc        : Location.t
  ; pexp_attributes : attributes
  }
and expression_desc =
  | Pexp_ident    of Longident.t loc
  | Pexp_constant of constant
  | Pexp_let      of rec_flag * value_binding list * expression
  ...

The following helpers are created:

val pexp_ident    : loc:Location.t -> Longident.t loc -> expression
val pexp_constant : loc:Location.t -> constant -> expression
val pexp_let      : loc:Location.t -> rec_flag -> value_binding list -> expression -> expression
...

For other record types, such as type_declaration, we have the following helper:

type type_declaration =
  { ptype_name       : string Located.t
  ; ptype_params     : (core_type * variance) list
  ; ptype_cstrs      : (core_type * core_type * Location.t) list
  ; ptype_kind       : type_kind
  ; ptype_private    : private_flag
  ; ptype_manifest   : core_type option
  ; ptype_attributes : attributes
  ; ptype_loc        : Location.t
  }

val type_declaration
  :  loc      : Location.t
  -> name     : string Located.t
  -> params   : (core_type * variance) list
  -> cstrs    : (core_type * core_type * Location.t) list
  -> kind     : type_kind
  -> private  : private_flag
  -> manifest : core_type option
  -> type_declaration

Attributes are always set to the empty list. If you want to set them you have to override the field with the { e with pexp_attributes = ... } notation.

High-level builders

Those functions are just wrapper on the low-level function, for simplifying the most common use. For instance, to simply create a 1 integer constant, with the low-level building block it would look like:

Ast_builder.Default.pexp_constant ~loc (Parsetree.Pconst_integer ("1", None))

This seems a lot for such a simple node. So, in addition to the low-level building blocks, Ast_builder provides higher level-building blocks, such as Ast_builder.Default.eint to create integer contants:

Ast_builder.Default.eint ~loc 1

Those functions also follows a pattern in their name, to make them easier to use. Functions that generate an expression start with an e, followed by what they build, such as eint, echar, estring, eapply, elist... Similarly, names that start with a p define a pattern, such as pstring, pconstruct, punit, ...

Dealing with locations

As explained in the dedicated section, it is crucial to correctly deal with locations. For this, Ast_builder can be used in several ways, depending on the context:

Ast_builder.Default contains function which take the location as a named argument. It is useful if you want to control locations in a fine-grained way, or if the additional ~loc variable is not verbosifying the code too much.

If you want to specify once and for all the location, and always use this specific one later AST construction, you can use the Ast_builder.Make functor or the Ast_builder.make function (outputing a first order module).

Compatibility

In order to stay as compatible as possible when a new option appears in the AST, Ast_builder always integrate the new option in a retro-compatible way. So, the signature of each function won't change, and Ast_builder will choose a retrocompatible way of generating the AST node of an updated type.

However, sometimes you might want to use a feature that was introduced recently in OCaml, and is not integrated in Ast_builder. For instance, OCaml 4.14 introduced the possibility to explicitly introduce type variables in a constructor declaration. This modified the AST type, and for backwards compatibility, Ast_builder did not modify the signature of the function, and it is thus impossible to generate code using this new feature.

In the case you need to access a new feature, you can use the Latest submodule (e.g. Ast_builder.Default.Latest when specifying the locations). This module includes new functions letting you control all features introduced, at the cost of potentially having breaking changes when a new feature modifies the function you are already using.

If a feature that was introduced in some recent version of OCaml is essential for your PPX to work, it might imply that you need to restrict the OCaml version on your opam dependencies: remember that ppxlib will rewrite using the latest Parsetree version, but it will then migrate the Parsetree back to the OCaml version of the switch, possibly losing the information given by the new feature.

Metaquot metaprogramming

General presentation

As you have seen, defining code with Ast_builder does not feel perfectly natural: some knowledge of the Parsetree types is needed. Yet, every part of a program we write corresponds to a specific AST node, so there is no need for AST generation to be more difficult than that.

Metaquot is a very useful PPX that allows to define values of a Parsetree type by writing natural code, using the quoting and splicing mechanism of metaprogramming.

Simplifying a bit, Metaquot rewrites an expression extension point directly with its payload. Since the payload was parsed by the OCaml parser to a value of Parsetree type, this rewriting turns naturally written code into AST values.

Usage

First, in order to use Metaquot, add it in your preprocess dune stanza:

(prepocess (pps ppxlib.metaquot))

Using Metaquot to generate code is simple: any Metaquot extension node in an expression context will be rewritten into the Parsetree value that lies in its payload.

However, the Parsetree.payload of an extension node can only take few forms: a structure, a signature, a core type or a pattern. We might want to generate other kind of nodes, such as expressions or structure items for instance. Ppxlib_metaquot provides different extension nodes for this:

Note that the replacement work when the extension node is an "expression" extension node: Indeed, the payload is a value (of Parsetree type) that would not fit elsewhere in the AST. So, let x : [%str "incoherent"] would not be rewritten by metaquot. (Actually, it also rewrites "pattern" extension nodes, as you'll see in the chapter on matching AST node.)

Also note the : and ? in the sigi, type and pat cases: they are needed for the payload to be parsed as the right kind of node.

Consider now the extension node [%expr 1 + 1], in an expression context. Metaquot will actually expand it into the following code:

{
  pexp_desc =
    (Pexp_apply
       ({
          pexp_desc = (Pexp_ident { txt = (Lident "+"); loc });
          pexp_loc = loc;
          pexp_attributes = []
        },
         [(Nolabel,
            {
              pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
              pexp_loc = loc;
              pexp_attributes = []
            });
         (Nolabel,
           {
             pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
             pexp_loc = loc;
             pexp_attributes = []
           })]));
  pexp_loc = loc;
  pexp_attributes = []
}

Looking at the example, you might notice two things:

So, for this to compile you need both to open Ppxlib and to have a loc : Location.t variable in scope. The produced AST node value and every other nodes within it will be located to this loc. You should therefore make sure that loc is the location you want for your generated code when using metaquot.

Anti-quotations

Using these extensions alone, you can only produce constant/static AST nodes. metaquot has a solution for that: anti-quotation. You can use anti-quotation to insert any expression representing an AST node. That way, you can include dynamically generated nodes inside a metaquot expression extension point.

Consider the following example:

let with_suffix_expr ~loc s =
  let dynamic_node = Ast_builder.Default.estring ~loc s in
  [%expr [%e dynamic_node] ^ "some_fixed_suffix"]

The with_suffix_expr function will create an expression which represents the concatenation of the s argument and the fixed suffix. I.e. with_suffix_expr "some_dynamic_stem" is equivalent to [%expr "some_dynamic_stem" ^ "some_fixed_suffix"].

The syntax for anti-quotation depends on the type of the node you wish to insert (which must also correspond to the context of the antiquotation extension node):

Note that if an anti-quote extension node is in the wrong context, it won't be rewritten by Metaquot. For instance, in [%expr match [] with [%e some_value] -> 1] the anti-quote extension node for expression is put in a pattern context, and it won't be rewritten.

On the contrary, you should use anti-quotes whose kind ([%e ...], [%p ...]) match the context. E.g. you should write:

let let_generator pat type_ expr =
  [%stri let [%p pat] : [%t type_] = [%e expr]] ;;

Finally, remark that as we are inserting values, so we never use patterns in the payloads of anti-quotations. Those will be used for matching.

< Writing PPXs
Destructing AST nodes >