PPX for plugin authors

This section describes how to use ppxlib for PPX plugin authors.

Getting started

There are two main kinds of PPX plugins you can write with ppxlib:

It is also possible to write more advanced transformations such as rewriting constants that bear the right suffix, rewriting function calls based on the function identifier or to generate code from items annotated with a custom attribute but we won't cover those in this section.

ppxlib compiles those transformations into rules which allows it to apply them to the right AST nodes, even recursively in nodes generated by other transformations, in a single AST traversal.

Note that you can also write arbitrary, whole AST transformations with ppxlib but they don't have a clear composition semantic since they have to be applied sequentially as opposed to the other, better defined rewriting rule. You should always prefer the above mentioned transformations instead when possible.

The OCaml AST

As described in ppx-overview, PPX rewriters don't operate at the text level but instead used the compiler's internal representation of the source code: the Abstract Syntax Tree or AST.

A lot of the following sections of the manual assume a certain level of familiarity with the OCaml AST so we'll try to cover the basics here and to give you some pointers to deepen your knowledge on the subject.

The types describing the AST are defined in the Parsetree module of OCaml's compiler-libs. Note that they vary from one version of the compiler to another so make sure you look at an up to date version and most importantly to the one corresponding to what ppxlib's using internally. You can find the module's API documentation online here. If you're new to these parts of OCaml it's not always easy to navigate as it just contains the raw type declarations but no actual documentation. This documentation is actually written in parsetree.mli but not in a way that allows it to make its way to the online doc unfortunately. Until this is fixed in the compiler you can look at the local copy in one of your opam switches: <path-to-opam-switch>/lib/ocaml/compiler-libs/parsetree.mli. Here you'll find detailed explanations as to which part of the concrete syntax the various types correspond to.

Ppxlib includes a Parsetree module for every version of OCaml since 4.02. For instance, the version for 4.05 is in Astlib.Ast_405.Parsetree. In what comes next, we will link the values we describe to the Ppxlib.Parsetree module, which corresponds to one version of Parsetree.

Parsetree is quite a large module and there are plenty of types there, a lot of which you don't necessarily have to know when writing a rewriter. The two main entry points are the structure and signature types which, amongst other things, describe respectively the content of .ml and .mli files. Other types you should be familiar with are:

Knowing what these types correspond to puts you in a good position to write a PPX plugin as they are the parts of the AST you will deal with the most in general.

Writing an extension rewriter

To write your ppx plugin you'll need to add the following stanza in your dune file:

(library
 (public_name my_ppx_rewriter)
 (kind ppx_rewriter)
 (libraries ppxlib))

You'll note that you have to let dune know this is not a regular library but a ppx_rewriter using the kind field. The public name you chose here is the name your users will refer to your ppx in there preprocess field. E.g. here to use this ppx rewriter one would add the (preprocess (pps my_ppx_rewriter)) to their library or executable stanza.

You will also need the following my_ppx_rewriter.ml:

open Ppxlib

let expand ~ctxt payload =
  ...

let my_extension =
  Extension.V3.declare
    "my_ext"
    <extension_context>
    <ast_pattern>
    expand

let rule = Ppxlib.Context_free.Rule.extension my_extension

let () =
  Driver.register_transformation
    ~rules:[rule]
    "my_ext"

There are a few things to explain here. The last part, i.e. the call to Driver.register_transformation is common to almost all ppxlib-based PPX plugins and is how you let ppxlib know about your transformation. You'll note that here we register a single rule but it is possible to register several rules for a single logical transformation.

The above is specific to extension rewriters. You need to declare a ppxlib Extension. The first argument is the extension name, that's what will appear after the % in the extension point when using your rewriter, e.g. here this will transform [%my_ext ...] nodes. The <extension_context> argument describes where in OCaml code your this extension can be used. You can find the full list in the API documentation in the Extension.Context module. The <ast_pattern> argument helps you restrict what users can put into the payload of your extension, i.e. [%my_ext <what goes there!>]. We cover Ast_pattern in depths here but the simplest form it can take is Ast_pattern.__ which allows any payload allowed by the language and passes it to the expand function which is the last argument here. The expand function is where the logic for your transformation is implemented. It receives an Expansion_context.Extension.t argument labelled ctxt and other arguments whose type and number depends on the <ast_pattern> argument. The return type of the function is determined by the <extension_context> argument, e.g. in the following example:

Extension.V3.declare "my_ext" Extension.Context.expression Ast_pattern.__ expand

The type of the expand function is:

val expand : ctxt: Expansion_context.Extension.t -> payload -> expression

If you want to look at a concrete example of extension rewriter you can find one in the examples/ folder of the ppxlib repository here.

Writing a deriver

Similarly to extension rewriters, derivers must be declared as such to dune. To do so you can use the following stanza in your dune file:

(library
 (public_name my_ppx_deriver)
 (kind ppx_deriver)
 (libraries ppxlib))

Same as above, the public name used here determines how users will refer to your ppx deriver in their dune stanzas.

You will also need the following my_ppx_deriver.ml:

open Ppxlib

let generate_impl ~ctxt (rec_flag, type_declarations) =
  ...

let generate_intf ~ctxt (rec_flag, type_declarations) =
  ...

let impl_generator = Deriving.Generator.V2.make_noarg generate_impl

let intf_generator = Deriving.Generator.V2.make_noarg generate_intf

let my_deriver =
  Deriving.add
    "my_deriver"
    ~str_type_decl:impl_generator
    ~sig_type_decl:intf_generator

The call to Deriving.add is how you'll let ppxlib know about your deriver. The first string argument is the name of the deriver as referred to by your users, in the above example one would add a [@@deriving my_deriver] annotation to use this plugin. Here our deriver can be used on type declarations, be it in structures or signatures (i.e. implementation or interfaces, .ml or .mli).

To add a deriver you first have to define a generator. You need one for each node you want to derive code from. Here we just need one for type declarations in structures and one for type declarations in signatures. To do that you need the Deriving.Generator.V2.make_noarg constructor. You'll note that there exists Deriving.Generator.V2.make variant if you wish to allow passing arguments to your deriver but to keep this tutorial simple we won't cover this here. The only mandatory argument to the constructor is a function which takes a labelled Expansion_context.Deriving.t, an 'input_ast and returns an 'output_ast and that will give us a ('output_ast, 'input_ast) Deriving.Generator.t. Much like the expand function described in the section about extension rewriters, this function is where the actual implementation for your deriver lives. The str_type_decl argument of Deriving.add expects a (structure, rec_flag * type_declaration list) Generator.t so our generate_impl function must take a pair (rec_flag, type_declaration list) and return a structure i.e. a structure_item list, for instance a list of function or module declaration. The same goes for the generate_intf function except that it must return a signature. It is often the case that a deriver has a generator for both the structure and signature variants of a node. That allows users to generate the signature corresponding to the code generated by the deriver in their .ml files instead of having to type it and maintain it themselves.

If you want to look at a concrete example of deriver you can find one in the examples/ folder of the ppxlib repository here.

Metaquot

metaquot is a PPX plugin that helps you write PPX plugins. It lets you write AST node values using the actual corresponding OCaml syntax instead of building them with the more verbose AST types or Ast_builder.

To use metaquot you need to add it to the list of preprocessor for your PPX plugin:

(library
 (name my_plugin_lib)
 (preprocess (pps ppxlib.metaquot)))

metaquot can be used both to write expressions of some of the AST types or to write patterns to match over those same types. The various extensions it exposes can be used in both contexts, expressions or patterns.

The extension you should use depends on the type of AST node you're trying to write or to pattern-match over. You can use the following extensions with the following syntax:

If you consider the first example [%expr 1 + 1], in an expression context, metaquot will actually expand it into:

{
  pexp_desc =
    (Pexp_apply
       ({
          pexp_desc = (Pexp_ident { txt = (Lident "+"); loc });
          pexp_loc = loc;
          pexp_attributes = []
        },
         [(Nolabel,
            {
              pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
              pexp_loc = loc;
              pexp_attributes = []
            });
         (Nolabel,
           {
             pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
             pexp_loc = loc;
             pexp_attributes = []
           })]));
  pexp_loc = loc;
  pexp_attributes = []
}

For this to compile you need the AST types to be in the scope so you should always use metaquot where Ppxlib is opened. You'll also note that the generated node expects a loc : Location.t value to be available. The produced AST node value and every other nodes within it will be located to loc. You should make sure loc is the location you want for your generated code when using metaquot.

When using the pattern extension, it will produce a pattern that matches no matter what the location and attributes are. For the previous example for instance, it will produce the following pattern:

{
  pexp_desc =
    (Pexp_apply
       ({
          pexp_desc = (Pexp_ident { txt = (Lident "+"); loc = _ });
          pexp_loc = _;
          pexp_attributes = _
        },
         [(Nolabel,
            {
              pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
              pexp_loc = _;
              pexp_attributes = _
            });
         (Nolabel,
           {
             pexp_desc = (Pexp_constant (Pconst_integer ("1", None)));
             pexp_loc = _;
             pexp_attributes = _
           })]));
  pexp_loc = _;
  pexp_attributes = _
}

Using these extensions alone, you can only produce constant/static AST nodes. You can't bind variables in the generated patterns either. metaquot has a solution for that as well: anti-quotation. You can use anti-quotation to insert any expression or pattern representing an AST node. That way you can include dynamically generated nodes inside a metaquot expression extension point or use a wildcard or variable pattern in a pattern extension.

Consider the following example:

let with_suffix_expr ~loc s =
  let dynamic_node = Ast_builder.Default.estring ~loc s in
  [%expr [%e dynamic_node] ^ "some_fixed_suffix"]

The with_suffix_expr function will create an expression which is the concatenation of the s argument and the fixed suffix. I.e. with_suffix_expr "some_dynamic_stem" is equivalent to [%expr "some_dynamic_steme" ^ "some_fixed_suffix"].

Similarly if you want to ignore some parts of AST nodes and extract some others when pattern-matching over them, you can use anti-quotation:

match some_expr_node with
| [%expr 1 + [%e? _] + [%e? third]] -> do_something_with third

The syntax for anti-quotation depends on the type of the node you wish to insert:

Note that when anti-quoting in a pattern context you must always use the ? in the anti-quotation extension as its payload should always be a pattern the same way it must always be an expression in an expression context.

As you may have noticed, you can anti-quote expressions which type differs from the type of the whole metaquot extension point. E.g. you can write:

let structure_item =
  [%stri let [%p some_pat] : [%t some_type] = [%e some_expr]]