odocodoc is a CLI tool to create API and documentation for OCaml projects. However, it operates at a rather low level, taking individual files through several distinct phases until the HTML output is generated.
For this reason, just like for building any multifiles OCaml project, odoc needs to be driver by a higher level tool. The driver will take care of calling the odoc command with the right arguments throughout the different phases. Several drivers for odoc exist, such as: dune and odig.
The odoc tools also contains a "reference driver", that is kept up-to-date with the latest development of odoc.
This document explains how to drive odoc, as of version 3. It is not needed to know any of this to use odoc, it is targeted at driver authors, tools that interact with odoc, or any curious passerby. This includes several subjects:
odoc pipeline,opam-installed packages.In addition to the documentation, the reference driver is a good tool to understand how to build odoc projects. It can be useful to look at the implementation code, but it can also help to simply look at all invocations of odoc during a run of the driver.
In its third major version, odoc has been improved so that the same documentation can work on multiple scenarios, from local switches to big monorepos, or the ocaml.org hub of documentation for all packages, without anything breaking, especially references.
The idea is that we have named group of documentation, that we'll call cluster here. We have two kinds of them: page clusters, and modules clusters. Inside the clusters, everything is managed by odoc. Outside of the cluster, the driver is free to arrange them however they like. In order to reference another cluster, a documentation author can use the name of the cluster in the reference.
Different situations will give different meanings to the clusters. In the case of opam packages, though, there is a natural meaning to give to those clusters (you'll find more details in the convention for opam-installed packages). Any opam package will have an associated "documentation cluster", named with the name of the package. Any of its libraries will have an associated "module cluster", named with the name of the library. Another package can thus refer to the doc using the package name, or to any of its library using the library name, no matter where the package is located in the hierarchy.
Just like when compiling OCaml modules, generating docs for these modules need to be run in a specific order, as some information for generating docs for a file might reside in another one. However, odoc actually allows a particular file to reference a module that depends on it, seemingly creating a circular dependency.
This circular dependency problem is one of the reason we have several phases in odoc. Let's review them:
compile phase, which is used to create the .odoc artifacts from .cm{i;t;ti} and .mld files. This is where odoc does similar work to that of OCaml, computing expansions for each module types. The dependencies between are the same as the ones for the .cm{i;t;ti} input.link phase transforms the .odoc artifacts to .odocl. The main result of this phase is to resolve odoc reference, which also has an effect on canonical modules.indexing phase generates .odoc-index files from sets of .odocl files. These index files will be used both for generating a global sidebar, and for generating a search index.generation phase takes the .odocl and .odoc-index files and turns them into either HTML, man pages or Latex files.The compile phase takes as input a set of .cm{i;t;ti} as well as .mld files, and builds a directory hierarchy of .odoc files.
There are distinct commands for this phase: odoc compile for interfaces and pages, odoc compile-impl for implementations, and odoc compile-asset for assets.
Let's have a look at a generic invocation of odoc during the compile phase:
$ odoc compile --output-dir <od> --parent-id <pid> -I <dir1> -I <dir2> <input-file>.<ext><input-file>.<ext> is the input file, either a .cm{i;t;ti} file or an .mld file. Prefer .cmti files over the other format!--output-dir <od> allows to specify the directory that will contain all the .odoc files. This directory has to be fully managed by odoc and should not be modified by another tool! The output file depends on the --parent-id option.--parent-id <pid> allows to place the output .odoc file in the documentation hierarchy. This consists in a / separated sequence of non empty strings (used as directory name). This "path" determines where the .odoc file will be located below the <od> output dir. The name of the output file is <input-file>.odoc for modules, and page-<input-file>.odoc for pages. Documentation artifacts that will be in the same unit of documentation needs to hare a common root in their parent id.-I <dir> corresponds to the search path for other .odoc file. It can be given as many times as necessary, but should allow to access every .odoc file generated from a .cm{i;t;ti} listed when calling odoc compile-deps on the input file. <dir> are directories under the <od> directory, computed from the --parent-id option given to previous call to odoc compile.A concrete example for such command would be:
$ odoc compile ~/.opam/5.2.0/lib/ppxlib/ppxlib__Extension.cmti --output-dir _odoc/ -I _odoc/ocaml-base-compiler/lib/compiler-libs.common -I _odoc/ocaml-base-compiler/lib/stdlib -I _odoc/ocaml-compiler-libs/lib/ocaml-compiler-libs.common -I _odoc/ppxlib/lib/ppxlib -I _odoc/ppxlib/lib/ppxlib.ast -I _odoc/ppxlib/lib/ppxlib.astlib -I _odoc/ppxlib/lib/ppxlib.stdppx -I _odoc/ppxlib/lib/ppxlib.traverse_builtins -I _odoc/sexplib0/lib/sexplib0 --parent-id ppxlib/lib/ppxlibA compile-impl command is pretty similar:
$ odoc compile-impl --output-dir <od> --source-id <sid> --parent-id <pid> -I <dir1> -I <dir2> <input-file>.<ext><input-file>.cmt is the input file, it has to be a .cmt file.--output-dir <od> has the same meaning as for odoc compile.--parent-id <pid> also has the same meaning as for odoc compile. However, the name of the output file is impl-<input-file>.odoc. Implementations needs to be available through the -I search path, so it is very likely that one wants the implementation and interface .odoc files to share the same parent id.-I <dir> also corresponds to the search path for other .odoc file.source-id <sid> is a new argument specific to compile-impl. This corresponds to the location of the rendering of the source, which is required to generate links to it.A concrete example for such command would be:
$ odoc compile-impl ~/.opam/5.2.0/lib/ppxlib/ppxlib__Spellcheck.cmt --output-dir _odoc/ -I _odoc/ocaml-base-compiler/lib/compiler-libs.common -I _odoc/ocaml-base-compiler/lib/stdlib -I _odoc/ocaml-compiler-libs/lib/ocaml-compiler-libs.common -I _odoc/ppxlib/lib/ppxlib -I _odoc/ppxlib/lib/ppxlib.ast -I _odoc/ppxlib/lib/ppxlib.astlib -I _odoc/ppxlib/lib/ppxlib.stdppx -I _odoc/sexplib0/lib/sexplib0 --enable-missing-root-warning --parent-id ppxlib/lib/ppxlib --source-id ppxlib/src/ppxlib/spellcheck.mlAssets are given during the generation phase. But we still need to create an .odoc file, for odoc's resolution mechanism.
$ odoc compile-asset --output-dir <od> --parent-id <pid> --name <assetname>--output-dir and --parent-id are identical to the compile and compile-impl commands,--name <assetname> gives the asset name.<output-dir>/<parent-id>/asset-<assetname>.odoc.The link phase requires the directory of the compile phase to generate its set of .odocl files. This phase resolves references and canonicals.
A generic link command is:
$ odoc link
-I <dir1> -I <dir2>
-P <pname1>:<pdir1> -P <pname2>:<pdir2>
-L <lname1>:<ldir1> -L <lname2>:<ldir2>
<path/to/file.odoc><path/to/file.odoc is the input .odoc file. The result of this command is path/to/file.odocl. This path was determined by --output-dir and --parent-id from the link phase, and it is important for the indexing phase that it stays in the same location.-P <name>:<dir> are used to list the "page clusters", used to resolve references such as {!/ocamlfind/index}.-L <name>:<dir> are used to list the "module clusters", used to resolve references such as {!/findlib.dynload/Fl_dynload}. This also adds <dir> to the search path.-I <dir> adds <dir> to the search path. The search path is used to resolve references that do not use the cluster mechanism, such as {!Module} and {!page-pagename}.The indexing phase refers to the "crunching" of information split in several .odocl files. Currently, there are two use-cases for this phase:
This step counts the number of occurrences of each value/type/... in the implementation, and stores them in a big table. A generic invocation is:
$ odoc count-occurrences <dir1> <dir2> -o <path/to/name.odoc-occurrences>An example of such command:
$ odoc count-occurrences _odoc/ -o _odoc/occurrences-all.odoc-occurrencesThe odoc compile-index produces an .odoc-index file, from .odocl files, other .odoc-index files, and possibly some .odoc-occurrences files.
To create an index for the page and documentation units, we use the -P and -L arguments.
$ odoc compile-index -o path/to/<indexname>.odoc-index -P <pname1>:<ppath1> -P <pname2>:<ppath2> -L <lname1>:<lpath1> -L <lname2>:<lpath2> --occurrences <path/to/name.odoc-occurrences>An example of such command:
$ odoc compile-index -o _odoc/ppxlib/index.odoc-index -P ppxlib:_odoc/ppxlib/doc -L ppxlib:_odoc/ppxlib/lib/ppxlib -L ppxlib.ast:_odoc/ppxlib/lib/ppxlib.ast -L ppxlib.astlib:_odoc/ppxlib/lib/ppxlib.astlib -L ppxlib.metaquot:_odoc/ppxlib/lib/ppxlib.metaquot -L ppxlib.metaquot_lifters:_odoc/ppxlib/lib/ppxlib.metaquot_lifters -L ppxlib.print_diff:_odoc/ppxlib/lib/ppxlib.print_diff -L ppxlib.runner:_odoc/ppxlib/lib/ppxlib.runner -L ppxlib.runner_as_ppx:_odoc/ppxlib/lib/ppxlib.runner_as_ppx -L ppxlib.stdppx:_odoc/ppxlib/lib/ppxlib.stdppx -L ppxlib.traverse:_odoc/ppxlib/lib/ppxlib.traverse -L ppxlib.traverse_builtins:_odoc/ppxlib/lib/ppxlib.traverse_builtins --occurrences _odoc/occurrences-all.odoc-occurrencesThe generation phase is the phase that takes all information computed in previous files, and actually generates the documentation. It can take the form of HTML, Latex and manpages, although currently HTML is the odoc backend that supports the most functionalities (such as images, videos, ...).
In this manual, we describe the situation for generating HTML. Usually, generating for other backend boils down to replacing html-generate by latex-generate or man-generate, refer to the manpage to see the diverging options.
Given an .odocl file, odoc might generate a single .html file, or a complete directory of .html files. The --output-dir option specifies the root for generating those output.
odoc provides a way to plugin a JavaScript file, containing the code to answer user's queries. In order to never block the UI, this file will be loaded in a web worker to perform searches:
--output-dir values, the driver can decide where.A generic html-generate command for interfaces has the following form:
$ odoc html-generate
--output-dir <odir>
--index <path/to/file.odoc-index>
--search-uri <relative/to/output-dir/file.js> --search-uri <relative/to/output-dir/file2.js>
<path/to/file.odocl>--output-dir <odir> is used to specify the root output for the generated .html.--index <path/to/file.odoc-index> is given to odoc for sidebar generation.--search-uri <relative/to/output-dir/file.js> tells odoc which file(s) to load in a web worker.The output directory or file can be computed from this command's --output-dir, the initial --parent-id given when creating the .odoc file, as well as the unit name. In the case of a module, the output is a directory named with the name of the module. In the case of a page, the output is a file with the name of the page and the .html extension.
An example of such command is:
$ odoc html-generate _odoc/ppxlib/doc/page-index.odocl --index _odoc/ppxlib/index.odoc-index --search-uri ppxlib/sherlodoc_db.js --search-uri sherlodoc.js -o _html/$ odoc html-generate-source --output-dir <odir> --impl <path/to/impl-file.odocl> <path/to/source/file.ml>--output-dir <odir> has been covered already--impl <path/to/impl-file.odocl> allows to give the implementation file.<path/to/source/file.ml> is the source file.The output file can be computed from this command's --output-dir, and the initial --source-id and --name given when creating the impl-*.odoc file.
An example of such command is:
$ odoc html-generate-source --impl _odoc/ppxlib/lib/ppxlib/impl-ppxlib__Reconcile.odocl /home/panglesd/.opam/5.2.0/lib/ppxlib/reconcile.ml -o _html/This is the phase where we pass the actual asset. We pass it as a positional argument, and give the asset unit using --asset-unit.
$ odoc html-generate-asset --output-dir <odir> --asset-unit <path/to/asset-file.odocl> <path/to/asset/file.ext>In order to build the documentation for installed package, the driver needs to give a meaning to various of the concept above. In particular, it needs to define the pages and libraries clusters, know where to find the pages and assets, what id to give them, when linking it needs to know to which clusters the artifact may be linking...
So that the different drivers and installed packages play well together, we define here a convention for building installed packages. If both the package and the driver follow it, building the docs should go well!
-P and -L clusters, and their root idsEach package define a set of cluster, each of them having a root ids. These roots will be used in --parent-id and in -P and -L.
The driver can decide any set of mutually disjoint set of roots, without posing problem to the reference resolution. For instance, both -P pkg:<output_dir>/pkg/doc and -P pkg:<output_dir>/pkg/version/doc are acceptable versions. However, we define here "canonical" roots:
Each installed package <p> define a single page root id: <p>/doc.
For each package <p>, each library <l> defines a library root id: <p>/lib/<l>.
For instance, a package foo with two libraries: foo and foo.bar will define three clusters:
foo, with root id foo/doc. When referred from other clusters, a -P foo:<odoc_dir>/foo/doc argument needs to be added at the link phase.foo, with root id foo/lib/foo. When referred from other clusters, a -L foo:<odoc_dir>/foo/lib/foo argument needs to be added at the link phase.foo.bar, with root id foo/doc. When referred from other clusters, a -L foo.bar:<odoc_dir>/foo/lib/foo.bar argument needs to be added at the link phase.Installed OPAM packages need to specify which clusters they may be referencing during the link phase, so that the proper -P and -L arguments are added. (Note that these dependencies can be circular without problem, as they happen during the link phase and only require the artifact from the compile phase.)
An installed package <p> specifies its cluster dependencies in a file at <opam root>/doc/<p>/odoc-config.sexp. This file contains s-expressions.
Stanzas of the form (packages p1 p2 ...) specifies that page clusters p1, p2, ..., should be added using the -P argument: with the canonical roots, it would be -P p1:<output_dir>/p1/doc -P p2:<output_dir>/p2/doc -P ....
Stanzas of the form (libraries l1 l2 ...) specifies that module clusters l1, l2, ..., should be added using the -L argument: with the canonical roots, it would be -L l1:<output_dir>/p1/lib/l1 -L l2<output_dir>/p2/lib/l2 -L ..., where p1 is the package l1 is in, etc.
The module units of a package p are all files installed by p that can be found in <opam root>/lib/p/ or a subdirectory.
The page units are those files that can be found in <opam root>/doc/odoc-pages/ or a subdirectory, and that have an .mld extension.
The asset units are those files that can be found in <opam root>/doc/odoc-pages/ or a subdirectory, but that do not have an .mld extension. Additionally, they are all files found in <opam root>/doc/odoc-assets/.
--parent-id argumentsInterface and implementation units have as parent id the root of the library cluster they belong to: with "canonical" roots, <pkgname>/lib/<libname>.
Page units that are found in <opam root>/doc/<pkgname>/odoc-pages/<relpath>/<name>.mld have the parent id from their page cluster, followed by <relpath>. So, with canonical roots, <pkgname>/doc/<relpath>.
Asset units that are found in <opam root>/doc/<pkgname>/odoc-pages/<relpath>/<name>.<ext> have the parent id from their page cluster, followed by <relpath>. With canonical roots, <pkgname>/doc/<relpath>.
Asset units that are found in <opam root>/doc/<pkgname>/odoc-assets/<filename> have the parent id from their page cluster, followed by _asset/<filename> <p>/doc/_assets/<filename>.
--source-id argumentsThe driver could chose the source id without breaking references. However, following the canonical roots convention, implementation units must have as source id: <pkgname>/src/<libraryname>/<filename>.ml.