odoc
odoc
is a CLI tool to create API and documentation for OCaml projects. However, it operates at a rather low level, taking individual files through several distinct phases until the HTML output is generated.
For this reason, just like for building any multifiles OCaml project, odoc
needs to be driver by a higher level tool. The driver will take care of calling the odoc
command with the right arguments throughout the different phases. Several drivers for odoc
exist, such as: dune and odig.
The odoc
tools also contains a "reference driver", that is kept up-to-date with the latest development of odoc
.
This document explains how to drive odoc
, as of version 3. It is not needed to know any of this to use odoc
, it is targeted at driver authors, tools that interact with odoc
, or any curious passerby. This includes several subjects:
odoc
pipeline,opam
-installed packages.In addition to the documentation, the reference driver is a good tool to understand how to build odoc
projects. It can be useful to look at the implementation code, but it can also help to simply look at all invocations of odoc
during a run of the driver.
In its third major version, odoc
has been improved so that the same documentation can work on multiple scenarios, from local switches to big monorepos, or the ocaml.org hub of documentation for all packages, without anything breaking, especially references.
The idea is that we have named group of documentation, that we'll call cluster here. We have two kinds of them: page clusters, and modules clusters. Inside the clusters, everything is managed by odoc
. Outside of the cluster, the driver is free to arrange them however they like. In order to reference another cluster, a documentation author can use the name of the cluster in the reference.
Different situations will give different meanings to the clusters. In the case of opam
packages, though, there is a natural meaning to give to those clusters (you'll find more details in the convention for opam-installed packages). Any opam package will have an associated "documentation cluster", named with the name of the package. Any of its libraries will have an associated "module cluster", named with the name of the library. Another package can thus refer to the doc using the package name, or to any of its library using the library name, no matter where the package is located in the hierarchy.
Just like when compiling OCaml modules, generating docs for these modules need to be run in a specific order, as some information for generating docs for a file might reside in another one. However, odoc
actually allows a particular file to reference a module that depends on it, seemingly creating a circular dependency.
This circular dependency problem is one of the reason we have several phases in odoc
. Let's review them:
compile
phase, which is used to create the .odoc
artifacts from .cm{i;t;ti}
and .mld
files. This is where odoc
does similar work to that of OCaml, computing expansions for each module types. The dependencies between are the same as the ones for the .cm{i;t;ti}
input.link
phase transforms the .odoc
artifacts to .odocl
. The main result of this phase is to resolve odoc reference, which also has an effect on canonical modules.indexing
phase generates .odoc-index
files from sets of .odocl
files. These index files will be used both for generating a global sidebar, and for generating a search index.generation
phase takes the .odocl
and .odoc-index
files and turns them into either HTML, man pages or Latex files.The compile phase takes as input a set of .cm{i;t;ti}
as well as .mld
files, and builds a directory hierarchy of .odoc
files.
There are distinct commands for this phase: odoc compile
for interfaces and pages, odoc compile-impl
for implementations, and odoc compile-asset
for assets.
Let's have a look at a generic invocation of odoc
during the compile phase:
$ odoc compile --output-dir <od> --parent-id <pid> -I <dir1> -I <dir2> <input-file>.<ext>
<input-file>.<ext>
is the input file, either a .cm{i;t;ti}
file or an .mld
file. Prefer .cmti
files over the other format!--output-dir <od>
allows to specify the directory that will contain all the .odoc
files. This directory has to be fully managed by odoc
and should not be modified by another tool! The output file depends on the --parent-id
option.--parent-id <pid>
allows to place the output .odoc
file in the documentation hierarchy. This consists in a /
separated sequence of non empty strings (used as directory name). This "path" determines where the .odoc
file will be located below the <od>
output dir. The name of the output file is <input-file>.odoc
for modules, and page-<input-file>.odoc
for pages. Documentation artifacts that will be in the same unit of documentation needs to hare a common root in their parent id.-I <dir>
corresponds to the search path for other .odoc
file. It can be given as many times as necessary, but should allow to access every .odoc
file generated from a .cm{i;t;ti}
listed when calling odoc compile-deps
on the input file. <dir>
are directories under the <od>
directory, computed from the --parent-id
option given to previous call to odoc compile
.A concrete example for such command would be:
$ odoc compile ~/.opam/5.2.0/lib/ppxlib/ppxlib__Extension.cmti --output-dir _odoc/ -I _odoc/ocaml-base-compiler/lib/compiler-libs.common -I _odoc/ocaml-base-compiler/lib/stdlib -I _odoc/ocaml-compiler-libs/lib/ocaml-compiler-libs.common -I _odoc/ppxlib/lib/ppxlib -I _odoc/ppxlib/lib/ppxlib.ast -I _odoc/ppxlib/lib/ppxlib.astlib -I _odoc/ppxlib/lib/ppxlib.stdppx -I _odoc/ppxlib/lib/ppxlib.traverse_builtins -I _odoc/sexplib0/lib/sexplib0 --parent-id ppxlib/lib/ppxlib
A compile-impl
command is pretty similar:
$ odoc compile-impl --output-dir <od> --source-id <sid> --parent-id <pid> -I <dir1> -I <dir2> <input-file>.<ext>
<input-file>.cmt
is the input file, it has to be a .cmt
file.--output-dir <od>
has the same meaning as for odoc compile
.--parent-id <pid>
also has the same meaning as for odoc compile
. However, the name of the output file is impl-<input-file>.odoc
. Implementations needs to be available through the -I
search path, so it is very likely that one wants the implementation and interface .odoc
files to share the same parent id.-I <dir>
also corresponds to the search path for other .odoc
file.source-id <sid>
is a new argument specific to compile-impl
. This corresponds to the location of the rendering of the source, which is required to generate links to it.A concrete example for such command would be:
$ odoc compile-impl ~/.opam/5.2.0/lib/ppxlib/ppxlib__Spellcheck.cmt --output-dir _odoc/ -I _odoc/ocaml-base-compiler/lib/compiler-libs.common -I _odoc/ocaml-base-compiler/lib/stdlib -I _odoc/ocaml-compiler-libs/lib/ocaml-compiler-libs.common -I _odoc/ppxlib/lib/ppxlib -I _odoc/ppxlib/lib/ppxlib.ast -I _odoc/ppxlib/lib/ppxlib.astlib -I _odoc/ppxlib/lib/ppxlib.stdppx -I _odoc/sexplib0/lib/sexplib0 --enable-missing-root-warning --parent-id ppxlib/lib/ppxlib --source-id ppxlib/src/ppxlib/spellcheck.ml
Assets are given during the generation phase. But we still need to create an .odoc
file, for odoc
's resolution mechanism.
$ odoc compile-asset --output-dir <od> --parent-id <pid> --name <assetname>
--output-dir
and --parent-id
are identical to the compile
and compile-impl
commands,--name <assetname>
gives the asset name.<output-dir>/<parent-id>/asset-<assetname>.odoc
.The link phase requires the directory of the compile
phase to generate its set of .odocl
files. This phase resolves references and canonicals.
A generic link command is:
$ odoc link
-I <dir1> -I <dir2>
-P <pname1>:<pdir1> -P <pname2>:<pdir2>
-L <lname1>:<ldir1> -L <lname2>:<ldir2>
<path/to/file.odoc>
<path/to/file.odoc
is the input .odoc
file. The result of this command is path/to/file.odocl
. This path was determined by --output-dir
and --parent-id
from the link phase, and it is important for the indexing phase that it stays in the same location.-P <name>:<dir>
are used to list the "page clusters", used to resolve references such as {!/ocamlfind/index}
.-L <name>:<dir>
are used to list the "module clusters", used to resolve references such as {!/findlib.dynload/Fl_dynload}
. This also adds <dir>
to the search path.-I <dir>
adds <dir>
to the search path. The search path is used to resolve references that do not use the cluster mechanism, such as {!Module}
and {!page-pagename}
.The indexing phase refers to the "crunching" of information split in several .odocl
files. Currently, there are two use-cases for this phase:
This step counts the number of occurrences of each value/type/... in the implementation, and stores them in a big table. A generic invocation is:
$ odoc count-occurrences <dir1> <dir2> -o <path/to/name.odoc-occurrences>
An example of such command:
$ odoc count-occurrences _odoc/ -o _odoc/occurrences-all.odoc-occurrences
The odoc compile-index
produces an .odoc-index
file, from .odocl
files, other .odoc-index
files, and possibly some .odoc-occurrences
files.
To create an index for the page and documentation units, we use the -P
and -L
arguments.
$ odoc compile-index -o path/to/<indexname>.odoc-index -P <pname1>:<ppath1> -P <pname2>:<ppath2> -L <lname1>:<lpath1> -L <lname2>:<lpath2> --occurrences <path/to/name.odoc-occurrences>
An example of such command:
$ odoc compile-index -o _odoc/ppxlib/index.odoc-index -P ppxlib:_odoc/ppxlib/doc -L ppxlib:_odoc/ppxlib/lib/ppxlib -L ppxlib.ast:_odoc/ppxlib/lib/ppxlib.ast -L ppxlib.astlib:_odoc/ppxlib/lib/ppxlib.astlib -L ppxlib.metaquot:_odoc/ppxlib/lib/ppxlib.metaquot -L ppxlib.metaquot_lifters:_odoc/ppxlib/lib/ppxlib.metaquot_lifters -L ppxlib.print_diff:_odoc/ppxlib/lib/ppxlib.print_diff -L ppxlib.runner:_odoc/ppxlib/lib/ppxlib.runner -L ppxlib.runner_as_ppx:_odoc/ppxlib/lib/ppxlib.runner_as_ppx -L ppxlib.stdppx:_odoc/ppxlib/lib/ppxlib.stdppx -L ppxlib.traverse:_odoc/ppxlib/lib/ppxlib.traverse -L ppxlib.traverse_builtins:_odoc/ppxlib/lib/ppxlib.traverse_builtins --occurrences _odoc/occurrences-all.odoc-occurrences
The generation phase is the phase that takes all information computed in previous files, and actually generates the documentation. It can take the form of HTML, Latex and manpages, although currently HTML is the odoc
backend that supports the most functionalities (such as images, videos, ...).
In this manual, we describe the situation for generating HTML. Usually, generating for other backend boils down to replacing html-generate
by latex-generate
or man-generate
, refer to the manpage to see the diverging options.
Given an .odocl
file, odoc
might generate a single .html
file, or a complete directory of .html
files. The --output-dir
option specifies the root for generating those output.
odoc
provides a way to plugin a JavaScript file, containing the code to answer user's queries. In order to never block the UI, this file will be loaded in a web worker to perform searches:
--output-dir
values, the driver can decide where.A generic html-generate
command for interfaces has the following form:
$ odoc html-generate
--output-dir <odir>
--index <path/to/file.odoc-index>
--search-uri <relative/to/output-dir/file.js> --search-uri <relative/to/output-dir/file2.js>
<path/to/file.odocl>
--output-dir <odir>
is used to specify the root output for the generated .html
.--index <path/to/file.odoc-index>
is given to odoc
for sidebar generation.--search-uri <relative/to/output-dir/file.js>
tells odoc
which file(s) to load in a web worker.The output directory or file can be computed from this command's --output-dir
, the initial --parent-id
given when creating the .odoc
file, as well as the unit name. In the case of a module, the output is a directory named with the name of the module. In the case of a page, the output is a file with the name of the page and the .html
extension.
An example of such command is:
$ odoc html-generate _odoc/ppxlib/doc/page-index.odocl --index _odoc/ppxlib/index.odoc-index --search-uri ppxlib/sherlodoc_db.js --search-uri sherlodoc.js -o _html/
$ odoc html-generate-source --output-dir <odir> --impl <path/to/impl-file.odocl> <path/to/source/file.ml>
--output-dir <odir>
has been covered already--impl <path/to/impl-file.odocl>
allows to give the implementation file.<path/to/source/file.ml>
is the source file.The output file can be computed from this command's --output-dir
, and the initial --source-id
and --name
given when creating the impl-*.odoc
file.
An example of such command is:
$ odoc html-generate-source --impl _odoc/ppxlib/lib/ppxlib/impl-ppxlib__Reconcile.odocl /home/panglesd/.opam/5.2.0/lib/ppxlib/reconcile.ml -o _html/
This is the phase where we pass the actual asset. We pass it as a positional argument, and give the asset unit using --asset-unit
.
$ odoc html-generate-asset --output-dir <odir> --asset-unit <path/to/asset-file.odocl> <path/to/asset/file.ext>
In order to build the documentation for installed package, the driver needs to give a meaning to various of the concept above. In particular, it needs to define the pages and libraries clusters, know where to find the pages and assets, what id to give them, when linking it needs to know to which clusters the artifact may be linking...
So that the different drivers and installed packages play well together, we define here a convention for building installed packages. If both the package and the driver follow it, building the docs should go well!
-P
and -L
clusters, and their root idsEach package define a set of cluster, each of them having a root ids. These roots will be used in --parent-id
and in -P
and -L
.
The driver can decide any set of mutually disjoint set of roots, without posing problem to the reference resolution. For instance, both -P pkg:<output_dir>/pkg/doc
and -P pkg:<output_dir>/pkg/version/doc
are acceptable versions. However, we define here "canonical" roots:
Each installed package <p>
define a single page root id: <p>/doc
.
For each package <p>
, each library <l>
defines a library root id: <p>/lib/<l>
.
For instance, a package foo
with two libraries: foo
and foo.bar
will define three clusters:
foo
, with root id foo/doc
. When referred from other clusters, a -P foo:<odoc_dir>/foo/doc
argument needs to be added at the link phase.foo
, with root id foo/lib/foo
. When referred from other clusters, a -L foo:<odoc_dir>/foo/lib/foo
argument needs to be added at the link phase.foo.bar
, with root id foo/doc
. When referred from other clusters, a -L foo.bar:<odoc_dir>/foo/lib/foo.bar
argument needs to be added at the link phase.Installed OPAM packages need to specify which clusters they may be referencing during the link phase, so that the proper -P
and -L
arguments are added. (Note that these dependencies can be circular without problem, as they happen during the link phase and only require the artifact from the compile phase.)
An installed package <p>
specifies its cluster dependencies in a file at <opam root>/doc/<p>/odoc-config.sexp
. This file contains s-expressions.
Stanzas of the form (packages p1 p2 ...)
specifies that page clusters p1
, p2
, ..., should be added using the -P
argument: with the canonical roots, it would be -P p1:<output_dir>/p1/doc -P p2:<output_dir>/p2/doc -P ...
.
Stanzas of the form (libraries l1 l2 ...)
specifies that module clusters l1
, l2
, ..., should be added using the -L
argument: with the canonical roots, it would be -L l1:<output_dir>/p1/lib/l1 -L l2<output_dir>/p2/lib/l2 -L ...
, where p1
is the package l1
is in, etc.
The module units of a package p
are all files installed by p
that can be found in <opam root>/lib/p/
or a subdirectory.
The page units are those files that can be found in <opam root>/doc/odoc-pages/
or a subdirectory, and that have an .mld
extension.
The asset units are those files that can be found in <opam root>/doc/odoc-pages/
or a subdirectory, but that do not have an .mld
extension. Additionally, they are all files found in <opam root>/doc/odoc-assets/
.
--parent-id
argumentsInterface and implementation units have as parent id the root of the library cluster they belong to: with "canonical" roots, <pkgname>/lib/<libname>
.
Page units that are found in <opam root>/doc/<pkgname>/odoc-pages/<relpath>/<name>.mld
have the parent id from their page cluster, followed by <relpath>
. So, with canonical roots, <pkgname>/doc/<relpath>
.
Asset units that are found in <opam root>/doc/<pkgname>/odoc-pages/<relpath>/<name>.<ext>
have the parent id from their page cluster, followed by <relpath>
. With canonical roots, <pkgname>/doc/<relpath>
.
Asset units that are found in <opam root>/doc/<pkgname>/odoc-assets/<filename>
have the parent id from their page cluster, followed by _asset/<filename>
<p>/doc/_assets/<filename>
.
--source-id
argumentsThe driver could chose the source id without breaking references. However, following the canonical roots convention, implementation units must have as source id: <pkgname>/src/<libraryname>/<filename>.ml
.