Doxygen Filter

From Okapi Framework
Jump to navigation Jump to search


The Doxygen Filter is an Okapi component for extracting Doxygen-style comments from source code. An example:

 /*! A test class */
 class Test
     /** An enum type.
      * The documentation block cannot be put after the enum!
     enum EnumType
       int EVal1, /**< enum value 1 */
       int EVal2 /**< enum value 2 */
     void member(); //!< a member function.
     int value; /*!< an integer value */

C++-style (///), Javadoc-style (/**), Qt-style (/*!), and Python-style (''' or """) comment blocks are supported.

Processing Details

Input Encoding

The filter decides which encoding to use for the input document using the following logic:

  • If the file has a Unicode Byte-Order-Mark:
    • Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.
  • Otherwise, the input encoding used is the default encoding that was specified when opening the document.

Inline Codes

The full set of Doxygen special commands, HTML commands, and XML commands are recognized and interpreted. For instance,

 /*! \class Test class.h "inc/class.h"
  *  \brief This is a test class.
  * Some details about the Test class

will be extracted to the following Text Units:

  1. <1/><2/> This is a test class.
  2. Some details about the Test class

Line Numbers

The filter preserves line numbers so that a one-to-one correspondence between source line number and translated line number is maintained.


Supported Doxygen commands are listed in one of three categories:

  • custom_commands
  • doxygen_commands
  • html_commands

You can customize the behavior of the filter by editing existing entries or adding new ones. An example doxygen_commands entry:

     type: TYPE
     inline: INLINE
     pair: PAIR_CMD_NAME
     translatable: CMD_TRANSLATABLE
       - name: PARAM_NAME
         length: LENGTH
         required: REQUIRED
         translatable: PARAM_TRANSLATABLE
       - ...

Replace bold items above with custom data conforming to the following.

Item Description Example value
COMMAND_NAME The name of the command as it will appear in the Doxygen comment, without any prefix or suffix bits. E.g. \code{.py} should be code. Case-sensitive. code
TYPE The "type" of the command, specifically one of PLACEHOLDER, OPENING, or CLOSING. PLACEHOLDER
INLINE Whether the command should be considered an inline item (true) or a block-level element (false). Default: false. true
PAIR_CMD_NAME For OPENING- and CLOSING-type commands, this identifies the paired command. E.g. \code is paired with \endcode, so for code we have pair: endcode. Not required for PLACEHOLDER commands. endcode
CMD_TRANSLATABLE Indicates whether the entire content of the command is translatable or not. This is intended for block-level OPENING commands that delimit entire blocks such as \code. Default: true. true
PARAM_NAME The name of a parameter. This is for organizational purposes only, and is not used by the filter. name
LENGTH The length of the parameter, specifically one of WORD, LINE, PHRASE or PARAGRAPH. These map to the designations described at the top of the special commands page, except for PHRASE which indicates a string bounded by double quotes like "image caption". WORD
REQUIRED Whether the parameter is required (true) or optional (false). This affects how aggressively the filter tries to interpret proceeding text as a parameter. Default: true. true
PARAM_TRANSLATABLE Indicates whether the parameter is translatable (true) or not (false). Each parameter may be set independently, though untranslatable parameters following translatable ones will be recorded as separate inline codes. Default: true. true


  • The parameters listing is optional.
  • When present, parameters should be listed in the order in which they are written following the command.
  • Parameters with non-whitespace delimiters (e.g. .py in \code{.py}) are not currently supported.

You may also define custom commands as follows (all of the above options except COMMAND_NAME are supported; the following is a minimal case):

   - pattern: "REGEX_PATTERN"
     type: TYPE
Item Description Example value
REGEX_PATTERN Any valid regex that matches non-zero-width runs of text within the comment. Matches will be turned into codes according to the parameters as described above. ###ACCESS_CHECKS###.*?;


Prevent the filter from collapsing whitespace by setting preserve_whitespace: true.


  • Single linebreaks in a text run that are not part of a Doxygen command are collapsed. No effort is made to enforce a maximum line width upon output, so essentially each translatable paragraph will be collapsed to a single (potentially very long) line.
  • Command parameters with non-whitespace delimiters (e.g. .py in \code{.py}) are not currently supported.
  • Non-translatable command parameters are not exposed for any special processing.