wxPdfDocument 1.2.0
Styling text using a simple markup language

Overview

The method wxPdfDocument::WriteXML allows to write text to PDF using a simple markup language. This allows for example to change font attributes within a cell, which is not supported by methods like wxPdfDocument::WriteCell or wxPdfDocument::MultiCell. The supported markup language consists of a small subset of HTML. Although the subset might be extended in future versions of wxPdfDocument, it is not the goal of this method to allow to convert full fledged HTML pages to PDF.

Important! The XML dialect used is very strict. Each tag must have a corresponding closing tag and all attribute values must be enclosed in double quotes.

Usually the current position should be at the left margin when calling wxPdfDocument::WriteXML. If the current position is not at left margin and the text passed to wxPdfDocument::WriteXML occupies more than a single line, you may get strange results.

Currently there is only limited error handling. You will get strange results or no results at all if tags are incorrectly used. Unknown tags and all their content are silently ignored.

Reference of supported tags

The following sections describe the tags supported by the wxPdfDocument markup language.

Notes

Starting with version 1.0.2 of wxPdfDocument numeric values for the attributes of markup elements that denote measures (like margins, line widths, cell heights and so on) may include the measurement unit. If no unit is given, the default unit of the wxPdfDocument instance is assumed. The latter rule has 2 exceptions:

  • a font size is measured in points ("pt")
  • an image size (width, height, and viewport) is measured in pixels ("px")

The following 2-letter units can be used:

  • pt = point (72 points = 1 inch)
  • mm = millimeter (25.4 millimeters = 1 inch)
  • cm = centimeter (2.54 centimeters = 1 inch)
  • in = inch
  • px = pixel (see note below)

Note: For image related values measured in pixels (unit px) the resulting value will be multiplied with the image scale factor to allow to compensate for high-res images. For values that are not image related the unit px will be treated as a synonym for the unit pt.

Example: <img src="pic1.png" width="40mm" height="25mm" viewport="0 0 0 20mm"/>

This defines the size of the image as 40 millimeters x 25 millimeters with a viewport which moves the lower 5 millimeters of the image height below the baseline of the surrounding cell.

Simple text markup

There are several tags to influence the size and weight of the font used for displaying the text and the relative vertical position within a line:

TagDescription
<b> ... </b>bold text
<i> ... </i>italic text
<u> ... </u>underlined text
<o> ... </o>overlined text
<s> ... </s>strike-through text
<strong> ... </strong>bold text (same as <b>)
<em> ... </em>emphasized text (same as <i>)
<small> ... </small>text with reduced font size
<sup> ... </sup>superscripted text
<sub> ... </sub>subscripted text
<h1> ... </h1>headline level 1
<h2> ... </h2>headline level 2
<h3> ... </h3>headline level 3
<h4> ... </h4>headline level 4
<h5> ... </h5>headline level 5
<h6> ... </h6>headline level 6

Structuring text markup

Some tags for structuring the text layout are available. Most of these tags have one or more attributes to change its properties. Click on the tag description to see a detailed description of the attributes.

TagDescription
<ul> ... </ul>Unordered lists
<ol> ... </ol>Ordered lists
<li> ... </li>List item of an ordered or unordered list
<br />Line break, positions the current position to the left margin of the next line
<p> ... </p>Paragraph
<hr />Horizontal rule
<a> ... </a>Internal or external link
<font> ... </font>Font specification
<table> ... </table>Tables

Miscelleaneous text markup

This section lists a few additional tags not fitting in any other category. Click on the tag description to see a detailed description of the attributes.

TagDescription
<msg> ... </msg>Translatable text
<img ... />Images

Unordered lists

Unordered lists start on a new line. Each list item is preceded by a list item marker and the content of the list item is indented.

Tag
<ul>
AttributeDescription
type="bullet|dash|number"

Sets the type of the list item marker

bullet displays a bullet character

dash displays a dash character

number has a value between 0 and 255. The corresponding character of the ZapfDingBats font is used as the list item marker

Ordered lists

Ordered lists start on a new line. Each list item is preceded by a list item enumerator and the content of the list item is indented.

Tag
<ol>
AttributeDescription
type="1|a|A|i|I|z1|z2|z3|z4"

Sets the type of the list item enumerator

1 displays a decimal number as the list item enumerator

a displays a lowercase alphabetic character as the list item enumerator

A displays a uppercase alphabetic character as the list item enumerator

i displays a lowercase roman number as the list item enumerator

I displays a uppercase roman number as the list item enumerator

z1|z2|z3|z4 displays number symbols of one of the 4 number series in the ZapfDingBats font. This option should only be used for lists of at most 10 items.

start="number"number represents the enumerator value of the first list item

Paragraph

A paragraph starts on a new line and forces an empty line after the closing paragraph tag.

Tag
<p>
AttributeDescription
align="left|right|center|justify"As specified by this option the content of the paragraph will be left or right aligned, centered or justified. The default is left aligned.

Horizontal rule

A horizontal rule is a line of specified width which is drawn on a separate line.

Tag
<hr>
AttributeDescription
width="number"The width of the horizontal rule is an integer number between 1 and 100 giving the width in percent of the available width (from left to right margin). The default value is 100.
linewidth="number"

The line width of the ruler.

Internal or external link

An internal or external link is displayed as blue underlined text. Clicking on the text opens a browser window loading the referenced URL.

Tag
<a>
AttributeDescription
href="url"url is an unified resource locator. If url starts with # it is interpreted as a reference to an internal link anchor; the characters following # are used as the name of the anchor.
name="anchor"anchor is the name of an internal link anchor.

Note: Either the name or the href attribute may be specified, but not both.

Font specification

This tag allows to specify several font attributes for the embedded content. Font family, font size and colour can be set. Attributes not given retain their previous value.

Tag
<font>
AttributeDescription
face="fontfamily"The name of the font family. It can be the name of one of the 14 core fonts or the name of a font previously added by wxPdfDocument::AddFont.
size="fontsize"The font size in points
color="fontcolour"The font colour in HTML notation, i.e. #rrggbb, or as a named colour, i.e. red.

Translatable text

For international applications a simple mechanism is provided to pass a string to wxGetTranslation.

Tag
<msg>

The text string included in the msg tag will be translated if a translation is available before it is written to PDF.

Note: Within the msg tag additional markup is not allowed.

Images

In the current implementation output of an image always starts on a new line.

Tag
<img>
AttributeDescription
src="imagefile"The name of the image file.
width="image width"The width of the image measured in pixels.
height="image height"The height of the image measured in pixels.
align="left|right|center"As specified by this option the image will be left or right aligned, or centered. The default is left aligned.
viewport="tlx tly brx bry"The viewport into the image that should correspond to the actual content of the surrounding cell. The viewport is defined as a rectangular area by specifying the coordinates of the top left corner (tlx,tly) and the bottom right corner (brx,bry). The viewport coordinates are measured in pixels. The values tlx and tly may be negative, in which case the viewport will be larger than the image. The values brx and bry may be specified as 0, in which case they will be replaced internally by the width and the height of the image.

Tables

Very often information is presented in a tabular structure. This is also supported by the wxPdfDocument markup language by using a specific kind of HTML table syntax. The structure is as follows:

    <table>
      <colgroup>
        <col ... />
        ...
      </colgroup>
      <thead>
        <tr><td> ... </td></tr>
        ...
      </thead>
      <tbody>
        <tr><td> ... </td></tr>
        ...
      </tbody>
    </table>

The colgroup tag and embedded col tags are always required since all column widths have to be specified a priori. width attributes are not interpreted when used in other table tags.

The thead tag and embedded table rows and cells are allowed, but since the current implementation only supports tables fitting completely on one page, the rows are handled as ordinary rows. (A future release will support tables spanning more than one page. Header rows will be repeated on each page.)

The use of the tbody tag is always required.

Nested tables are supported.

The table tag may have the following attributes:

Tag
<table>
AttributeDescription
border="number"

Table cells may have borders on each side. This attribute specifies whether cells will have borders on every side or not. This may be overriden for each individual cell. The attribute value consists of the combination of up to 4 letters:

0 - no borders
> 0 - borders on all sides of each cell

align="left|right|center"Defines the horizontal alignment of the table. Default is the alignment of the surrounding context.
valign="top|middle|bottom"Defines the vertical alignment of the table. Default is top.
cellpadding="number"Number defines the padding width on each side of a cell. Default is 0.

The supported tags and their attributes are shown in the following tables:

TagDescription
<table> ... </table>Groups the definitions of column widths. Contains one or more <col> tags.
<colgroup> ... </colgroup>Groups the definitions of column widths. Contains one or more <col> tags.
<col width="width" span="number"> ... </col>Defines the width of one or more columns. number specifies for how many columns the width is specified, default is 1.
<thead odd="background colour for odd numbered rows" even="background colour for even numbered rows"> ... </thead> Defines a group of table header rows. Contains one or more <tr> tags. If a table does not fit on a single page these rows are repeated on each page. The attributes odd and even are optional.
<tbody odd="background colour for odd numbered rows" even="background colour for even numbered rows"> ... </tbody> Defines a group of table body rows. Contains one or more <tr> tags. The attributes odd and even are optional.
<tr bgcolor="background colour" height="height"> ... </tr>

Defines a table row. Contains one or more <td> tags.

The background colour may be specified in HTML notation, i.e. #rrggbb, or as a named colour, i.e. red. If no background colour is given the background is transparent.

Usually the height of the highest cell in a row is used as the row height, but a minimal row height may be specified, too

<td> ... </td>

Defines a table cell.

The available attributes are described in section Table cells.

Table cells

A table cell can have several attributes:

Tag
<td>
AttributeDescription
border="LTBR"

A cell may have a border on each side. This attribute overrides the border specification in the <table> tag. The attribute value consists of the combination of up to 4 letters:

L - border on the left side of the cell
T - border on the top side of the cell
B - border on the bottom side of the cell
R - border on the right side of the cell

.

align="left|right|center"Defines the horizontal alignment of the cell content. Default is left.
valign="top|middle|bottom"Defines the vertical alignment of the cell content. Default is top.
bgcolor="background colour"The background colour of the cell in HTML notation, i.e. #rrggbb, or as a named colour, i.e. red. This attribute overrides the background colour specification of the row. If neither a row nor a cell background colour is specified the background is transparent.
rowspan="number"Number of rows this cell should span. Default is 1.
colspan="number"Number of columns this cell should span. Default is 1.