You are not logged in Log in Join
You are here: Home » Zope Documentation » Zope Articles » An Introduction to Structured Text » View Document

Log in
Name

Password

 

An Introduction to Structured Text

Engineers spend a lot of time communicating, primarily by email but also in documentation. However, writing by engineers is complicated by a simple fact: the world consumes writing largely in presentation formats such as HTML and PDF.

By Paul Everitt

Engineers spend a lot of time communicating, primarily by email but also in documentation. However, writing by engineers is complicated by a simple fact: the world consumes writing largely in presentation formats such as HTML and PDF.

In theory this should be no problem, as we would all march happily off and write in "DocBook" (or perhaps LaTeX), the supposed lingua franca of documentation. However most tools don't support DocBook (or LaTex) very well, and even if tools were mature, most engineers would reject them.

Why? Engineers spend most of their time communicating in plain text. Their tools (vi and Emacs) are oriented toward text. The vast majority of words they communicate are in email. Finally, what little documentation you can squeeze out of engineers is in the form of "docstrings" in source code.

Wouldn't it be nice if there was a non-tag, text-oriented system for engineers to express semantic meaning? This is the problem Structured Text tackles. With Structured Text, format-independent writing becomes extremely convenient and natural, once a few rules are learned. Furthermore, Structured Text can be extended to cover advanced and customed uses.

To get a quick idea of what Structured Text does, the following words in Structured Text::

Sometimes the best approach to complexity is simplicty. A good structured text system is:

  • Convenient
  • Rich

is rendered into the following HTML:

      <p>
      Sometimes the <em>best</em> approach to complexity is
      simplicity. A good structured text system is:
      </p>

      <ul>
      <li><p>Convenient</p></li>
      <li><p>Rich</p></li>
      </ul>

and the following DocBook XML:

      <para>
      Sometimes the <emphasis>best</emphasis> approach to
      complexity is simplicty. A good structured text system is:
      </para>

      <itemizedlist>
      <listitem><para>Convenient</para></listitem>
      <listitem><para>Rich</para></listitem>
      </itemizedlist>

In fact, the text of this article is written in Structured Text. In this article, we'll look at the basics of Structured Text, organizing large text into sections, advanced formatting, and metadata issues.

Structured Text Basics

Let's plunge into structured text and look at the basics by correlating it to ideas in HTML.

The most basic idea in Structured Text is a paragraph. The following snippet of::

This is the first paragraph.

This is the second paragraph.

...is converted to the following in HTML:

      <p>This is the first paragraph.</p>

      <p>This is the second paragraph.</p>

That is, white space matters in Structured Text. This is a very intuitive idea. For instance, in email paragraphs are separated by white space.

To introduce emphasis, Structured Text uses another text convention: asterisks. Note the following snippet:

      This is the *first* paragraph.

      This is the **second** paragraph.

In HTML, this snippet introduces the em tag and the strong tag:

      <p>This is the <em>first</em> paragraph.</p>

      <p>This is the <strong>second</strong> paragraph.</p>

Again, this is a common pattern in email. Several other common patterns are supported, such as referring to a piece of jargon:

      When you see 'STX', you know this is shorthand for 'Structured
      Text'.

The HTML output is as follows:

      <p>When you see <code>STX</code>, you know this is shorthand for
      <code>Structured Text</code>.</p>

Using Indentation

The preceding section focused on text conventions that convey a semantic meaning. This semantic meaning, when processed by Structured Text, produces certain HTML tags.

In Structured Text, indentation is also very important in conveying semantic meaning. The most basic is the idea from HTML of headings.

In the following snippet, indentation is used to convey an outline-like structure::

Using Indentation

The preceding section focused on text conventions that convey a semantic meaning. This semantic meaning, when processed by Structured Text, produces certain HTML tags.

This produces the following HTML:

      <h1>Using Indentation</h1>

      <p>The preceding section focused on text conventions that convey
      a semantic meaning. This semantic meaning, when processed by
      Structured Text, produces certain HTML tags.</p>

That is, the indentation conveyed a semantic meaning. The paragraph was subordinate to the heading, and the relationship is thus expressed in HTML. In fact, outline relationship can be continued:

      Using Indentation

        The preceding section focused on text conventions that convey a
        semantic meaning. This semantic meaning, when processed by
        Structured Text, produces certain HTML tags.

        Basics of Indentation

          In this section we will investigate the basics of
          indentation...

Hyperlinks

This produces the following HTML:

      <h1>Using Indentation</h1>

      <p>The preceding section focused on text conventions that convey
      a semantic meaning. This semantic meaning, when processed by
      Structured Text, produces certain HTML tags.</p>

      <h2>Basics of Indentation</h2>

      <p>In this section we will investigate the basics of
      indentation...</p>

      <h2>Hyperlinks</h2>

Lists and Items

Lists are also supported in Structured Text, including unordered, ordered, and descriptive lists. The convention unordered lists is a common pattern in text-based communication::

HTML has three kinds of lists:

  • Unordered lists
  • Ordered lists
  • Descriptive lists

Structured Text allows you to use the symbols '*', o, and - to connote list items. The above example produces this HTML:

      <p>HTML has three kinds of lists:</p>

      <ul>

      <li><p>Unordered lists</p></li>

      <li><p>Ordered lists</p></li>

      <li><p>Descriptive lists</p></li>

      </ul>

The Structured Text conventions for ordered lists is shown below:

      HTML has three kinds of lists:

      1. Unordered lists

      2. Ordered lists

      3. Descriptive lists

This produces:

      <p>HTML has three kinds of lists:</p>

      <ol>

      <li><p>Unordered lists</p></li>

      <li><p>Ordered lists</p></li>

      <li><p>Descriptive lists</p></li>

      </ol>

Descriptive lists are also easily accommodated using double dashes:

      Unordered Lists -- Generally inclues a series of bullets when
      viewed in HTML.

      Ordered Lists -- HTML viewers convert the list items into a
      numbered series.

      Descriptive Lists -- Usually used for definitional lists such as
      glossaries.

This becomes the following HTML:

      <dl><dt>Unordered Lists</dt><dd><p>Generally inclues a series of
      bullets when viewed in HTML.</p>
      </dd>
      <dt> Ordered Lists</dt><dd><p>HTML viewers convert the list
      items into a numbered series.</p>
      </dd>
      <dt> Descriptive Lists</dt><dd><p>Usually used for definitional
      lists such as glossaries.</p>
      </dd>
      </dl>

Example Code

As mentioned above, Structured Text authors can use an easy convention to get the monotype semantics of the CODE tag from HTML. For instance::

When you see the dialg box, hit the Ok button.

...is rendered into the following HTML:

      <p>When you see the dialg box, hit the <code>Ok</code> button.</p>

However, sometimes you want long passages of code. For instance, what if you wanted to document a Python function in the middle of an article discussing Python? You can indicate a code block by ending a paragraph with ::, and indenting the following paragraph(s). For instance, this Structured Text snippet:

      In our next Python example, we convert human years to dog years::

        def dog_years(age):
            """Convert an age to dog years"""
            return age*7

...would be converted to the following HTML:

      <p>In our next Python example, we convert human years to dog
      years:</p>

      <pre>
      def dog_years(age):
          """Convert an age to dog years"""
          return age*7
      </pre>

The convention of combining :: at the end of a paragraph-ending sentence and indenting a block does more than apply CODE semantics. It also escapes the indented block. That is how the Structured Text and HTML snippets in this article are left alone, rather than being rendered.

For example, the less than, greater than, and ampersand symbols in this code block are escaped:

      Here's an HTML example::

        <html>
        <p>This is a page about dogs & cats.</p>
        </html>

...to produce this HTML:

       <p>Here's an HTML example:</p>

       <pre>
        <html>
        <p>This is a page about dogs & cats.</p>
        </html>
       </pre>

Hyperlinks

In the previous sections we focused on ways to get certain presentation semantics in HTML by using common text conventions.

But the web isn't just HTML. Linking words and phrases to other information and including images are equally important. Fortunately Structured Text supports conventions for hyperlinks and image tags.

Let's start with a simple hyperlink. If we have a Structured Text paragraph discussing Python::

For more information on Python, please visit the "Python website" :http://www.python.org/.

This becomes:

      <p>For more information on Python, please visit the <a
      href="http://www.python.org/">Python website</a>.

The convention is fairly simple:

  • The text of the reference is enclosed in quotes.
  • The second quotation mark is followed by a colon and a URL.
  • The URL can be followed by punctuation.

This basic convention has a number of variations. For instance, relative URLs are possible, as are mailto URLs.

(Note: in the above example, there should not be a space between the last quote and the colon. This is due to a bug in the version of structured text currently running on Zope.org. This bug has been fixed in more recent versions of Zope.)

Advanced Usage

There are more obscure extensions to Structured Text to handle cross references, tables, images, and more.

One of the great things about structured text is that if you don't like its rules it's fairly easy to extend. This is made possible by the recent rewriting of Structured Text sometimes referred to as "Structured Text NG". For example, you could create a LaTeX outputter, or you could change structured text to recognize a different syntax for hyperlinks.

Structured Text is available in Zope (and is integrated into the Zope Content Management Framework,) but you can also use it outside of Zope. To use Structured Text in Zope just create a document or file containing structured text, then call it like so::

This will give you the HTML representation of my_document.

The Zope Book is an example of a Project that uses Structured Text outside of Zope. The book was written in Structured Text with some modifications to support figure handling, and the publisher's in-house markup format. Python scripts parse the input and create output in HTML and PDF.

Structured Text use is also used in Python doc strings. A number of Python documentation extraction tools support Structured Text. Currently work is under way on the Python doc-sig to develop docstring conventions, and a docstring processing system.

Conclusion

Structured Text gives you an easy way to express yourself in plain text. The Structured Text implementation allows you to tailor the syntax and output. Structured Text is integrated into Zope and is also usable outside Zope.

Resources

Structured Text Wiki - discusses structured text and STXNG.

reStructuredText - A Structured Text alternative being developed as a Python docstring standard.

Comment

Discussion icon Missing Fragment

Posted by: dle at 2004-04-22

This article contains the paragraphs:

Structured Text is available in Zope (and is integrated into the Zope Content Management Framework,) but you can also use it outside of Zope. To use Structured Text in Zope just create a document or file containing structured text, then call it like so::

This will give you the HTML representation of my_document.

Can someone clarify this by inserting the missing code fragment?

Comment

Discussion icon STX examples unclear

Posted by: heiko.stoermer at 2007-05-30

The actual STX code that is used and cited to produce the HTML outputs displayed above is unclear. Every block of STX code should go into a code environment to precisely show the characters one has to type to achieve the desired result (e.g. the "bulleted list" example cannot be reproduced).