StructuredTextNG Vision


StructuredText has been found to be extremely useful, however, a number of shortcomings have become apparent with the current implementation:

  • Not everyone likes the ClassicStructuredTextRules, and it is too hard to change the rules used.
  • It is too hard to produce different kinds of output, other than the existing HTML (and MML) output.
  • Structuring and formatting are combined in such a way that the structuring information used to create the output is lost after processing. It would be useful to make the structure more accessible.


Provide a new implementation of StructuredText that addresses the above problems by satisfying the following goals:

  1. Make it much easier to adapt StructuredText to support different structuring rules and output options.
  2. Increase the maintainability of the StructuredText software.
  3. Provide a DOM interface to parsed StructuredText. *How do we plan to do this? It seems we would need some kind of NameSpace? to allow DOM addressibility, whereas currently StructuredText's structures are unnamed. Does providing a NameSpace? perhaps violate StructuredTextZen, or is that a nonissue since StructuredTextUsers won't see the DOM interface? --[zigg]?*

Risk Factors

  • Backward compatibility for products that depend on the current implementation of StructuredText


This project will have essentially the same scope as the original structured text project, which is to provide a Python-level interface for managing structured texts. Higher-level applications and interfaces (e.g. Wiki pages or Zope editing interfaces) are outside the scope of this effort.


gvanrossum (Mar 29, 2001 7:12 pm; Comment #1)
This vision, and the whole project, seems almost exclusively focused on fixing the ST software architecture, and not on fixing the rules. A current doc-sig discussion suggests that many of the problems with ST lie in the ClassicStructuredTextRules.
aexl (Jan 2, 2003 4:49 pm; Comment #2) Editor Remark Requested
this topic is parallel to many wiki discussions.

my conclusions are:

  • different structured text / wiki notations will evolve and users will pick the elements they need
  • some intermediate format will evolve, most probably an xml app

so in my opinion the way to go is

  • make the structured text engine more generic, a keyword for this is the "Recursive-Descent-Parser", examples are Yapps , kwParsing , TPG
  • store the text in parsed xml (e.g. xhtml)
  • render the text with xslt or some other declarative transformation

now some questions come into our mind:

  • performance: a generic parsing solution has its costs. but: parsing happens just once, the stored data is pre-parsed and rendering is quite cheap.
  • proof of inversion correctness: of course the rendering and the parsing process must satisfy r(p(x))=x and p(r(y))=y. the use of appropriate declarative descriptions for the parsing (Recursive-Descent grammar) and rendering (xslt) process makes this feasible.

this proposal has some more payoffs:

  • rendering content can be done in different ways (html, wap, xml, stx, ...)
  • dom comes in for free with an xml format
  • controlled editing: for many content maintainers this is an important point. example: authors should be able to use heading, links and images to write a magazine article, but not suffer from the temptation to use all the other fancy markup. this feature comes in for free if we customize the parser.
  • the xml can be checked for conformance with xml schema (controlled editing!)
  • xml content allows easy migration to wysiwyg editors like xopus ( which allows controlled editing governed by a xml schema!)

puh, when i started i thought to write "two sentences"...