StructuredText has been found to be extremely useful, however, a number of shortcomings have become apparent with the current implementation:
- Not everyone likes the ClassicStructuredTextRules, and it is too hard to change the rules used.
- It is too hard to produce different kinds of output, other than the existing HTML (and MML) output.
- Structuring and formatting are combined in such a way that the structuring information used to create the output is lost after processing. It would be useful to make the structure more accessible.
Provide a new implementation of StructuredText that addresses the above problems by satisfying the following goals:
- Make it much easier to adapt StructuredText to support different structuring rules and output options.
- Increase the maintainability of the StructuredText software.
- Provide a DOM interface to parsed StructuredText. *How do we plan to do this? It seems we would need some kind of NameSpace? to allow DOM addressibility, whereas currently StructuredText's structures are unnamed. Does providing a NameSpace? perhaps violate StructuredTextZen, or is that a nonissue since StructuredTextUsers won't see the DOM interface? --[zigg]?*
- Backward compatibility for products that depend on the current implementation of StructuredText
This project will have essentially the same scope as the original structured text project, which is to provide a Python-level interface for managing structured texts. Higher-level applications and interfaces (e.g. Wiki pages or Zope editing interfaces) are outside the scope of this effort.
- A new StructuredText package with the capabilities described above
- Documentation of the StructuredTextRules as implemented by the package
- gvanrossum (Mar 29, 2001 7:12 pm; Comment #1)
- This vision, and the whole project, seems almost exclusively focused on fixing the ST software architecture, and not on fixing the rules. A current doc-sig discussion suggests that many of the problems with ST lie in the ClassicStructuredTextRules.
- aexl (Jan 2, 2003 4:49 pm; Comment #2) Editor Remark Requested
- this topic is parallel to many wiki discussions.
my conclusions are:
- different structured text / wiki notations will evolve and users will pick the elements they need
- some intermediate format will evolve, most probably an xml app
so in my opinion the way to go is
- make the structured text engine more generic, a keyword for this is the "Recursive-Descent-Parser", examples are Yapps , kwParsing , TPG
- store the text in parsed xml (e.g. xhtml)
- render the text with xslt or some other declarative transformation
now some questions come into our mind:
- performance: a generic parsing solution has its costs. but: parsing happens just once, the stored data is pre-parsed and rendering is quite cheap.
- proof of inversion correctness: of course the rendering and the parsing process must satisfy r(p(x))=x and p(r(y))=y. the use of appropriate declarative descriptions for the parsing (Recursive-Descent grammar) and rendering (xslt) process makes this feasible.
this proposal has some more payoffs:
- rendering content can be done in different ways (html, wap, xml, stx, ...)
- dom comes in for free with an xml format
- controlled editing: for many content maintainers this is an important point. example: authors should be able to use heading, links and images to write a magazine article, but not suffer from the temptation to use all the other fancy markup. this feature comes in for free if we customize the parser.
- the xml can be checked for conformance with xml schema (controlled editing!)
- xml content allows easy migration to wysiwyg editors like xopus ( which allows controlled editing governed by a xml schema!)
puh, when i started i thought to write "two sentences"...