You are not logged in Log in Join
You are here: Home » Members » jim » StructuredTextWiki » WhatDoWeWantToChange

Log in
Name

Password

 
 

History for WhatDoWeWantToChange

??changed:
-
The next generation of structured text has three main goals
   
	1. Break the processes of organizing the paragraphs, tagging special
	   elements, and parsing the paragraphs into distinct and seperate
                 phases.
	
	2. Enable the user to define new structured text types and or
                  overload/ignore old types.

	3. Enable the user to define new or custom parsers.

How are	the processes broken up?

	- The first module is ST. This module contains the StructuredText
	   function and the DOC class.

	- The first step is organizing the paragraphs. This is done by
	   the StructuredText function. This function returns a structure
	   of the formatt ["A",[B]]. Part A is a paragraph, part B is a
	   list of sub-paragraphs of part A. This list is nested and can be
                 traversed to find sub-paragraphs of part A's subparagraph. This
	   list maintains the same format for each element, ["A",[B]].
	
	- The second step is tagging structured text types in each
                 paragraph. This is done by the DOC class. Doc first receives a
                 structure returned by Structured Text. Doc maintains an
	   internal list of class instances for each structured text type.
	   Each class takes a raw string / unprocessed paragraph and searches
	   for a structured text type. The structure returned by Doc is very
	   similar to the structure returned. The only differnce is that part
	   A is now a list of raw strings and structured text instances.
	   'EX : original paragraph = "this is a link to "john":john"\n" is 				   returned as [["this is a link to ", <ST.doc.href1 instance>, 	 "\n"], []]'. 
	   Each instance maintains an internal string which
	   holds the original matching string and any other structured text
	   types found in the string.

	- The final stage is parsing the structure returned by DOC. The 		  		   parser by my definition is something that interprets the raw 				   strings and type instances and generate the appropriate code, 		  		   such as html. The parser traverses each paragraph and each 				   instance's string and interprets the code based on what type of
	   instance the string belongs to. 

How does the user define new types, extend/overload old one?

	- To define a new type, the user would need to create a class for
	  the new type. This class would contain the expr that matched the
 	  new type. The class '__call__' method would be overloaded to
	  receive a raw string and determine if it matched the new type.
	  If the raw string matches, an new instance of the type is created 		  		  and that instance's string becomes the matching sub-string. The
	  instance also maintains the start and end positions of the 			  		  sub-string in relation to the original string.
	  The class also needs to maintain a span method, which returns a
	  tuple (start,end), of the sub-string's position.
	
	- To change how a type is matched the user would need to alter
	  the expr in the class for the type to be changed.

	- To ignore a type, the easy way is to remove the type from
	  self.types in Doc. This class either be done brute force by
	  literally removing it from the code, or by subclassing doc
	  and simply splicing the type from self.type 
	  **NOTE : this requires knowing the location of the type in the list**

How to extend DOC to recognize new types
	
	Define the new type

		1. Need to write a new class for the type. This class must have the 				   following
		
			- an overloaded call function
	
			- an overloaded init function

			- a span function
		
			- a type function

			- string function

		2. The init function will create a self.str item for the new type. Also has
		  two items for the span function, self.start, self.end
	
		3. The type function will return string which tells what type the instance is.
		   Ex : the current header class's .type() returns "header"
	
		4. A string function which returns self.str
	
		5. The overloaded call function receives a string and determines if there is 			a matching structured text type in the string. If there is, set self.start and 			self.end for the range of the sub-string that matches. Create a new 				instance whose string is the matching sub-string. Return the new instanace

		6. span returns the tuple (self.start,self.end)

	Make it so DOC can recognize the new type
		
		1. Need to create a new DOC, which subclasses the old DOC

		2. Overload the init function. Perform the original DOC init, but then
		   self.types needs to be modified. This is a **list** of structured text
		   types. An instance of the new type must be inserted/appended to
		   the list. **NOTE : Order does matter**.

How to extend DOC to overload old types

How to extend a parser
	
   Sub-Class an older parser

	1. Why Sub-class an older parser?

		- If the user is modifying a small number of structured text types, it is faster 			   to sub-class and have the majority of the types pre-defined by an old 			   parser.

   To add a new type
		
	2. Modify the self.types for the parser class. This is a dictionary, so order is 			    illrelevant. Ex: self.types["newtypename"] = self.typefunction
	    where newtypename is the string returned by the new instance's .type()
	    call and typefunction is the function in the parser which handles that
	    instance type.
	
	3. Modify the self.self_par if the strucuture marks paragraphs internally.
	   Ex: self.self_par.append("newtypename") where newtypename is the
	   string returned by the new instance's .type() call.
	  **Headers and lists do this currently**

   To overload a built-in type
	
	1.  In the new class, re-define the function which handles the type
	Ex: overloading header
	In the sub-class, re-define header
	def header(self,object): self.string = self.string + "I am not a header or crook"
	
	2. Remember that functions receive instances of the types they handle. To
	   go through the instance's string, use the .string() call.
	
	3. The .string() call returns either a string (if the object's string is text only) or
	  a list if the object's string contains other instances. If a list is returned it
	  is necessary to go throught each item. There can be only three things in
	  a list, strings, lists, and instances. For strings, call the self.paragraph function, or
	  whatever function handles strings. For lists, call the self.loop function. For 			  instances, call then self.instance function.