1.0 Document id
	
  
  
      1.1 General
  
        Copyright © 1996-2010 Jari Aalto
        License: This material may be distributed only subject to
        the terms and conditions set forth in GNU General Public
        License v2 or later; or, at your option, distributed under the
        terms of GNU Free Documentation License version 1.2 or later
        (GNU FDL).
  
  
      1.2 T2html program features
  
        Writing text documents is different from writing messages to
        Usenet or to your fellow workers. There already exists several
        tools to convert email messages into HTML, like MHonArc,
        Email hyper archiver, but for regular text documents, like for
        memos, FAQs, help pages and for other papers, there wasn't
        any suitable HTML converter couple of years back. The author
        wanted a simple HTML tool which would read pure plain
        text documents, like guides, tips pages, documentation,
        book mark pages etc. and convert them into HTML. Here you will
        find the specification how to format your text documents for
        t2html.pl perl script text to HTML converter.
        Few arguments, why plain text is the best source document format:
	- It is readable by all, without any extra software
- Deliverable by email, as is.
	
- Most easily kept in version control
	
- Most easily patched ( when someone sends a diff -u ...)
	
- Most easily handed to someone else when author no longer
            maintain it. (If you have specialized tools, people
	            need to learn them in order to maintain your FAQ.)
       1.2.1 Overview of features:
	- Requires Perl 5.004 or never
- 500K text document takes 70 seconds to convert to HTML.
	
- TF to Perl POD conversion may be in a future plan.
	
- Better linking of multiple files planned
	
- Configuration file for individual file options planned.
       1.2.2 HTML conversion
	- minimal mark up: rendering is based on indentation level.
            Written text document looks like a "Natural Document", and is
            suitable for reading as such.
- Text layout with indentation rules is called Technical Format
            (TF) and document must be formatted according to it before it
            is suitable for HTML generation.
	
- Rules are simple: place heading to the left and text at column 8.
	
- Program generates META tags for search engines.
	
- Colored html page: <EM> <STRONG> <PRE> ...
	
- Hyperlinks and email addresses are automatically detected.
	            No mark up is needed.
       1.2.3 HTML 4.01
	- Make a single html (1 file) or Framed version (3 files)
- Sample CSS2 (Cascading Style Sheet) included in HTML code for
	            document rendering. User can import his own CSS2.
       1.2.4 Link check for the text file
	- You need LWP module in order to use this feature. (Comes with
            latest Perl)
- Program has switches to run Link check on your text file
            to find out any broken or moved link. Currently you
            have to manually fix the links, nut an Emacs mode to do this
            automatically is planned. The output from Link check is standard
	            grep style:  *FILE:NBR:Error-Description*
       1.2.5 Splitting the text file to pieces
	- You can split very large document into pieces, e.g. according
            to top level headings and convert each piece to HTML. This is
            also handy if you're planning to print Slides for a class:
            Split on Headings to individual files: raise the font point
	            and print each file separately.
      1.3 How to convert text files into HTML?
  
        The TF specification can be found from the Manual
        The command used to generate this page was:
    
    | 
      t2html.pl                                                     \
      --author           "Jari Aalto"                               \
      --title            "Conversion for text files"                \
      --html-body         LANG=en                                   \
      --Out                                                         \
      index.txt     | 
  
  
      1.4 Writing a text document
  
        You need nothing else but a text editor where the current column
        number is displayed or editor can be configured to advance your
        TAB by 4 spaces. That's it.
        An Emacs minor mode (See package
        tinytf.el) can
        make the writing documents easy. The mode will help formatting
        paragraphs, filling bullets numbering headings and keeping TOC
        up to date.
  
  
      1.5 Ripping program documentation
  
       1.5.1 Documentation tools in programming languages
        Perl is an exception within programmin languages, because it
        includes internal documentation syntax called POD (Plain Old
        Syntax), with which you can embed documentation right into the
        program source. Deriving the documentation from perl programs
        is a straightforward job. Another well known language
        (invented long after Perl) is Java, which calls the embedded
        documentation javadoc. fro all others, there is need to
        write separate documentation.
       1.5.2 Other programming languages
        But it is possible to embed documentation inside any
        programming language: directly into the code. A small Perl
        utility can be used to extract the documentation provided it
        was written in TF format. Documentation is put at the
        beginning of the file and updated there. Program ripdoc.pl
        extracts the documentation which follows TF guidelines. The
        idea is that you can generate HTML documents from the embedded
        'TF pod'. The conversion goes like this:
    
    | 
      ripdoc.pl code.sh | t2html.pl > code.html
      ripdoc.pl code.el | t2html.pl > code.html
      ripdoc.pl code.cc | t2html.pl > code.html     | 
        Suitable for awk, shell, sh, ksh, C++, Java, Lisp, python,
        Tcl etc. programming languages. The only criteria is that the language
        supports one-comment-starter and that the documentation has
        been written by using it. Languages that have comment-start
        and comment-end, like C that has /* and */, are not suitable for
        ripdoc.pl.
	
	
	2.0 Other converters
	
  
  
      2.1 Postscript
  
  
  
      2.2 Texinfo
  
	- See page http://www.fido.de/kama/texinfo/texinfo-en.html
            where you can find C-program html2texinfo program
- Perl program html2texi.pl
            http://www.cs.washington.edu/homes/mernst/software/#html2texi
            html2texi converts HTML documentation trees into Texinfo
            format.  Texinfo format can be easily converted to Info format
            (for browsing in Emacs or the stand alone Info browser), to a
            printed manual, or to HTML. Thus, html2texi.pl permits
            conversion of HTML files to Info format, and secondarily
            enables producing printed versions of Web page
            hierarchies. Unlike HTML, Info format is searchable. Since Info
            is integrated into Emacs, one can read documentation without
            starting a separate Web browser. Additionally, Info browsers
            (including Emacs) contain convenient features missing from Web
	            browsers, such as easy index lookup and mouse-free browsing.
      2.3 Other text to HTML tools
  
  
  
      2.4 Other Utilities
  
	-  DocBook - SGML online book
-  Texi2html
             Perl script.
	
-  HTML tidy
             remove extra markup.
	
-  FTL
             Latex like Perl formatting
	
-  Hyperlatex
             "Hyperlatex is a package that allows you to prepare documents
             in HTML, and, at the same time, to produce a neatly printed
             document from your input. Unlike some other systems that you
             may have seen, Hyperlatex is not a general LaTeX-to-HTML
             converter. In my eyes, conversion is not a solution to HTML
             authoring. A well written HTML document must differ from a
             printed copy in a number of rather subtle ways. I doubt that
             these differences can be recognized mechanically, and I
             believe that converted LaTeX can never be as readable as a
             document written in HTML.  The basic idea of Hyperlatex is to
             make it possible to write a document that will look like a
             flawless LaTeX document when printed and like a handwritten
	             HTML document when viewed with an HTML browser."
	-  html2texi
             "html2texi converts HTML documentation trees into Texinfo format.
             Texinfo format can be easily converted to Info format (for browsing
             in Emacs or the stand alone Info browser), to a printed manual, or
             to HTML. Thus, html2texi.pl permits conversion of HTML files to
             Info format, and secondarily enables producing printed versions of
             Web page hierarchies. Unlike HTML, Info format is searchable. Since
             Info is integrated into Emacs, one can read documentation without
             starting a separate Web browser. Additionally, Info browsers
             (including Emacs) contain convenient features missing from Web
             browsers, such as easy index lookup and mouse-free browsing."
-  RTF in PC
	
-  catdoc
             Viewing MS WORD files.
             Catdoc is simple, one C source file, compiles in any system (DOS;
             Unix). Feed MS word file to it and it gives 7bit text out of it.
	
-  word2x
             Viewing MS WORD files.
	
-  MSWordView
             "MSWordView is a program that can understand the microsofts word
             8 binary file format (office97), it currently converts word into
             html, which can then be read with a browser."
	
-  Laola
             Viewing MS WORD files.
             "Laola(perl) does a respectable job of taking MSWord files to text
             ...LAOLA is giving access to the raw document streams of any program
             using "structured storage" technology to save its documents.
             ELSER is dealing especially with these streams as they are present
	             in Word 6 and Word 7 documents."
      2.5 General Document Maintenance tools