Difference between revisions of "Manual Generation 3.0"

From Gramps
Jump to: navigation, search
(wiki text to html)
(FOP (need Java))
(7 intermediate revisions by the same user not shown)
Line 27: Line 27:
  
 
===PHP===
 
===PHP===
* [http://tools.wikimedia.de/~magnus/wiki2xml/w2x.php wiki2xml] is a [http://svn.wikimedia.org/svnroot/mediawiki/trunk/wiki2xml/php/wiki2xml.php GPL script] for parsing MediaWiki.
+
* [https://www.mediawiki.org/w/index.php?oldid=3062040 wiki2xml] was a [https://phabricator.wikimedia.org/diffusion/SVN/browse/trunk/parsers/graveyard/wiki2xml/php/ GPL script] for parsing MediaWiki.
  
* [http://wikirenderer.berlios.de/en/ WikiRenderer] is a php component which can parse a wiki content, and transform it to XHTML content, to any other markup language, or to an other wiki content with a different syntax. Sounds correct with [http://www.dokuwiki.org/syntax dokuwiki syntax], which is not far away (headline rule inversed) from Mediawiki syntax ! => [http://wikirenderer.berlios.de/en/demo.php Demo]
+
* [https://wikirenderer.jelix.org/ WikiRenderer] is a php component which can parse a wiki content, and transform it to XHTML content, to any other markup language, or to an other wiki content with a different syntax. Sounds correct with [http://www.dokuwiki.org/syntax dokuwiki syntax], which is not far away (headline rule inversed) from Mediawiki syntax ! => [https://wikirenderer.jelix.org/en/demo.php Demo]
  
 
===wt2db===
 
===wt2db===
Line 70: Line 70:
 
Overuse of ''emphasis'' and ''emphasis role="bold"''
 
Overuse of ''emphasis'' and ''emphasis role="bold"''
  
*[http://search.cpan.org/dist/html2dbk/ HTML::ToDocBook] is CPAN perl module who converts an XHTML file into DocBook.
+
*[http://search.cpan.org/dist/html2dbk/ HTML::ToDocBook] is CPAN perl module that converts an XHTML file into DocBook.
  
 
==Manual Text Guidelines==
 
==Manual Text Guidelines==
Line 117: Line 117:
  
 
===xmlto (need PassiveTeX and TeX)===
 
===xmlto (need PassiveTeX and TeX)===
  [http://cyberelk.net/tim/software/xmlto/ xmlto] pdf mydoc.xml
+
A tool for converting XML files to various formats
Is it [http://www.gramps-project.org/wiki/index.php?title=Gramps-about Tim Waugh] ?
+
  [https://pagure.io/xmlto/ xmlto] pdf mydoc.xml
[[media:Ancestors.xsl.gz|Ancestors.xsl]] [[media:Birthday.xsl.gz|Birthday.xsl]]
 
  
===FOP (need Java)===
+
[[media:Ancestors.xsl.gz|Ancestors.xsl]]
see [http://www.gramps-project.org/wiki/index.php?title=Manual_Generation Manual generation]
+
 
 +
[[media:Birthday.xsl.gz|Birthday.xsl]]
 +
 
 +
===FOP (Formatting Objects Processor)(needs Java)===
 +
see [[Manual_Generation|Manual generation]]
  
 
==A Test==
 
==A Test==

Revision as of 23:20, 13 June 2020

Gramps-notes.png
Manual Generation for Gramps 3.x and newer

Creation of the Gramps manual (docbook/pdf/html) starting from the Gramps 3.0 Wiki Manual.

How to create a manual starting from the wiki ?

MediaWiki to OpenDocument

MediaWiki to PDF

XML to XML

  1. Wikipedia use Wikimedia DTD, a format based on XML, for sharing his data. SGML, docbook are based on XML too.
  2. We can make a test for exporting our wiki data to Wikimedia DTD.
  3. To generate a script (XSLT, python, perl, sh ?) for parsing data from Wikimedia DTD to docbook/SGML.
  • Pandoc will convert files from one markup format into another.

Text to XML

  • All wiki pages are saved as txt: header.txt, preface.txt, chapter_01.txt, ..., which could be included into one file later.txt2tags supports Wikipedia.
  • Output will be a full gramps.xml/gramps.sgml file with utf8 encoding to avoid non-ASCII characters issues. The present Makefiles in GRAMPS can create html/manual/pdf from these xml files. Possible solution for keeping docbook : OpenJade + DSSSL. Note that yelp may open xhtml too.

We should keep an eye on official developments here: [1]

Wikibooks

An alternative is to proceed as Wikibooks do.

PHP

  • WikiRenderer is a php component which can parse a wiki content, and transform it to XHTML content, to any other markup language, or to an other wiki content with a different syntax. Sounds correct with dokuwiki syntax, which is not far away (headline rule inversed) from Mediawiki syntax ! => Demo

wt2db

wt2db converts a text file in a special format similar to that used in WikiWikiWebs into DocBook XML/SGML

wiki text to html

A Python program could be used to generate HTML from the text of the Gramps manual wiki pages. Manual Html Generation(No python code available)

xhtml to ODT

xhtml2odt stylesheets convert namespaced XHTML to ODT.

html to html translation

  • Translate toolkit

We can try to translate generated html by using translate toolkit

html2po <html> > <pot>
msgmerge --no-wrap <po> <pot> > <new_po>
po2html -t <html> -i <new_po> -o <new_html> 

where <x> is the file format, use your names.

  • GNUnited Nations

GNUnited Nations (GNUN) is a build system for www.gnu.org translations. It generates a PO template (.pot) for an original HTML article, and merges the changes into all translations, which are maintained as PO (.po) files. Finally, it regenerates the translations in HTML format.

The goal of GNUN is to make maintenance of gnu.org translations easier and to avoid the effect of seriously outdated translations when a particular team becomes inactive.

html to docbook

  • Html2Docbook converts project documentation from HTML to DocBook.
  1. Convert all of your HTML to XHTML using Tidy. Enable 'enclose-block-text' in the configfile, else any unenclosed text (where this is allowed under XHTML Transitional but not under XHTML Strict) will vanish.
  2. Use the XSL stylesheet (below) to convert the XHTML into DocBook (There's no way to merge the multiple XHTML files into a single document, so the stylesheet converts each HTML page into a section). Be sure to pass in the filename (minus the extension) as a parameter. This will become the section id.
  3. Combine the multiple DocBook section files into a single file, and re-arrange the sections into the proper order
  4. Correct any validity errors. (At this point, there are likely to be a few, depending on how good the original HTML was.)
  5. Peruse the now valid DocBook document, and look for the following:

Broken links xref elements that should be links

Missing headers (the heading logic isn't perfect. You'll lose at most 1 header per page, though, and most pages come through with all headers intact.)

Overuse of emphasis and emphasis role="bold"

  • HTML::ToDocBook is CPAN perl module that converts an XHTML file into DocBook.

Manual Text Guidelines

In order for the above scripts to work, we need to limit ourselves to a limited set of templates and syntax in the manual, as we cannot support everything. Hence, the Manual section does not have the full capabilities as a normal Mediawiki.

General textual guidelines

  • Only approved templates may be used. These are
    1. {{grampsmanualcopyright}}: the copyright template. This will be stripped out on manual generation.
    2. {{man label|Labels}}: template for GUI elements, example: Labels
    3. {{man button|Buttons}}: template for GUI buttons, example Buttons
    4. {{man tip| 1=title |2=text.}}: template to add a tip in the text
    5. {{man note| title |text}}: template to add a note to the text
    6. {{man warn| title |text}}: template to add a warning to the text
    7. {{man index|prevpage|nextpage}}: template to add the bottom index bar. This will be stripped out on manual generation.
    8. {{man menu|Edit->Preferences}}: template for the menu items sequence, example Edit->Preferences
    9. {{languages}}: template to add language bar. This will be stripped out on manual generation.
  • The following markup code may be used:
    1. ''' bold ''': for bold in GRAMPS, eg. Gramps
    2. '' italic '': for italic or filenames in GRAMPS, eg. filename
    3. <code> code sections</code>: for commands you type in the command line.
    4. <tt>''Replaceable text''</tt>: for GUI elements the user must type in replacing text, eg John Doe.
    5. <tt>Anything you type in</tt>: for GUI elements the user must type in, eg John Doe.

Tables, lists

Tables are special. Can we support them? Perhaps only tables via a template? As far as I can see present manual only has one table, we can easily change that in a list.

Lists. Can we support the list tokens: * and #, also nested?

Indent. Can we support the indent token: :

Cross referencing images/text

One may NOT link to pages, only to subsections in the Manual, so

  • [[manualpage#subsection]]: only this is allowed
  • [http://non_manual_subsection description]: all the other links must be cross site links.

One may also link to Manual images in the text. See discussion on Talk:Manual_Generation_3.0


Pdf output

JadeTeX (need TeX)

openjade -t tex -d DSSL DCL file_XML
  • DSSL is the stylesheet
  • DCL is the declaration, something like : /usr/share/sgml/openjade-1.3.2/pubtext/xml.dcl
pdfjadetex TEX_file

xmlto (need PassiveTeX and TeX)

A tool for converting XML files to various formats

xmlto pdf mydoc.xml

Ancestors.xsl

Birthday.xsl

FOP (Formatting Objects Processor)(needs Java)

see Manual generation

A Test

There is a user request for a 2132:downloadable text format users manual on bug manager.

Steps:

  1. go to webpages of the manual (wiki), and save full local copies.
  2. create an empty file and Ctrl+c/Ctrl+v with HTML <body> codes (1.) without scripts or javascript
  3. copy all images into one directory, change href links on code
  4. make some href links as relative links
  5. add/clean anchors
  6. using Tidy

See also