The Informational Language, abbreviated InfoLang or IL, is an application of SGML, the Standard Generalized Markup Language. I know XML is hip and the thing to do now, but I need more than what XML offers. The basis is structured textual markup, such as may be found in books and technical documents. This includes figures and tables, which are included in no standard fashion.
InfoLang attempts to be like HTML 1.0, which was a language used for marking up text for speedy sharing via the InterNet. The emphasis is placed on structure, text, accessibility and internationalization (I18N, because it really is a long word).
1: <!DOCTYPE INFO PUBLIC "-//W3M//DTD INFO//1.0" 2: "http://micron999.com/web/info/info10.dtd"> 3: <!-- ~/info/ex.info --> 4: <info lang="en-US"> 5: <meta cdate="1999-08-07T21:33:14-05:00" mdate="1999-08-07T21:34:00" 6: name="Mike Burns" email="netgeek@speakeasy.net" ver="2.3"> 7: <title>Monkies and Oranges</title> 8: <link lang="en" num=17 uri="info:lang:human:en:all:orangemonkey" 9: type="text/sgml" name="Source Document"> 10: <link lang="en" num=18 uri="learn018.tbl" type="text/xml" 11: name="Table 14"> 12: </meta> 13: <content genre="tech-phy-study"> 14: <sec name="Abstract"> 15: <para> 16: <sum>The words 17: <term format="word" ref="monkey" >monkey</term> 18: and <term format="word" id="orange">orange</term> 19: have been exchanged. The effects on humans is 20: <em>amazing</em>.</sum> 21: </para> 22: </sec> 23: <sec name="The Experiment"> 24: <para> 25: We switched the words 26: <term format="word" ref="monkey">monkey</term> 27: and <term format="word" ref="orange">orange</term> 28: on three of our test subjects, <var>Foo</var>, 29: <var>Bar</var>, and <var>Baz</var>. 30: </para> 31: <para> 32: On test subjects <var>Fred</var>, 33: <var>Barney</var>, and 34: <var>Quaam</var> the word 35: <term format="word" ref="monkey">monkey</term> 36: was removed. <im>This caused major 37: problems in speaking of primates.</im>. 38: </para> 39: <para> 40: According to <var>Barney</var>, 41: <quote from="subject5">Yeah, 42: I had difficulties when asked what I saw at 43: the zoo.</quote>. 44: </para> 45: <para> 46: <list format="unorder"> 47: <lh>Difficulties experienced 48: by all test subjects were</lh> 49: <li>Discussing colours.</li> 50: <li>Discussing the jungle.</li> 51: </list> 52: This may be seen in table 14. 53: <inlink refnum=18 54: caption="Table of monkies and oranges."> 55: </para> 56: </sec> 57: </content> 58: </info>
Lines one and two are the DOCTYPE declaration. Line three is a comment. A comment begins with <!--
, ends with -->
, and may contain any text in between. This comment says that this file is ~/info/ex.info
. Comments are skipped by the InfoLang parser.
Line four is the start of the InfoLang document. This line also may contain information on the humanistic language of the document. This one is en-US
, or American English. This is so documents written in, say, French, may be translated to Spanish. No InfoLang elements may occur outside of the info element. This element must be closed.
Information about the document, or meta information, is contained in the meta element. Attributes for the meta element are lang
, cdate
, mdate
, name
, email
, and ver
. The meta element must be both opened and closed.
As explained before, the lang
takes as its value a language code from RFC1766 for internationalization purposes. cdate
is for the creation date of the document and mdate
is the modification date, specified in ISO format. ISO format is YYYY-MM-DDTHH:mm:ssZ
, where Y
is the year, M
is the month, D
is the day, H
is the hour, m
is the minute, s
is the second, and Z
is the time difference from Greenwich Mean Time.
name
is the author's name, email
is the author's email address, and ver
is the version of the document. Version x.0 is stable, where x is a variable greater than zero.
Suggestion for rendering: display the author's name, his email address, and at least the modified date at the very bottom or top, right, in an italic font face.
Within the meta element must be one title element and zero or more link elements. Nothing else may appear in the meta element's scope.
The title element contains the title of the document. This has many uses; when printed, it may be the title page, when online it may be centered at the top of the screen, or it may have other uses on different media. This may take the afore mentioned lang
attribute.
Linking to other documents, such as references, figures, or further explanations, may be achieved via the link element. This element may take the lang
attribute, as described earlier, for information about the linked document. It may also take a name
attribute, which is the title of the referenced document (such as Figure Thirteen
or All About Doors
). Also optional, unless the link is to be explicitly included in the document, is the num
attribute, which takes an unsigned integer as it's value. Its use is seen later with the inlink element.
Required attributes for the link element are uri
and type
. The value for uri
is the Universal Reference Identifier for the document, such as news:alt.tv.aeon-flux
or http://www.ebooboo.org
. The type
attribute take a MIME type as its value, such as text/html
, image/png
, or text/plain
. Example uses of the link element is on lines 8 through 11.
Suggestions for rendering: display the title in a large font face at the top, center, of the page display and in the title bar. Have a drop-down or menu listing available for link items, displaying the value of their name
attribute and linking to the value of their uri
attribute.
Line 13 starts the content of the page, as implied by the content element. All data to be directly experienced by the users must be within the content element.
Optional attributes for the content element are lang
and genre
. lang
is described earlier. genre
is the genre type of the document's content. This will be defined later.
All content is divided into sections. Sections may be nested. Attributes for the sectioning element, sec
, are lang
, name
, and genre
. lang
and genre
are used as described earlier. name
is the title of the section, such as "Abstract" or "Air Conditioner Mechanics".
Suggestions for rendering: start sections with the value of the name
attribute in a large font face, and possibly a line break.
Paragraphs are the next logical breakdown of content. The para
element may take the lang
and genre
attributes, as defined previously. This is shown in lines 24, 31, 39, and 45.
Suggestion for rendering: separating paragraphs with line breaks has been known to be standard and easy to read online, while separating paragraphs with a tab is standard on paper.
Lists are created with the list, list item head, and list item elements. The list element surrounds the other list elements; no list item nor list head item elements may appear outside of the list element. The list element must contain zero or more list head item elements and one or more list item elements. Attributes for the list element, list
, are lang
and format
. The lang
attribute works as described earlier. The format
takes either order
or unorder
, with the default being unorder
. Value unorder
creates a list with no order, such as a bulleted list, while order
creates a list with an order, such as a numbered list. List elements may be nested.
The list item head element, lh
, and the list item element, li
, may take the predefined lang
attribute.
List head items are used for main category descriptors, while list items are used for the individual contents being listed.
Suggestions for rendering: display the list head item in a bold face and list items with a bullet preceding it. Another way to render lists would be in full text. For example, the above example InfoLang document's list may be rendered as "Difficulties experienced by all test subjects were discussing colours and discussing the jungle.
"
Links may be explicitly embedded in a document via the inlink element. Attributes are refnum
and caption
. The value for refnum
is equal to the link number to be included. The value for caption
is text for the caption for the inline link. For an example, see lines 51 and 52.
Suggestion for rendering: supporting different file formats is optional, and likewise displaying inline figures, et cetera, is optional.
Text level elements occur inline with the text. They are used to markup words or phrases for special rendering. They should not, however, cause any break in the flow, as would a paragraph or a sentence.
General emphasis may be placed on text with the emphasis element, em
. This is exampled on line number 20 of the example
Suggestion for rendering: format emphasized text in an italic font face.
Information deemed important by the author may be marked that way with the important element, im
. This may be used for e.g. highlighting or drawing attention.
Suggestion for rendering: display important text in a bold font face.
A summarization of a section or document may be marked using the summary element. This is marked with the sum
tag. See lines 16 through 20 of the example.
Suggestion for rendering: rendering engines may wish to display summary information with a light bold face or italic font.
Words and their definitions, or just words or just definitions, may be marked up using the term element. Attributes for the term element are the optional, and explained earlier, lang
attribute, the ref
attribute, and the format
attribute. The format
attribute may take one of word
or definition
, marking whether the marked text is the word or definition. The ref
attribute is the reference for the term, so it may be called later or used by the definition. See line 17 for an example.
Suggestions for rendering: words may be displayed with quotes surrounding it or in italics. Definitions may be highlighted in some fashion (bold face, etc.).
Variable terms, letters that may be replaced by another value, are marked using the variable element, var
. The optional attribute is value
, which takes the value of the variable.
Suggestion for rendering: it is common for variables to be rendered in an italic font face.
Text quoted from another source is marked using the quote
element. This is useful for text from another document, person, or other source. Optional attributes are lang
, as described before, and from
. The from
attribute takes as it's value the name of the source of the quote. For an example use, see lines 41 through 43.
Suggestion for rendering: display the quoted text within the local quotes (" ' ' " in English and American).
Text may be translated to the local language. Rendering may be controlled by the use via, e.g. style sheets. Information browser developers may wish to experiment with textual layout. For example, text may be displayed in opposite directions, or may be 'read' by the browser. Other media may be experimented with, such as aural or tactile.
Element Name | Optional Attributes | Required Attributes | Element Description |
---|---|---|---|
info |
lang |
- | Must surround all other elements. Start of InfoLang document. |
meta |
lang, cdate, mdate, name, email, ver |
- | Must surround title and link . Information about the document. |
title |
lang |
- | Title of the document. |
link |
lang, num, name |
uri, type |
Links to other resources. Empty element; cannot be closed. |
content |
lang, genre |
- | Actual content. Containing element for sec , para , inlink , em , im , sum , term , var , quote , and list . |
sec |
lang, genre |
name |
Sectioning. May be nested. |
para |
lang, genre |
- | Paragraphs. All text must be in the para element. Container for em , im , sum , term , var , quote , list . |
inlink |
caption |
refnum |
Inline links. Empty element. |
em |
- | - | General emphasis. |
im |
- | - | Important text. |
sum |
- | - | Summary information. |
term |
lang |
format, ref |
Words and definitions. |
var |
value |
- | Variables. |
quote |
lang, from |
- | Quoted text. |
list |
lang |
format |
List containter. Container element for lh and li . |
lh |
lang |
- | List head. |
li |
lang |
- | List item. |
Not sure why I cared that it was SGML instead of XML. There's some of Ted Nelson's Xanadu in here, if you squint.
Comments are skipped by the InfoLang parser.
Clearly I didn't know what a parser was capable of.