The Informational Language

The Informational Language, abbreviated InfoLang or IL, is an application of SGML, the Standard Generalized Markup Language. I know XML is hip and the thing to do now, but I need more than what XML offers. The basis is structured textual markup, such as may be found in books and technical documents. This includes figures and tables, which are included in no standard fashion.

Abstract

InfoLang attempts to be like HTML 1.0, which was a language used for marking up text for speedy sharing via the InterNet. The emphasis is placed on structure, text, accessibility and internationalization (I18N, because it really is a long word).

Example InfoLang Document

1: <!DOCTYPE INFO PUBLIC "-//W3M//DTD INFO//1.0"
2: 	"http://micron999.com/web/info/info10.dtd">
3: <!-- ~/info/ex.info -->
4: <info lang="en-US">
5: 	<meta cdate="1999-08-07T21:33:14-05:00" mdate="1999-08-07T21:34:00"
6: 	 name="Mike Burns" email="netgeek@speakeasy.net" ver="2.3">
7: 		<title>Monkies and Oranges</title>
8: 		<link lang="en" num=17 uri="info:lang:human:en:all:orangemonkey"
9: 		 type="text/sgml" name="Source Document">
10: 		<link lang="en" num=18 uri="learn018.tbl" type="text/xml"
11: 		 name="Table 14">
12: 	</meta>
13: 	<content genre="tech-phy-study">
14: 		<sec name="Abstract">
15: 			<para>
16: 				<sum>The words
17: 				<term format="word" ref="monkey" >monkey</term>
18: 				and <term format="word" id="orange">orange</term>
19: 				have been exchanged. The effects on humans is
20: 				<em>amazing</em>.</sum>
21: 			</para>
22: 		</sec>
23: 		<sec name="The Experiment">
24: 			<para>
25: 				We switched the words
26: 				<term format="word" ref="monkey">monkey</term>
27: 				and <term format="word" ref="orange">orange</term>
28: 				on three of our test subjects, <var>Foo</var>,
29: 				<var>Bar</var>, and <var>Baz</var>.
30: 			</para>
31: 			<para>
32: 				On test subjects <var>Fred</var>,
33: 				<var>Barney</var>, and
34: 				<var>Quaam</var> the word
35: 				<term format="word" ref="monkey">monkey</term>
36: 				was removed. <im>This caused major
37: 				problems in speaking of primates.</im>.
38: 			</para>
39: 			<para>
40: 				According to <var>Barney</var>,
41: 				<quote from="subject5">Yeah,
42: 				I had difficulties when asked what I saw at
43: 				the zoo.</quote>.
44: 			</para>
45: 			<para>
46: 				<list format="unorder">
47: 					<lh>Difficulties experienced
48: 					by all test subjects were</lh>
49: 					<li>Discussing colours.</li>
50: 					<li>Discussing the jungle.</li>
51: 				</list>
52: 				This may be seen in table 14.
53: 				<inlink refnum=18
54: 				 caption="Table of monkies and oranges.">
55: 			</para>
56: 		</sec>
57: 	</content>
58: </info>

Dissection

Lines one and two are the DOCTYPE declaration. Line three is a comment. A comment begins with <!--, ends with -->, and may contain any text in between. This comment says that this file is ~/info/ex.info. Comments are skipped by the InfoLang parser.

InfoLang Start

Line four is the start of the InfoLang document. This line also may contain information on the humanistic language of the document. This one is en-US, or American English. This is so documents written in, say, French, may be translated to Spanish. No InfoLang elements may occur outside of the info element. This element must be closed.

Metadata

Information about the document, or meta information, is contained in the meta element. Attributes for the meta element are lang, cdate, mdate, name, email, and ver. The meta element must be both opened and closed.

As explained before, the lang takes as its value a language code from RFC1766 for internationalization purposes. cdate is for the creation date of the document and mdate is the modification date, specified in ISO format. ISO format is YYYY-MM-DDTHH:mm:ssZ, where Y is the year, M is the month, D is the day, H is the hour, m is the minute, s is the second, and Z is the time difference from Greenwich Mean Time.

name is the author's name, email is the author's email address, and ver is the version of the document. Version x.0 is stable, where x is a variable greater than zero.

Suggestion for rendering: display the author's name, his email address, and at least the modified date at the very bottom or top, right, in an italic font face.

Title and Links

Within the meta element must be one title element and zero or more link elements. Nothing else may appear in the meta element's scope.

The title element contains the title of the document. This has many uses; when printed, it may be the title page, when online it may be centered at the top of the screen, or it may have other uses on different media. This may take the afore mentioned lang attribute.

Linking to other documents, such as references, figures, or further explanations, may be achieved via the link element. This element may take the lang attribute, as described earlier, for information about the linked document. It may also take a name attribute, which is the title of the referenced document (such as Figure Thirteen or All About Doors). Also optional, unless the link is to be explicitly included in the document, is the num attribute, which takes an unsigned integer as it's value. Its use is seen later with the inlink element.

Required attributes for the link element are uri and type. The value for uri is the Universal Reference Identifier for the document, such as news:alt.tv.aeon-flux or http://www.ebooboo.org. The type attribute take a MIME type as its value, such as text/html, image/png, or text/plain. Example uses of the link element is on lines 8 through 11.

Suggestions for rendering: display the title in a large font face at the top, center, of the page display and in the title bar. Have a drop-down or menu listing available for link items, displaying the value of their name attribute and linking to the value of their uri attribute.

Content

Line 13 starts the content of the page, as implied by the content element. All data to be directly experienced by the users must be within the content element.

Optional attributes for the content element are lang and genre. lang is described earlier. genre is the genre type of the document's content. This will be defined later.

Sections

All content is divided into sections. Sections may be nested. Attributes for the sectioning element, sec, are lang, name, and genre. lang and genre are used as described earlier. name is the title of the section, such as "Abstract" or "Air Conditioner Mechanics".

Suggestions for rendering: start sections with the value of the name attribute in a large font face, and possibly a line break.

Paragraphs

Paragraphs are the next logical breakdown of content. The para element may take the lang and genre attributes, as defined previously. This is shown in lines 24, 31, 39, and 45.

Suggestion for rendering: separating paragraphs with line breaks has been known to be standard and easy to read online, while separating paragraphs with a tab is standard on paper.

Lists

Lists are created with the list, list item head, and list item elements. The list element surrounds the other list elements; no list item nor list head item elements may appear outside of the list element. The list element must contain zero or more list head item elements and one or more list item elements. Attributes for the list element, list, are lang and format. The lang attribute works as described earlier. The format takes either order or unorder, with the default being unorder. Value unorder creates a list with no order, such as a bulleted list, while order creates a list with an order, such as a numbered list. List elements may be nested.

The list item head element, lh, and the list item element, li, may take the predefined lang attribute.

List head items are used for main category descriptors, while list items are used for the individual contents being listed.

Suggestions for rendering: display the list head item in a bold face and list items with a bullet preceding it. Another way to render lists would be in full text. For example, the above example InfoLang document's list may be rendered as "Difficulties experienced by all test subjects were discussing colours and discussing the jungle."

Inline Links

Links may be explicitly embedded in a document via the inlink element. Attributes are refnum and caption. The value for refnum is equal to the link number to be included. The value for caption is text for the caption for the inline link. For an example, see lines 51 and 52.

Suggestion for rendering: supporting different file formats is optional, and likewise displaying inline figures, et cetera, is optional.

Text Level Elements

Text level elements occur inline with the text. They are used to markup words or phrases for special rendering. They should not, however, cause any break in the flow, as would a paragraph or a sentence.

Emphasis

General emphasis may be placed on text with the emphasis element, em. This is exampled on line number 20 of the example

Suggestion for rendering: format emphasized text in an italic font face.

Important Information

Information deemed important by the author may be marked that way with the important element, im. This may be used for e.g. highlighting or drawing attention.

Suggestion for rendering: display important text in a bold font face.

Summary Information

A summarization of a section or document may be marked using the summary element. This is marked with the sum tag. See lines 16 through 20 of the example.

Suggestion for rendering: rendering engines may wish to display summary information with a light bold face or italic font.

Definitions

Words and their definitions, or just words or just definitions, may be marked up using the term element. Attributes for the term element are the optional, and explained earlier, lang attribute, the ref attribute, and the format attribute. The format attribute may take one of word or definition, marking whether the marked text is the word or definition. The ref attribute is the reference for the term, so it may be called later or used by the definition. See line 17 for an example.

Suggestions for rendering: words may be displayed with quotes surrounding it or in italics. Definitions may be highlighted in some fashion (bold face, etc.).

Variables

Variable terms, letters that may be replaced by another value, are marked using the variable element, var. The optional attribute is value, which takes the value of the variable.

Suggestion for rendering: it is common for variables to be rendered in an italic font face.

Quoted Text

Text quoted from another source is marked using the quote element. This is useful for text from another document, person, or other source. Optional attributes are lang, as described before, and from. The from attribute takes as it's value the name of the source of the quote. For an example use, see lines 41 through 43.

Suggestion for rendering: display the quoted text within the local quotes (" ' ' " in English and American).

Other Information

Text may be translated to the local language. Rendering may be controlled by the use via, e.g. style sheets. Information browser developers may wish to experiment with textual layout. For example, text may be displayed in opposite directions, or may be 'read' by the browser. Other media may be experimented with, such as aural or tactile.

Quick Table of Elements

Element Name Optional Attributes Required Attributes Element Description
info lang - Must surround all other elements. Start of InfoLang document.
meta lang, cdate, mdate, name, email, ver - Must surround title and link. Information about the document.
title lang - Title of the document.
link lang, num, name uri, type Links to other resources. Empty element; cannot be closed.
content lang, genre - Actual content. Containing element for sec, para, inlink, em, im, sum, term, var, quote, and list.
sec lang, genre name Sectioning. May be nested.
para lang, genre - Paragraphs. All text must be in the para element. Container for em, im, sum, term, var, quote, list.
inlink caption refnum Inline links. Empty element.
em - - General emphasis.
im - - Important text.
sum - - Summary information.
term lang format, ref Words and definitions.
var value - Variables.
quote lang, from - Quoted text.
list lang format List containter. Container element for lh and li.
lh lang - List head.
li lang - List item.

Home - Contents - Search

Mike Burns <mike@mike-burns.com>

Retrospective,

Not sure why I cared that it was SGML instead of XML. There's some of Ted Nelson's Xanadu in here, if you squint.

Comments are skipped by the InfoLang parser.

Clearly I didn't know what a parser was capable of.