mintyml

12 releases

0.1.18	May 28, 2024
0.1.17	May 23, 2024
0.1.11	Apr 29, 2024
0.1.1	Mar 9, 2024

#444 in HTTP server

994 downloads per month
Used in mintyml-cli

MIT license

175KB
5K SLoC

What is MinTyML?

MinTyML (from Minimalist HTML) is an alternative HTML syntax intended for writing documents.

Principles

This markup language is designed with the following principles in mind:

Structure and formatting should be as concise as possible so the writer can focus on the text.
Writing documents without complex formatting or interactivity should not require strong knowledge of HTML.
Any reasonable, valid HTML should be representable in MinTyML.

Using MinTyML

Demo

This web-based demo shows how a MinTyML document will look.

Examples:

Command-line interface

mintyml-cli

The command-line MinTyML converter can be downloaded from Github or installed with cargo install mintyml-cli

IDE tooling

Try the VS Code extension.

Libraries

Language	Package	Install	Doc
Rust	mintyml	`cargo add mintyml`	doc
JavaScript/TypeScript	mintyml (compatible with NodeJS or in the browser with WebPack)	`npm install --save mintyml`

Basic structure

A MinTyML document is made up of nodes that correspond to HTML constructs.

Paragraph

Paragraphs separated by empty lines.
Source	Rendered
`Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.`	Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

The simplest node is the paragraph. A paragraph is made up of one or more consecutive lines of plain text. Paragraphs are separated from each other by empty lines and cannot be nested inside other paragraphs.

Selector

Some elements may include a selector, which describes the type and attributes of the element. It's syntactically very similar to a CSS selector.

A selector is made up of the following components (each of which is optional):

tag name: Indicates the type of the element, and must be placed at the beginning of the selector. For example, a to create a link (also called an anchor). A wildcard (*) is the same as not including a tag name at all, which signals that the element type should be inferred from context.
class: A class name, preceded by a dot (.). Classes are used by CSS to apply styles to certain elements. For example, .external, which one might use to indicate a link points to another domain. A selector can have multiple classes like so: .class1.class2
id: An identifier for this specific element, preceded by a hash #. For example, #example-link. No element should have the same ID as another within the same document. Can be used by CSS to style a specific element.
attribute: One or more HTML attributes that provide more information about the element, surrounded by brackets ([ ]). For example, [href=http://example.com/] sets the destination of a link. Multiple attributes are separated by a space. If an attribute value contains a space or a closing bracket, it can be surrounded by single quotes (' ') or double quotes (" "). For example, [href=http://example.com/ title='Example site']

Putting all of the components together to form a selector might look like this:

a#example-link.external[href=http://example.com/ title='Example site']

The above selector describes a link that:

Has the unique identifier example-link.
Is given the external class.
Points to http://example.com/.
Has the title Example site.

Most elements' selectors will not be nearly as complex as this. In fact, many elements will not need a selector at all.

Multiple selectors can be chained, separated by > to nest elements together.

For example, header>h2> describes a header element with a level 2 heading inside. This can also be used to compose elements with inferred types:

>img[src=/icon.png]>, when used within a section context describes a paragraph containing a single image.
section>*> or simply section>> describes a section with a single paragraph element.

Element

Elements are components of a document that have a distinct purpose and translate directly to an HTML element.

If an element directly follows a paragraph, the paragraph ends and does not include the element.
Source	Rendered
`This is a paragraph. footer> This footer is outside the paragraph.`	This is a paragraph. This footer is outside the paragraph.

Some common element types (or tag names) that might be used in a text-centric document are:

p (paragraph)

A block of related text. Block elements that contain section content will automatically encapsulate chunks of text in this element.

h1, h2, h3, h4, h5, and h6 (heading levels 1-6)

Headings implicitly organize the document into sections and subsections based on the heading level.

ul (unordered list) and ol (ordered list)

Bulleted and numbered lists, respectively. Any paragraph or element contained within that doesn't specify a type will be inferred as li (list item). See list inference.

table

Displays data in a grid. The most basic table is made up of tr (table row) elements. Each row contains one or more cells of type th (table header) or td (table data).

Like list items, rows and data cells can be inferred. See table inference.

article

A complete work of written text. In some cases, there may be more than one article per document.

section

A logical section of a page or document. The first heading found in the section is that section's heading. Some examples of uses for sections:

To denote chapters in a book.
To separate the introduction, body, and bibliography of a paper.
To mark categories of items in a menu.

div (content division)

A container used to group other elements. This element behaves the same as a section, but doesn't imply any logical grouping of its contents. It is often used to style or visually group one or more elements.

A block declared in a section context that contains section content automatically creates a div if no tag name is given:

section {
  {
    This is the content of a div.
  }

  .foo {
    This is the content of a div with the class 'foo'.
  }
}

MDN is an excellent resource for learning more about HTML elements.

Line Element

A line element consists of an optional selector followed by > and optionally another node (for example a paragraph or element).

A line element containing text can span multiple lines
Source	Rendered
`footer> This text is the content of the footer. Here's another sentence in the footer. This sentence is not part of the footer.`	This text is the content of the footer. Here's another sentence in the footer.

Line elements are generally used for nodes that contain no children, or that are meant to wrap around a single child or line of text.

Line elements contain the node following > if one is present.


Source	Rendered
p> Explicit <`p`> tag.	Explicit `p` tag.
> Implicit <`p`> tag when placed in a section context.	Implicit `p` tag when placed in a section context.
`hr>`
div> A div containing plain text (not wrapped in <`p`> elements).	A div containing plain text (not wrapped in `p` elements).

Inference

When no element selector is provided, a line element will become a p when used in a section context, or a span in a paragraph context. For other contexts, see element inference.

Block Element

A block element consists of an optional selector followed by a pair of curly braces ({ }) containing zero or more nodes.

Block elements can be nested to form the structure of the document. Borders have been added for visualization purposes.
Source	Rendered
`article { Article section { Section 1 } section { Section 2 } }`	Article Section 1 Section 2

Blocks are generally used to define containers, or nodes that are meant to contain other nodes. For example, the article, section, table, details, ul, and ol elements will usually be declared with braces.

Inference

When no element type is specified, a block element is inferred to be a div in a section context, or span in a paragraph context. For other contexts, see element inference.

Line-Block Element

A line-block element consists of an optional selector followed by > { ... }, where the curly braces contain zero or more nodes. A line-block behaves like a multi-node line element.

The crucial difference between block and line-block elements is that a line-block doesn't wrap bare text up in paragraphs or any other element.

Inference

Line-block elements that don't specify a type will be inferred exactly the same as a line element:

Block vs. line-block inference comparison
Example	MinTyML	HTML
Block	`div { { ... } }`	`<div> <div> ... </div> </div>`
Line-Block	`div { > { ... } }`	`<div> <p> ... </p> </div>`

Inline Element

An inline element is an element placed inside a paragraph node. It consists of an optional node wrapped in angle brackets and parentheses: <( ... )>.

An inline element can contain any kind of node, including elements and paragraphs.

Inner Node	Example
Inner Node	MinTyML	Rendered
Paragraph	`<( Hello, <_world_>! )>`	Hello, world!
Line	`<(button[type=button]> OK)>`	OK
Block	`ol { > item 1 <(ol { > item 2 > item 3 })> > item 4 <(ol { > item 5 > item 6 <(ol { > item 7 })> })> }`	item 1 item 2 item 3 item 4 item 5 item 6 item 7
Line-block	`ul> <(eggs)> <(milk)> <(> { del> red ins> green chilis })>`	eggs milk ~~red~~ green chilis

Inline Formatting

Inline formatting nodes are a set of shorthands for certain inline elements. These should cover the most common scenarios for applying formatting to a slice of a paragraph.

Inline formatting nodes and equivalents
Formatting	Element	Appearance
`<#strong#>`	`<(strong> strong)>`	strong
`</emphasis/>`	`<(em> emphasis)>`	emphasis
`<_underline_>`	`<(u> underline)>`	underline
`<~strikethrough~>`	`<(s> strikethrough)>`	~~strikethrough~~
`<"quote">`	`<(q> quote)>`	quote
<`code`>	`<(code> <[[code]]>)>`	`code`

Note that <'code'> is different from the others; instead of parsing its contents as MinTyML, it reads the string as-is. This does mean inline code can't contain the terminator `> as it would be ambiguous. To work around this, you need to use an equivalent like <(code> \<`code`\>)> or <(code> <[[<`code`>]]>)> ( see escape sequence and verbatim segment ).

Raw Text

Sometimes you want to include text without it being interpreted in the form of MinTyML nodes. For example, if your text includes characters that normally need to be escaped.

It's recommended that script and style elements contain raw text to avoid conflict between MinTyML syntax and JavaScript or CSS syntax.

Plaintext Block

A plaintext block is consists or zero or more lines of text surrounded by either ''' or """, where the opening and closing quotes are on their own lines. Any text on the lines between the opening and closing quotes will be interpreted as plain text and will not create any nodes. Plaintext blocks delimited by double quotes (""") will have escape sequences subsituted, but those delimited by single quotes (''') will not.

Comparison of single-quoted and double-quoted plaintext blocks
Delimiter	MinTyML	Rendered
Single quotes	`''' Hello, \u{1F30E} </world/>! '''`	Hello, \u{1F30E} </world/>!
Double quotes	`""" Hello, \u{1F30E} </world/>! """`	Hello, 🌎 </world/>!

Any indentation prior to the closing quotes' indentation level will be discarded.

Verbatim Segment

Inspired by XML's CDATA syntax, a verbatim segment is a string of text surrounded by <[[ and ]]>. The contents of a verbatim segment are interpreted as text and included in the document as-is. This means no nodes can be nested inside the segment and escape sequences will not be substitued.

Verbatim segments using the alternate delimiters.
Source	Rendered
`<[#[A verbatim segment may look like this: <[[ ... ]]>]#]> <[##[or like this: <[#[ ... ]#]>]##]> or even like this: \<\[##\[ ... \]##\]\>`	A verbatim segment may look like this: <[[ ... ]]> or like this: <[#[ ... ]#]> or even like this: <[##[ ... ]##]>

The alternate delimiters <[#[ and ]#]> or <[##[ and ]##]> may also be used when case the text may contain ]]> or ]#]>.

MinTyML	HTML	Rendered
`<[[Hello, world!]]>`	`Hello, world!`	Hello, world!
`pre { <[[Hello, world!]]> }`	`<pre> Hello,&Newline;world! </pre>`	Hello, world!
`<[[Hello,\nworld!]]>`	`Hello,\nworld!`	Hello,\nworld!
`<[[Hello, </world/>!]]>`	`Hello, </world/>!`	Hello, </world/>!

Template Interpolation Segment

Segments of text that resemble some common interpolation tags for template languages will remain unchanged so the MinTyML source can be compiled to an HTML template.

The following delimiters mark interpolations that will be unchanged:

Usage examples

Open	Close
`{{`	`}}`	Angular, Handlebars, Liquid
`{%`	`%}`	Liquid
`<%`	`%>`	Embedded Ruby
`<?`	`?>`	PHP

Code Block

A code block closely resembles a plaintext block, but it begins and ends with backticks (```) rather than quotation marks. It differs from the single-quoted plaintext block in that the contents are wrapped in a code element within a pre element. This means it will usually be rendered in a monospace font, and whitespace will not be ignored. A code block is equivalent to a single-quoted plaintext block following pre>code>

Code block defined with code block syntax vs. a plaintext block
Code Block Source	Plaintext Block Source	Rendered
``` function add(a, b) { return a + b; } ```	`pre>code>''' function add(a, b) { return a + b; } '''`	`function add(a, b) { return a + b; }`

Comment

A comment contains text that is visible in the source of the document. but excluded from the presentation. Comments are enclosed with <! and !> like so:

<! This is a comment !>

The above example would be represented in HTML with:

<!-- This is a comment -->

Comments can be used anywhere a node is valid, including within a paragraph:

Comments within a paragraph
Source	Rendered
`Hello, <! this is a comment !> world!`	Hello, world!
`Hello,<! this is a comment !>world!`	Hello,world!

Escape Sequence

An Escape sequence begins with a backslash (\) and provides an alternate representation of a character.

All valid escapes
Escape	Output
`\n`	Line feed (new line)
`\r`	Carriage return
`\t`	Tab
`\\`	Backslash (`\`)
`\` (a space following `\`)	Space
`\xhh` Where `hh` is a 2-digit hexadecimal number no greater than 7F	The character with ASCII number `hh`. e.g. `\x7B` becomes {.
`\u{hex}` Where `hex` is a hexadecimal number between 1 and 6 digits.	The character with unicode number `hex`. e.g. `\u{20AC}` becomes €
`\sym` Where `sym` is any of: `<` `>` `{` `}` `'` `"`	`sym` e.g. `\>` becomes >

Element Inference

Context

The type of an unspecified element is inferred based on context, which is determined by the type of the containing element.

Standard Contexts

Implicit vs equivalent explicit types
Implicit	Explicit
`section { { > { > Hello, world! } } Goodbye, world! }`	`section { div { p> { span> Hello, world! } } p> Goodbye, world! }`

section

The inside of an element that may contain paragraphs, headings, lists, tables, and structural elements like header, footer, section, or article.

Elements that contain a section context include:

body
main
article
header
footer
section
nav
aside
figure
dialog
blockquote
div
template
hgroup

paragraph

The inside of an element that should only contain text or items that can flow with text (like buttons or images).

Elements that contain a paragraph context include but are not limited to:

p
h1...h6
span
inline formatting

Some elements contain section context if declared as a block element and paragraph context if declared as a line, line-block, or inline element. This includes the following:

td (including when inferred as a child of tr)
th
li (including when inferred as a child of ul or ol)
dd
figcaption

Inference by context and node type
Context	Node Type
	Element		Paragraph
	Line	Block	Paragraph
Section	`p`	`div`	`p`
Paragraph	`span`	`span`	plain text

Specialized Contexts

Some contexts are specific to a single element type or small group of element types.

`detail`'s specialized context infers the first paragraph as the summary.
Source	Rendered
`details[open] { More info This is the more detailed information. }`	More info This is the more detailed information.

list: The inside of a ul, ol, or menu element. Infers contents to be list items (li).
table: The inside of a table, thead, tbody, or tfoot element. Infers contents to be rows (tr).
table row: The inside of a tr element. Infers contents to be data cells (td). Note: header cells (th) will need their element type explicitly stated.
details: The inside of a details element. The first line, block, or paragraph is inferred to be a summary element. The remainder of the contents are in section context.
fieldset: If the fieldset element's first node is a paragraph, that paragraph will be inferred as a legend. All other nodes are in section context.
description list: The inside of a dl element. Infers line elements to be description terms (dt) and all other nodes to be description details (dd).
links and custom elements: If used inside a paragraph context, a as well as any custom element contain a paragraph context if used in a paragraph context. Otherwise contains a section context.
others: The following special contexts can also be found on the table below:

Inference by context and node type
Context	Node Type
	Element		Paragraph
	Line	Block	Paragraph
List	`li`	`li`	`li`
Table	`tr`	`tr`	`tr`
Table Row	`td`	`td`	`td`
Description List	`dt`	`dd`	`dd`
Label	`input`	`div`	`p`
Select	`option`	`optgroup`	`option`
Data List	`option`	`option`	`option`
Column Group	`col`	`col`
Image Map	`area`	`area`

Dependencies

~0.5–1.6MB
~33K SLoC

MinTyML	HTML	Rendered
`div { Line 1 Line 2 }`	`<div> <p>A</p> <p>B</p> </div>`	A B
`div> { A B }`	`<div> A B </div>`	A B