12 releases
0.1.18 | May 28, 2024 |
---|---|
0.1.17 | May 23, 2024 |
0.1.11 | Apr 29, 2024 |
0.1.1 | Mar 9, 2024 |
#484 in Data structures
Used in mintyml-cli
175KB
5K
SLoC
What is MinTyML?
MinTyML (from Minimalist HTML) is an alternative HTML syntax intended for writing documents.
Principles
This markup language is designed with the following principles in mind:
- Structure and formatting should be as concise as possible so the writer can focus on the text.
- Writing documents without complex formatting or interactivity should not require strong knowledge of HTML.
- Any reasonable, valid HTML should be representable in MinTyML.
Using MinTyML
Demo
This web-based demo shows how a MinTyML document will look.
Examples:
Command-line interface
The command-line MinTyML converter can be downloaded from Github or installed with cargo install mintyml-cli
IDE tooling
Try the VS Code extension.
Libraries
Language | Package | Install | Doc |
---|---|---|---|
Rust | mintyml | cargo add mintyml |
doc |
JavaScript/TypeScript | mintyml (compatible with NodeJS or in the browser with WebPack) | npm install --save mintyml |
Basic structure
A MinTyML document is made up of nodes that correspond to HTML constructs.
Paragraph
The simplest node is the paragraph. A paragraph is made up of one or more consecutive lines of plain text. Paragraphs are separated from each other by empty lines and cannot be nested inside other paragraphs.
Selector
Some elements may include a selector, which describes the type and attributes of the element. It's syntactically very similar to a CSS selector.
A selector is made up of the following components (each of which is optional):
- tag name
- Indicates the type of the element, and must be placed at the beginning of the selector. For example,
a
to create a link (also called an anchor). A wildcard (*
) is the same as not including a tag name at all, which signals that the element type should be inferred from context. - class
- A class name, preceded by a dot (
.
). Classes are used by CSS to apply styles to certain elements. For example,.external
, which one might use to indicate a link points to another domain. A selector can have multiple classes like so:.class1.class2
- id
- An identifier for this specific element, preceded by a hash
#
. For example,#example-link
. No element should have the same ID as another within the same document. Can be used by CSS to style a specific element. - attribute
- One or more HTML attributes that provide more information about the element, surrounded by brackets (
[ ]
). For example,[href=http://example.com/]
sets the destination of a link. Multiple attributes are separated by a space. If an attribute value contains a space or a closing bracket, it can be surrounded by single quotes (' '
) or double quotes (" "
). For example,[href=http://example.com/ title='Example site']
Putting all of the components together to form a selector might look like this:
a#example-link.external[href=http://example.com/ title='Example site']
The above selector describes a link that:
- Has the unique identifier
example-link
. - Is given the
external
class. - Points to
http://example.com/
. - Has the title
Example site
.
Most elements' selectors will not be nearly as complex as this. In fact, many elements will not need a selector at all.
Multiple selectors can be chained, separated by >
to nest elements together.
For example, header>h2>
describes a header
element with a level 2 heading inside. This can also be used to compose elements with inferred types:
>img[src=/icon.png]>
, when used within a section context describes a paragraph containing a single image.section>*>
or simplysection>>
describes a section with a single paragraph element.
Element
Elements are components of a document that have a distinct purpose and translate directly to an HTML element.
Some common element types (or tag names) that might be used in a text-centric document are:
p
(paragraph)- A block of related text. Block elements that contain section content will automatically encapsulate chunks of text in this element.
h1
,h2
,h3
,h4
,h5
, andh6
(heading levels 1-6)- Headings implicitly organize the document into sections and subsections based on the heading level.
ul
(unordered list) andol
(ordered list)- Bulleted and numbered lists, respectively. Any paragraph or element contained within that doesn't specify a type will be inferred as
li
(list item). See list inference. table
-
Displays data in a grid. The most basic table is made up of
tr
(table row) elements. Each row contains one or more cells of typeth
(table header) ortd
(table data).Like list items, rows and data cells can be inferred. See table inference.
article
- A complete work of written text. In some cases, there may be more than one article per document.
section
-
A logical section of a page or document. The first heading found in the section is that section's heading. Some examples of uses for sections:
- To denote chapters in a book.
- To separate the introduction, body, and bibliography of a paper.
- To mark categories of items in a menu.
div
(content division)-
A container used to group other elements. This element behaves the same as a section, but doesn't imply any logical grouping of its contents. It is often used to style or visually group one or more elements.
A block declared in a section context that contains section content automatically creates a
div
if no tag name is given:section { { This is the content of a div. } .foo { This is the content of a div with the class 'foo'. } }
MDN is an excellent resource for learning more about HTML elements.
Line Element
A line element consists of an optional selector followed by >
and optionally another node (for example a paragraph or element).
Line elements are generally used for nodes that contain no children, or that are meant to wrap around a single child or line of text.
Line elements contain the node following >
if one is present.
Source | Rendered |
---|---|
|
Explicit |
|
Implicit |
|
|
|
A div containing plain text (not wrapped in
p elements). |
Inference
When no element selector is provided, a line element will become a p
when used in a section context, or a span
in a paragraph context. For other contexts, see element inference.
Block Element
A block element consists of an optional selector followed by a pair of curly braces ({ }
) containing zero or more nodes.
Blocks are generally used to define containers, or nodes that are meant to contain other nodes. For example, the article
, section
, table
, details
, ul
, and ol
elements will usually be declared with braces.
Inference
When no element type is specified, a block element is inferred to be a div
in a section context, or span
in a paragraph context. For other contexts, see element inference.
Line-Block Element
A line-block element consists of an optional selector followed by > { ... }
, where the curly braces contain zero or more nodes. A line-block behaves like a multi-node line element.
The crucial difference between block and line-block elements is that a line-block doesn't wrap bare text up in paragraphs or any other element.
Inference
Line-block elements that don't specify a type will be inferred exactly the same as a line element:
Example | MinTyML | HTML |
---|---|---|
Block |
|
|
Line-Block |
|
|
Inline Element
An inline element is an element placed inside a paragraph node. It consists of an optional node wrapped in angle brackets and parentheses: <( ... )>
.
An inline element can contain any kind of node, including elements and paragraphs.
Inner Node | Example | |
---|---|---|
MinTyML | Rendered | |
Paragraph |
|
Hello, world! |
Line |
|
OK |
Block |
|
|
Line-block |
|
|
Inline Formatting
Inline formatting nodes are a set of shorthands for certain inline elements. These should cover the most common scenarios for applying formatting to a slice of a paragraph.
Formatting | Element | Appearance |
---|---|---|
<#strong#> |
<(strong> strong)> |
strong |
</emphasis/> |
<(em> emphasis)> |
emphasis |
<_underline_> |
<(u> underline)> |
underline |
<~strikethrough~> |
<(s> strikethrough)> |
|
<"quote"> |
<(q> quote)> |
quote |
<`code`> |
<(code> <[[code]]>)> |
code |
Note that <'code'>
is different from the others; instead of parsing its contents as MinTyML, it reads the string as-is. This does mean inline code can't contain the terminator `>
as it would be ambiguous. To work around this, you need to use an equivalent like <(code> \<`code`\>)>
or <(code> <[[<`code`>]]>)>
( see escape sequence and verbatim segment ).
Raw Text
Sometimes you want to include text without it being interpreted in the form of MinTyML nodes. For example, if your text includes characters that normally need to be escaped.
It's recommended that script
and style
elements contain raw text to avoid conflict between MinTyML syntax and JavaScript or CSS syntax.
Plaintext Block
A plaintext block is consists or zero or more lines of text surrounded by either '''
or """
, where the opening and closing quotes are on their own lines. Any text on the lines between the opening and closing quotes will be interpreted as plain text and will not create any nodes. Plaintext blocks delimited by double quotes ("""
) will have escape sequences subsituted, but those delimited by single quotes ('''
) will not.
Delimiter | MinTyML | Rendered |
---|---|---|
Single quotes |
|
Hello, \u{1F30E} </world/>! |
Double quotes |
|
Hello, 🌎 </world/>! |
Any indentation prior to the closing quotes' indentation level will be discarded.
Verbatim Segment
Inspired by XML's CDATA
syntax, a verbatim segment is a string of text surrounded by <[[
and ]]>
. The contents of a verbatim segment are interpreted as text and included in the document as-is. This means no nodes can be nested inside the segment and escape sequences will not be substitued.
The alternate delimiters <[#[
and ]#]>
or <[##[
and ]##]>
may also be used when case the text may contain ]]>
or ]#]>
.
MinTyML | HTML | Rendered |
---|---|---|
|
|
Hello, world! |
|
|
Hello, world! |
|
|
Hello,\nworld! |
|
|
Hello, </world/>! |
Template Interpolation Segment
Segments of text that resemble some common interpolation tags for template languages will remain unchanged so the MinTyML source can be compiled to an HTML template.
The following delimiters mark interpolations that will be unchanged:
Usage examplesOpen | Close | |
---|---|---|
{{ |
}} |
Angular, Handlebars, Liquid |
{% |
%} |
Liquid |
<% |
%> |
Embedded Ruby |
<? |
?> |
PHP |
Code Block
A code block closely resembles a plaintext block, but it begins and ends with backticks (```
) rather than quotation marks. It differs from the single-quoted plaintext block in that the contents are wrapped in a code
element within a pre
element. This means it will usually be rendered in a monospace font, and whitespace will not be ignored. A code block is equivalent to a single-quoted plaintext block following pre>code>
Code Block Source | Plaintext Block Source | Rendered |
---|---|---|
|
|
|
Comment
A comment contains text that is visible in the source of the document. but excluded from the presentation. Comments are enclosed with <!
and !>
like so:
<! This is a comment !>
The above example would be represented in HTML with:
<!-- This is a comment -->
Comments can be used anywhere a node is valid, including within a paragraph:
Source | Rendered |
---|---|
|
Hello, world! |
|
Hello,world! |
Escape Sequence
An Escape sequence begins with a backslash (\
) and provides an alternate representation of a character.
Escape | Output |
---|---|
\n | Line feed (new line) |
\r | Carriage return |
\t | Tab |
\\ | Backslash (\ ) |
\ (a space following \ ) | Space |
Where hh is a 2-digit hexadecimal number no greater than 7F |
The character with ASCII number hh. e.g. |
Where hex is a hexadecimal number between 1 and 6 digits. |
The character with unicode number hex. e.g. |
Where sym is any of:
|
sym e.g. |
Element Inference
Context
The type of an unspecified element is inferred based on context, which is determined by the type of the containing element.
Standard Contexts
- section
-
The inside of an element that may contain paragraphs, headings, lists, tables, and structural elements like
header
,footer
,section
, orarticle
.Elements that contain a section context include:
body
main
article
header
footer
section
nav
aside
figure
dialog
blockquote
div
template
hgroup
- paragraph
-
The inside of an element that should only contain text or items that can flow with text (like buttons or images).
Elements that contain a paragraph context include but are not limited to:
p
h1
...h6
span
- inline formatting
Some elements contain section context if declared as a block element and paragraph context if declared as a line, line-block, or inline element. This includes the following:
td
(including when inferred as a child oftr
)th
li
(including when inferred as a child oful
orol
)dd
figcaption
Context | Node Type | ||
---|---|---|---|
Element | Paragraph | ||
Line | Block | ||
Section | p |
div |
p |
Paragraph | span |
span |
plain text |
Specialized Contexts
Some contexts are specific to a single element type or small group of element types.
- list
- The inside of a
ul
,ol
, ormenu
element. Infers contents to be list items (li
). - table
- The inside of a
table
,thead
,tbody
, ortfoot
element. Infers contents to be rows (tr
). - table row
- The inside of a
tr
element. Infers contents to be data cells (td
). Note: header cells (th
) will need their element type explicitly stated. - details
- The inside of a
details
element. The first line, block, or paragraph is inferred to be asummary
element. The remainder of the contents are in section context. - fieldset
- If the
fieldset
element's first node is a paragraph, that paragraph will be inferred as alegend
. All other nodes are in section context. - description list
- The inside of a
dl
element. Infers line elements to be description terms (dt
) and all other nodes to be description details (dd
). - links and custom elements
- If used inside a paragraph context,
a
as well as any custom element contain a paragraph context if used in a paragraph context. Otherwise contains a section context. - others
- The following special contexts can also be found on the table below:
- label (
label
)- Line elements inferred as
input
- Line elements inferred as
- data list (
datalist
,optgroup
)- All children inferred as
option
- All children inferred as
- select (
select
)- Like data list, but blocks inferred as
optgroup
- Like data list, but blocks inferred as
- column group (
colgroup
)- Infers elements as
col
- Infers elements as
- image map (
imagemap
)- Infers elements as
area
- Infers elements as
Context | Node Type | ||
---|---|---|---|
Element | Paragraph | ||
Line | Block | ||
List | li |
li |
li |
Table | tr |
tr |
tr |
Table Row | td |
td |
td |
Description List | dt |
dd |
dd |
Label | input |
div |
p |
Select | option |
optgroup |
option |
Data List | option |
option |
option |
Column Group | col |
col |
|
Image Map | area |
area |
lib.rs
:
This library exists to convert MinTyML (for Minimalist HTML) markup to its equivalent HTML.
This should be considered the reference implementation for MinTyML.
Dependencies
~0.5–1.6MB
~34K SLoC