Uses new Rust 2021
|0.1.4||Nov 6, 2022|
|0.1.3||Oct 24, 2022|
|0.1.2||Sep 7, 2022|
|0.1.1||Aug 29, 2022|
|0.0.13||Aug 17, 2022|
#315 in Parser implementations
85 downloads per month
A lightweight, textual tagging system aimed at DJs for managing custom metadata.
A gig tag is a flat structure with the following, pre-defined fields or components:
- Facet (including an optional calendar date)
All components are optional with the following restrictions:
- A valid gig tag must have a label or a facet.
- A valid gig tag with only a facet and neither a label or props is valid, if the facet has a date suffix
A label is a non-empty string that contains arbitrary text without leading/trailing whitespace.
Labels are supposed to be edited by users and are displayed verbatim in the UI.
||a single word|
||multiple words concatenated in PascalCase|
||multiple words separated by whitespace|
The same content rules that apply to labels also apply to facets.
Moreover facets must not start with a leading slash
/ character that
would otherwise interfere with the serialization format (see below).
Facets serve a different semantic purpose than labels. They are used for categorizing, namespacing or grouping a set of labels or for defining the context of associated properties.
Facets are supposed to represent pre-defined identifiers that are neither editable nor directly displayed in the UI.
A reserved suffix could be used to encode a calendar date into facets.
Facets that end with a
@ character followed by 8 decimal digits
are considered as date-like facets. The digits are supposed to
encode an ISO 8601 calendar date without a time zone in the format
Facets considered as date-like even if the 8 decimal digits do not encode a valid date. This less restrictive constraints have been chosen deliberately to allow using regular expressions for recognizing date-like facets.
@ character of the date suffix must follow the preceding text
without any intermediate whitespace. Thus the remaining prefix after
stripping the date-like suffix remains a valid facet.
The following regular expressions could be used:
||Recognize date-like facets|
||Reject facets with a date-like suffix if preceded by whitespace|
||a tag for encoding properties related to Spotify|
||date-like facet without a prefix that denotes the calendar day 2022-06-25 in any time zone|
||date-like facet with prefix
||date-like facet without a prefix and an invalid date|
||date-like facet with prefix
||invalid date-like facet with a prefix containing trailing whitespace before the date-like suffix|
Custom properties could be attached to tags, abbreviated as props.
Properties are represented as a non-empty, ordered list of name/value pairs.
Names are non-empty strings that contain arbitrary text without leading/trailing whitespace. There are no restrictions regarding the uniqueness of names, i.e. duplicate names are permitted.
Values are arbitrary strings without any restrictions. Empty values are permitted.
Applications are responsible for interpreting the names and values in their respective context. Facets could be used for defining this context.
Individual tags are encoded as URIs:
URI = scheme ":" ["//" authority] path ["?" query] ["#" fragment]
authority = [userinfo "@"] host [":" port]
Only the path, query, and fragment components could be present. All other components must be absent, i.e. the URI string must neither contain a scheme nor an authority component.
The following table defines the component mapping:
|Tag component||URI component||Percent-encoded character set|
|label||fragment||fragment percent-encode set +
|facet||path||path percent-encode set +
|props (name/value)||query||query percent-encode set +
Tags, respective their URIs, are serialized as text and the components are percent-encoded according to RFC 2396/1738. The above table specifies which characters need to be encoded for each tag component. Property names/values are encoded separately.
Empty components are considered as absent when parsing a gig tag from an URI string.
The following examples show variations of the encoded string with empty components that are ignored when decoding the URI.
|Encoded||Facet||Date||Label||Props: Names||Props: Values|
The following tokens do not represent valid gig tags:
||URL scheme/authority are present|
||Only a facet without a date, neither a label nor props|
||Facet starts with a
||Date suffix in facet is prefixed by whitespace|
||Empty property name|
||Special characters like
||Empty label is considered as absent|
||Empty facet and props are considered as absent|
||Empty facet, props, and label are considered as absent|
Multiple tags are formatted and stored as text by concatenating the corresponding, encoded URIs. Subsequent URIs are separated by whitespace, e.g. a single ASCII space character.
Often it is not possible to store the encoded gig tags in a reserved field. In this case gig tags could appended to any text field by separating them with arbitrary whitespace from the preceding text.
Text is split into tokens that are separated by whitespace. Parsing starts with the last token and continues from back to front. It stops when encountering a token that could not be parsed as a valid gig tag.
The first token that could not be parsed as a valid gig tag is considered the last token of the preceding text. The preceding text including this token and the whitespace until the first valid gig tag token must be preserved as an undecoded prefix.
When re-encoding the gig tags the undecoded prefix that was captured during parsing must be prepended to the re-encoded gig tags string. This rule ensures that only whitespace characters could get lost during a decode/re-encode roundtrip, i.e. when unintentionally parsing arbitrary words from the preceding text as valid gig tags (false positives).
The text with the encoded gig tags is appended (separated by whitespace) to the Content Group field of audio files:
Permissions of this copyleft license are conditioned on making available source code of licensed files and modifications of those files under the same license (or in certain cases, one of the GNU licenses). Copyright and license notices must be preserved. Contributors provide an express grant of patent rights. However, a larger work using the licensed work may be distributed under different terms and without source code for files added in the larger work.
Any contribution intentionally submitted for inclusion in the work by you shall be licensed under the Mozilla Public License 2.0 (MPL-2.0).
It is required to add the following header with the corresponding SPDX short identifier to the top of each file:
// SPDX-License-Identifier: MPL-2.0