1 unstable release
0.2.0 | Dec 30, 2024 |
---|
#551 in Database interfaces
128 downloads per month
Used in 3 crates
(2 directly)
65KB
1.5K
SLoC
PDL: Prompt Description Language
Pdl is a special file format used by ragit project to represent prompts. It allows you to
- write pragmatic Prmopts using tera template language.
- embed image files.
- force LLMs to output a json with a designated schema.
Language
Pdl is basically a readable format of LLM messages. For example,
<|user|>
Hi, what's your name?
<|assistant|>
I'm Llama.
<|user|>
How old are you?
is converted to
[
{
"role": "user",
"content": "Hi, what's your name?",
}, {
"role": "assistnat",
"content": "I'm Llama",
}, {
"role": "user",
"content": "How old are you?",
},
]
Each turn must starts with a turn-separator: <|user|>
, <|assistant|>
, <|system|>
or <|schema|>
. A turn-separator must be following and followed by a newline character. If a content comes before any turn-separator, that's an error.
<|schema|>
is a special type of a turn. I'll talk about it later.
Template
You can write a pragmatic prompt with tera template engine. When the engine parses a pdl file, the file first goes through the engine. That means tera syntax is applied before any pdl syntax. You can create or remove a turn using tera syntax, or create a templated schema. You can also write comments with its syntax.
Images
There'a special syntax in pdl that allows you to embed images.
TODO: write doc
Schema
You can force LLMs to output a json value with a schema. You can set the schema with a <|schema|>
turn. If it's not given, it doesn't check anything. If it's given more than once, that's an error.
<|schema|>
{ name: str, age: int }
<|user|>
Tell me about you.
The above pdl forces LLMs to output a json like { "name": "Llama", "age": 4 }
. It's not a magic. It's just a prompt-enhancement. So I recommend you to
- Explain your schema in user prompt or system prompt. The
<|schema|>
turn does not reach the LLM. - Keep your schema simple. It works by telling the LLM which part of the output is wrong if it's wrong. It's like fixing your code with compiler error messages. If the schema is too complicated, the error message would be less readable. If it fails too much, it just returns a default value.
Constraints
You can add constraints to schema. For example, { name: str, age: int { min: 0, max: 100 } }
forces the age
value to be between 0 and 100 (both inclusive).
Non-json schema
Basically, pdl engine first extracts json-looking string from LLM output, then parses it. For example, if the schema is a json object, the engine tries to match a curly brace using regular expression. If it fails to parse json, that's an error.
There are 2 cases where it doesn't parse json.
- If the schema is
str
, it just treats the entire output as the string. It doesn't look for quotation marks, and it doesn't run the parser. You can also add constraints tostr
. For example, if the schema isstr { min: 100 }
, it makes sure that the length of the entire output is at least 100 characters. - (TODO) If the schema is
yesno
, it makes sure that the LLM's output is eitheryes
orno
. You cannot mix it with other json schema becauseyes
andno
are not valid json values. If yes/no is all you need,yesno
is better thanbool
because LLMs are usually better at English than json.
Dependencies
~10–19MB
~271K SLoC