1 unstable release
new 0.1.0 | Jan 31, 2025 |
---|
#7 in #policies
110KB
2.5K
SLoC
policyai
Instructions form the backbone of agent-based workloads. Crucially, agents need instruction to know how to proceed with tasks. The acai framework specifies the structure of an instruction as a core primitive, but leaves it up to the user to write tooling for turning written instructions into predictable, reliable, structured outputs.
For an illustration of the problem, consider a manager for GitHub issues and notifications. Giving an open source model instructions about an issue's relative priority based upon the issue's assignee yields poor results because the model will often cross instructions from different assignees. For example, I asked a model to prioritize messages from Alice and deprioritize messages from Bob in the most straightforward way possible and what it decided was that no decision could be reached because of conflicting information because Alice must be priority and Bob must not be priority.
I want the reliability of o1 with the speed and cost of phi4.
policyai is about hitting that mark.
What is a policy?
To define the term precisely, a policy is a semantic injection coupled with a structured output. Policies are designed in such a way that policies of the same type compose a larger policy in a reliable way. To continue our GitHub example, we would be able to tell whether either policy for Alice or Bob applies to the output and represent just the information extracted by the relevant pieces of semantic policy writing.
For another example, consider a simple policy about what should happen with an email. We basically know that we want to mark the email read/unread, categorize and label it, prioritize human interaction with it, and reply with a drafted response using a template. What we can say is that the email should default to unread, but be toggle-able. The priority must take the highest priority from any policy. The category must be uniformly agreed upon, but labels are an open set of strings. Finally, the template, if present, must be agreed to by all present. Such a policy has a type like:
type policyai::EmailPolicy {
unread: bool = true,
priority: ["low", "medium", "high"] @ highest wins,
category: ["ai", "distributed systems", "other"] @ agreement = "other",
template: string @ agreement,
labels: [string],
}
It is a declarative representation of everything we represented in the prose. Crucially this language for policy types prohibits representing structured outputs that are hard to compute reliably when composing policies.
Instances of this policy look like:
{
"semantic-injection": "
The email is relevant to football of either form.
Mark \"unread\" false with \"low\" \"priority\".
",
"action": {
"unread": false,
"priority": "low"
}
}
{
"semantic-injection": "
The email pertains to ecommerce.
Add \"Shopping\" to \"labels\".
",
"action": {
"labels": [
"Shopping"
]
}
}
{
"semantic-injection": "
The email is from mom@example.org.
Record \"high\" \"priority\" and add \"Family\" to \"labels\".
",
"action": {
"priority": "high",
"labels": [
"Family"
]
}
}
There's plenty of room for cross-policy interaction here. What should the outcome of applying these policies be if my mom sends me an email about shopping for football-related gear? It should be marked as unread with high priority and the labels "Family" and "Shopping". That's the unit-tested outcome for these policies and this query.
How it works
Dependencies
~14–26MB
~380K SLoC