13 releases

0.0.13	Nov 29, 2024
0.0.12	Nov 29, 2024
0.0.3	Oct 31, 2024

#373 in Web programming

28 downloads per month

AGPL-3.0-or-later

44KB
899 lines

Amsterdam Prompt Gateway

Features:

Routing: route requests to suitable LLM providers, (possibly using something like https://github.com/lm-sys/RouteLLM)
Monitoring:
- request / response token usage
- latency, failure rates
- allow clients to give feedback on results to track quality
Tracking: group requests by templates / tags and threads
Modifying: allow requests to specify what parts of a prompt are variable, store the templates in a database, and experiment with different templates.

Request API:

The gateway has endpoints that mimic LLM API endpoints, but with additional fields to support the features above.

For example for the OpenAI API the request body would look like this:

"messages": [
      {
        "role": "user",
        "content": "Hello Tinco!"
        "template": "Hello {{ name }}!",
        "template_id": "bla_template-v1.2321beta5",
        "variables": {
          "name": "Tinco"
        }
      }
],
"agent_id": "greeting-agent-v1.1231beta5",
"run_id": "abcdef123",
"request_parent_id": "abcdef123",
"request_id": "abcdef123"

Modifications:

variables + template / template_id: Passing these along alows the gateway to override the default prompt with alternatives to the prompt. It would be possible to drop the content property if the template and variables are given, but maybe it’s nice to keep compatibility with the openai protocol this way by only adding fields to it. Having a template_id allows us to easily group requests.
run_id, request_parent_id, request_id: These fields allow us to establish a context to the requests and identify it uniquely.
agent_id: This field allows us to group requests based on what agent is being run.

Endpoints

Gateway endpoints start with /v<version>/<provider>, for example /v1/openai/v1/chat/completions. To ensure compatibility, the requests are proxied to the provider as-is, with the additional fields stripped off.

Dependencies

~60MB
~1M SLoC