• Docs >
  • General and Deployment Tips

General and Deployment Tips

Usage is metered by API calls, where an API call is:

  • Text: 100 words, where a word is a whitespace separated piece of text

  • Image: A single image

  • Video: Minutes of video processed, rounded up (e.g. 23s will be rounded up to 1 minute)

The number of API calls used is in the POST request return, see field “api_calls_used”.

Whilst the Private AI docker solution can make use of all available CPU cores, it is optimised to run on a single CPU core machine.

The docker solution relies on Uvicorn to service requests. Uvicorn is capable of multiple simultaneous requests, however this will not result in a processing speed improvement beyond removing any network latency. Inputs are still processed sequentially by our code.

For more information on Uvicorn, please visit https://www.uvicorn.org/

Recommended deployment setup is to use Kubernetes with a cluster of single CPU core nodes, together with a load balancer to distribute requests. This could be a local on-premise deployment or a cloud provider such as GCP, AWS or Azure.

Recommended worker type is a single core Intel Cascade Lake with 8GB RAM - a 2GB RAM option can be delivered upon request. Other CPU types with AVX512 VNNI support will also perform well. In AWS, we recommend using the m5zn instance type and in GCP the N2 instance type.

Benchmark report can be delivered upon request.

Please contact us if you’d like to know how your infrastructure can best utilize the runtime.

The health of the container can be monitored by calling ‘healthz’ as follows:

$ curl -X GET localhost:8080/healthz

“last_auth_call_successful” displays whether the attempted call to the Private AI authentication servers was successful. Note that this value defaults to false on startup, until the first deid API call has been made successfully.

Authentication and External Communications

Private AI’s de-identification suite is designed to run entirely on-device, on-premise, or in private cloud. The only outside communications made are for authentication & usage reporting with Private AI’s servers. These communications do not contain any customer data – if training data is required, this must be given to Private AI seperately. An authentication call is made upon the first API call after the Docker image is started, and again at pre-defined intervals based on your subscription.