Private AI’s privacy solution is primarily designed to be self-hosted by the user via a Docker container. This is to provide users with the best possible experience in terms of latency & security. It would be counter-productive to send sensitive data across the Internet to a 3rd party system for the purpose of improving privacy. It also ensures that Private AI never sees or handles customer data, unlike cloud APIs which retain a right to use any data passed through the system for service improvements and ML model development.

It is also possible to use a cloud version of the API, hosted by Private AI at the following endpoint:

Custom integrations that do not rely on Docker can also be delivered upon request.

System Requirements

The container comes in two different build flavours: a compact, CPU-only container that runs on any Intel or AMD CPU and a container with GPU acceleration. The CPU container is highly optimised and is suitable for the majority of use cases, as the container uses hand-coded AVX2/AVX512/AVX512 VNNI instructions in conjunction with Neural Network compression techniques to deliver a ~25X speedup over a reference transformer-based system. The GPU container is designed for large-scale deployments making billions of API calls or processing terabytes of data per month.

The minimum & recommended system requirements for the Docker container are as follows:



Recommended Concurrency


Any x86 (Intel or AMD) processor with 6GB RAM

Intel Cascade Lake or newer CPUs supporting AVX512 VNNI with 8GB RAM



Any x86 (Intel or AMD) processor with 28GB RAM. Nvidia GPU with compute capability 6.0 or higher (Pascal or newer)

Any x86 (Intel or AMD) processor with 32GB RAM & Nvidia Tesla T4 GPU


The Private AI container can also run on the new Apple chips, such as the M1. Performance will be degraded however, due to the Rosetta 2 emulation of the AVX instructions. Native ARM CPU builds & builds requiring only 1-2GB RAM can also be delivered upon request.

Whilst Private AI CPU-based container will run on any x86-compatible instance, the below cloud instance types give optimal throughput & latency per dollar:


Recommended Instance Type




2x Intel Ice Lake vCPUs, 16GB RAM


M5zn.large (2 vCPU, 8GB RAM)

2x Intel Cascade Lake vCPUs, 8GB RAM


N2-Standard-2 (2 vCPU, 8GB RAM)

2x Intel Cascade Lake or Ice Lake vCPUs, 8GB RAM

Note: In the event lower latency is required, the instance type should be scaled; e.g., using an M5zn.2xlarge in place of a M5zn.xlarge. While the Private AI docker solution can make use of all available CPU cores, it delivers best throughput per dollar using a single CPU core machine. Scaling CPU cores does not result in a linear increase in performance.

Similarly for the GPU-based container, Private AI recommends the following instance types:


Recommended Instance Type




8x AMD EPYC 7V12(Rome) vCPUs, 56GB RAM, Nvidia Tesla T4 GPU



8x Intel Cascade Lake vCPUs, 32GB RAM, Nvidia Tesla T4 GPU


N1-Standard-8 + Tesla T4

8x Intel Skylake vCPUs, 32 GB RAM, Nvidia Tesla T4 GPU

The aforementioned instance types were selected based on extensive benchmarking performed by Private AI. Please contact Private AI for a copy of the benchmark report.

Note: Please only run one container instance per CPU or GPU. Running multiple containers results in vastly reduced performance!


The following prerequisites are required to run the Docker container

Docker Image Setup Instructions

The Docker image can be pulled via two methods: from Docker Hub or via an encrypted Docker image export.

Docker Hub

  1. To pull the image from Docker Hub, please login to Private AI’s customer Docker account using the access token provided by Private AI:

    $ docker login -u paiuser -p <Access Token>
  2. Next, pull the appropriate version of the image:

    $ docker pull privateai/deid:<version>

Docker Image Export

  1. GnuPG and wget are required to download and decrypt the exported image:

  1. Download the Docker image file using the following command:

    $ wget <download link>

    This will download the file to the current work directory. Please email us for the download link.

  2. Decrypt the Docker image file using the following command:

    $ gpg private_ai_<version number>.tar.gpg

    Enter the decryption password when prompted. This will create a .tar file in the current work directory.

  3. Load the tar file to the Docker engine using the following command:

    $ docker load -i private_ai_<version number>.tar

After setting up the image, the docker image can be found by the name deid.

Run Instructions

The CPU container can be run with the following command:

$ docker run --rm -p 8080:8080 -it deid:<version number>

The command to run the GPU container requires an additional ‘gpus’ flag to specify the GPU ID to use:

$ docker run --gpus <GPU_ID, usually 0> --rm -p 8080:8080 -it deid:<version number>

Note that it is recommended to deploy the container on single GPU machines. For multi-GPU machines, please launch a container instance for each GPU and specify the GPU_ID accordingly.

For cloud deployment, such as on Azure or AWS, the Private AI devops team can provide best practice guides on installation.

Environment Variables

The Private AI container supports a number of environment variables. The environment variables can be set in the Docker run command as follows:

$ docker run --rm -e <ENVIRONMENT_VARIABLE>=<VALUE> -p 8080:8080 -it deid:<version number>
Supported Environment Variables

Variable Name



Controls the verbosity of the container logging output. Allowed values are ‘info’, ‘warning’ or ‘error’. Default is ‘info’


Allows for the redaction marker format to be set globally, instead of passing into each POST request. Please see Processing Text


Allows for the allow list to be set globally, instead of passing into each POST request. An example could be ALLOW_LIST=’[“John”,”Grace”]’. Please see Processing Text

  • To change the port used by the container, please set the host port as per the command below:

    $ docker run --rm -p <host port>:8080 -it deid:<version number>

Authentication and External Communications

Private AI’s self-hosted solution is designed to run entirely on-device, on-premise, or in private cloud. The only outside communications made are for authentication & usage reporting with Private AI’s servers. These communications do not contain any customer data – if training data is required, this must be given to Private AI separately. An authentication call is made upon the first API call after the Docker image is started, and again at pre-defined intervals based on your subscription.

An “airgapped” version of the container that doesn’t require external communication can be delivered upon request.