Blog

Sometimes we take a break from building cutting edge AI redaction models to stretch our academic muscles and write about privacy and machine learning. Check back here regularly for our musings.

The latest data breaches are a regular topic in the news. Raising awareness about the prevalence and severity of the issue, as well as how

Data protection is a critical concern in today’s digital world. As more and more data are collected and processed, the need for effective data protection

On February 8, 2023, the International Organization for Standardization adopted privacy by design in ISO 31700:2023 as a voluntary standard for organizations to implement into

What does privacy mean to us at Private AI? As a tech company whose purpose it is to enhance privacy, we are acutely aware of

Large language models (LLMs) are a type of machine learning model that are trained on vast amounts of text data to generate human-like text. They

In today’s data-driven world, businesses are constantly collecting information from their customers in order to provide a better product or service, to understand and alleviate

The Canadian healthcare and health tech space is robust and growing at warp speed. Globally health tech, especially in the AI field, is evolving much faster

What is the Metaverse? Ever since the word “Metaverse” hit our vocabularies in 2021, there has been ample confusion with many publications trying their hands

Differential privacy is a hot topic given the many conflicting opinions on its effectiveness. For some background, we previously wrote a comprehensive post on the

Over the years, large pre-trained language models like BERT and Roberta have led to significant improvements in natural language understanding (NLU) tasks. However, these pre-trained

Automated Container Resource Checks: Does your container have the required resources?

At Private AI, we are building a privacy suite centered around personally identifiable information (PII) detection and remediation in unstructured data, such as text. Users interact with our

In the previous episode of Private AI’s ML Speaker Series, Patricia Thaine (CEO of Private AI) sat down with Dr. Aida Nematzadeh (Staff Research Scientist

Discussing Responsible AI & International Governance

In the previous episode of Private AI’s ML Speaker Series, Patricia Thaine (CEO of Private AI) sat down with Dr. Sarah Shoker (Research Scientist at

In today’s world, large models with billions of parameters trained on terabytes of datasets have become the norm as language models are the foundations of

There are several resources available on the internet on how to scale your Kubernetes pods based on CPU, but when it comes to Kubernetes pods

In the previous episode of Private AI’s ML Speaker Series, Patricia Thaine (CEO of Private AI) sat down with Arvid Frydenlund (PhD candidate at the University

Personally Identifiable Information (PII) is any data that can be used to identify an individual. This can be done using direct identifiers (name, social security

Previously on Private AI’s speaker series CEO, Patricia Thaine, sat down with Franziska Boenisch to discuss her latest paper, ‘When the Curious Abandon Honesty: Federated Learning Is Not Private’.  Franziska

In the latest episode of Private AI’s ML Speaker Series, Patricia Thaine (CEO of Private AI) sits down to chat about MLOps and Machine Learning Deployment

9 Companies to Help You Get Your Privacy $hit Together

With the ever-growing number of global regulations, legislations, and amendments, it can be overwhelming to know where to start (or continue) your data privacy journey.

Transformer networks have taken the NLP world by storm, but the sheer size of these networks presents new challenges for deployment, such as how to provide acceptable latency and unit economics.

Parameter Prediction without Training and SGD

Previously on Private AI’s Speaker Series, our CEO Patricia Thaine sat down with data privacy law expert Carol Piovesan to talk about the legal ramifications

5 Facts You Probably Didn’t Know About Data Privacy

Data privacy, in simplest terms, is the right to control how your personal information is collected and used. Although this may seem obvious, it hasn’t

Data Protection Regulations to Watch Out for in 2022

Carole Piovesan discusses legal responsibilities, what companies are getting wrong with data governance, and more.

GDPR compliance, privacy and engineering team collaboration, and common mistakes companies make with their data.

Discussing developer responsibility, Bill C-11, positive consent, and the importance of Privacy by Design

“When is anonymization useful?” is a tricky question, because the answer is highly data-type- and task-dependent.

On the misleading ways journalists and industry use the term "anonymization."

Understanding key tech for data protection regulation compliance

There’s a saying ‘the last 20% of the work takes 80% of the time’ and nowhere is that more true than AI systems.

Regexes are highly effective in the perfect world of computer data, but unfortunately the real world is much more complicated.

There exists a vibrant ecosystem of specialized security tools. The sad truth is that it is almost impossible to reach 100% invulnerability. What can we do to get closer?

In the past three years there has been a massive wake-up in customer awareness about privacy. Many customers are now refactoring how they buy, taking their business elsewhere if they don’t trust a company’s data practices.

Privacy Enhancing Technologies Decision Tree: for developers, managers, and founders looking to integrate privacy into their software pipelines and products.

AI is rapidly being deployed around the world with few to follow. Along with the complexity of creating the technology, there remain many unanswered legal questions.

The new Tensorflow Lite XNNPACK delegate enables best in-class performance on x86 and ARM CPUs — over 10x faster than the default Tensorflow Lite backend in some cases.

Some techniques to improve DALI resource usage & create a completely CPU-based pipeline.

We introduce the four pillars required to achieve perfectly privacy-preserving AI and discuss various technologies that can help address each of the pillars.

We discuss a practical application of homomorphic encryption to privacy-preserving signal processing, particularly focusing on the Fourier transform.

Terms and Conditions of Use Effective March 10, 2020 These Terms and Conditions of Use (“Terms”) apply to and govern: your use of Private AI’s

We cover the basics of homomorphic encryption, followed by a brief overview of open source HE libraries and a tutorial on how to use one of those libraries (namely, PALISADE).

A number of people ask us why we should bother creating NLP tools that preserve privacy. Apparently not everyone spends hours thinking about data breaches and privacy infringements.

A very brief overview of privacy-preserving technologies follows for anyone who’s interested in starting out in this area. I cover symmetric encryption, asymmetric encryption, homomorphic encryption, differential privacy, and secure multi-party computation.

Language Packs

Expand the categories below to see which languages are included within each language pack.
Note: English capabilities are automatically included within the Enterprise pricing tier. 

French
Spanish
Portuguese

Arabic
Hebrew
Persian (Farsi)
Swahili

French
German
Italian
Portuguese
Russian
Spanish
Ukrainian
Belarusian
Bulgarian
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
Greek
Hungarian
Icelandic
Latvian
Lithuanian
Luxembourgish
Polish
Romanian
Slovak
Slovenian
Swedish
Turkish

Hindi
Korean
Tagalog
Bengali
Burmese
Indonesian
Khmer
Japanese
Malay
Moldovan
Norwegian (Bokmål)
Punjabi
Tamil
Thai
Vietnamese
Mandarin (simplified)

Arabic
Belarusian
Bengali
Bulgarian
Burmese
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
French
German
Greek
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Khmer
Korean
Latvian
Lithuanian
Luxembourgish
Malay
Mandarin (simplified)
Moldovan
Norwegian (Bokmål)
Persian (Farsi)
Polish
Portuguese
Punjabi
Romanian
Russian
Slovak
Slovenian
Spanish
Swahili
Swedish
Tagalog
Tamil
Thai
Turkish
Ukrainian
Vietnamese

Rappel

Testé sur un ensemble de données composé de données conversationnelles désordonnées contenant des informations de santé sensibles. Téléchargez notre livre blanc pour plus de détails, ainsi que nos performances en termes d’exactitude et de score F1, ou contactez-nous pour obtenir une copie du code d’évaluation.

99.5%+ Accuracy

Number quoted is the number of PII words missed as a fraction of total number of words. Computed on a 268 thousand word internal test dataset, comprising data from over 50 different sources, including web scrapes, emails and ASR transcripts.

Please contact us for a copy of the code used to compute these metrics, try it yourself here, or download our whitepaper.