Every month, Osano’s technology serves up 2.5 billion cookie consents for more than 750,000 businesses around the world. That number is only likely to grow, as new European and American laws that require companies to ask site visitors before collecting their data — and tell them how they use it — go into effect.
CEO Arlo Gilbert said scaling poses a challenge for the Austin-based data privacy software firm: Osano can’t afford to have its software go down or be slow to load. If it does, the company’s customers are literally breaking the law.
If we’re down, then your cookie pop-ups don’t show up, and your website’s out of compliance.”
“We had to be prepared from the very beginning to scale and support high volumes of everything,” Gilbert said. “If we’re down, then your cookie pop-ups don’t show up, and your website’s out of compliance.”
Built In spoke with Gilbert about how Osano built its technology for scale.
Edge computing helps the company keep speeds up around the world.Osano stores copies of its application in edge servers located across the world, which allows it to serve customers living in say, Japan, its product at exactly the same speed as those living in its hometown of Austin.
Supplementing AI with human expertise. Osano’s attorneys have reviewed approximately 9,000 companies’ data privacy policies, and the firm is now using that dataset to train a natural language processing system to rate how companies use consumer data. But its attorneys are still checking the AI’s work.
Cataloging the world's privacy policies
Founded in late 2018, and launched commercially in October, Osano is a software-as-a-service platform that gives businesses insight into whether they’re complying with privacy laws, and alerts them when vendor privacy policies change. The Austin firm also rates companies’ data privacy skills, and keeps tabs on open lawsuits against tech companies, so as to automatically alert companies whenever they — or a tech vendor that powers their website — are on the wrong side of the law.
In addition to helping companies analyze the data standards of their vendors, Osano also maintains a popular open-source tool companies use to make pop-up consents for cookies, which are used to track user visits and activity, record login information and more.
Gilbert said 50,000 new businesses are adding the code to their sites every month, thanks in part to the EU’s General Data Protection Regulation Act (GDPR).
“It’s just accidentally become the main open-source tool for cookie consents,” Gilbert said.
Throw in an additional, soon-to-be-in-effect California Consumer Privacy Act that is essentially the U.S. version of the GDPR, and Osano execs knew scale would soon be an even bigger concern.
For these reasons, engineers at Osano are currently in a “rapid development cycle,” Gilbert said, building out features like a data subject access request portal, which will allow companies to pull up information immediately about what data they have on file for individual customers. This will help companies comply with privacy laws in Europe and the U.S., Gilbert said.
Building for scale
Most of Osano’s systems run on Amazon Web Servers technology, which Gilbert said gives the company a global footprint. The firm uses Amazon’s Serverless system — which allows it to run applications and services without thinking about servers, since Amazon will automatically scale according to Osano’s needs — as well as run its Aurora data storage system, which also allows Osano to instantly scale its storage needs.
Osano also utilizes edge servers located around the world, where copies of its application reside.
It's always being served by a computer that's close to the end user."
“If you log in from Japan, our application will be just as fast as if you log in from Austin, because it’s always being served by a computer that’s close to the end user,” Gilbert said.
Integrating humans into an AI-first workflow
Osano developers have also built a scanning tool — created with a combination of in-house software and open-source tools, written in a coding language called Erlang — that scans companies’ websites on a nightly basis to identify its vendors and analyze their compliance documents.
The scanning tool operates as a headless Chrome browser — a browser with no actual interactions that can run on a server without human input — that looks over the HTML and PDF privacy docs it scans and converts them into the Markdown formatting language, removing the images, navigation and other decorative elements of the document. Each night, Osano then compares the new Markdown version of a vendor’s privacy document to the version it has stored.
Sometimes, changes can be nothing more than a word or two. Other times, the updates are significant.
Storing user consent on a cryptographic ledger
Osano’s pop-up asks web visitors if they’re comfortable with cookies. Those responses are recorded in Osano’s cryptographically verifiable ledger.
Osano uses Quantum Ledger Database — another Amazon technology — to record user responses. Gilbert said Osano chose Quantum because it loads records more quickly than a blockchain product that is decentralized, which means it requires many users to come to a consensus about each record before it’s approved.
Quantum has a single, “central trusted authority” that approves each addition. Using the ledger helps Osano securely record users’ consent, and companies can use the record as proof against consumer allegations of collecting their data without permission.
It is cryptographically verifiable when a record was created and that that record has not been modified in any way."
“By using a third party, and by recording that data on a [ledger], it is cryptographically verifiable when a record was created and that that record has not been modified in any way,” Gilbert said.
Gilbert said Osano was an early partner for Quantum, which has helped guide its product roadmap.
Where law and tech intersect
One of Osano’s first moves as a new company was to pay an “undisclosed sum” for the open-source code that powers its cookie consents.
After acquiring the code, Gilbert said Osano spent its first six months building out the data set it uses to identify how well a company complies with privacy laws. The data set is comprised of answers to 163 legal and technical questions. Before launch, a team of 24 attorneys used the dataset to review approximately 9,000 companies’ privacy policies and rate how their use of consumer data stacks up.
Developers are in the process of turning Osano’s wide ratings database into a natural language processing system. By June 2020, Gilbert said Osano aims to use this system to review companies’ privacy policies, but still have attorneys review its findings for quality assurance.
We want to ensure that our customers have faith in the data.”
“We have the dataset now that we can use to train, but for the most part it is still a manual process because we want to ensure that our customers have faith in the data,” he said.
In addition to creating this dataset, Osano also connects directly into Pacer, a federal lawsuit database, to keep tabs on legal complaints against companies. It files that data into its system too. Through an application programming interface (API), Osano also connects to seven state court websites and scrapes complaint data from California, Texas, Arizona, Florida, Maryland, Delaware and New York.
Gilbert said Osano plans to integrate with an additional state court every month, and eventually expand to scrape data from state courts across the nation.
“The courts systems are stuck in the ‘90s, integrating with them is not as simple as making a couple API calls. It’s a pretty brutal process,” he said. “It’s going to take time.”