#StartupsEverywhere: Cambridge, Mass.

#StartupsEverywhere profile: Anne Kim, Co-Founder and CEO, Secure AI Labs (SAIL)

This profile is part of #StartupsEverywhere, an ongoing series highlighting startup leaders in ecosystems across the country. This interview has been edited for length, content, and clarity.

anne kim.png

Innovating Systems for Health Research, Improving Data Usage with Enhanced Security

Anne Kim—Co-Founder and CEO of Secure AI Labs (SAIL)—is a researcher at heart. While working through her graduate studies, Kim discovered a significant obstacle to her research in her efforts to obtain patient data from hospitals. This led her to the concept of SAIL, a data management platform that focuses on the need for accessible yet secure medical data. We sat down with Kim to hear what brought her to develop the platform, her thoughts on how startups and policymakers can be more forward thinking about privacy policy, and how she thinks the government can provide startups a boost as a partner and customer.

Could you tell us about your background and how it led you to SAIL?

I'm a big nerd. Starting from a very early age, in middle school, I discovered I love math and all things data. As I got into high school and beyond, this developed into not just a love for math, but a love of research—the process of developing science and technology and the rigor it takes to conduct any kind of disease research. So I followed this passion, spending more time studying biology and oncology, but that wasn’t enough. I wondered how we could do research faster and bigger. And that's when I was introduced to computational biology. Later in high school I had the opportunity to work on a supercomputer project, using Desmond. This allowed me to do protein folding simulations, and understand how, for example, a protein responds to high heat energy or how the presence or absence of chaperones in protein folding can lead to cancer. In grad school I saw all these other ways health data can be analyzed in a high-throughput manner with a computer—genome wide association studies, molecular dynamics, cellular differentiation, patient records, and entire clinical trials.

This was all really exciting, and I wanted to pursue a PhD. But I was learning that it takes so long to access the necessary data to conduct this sort of research. You need to apply to get data from hospitals. That can take many months or longer—these are sensitive records, lawyers have to carefully review data use agreements, and you have to ensure data compliance and responsible usage. Then the actual logistics of privacy and security are pretty antiquated. For example, I would get hospitals mailing me the hard drives with patient data. It seemed like a pretty insecure way to transport something that was clearly so valuable and so many people were invested in. So I decided to put my studies on hold and focus on the logistics of the security process because that's the linchpin, and a huge bottleneck, for this whole workflow. I saw the need for a secure way of making health data accessible. 

Tell us about SAIL. What is the work that you are doing?

In grad school, I was working in the MIT Media Lab on privacy-preserving techniques to work with clinical trial data. This involved using Open Algorithms (OPAL) and federated learning to access data. At SAIL, we are combining this federated structure with something similar to digital rights management.  

To put this in context, currently hospitals share data for all kinds of research—it is necessary for clinical trials and drug discovery—but they're doing it in a really slow, messy way. Lawyers draw up long legal contracts that dictate what data will be shared, how it can be used, and who has to pay if there’s a breach. But there's no way to actually check what is happening during usage. What we have done at SAIL is take those contracts, author a digital version of the data use agreement, and attach it directly to the data. You can think of it sort of as if we have watermarked the data with the use agreement, so that any time you're accessing the data, it has to be compliant with the original use agreement. This makes it much simpler for the outside researchers to work with data.

Our platform pairs federated learning and digital rights management concepts to allow researchers to use data while hospitals keep data safe. The idea with federated learning is that instead of moving data around, such as on a removable hard drive as is typical for multi-site collaborations, you keep the data where it is and you instead distribute the analysis. And the idea of digital rights management is similar to the basis for platforms like Spotify and Netflix, which enable you to access movies and music through their services, but you never actually have the MP3 file or the movie on your computer. 

To explain, using an example from the research context—say I want to know the correlation of certain side effects for Asian women over the age of 40 in a breast cancer trial. With SAIL, all the hospitals involved in the study can run the question locally against their patient data. They may have 20 patients that are Asian women over 40 and only 10 of them are being treated with this drug. The system can then pull a list of their side effects. And in the end the system just returns that list of side effects. This enables researchers using the data to access it, and understand typical side effects, but to do so in a compliant, privacy-preserving way.

Data privacy is a hot topic for policymakers and it is even more of a concern with sensitive data like health data. How do you think policymakers should think about privacy, especially when it comes to new tools and services?

Right now in the U.S. we have the Genetic Information Nondiscrimination Act (GINA) and the Health Insurance Portability and Accountability Act (HIPAA), as well as in California we have the California Consumer Privacy Act. And in Europe, there is the General Data Protection Regulation (GDPR). This results in a pretty complex framework of an array of policies, and there are ongoing discussions as policies continue to grow and change. And I think it is important to recognize that with every single policy, regardless of how it addresses privacy, it adds more rules that companies, no matter their size, have to navigate. One of the benefits for our company is that the focus of our technology is maintaining data privacy and protection. But there are all kinds of organizations that—as privacy policies change or develop—need to make sure that they are future-proofing their data. By future-proofing, I mean that anytime a new policy comes along, it would not be a pain to redact data or be forced to complete some burdensome audit to comply with the new policy.

The challenge with these policies and the idea of future-proofing is that the easiest solution to privacy and security is just locking data up and never sharing it. This is a tactic we have seen a lot of hospitals take, especially smaller, non-academic medical centers. But at the same time, that would be a big shame for research. You would get uneven patient populations. But researchers and innovators need to have a lot of diversity in the datasets they pull from—it is a critical basis for unbiased AI analysis. So the question becomes how do you balance high utility of the data and privacy? That's what we've been doing at SAIL through the development of our systems, but it is also a question that applies to policymakers thinking about the ways we regulate data privacy.

One other particularly helpful thing in this context is for policymakers to include the voices of technologists in conversations around policy developments. There are programs nowadays, like TechCongress, that have established fellowships that connect policymakers with those who better understand the implications of certain technologies, as well as where the development of technology will lead. I think it would be great for every member of Congress to have a technical expert aiding them. This would allow for a more informed policy debate when it comes to the topic of new technologies and privacy. 

Is there anything that policymakers should be doing to better support startups like SAIL? Are there any tech startup issues and concerns that you believe should receive more attention from local, state, and federal policymakers?

There are a number of government grants out there, which are very helpful to startups in need of resources. But I think one thing that is mutually beneficial and not often thought of is bringing startups in to help the government to solve its own problems. It would be great for the government to have startups come in and help them make processes that work better for them. This would also be really helpful for startups—they would have a testimonial about a successful project with a trusted institution. It would also give startups the government as a first customer. Getting that first customer is really hard, and the government could open up RFPs for a lot of projects. Whether it be putting systems in place for improving access to government data,  making existing systems more secure or streamlined, or even doing basic website updates to make information more accessible. Tech startups could help modernize or refurbish a range of government systems.

There are also some specific grants to support female founders or people of color who are founders. This is a great concept, but my impression has been that outside of that funding, grant recipients are left out on a limb on their own. Founders often really need more business success and mentorship. This goes back to the adage about giving a man a fish versus teaching a man to fish. Starting a business and getting the opportunity to prove your technology works can be harder for underrepresented founders. And it may be harder for underrepresented founders to find mentors. So the government could provide resources or incentivize mentorship to help startups learn to close customers and grow. I think I've benefited most from actual mentors that have taken the time to teach me how to develop my business.

What are your goals for SAIL going forward?

The implications of what we are doing at SAIL include that it should be possible to connect all cancer data sets out there in a privacy-preserving way. This would support really strong, innovative ways for researchers to discover more about cancer biology. At SAIL, we want to connect all these data sets that already exist. This is the future of healthcare - wouldn’t it be awesome if you could get all the available cancer data and you did not need to worry about privacy?


All of the information in this profile was accurate at the date and time of publication.

Engine works to ensure that policymakers look for insight from the startup ecosystem when they are considering programs and legislation that affect entrepreneurs. Together, our voice is louder and more effective. Many of our lawmakers do not have first-hand experience with the country's thriving startup ecosystem, so it’s our job to amplify that perspective. To nominate a person, company, or organization to be featured in our #StartupsEverywhere series, email ian@engine.is.