Transcript
Chris Barnett (00:06):
Thanks for joining today. I’m Chris Barnett, and I’m head of Marketing and Partnerships here at TripleBlind. We appreciate you signing up for this webinar and making time with us today.
Chris Barnett (00:23):
We are, of course, going to talk about privacy enhancing technologies today, and who should care. Presumably, some of you folks in the audience will be in that group. Our esteemed panelists today include my colleagues, Mitchell Roberts, who is our Director of Product Marketing and Chad Lagomarsino, who is one of our Partnership Engineers. He’ll bring it for us on the technical side
Chris Barnett (00:45):
Let’s go ahead and get into the material. Please feel free to put questions in the Q and A box at the bottom. The button for that should be at the bottom of your screen. If you’d like to post use cases or business problems, do that as well. All right, here we go.
Chris Barnett (01:06):
The first thing that I want to point out is that when it comes to the fields of artificial intelligence and analytics, a lot of the problems in those fields, what we hear from folks that we talk to are data problems. Access, prep, bias, those are the kinds of problems that are at the heart of wanting to do more and better and bigger with AI. So Chad, in your work with our partners, you go through these kinds of topics all the time. Maybe you can give some color commentary on this for our group.
Chad Lagomarsino (01:46):
Sure. Hey, everyone. Chad Lagomarsino here. I do a lot of work with healthcare insurance payers, as well as clinicians, providers, et cetera. I see a lot of these issues occurring in the healthcare space. I would say that data access, which we translate to data interoperability in healthcare, is the number one bottleneck that is stopping healthcare insurance providers, hospital systems, and other regulatory agencies from being able to provide good analytics.
Chad Lagomarsino (02:19):
So a little bit of background as to what this kind of problem can look like. Typically, if you’re working in something like a hospital system, you’re going to have multiple EHR records, depending on, could be internal. Each wing of a department could be using a different system. Clinicians might be using something like Epic for their EHR records or electronic health records.
Chad Lagomarsino (02:40):
Whereas you might have another wing that’s behavioral health focused, and they might be using an entirely different tool to collect behavioral health questionnaires and information on all of that. Where this comes together is that a lot of private and public healthcare hospital aggregators are looking to work with something called whole person care.
Chad Lagomarsino (03:04):
So there’s been an initiative at, for example, the California state level. I know Nevada as well as Utah are also taking similar initiatives where they’re gathering information from many different sources. It’s not just the clinical data from a one time visit with the provider.
Chad Lagomarsino (03:20):
It’s also, they’re gathering information from places such as social benefits packages, law enforcement, behavioral health is a huge one, housing agencies, et cetera. And they’re aggregating that information together.
Chad Lagomarsino (03:35):
On that level, we have a huge problem. Whereas, for example, I worked with the California Department of Healthcare Services and we had a project where we had 21 different EHR systems that we needed to link together. The plumbing for each of these was different.
Chad Lagomarsino (03:52):
We had to build a system that could handle Medicare, Medicaid, and Tricare claims, and then aggregate all that information in a third party Cloud environment. And needless to say, it took a lot of labor, which was good for the vendors, bad for taxpayers. So this is one of the things that is really causing huge bottleneck within the data access space for healthcare agencies.
Chad Lagomarsino (04:17):
Maybe you’re not a giant conglomerate like the California government, or a major hospital insurance system. But you might just have a local business or a local hospital that even within there, you might have different Cloud environments set up, you might have different EHR systems set up, data’s coming in from multiple sources –– and making sure that all that stuff is sitting in one place is really challenging.
Chad Lagomarsino (04:42):
So the common paradigm that people take for this, is they try and create a place where they can take all those different sources of data and put all the data in that one place. The question we want to answer today is, “What if you did not have to move the data from where it is?”
Chad Lagomarsino (05:01):
Moving on to the next point here with data prep, data movement is going to kill projects. Data movement will be the lion’s share of the budget for any kind of initiative for data. In particular, there’s going to have heavy compute costs, whether you’re doing on-prem or Cloud computing. It’s going to be very expensive to move that data.
Chad Lagomarsino (05:21):
And also, data prep is a largely manual resource-intensive task that requires skilled labor. You need somebody who has knowledge of data, how to wrangle it, how to scale it correctly. So again, working with data can be really challenging because it all has different shapes and structures.
Chad Lagomarsino (05:39):
It’s in different spots, silos are different. As soon as you start adding external entities, other businesses, it becomes a really complicated endeavor, which kills the majority of projects before they get a chance to get off the ground. And lastly, data bias. This is definitely a huge issue for analytics companies.
Chad Lagomarsino (06:00):
I think it’s an issue that is really important that we get on top of early. This is something that you’ll see when you’re working with various data sets that have not been vetted properly. So if you’re just throwing whatever data you have at a problem, you’ll end up with inherent biases that could cause problems down the line and will require fixing down the line.
Chad Lagomarsino (06:23):
I would say that those issues, while important, are not quite as important as the data accessibility issue and the data prep issues. Because again, that’s about 90% of the pipeline. And to be fair, with regards to both healthcare and the financial industry, that’s where the majority of the vendors and players are in that space.
Chad Lagomarsino (06:48):
So you’ve heard a little bit about some of the experiences that I’ve had working with large healthcare and insurance companies. We wanted to get a little more information about some of your use cases, get a better idea of how we can better address your questions with regards to data access, data prep, data bias, and other issues that you’re having with your data access and modeling pipeline.
Chris Barnett (07:13):
Great. Yeah. So please go to the Q and A button at the bottom there and post your use case, and we’ll respond to that after we get through a couple more slides. Thank you, Chad.
Chad Lagomarsino (07:23):
Mm-hmm (affirmative).
Chris Barnett (07:24):
In terms of use cases, one of the things that we wanted to touch on is Gartner’s research on the various forms of privacy enhancing technology and what are the compelling use cases according to Gartner. So Mitchell, why don’t you go through this for the group?
Mitchell Roberts (07:44):
Yeah. Absolutely. So once again, I’m Mitchell Roberts. I’m the Director of Product Marketing here at TripleBlind. In my role, a lot of what I do is market research and deciding or figuring out what the use cases are for not just our technology, but other privacy technologies and kind of understanding the market as a whole.
Mitchell Roberts (08:06):
What are the opportunities? What are the problems that are most pressing? The privacy enhancing technology category, as we say, is not extremely old. It’s still fairly young. But that doesn’t mean that all these technologies are very new. Some of them have been around for a little while.
Mitchell Roberts (08:28):
What’s new is the practical application of these technologies to business problems –– so what’s really encouraging is when somebody like Gartner takes the time to research into this category and identifies some really key, important use cases where we can apply these technologies.
Mitchell Roberts (08:50):
If we look at what Gartner says are the three key use cases for PETs number one, AI model training, ensuring models with third parties. So Chad just talked about data access, prep, and bias. In this first use case, this is both a data access and a data bias problem.
Mitchell Roberts (09:12):
If I’m training an AI model, I want to get access to the best possible data out there. I may not already have that. I may need to look to third parties for that information and for that data. I also want my data to be unbiased, because I don’t want the model to be biased.
Mitchell Roberts (09:32):
So I need to source from multiple locations. Those locations may be subject to different data laws, residency restrictions, and competitive pressures. And business reasons may keep me from getting access to that data. So, privacy-enhancing technologies are addressing this problem. The second one, usage of public Cloud platforms amid data residency restrictions.
Mitchell Roberts (10:02):
This one is really important because as we think about pushing more data and more information to the Cloud, lots of these resources are shared and we’re trying to figure out how can we create scenarios in which we can use sensitive data to its best utility, without adding risk and liability [inaudible 00:10:30] we’re going to be exposing any sensitive information.
Mitchell Roberts (10:34):
So some privacy enhancing technologies, and we’ll go through in specifics a little bit later, are addressing this problem: How do we store data in encrypted states and actually operate on it without encrypting it, without decrypting it, without moving it?How do we add trust in kind of trustless or semi-trusted environments?
Mitchell Roberts (10:59):
The third use case here is internal, external and business intelligence activities. This would touch on data access, prep and bias, all three of them. So how do we better operationalize data within an organization? When do we identify opportunities to bring in third party data or to leverage our first party data with third parties for potential partnerships, additional revenue, opportunities and more?
Mitchell Roberts (11:27):
These are just kind of the three main categories of use cases Gartner identifies. Of course, there’s more. But these are what the market is saying are very important right now.
Chris Barnett (11:42):
Great. Mitchell, thank you. So that’s Gartner’s take. And then next we wanted to talk for a second about MITRE. Some folks in the audience may be very familiar with MITRE. If you want to learn more, you can go to MITRE.org, their website. They’re a nonprofit group that’s operated in the public interest for the federal government and for other constituencies in the United States.
Chris Barnett (12:06):
And they operate 42 federally funded R and D centers, including these kind of topics that we’re talking about today, and just tons and tons of expertise. So in addition to Gartner, I like to look to MITRE for sort of what’s important in this area. And so what you can see here, what MITRE has to say, valuable insights come from applying analytics across shared data sets, right?
Chris Barnett (12:33):
And what I’ve bolded here, halfway down, the analyst needs of generating insight, that’s probably what folks in our audience are really focused on. How do we generate insight, and what are the issues around that? That has to be balanced of course, with the individual needs of privacy and also protecting the analytics, the algorithms they come up with.
Chris Barnett (12:54):
It’s been hard classically to balance those things. That’s really where the privacy- enhancing computation comes in to enable that balance and to get the insights that you need as we’ve been talking about. So that’s what MITRE says. By the way, sidebar, MITRE has independently done an extremely thorough and exhaustive review of TripleBlind’s technology.
Chris Barnett (13:17):
There’s a public report on that that you can get from our website. Or you can email one of us and be happy to send it to you just as a side note. Okay. So privacy-enhancing technology is a term that’s relatively new. Maybe most people on this session are familiar, but these are the seven buckets.
Chris Barnett (13:39):
You may be familiar with some of them, not all of them. But this is the set of seven items that we are classified under privacy-enhancing technology. We’re obviously in this one session, not going to go deep on all these, but this is a good level set in terms of what is it that privacy enhancing technology encompasses.
Chris Barnett (13:58):
In the interest of getting everybody on the same page and just as a reference for you and your colleagues, Chad’s going to go through a quick summary of each of these.I’ll turn it over to you for that, Chad.
Chad Lagomarsino (14:20):
Thanks. So effectively, differential privacy is similar to using aggregated data. You’re looking at large scale populations and looking at general patterns that you can infer from those populations without looking at the individual record.
Chad Lagomarsino (14:35):
This is commonly used in anything sensitive containing PII, like healthcare or financial information. For example, you’re looking at credit information. You can glean a lot of information without actually looking at those individual rows.
Chad Lagomarsino (14:50):
Federated learning is a process where you are using multiple machines to learn on a data set, to train and validate a data set at the same time. What you’re able to do is basically split a data set or an algorithm apart, and then train independently, get results from both of those sides and then bring the aggregated information together to get an inference.
Chad Lagomarsino (15:17):
A good example of this is Apple doing system updates. Apple is constantly looking at your phones usage patterns, collecting some information about that. Then, they send the usage pattern information back to their headquarters without revealing any personal information on the device itself.
Chad Lagomarsino (15:36):
Homomorphic encryption, this is a big topic, and describes a whole set of algorithms that work on various forms of computation. Definitely would like to get into that if anyone has curiosities or questions about that. But it’s a big topic, so we’re just going to glance over that for a moment.
Chad Lagomarsino (15:54):
Secure in place, this is what a lot of hospital systems or banks are doing, where you have a hardware based solution. Maybe it’s a separate Cloud environment. I have my Epic EHR records in one secure enclave, so on one machine. And then I have my behavioral health information on another machine.
Chad Lagomarsino (16:14):
Avery common pattern you’ll see is “I have sensitive PII, personal information that is stored on a private server that only a very select group has access to” and then a more public facing server that has just general de-aggregated information. SMPC is a fundamental building block technology that allows computation to occur between different computers.
Chad Lagomarsino (16:40):
With SMPC, you are able to create logic gates with addition and multiplication. And long story short, be able to run arbitrary code while not having all of that code be present on one machine. Very cool technology. Again, if anyone has questions, I’m happy to dig into that. Could spend an hour talking about that.
Chad Lagomarsino (17:02):
Synthetic data: This is you are creating a fake set of data that is representing the shape and form of the data that you are originally trying to look at. This is a strategy if you’re interested in seeing what a data set could be like, what kind of information, or correlations you could get from a data set, but you can’t look at the actual data itself for privacy reasons.
Chad Lagomarsino (17:25):
And then tokenization is everywhere on online payment systems. This is something that you’re going to see whenever you’re dealing with sensitive information. They’re going to create a token for that information, and then that token will represent where that data is as opposed to actually sending the raw data across different networks.
Chad Lagomarsino (17:46):
All right. Chris, you want to move on? Good. Okay. So some applicability to use cases. For the first one, these are level set here. These are three different use cases that Gartner has highlighted are very important. The first one is about just training models with third parties, right?
Chad Lagomarsino (18:05):
That could be an external party such as a un-trusted source that’s more public facing, or it could be a trusted source. You already have some kind of legal agreement. That refers to both of those. Federated learning and SMPC synthetic data, those are useful for that case.
Chad Lagomarsino (18:23):
The second one here, the usage of public Cloud platforms –– his is describing basically putting all of your data from various sources in a Cloud and making sure that Cloud is secure and private. Just making a big assumption there. That refers to everything on the board, except for the federal federated learning and synthetic data aspects.
Chad Lagomarsino (18:48):
Those are all use cases for that. And then finally, you have internal, external and business intelligence activities. This is a very broad term to basically say sharing, using data, analytics, getting insights from various-. Excuse me, various data sources. That’s going to be applicable to everything, because there’s multiple ways to do everything.
Chad Lagomarsino (19:17):
Now, we can go much more in depth on these individual use cases, but for the sake of time, the best thing we can do is send this slide deck to you as part of our follow up email. You’ll be able to dig in on some of our research on specific use cases for specific privacy enhancing technologies.
Chris Barnett (19:44):
That’s great. Chad, thank you.
Chad Lagomarsino (19:46):
Mm-hmm (affirmative).
Chris Barnett (19:46):
I’m going to ask you to pick your favorite row from this and just talk about that and talk a little bit about how, if you’re architecting, you might decide for one of these rows, which techniques to use. You’re right, we can’t go through them all, but let’s tell folks about one as an example, and they can read the rest later, as you mentioned.
Chad Lagomarsino (20:09):
Sure. One that a lot of people get really interested in when I mention is about the iPhone. Apple takes information about usage patterns, so that’s not containing PII. And that information is then used to basically train a local instance of a machine learning algorithm.
Chad Lagomarsino (20:30):
What they’re doing is they’re trying to see which apps are most commonly used, which apps are correlated together, if somebody downloads one app, are they going to download another app, et cetera. That information is then trained locally.
Chad Lagomarsino (20:44):
The output of that model is sent to a central server that Apple owns. That uses differential privacy, because you’re not looking at PII, a personal identifying information. You’re just looking at the aggregate information.
Chad Lagomarsino (21:00):
Secure enclave is your device, its own separate computer that contains your information on it. SMPC is used, because you’re actually doing computation at both Apple’s host server and the client device, which is your iPhone.
Chris Barnett (21:14):
That’s great. Thank you. I appreciate you sharing that. And the folks that have joined, you’ll get this in the follow up. So you can look at that and you can ask us questions as Chad mentioned. All right. So next. Again, friendly reminder to post your use cases or questions in the Q and A box.
Chris Barnett (21:38):
I’ll give our panelists heads up that slide after next, we do have one use case that’s come in so far, which is to talk about training AI models on healthcare data that’s not an EHR file, that’s not tabular data. That’s maybe image data. So let’s go on and talk a little bit more about the comparison. We’ll bring you in for this one, Mitchell.
Mitchell Roberts (22:11):
Yeah, absolutely. One thing that can often be a little bit confusing when talking about this category of privacy enhancing technologies is, “What does each individual technology solve for and what criteria should we judge them on when we are making decisions?”
Mitchell Roberts (22:33):
Of course, each of these technologies sets out to solve slightly different nuanced problems in the privacy space. So we’re not saying one of these technologies is necessarily superior to another in a lot of ways, but we’re also showing where each one really dominates and where some gaps are.
Mitchell Roberts (22:59):
TripleBlind was founded and our product was designed to really solve these 11 criteria that you see down the left side. So we want to make the maximally, or we want to create the maximum degree of privacy in our product. That means some of the considerations in that are, “Is there a description key, is the data being moved, how much raw data is seen by the end user?”
Mitchell Roberts (23:33):
Those types of considerations go into that. Ability to operate at scale is, “How easy is this to scale horizontally into different business problems, but also vertically to really scale within an organization?”
Mitchell Roberts (23:48):
Types of data –– we want to work on more than just tabular data. This is really important. We want to work with image data and genomics and large files and voice data and everything. We want to be able to keep all their private and compute on it. Speed is really important.
Mitchell Roberts (24:06):
Data expires really quickly in a lot of spaces. So, the faster we can use it and the less burden we add to the process, the better. Supporting training new AI and ML models? Not every solution will offer this. It’s pretty unique. And to be able to leverage data from multiple locations to train an AI model, that’s something we really wanted to provide.
Mitchell Roberts (24:33):
Digital rights is also a very unique aspect of our solution, because we’re able to allow our customers to permission how and why their data is used, and how often. And then rolling into number seven: algorithm encryption is also extremely important. More, more and more there is intellectual property wrapped up in some of the models and algorithms that people are developing.
Mitchell Roberts (25:06):
We want to actually protect algorithms and use as well. Then, of course, compliance. We want to eliminate masking, synthetic data hashing, accuracy reduction, basically to preserve the full fidelity of data and eliminate having to make that trade off between utility and privacy. We want to maximize for both.
Mitchell Roberts (25:32):
Hardware dependencies really slow down data usage when everything is being virtualized. Why should we design a solution that requires specific hardware? And then interoperability with third parties. Like we said, we want this to scale both within an organization and externally. Hw easy is it for your data partners to get up and running?
Mitchell Roberts (25:57):
As you can see, at TripleBlind, we have a lot of green checked boxes, but that’s because we designed for these things. We designed to fill these gaps, these red circles that are left by some of these other techniques. This is why we call ourselves the most complete and scalable solution for privacy enhancing technology, because we really try to take a holistic view of solving these problems.
Chris Barnett (26:27):
That’s great, Mitchell. We’ve got a question here in the chat that I’ll start to try and answer, and then you guys can chime in. So the question, which is a great one. First of all, saying, “Hey, is there a demo?” And I mentioned here in the chat, we actually aren’t doing a demo today.
Chris Barnett (26:44):
This was intended to be more sort of fundamental educational, but we’d be happy to schedule demos with anybody that’s interested so you can see the Python fly by, see it work in real time, and see actual AI models trained and queries answered. We’re happy to do that.
Chris Barnett (27:02):
And then a good related question is, “Will you demonstrate or explain how you solve all 11 requirements? Do you have a new PET technology or your platform allows using one of the existing technologies?” That’s a great one. Mitchell, do you want to talk about sort of the foundational pieces that we’ve built on and improved versus things that we don’t use at all? I think is what this question’s getting at. Does that make sense?
Mitchell Roberts (27:28):
Yeah, absolutely. It’s a great question. The answer is a little bit of both. So TripleBlind, I use the term complete and scalable. What this means is we’re addressing the problem from multiple angles and we’re providing the best privacy solution for a given scenario. How we do that is by leveraging some novel advancements that TripleBlind cryptographers and engineers have made on top of some existing solutions.
Mitchell Roberts (28:00):
You see traditional SNPC over on the right side of this chart. Secure multi-party compute is really a core piece of our solution, but we have made some important advancements on the practicality and scalability of that. There’s also some elements of things like federated learning. We’ve kind of taken a different approach to solving the same problem, and there are some novel aspects of our solution there.
Mitchell Roberts (28:33):
We do have peer reviewed articles and things we can share supporting some of the novel aspects of our technology. The answer is really, it’s kind of a blend of using the best technologies out there as well as developing some of our own.
Mitchell Roberts (28:54):
One really exciting thing is if you look at homomorphic encryption, which has been sometimes talked about as the holy grail of encryption. The inventor of some of the more practical, fully homomorphic encryption schemes, Craig Gentry is now our Chief Technology Officer.
Mitchell Roberts (29:18):
He’s working with us as well. And while homomorphic encryption is not part of our specific solution, it is really great to have somebody who has been kind of at the forefront of this space, joining our team and helping lead the future of it. I hope that kind of answers the question. I’ll let Chad or Chris chime in to add anything else.
Chris Barnett (29:44):
I think that’s pretty good. Thank you for that question and good coverage of that. Mitchell, thanks. All right. Let’s go to a couple specific use cases that have come up in the chat. I’m going to just leave us here on this. Let me just get the mouse back.
Chris Barnett (30:08):
I’m going to just leave us here on this slide so that we can just have a conversation. The first use case that folks are asking about is, again, it’s about training AI models, but it’s not tabular data, it’s image data. The example that was used was an x-ray. How do you train AI models on a giant stack of x-rays from five different hospitals, all of which are private data?
Chris Barnett (30:35):
Based on these questions, I’m thinking we could speak to, “How does privacy enhancing technology generally address this?” We’ve talked about the seven techniques, and then I think folks are interested in specifically, how does TripleBlind address this as well. Chad, do you want to kick that discussion off?
Chad Lagomarsino (30:55):
Sure. If you are, for example –– let’s forget about external entities –– you’re a hospital system, you own all the data, of course, with HIPAA regulations. You have one department that’s using EHR records and you have another department, radiology, they’re using x-rays, right? What do you do if you want to run analytics on this data? Can you train an AI model on both data at the same time?
Chad Lagomarsino (31:22):
How do you manage that? Well, the question is a good one because effectively the way that it’s generally done is that it’s not done. I’m just kidding. The way that most hospital systems will attempt to try and create a system to train data on various types of data. Excuse me, train models on various types of data is that they will create a place where they pipe all that data into.
Chad Lagomarsino (31:51):
So they’re going to pipe the data into a Cloud environment. They’re from a tabular source, they’re going to pipe it in from an image source, and then they will run separate models on both of those data types. So the effective thing is that they are creating a secure enclave that they are then piping data into from different means.
Chad Lagomarsino (32:13):
They usually will integrate with a vendor that is going to help them read that data, get it to the right size, scale it, and then pipe it into a centralized database. And then they have extreme triple-A-class, we hope they have extreme triple-A-class cyber security around that Cloud environment, everything is secured.
Chad Lagomarsino (32:36):
Or they do it on-prem and [inaudible 00:32:38] it’s on them to secure that server and make sure that all of that data cannot be exposed to the outside world. Right? The question then is, what do you do differently if you don’t want to be concerned about the data movement working with different data types, all of those integrations with external vendors, et cetera?
Chad Lagomarsino (33:00):
The answer is you don’t move the data, you keep it where it is. You actually have a local storage for that data where it’s generated and you just create an access to that data from a[n] internal source. It’s a paradigm shift. Most of the time, companies will move data and run analytics.
Chad Lagomarsino (33:26):
The alternative, and what TripleBlind does, is keep the data exactly where it is and run the analytics on-site. So run the analytics where the data is at –– the “data point.” That way, you don’t have to worry about piping the data anywhere, you skip all that plumbing. And that’s really the core value of what TripleBlind is doing.
Chad Lagomarsino (33:46):
We’re extracting away all of that plumbing, shifting the question on its head, and we have a more effective way to access data without having to worry about the concerns of, “I just copied this data somewhere. Now I have to protect that data. I have to pay for all the compute costs of moving that data,” et cetera. So in a nutshell, we’re happy to expand on this, but in a nutshell, what TripleBlind does is we let you use data privately behind your firewall, where it’s generated.
Chris Barnett (34:21):
That’s great. Thank you. We continue to get a couple more questions. I think that we’ll have time for two questions and then we’ll wrap to let everybody get back to their day. The next question says, “How do I do analytics or AI on genetic data? Classically, it’s extremely hard because of the size of the data, and also the fact that the data can just implicitly be tied back to the individual patients. What do I do about genetic data?”
Chad Lagomarsino (34:55):
Genetic data is an excellent example of why you don’t want to move data. It’s very expensive. It’s computationally very heavy. And then yes, you do not want to expose the individual records of recombinant or variant data for genetic studies. What’s typically done is you will do aggregation.
Chad Lagomarsino (35:16):
This is, again, going back to differential privacy. You’re going to have your secure enclave where the data is stored and you’ll use aggregated data, differential privacy. From there, you’ll do a population level study generally speaking.
Chad Lagomarsino (35:33):
The challenge with that is that at present, most genetic organizations, companies that work in the bio space, pharma, et cetera –– They create something called biobanks, which again, goes back to “Let’s create this super hyped-up, steroid-filled computing environment in the Cloud that has Fort Knox level security in theory and pump all of our data into that.”
Chad Lagomarsino (35:57):
That’s expensive, time consuming and risky, not to mention all the legal compliance issues that come with putting all that data in one spot. Either they do that or they just don’t do the studies in general, because it’s going to cost too much. Again, one of the advantages of computing the data where it sits is that you can run studies on this data, train models on this data, without moving it from its source.
Chris Barnett (36:30):
That’s great. Thank you, Chad. Oh, okay. So one last question. This one says, “This sounds good, but I need to actually get my hands on this so I can play with it. How do I get started or do a trial or a proof of concept? I need something practical.” So Mitchell, maybe that’s over to you for that question.
Mitchell Roberts (36:50):
Yeah, absolutely. A pretty common thing that we support with customers and perspective partners is we’ll run an evaluation where you get to play around with our software development kit, our user interface, and really get to know the tools hands on.
Mitchell Roberts (37:13):
And some example use cases that we’ve set up. Then we can also work on custom evaluations where you bring your own data or your own algorithms. We have a pretty frictionless set up for running those and getting people comfortable with our solution.
Mitchell Roberts (37:33):
If that’s something you’re interested in, definitely reach out after this webinar and we will support that. We also can provide product demos. That usually would come before an evaluation. There’s lots of ways to kind of get your hands on the product.
Chris Barnett (37:57):
That’s great. I think it’s worth mentioning in the follow up email that we send, we’ll have all these slides and then we have a link of resources that you can see here on the screen. That includes that you can download TripleBlind software from both the AWS Marketplace and also the Azure Marketplace.
Chris Barnett (38:16):
If you already have some of your data or some of your development operations in those environments, you can just go into those marketplaces and get the software right there, including free trial availability, if you’d like. We’ll support you, but in terms of just getting the software and downloading it to your environment, it’s just right there, which is pretty handy.
Chris Barnett (38:37):
So panelists, I appreciate your great comments and your responses to the Q and A today. I think this has been very helpful. Folks that joined. I see everybody that came in the beginning is still here. We appreciate that. So wait, wait, there’s an encore.
Chris Barnett (38:52):
We have one more question. Just so, “Does TripleBlind provide API integration capabilities?” So Chad, maybe talk a little bit about the Python and the R APIs and kind of work toward this question. I’m not sure exactly what integration means, but just talk about the APIs. Thank you for that question.
Chad Lagomarsino (39:14):
TripleBlind is effectively built [inaudible 00:39:15], so it is an API itself. It is natively written in Python. It is able to connect to other languages as well. APIs are the primary way that we do integrations with databases, with external vendors, analytics services, et cetera. So basically everything you can do with Python as an API, you can do with TripleBlind. Yeah.
Chris Barnett (39:43):
Chad, would it be fair to say that the syntax and the commands and the language that we have in Python is really similar to what data scientists and analysts are used to using in the regular world when they come over to the TripleBlind world? It’s looked very familiar, just a little bit-.
Chad Lagomarsino (39:59):
Exactly.
Chris Barnett (40:00):
More private. Is that right? Can you just talk about that for a second?
Chad Lagomarsino (40:03):
Yeah. We have designed TripleBlind to be this easy-to-use package. You import it on an existing Python instance and you are able to import TripleBlind and use it the same way that you’d be using commonly used packages for data science in Python, such as Scikit-learn, PyTorch.
Chad Lagomarsino (40:25):
We can do TensorFlow, et cetera. Not that we are limited to those, but those are the tools that we’re modeled after. So anyone who has experience working at, during the data pipeline, doing data prep, analytics, modeling, building models, comparing models, et cetera, that whole process will look very familiar in TripleBlind.
Chris Barnett (40:48):
That’s right. We’d be happy to get you the docs or more information on that, or how to, and also just to be able to try it. Great.I think we’ve covered all the questions, so we’re going to wrap here. Again, you’ll get an email with the recording of this, with the resources, with the slides. You’ve got our contact information, so you can follow up. We appreciate your time and thank you, everybody. Have a great day.
Privacy Enhancing Technologies: Who Should Care and Why?
The biggest problems facing healthcare and finserv professionals in 2022? Data problems…
We hear from C-Suite and compliance officers, data scientists and even cloud architects that if your work includes machine learning or analytics, you’re likely facing data access, data prep, and data bias challenges –– along with a host of compliance requirements and more. What if emerging privacy-enhancing technologies could reshape and catalyze your organizations’ data-based innovations?
Speakers include:
- Chris Barnett, VP, Partnerships & Marketing, TripleBlind
- Chad Lagomarsino, Partnership Engineer, TripleBlind
- Mitchell Roberts, Director, Product Marketing, TripleBlind
Date/Time: Wednesday, May 25th, 11:00 am CT / 12:00 pm EST
According to Gartner1 – the three key use cases for Privacy Enhancing Technologies are:
- AI model training and sharing models with third parties.
- Usage of public cloud platforms amid data residency restrictions.
- Internal and external and business intelligence activities.
According to MITRE², who operates federally funded data R&D centers:
“The most valuable insights come from applying highly valuable analytics to shared data across multiple organizations, which increases the risk of exposing private information or algorithms. This three-way bind – balancing the individual needs of privacy, the analyst’s needs of generating insight and the inventor’s needs of protecting analytics – has been hard to balance…”
The emergent category of privacy-enhancing technologies (PET), also referred to as privacy preserving technologies or privacy-enhancing computation (PEC), represents a cohort of technological solutions which seek to ease the pains, pressures, and risks involved in working with sensitive and protected data.
In this webinar, hear TripleBlind’s experts discuss how to choose the optimal PET technique(s) for your business problem and use cases – and how to evaluate and implement solutions that will have the greatest impact. Techniques covered include:
- Differential Privacy
- Federated Learning
- Homomorphic Encryption
- Secure Enclaves (aka Confidential Compute or Trusted Execution Environment)
- Secure Multi-party Computation
- Synthetic Data
- Tokenization (along with data masking and data hashing)
REFERENCES:
- Three Critical Use Cases for Privacy-Enhancing Computation Techniques, Bart Williamsen, et al, 28 June 2021.
- Federated Approaches to Observational Research – Technical Considerations, Stacy Chen, Zeshan Rajput, Nichole Persing, February 2022.