Transcript
Suraj Kapa (00:08:55):
Hi, everyone. Thanks for joining. We’ll wait for a couple more people to join and then we’ll go ahead and get started.
Suraj Kapa (00:10:35):
We’ll give it two more minutes and we’ll go ahead and start at 12:05 Eastern.
Suraj Kapa (00:12:02):
Good afternoon, everybody, or good morning, depending on where everybody is. Just to make sure, everybody can hear me okay? Great, looks like everybody can. So I appreciate the opportunity to speak with everybody and to actually introduce our wonderful panelists who are joining me today to talk about how to overcome bias in big data. So my name, for starters, is
Suraj Kapa. I’m the SVP of healthcare and government at TripleBlind, and actually also a practicing cardiologist, cardiac electrophysiologist.
Suraj Kapa (00:12:31):
Next in our panelist group is Aashima Gupta. She’s the director of global healthcare solutions at Google Cloud. And then Daniel Kraft, who’s the faculty and sheriff for medicine at Singularity University. And Brian Anderson, who’s the chief digital health physician for MITRE.
Suraj Kapa (00:12:47):
And what we’ll be talking about today is to get an understanding of bias in big data, what it is, and what the issues are related to that when we start thinking about the creation of digital health platforms and the actual scalable deployment of digital health tools to society in general, which is becoming more and more of an important issue as the resources and the technology gets mature enough to allow for such deployments.
Suraj Kapa (00:13:16):
And we’re going to start today with a general primer about what bias in big data actually means, just to set a foundation for how we think about it, and then we’ll move to a panel conversation. And we really do want this to be interactive, so I’d appreciate all the attendees to feel free to ask any relevant questions into the chat, and we will take them as they come after this particular set, after the initial foundational session.
Suraj Kapa (00:13:44):
So as I said, we’re going to start by talking about what bias in big data actually means. And I really think it’s important to get back to the point that the explosion of interest in big data in digital health platforms actually emanates from how digitized our lives are nowadays. I still remember from when I was in training in medical school and residency, and even my early years on faculty, we were still writing out our notes as some of the premier medical institutions in the country. And that very quickly translated into actually typing our notes and creating a digital footprint for all of the thinking and all the thought processes we employ about how to manage patient care. And that increasing availability of those digitized information actually facilitates the explosion of opportunity to actually get novel insights or to drive clinical decision support or to improve, even at the patient consumer level, how we deliver health.
Suraj Kapa (00:14:47):
But when we think about this, we also have to think about, where’s the data coming from? How is it being used? How is it being integrated? Because all of that enters into what we think about when we think about the potential imposition of bias into these processes that has the potential to actually widen the disparities of care that might exist.
Suraj Kapa (00:15:09):
So there’s certain key considerations we have to think about. First, in general, while the digital economy has concentrated and dispersed power, data in particular has had a very concentrating force within the digital economy. And the reason for that is, when we look at the number of institutions, especially in healthcare, that have mature enough data platforms where the data’s truly integrated, silos have been broken down within the institution. They are a small few of a very much larger ecosystem where the data actually exists, and that has the potential issue of creating a big focus on those who have reached the level they need to with their data platforms, to enable their data to translate into the digital economy.
Suraj Kapa (00:16:00):
Secondly, the data economy is changing our approach to accountability from one based on direct causation to one based on correlation, and this has a lot to do with issues of explainability and understanding what is actually going on, because the thing is, when we look at big data, a lot of times we’re looking at big data as in the data’s already been produced. But in how that data’s produced and exposed to analytics, to algorithms, to creating algorithms, what we think about is, as scientists, we don’t just want correlation, we want causation so we’re able to do further studies, further analyses to try and take out certain pieces of the puzzle and see if it shifts one way or the other, to understand that this is a causal factor as opposed to a correlative factor.
Suraj Kapa (00:16:53):
But when we assume causation from correlation, because all we have is this massive dataset with limited explainability, if there’s bias in how we obtain the result, based on how the results were accounted for within the actual plan of care, say in healthcare for a given patient, that’s going to magnify into an algorithm. And this creation of assumption about causation also becomes magnified, and can potentially lead to worse care outcomes, especially in specific populations where there’s not as much data or they’re not as well represented. And it’s important to remember that we sometimes just assume if we have enough data, we’ll be free of bias. But the reality is, data systems often mirror the existing social world.
Suraj Kapa (00:17:39):
The truth of the matter is, the more data you have, the more opinions you’re bringing in. But if opinions about the management of a given patient based on socioeconomic factors, based on ethnic factors, based on racial factors, or even based on factors that don’t have anything specific to do with any one of those particular attributes, those do get magnified. And many of these are unconscious biases that we can’t easily account for otherwise. And the issue is, humans determine what data is captured, they create the algorithms, they assemble the datasets, and they’re ultimately involved in building and validating the artificial intelligence too. And we know that humans have inherent bias, and that’s one of the extraordinary difficulties, when that bias is integrated in some way into the data assets we’re working through.
Suraj Kapa (00:18:33):
So analytical techniques which utilize existing data to predict the future inevitably have the tendency to replicate and often amplify existing biases. And all of this gets to the issue of the limits of over-trusting data silos as they exist, basically saying that one silo of data, one institution’s data, or two institutions’ data is sufficient to create a sufficiently unbiased picture of the world. It might be or it might not be, because what we need to be able to understand is that that data that’s being represented is actually built off data that represents the populations that they’ll ultimately apply to. And that doesn’t mean that the algorithm by its very nature is going to be biased because it wasn’t built off of a particular population’s data. So for example, if I go to Iceland and I build a novel genetic algorithm, it doesn’t mean to say that genetic algorithm might not be applicable to patients in Nigeria or India or China, but it does require that insight that there might need to be an additional step of validation in order to ensure that that algorithm is truly applicable, because otherwise, applying it without applying that additional aspect, that guardrail of validation in a different out-of-distribution population is extraordinarily necessary to avoid potentially creating worse health outcomes, because you’re over- or under-representing disease risk compared to what existing best standards are.
Suraj Kapa (00:20:15):
And the over-dependence on single or few data silos also creates an issue, because the problem is, while we can claim that, “Okay, we can say exactly what the diversity of the population is, how many people of specific determinants such as race, gender, and other SDH, social determinants of health exist within a given dataset,” those are measurable aspects of bias that can exist within data, but there are many immeasurable aspects of bias that can exist within data. One can say, in a very wealthy hospital, is an Indian person, just taking myself as an example, who’s able to go to that really wealthy hospital in a wealthy community, the same as maybe somebody who just immigrated here, living in a very urban, socially deprived region, and using the free hospital and free access points there. Do the same social determinants hold when we think about how I fared in that wealthy hospital I have the resources to get to, versus that poor hospital which I’m dependent on because I don’t have those resources? And is my elemental person the same in both attributes? And we do have to think about these immeasurable aspects that can pose bias within these data assets.
Suraj Kapa (00:21:40):
So ultimately, we have to think about, how do we measure representation within training data? There’s the easy aspect of it, I say it’s easy, but it’s not, because otherwise everybody would be doing it right away, but [inaudible 00:21:51] on how we are represented in the training data. So this can take the simple example of, how many people are like Suraj in the training dataset that led to the creation of this algorithm? All right, Suraj, in being a 40-something Indian male is well represented, but there’s also a question of being appropriately represented in the training data. And what I mean by that is there are biases not just imposed by not being represented, but by how you are managed in that particular entity, in that institution, by that group that led to the data as a result of your care or as a result of your involvement in that particular institution. And there’s several examples of that.
Suraj Kapa (00:22:45):
So for example, gender bias. Amazon had an AI algorithm where they looked at resumes and they tried to identify what’s a good resume versus a bad resume. So it tends to overemphasize masculine words in the resume while it tends to downgrade anything that had to do more with female counterparts, that led to Amazon shutting down the algorithm in 2017. And these are subtle, potentially very hard to measure aspects or account for aspects in the development of an AI tool that does impose bias potentially in that tool, unless it’s considered and accounted for. And there are other aspects of obviously demographic bias, ethnicity bias, socioeconomic bias. So we need to account for, when we think about ultimately, how are we going to neutralize AI bias?
Suraj Kapa (00:23:38):
And I do also want to be clear that bias can hit every element of the data scientist world. It starts, and I spend a lot of time talking about what kind of data you’re using, and that’s the data gathering bias. That’s the easy bias we can think about, which is, are we fully represented and appropriately represented in the training dataset, or in the validation dataset that led to the approval of an algorithm, say by FDA or by some other regulatory authority? And maybe that can be solved for by having diverse enough datasets, et cetera. But there’s also data analysis biases and data application biases that are also critically important. So when we think about the creation of digital platforms that try to mitigate bias impact in terms of not just the creation of algorithms, but the actual validation of the algorithms and the approaches we used to create that validation standard, and finally, the application of these algorithms downstream once they’re approved and once they’re delivered through digital platforms.
Suraj Kapa (00:24:48):
And it’s, again, important to realize that these sources of bias exist throughout the data life cycle, everywhere from the funding that led to the creation of the data assets that you’re using. So for example, if I created the algorithms off of every clinical trial participant that’s ever been in a clinical trial globally, because we know that data’s very robust, not perfectly clean, but more clean than real-world data, and is very ample, and not very well represented of the population in general, but somewhat so, and we can get enough patients. The problem is, there’s very specific types of people who tend to engage in clinical trials, and they’re not reflective of the larger portion of the population. So we need to think about all of these elements of where bias to the data elements through the life cycle of data might actually ultimately have an imposition on the bias being magnified into the actual traditional use of algorithms once approved.
Suraj Kapa (00:25:54):
So how can we solve for this? And this is what we’ll get to with the panel. First, we need multidisciplinary work. We can’t sit here and think that industry by itself, academia by itself, or government by itself are going to be able to, as a silo, figure out how to solve all of this. The creation of appropriate guardrails is probably going to require consortiums of all of these entities. We’re also going to have to ensure secure, privacy-preserving, yet scalable approaches to collaborate on data broadly, to ultimately hopefully mitigate bias that emanates from limited data diversity. Because as I said earlier, while some elements of data diversity is measurable, like what proportion of your population is African American versus Hispanic American versus Indian or other populations, there’s some immeasurable aspects that you might need larger datasets in order to ultimately account for, because we don’t know what we don’t know.
Suraj Kapa (00:26:51):
And finally, improving methods to understand how well an individual is truly represented within a dataset. In other words, if I am coming into this as the first person on who an algorithm is being deployed, we think, “All right, this person’s probably well represented in the dataset,” but how do we really know? How do we create these labelings to understand the representation of a given new individual coming in, and to whom this algorithm is now being applied?
Suraj Kapa (00:27:23):
Now, I work for a company that’s heavily focused on, how do we do secure privacy preservation? And one aspect we really thought about is, how does that help in the context of bias? And this is one piece of the puzzle. And I really want to be clear on that, because the piece of the puzzle this helps with is thinking about, how do we create agnostic approaches, agnostic software-based solutions to allow data to stay residents where it’s located, such that we mitigate the fears, especially of underrepresented populations, about the use of their data, for the purpose of training algorithms or analytics? It nevertheless allow them to be represented in the final algorithms. So we ultimately expand access to the data pools and less representation. And ultimately, we need to allow for consideration of the entire data pool. And this is one of the issues and limitations that we experience when we’re trying to train algorithms with the data that’s made available, because when we’re trying to abide by privacy, and when we’re trying to abide by varying regulatory standards, a lot of times, we’re forced, on purpose and for a good reason, to extract extensive degrees of context from the data, and such as that the algorithms are actually resulting from data that has removed context. And ultimately, and ideally, we’d like maintain that context in those data sources.
Suraj Kapa (00:28:50):
But I’ll stop here so we can actually hear from our panelists. And what I’m going to do is, I’m going to let each of them introduce themselves briefly in their specific interests in this area, and then we’ll get to questions. And again, for the audience, feel free to answer whatever questions. I know we already have one, which we’ll get to. But Aashima, maybe I’ll hand it to you first.
Aashima Gupta (00:29:11):
Thank you, Suraj. Hi, everyone. This is Aashima Gupta. I’m the global director for healthcare at Google Cloud, and much of my work involves in three vectors. Number one is creating forward-thinking solutions where we are leveraging our engineering and product advancement for healthcare. The second is working with the industry at large, so transformative work with institutes like Mayo Clinic, like [inaudible 00:29:40], where it is about helping the industry. And third is being involved in the healthcare ecosystem, I serve on the board of HIMSS and [inaudible 00:29:50], generally thinking broadly as an ecosystem, and happy to be a part of this panel.
Daniel Kraft (00:29:59):
Great. Hi, I’m Daniel Kraft, coming to you from Creek at the moment. I’m a Med-P trained physician, scientist, hematology, oncology by clinical background, chair of medicine for Singular Universities since it started in 2008, and a program called Exponential Medicine, generally trying to, as well as more recently chairing the XPRIZE Pandemic and Health Alliance. We’re trying to pull different datasets together. I try to put the lens of what’s here and what’s coming next in terms of, in this case, data. We have new forms of information, whether it’s your digital exhaust, new forms of multiomics, and try and help folks converge to think about solving for, in this case, data, what’s unrepresented elements or bias, but also seeing where the puck is going and what elements might need to be put into the solving set. When I think of data and healthcare briefly, I think that sort of might be the oil, but we want to think more about the insights that can be driven, and driving those new insights and knowledge to the bedside or, increasingly, the web side of it, and of course bias is an issue. I think perfection is also the enemy of the good here, and we need to sort of be building stacks and new approaches, and allowing folks to become data donors, who often are unrepresented, to build less biased starting points. Looking forward to the discussion. Thanks.
Suraj Kapa (00:31:17):
Okay. Thanks, Daniel. And I’ll hand it over to Brian.
Brian Anderson (00:31:18):
Thanks, Suraj. And really appreciate the opportunity to join my colleagues from Google and Singularity here with you, Suraj, from TripleBlind, on this panel. So Brian Anderson, MITRE’s chief digital health physician. At MITRE, I have the pleasure of working very closely at the intersection of private sector work with the public sector. So MITRE’s principal work is working with the federal government. Obviously, I help lead a lot of the health-related stuff. The particular area of interest here is very much what you described, Suraj, around guardrails. We recently started and announced publicly a coalition called Coalition for Health AI, or CHAI, and it brings together just the stakeholders you described, academia, industry, and public sector government. Around this common mission and vision to build a framework or guardrails that promote the kind of trustworthiness and transparency into models and their applicability, such that we are realizing algorithms and models that are useful for all of us, that are trained on data that is inclusive of all of us.
Brian Anderson (00:32:31):
I really liked that Venn diagram where you had data gathering, data analysis, and data application. Within CHAI, we just had a virtual roundtable where we really focused on bias along three similar vectors. One, data gathering, model development, which is really the analysis and building or training of the models, and then application, how do humans actually take these models and apply them appropriately or inappropriately in the real-world environments where they’re deployed? And there is so much to address in each one of those areas, certainly just focusing on the easiest one that you described, the data gathering one, only gets one part of the problem. And so, there’s a lot of opportunity for a community to come together to address all of those areas in bias, to build that kind of more transparent, trustworthy framework so that we can trust these kinds of algorithms that are being deployed for all of us.
Suraj Kapa (00:33:32):
Oh, great. And I think everything you all are saying about the need for that focus and how that focus being applied, I think is very synergistic across the different areas, from government to the startup space to the very established providers, very established digital providers in this space. But maybe I’ll start with you, Aashima. So from your perspective, as being involved with a major cloud provider that has been very focused over the last couple of decades, really, on understanding how to get into healthcare, how to work in that space, where do you see the cloud provider’s role in playing a role in bias through the big tech role in playing a role in bias related to digital health platform enablement? And if you can comment, where’s Google looking at it as well?
Aashima Gupta (00:34:24):
Thank you, Suraj. At Google, we have seen firsthand that the rigorous evaluation on how to build AI responsibly and the work that you’re talking about, the fairness, the data bias, is not only the right thing to do, they are critical components for AI’s successful deployment. So there’s a growing distrust in AI, and we listen to our customers very closely. And our work, I will share across three dimensions. Number one is, we as a company. So I am part of Cloud, but at the Google wide, they are a common set of AI principles. They are eight of them and they govern all of our work, including in healthcare, but they are common across all domains. And these values of responsible and inclusive AI are at the core. So when we are talking about a new product development or when we are doing a custom deal, for example, working with the Mayo Clinic or ecosystem at large, much of all of that work goes through that repeatable process of applying the AI principles in the context of product, and the product in the context of the customer work. So that’s the one dimension.
Aashima Gupta (00:35:41):
The second is, the very important part is establishing the shared understanding of AI models, whether it’s knowing… Dr. Halamka uses this example a lot. Whether it’s knowing the nutritional content in our food or medication adherence warning in healthcare, regardless of the use case, we rely on information to make responsible decisions. But what about AI? So despite its ability and potential to transform so much of the way we work and live, machine-learning models are often distributed without a clear understanding of how they function. For example, the example that you were giving, Suraj.
Aashima Gupta (00:36:25):
So what we have built in that context is a framework called model card or the data card, and we share the common vision with the industry. And model cards are not a Google product. It is a framework that we’ve shared with the industry to define the explainability of the model. So for example, data model performed consistently across a diverse range of people. So I would encourage, and I’ll share the link here, that data card and the model card, the framework that we have built, it has come from years of research within Google in bringing that forward in defining in what condition the model breaks, what is our data element that has gone in? Much like tying it back to the nutrition label, what is a protein, sugar, carbs? Same thing. What is the diversity of the data?
Aashima Gupta (00:37:21):
And the final, third point is around, it’s an important one, we have seen, despite the growing recognition of AI machine learning as, and in some cases, even a crucial pillar for digital transformation, there’s significant bottleneck in the effective operation of machine-learning platform-building. And some staggering stats that I’ve heard, there’s only one in two organizations have moved beyond the pilots and proof of concept at scale. So how do you make the adjustment, new set of data has come in? How does that change your machine learning data pipeline? And across the industries, if you look 10, 15 years ago, DevOps or DataOps were brand new disciplines. Similarly, we are seeing machine learning ops, and Google Cloud is building those tools and bringing that operationalization of machine learning, data labeling, data pipeline, repeatability of the model, retraining of the model, because without these tooling, the intent of retesting the models, it becomes too painful.
Aashima Gupta (00:38:31):
So this is the domain. So if you think about building the model, you need to think about AI operationalization, and that’s where… Those are the three things, having common set of principles, making sure your products and partnerships go through that AI review consistently in a repeatable fashion. Second is around explainability with model cards and data cards, and sharing that as a community resource. And third is the machine learning operations. Well, this is all three vectors, but this problem is much bigger than Google alone, right? This is where the partnership with academia, with the customers, with the ecosystem on large will be important.
Suraj Kapa (00:39:16):
Great. And with everything you said, I would actually ask, Brian, to you, when we think about the Coalition for Health AI, I mean, it seems like, Aashima, great [inaudible 00:39:26] work, a foundation for a lot of what you kind of were going into regarding what the Coalition for Health AI needs to think about and talk about as far as guardrails, but how do you envision that ultimately translating into policy, into standards, as opposed to being just general talking points about best standards?
Aashima Gupta (00:39:48):
Well, it’s-
Brian Anderson (00:39:48):
Great question. Oh, go ahead, Aashima.
Aashima Gupta (00:39:53):
No, I was going to say, Brian can help answer it.
Brian Anderson (00:39:55):
So part of building any kind of standards, effort, and movement involves a coalition of the willing and a critical mass of implementers. Aashima’s example about the model cards is a great one that started with Google and has grown to be beyond Google, just as he’s described. I believe, Aashima, it’s an open-source, freely available approach and technical framework. And so, in CHAI, what we are attempting to do is pull together, in the industry side, so the stakeholders like Google, like Microsoft, and if you saw the news that Dr. Horvitz shared yesterday, Microsoft’s recent publication on their AI code of ethics, taking those bodies of work that organizations are developing and publishing across the industry, and then looking to develop that kind of agreeable framework that we can all as a coalition say, “Yes, this is what we are going to move forward with to address data bias, or to address human application bias. Or this is how we are going to approach testability or promote transparency in algorithm development.” And then to have that real technical framework that is implementable in industry.
Brian Anderson (00:41:18):
And it’s that critical part of actually implementing these technical frameworks that then provides that iterative feedback into a standards process that builds the kind of coalition and adoption curve that you like to see when you look back at, in the health IT space, US Core became USCDI. It started with four or five companies partnering together to do something very similar to what we’re trying to address when it comes to bias in model development.
Brian Anderson (00:41:49):
Now, you asked, “Where does government come into play?” Well, if we, again, rewind the clock and look at how US Core and USCDI came in, it started in the private sector. It started with academia and private sector industry coming together. And really, then, the government said, “Wow, what’s going on here? This is something really remarkable. This is something that we can also get behind, that we may be able to offer some input and some advice on what are some of the equities and concerns from a government standpoint.” And within CHAI, what we’ve tried to do is to create a coalition of equal partnership across those three categorical areas, industry, academia, and government, where the government has a seat at the table, understanding where these implementations, these technical frameworks are being developed, and actually in real-world implementation used so that they can, in their internal conversations, appropriately so, when they begin thinking about how to regulate this, they can develop a much more appropriate, informed process to which create regulation, and again, provide the regulatory guardrails for all of us.
Brian Anderson (00:43:05):
And so, when we think about the FDA’s inclusion in CHAI, or the ONC’s inclusion in CHAI, and I don’t want to speak for either one of them other than to say ONC’s traditional scope of responsibility has been in electronic health records. So you might imagine ONC having a very strong interest in, “Well, how do we regulate models that are deployed and used within EHRs?” FDA, they’re kind of… Their wheelhouse is really medical devices and software as a medical device. And so, they would be very interested in the kinds of tools that are developed from a human subject standpoint, that will have potential effects on humans, and the regulatory framework that they need to have in place to do that. So having them at the table, working together, having the informed conversations that each group may need to have internally within themselves, as a government just wanting to talk about regulation, or openly with the private sector, that’s the kind of community that we’re hoping to create so regulations aren’t created without that kind of insight into knowing where the private sector is going, and the considerations that they, many smart people at Google, Microsoft, and other companies are laboring to develop and to think about. That’s the hope of CHAI and where we’re going, Suraj.
Suraj Kapa (00:44:29):
Wonderful. And I’ll actually translate that to you, Daniel, because obviously, we’re talking about, from Google to Microsoft, two huge organizations, and you have a very unique vision in digital health, running Exponential Medicine, your role at Singularity, in the venture space, working with startups, everything from early stage to mid stage to late stage where the mentality tends to be more of, “Build fast, grow fast.” And now we’re thinking about guardrails and bias mitigation. How do you see this kind of dovetailing with that mentality, and kind of integrating into this broader private industry mindset?
Daniel Kraft (00:45:11):
Well, I think that’s a really important question. I think the challenges, I mean, often they’re well-meaning guardrails from the past that haven’t operated so well, like HIPAA is a bit antiquated, and still it’s analog-era regulation in a digital connected age. And we’ve seen examples. I’ve seen patients die because we’re waiting for the HIPAA sign-off to get their EKG sent over from another hospital. So I think there’s probably some middle ground. I mean, as I mentioned earlier, perfection being the enemy of the good, we still need to enable startups to, hopefully within good context and good faith, be able to hopefully collect, leverage, and learn from data, but also have the responsibility to do that in ways where they can optimally share it and understand where maybe there’s a role to educate startups, academics, and others, and even clinicians today, where the biases may occur.
Daniel Kraft (00:46:02):
I think many times we’re still using a lot of, let’s say, guidelines from Framingham, which you could argue is very biased. Data comes from a subset of mostly Caucasian European nurses in Western Massachusetts. Do we get reminded in our workflow that this data is relatively unrepresented? How do we leverage… It’s interesting, with Microsoft and the Google’s represented, and the Amazon’s around the table, and Apple, the consumerization of access to not only leveraging your own data, but being able to share it is now here. I’m part of the All of Us trial. I signed up as a data donor. I tried to sign up for Verily’s Baseline, but maybe I’m too represented as a Bay Area based physician. But to better answer your question, I think there just needs to be some give and take, and not waiting for it all to be lined up, and enable the startups as well to be part of that conversation along with the regulators.
Suraj Kapa (00:47:05):
Great. And there’s a question from the audience that, how does this human bias apply when we’re talking about recorded data, right? Because, and I think this gets to a large portion of the question when we talk about data gathering, right? Because we can say, “All right, fine, we’re dealing with healthcare data. Let’s take out all the subjective data. Let’s throw out the doctor’s notes, because no one knows how to read them anyway. Let’s take out what they were prescribed, because there might be bias in that. Let’s just purely look at quote, unquote, objective data. The ECG, the chest x-ray, the CT scan.” So maybe I’ll hand that to you, Brian. Do you perceive, especially in your life as a physician, as a clinician, do you see an element of potential bias primarily and solely focusing on the quote, unquote, objective world of data within medicine, and do you see fallacies within that thinking, or what’s your perception on that?
Brian Anderson (00:48:02):
That’s a good question. So when I think of human bias, or application bias, it’s not necessarily tied to the objective data itself. It’s tied to the application of a model and whether or not… So to take this EKG example a little bit further, it would be more along the lines of, the EKG data, you’re right, it is objective. It has its wave form, and analysis can be applied to it. And if we use the Mayo example, a model developed that predicts any number of cardiovascular endpoints that would be of concern. Human bias for me means, is an individual going to actually apply a model that would be appropriate or perhaps inappropriate for an individual that has that wave form that’s been captured? And so, it doesn’t have so much to do about the actual objective data itself being captured as it does about the application of models at an end user level.
Brian Anderson (00:49:19):
So perhaps that’s not exactly answering your question, Suraj, but getting to the objective data within the EKG, and when I think about human biases associated with it, it’s that human in the loop deciding what they’re going to do with that data and whether or not they’re going to take a model, appropriately so or not, and apply it to that objective data. The objective data in of itself is what it is. And there are certainly types of biases we can talk about from a data gathering standpoint, but from an EKG standpoint, I guess that that’s how I would approach it. But I certainly welcome Aashima or Daniel’s other thoughts on that.
Aashima Gupta (00:50:02):
No, Brian, great point. I wanted to also mention that it’s not… It’s being manifested at the AI, responsible AI, but if you look into… It’s a question of human-centered design, right? Humans are the center of technology design and decisions, and humans have not always made decisions that are in line with everyone’s needs. So I’ll offer you two examples. So first is, until 2011, the auto companies were not required to use crash test dummies that represented female bodies. As a result, women were 47% more likely to be seriously injured in the car accident than men, right? This is about human-centered creating that diversity, and as you’re defining, the second is around the adhesive bandages, and they are primarily made for white skin or skin tone to the benchmark, forcing everyone with darkened skin to kind of go with that. So to me, there’s no AI there, right? This is about designing of the product, which is more inclusive. So we are dealing with, as we create new system, the amplification with AI is far greater and far profound, but it goes back to the basic about diversity and inclusion and a human-centered design as we think about creating these products.
Suraj Kapa (00:51:28):
Daniel, any additional comments?
Daniel Kraft (00:51:32):
I love that comment about design thinking. I think it still comes down to how you design the workflow and the insights for the clinician or the engaged human consumer patient to sort of see where the challenges are, and maybe design better solutions. I think none of us want more data as clinicians. We want the actionable insights and how those get presented again could start to inform us about the opportunity to maybe, with the patient in front of us, have them opt in, in appropriate ways, to share data if they’re a member of, let’s say, an unrepresented set, all the way to, again, highlighting as mentioned before, where there might be bias, and give you that sort of little check engine light. You’re in the path of managing this patient like you would the average European when they have a lot of other elements from their socioeconomic to their genetic determinants. So I think that design piece is key, as well as how to engage people in sharing and opting in. We have these long legal forms about sharing data, there’s all these new ways to manage it with, with blockchain and beyond. How do you explain that in smart ways to the folks who are asking to hopefully share and opt in to contribute some of their data and knowledge base?
Aashima Gupta (00:52:56):
Right. And then, one thing to add there, Daniel-
Suraj Kapa (00:52:56):
Yeah. And I mean, and I like a lot of it, especially of what Aashima said, which was, in my former life as an academic clinician, I did a lot of research and a lot of looking at kind of just general understanding of what data shows us. And it’s interesting that we don’t know what we don’t know. And what I mean by that is, even if you look at something as simple as a form factor of the electrocardiogram, which has been around since the 1800s, the reality is that the electrocardiogram normal ranges vary by race and by gender. And most people were never taught that in medical school, not to mention residency or even fellowship or even sub-fellowship, but there are papers that look at racial differences in the normal values on ECGs and where application of a reference range that’s traditionally used for, say, a non-Hispanic Caucasian population, and applied to an Asian population actually results in increased mortality.
Suraj Kapa (00:53:49):
So it actually raises a key point when we talk about this design aspect, not just of the technology itself, but even of the definition of “normal” within that design can actually have potential ramifications downstream. And some of the work that we had done previously, that I had done previously with colleagues, was looking at, can we predict, for example, COVID from an ECG? And we very quickly realized we need an extraordinarily diverse global data set in order to do this. And we tried our best to get as much data together, and there’s some degree of reasonable predictive accuracy with an AEC in the 0.8 range. However, there was a huge limitation in getting all that data together. And that’s some of the research work that TripleBlind has been doing with actual cardiology colleagues at Mayo, was seeing, how can a privacy-preserving approach help in terms of getting a broader data set so we’re not only training on A, B, or C population, and thus potentially limiting ourselves, because the normals aren’t necessarily globally normal when we think about varying populations?
Suraj Kapa (00:55:04):
But as a next question, maybe I’ll hand it over to you, Daniel. So when you look at kind of the sphere of talking about it, I mean, you speak all over the world at various conferences focused on the digital economy, digital health, et cetera. How big of a role do you see bias in AI being? Is it something that’s been increasing focus over the years? Has it always been a focus from the very start or do you think it’s something that’s really needs more attention in kind of the digital health circles?
Daniel Kraft (00:55:41):
I think maybe in some of the insider circles it’s well understood. I was at the AIMed Conference, and I think there were several panels and others that definitely hit on the topic, but let’s say the average physician, or is it even part of medical school education or CME? Is it in the zeitgeist? I don’t necessarily think so. I think there’s a lot of excitement about the buzz of big data and AI machine learning, and applying that within from radiology to pathology, but maybe not enough recognition that we have a long way to go, whether it’s on genetic data sets or others. So I think it’s something I’ll try and integrate more. And when I interact with a few people, I’m moving a lot from the session here, but I don’t… I think there’s a big need to sort of understand and describe the challenge and the opportunity, and where we might need to go next.
Suraj Kapa (00:56:28):
And to that point, Brian, do you see that part of being the role of the Coalition for Health AI to try to focus on not just the guardrails, but the education, or do you think that needs to be more of a focus? And then I’ll hand over to you as well, Aashima, the same question.
Brian Anderson (00:56:44):
Well, certainly. So within CHAI, I mean, the mission and the vision is to create a more trustworthy and transparent way of implementing and using AI. And I think as Daniel described, certainly in my medical education, there was no health AI class or even digital health class. It may date me somewhat, but… So physicians certainly don’t have the kind of understanding or trust in these kinds of algorithms, right? Many of them, and Aashima, you can speak probably much more knowledgeably to these than I, deep, convoluted neural nets, are they ever really fully explainable? How do we communicate and educate providers in such a way that they can understand them and trust them? How do we create the layers of transparency around them so that a physician can use them in a way that they trust? And so, yes, I think the short answer, Suraj, is the goal of our coalition is to create a framework that promotes transparency and trustworthiness by addressing things like bias, by creating an open framework that is easily adoptable for validation purposes or testability purposes, because fully explaining a deep convoluted neural net, I’m going to defer to my colleagues at Google to help me with that one, because that’s not something I’ve been able to do yet. So it is a real challenge, Suraj.
Aashima Gupta (00:58:21):
Right. And to act to that, Brian, so as a tech company or tech provider, especially wearing my Google Cloud hat, we can build the explainable AI tool or responsible AI, but it would need the ecosystem participation. And one thing I would mention is the culture of support, right? These are hard conversations industry-wide, and oftentimes, a top-level mandate is necessary, but it’s not also sufficient. And part of this comes from training on what you touched upon, the tech ethics, in order to actively connect ethics to technology that might or otherwise be believe to be valued as neutral. So that, I see a big gap in, I believe there’s a new discipline around tech ethics that needs to come into all strata of ecosystem, be it a startup, be it a big health system, or as a tech company. And it’s not always comfortable to talk about how these transformation technologies can be harmful, but it is critical to build that community by modeling it, even when it’s difficult. And that’s where the approach that we are taking both from the tooling perspective, the principal’s perspective, but it would need that participation, Brian, and… Otherwise, we will… And I don’t know how many organizations today have their tech ethics scores, as an example, for developers on the keyboard, designers, and are taking that into account when they’re creating a new product or a project, right? It’s not about-
Brian Anderson (00:59:54):
But Aashima, I think one of the things that you’ve done really well at Google, right, it’s, you haven’t fallen into the trap of, “How do we just explain a convoluted neural net?” I would assert that that is very much difficult, if not impossible, to do to the non-tech savvy physician out there. What you’ve created as a transparent framework by which that end user can understand, “Is this model appropriate for me to use based on the nutrition label framework, the model cards that Google’s developed?” So I think we need to have different ways of how we think about educating the healthcare ecosystem, Suraj, and physicians in the space, because explaining a convoluted neural net in such a way that a doctor says, “Ah, of course, I understand that,” is going to be really hard.
Suraj Kapa (01:00:47):
I mean, I’ll take that a step further, Brian, and say that it’s more than explaining a convolutional neural net. I mean, if you sat there, and so, within my field of cardiac electrophysiology, we have a calculation that’s a very simple numeric edition to calculate stroke risk, called the CHA2SD2-VASc score. But I bet you, if you ask almost any cardiologist in the country, “How was that CHA2SD2-VASc score created?” in the context of what was the reasoning for every component of it, and what was the level of value we put to the reasoning of each component of it, I bet you, there’s very, very few people in the country, in cardiology, and the best cardiac electrophysiologist, and academia who could actually opine on exactly what those nuances are to the degree that another person would feel, “Oh yeah, that makes every sense in the world.” And it becomes a question of, how much explainability is required, and is it the fact that a computer’s just spitting out an answer as opposed to, “I added up a bunch of risk factors on my own, and I trust the fact that this means you’re at a high risk for an event,” right? So yeah, you were going to say something, Brian.
Brian Anderson (01:02:03):
I was going to say that’s very fair. I don’t know if the CHA2SD2-VASc was based on odds ratios or how the different features were weighted, but it’s a fair point. And maybe to take that a step further, what you’re really saying is that, will algorithms and models just be so readily accessible and adopted that, in the future, we won’t have that kind of need to overcome the mistrust of computers just spitting out answers? That may be true, but it’s certainly not the case now. And so, in terms of how we educate people now, I think things like what Google is doing, what you’re doing at TripleBlind and other organizations is really some of the initial steps we need to take.
Aashima Gupta (01:02:51):
Yeah. And to add to that, that’s why we are very deliberate that the model cards and the data cards technologies are means to an end. They’re not intended to be a Google product, but a shared evolving framework with range of voices. So we invite academia, we invite health systems to really get it right. And collaboration is going to be key there. And we do also have our own examples. For example, Cloud Vision API, the face detection and object detection. We are giving the model card towards our own AI algorithm and showing how the model deals with the input, what are the key limitations of these models, and basic performance metrics. So we are creating enough examples that, for our own, like Cloud Vision API, it’s… You can detect an object, and… But that’s one example. As we are developing the different algorithm, you can see how we have applied and then generate, but taking that to healthcare will need that work. And most of the work, Brian, that you and your team are doing is bringing this range of voices into that conversation.
Suraj Kapa (01:04:05):
Great. And I know there’s only three minutes left, so I’ll ask [inaudible 01:04:08].
Daniel Kraft (01:04:08):
[inaudible 01:04:08].
Suraj Kapa (01:04:10):
Sorry. Go ahead, Daniel. Go ahead. Please go ahead, Daniel. Sorry, [inaudible 01:04:17].
Daniel Kraft (01:04:16):
Well, I was just going to raise the point about… Yeah, just hopefully you can hear me. I was just trying to think about other elements around bias, et cetera, is to better unlock all the different silos into aligned incentives for folks who share them, whether it’s between EMRs or health systems or hospital systems, et cetera. Because it does seem like many times the data may exist, and the bias could be minimized by better collaborating and connecting the dots between datasets that maybe TripleBlind or others helped modulate. That just seems to be part of the solution [inaudible 01:04:48].
Suraj Kapa (01:04:51):
Great. And we’re in the last three minutes, but I have one last question kind of in the frame of this. Anybody’s more than welcome to answer. But do you all feel, when we start thinking about these guardrails, these standards to mitigate the bias issues, do you feel that the technological solutions to actually address the guardrail needs, the nutrition label needs to what John talks about all the time, are there in an adequate way, or do you think there needs to be more development and more growth to realize the right tool sets to address these issues? Maybe I’ll hand it to you firs, Aashima.
Aashima Gupta (01:05:29):
I would say we need to take a very humble approach. AI is changing rapidly, and our world is not static. And as long as we are trying to be consciously remember that we are always learning, and while it’s an unreachable goal to create a perfect product, but there’s always, we can improve in the ways. We will make difficult decisions over time. So yes, tooling is there, it’s not perfect, and I think that’s where humility is important. We are still learning about this, definitely applying that in the healthcare setting.
Brian Anderson (01:06:07):
My thoughts on that. So agree with everything Aashima and Daniel have said. It really, for me, comes down, at a technical level, to the specific use case driven implementations, Suraj. And I don’t think we’re there yet. We have some frameworks, we have some code of ethics developed, which I would say would be broadly the infrastructure or the architecture by which we can then actually implement real technical standards in a transparent way. And so, that use case driven effort is hard. There’s a long tail to it, and it requires a community that’s willing to invest time and effort to do that. And Aashima’s right, there are evolutions in the field of AI occurring daily. And so, building a coalition to actually implement a framework in a use case driven manner is hard, but I think it’s certainly necessary. And then building that adoption curve up in each of those use cases, I would say that that’s where some of the real work needs to happen at a next stage, is so many organizations have been working to collateralize and publish very worthy and notable places to start from. But in a use case driven way that is applicable to the startups that Daniel was mentioning, to the entrenched behemoths like Google and Microsoft and Epic and others, that kind of use case driven approach really needs to be, I think, where we go next.
Suraj Kapa (01:07:44):
Great. So we’re at time, but I really appreciate the excellent conversation and the excellent opinement on these issues related to bias, the technical solutions, and where we actually are trying to move towards. So I appreciate the attention of all of our panelists and all of our attendees. Thanks again, and hope you have a good day.
Description:
In this webinar, Dr. Suraj Kapa, TripleBlind’s SVP of Healthcare, will be joined by two healthcare industry leaders to discuss the current challenges facing bias in big data. A common obstacle among statisticians, data scientists and AI developers stems from three challenges: data access, data prep and data bias. Collaboration between healthcare enterprises is currently regulated by global data privacy protection, such as HIPAA, which limits access to personal health information (PHI). Privacy enhancing technologies (PET) are enabling broader, diverse data engagement when it comes to data prep. But, how can you enable cross institutional data marketplaces for secure data collaboration and genomic information while decreasing bias? How much data do we need and is there a better approach for vetting this data?
Speakers/Guests include:
- Suraj Kapa, SVP Healthcare, TripleBlind
- Daniel Kraft, Faculty Chair for Medicine at Singularity, Chair XPrize Pandemic Alliance Task Force
- Brian Anderson, Chief Digital Health Physician, MITRE
Date/Time: Wednesday, June 22nd, 11:00am CT / 12:00pm EST