Ep. 201: Inside the Black Box: Can Neurologists Trust AI?
Show notes
Moderator: Georg Starke (Munich, Germany)
Guest: Giulia Di Rauso (New York, USA)
In this episode, Georg Starke speaks with Giulia Di Rauso about trustworthiness and the use of artificial intelligence in neurological research and clinical practice. They discuss explainability, data quality, interpretability, and human oversight in AI systems, highlighting key considerations for responsible integration of AI tools into neurology and the importance of maintaining clinical judgement.
Show transcript
00:00:00: Welcome to EANcast, your weekly source for education research and updates from the European Academy of Neurology.
00:00:15: Hello!
00:00:15: And welcome to Eancast Weekly Neurology.
00:00:18: My name is Georg Starke.
00:00:20: I'm a physician and philosopher working on the ethics of AI neuroscience at The Technical University in Munich.
00:00:27: Our topic today sits at the intersection of two things that I think about in most my current work, namely Artificial Intelligence and Neurological Clinical Practice & Research.
00:00:38: In this episode we want to address one question which sounds at first deceptively simple – can neurologists trust AI?
00:00:48: My guest is Julia Diraozo a neurologist and researcher with experience using AI models in clinical research in neurology.
00:00:56: She used to work at the Movement Disorder Center of Reggio Emilia, and is currently a fellow at The Fresco Institute for Parkinson's and Movement Disorders at NYU Langone Health in New York.
00:01:09: Julia welcome!
00:01:10: And thank you for joining us today.
00:01:12: Thank You Gerg very happy to be here.
00:01:15: So before we dive-in I think it would be good to set the scene level as our listeners of course know I promise is to reach a neurology in many areas right from seizure prediction and your imaging analytics.
00:01:30: To let's say closely neural stimulation or AI enabled brain computer interfaces, and yet despite all the excitement of other things going on adoption in clinical practice remains somewhat slower than many may have expected.
00:01:46: And what are some points that come only comes up when we think about this gap comes down to trust for maybe rather it's absence.
00:01:54: So I would suggest that we start here, Julia can you tell us a bit about how you've been using AI in your own work?
00:02:03: Sure just to clarify i come from research angle resident everyday clinical use and so throughout our conversation will share my personal opinion on this perspective.
00:02:17: As you mentioned, I'm involved in research mailing the field of neurodegenerative diseases and Parkinson's disease with a particular focus on genetic form of PD.
00:02:37: So just to give a concrete example, we published the study conducted at the Movement Resorder Center of Rezumilia whose main objective was to develop a machine learning model capable of providing a pre-estimate of GBA I mutation status in PD patients based on clinical and demographic features with an high predictive value.
00:03:00: That's, I think a really nice example to ground the conversation because it brings up at least for me immediately.
00:03:06: The question that I imagine also many of our listeners will recognize right?
00:03:10: You can see the inputs and you can see outputs but in the middle part what the model is actually doing Can feel slightly opaque.
00:03:19: Yes so i can speak to this quite personally Because my experience is feeling myself Actually especially from beginning.
00:03:27: There were moments when I asked myself, do I?
00:03:29: How can i actually trust this result.
00:03:33: Does it make sense clinically
00:03:35: speaking?".
00:03:37: Yeah and I think discomfort is somewhat widespread right... And I want to return to that in a second.
00:03:43: but first since we are tackling this idea of trust here extensively It's worth also thinking about at the stage more carefully because This notion of trust comes up in so many contexts involving AI, from marketing to regulation on the EU level for instance.
00:04:02: And yet there's actually something quite fundamental here if we talk about trust and not just say about reliability or performance measures because when I rely on a calculator... ...I probably don't trust it in any meaningful sense.
00:04:18: i'm expected to perform function If need be, I could actually also do the calculation myself by hand.
00:04:25: But trust involves something more.
00:04:28: it implies that there's some inherent risk in a complex situation Some kind of vulnerability primarily for patients but potentially also clinicians using these systems For instance in terms of liability and this adds an normative dimension to trust.
00:04:46: That i find particularly interesting as an ethicist also think is deeply linked to this model's opacity that you've mentioned.
00:04:56: Yes, one of the main obstacles at least for me initially was trusting the output because I couldn't fully see into how the model arrived at its result.
00:05:07: so the famous black box problem as i mentioned earlier The first research project i worked on using AI Was trying to develop a predictive model and i remember this feeling quite clearly.
00:05:21: The model would give us a prediction, but I found it difficult to understand whether that result was coming from.
00:05:29: This discussion can also be extended to the clinical context.
00:05:32: you know not just the research one.
00:05:34: for example when a model provides information about patient prognosis how can you understand where the results come and then explain them?
00:05:43: I believe this is an example of the kind of situation people have in mind when they stress the importance of explainability, interpretability and transparency of AI system also when it comes to trust.
00:05:57: Yeah i mean i fully agree.
00:05:58: these are concepts that help address common questions such as can i understand why this model flagged this particular patient?
00:06:07: do you know how it was built, on which data it was trained?
00:06:11: and what actually I saw in two recent projects where we interviewed experts on AI Neurology and in Psychiatry is that clinicians are quite pragmatic about this.
00:06:22: In a way they don't necessarily want to understand the full architecture of their model or at least many do not necessarily wanna understand the whole architecture of them.
00:06:31: but what they do want you know uh...is tell me what kind of input data this system was trained on, show me how that output relates to something clinically meaningful.
00:06:41: And I think this is the kind of interpretability —the level of interpretibility—that actually matters for clinical practice in many cases.
00:06:51: and now as you know someone working also in the field a whole array of explain-ability tools what researchers often call XAI has of course improved substantially in the past years and you can use techniques like SHAP to show which features drove a particular prediction.
00:07:10: And also, I think this is really useful for understanding how models work especially at the stage development.
00:07:18: but something that the experts iceberg with also flag tier is that explainability if it's not used properly misused or can become a kind of fake leaf to cover up deeper problems such as by straining data, limited external validation lack accountability and so on.
00:07:38: And I would say that if we just say this model is explainable therefore you can trust it.
00:07:47: That's almost certainly too simple an answer.
00:07:51: This is only again considering the side-of-the-model but there are also human sides actually understanding these explainability measures to make sense of them.
00:08:00: So when AI tools arrive in a clear context without adequate education and training for the people who are supposed use them, they wouldn't necessarily know how to properly interrogate them.
00:08:13: so you could have it perfectly interpretable model I think.
00:08:17: but if clinicians haven't been taught what questions to ask here this kind of interpretability may even be limited to them and practice.
00:08:27: Yes, you're at an important point.
00:08:29: thank you Gerg.
00:08:30: I think AI is a tool like any tool.
00:08:39: In my opinion, it would be important to provide a new neurologist who deals with AI.
00:08:44: With basic education and training in this field so that he can better understand his strength and limitation interpret its result then use it in an informed and responsible way?
00:08:57: I think that education and knowledge are fundamental for building informed well-grounded trust!
00:09:10: explainability, interpretability transparency of AI models and the relation to trust.
00:09:14: But from your perspective as both a researcher and also clinically working neurologist what else do you think would be helpful?
00:09:23: To build some form of justified trust in AI in neurology.
00:09:28: In my opinion, an essential aspect is also recognizing the key roles that humans play in AI systems.
00:09:36: So not just as a user of output but as active participant at everyday stage This important to build justified trust and fully acknowledge our responsibility for this process.
00:09:53: In addition, I think that by actively participating in this process we can also make the system itself more trustworthy.
00:10:02: Let's take a step back and schematically think of AI processes as a pipeline Human provide data, then model-process it to generate an output And then human interpret and evaluate results within their clinical context.
00:10:18: So, humans are present at both ends of the pipeline and play an active role throughout this process.
00:10:25: I think that these pipelines also align several key points essential for improving the transworsiness of AI systems while building informed trust.
00:10:38: That sounds very reasonable!
00:10:39: Could you maybe guide us through what we have in mind here?
00:10:42: Yes, let's start with a hint….
00:10:46: First, and in absolutely fundamental we need to ensure the quality of input data.
00:10:52: If they are inaccurate or biased then output will be unreliable regardless how sophisticated it is.
00:11:00: There's a well-known say in Garbage In, GARBAGE OUT and I think it fully applies here!
00:11:05: Even the most advanced model would produce unreliables results if their data is trained on our bias.
00:11:12: And second... clear and well-defined clinical questions that justify the use of AI.
00:11:19: AI is not always a right tool, it shouldn't be applied simply because its available.
00:11:25: The question should drive the model not see other way around.
00:11:29: So to trust this final output we need first good quality data & relevant questions as basis for trustworthy systems.
00:11:40: Third, of course we need to consider transparency, explainability and clinically meaningful interpretability as mentioned before.
00:11:49: And finally when it comes the output is essential for maintaining a critical stance not relying on it in an unconditional or uncritical way Even with good data.
00:12:01: while designing models at the Interpretability tool you must keep a critical perspective.
00:12:06: We need to ask whether they help would make sense and whether it is appropriate for our specific clinical question.
00:12:13: This helps us prevent over-reliance on an algorithmic results, and ensures that clinical judgment remains central.
00:12:21: Moreover, adopting a critical approach can also help uncover potential limitation of the model And ultimately contribute to improving it.
00:12:32: What do you think about this?
00:12:35: Yeah, I think you're absolutely right about stressing these points and it actually also helps me to drive from one point that i really believe can't be stressed enough here namely.
00:12:43: That what ultimately matters is not that we just promote trust in AI for the sake of increasing trust.
00:12:50: trust can also be misplaced with potentially real dangerous consequences.
00:12:55: so What we actually should aim for is making systems trustworthy.
00:13:00: Have, as you mentioned better data, better validation, better explainability tools and so on but at the same time We also mustn't forget that there's this genuinely human side to trust right?
00:13:11: I think that uptake of these tools will almost certainly stall.
00:13:29: So in my view, it's for instance absolutely crucial... ...that we involve all relevant stakeholders early on including both patients and clinicians,... ....in the development of this AI systems.... In some kind of co-design process ideally.
00:13:44: And my worry but this is of course the ethicist speaking here Is that instead focusing on properly building trust We sometimes just aim to increase trust through marketing, for instance without also making sure that there is underlying trustworthiness.
00:14:02: That's grounded in this kind of thorough process and Of course it may speed up things at first but if we do so... ...we actually risk later on misplacing trust And we also risked that things go really wrong and that we disrupt existing relations of trust medicine Most importantly between patients and physicians And I think these will be quite hard to patch later on.
00:14:27: So, this is something that's really worth stressing here.
00:14:33: Thank you Georg for sharing your perspective!
00:14:36: I think we have highlighted a crucial point.
00:14:39: It isn't just about increasing trust but also ensuring the system are trustworthy.
00:14:45: But from your point of view what do you actually think should be improved?
00:14:50: That's a big question I think.
00:14:54: with you to trust, it really matters whom we have in mind when we talk about trust.
00:14:58: Most discussions typically focus on individual clinicians or individual patients in our western countries.
00:15:06: so do you as a neurologist or potentially of the patient trust this specific AI-enabled tool?
00:15:14: But trust is also public and institutional to some extent, also a global phenomenon.
00:15:21: And I think each of these levels deserves attention because at the public's level we've seen in recent years how quickly trust can erode Also In The Context Of Medicine not necessarily Because A Technology Failed But Because How It Is Governed and How it Communicates.
00:15:39: And In The Feel That We Are Talking About.
00:15:42: Controversies Around Companies Like Neuralink are Good Example Here wasn't necessarily the technology itself, but then a park trial reporting or perceived conflicts of interest.
00:15:56: Or the perception that private actors were moving faster than accountability structures could keep up with.
00:16:02: and once they're kind of skepticism takes hold in the public imagination to public perception off a field it doesn't necessarily stay contained there.
00:16:12: its built over an cast doubt on adjacent areas on the entire field.
00:16:19: And that, I believe matters for every neurologist who wants to use AI responsibly in their clinic because your patients don't arrive as blank slates.
00:16:29: they arrive with whatever they've read what they've heard or experienced.
00:16:35: second at the institutional level i think we need also institutional safeguards systematic auditing of AI tools for instance in deployment, clear accountability structures when a model fails and also governance frameworks that go beyond simply ticking regulatory boxes.
00:16:55: Within Europe.
00:16:56: the UAI Act is an meaningful step here as it mandates human oversight includes transparency requirements or post-marketing monitoring for high-risk systems such as AI used in neurology.
00:17:13: And also on a even more international level, there's an increasing attention that this intersection of AI and neuroscience and neurotechnology requires specific regulation, as mirrored in last year's recommendations under ethics of neuro technology from the UNESCO which you may be familiar with.
00:17:37: but I want to be clear, these frameworks are important and they can create conditions under which trust can flourish.
00:17:44: But they cannot create this complex and also somewhat vulnerable webs of trust themselves.
00:17:52: And last but not least on a third point On this more global level there's the dimension that i think is often unfortunately appreciated Which Most of the AI defaults being developed and validated in neurology are actually built on data from a relatively narrow slice of world, predominantly well resourced western clinical settings.
00:18:19: But if those tools are then deployed globally or they shape standards by which AI is evaluated on global scale.
00:18:28: we have trust problem.
00:18:29: that isn't just a clinical problem but it's also really justice problems.
00:18:35: If you add this to the huge environmental costs of training AI models that may disproportionately affect people in other parts of the world, This problem of justice I believe becomes even more pronounced.
00:18:51: Now these questions are not entirely new problems right?
00:18:54: We see similar issues with much medical progress but then it becomes potentially even wider reaching.
00:19:03: want to conclude on a more positive note because among all these concerns, I think it's also crucial that we don't lose sight of the fact there is real potential here.
00:19:14: AI tools can predict or even prevent seizures and detect neurodegenerative diseases early-on which support us in finding the best therapy for individual patients are genuinely great.
00:19:27: but this potentially why in my view its really worth getting questions about trust right precisely because there's so much at stake.
00:19:37: Julia, any final thoughts?
00:19:41: Yes I believe it is important as i mentioned earlier especially given the increasing views of AI system to stay informed and educate ourselves that we can approach them as neurologists in a more informed than responsible way avoiding both prejudice and uncritical trust.
00:20:01: Yeah, with this I really agree.
00:20:02: And in fact it seems like a perfect way probably to end today's session as it shows that we are not only completely passive recipients of the technology but also have some degree of agency and for this also some responsibility to shape its integration into our work as far as we can.
00:20:25: Thank you so much Gergen.
00:20:27: thank everyone for listening.
00:20:29: No, thank you Julia and yeah.
00:20:31: Thank you all for listening.
00:20:33: if this topic interests you there will also be more content on EANcasts this month on AI so stay tuned!
00:20:57: You can also listen to this and all of our previous episodes on the EAN campus, E&Cast weekly neurology is your unbiased and independent source for educational and research-related neurological content.
00:21:34: Although all the contents are provided by experts in their field, it should not be considered official medical advice!
New comment