Problems Worth Solving
Technology doesn’t transform services. People do.
Problems Worth Solving brings you conversations with the leaders, practitioners, and radical thinkers reshaping health, care and support services. It's hosted by Sam Menter, co-founder of Healthia (www.healthia.services).
From transformation and AI to prevention and human-centred design, each episode uncovers the ideas and experiences behind lasting change.
Guests include NHS directors, policy shapers, entrepreneurs, clinicians, and designers — all united by a drive to solve complex problems.
Listen if you would like to understand how health systems can evolve to meet today’s pressures and tomorrow’s possibilities.
Problems Worth Solving
Dr Lia Ali: The fictional binary human
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
At a radiology AI conference, a clinician described a simple interface choice: an agree/disagree button for feeding back on AI diagnoses. It looked clean and efficient. But it assumed something fundamentally untrue: that clinicians experience a binary internal state when reviewing AI outputs. In reality, what they experience is uncertainty, context, fatigue and gut instinct. And when you force all of that into a binary choice, you don't just create a poor experience. You distort the data training the model.
In this episode, consultant psychiatrist Dr Lia Ali argues that behaviour isn't a surface layer in health technology — it's infrastructure. Drawing on experimental psychology and her work at NHS England, she makes the case that most health AI is built for a version of human cognition that doesn't exist. She calls this the fictional binary human, and she thinks it's one of the biggest unspoken risks in health technology today.
Problems Worth Solving is brought to you by Healthia, the collaborative service design consultancy for transformation in health, care and public services.
Find out more about our work at healthia.services.
Humans Aren’t Rational Designers Know
SamGood designers have always known that humans aren't rational. Don Norman made this case almost 40 years ago. As a consequence in health, we've built sophisticated frameworks for understanding behavior. For example, COMB maps capability, opportunity, and motivation, so we can design services that actually change what people do. So when a psychiatrist says, we're designing health AI for a version of human cognition that doesn't exist, your first instinct might be, yes, we know, but we've got tools for this. But those tools were built for a world where the interface serves the user. What happens when the interface is also a data collection instrument and every interaction shapes what the AI does next? You're no longer just designing for experience, you're designing the quality of the intelligence itself. I'm Sam Menter, founder and managing director at Healthier, the Collaborative Service Design Consultancy. If you enjoy listening, you can subscribe to this podcast and the accompanying newsletter at healthier.services. So, in this episode, I'm joined by Dr. Leah Arley. Leah is a consultant psychiatrist at South London and Maudsley and a clinical advisor at NHS England's Transformation Directorate, where she works on the clinical design of digital services, including the NHS app and shared care records. Today we're talking about something spotted at a radiology AI conference that stopped Leah in her tracks. A simple agree-disagree button designed to capture clinician feedback on AI outputs. On the face of it, it sounds logical and intuitive, but Leah argues that it exposes a deep problem. We're designing AI systems for a version of human cognition that doesn't really exist. She calls this behavioral architecture. Leah, welcome back. Thank you for joining me again. What's going on here and what's the problem that needs solving?
Bringing Psychologists Into Product Teams
SPEAKER_00So when I attended this radiology AI conference, I was sitting in the audience listening to radiologists and technical people who were describing problems that they have with the development of their algorithms and actually their entire businesses, how they grow and how they succeed. And a particular problem that one of the presenters talked about was a situation in which their algorithm required clinicians to agree or disagree with the diagnosis that the algorithm had made, which helped with actually calibrating that algorithm and its future development. And the users, the clinicians, had really significant problems with this. And the reason they had problems with this because they were being asked to make a sort of binary decision, agree or disagree. But the reality was that it was actually quite uncertain. So for them, it might be thinking things like, I think that's right, but the image quality isn't great. Or I disagree a bit, I'm not quite sure why, but overall, I think I'm gonna agree, or actually, I really can't make a decision here at all. So in each of those situations, what they found was that for the model training, they weren't getting enough feedback. And this is a this is a problem for keeping that algorithm accurate. And when they really dived down into what was going on in terms of this interaction, it became really clear that it was all about this forcing this binary decision. And I was sitting there in that audience with experience in doing experiments that use experimental psychology paradigms, thinking what this needs is that approach, is that way of thinking about how humans make those decisions and then how that interaction works with your AI system. And what it's clear to me, and from working in in many areas like this, not just in looking at radiology AI problems, but in other areas too, is that it's more than the interface problem. It's actually a problem that relates to the psychology of what's going on and interacting with how the system actually operates. And that's why I call it a behavioral architecture problem, because there's a number of different parameters that you're needing to think about that are both to do with the human, but also that decision environment that person is in.
SamDoes that mean we need to involve psychologists much earlier or even involve psychologists throughout the process? Because lots of designers can be great designers without a background in psychology.
SPEAKER_00Yeah, I really think so. I can give you a really tangible example. When we were doing work on how to deliver digital therapeutics for people with musculoskeletal conditions in the national channels, one of the things we did really quite early was involve a colleague of mine, Chloe Stewart, who is a health psychologist and a national clinical advisor, specifically on behavioral psychology and particularly in musculoskeletal conditions. And one of the real advantages of involving Chloe early was that she was able to characterize for our team some really important areas for us to focus on. Human psychology is complex. How you produce behavior change can be really complex. So when you're starting from the beginning of that design task, you need to have some frameworks that allow you to target what you're doing. And she was able to direct us to the psychological evidence base that showed that when they do studies of people doing physiotherapy exercises, the two psychological components that were most important and related to successful completion of exercises, for example, and therefore improved health outcomes, reduced back pain, etc., were fear of making the pain worse and fear of making the injury worse. So what that tells you is that if you're taking a population level approach to musculoskeletal conditions, if you don't tackle those two things in your design, you're very unlikely to get people to adhere to their exercise. And so you must tackle that. If you take a standard user-centered design approach, you might get to that. You might get to those insights, but you also might not get to that. Another really great example that I was given for from physical design, actually, is one of the biggest constraints when you are trying to think about modifying hospital environments is thinking about the asbestos in the walls. Actually, your design is going to have to take that into account before it can do anything else. And I think there are the psychological equivalents of that when you're thinking about human behavior.
SamWhat would be different if we had a psychologist embedded in every product team?
SPEAKER_00I think we would have a really nuanced understanding of how a particular decision could be optimized. And I suspect it would make design cycle shorter. It certainly has, in my experience, that means things cost less. Actually, I think this starts to help us really target what we're doing and improve the overall quality of work as well as it making it cheaper and faster.
SamI want to bring this back a bit to the AI radiology example we were talking about just now. You've talked about the fictional binary human, this idea that AI systems are designed for a version of human thinking that doesn't really exist. What does uncertainty actually look like for a clinician reading a scan?
SPEAKER_00So we talked a little bit before about all the things that might be going through someone's mind when they're asked to make this decision about whether they agree or disagree with what's presented on the screen. Is it the image quality? Is it that I have enough context? Do I know whether this is an elderly person or this is someone younger? All of which affects how sure I am of what I'm seeing on the screen as a clinician. That's a little flavor of just the complexity of what's going on when anybody is making a decision about anything. So there's an American neurologist called Antonio Damasio, who works very much in this area. And he's famous for a number of things, but there's a quote of his that I particularly like, which is that we are not thinking machines that feel, rather, we are feeling machines that think. That's a really important distinction. We tend, particularly in the globalized north, to be a bit more, you've heard me talk about Descartes before, to be a bit more Cartesian about it and separate mind from body. I think, therefore I am. Whereas Damasio's position is, I am, therefore I think. Our feelings are really affecting what we are thinking. So let's go back to that clinician, that radiologist, saying, I haven't got enough context. I don't know whether this is an elderly person or a young person. Let's unpack that a bit. What things might be going on as to why that radiologist thinks that's an important thing to know when they're assessing their degree of certainty about this decision? Well, some of that might be how they're trained. What were the data sets that they looked at? Maybe you're looking at something that might be a breast cancer, a tumor on the screen. What influenced how you make that judgment and your decision? But we also know actually that it comes down to even deeper layers of things that might be going on for somebody. If you are in a particular emotional state, if you're if you're in a rush on that day, our way that we make decisions is affected by all of those factors. We've got really good examples from things like emergency care. When you're in a crisis situation, that safer decisions are made, more accurate decisions are made, when you really make it really easy for people to make the decision. So, so for example, assessing a patient who, you know, who is acutely unwell, you think about airway, breathing, and circulation, ABC. It's almost one of the first things you get taught when you're in medical training. What that does is it gives you a very quick framework to assess a lot of complex things very quickly and apply a hierarchy to it. You're trying to tease out the things that are going to kill the quickest. And there's a whole evidence base as to why we do that. And presenting that decision architecture, that decision environment again, back to that behavioral architecture piece, in that way, it's proven that improves outcomes. It improves the quality of those decisions. That's exactly equivalent to what that clinician is doing when they're saying I need more context, they're applying a whole wealth of experience, training, etc., and their feeling state and their immediate state of how quickly they have to make that decision to this thought, back to Dimasio's feeling, complexity, things that aren't articulated, actually informing what surfaces consciously in your head as being important. And we know that when people are making clinical decisions, there's a literature around this, there is a piece that clinicians can articulate, but there is a big bit below the waterline with the iceberg, which they actually don't articulate.
SamWhat might be going on for a clinician who's having to make binary decisions about a diagnosis looking at a screen at a certain point that they're not aware of?
SPEAKER_00So it could be something like maybe thinking this AI is going to take my job, actually. That well, they probably won't be aware of that, but that's not an unreasonable thing to think. And it's probably going to be colouring emotionally how they feel about it. Does that mean they're a bit more likely to be a bit uncertain? You can extrapolate quite a lot from that. Does it mean that actually I don't really want to help this algorithm learn more? Does it mean it's going to get better and be more likely to replace me? It's not ridiculous to think that that might be going on for people. And we, again, we know from studies there are concerns about this. And I'm sure that is playing out in some of this difficulty with how these interactions work.
SamSo if the interface doesn't accurately reflect how clinicians are processing uncertainty, what happens to the data that's actually being generated and is then going back into the AI?
SPEAKER_00It means it's biased, right? So that is what we know is one of the things that we have to keep a really strong eye on in terms of how algorithms perform and how AI develops. It's all about how these models are maintained. We know that is crucial to making sure we get the right health outcomes. Essentially, what's happening is that data is being that training model data is being distorted in that way. And that has quite serious implications for how these things might perform going forward.
SamAnd that the expectation is that within a few years there's going to be lots of new meaning coming from AI. It's going to be solving all sorts of problems for us that we haven't been able to solve as humans. And I guess that's a point where everything changes.
SPEAKER_00Yeah, I think there's an interesting one from actually radiology again around racial bias in attributing diagnosis of sickle cell disease. So there's a study, I think it was done in the States, where classically we know that people who are in sickle cell crisis do not always get the pain relief or management on time, and they're sort of seen as drug-seeking. That's real racial bias coming into clinical decision making. And there was a study where when I think the algorithm was making a diagnosis based on looking at scans. So looking at features in a scan, but not necessarily the same features that a human would look at, it more accurately diagnosed whether people had sickle cell or not. And I think this is really interesting because the algorithm isn't being biased or it being isn't being racist or not racist, but it's clearly picking up on something that's it's doing something different to what the human is doing, and actually, in that scenario, coming out with something that is not as racist as what the humans were doing. And that for me is really quite interesting. There is this, and it goes, and there's also the problem that goes in the negative direction. And I talk about it a bit as this kind of thinking about these as two different operating models. You've got in in therapeutic work, we talk about it being really important to understand what you are bringing to the therapeutic relationship as a clinician because you're always interpreting what the patient's bringing through your lens. The ideal is to really know yourself. That's why there's a history of psychiatrists going through their own psychodynamic psychotherapy or psychoanalysis, because it allows you to know yourself and know what you are bringing. When you've got a human interacting with an AI system, you've actually got two different operating models. In the example I gave you before, it's at least the same overall operating model, might look really different between those two people, but it's the same bones. Actually, when it's AI and human, you've got two completely different operating models, if you like. And that for me is a really interesting territory for us to be in now because that's really new. And we're going to have to be really thoughtful about how we design to make sure those socio-technical systems work together.
SamSo not only are we thinking about what's the clinician bringing to the relationship, we're thinking about what's the AI bringing to the relationship.
SPEAKER_00Absolutely.
SamAnd is that quantifiable?
SPEAKER_00I don't know if it is yet. I think that's I think we're really at the very early days of understanding that a lot of my clinical work is based on this component that that what the clinician brings is not fully articulated. When I talk, we talked before about biopsychosocial models. A lot of what I do in my clinical work is articulating for people who are trained in the medicalized model, actually, your patient's experiences in the biopsychosocial model, this holistic model. And so you really have to work to be able to understand that. We're at really quite early stages of even doing that between humans, let alone a completely alien model, which is has this different operating system entirely, which we don't even know always exactly what an AI is looking at when it comes out with a certain thing.
SamDo we need a new discipline that is psychology for AIs?
Human Factors Lessons From Aviation
SPEAKER_00Yeah, maybe. So I think there's a real opportunity embedding more psychological thinking into the process, as we've talked about. And I think it starts from taking a step back right at the proposition stage, really understanding what your problem frame is, which is it's nothing new, that's good design thinking practice. And I think the piece that we don't always do is this bit around people understanding what they are bringing to it. Where is your energy going? Being recognizing what you attend to and being able to understand what that means for the work that you are doing. Because once you do that, if I equate that to this way that we think works best in terms of building therapeutic relationship in the clinical interaction, is that then allows you to be able to see what's happening for the patient, or in this case for your user, it allows you to see that more clearly. And that's the first thing I would advise people to do.
SamAre there examples from other high-stakes fields, like aviation, for example, where you've seen this done well?
SPEAKER_00So I think we're all, I think every industry is learning in terms of how it interacts with AI. But we do definitely know aviation probably has some of the most advanced thinking in terms of human factors. So there is this kind of recognition that we as humans are complex, the way we interact with our environment can be complex. You've got the individual interaction, then you've got the team interaction, and being able to step back and analyze that and particularly to learn from incidents is particularly important. I think that field is quite advanced in how that is applied. I've seen it applied quite well in surgery, in health. Martin Bromley has done, who came from an aviation field, has done some incredible work and is a very well-known advocate in this space. Uh, sadly, because he tragically lost his wife in an operation where some of the factors were around how decisions were made at a point of crisis. And it comes down to very human things, understanding things like how power dynamics are operating in a particular situation, how body language works, how how hydrated, how tired people are, all of these things really affect the decisions that you're making in the moment.
Why HCD Alone Falls Short
SamYou've argued that standard human-centered design methods, such as user research, journey mapping, and all the techniques that come with human-centered design, they aren't enough for this problem. That's quite a provocative thing to say. Can you build on that a bit?
SPEAKER_00So I think in some ways I'm I'm really referring, particularly in health technology, to the problem of people understanding use center design really as more very limited user interaction, interface design. Whereas actually you and I both know that the really good work comes from understanding experience in all its complexity. And when I say that standard methods on their own aren't enough, it's because I think there is an opportunity to augment with the full depth of the sorts of methods that we use in experimental psychology and across academic medicine more broadly. And I actually think there's a problem on the other side in places where, on the academic medicine side, we've tended to not use user centered design methods in a way that leaves room for uncertainty and for creativity. So for me, the power is combining. The two. And that is where I feel there is real opportunity, like the example that I gave of using the academic health psychology evidence base to really refine and direct where the user-centered design work went in the musculoskeletal conditions example.
SamIf you're a health tech designer listening to this and thinking actually there's a real opportunity for how we're approaching design based on some of the things that we've been talking about in this interview, what could someone actually do differently on a Monday morning?
SPEAKER_00I think they could begin by engaging with some of the behavioral frameworks. So there are some really useful things like the COMB model, for example, capability, opportunity, and motivation and thinking about behavior change is one of the most powerful. It synthesizes a number of different behavioral models. So just starting by doing some reading about what these things do, how they're used, can start to really change how you're thinking about your particular problem frame. When you are doing, for example, your riskiest assumptions testing, to just take a step back, think about what you are bringing to that. What are the assumptions about what you are bringing, and allowing that to influence how you then unpack what that problem is? So just bringing that really human-centered lens in, but in a way that is a bit more based on what the kind of the health psychology evidence base tells us.
SamLeah, you're speaking at Rewired this year. What are you going to be talking about? And is it related to this content we've been talking about today?
SPEAKER_00Yeah, absolutely. I'm going to be talking about what the opportunity is for really trying to build this kind of behavioural architecture and how that can help us really embrace that full complexity of what people need. So it's called Beyond the Dashboard, is what the session is called. And there are three other really great speakers on that session as well.
SamAmazing. I will see you there. And thank you so much for joining me today, Leo. It's been a pleasure to talk to you again. Leah mentioned the COMB model as a starting point for understanding what shapes behavior. We've shared a COMB checklist for service leaders at healthier.services slash insights. It's designed to use in design reviews, gateways, and discovery exactly the moments where the kind of assumptions Leah is talking about could go unchallenged. What I come back to is just how small the design decision was that started this conversation. A button to agree or disagree. But how large the consequences are when we get it wrong. It's not just about frustrated users and clinicians, but rather corrupted data, reduced trust, and fooling adoption. A spiral that dashboards won't explain until it's too late. If you're building technology that depends on human judgment, the question is about much more than whether the interface is usable. We need to be confident that fundamentally it fits with how humans think and behave.services.