The Promise and Pitfalls of AI in the Complex World of Diagnosis, Treatment, and Disease Management | Health Informatics | JAMA

This conversation is part of a series of interviews in which JAMA Editor in Chief Kirsten Bibbins-Domingo, PhD, MD, MAS, and expert guests explore issues surrounding the rapidly evolving intersection of artificial intelligence (AI) and medicine.

Medical practice is about the human interaction between clinicians and patients, but what does it mean when a technology with human-like attributes such as AI enters the examination room? How does the dynamic between clinicians and patients change when AI is involved?

In a recent interview, Kirsten Bibbins-Domingo, PhD, MD, MAS, editor in chief of JAMA and the JAMA Network, discussed this aspect of AI with primary care physician Ida Sim, PhD, MD (Video). Sim is codirector of a joint program between the University of California, Berkeley, and the University of California, San Francisco (UCSF) in computational precision health. She is also an elected member of the National Academy of Medicine and the American College of Medical Informatics.

The following is an edited version of the interview.

Dr Bibbins-Domingo:I think of you as a physician who’s been in this informatics research space for a long time. You’ve also thought about how we create structures for sharing data and making data more accessible, and now all of us are talking about AI, and it’s transforming how we’re going to practice. It’s transforming how we’re going to do science. Why are we talking with a different urgency right now?

Dr Sim:It is an urgency, absolutely. I think November 30, 2022, is going to go down in history. That was the day when ChatGPT came out. I had been talking to other people here in Silicon Valley about ChatGPT-2.0, about DALL-E, and it was just mind-blowing what was going on in the computer science world that we were not seeing publicly. But November 30 changed that.

That was an inflection point, and this is why I think it’s so transformative. AI and machine learning are 2 different terms, but we won’t dissect those. A lot of the work previous to what the public sees has been machine learning. You take a bunch of data, stick it in a black box and out comes something. And that something is usually a prediction.

Machine learning, it’s called predictive analytics, is like, for example, this patient has a 68% chance of needing ICU [intensive care unit] transfer in the next 24 hours. This patient has a 72% chance of being readmitted in 32 days. That we can wrap our heads around. We do logistic regression. We have tons of JAMA papers for decades about prediction. We know how to do prediction. We know how to think about it as clinicians. It’s a number.

ChatGPT is not a number. What it generates is natural language. Natural language is what you and I are doing; this is how we interact with each other. It’s where we connect with other humans. And for the first time, a machine can interpose itself into that exchange in a way where we in our millennia of evolution see language exchange coming from another being.

So now we have this machine that can be kind of human-like, which is qualitatively different from all the machine learning that we’ve seen before in AI. You see the amazing interest and also fear, honestly, that this technology has engendered. It can do all sorts of things. Sure, it can predict, and all that kind of thing, but because it’s got this quality of being natural language and something that can almost act human-like in its interactions, it’s fundamentally different.

The patient-doctor relationship is about human relationships and language is the mediation of that relationship. And now we have a machine that can do that.

Dr Bibbins-Domingo:Is it too simplistic for me to use generative AI as the label that describes the set of things that is different than the simple predictive analytics?

Dr Sim:Generative AI is a really good umbrella term because it’s generating things as a stream. It’s generating language, it can generate music, and it can generate images. DALL-E is a model that can generate images.

Probably the other thing to know in there is something called large language models. You’ve probably heard of that. That’s the model GPT-3.5, GPT-4, LAMA2, and on and on. There are many of these large language models. It isn’t just GPT. ChatGPT is sort of like the web-based interface to the model behind it. There are different kinds of large language models.

A large language model ingests everything. And I mean everything—the entire web from 2021 and before. Some people may have heard the term a stochastic parrot, with stochastic meaning random and parrot because the parrot just talks because it listens to you. So in a sense, the large language model has listened to what we have said on the internet and everything else, and it’s parroting it. It’s just generating the next word or the next pixel that makes an image.

It’s this class of technologies that’s really transformative. But of course, a parrot can sound very intelligent and you wonder, does a parrot really understand what it’s saying? So if we think about it as being a parrot, we can kind of say, “Well, geez, does it really understand a disease or a diagnosis?”

Dr Bibbins-Domingo:It reminds me that so much is often made of ChatGPT got this answer wrong; it’s telling me the wrong things. As you describe, it’s predicting the next logical thing that would come after this set of things. How should we be thinking about this—that it generated something wrong or that these models understand the question and the answer?

Dr Sim:We should think about how these technologies can augment what we do in our work. It’s really an augmentation, not a replacement. That forces us to think—what do we want to augment, who is being augmented, and what does that mean. Because you can close inequities by augmenting some people, or you can increase inequities by augmenting different people. It should be a choice.

We should think very carefully about it. In what cases, if you augment something, are you also replacing it and then it goes away—what is the loss? For example, there was an article in JAMA Internal Medicine in April by Ayers et al, that GPT can give answers that are as accurate and as empathetic as physicians answering questions posed by the public on the internet.

If it’s as accurate and as empathetic, is that good? Should we just use GPT or do we lose something? We do need to think about whether we are losing something. In a way, I think this GPT generative AI thing is asking us to think about—certainly making me think about—what is doctoring, what is human about doctoring, and what is it that we should preserve and enhance and then use technology around it.

That’s always been the promise, but can we really do it as opposed to just having these technologies come in and say, “Well, they can do it, so why don’t we just go ahead and do it?” I’m worried about that.

Dr Bibbins-Domingo:You and I are both primary care providers. These are fields where we have a lot of interaction with patients. What makes you excited when you think about what’s on the horizon in terms of AI to make the life of primary care physicians, physicians, or patients better?

Dr Sim:A lot. At UCSF, we were thinking about it as horizon 1, 2, and 3. Horizon 1 is your business today—what can you do to improve your business today? If you were in the transportation industry, it would be like your gas car. How do you make your gas car more efficient?

Horizon 2 looks a little bit ahead and maybe it’s your EV [electric vehicle] or hybrid. What is your charging network around the country? How can we have, I don’t know, solar chargers? And then horizon 3 is to think, well why cars? Why not autonomous flying, individualized jet packs or whatever? What’s really challenging is that in a normal world, we would think about horizon 1 as now, years 1 to 2. Horizon 2 is years 3 to 5, and horizon 3 is like 10 years out.

With generative AI, we need to think about horizons 1, 2, and 3 at the same time. So operationally horizon 1 would be things like helping us chart, helping us code. Why am I telling you why I’m ordering a hemoglobin A1c on this patient who has diabetes? Why do I have to click that? I’m sure our audience has all kinds of pain points that we can use large language models and other generative AI for.

Horizon 2, I would think is bringing in digital devices, digital data in a way that supports patients more directly. Forty percent of Americans have 2 chronic diseases, 60% have 1 chronic disease, and 40% have 2 or more chronic diseases. If you think about chronic disease management, it’s not you and I who are managing those. It’s the patients, and we don’t do very much to help them.

We say, “Well, you go do this and this and come back in 3 months. Well, how did you do?” I think we need to upskill everyone. And I mean everyone—even if they don’t speak English; if they’re a high school grad or not a high school grad. So I’m really excited about technologies that augment our patients and our families and our communities. I think that’s a huge area.

And then we get to horizon 3, which would be diagnostic and other things that these technologies can do that are stunning. We should think explicitly about what our focus is on horizon 1—increased operational efficiency. Horizon 2 is getting to where we’re thinking about what values we’re inculcating. What is the kind of doctoring we’re supporting? And then at horizon 3, we better think about what it is we want. Otherwise, we’re going to end up with something that we might not like or that our patients aren’t going to benefit from.

Dr Bibbins-Domingo:I feel like I have heard about the examples in horizon 1 and horizon 3. But the middle strikes me as the thing I’ve probably heard less about, and about thinking through what it means for our older population—a population with multiple conditions and increasing means to generate information themselves on their wearable devices or on their phones. We’re already using that individually, but we don’t have as many great examples of the integration into clinical care.

Dr Sim:That goes back perhaps to how we think about these machines and what seems to be the most brilliant thing they can do. I think sometimes there’s the sense that the machine needs to be the smartest doctor in the room.

That is not the way it should be, but if we think about it that way, then we think it needs to be the master diagnostician. So we hold that up like that’s the master clinician, it’s a diagnostician. We do that even among humans. We expect predictions, right? 68.2%. But as primary care docs, we know that we do diagnosis, but the bulk of what we do is not diagnosis.

It’s treatment; it’s management. It’s managing complexity to keep things going. And frankly, it’s the patient and often the patients’ families who are doing the managing. There’s a whole stream of AI around planning. If you use AI to plan a war, for example—the opposite of medicine—they’re not sitting there saying I predicted this missile was going to hit with whatever accuracy. They’re actually planning the whole thing.

It’s the logistics. We don’t do that in medicine. We’re not using some of these technologies to think through what treatment and management are, diagnosis being just the first step. Part of the reason for that is because medicine is extremely complex; the processes are complex.

We have people who are expert in computation but are also embedded in medicine so that they can understand what that process is and then think about how we could use these technologies to help patients. An example is that I suspect somebody has cancer—maybe a mass on the abdominal CT [computed tomographic] scan. What is the most efficient way to work that up?

And it’s not just diagnostically, but it’s also like what is the capacity of the radiology suite? How do I optimize it so that the patient can come and get these blood tests, get the right scan at the right time to minimize the workup time and time off from work, and that I draw as little blood as possible so I’m as efficient as I can be in my diagnostics? I could use help with that. There are many examples like that that can make not just our lives easier, but our patients’ lives easier. I think that’s in horizon 2 because it’s still within the framework of the way we do doctoring now.

Dr Bibbins-Domingo:Could you talk a little bit more about that? Certainly AI innovation is going to come from those with a computer science background, not necessarily from those who are trained in medicine or who think primarily in that domain. So what are those models that can bring these different disciplines together so that we are not just advancing this really transformative technology, but we’re doing so in a way that is ultimately useful for the domain we’re trying to have an impact on, in this case medicine?

Dr Sim:The first and most important thing is that those of us in clinical medicine can’t sit on the sidelines. We can’t just watch it go by and be consumers of this technology. We need to actively engage at all levels, especially those in leadership in education, clinical medicine, and research.

I think it’s incumbent on those of us who are in academic medical centers to actively put forth the vision and test that vision over time as to whether that’s what we need and what we want. We need to be very engaged and very creative. Health is 20% of our GDP. If you think about all the billions of dollars going into generative AI and if you looked at the economy—what sector has a lot of inefficiency, a lot of money, and is a huge part of the economy? It’s health care.

People with the money and the technology are coming. They’re going to do things, and that’s great. But how do we draw in people who really understand clinical medicine and our values and our vision? We can partner; we absolutely need to do that. Some people think even if you bring 2 people together, that’s not as good as having 1 person who understands both.

I think that is true, so more of this may need to be in the medical school curriculum. This is pretty critical. We need to start integrating that sensibility into our current practitioners. There’s a lot of work to be done here. I think that there are so many wide-open issues about how we do this to generate clinicians who can really participate in a very fundamental way with where this is going.

Dr Bibbins-Domingo:The program you’re describing where those who are at the formative stages in their career and learning that discipline—they don’t have to be physicians—but they’re honing their skills in an environment of medicine already, that seems to be something that can help bridge some of the barriers that historically have existed.

Dr Sim:What’s important is that we bring those kinds of technologies within the real world of medicine; that we actually tackle the fact that this is the best drug, but it’s not covered by insurance or it’s not available in the pharmacy or that people have different values.

We don’t bring in patient values often to our computation because the natural way a computer scientist might look at it is almost like a mechanical problem and it’s not. It’s a human problem. With generative AI that is far more human in its appearance, I think that’s going to be where people really see that we can’t think about this as just a machine spitting out a number of predictions.

This has to fit right into the human process and the human experience of clinical care. I think people are going to see that more and more. It is a more difficult task to be trained and understand the real world and be embedded in the real world. But the good news is that we have more technologies now, machine learning, causal inference, large language models, generative AI, where I think we can start to get a much better handle on the messiness of the real world computationally than we had before. So it’s a really exciting time.

Dr Bibbins-Domingo:It strikes me that when you listen to what leads to burnout for clinicians, it is not about the misdiagnosis or the fact that they couldn’t get a test that’s really important. It is about the way we practice and the ways that the systems we interact with either don’t happen or aren’t designed for efficiencies. So what I hear you say you’re excited about with the promise of these technologies is that some of those things that make health care hard for both doctors and patients could actually be a perfect role for generative machines to step into.

Dr Sim:Yes, if we want to conceive of it that way, and I think we should. And as we do that, the issue of ethics always comes up—is it ethical, is it fair? Is it increasing inequities or reducing them? Those are all principles for protecting the human rights of patients, and that’s as it should be. But if we think about our primary care clinic, when we walk into the clinic room there’s the generative AI and there’s the patient. The patient is not the only human in that room. There are 2 humans—the clinician and the patient and the machine, if we reduce it down to that. So I wonder whether there are bioethical principles that relate to the clinician as a human and what our rights are when there’s also this machine there.

And that goes to the burnout issue. I think that we have traditionally focused on the rights and the experience of the patient and the doctor is just there. I don’t think this is going to happen, but at one point I thought that the machine is going to do all the work. It’s going to generate everything. It’s going to write the note and we’re going to be, wow, it’s going to write the note.

No. We’re going to be spending all our time at night reviewing the notes that the computer has generated and fixing them. Now you could say that’s really great, but that’s not the kind of doctoring I want do. So what is it that gives me the reason why I went into medicine, the drive that makes us value ourselves as a profession? What is that and how does the machine fit in there?

I think we have to explicitly take the clinician as a human and elevate them maybe even at the same level as a patient. I mean, we’re all human, right? It’s a relationship. I hope this would make us think about that also—especially in this time of moral injury and burnout—that we should be thinking of clinicians as humans, and how does the machine augment our humanity?

Dr Bibbins-Domingo:I love the way you said that. It sounds like you’re describing both the types of research studies that you’d want in place when we’re evaluating new technologies and a framework for how we should think about whether these technologies bring value. The value comes both in evaluating clinical outcomes, which is always a patient and how we protect the patient in these environments. But the promise and the challenge of these technologies is both to the patient and to the clinician, and that we have to have frameworks for thinking about those as well, is what I’m hearing you say.

Dr Sim:Indeed. And you might ask who’s going to look out for that. Who is making the investments in these technologies? Let’s say if it’s a hospital, do they have this framework for thinking about physician well-being not just as an afterthought, but as a real factor in what we’re doing and whether we’re set up?

In a sense, we’re not set up in terms of organization or governance to confront an opportunity and a threat like generative AI. The pandemic was a pressure test, right? Although generative AI is not in any way like a pandemic, I think it does make us think institutionally and then broader across our whole medical system, how are we going to respond and how should we configure ourselves so that the right signals and the right values get embedded in what we do?

Dr Bibbins-Domingo:That’s such a perfect way to challenge us because you’re describing a technology that’s so transformative that we’re thinking about phase 1, 2, and 3 all at the same time and a system in terms of organized medicine, in terms of academic medicine that isn’t quite set up to think about all the governance and ways of evaluating this. I think we can all hear from what you’re saying that not only is this important for us to address but also how we’re going to do that—that’s what’s going to play out over the next few months and years.

Dr Sim:It’s up to us. It would be easy enough to go back to our inboxes, to say I need to do that this afternoon or I need to fill out our insurance paperwork. But we do need to engage in a really deep way with what’s going on, and we need to do that now. There is an urgency, and you’re right in calling that out.

Published Online: September 27, 2023. doi:10.1001/jama.2023.19180

Conflict of Interest Disclosures: Dr Sim reports being a paid consultant and a member of the board of directors for Vivli, Inc and is a stockholder and medical advisory board member of 98point6, Inc.


Leave a Reply

Your email address will not be published. Required fields are marked *