Exploring the Impact of Large Language Models on Diagnosing and Managing Obstetric Patients: A Pilot Study Utilizing Simulated Cases

By
Iason Psilopatis
Katharina Redling
Valeria Filippi
Sofia Kappos
Julius Emons
Beatrice Mosimann
Tibor A. Zwimpfer
April 27, 2026
0 min

Archives Of Gynecology And Obstetrics

At a Glance

Category	Detail
Condition	Common obstetric conditions including preeclampsia, fetal growth restriction, preterm premature rupture of membranes, antepartum vaginal bleeding, and minor abdominal trauma in late pregnancy
Key Mechanisms	Large language models (LLMs) autonomously generate diagnostic impressions, differential diagnoses, and management plans by synthesizing clinical data and guidelines
Target Population	Obstetric patients presenting with time-sensitive and complex clinical scenarios
Care Setting	Obstetric clinical practice, particularly in time-critical decision-making environments

Key Highlights

LLMs demonstrated potential to support clinical reasoning and decision-making in obstetrics by interpreting guidelines and suggesting management strategies.
Five simulated obstetric cases representing diverse and common clinical conditions were used to evaluate three freely available LLMs: Chat-GPT, Gemini, and DeepSeek.
Expert evaluation using a modified Global Quality Score assessed diagnostic accuracy, completeness, urgency recognition, maternal and fetal safety, and guideline concordance.

Guideline-Based Recommendations

Diagnosis

Use LLMs to generate initial diagnostic impressions and differential diagnoses based on patient data and clinical presentation.
Ensure LLM outputs align with established national and international obstetric guidelines (e.g., German Society of Obstetrics and Gynecology, ACOG, RCOG).

Management

LLMs can suggest management plans consistent with clinical standards for conditions such as preeclampsia, FGR, PPROM, and vaginal bleeding.
Interactive, staged dialogue with LLMs may enhance the refinement of management strategies in complex cases.

Monitoring & Follow-up

Continuous expert oversight is essential to validate LLM recommendations and ensure maternal and fetal safety.
Monitor LLM performance for accuracy, completeness, and guideline adherence before clinical integration.

Risks

Potential for inaccurate or incomplete advice leading to harmful clinical decisions if LLM outputs are uncritically accepted.
LLMs currently lack integration into real clinical environments and require rigorous validation before routine use.

Patient & Prescribing Data

Simulated obstetric patients with varying clinical presentations and complexities reflective of real-world cases.

LLMs provided structured recommendations comparable to experienced clinicians, indicating potential to reduce cognitive burden and standardize assessments.

Clinical Best Practices

Use LLMs as adjunct tools to support, not replace, clinician judgment in obstetric care.
Incorporate expert review and validation of LLM-generated outputs prior to clinical application.
Employ LLMs in time-sensitive scenarios to assist with rapid synthesis of clinical information and guideline-based decision-making.
Continue research and development to improve LLM accuracy, safety, and integration into clinical workflows.

Exploring the Impact of Large Language Models on Diagnosing and Managing Obstetric Patients: A Pilot Study Utilizing Simulated Cases

Clinical Scorecard: Exploring the Impact of Large Language Models on Diagnosing and Managing Obstetric Patients: A Pilot Study Utilizing Simulated Cases

At a Glance

Key Highlights

Guideline-Based Recommendations

Diagnosis

Management

Monitoring & Follow-up

Risks

Patient & Prescribing Data

Clinical Best Practices

References

Original Source(s)

Exploring the Impact of Large Language Models on Diagnosing and Managing Obstetric Patients: A Pilot Study Utilizing Simulated Cases

Related Content

Post-Dobbs Mortality Trends Remain Unclear

GROWell App Fails to Curb Pregnancy Weight Gain

Birthweight, Childhood BMI: What Drives the Link?