CPI Musculoskeletal Radiology Module 2023
Author: Eric Chang, Stephane Desouches, Lauren Ladd, Hyojeong Lee, Catherine Roberts, Vanessa Zayas-Colon
Journal Name: American College of Radiology
Published: 12/06/2023
A world-class team of musculoskeletal (MSK) radiologists designed an excellent teaching and learning tool for general diagnostic radiologists, residents, and subspecialists. Review 50 self-assessment questions with detailed explanations of the images, correct answers, and incorrect answers. This self-assessment module includes:
- MSK case topics relevant to community and academic settings, ranging from commonly encountered sports and traumatic injuries, arthritic, inflammatory, and infectious conditions as well as tumor and tumor-like conditions, metabolic disorders, and post-surgical and hardware complications.
- Over 235 multi-modality images using radiography, MRI, CT, and US.
- Practical questions on developmental variants, MRI artifacts, TMJ dysfunction, and the use of ACR Appropriateness Criteria in MSK imaging.
Objectives:
After completing this activity, the participant should be better able to:
- Describe how to assess and manage a variety of hypothetical clinical situations related to musculoskeletal radiology.
- Identify image findings and exercise independent medical judgment based on the images and/or facts provided concerning musculoskeletal radiology.
- Cite clinical and imaging principles relevant to the cases encountered and topics discussed in the practice of musculoskeletal radiology.
- Review clinical practice strategies based on the practitioner’s self-assessment in the area of musculoskeletal radiology.
Daytime, evening, and overnight: the 24-h radiology cycle and impact on interpretative accuracy
Author: Shannon Zhou, Tarek Hanna, Tianwen Ma, Timothy Johnson, Christine Lamoureux, Scott Weber, Jamlik-Omari Johnson, Scott Steenburg, Jeffrey Dunkle, Suzanne Chong
Journal Name: Emergency Radiology
Published: 07/31/2023
Purpose
To assess the influence of time of day when a study is interpreted on discrepancy rates for common and advanced studies performed in the acute community setting.
Methods
This retrospective study used the databank of a U.S. teleradiology company to retrieve studies between 2012 and 2016 with a preliminary report followed by a final report by the on-site client hospital. Neuroradiology, abdominal radiology, and musculoskeletal radiology studies were included. Teleradiologists were fellowship trained in one of these subspecialty areas. Daytime, evening, and overnight times were defined. Associations between major and minor discrepancies, time of day, and whether the study was common or advanced were tested with significance set at p = .05.
Results
A total of 5,883,980 studies were analyzed. There were 8444 major discrepancies (0.14%) and 17,208 minor discrepancies (0.29%). For common studies, daytime (0.13%) and evening (0.13%) had lower major discrepancy rates compared to overnight (0.14%) (daytime to overnight, RR = 0.57, 95%CI: 0.45, 0.72, p < 0.01 and evening to overnight, RR = 0.57, 95%CI: 0.49,0.67, p < 0.01). Minor discrepancy rates for common studies were decreased for evening (0.29%) compared to overnight (0.30%) (RR = 0.89, 95%CI: 0.80,0.99, p = 0.029). For advanced studies, daytime (.15%) had lower major discrepancy rates compared to evening (0.20%) and overnight (.23%) (daytime to evening, RR = 0.77, 95%CI: 0.61, 0.97, p = 0.028 and daytime to overnight, RR = 0.66, 95%CI: 0.50, 0.87, p ≤ 0.01).
Conclusion
Significantly higher major discrepancy rates for studies interpreted overnight suggest the need for radiologists to exercise greater caution when interpreting studies overnight and may require practice management strategies to help optimize overnight work conditions. The lower major discrepancy rates on advanced studies interpreted during the daytime suggest the need for reserving advanced studies for interpretation during the day when possible.
Radiologist age and diagnostic errors
Author: Christine Lamoureux, Tarek Hanna, Edward Callaway, Michael Bruno, Scott Weber, Devin Sprecher, Timothy Johnson
Journal Name: Emergency Radiology
Published: 07/17/2023
Purpose
Previous investigations into the causes of error by radiologists have addressed work schedule, volume, shift length, and sub-specialization. Studies regarding possible associations between radiologist errors and radiologist age and timing of residency training are lacking in the literature, to our knowledge. The aim of our study was to determine if radiologist age and residency graduation date is associated with diagnostic errors.
Methods
Our retrospective analysis included 1.9 million preliminary interpretations (out of a total of 5.2 million preliminary and final interpretations) of imaging examinations by 361 radiologists in a US-based national teleradiology practice between 1/1/2019 and 1/1/2020. Quality assurance data regarding the number of radiologist errors was generated through client facility feedback to the teleradiology practice. With input from both the client radiologist and the teleradiologist, the final determination of the presence, absence, and severity of a teleradiologist error was determined by the quality assurance committee of radiologists within the teleradiology company using standardized criteria. Excluded were 3.2 million final examination interpretations and 93,963 (1.8%) of total examinations from facilities reporting less than one discrepancy in examination interpretation in 2019. Logistic regression with covariates radiologist age and residency graduation date was performed for calculation of relative risk of overall error rates and by major imaging modality. Major errors were separated from minor errors as those with a greater likelihood of affecting patient care. Logistic regression with covariates radiologist age, residency graduation date, and log total examinations interpreted was used to calculate odds of making a major error to that of making a minor error.
Results
Mean age of the 361 radiologists was 51.1 years, with a mean residency graduation date of 2001. Mean error rate for all examinations was 0.5%. Radiologist age at any residency graduation date was positively associated with major errors (p < 0.05), with a relative risk 1.021 for each 1-year increase in age and relative risk 1.235 for each decade as well as for minor errors (p < 0.05, relative risk 1.007 for each year, relative risk 1.082 for each decade). By major imaging modality, radiologist age at any residency graduation date was positively associated with computed tomography (CT) and X-ray (XR) major and minor error, magnetic resonance imaging (MRI) major error, and ultrasound (US) minor error (p < 0.05). Radiologist age was positively associated with odds of making a major vs. minor error (p < 0.05).
Conclusions
The mean error rate for all radiologists was low. We observed that increasing age at any residency graduation date was associated with increasing relative risk of major and minor errors as well as increasing odds of a major vs. minor error among providers. Further study is needed to corroborate these results, determine clinical relevance, and highlight strategies to address these findings.
Detection of Critical Spinal Epidural Lesions on CT Using Machine Learning
Author: Robert Harris, Scott Baginski, Yulia Bronstein, Dietrich Schultze, Kenneth Segel, Shwan Kim, Jerry Lohr, Steve Towey, Nishit Shahi, Ian Driscoll, Brian Baker
Journal Name: Spine
Published: 01/01/2023
Background
Critical spinal epidural pathologies can cause paralysis or death if untreated. Although magnetic resonance imaging is the preferred modality for visualizing these pathologies, computed tomography (CT) occurs far more commonly than magnetic resonance imaging in the clinical setting.
Objective
A machine learning model was developed to screen for critical epidural lesions on CT images at a large-scale teleradiology practice. This model has utility for both worklist prioritization of emergent studies and identifying missed findings.
Materials and Methods
There were 153 studies with epidural lesions available for training. These lesions were segmented and used to train a machine learning model. A test data set was also created using previously missed epidural lesions. The trained model was then integrated into a teleradiology workflow for 90 days. Studies were sent to secondary manual review if the model detected an epidural lesion but none was mentioned in the clinical report.
Results
The model correctly identified 50.0% of epidural lesions in the test data set with 99.0% specificity. For prospective data, the model correctly prioritized 66.7% of the 18 epidural lesions diagnosed on the initial read with 98.9% specificity. There were 2.0 studies flagged for potential missed findings per day, and 17 missed epidural lesions were found during a 90-day time period. These results suggest almost half of critical spinal epidural lesions visible on CT imaging are being missed on initial diagnosis.
Conclusion
A machine learning model for identifying spinal epidural hematomas and abscesses on CT can be implemented in a clinical workflow.
Leveraging Artificial Intelligence to Enhance Peer Review: Missed Liver Lesions on Computed Tomographic Pulmonary Angiography
Author: Sarah Thomas, Tyler Fraum, Lawrence Ngo, Robert Harris, Elie Balesh, Mustafa Bashir, Benjamin Wildman-Tobriner
Journal Name: Journal of the American College of Radiology
Published: 11/01/2022
Purpose
The aim of this study was to use artificial intelligence (AI) to facilitate peer review for detection of missed suspicious liver lesions (SLLs) on CT pulmonary angiographic (CTPA) examinations.
Methods
This retrospective study included 1 month of consecutive CTPA examinations from a multisite teleradiology practice. Visual classification (VC) software analyzed images for the presence (+) or absence (−) of SLLs (>1 cm, >20 Hounsfield units). Separately, a natural language processing (NLP) algorithm evaluated corresponding reports for description (+) of an SLL or lack thereof (−). Studies containing possible missed SLLs (VC+/NLP−) were reviewed by three abdominal radiologists in a two-step adjudication process to confirm if an SLL was missed by the interpreting radiologist. The number of VC+/NLP− cases, the number of images needing radiologist review, and the number of cases with confirmed missed SLLs were recorded. Interobserver agreement for SLLs was calculated for the radiologist readers.
Results
A total of 2,573 CTPA examinations were assessed, and 136 were classified as potentially containing missed SLLs (VC+/NLP−). After radiologist review, 13 cases with missed SLLs were confirmed, representing 0.5% of analyzed CT studies. Using AI, the ratio of CT studies requiring review to missed SLLs identified was 10:1; the ratio without the help of AI would be at least 66:1. Among the 136 cases reviewed by radiologists, interobserver agreement for SLLs was excellent (κ = 0.91).
Conclusions
AI can accelerate meaningful peer review by rapidly assessing thousands of examinations to identify potentially clinically significant errors. Although radiologist involvement is necessary, the amount of effort required after initial AI screening is dramatically reduced.
ACR Appropriateness Criteria® Imaging After Shoulder Arthroplasty: 2021 Update
Author: Catherine Roberts, Darlene Metter, Michael Fox, Marc Appel, Shari Jawetz, William Morrison, Nicholas Nacey, Nicholas Said, James Stensby, Naveen Subhas, Katherine Tynus, Eric Walker, Joseph Yu, Mark Kransdorf
Journal Name: Journal of the American College of Radiology
Published: 05/01/2022
Shoulder arthroplasty is a common orthopedic procedure with a complication rate reported to be as high as 39.8% and revision rates as high as 11%. Symptoms related to postoperative difficulties include activity-related pain, decreased range of motion, and apprehension. Some patients report immediate and persistent dissatisfaction, although others report a symptom-free postoperative period followed by increasing pain and decreasing shoulder function and mobility. Imaging plays an important role in diagnosing postoperative complications of shoulder arthroplasties. The imaging algorithm should always begin with radiographs. The selection of the next imaging modality depends on several factors, including findings on the initial imaging study, clinical suspicion of an osseous versus soft-tissue injury, and clinical suspicion of infection. The American College of Radiology Appropriateness Criteria are evidence-based guidelines for specific clinical conditions that are reviewed annually by a multidisciplinary expert panel. The guideline development and revision include an extensive analysis of current medical literature from peer reviewed journals and the application of well-established methodologies (RAND/UCLA Appropriateness Method and Grading of Recommendations Assessment, Development, and Evaluation or GRADE) to rate the appropriateness of imaging and treatment procedures for specific clinical scenarios. In those instances where evidence is lacking or equivocal, expert opinion may supplement the available evidence to recommend imaging or treatment.
Multi-institutional evaluation of a deep learning model for fully automated detection of aortic aneurysms in contrast and non-contrast CT
Author: Yiting Xie, Benedikt Graf, Parisa Farzam, Brian Baker, Christine Lamoureux, Arkadiusz Sitek
Journal Name: SPIE Medical Imaging
Published: 04/04/2022
We developed and validated a research-only deep learning (DL) based automatic algorithm to detect thoracic and abdominal aortic aneurysms on contrast and non-contrast CT images and compared its performance with assessments obtained from retrospective radiology reports. The DL algorithm was developed using 556 CT scans. Manual annotations of aorta centerlines and cross-sectional aorta boundaries were created to train the algorithm. Aorta segmentation and aneurysm detection performances were evaluated on 2263 retrospective CT scans (154 thoracic and 176 abdominal aneurysms). Evaluation was performed by comparing the automatically detected aneurysm status to the aneurysm status reported in the radiology reports and the AUC was reported. In addition, a quantitative evaluation was performed to compare the automatically measured aortic diameters to manual diameters on a subset of 59 CT scans. Pearson correlation coefficient was used. For aneurysm detection, the AUC was 0.95 for thoracic aneurysm detection (95% confidence region [0.93, 0.97]) and 0.94 for abdominal aneurysm detection (95% confidence region [0.92, 0.96]). For aortic diameter measurement, the Pearson correlation coefficient was 0.973 (p<0.001).
Interpretations of Examinations Outside of Radiologists’ Fellowship Training: Assessment of Discrepancy Rates Among 5.9 Million Examinations from a National Teleradiology Databank
Author: Suzanne Chong, Tarek Hanna, Christine Lamoureux, Tianwen Ma, Scott Weber, Jamlik-Omari Johnson, Eric Friedberg, Robert Pyatt Jr., Catherine Everett, Timothy Johnson
Journal Name: American Journal of Roentgenology
Published: 10/25/2021
Background
In community settings, radiologists commonly function as multispecialty radiologists, interpreting examinations outside of their fellowship training.
Objective
To compare discrepancy rates for preliminary interpretations of acute community-setting examinations concordant versus discordant with interpreting radiologists’ fellowship training.
Methods
This retrospective study used the databank of a U.S. teleradiology company that provides preliminary interpretations for client community hospitals. The analysis included 5,883,980 acute examinations performed from 2012 to 2016 that were preliminarily interpreted by 269 teleradiologists with a fellowship of neuroradiology, abdominal radiology, or musculoskeletal radiology. When providing final interpretations, client on-site radiologists voluntarily submitted quality assurance (QA) requests if preliminary and final interpretations were discrepant; the teleradiology company’s QA committee categorized discrepancies as major (n=8,444) or minor (n=17,208). Associations among examination type (common vs advanced), relationship between examination subspecialty and the teleradiologist’s fellowship (concordant vs discordant), and major and minor discrepancies were assessed using three-way conditional analyses with generalized estimating equations.
Results
For examinations with concordant subspecialty, major discrepancy rate was lower for common than advanced examinations [0.13% vs 0.26%; relative risk (RR) 0.50, 95% CI: 0.42, 0.60; p < .001]. For examinations with discordant subspecialty, major discrepancy rate was lower for common than advanced examinations (0.14% vs 0.18%; RR 0.81, 95% CI: 0.72, 0.90; p < .001). For common examinations, major discrepancy rate was not different between examinations with concordant versus discordant subspecialty (0.13% vs 0.14%; RR 0.90, 95% CI: 0.81, 1.01; p = .07). For advanced examinations, major discrepancy rate was higher for examinations with concordant versus discordant subspecialty (0.26% vs 0.18%; RR 1.45, 95% CI: 1.18, 1.79; p < .001). Minor discrepancy rate was higher among advanced examinations for those with concordant versus discordant subspecialty (0.34% vs 0.29%; RR 1.17, 95% CI: 1.001, 1.36; p = .04), but not different for other comparisons (p > .05).
Conclusion
Major and minor discrepancy rates were not higher for acute community-setting examinations outside of interpreting radiologists’ fellowship training. Discrepancy rates increased for advanced examinations.
Radiologist errors by modality, anatomic region, and pathology for 1.6 million exams: what we have learned
Author: Christine Lamoureux, Tarek Hanna, Devin Sprecher, Scott Weber, Edward Callaway
Journal Name: Emergency Radiology
Published: 07/30/2021
Purpose
To evaluate the feasibility of adding pathology to recent radiologist error characterization schemes of modality and anatomic region and the potential of this data to more specifically inform peer review and peer learning.
Methods
Quality assurance data originating from 349 radiologists in a national teleradiology practice were collected for 2019. Interpretive errors were simply categorized as major or minor. Reporting or communication errors were classified as administrative errors. Interpretive errors were then divided by modality, anatomic region and placed into one of 64 pathologic categories.
Results
Out of 1,628,464 studies, the discrepancy rate was 0.5% (8181/1,634,201). The 8181 total errors consisted of 2992 major errors (0.18%) and 5189 minor errors (0.32%). Precisely, 3.1% (257/8181) of total errors were administrative. Of major interpretive errors, 75.5% occurred on CT, with CT abdomen and pelvis accounting for 40.4%. The most common pathologic discrepancy for all exams was in the category of mass, nodule, or adenopathy (1583/8181), the majority of which were minor (1315/1583). The most common pathologic discrepancy for the 2937 major interpretive errors was fracture or dislocation (27%; 793/2937), followed by bleed (10.7%; 315/2937).
Conclusion
The addition of error-related pathology to peer review is both feasible and practical and provides a more detailed guide to targeted individual and practice-wide peer learning quality improvement efforts. Future research is needed to determine if there are measurable improvements in detection or interpretation of specific pathologies following error feedback and educational interventions.
Measurement of Endotracheal Tube Positioning on Chest X-Ray Using Object Detection
Author: Robert Harris, Scott Baginski, Yulia Bronstein, Shwan Kim, Jerry Lohr, Steve Towey, Zeljko Velichkovich, Tim Kabachenko, Ian Driscoll, Brian Baker
Journal Name: Journal of Digital Imaging
Published: 07/28/2021
Patients who are intubated with endotracheal tubes often receive chest x-ray (CXR) imaging to determine whether the tube is correctly positioned. When these CXRs are interpreted by a radiologist, they evaluate whether the tube needs to be repositioned and typically provide a measurement in centimeters between the endotracheal tube tip and carina. In this project, a large dataset of endotracheal tube and carina bounding boxes was annotated on CXRs, and a machine-learning model was trained to generate these boxes on new CXRs and to calculate a distance measurement between the tube and carina. This model was applied to a gold standard annotated dataset, as well as to all prospective data passing through our radiology system for two weeks. Inter-radiologist variability was also measured on a test dataset. The distance measurements for both the gold standard dataset (mean error?=?0.70 cm) and prospective dataset (mean error?=?0.68 cm) were noninferior to inter-radiologist variability (mean error?=?0.70 cm) within an equivalence bound of 0.1 cm. This suggests that this model performs at an accuracy similar to human measurements, and these distance calculations can be used for clinical report auto-population and/or worklist prioritization of severely malpositioned tubes.
Effect of Independent Resident Night Call Versus 24-7 Attending Radiologist Coverage on Subsequent Practice Performance
Author: Alexia Tatem, Christine Lamoureux, Elizabeth Krupinski, Scott Weber, Kristen DeStigter, Michael Bruno
Journal Name: Journal of the American College of Radiology
Published: 07/17/2021
The traditional model of residency education in radiology, which has prevailed since the first residency programs in radiology were established in the 1930s, has included independent or autonomous after-hours coverage, in which radiology residents provide after-hours interpretations independently. Under this scenario, referring clinicians base their care decisions on the radiology residents’ preliminary interpretations, which are accepted as being subject to revision following attending radiologist review. Radiology residents receive delayed supervision by attending radiologists, most commonly the following morning, at which time any errors uncovered are remediated.
This standard practice has been frequently evaluated and is generally accepted as being extremely safe for patients, with a very low rate of significant resident errors and creating no appreciable increased risk of patient harm resulting from the residents’ role.
Detection of Superior Mesenteric Artery Occlusion on Abdominal CT Using a Machine Learning Model
Author: Robert Harris
Journal Name: SIIM Annual Meeting
Published: 05/24/2021
The superior mesenteric artery (SMA) branches from the aorta and is a major arterial supplier of the intestines. Occlusion of the SMA often results in bowel ischemia which can lead to severe morbidity or death. SMA occlusion can be missed on abdominal CT in clinical practice. Our radiology practice processes a high volume of abdominal CT studies and developing a screening tool for this pathology could improve patient care. Hypothesis We hypothesized that by training a bounding box-based machine learning model on a labeled dataset of both occluded and non-occluded SMA studies, this model could be used to prospectively screen patients for SMA occlusion (SMAO).The results could then be used in a quality assurance (QA) pipeline. We also hypothesized this model could be implemented with high specificity to avoid a large number of false positives in a high-throughput setting. Methods A natural language processing (NLP) model was used to select 142 retrospective post-contrast abdominal CT studies from our database that were positive for SMAO. The occlusions on each slice in these series were segmented by a radiologist and the segmentations were converted into bounding box data. A total of 1,286 images with SMAO were used. Additionally, 1,286 post-contrast abdominal CT images negative for SMAO were added to the training dataset. The model was trained using the yolo-v3 bounding box framework with a single output label for SMAO. A training/validation split of 90/10 was used, and the trained model was run on a test set containing SMAO and non-SMAO studies. The model was incorporated into our prospective data pipeline and run on incoming data, comparing the model result to the NLP result for a two-week period. Results The best validation mean average precision (mAP) achieved during training was 0.616. The test set AUC was 0.936. For prospective data, 21,326 post-contrast abdominal CT studies passed through our system over a two-week period with a sensitivity and specificity of 50.0% and 99.4%, respectively. As a relatively rare condition, only 8 SMAO cases came through during this time, of which 4 of them were identified. Many of the false positives contained atherosclerosis or partial occlusions, which were considered negative for the purposes of this study but may also be important clinically. Conclusion A machine learning model was trained that identifies SMAO in the clinical setting with high specificity and reasonable sensitivity. This model can be incorporated into a clinical workflow to avoid missed diagnoses of this critical condition.
Natural Language Processing and Machine Learning for Detection of Respiratory Illness by Chest CT Imaging and Tracking of COVID-19 Pandemic in the US
Author: Ricardo Cury, Istvan Megyeri, Tony Lindsey, Robson Macedo, Juan Batlle, Shwan Kim, Brian Baker, Robert Harris, Reese Clark
Journal Name: Radiology Cardiothoracic Imaging
Published: 02/25/2021
Background
Coronavirus disease 2019 (COVID-19) has spread quickly throughout the United States (US) causing significant disruption in healthcare and society. Tools to identify hot spots are important for public health planning. The goal of our study was to determine if natural language processing (NLP) algorithm assessment of thoracic computed tomography (CT) imaging reports correlated with the incidence of official COVID-19 cases in the US.
Methods
Using de-identified HIPAA compliant patient data from our common imaging platform interconnected with over 2,100 facilities covering all 50 states, we developed three NLP algorithms to track positive CT imaging features of respiratory illness typical in SARS-CoV-2 viral infection. We compared our findings against the number of official COVID-19 daily, weekly and state-wide.
Results
The NLP algorithms were applied to 450,114 patient chest CT comprehensive reports gathered from January 1st to October 3rd, 2020. The best performing NLP model exhibited strong correlation with daily official COVID-19 cases (r2=0.82, p<0.005). The NLP models demonstrated an early rise in cases followed by the increase of official cases, suggesting the possibility of an early predictive marker, with strong correlation to official cases on a weekly basis (r2=0.91, p<0.005). There was also substantial correlation between the NLP and official COVID-19 incidence by state (r2=0.92, p<0.005).
Conclusion
Using big data, we developed a novel machine-learning based NLP algorithm that can track imaging findings of respiratory illness detected on chest CT imaging reports with strong correlation with the progression of the COVID-19 pandemic in the US.
Automated Segmentation and Worklist Prioritization of Pneumoperitoneum in Abdominal CT Images Using a Convolutional Neural Network
Author: Robert Harris
Journal Name: RSNA Annual Meeting
Published: 02/12/2021
Purpose
Pneumoperitoneum, the presence of free gas in the peritoneal cavity, can be a sign of critical pathology such as bowel perforation or trauma. Pneumoperitoneum is often diagnosed with abdominal CT and early detection is important to a patient’s outcome. Our institution processes approximately 3,300 abdominal CT studies per day, of which 1.3% are positive for pneumoperitoneum. We hypothesized that a convolutional neural network could be trained to detect pneumoperitoneum in prospective patients in order to expedite patient care.
Method and materials
Natural language processing (NLP) of radiology CT reports was used retrospectively to identify 297 body CT studies containing pneumoperitoneum. Axial CT images of these studies were annotated by a Board Certified radiologist to train a convolutional neural network. The training dataset consisted of 2,986 positive images and their segmentations, along with an equal number of negative images. A uNet model was trained using ResNet32 as the backbone. The model was first applied to a test cohort of 100 patients. This model was then integrated with our teleradiology pipeline to screen prospective patients for pneumoperitoneum in real-time, with NLP of the subsequent radiology report used as ground truth.
Results
The model achieved an AUC of 0.906 on the test dataset. A detection threshold of 3 cc pneumoperitoneum was selected. Over a two-week period, for prospective patients, the model had a sensitivity of 50.1% and a specificity of 94.7%. The mean volume of pneumoperitoneum was 37.4 cc for true positives with a maximum of 413.5 cc. Conclusion An artificial intelligence model was trained to quantify pneumoperitoneum on CT images and implemented in a real-time clinical system. To our knowledge, this is the first use of machine learning to identify pneumoperitoneum on CT images and perform worklist prioritization for patients based on its presence. This model is currently being expanded to identify additional types of free air such as pneumothorax, pneumomediastinum, and soft tissue gas.
ACR Appropriateness Criteria® Chronic Foot Pain
Author: Monica Tafur, Jenny Bencardino, Catherine Roberts, Marc Appel, Angela Bell, Soterios Gyftopoulos, Darlene Metter, Douglas Mintz, William Morrison, Kirstin Small, Naveen Subhas, Barbara Weissman, Joseph Yu, Mark Kransdorf
Journal Name: Journal of the American College of Radiology
Published: 11/01/2020
Chronic foot pain is a frequent clinical complaint, which can significantly impact the quality of live in some individuals. These guidelines define best practices with regards to requisition of imaging studies based on specific clinical scenarios, which have been grouped into different variants. Each variant is accompanied by a brief description of the usefulness, advantages, and limitations of different imaging modalities. The present narrative is the result of an exhaustive assessment of the available literature and a thorough review process by a panel of experts on Musculoskeletal Imaging.
The American College of Radiology Appropriateness Criteria are evidence-based guidelines for specific clinical conditions that are reviewed annually by a multidisciplinary expert panel. The guideline development and revision include an extensive analysis of current medical literature from peer reviewed journals and the application of well-established methodologies (RAND/UCLA Appropriateness Method and Grading of Recommendations Assessment, Development, and Evaluation or GRADE) to rate the appropriateness of imaging and treatment procedures for specific clinical scenarios. In those instances where evidence is lacking or equivocal, expert opinion may supplement the available evidence to recommend imaging or treatment.
Classification of Endotracheal Tube Positioning on Chest XR using a Convolutional Neural Net Trained with Annotated Images
Author: Robert Harris, Jerry Lohr, Steve Towey, Tim Kabachenko, Ian Driscoll, Kate Weber, Shwan Kim, Brian Baker
Journal Name: SIIM Annual Meeting
Published: 06/24/2020
Endotracheal tube intubation is often used when patients are ill and require respiratory assistance. These tubes must be positioned properly in relation to the carina; too high and the lungs may not be respirated, too low and only one lung may be respirated. Our institution receives approximately 4,000 XR Chest images every day, 5% of which contain an endotracheal tube. If the tube is determined to be malpositioned by the reading radiologist, this information is relayed back to the site for tube adjustment. We hypothesized that by training a convolution neural net using annotations of Chest XR images, we could localize both the endotracheal tube and the carina on prospective Chest XR data and use this information to classify images as having a malpositioned tube or not, along with the distance in cm that the tube must be adjusted if malpositioned.
Radiologist Opinions of a Quality Assurance Program: The Interaction Between Error, Emotion, and Preventative Action
Author: Christine Lamoureux, Jennifer Mahoney, Scott Weber, Jamlik-Omari Johnson, Tarek Hanna
Journal Name: Academic Radiology
Published: 03/02/2020
Rationale and Objectives
To investigate inter-relationships between radiologist opinions of a quality assurance (QA) program, QA Committee communications, negative emotions, self-identified risk factors, and preventive actions taken following major errors.
Materials and Methods
A 48 question electronic survey was distributed to all 431 radiologists within the same teleradiology organization between June 15 and July 3, 2018. Two reminders were sent during the survey time period. Descriptive statistics were generated, and comparisons were made with Fisher exact test. Significance level was set at p < 0.05.
Results
Response rate was 67.5% (291/431), and 72.5% of respondents completed all survey questions. A total of 64.3% of respondents were male, and the highest proportion of radiologists (28.9%, 187/291) had been in practice >20 years. Preventative actions following an error were positively correlated to a higher opinion of the QA process, self-identification of personal risk factors for error, and greater negative emotions following an error (all p < 0.05). A higher opinion of communications with the QA committee was associated with a positive opinion of the QA process (p < 0.001). An inverse relationship existed between negative emotion and opinion of QA committee communications (p < 0.05) and negative emotion and opinion of the QA process (p < 0.05). Radiologist gender and full time versus part time status had a significant effect on perception of the QA process (p < 0.05).
Conclusion
Radiologist opinions of their institutional QA process was related to the number of negative emotions experienced and preventative actions taken following major errors. Nurturing trust and incorporating more positive feedback in the QA process may improve interactions with QA Committees and mitigate future errors.
Classification of Aortic Dissection and Rupture on Post-contrast CT Images Using a Convolutional Neural Network
Author: Robert Harris, Shwan Kim, Jerry Lohr, Steve Towey, Zeljko Velichkovich, Tim Kabachenko, Ian Driscoll, Brian Baker
Journal Name: Journal of Digital Imaging
Published: 09/12/2019
Aortic dissections and ruptures are life-threatening injuries that must be immediately treated. Our national radiology practice receives dozens of these cases each month, but no automated process is currently available to check for critical pathologies before the images are opened by a radiologist. In this project, we developed a convolutional neural network model trained on aortic dissection and rupture data to assess the likelihood of these pathologies being present in prospective patients. This aortic injury model was used for study prioritization over the course of 4 weeks and model results were compared with clinicians’ reports to determine accuracy metrics. The model obtained a sensitivity and specificity of 87.8% and 96.0% for aortic dissection and 100% and 96.0% for aortic rupture. We observed a median reduction of 395 s in the time between study intake and radiologist review for studies that were prioritized by this model. False-positive and false-negative data were also collected for retraining to provide further improvements in subsequent versions of the model. The methodology described here can be applied to a number of modalities and pathologies moving forward.
Effect of intravenous contrast for CT abdomen and pelvis on detection of urgent and non-urgent pathology: can repeat CT within 72 hours be avoided?
Author: Christine Lamoureux, Scott Weber, Tarek Hanna, Andrew Grabiel, Reese Clark
Journal Name: Emergency Radiology
Published: 07/22/2019
Purpose
To determine if administering IV contrast for CT abdomen and pelvis improves detection of urgent and clinically important non-urgent pathology in patients with urgent clinical symptoms compared to patients not receiving IV contrast, and in turn to determine whether repeat CT exams on the same patient within 72 h were of low diagnostic benefit if the first CT was performed with IV contrast.
Methods
We evaluated 400 consecutive patients who had CT abdomen and pelvis (CT AP) examinations repeated within 72 h. For each patient, demographic data, reason for examination, examination time stamps, and examination technique were documented. CT AP radiology reports were reviewed and both urgent and non-urgent pathology was extracted.
Results
Of 400 patients, 63% had their initial CT AP without contrast. Administration of IV contrast for the first CT AP was associated with increased detection of urgent findings compared with non-contrast CT (p=?0.004) and a contrast-enhanced CT AP following an initial non-contrast CT AP examination better characterized both urgent (p=?0.002) and non-urgent findings (p<0.001). Adherence to ACR appropriateness criteria for IV contrast administration was associated with increased detection of urgent pathology on the first CT (p=?0.02), and the second CT was more likely to be performed with IV contrast if recommended by the radiologist reading the first CT (p=?0.0006).
Conclusion
In the absence of contraindications, encouraging urgent care physicians to preferentially order IV contrast-enhanced CT AP examinations in adherence with ACR appropriateness criteria may increase detection of urgent pathology and avoid short-term repeat CT AP.
ACR Appropriateness Criteria (®) Shoulder Pain-Atraumatic
Author: Kirstin Small, Ronald Adler, Shaan Shah, Catherine Roberts, Jenny Bencardino, Marc Appel, Soterios Gyftopoulos, Darlene Metter, Douglas Mintz, William Morrison, Naveen Subhas, Ralf Thiele, Jeffrey Towers, Katherine Tynus, Barbara Weissman, Joseph Yu, Mark Kransdorf
Journal Name: Journal of the American College of Radiology
Published: 11/01/2018
Shoulder pain is one of the most common reasons for musculoskeletal-related physician visits. Imaging plays an important role in identifying the specific cause of atraumatic shoulder pain. This review is divided into two parts. The first part provides a general discussion of various imaging modalities (radiographs, arthrography, nuclear medicine, ultrasound, CT, and MRI) and their usefulness in evaluating atraumatic shoulder pain. The second part focuses on the most appropriate imaging algorithms for specific shoulder conditions including: rotator cuff disorders, labral tear/instability, bursitis, adhesive capsulitis, biceps tendon abnormalities, postoperative rotator cuff tears, and neurogenic pain.
The American College of Radiology Appropriateness Criteria are evidence-based guidelines for specific clinical conditions that are reviewed annually by a multidisciplinary expert panel. The guideline development and revision include an extensive analysis of current medical literature from peer reviewed journals and the application of well-established methodologies (RAND/UCLA Appropriateness Method and Grading of Recommendations Assessment, Development, and Evaluation or GRADE) to rate the appropriateness of imaging and treatment procedures for specific clinical scenarios. In those instances where evidence is lacking or equivocal, expert opinion may supplement the available evidence to recommend imaging or treatment.
Expert Panel on Musculoskeletal Imaging. ACR Appropriateness Criteria(®) Chronic Wrist Pain
Author: David Rubin, Catherine Roberts, Jenny Bencardino, Angela Bell, Carter Cassidy, Eric Chang, Soterios Gyftopoulos, Darlene Metter, William Morrison, Naveen Subhas, Siddharth Tambar, Jeffrey Towers, Joseph Yu, Mark Kransdorf
Journal Name: Journal of the American College of Radiology
Published: 05/01/2018
Radiographs are indicated as the first imaging test in all patients with chronic wrist pain, regardless of the suspected diagnosis. When radiographs are normal or equivocal, advanced imaging with MRI (without or without intravenous contrast or following arthrography), CT (usually without contrast), and ultrasound each has a role in establishing a diagnosis. Furthermore, these examinations may contribute to staging disease, treatment planning, and prognostication, even when radiographs are diagnostic of a specific condition. Which examination or examinations are best depends on the specific location of pain and the clinically suspected conditions.
The American College of Radiology Appropriateness Criteria are evidence-based guidelines for specific clinical conditions that are reviewed annually by a multidisciplinary expert panel. The guideline development and revision include an extensive analysis of current medical literature from peer reviewed journals and the application of well-established methodologies (RAND/UCLA Appropriateness Method and Grading of Recommendations Assessment, Development, and Evaluation or GRADE) to rate the appropriateness of imaging and treatment procedures for specific clinical scenarios. In those instances where evidence is lacking or equivocal, expert opinion may supplement the available evidence to recommend imaging or treatment.
The Benefit of a Triage System to Expedite Acute Stroke Head Computed Tomography Interpretations
Author: Thomas Osborne, Andrew Grabiel, Reese Clark
Journal Name: Journal of Stroke and Cerebrovascular Diseases
Published: 01/03/2018
Background and purpose
We developed and tested a triage system to accelerate the interpretation of stroke head computed tomographies (CTs), with the goal of optimizing the time available for acute stroke therapy.
Materials and methods
In our practice, acute stroke protocol head CTs have been given the highest reading priority. We implemented a technologically enabled prioritization infrastructure to consistently present these critical cases to our radiologists so they are evaluated before other examinations. In our 1-year retrospective multicenter study of 350,495 head CT examinations, we compared the reading time of stroke protocol head CTs to our next highest priority head CT.
Results
Our average acute stroke head CT reading turnaround time was 6.5 minutes. This represented a 17.3-minute improvement over the next highest priority head CT in our practice (confidence interval: 17.2-17.4 minutes, P < .001).
Conclusions
A technologically enabled acute stroke protocol CT triage system consistently improves the reading times of critically time-dependent head CT examinations. As a result, this system has the potential to improve treatment times, treatment eligibility, and clinical outcomes.
Effect of Shift, Schedule, and Volume on Interpretive Accuracy: A Retrospective Analysis of 2.9 Million Radiologic Examinations
Author: Tarek Hanna, Christine Lamoureux, Elizabeth Krupinski, Scott Weber, Jamlik-Omari Johnson
Journal Name: Radiology
Published: 11/02/2017
Purpose
To determine whether there is an association between radiologist shift length, schedule, or examination volume and interpretive accuracy.
Materials and Methods
This study was institutional review board approved and HIPAA compliant. A retrospective analysis of all major discrepancies from a 2015 quality assurance database of a teleradiology practice was performed. Board-certified radiologists provided initial preliminary interpretations. Discrepancies were identified during a secondary review by a practicing radiologist or through an internal quality assurance process and were vetted through a consensus radiology quality assurance committee. Unique anonymous radiologist identifiers were used to link the discrepancies to radiologists’ shifts and schedules. Data were analyzed by using analysis of variance, t test, or ?2 test.
Results
A total of 4294 major discrepancies resulted from 2 922 377 examinations (0.15%). There was a significant difference for shift length (P < .0001) and volume (P < .0001) for shifts with versus those without discrepancies. On average, errors occurred a mean (± standard deviation) of 8.97 hours ± 2.28 into the shift (median, 10 hours; interquartile range, 2.0 hours). Significantly more errors occurred late in shifts than early (P < .0001), peaking between 10 and 12 hours. The number of major discrepancies in a single shift ranged from one to four, with a significant difference in the number of discrepancies as a function of study volume (volume for all shifts, 67.60 ± 60.24; volume for shifts with major discrepancies, 118.96 ± 66.89; P < .001). Despite a trend for more discrepancies after more consecutive days worked, the difference was not significant (P = .0893).
Conclusion
Longer shifts and higher diagnostic examination volumes are associated with increased major interpretive discrepancies. These are more likely to occur later in a shift, peaking after the 10th hour of work.
Emergency Radiology Practice Patterns: Shifts, Schedules, and Job Satisfaction
Author: Tarek Hanna, Haris Shekhani, Christine Lamoureux, Hanna Mar, Refky Nicola, Clint Sliker, Jamlik-Omari Johnson
Journal Name: Journal of the American College of Radiology
Published: 12/04/2016
Purpose
To assess the practice environment of emergency radiologists with a focus on schedule, job satisfaction, and self-perception of health, wellness, and diagnostic accuracy.
Methods
A survey drawing from prior radiology and health care shift-work literature was distributed via e-mail to national societies, teleradiology groups, and private practices. The survey remained open for 4 weeks in 2016, with one reminder. Data were analyzed using hypothesis testing and logistic regression modeling.
Results
Response rate was 29.6% (327/1106); 69.1% of respondents (n = 226) were greater than 40 years old, 73% (n = 240) were male, and 87% (n = 284) practiced full time. With regard to annual overnight shifts (NS): 36% (n = 118) did none, 24.9% (n = 81) did 182 or more, and 15.6% (n = 51) did 119. There was a significant association between average NS worked per year and both perceived negative health effects (P < .01) and negative impact on memory (P < .01). There was an inverse association between overall job enjoyment and number of annual NS (P < .05). The odds of agreeing to the statement “I enjoy my job” for radiologists who work no NS is 2.21 times greater than for radiologists who work at least 119 NS, when shift length is held constant. Radiologists with 11+ years of experience who work no NS or 1 to 100 NS annually have lower odds of feeling overwhelmed when compared with those working the same number of NS with <10 years’ experience.
Conclusion
There is significant variation in emergency radiology practice patterns. Annual NS burden is associated with lower job satisfaction and negative health self-perception.