File: 2013-01-01-I-PrACTISE.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “I-PrACTISE” categories:
tags:
Helped to create Improving Primary Care Through Industrial and Systems Engineering (I-PrACTISE) collaborative. I-PrACTISE is an educational and research collaborative focused on connecting problems in Primary Care with solutions from Industrial Engineering.
It is a formal partnership between the University of Wisconsin Department of Industrial and Systems Engineering, and the Departments of Family Medicine and Community Health, Medicine and Pediatrics of the UW School of Medicine and Public Health.
I-PrACTISE focuses on applying industrial engineering methods and systems thinking to primary care healthcare settings, aimed at improving patient outcomes while reducing costs and minimizing waste. By doing so, they seek to address some of the challenges facing modern healthcare delivery, which includes rising healthcare costs, limited resources, and burnout.
The goal of I-PrACTISE is to develop a home for cross-disciplinary research to foster development of innovative solutions that involve re-engineering existing clinical workflows and tools.
The care of patients will be improved and the practice of primary care medicine will become more efficient through new knowledge and techniques created by the collaboration between Industrial Engineering and the primary care specialties.
Create a home for scholars and clinicians with interest and expertise in industrial engineering and/or primary care to conduct funded projects directed at improving the quality of primary care for patients, clinicians and staff.
The membership consists of interested UW Faculty from the School of Medicine and Public Health and the Department of Industrial and Systems Engineering as well as interested scholars from other professions and institutions.
File: 2013-04-01-I-PrACTISE-White-Paper.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “I-PrACTISE White Paper” categories:
The first Improving PrimAry Care Through Industrial and Systems Engineering (I-PraCTISE) conference was held at Union South at the University of Wisconsin - Madison in April of 2013. It was funded by the Agency for Healthcare Research and Quality and co-sponsored by the UW - Madison Departments of Family Medicine and Industrial and Systems Engineering. A key objective of the first I-PrACTISE conference was to develop a cross-disciplinary research agenda, bringing together engineers and physicians.
I helped to organize themes from across the conference and created this paper to summarize our findings.
Primary healthcare is in critical condition with too few students selecting careers, multiple competing demands stressing clinicians, and increasing numbers of elderly patients with multiple health problems. The potential for transdisciplinary research using Industrial and Systems Engineering (ISyE) approaches and methods to study and improve the quality and efficiency of primary care is increasingly recognized. To accelerate the development and application of this research, the National Collaborative to Improve Primary Care through Industrial and Systems Engineering (I-PrACTISE) sponsored an invitational conference in April, 2013 which brought together experts in primary care and ISyE. Seven workgroups were formed, organized around the principles of the Patient Centered Medical Home: Team-Based Care, Coordination and Integration, Health Information Technology (HIT) – Registries and Exchanges, HIT – Clinical Decision Support and Electronic Health Records, Patient Engagement, Access and Scheduling, and Addressing All Health Needs. These groups: (A) Explored critical issues from a primary care perspective and ISyE tools and methods that could address these issues; (B) Generated potential research questions; and (C) Described methods and resources, including other collaborations, needed to conduct this research.
Download paper.———————— File: 2015-01-31-SMS-Website.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Send Me Specials Website” categories:
In the days prior to wide smartphone adoption it was hard to find deals on meals and drinks as broke college students on the go.
SMS bottlecap logo
In order to enable restaurants and bars to reach out to college age customers Adam Maus and I created a custom text message gateway integrated with an application and website. These businesses could upload information about their menus and weekly specials and then share them with interested customers by sending out a text message blast.
SMS welcome screen
SMS gateway services existed at the time, but they were very expensive (i.e., you had to pay for each text). To avoid paying per text we got an android smartphone and had it serve as the text message router. We had a webservice that would pass information to an app on the smartphone which would then send text messages using its unlimited data and text plan.
SMS messaging screen
Ultimately, while we were technically successful this project didn’t really go anywhere. We were not addressing a pain point that businesses in Madison were experiencing. Students would have benefited, but they weren’t our “customers”. Cautionary tale on doing good customer discovery and working hard to achieve product-market fit. That’s more important than cool technology.
File: 2015-03-31-SHS-FlexSim.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “2015 FlexSim - SHS ED Modeling Competition” categories:
I led the University of Wisconsin team to victory in the inaugural FlexSim - SHS Emergency Department Modeling Competition in 2015. This international competition was sponsored by Flexsim Healthcare and took place at the 2015 Orlando Society for Health Systems conference. The team consisted of Samuel Schmitt, April Sell, Michael Russo and myself. We were advised by Dr. Brian Patterson and Dr. Laura Albert.
This case competition involved optimizing the operations of an emergency department (ED) using discrete event simulation and operations research tools. The goal was to analyze the Susquehanna Health ED’s current operations and determine the best care delivery model to meet productivity requirements while satisfying staffing and care constraints.
We used a combination of discrete event simulation (FlexSim healthcare software), design of experiments, and mathematical programming to determine the ideal care delivery model. See below for a copy of our winning presentation.
Susquehanna Health, a four‐hospital, not‐for‐profit health system, has deployed an Emergency Department (ED) Leadership Team to reduce expenses and optimize operations at their flagship hospital, Williamsport Regional Medical Center (WRMC). The Emergency Department has been experiencing pressure from a recently enacted marketing campaign that ensures patients are seen by a provider in 30 minutes or less at two competitor hospitals in the region. This campaign concerns Susquehanna Health because their current average door to provider time is 42.7 minutes with peak times as long as 140 minutes. As a result, 2.8% of their patients are leaving without being seen.
The Susquehanna Health System needs to be competitive in order to face today’s healthcare trends of declining reimbursement, increasingly high debt, and greater focus on outpatient services. The Emergency Department Leadership Team reached out to UW‐Madison’s Industrial & Systems Engineering students to assist them in creating a simulation that will help them improve patient safety, staff productivity, and overall efficiency.
The UW‐Madison Industrial & Systems Engineering students developed a discrete‐event simulation of WRMC Emergency Department’s traditional triage and bed process using FlexSim HC simulation software. Input data consisted of processing time distributions and probabilities supplied from the Emergency Department Leadership Team. To enhance the accuracy of the model, the team also collaborated with physicians at the University of Wisconsin Hospitals and Clinics (UWHC) to gather information on average processing times. Based on best practices in other institutions, simulation models were created to represent the two additional delivery methods: PITT and PITT/Super Fast Track.
After the modeling process was completed the team ran a series of experiments to determine the optimal delivery method and staffing levels. Super Fast Track appeared to be the best delivery system, however the team recommends that this analysis be redone on a more powerful machine. The machine used for modeling was not powerful enough to run the simulation experiments needed for statistical certainty.
The team views this as the first phase of a longer term project. The team will continue to refine the model and run new experiments once a new machine is procured. Collaborators at the UW – Madison, School of Medicine and Public Health, have asked the team to build a second set of models to be used for the UW Health ED.
File: 2015-10-12-Optimizing-the-ER.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Wisconsin Engineer: Optimizing the ER” categories:
April Sell, Samuel Schmitt, and I discussed our win at the Flexsim-SHS Emergency Department Modeling Competition with Kelsey Murphy for an article in the Wisconsin Engineer magazine.

File: 2015-10-31-Predicting-ED-Patient-Throughput.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Predicting ED Patient Throughput Times Utilizing Machine Learning” categories:
Annals of Emergency Medicine research forum abstract. Work done in conjunction with Dr. Brian Patterson and Dr. Laura Albert. Link to paper.
Patient throughput time in the emergency department is a critical metric affecting patient satisfaction and service efficiency. We performed a retrospective analysis of electronic medical record (EMR) derived data to evaluate the effectiveness of multiple modeling techniques in predicting throughput times for patient encounters in an academic emergency department (ED). Analysis was conducted using various modeling techniques and on differing amounts of information about each patient encounter. We hypothesized that more comprehensive and inclusive models would provide greater predictive power.
Retrospective medical record review was performed on consecutive patients at a single, academic, university-based ED. Data were extracted from an EMR derived dataset. All patients who presented from January 1, 2011 to December 31, 2013 and met inclusion criteria were included in the analysis. The data were then partitioned into two sets: one for developing models (training) and a second for analyzing the predictive power of these models (testing). The Table lists model types used. The primary outcome measured was the ability of the trained models to accurately predict the throughput times of test data, measured in terms of mean absolute error (MAE). Secondary outcomes were R2 and mean squared error (MSE). Model factors included a mix of patient specific factors such as triage vital signs, age, chief complaint; factors representing the state of the ED such as census and running average throughput time; and timing factors such as time of day, day of week, and month. The most comprehensive models included a total of 29 distinct factors.
Of the 134,194 patients that were seen in the 3-year period of the study 128,252 met the inclusion criteria; the mean throughput time was 183.327 min (SD 1⁄4 98.447 min). Compared to using a single average throughput time as a naïve model (MAE 1⁄4 80.801 min), univariate models provided improved predictive abilities. More sophisticated models, using machine learning methods and including all available factors provided greater predictive power with the lowest MAE achieved at 73.184 min.
We have demonstrated that including information about incoming patients and the state of the ED at the time of an arrival can aid in the prediction of individual patients’ throughput times. The Multiple Linear Regression model, including all available factors, had the highest predictive accuracy, reducing mean absolute error by over 9% compared to the naïve model. While this represents an improvement in the current state of the art, we believe there is room for further work to generate high quality individual patient predictions. More sophisticated models based on ED workflows may lead to greater predictive power to prospectively estimate patient throughput times at arrival.
Download paper. ———————— File: 2015-12-31-Arena-Simulation-Modeling-Course.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Arena Simulation Modeling Course” categories:
I developed an online course to introduce the Arena simulation application. Arena is a discrete event simulation tool that is widely used throughout the field of industrial engineering. Despite its frequent use and inclusion in undergraduate curicula it is often not well understood by students. This is due to a lack of high quality training materials.
I taught an in-person simulation lab (ISyE 321) and assisted in teaching a theory of simulation course (ISyE 320) with Dr. Laura Albert in 2015 at the University of Wisconsin. During this time I developed a series of modules to show off the functionality of Arena. I subsequently recorded these modules and developed a free online course that is on youtube.
Here’s the first video in the online Arena course that I developed:
File: 2016-02-13-Cherry-Picking.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Cherry Picking Patients: Examining the Interval Between Patient Rooming and Resident Self-assignment” categories:
Study titled “Cherry Picking Patients: Examining the Interval Between Patient Rooming and Resident Self-assignment”. We aimed to evaluate the association between patient chief complaint and the time interval between patient rooming and resident physician self-assignment (“pickup time”). The team hypothesized that significant variation in pickup time would exist based on chief complaint, thereby uncovering resident preferences in patient presentations.[^1]
The authorship team consisted of Brian W. Patterson MD, MPH, Robert J. Batt PhD, Morgan D. Wilbanks MD, myself, Mary C. Westergaard MD, and Manish N. Shah MD, MPH.
We aimed to evaluate the association between patient chief complaint and the time interval between patient rooming and resident physician self-assignment (“pickup time”). We hypothesized that significant variation in pickup time would exist based on chief complaint, thereby uncovering resident preferences in patient presentations.
A retrospective medical record review was performed on consecutive patients at a single, academic, university-based emergency department with over 50,000 visits per year. All patients who presented from August 1, 2012, to July 31, 2013, and were initially seen by a resident were included in the analysis. Patients were excluded if not seen primarily by a resident or if registered with a chief complaint associated with trauma team activation. Data were abstracted from the electronic health record (EHR). The outcome measured was “pickup time,” defined as the time interval between room assignment and resident self-assignment. We examined all complaints with >100 visits, with the remaining complaints included in the model in an “other” category. A proportional hazards model was created to control for the following prespecified demographic and clinical factors: age, race, sex, arrival mode, admission vital signs, Emergency Severity Index code, waiting room time before rooming, and waiting room census at time of rooming.
Of the 30,382 patients eligible for the study, the median time to pickup was 6 minutes (interquartile range = 2–15 minutes). After controlling for the above factors, we found systematic and significant variation in the pickup time by chief complaint, with the longest times for patients with complaints of abdominal problems, numbness/tingling, and vaginal bleeding and shortest times for patients with ankle injury, allergic reaction, and wrist injury.
A consistent variation in resident pickup time exists for common chief complaints. We suspect that this reflects residents preferentially choosing patients with simpler workups and less perceived diagnostic ambiguity. This work introduces pickup time as a metric that may be useful in the future to uncover and address potential physician bias. Further work is necessary to establish whether practice patterns in this study are carried beyond residency and persist among attendings in the community and how these patterns are shaped by the information presented via the EHR.
File: 2016-05-31-Forecasting-ED-Patient-Admissions.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Forecasting ED Patient Admissions Utilizing ML” categories:
“Forecasting Emergency Department Patient Admissions Utilizing Machine Learning” was a clinical abstract submitted to Academic Emergency Medicine. In this study, we aimed to predict the need for admission at the time of patient triage utilizing data already available in the electronic health record (EHR). We performed a retrospective analysis of EHR-derived data to evaluate the effectiveness of machine learning techniques in predicting the likelihood of admission for patient encounters in an academic emergency department. We hypothesized that more comprehensive & inclusive models would provide greater predictive power.
This work was done in conjunction with Dr. Brian Patterson, Dr. Jillian Gorski, and Dr. Laura Albert.
Multiple studies have identified inpatient bed availability as a key metric for Emergency Department operational performance. Early planning for patient admissions may allow for optimization of hospital resources.
Our study aimed to predict the need for admission at the time of patient triage utilizing data already available in the electronic health record (EHR). We performed a retrospective analysis of EHR derived data to evaluate the effectiveness of machine learning techniques in predicting the likelihood of admission for patient encounters in an academic emergency department. We hypothesized that more comprehensive & inclusive models would provide greater predictive power.
All patients who presented from 1/1/2012 to 12/31/2013 and met inclusion criteria were included in the analysis. The data were then partitioned into two sets for training and testing. The primary outcome measured was the ability of the trained models to discern the future admission status of an encounter, measured in terms of area under the receiver operator curve (ROC AUC). A secondary outcome was accuracy (ACC). Model features included a mix of patient specific factors (demographics, triage vital signs, visit and chief complaint history), the state of the ED (census and other performance metrics); and timing factors (time of day, etc.). The most comprehensive models included 682 variables, encoding 328 features, aggregated into 3 feature groups.
Our final analysis included 91,060 patient encounters. 28,838 (31.7%) of these encounters resulted in an inpatient admission. Compared to using a naïve model, single feature group models provided improved predictive abilities (1.8% - 50.8% improvement in ROC AUC), see figure for details. More sophisticated models, including all available feature groups provided greater predictive power with the greatest achieved at ROC AUC score of 0.756.
We have demonstrated that including information about incoming patients and the state of the ED at the time of triage can aid in the prediction of individual patients’ likelihood of admission. More sophisticated models using claims, weather, and social media data may lead to greater predictive power to prospectively estimate patient admission likelihood at arrival.
File: 2016-06-31-I-PrACTISE-Colloquia-Primary-Care-Predictive-Analytics.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “I-PrACTISE Colloquium Primary Care & Predictive Analytics” categories:
I had the opportunity to give a talk titled “Primary Care & Predictive Analytics” as a part of the I-PrACTISE colloquia series. We discussed artificial intelligence/machine learning and their applications in medicine, with a particular focus on primary care. In the presentation, I aimed to demystify machine learning, discuss its potential benefits in healthcare, and address the challenges associated with implementing these cutting-edge techniques.
Machine learning is a discipline that explores the construction and study of algorithms that can learn from data. These algorithms improve their performance at specific tasks as they gain experience, which is often measured in terms of data. In my talk, I explained the concept of machine learning by drawing parallels between training an algorithm and training an undergraduate. Just as we teach undergraduates general concepts and facts that they then synthesize and apply to specific situations, we train algorithms using data to improve their performance at a given task.
Machine learning has the potential to revolutionize the field of medicine, and primary care is no exception. By leveraging vast amounts of data, we can train algorithms to predict patient outcomes, diagnose conditions more accurately, and identify potential treatment options. For example, we could use machine learning to analyze tumor samples and train a model to evaluate new samples, helping doctors make more informed decisions about cancer diagnosis and treatment.
Despite its potential, there are several challenges to integrating machine learning into healthcare, particularly in sensitive areas like primary care. One of the key issues I addressed in my talk is the need for collaboration between engineers, computer scientists, statisticians, and healthcare professionals to ensure that these advanced techniques are applied responsibly and effectively.
Additionally, it is crucial to consider the human factors involved in implementing machine learning in healthcare settings. Understanding how healthcare providers interact with and use these algorithms is essential to ensuring their successful integration into medical practice.
As we continue to explore the potential of machine learning in primary care and the broader medical field, it is vital to remain focused on responsible development and implementation. By collaborating across disciplines and considering the human factors involved, we can work towards harnessing the power of machine learning to improve patient outcomes and revolutionize healthcare.
File: 2016-07-01-Metastar-Community-Pharmacy_Initiative.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “A community pharmacy initiative to decrease hospital readmissions by increasing patient adherence and competency of therapy” categories:
While working as the lead data scientist at MetaStar I helped to analyze the impact of a community pharmacy based intervention to reduce the rate of hospital admissions and readmissions. Patients enrolled in the intervention had the community pharamcy deliver medications to the homes of patients and educate them as well. We found that enrolling patients in the program reduced their rate of admissions.
Direct pharmacist care has been associated with substantial reduction in hospital admission and readmission rates and other positive outcomes, as compared with the absence of such care.
To decrease readmissions for community pharmacy patients through a program of improved medication packaging, delivery and patient education.
Comparison of the number of admissions and readmissions for each patient enrolled in the program, comparing the time elapsed since enrollment with the equivalent period prior to enrollment.
A community pharmacy in Kenosha, Wisconsin.
Medicare beneficiaries served by the community pharmacy conducting the intervention. This includes 263 patients, 167 of which are Medicare beneficiaries, who have been placed in the intervention group as of June 2016.
A voluntary program to package medications according to patient-specific characteristics and physician orders, to deliver medication to patients’ homes, and to educate and follow up with patients regarding problems with adherence.
Hospital admissions and readmissions post-enrollment as compared with the equivalent pre-enrollment period.
An analysis that limits the study period to a year centered on the patient’s enrollment date in the PACT intervention found a highly statistically significant (p < 0.01) reduction in admissions. An analysis that included the entire duration of the patient’s enrollment in PACT also found a statistically significant (p < 0.001) reduction in admissions. However, neither analytic technique found a statistically significant reduction in readmissions (p=0.2 and 0.1 respectively).
That the study was unable to show a decrease in readmissions to accompany the decrease in admissions may be due to the success of the intervention in decreasing the denominator as well as the numerator of the readmissions measure. In addition, the study has not stratified for changes in the intervention over time, and for differences in patient characteristics or outcomes other than admissions and readmissions.
File: 2016-08-01-INFORMS-in-the-News.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Quoted in INFORMS in the News” categories:
File: 2016-09-19-Impact-of-ED-Census-on-Admission.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “The Impact of ED Census on the Decision to Admit” categories:
Academic Emergency Medicine paper studying the impact of ED census on admission decisions: The Impact of Emergency Department Census on the Decision to Admit.
Jillian K. Gorski, Robert J. Batt, PhD, myself, Manish N. Shah, MD MPH, Azita G. Hamedani MD, MPH, MBA, and Brian W. Patterson MD, MPH, studied the impact of emergency department (ED) census on disposition decisions made by ED physicians. Our findings reveal that disposition decisions in the ED are not solely influenced by objective measures of a patient’s condition, but are also affected by workflow-related concerns.
The retrospective analysis involved 18 months of all adult patient encounters in the main ED at an academic tertiary care center. The results demonstrated that both waiting room census and physician load census were significantly associated with an increased likelihood of patient admission. This highlights the need to consider workflow-related factors when making disposition decisions, in order to ensure optimal patient care and resource allocation in emergency departments.
We evaluated the effect of emergency department (ED) census on disposition decisions made by ED physicians.
We performed a retrospective analysis using 18 months of all adult patient encounters seen in the main ED at an academic tertiary care center. Patient census information was calculated at the time of physician assignment for each individual patient and included the number of patients in the waiting room (waiting room census) and number of patients being managed by the patient’s attending (physician load census). A multiple logistic regression model was created to assess the association between these census variables and the disposition decision, controlling for potential confounders including Emergency Severity Index acuity, patient demographics, arrival hour, arrival mode, and chief complaint.
A total of 49,487 patient visits were included in this analysis, of whom 37% were admitted to the hospital. Both census measures were significantly associated with increased chance of admission; the odds ratio (OR) per patient increase for waiting room census was 1.011 (95% confidence interval [CI] = 1.001 to 1.020), and the OR for physician load census was 1.010 (95% CI = 1.002 to 1.019). To put this in practical terms, this translated to a modeled rise from 35.3% to 40.1% when shifting from an empty waiting room and zero patient load to a 12-patient wait and 16-patient load for a given physician.
Waiting room census and physician load census at time of physician assignment were positively associated with the likelihood that a patient would be admitted, controlling for potential confounders. Our data suggest that disposition decisions in the ED are influenced not only by objective measures of a patient’s disease state, but also by workflow-related concerns.
File: 2016-10-01-Cues-for-PE-diagnosis-in-the-ED.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Cues for PE Diagnosis in the Emergency Department: A Sociotechnical Systems Approach for Clinical Decision Support” categories:
American Medical Informatics Association Annual Symposium abstract. Work done in conjunction with Dr. Brian Patterson, MD MPH, Ann Schoofs Hundt, MS, Peter Hoonakker, PhD, and Pascale Carayon, PhD.
Pulmonary embolism (PE) diagnosis presents a significant challenge for emergency department (ED) physicians, as both missed or delayed diagnosis and overtesting can have serious consequences for patients. The implementation of health information technology, such as clinical decision support systems, has the potential to mitigate diagnostic errors and enhance the overall diagnostic process. However, to achieve this, the technology must be practical, user-friendly, and seamlessly integrate into clinical workflows. This calls for a sociotechnical systems approach to understand the cues involved in the PE diagnosis process and how they relate to the information available in electronic health records (EHRs).
In this study, we sought to comprehend the cues in the PE diagnosis process within the ED sociotechnical system and compare them to the information found in the EHR. The objective was to establish design requirements for clinical decision support for PE diagnosis in the ED.
Pulmonary embolus (PE) is among the most challenging diagnoses made in the emergency department (ED). While missed or delayed diagnosis of PE is a major problem in the ED1, overtesting, which subjects patients to harm from radiation, overdiagnosis, and increased cost, is also a concern. Health information technology, such as clinical decision support, has the potential to reduce diagnostic errors and support the diagnostic process. However, this requires that the technology be useful and usable, and fit within the clinical workflow, providing justification for a sociotechnical systems approach. The purpose of this study is to understand cues in the PE diagnosis process in the ED sociotechnical system and to compare these cues to the information available in the EHR. This will help in defining design requirements for a clinical decision support for PE diagnosis in the ED. Using the Critical Decision Method, we interviewed 16 attending physicians and residents in three EDs of two academic medical centers and one community hospital. The total duration of the interviews was over 12 hours. Using an iterative qualitative content analysis, we identified 4 categories of cues: (1) patient signs and symptoms (e.g., leg swelling, chest pain), (2) patient risk factors (e.g., immobilization, surgery or trauma, cancer), (3) explicit risk scoring (e.g., PERC), and (4) clinical judgment. We then mapped these cues to information available in the EHR at one of the participating hospitals. About 80-90% of the cues may be available in the EHR; many of them rely on the physical exam and information obtained by talking to the patient. This finding underlines the need to identify the various roles involved in obtaining, documenting and reviewing the information that informs the PE diagnostic process. The PE diagnostic process in the ED is distributed across multiple roles, individuals and technologies in a sometimes chaotic and often busy physical and organizational environment.
File: 2016-12-13-WHO-human-factors.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “WHO Technical Series on Safer Primary Care: Human Factors” categories:
Tosha Wetterneck, MD MS, Richard Holden, PhD, John Beasley, MD, and myself wrote a technical chapter for the World Health Organization. Link to technical chapter.
Its part of the World Health Organization’s technical series on safer primary care, and has a particular focus on human factors. This report highlights the crucial role that human factors play in ensuring patient safety, improving the quality of care, and optimizing the overall efficiency of primary care systems. By understanding the interaction between humans, systems, and technologies, healthcare organizations can implement more effective strategies to reduce errors, enhance communication, and ultimately improve patient outcomes.
This monograph describes what “human factors” are and what relevance this approach has for improving safety in primary care. This section defines human factors. The next sections outline some of the key human factors’ issues in primary care and the final sections explore potential practical solutions for safer primary care.
Download technical chapter. ———————— File: 2017-12-31-M-is-for-Medicine.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “M is for Medicine” categories:
File: 2018-01-18-Immune-Genomic-Expression-Correlates-Outcomes-in-Trauma-Patients.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Immune Genomic Expression Correlates with Discharge Location and Poor Outcomes in Trauma Patients” categories:
Academic Surgical Congress abstract, can be found here.
File: 2019-05-20-AAFP-Innovation-Fellow.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “AAFP’s Innovation Fellow Studies Tech, Digital Scribes” categories:
File: 2019-08-10-RTW-after-injury-sequential-prediction-and-decision.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Return to Work After Injury: A Sequential Prediction & Decision Problem” categories:
Machine Learning for Healthcare Conference clinical abstract, can be found here.
File: 2020-04-20-COVID-Staffing-Project.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “COVID Staffing Project: Three Medical Students’ Contributions” categories:
File: 2020-04-22-COVID-19-Visualization.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “COVID-19 Analysis” categories:
Quick exploration of case spread and mortality rates of the novel coronavirus.
File: 2020-05-12-Faster-than-COVID.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Faster than COVID: a computer model that predicts the disease’s next move” categories:
File: 2020-05-29-AADL-Friday-Night-AI.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Ann Arbor District Library - Friday Night AI: AI and COVID-19” categories:
Virtual panel discussion on how artificial intelligence could guide the response to the coronavirus outbreak. Hosted by the Ann Arbor District Library. Panel included speakers from across the Michigan AI and Michigan Medicine.
File: 2020-05-30-its-time-to-bring-human-factors-to-primary-care.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “It’s time to bring human factors to primary care policy and practice” categories:
Appeared in Applied Ergonomics. Link
Primary health care is a complex, highly personal, and non-linear process. Care is often sub-optimal and professional burnout is high. Interventions intended to improve the situation have largely failed. This is due to a lack of a deep understanding of primary health care. Human Factors approaches and methods will aid in understanding the cognitive, social and technical needs of these specialties, and in designing and testing proposed innovations. In 2012, Ben-Tzion Karsh, Ph.D., conceived a transdisciplinary conference to frame the opportunities for research human factors and industrial engineering in primary care. In 2013, this conference brought together experts in primary care and human factors to outline areas where human factors methods can be applied. The results of this expert consensus panel highlighted four major research areas: Cognitive and social needs, patient engagement, care of community, and integration of care. Work in these areas can inform the design, implementation, and evaluation of innovations in Primary Care. We provide descriptions of these research areas, highlight examples and give suggestions for future research. ———————— File: 2020-09-23-UM-Precision-Health-Symposium.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “UMich Precision Health Symposium: Prediction & Prevention - Powering Precision Health” categories:
Virtual panel discussion on precison health. A video segment from the 2020 University of Michigan Precision Health Virtual Symposium.
File: 2020-11-13-UM-Precision-Health-Onboarding-Session.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “UMich Precision Health Onboarding Session: Precision Health De-Identified RDW” categories:
Precision Health Data Analytics & IT workgroup held an onboarding session for Engineering students who could use Precision Health tools and resources for their classes and research. I provided a technical demonstration on how to find and query the database through the sql server.
File: 2021-05-19-UMich-MSTP-Promo-Video.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “UMich MSTP Promo Video” categories:
Was featured in the University of Michigan Medical Scientist Training Program recruiting video.
The MSTP at Michigan prepares physician scientists for careers in academic medicine with a focus on biomedical research. More than just an M.D. and Ph.D. spliced together, our program offers comprehensive support and guidance, integrating academic excellence and flexibility to help you reach your career goals.
File: 2021-07-21-External-Validation-of-a-Widely-Implemented-Proprietary-Sepsis-Prediction-Model-in-Hospitalized-Patients.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients” categories:
JAMA Internal Medicine. Can be found here.
How accurately does the Epic Sepsis Model, a proprietary sepsis prediction model implemented at hundreds of US hospitals, predict the onset of sepsis?
In this cohort study of 27 697 patients undergoing 38 455 hospitalizations, sepsis occurred in 7% of the hosptalizations. The Epic Sepsis Model predicted the onset of sepsis with an area under the curve of 0.63, which is substantially worse than the performance reported by its developer.
This study suggests that the Epic Sepsis Model poorly predicts sepsis; its widespread adoption despite poor performance raises fundamental concerns about sepsis management on a national level.
The Epic Sepsis Model (ESM), a proprietary sepsis prediction model, is implemented at hundreds of US hospitals. The ESM’s ability to identify patients with sepsis has not been adequately evaluated despite widespread use.
To externally validate the ESM in the prediction of sepsis and evaluate its potential clinical value compared with usual care.
This retrospective cohort study was conducted among 27 697 patients aged 18 years or older admitted to Michigan Medicine, the academic health system of the University of Michigan, Ann Arbor, with 38 455 hospitalizations between December 6, 2018, and October 20, 2019.
The ESM score, calculated every 15 minutes.
Sepsis, as defined by a composite of (1) the Centers for Disease Control and Prevention surveillance criteria and (2) International Statistical Classification of Diseases and Related Health Problems, Tenth Revision diagnostic codes accompanied by 2 systemic inflammatory response syndrome criteria and 1 organ dysfunction criterion within 6 hours of one another. Model discrimination was assessed using the area under the receiver operating characteristic curve at the hospitalization level and with prediction horizons of 4, 8, 12, and 24 hours. Model calibration was evaluated with calibration plots. The potential clinical benefit associated with the ESM was assessed by evaluating the added benefit of the ESM score compared with contemporary clinical practice (based on timely administration of antibiotics). Alert fatigue was evaluated by comparing the clinical value of different alerting strategies.
We identified 27 697 patients who had 38 455 hospitalizations (21 904 women [57%]; median age, 56 years [interquartile range, 35-69 years]) meeting inclusion criteria, of whom sepsis occurred in 2552 (7%). The ESM had a hospitalization-level area under the receiver operating characteristic curve of 0.63 (95% CI, 0.62-0.64). The ESM identified 183 of 2552 patients with sepsis (7%) who did not receive timely administration of antibiotics, highlighting the low sensitivity of the ESM in comparison with contemporary clinical practice. The ESM also did not identify 1709 patients with sepsis (67%) despite generating alerts for an ESM score of 6 or higher for 6971 of all 38 455 hospitalized patients (18%), thus creating a large burden of alert fatigue.
This external validation cohort study suggests that the ESM has poor discrimination and calibration in predicting the onset of sepsis. The widespread adoption of the ESM despite its poor performance raises fundamental concerns about sepsis management on a national level. ———————— File: 2021-07-21-STAT-News-Epic-sepsis.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “STAT News: A popular algorithm to predict sepsis misses most cases and sends frequent false alarms, study finds” categories:
File: 2021-07-21-WIRED-An-Algorithm-That-Predicts-Deadly-Infections-Is-Often-Flawed.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “WIRED: An Algorithm That Predicts Deadly Infections Is Often Flawed” categories:
File: 2021-07-22-The-Verge-A-hospital-algorithm-designed-to-predict-a-deadly-condition-misses-most-cases.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “The Verge: A hospital algorithm designed to predict a deadly condition misses most cases” categories:
File: 2021-07-26-The-Washington-Post-A-hospital-algorithm-designed-to-predict-a-deadly-condition-misses-most-cases copy.md Creation Date: — title: “The Washington Post: Sepsis prediction tool used by hospitals misses many cases, study says. Firm that developed the tool disputes those findings.” categories:
File: 2021-08-01-Mind-the-Performance-Gap.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Mind the Performance Gap: Dataset Shift During Prospective Validation” categories:
Our 2021 Machine Learning for Healthcare Conference paper! It discusses a special kind of dataset shift that is particularly pervasive and pernicious when developing and implementing ML/AI models for use in healthcare. Here’s a link to the Mind the Performance Gap paper that I authored with Jeeheh Oh, Benjamin Li, Michelle Bochinski, Hyeon Joo, Justin Ortwine, Erica Shenoy, Laraine Washer, Vincent B. Young, Krishna Rao, and Jenna Wiens.
Once integrated into clinical care, patient risk stratification models may perform worse com- pared to their retrospective performance. To date, it is widely accepted that performance will degrade over time due to changes in care processes and patient populations. However, the extent to which this occurs is poorly understood, in part because few researchers re- port prospective validation performance. In this study, we compare the 2020-2021 (’20-’21) prospective performance of a patient risk stratification model for predicting healthcare- associated infections to a 2019-2020 (’19-’20) retrospective validation of the same model. We define the difference in retrospective and prospective performance as the performance gap. We estimate how i) “temporal shift”, i.e., changes in clinical workflows and patient populations, and ii) “infrastructure shift”, i.e., changes in access, extraction and transfor- mation of data, both contribute to the performance gap. Applied prospectively to 26,864 hospital encounters during a twelve-month period from July 2020 to June 2021, the model achieved an area under the receiver operating characteristic curve (AUROC) of 0.767 (95% confidence interval (CI): 0.737, 0.801) and a Brier score of 0.189 (95% CI: 0.186, 0.191). Prospective performance decreased slightly compared to ’19-’20 retrospective performance, in which the model achieved an AUROC of 0.778 (95% CI: 0.744, 0.815) and a Brier score of 0.163 (95% CI: 0.161, 0.165). The resulting performance gap was primarily due to in- frastructure shift and not temporal shift. So long as we continue to develop and validate models using data stored in large research data warehouses, we must consider differences in how and when data are accessed, measure how these differences may negatively affect prospective performance, and work to mitigate those differences. ———————— File: 2021-08-01-evaluating-a-widely-implemented-proprietary-deterioration-index-among-inpatients-with-COVID.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Evaluating a Widely Implemented Proprietary Deterioration Index Model among Hospitalized Patients with COVID-19” categories:
Annals of the American Thoracic Society. Can be found here.
The Epic Deterioration Index (EDI) is a proprietary prediction model implemented in over 100 U.S. hospitals that was widely used to support medical decision-making during the coronavirus disease (COVID-19) pandemic. The EDI has not been independently evaluated, and other proprietary models have been shown to be biased against vulnerable populations.
To independently evaluate the EDI in hospitalized patients with COVID-19 overall and in disproportionately affected subgroups.
We studied adult patients admitted with COVID-19 to units other than the intensive care unit at a large academic medical center from March 9 through May 20, 2020. We used the EDI, calculated at 15-minute intervals, to predict a composite outcome of intensive care unit–level care, mechanical ventilation, or in-hospital death. In a subset of patients hospitalized for at least 48 hours, we also evaluated the ability of the EDI to identify patients at low risk of experiencing this composite outcome during their remaining hospitalization.
Among 392 COVID-19 hospitalizations meeting inclusion criteria, 103 (26%) met the composite outcome. The median age of the cohort was 64 (interquartile range, 53–75) with 168 (43%) Black patients and 169 (43%) women. The area under the receiver-operating characteristic curve of the EDI was 0.79 (95% confidence interval, 0.74–0.84). EDI predictions did not differ by race or sex. When exploring clinically relevant thresholds of the EDI, we found patients who met or exceeded an EDI of 68.8 made up 14% of the study cohort and had a 74% probability of experiencing the composite outcome during their hospitalization with a sensitivity of 39% and a median lead time of 24 hours from when this threshold was first exceeded. Among the 286 patients hospitalized for at least 48 hours who had not experienced the composite outcome, 14 (13%) never exceeded an EDI of 37.9, with a negative predictive value of 90% and a sensitivity above this threshold of 91%.
We found the EDI identifies small subsets of high-risk and low-risk patients with COVID-19 with good discrimination, although its clinical use as an early warning system is limited by low sensitivity. These findings highlight the importance of independent evaluation of proprietary models before widespread operational use among patients with COVID-19. ———————— File: 2021-08-05-MLHC-Presentation.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Machine Learning for Healthcare Conference: Characterizing the Performance Gap” categories:
Jeeheh Oh and I presented our work on dataset shift at the 2021 Machine Learning for Healthcare Conference. This talk briefly summarizes our our conference paper.
Once integrated into clinical care, patient risk stratification models may perform worse com- pared to their retrospective performance. To date, it is widely accepted that performance will degrade over time due to changes in care processes and patient populations. However, the extent to which this occurs is poorly understood, in part because few researchers re- port prospective validation performance. In this study, we compare the 2020-2021 (’20-’21) prospective performance of a patient risk stratification model for predicting healthcare- associated infections to a 2019-2020 (’19-’20) retrospective validation of the same model. We define the difference in retrospective and prospective performance as the performance gap. We estimate how i) “temporal shift”, i.e., changes in clinical workflows and patient populations, and ii) “infrastructure shift”, i.e., changes in access, extraction and transfor- mation of data, both contribute to the performance gap. Applied prospectively to 26,864 hospital encounters during a twelve-month period from July 2020 to June 2021, the model achieved an area under the receiver operating characteristic curve (AUROC) of 0.767 (95% confidence interval (CI): 0.737, 0.801) and a Brier score of 0.189 (95% CI: 0.186, 0.191). Prospective performance decreased slightly compared to ’19-’20 retrospective performance, in which the model achieved an AUROC of 0.778 (95% CI: 0.744, 0.815) and a Brier score of 0.163 (95% CI: 0.161, 0.165). The resulting performance gap was primarily due to in- frastructure shift and not temporal shift. So long as we continue to develop and validate models using data stored in large research data warehouses, we must consider differences in how and when data are accessed, measure how these differences may negatively affect prospective performance, and work to mitigate those differences. ———————— File: 2021-10-11-CHEPS-Seminar.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “CHEPS Seminar: Engineering Machine Learning for Medicine” categories:
Invited to give a talk for the 2021 University of Michigan Center for Healthcare Engineering and Patient Safety (CHEPS) fall seminar series. Discussed engineering machine learning for medicine. Gave an overview of the whole healthcare AI/ML lifecycle and discussed it is chockablock with cool industrial & health systems engineering problems.
File: 2021-10-31-Using-NLP-to-Automatically-Assess-Feedback-Quality.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Using Natural Language Processing to Automatically Assess Feedback Quality: Findings From 3 Surgical Residencies” categories:
Academic Medicine. Can be found here.
Learning is markedly improved with high-quality feedback, yet assuring the quality of feedback is difficult to achieve at scale. Natural language processing (NLP) algorithms may be useful in this context as they can automatically classify large volumes of narrative data. However, it is unknown if NLP models can accurately evaluate surgical trainee feedback. This study evaluated which NLP techniques best classify the quality of surgical trainee formative feedback recorded as part of a workplace assessment.
During the 2016–2017 academic year, the SIMPL (Society for Improving Medical Professional Learning) app was used to record operative performance narrative feedback for residents at 3 university-based general surgery residency training programs. Feedback comments were collected for a sample of residents representing all 5 postgraduate year levels and coded for quality. In May 2019, the coded comments were then used to train NLP models to automatically classify the quality of feedback across 4 categories (effective, mediocre, ineffective, or other). Models included support vector machines (SVM), logistic regression, gradient boosted trees, naive Bayes, and random forests. The primary outcome was mean classification accuracy.
The authors manually coded the quality of 600 recorded feedback comments. Those data were used to train NLP models to automatically classify the quality of feedback across 4 categories. The NLP model using an SVM algorithm yielded a maximum mean accuracy of 0.64 (standard deviation, 0.01). When the classification task was modified to distinguish only high-quality vs low-quality feedback, maximum mean accuracy was 0.83, again with SVM.
To the authors’ knowledge, this is the first study to examine the use of NLP for classifying feedback quality. SVM NLP models demonstrated the ability to automatically classify the quality of surgical trainee evaluations. Larger training datasets would likely further increase accuracy. ———————— File: 2021-11-01-INFORMS-Dynamic-Machine-Learning.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “INFORMS: Dynamic Machine Learning for Medical Practice” categories:
INFORMS conference talk focused on dynamic machine learning for medicine. Based on Joint work with Jon Seymour, MD (Peers Health) and Brian Denton PhD (University of Michigan).
Time is a crucial factor of clinical practice. Our work explores the intersection of time and machine learning (ML) in the context of medicine. This presentation will examine the creation, validation, and deployment of dynamic ML models. We discuss dynamic prediction of future work status for patients who have experienced occupational injuries. Methodologically we cover a framework for dynamic prediction health-state prediction that combines a novel data transformation with an appropriate automatically generated deep learning architecture. These projects expand our understanding of how to effectively train and utilize dynamic machine learning models in the service of advancing health.
File: 2021-11-02-Trust-The-AI-You-Decide.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Forbes: Trust The AI? You Decide” categories:
File: 2021-11-19-Quantification-of-Sepsis-Model-Alerts-in-24-US-Hospitals-Before-and-During-the-COVID-19-Pandemic.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Quantification of Sepsis Model Alerts in 24 US Hospitals Before and During the COVID-19 Pandemic” categories:
JAMA Network Open. Can be found here.
File: 2021-12-02-NLP-and-Assessment-of-Resident-Feedback-Quality.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Natural Language Processing and Assessment of Resident Feedback Quality” categories:
Journal of Surgical Education. Can be found here.
To validate the performance of a natural language processing (NLP) model in characterizing the quality of feedback provided to surgical trainees.
Narrative surgical resident feedback transcripts were collected from a large academic institution and classified for quality by trained coders. 75% of classified transcripts were used to train a logistic regression NLP model and 25% were used for testing the model. The NLP model was trained by uploading classified transcripts and tested using unclassified transcripts. The model then classified those transcripts into dichotomized high- and low- quality ratings. Model performance was primarily assessed in terms of accuracy and secondary performance measures including sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC).
A surgical residency program based in a large academic medical center.
All surgical residents who received feedback via the Society for Improving Medical Professional Learning smartphone application (SIMPL, Boston, MA) in August 2019.
The model classified the quality (high vs. low) of 2,416 narrative feedback transcripts with an accuracy of 0.83 (95% confidence interval: 0.80, 0.86), sensitivity of 0.37 (0.33, 0.45), specificity of 0.97 (0.96, 0.98), and an area under the receiver operating characteristic curve of 0.86 (0.83, 0.87).
The NLP model classified the quality of operative performance feedback with high accuracy and specificity. NLP offers residency programs the opportunity to efficiently measure feedback quality. This information can be used for feedback improvement efforts and ultimately, the education of surgical trainees. ———————— File: 2021-12-02-NLP-to-Estimate-Clinical-Competency-Committee-Ratings.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Natural Language Processing to Estimate Clinical Competency Committee Ratings” categories:
Journal of Surgical Education. Can be found here.
Residency program faculty participate in clinical competency committee (CCC) meetings, which are designed to evaluate residents’ performance and aid in the development of individualized learning plans. In preparation for the CCC meetings, faculty members synthesize performance information from a variety of sources. Natural language processing (NLP), a form of artificial intelligence, might facilitate these complex holistic reviews. However, there is little research involving the application of this technology to resident performance assessments. With this study, we examine whether NLP can be used to estimate CCC ratings.
We analyzed end-of-rotation assessments and CCC assessments for all surgical residents who trained at one institution between 2014 and 2018. We created models of end-of-rotation assessment ratings and text to predict dichotomized CCC assessment ratings for 16 Accreditation Council for Graduate Medical Education (ACGME) Milestones. We compared the performance of models with and without predictors derived from NLP of end-of-rotation assessment text.
We analyzed 594 end-of-rotation assessments and 97 CCC assessments for 24 general surgery residents. The mean (standard deviation) for area under the receiver operating characteristic curve (AUC) was 0.84 (0.05) for models with only non-NLP predictors, 0.83 (0.06) for models with only NLP predictors, and 0.87 (0.05) for models with both NLP and non-NLP predictors.
NLP can identify language correlated with specific ACGME Milestone ratings. In preparation for CCC meetings, faculty could use information automatically extracted from text to focus attention on residents who might benefit from additional support and guide the development of educational interventions. ———————— File: 2021-12-04-Comparative-Assessment-of-a-Machine-Learning-Model-and-Rectal-Swab-Surveillance-to-Predict-Hospital-Onset-Clostridioides-difficile.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Comparative Assessment of a Machine Learning Model and Rectal Swab Surveillance to Predict Hospital Onset Clostridioides difficile” categories:
IDWeek Abstract. Can be found here.
File: 2021-12-07-IOE-Research-Spotlight.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “IOE Research Spotlight” categories:
Shared an overview of my research during the 2021 University of Michigan Department of Industrial and Operations Engineering recruiting weekend.
File: 2021-12-09-Precision-Health-Webinar.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Precision Health Webinar: What Clinicians Need to Know when Using AI” categories:
Panel discussion on what is important for clinicians to know and how confident they can be when using these AI tools. Conversation with Drs. Rada Mihalcea, Max Spadafore, and Cornelius James.
File: 2022-01-07-hello-world.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Hello, World!” categories:
Hello, World!
Welcome to Ötleş Notes! It’s a blog by me (Erkin Ötleş).
For a little background: I am a Medical Scientist Training Program Fellow at the University of Michigan. What does that mean in English? It means I am a very silly person who decided to go to school forever in order to study medicine (MD) and engineering (PhD in industrial and operations engineering). Generally, I am fascinated by the intersection of engineering and medicine. I strongly believe that both fields have a lot to learn from one another. While working between the two presents challenges, I am genuinely grateful to learn from wonderful mentors and colleagues in both fields.
As I come across interesting topics that pertain to medicine or engineering I’ll try to share them here along with my perspective. I won’t make any guarantees regarding posting frequency or topics. However, I will to make every effort to cite original sources and be as factual as possible.
Ultimately this is a project for myself: 1) to help strengthen my written communication skills and 2) allow me to explore a broader space of ideas. If you happen to get something out of it too in the meantime that’s a wonderful byproduct.
If you have ideas about my ideas feel free to reach out to me on twitter (@eotles) or write me an email.
Cheers,
Erkin
Go ÖN Home
————————
File: 2022-01-10-solving-wordle.md
Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000”
—
title: “Solving Wordle”
categories:
Let’s talk about Wordle. [1] You, like me, might have been drawn into this game recently, courtesy of those yellow and green squares on twitter. The rules are simple, you get 6 attempts to guess the 5 letter word. After every attempt you get feedback in the form of the colored squares around your letters. Grey means this character isn’t used at all. Yellow means that the character is used, but in a different position. Finally, green means you nailed the character to (one of) the right position(s). Here’s an example of a played game:
A valiant wordle attempt by J.B. Cheadle (January 10th 2022)
It’s pretty fun to play, although wracking your brain for 5 letter words can be annoying, especially since you are not allowed to guess words that aren’t real words (e.g., you can’t use AEIOU). Once I got the hang of the game’s mechanics my natural inclination was to not enjoy the once daily word guessing diversion, but was to find a way to “solve wordle”.
Now, what does it mean to “solve wordle”? Maybe you would like to start with a really good guess? Maybe you would like to guarantee that you win the game (i.e., guess the right word by your sixth try)? Or perhaps, you’d like to win the game and get the most amount of greens or yellow on the way? “Solving” is a subjective and probably depends on your preferences.
Due to this subjectivity I think there’s couple valid ways to tackle wordle. If you have a strong preference for one type of solution you might be able to express that directly and then solve the game in order to get the optimal way to play. I’m going to try to avoid the O-word because: 1) I don’t know what you’d like to optimize for and 2) these approaches below don’t solve for the true optimal solution (they are heuristics).
The solution strategies I’ve explored thus far can be broken down into two major categories. The first set of strategies are trying to find really good first words to start with (First Word) and the second set are finding strategies that can be used to pick good words throughout the course of the game in response to responses received from guesses (Gameplay).
Let’s start with the First Words strategies: there are two first word strategies that can be employed based on how you’d like to start your game. First Word - Common Characters: ideal if you’d like to start your game using words that have the most common characters with all the solution words. Think of this as trying to maximize the number of yellow characters that you get on the first try.
First Word - Right Character in Right Position: ideal if you’d like to start the game using words that have the highest likelihood of having the right characters in the right position. This would yield the most number of green characters.
| Rank | Solution Words | Usable Words |
|---|---|---|
| 1st | later, alter, alert | oater, orate, roate |
| 2nd | sonic, scion | lysin |
| 2nd | pudgy | chump :) |
First Word - Right Character in Right Position: ideal if you’d like to start the game using words that have the highest likelihood of having the right characters in the right position. This would yield the most number of green characters.
| Rank | Solution (& Usable) Words |
|---|---|
| 1st | slate |
| 2nd | crony |
| 2nd | build |
Note on solution word vs. usable words. Wordle has two sets of words, solution words and other words. Other words are never the correct answer but can be used as a guess. There’s a chance that other words can be used to get a lot of yellows, despite never being the correct answer. So I created a list of usable words that combined the solution words and the other words. Notice that the First Word - Common Characters strategy has two lists. That’s because there are other words like “oater” that are more likely to produce yellows than the best solution word “later”. This isn’t the case for the First Word - Right Character in Right Position, as it produces the same results for both sets of words.
You might also observe that there are several sets of words in terms of 1st, 2nd, and 3rd. If you wanted you could use these strategies over several rounds to build up your knowledge. However, these strategies don’t take into account the feedback that you get from the game. So there may be better ways to play the game that take into account what kind of results you get after you put in a guess.
These strategies are the Gameplay strategies. I’ll present two potential approaches that use knowledge as it is collected.
Here is an example of the Gameplay - Refine List + Common Characters strategy in action based on the Wordle from January 10th 2022.
| Guess # | Green Characters | Grey Characters | Guess | Result |
|---|---|---|---|---|
| 1 | ***** | alert | ![]() |
|
| 2 | **\er* | a, l, t | fiery | ![]() |
| 3 | **\ery* | a, f, i, l, t | query | ![]() |
Here you can see that after every guess we get to update the green characters and the grey characters that we know about. For example after round 1, we know that the word must be **er* (where * represent wildcards) and must not contain the characters: a, l (el) or t. I use regular expressions to search through the list of words, the search expression is really simple, it just replaces * in the green character string with tokens for the remaining viable characters (the set of alphabet characters minus the grey characters).
The reinforcement learning based approach would operate in a similar manner for a user. However, the mechanics under the hood are a bit more complicated. If you are interested in how it (or any of the other strategies) work please see the appendix.
As I mentioned above, solving wordle is subjective. You might not like my approaches or might think there are ways for them to be improved. Luckily I’m not the only one thinking about this problem. [3, 4]
Erkin
Go ÖN Home
This contains some technical descriptions of the approaches described above.
This one is pretty simple. I am essentially trying to find the word that has the most unique characters in common with other words (this is a yellow match).
In order to do this I reduce words down to character strings which are just lists of unique characters that the words are made up of. So for an example, the word “savvy” becomes the string list: a,s,v,y. We then use the chapter strings to count the number of words represented by a character. So using the character string from above the characters a, s, v, and y would all have their counts incremented by 1. These counts represent the number of words covered by a character (word coverage).
We then search through all words and calculate their total word coverage. This is done by summing up the counts for every character in the word. We then select the word with the highest amount of other word coverage. In order to find words to be used in subsequent rounds we can remove the characters already covered by previously selected words and repeats the previous step.
Code can be found in the first_word_common_characters.ipynb notebook.
This one is a pretty straightforward extension of the First Word - Common Characters approach that has an added constraint, which is position must be tracked along with the characters.
To do this we count a character-position tuples. For every word we loop through the characters and their positions. We keep track of the number of times a character-position is observed. For example, the world “savvy” would increment the counts for the following character-portion tuples: (s, 1), (a, 2), (v, 3), (v, 4), (y, 5). These counts represent the number of words covered by a character-tuple (word coverage).
We then loop through every word and calculate their total word coverage. This is done by breaking the word into character-position tuples and summing up the counts of the observed character-positions.
Code can be found in the first_word_right_character_in_right_position.ipynb notebook.
Both the First Word strategies can be converted from counts to probabilities. I haven’t done this yet, but maybe I’ll update this post in the future to have that information.
The Gameplay strategies are a little more complicated than the First Word strategies because they need to be able to incorporate the state of the game into the suggestion for the next move.
This approach is reminds me of an AI TA I had. He would always say “AI is just search”. Which is true. This approach is pretty much searching over the word list with some filtering and using some distributional knowledge. It was surprised at how easily it came together and how effective it is. As a side note, it was probably the easiest application of regex that I’ve had in a while.
There are three components to this approach:
I will briefly detail some of the intricacies of these components.
Generate Regex: the users need to provide 3 things before a guess 1) a string with the green characters positioned correctly and wildcards (*) elsewhere, 2) a list of the yellow characters found thus far, and finally 3) a list of the gray characters. Using this information we build a regular expression that describes the structure of the word we are looking for. For example let’s say we had **ery as green letters and every character other than q and u were greyed out then we would have a regex search pattern as follows: [qu][qu]ery.
Get possible solutions: after building the regex search string we can loop through the list of solution words and filter all the words that don’t meet the regex search pattern. We can additionally remove any words that do not use characters from the yellow characters list. Finally, we then Rank Order Solutions by finding each words coverage using the approach described in Common Characters above. This produces a list of words ranked by their likelihood of producing yellow characters on the remaining possible words.
Code can be found in the gameplay_refine_list_common_characters.ipynb notebook. There’s also a blogpost with this solver implemented.
There’s also a website with this solver implemented.
This approach is based on tabular Q-learning. [2, 5] Its a little bit complicated and I’m unsure the training procedure produced ideal results. But I’ll provide a brief overview.
Reinforcement learning seeks to learn the right action to take in a given state. [6] You can use it to learn how to play games if you can formulate that game as a series of states (e.g., representing a board position) and actions (potential moves to take). [5] In order to convert tackle the wordle task with RL we need a way to represent the guesses that we’ve already done (state) and the next guess we should make (action).
The actions are pretty obvious, have one action for each potential solution word we can guess. There’s about 2,000 of these.
The states are where things get hairy. If you wanted to encode all the information that the keyboard contains you would need at least 4^26 states. This is because there are 4 states a character can take {black/un-guessed, yellow, green, grey} each character can be in anyone of these states. This is problematic - way too big! Additionally, this doesn’t encode the guesses we have tied. What I eventually settled on was a state representation that combined the last guessed word along with the results (the colors) for each character. This is a much more manageable 2,000 x 4^5.
I then coded up the wordle game and used tabular Q-learning to learn the value of state action pairs. This was done through rewarding games that resulted in a win with a 1 and losses getting a 0.
I think this also might be solvable using dynamic programming as we know the winning states. These are terminal and then I think you can work backwards to assign values to the intermediary states. It’s been almost a decade since I took my dynamic programming class, so I need a bit of a refresher before I dive into it.
As you can see, there are a lot of interesting questions that arise from formulating this task as an RL problem. I will probably come back to this and explore it further in the future.
File: 2022-01-10-wordle-solver.md Creation Date: “Sun, 2 Nov 2025 10:03:04 +0000” — title: “Wordle Solver” categories:
<!DOCTYPE html>
[Download abstract.](https://eotles.com/assets/papers/dynamic_prediction_of_work_status_for_workers_with_occupational_injuries.pdf)
## Abstract
### Objective
Occupational injuries (OIs) cause an immense burden on the US population. Prediction models help focus resources on those at greatest risk of a delayed return to work (RTW). RTW depends on factors that develop over time; however, existing methods only utilize information collected at the time of injury. We investigate the performance benefits of dynamically estimating RTW, using longitudinal observations of diagnoses and treatments collected beyond the time of initial injury.
### Materials and Methods
We characterize the difference in predictive performance between an approach that uses information collected at the time of initial injury (baseline model) and a proposed approach that uses longitudinal information collected over the course of the patient’s recovery period (proposed model). To control the comparison, both models use the same deep learning architecture and differ only in the information used. We utilize a large longitudinal observation dataset of OI claims and compare the performance of the two approaches in terms of daily prediction of future work state (working vs not working). The performance of these two approaches was assessed in terms of the area under the receiver operator characteristic curve (AUROC) and expected calibration error (ECE).
### Results
After subsampling and applying inclusion criteria, our final dataset covered 294 103 OIs, which were split evenly between train, development, and test datasets (1/3, 1/3, 1/3). In terms of discriminative performance on the test dataset, the proposed model had an AUROC of 0.728 (90% confidence interval: 0.723, 0.734) versus the baseline’s 0.591 (0.585, 0.598). The proposed model had an ECE of 0.004 (0.003, 0.005) versus the baseline’s 0.016 (0.009, 0.018).
### Conclusion
The longitudinal approach outperforms current practice and shows potential for leveraging observational data to dynamically update predictions of RTW in the setting of OI. This approach may enable physicians and workers’ compensation programs to manage large populations of injured workers more effectively.
------------------------
File: 2022-08-30-IOE-RTW-JAMIA-Press.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "Helping people get back to work using deep learning in the occupational health system"
categories:
- Blog
- Press
tags:
- Blog
- Press
- occupational health
- return to work
- medicine
- healthcare
- artificial intelligence
- machine learning
header:
teaser: "/assets/images/insta/IMG_1408.JPG"
overlay_image: "/assets/images/insta/IMG_1408.JPG"
---
Discussed our recent [JAMIA paper on predicting return to work](/blog/research/Dynamic-prediction-of-work-status-for-workers-with-occupational-injuries/) with Jessalyn Tamez. Check out the news brief [here](https://ioe.engin.umich.edu/2022/08/30/helping-people-get-back-to-work-using-deep-learning-in-the-occupational-health-system/).
------------------------
File: 2022-09-19-Prospective-evaluation-of-data-driven-models-to-predict-daily-risk-of-clostridioides-difficile-infection-at-2-large-academic-health-centers.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "Prospective evaluation of data-driven models to predict daily risk of Clostridioides difficile infection at 2 large academic health centers"
categories:
- Blog
- Research
tags:
- Blog
- Research
- Clostridioides difficile
- infectious disease
- early warning system
- medicine
- healthcare
- artificial intelligence
- machine learning
header:
teaser: "/assets/images/insta/IMG_1144.JPG"
overlay_image: "/assets/images/insta/IMG_1144.JPG"
---
Infection Control and Hospital Epidemiology. Can be found [here](https://doi.org/10.1017/ice.2022.218).
[Download paper.](https://eotles.com/assets/papers/prospective_evaluation_of_data_driven_models_to_predict_daily_risk_of_clostridioides_difficile_infection_at_2-large_academic_health_centers.pdf)
## Abstract
Many data-driven patient risk stratification models have not been evaluated prospectively. We performed and compared the prospective and retrospective evaluations of 2 Clostridioides difficile infection (CDI) risk-prediction models at 2 large academic health centers, and we discuss the models’ robustness to data-set shifts.
------------------------
File: 2022-09-19-UMich-IOE-Promo-Video.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "UMich IOE Promo Video"
categories:
- Blog
tags:
- Blog
- industrial engineering
- operations research
---
Was featured in the University of Michigan Department of Industrial and Operations Engineering promotional video.
> University of Michigan Industrial and Operations Engineering graduates are in high demand and use mathematics, and data analytics to launch their careers and create solutions across the globe in business, consulting, energy, finance, healthcare, manufacturing, robotics, aerospace, transportation, supply chain and more.
------------------------
File: 2022-11-02-Using-NLP-to-determine-factors-associated-with-high-quality-feedback.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "Using natural language processing to determine factors associated with high‐quality feedback"
categories:
- Blog
- Research
tags:
- Blog
- Research
- medicine
- healthcare
- artificial intelligence
- machine learning
- natural language processing
- medical education
- SIMPL
header:
teaser: "/assets/images/insta/IMG_0591.JPG"
overlay_image: "/assets/images/insta/IMG_0591.JPG"
---
Global Surgical Education. Can be found [here](https://doi.org/10.1007/s44186-022-00051-y).
[Download paper.](https://eotles.com/assets/papers/using_NLP_to_determine_factors_associated_with_high_quality_feedback.pdf)
## Abstract
### Purpose
Feedback is a cornerstone of medical education. However, not all feedback that residents receive is high-quality. Natural language processing (NLP) can be used to efficiently examine the quality of large amounts of feedback. We used a validated NLP model to examine factors associated with the quality of feedback that general surgery trainees received on 24,531 workplace-based assessments of operative performance.
### Methods
We analyzed transcribed, dictated feedback from the Society for Improving Medical Professional Learning’s (SIMPL) smartphone-based app. We first applied a validated NLP model to all SIMPL evaluations that had dictated feedback, which resulted in a predicted probability that an instance of feedback was “relevant”, “specific”, and/or “corrective.” Higher predicted probabilities signaled an increased likelihood that feedback was high quality. We then used linear mixed-effects models to examine variation in predictive probabilities across programs, attending surgeons, trainees, procedures, autonomy granted, operative performance level, case complexity, and a trainee’s level of clinical training.
### Results
Linear mixed-effects modeling demonstrated that predicted probabilities, i.e., a proxy for quality, were lower as operative autonomy increased (“Passive Help” B = − 1.29, p < .001; “Supervision Only” B = − 5.53, p < 0.001). Similarly, trainees who demonstrated “Exceptional Performance” received lower quality feedback (B = − 12.50, p < 0.001). The specific procedure or trainee did not have a large effect on quality, nor did the complexity of the case or the PGY level of a trainee. The individual faculty member providing the feedback, however, had a demonstrable impact on quality with approximately 36% of the variation in quality attributable to attending surgeons.
### Conclusions
We were able to identify actionable items affecting resident feedback quality using an NLP model. Attending surgeons are the most influential factor in whether feedback is high quality. Faculty should be directly engaged in efforts to improve the overall quality of feedback that residents receive.
------------------------
File: 2022-12-20-Teaching-AI.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "Teaching AI as a Fundamental Toolset of Medicine"
categories:
- Blog
- Research
tags:
- Blog
- Research
- medical education
- medical school
- artificial intelligence
- machine learning
header:
teaser: "/assets/images/insta/IMG_0440.JPG"
overlay_image: "/assets/images/insta/IMG_0440.JPG"
---
New article out in Cell Reports Medicine. It is a [perspective paper on incorporating AI into medical education](https://doi.org/10.1016/j.xcrm.2022.100824) with Drs. Cornelius A. James, Kimberly D. Lomis, and James Woolliscroft.
[Download paper.](https://eotles.com/assets/papers/teaching_AI_as_a_fundamental_toolset_of_medicine.pdf)
## Abstract
Artificial intelligence (AI) is transforming the practice of medicine. Systems assessing chest radiographs, pathology slides, and early warning systems embedded in electronic health records (EHRs) are becoming ubiquitous in medical practice. Despite this, medical students have minimal exposure to the concepts necessary to utilize and evaluate AI systems, leaving them under prepared for future clinical practice. We must work quickly to bolster undergraduate medical education around AI to remedy this. In this commentary, we propose that medical educators treat AI as a critical component of medical practice that is introduced early and integrated with the other core components of medical school curricula. Equipping graduating medical students with this knowledge will ensure they have the skills to solve challenges arising at the confluence of AI and medicine.
------------------------
File: 2023-01-12-STAT-News-medical-schools-missing-mark-on-AI.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "STAT News: How medical schools are missing the mark on artificial intelligence"
categories:
- Blog
- Press
tags:
- Blog
- Press
- artificial intelligence
- machine learning
- medical education
- medical school
- STAT News
header:
teaser: "/assets/images/insta/IMG_0388.JPG"
overlay_image: "/assets/images/insta/IMG_0388.JPG"
---
Discussed my recent [perspective paper on incorporating AI into medical education](https://www.sciencedirect.com/science/article/pii/S2666379122003834) with Dr. James Woolliscroft and Katie Palmer of STAT News. Check out the full discussion [here](https://www.statnews.com/2023/01/12/medical-school-artificial-intelligence-health-curriculum/).
------------------------
File: 2023-02-22-RISE-VTC-AI-MedEd.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "RISE Virtual Talking Circle: Innovations in Machine Learning and Artificial Intelligence for Application in Education"
categories:
- Blog
- Talk
tags:
- medicine
- machine learning
- artificial intelligence
- medical educcation
header:
teaser: "/assets/images/insta/IMG_0302.JPG"
overlay_image: "/assets/images/insta/IMG_0302.JPG"
---
University of Michigan Medical School RISE (Research. Innovation. Scholarship. Education) virtual talking circle discussion with Dr. Cornelius James.
Discussed the need for integration of AI education into undergraduate medical education (medical school). Echoed some of the findings from our [Cell Reports Medicine paper](https://www.sciencedirect.com/science/article/pii/S2666379122003834).
[Link to presentation.](https://eotles.com/assets/presentations/2023_RISE_VTC/AI_RISE_VTC.pdf)
------------------------
File: 2023-03-16-NAM-AI-HPE.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: "National Academy of Medicine: AI in Health Professions Education Workshop"
categories:
- Blog
- Talk
tags:
- medicine
- machine learning
- artificial intelligence
- medical education
- national academies
header:
teaser: "/assets/images/insta/IMG_0212.JPG"
overlay_image: "/assets/images/insta/IMG_0212.JPG"
---
Panel discussion on AI in health professions education.
I joined a panel of learners to share our perspectives on how AI should be incorporated into health professions education. Moderated by Mollie Hobensack and Dr. Cornelius James.
Panelists included: Noahlana Monzon, CPMA Nutrition Student, University of Oklahoma, Dallas Peoples, PhD Candidate in Sociology, Texas Woman's University, Winston Guo, MD Candidate, Weill Cornell Medical College, Gabrielle Robinson, PhD Student in Medical Clinical Psychology, Uniformed Services, University of the Health Sciences, Alonzo D. Turner, PhD Student, Counseling and Counselor Education, Syracuse University & 2022 NBCC Doctoral Minority Fellow and myself.
------------------------
File: 2023-03-23-html-svg-experiments.md
Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000"
---
title: HTML/SVG Experiment
categories:
- Blog
tags:
- Blog
- HTML
- SVG
header:
teaser: "/assets/images/random_gradient_hello.svg"
overlay_image: "/assets/images/random_gradient_hello.svg"
---
Incorrect guesses:
------------------------ File: 2023-08-11-Machine-Learning-for-Healthcare-2023.md Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000" --- title: "2023 Machine Learning for Healthcare Conference" categories: - Blog - Talk tags: - Machine Learning for Healthcare Conference - medicine - healthcare - research - machine learning - artificial intelligence header: teaser: "/assets/images/insta/E35BD8D3-0BE7-4D05-BDD7-C42C47F7C487.jpg" overlay_image: "/assets/images/insta/E35BD8D3-0BE7-4D05-BDD7-C42C47F7C487.jpg" --- Presentation at Machine Learning for Healthcare 2023 in New York on our work on rank-based compatibility. During the conference I presented a brief spotlight talk introducing our work and also had the chance to present a poster going into more detail. I've included copies of both in this blog post. You can find a link to the post about the paper [here](https://eotles.com/blog/research/Updating-Clinical-Risk-Stratification-Models-Using-Rank-Based-Compatibility/). A recording of the spotlight intro video. Spotlight presentation slides [Link to download presentation.](https://eotles.com/assets/presentations/2023_MLHC/20230811_MLHC_rank_based_compatibility.pdf) Poster [Link to download poster.](https://eotles.com/assets/presentations/2023_MLHC/2023_MLHC_poster_20230809.pdf) ## Abstract Updating clinical machine learning models is necessary to maintain performance, but may cause compatibility issues, affecting user-model interaction. Current compatibility measures have limitations, especially where models generate risk-based rankings. We propose a new rank-based compatibility measure and loss function that optimizes discriminative performance while promoting good compatibility. We applied this to a mortality risk stratification study using MIMIC data, resulting in more compatible models while maintaining performance. These techniques provide new approaches for updating risk stratification models in clinical settings. ------------------------ File: 2023-09-12-Github-Action-for-Post-Concatenation.md Creation Date: "Sun, 2 Nov 2025 10:03:04 +0000" --- title: "It's Automation All the Way Down! How to Use GitHub Actions for Blogging Automation with LLMs" last_modified_at: 2023-12-08 categories: - Blog tags: - git - github - github actions - github pages - CI/CD - blogging - jekyll - minimal mistakes - minimal-mistakes - automation tools - web development - workflow optimization - LLM - chatGPT - data engineering header: teaser: "/assets/images/insta/IMG_2253.JPG" overlay_image: "/assets/images/insta/IMG_2253.JPG" overlay_filter: 0.5 # same as adding an opacity of 0.5 to a black background excerpt: "CI/CD automation isn't just for large-scale projects; it's a game-changer for individual programmers. I've started using the power of GitHub Actions to improve my blogging process, making it more efficient. I ❤️ Automation." --- # The LLM Advantage in Blogging I've used [large language model (LLM)](https://en.wikipedia.org/wiki/Large_language_model) powered chatbots ([ChatGPT](https://chat.openai.com) & [Claude](https://claude.ai/chats) to help with some of my writing. They've been especially beneficial with blog posts where I have functionality dependent on JavaScript code. # The Automation Dilemma Utilizing these LLM chatbots is pretty straightforward, but it gets annoying when you want to provide them with writing samples. You can pick and choose a couple representative posts and share those, but that's too scattershot for me. Ideally, I'd like my whole corpus of blog posts to be used as samples for the chatbots to draw from. I had written some python scripts that loop over my posts and create a concatenated file. This worked fine for creating a file - but it was annoying to manually kick off the process every time I made a new post. So, I started thinking about how to automate the process. There are many ways to approach it, but I wanted to keep it simple. The most straightforward route was to build off my existing automation infrastructure - the GitHub pages build process. # GitHub Actions: My Automation Hero The GitHub pages build process automatically converts the documents I use to write my blog (markdown files) into the web pages you see (HTML). GitHub provides this service as a tool for developers to quickly spin up webpages using the [GitHub Actions](https://github.com/features/actions) framework. GitHub actions are fantastic as they enable [continuous integration and continuous delivery/deployment (CI/CD)](https://en.wikipedia.org/wiki/CI/CD).
graph TB
%% Primary Path
A[Push new blog .md post to github] --> BA
BB --> CA
CB --> D[Commit & push changes]
%% GitHub Pages Build Process
subgraph B[GitHub Pages Build Process]
BA[Build eotles.com webpages] --> BB[Trigger: gh-pages branch]
end
%% Concatenate .md Files Action
subgraph C[Concatenate .md Files Action]
CA[Create file] --> CB[Loop over all posts and concat to file]
end
%% .md Files
A -.-> P[.md files]
P -.-> B
P -.-> C
*The above diagram provides a visual overview of the automation process I've set up using GitHub Actions.*
# Connecting the Dots with Jekyll, GitHub Pages, and Minimal Mistakes Theme
We've primarily centered our dicussion of automation around GitHub Actions; however, it's essential to recognize [the broader ecosystem that supports my blogging](/blog/Hello-World-2/). I use the [Jekyll blogging platform](https://jekyllrb.com), a simple, blog-aware, static site generator. It's a fantastic tool that allows me to write in Markdown (.md), keeping things straightforward and focused on content. And Jekyll seamlessly integrates with GitHub Pages! The aesthetic and design of my blog is courtesy of the [Minimal Mistakes theme](https://mmistakes.github.io/minimal-mistakes/). It's a relatively flexible theme for Jekyll that's ideal for building personal portfolio sites.
For those of you who are on the Jekyll-GitHub Pages-Minimal Mistakes trio, the automation process I've described using GitHub Actions can be a game-changer. It's not just about streamlining; it's about harnessing the full potential of these interconnected tools to actually *speed up* your work.
# Diving into CI/CD
CI/CD is essential if you regularly ship production code. For example, it enables you to automatically kick off testing code as a part of your code deployment process. This is really important when you are working on a large codebase as a part of a team. Fortunately/unfortunately, I'm in the research business, so I'm usually just coding stuff up by my lonesome. CI/CD isn't a regular part of my development process (although maybe it should be 🤔). Despite not using it before, I decided to see if I could get it to work for my purposes.
# My First Foray into GitHub Action
Since this was my first time with GitHub Actions, I turned to an expert, ChatGPT. I had initially asked it to make a bash script that I was going to run manually, but then I wondered:
> so I have a website I host on GitHub. Is there a way to use the GitHub actions to automatically concantenate all the .md files in the /_posts directory?
It described the process, which comprised of two steps:
1. Create a GitHub Action Workflow: you tell GitHub about an action by creating a YAML file in a special subdirectory (`.github/workflows`) of the project
2. Define the Workflow: in the YAML file, specify what you want to happen.
ChatGPT suggested some code to put in this file.
I committed and pushed the changes. A couple minutes later, I got an email that my GitHub Action(s) had errored out. The action that I created conflicted with the existing website creation actions. With assistance from ChatGPT, I solved this by having my new concatenation action wait for the website creation action to finish before running. We achieved this by using the gh-pages branch as a trigger, ensuring our action ran after the webpages were built and deployed.
# The Code Behind the Magic
The code for this GitHub Action is as follows:
```
name: Concatenate MD Files with Metadata
on:
push:
paths:
- '_posts/*.md'
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Concatenate .md files with metadata
run: |
mkdir -p workflows_output
> workflows_output/concatenated_posts.md
cd _posts
for file in *.md; do
echo "File: $file" >> ../workflows_output/concatenated_posts.md
echo "Creation Date: $(git log --format=\"%aD\" -n 1 -- $file)" >> ../workflows_output/concatenated_posts.md
cat "$file" >> ../workflows_output/concatenated_posts.md
echo "------------------------" >> ../workflows_output/concatenated_posts.md
done
- name: Commit and push if there are changes
run: |
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git add -A
git diff --quiet && git diff --staged --quiet || git commit -m "Concatenated .md files with metadata"
git push
```
# Conclusion: Automation Can Be a Warm Hug
The final result was an automation process that runs in the background every time a new post is added. Overall, I was impressed with the power and flexibility of GitHub Actions. This experience demonstrated that CI/CD isn't just for large software projects but can be a valuable tool for individual researchers and developers!
# Update!
This automation didn't end up working well.
I ended up switching the automation trigger to be time-based.
You can read about the updated setup [here](/blog/Github-Action-for-Post-Concatenation-Update/).
Cheers,