A lot of time and effort has gone into methods to predict utilization and spending, primarily for targeting of care management programs. Most of those efforts utilize claims as a primary data source. An article in the American Journal of Managed Care describes the use of electronic medical record information for this purpose instead of or in addition to claims data. (AJMC Article) The authors focused on hospitalizations and the ability to develop algorithms that might predict who was likely to have a hospitalization in the coming 6 months. Adult patients in a large health system covering several states composed the test population. Data existing in the EHR, or in claims data, as of a certain date was used to test ability to predict hospitalizations in the next 6 months. In addition to health status variables; utilization data, diagnoses and demographic information was used in the regressions. As is usually the case in this kind of work, a large proportion of the population was used as a “training” data set to develop the predictive algorithms, and then the algorithms were tested on the rest of the population. The researchers evaluated models using EHR data only, claims data only and both types of data. Models were also tested that used limited sets of the variables. 169 variables ended up being included in the final models.
The variables which had the highest individual predictive value included diagnoses, demographic ones, and prior utilization. All consistent with common sense. Certain diagnoses were strongly linked to likelihood of a future hospitalization, such as sickle cell anemia, heart transplant and lipidoses. Being age 76 or over was also a strong predictor. The model which used all variables, claims data and EHR data had the best overall predictive score, but was only slightly better than either EHR data or claims data alone. These models predicted around 85% of hospitalizations. The model tended to overpredict hospitalization for those patients at the highest risk of hospitalization, which is useful to consider in care management efforts. One advantage of including EHR data is that it generally is available faster and would allow for more current and constant updating of predictive results. It is encouraging to see that progress seems to be occurring in creating models that do a good job of predicting utilization and spending.