Putting personal data to use

Impressions

The following are a collection of posts about individualized approaches to data analysis, highlighting some general considerations as well as specific applications.


5. Feedback

feedback-2.jpg

A Dynamic Process

At this point, it should be apparent that the individualized process is not a static, one-size-fits-all approach to heath and wellness.  Rather, it is a process that requires intensive control and iteration in order to make the inferences most relevant to each user.  Such a dynamic process is frustratingly rare in today's healthcare system, where providers are increasingly motivated to apply systematic, consistent treatments to patients based on broad similarity of disease state.  There are several reasons for this, such as the fee-for-service payment approach that bills a specific amount of money for each condition treated and each treatment delivered, as well as the ever-prevalent 'evidence-based medicine' emphasis on providers to treat patients using only methods that have been studied in large, population-level randomized controlled studies.  Ironically, despite the advances highlighted in previous sections in sensor and monitors for data collection, user interfaces and mobile apps for delivery of information, and analytical approaches to manage large amounts of information, it is the static, one-size-fits-all approach that is considered modern, with personalized, or individualized, medical decision-making falling under the umbrella of the old days of 'doctor knows best'.  We hope that at this point the approaches outlined previously will raise objection over this modern, static approach in favor of more dynamic, individualized approach that also incorporates systematic data collection and analysis.  In this section, we will expand on the dynamic approach to incorporate novel principles from reinforcement learning to understand how the individualized process can be further individualized and automated.  

Reinforcement Learning

Without getting too technical, reinforcement learning is the branch of machine learning focused on how we make decisions based on receiving some reward for our actions taken.  Reinforcement learning, which also contains other systematic approaches to learning and quality improvement, such as Markov decision processes, has gained recent acclaim with its use in deep-learning models that have enabled computers to best human experts in games like chess and Go.  There are a number of excellent references and descriptions available online, as well as open source code for playing around with reinforcement learning, and we strongly suggest anyone interested research further independently.

At a basic level, reinforcement learning focuses on how we tend to make decisions based on a policy that is directly tied to the information available to us at the time of the decision, from which we take a certain action with expectation of a reward.  For example, my policy for playing blackjack may be that whenever I am holding 16 or less and the dealer holds a face card, I take another card. My reward with taking another card may be that I draw a 5, which puts my total up to 21 and leads to a win if the dealer's second card is anything other than an ace.   The information available (cards I can see) are called the 'state', and the policy space is the set of policies that are tied to the actions taken for each specific state with an expected reward.  If my cards total 18, and the dealer has a 6 showing, this represents another state, for which my policy may be to stand (not take another card).  

So how do we develop our set of policies?  The answer is often through trial-and-error, or through education from an outside source.  For blackjack, I may have learned from playing numerous games that when a dealer holds a face card she is more likely to draw another face card resulting in a total of 20, which beats anything I might have less than that.  Or perhaps I might incorporate outside information about the odds of her second card being greater than a 6, which leads me toward my action of taking another card.  Or perhaps some combination of the two.  The point is that as I test out and develop my policy set, I make adjustments based on the long-term reward I receive from taking a specific policy over another.  

Now, suppose that I play a few games of blackjack and develop what I believe to be a solid set of policies for what actions to take based on each state of the cards.  How do I know that my policy set is the best, or optimum, policy set for the game of blackjack?  Further, how do I decide whether to stick with a policy set that I believe is pretty good for my purposes, or make changes to my policy set in the hopes of finding a better set?  This situation is referred to as the exploitation/exploration trade-off, and what we're really getting at is whether my pretty good policy set is the merely a local optimum, or the global optimum.  It turns out that in many situations, in the process of exploring all policy sets to find the global optimum, we actually do worse for a period.  We take a step back in order to take two steps forward.  It is actually this situation that has motivated the use of the term 'disruptive' technology, which can be disruptive of the old policy set and exploitation that comes with sticking by the 'old ways' for the sake of finding better, more globally optimum policy sets.  As you can imagine, there are a number of theories about the best method for balancing exploration and exploitation, which are beyond the scope of this discussion.  

Figure 1. Local and Global optimum.  Oftentimes we may be at a local optimum, in which case we actually need to get less optimal in order to reach the global optimum.  See text for details.

Figure 1. Local and Global optimum.  Oftentimes we may be at a local optimum, in which case we actually need to get less optimal in order to reach the global optimum.  See text for details.

So what are the implications of this, at times disruptive, learning process in terms of feedback and improvement of the individualized process?  Moreover, how can we be certain that we have explored the range of policies and approaches to medical decision making needed to ensure that our approach is globally optimum, and not merely locally optimum?  Medicine is not a field highly comfortable with disruption, and yet the history of medicine is marked by disruptive innovations that render past approaches obsolete.  Managing this balance is not something that many of us are comfortable with, or willing to put our own health at risk for, so how do we develop a system for continual improvement, while also balancing safety and efficacy?

The Minimally Viable Product

In the Silicon Valley world of software development, experts in start-up initiation preach of the importance of the minimally viable product, which is the prototype application that is far from bulletproof, but adaptable to the needs of a growing user base.  Books such as the Lean Startup teach entrepreneurs to avoid putting too much time into making the first versions of a product high-quality since it is unpredictable what users will ultimately want, and effort is better spent learning those needs and adapting the product to meet those needs.  You may start out trying to build an app that will accurately record daily activity from an accelerometer only to find out that people are less concerned with accuracy and more concerned with an interface that allows them to compare with friends.  We touched on this situation briefly in the User Interface section, but let's dive into this approach further.  

Now, if your app is for entertainment purposes, where a loss of accuracy is not a big deal, then this approach makes a lot of sense, since indeed you would waste a lot of time developing something that people don't really want.  On the other hand, if you're app is going to be used to make medical decisions, such as whether to start or stop a medication, then even an early version could have dire consequences if it were not designed to provide accurate information.  The individualized data analysis approach described here can be both, and this is our great challenge.  

If our application were solely for medical use, then we would need to use the standard approaches of building a solid prototype, conducting clinical studies in populations of increasing size before testing it officially in the setting of a randomized controlled trial (RCT).  Only then, based on a successful result in the RCT would we be able to release the app for public use.  However, what happens if during the process of testing our app in clinical trials, we made a few changes to the analysis platform that improved the predictiveness?  Do we need to repeat the study?  

If you think that this issue is only relevant to app development, think again.  The issue of improving technology during the course of clinical trials and care arises all the time in healthcare, particularly in the area of medical procedures.  A great example of this phenomenon has occurred recently with the development of a procedure to treat the heart rhythm disorder called atrial fibrillation (Afib).  Early investigators noted that for many patients with Afib, extra beats from one part of the heart seemed to be triggering the rhythm, and that burning those areas with a technique called ablation appeared to reduce the amount of Afib in their patients.  This information was tested in a small clinical study, and the procedure was approved by the FDA.  Several companies designed the equipment for use in the Afib ablation procedure, and over time each improved on prior iterations of the equipment to make the procedure safer and more efficient.  Of course, while each iteration improved the safety in small clinical studies, bigger RCTs had not been conducted to determine the effect overall, and whether this particular procedure should be applied for each of the millions of people worldwide with Afib.  Then, after over 10 years of the procedure being available to the public, an RCT was designed to test the procedure overall.  That study, called the CABANA study, would enroll patients over the next 8-10 years (RCTs take a lot of time and money to conduct, by the way).  However, during the enrollment, the interative process continued with 'minor' changes to the procedure again to improve efficiency and safety.  By the end of the study, the Afib ablation procedure being done across sites was quite different from what it had been at the start of the study, such that when the results were published, showing benefit in some patients but not others, no one knew how to interpret the results.  This situation has prompted much discussion in the cardiology community about the true efficacy of the Afib ablation procedure, but the point is that the same situation could occur in healthcare app development.  The question then becomes, how do we balance innovation and safety/efficacy when our methods for determining the latter rests on methods that are not compatible with the pace of the former?  

Just like the Afib ablation controversy, we are unlikely to definitely solve the problem of exploitation and exploration inherent to innovation clashing with safety and regulation needed in healthcare applications.  Our approach is to understand that development of individualized approaches to health and wellness is a dynamic process, one that requires iteration and improvement, with close feedback from users, both providers and patients, and scientific study whenever possible.  Indeed, the quantitative approach to individualized medicine is a new field in medical research, and the purpose of our group and others will be to expand our knowledge in an open and transparent way so that we can ultimately balance innovation and efficacy to improve the health of our patients, and the effectiveness of our providers. 

There are a number of methods from reinforcement learning and decision analysis for exploring the policy space, and searching for higher optimums in terms of outcomes and health.  At this point, we haven't even scratched the surface of these applications in healthcare, much less individualized medicine.  The future is bright, however, as the first key step in the process is systematizing and quantifying our health.  If nothing else, this is the first innovation of individualized medicine.  It enables us to measure our outcomes and risk factors quantitatively and longitudinally, to make statistical models identifying trends and correlations, and uses statistics and epidemiology to test and verify hypotheses of causation.  This process, once established, is ripe for improvement.  

Conclusion

This concludes this brief series into the components of the individualized medicine approach.  Hopefully, we have provided enough information to spark additional reading and study into each of the wide-ranging factors that must all come together in order to make individualized medicine a reality.  None of these sections are intended to be comprehensive, but we hope that a reader with solid experience and knowledge of one component will gain insight into others in order to fully grasp the challenge.  If you have suggestions or comments, please feel free to leave below, or email us at IDAO@ucdenver.edu.

 

cute.jpg
Michael RosenbergComment