Step 1

Defining the Problem

Traditionally, methods of data collection have been done person-to-person either by phone or face-to-face.

This resource is intensive and suffers from several limitation, such as recall error and self-reported bias. The Centers for Disease Control and Prevention (CDC) wanted to address the challenges and limitations of self-reported health surveillance information and tap into the potential of innovative data sources and alternative methodologies for public health surveillance.

Step 2

Methods of Outreach

Our solution to closing the gap in data collected from mobile sources was to build an app to house a survey feature, gather data collected from wearable devices / apps and provide a method to validate to add additional context to that data.

The survey feature allows users to answer questions with a multiple-choice answer set or to answer questions that require the user to enter a value. Users answers one question per screen to provide an optimal viewing experience for the user. Answers are saved as the user moves through the survey and the percent complete is a visual queue to the user that there are more questions to be answered if they chose not to complete the survey in one sitting

We wanted to provide a solution that improves on the quality of data collected, leverages novel technologies (wearables, apps) and results in a data set that is comparable to the historical data that’s been collected.

Step 3

Gathering the Data

In order to gather data for the user to validate, we leveraged our partnership with Validic to aggregate data on physical activity gathered from wearable devices and apps. We built a mobile app that pre-populates the collected physical activity data into validation grids that a user can then add data to (location and activity type), confirm or edit, and manually enter additional or missing bouts of activity. Completing these grids provides additional context and validity to the data.

Additionally, the data collected uses the same location and activity categories as the Behavioral Risk Factor Surveillance System (BRFSS), providing data that is both complementary to and more novel than results from the survey methodology currently utilized by the CDC.

Direct collection of a number of metrics that are tracked by the BRFSS (e.g., hours of sleep per night, servings of fruits and vegetables) and aggregation of our collected data into other BRFSS metrics (e.g., number of non-job related physical activity bouts per month) will make direct comparison with BRFSS results straight forward. Our survey methodology will also allow for the collection of age, gender, zip code and other demographic information that would be helpful as a data source complementary to CDC health surveillance data.

Step 4

The Outcomes

Outcomes from this pilot are summarized based on participation and data collection. From a participatory perspective, we had 262 members sign up for the pilot, 59% (n=154) of which downloaded and registered in the app.

There are three primary components of our app, connecting a device/app, completing a survey, and validating bouts of physical activity. Of the 154 participants who downloaded the app, 98% (n=151) successfully connected a device. Seventy-nine percent (n=121) of participants started the survey and 77% (n=119) completed the full 23-item questionnaire. Bouts of physical activity were validated by 99% (n=149) of participants, with an average of 12 days of physical activity validated.

Sample Question: During the past month, did you participate in any physical activities?

BRFSS = Behavioral Risk Factor Surveillance System / HBC = Healthy Behavioral Challenge

In the majority of cases, a smaller percentage of BRFSS participants, grouped by age band, self-reported bouts of physical activity summing to 150 minutes or more per week as compared with objectively collected data. Differences range from 7.5 percentage points to 41.8 percentage points, with an average difference of 18.7 percentage points.

While we were not able to achieve the requested sample size of 300 participants, we were still able to collect a significant amount of data in a short period of time. This is huge improvement compared to traditional methods of data collection. We were able to gather 39 data points on 150 people in a few weeks with limited marketing and a small incentive. Results such as these are a strong signal of how a solution like ours can significantly bolster the data collection capabilities of the CDC.

Step 5

The Limitations

Practical Solution

In this solution, participants must own a smart phone or wearable device in order to collect data. While the proportion of the population that own these technologies continues to grow, this requirement limits the pool of people that can provide data in this manner. Long-term engagement with the app has not been tested and may require strong marketing campaigns to ensure continued use. Based on experience in the health and wellness industry, communication, awareness and incentives are often required to achieve initial and sustained engagement with data collection apps.

We discovered, through the pilot, that we did not have an intuitive and clear option for indicating that no activity had been completed on a particular day. In our current design, if a member wants to indicate they have not done any physical activity on a particular day, they must click on a button that says, “Tap to validate your activity!”, which is not intuitive. Once on that screen, they would have to select an option that says, “I did not do this activity” and leave the validation grid blank. Again, this is not clear or intuitive for the user. This issue would be rectified in future iterations by providing a clear prompt to indicate that no activity was completed.

Through the data analysis process, we found that it would have been helpful to ask questions about a participant’s perception of their average physical activity so we could compare their perception to what was captured objectively through the device/app. This limitation is not an inherent flaw as it can easily be rectified by making small changes to the survey questions.

Technical Solution

Our current design only allows for linear questionnaires. Further development is needed to allow for branching functionality and more sophisticated techniques, such as adaptive questionnaires. We have not included an administrative interface in our solution. Further iterations should include an administrative interface for modifications and scheduling of questionnaires based on researcher’s needs.

A robust registration model based on survey eligibility data would be required to roll out the solution to a large population. Despite the fundamental quality of the underlying design, the limited size of the test data set is not representative of a mobile based application at scale. We feel confident in our design’s scalability, but further testing is required.

Healthy Behavior Data Challenge