The 2022 DataFest challenge came from the Play2Prevent lab at the Yale Center for Health & Learning Games in the Yale School of Medicine. DataFest participants analyzed data from Elm City Stories, an educational video game designed for middle and high school students. The goal of the game was to prevent negative health outcomes in young, at-risk adolescents by helping teens to increase perception of risk and to develop skills for predicting and understanding future consequences of actions. The DataFest challenge was to characterize, measure and display patterns of play within the game that the researchers could then use to make connections to real-life behavior.
DataFest Through the Years
DataFest 2021 participants used data provided by the Rocky Mountain Poison and Drug Safety (RMPDS) center to discover and identify patterns of drug use and misuse. RMPDS administers the Research Abuse, Diversion and Addiction Related Surveillance (RADARS®) System Survey of Non-Medical Use of Prescription Drugs Program, which monitors drug use and abuse across the country and internationally using a variety of data sources. DataFest participants were provided with nearly 100,000 survey responses from an online survey that studies drug use and other behaviors and risk factors among the general population in four countries (the United States, Canada, Germany and the United Kingdom of Great Britain and Northern Ireland). Hundreds of demographic and drug-specific variables were available for DataFest participants to use to help identify and predict patterns of misuse, with an emphasis on opioid medications.
In order to protect the health and safety of all our students, staff, faculty, and community members, ASA DataFest @ OSU 2020 was canceled due to the COVID-19 outbreak.
DataFest participants received four different data sets from the National Canadian Women’s Rugby team that spanned the team’s 2017-2018 season. Some of the data was qualitative—self-reported ratings of different health and wellness factors—while other data was quantitative—win/loss ratios, players’ rates of speed and acceleration on the field, and geo-locational data from during game play. Over 2 GB of data were used to explore the relationships between individual player performance and team performance. Many DataFest teams chose to combine data from the different data sets to determine the most likely predictors of players’ fatigue, stress, and injury, with a goal of improving players’ stamina, performance, and overall health and wellness.
DataFest participants used over 17 million records of data related to job postings on Indeed's web site to gain insight into job markets over the course of a year across the US, Canada and Germany. Over 3 GB of data could be used to explore trends in job availability across months and seasons, to examine how users engaged with jobs in different industrial sectors, and to build analyses that could be used to help inform new job seekers. Many DataFest teams chose to combine Indeed's data with outside data on the economy and labor markets to help predict local, national and international economic activity.
DataFest participants used over 10 million records of hotel searches from Expedia’s web sites to analyze how customers interact with Expedia on their path from search to selection to purchase. Over 2 GB of search data could be combined with over 5 million fields of data describing travel destinations in order to understand how customer segments differ in their search and travel behavior and, ultimately, to help Expedia differentiate between “lookers”—those browsing Expedia’s sites—and “bookers”—customers who ultimately make a hotel reservation.
The inaugural DataFest at Ohio State. Participants used three data sets from TicketMaster—one with information about customer use of its web site, one describing events listed on the site, and one with information about Google ad campaigns on the site—to help understand how site visits could be converted to ticket sales, and to identify "true fans" of artists and bands.