The University of South Carolina (USC) Big Data Health Science Center (BDHSC) held the fourth Annual Big Data Health Science Case Competition virtually on February 3-5, 2023. In its fourth year, the competition attracted 37 teams from 15 prestigious universities in the U.S.

Since 2020, the BDHSC has organized the Big Data Health Science Case Competition ahead of the BDHSC’s other signature event: Annual National Big Data Health Science Conference. The annual BDHS Student Case Competition is open to undergraduate and graduate students at U.S. colleges and universities. The BDHS Case Competition aims to provide enthusiastic teams of graduate and undergraduate students with the opportunity to apply their knowledge to the analysis of big healthcare datasets.  

The 2023 Big Data Health Science Student Case Competition focused on creating a diagnostic and prognostic algorithmic tool for classifying and defining a disease condition using gene expressions from a panel of healthy and unhealthy patients using an advanced data analytics approach. In its fourth year, 37 teams from 15 universities in the U.S., including the University of South Carolina, Arkansas State University, Auburn University, Boston University, College of Charleston, Dartmouth College, Duke University, Iowa State University, Minnesota State University Mankato, Oklahoma State University, University of Connecticut, University of Florida, University of Illinois at Chicago, University of Minnesota, and University of North Carolina at Charlotte, competed.

More specifically, the fourth year’s challenge was a classic “black box” unsupervised learning challenge, which was even more challenging since over 90% of teams had no genomic domain knowledge. Each team was provided massive synthesized genomic data with over 64,000 gene expressions for many patients.

The data consisted of two cohorts: a healthy cohort and a cohort suffering from an undisclosed ailment. Each team was given 24 hours to create a tool that identified healthy and unhealthy patients. Teams were asked to create a diagnostic tool-based algorithm using a data analytics approach, taking into account important features like interpretability, and usability, as well as the tool’s ethical and equity implications for different population groups. Each participating team analyzed the case and datasets. They implemented and presented their best diagnostic/prognostic algorithm using the genetic data provided.

Each team presentation was judged by a panel of three to seven judges based on the following criteria: 1) Organization and Presentation of Facts, 2) Accuracy, 3) Relevance, 4) Case Objectives Met, 5) Professional Appearance and Timeliness, and 6) Ability to Answer Judges’ Questions. There were 21 judges composed of industry, government, and academic representatives. After the preliminary round presentations, 7 teams competed in the final round. Finalists were required to identify the genes and the specific ailment affecting the unhealthy group.

G. Thomas Chandler, Dean, Arnold School of Public Health, and Professor, Environmental Health Sciences, at the University of South Carolina, announced the winners at his opening remarks at the USC National Big Data Health Science Conference on February 10, 2023. Yanjiao Yang, Shang Zeng, and Xingzhi Zhou from Duke University, Durham, NC, won the top prize of $5,000. Joseph Gyorda, Benjamin Levesque, and Digvijay Yadav of Dartmouth College, Hanover, NH, won the second-place prize of $3,000, while Bofan Chen, Anton Hung, and Kevin Rouse from Dartmouth College won the third-place prize of $2,000.

Additionally, the teams of Minchuan Qin, Yaoyu Zang, and Tianyue Zhou from Oklahoma State University, Stillwater, OK; Jiacheng (Eric) Liu, Meghna Singh, and Trevor Winger from the University of Minnesota, Minneapolis, MN; Rafael Bidese Puhl, Bang Truong, and Li Zhou from Auburn University; and Jennifer Liu, Yuxuan Peng, and Xinwen Xu from Dartmouth College, Hanover, NH, received honorable mentions and won a $500 prize for their participation as finalists at the 2023 Annual Big Data Health Science Student Case Competition.

The Big Data Health Science Case Competition is designed to be a hands-on experience that tests the students’ analytical, teamwork, communication, and presentation skills in order to build a talent pipeline in big data health science.

I really appreciate University of South Carolina for providing this enlightening competition and giving me the opportunity to participate in the event. For a statistics background student as me, the biggest challenge here is to learn the underlying background information under time pressure. In the tasks, I learned to not only give a technically correct solution, but also focus on the related impact and application of our analysis,” said Zhou. “I am excited to be part of this case competition and it is a great opportunity to showcase our skills and learn from our peers. The health competition really gives us a new insight on how the data analytics process fusion with the real-life industry. I am glad that we rise to the challenge and deliver a strong performance,” added Zeng. “Working among graduate and Ph.D. students, this competition challenged me to stretch the limits of my undergraduate skills. At the same time, this competition served as my first foray into Big Data and demonstrated the practical applications of machine-learning techniques to the field of healthcare. Moreover, the experience encouraged me to generate robust and interpretable insights from the data, presenting these insights to various stakeholders in the field,” said Levesque. “It was really cool to use methods we are learning in our program at Dartmouth during the case competition. I didn’t realize how much I had learned until I had to apply it to a real-world situation,” told Chen. “The competition was very intense but also fun. I had never used genomic data as a diagnostic tool and felt like we accomplished something very meaningful. I hope to do similar work in my eventual career to improve the lives of others,” said Rouse.

Moreover, the competition provided students with an opportunity to present their analysis, and recommendations to a broad panel of judges consisting of academia, business, and the healthcare industry.

Having such a diverse and impressive panel of judges was intimidating but also thrilling. I want to thank all at USC who volunteered their weekends, and this is an experience I will remember forever,” said Hung. “I really enjoy drawing insights from our data and discussing our findings with judges,” told Zhou. “The University of South Carolina’s case competition was truly an excellent experience to apply our data science knowledge to a realistic healthcare dilemma with big data. I also greatly appreciated the opportunity to practice our presentation skills in front of a panel of renowned judges within academia and industry,” added Gyorda. “I want to express my gratitude to the University of South Carolina for giving us this fantastic opportunity to work with real-world health data. It had always been my goal to use statistical analysis and data science concepts in the healthcare industry, and thanks to this experience, I am convinced that I have made a significant contribution,” said Yadav.

A Special Thanks to Our Judges!

Each team’s presentation was judged by a panel of three to seven judges. Each judging panel was composed of industry and academic representatives. The BDHSC would like to thank our judges for lending their expertise; this competition would not have been possible without them.

Industry and Government  

  • Dr. Susan Burroughs, MHA, FACHE Associate Chief Executive Officer, MUSC Health Columbia Medical Center Northeast
  • Chandra Dronavajjala, Senior Data Scientist, CVS Health
  • Jay Hamm, VP Operations LMC
  • Jacqueline (Jackie) Johnson, Principal Analytical Training Consultant, SAS
  • John McCall, SAS Principal Technical Training Consultant, SAS
  • Aunyika Moonan, PhD, MSPH, CPHQ, Executive Director of Data and Measurement, SC Hospital Association

Academia (External)

  • Dr. Carla Sampson, Clinical Associate Professor, Director of Healthcare Programs, Robert F. Wagner Graduate School of Public Service, New York University

Academia (Internal)

  • Hamid Abdollah, PhD, Post-Doc Researcher at USC
  • Forest Agostinelli, Ph.D., Assistant Professor, Computer Science and Engineering, AI Institute USC College of Engineering and Computing
  • Denise Davise, Informatics PhD Student, Integrated Information Technology
  • Theodoros V. Giannouchos, Ph.D., Assistant Professor, Health Services Policy & Management Arnold School of Public Health
  • Kevin Lu, Ph.D., Associate Professor Clinical Pharmacy and Outcomes Sciences (CPOS) College of Pharmacy
  • Nabil Natafgi, Ph.D., MPH, CPH, Assistant Professor, Associate Director of the Patient Engagement Studio, Health Services Policy & Management, USC Arnold School of Public Health
  • Elizabeth A. Regan, Ph.D., Department Chair and Professor, Department of Integrated Information Technology, College of Engineering and Computing
  • Homayoun Valafar, Professor & Department Chair, USC Computer Science & Engineering; Biomedical Engineering
  • Songhua Xu, Ph.D., Associate Professor Integrated Information Technology, USC College of Engineering and Computing Sunday

About BDHSC:

The University of South Carolina Big Data Health Science Center (BDHSC), as one of USC’s Excellence Initiatives, serves as a campus-wide interdisciplinary enterprise that conducts cutting-edge research and discovery, offers professional development and academic training, and provides service to the community and industry. The BDHSC consists of five content cores (Artificial Intelligence for Sensing and Diagnosis, Electronic Health Records, Genomics, Geospatial and Social Media) and two functional hubs (Business/Entrepreneurship and Technology) embody interdisciplinary collaboration and foster cutting-edge research and discovery. The combination of BDHSC Cores and Hubs represents a paradigm shift from traditional academic research to a focus on engagement and collaboration between academia, industry and community.

The leadership structure of the BDHSC follows a team science approach with two MPIs (Xiaoming Li and Bankole Olatosi). Research at BDHSC takes place with campus-wide representation by 50 faculty affiliates from 20 departments and 9 colleges across USC. The BDHSC is governed by a Steering Committee and supported by Internal and External Advisory Committees.

About the University of South Carolina National Big Data Health Science Conference:

The Big Data Health Science Conference is a signature annual event of the USC Big Data Health Science Center (BDHSC). The theme of the 2023 conference is “Unlocking the Power of Big Data in Health: Translating Data Science into Program Development and Implementation.” Highlights of the conference include keynote and panel speakers from diverse areas of the health sciences, government, and academia as well as poster sessions, networking events, and breakout sessions in areas of artificial intelligence for sensing and diagnosis, electronic health records, genomics, geospatial and social media research. The Conference was held on February 10 – 11, 2023. More information is available at