Seed Grant Awardees
Every year the Data Science Academy awards seed grants to help support interdisciplinary data science research efforts. Below is a list of previous awardees and the projects funded.
2023 Awardees
Animal Behavior During the Great American Eclipse of 2024 – a Citizen Science Web Application
- Team members:
- Caren Cooper – College of Natural Resources
- Kelly Lynn Mulvey – College of Humanities and Social Sciences
- Adam Hartstone-Rose – College of Sciences
- Awarded: $48,475
- Description: We request funds to support an interdisciplinary team of students in building, user-, pilot-, and beta-testing a web application for a citizen science project focused on animal and human behaviors during extreme natural phenomena. This project is preparation for a data-intensive project focused on the 2024 solar eclipse, and useful for studies of animal and human behavior during subsequent phenomena including before and after hurricanes and earthquakes. The web app increases the competitiveness of immediate grant proposals for: 1) documenting circadian and anxiety behaviors of a variety of animal taxa in multiple environments and phenomena and 2) broadening public engagement in Earth science to foster belonging and identity in science. The student team will create a well-documented and managed database that balances participant privacy protections and data sharing/re-use, a user-tested data-entry system with automated report-backs to participants in the form of data visualizations. The student team will also draw on multiple disciplines to create recruitment materials, curricula for partners, materials to train project ambassadors, user-friendly protocols and training for participants. We will user- and pilot-test within the Citizen Science Campus program during dawn/dusk scenarios, and beta-test with partners during the 2023 annular eclipse.
Data-Driven Research and Training in Emerging Semiconductors: A Cluster-Wide Pilot Project
- Team members:
- Aram Amassian – College of Engineering
- Harald Ade – College of Sciences
- Dali Sun – College of Sciences
- Ryan Chiechi – College of Sciences
- Brendan O’Connor – College of Engineering
- Kelly Zering – College of Agriculture & Life Sciences
- Awarded: $29,542
- Description: Emerging semiconductor materials, namely conjugated polymers and/or hybrid perovskites, have many energy, electronic, photonic and quantum use cases. A key feature of emerging semiconductors is their chemical and compositional diversity, which creates unique opportunities and challenges, such as a design space explosion. While nearly all 100 Carbon Electronic Cluster members working on these materials, our research can benefit from centralized database and use of informatics to support larger team efforts, especially in relation to how the common material chemistry, structure, properties, and device performances measured by different research units relate to each other. There is tremendous opportunity to making breakthroughs in our research fields by adopting Materials Genomics Initiative 2.0 practices combining data science with theory, computation, and experiment. We aim to develop (1) a data science infrastructure, (2) data science skillsets, workflows, and culture, and (3) extend existing automated data generation workflows to a wider Cluster membership to enable team-based data-driven inquiry. Our vision is for the Cluster to conduct data-driven fundamental and use-based materials research on emerging semiconductors from synthesis all the way to device fabrication collaboratively and become a national leader in data-driven research in emerging semiconductors.
Delivering on-farm recommendations through interdisciplinary data science
- Team members:
- Cranos Williams – College of Engineering
- Rachel Vann – College of Agriculture & Life Sciences
- Katherine Stowe – College of Agriculture & Life Sciences
- Awarded: $24,619
- Description: To ensure continued on-farm profitability, it will be imperative that farmers have access to dynamic tools that allow them to use data science to advance the economic and environmental sustainability of their operations. The goal of the proposed project is to leverage investments in current cloud computing and analytics infrastructure with data science expertise across NC State University coupled with current small-plot field data to create a farmer decision support tool via the SAS Viya platform that will be easily navigable by farmers across North Carolina. This tool will be created using a dataset from the field generated since 2019 that comprises 50,000 data points demonstrating the relationship of production practices with yield and quality across the state. Coupling this dataset with weather data to create an accessible tool that over 5,000 farmers can use will bring data science off campus into every county of North Carolina. Funds are requested to hire a graduate student with the analytical and programming expertise needed to create this tool in SAS Viya. The outcome of this project will be web-accessible models that provide data-driven solutions about soybean production practices directly to farmers, ultimately leading to an increase in the North Carolina agricultural economy.
- Related Articles: Seed Grant Awardees Aim to Revolutionize Farming with Data Science
Exploring the Pedagogical Effectiveness of an Open-Source Approach to Data Acquisition and Data Providence
- Team members:
- Shuyin Jiao – College of Engineering
- Warren J. Jasper – Wilson College of Textiles
- Awarded: $43,465
- Description: Many engineering students enrolled in STEM courses lack the skills needed to design novel methods for collecting data and managing its provenance to support scientific reproducibility and reliability. Without these essential skills, students struggle to validate prototypes and resort to expensive proprietary equipment with restrictive interfaces that may not actually collect the right type of data. While most curricula emphasize data analysis and visualization (two core tenets of data science)[1, 9], there is a paucity of emphasis on data collection, and to a lesser extent on data storage. Students are generally given representative data sets to analyze but are not given the proper tools to design systems which will acquire, collect and store physical measurements – critical experience necessary for success in later years, graduate school, and industry. To better connect engineering education with industrial practices and prepare our students for the STEM workforce of tomorrow, we propose a major change to the current pedagogy centered on open source/open standards and the Raspberry Pi [6]. Adopting an open-source philosophy toward data collection will enable stability, security, interoperability, and reliability across a variety of engineering disciplines that acquire data, both in the classroom and in research labs.
Hidden Markov methodology for assessing racial stressors
- Team members:
- Jonathan Williams – College of Sciences
- Vanessa Volpe – College of Humanities and Social Sciences
- Awarded: $50,000
- Description: In America, Black individuals are at a higher risk for cardiovascular diseases than other ethnic/racial groups. A major contributor to such health disparities is the stress imposed on Black Americans due to racism. This work seeks to uncover the heterogeneity of racial stress responses from Black, young adults through the application of Bayesian hidden Markov models (HMMs) on cardiac output data, namely respiratory sinus arrhythmia (RSA). This approach will allow for an accurate assessment of the magnitude with which RSA changes due to stress, as well as provide an approach for classifying participants based on their physiological response to stress. Ultimately, this research aims to bring more awareness to the racial health disparities witnessed in America through the application of new statistical methods to psychophysiological data.
Modeling Social Media Information Pathways and Mitigating the Effect of Disinformation
- Team members:
- William Rand – Poole College of Management
- Ana-Maria Staicu -College of Sciences
- Xiaocia Champon College of Sciences
- Nikhil Ankolkar – College of Sciences
- David Roycroft – College of Sciences
- Landon Smith – Poole College of Management
- Nate Schaefer – Poole College of Management
- Awarded: $20,000
- Description: Social media influencers play a significant role in shaping the information transmitted on any social media platform and there are growing concerns over disinformation and manipulated messages. A systematic understanding of influencer behaviors and information diffusion could assist in reducing the effect of mis/disinformation. Motivated by the Modeling Influence Pathways project (MIPs), our interdisciplinary team used Tweets related to the Skripal case to investigate users’ time dependent behavioral differences and assess multiple social media platforms’ interaction effects on different campaigns. However, the limited MIPs funding only allows us to explore a solitary case. Upon the depletion of the MIPs funding, this seed funding will facilitate our continuous effort to support our interdepartmental team, help the transition to secure additional external funding. This funding offers an opportunity to continue expanding our team and test our methodology on additional cases like the war on Ukraine. The goal of this research is to build a tool to identify the most important users involved in information diffusion and map out the information transfer pathways. This project aims to connect information that flows into pathways which are used to disseminate and amplify mis/disinformation, and helps to mitigate the effect of disinformation.
Predicting Air Pollution in Disadvantaged Communities in the United States with Deep Learning
- Team members:
- Zhen Qu – College of Sciences
- Xiaourui Liu – Collège of Engineering
- Kofi Boone – College of Design
- Katy May – Center for Human Health and the Environment
- Awarded: $49,973
- Description: Air pollution is the fifth leading risk factor for mortality worldwide. In the US, air pollution causes 100,000 premature deaths per year, 80% of which is due to fine particulate matter (PM2.5). Effective regulation of air pollution requires a better understanding of PM2.5 concentrations than we have at present. The current PM2.5 monitoring stations are mainly located in high-income and predominantly White areas, leaving the most vulnerable communities in the dark about their levels of exposure. The existing PM2.5 predictions by the EPA air quality model can fill these spatial gaps but still have large uncertainties. To resolve these challenges, we will apply novel semi-supervised deep learning techniques such as Graph Neural Networks (GNNs) to predict PM2.5 concentrations over the entire US. We will compare the resulting PM2.5 predictions with those from existing machine learning and air quality models, perform environmental justice analysis, disseminate our results to planners and key decision-makers, and outreach to disadvantaged communities in North Carolina. This seed grant will initiate collaborations between three Colleges and a Center, support two early career PIs to build their research program, and produce preliminary results for collaborative research proposals for external grant applications.
Researchers in AI for Safety and Ethics (RAISE) NC State
- Team members:
- Veljko Dubljević – College of Humanities and Social Sciences
- George List – College of Engineering
- Darby Orcutt – University Libraries
- William Bauer – College of Humanities and Social Sciences
- Munindar Singh – College of Engineering
- Crystal Lee – College of Education
- Awarded: $25,837
- Description: Fostering diverse connections and engaged interdisciplinary community of NC State researchers working on wicked problems around AI in Society will RAISE the University’s profile as a leading voice on matters of science and society. RAISE would provide two day-long ideation and workshopping sessions, a half-day grant-writing retreat, support for honoraria to grow and extend the existing AI in Society colloquium, Most notably, this seed grant will support one cross-college graduate student position and two undergraduate research assistant positions (from multiple colleges) to support activities, maintain the website and listservs, and manage communications between and among participating researchers. The goals of RAISE are to increase the number of colleges at NC State actively engaged in the AI in Society community, to work on the submission of major interdisciplinary AI in Society-related research grant proposals in 2024, and to develop of a new National Science Foundation (NSF) Research Traineeship (NRT) grant application. In our view, values are inherent to the very consideration of the place of AI in society. Ensuring the morality of AI agent interactions with each other and with humans requires anticipating challenges including: (1) avoiding coding bias; (2) achieving a comprehensive view; (3) providing safety guarantees; and (4) public engagement.
Statistical Evaluation of Data-Driven Mechano-Electrochemistry of Lithium Batteries and Beyond Lithium
- Team members:
- Shadow Huang – College of Engineering
- Logan Opperman – College of Sciences
- Awarded: $40,000
- Description: In computational materials science and mechanics of lithium batteries and beyond lithium, statistics is critical for evaluating the confidence in predictions and models with substantial potential for impact on battery materials development and qualification. The role for statistical techniques is not limited to obtaining uncertainty intervals on phase boundaries. These methods can drive the development of selection criteria for the best combinations of datasets of battery materials and mechano-electrochemical descriptions that support each other for a given electrochemical system. The proposed work will be accompanied by a comparison of the use of statistical techniques in mechano-electrochemistry to inform critical lithium battery design choices and gleaning scientific insights.
The role of textile dyes in the atopic march
- Team members:
- Lizzette Lorenz – Veterinary Medicine
- Jasmine Olivares – Veterinary Medicine
- Nelson Vinueza Benitez – Wilson College of Textiles
- Awarded: $16,418
- Description: In computational materials science and mechanics of lithium batteries and beyond lithium, statistics is critical for evaluating the confidence in predictions and models with substantial potential for impact on battery materials development and qualification. The role for statistical techniques is not limited to obtaining uncertainty intervals on phase boundaries. These methods can drive the development of selection criteria for the best combinations of datasets of battery materials and mechano-electrochemical descriptions that support each other for a given electrochemical system. The proposed work will be accompanied by a comparison of the use of statistical techniques in mechano-electrochemistry to inform critical lithium battery design choices and gleaning scientific insights.
2023 Wearables Pilot
- Team members:
- Oguz Akbilgic – Wake Forest University School of Medicine
- James Dieffenderfer – NC State University College of Engineering
- Awarded: $40,000 – split with Wake Forest Center for Biomedical Informatics (WFBMI)
- Description: Evaluation of Left Arm ECG for Deep Learning Risk Prediction of Heart Failure seeks to achieve a first step in solving two big issues in cardiovascular disease prevention: The current standard of diagnostic prediction is (1) infrequent and (2) not easily accessible due to the expensive nature of the equipment. This research builds upon previous work accomplished in the field (including work performed by the PIs) and aims to build an ECG armband that utilizes artificial intelligence in order to provide a prediction for risk of cardiovascular disease.
2022 Awardees
Wolfwebs: Social Network Analysis at NC State
- Team members:
- Steve McDonald – Humanities & Social Sciences, Sociology & Anthropology
- Andrew Davis – Humanities & Social Sciences, Sociology & Anthropology
- Branda Nowell – Humanities & Social Sciences, School of Public & International Affairs
- Robin Dodsworth – Humanities & Social Sciences, English
- College of Humanities & Social Sciences
- Awarded: $35,000
- Description: Social Network Analysis (SNA) is a methodological tool frequently deployed as part of data science, though it is not currently one of the courses offered by the Data Science Academy. Many faculty members and graduate students at NC State use SNA as part of their research, but they tend to work within rather than across disciplinary units. There is a growing need to develop an interdisciplinary community of SNA scholars at NC State in order to advance our institution’s goal of conducting high impact and cutting edge research, while also preparing students to become future leaders in the field of data science. Under the heading of WolfWebs, we aim to develop interdisciplinary connections among faculty and graduate students at NC State who conduct or are interested in conducting research using SNA. These efforts will develop social infrastructure of SNA expertise to facilitate research collaboration, grant submission, and training. Specifically, the seed grant funds will be used to support a high profile speaker series, research panels to introduce and connect SNA faculty, grant proposal ideation workshops, and intensive SNA training sessions for faculty and students.
Social Media Mining to inform Park use and Public Health decision-making
- Team members:
- J.Aaron Hipp – Natural Resources Center for Geospatial Analytics; Parks, Recreation, & Tourism
- Deepti Adlakha – College of Design Landscape, Architecture & Environmental Planning
- Laua Tateosian – Center for Geospatial Analytics
- Jason Bocarro – Natural Resources; Parks, Recreation & Tourism Management
- Jing-Huei Huang – Center for Geospatial Analytics
- College of Natural Resources and College of Design
- Awarded: $32,417
- Description: The COVID-19 pandemic has highlighted the importance of access and use of quality outdoor spaces including parks and trails. The pandemic has also exacerbated the downward trends in survey response rate and increased survey fatigue. Parks and recreation and public health professionals face challenges in recruiting and responding to diverse voices in their communities to deliver the programs, policies, and spaces desired. Social media data, here specifically Twitter data, may provide a reliable and valid source of social and behavioral science data for appropriating parks, recreation, and public health resources. With this seed funding we seek to bring together two Colleges, a Center, and an Initiative, as well as two early career researchers, to explore the feasibility of using social media data mining to inform program, policy, and design decision-making for park and community health professionals. Aim 1 is to use and compare two distinct Twitter datasets to scrape North Carolina originating tweets with park-related and physical activity-related text. Aim 2 is to validate content and topic results against existing survey data related to parks and physical activity amongst the public and parks and recreation practitioners. Aim 3 to disseminate this work and seek extramural funding.
WormScanAI: A neural phenotype prediction tool for C. Elegans using machine learning and automated high-throughput microscopy
- Team members:
- Adrianna San Miguel – Engineering
- Kevin Flores – Sciences and Math
- College of Engineering and College of Sciences
- Awarded: $48,606
- Description: Morphological changes in organs, tissues, and cells are indicative of underlying disorders or decline induced by aging. Images can be highly informative for diagnosing, understanding, and treating disease. However, identification of the relevant features from biological images is a very challenging task. In this work, we propose to integrate deep-learning approaches with high-throughput, high-content images in the model organism C. elegans. We plan to develop pipelines that enable analyzing and extracting informative insights from images of neurodegenerating neurons and declining tissues, by converting biological images into large descriptive data-sets amenable to statistical and mathematical examination. This work will allow the identification of neuronal defects and the in-depth analysis of features that can reveal mechanisms that lead to diverse structural changes, such as the drivers of neurodegeneration in different contexts. This grant brings together a strong team with expertise in high-throughput imaging, C. elegans aging and neurodegeneration, and machine learning applied to biological prediction problems.
Teaching History through Mining Patterns in Primary Source Texts: Using Artificial Intelligence for a student-led inquiry into redlining and data bias
- Team members:
- Amato Nocera – Education, TELS
- Christy Byrd – Education, TELS
- Shiyan Jiang – Education, TELS
- College of Education
- Awarded: $50,000
- Description: This proposed project aims to bring machine learning to high school history classrooms to foster historical thinking and critical AI (Artificial Intelligence) literacy. This project will build on an innovative pilot study that the PI team conducted in five history classrooms in North Carolina in May 2022. Our study had students use StoryQ—a web-based text mining and AI modeling platform for K-12—to investigate patterns across primary source texts, gain a deeper understanding of the systemic nature of historical discrimination, and foster AI literacy in evaluating bias and sources of data. With support from the Seed Grant program, we will hire a full-time graduate student to analyze the rich data collected from the initial pilot study (including pre- and post-surveys, in-class assignments, field notes, and interviews), help refine the unit (based on student learning outcomes from the first iteration), and implement a second iteration next academic year, following a design-based approach. The resulting data and analysis will provide an important foundation for the PI team to write a compelling NSF grant proposal for the ITEST program.
Data Science to connect earth movement across length scales
- Team members:
- Karen Daniels – Sciences; Physics
- Karl Wegmann – Sciences; Marine, Earth & Atmospheric Sciences
- Vrinda Desai – Sciences; Physics
- Nakul Deshpande – Sciences; Physics/MEAS
- Al Handwerger – JPL
- Vashan Wright – Scripps Institution of Oceanography, UCSD
- Ted Brzinski – Haverford College, Physics
- College of Sciences
- Awarded: $46,016
- Description: The ground beneath our feet shifts over time: sometimes slowly creeping downhill, sometimes in sudden landslides and the separation in length scales — between soil particles and the landforms — is similarly immense. As global warming creates extreme weather events that destabilize the earth’s surface, and population growth pushes development into unfavorable building sites, there is an urgent need to understand how to identify and protect vulnerable populations from such hazards. Physicists and engineers have made great strides in performing laboratory studies of particulate materials which connect grain-scale deformation, applied loads, and the flow rules that relate them. Similarly, geoscientists have developed satellite technology to monitor the shifting landforms. However, these length and time scales remain far removed from the grain scale, and measurements of the internal forces within the earth’s surface are much less well-developed. How do we take our nascent understanding beyond the lab, to address real earth data? How do we take the most pressing earth hazard concerns and use laboratory data to seek solutions? This seed grant would fund open data science that maps out the most promising routes for bridging the chasm of dramatically different length and time scales, and targets those projects for external funding.
Data-Driven Reliability Analysis of Lithium-io Battery Pack with Cell-to-Cell Degradation Dependence and Variation
- Team members:
- Mengmeng Zhu – Textiles
- Xiangwu Zhang – Textiles
- Wilson College of Textiles
- Awarded: $5,000
- Description: Driven by stricter emission regulations, energy intensity, and lower cost, Lithium-ion (Li-ion) battery technology is emerging to be one of the most prominent sources for powertrain electrification and large-scaleenergy storage systems. Safety and reliability are the critical concerns impeding the adoption of Li-ion battery technology. Traditional methods to assess battery health, such as electrochemical models, rely on multiple partial differential equations, which are time-consuming and computationally expensive. The research objective of this proposal is to develop a data-driven reliability analysis framework of the battery pack by understanding the dependence and variation of degradation behaviors of cells in Li-ion battery pack and leverage the gained insights to advance modeling and computing on Li-ion battery pack health management. The proposed data-driven approach consists of data collection, stochastic process-based modeling capturing the heterogeneities of battery cells, and model validation using the collected data greatly improve the robustness and accuracy in evaluating the health of Li-ion battery pack, thereby improving safety, reliability, and efficiency of battery-driven applications and further adoption of Li-ion battery technology. In addition, we will organize small-group proposal ideation workshops within NC State to bring together experts in data science, materials, communications, and power systems for future research projects.
Using Machine Learning to Automate the Identification of Transducing Events from Metagenomic Data
- Team members:
- Manuel Kleiner – CALS
- Benjamin Callahan – CVM
- College of Veterinary Medicine and College of Agriculture and Life Sciences
- Awarded: $41,712
- Description: Horizontal gene transfer (HGT) between microbes is critical for the evolution of microorganisms in communities and global health threats such as the rapid spread of antibiotics resistance in microbiomes. We have developed and published a novel approach, termed “transductomics”, that allows for the sequence determination of genome fragments that are horizontally transferred between microbes by viruses. Currently, this approach requires extensive manual inspection of metagenomic sequencing data by an expert which limits its user-range. In this project, we will develop machine learning approaches to automate the detection of HGT events in transductomics datasets to enable faster analysis of larger datasets by non-experts. This interdisciplinary research project will combine the expertise of the computational research group of Dr. Callahan and the meta-omics data generation and analysis focused research group of Dr.Kleiner. The project will fund a graduate student – who has already done some preliminary work on developing transductomics machine learning approaches. The key goal of this seed project is to provide preliminary data for a favorably reviewed R01 grant resubmission, as well as additional planned grant submissions.
Oppressive Infrastructures: Data, Equity and Access
- Team members:
- Tania Allen – Design; Art & Design
- Sara Queen – Design; Architecture
- College of Design
- Awarded: $50,000
- Description: The primary goal of this project is to build on new perspectives at the intersection of data science, ethics, representation and bias specifically as it relates to equity, inclusion and systems of oppression that may or may not be evident in the way that data is collected, cleaned, manipulated and communicated. Funding from this grant will support a colloquium of scholars and practitioners from multiple fields who are actively collecting and using a variety of quantitative and qualitative data in their work to discuss how they account for and address issues of equity and access. Through this proposal we hope to strengthen our cross-disciplinary network of researchers exploring how data and data visualization contribute to inequity and injustice as oppressive and liberating tools, and to collectively build an impactful body of research that diversifies the dimensions used to describe and study these wicked challenges and expands critical conversations to initiate change.
2021 Awardees
- Promoting Youth Critical Data Literacy Through Computing and Community Storytelling With Data
- PIs: Shiyan Jiang (TELS), Bita Akram (CSC)
- Awarded: $25,000
- Machine Learning-based Mathematical Representation of Model Uncertainty for Bayesian Inverse Uncertainty Quantification
- PIs: Xu Wu (Nuclear Engineering), Ralph Smith (Mathematics)
- Awarded: $25,000
- Think and Do: A Workshop to Advance Open Climate Data Science in North Carolina
- PIs: Kathie Dello (State Climate Office), Jessica Matthews (NCICS) with collaborators Carl Schreck (NCICS), Bjorn Brooks (NCICS), Sheila Saia (BAE), Yuhan Rao (NCICS) and Micah Vandegrift (Libraries)
- Awarded: $25,000