Seed Grant Awardees
Every year the Data Science Academy awards 5-10 awards to help support interdisciplinary data science research efforts. Below is a list of previous awardees and the projects funded.
2022 Awardees
Wolfwebs: Social Network Analysis at NC State
- Team members:
- Steve McDonald – Humanities & Social Sciences, Sociology & Anthropology
- Andrew Davis – Humanities & Social Sciences, Sociology & Anthropology
- Branda Nowell – Humanities & Social Sciences, School of Public & International Affairs
- Robin Dodsworth – Humanities & Social Sciences, English
- College of Humanities & Social Sciences
- Awarded: $35,000
- Description: Social Network Analysis (SNA) is a methodological tool frequently deployed as part of data science, though it is not currently one of the courses offered by the Data Science Academy. Many faculty members and graduate students at NC State use SNA as part of their research, but they tend to work within rather than across disciplinary units. There is a growing need to develop an interdisciplinary community of SNA scholars at NC State in order to advance our institution’s goal of conducting high impact and cutting edge research, while also preparing students to become future leaders in the field of data science. Under the heading of WolfWebs, we aim to develop interdisciplinary connections among faculty and graduate students at NC State who conduct or are interested in conducting research using SNA. These efforts will develop social infrastructure of SNA expertise to facilitate research collaboration, grant submission, and training. Specifically, the seed grant funds will be used to support a high profile speaker series, research panels to introduce and connect SNA faculty, grant proposal ideation workshops, and intensive SNA training sessions for faculty and students.
Social Media Mining to inform Park use and Public Health decision-making
- Team members:
- J.Aaron Hipp – Natural Resources Center for Geospatial Analytics; Parks, Recreation, & Tourism
- Deepti Adlakha – College of Design Landscape, Architecture & Environmental Planning
- Laua Tateosian – Center for Geospatial Analytics
- Jason Bocarro – Natural Resources; Parks, Recreation & Tourism Management
- Jing-Huei Huang – Center for Geospatial Analytics
- College of Natural Resources and College of Design
- Awarded: $32,417
- Description: The COVID-19 pandemic has highlighted the importance of access and use of quality outdoor spaces including parks and trails. The pandemic has also exacerbated the downward trends in survey response rate and increased survey fatigue. Parks and recreation and public health professionals face challenges in recruiting and responding to diverse voices in their communities to deliver the programs, policies, and spaces desired. Social media data, here specifically Twitter data, may provide a reliable and valid source of social and behavioral science data for appropriating parks, recreation, and public health resources. With this seed funding we seek to bring together two Colleges, a Center, and an Initiative, as well as two early career researchers, to explore the feasibility of using social media data mining to inform program, policy, and design decision-making for park and community health professionals. Aim 1 is to use and compare two distinct Twitter datasets to scrape North Carolina originating tweets with park-related and physical activity-related text. Aim 2 is to validate content and topic results against existing survey data related to parks and physical activity amongst the public and parks and recreation practitioners. Aim 3 to disseminate this work and seek extramural funding.
WormScanAI: A neural phenotype prediction tool for C. Elegans using machine learning and automated high-throughput microscopy
- Team members:
- Adrianna San Miguel – Engineering
- Kevin Flores – Sciences and Math
- College of Engineering and College of Sciences
- Awarded: $48,606
- Description: Morphological changes in organs, tissues, and cells are indicative of underlying disorders or decline induced by aging. Images can be highly informative for diagnosing, understanding, and treating disease. However, identification of the relevant features from biological images is a very challenging task. In this work, we propose to integrate deep-learning approaches with high-throughput, high-content images in the model organism C. elegans. We plan to develop pipelines that enable analyzing and extracting informative insights from images of neurodegenerating neurons and declining tissues, by converting biological images into large descriptive data-sets amenable to statistical and mathematical examination. This work will allow the identification of neuronal defects and the in-depth analysis of features that can reveal mechanisms that lead to diverse structural changes, such as the drivers of neurodegeneration in different contexts. This grant brings together a strong team with expertise in high-throughput imaging, C. elegans aging and neurodegeneration, and machine learning applied to biological prediction problems.
Teaching History through Mining Patterns in Primary Source Texts: Using Artificial Intelligence for a student-led inquiry into redlining and data bias
- Team members:
- Amato Nocera – Education, TELS
- Christy Byrd – Education, TELS
- Shiyan Jiang – Education, TELS
- College of Education
- Awarded: $50,000
- Description: This proposed project aims to bring machine learning to high school history classrooms to foster historical thinking and critical AI (Artificial Intelligence) literacy. This project will build on an innovative pilot study that the PI team conducted in five history classrooms in North Carolina in May 2022. Our study had students use StoryQ—a web-based text mining and AI modeling platform for K-12—to investigate patterns across primary source texts, gain a deeper understanding of the systemic nature of historical discrimination, and foster AI literacy in evaluating bias and sources of data. With support from the Seed Grant program, we will hire a full-time graduate student to analyze the rich data collected from the initial pilot study (including pre- and post-surveys, in-class assignments, field notes, and interviews), help refine the unit (based on student learning outcomes from the first iteration), and implement a second iteration next academic year, following a design-based approach. The resulting data and analysis will provide an important foundation for the PI team to write a compelling NSF grant proposal for the ITEST program.
Data Science to connect earth movement across length scales
- Team members:
- Karen Daniels – Sciences; Physics
- Karl Wegmann – Sciences; Marine, Earth & Atmospheric Sciences
- Vrinda Desai – Sciences; Physics
- Nakul Deshpande – Sciences; Physics/MEAS
- Al Handwerger – JPL
- Vashan Wright – Scripps Institution of Oceanography, UCSD
- Ted Brzinski – Haverford College, Physics
- College of Sciences
- Awarded: $46,016
- Description: The ground beneath our feet shifts over time: sometimes slowly creeping downhill, sometimes in sudden landslides and the separation in length scales — between soil particles and the landforms — is similarly immense. As global warming creates extreme weather events that destabilize the earth’s surface, and population growth pushes development into unfavorable building sites, there is an urgent need to understand how to identify and protect vulnerable populations from such hazards. Physicists and engineers have made great strides in performing laboratory studies of particulate materials which connect grain-scale deformation, applied loads, and the flow rules that relate them. Similarly, geoscientists have developed satellite technology to monitor the shifting landforms. However, these length and time scales remain far removed from the grain scale, and measurements of the internal forces within the earth’s surface are much less well-developed. How do we take our nascent understanding beyond the lab, to address real earth data? How do we take the most pressing earth hazard concerns and use laboratory data to seek solutions? This seed grant would fund open data science that maps out the most promising routes for bridging the chasm of dramatically different length and time scales, and targets those projects for external funding.
Data-Driven Reliability Analysis of Lithium-io Battery Pack with Cell-to-Cell Degradation Dependence and Variation
- Team members:
- Mengmeng Zhu – Textiles
- Xiangwu Zhang – Textiles
- Wilson College of Textiles
- Awarded: $5,000
- Description: Driven by stricter emission regulations, energy intensity, and lower cost, Lithium-ion (Li-ion) battery technology is emerging to be one of the most prominent sources for powertrain electrification and large-scaleenergy storage systems. Safety and reliability are the critical concerns impeding the adoption of Li-ion battery technology. Traditional methods to assess battery health, such as electrochemical models, rely on multiple partial differential equations, which are time-consuming and computationally expensive. The research objective of this proposal is to develop a data-driven reliability analysis framework of the battery pack by understanding the dependence and variation of degradation behaviors of cells in Li-ion battery pack and leverage the gained insights to advance modeling and computing on Li-ion battery pack health management. The proposed data-driven approach consists of data collection, stochastic process-based modeling capturing the heterogeneities of battery cells, and model validation using the collected data greatly improve the robustness and accuracy in evaluating the health of Li-ion battery pack, thereby improving safety, reliability, and efficiency of battery-driven applications and further adoption of Li-ion battery technology. In addition, we will organize small-group proposal ideation workshops within NC State to bring together experts in data science, materials, communications, and power systems for future research projects.
Using Machine Learning to Automate the Identification of Transducing Events from Metagenomic Data
- Team members:
- Manuel Kleiner – CALS
- Benjamin Callahan – CVM
- College of Veterinary Medicine and College of Agriculture and Life Sciences
- Awarded: $41,712
- Description: Horizontal gene transfer (HGT) between microbes is critical for the evolution of microorganisms in communities and global health threats such as the rapid spread of antibiotics resistance in microbiomes. We have developed and published a novel approach, termed “transductomics”, that allows for the sequence determination of genome fragments that are horizontally transferred between microbes by viruses. Currently, this approach requires extensive manual inspection of metagenomic sequencing data by an expert which limits its user-range. In this project, we will develop machine learning approaches to automate the detection of HGT events in transductomics datasets to enable faster analysis of larger datasets by non-experts. This interdisciplinary research project will combine the expertise of the computational research group of Dr. Callahan and the meta-omics data generation and analysis focused research group of Dr.Kleiner. The project will fund a graduate student – who has already done some preliminary work on developing transductomics machine learning approaches. The key goal of this seed project is to provide preliminary data for a favorably reviewed R01 grant resubmission, as well as additional planned grant submissions.
Oppressive Infrastructures: Data, Equity and Access
- Team members:
- Tania Allen – Design; Art & Design
- Sara Queen – Design; Architecture
- College of Design
- Awarded: $50,000
- Description: The primary goal of this project is to build on new perspectives at the intersection of data science, ethics, representation and bias specifically as it relates to equity, inclusion and systems of oppression that may or may not be evident in the way that data is collected, cleaned, manipulated and communicated. Funding from this grant will support a colloquium of scholars and practitioners from multiple fields who are actively collecting and using a variety of quantitative and qualitative data in their work to discuss how they account for and address issues of equity and access. Through this proposal we hope to strengthen our cross-disciplinary network of researchers exploring how data and data visualization contribute to inequity and injustice as oppressive and liberating tools, and to collectively build an impactful body of research that diversifies the dimensions used to describe and study these wicked challenges and expands critical conversations to initiate change.
2021 Awardees
- Promoting Youth Critical Data Literacy Through Computing and Community Storytelling With Data
- PIs: Shiyan Jiang (TELS), Bita Akram (CSC)
- Awarded: $25,000
- Machine Learning-based Mathematical Representation of Model Uncertainty for Bayesian Inverse Uncertainty Quantification
- PIs: Xu Wu (Nuclear Engineering), Ralph Smith (Mathematics)
- Awarded: $25,000
- Think and Do: A Workshop to Advance Open Climate Data Science in North Carolina
- PIs: Kathie Dello (State Climate Office), Jessica Matthews (NCICS) with collaborators Carl Schreck (NCICS), Bjorn Brooks (NCICS), Sheila Saia (BAE), Yuhan Rao (NCICS) and Micah Vandegrift (Libraries)
- Awarded: $25,000