Skip to main content

News

#TechTalk: What is the Big Data, AI, and Genomics Technology Center?

A Q&A with Jay Unruh, Ph.D., Director of Scientific Data, Sean McKinney, Ph.D., Head of Computational Imaging, and Chris Seidel, Ph.D., Genomics Scientist

07 January 2025

Tell us more about the Big Data, AI, and Genomics Tech Center. How does it connect to the overall mission of the Stowers Institute?

Jay Unruh: The role of artificial intelligence (AI) and big data throughout biology has been exploding in recent years. Just like the Investigator labs at the Institute study a wide variety of organisms, we work on a wide variety of data. While the Computational Biology Technology Center tends to do more mainstream genomics and statistical analyses, there is constant need for expertise in specialty areas.

Our team consists of subject matter experts who can partner with researchers on topics such as computer vision, image analysis, mathematical modeling, genomics, and project planning from experimental design through data interpretation.

What types of data do you work with, and how do AI and machine learning tools help you make sense of large genomic datasets?

Sean McKinney: The human eye is very good at looking for shapes and finding objects in an image. Computer algorithms, however, struggled with the most basic of tasks, until about seven years ago when a type of AI called neural networks demonstrated the ability to identify biological objects with remarkable accuracy.

Analysis of microscopy data can now be easily automated to find proteins, organelles, cells, or whole animals in images at unprecedented scale. For example, in a recent experiment, we were able to process 2,000 cells per second. Seven years ago, these cells would have had to be identified manually.

What is a typical challenge you encounter when analyzing genomic data, and how does the integration of AI or big data technologies help overcome these challenges?

3D Electron microscopy images of a developing sea anemone show clearly all of the cilia on the outside of the animal. A special organ, the apical tuft, whose function remains unknown houses a large bundle of cilia. Here the total length of the cilia is color coded, showing a drastic difference in length (and presumably function) of the cilia coming from the tuft vs those on the rest of the body column. (Keith Sabin, Gibson Lab and Melainia McClain, Electron Microscopy)

Chris Seidel: The data sets are large and unwieldy. They often represent many tens of thousands of measurements, or dimensions of a system. AI and data science provide methods for extracting patterns, information, and features from highly dimensional data. The other challenge is more traditional and involves bringing hypotheses to evaluate against the data.

In terms of AI development, what are some of the most promising applications you see emerging within the field of computational biology in the next 5 to 10 years?

Sean McKinney: I do not think it is possible to see five years into the future right now, as things are moving too rapidly. Efforts so far have been largely limited to looking at 2D images, but newer developments make much better use of a third dimension, whether it be time or depth. This will enable researchers to look more closely at behavioral or other dynamic processes.

Jay Unruh: In the short span of a decade or two, biology will shift away from humans finding insights using direct observations to humans training computers to make those observations and us learning to draw insights from what the computer learned.

Chris Seidel: Just like ChatGPT understands language, we will have models that can understand DNA and protein sequences. This will help us understand biological molecules, and generate new ones with novel properties, enabling big advancements in synthetic biology and medicine. We'll also get better at understanding cells as systems.

Ribosomal DNA exists on only some human chromosomes and appears to be inherited intact from parents. Here AI is used to find these chromosomes automatically to perform quantification on their ribosomal DNA arrays. (Tamara Potapova, Gerton Lab)

What collaborations or interdisciplinary approaches are most important in advancing the research you’re involved in, and how does your team interact with other departments at Stowers Institute?

Jay Unruh: One of the biggest strengths of the Stowers Technology Centers is collaboration. This is especially true for our department. Because we work on so many kinds of data, we rely heavily on others, from curating data sets to evaluating the accuracy of results. AI can predict all sorts of things but until you have a plan to test those predictions they can’t make an impact on our scientific understanding.

Sean McKinney: We collaborate extensively with the Light and Electron Microscopy teams who help users design and run experiments using microscopy that can actually be quantified by our team.

Chris Seidel: We work one-on-one with researchers in the PI labs to bridge the gap between experimentation and computation, always with an eye towards what the next experiment might be. We especially like to work on open-ended exploratory projects to help solve scientific problems.

Can you share an example of a recent project or discovery that has benefited from the combination of big data, AI, and genomics? How has this impacted scientific understanding or potential applications in medicine?

Sean McKinney: We worked with the Gerton Lab to identify individual chromosomes in microscopy data and quantify the amount of a particular type of DNA called ribosomal DNA (rDNA) in hundreds of them. By analyzing chromosome sets from family members, we were able to show that these rDNA amounts are inherited.

Jay Unruh: In a recent project, I worked with Bret Redwine from Custom Protein Resources to design a small protein, or peptide, that would block an immune signal for the Li Lab. Our hope is that the peptide inhibitor could one day be used clinically for cancer treatment.

Chris Seidel: Advances in DNA sequencing recently enabled the complete sequence of the human genome, giving us insight into long overlooked areas. This has helped us understand how chromosomal structure can change and has allowed us to quantify differences in these regions across populations.

Jay Unruh, Sean McKinney, and Chris Seidel (left to right)

Are there any new instruments and/or technology that you are excited about?

Sean McKinney: We are working in a bizarre time when our field, though academic, is being advanced more by AI powerhouses like Google and Facebook than image processing academicians. It is exciting for me to just check what new tools are available every day.

Jay Unruh: As Sean mentioned, new tools are released every day. With modern AI tools, it is becoming possible to sequence the genome of an organism with a unique trait and propose the genes and molecules responsible for that trait.

Chris Seidel: The Systems Mass Spectrometry team has new machines that can rapidly quantify proteins in many samples. I'm curious about the large numbers of data sets that can be generated in this way and what we can learn from them. They also have machines for quantifying products of metabolism. I'm excited about the idea of combining genomic data with metabolomic data.

Newsletter & Alerts