Data science is an important interdisciplinary field with significant impacts on many aspects of the modern world, including government, industry, academia, and the general public. Transdisciplinary Research In Principles Of Data Science (TRIPODS) brings together the statistics, mathematics, and theoretical computer science communities to develop the theoretical foundations of data science through integrated research and training activities focused on core algorithmic, mathematical, and statistical principles.

Phase II, Cohort I Institutes

  • University of California–Berkeley, Massachusetts Institute of Technology, Harvard University, Northeastern University, Boston University

    The Foundations of Data Science Institute (FODSI) brings together a large and diverse team of researchers from UC Berkeley, MIT, Boston University, Harvard University, Northeastern University, Bryn Mawr College and Howard University, with the aim of laying the theoretical foundations for the field of data science across the full breadth of scientific issues that arise in the rich and complex processes by which data can be used to make decisions – modeling issues, inferential issues, computational issues, and societal issues. Research in the institute is organized around eight themes. Four of these themes focus on key challenges arising from strategic, sequential, combinatorial and experimental interactions (Learning and Economics, Reinforcement Learning, Networks and Graphical Models, and Causal Inference), and others represent opportunities for major impacts across disciplinary boundaries: on elucidating the algorithmic landscape of statistical problems (Computational Complexity of Statistical Estimation), on scalability in data science problems (Sketching, Sampling, and Sub-Linear Time Algorithms); on exploiting statistical methodology in the service of algorithms (Machine Learning for Algorithms); and on using breakthroughs in applied mathematics to address computational and inferential challenges (Geometry of Sampling and Optimization). Societal issues in data science will feature throughout this set of themes. The institute aims to educate and mentor a diverse cohort of future leaders in data science, and to broaden participation and diversity in the data science workforce, and it hosts a range of public activities such as research workshops, collaborative research programs and summer schools.

  • University of Washington, University of Wisconsin–Madison, University of California–Santa Cruz, University of Chicago

    Data science is making an enormous impact on science and society, but its success is uncovering pressing new challenges that stand in the way of further progress. Outcomes and decisions arising from many machine learning processes are not robust to errors and corruption in the data; data science algorithms are yielding biased and unfair outcomes, as concerns about data privacy continue to mount; and machine learning systems suited to dynamic, interactive environments are less well developed than corresponding tools for static problems. Only by an appeal to the foundations of data science can we understand and address challenges such as these.

    Building on the work of three TRIPODS Phase I institutes, the Institute for Foundations of Data Science (IFDS) brings together researchers from the Universities of Washington, Wisconsin-Madison, California-Santa Cruz, and Chicago, with the goal of tackling these critical issues. IFDS organizes its research around four core themes: complexity, robustness, closed-loop data science, and ethics and algorithms. By making concerted progress on these fundamental fronts, IFDS aims to lower several of the barriers to better understanding of data science methodology and to its improved effectiveness and wider relevance to application areas. In concert with its research agenda, IFDS engages the data science community through workshops, summer schools, and hackathons, and is committed to equity and inclusion through extensive plans for outreach to traditionally underrepresented groups.

Phase I, Cohort I Institutes

  • Brown University

    The mission of the Brown TRIPODS institute is to foster development and principled application of theory and methods of big data to discover, refine, and validate underlying theoretical models that govern a system or data-generating process, which in turn improve predictions of new outcomes.

  • Columbia University

    The Columbia TRIPODS Institute, hosted in the Data Science Institute at Columbia University, fosters research, education and center building around foundational topics that support the practice of data science, including (nonconvex) optimization, primitives for efficient computation, and interactive machine learning.

  • Cornell University

    This project creates a center of data science for improved decision-making that combines expertise from computer science, information science, mathematics, operations research, and statistics. The five concrete research directions proposed are: Privacy and Fairness, Learning on Social Graphs, Learning to Intervene, Uncertainty Quantification, and Deep Learning.

  • Georgia Institute of Technology

    The Transdisciplinary Research Institute for Advancing Data Science (TRIAD) integrates research and education in mathematical, statistical, and algorithmic foundations for data science.

  • Massachusetts Institute of Technology

    The MIT Institute for Foundations of Data Science (MIFODS) is an interdisciplinary effort to develop the theoretical foundations of data science through integrated research and training activities. Our goal is to stimulate research and educational interactions between mathematics, statistics and theoretical computer science, both within MIT and in the research community at large.

  • Northwestern University, Lehigh University, State University of New York at Stonybrook

    The NSF TRIPODS Institute on Optimization and Learning, based at Lehigh University and in collaboration with Stony Brook and Northwestern Universities, has its current focus on new advances in tools for non-convex machine learning applications, in particular for various cases of training deep learning models.

  • Ohio State University

    This center advances the methodological and theoretical foundations of data analytics by considering the geometric and topological aspects of complex data from mathematical, statistical and algorithmic perspectives, thus enhancing the synergy between the Computer Science, Mathematics, and Statistics communities.

  • University of Arizona

    UA-TRIPODS is an integrated research and educational institute in data sciences at UA, with focus on theoretical foundations. The mission of UA-TRIPODS is to produce long-term and deep-level collaborative research among computer science, statistics, and mathematics, to build effective partnerships with domain sciences, local industry and business, and to outreach to the public community.

  • University of California–Berkeley

    The UC Berkeley FODA (Foundations of Data Analysis) Institute will focus on deepening the theoretical foundations of data science, from basic education to cutting-edge research, and translating those foundational developments to data science practice in the diverse range of domains that generate data.

  • University of California–Santa Cruz

    The UC–Santa Cruz TRIPODS effort brings together researchers from mathematics, statistics, and computer science to develop a unified theory of data science applied to uncertain and heterogeneous graph and network data. We collaborate closely with the D3 Data Science Research Center and Data Science Santa Cruz.

  • University of Washington

    The UW Tripods Institute on Algorithmic Foundations of Data Science (ADSI) focuses on theoretical foundations and algorithms for data science. At their core, each of the disciplines of computer science, mathematics, and statistics has rich theories of complexity and robustness, which have influenced the design of the available tools used in real world computational problems. ADSI seeks new algorithms and design principles that unify ideas and provide a common language for addressing contemporary data science challenges.

  • University of Wisconsin–Madison

    The Institute for Foundations of Data Science at UW-Madison is doing research in the fundamentals of data science by working collaboratively across traditional domain boundaries of mathematics, statistics, and computer science.

Phase I, Cohort II Institutes

  • Duke University

    The Transdisciplinary Research and Education Collective at Duke University (TREC@Duke) was established to further our understanding of foundational principles in data science and to identify opportunities for innovation by working across disciplines. TREC@Duke focuses on creating new tools, methods, and dialogues in data science throughout North Carolina’s “Research Triangle.”

  • Iowa State University

    The D4 (Dependable Data Driven Discovery) Institute at Iowa State University is focused on advancing the theoretical foundations of data science by fostering foundational research to enable understanding of the risks to the dependability of data-science lifecycles, to formalize the rigorous mathematical basis of the measures of dependability for data science lifecycles, and to identify mechanisms to create dependable data-science lifecycles.

  • Johns Hopkins University

    The Mathematical Institute for Data Science (MINDS) brings together a multidisciplinary team of mathematicians, statisticians, theoretical computer scientists, and electrical engineers from Johns Hopkins University to develop the foundations of deep neural models (e.g., feedforward networks) and graphical models data (e.g., random graphs), with the ultimate goal of arriving at integrated models that are more interpretable, robust to perturbations, and learnable with minimal supervision. In addition, the institute will foster interactions among data scientists through a monthly seminar series, semester-long research themes, an annual research symposium, and a summer research school and workshop on the foundations of data science.

  • Northwestern University

    The Institute for Data, Econometrics, Algorithms, and Learning (IDEAL) is a multi-discipline (computer science, statistics, economics, electrical engineering, and operations research) and multi-institution (Northwestern University, Toyota Technological Institute at Chicago, and University of Chicago) collaborative institute that focuses on key aspects of the theoretical foundations of data science. The institute will support the study of foundational problems related to machine learning, high-dimensional data analysis and optimization in both strategic and non-strategic environments. The primary activity of the institute will be thematically focused quarters which will coordinate graduate course work with workshops and external visitors.

  • Rutgers, The State University of New Jersey

    DATA-INSPIRE is the Institute in the nation’s TRIPODS network engaged in fundamental transdisciplinary data science research, education & workforce development related to the operation of intelligent machines and their interaction with people.

  • Texas A&M University

    The Texas A&M Research Institute for Foundations of Interdisciplinary Data Science (FIDS) will bring together researchers from six disciplinary areas, Statistics, Electrical Engineering, Mathematics, Computer Science, Industrial & Systems Engineering, and Operation Management to conduct research on the foundations of data science motivated by problems arising in bioinformatics, the energy arena, power systems, and transportation systems. This Institute will be well positioned to develop rigorous theories, novel methodologies, and efficient computational techniques to solve data challenges in many other application domains.

  • Tufts University

    The Tufts T-TRIPODS Institute connects an interdisciplinary team of mathematicians, computer scientists, statisticians and electrical engineers advancing the foundations of data science with domain experts in four broad applications areas. The four areas are biological and biomedical data; education and cognitive science; smart cities, development and design; and computational arts and humanities (including language and music).

  • University of California, Davis

    UCD4IDS is composed of 35 researchers (four PIs and 31 senior participants) coming from four departments (Computer Science, Electrical & Computer Engineering, Mathematics, and Statistics) and will cross interdepartmental barriers and promote interdisciplinary research collaborations among faculty members, postdocs, and graduate students, particularly focusing on: 1) Fundamentals of machine learning directed toward biological and medical applications; 2) Optimization theory and algorithms for machine learning; and 3) High-dimensional data analysis on graphs and networks.

  • University of Illinois at Chicago

    This collaborative research institute combines aspects of mathematics, statistics, computer science, and electrical engineering to study the foundations of data science. The research focus of the institute is centered around the topics of: representation and structure of data, machine learning and complexity, and robustness and privacy.

  • University of Illinois at Urbana–Champaign

    The mission of Illinois Institute for Data Science and Dynamical Systems (iDS^2) is to develop the theoretical and the algorithmic foundations of the synthesis of data science and dynamical systems. The institute activities will center on four interrelated themes: Data Modeling and Dynamical Systems; Sampling, Inference, and Dynamical Systems; Algorithm Design and Dynamical Systems; Decision-Making and Dynamical Systems.

  • University of Massachusetts

    The UMass TRIPODS Institute brings together faculty from computer science, mathematics and statistics, and electrical engineering, along with post-docs, undergraduate and graduate students, and high school students, for research and training in the foundations of data science. Research interests include algorithms for massive data sets, computational and statistical trade-offs, quantifying uncertainty, and interactive data acquisition. Application areas include biomedical and chemical engineering.

  • University of Pennsylvania

    The focus of FINPenn, the Center for the Foundations of Information Processing at the University of Pennsylvania, is to establish fundamental theory to enable the study of data beyond time and images. Humans’ rich intuitive understanding of space and time may not be applicable to the processing of complex signals, necessitating the discovery and development of foundational principles to guide the design of generic artificial intelligence algorithms. FINPenn will support a class of scholar trainees along with a class of visiting postdocs and students to advance this agenda.

  • University of Pennsylvania

    The Penn Institute for Foundations of Data Science (PIFODS) brings together scientists and ideas from multiple disciplines, including computer science, electrical engineering, statistics, and mathematics, in order to collectively develop long-lasting principles for data science that can serve the field for decades to come. Specific research focuses of the institute include developing principles for complex learning tasks; for efficient optimization (convex, non-convex, and submodular); for streaming, distributed, and massively parallel data analysis; for privacy-preserving and fairness-preserving data analysis; and for reproducible data analysis.

  • University of Rochester & Cornell University

    Motivated by today’s greatest foundational data science challenges arising in medicine, healthcare, and beyond, the Greater Data Science Cooperative’s mission is to develop a mathematical foundation that integrates trans-disciplinary perspectives and enables applications that can ultimately benefit everyone worldwide.

  • University of Texas at Austin

    UT Austin’s TRIPODS IFDS beings together 8 faculty from 4 departments – ECE, CS, Statistics and Mathematics – to provide a highly collaborative and productive environment for research into three main thrust areas: (a) advancing the theoretical understanding of training and generalization in neural networks, (b) rigorous approaches to robustness in machine learning, and (c) incorporating and using graphical structure into how data is modeled and used. The institute is synergistic with a recently announced NSF AI Institute at UT Austin.