Aaron N. Brooks Ph.D
        
å aaron.neil.brooks@gmail.com | S www.aaronbrooks.info | E scalefreegan | D aaron-n-brooks | œ Google Scholar
Scientist and leader with a track record of transforming data into insight using machine learning. I build tools, processes and teams to
structure, integrate and distill biological data into formats for stakeholders to make eective decisions.

  Python, R
  scikit-learn, PyTorch, Hydra
  DNA/RNA-Seq, multi-omics, genome assembly
 AWS, GCP
  SQL, NoSQL (ArangoDB, MongoDB), Neo4j
  Dash, Django, Shiny
  git, Docker
  Snakemake, Nextflow
  Jira, Confluence
 German (A2), Spanish

   Boston, Massachusetts (Remote)
    Jan. 2022 -present
Leading a team that innovates approaches to curation, characterization and communication of information about best-in-class genetic parts.
Outcome: Curated 229 DNA parts across 6 organisms yielding part sets with tunable expression and up to 30-fold increases in expression levels.
Founded and developed a high-performing team to meet increasing business demands. Developed and implemented Agile processes and a
support service architecture (Jira) to execute on external requests and internal projects eectively.
Prototyped AI-assisted knowledge management systems, including retrieval augmented generation (RAG) and other embedding approaches.
Predicted DNA synthesizability using machine learning (gradient boosting, random forest, logistic regression).
Fine-tuned a DNA foundation model (HyenaDNA) for multiple, application-specific tasks, including regression and classification.
  Boulder, Colorado (Remote)
    Dec. 2020 - Dec. 2022
Developed statistical and analytical soware for interpretation of highly-multiplexed selection experiments with CRISPR-engineered cell li-
braries. Outcome: 3 publications and 3 patent applications.
Designed and implemented an interactive web service for submitting deep mutational scanning data for analysis and visualization (Dash).
Supported interpretation of pooled selection experiments for external customers. Distilled complex data into actionable insights.
Designed DNA libraries (promoter insertion) leveraging a pre-trained convolutional neural network (CNN).
Led implementation of ETL processes for generation and querying of Knowledge Graphs.
       Heidelberg, Germany
  Jun. 2015 - Nov. 2020
Established a synthetic biology research subgroup, directed day-to-day research activities and secured funding. Outcome: 1.4M Euro funding,
4 publications in top journals, including Science and Cell, and 1 patent application.
Collected and analyzed hundreds of millions of Nanopore direct RNA sequencing reads on more than 60 highly-rearranged synthetic yeast
genomes.
Wrote a semi-automated analysis pipeline (Snakemake) to perform all steps in a sequencing workflow on HPC infrastructure, from basecalling
to transcript quantification. Used this pipeline to process terabytes of sequencing data.
Applied machine learning (gradient boosting) to disentangle multiple factors influencing transcript start and end sites in S. cerevisiae.
     
        

Cell
            
          
            
2023
       
       
 
Cell
            
             
 
2023
         

Nature Communications
        2023
          
   
mSystems
           

2022
         Science
           2022
”Understanding genomes, piece by piece. EurekAlert.
”Wie die Platzierung eines Gens seine Expression beeinflusst. Nach Welt.
        
 
bioRxiv
             
     
2021
       PLoS Biol
        2019
     Cell Systems
      2016
         Front. Microbiol.
         2015
         

BMC Systems Biology
                2014
      Front. Microbiol.
        

2014
       Mol Syst Biol.
               2014
”New Open-Access Multiscale Model Captures Dynamic Molecular Processes in Unprecedented Detail. ISB Molecular Me.
      Wiley Interdiscip Rev Syst Biol Med.
         2011

        2024
   
       
     
   
2023

        2022


   Seattle. WA
      Sep. 2008 - Aug. 2014
Developed an ensemble learning approach to predict gene regulatory networks from microarray data
Full-stack implementation of web service (Django) that enables model access and exploration: http://egrin2.systemsbiology.net
Nominated for UW Distinguished Dissertation Award
    Albuquerque, NM
        Aug. 2002 - Aug. 2007
Characterized interactions of cytoplasmic poly(A)binding protein with poly(A) RNA using transmission electron microscopy
Robert B. Lofield Award for outstanding thesis in biochemistry
Graduated summa cum laude and with general university honors
Regents’ scholarship (highest awarded by the University)
Minor: Philosophy
 

       
Heidelberg,
Germany

         Seattle, WA
       Seattle, WA
           Albuquerque, NM
         Albuquerque, NM
       Albuquerque, NM
      Rio Rancho, NM
 
         Ľ  Ì New York, USA
     S  Ì London, UK
          Ľ  Ì Sydney, AU
         Ľ  Ì Edinburgh, UK
  
      EMBL Heidelberg
 2019
          EMBL Heidelberg
 2018
      EMBL Heidelberg
  2015
Teaching materials: http://scalefreegan.github.io/Teaching/DataIntegration/
       
     Washington, DC
   2014
Designed and facilitated a hands-on activity and web-based game to understand the structure and function of networks. Over 300 students
have played the online game.
”Knowing Networks”. NIH NIGMS Inside Life Science.
”ISB at USA Science and Engineering Festival”. ISB Molecular Me.
”USA Science and Engineering Festival”. ISB Molecular Me.
       Seattle, WA
 2010
       