Senior / Lead Computer Vision/Clinical ML Scientist



Software Engineering, Data Science
San Francisco, CA, USA
Posted on Saturday, July 1, 2023
The Opportunity
Key to insitro’s approach to rethinking drug development is leveraging disease models, genetics, and clinical datasets to link in vitro and cellular phenotypes with patient outcomes.
Imaging based high content phenotyping is at the heart of insitro’s efforts to characterize and quantify sophisticated patient datasets. Our goal is to use machine learning to characterize image content and patient state from clinical imaging modalities such as pathology images in clinical cohorts.
As a machine learning / computer vision researcher with an emphasis on clinical data, your focus will be to develop innovative ML approaches to analyze and integrate large-scale medical imaging datasets such as histopathology, MRI, and other clinical imaging modalities from randomized clinical trials, electronic health records/PACS, national biobanks, and other sources. You will also work with colleagues to integrate those data with associated clinical, genetic, or other variables. Via this collaborative effort, you will have the opportunity to contribute to developing models for understanding patient disease state and progression, predicting patient outcomes, and identifying therapeutic targets and developing drugs that have high efficacy and low toxicity.
Your work will involve the development and deployment of cutting edge methods in both classical computer vision and deep learning. The data we deal with will require addressing challenges such as distribution shift, such as hospital site variability in histology stain characteristics, data missingness, class imbalance, and small sample sizes, among other unique challenges. You will need to develop fit-for-purpose approaches that utilize methods such as self-supervised learning, multi-task learning, few-shot learning, and more. You will work in collaboration with the software engineering team to develop these methods as robust, reusable platform components that can be deployed on large-scale datasets in a portable way.
You will be joining a vibrant biotech startup that has long-term stability, due to significant funding, and is in a high growth phase. A lot can change in this early and exciting phase, providing many opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!
This role is preferably based in the San Francisco Bay Area or Boston, but we are open to discussing other locations in the United States and the United Kingdom.
About You
  • Ph.D. in computer vision, machine learning, computer science, clinical machine learning or a related discipline, or equivalent practical experience (e.g., a Master’s degree plus 2 years of relevant industry experience)
  • Experience developing models for diverse computer vision tasks (e.g. segmentation, recognition, classification, domain adaptation) including using modern deep learning frameworks (PyTorch, JAX, Keras, etc)
  • Extensive hands-on experience working with biomedical imaging modalities
  • Experience with modern representation learning approaches such as self-supervised learning, transfer learning, multi-modal modeling, few-shot learning, and more
  • Strong programming skills in Python, including processing TB-scale image datasets using Linux/Bash
  • Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions
  • Passion for making a difference in the world
Nice to Have
  • Experience building infrastructure for clinical image dataset preprocessing
  • Experience with real-world challenges of clinical data analysis, including working with clinical datasets such as EHR/PACS or clinical trial data and interactions with clinicians
  • Demonstrated ability to write software in a team, through industry experience or substantial involvement with open source projects
  • Some understanding of human physiology or disease biology (especially cancer, metabolism, or neurodegeneration)
  • Publication record of meaningful contributions to high-quality work in relevant computer vision, clinical ML, or biomedical venues
  • Familiarity with cloud computing services (e.g., AWS or GCP)
  • Proficiency in C++/OpenCV/CUDA, or other compiled, statically-typed languages and related computer vision/graphics libraries
Compensation & Benefits at insitro

Our target starting salary for successful US-based applicants for this role is $160,000 - $215,000. To determine starting pay, we consider multiple job-related factors including a candidate’s skills, education and experience, the level at which they are actually hired, market demand, business needs, and internal parity. We may also adjust this range in the future based on market data.

This role is eligible for participation in our Annual Performance Bonus Plan (based on company targets by role level and annual company performance) and our Equity Incentive Plan, subject to the terms of those plans and associated policies.

In addition, insitro also provides our employees:

  • 401(k) plan with employer matching for contributions
  • Excellent medical, dental, and vision coverage (insitro pays 100% of premiums for employees), as well as mental health and well-being support
  • Open, flexible vacation policy
  • Paid parental leave
  • Quarterly budget for books and online courses for self-development
  • Support to occasionally attend professional conferences that are meaningful to your career growth and development
  • New hire stipend for home office setup
  • Monthly cell phone & internet stipend
  • Access to free onsite baristas and cafe with daily lunch and breakfast
  • Access to free onsite fitness center
  • Commuter benefits
About insitro
insitro is a drug discovery and development company using machine learning (ML) and data at scale to decode biology for transformative medicines. At the core of insitro’s approach is the convergence of in-house generated multi-modal cellular data and high-content phenotypic human cohort data. We rely on these data to develop ML-driven, predictive disease models that uncover underlying biologic state and elucidate critical drivers of disease. These powerful models rely on extensive biological and computational infrastructure and allow insitro to advance novel targets and patient biomarkers, design therapeutics and inform clinical strategy. insitro is advancing a wholly owned and partnered pipeline of insights and therapeutics in neuroscience, oncology and metabolism. Since launching in 2018, insitro has raised over $700 million from top tech, biotech and crossover investors, and from collaborations with pharmaceutical partners. For more information on insitro, please visit