Resume

Machine Learning Engineer · Mountain View, CA · jiaojunyi90@gmail.com · github.com/JunyiJ

Summary

Machine Learning Engineer with 7+ years building production ML systems at Google/YouTube and Bloomberg. Expertise in large-scale retrieval/ranking, cold-start recommendation, and ads targeting. Led cross-functional teams delivering high-impact launches, including initiatives contributing $150M+ ARR. Hands-on GenAI builder with open-source implementations in LLM post-training (GRPO/PPO/DPO + LoRA), decoder-only GPT, and Diffusion Transformer (DiT). Targeting MLE roles in GenAI/LLM training and applications.

Experience

Google, LLC — Mountain View, CA

Staff Software Engineer, Tech Lead Manager — YouTube Shorts Discovery · May 2023 – Present

  • Lead Shorts cold-start exploration as TLM for ~23 engineers, developing ML strategies to identify initial audiences for new uploads and expand recommendable corpus.
  • Delivered retrieval improvements for cold-start, uplift modeling for traffic optimization, and uncertainty-aware ranking.
  • Led large-model multimodal embedding initiatives for content understanding and recommendation quality.

Senior Software Engineer — YouTube Shorts Discovery · Nov 2021 – Sep 2023

  • Initiated and led local Shorts discovery to build place-based discovery experiences from scratch.

Software Engineer, Tech Lead — Display Ads Audience Targeting · Mar 2019 – Nov 2021

  • Built and scaled ML infrastructure across training, serving, and model application for audience targeting.
  • Led quality and serving-efficiency launches that improved advertiser ROI and contributed $150M+ ARR.
  • Built privacy-era targeting simulation/evaluation infrastructure (post third-party-cookie), including clustering and feature-space experimentation.
  • Recognition: AViDly Award.

Bloomberg L.P. — San Francisco, CA

Software Engineer · 2018 – 2019

  • Built Kafka-based data pipelines for customer/quant workflows.
  • Implemented automated versioned release and test-release frameworks with Kubernetes + CI/CD.

Selected GenAI Projects

PostTraining-LLM-Small

  • Implemented LLM post-training pipeline on Gemma 2B with GRPO + LoRA, plus PPO/DPO experimentation.
  • Built training loop with sampling, reward/advantage computation, ratio-based loss, and evaluation scripts.

Practical-Decoder

  • Implemented decoder-only GPT-style model from scratch.
  • Experimented with attention mechanisms and MoE structures; built end-to-end training and text generation workflow.

Practical-DiT

  • Implemented Diffusion Transformer (DiT) from scratch, including DDPM training/sampling flow.
  • Added debugging/eval utilities for sampling quality analysis and iterative model improvements.

Education

Georgia Institute of Technology

M.S., Computer Science (Machine Learning), 2018 – 2021

Yale University

Ph.D., Biophysics/Cell Biology, May 2018

Peking University (Beijing, China)

B.S., Life Sciences, Jul 2012

Skills

ML/GenAI: LLM Post-Training (GRPO/PPO/DPO), LoRA, Transformer/Decoder Models, Diffusion Models (DiT/DDPM), Retrieval & Ranking, ANN, Cold Start, Uncertainty Modeling, Online Experimentation

Programming Languages: C++, Python, SQL