Resume — Junyi Jiao

Summary

Machine Learning Engineer with 7+ years building production ML systems at Google/YouTube and Bloomberg. Expertise in large-scale retrieval/ranking, cold-start recommendation, and ads targeting. Led cross-functional teams delivering high-impact launches, including initiatives contributing $150M+ ARR. Hands-on GenAI builder with open-source implementations in LLM post-training (GRPO/PPO/DPO + LoRA), decoder-only GPT, and Diffusion Transformer (DiT). Targeting MLE roles in GenAI/LLM training and applications.

Experience

Google, LLC — Mountain View, CA

Staff Software Engineer, Tech Lead Manager — YouTube Shorts Discovery · May 2023 – Present

Lead Shorts cold-start exploration as TLM for ~23 engineers, developing ML strategies to identify initial audiences for new uploads and expand recommendable corpus.
Delivered retrieval improvements for cold-start, uplift modeling for traffic optimization, and uncertainty-aware ranking.
Led large-model multimodal embedding initiatives for content understanding and recommendation quality.

Senior Software Engineer — YouTube Shorts Discovery · Nov 2021 – Sep 2023

Initiated and led local Shorts discovery to build place-based discovery experiences from scratch.

Software Engineer, Tech Lead — Display Ads Audience Targeting · Mar 2019 – Nov 2021

Built and scaled ML infrastructure across training, serving, and model application for audience targeting.
Led quality and serving-efficiency launches that improved advertiser ROI and contributed $150M+ ARR.
Built privacy-era targeting simulation/evaluation infrastructure (post third-party-cookie), including clustering and feature-space experimentation.
Recognition: AViDly Award.

Bloomberg L.P. — San Francisco, CA

Software Engineer · 2018 – 2019

Built Kafka-based data pipelines for customer/quant workflows.
Implemented automated versioned release and test-release frameworks with Kubernetes + CI/CD.

Selected GenAI Projects

PostTraining-LLM-Small

Implemented LLM post-training pipeline on Gemma 2B with GRPO + LoRA, plus PPO/DPO experimentation.
Built training loop with sampling, reward/advantage computation, ratio-based loss, and evaluation scripts.

Practical-Decoder

Implemented decoder-only GPT-style model from scratch.
Experimented with attention mechanisms and MoE structures; built end-to-end training and text generation workflow.

Practical-DiT

Implemented Diffusion Transformer (DiT) from scratch, including DDPM training/sampling flow.
Added debugging/eval utilities for sampling quality analysis and iterative model improvements.

Education

Georgia Institute of Technology

M.S., Computer Science (Machine Learning), 2018 – 2021

Yale University

Ph.D., Biophysics/Cell Biology, May 2018

Peking University (Beijing, China)

B.S., Life Sciences, Jul 2012

Skills

ML/GenAI: LLM Post-Training (GRPO/PPO/DPO), LoRA, Transformer/Decoder Models, Diffusion Models (DiT/DDPM), Retrieval & Ranking, ANN, Cold Start, Uncertainty Modeling, Online Experimentation

Programming Languages: C++, Python, SQL