Peter Lee

As a seasoned engineer, I have a strong background in data infrastructure, data science, and backend development. I bring hands-on experience in DevOps practices, including CI/CD pipeline design, infrastructure as code, and system monitoring.

I have extensive experience working with public cloud platforms (GCP and AWS), as well as deploying and scaling pipelines in on-premises environments. I excel at optimizing system performance, ensuring reliability, and continuously improving deployment workflows.

Work

Built an AI-powered chat platform supporting mission dialogues, speech recognition, TTS/STT, and multi-model workflows.
Architected and optimized GCP backend services, improving reliability and reducing operational complexity across microservices.
Designed GitHub CI/CD pipelines for automated build, testing, and deployment across multiple environments.
Implemented RAG pipelines for knowledge retrieval, context-aware responses, document ingestion, embedding generation, and content processing.
Mentored and collaborated with team members to resolve technical challenges, including CVE remediation, CORS configuration, and system design.
Developed Ethereum smart contracts for custom ERC-20/ERC-721 tokens and transparent NFT holder revenue sharing.

Designed AWS data infrastructure that reduced cost by 2x and improved query speed by 3x across EKS, MWAA, EMR, EC2, and RDS.
Migrated Airflow 1.x to Airflow 2.0 with AWS MWAA and optimized cloud resource allocation.
Designed a log service for collecting custom log events, improving data collection reliability and reducing reliance on third-party analytics platforms.
Maintained Terraform modules for AWS infrastructure as code, including EC2 and EKS resource provisioning.
Synchronized Kubernetes jobs between Flux CD and GitHub to reduce manual errors and keep cluster workloads up to date.

Built data infrastructure and pipelines for data collection, exchange, tagging, ETL, and real-time validation.
Designed a Data SDK collection backend that reduced data update time from one day to one hour and lowered ETL machine cost by 40%.
Introduced Kubernetes-based deployment and monitoring, reducing infrastructure cost by 75% and saving 20 hours of manual work.
Migrated ETL workloads from VM/EMR environments to Apache Airflow and Kubernetes CronJobs for more flexible, cost-effective operations.
Migrated projects and ETL pipelines from AWS to GCP, including Hive/S3 data transfer, GitLab setup, and new migration pipelines.

Built LINE Bot integrations that connected internal systems and helped colleagues save time on routine workflows.
Designed API interfaces for meeting reservation and electronic bulletin board systems to integrate with the bot platform.
Built a real-time subtitle display system for livestreams, moving from idea to working prototype in one week.
Contributed fixes and features to open-source projects including Sandstorm, Rocket.Chat, and TensorFlow.

Selected Skills

DataKubernetes
Airflow
Spark
Elasticsearch
AWS
GCP
DevOpsCI/CD
Docker
Terraform
Flux CD
Python
Git
BackendTypeScript/Node
Python
PostgreSQL

Publications

           
Implementation of Lambda Architecture: A Restaurant Recommender System over Apache MesosAdvanced Information Networking and Applications (AINA)
2017 IEEE 31st International Conference
A Lambda Architecture at DC/OS, using Spark, Spark Streaming, Kafka, Hadoop HDFS.

           Fulltext
              
             preview
             
Stock market analysis from Twitter and news based on streaming big data infrastructureAwareness Science and Technology (iCAST)
2017 IEEE 8th International Conference
Real-time trend analysis by twitter, using Spark, Spark Streaming, Kafka, Classification Data and Visualize on Web.

           Fulltext
              
             preview
             
Real-time Trend Analysis of Streaming Twitter and News Based on Big Data Infrastructure電子情報通信学会-講演論文
信学技報, vol. 117, no. 184, SC2017-13, pp. 1-6, 2017年8月.
SC2017-13 2017-08-18 (SWIM, SC)

           Fulltext

Selected Projects

DNS Security

An iOS DNS configuration app supporting DNS-over-HTTPS and DNS-over-TLS encryption.

Sold more than 10,000 units in its first year.

URL

Airbox

An open-source PM2.5 sensor project.

Built a serverless data pipeline using Elastic Stack for storage and visualization.

Github

Awards

NVIDIA Jetson Community Developer

2018 April - Outstanding-jetson-developer-community-contributions, Porting TensorFlow for NVIDIA Jetson.

NVIDIA Developer Forum

TensorFlow Contributor

Porting TensorFlow to NVIDIA Jetson

Repository tensorflow-nvJetson

TensorFlow #26985, #20025, #19075, #17394

Peter Lee

Peter Lee

Work

Selected Skills

Data

DevOps

Backend

Publications

Selected Projects

Awards

Education