Teaching Assistant @ UCLA (2020 - ):
Select courses:
- Introduction to Statistical Reasoning (STATS 10/13), Linear Regression (STATS 101A), Analysis of Experimental Design (STATS 101B), Statistical Modeling (STATS 101C), Introduction to SQL and Databases (STATS 147), Statistical Computing (STATS 404, graduate course), Financial Forecasting (ECON 409)
Data Science Intern @ Capital One (2023 & 2024):
Keywords: fraud detection, model building, feature selection, high dimensional data
- Designed and implemented a complementary model to capture missed instances of fraudulent applications
- Created 69 features from biometrics data to enhance main fraud detection model
Data Science Intern @ DocuSign (2021):
Keywords: A/B testing, frequentist experimental design, Bayesian experimental design
- Implemented internal computation tool to conduct experimental analysis through Bayesian approach by integrating previous observational data and experimental data
- Built in-house tool using Streamlit to determined experiment sample size and duration given desired statistical power and treatment effect
Data Science Intern @ Viasat (2020):
Keywords: competitive intelligence, clustering/representation learning, data collection, geographical data
- Created method that downsizes household population to a representative, comprehensive subset by using density-based clustering method (DBSCAN) to identify households addresses
- Captured information on competitor internet service pricing plans in regions by surveying such selected households addresses
Undergraduate Research @ UCLA HeartBD2K Lab (2018-2020):
Keywords: data imputation, feature selection, deep learning
- Processed molecular data by identifying proteins present in both treatment groups and imputing missing values using MICE
- Performed feature selection to extract proteins sequences linked to heart diseases using Linear SVM, LASSO, and Elastic Net
- Implemented unsupervised deep learning and conventional modeling methods to cluster proteins based on time-series biological data
Keywords: graph representation learning, high-dimensional data, pre-clinical/clinical data
- Created a network from high dimensional multi-omics data that models gene and proteins relationships through linear regression and multi-layer perceptron
- Employed an autoencoder to extract molecular features significant in diļ¬erentiating breast cancer cell lines from other cancer types