Intro

I am a Data Engineer experienced in cloud-based big data solutions, including data warehousing, data quality, data modeling, data ingestion, data transformation, APIs, and cloud computing. I worked with SQL/NoSQL databases, ETL tools (SSIS, Matillion), orchestration tools (Apache Airflow), cloud data warehouses (BigQuery, RedShift, Snowflake), data processing framework (PySpark), infrastructure tools (Terraform), data visualization tools (Tableau, SiSense). I love to engage in cross-team environments and learn new technologies, understand business requirements, and deliver high-quality data products to add value to the business. If you would like to contact me, reach out to me here or through my LinkedIn profile

Education

University of Toronto
Bachelor of Science, Computer Science & Statistics Double Major
GPA: 3.97/4.00
Sep 2019 - Apr 2023 (Expected)



University of Florida
Master of Engineering, Chemical Engineering Major
GPA: 3.78/4.00
Aug 2012 - May 2014



Lanzhou University
Bachelor of Science, Chemistry Major
GPA: 4.21/5.00
2008 Sep - June 2012

Work

Database Assistant
University of Toronto, Department of Computer Science
Oct 2022 - Present
Skills: SQL; Microsoft Access; Database design principles; Data analysis and management; Pivoting tables and data validation etc.; Programming with programming languages such as VBA and Python; Automation of tasks like tables generation and offer letter generation; Collaboration and communication; Troubleshooting and problem-solving

  • Helped manage the admissions database for the University of Toronto Computer Science Department Graduate Program.
  • Designed and implemented new features to replace manually data ingestion and data transformation that reduce errors by 50%
  • Developed a series of customized reports to help the admissions team track and analyze applicant data more efficiently



Data Engineer, Business Intelligence
Ecobee Inc.
May 2021 - Apr 2022
Skills: Relational database management; High-volume data ingestion from various sources; Data modeling; Data pipeline architecture (design/deployment/matainance); Large-scale data lakes and data warehouses construction; Automation of workflow; Data validation and visualization

  • Ingested high volume of data from a variety of data sources such as MySQL, PostgreSQL, MongoDB, JSON files, CSV files, Salesforce on Cloud Storage, Netsuite etc.
  • Participated in data modeling such as Ecobee Sells modeling and subscription modeling.
  • Developed, deployed and maintained resilient data ingestion and transformation pipelines with a variety of tools such as Matillion, Cloud Functions on Python, SQL stored Procedures on GCP BigQuery.
  • Built large-scale data lake and data warehouses on BigQuery that adhere to Kimball's dimensional data modeling principles
  • Utilized Airflow to schedule and monitor all ELT tasks, and resolved Airflow job failures in a timely manner to avoid substantial impact on business operations.
  • Administered the Team's Google Cloud Platform with Terraform
  • Developed data visualizations on Tableau and SiSence for data reporting



Data Analyst Trainee
Easy Career Group
Dec 2019 - Feb 2020

  • Queried large volumes of data from AWS Cloud SQL database, created weekly and monthly reports to monitor over 1000 products' sales performance by each store location
  • Designed and developed inventory reports to track and monitor product availability at each store; provided solid data to support business decisions on replenishment strategies
  • Automated the monitoring reporting process for a Credit Limit Increase marketing campaign and created a Tableau Dashboard to evaluate post-campaign performance



Data Analyst, Supply and Material Planning
Wuhu Longtai Steel Structure Equipment Manufacturing Co., Ltd
Mar 2016 - July 2018

  • Utilized Excel and VBA to manipulate data and designed new analytical tools to monitor and forecast sales, shortening reporting time from one day to 2 hours
  • Maintained positive relationship with 5 national venders and worked closely with a team of 4 in product allocation, resulting in an average of 95% product delivery rate

projects

Machine Learning Project - Online Education Assessment

  • Collected raw datasets of 1 million records from an online education platform used by many universities, completed the data cleansing process by using Pandas and prepared the final dataset for model build;
  • Performed EDA (exploratory data analysis) to generate descriptive statistics and univariate analysis results; identified the relationship between main features with target variables;
  • Built different machine learning models including KNN and Neural Networks to predict whether a student will answer specific questions correctly; conducted cross-validation to improve model performance

  • Software Design Project - Calendar Android App

    • Utilized Java to develop a Calendar application on Android system;
    • Designed the UI for users to log in and implemented advanced features such as sharing with friends and searching for appointments;
    • Created unit tests and mocked applications to validate developed functionality;
    • Ensured accurate written documents on GitHub regarding code changes, feature improvements and debugging results

    • PostgreSQL Database Project - Ride-sharing Company

    • Conducted data modeling for analysis purposes on a ride-sharing company's dataset;
    • Developed complex, stand-alone SQL queries and integrated them into a Python program using psycopg2;
    • Designed comprehensive datasets, including scenarios and edge cases, to thoroughly test SQL queries for accuracy and reliability

    • MIPS Assembly Project - Doodle Jump Game Simulation

    • Completed a comprehensive implementation of the Doodle Jump mobile game using MIPS assembly language;
    • Utilized a simulated environment within MARS to test the implementation;
    • Accomplished a set of objectives, including creating animations, implementing movement controls, developing a random platform generator;
    • Added game features such as a scoreboard and dynamic increase in difficulty, as well as additional features such as power-ups and fancier graphics

    • Time Series Project - CO2 Concentration Time-series Data Analysis

    • Conducted exploratory analysis of CO2 concentration time-series data from Mauna Loa Observatory, Hawaii;
    • Applied transformations to convert the data into a stationary process and proposed at least two ARIMA or SARIMA models;
    • Estimated model parameters, performed diagnostics, and selected the best model based on selection criteria;
    • Forecasted the data into the future ten-time periods with 95% prediction intervals and performed periodogram analysis to identify the first three predominant periods, obtaining confidence intervals and interpreting the findings.

    courses

    Computer Science Courses:
    Introduction to Databases
    Neural Nets and Deep Learning
    Data Structures & Analysis
    Numerical Methods
    Software Design
    Software Test & Verification
    Software Tools and Systems Programming
    Computer Organization
    Intro Machine Learning
    Introduction to Computer Science
    Introduction to the Theory of Computation
    Mathematical Expression and Reasoning for Computer Science


    Statistics Courses:
    Probabilty & Statistics
    Methods of Data Analysis
    Methods of Applied Statistics
    Methods for Multivariate Data
    Time Series Analysis
    Design and Analysis of Experiments
    Theory of Statistical Practice
    Probability


    Math Courses:
    Linear Algebra
    Advanced Calculus


    Economics Courses:
    Principles of Microeconomics
    Principles of Macroeconomics

    About

    Lorem ipsum dolor sit amet, consectetur et adipiscing elit. Praesent eleifend dignissim arcu, at eleifend sapien imperdiet ac. Aliquam erat volutpat. Praesent urna nisi, fringila lorem et vehicula lacinia quam. Integer sollicitudin mauris nec lorem luctus ultrices. Aliquam libero et malesuada fames ac ante ipsum primis in faucibus. Cras viverra ligula sit amet ex mollis mattis lorem ipsum dolor sit amet.

    Elements

    Text

    This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


    Heading Level 2

    Heading Level 3

    Heading Level 4

    Heading Level 5
    Heading Level 6

    Blockquote

    Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

    Preformatted

    i = 0;
    
    while (!deck.isInOrder()) {
        print 'Iteration ' + i;
        deck.shuffle();
        i++;
    }
    
    print 'It took ' + i + ' iterations to sort the deck.';

    Lists

    Unordered

    • Dolor pulvinar etiam.
    • Sagittis adipiscing.
    • Felis enim feugiat.

    Alternate

    • Dolor pulvinar etiam.
    • Sagittis adipiscing.
    • Felis enim feugiat.

    Ordered

    1. Dolor pulvinar etiam.
    2. Etiam vel felis viverra.
    3. Felis enim feugiat.
    4. Dolor pulvinar etiam.
    5. Etiam vel felis lorem.
    6. Felis enim et feugiat.

    Icons

    Actions

    Table

    Default

    Name Description Price
    Item One Ante turpis integer aliquet porttitor. 29.99
    Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
    Item Three Morbi faucibus arcu accumsan lorem. 29.99
    Item Four Vitae integer tempus condimentum. 19.99
    Item Five Ante turpis integer aliquet porttitor. 29.99
    100.00

    Alternate

    Name Description Price
    Item One Ante turpis integer aliquet porttitor. 29.99
    Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
    Item Three Morbi faucibus arcu accumsan lorem. 29.99
    Item Four Vitae integer tempus condimentum. 19.99
    Item Five Ante turpis integer aliquet porttitor. 29.99
    100.00

    Buttons

    • Disabled
    • Disabled

    Form