braddy commited on
Commit
7aca958
·
verified ·
1 Parent(s): 97830bc

Delete docs

Browse files
Files changed (2) hide show
  1. docs/He Yingxu_2806.pdf +0 -0
  2. docs/resume.md +0 -90
docs/He Yingxu_2806.pdf DELETED
Binary file (195 kB)
 
docs/resume.md DELETED
@@ -1,90 +0,0 @@
1
- # personal information
2
- ## identification
3
- Singapore Permanent Resident|Chinese citizen
4
-
5
- ## address
6
- 17 Jalan Masjid, Singapore
7
-
8
- ## contact
9
- [email protected]|+65 91752741|+86 15063250971
10
-
11
- # Working Experience
12
- ## Machine Learning Engineer at Huawei Ltd.
13
- • from Dec 2022 to present
14
-
15
- • Built a pipeline to automatically visualize data tables using LSTM network trained on ChatGPT-generated
16
- data with pairwise loss method, achieving 80% recall@5 on 100+ internal test cases.
17
-
18
- • Designed and implemented a novel SISR method that enhanced WIFI-signal simulations for office buildings
19
- by achieving 10x speedup compared to physics-based simulation with negligible loss in accuracy (1% MAE)
20
- on over 80 large-scale office layouts.
21
-
22
- ## Machine Learning Research Engineer at Dyson Ltd.
23
- • from Sept 2021 to Dec 2022
24
-
25
- • Implemented an object localization model in a few -shot context by semi -supervised training. The model
26
- achieved comparable results to professional software with improved adaptability and robustness .
27
-
28
- • Designed and implemented an air quality estimation model, using LGBM, Bayesian Regression, etc., with
29
- geographical and meteorological features . Demonstrat ed its advantages over spatial interpolated methods
30
- and deployed the pipeline with Metaflow framework on AWS services.
31
-
32
- ## ML Research Assistant at NUS -Singtel Cyber Security Lab
33
- • from Sept 2020 to July 2021
34
-
35
- • Identif ied anomalies from system logs leveraging DBSCAN and hierarchical clustering for model training .
36
-
37
- • Developed an information retrieval method for web -attack strategy identification from system and firewall
38
- logs. The recall@3 rate achieved 80% on 100+ hand -labelled samples .
39
-
40
- ## Data Analyst Intern at GIC Pte. Ltd.
41
- • from Dec 2018 to July 2019
42
-
43
- • Deployed an R application that forecasts the mid -term returns of portfolio with visualization using R shiny .
44
-
45
- • Optimized the coefficients of a mean reversion forecasting model using the Genetic Algorithm.
46
-
47
- ## Data Analyst Intern at PropertyGuru
48
- • from May 2018 to Aug 2018
49
-
50
- • Developed dashboard s in Tableau to analyze the user behaviors and listings’ performance to better match
51
- user demand to agents’ recommendations.
52
-
53
- • Implemented a POC to calculate and geographically visualize the liveability score for properties .
54
-
55
- # Education
56
- ## Master of Computing in Artificial Intelligence at National University of Singapore
57
- • from Aug 2020 to Sept 2021
58
- • School of Computing : CAP 4.42/5.0
59
- • Teaching Assistant : Advanced Analytics and Machine Learning (from Jan 2021 to May 2021)
60
-
61
- ## Bachelor of Science (Hons) in Business Analytics at National University of Singapore
62
- • from Aug 2016 to June 2020
63
- • School of Computing : CAP 4.15/5.0 , Dean’s List in Semester 3 AY 2018/2019
64
- • Distinction : Analytics Techniques Knowledge Area (awarded in Dec 2020)
65
- • Teaching Assistant : Programming Methodology in python (from Aug 2017 to June 2018)
66
-
67
- # Relevant Projects
68
- ## Distilling ChatGPT for finetuning image captioning models
69
- • from Jan 2023 to Present
70
- • Employed Chain -of-Thought with verification prompting technique on ChatGPT to create 10k+ accurate
71
- capt ions from the xView annotations. Fine -tuned a GIT image captioning model and significantly improved
72
- the CIDE r score from 11.59 to 85.93 over 2k RSICD samples.
73
- ## Dialogue Response Generation ( Master Thesis ) at NUS NExT++ Lab
74
- • from Nov 2020 to Aug 2021
75
- • Built an enriched task -oriented response generation by implementing copy -mechanism on GPT -2 using
76
- Pytorch. The proposed model is capable of naturally incorporating external tips/user reviews about venues
77
- into responses. The generated response outperforms m any state -of-the-art models on user satisfaction.
78
- ## Property Resale Price Prediction
79
- • from Jan 2021 to May 2021
80
- • Fitted CatBoost, LGBM, XGBoost on 43k pieces of property sales data. Selected features by correlation and
81
- information gain. Engineered new features describing properties’ livability. Reduce d data dimensionality
82
- with WOE encoding. The f inal ensemble methods’ accuracy achieved 5th/64 place.
83
-
84
- # Skills
85
- • Python (Pytorch, Tensorflow), R : Machine
86
- Learning, Deep Learning , Data processing
87
- • SQL, Spark: Data query and big data
88
- • Tableau, PowerBI : Visualization development
89
- • Java, Git, Scala, JavaScript, HTML, CSS : Software
90
- Development