CV
Education
- BSc in Computer Science & Technology, Northeastern University, 2022-2026
Professional experience
CelerData (Powered by StarRocks)
Feb 2025 - Present
| Menlo Park, California
SaaS Team, StarRocks Division R&D Engineer
- Designed data integration solutions for ecosystem products and proposed kernel-level optimization strategies for StarRocks database engine
- Led integration of Apache Arrow and high-performance RPC framework, achieving 20× general query performance boost (peak 160× acceleration)
- Developed ecosystem tools: StarRocks Tableau JDBC Connector and official dbt-starrocks adapter
- Open-source contributions: Support Arrow Flight SQL for StarRocks
Kuaishou Technology
Jul 2024 - Dec 2024
| Beijing, China
Backend R&D Engineer, DevOps Infrastructure Division
- Built task scheduling system for KDev API platform supporting automated data migration
- Designed Swagger 3.0 parsing framework for HTTP API specifications via OpenAPI
- Implemented file upload support (200MB limit) in API mocking/proxy modules
- Optimized empty directory cleanup mechanisms for internal API platform
Meituan
Mar 2024 - Jul 2024
| Beijing, China
Backend R&D Engineer, Resource Scheduling Technology Center
- Developed foundational modules for Strategy Analysis Tool 2.0 (ToB business strategy optimization platform)
- Implemented dynamic UI layout adjustment using Lion distributed configuration system
- Conducted A/B testing for strategy optimization across adjacent data slices
- Solved large-scale data growth challenges (250GB/table) via MySQL sharding + Blade tiered storage
TutorEva
Jan 2024 - Mar 2024
| Beijing, China
Algorithm Engineer, AI Research Institute
- Developed syntax parsing/semantic mapping algorithms for K-12 math expressions
- Optimized symbolic algebra computation framework improving accuracy & performance
- Created automated problem-solving engine with Drools-based logical reasoning
Yonyou Hong Kong
Nov 2023 - Jan 2024
| Wan Chai, Hong Kong
Test Development Engineer
- Executed QA for marketing systems: test case design, requirements analysis, defect tracking
- Implemented automated API testing framework and expanded test scenarios
- Performed source code audits to identify security vulnerabilities
Research Experience
iDC-NEU Group
Jul 2024 - Mar 2025
| Shenyang, China
Research Intern (Mentors: Prof. Yanfeng Zhang & Prof. Shufeng Gong)
- Co-authored preprint on vector database optimization: Updating Graph-based Index with Fine-grained Blocks
- Focus areas: Vector DB architectures, I/O optimization for large-scale storage systems
Northeastern University AI Lab
Mar 2023 - Jul 2023
| Shenyang, China
Research Intern (Mentor: Prof. Miao Fang)
Technical Skills
Programming Languages
C/C++, Python, Java, Golang, C#, CUDA, Bash/Shell, MATLAB, SQL
Web & Scripting
HTML, CSS, JavaScript, Jinja2, LaTeX, Markdown
Cloud & Infrastructure
Linux/Unix, Docker, Git, Nginx, MySQL, MongoDB, Redis, SPDK
Frameworks & Libraries
- Java Ecosystem: Spring (MVC/Boot/Cloud), MyBatis, Maven, Spring Security, Spring Task, Dubbo
- Data Engineering: Apache Arrow, DBT, StarRocks, MPP, EasyExcel
- AI/ML: PyTorch, Drools
- Testing: Selenium, JUnit, Pytest, FIO
Specialized Technologies
Arrow Flight RPC, Blade (TiDB-based storage), Lion (distributed config)
Benchmarks
TOEFL: 103 | GRE: 332
Publications
Updating Graph-based Index with Fine-grained Blocks for Large-scale Streaming High-dimensional Vectors
Preprint
| arXiv:2503.00402v1
Song Yu, Shengyuan Lin, Shufeng Gong, Yongqing Xie, Ruicheng Liu, Yijie Zhou, Pufan Zuo, Yanfeng Zhang, Ji Sun, Ge Yu
(Aug 2024 - Mar 2025)
• Novel vector database indexing method for streaming high-dimensional data
• Fine-grained block updating technique for graph-based indexes
• Hardware-aware optimizations for large-scale vector storage systems