ML Data Engineer Internship
Job Description
- Support in curating, cleaning, and validating large-scale datasets for Machine Learning (ML) applications.
- Design and implement data pipelines for efficient data ingestion and structuring.
- Use annotation tools to label datasets and maintain labeling quality.
- Perform data validation, deduplication, and error-checking to ensure data quality.
- Maintain organized datasets using version control systems (e.g., DVC, Git LFS).
- Collaborate with ML engineers to prepare datasets for model training and testing.
- Assist in evaluating new data tools and prototyping small-scale POCs.
- Report and track data issues to ensure datasets meet required standards for model performance.
Job Requirements
- Undergraduate students, Diploma students, or Fresh Graduates from Computer Science, IT, Data Engineering, or related majors.
- Good problem-solving skills with a focus on data quality and accuracy.
- Eagerness to learn new data tools and technologies.
- Basic programming skills in Python or other scripting languages for data manipulation.
- Familiarity with data annotation platforms (e.g., Label Studio, Roboflow) and data labeling techniques.
- Basic understanding of data engineering concepts and dataset preprocessing.
- Basic knowledge of version control systems (e.g., Git, DVC) to manage datasets.
Benefits
- Certificate and Allowance.
- Career Path.
- Bonus Allowance.
- Meal Allowance.
- Device Allowance (if required).
- Dormitory (if required).
- 24 Hour Office (AC + WiFi).
- Round-trip Economy Class Ticket Train.
- Start-up Mentoring.
- Work in Team Experiences.
- Extended Network.
Duration of employment
- Minimum 4 months.
Beginning of employment
- Flexible can be adjusted according to campus policy.