Job Description

  1. Support in curating, cleaning, and validating large-scale datasets for Machine Learning (ML) applications.
  2. Design and implement data pipelines for efficient data ingestion and structuring.
  3. Use annotation tools to label datasets and maintain labeling quality.
  4. Perform data validation, deduplication, and error-checking to ensure data quality.
  5. Maintain organized datasets using version control systems (e.g., DVC, Git LFS).
  6. Collaborate with ML engineers to prepare datasets for model training and testing.
  7. Assist in evaluating new data tools and prototyping small-scale POCs.
  8. Report and track data issues to ensure datasets meet required standards for model performance.


Job Requirements 

  1. Undergraduate students, Diploma students, or Fresh Graduates from Computer Science, IT, Data Engineering, or related majors.
  2. Good problem-solving skills with a focus on data quality and accuracy.
  3. Eagerness to learn new data tools and technologies.
  4. Basic programming skills in Python or other scripting languages for data manipulation.
  5. Familiarity with data annotation platforms (e.g., Label Studio, Roboflow) and data labeling techniques.
  6. Basic understanding of data engineering concepts and dataset preprocessing.
  7. Basic knowledge of version control systems (e.g., Git, DVC) to manage datasets.


Benefits

  1. Certificate and Allowance.
  2. Career Path.
  3. Bonus Allowance.
  4. Meal Allowance.
  5. Device Allowance (if required).
  6. Dormitory (if required).
  7. 24 Hour Office (AC + WiFi).
  8. Round-trip Economy Class Ticket Train.
  9. Start-up Mentoring.
  10. Work in Team Experiences.
  11. Extended Network.


Duration of employment

  1. Minimum 4 months.


Beginning of employment

  1. Flexible can be adjusted according to campus policy.

Apply Now