Chinese Evaluator Expert (Remote)

Keystone Recruitment

Taiwan

Fresher

This job is no longer accepting applications

Posted a month ago

Job Description

Title: Chinese Evaluator Expert (Remote)
Engagement: Hourly contract (independent contractor)
Location: Remote
RATE: USD 25-30/Hour

Role Overview

One of our leading AI research lab clients is seeking native Traditional Chinese professionals with exceptional writing skills to support a high-impact AI evaluation initiative.

In this role, you will create Traditional Chinese/English prompt and golden-answer pairs used to train and evaluate advanced language models. This opportunity is ideal for individuals who combine language mastery, strong analytical thinking, and the ability to translate complex ideas into culturally accurate, Taiwan-specific Traditional Chinese while maintaining technical precision in English.

Key Responsibilities

Multilingual Prompt Design & Optimization

Create detailed prompts in Traditional Chinese and/or English with structured constraints and clear instructions
Ensure natural phrasing and real-world relevance for Taiwan-based users

Evaluation Standards & Rubric Development

Define clear expectations for high-quality responses in Traditional Chinese consumer contexts
Develop comprehensive evaluation rubrics that account for linguistic nuance, tone, and cultural conventions

Model Testing & Grading

Run prompts through AI models and evaluate outputs for accuracy, fluency, reasoning quality, and cultural alignment
Compare Traditional Chinese outputs with English equivalents when necessary

Benchmarking & Quality Assurance

Collaborate in QA review processes to ensure consistency, rigor, and reliability across Traditional Chinese benchmarks
Support integration into official evaluation frameworks

Minimum Qualifications

Native-level written fluency in Traditional Chinese with strong English reading/writing ability
BS or BA from a reputable institution (completed or in progress)
Strong writing and critical thinking skills
Ability to work independently and meet deadlines
Significant familiarity with ChatGPT or similar AI tools
Based in Taiwan or able to consistently produce culturally accurate, Taiwan-specific Traditional Chinese

Preferred Qualifications

Experience in teaching, research, editing, or academic writing
Experience developing evaluation criteria, grading guidelines, or rubrics
Familiarity with LLM evaluation, prompt engineering, or model testing

Engagement Details

Approximately 20+ hours per week
Estimated commitment: 2+ months
Structured project environment with defined tools and goals
Fully remote, flexible schedule
Independent contractor engagement
Weekly payments via Stripe or Wise
Projects may be extended, shortened, or concluded early based on needs and performance
Unable to support H1-B or STEM OPT candidates at this time

Application Process