Red Team Operator (English & Chinese) - Remote

Keystone Recruitment

Taiwan

Fresher

This job is no longer accepting applications

Posted a month ago

Job Description

Role: Red Team Operator (English & Chinese) - Remote
Location: 100% Remote (Global)
RATE: USD 50.50/ HOUR

Why This Role Exists

One of our clients is building a dedicated red team to proactively identify weaknesses in advanced AI systems. The safest AI systems are those that have already been rigorously stress-tested by skilled human adversaries.

This role focuses on adversarial testing of conversational AI models, identifying vulnerabilities before they surface in production environments. You will generate structured red-team datasets that directly strengthen AI robustness and safety.

Some projects may involve reviewing AI outputs related to sensitive topics (e.g., bias, misinformation, harmful behaviors). All work is text-based. Participation in higher-sensitivity tracks is optional, clearly communicated in advance, and supported by structured guidelines and wellness resources.

What You'll Do

Conduct adversarial testing of conversational AI systems (jailbreaks, prompt injections, misuse cases, bias exploitation, multi-turn manipulation)
Generate high-quality red-team data by annotating failures and classifying vulnerabilities
Identify systemic weaknesses and recurring risk patterns
Apply structured taxonomies, benchmarks, and testing frameworks
Produce clear, reproducible documentation including reports, datasets, and attack cases
Deliver actionable findings for AI research and safety teams

Who You Are

Experienced in AI red teaming, adversarial testing, cybersecurity, or socio-technical risk analysis
Naturally adversarial and curious you instinctively push systems to edge cases
Structured in your approach you rely on benchmarks and repeatable methodologies
Strong communicator able to explain complex risks to both technical and non-technical stakeholders
Adaptable and comfortable working across multiple projects

Nice-to-Have Specialties

Adversarial ML: jailbreak datasets, prompt injection attacks, RLHF/DPO vulnerabilities, model extraction
Cybersecurity: penetration testing, exploit development, reverse engineering
Socio-technical risk: harassment probing, misinformation testing, abuse analysis
Creative adversarial thinking: psychology, narrative construction, unconventional probing strategies

What Success Looks Like

You uncover vulnerabilities automated testing pipelines miss
You produce reproducible artifacts that measurably improve AI safety
Evaluation coverage expands across linguistic and cultural edge cases
AI systems deploy with fewer safety blind spots

Engagement Details

Rate: $50.50/hour (aligned with expertise and project sensitivity)
Fully remote and asynchronous
Independent contractor engagement
Projects may be extended, shortened, or concluded based on performance and research needs
Weekly payments via Stripe or Wise
Visa sponsorship (H1-B or STEM OPT) not available

All qualified applicants will be considered without regard to legally protected characteristics. Reasonable accommodations are available upon request.

APPLY NOW!