A5 Labs is the advanced technology group behind some of the largest online mobile gaming sites in the world, pioneering AI-driven solutions that redefine security, player experience, and game design. If you&aposve played an online casino game, there&aposs a high chance you&aposve engaged with our technology. We reward exceptional talent with an outstanding work environment, industry-leading incentives, and continuous growth opportunities.
We are seeking a Principal Engineer specializing in LLM Reasoning & Tool-Use Intelligence to lead the development of next-generation language models with advanced reasoning, tool-use capabilities, and human-machine collaboration fluency. This role will play a central part in applying and advancing reinforcement learning techniques, hybrid system architectures, and multi-agent coordination for scalable, intelligent, and efficient AI systems.
Key Responsibilities:
- Design and implement post-training optimization techniques (e.g., RLHF, RLAIF) to improve LLM reasoning capabilities in complex, multi-step tasks.
- Build LLM-driven tool-use agents that intelligently integrate external APIs, computational tools, and memory systems in real-time reasoning scenarios.
- Lead development of hybrid Retrieval-Augmented Generation (RAG) architectures combining symbolic and neural reasoning for enhanced factuality and generalization.
- Develop frameworks for controlled synthetic data generation to target gaps in model reasoning, compositionality, and planning.
- Architect and train context-aware, multi-agent LLM systems to facilitate robust collaboration and delegation among humans and autonomous agents.
- Collaborate cross-functionally with research scientists, product engineers, and infrastructure teams to productionize intelligent agents.
- Contribute to strategic direction and technical leadership on LLM reasoning systems and their real-world deployments.
Desired Qualifications:
- 7+ years of experience in applied machine learning, reinforcement learning, or LLM development, with demonstrated leadership in research or product settings.
- Deep understanding of large language model architectures (e.g., Transformer-based models), fine-tuning strategies, and inference optimization.
- Strong experience with RL for language (e.g., reward modeling, policy optimization, preference-based learning).
- Familiarity with tool-use environments (e.g., OpenAI function calling, LangChain agents, Toolformer-style training).
- Proven track record in designing or applying hybrid neural-symbolic or RAG systems.
- Experience developing and evaluating synthetic data pipelines for training or evaluation.
- Expertise in multi-agent systems, coordination strategies, and collaborative learning frameworks is a strong plus.
- Advanced degree (Ph.D. or M.S.) in Computer Science, Machine Learning, or related field.
Bonus:
- Publications in top-tier ML/AI conferences (NeurIPS, ICLR, ICML, ACL, etc.)
- Experience deploying LLM-based agents in real-world software or robotics settings
- Contributions to open-source LLM/RL ecosystems (e.g., Hugging Face, RLlib, OpenAI Gym, LangChain)
What we offer:
Competitive compensation Glassdoor rating of 4.5, with 100% of reviewers satisfied with pay
Fully remote & flexible work 45 additional weeks of paid leave on top of public holidays
Work alongside world-class AI experts We ship real products, not just research
Global team No need for fluent English; basic communication is totally fine