Senior AI Engineer (LLM & Audio)

Kneron

Taiwan, Taipei City

Fresher

Save

Posted 12 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

(1)Team & Project Introduction

We are dedicated to reshaping enterprise productivity through AI. Our team's core project is KNemo, a next-generation intelligent meeting organization software. KNemo integrates Large Language Models (LLM) with advanced speech processing (ASR/Diarization) to provide meeting summaries, speaker identification, and actionable item extraction on edge server.

As a part of our team, you will be involved in cutting-edge tech integrations. We highly value engineering quality and the ability to deploy real-world applications, encouraging our members to leverage AI-assisted tools for maximum efficiency. You will collaborate with an outstanding product and engineering team to transform the latest AI models into reliable, low-latency, and privacy-focused products.

(2)Responsibilities

AI Core Feature Development: Design and implement LLM/Audio-based applications, including the development of RAG architectures, Chat Q/A systems, and highly accurate long-context/meeting summarization features.
Systematic Prompt Engineering: Optimize prompts across complex contexts and various use cases. Ensure the system consistently outputs structured data to accurately extract key meeting takeaways and Action Items.
Backend Architecture & Deployment: Participate in API development and architectural planning for AI applications. Containerize models and apps (Docker), deploy them to Linux servers, and help maintain CI/CD pipelines and daily system operations.
Model Inference Optimization & Tech Integration: Evaluate and adopt LLM acceleration frameworks (e.g., vLLM, llama.cpp) to enhance system performance. Assist in researching and designing architectures to deploy audio processing or AI models onto Edge devices (NPU).
Cross-Functional Collaboration: Work closely with PMs and other engineering teams to define product requirements and design system workflows. Actively utilize AI-assisted tools to optimize the team's overall development efficiency.

(3)Job Requirements

Basic Qualifications

LLM Application Development: Practical experience in developing Large Language Model (LLM) applications. Proficient in Chat Q/A, RAG (Retrieval-Augmented Generation) architectures, and long-context/document summarization techniques.
Prompt Engineering: Proven ability to systematically tune and optimize prompts to consistently output structured data (e.g., JSON) for precise extraction of meeting summaries and Action Items.
System Deployment & Version Control: Familiarity with Python and Linux environments. Solid understanding of Docker containerization, GitLab version control, and basic CI/CD pipelines, along with hands-on server-side deployment experience.
AI Tool Utilization: Proficiency in leveraging various AI-assisted development tools to enhance overall engineering and workflow efficiency.

How to Stand Out

Audio & Text AI Processing: Experience in developing and training ASR (Automatic Speech Recognition), Speaker Diarization, or speech frontend technologies. Familiarity with low-level audio processing (e.g., FFmpeg) or real-time streaming (e.g., WebRTC, WebSocket) is a strong plus.
Workflow & Context Design: Practical experience in designing complex system workflows and Context Engineering.
Model Deployment & Acceleration: Hands-on experience with LLM inference deployment and acceleration techniques. Familiarity with frameworks such as llama.cpp, sglang, vLLM, or BentoML.
Edge Computing Integration: Ability to optimize AI models, with conceptual knowledge and practical experience in deploying models to Edge Devices or NPUs.
Backend & API Development: Familiarity with backend frameworks (e.g., FastAPI, Flask, or Node.js). Experience in API integration and connecting internal/external systems.
Database Management: Experience in the architectural design and operation of relational and non-relational databases (e.g., PostgreSQL, Redis).
Server Operations & Management: Experience managing model execution across multiple servers, configuring load balancing, and handling daily system operations.