AI Architect/Insurance/12 months (Extendable/Convert to Perm)

Hong Kong
Default

Sector: 

Technology

Function:

AI/ML

Contact Name:

Vivian On

Expiry Date:

24-Jul-2026

Job Ref:

JN -052026-493766

Date Published:

24-Jun-2026

Job Description: Technical AI Architect

Role Overview

We are seeking a Technical AI Architect to lead the design, scaling, and governance of our Enterprise Agentic RAG platform. You will move beyond basic semantic search to architect production-grade, end-to-end multi-agent products and high-performance retrieval systems.

This role demands deep technical mastery in Agentic RAG and LangGraph , strict attention to cost/token optimization , and the ability to ship resilient, production-grade products that enforce robust enterprise guardrails and security compliance.

Key Responsibilities

  • Production-Grade Agentic Architecture: Design and build end-to-end Agentic RAG products utilizing state-driven, multi-agent systems and cyclic workflows via LangGraph . Move from sequential pipelines to iterative, self-correcting reasoning loops (e.g., query decomposition, self-reflection, and dynamic context validation).

  • Enterprise-Scale Retrieval Systems: Architect high-precision, layout-aware semantic chunking pipelines. Implement enterprise hybrid search (combining dense vectors, sparse BM25 keyword matching, and Reciprocal Rank Fusion) backed by two-stage cross-encoder reranking layers.

  • Cost & Token Optimization: Drive LLM unit economics at scale. Implement advanced strategies for token optimization, context-window compression, semantic caching, and dynamic cost-based model routing (e.g., routing lookups to lightweight models and deep reasoning to frontier models).

  • AI Governance, Security & Guardrails: Deploy production-ready enterprise safety nets. Enforce secure tool execution environments, Source Access Control Lists (ACLs), data privacy/PII redacting, and automated LLM-as-a-judge evaluation frameworks (e.g., Ragas, TruLens) tracking Faithfulness, context precision, and latency SLAs.

  • Technical Leadership & DevOps: Lead, mentor, and establish best practices for a dedicated team of AI/ML engineers. Oversee containerization (Docker, Kubernetes) and inference server optimization (e.g., vLLM, PagedAttention) to achieve low-latency SLAs.

Technical Stack & Requirements

  • Orchestration & Agents: Expert-level mastery of LangGraph(critical), LangChain, or LlamaIndex for state tracking and tool use.

  • Data & Vector Infrastructure: Deep experience with enterprise vector databases (Pinecone, Milvus, Qdrant, pgvector) and robust extraction pipelines for complex enterprise documents (PDFs, financial tables).

  • Models & Deployment: Hands-on experience with commercial APIs (OpenAI, Anthropic) and deploying, fine-tuning, or quantization of open-source models (Llama, Mistral) via production engines like vLLM.

  • Core Engineering: Strong Python foundation, asynchronous programming, microservices (FastAPI), and observability infrastructure (LangSmith, Weights & Biases).

  • Experience: 10 years of software/data experience, minimum of 3+ years in AI enterprise architecture with a proven track record of shipping end-to-end, production-ready enterprise GenAI products.

Argyll Scott Asia is acting as an Employment Agency in relation to this vacancy.

APPLY NOW
APPLY NOW
Interested in this job?
Save Job

Share this job

Sign up for Job alerts

Get similar jobs like these by email

Create As Alert

Similar Jobs

SCHEMA MARKUP ( This text will only show on the editor. )