Skip to main content Skip to footer

Cloud Engineer - AI ML

Kuala Lumpur Job No. 14317390 Full-time - On-Site

工作描述

You will serve as a subject‑matter expert (SME) providing Level‑3 technical support across Google Cloud’s AI/ML portfolio, with emphasis on Vertex AI, GenAI, Conversational AI, and Other AI services. The role centers on rapid, high‑quality incident response, root‑cause diagnosis, and resolution for complex customer cases—while maintaining SLOs, CSAT targets, and rigorous documentation standards across phone, email, and chat channels.

Key Responsibilities

  • Own complex incidents end‑to‑end: triage, reproduce, diagnose, and resolve issues for AI/ML products; maintain transparent customer communication and accurate case records.
  • Response, diagnosis, resolution and tracking by phone, email and chat of customer support queries.
  • Maintain response and resolution speed as defined by SLOs.
  • Keep high customer satisfaction scores and follow quality standards in 90% of cases.
  • Assist and respond to consults from other technical support representatives through existing systems and tools.
  • Use existing troubleshooting tools and techniques to establish root cause for queries and provide a customer facing root cause assessment.
  • Understand business impact of customer issue reports and follow internal issue prioritization guidelines, provide justification on priority for a given single customer report.
  • Perform internal classification queries documenting classes of problems and preventative actions for further retroactive analysis.
  • Reactively (e.g. as a result of a query) file issue reports to Google engineers, collaborate with Google engineers to diagnose customer issues, build documentation, procedures, document desired behavior and/or steps to reproduce, and suggest code-level resolutions for complex product bugs, assist engineers to drive bugs to resolution.
  • Perform community management tasks as needed by the business.
  • Promptly and independently resolve technical incidents and escalations, with effective communication to all stakeholders internally and externally, so that no monitoring is needed by Google engineers.
  • Take cases involving customer-specific requirements on architectural design, provide solutions limited to a particular product (or a subset of product features).
  • Community contributions: solutions posts, FAQs, and guidance on best practices for AI/ML deployments and responsible AI usage.

Product Scope & Typical Case Patterns

Vertex AI

  • Introduction/AutoML: dataset ingestion, labeling, AutoML training failures, metric drift, imbalance handling.
  • Notebooks: environment provisioning, dependency/runtime conflicts, GPU/TPU access, kernel issues.
  • AI Vector Search: index build latency, recall/precision tuning, ANN configuration, embedding mismatches.
  • Pipelines: DAG orchestration failures, component contract issues, artifact lineage, caching.
  • Prediction (Online/Batch): endpoint scaling, model versioning, cold‑start latency, batch job retries.
  • Training: hyperparameter tuning, distributed training, accelerator utilization, checkpointing.
  • Model Registry: version promotion policies, metadata integrity, rollback flows.
  • Managed Datasets: schema evolution, governance, access control.
  • Explainable AI: feature attributions, baselines, compliance requests.
  • Feature Store: ingestion latency, online/offline store consistency, backfills.

GenAI

  • LLMs & GenAI Introduction: prompt engineering pitfalls, safety filters, quota/latency.
  • Vertex AI Gemini: model selection, context window sizing, tool‑use function calling, grounding.
  • Vertex AI Search & Conversation: data connectors, retrieval quality, schema/FAQ ingestion.
  • Discovery AI Retail Search: relevance tuning, synonym/attribute mapping, cold‑start catalogue issues.
  • Vertex Gen AI Studio: prototype to production handoff, evaluation harnesses.
  • Vertex Model Garden: model availability, versioning, licenses, tuning envelopes.

Conversational AI

  • Dialogflow ES/CX: intent/flow design, session state, webhook reliability, NLU regression.
  • CCAI Platform / CCaaS: telephony integration, routing, agent desktop, compliance.
  • CCAI Insights: transcript accuracy, sentiment, redaction, analytics pipelines.
  • Contact Center AI (General): deployment patterns, multichannel orchestration.
  • Speech‑to‑Text / Text‑to‑Speech: language/acoustic models, latency, accuracy, voice settings.
  • Agent Assist: suggestion quality, knowledge base integration, real‑time performance.

Other AI

  • Healthcare Data Engine (HDE): FHIR mapping, interoperability, privacy controls.
  • Document AI: processor selection, field extraction accuracy, batch throughput.
  • Vision API: model outputs, rate limits, edge cases, dataset curation.

职位要求

Minimum Qualifications

  • Technical Support Experience (L2/L3) for cloud AI/ML platforms, with proven incident ownership, RCA delivery, and cross‑functional collaboration.
  • Troubleshooting & Analysis: proficiency with logs, metrics, tracing; ability to interpret model artifacts, pipeline steps, and service quotas.
  • Communication: customer‑friendly RCA and escalation narratives; ability to handle sensitive, high‑impact scenarios.
  • Language: Mandarin B2 (CEFR) mandatory; English professional working proficiency.
  • 2-6 years of experience on google cloud or any cloud platform such as AWS or Azure

Preferred Skills & Product Certifications

Vertex AI Track

  • AutoML, Notebooks, Pipelines, Vector Search, Training/Prediction (online/batch), Model Registry, Managed Datasets, Explainable AI, Feature Store.

GenAI Track

  • Gemini family on Vertex AI; Search & Conversation; Discovery AI Retail Search; Gen AI Studio; Model Garden (model selection, safety, evaluation).

Conversational Track

  • Dialogflow ES/CX design and troubleshooting; CCAI Platform/CCaaS integrations; CCAI Insights; STT/TTS; Agent Assist.

Other AI Track

  • HDE (FHIR/health data), Document AI processors, Vision API.

Certifications (nice‑to‑have)

  • Google Cloud Professional ML Engineer, Professional Cloud Architect/Developer, Data Engineer; Dialogflow/CCAI badges; Responsible AI training.
  • Relevant third‑party: conversational design, speech technologies, healthcare data standards.

更多了解埃森哲

我们的专长

我们秉承“科技融灵智,匠心承未来”的企业使命,致力于通过引领变革创造价值,为我们的客户、员工、股东、合作伙伴与整个社会创造美好未来。

认识我们的团队

从业务服务部门到各个行业领域, 从职场新人到卓越领袖,我们一直在运用科技创造非凡!

联系我们

加入我们的团队

搜索与你的技能和兴趣匹配的空缺职位。我们希望招聘充满激情、求知若渴、富有创意、专注于解决方案且喜欢团队合作的员工。

埃森哲职位博客

关注埃森哲职业博客,在职场中先人一步,从真正的业内人士处,获取职业建议、内部观点以及可以即学即用的行业真知。