Welcome to LHH Israel Network

On this board you can review our network of companies that will assist you finding new job opportunities. This board automatically pulls the jobs from their career sites.
Found a suitable job? Send us the job link including your resume to: jobs@lhh.co.il and we will make sure it reaches the right person in the organization.
Please do not apply on this platform.

Before sending your resume, please check how well your CV matches the role requirements using the LHH AI CV Optimizer.

Lead SW Architect

Neureality

Posted on Jun 14, 2026

Apply now

Lead SW Architect

System Architecture
Israel
Full-time

Description

NeuReality is seeking a Lead System Architect to join our system architecture team and help define NR-NEXUS, our next-generation AI inference platform.

Responsibilities

Lead the software architecture and technical roadmap for NeuReality’s NR-Nexus
Write system specifications for NR-Nexus product
Research AI infrastructure, SaaS platforms, model serving, and inference trends
Work with engineering to translate technical capabilities into product value
Work closely with engineering teams to optimize performance, scalability, and feature delivery.
Define performance goals and lead profiling, benchmarking, and optimization efforts for GenAI and distributed AI workloads.
Collaborate with customers, partners, and open-source communities to ensure ecosystem compatibility and adoption.
Mentor software engineers and provide technical leadership

Requirements

7+ years of software engineering experience, including 3+ years in software architecture or technical leadership.
Strong experience with Kubernetes-based platforms and cloud-native architecture.
Deep understanding of Gen AI/LLM infrastructure and distributed workloads
Experience designing management software or SaaS platforms for production systems.
Strong background in distributed systems, microservices, APIs, and automation.
Hands-on experience with observability stacks, monitoring, logging, alerting, and SLA/SLO tracking.
Experience with CI/CD, deployment automation, upgrades, and rollback mechanisms.
Good understanding of security, authentication, authorization, and integration with customer data center environments.

Nice to have

Deep understanding of GenAI / LLM inference infrastructure, including model serving, scaling, batching, latency, throughput, and resource utilization.
Experience with production AI inference clusters using GPUs, AI accelerators, or other specialized compute infrastructure.
Understanding of how distributed inference systems operate, including scheduling, load balancing, autoscaling, failover, and cluster-level observability.
Experience with LLM serving frameworks such as vLLM, Triton Inference Server, TensorRT-LLM, or similar.
Familiarity with GPU/accelerator orchestration, device plugins, resource scheduling, and cluster capacity planning.
Familiarity with GPU communication technologies such as GPUDirect RDMA, NCCL, NVLink, or UALink.
Experience optimizing communication for distributed AI/ML workloads.
Knowledge of Prometheus, Grafana, OpenTelemetry, Helm, Argo CD, Istio, KServe, Kubeflow, or similar tools.
Experience deploying software in on-prem, edge, private cloud, or hybrid environments.

Apply now

See more open positions at Neureality