EXO: End-to-End Local AI Cluster Deployment Training Course
EXO is an open-source framework designed to link Apple Silicon devices into a distributed AI cluster, allowing for the local inference of frontier models that exceed the capacity of a single device.
This instructor-led, live training (available online or onsite) targets system administrators and DevOps engineers who want to deploy, configure, and manage EXO clusters for private LLM inference across multiple Apple Silicon or Linux nodes.
Upon completing this training, participants will be able to:
- Install and configure EXO on macOS and Linux nodes.
- Enable automatic device discovery and establish multi-node clusters.
- Activate and verify RDMA over Thunderbolt 5 to achieve ultra-low-latency inter-device communication.
- Deploy frontier models (DeepSeek, Qwen, Llama) across clustered devices.
- Monitor cluster health and resolve common deployment issues.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical application.
- Hands-on implementation within a live laboratory environment.
Customization Options
- To request customized training, please contact us to make arrangements.
Course Outline
Introduction to EXO and Local AI Clustering
- Overview of the EXO framework and the exo-explore ecosystem
- Comparison of centralized cloud inference versus distributed local inference
- Architecture: libp2p device discovery, MLX backend, dashboard, and API layers
- Hardware requirements: Apple Silicon (M3 Ultra, M4 Pro/Max), Thunderbolt 5, and shared storage
Installing EXO on macOS
- Setting up Xcode, Metal Toolchain, and macOS prerequisites
- Installing uv, Node.js, and the Rust nightly toolchain
- Installing the pinned macmon fork for Apple Silicon monitoring
- Cloning the repository and building the dashboard using npm
- Running EXO from source and verifying the localhost:52415 dashboard
Installing EXO on Linux
- Installing dependencies via apt or Homebrew on Linux
- Configuring uv, Node.js 18+, and Rust nightly
- Building the dashboard and running EXO in CPU-only mode
- Directory layout: XDG Base Directory paths for config, data, cache, and logs
Automatic Device Discovery and Cluster Formation
- Understanding libp2p-based auto-discovery across local networks
- Configuring custom namespaces with EXO_LIBP2P_NAMESPACE for cluster isolation
- Verifying node membership in the dashboard cluster view
- Handling discovery failures and network segmentation issues
Enabling RDMA over Thunderbolt 5
- RDMA architecture and the reported 99 percent latency reduction
- Enabling RDMA in macOS Recovery mode using rdma_ctl
- Cable requirements and port topology constraints on Mac Studio
- Ensuring macOS versions match across all cluster nodes
- Troubleshooting RDMA discovery and DHCP configuration
Deploying Frontier Models
- Using the dashboard to load and shard DeepSeek v3.1, Qwen3-235B, and Llama family models
- Previewing instance placements via the /instance/previews API endpoint
- Creating model instances with pipeline or tensor-parallel sharding
- Configuring custom model cards from the HuggingFace hub
Monitoring and Troubleshooting
- Reading EXO logs and understanding distributed tracing
- Interpreting cluster health in the dashboard cluster view
- Diagnosing worker node failures and reconnection behavior
- Using EXO_TRACING_ENABLED for performance bottleneck analysis
Cluster Maintenance and Updates
- Updating EXO binaries and procedures for dashboard rebuilds
- Migrating model caches and managing pre-downloaded models over NFS
- Gracefully removing nodes and rebalancing workloads
Requirements
- Familiarity with networking fundamentals (IP addressing, subnetting, firewalls)
- Experience with command-line administration on macOS or Linux
- Knowledge of Python package management (pip/uv) and Node.js tools
Audience
- System administrators
- DevOps engineers
- AI infrastructure architects responsible for on-premise LLM deployment
Open Training Courses require 5+ participants.
EXO: End-to-End Local AI Cluster Deployment Training Course - Booking
EXO: End-to-End Local AI Cluster Deployment Training Course - Enquiry
EXO: End-to-End Local AI Cluster Deployment - Consultancy Enquiry
Upcoming Courses
Related Courses
Advanced LangGraph: Optimization, Debugging, and Monitoring Complex Graphs
35 HoursLangGraph is a framework designed for building stateful, multi-agent LLM applications as composable graphs with persistent state and execution control.
This instructor-led live training (available online or onsite) is tailored for advanced-level AI platform engineers, AI DevOps specialists, and ML architects who aim to optimize, debug, monitor, and operate production-grade LangGraph systems.
Upon completion of this training, participants will be able to:
- Design and optimize complex LangGraph topologies for improved speed, cost efficiency, and scalability.
- Enhance reliability through retries, timeouts, idempotency, and checkpoint-based recovery mechanisms.
- Debug and trace graph executions, inspect states, and systematically reproduce production issues.
- Instrument graphs with logs, metrics, and traces; deploy them to production; and monitor SLAs and costs.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practice sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Building Coding Agents with Devstral: From Agent Design to Tooling
14 HoursDevstral is an open-source framework designed for building and running coding agents that can interact with codebases, developer tools, and APIs to enhance engineering productivity.
This instructor-led, live training (online or onsite) is aimed at intermediate-level to advanced-level ML engineers, developer-tooling teams, and SREs who wish to design, implement, and optimize coding agents using Devstral.
By the end of this training, participants will be able to:
- Set up and configure Devstral for coding agent development.
- Design agentic workflows for codebase exploration and modification.
- Integrate coding agents with developer tools and APIs.
- Implement best practices for secure and efficient agent deployment.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Open-Source Model Ops: Self-Hosting, Fine-Tuning and Governance with Devstral & Mistral Models
14 HoursMistral and Devstral models are open-source AI technologies crafted for flexible deployment, fine-tuning, and scalable integration.
This instructor-led live training (available online or onsite) is designed for intermediate to advanced machine learning engineers, platform teams, and research engineers seeking to self-host, fine-tune, and govern Mistral and Devstral models in production environments.
Upon completing this training, participants will be able to:
- Set up and configure self-hosted environments for Mistral and Devstral models.
- Apply fine-tuning techniques to achieve domain-specific performance.
- Implement versioning, monitoring, and lifecycle governance.
- Ensure security, compliance, and responsible usage of open-source models.
Course Format
- Interactive lectures and discussions.
- Hands-on exercises focused on self-hosting and fine-tuning.
- Live-lab implementation of governance and monitoring pipelines.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Fiji: Image Processing for Biotechnology and Toxicology
14 HoursThis instructor-led, live training in South Korea (online or onsite) targets beginner to intermediate-level researchers and laboratory professionals who want to process and analyze images of histological tissues, blood cells, algae, and other biological specimens.
Upon completion of this training, participants will be able to:
- Navigate the Fiji interface and leverage ImageJ’s core functionalities.
- Preprocess and enhance scientific images to improve analysis accuracy.
- Perform quantitative image analysis, such as cell counting and area measurement.
- Automate repetitive tasks using macros and plugins.
- Customize workflows to meet specific image analysis requirements in biological research.
LangGraph Applications in Finance
35 HoursLangGraph serves as a framework for constructing stateful, multi-agent LLM applications through composable graphs, enabling persistent state management and precise execution control.
This instructor-led live training, available online or onsite, targets intermediate to advanced professionals aiming to design, implement, and manage LangGraph-based finance solutions with robust governance, observability, and compliance standards.
Upon completion of this training, participants will be equipped to:
- Design finance-specific LangGraph workflows that align with regulatory and audit requirements.
- Integrate financial data standards and ontologies into graph states and tooling.
- Implement reliability, safety mechanisms, and human-in-the-loop controls for critical processes.
- Deploy, monitor, and optimize LangGraph systems to ensure performance, cost-efficiency, and adherence to SLAs.
Course Format
- Interactive lectures and discussions.
- Extensive exercises and practical application.
- Hands-on implementation within a live lab environment.
Customization Options
- For personalized training requests, please contact us to arrange a session.
LangGraph Foundations: Graph-Based LLM Prompting and Chaining
14 HoursLangGraph is a framework designed for creating graph-structured LLM applications that support planning, branching, tool use, memory, and controllable execution.
This instructor-led, live training (available online or onsite) targets beginner-level developers, prompt engineers, and data practitioners who want to design and build reliable, multi-step LLM workflows using LangGraph.
By the end of this training, participants will be able to:
- Explain core LangGraph concepts (nodes, edges, state) and when to use them.
- Build prompt chains that branch, call tools, and maintain memory.
- Integrate retrieval and external APIs into graph workflows.
- Test, debug, and evaluate LangGraph apps for reliability and safety.
Format of the Course
- Interactive lecture and facilitated discussion.
- Guided labs and code walkthroughs in a sandbox environment.
- Scenario-based exercises on design, testing, and evaluation.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
LangGraph in Healthcare: Workflow Orchestration for Regulated Environments
35 HoursLangGraph empowers stateful, multi-agent workflows driven by LLMs, offering precise control over execution paths and state persistence. In the healthcare sector, these capabilities are essential for ensuring compliance, interoperability, and the development of decision-support systems that seamlessly integrate with medical workflows.
This instructor-led live training (available online or onsite) targets intermediate to advanced professionals who aim to design, implement, and manage LangGraph-based healthcare solutions while navigating regulatory, ethical, and operational challenges.
Upon completion of this training, participants will be able to:
- Design healthcare-specific LangGraph workflows with compliance and auditability as core principles.
- Integrate LangGraph applications with medical ontologies and standards (FHIR, SNOMED CT, ICD).
- Apply best practices for reliability, traceability, and explainability in sensitive environments.
- Deploy, monitor, and validate LangGraph applications in healthcare production settings.
Course Format
- Interactive lectures and discussions.
- Hands-on exercises featuring real-world case studies.
- Implementation practice within a live-lab environment.
Course Customization Options
- To request customized training for this course, please contact us to arrange.
LangGraph for Legal Applications
35 HoursLangGraph is a framework designed for constructing stateful, multi-actor LLM applications using composable graphs that maintain persistent state and offer precise execution control.
This instructor-led live training (available online or onsite) is tailored for intermediate to advanced professionals seeking to design, implement, and manage LangGraph-based legal solutions, ensuring they meet necessary compliance, traceability, and governance standards.
Upon completing this training, participants will be able to:
- Design legal-specific LangGraph workflows that ensure auditability and compliance.
- Integrate legal ontologies and document standards into graph state and processing workflows.
- Implement guardrails, human-in-the-loop approvals, and traceable decision paths.
- Deploy, monitor, and maintain LangGraph services in production environments with observability and cost controls.
Format of the Course
- Interactive lecture and discussion.
- Numerous exercises and practical activities.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Building Dynamic Workflows with LangGraph and LLM Agents
14 HoursLangGraph serves as a framework for assembling graph-based LLM workflows that support branching, tool usage, memory management, and controlled execution.
This instructor-led, live training (available online or onsite) targets intermediate-level engineers and product teams aiming to integrate LangGraph’s graph logic with LLM agent loops to develop dynamic, context-aware applications, such as customer support agents, decision trees, and information retrieval systems.
Upon completing this training, participants will be able to:
- Design graph-based workflows that coordinate LLM agents, tools, and memory.
- Implement conditional routing, retries, and fallbacks to ensure robust execution.
- Integrate retrieval mechanisms, APIs, and structured outputs into agent loops.
- Evaluate, monitor, and secure agent behavior for enhanced reliability and safety.
Course Format
- Interactive lectures and facilitated discussions.
- Guided labs and code walkthroughs within a sandbox environment.
- Scenario-based design exercises and peer reviews.
Customization Options
- For customized training on this topic, please contact us to arrange.
LangGraph for Marketing Automation
14 HoursLangGraph is a graph-based orchestration framework that enables conditional, multi-step LLM and tool workflows, ideal for automating and personalizing content pipelines.
This instructor-led, live training (online or onsite) is aimed at intermediate-level marketers, content strategists, and automation developers who wish to implement dynamic, branching email campaigns and content generation pipelines using LangGraph.
By the end of this training, participants will be able to:
- Design graph-structured content and email workflows with conditional logic.
- Integrate LLMs, APIs, and data sources for automated personalization.
- Manage state, memory, and context across multi-step campaigns.
- Evaluate, monitor, and optimize workflow performance and delivery outcomes.
Format of the Course
- Interactive lectures and group discussions.
- Hands-on labs implementing email workflows and content pipelines.
- Scenario-based exercises on personalization, segmentation, and branching logic.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Le Chat Enterprise: Private ChatOps, Integrations & Admin Controls
14 HoursLe Chat Enterprise offers a private ChatOps solution that delivers secure, customizable, and governed conversational AI capabilities for organizations, supporting RBAC, SSO, connectors, and enterprise app integrations.
This instructor-led live training (available online or onsite) is designed for intermediate-level product managers, IT leads, solution engineers, and security/compliance teams who wish to deploy, configure, and govern Le Chat Enterprise in enterprise environments.
By the end of this training, participants will be able to:
- Set up and configure Le Chat Enterprise for secure deployments.
- Enable RBAC, SSO, and compliance-driven controls.
- Integrate Le Chat with enterprise applications and data stores.
- Design and implement governance and admin playbooks for ChatOps.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Cost-Effective LLM Architectures: Mistral at Scale (Performance / Cost Engineering)
14 HoursMistral is a high-performance family of large language models optimized for cost-effective production deployment at scale.
This instructor-led, live training (online or onsite) is aimed at advanced-level infrastructure engineers, cloud architects, and MLOps leads who wish to design, deploy, and optimize Mistral-based architectures for maximum throughput and minimum cost.
By the end of this training, participants will be able to:
- Implement scalable deployment patterns for Mistral Medium 3.
- Apply batching, quantization, and efficient serving strategies.
- Optimize inference costs while maintaining performance.
- Design production-ready serving topologies for enterprise workloads.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Productizing Conversational Assistants with Mistral Connectors & Integrations
14 HoursMistral AI serves as an open AI platform, empowering teams to develop and embed conversational assistants within enterprise and customer-facing workflows.
This instructor-led training, available in online or onsite formats, is designed for beginner to intermediate product managers, full-stack developers, and integration engineers. The course focuses on designing, integrating, and productizing conversational assistants using Mistral’s connectors and integrations.
Upon completion, participants will be able to:
- Integrate Mistral conversational models with enterprise and SaaS connectors.
- Implement retrieval-augmented generation (RAG) to ensure grounded responses.
- Design UX patterns for both internal and external chat assistants.
- Deploy assistants into product workflows to address real-world use cases.
Course Format
- Interactive lectures and discussions.
- Hands-on integration exercises.
- Live lab sessions for developing conversational assistants.
Course Customization Options
- For customized training requests, please contact us to make arrangements.
Enterprise-Grade Deployments with Mistral Medium 3
14 HoursMistral Medium 3 is a high-performance, multimodal large language model designed for production-grade deployment across enterprise environments.
This instructor-led, live training (online or onsite) is aimed at intermediate-level to advanced-level AI/ML engineers, platform architects, and MLOps teams who wish to deploy, optimize, and secure Mistral Medium 3 for enterprise use cases.
By the end of this training, participants will be able to:
- Deploy Mistral Medium 3 using API and self-hosted options.
- Optimize inference performance and costs.
- Implement multimodal use cases with Mistral Medium 3.
- Apply security and compliance best practices for enterprise environments.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Mistral for Responsible AI: Privacy, Data Residency & Enterprise Controls
14 HoursMistral AI offers an open, enterprise-ready AI platform designed to support secure, compliant, and responsible AI deployments.
This instructor-led live training, available both online and onsite, is tailored for intermediate-level compliance leads, security architects, and legal or operations stakeholders aiming to establish responsible AI practices. The course focuses on leveraging privacy safeguards, data residency solutions, and enterprise control mechanisms within the Mistral ecosystem.
Upon completion of this training, participants will be equipped to:
- Deploy privacy-preserving techniques within Mistral environments.
- Execute data residency strategies to satisfy regulatory obligations.
- Configure enterprise-grade controls, including RBAC, SSO, and audit logging.
- Assess vendor and deployment options to ensure alignment with compliance standards.
Course Format
- Interactive lectures and group discussions.
- Case studies and exercises focused on compliance.
- Practical implementation of enterprise AI controls.
Customization Options
- To request a customized version of this course, please contact us to arrange.