Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction to Multimodal LLMs in Vertex AI
- Overview of multimodal capabilities in Vertex AI
- Gemini models and supported modalities
- Use cases in enterprise and research
Setting Up the Development Environment
- Configuring Vertex AI for multimodal workflows
- Working with datasets across modalities
- Hands-on lab: environment setup and dataset preparation
Long Context Windows and Advanced Reasoning
- Understanding long-context workflows
- Use cases in planning and decision-making
- Hands-on lab: implementing long-context analysis
Cross-Modal Workflow Design
- Combining text, audio, and image analysis
- Chaining multimodal steps in pipelines
- Hands-on lab: designing a multimodal pipeline
Working with Gemini API Parameters
- Configuring multimodal inputs and outputs
- Optimizing inference and efficiency
- Hands-on lab: tuning Gemini API parameters
Advanced Applications and Integrations
- Interactive multimodal agents and assistants
- Integrating external APIs and tools
- Hands-on lab: building a multimodal application
Evaluation and Iteration
- Testing multimodal performance
- Metrics for accuracy, alignment, and drift
- Hands-on lab: evaluating multimodal workflows
Summary and Next Steps
Requirements
- Proficiency in Python programming
- Experience with machine learning model development
- Familiarity with multimodal data (text, audio, image)
Audience
- AI researchers
- Advanced developers
- ML scientists
14 Hours