A Practical Introduction to Data Analysis and Big Data 교육 과정

Course Code

bigdata_

Duration

35 hours (usually 5 days including breaks)

Requirements

  • A general understanding of math.
  • A general understanding of programming.
  • A general understanding of databases.

Audience

  • Developers / programmers
  • IT consultants

Overview

강사 주도의 실제 교육을 이수하는 참가자는 Big Data 및 관련 기술, 방법론 및 도구에 대한 실용적이고 실제적인 이해를 얻을 수 있습니다.

참가자들은 실습을 통해 이러한 지식을 실습 할 수있는 기회를 갖게됩니다. 그룹 상호 작용과 강사 피드백은 수업의 중요한 구성 요소입니다.

이 과정은 Big Data 의 기본 개념에 대한 소개부터 시작하여 Data Analysis 을 수행하는 데 사용되는 프로그래밍 언어 및 방법론으로 진행됩니다. 마지막으로 Big Data 저장소, 분산 처리 및 Scala 을 지원하는 도구와 인프라에 대해 설명합니다.

코스 형식

  • 파트 강의, 파트 토론, 실무 연습 및 구현, 진행 상황을 측정하기위한 가끔 큐잉

Machine Translated

Course Outline

Introduction to Data Analysis and Big Data

  • What Makes Big Data "Big"?
    • Velocity, Volume, Variety, Veracity (VVVV)
  • Limits to Traditional Data Processing
  • Distributed Processing
  • Statistical Analysis
  • Types of Machine Learning Analysis
  • Data Visualization

Big Data Roles and Responsibilities

  • Administrators
  • Developers
  • Data Analysts

Languages Used for Data Analysis

  • R Language
    • Why R for Data Analysis?
    • Data manipulation, calculation and graphical display
  • Python
    • Why Python for Data Analysis?
    • Manipulating, processing, cleaning, and crunching data

Approaches to Data Analysis

  • Statistical Analysis
    • Time Series analysis
    • Forecasting with Correlation and Regression models
    • Inferential Statistics (estimating)
    • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Machine Learning
    • Supervised vs unsupervised learning
    • Classification and clustering
    • Estimating cost of specific methods
    • Filtering
  • Natural Language Processing
    • Processing text
    • Understaing meaning of the text
    • Automatic text generation
    • Sentiment analysis / topic analysis
  • Computer Vision
    • Acquiring, processing, analyzing, and understanding images
    • Reconstructing, interpreting and understanding 3D scenes
    • Using image data to make decisions

Big Data Infrastructure

  • Data Storage
    • Relational databases (SQL)
      • MySQL
      • Postgres
      • Oracle
    • Non-relational databases (NoSQL)
      • Cassandra
      • MongoDB
      • Neo4js
    • Understanding the nuances
      • Hierarchical databases
      • Object-oriented databases
      • Document-oriented databases
      • Graph-oriented databases
      • Other
  • Distributed Processing
    • Hadoop
      • HDFS as a distributed filesystem
      • MapReduce for distributed processing
    • Spark
      • All-in-one in-memory cluster computing framework for large-scale data processing
      • Structured streaming
      • Spark SQL
      • Machine Learning libraries: MLlib
      • Graph processing with GraphX
  • Scalability
    • Public cloud
      • AWS, Google, Aliyun, etc.
    • Private cloud
      • OpenStack, Cloud Foundry, etc.
    • Auto-scalability

Choosing the Right Solution for the Problem

The Future of Big Data

Summary and Conclusion

회원 평가

★★★★★
★★★★★

Related Categories

Related Courses

코스 프로모션

고객 회사

is growing fast!

We are looking to expand our presence in South Korea!

As a Business Development Manager you will:

  • expand business in South Korea
  • recruit local talent (sales, agents, trainers, consultants)
  • recruit local trainers and consultants

We offer:

  • Artificial Intelligence and Big Data systems to support your local operation
  • high-tech automation
  • continuously upgraded course catalogue and content
  • good fun in international team

If you are interested in running a high-tech, high-quality training and consulting business.

Apply now!

This site in other countries/regions