A Practical Introduction to Data Analysis and Big Data 교육 과정

Course Code



35 hours (usually 5 days including breaks)


  • A general understanding of math.
  • A general understanding of programming.
  • A general understanding of databases.


  • Developers / programmers
  • IT consultants


강사 주도의 실제 교육을 이수하는 참가자는 Big Data 및 관련 기술, 방법론 및 도구에 대한 실용적이고 실제적인 이해를 얻을 수 있습니다.

참가자들은 실습을 통해 이러한 지식을 실습 할 수있는 기회를 갖게됩니다. 그룹 상호 작용과 강사 피드백은 수업의 중요한 구성 요소입니다.

이 과정은 Big Data 의 기본 개념에 대한 소개부터 시작하여 Data Analysis 을 수행하는 데 사용되는 프로그래밍 언어 및 방법론으로 진행됩니다. 마지막으로 Big Data 저장소, 분산 처리 및 Scala 을 지원하는 도구와 인프라에 대해 설명합니다.

코스 형식

  • 파트 강의, 파트 토론, 실무 연습 및 구현, 진행 상황을 측정하기위한 가끔 큐잉

Machine Translated

Course Outline

Introduction to Data Analysis and Big Data

  • What Makes Big Data "Big"?
    • Velocity, Volume, Variety, Veracity (VVVV)
  • Limits to Traditional Data Processing
  • Distributed Processing
  • Statistical Analysis
  • Types of Machine Learning Analysis
  • Data Visualization

Big Data Roles and Responsibilities

  • Administrators
  • Developers
  • Data Analysts

Languages Used for Data Analysis

  • R Language
    • Why R for Data Analysis?
    • Data manipulation, calculation and graphical display
  • Python
    • Why Python for Data Analysis?
    • Manipulating, processing, cleaning, and crunching data

Approaches to Data Analysis

  • Statistical Analysis
    • Time Series analysis
    • Forecasting with Correlation and Regression models
    • Inferential Statistics (estimating)
    • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Machine Learning
    • Supervised vs unsupervised learning
    • Classification and clustering
    • Estimating cost of specific methods
    • Filtering
  • Natural Language Processing
    • Processing text
    • Understaing meaning of the text
    • Automatic text generation
    • Sentiment analysis / topic analysis
  • Computer Vision
    • Acquiring, processing, analyzing, and understanding images
    • Reconstructing, interpreting and understanding 3D scenes
    • Using image data to make decisions

Big Data Infrastructure

  • Data Storage
    • Relational databases (SQL)
      • MySQL
      • Postgres
      • Oracle
    • Non-relational databases (NoSQL)
      • Cassandra
      • MongoDB
      • Neo4js
    • Understanding the nuances
      • Hierarchical databases
      • Object-oriented databases
      • Document-oriented databases
      • Graph-oriented databases
      • Other
  • Distributed Processing
    • Hadoop
      • HDFS as a distributed filesystem
      • MapReduce for distributed processing
    • Spark
      • All-in-one in-memory cluster computing framework for large-scale data processing
      • Structured streaming
      • Spark SQL
      • Machine Learning libraries: MLlib
      • Graph processing with GraphX
  • Scalability
    • Public cloud
      • AWS, Google, Aliyun, etc.
    • Private cloud
      • OpenStack, Cloud Foundry, etc.
    • Auto-scalability

Choosing the Right Solution for the Problem

The Future of Big Data

Summary and Conclusion

회원 평가


Related Categories

Related Courses

코스 프로모션

고객 회사

is growing fast!

We are looking to expand our presence in South Korea!

As a Business Development Manager you will:

  • expand business in South Korea
  • recruit local talent (sales, agents, trainers, consultants)
  • recruit local trainers and consultants

We offer:

  • Artificial Intelligence and Big Data systems to support your local operation
  • high-tech automation
  • continuously upgraded course catalogue and content
  • good fun in international team

If you are interested in running a high-tech, high-quality training and consulting business.

Apply now!

This site in other countries/regions