Introductory R for Biologists 교육 과정

Course Code

rintrob

Duration

28 hours (usually 4 days including breaks)

Overview

R은 통계 컴퓨팅, 데이터 분석 및 그래픽을위한 오픈 소스 무료 프로그래밍 언어입니다. R은 기업 및 학계 내에서 점점 더 많은 관리자 및 데이터 분석가가 사용합니다. R은 컴퓨터 프로그래밍 기술 없이도 통계학 자, 엔지니어 및 과학자 중 추종자를 쉽게 찾을 수 있음을 발견했습니다. 그 인기는 광고 가격 설정, 신약의 신속 검색 또는 재무 모델 미세 조정과 같은 다양한 목표에 대한 데이터 마이닝의 사용 증가 때문입니다. R에는 데이터 마이닝을위한 다양한 패키지가 있습니다.

Machine Translated

Course Outline

I. Introduction and preliminaries

1. Overview

  • Making R more friendly, R and available GUIs
  • Rstudio
  • Related software and documentation
  • R and statistics
  • Using R interactively
  • An introductory session
  • Getting help with functions and features
  • R commands, case sensitivity, etc.
  • Recall and correction of previous commands
  • Executing commands from or diverting output to a file
  • Data permanency and removing objects
  • Good programming practice:  Self-contained scripts, good    readability e.g. structured scripts, documentation, markdown
  • installing packages; CRAN and Bioconductor

2. Reading data

  • Txt files  (read.delim)
  • CSV files

3. Simple manipulations; numbers and vectors  + arrays

  • Vectors and assignment
  • Vector arithmetic
  • Generating regular sequences
  • Logical vectors
  • Missing values
  • Character vectors
  • Index vectors; selecting and modifying subsets of a data set
    • Arrays
  • Array indexing. Subsections of an array
  • Index matrices
  • The array() function + simple operations on arrays e.g. multiplication, transposition  
  • Other types of objects

4. Lists and data frames

  • Lists
  • Constructing and modifying lists
    • Concatenating lists
  • Data frames
    • Making data frames
    • Working with data frames
    • Attaching arbitrary lists
    • Managing the search path

5. Data manipulation

  • Selecting, subsetting observations and variables         
  • Filtering, grouping
  • Recoding, transformations
  • Aggregation, combining data sets
  • Forming partitioned matrices, cbind() and rbind()
  • The concatenation function, (), with arrays
  • Character manipulation, stringr package
  • short intro into grep and regexpr

6. More on Reading data                                            

  • XLS, XLSX files
  • readr  and readxl packages
  • SPSS, SAS, Stata,… and other formats data
  • Exporting data to txt, csv and other formats

6. Grouping, loops and conditional execution

  • Grouped expressions
  • Control statements
  • Conditional execution: if statements
  • Repetitive execution: for loops, repeat and while
  • intro into apply, lapply, sapply, tapply

7. Functions

  • Creating functions
  • Optional arguments and default values
  • Variable number of arguments
  • Scope and its consequences

8. Simple graphics in R

  • Creating a Graph
  • Density Plots
  • Dot Plots
  • Bar Plots
  • Line Charts
  • Pie Charts
  • Boxplots
  • Scatter Plots
  • Combining Plots

II. Statistical analysis in R 

1.    Probability distributions

  • R as a set of statistical tables
  • Examining the distribution of a set of data

2.   Testing of Hypotheses

  • Tests about a Population Mean
  • Likelihood Ratio Test
  • One- and two-sample tests
  • Chi-Square Goodness-of-Fit Test
  • Kolmogorov-Smirnov One-Sample Statistic 
  • Wilcoxon Signed-Rank Test
  • Two-Sample Test
  • Wilcoxon Rank Sum Test
  • Mann-Whitney Test
  • Kolmogorov-Smirnov Test

3. Multiple Testing of Hypotheses

  • Type I Error and FDR
  • ROC curves and AUC
  • Multiple Testing Procedures (BH, Bonferroni etc.)

4. Linear regression models

  • Generic functions for extracting model information
  • Updating fitted models
  • Generalized linear models
    • Families
    • The glm() function
  • Classification
    • Logistic Regression
    • Linear Discriminant Analysis
  • Unsupervised learning
    • Principal Components Analysis
    • Clustering Methods(k-means, hierarchical clustering, k-medoids)

5.  Survival analysis (survival package)

  • Survival objects in r
  • Kaplan-Meier estimate, log-rank test, parametric regression
  • Confidence bands
  • Censored (interval censored) data analysis
  • Cox PH models, constant covariates
  • Cox PH models, time-dependent covariates
  • Simulation: Model comparison (Comparing regression models)

 6.   Analysis of Variance

  • One-Way ANOVA
  • Two-Way Classification of ANOVA
  • MANOVA

III. Worked problems in bioinformatics           

  • Short introduction to limma package
  • Microarray data analysis workflow
  • Data download from GEO: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
  • Data processing (QC, normalisation, differential expression)
  • Volcano plot             
  • Custering examples + heatmaps

회원 평가

★★★★★
★★★★★

고객 회사

is growing fast!

We are looking to expand our presence in South Korea!

As a Business Development Manager you will:

  • expand business in South Korea
  • recruit local talent (sales, agents, trainers, consultants)
  • recruit local trainers and consultants

We offer:

  • Artificial Intelligence and Big Data systems to support your local operation
  • high-tech automation
  • continuously upgraded course catalogue and content
  • good fun in international team

If you are interested in running a high-tech, high-quality training and consulting business.

Apply now!