Module 1 - Basic Analytics
1. Statistics Basics
- Introduction to Data Analytics and Statistical Techniques
- Types of Variables, measures of central tendency and dispersion
- Variable Distributions and Probability Distributions
- Normal Distribution and Properties
- Central Limit Theorem and Application
2. Hypothesis Testing
- Null/Alternative Hypothesis formulation
- One Sample, two sample (Paired and Independent) T/Z Test
- P Value Interpretation
- Analysis of Variance (ANOVA)
- Chi Square Test
- Non Parametric Tests (Kruskal-Wallis, Mann-Whitney, KS)
3. Multivariate Regression
- Introduction to Correlation - Karl Pearson and Graphical Methods
- Spearman Rank Correlation
- OLS Regression - Simple and Multiple
1. Logistic Regression
- Non Linear Regressions using Link functions
- Logit Link Function
- Binomial Propensity Modeling
- Training-Validation approach
- ROC-AUC, Lift charts, Decile Analysis
2. Factor Analysis
- Introduction to Factor Analysis - PCA
- KMO MSA tests, Eigen Value Interpretation
- Factor Rotation and Extraction
3. Cluster Analysis
- Introduction to Cluster Techniques
- Distance Methodologies
- Hierarchical and Non-Hierarchical Procedures
- K-Means clustering
- Wards Method
Module 3 - Time Series
Analysis
1. Introduction and Exponential
Smoothening
- Introduction to Time Series Data and Analysis
- Decomposition of Time Series
- Trend and Seasonality detection and forecasting
- Exponential Smoothing (Single, double and triple)
2. Arima Modeling
- Box - Jenkins Methodology
- Introduction to Auto Regression and Moving Averages, ACF, PACF
- Detecting order of ARIMA processes
- Seasonal ARIMA Models (P,D,Q)(p,d,q)
- Introduction to Multivariate ARIMA
Module 4 - Advanced Data Mining
1. Introduction to R/Rattle Environment
- R-Rattle GUI Familiarization
- Rattle Tabs
- Data Import and Variable role setting
- Data Exploration and Visualization, Hypothesis Testing
- Data Manipulation, Standardization, Missing value Treatment
2. Statistical Analysis & Data
Mining/Machine Learning
- Cluster Analysis using R-Rattle
- Association Rule Mining
- Predictive Modeling using
- Decision Trees
- Random Forests
- Adaptive Boosting
- Logistic Regression
3. Evaluating & Deploying Models
- Evaluating performance of Model on Training and Validation data
- ROC, Sensitivity, Specificity, Lift charts, Error Matrix
- Deploying models using Score options
- Opening and Saving models using Rattle
No comments:
Post a Comment