Data Science and Machine Learning – Placement Oriented Course

Course Structure :

UNIT 1 |10 CLASSES|Introduction to Data Management
UNIT 1.1: Programming in R – Foundation Course
¨ Introduction to R Programming
¨ Installing R and R Studio
¨ Basic Operations in R
¨ Vectors
¨ Factors
¨ Matrices
¨ Data Frames
¨ Lists
UNIT 1.2: Intermediate Course in R
¨ Introduction
¨ Logical and Relational Operators
¨ Conditional Statements
¨ Loops
¨ Functions
¨ Apply Family Functions

UNIT 2 |10 CLASSES|Data Analysis in Excel
UNIT 2.1: Data Analysis in Excel- I
¨ Introduction
¨ Understanding the Excel Interface
¨ Slicing and Dicing Data – Sort and Filter
¨ Report Making II: Conditional Formatting
¨ Report Making III: Advanced Formatting
¨ Printing and Page Layout
¨ Passwords and Naming Files
¨ Delimited files
¨ Discovering shortcuts
¨ Practice Assignment
UNIT 2.2: Data Analysis in Excel- II
¨ Formulae in Excel
¨ Complex Functions
¨ Cell Referencing and Text Functions
¨ Logical Formulae
¨ Creating & Formatting Charts in Excel
¨ Choosing the right type of Charts
¨ Creating a Pivot Table
¨ Analyzing Data in a Pivot Table
¨ Filtering Data in a Pivot Table
¨ VLOOKUP – Linking Data from multiple files & tables
¨ Common Errors in Excel

UNIT 3 |10 CLASSES|Data Analysis Using SQL
UNIT 3.1: Data Analysis Using SQL
¨ INTRODUCTION
¨ WRITING BASIC SQL SELECT STATEMENTS
¨ RESTRICTING AND SORTING DATA
¨ SQL FUNCTIONS
¨ DISPLAYING DATA FROM MULTIPLE TABLES
¨ SUBQUERIES
¨ CONSTRAINTS

UNIT 4 |10 CLASSES|Data Warehouse
UNIT 4.1: Basic course in Data Warehouse
¨ Introduction to Data Warehouse
¨ Defining Data Warehouse
¨ OLAP vs OLTP
¨ Structure of Data Warehouse
¨ Star Schema
¨ Star Schema – Demonstration
¨ ETL operations
¨ Data Warehouse Schema – Industry Implementation
¨ Introduction to Business Intelligence
¨ Data Cubes
¨ OLAP Operations
UNIT 4.2: Exercises in Data Warehouse
¨ Case Study
¨ Merging
¨ Operations in R
¨ Coding Exercise 1
¨ Coding Exercise 2
¨ Exporting Files from R to Excel
¨ Operations in Excel

UNIT 5|10 CLASSES|Problem Solving Methodology
UNIT 5.1: CRISP DM Frame Work – Business and Data Understanding
̈ Introduction
̈ Define the Business Problem – Business Understanding
̈ Understanding the Problem Statement
̈ Understanding Raw Data
̈ Preparing Data for Analysis
̈ Data Modeling
̈ Model Evaluation and Deployment
̈ Getting local Data
̈ Data From SQL
̈ JSON

UNIT 5.2: Manipulating Data
̈ Introduction
̈ Wide and Long Data
̈ Wide to Long Format
̈ TidyR – Some more functions
̈ Manipulating data with dplyr
̈ Swirl – Manipulating Strings with Stringr

UNIT 6|15 CLASSES|Statistics and Exploratory Data Analysis
UNIT 6.1: Data Visualisation I
̈ Introduction to Data Visualisation
̈ Visualisation with examples
̈ Visualisations – The World of Imagery
̈ Understanding Basic Chart Types I
̈ Understanding Basic Chart Types II
̈ Visualisation in R – Using the Base Package
̈ Basic Plotting in R
̈ Histogram and Box Plots Using the Base Package
̈ ggplot Package
̈ Creating Visualisations Using ggplot()
̈ Scatter Plots in ggplot

UNIT 6.2: Data Visualisation II
̈ Data Visualisation in R – Using ggplot2
̈ Plotting Larger Data Sets
̈ Bar Charts in ggplot()
̈ Factors Affecting Visualisation
̈ Factors Affecting Visualisation
̈ Jitter
̈ ggplot – Histogram and Bar Chart
̈ Working on Time Series Data
̈ Practice Excercises

UNIT 6.3: Visualisation Using Tableau: Data Exploration in Tableau
̈ Introduction
̈ Data Formats and Tableau Interface
̈ Connecting to the Data
̈ Data Preparation in Tableau
̈ Hierarchies and Drill Down
̈ Visualising and Analysing Data in Tableau -I
̈ Bar charts
̈ Scatter Plots and Pie Charts
̈ Tree Maps
̈ Dual Axes Charts
̈ Visualising and Analysing Data in Tableau – II
̈ Histograms
̈ Box Plots
̈ Area Maps
̈ Calculations in Tableau
̈ Dashboards and Stories
̈ Practice Questions

UNIT 7|10 CLASSES|Inferential Statistics
UNIT 7.1: Basics of Probability
̈ Introduction to Inferential Statistics
̈ Introduction to Basics of Probability
̈ Random Variables
̈ Probability Distributions – I
̈ Probability Distribution – II
̈ Expected Value – I
̈ Expected Value – II
̈ Discrete Probability Distribution
̈ Introduction to Discrete Probability
̈ Probability without Experiment – I
̈ Probability without Experiment – II
̈ Binomial Distribution (Examples)
̈ Cumulative Probability
̈ Practice Questions
UNIT 7.2: Continuous Probability Distribution
̈ Introduction: Continuous Probability Distribution
̈ Probability Density Function – I
̈ Probability Density Function – II
̈ Normal Distribution
̈ Standard Normal Distribution
̈ Central Limit Theorem
̈ Introduction to Central Limit Theorem
̈ Samples
̈ Sampling Distribution
̈ Properties of Sampling Distribution
̈ Sampling Distributions – R Simulation
̈ Central Limit Theorem
̈ Estimating Mean Using CLT
̈ Confidence Interval – Example
̈ Practice Questions on Central Limit Theorem

UNIT 8|10 CLASSES|Hypothesis Testing
UNIT 8.1: Concepts of Hypothesis Testing
̈ Introduction
̈ Understanding Hypothesis Testing
̈ Null and Alternate Hypothesis
̈ Making Decision
̈ Critical Value Method
̈ Critical Value Method – Examples
̈ p – Value method
̈ p – value method examples
̈ Types of Errors
UNIT 8.2: Industry Demonstration of Hypothesis Testing
̈ Introduction
̈ T Distribution
̈ Two sample Mean Test
̈ Two Sample Proportion Test
̈ A/B Testing Demonstration
̈ Industry Relevance
̈ Hypothesis Testing in R

UNIT 9|15 CLASSES|Exploratory Data Analysis
UNIT 9.1: Data Sourcing
̈ Introduction to EDA
̈ Public and Private Data
̈ Public Data
̈ Private Data
̈ Public Data Exercise
Data Cleaning
̈ Introduction
̈ Election Data: Case Study
̈ Fixing Rows and Columns
̈ Missing Values
̈ Standardizing Values
̈ Invalid Values
̈ Filtering Data
̈ Data Cleaning Practice Questions

UNIT 9.2: Univariate Analysis and Segmented Univariate
Univariate Analysis
̈ Introduction
̈ Data Description
̈ Unordered Categorical Variables – Univariate Analysis
̈ Ordered Categorical Variables – Univariate Analysis
̈ Quantitative Variables – Univariate Analysis
̈ Quantitative Variables – Summary Metrics
̈ Practice Questions

Segmented Univariate
̈ Introduction to Segmented Univariate Analysis
̈ Basis of Segmentation
̈ Quick way of Segmentation
̈ Comparison of Averages
̈ Comparison of other Metrics

UNIT 9.3: Bivariate Analysis and Derived Metrics
Bivariate Analysis
̈ Introduction to Bivariate Analysis
̈ Bivariate Analysis on Continuous Variables
̈ Industry Sights on Correlation
̈ Bivariate Analysis on Categorical Variables

Derived Metrics
̈ Introduction to Derived Metrics
̈ Types of Derived Metrics: Type Driven Metrics
̈ Types of Derived Metrics: Business Driven Metrics
̈ Types of Derived Metrics: Data Driven Metrics
̈ Practice Questions

UNIT 10|10 CLASSES|Predictive Analytics
UNIT 10.1: Linear Regression
Linear Regression: Simple Linear Regression
̈ Course Overview
̈ Introduction to Machine Learning
̈ Regression Line
̈ Best Fit Line
̈ Strength of Simple Linear Regression
̈ Simple Linear Regression in R
̈ Correlation & R2
Multiple Linear Regression
̈ Introduction
̈ Multiple Linear Regression
̈ Modelling in R
̈ Housing Case Study
̈ Dummy Variables
̈ Understanding the Data
̈ Multicollinearity
̈ Building Predictive Models
̈ Variable Selection Method
̈ Exercises

Linear Regression Industry Demo
̈ Linear Regression: Revision
̈ Prediction Vs Projection
̈ Case Study – I
̈ Exploratory Data Analysis

Predictive Modeling – I
̈ Predictive Modeling – II
̈ Predictive Modeling – III
̈ Assessing the Model
̈ Interpreting the Results