Hands-On Data Science with R
Vitor Bianchi Lanzetta Nataraj Dasgupta Ricardo Anjoleto Farias更新时间:2021-06-10 19:13:14
最新章节:Leave a review - let other readers know what you thinkcoverpage
Title Page
About Packt
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Getting Started with Data Science and R
Introduction to data science
Key components of data science
Computer science
Predictive analytics (machine learning)
Domain knowledge
Active domains of data science
Finance
Healthcare
Pharmaceuticals
Government
Manufacturing and retail
Web industry
Other industries
Solving problems with data science
Using R for data science
Key features of R
Our first R program
UN development index
Summary
Quiz
Descriptive and Inferential Statistics
Measures of central tendency and dispersion
Measures of central tendency
Calculating mean median and mode with base R
Measures of dispersion
Useful functions to draw automated summaries
Statistical hypothesis testing
Running t-tests with R
Decision rule – a brief overview of the p-value approach
Be careful
Running z-tests with R
Elaborating a little longer
A/B testing – a brief introduction and a practical example with R
Summary
Quiz
Data Wrangling with R
Introduction to data wrangling with R
Data types formats and sources
Data extraction transformation and load
Basic tools of data wrangling
Using base R for data manipulation and analysis
Applying families of functions
Aggregation functions
Merging DataFrames
Using tibble and dplyr for data manipulation
Basic dplyr usage
Using select
Filtering with filter
Using arrange for sorting
Summarise
Sampling data
The tidyr package
Converting wide tables into long tables
Converting wide tables into long tables
Joining tables
dbplyr – databases and dplyr
Using data.table for data manipulation
Grouping operations
Adding a column
Ordering columns
What is the advantage of searching using key by?
Creating new columns in data.table
Deleting a column
Pivots on data.table
The melt functionality
Reading and writing files with data.table
A special note on dates and/or time
Miscellaneous topics
Checking data quality
Reading other file formats – Excel SAS and other data sources
On-disk formats
Working with web data
Web APIs
Tutorial – looking at airline flight times data
Summary
Quiz
KDD Data Mining and Text Mining
Good practices of KDD and data mining
Stages of KDD
Scraping a dwarf name
Retrieving text from the web
Legality of web scraping
Web scraping made easy with rvest
Retrieving tweets from R community
Creating your Twitter application
Fetching the number of tweets
Cleaning and transforming data
Looking for patterns – peeking visualizing and clustering data
Peeking data
Visualizing data
Cluster analysis
Summary
Quiz
Data Analysis with R
Preparing data for analysis
Data categories
Data types in R
Reading data
Managing data issues
Mixed data types
Missing data
Handling strings and dates
Handling dates using POSIXct or POSIXlt
Handling strings in R
Reading data
Combining strings
Simple pattern matching and replacement with R
Printing results
Data visualisation
Types of charts – basic primer
Histograms
Line plots
Scatter plots
Boxplots
Bar charts
Heatmaps
Summarizing data
Saving analysis for future work
Packrat
Checkpoint
Rocker
Summary
Quiz
Machine Learning with R
What is machine learning?
Machine learning everywhere
Machine learning vocabulary
Generic problems solved by machine learning
Linear regression with R
Tricks for lm
Tree models
Strengths and weakness
The Chilean plebiscite data
Starting with decision trees
Growing trees with tree and rpart
Random forests – a collection of trees
Support vector machines
What about regressions?
Hierarchical and k-means clustering
Neural networks
Introduction to feedforward neural networks with R
Summary
Quiz
Forecasting and ML App with R
The UI and server
Forecasting machine learning application
Application details
Summary
Quiz
Neural Networks and Deep Learning
Daily neural nets
Overview – NNs and deep learning
Neuroscience inspiration
ANN nodes
Activation functions
Layers
Training algorithms
NNs with Keras
Getting things ready for Keras
Getting practical with Keras
Further tips
Summary
Quiz
Markovian in R
Markovian-type models
Markovian models – real-world applications
The Markov chain
Programming an HMM with R
Summary
Quiz
Visualizing Data
Retrieving and cleaning data
Crafting visualizations
Summary
Quiz
Going to Production with R
What is R Shiny?
How to build a Shiny app
Building an application inside R
The reactive and isolate functions
The observeEvent and eventReactive functions
Approach for creating a data product from statistical modeling and web UI
Some advice about Shiny
Summary
Quiz
Large Scale Data Analytics with Hadoop
Installing the package and Spark
Manipulating Spark data using both dplyr and SQL
Filtering and aggregating Spark datasets
Using Spark machine learning or H2O Sparking Water
Providing interfaces to Spark packages
Spark DataFrames within the RStudio IDE
Summary
Quiz
R on Cloud
Cloud computing
Cloud types
Things to look for
Why Azure?
Azure registration
Azure Machine Learning Studio
How modules work
Building an experiment that uses R
Summary
Quiz
The Road Ahead
Growing your skills
Gathering data
Content to stay tuned to
Meeting Stack Overflow
Other Books You May Enjoy
Leave a review - let other readers know what you think
更新时间:2021-06-10 19:13:14