ISBN-10:
1498742092
ISBN-13:
9781498742092
Pub. Date:
06/16/2017
Publisher:
Taylor & Francis
Data Science and Analytics with Python / Edition 1

Data Science and Analytics with Python / Edition 1

by Jesus Rogel-SalazarJesus Rogel-Salazar
Current price is , Original price is $61.95. You

Temporarily Out of Stock Online

Please check back later for updated availability.

Overview

Data Science and Analytics with Python is designed for practitioners in data science and data analytics in both academic and business environments. The aim is to present the reader with the main concepts used in data science using tools developed in Python, such as SciKit-learn, Pandas, Numpy, and others. The use of Python is of particular interest, given its recent popularity in the data science community. The book can be used by seasoned programmers and newcomers alike.

The book is organized in a way that individual chapters are sufficiently independent from each other so that the reader is comfortable using the contents as a reference. The book discusses what data science and analytics are, from the point of view of the process and results obtained. Important features of Python are also covered, including a Python primer. The basic elements of machine learning, pattern recognition, and artificial intelligence that underpin the algorithms and implementations used in the rest of the book also appear in the first part of the book.

Regression analysis using Python, clustering techniques, and classification algorithms are covered in the second part of the book. Hierarchical clustering, decision trees, and ensemble techniques are also explored, along with dimensionality reduction techniques and recommendation systems. The support vector machine algorithm and the Kernel trick are discussed in the last part of the book.

About the Author

Dr. Jesús Rogel-Salazar is a Lead Data scientist with experience in the field working for companies such as AKQA, IBM Data Science Studio, Dow Jones and others. He is a visiting researcher at the Department of Physics at Imperial College London, UK and a member of the School of Physics, Astronomy and Mathematics at the University of Hertfordshire, UK, He obtained his doctorate in physics at Imperial College London for work on quantum atom optics and ultra-cold matter. He has held a position as senior lecturer in mathematics as well as a consultant in the financial industry since 2006. He is the author of the book Essential Matlab and Octave, also published by CRC Press. His interests include mathematical modelling, data science, and optimization in a wide range of applications including optics, quantum mechanics, data journalism, and finance.

Product Details

ISBN-13: 9781498742092
Publisher: Taylor & Francis
Publication date: 06/16/2017
Series: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series
Pages: 400
Product dimensions: 6.00(w) x 9.00(h) x (d)

About the Author

Dr. Jesús Rogel-Salazar is a Lead Data Scientist at IBM Data Science Studio and visiting researcher at the Department of Physics at Imperial College London, UK. He is also a member of the School of Physics, Astronomy and Mathematics at the University of Hertfordshire, UK. He obtained his doctorate in Physics at Imperial College London for work on quantum atom optics and ultra-cold matter. He has held a position as senior lecturer in mathematics as well as a consultant and data scientist in the financial industry since 2006. He is the author of the book “Essential Matlab and Octave”, also published with CRC Press. His interests include mathematical modelling, data science and optimisation in a wide range of applications including optics, quantum mechanics, data journalism and finance. Dr. Jesús Rogel-Salazar is a Lead Data Scientist at IBM Data Science Studio and visiting researcher at the Department of Physics at Imperial College London, UK. He is also a member of the School of Physics, Astronomy and Mathematics at the University of Hertfordshire, UK. He obtained his doctorate in Physics at Imperial College London for work on quantum atom optics and ultra-cold matter. He has held a position as senior lecturer in mathematics as well as a consultant and data scientist in the financial industry since 2006. He is the author of the book “Essential Matlab and Octave”, also published with CRC Press. His interests include mathematical modelling, data science and optimisation in a wide range of applications including optics, quantum mechanics, data journalism and finance.

Table of Contents

The Trials and Tribulations of a Data Scientist
Data? Science? Data Science!
The Data Scientist: A Modern Jackalope
Data Science Tools
From Data to Insight: the Data Science Workflow

Python: For Something Completely Different
Why Python? Why not?!
Firsts Slithers with Python
Control Flow
Computation and Data Manipulation
Pandas to the rescue
Plotting and visualising: Matplotlib

The Machine that Goes "Ping": Machine Learning and Pattern Recognition
Recognising Patterns
Artificial Intelligence and Machine Learning
Data is good, but other things are also needed
Learning, Predicting and Classifying
Machine Learning and Data Science
Feature selection
Bias, Variance and Regularisation: A Balancing Act
Some Useful Measures: Distance and Similarity
Beware the Curse of Dimensionality
Scikit-learn is our Friend
Training and Testing
Cross-validation

The Relationship Conundrum: Regression
Relationships between variables: Regression
Multivariate Linear Regression
Ordinary Least Squares
Brain and Body: Regression with one variable
Logarithmic transformation
Making the Task Easier: Standardisation and Scaling
Polynomial Regression
Variance-Bias Trade-Off
Shrinkage: LASSO and Ridge

Jackalopes and Hares: Clustering
Clustering
Clustering with k-means
Summary
Unicorns and Horses: Classification
Classification
Classification with KNN
Classification with Logistic Regression
Classification with Naïve Bayes

Decisions, Decisions: Hierarchical Clustering, Decision Trees and Ensable Techniques
Hierarchical Clustering
Decision Trees
Ensemble Techniques
Ensemble Techniques in Action

Less is More: Dimensionality Reduction
Dimensionality Reduction
Principal Component Analysis
Singular Value Decomposition
Recommendation Systems

Kernel Tricks under the Sleeve: Support Vector Machines
Support Vector Machines and Kernel Methods

Pipelines in Scikit-learn

Customer Reviews