Product categories

EbookNice.com

Most ebook files are in PDF format, so you can easily read them using various software such as Foxit Reader or directly on the Google Chrome browser.
Some ebook files are released by publishers in other formats such as .awz, .mobi, .epub, .fb2, etc. You may need to install specific software to read these formats on mobile/PC, such as Calibre.

Please read the tutorial at this link. https://ebooknice.com/page/post?id=faq

We offer FREE conversion to the popular formats you request; however, this may take some time. Therefore, right after payment, please email us, and we will try to provide the service as quickly as possible.

For some exceptional file formats or broken links (if any), please refrain from opening any disputes. Instead, email us first, and we will try to assist within a maximum of 6 hours.

EbookNice Team

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools by David Mertz ISBN 9781801071291, 1801071292 instant download

Name: Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools by David Mertz ISBN 9781801071291, 1801071292 instant download
SKU: EBN-23849894
Price: 32 USD
Availability: InStock
Rating: 4.67 (8 reviews)

SKU: EBN-23849894

$ 32 ~~$ 40~~ (-20%)

Status:

Available

4.7

8 reviews

Instant download (eBook) Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools after payment.

Authors:David Mertz

Pages:492 pages

Year:2021

Publisher:Packt Publishing - ebooks Account

Language:english

File Size:6.62 MB

Format:pdf

ISBNS:9781801071291, 1801071292

Categories: Ebooks

Product desciption

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools by David Mertz ISBN 9781801071291, 1801071292 instant download

A comprehensive guide for data scientists to master effective data cleaning tools and techniques

Key Features

Master data cleaning techniques in a language-agnostic manner
Learn from intriguing hands-on examples from numerous domains, such as biology, weather data, demographics, physics, time series, and image processing
Work with detailed, commented, well-tested code samples in Python and R

Book Description

It is something of a truism in data science, data analysis, or machine learning that most of the effort needed to achieve your actual purpose lies in cleaning your data. Written in David’s signature friendly and humorous style, this book discusses in detail the essential steps performed in every production data science or data analysis pipeline and prepares you for data visualization and modeling results.

The book dives into the practical application of tools and techniques needed for data ingestion, anomaly detection, value imputation, and feature engineering. It also offers long-form exercises at the end of each chapter to practice the skills acquired.

You will begin by looking at data ingestion of data formats such as JSON, CSV, SQL RDBMSes, HDF5, NoSQL databases, files in image formats, and binary serialized data structures. Further, the book provides numerous example data sets and data files, which are available for download and independent exploration.

Moving on from formats, you will impute missing values, detect unreliable data and statistical anomalies, and generate synthetic features that are necessary for successful data analysis and visualization goals.

By the end of this book, you will have acquired a firm understanding of the data cleaning process necessary to perform real-world data science and machine learning tasks.

What you will learn

Identify problem data pertaining to individual data points
Detect problem data in the systematic “shape” of the data
Remediate data integrity and hygiene problems
Prepare data for analytic and machine learning tasks
Impute values into missing or unreliable data
Generate synthetic features that are more amenable to data science, data analysis, or visualization goals.

Who This Book Is For

This book is designed to benefit software developers, data scientists, aspiring data scientists, and students who are interested in data analysis or scientific computing.

Basic familiarity with statistics, general concepts in machine learning, knowledge of a programming language (Python or R), and some exposure to data science are helpful. A glossary, references, and friendly asides should help bring all readers up to speed.

The text will also be helpful to intermediate and advanced data scientists who want to improve their rigor in data hygiene and wish for a refresher on data preparation issues.

*Free conversion of into popular formats such as PDF, DOCX, DOC, AZW, EPUB, and MOBI after payment.

EbookNice.com

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools by David Mertz ISBN 9781801071291, 1801071292 instant download

Product desciption

Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools by David Mertz ISBN 9781801071291, 1801071292 instant download

Related Products

(Ebook) Python Data Science Handbook: Essential Tools for Working with Data by Jake VanderPlas ISBN 9781491912058, 1491912057

(Ebook) Python Data Science Handbook: Essential Tools for Working with Data, 2nd Edition by Jake VanderPlas ISBN 9781098121228, 1098121228

(Ebook) Python Data Science Handbook: Essential Tools for Working with Data by Jake VanderPlas ISBN 9781371401412, 9781491912058, 1371401411, 1491912057

(Ebook) Data Science at the Command Line: Obtain, Scrub, Explore, and Model Data with Unix Power Tools by Jeroen Janssens ISBN 9781492087915, 1492087912

(Ebook) Python Data Science Handbook: Essential Tools for Working with Data, 2nd Edition by Jake VanderPlas ISBN 9781098121228, 9781098121198, 1098121228, 1098121198

Python Data Cleaning Cookbook: Modern techniques and Python tools to detect and remove dirty data and extract key insights by Michael Walker ISBN 9781800565661, 1800565666 instant download

(Ebook) Data Science Fundamentals with R, Python, and Open Data (for True Epub) by Marco Cremonini ISBN 9781394213269, 1394213263

(Ebook) Data Science at the Command Line by Jeroen Janssens ISBN 9781491947852, 1491947853

Effective Shell: A Practical User’s Guide to Working Smarter on the Command Line by Dave Kerr instant download

Effective Shell: A Practical User's Guide to Working Smarter on the Command Line by Dave Kerr instant download

Customer service

Customer Support