电子书最佳背景图_最佳免费数据科学电子书2020更新

电子书最佳背景图

We are in an ever-advancing industry, and learning resources are unlimited.

我们处于一个不断发展的行业,学习资源是无限的。

Last year I put together a compilation of ebooks that have helped me in my data science learning path and have been recommended by mentors and professors to solve specific projects or deepen concepts.

去年,我整理了一些电子书的汇编,这对我的数据科学学习道路有所帮助,并受到导师和教授的推荐,以解决特定项目或加深概念。

As I spent time deepening my learning, I discovered new books that I didn’t recommend before or found updates of all books I’ve recommended. All the eBooks are legally for free or in a ‘Pay What you Want' concept with $0 as a minimum.

在花时间加深学习的过程中,我发现了以前不推荐的新书,或者发现了我推荐的所有书的更新。 所有电子书在法律上都是免费的,或采用“按需付费”的概念,最低收费为0美元。

If you enjoyed a book and you have the resources to do so, I suggest that you look for a way to support the author by buying the printed version, supporting them on Patreon, or Buying them a Coffee.

如果您喜欢一本书,并且有足够的资源来做这本书,那么我建议您寻找一种方式来支持作者,方法是购买印刷版,在Patreon上支持它们或购买咖啡。

Let’s keep quality education content available for the masses.

让我们继续为大众提供优质的教育内容。

Disclaimer: Python and sometimes R are my go-to programming languages and that is why most of the books are based on these. If you have recommendations of other books in other languages, please share them on the comments or send me a tweet and I will add them.

免责声明:Python和R有时是我的编程语言,这就是为什么大多数书籍都基于这些语言的原因。 如果您有其他语言推荐的其他书籍,请在评论中分享或给我发送推文,我将添加它们。

概率统计 (Probability and Statistics)

Description: A complete foundation for Statistics, also serving as a foundation for Data Science. OpenIntro Statistics offers a traditional introduction to statistics at the college level. This textbook is widely used at the college level and offers an exceptional and accessible introduction for students from community colleges to the Ivy League.

描述:完整的统计学基础,同时也是数据科学的基础。 OpenIntro Statistics对大学统计提供了传统的介绍。 这本教科书在大学中得到广泛使用,并为社区大学到常春藤盟校的学生提供了出色且易于理解的介绍。

  • Introduction to Probability2019'z Official book for Harvard’s Stats 110 by Joseph K. Blitzstein and Jessica Hwang

    概率概论 2019'z约瑟夫·K·布利兹施泰因和杰西卡·黄的《哈佛统计》 110版正式书

Description: This book provides essential language and tools for understanding statistics, randomness, and uncertainty. The book explores a wide variety of applications and examples, ranging from coincidences and paradoxes to Google PageRank and Markov chain Monte Carlo (MCMC). Additional application areas explored include genetics, medicine, computer science, and information theory.

说明:本书提供了基本的语言和工具 了解统计信息,随机性和不确定性。 本书探讨了各种应用和示例,从巧合和悖论到Google PageRank和Markov链蒙特卡洛(MCMC)。 探索的其他应用领域包括遗传学,医学,计算机科学和信息论。

The authors present the material in an accessible style and motivate concepts using real-world examples. Be prepared, it is a big book!.

作者以一种易于获取的方式介绍了该材料,并使用实际示例来激发概念。 要做好准备,这是一本大书!

probabilitybook.net

概率书网

电子书最佳背景图_最佳免费数据科学电子书2020更新
Image by probabilitybook.net
图片由概率书网

Also, check out their great probability cheat sheet here:

另外,在这里查看他们的大概率备忘单:

  • Probabilistic Programming & Bayesian Methods for Hackers (2020) by Cam Davidson-Pilon

    Cam Davidson-Pilon的《概率编程与黑客贝叶斯方法》(2020年)

Description: Bayesian Methods for Hackers is designed as an introduction to Bayesian inference from a computational/understanding-first, and mathematics-second, point of view. Of course, as an introductory book, we can only leave it at that: an introductory book. For the mathematically trained, they may cure the curiosity this text generates with other texts designed with mathematical analysis in mind. For the enthusiast with a less mathematical background or one who is not interested in mathematics but simply the practice of Bayesian methods, this text should be sufficient and entertaining.

描述:《贝叶斯黑客方法》旨在从计算/理解第一和数学第二的角度介绍贝叶斯推理。 当然,作为入门书籍,我们只能将其保留下来:入门书籍。 对于经过数学训练的人员,他们可以考虑其他经过专门数学分析设计的文本,来解决此文本产生的好奇心。 对于数学背景较少的发烧友或对数学不感兴趣而只是对贝叶斯方法的实践感兴趣的人,本文应该足够有趣。

Check their amazing Github using TensorFlow repo here:

在这里使用TensorFlow回购检查他们惊人的Github:

  • Practical statistics for Data Scientist (2017) by Peter Bruce and Andrew Bruce

    彼得·布鲁斯和安德鲁·布鲁斯的《数据科学家实用统计》(2017年)

Description: This book is aimed at the data scientist with some familiarity with the R programming language and with some prior (perhaps spotty or ephemeral)exposure to statistics. Both of us came to the world of data science from the world of statistics, so we have some appreciation of the contribution that statistics can make to the art of data science. At the same time, we are well aware of the limitations of traditional statistics instruction: statistics as a
discipline is a century and a half old, and most statistics textbooks and courses
are laden with the momentum and inertia of an ocean liner.

简介:本书针对的是熟悉R编程语言并具有一定统计经验(也许是零星的或短暂的)的数据科学家。 我们俩都从统计领域来到了数据科学领域,因此我们对统计可以对数据科学领域做出的贡献有所了解。 同时,我们深知传统统计教学的局限性: 学科已有一个半世纪的历史了,大多数统计学教科书和课程充满了远洋客轮的动量和惯性。

程式设计 (Programming)

  • R programming for Data Science by Roger d. Peng

    Roger d的《数据科学R编程》 。 鹏

Description: This book brings the fundamentals of R programming to you, using the same material developed as part of the industry-leading Johns Hopkins Data Science Specialization. The skills taught in this book will lay the foundation for you to begin your journey learning data science.

简介:本书使用与业界领先的约翰·霍普金斯数据科学专业知识相同的材料,为您带来R编程的基础知识。 本书中讲授的技能将为您开始学习数据科学的旅程奠定基础。

  • Exploratory Data Analysis with R by Roger d. Peng

    Roger的R探索性数据分析。 鹏

Description: This book teaches you to use R to effectively visualize and explore complex datasets. Exploratory data analysis is a key part of the data science process because it allows you to sharpen your question and refine your modeling strategies. This book is based on the industry-leading Johns Hopkins Data Science Specialization.

说明:本书教会您使用R来有效地可视化和探索复杂的数据集。 探索性数据分析是数据科学过程中的关键部分,因为它使您可以进一步提出问题并完善建模策略。 本书基于行业领先的约翰霍普金斯大学数据科学专业知识。

  • Data Science at the Command Line (2020) by Jeroen Janssens

    Jeroen Janssens的《命令行中的数据科学》(2020年)

Description: This book Obtain data from websites, APIs, databases, and spreadsheets

说明:本书从网站,API,数据库和电子表格中获取数据

  • Perform scrub operations on text, CSV, HTML/XML, and JSON

    对文本,CSV,HTML / XML和JSON执行清理操作

  • Explore data, compute descriptive statistics, and create visualizations

    探索数据,计算描述性统计数据并创建可视化

  • Manage your data science workflow

    管理您的数据科学工作流程

  • Create reusable command-line tools from one-liners and existing Python or R code

    通过单行代码和现有的Python或R代码创建可重复使用的命令行工具

  • Parallelize and distribute data-intensive pipelines

    并行化和分发数据密集型管道

  • Model data with dimensionality reduction, clustering, regression, and classification algorithms

    使用降维,聚类,回归和分类算法对数据建模

  • Python 3 101 (2019 — updated) by Michael Driscoll

    Michael Driscoll的Python 3101(2019年-更新)

Description: Learn how to program with Python 3 from beginning to end. Python 101 starts off with the fundamentals of Python and then builds onto what you’ve learned from there. The audience of this book is primarily people who have programmed in the past but want to learn Python. This book covers a fair amount of intermediate level material in addition to the beginner material.

描述:从头到尾学习如何使用Python 3进行编程。 Python 101从Python的基础开始,然后在您从中学到的知识基础上进行构建。 本书的读者主要是那些曾经编程但想学习Python的人。 这本书除了涵盖初学者的内容外,还涵盖了大量的中级水平的内容。

https://python101.pythonlibrary.org/

https://python101.pythonlibrary.org/

  • Natural Language Processing with Python by Steven Bird, Ewan Klein, and Edward Loper

    Steven Bird,Ewan Klein和Edward Loper的Python自然语言处理

Description: This book is a practical introduction to NLP. You will learn by example, write real programs, and grasp the value of being able to test an idea through implementation. If you haven’t learned already, this book will teach you programming. Unlike other programming books, we provide extensive illustrations and exercises from NLP. The approach we have taken is also principled, in that we cover the theoretical underpinnings and don’t shy away from careful linguistic and computational analysis. We have tried to be pragmatic in striking a balance between theory and application, identifying the connections and the tensions. Finally, we recognize that you won’t get through this unless it is also pleasurable, so we have tried to include many applications and examples that are interesting and entertaining, sometimes whimsical.

简介:本书是NL​​P的实用入门。 您将通过示例学习,编写真实的程序,并掌握通过实施测试一个想法的价值。 如果您还没有学习过,这本书将教您编程。 与其他编程书籍不同,我们提供NLP的大量插图和练习。 我们所采用的方法也是原则性的,因为我们涵盖了理论基础,并且不回避仔细的语言和计算分析。 我们试图务实地在理论与应用之间取得平衡,确定联系和紧张关系。 最后,我们认识到除非它也令人愉悦,否则您将无法完成它,因此我们尝试包含许多有趣而有趣的应用程序和示例,有时甚至是异想天开。

  • Mining of Massive Datasets (2019) by Jure Leskovec (Stanford University), Anand Rajaraman(Rocketship Ventures), and Jeffrey D. Ullman (Stanford University)

    挖掘海量数据集(2019) Jure Leskovec(斯坦福大学),Anand Rajaraman(火箭事业)和Jeffrey D. Ullman(斯坦福大学)

Description: This book focuses on data mining of very large amounts of data, that is, data so large it does not fit in main memory. Because of the emphasis on size, many of their examples are about the Web or data derived from the Web. Further, the book takes an algorithmic point of view: data mining is about applying algorithms to data, rather than using data to “train” a machine-learning engine of some sort.

简介:本书着重于大量数据的数据挖掘,也就是说,如此之大的数据无法容纳在主存储器中。 由于强调大小,因此他们的许多示例都是有关Web或从Web派生的数据的。 此外,这本书还从算法的角度出发:数据挖掘是关于将算法应用于数据,而不是使用数据来“训练”某种机器学习引擎。

  • Machine Learning Yearning(2016) by Andrew Ng

    机器学习的渴望(2016) ,吴安德

Description: AI is transforming numerous industries. Machine Learning Yearning, teaches you how to structure Machine Learning projects.

描述:人工智能正在改变众多行业。 机器学习渴望,教您如何构建机器学习项目。

This book is focused not on teaching you ML algorithms, but on how to make ML algorithms work. After reading Machine Learning Yearning, you will be able to:

本书的重点不是教学ML算法,而是如何使ML算法工作。 阅读机器学习渴望之后,您将能够:

  • Prioritize the most promising directions for an AI project

    优先考虑AI项目最有希望的方向

  • Diagnose errors in a machine learning system

    诊断机器学习系统中的错误

  • Build ML in complex settings, such as mismatched training/ test sets

    在复杂的设置中构建ML,例如不匹配的训练/测试集

  • Set up an ML project to compare to and/or surpass human-level performance

    设置一个ML项目以比较和/或超越人类水平的绩效

  • Know when and how to apply end-to-end learning, transfer learning, and multi-task learning.

    了解何时以及如何应用端到端学习,迁移学习和多任务学习。

领导数据科学团队 (Leading a Data Science Team)

  • Executive Data Science (2018) by Brian Caffo, Roger D. Peng, and Jeffrey Leek

    Brian Caffo,Roger D.Peng和Jeffrey Leek撰写的《执行数据科学》(2018年)

Description: This book teaches you how to assemble and lead a data science enterprise so that your organization can move towards extracting information from big data.

描述:本书教您如何组建和领导数据科学企业,以便您的组织可以朝着从大数据中提取信息的方向发展。

Is there another ebook that MUST be on this list? Share with me on the comments or send me a tweet https://twitter.com/brendahali

此清单上还有另一本电子书吗? 与我分享评论或给我发送推文https://twitter.com/brendahali

翻译自: https://towardsdatascience.com/the-best-free-data-science-ebooks-2020-update-dac5e170a478

电子书最佳背景图