吴恩达机器学习笔记 (一)--初识机器学习

吴恩达机器学习笔记 (一)–初识机器学习

学习基于:吴恩达机器学习.

1. What is Machine Learning?

Two definations of Machine Learning are offered

  • Arthur Samuel described it as:
    the field of study that gives computers the ability to learn without being explicitly programmed.
    (它是一种使计算机无需显式编程就能学习的研究领域。)

  • Tom Mitchell provides a more modern definition:
    “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, its performance at tasks in T, as measured by P, improves with experience E.”
    (它是一种计算机程序,从经验E中学习关于某些任务T进行性能测量P,任务T的效果根据测量值P随着经验E提高。)

    • E.g.:
      playing checkers.
      E = the experience of playing many games of checkers
      T = the task of playing checkers.
      P = the probability that the program will win the next game.
机器学习
机器学习算法
监督学习
无监督学习

2. Supervised learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Regression problem:

In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function.

  • E.g:
    Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output.
    吴恩达机器学习笔记 (一)--初识机器学习

Classification problem

In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

  • E.g.:
    Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.
    吴恩达机器学习笔记 (一)--初识机器学习

3. Unsupervised learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like.

  • We can derive structure from data where we don’t necessarily know the effect of the variables.
  • We can derive this structure by clustering the data based on relationships among the variables in the data.
  • With unsupervised learning there is no feedback based on the prediction results.
    • E.g.:
      Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on. (Clustering)
      吴恩达机器学习笔记 (一)--初识机器学习
      吴恩达机器学习笔记 (一)--初识机器学习