机器学习基石 - The Learning Problem

机器学习基石上 (Machine Learning Foundations)—Mathematical Foundations
Hsuan-Tien Lin, 林轩田,副教授 (Associate Professor),资讯工程学系 (Computer Science and Information Engineering)

The Learning Problem

What Is Machine Learning

  • observation → learning → skill
  • data → ML → skill
  • skill ↔ improve some performance measure
  • machine learning: improve some performance measure with experience computed from data
  • ML: an alternative route to build complicated system
    • some use scenarios
      机器学习基石 - The Learning Problem

Key Essence of Machine Learning

  • 具有潜在的模式
  • 不能简单的编写出来程序
  • 大量的数据

机器学习基石 - The Learning Problem

Examples

机器学习基石 - The Learning Problem

Applications of Machine Learning

  • Food, Housing, Transport. Clothes. Education, Entertainment
  • Learn our performances
  • Examples
    机器学习基石 - The Learning Problem

Components of Machine Learning

  • input: xX
  • output: yY
  • unknown pattern to be learnt (target function): f:XY
  • data (training examples): D={(x1,y1),(x2,y2),...,(xN,yN)}
  • hypothesis (skill with hopefully good performance): g:XY
  • {(xn,yn)} from fMLg , f 是真正的模式,但学不到,g 是学习之后给出的模式
  • 机器学习基石 - The Learning Problem
    • f is unknown (no programmable definition)
    • 希望 gf 尽量接近(假设函数和目标函数)
    • assume gH={hk}, hypothesis set H can contain good or bad hypotheses
    • learning algorithm A to pick the ‘best’ one as g
  • machine learning: use data to compute hypothesis g that approximates target f

Machine Learning and Other Fields

Machine Learning and Data Mining

  • 资料勘探即数据挖掘 (Data Mining)

    use huge data to find property that is interesting

  • if ‘interesting property’ same as ‘hypothesis that approximate target’ —— ML = DM

  • if ‘interesting property’ related to ‘hypothesis that approximate target’ —— DM can help ML, and vice versa (but not always)

  • traditional DM also focuses on efficient computation in large database

Machine Learning and Artificial Intelligence

  • 人工智能 (Artificial Intelligence)

    compute something that shows intelligent behavior

  • ML is one possible route to realize AI (机器学习是实现人工智能的一种方法)

  • gf is something that shows intelligent behavior

Machine Learning and Statistics

  • 统计学 (Statistics)

    use data to make inference(推断) about an unknown process

  • g is an inference outcome while f is something unknown

  • statistics can be used to achieve ML (统计是实现机器学习的一种方法)

  • traditional statistics also focus on provable results with math assumptions (侧重数学上的推论), and care less about computation

  • In statistics, there are many useful tools for ML