Cousera Sequence,Time Series and PredictionWK1

本文内容为Cousera,deepAI, Tensorflow in practise的笔记,原视频地址

Time series examples

It’s typically defined as an ordered sequence of values that are usually equally spaced over time.

Univariant

Cousera Sequence,Time Series and PredictionWK1
e.g.
stock prices
Cousera Sequence,Time Series and PredictionWK1
e.g.weather forecast
Cousera Sequence,Time Series and PredictionWK1
historical trend: e.g. Moore’s law
Cousera Sequence,Time Series and PredictionWK1
不知道这个Arcade revenue是啥
Cousera Sequence,Time Series and PredictionWK1

Multivariant

Multivariate Time Series charts can be useful ways of understanding the impact of related data.
Cousera Sequence,Time Series and PredictionWK1

e.g.Cousera Sequence,Time Series and PredictionWK1
e.g. combined, the correlation is easy to see
Cousera Sequence,Time Series and PredictionWK1
e.g. the path of a car as it travels.
Cousera Sequence,Time Series and PredictionWK1

Machine learning applied to time series

Predict / Forecast

Cousera Sequence,Time Series and PredictionWK1

Imputation (past/hole)

Cousera Sequence,Time Series and PredictionWK1
Cousera Sequence,Time Series and PredictionWK1

Anormaly Detect

Cousera Sequence,Time Series and PredictionWK1

Spot Pattern

语音识别中
Cousera Sequence,Time Series and PredictionWK1

Comman Patterns in time series

Trend

e.g. upwards facing trend 呈上升趋势
Cousera Sequence,Time Series and PredictionWK1

Seasonality

e.g. active users at a website for software developers
工作日(小平台) / 周末(凹区域)
其他 shopping sites 周末peak
Cousera Sequence,Time Series and PredictionWK1

Combination of seasonality & trend

overall upwards trend but there are local peaks and troughs
Cousera Sequence,Time Series and PredictionWK1

无法使用Time Serie的情形: Random(White) noise/ Random values 无法预测的数据

Cousera Sequence,Time Series and PredictionWK1

Autocorrelation 自相关

The spikes appear at random timestamps. You can’t predict when that will happen next or how strong they will be. But clearly, the entire series isn’t random. Between the spikes there’s a very deterministic type of decay.

Cousera Sequence,Time Series and PredictionWK1
We can see here that the value of each time step is 99 percent of the value of the previous time step plus an occasional spike. This is an auto correlated time series. Namely it correlates with a delayed copy of itself often called a lag.
Cousera Sequence,Time Series and PredictionWK1
Often a time series like this is described as having memory as steps are dependent on previous ones. The spikes which are unpredictable are often called Innovations. In other words, they cannot be predicted based on past values.
Another example is here where there are multiple autocorrelations, in this case, at time steps one and 50. The lag one autocorrelation gives these very quick short-term exponential delays, and the 50 gives the small balance after each spike.
Cousera Sequence,Time Series and PredictionWK1

以上编程实现

ipynb

As we’ve learned a machine-learning model is designed to spot patterns, and when we spot patterns we can make predictions. For the most part this can also work with time series except for the noise which is unpredictable. But we should recognize that this assumes that patterns that existed in the past will of course continue on into the future.
Cousera Sequence,Time Series and PredictionWK1

Real life data : Combination of above ,Sometimes big events

stationary & Non-stationary time series

If this were stock, price then maybe it was a big financial crisis or a big scandal or perhaps a disruptive technological breakthrough causing a massive change. 比如这次的新冠疫情/以前的SARS
After that the time series started to trend downward without any clear seasonality. We’ll typically call this a non-stationary time series.
Cousera Sequence,Time Series and PredictionWK1
Cousera Sequence,Time Series and PredictionWK1
To predict on this we could just train for limited period of time. For example, here where I take just the last 100 steps. You’ll probably get a better performance than if you had trained on the entire time series. But that’s breaking the mold for typical machine, learning where we always assume that more data is better. But for time series forecasting it really depends on the time series. If it’s stationary, meaning its behavior does not change over time, then great. The more data you have the better. But if it’s not stationary then the optimal time window that you should use for training will vary.
Cousera Sequence,Time Series and PredictionWK1Ideally, we would like to be able to take the whole series into account and generate a prediction for what might happen next. As you can see, this isn’t always as simple as you might think given a drastic change like the one we see here. So that’s some of what you’re going to be looking at in this course. But let’s start by going through a workbook that generates sequences like those you saw in this video. After that we’ll then try to predict some of these synthesized sequences as a practice before later we’ll move on to real-world data.

Train,Val,Test sets

Fixed Partitioning

和其他数据的随机分组不同,这个要保证train,val,test都有一个seasonal pattern。
Cousera Sequence,Time Series and PredictionWK1
To measure the performance of our forecasting model,. We typically want to split the time series into a training period, a validation period and a test period. This is called fixed partitioning. If the time series has some seasonality, you generally want to ensure that each period contains a whole number of seasons. For example, one year, or two years, or three years, if the time series has a yearly seasonality. You generally don’t want one year and a half, or else some months will be represented more than others. While this might appear a little different from the training validation test, that you might be familiar with from non-time series data sets.Where you just picked random values out of the corpus to make all three, you should see that the impact is effectively the same.

val 调参差不多的,retrain using both train & val (well?=>) retrain using test

Cousera Sequence,Time Series and PredictionWK1
Next you’ll train your model on the training period, and you’ll evaluate it on the validation period. Here’s where you can experiment to find the right architecture for training. And work on it and your hyper parameters, until you get the desired performance, measured using the validation set. Often, once you’ve done that, you can retrain using both the training and validation data. And then test on the test period to see if your model will perform just as well. And if it does, then you could take the unusual step of retraining again, using also the test data. But why would you do that? Well, it’s because the test data is the closest data you have to the current point in time. And as such it’s often the strongest signal in determining future values. If your model is not trained using that data, too, then it may not be optimal. Due to this, it’s actually quite common to forgo a test set all together. And just train, using a training period and a validation period, and the test set is in the future.

Roll-Forward Partitioning

(大概就是切一小块一小块吧)
Cousera Sequence,Time Series and PredictionWK1

Moving average and differencing

(这一块暂时看不懂,哎我感觉adam什么的都是moving average 加个exponentional 加速度修正那些)
Cousera Sequence,Time Series and PredictionWK1
Cousera Sequence,Time Series and PredictionWK1
Cousera Sequence,Time Series and PredictionWK1
Cousera Sequence,Time Series and PredictionWK1
Cousera Sequence,Time Series and PredictionWK1

以上编程实现(metric)

ipynb

wk1 exercise

wk1 ans