Cousera Sequence,Time Series and PredictionWK1
本文内容为Cousera,deepAI, Tensorflow in practise的笔记,原视频地址
Time series examples
It’s typically defined as an ordered sequence of values that are usually equally spaced over time.
Univariant
e.g.
stock prices
e.g.weather forecast
historical trend: e.g. Moore’s law
不知道这个Arcade revenue是啥
Multivariant
Multivariate Time Series charts can be useful ways of understanding the impact of related data.
e.g.
e.g. combined, the correlation is easy to see
e.g. the path of a car as it travels.
Machine learning applied to time series
Predict / Forecast
Imputation (past/hole)
Anormaly Detect
Spot Pattern
语音识别中
Comman Patterns in time series
Trend
e.g. upwards facing trend 呈上升趋势
Seasonality
e.g. active users at a website for software developers
工作日(小平台) / 周末(凹区域)
其他 shopping sites 周末peak
Combination of seasonality & trend
overall upwards trend but there are local peaks and troughs
无法使用Time Serie的情形: Random(White) noise/ Random values 无法预测的数据
Autocorrelation 自相关
The spikes appear at random timestamps. You can’t predict when that will happen next or how strong they will be. But clearly, the entire series isn’t random. Between the spikes there’s a very deterministic type of decay.
We can see here that the value of each time step is 99 percent of the value of the previous time step plus an occasional spike. This is an auto correlated time series. Namely it correlates with a delayed copy of itself often called a lag.
Often a time series like this is described as having memory as steps are dependent on previous ones. The spikes which are unpredictable are often called Innovations. In other words, they cannot be predicted based on past values.
Another example is here where there are multiple autocorrelations, in this case, at time steps one and 50. The lag one autocorrelation gives these very quick short-term exponential delays, and the 50 gives the small balance after each spike.
以上编程实现
As we’ve learned a machine-learning model is designed to spot patterns, and when we spot patterns we can make predictions. For the most part this can also work with time series except for the noise which is unpredictable. But we should recognize that this assumes that patterns that existed in the past will of course continue on into the future.
Real life data : Combination of above ,Sometimes big events
stationary & Non-stationary time series
If this were stock, price then maybe it was a big financial crisis or a big scandal or perhaps a disruptive technological breakthrough causing a massive change. 比如这次的新冠疫情/以前的SARS
After that the time series started to trend downward without any clear seasonality. We’ll typically call this a non-stationary time series.
To predict on this we could just train for limited period of time. For example, here where I take just the last 100 steps. You’ll probably get a better performance than if you had trained on the entire time series. But that’s breaking the mold for typical machine, learning where we always assume that more data is better. But for time series forecasting it really depends on the time series. If it’s stationary, meaning its behavior does not change over time, then great. The more data you have the better. But if it’s not stationary then the optimal time window that you should use for training will vary.Ideally, we would like to be able to take the whole series into account and generate a prediction for what might happen next. As you can see, this isn’t always as simple as you might think given a drastic change like the one we see here. So that’s some of what you’re going to be looking at in this course. But let’s start by going through a workbook that generates sequences like those you saw in this video. After that we’ll then try to predict some of these synthesized sequences as a practice before later we’ll move on to real-world data.
Train,Val,Test sets
Fixed Partitioning
和其他数据的随机分组不同,这个要保证train,val,test都有一个seasonal pattern。
To measure the performance of our forecasting model,. We typically want to split the time series into a training period, a validation period and a test period. This is called fixed partitioning. If the time series has some seasonality, you generally want to ensure that each period contains a whole number of seasons. For example, one year, or two years, or three years, if the time series has a yearly seasonality. You generally don’t want one year and a half, or else some months will be represented more than others. While this might appear a little different from the training validation test, that you might be familiar with from non-time series data sets.Where you just picked random values out of the corpus to make all three, you should see that the impact is effectively the same.
val 调参差不多的,retrain using both train & val (well?=>) retrain using test
Next you’ll train your model on the training period, and you’ll evaluate it on the validation period. Here’s where you can experiment to find the right architecture for training. And work on it and your hyper parameters, until you get the desired performance, measured using the validation set. Often, once you’ve done that, you can retrain using both the training and validation data. And then test on the test period to see if your model will perform just as well. And if it does, then you could take the unusual step of retraining again, using also the test data. But why would you do that? Well, it’s because the test data is the closest data you have to the current point in time. And as such it’s often the strongest signal in determining future values. If your model is not trained using that data, too, then it may not be optimal. Due to this, it’s actually quite common to forgo a test set all together. And just train, using a training period and a validation period, and the test set is in the future.
Roll-Forward Partitioning
(大概就是切一小块一小块吧)
Moving average and differencing
(这一块暂时看不懂,哎我感觉adam什么的都是moving average 加个exponentional 加速度修正那些)