数据集的evaluation protocol是什么

举个例子:
比如oulu-npu数据集的介绍如下:
Evaluation protocols
For the evaluation of the generalization capability of the face PAD methods, four protocols are used.

Protocol I:

The first protocol is designed to evaluate the generalization of the face PAD methods under previously
unseen environmental conditions, namely illumination and background scene. As the database is recorded in three sessions with different illumination condition and location, the train, development and evaluation sets are constructed using video recordings taken in different sessions.

Protocol II:

The second protocol is designed to evaluate the effect of attacks created with different printers or dis-
plays on the performance of the face PAD methods as they may suffer from new kinds of artifacts. The effect of attack variation is assessed by introducing a previously unseen print and video-replay attack in the test set.

Protocol III:

One of the critical issues in face PAD and image classification in general is sensor interoperability. To study the effect of the input camera variation, a Leave One Camera Out (LOCO) protocol is used. In each iteration, the real and the attack videos recorded with five smartphones are used to train and tune the algorithms, and the generalization of the models is assessed using the videos recorded with the remaining one.

Protocol IV:

In the last and most challenging protocol, all above three factors are considered simultaneously and generalization of face PAD methods are evaluated across previously unseen environmental conditions, attacks and input sensors.

The following table gives a detailed information about the video recordings used in the train, development and test sets of each protocol (P refers to printer and D refers to display).

数据集的evaluation protocol是什么
evaluation protocol字面意思是评价协议
个人认为就是如何取评价每个人的方法在该数据集上的效果
也就是说该数据集制定一个或多个规矩,大家都要遵守(统一的标准才好比较)
以protocol I 为例
The first protocol is designed to evaluate the generalization of the face PAD methods under previously
unseen environmental conditions, namely illumination and background scene. As the database is recorded in three sessions with different illumination condition and location, the train, development and evaluation sets are constructed using video recordings taken in different sessions.
讲的就是在不同环境情况下(对应就是3种Session)的泛化性
对应上表中的Protocol I
训练和验证用的都是Session1,2
而测试用的是Session3