A Survey of Visual Analytic Pipelines

论文传送门

作者:

浙江大学

  • 王叙萌
  • 张天野
  • 马昱欣
  • 夏菁
  • 陈为

摘要

可视分析在过去十年中得到了广泛的研究。使可视分析在研究和工业应用中都切实可行的一个关键是对可视分析管道的适当定义和实施,它为设计和实施可视分析系统提供了有效的抽象。在本文中,我们从数据,可视化,模型和知识等多个角度回顾了先前有关可视分析管道和各个模块的工作。在每个模块中,我们讨论模块内部管道的各种表示形式和描述,并比较它们之间的共性和差异。

Conventional Visual Analytics Pipeline

A Survey of Visual Analytic Pipelines

原始数据包含错误与无效值,需要预处理。

Data

  • Data Integration
  • Data Cleaning
  • Data Transformation
  • Data Reduction

A Survey of Visual Analytic Pipelines

Visualization

  • Information visualization data state reference model
    A Survey of Visual Analytic Pipelines
  • Generic visualization model
    A Survey of Visual Analytic Pipelines
  • Reference model
    A Survey of Visual Analytic Pipelines
  • Nested model of visualization creation
    A Survey of Visual Analytic Pipelines
    Specific visualization pipeline
    A Survey of Visual Analytic Pipelines
    视觉通道

A Survey of Visual Analytic Pipelines
还有很多经典的视觉映射方法,包括平行坐标轴、力导引图、弦图、散点图矩阵等等。

View Generation and Coordination

Visualization system with multiple views
A Survey of Visual Analytic Pipelines
Overview + detail

Model

Witten 和 Frank 根据数据集呈现的结构将机器学习模型分成了8种基本类型。

Data mining as a step in the process of knowledge discovery
A Survey of Visual Analytic Pipelines
Predictive visual analytics pipeline

A Survey of Visual Analytic Pipelines
Our proposed model design driven pipeline A Survey of Visual Analytic Pipelines
可以借助可视分析工具来提升特征选择的效率

  • Feature Selection and Generation
    • clustering
    • ranking
    • sorting
  • Model Building, Selection and Validation
    • statistical models
    • physical models
    • data mining models
      A Survey of Visual Analytic Pipelines

Knowledge

  • intelligence gaining
  • sense-making
  • decision-making
  • concept building

Sense-making model
A Survey of Visual Analytic Pipelines
Knowledge Generation Pipeline
A Survey of Visual Analytic Pipelines
Other Pipelines

Human cognition model:
A Survey of Visual Analytic Pipelines
generating hypotheses — listing evidences — proving/disproving — creating the matrix of hypotheses and evidences — drawing conclusions — reanalyzing conclusion based on evidences

Knowledge generation model:
A Survey of Visual Analytic Pipelines
data — hypothesis — theory — explanation

data and frame:

A Survey of Visual Analytic Pipelines

Induction and Deduction (归纳与演绎)

In inductive reasoning, the analysts build concepts from observations or schemas.
In deductive reasoning, the analysts search for evidences that either confirm or deny the initialized hypotheses.

Guidelines

  • Enable Induction and Deduction
    • bottom-up reasoning and top-down reasoning
  • Enable Knowledge Externalization
  • Enable Data Provenance
    • To enable data provenance is another aspect of enabling deductive reasoning because it facilitates bottom-up reasoning
  • Enable Uncertainty-Aware Knowledge Generation

思考

Critical thinking:
对于不同类型的数据,比如地理空间数据、高维数据、网络数据等等,分析的pipeline可能有所差异。

Creative thinking:
分析常用BI软件对于可视分析流水线的包含实现程度。

How to apply to our work:
根据pipeline,快速设计原型系统,迭代改进。思考一些步骤能否并行。