如何写一篇好的科研论文
背景
近期在整理一些文档,发现了去年去参加ICCV 2019的时候,Facebook AI 的 Ross Girshick 做了一个关于目标检测和实例分割的 tutorial,最后用19页PPT讲解了如何写一篇好的科研论文,重新回顾了一下做个整理总结留用。
我能够从你的论文里学到什么?
首先,一篇论文应该是关于单个聚集的观点或问题;
“观点”意味着 方法;我能从中学到什么呢?
- 在什么条件下有效?
- 何时无效?
- 如果该方法有多个部件,哪些是最重要的?
- 哪些实施细节很重要?
如果你的观点加上了一些技巧得到了SOTA的结果,其实并不是很看重。
- 审稿人关注点是:我能够从你的观点里学到什么,有没有一些有趣的东西
论文写作:
- 从 基准线开始,在其上应用你的想法;
- 各类消融实验:注意每张表格只传达1个信息(One table, one message)
论证你的所有观点
所以的观点都需要引用已有观点或实验来支撑;
- Claim && Table/Figure
避免不合格的表达:
- 例如:Intuitively…, may
留意速度/准确率的论证
与当前算法的比较通常是不可控的,主要表现为:
- 使用不同参数时,算法准确率不同;
- 不同参数及硬件环境,算法速度不同;
- 接口细节不同时,算法速度不同;
- 有些是以优化算法速度为生的,他们可能做到10-100倍的提速;
因此,我们对于算法准确度和速度提升保持怀疑态度,做实验是,需要注意一下四项:
- 尽可能的让训练过程设置相同
- 接口设置尽可能相同
- 确保底层优化公平性
- 确保所有方法都使用相同硬件
原稿部分内容:
What Did I Learn from Your Paper?
A paper should be about a single focused idea or question
“Idea” usually means method; What should I learn?
- Under what conditions does it work?
- When does it not work?
- If the idea has multiple components, which are the most important?
- Which implementation details are important?
I seldom care “If your idea unrelated ideas/tricks”—>S.O.T.A. results
- My first priority: to learn some interesting things about your idea.
Support all of Your Claims
Beware of Speed/Accuracy Claims
Comparisons across publications are often uncontrolled
- Accuracy varies with hyper-parameters (‘recipe’)
- Speed varies with low-level optimization (perf tuning) and hardward
- Speed varies with inference details (e.g., batching during inference)
- Someone else writes fast mode for a libing ( 10-100x speedup)
- Therefore, speed/accuracy should be taken with a large grain of salt.
Implement all Methods in One Codebase
There are so many details that matter in detection
E.g., COCO mask AP increase ~1% AP (absolute) going from detectron (v1) to detectron 2 (same model)
- A baseline in detectron (v1) is not a valid comparison to a method in detectron2
Many good codebased now: mmdetection, simpledet, detectron2
- No excuses anymore;
- Use the same codebase greatest extent possible