[深度学习框架] Pytorch源码学习之源码基础一

代码段一

// Python object that backs torch.autograd.Variable
struct THPVariable {
    PyObject_HEAD
    torch::autograd::Variable cdata;
    PyObject* backward_hooks;
};

PyObject_HEAD这个宏用于标准化python对象,其拓展为一个结构体,包含有指向一个type对象(定义了初始化方法、内存分配等)的指针以及一个引用计数器。在python API中还有两个宏Py_INCREF()Py_DECREF(),它们用来增加和减少python对象的引用数量

代码段二

typedef struct THTensor
{
    int64_t *size;
    int64_t *stride;
    int nDimension;
    THStorage *storage;
    ptrdiff_t storageOffset;
    int refcount;
    char flag;
} THTensor;

可以看到,THTensor包含了size/strides/dimensions/offsets以及THStorage
[深度学习框架] Pytorch源码学习之源码基础一
代码段三

static THStorage* THPStorage_(newFilenameStorage)(ptrdiff_t size)
{
  int flags = TH_ALLOCATOR_MAPPED_SHAREDMEM | TH_ALLOCATOR_MAPPED_EXCLUSIVE;
  std::string handle = THPStorage_(__newHandle)();
  auto ctx = libshm_context_new(NULL, handle.c_str(), flags);
  return THStorage_(newWithAllocator)(size, &THManagedSharedAllocator, (void*)ctx);
}

THManagedSharedAllocator中有函数指向Pytorch的内置库libshm,以MacOS为例,这个库会实现Unix Domain Socket 通信来分享共享内存区域的句柄,以此来处理多进程时的内存分配问题。

内存共享
[深度学习框架] Pytorch源码学习之源码基础一
[深度学习框架] Pytorch源码学习之源码基础一
DLPack
DLPack是内存张量结构的开源标准,因为这一标准与许多框架的内存标准相似,使得zero-copy数据能在不同框架之间进行共享。这个的好处是让开发者能混用不同框架下的操作。
DLPack的核心结构体是DLTensor,如下:

/*!
 * \brief Plain C Tensor object, does not manage memory.
 */
typedef struct {
  /*!
   * \brief The opaque data pointer points to the allocated data.
   *  This will be CUDA device pointer or cl_mem handle in OpenCL.
   *  This pointer is always aligns to 256 bytes as in CUDA.
   */
  void* data;
  /*! \brief The device context of the tensor */
  DLContext ctx;
  /*! \brief Number of dimensions */
  int ndim;
  /*! \brief The data type of the pointer*/
  DLDataType dtype;
  /*! \brief The shape of the tensor */
  int64_t* shape;
  /*!
   * \brief strides of the tensor,
   *  can be NULL, indicating tensor is compact.
   */
  int64_t* strides;
  /*! \brief The offset in bytes to the beginning pointer to data */
  uint64_t byte_offset;
} DLTensor;

DLTensor中包括了指向原始数据的指针以及数据的shape/stride/offset
调用以下python语句就能实现tensorDLTensor格式的转换

import torch
from torch.utils import dlpack
t = torch.ones((5, 5))
dl = dlpack.to_dlpack(t)

dlpack.to_dlpack(t)会从ATen中调用toDLPack 函数完成pytorch数据格式到DLPack格式的转换,如下:

DLManagedTensor* toDLPack(const Tensor& src) {
  ATenDLMTensor * atDLMTensor(new ATenDLMTensor);
  atDLMTensor->handle = src;
  atDLMTensor->tensor.manager_ctx = atDLMTensor;
  atDLMTensor->tensor.deleter = &deleter;
  atDLMTensor->tensor.dl_tensor.data = src.data_ptr();
  int64_t device_id = 0;
  if (src.type().is_cuda()) {
    device_id = src.get_device();
  }
  atDLMTensor->tensor.dl_tensor.ctx = getDLContext(src.type(), device_id);
  atDLMTensor->tensor.dl_tensor.ndim = src.dim();
  atDLMTensor->tensor.dl_tensor.dtype = getDLDataType(src.type());
  atDLMTensor->tensor.dl_tensor.shape = const_cast<int64_t*>(src.sizes().data());
  atDLMTensor->tensor.dl_tensor.strides = const_cast<int64_t*>(src.strides().data());
  atDLMTensor->tensor.dl_tensor.byte_offset = 0;
  return &(atDLMTensor->tensor);
}

可以看出,整个转换比较简单,即将pytorch的原数据逐一转换成DLPack格式。

源码中常用变量前缀的意义
[深度学习框架] Pytorch源码学习之源码基础一

【参考文献】
http://blog.christianperone.com/2018/03/pytorch-internal-architecture-tour/