GP删除百亿级别表为何那么快?

     昨天进行对GP的一个比较大的业务表进行了drop动作,该表的大小大约在160亿左右,后来经过统计数据大小大约占6TB的空间,primary和mirror空间总共释放了12TB。

     GP的releation都会维护在一个 pg_class 的view中,通过sql 可以查询到某个table在disk的详细信息

                           pg_class的信息

column type references description
relname name   Name of the table, index, view, etc.
relnamespace oid pg_namespace.oid The OID of the namespace (schema) that contains this relation
reltype oid pg_type.oid The OID of the data type that corresponds to this table's row type, if any (zero for indexes, which have no pg_type entry)
relowner oid pg_authid.oid Owner of the relation
relam oid pg_am.oid If this is an index, the access method used (B-tree, Bitmap, hash, etc.)
relfilenode oid   Name of the on-disk file of this relation; 0 if none.
reltablespace oid pg_tablespace.oid The tablespace in which this relation is stored. If zero, the database's default tablespace is implied. (Not meaningful if the relation has no on-disk file.)
relpages int4   Size of the on-disk representation of this table in pages (of 32K each). This is only an estimate used by the planner. It is updated by VACUUM, ANALYZE, and a few DDL commands.
reltuples float4   Number of rows in the table. This is only an estimate used by the planner. It is updated by VACUUM, ANALYZE, and a few DDL commands.
reltoastrelid oid pg_class.oid OID of the TOAST table associated with this table, 0 if none. The TOAST table stores large attributes "out of line" in a secondary table.
reltoastidxid oid pg_class.oid For a TOAST table, the OID of its index. 0 if not a TOAST table.
relaosegidxid oid   Deprecated in Greenplum Database 3.4.
relaosegrelid oid   Deprecated in Greenplum Database 3.4.
relhasindex boolean   True if this is a table and it has (or recently had) any indexes. This is set by CREATE INDEX, but not cleared immediately by DROP INDEX. VACUUM will clear if it finds the table has no indexes.
relisshared boolean   True if this table is shared across all databases in the system. Only certain system catalog tables are shared.
relkind char   The type of object

r = heap or append-optimized table, i = index, S = sequence, v = view, c = composite type, t = TOAST value, o = internal append-optimized segment files and EOFs, c = composite type, u = uncataloged temporary heap table

relstorage char   The storage mode of a table

a= append-optimized, c= column-oriented, h = heap, v = virtual, x= external table.

relnatts int2   Number of user columns in the relation (system columns not counted). There must be this many corresponding entries in pg_attribute.
relchecks int2   Number of check constraints on the table.
reltriggers int2   Number of triggers on the table.
relukeys int2   Unused
relfkeys int2   Unused
relrefs int2   Unused
relhasoids boolean   True if an OID is generated for each row of the relation.
relhaspkey boolean   True if the table has (or once had) a primary key.
relhasrules boolean   True if table has rules.
relhassubclass boolean   True if table has (or once had) any inheritance children.
relfrozenxid xid   All transaction IDs before this one have been replaced with a permanent (frozen) transaction ID in this table. This is used to track whether the table needs to be vacuumed in order to prevent transaction ID wraparound or to allow pg_clog to be shrunk. Zero (InvalidTransactionId) if the relation is not a table.
relacl aclitem[]   Access privileges assigned by GRANT and REVOKE.
reloptions text[]   Access-method-specific options, as "keyword=value" strings.

GP删除百亿级别表为何那么快?

此处有一个误区就是:误认为了服务器上的数据文件上一层目录时relowner,事实并非如此,relfilenode的上一层为baseid,也就是属于哪一个库?

经过验证

862951 -- 是系统库

17149 是业务库

 

GP删除百亿级别表为何那么快?

另外一个问题就是为啥删除5TB的数据,竟然是秒级别的,这个是非常快的,我们能从中学到什么呢?