Postgres分表

这个需求就是一个典型的按时间创建分区表,首先看一下步骤:

  1. 创建父表

 

先创建一张父表,所有分区表都从它继承,这个表中没有数据,也不要在这个表上定义任何检查约束及索引,现在我们就先创建这样一张表,但之前先建一个序列:

CREATE SEQUENCE "public"."control_alarm_info_uid_seq"

 INCREMENT 1

 MINVALUE 1

 MAXVALUE 99999999

 START 1

 CACHE 1;

 

ALTER TABLE "public"."control_alarm_info_uid_seq " OWNER TO "postgres";

 

接下来创建父表

-- Table: control_alarm_info

 

-- DROP TABLE control_alarm_info;

 

CREATE TABLE control_alarm_info

(

  uid bigserial NOT NULL,

  status bigint,

  create_time timestamp without time zone,

  update_time timestamp without time zone,

  creator character varying(32),

  store_id integer,

  store_name character varying(255),

  target_id integer,

  person_name character varying(255),

  telphone character varying(255),

  alarm_time character varying(255),

  face_picurl character varying(255),

  camera_info_id character varying(255),

  camera_name character varying(255),

  event_log_id character varying(255),

  similarity real,

  person_code character varying(64),

  alarm_type character varying(255),

  bkg_picurl character varying(255),

  app_key character varying(64)

)

WITH (

  OIDS=FALSE

);

ALTER TABLE control_alarm_info4

  OWNER TO postgres;

  1. 按时间触发,创建n个子表

 

 

 

 

  1. 创建n个子表,每个子表都是继承于父表

由于每个分区表都是从父表继承的,所以分区表不会增加任何字段,下面我们按需求创建4张分区子表,分别用于存放9月、10月、11月和12月的日志数据:

create table control_alarm_info_201809

(CHECK (alarm_time >= '2018-09-01' AND alarm_time < '2018-10-01'))

INHERITS (control_alarm_info);

 

create table control_alarm_info_201810

(CHECK (alarm_time >= '2018-10-01' AND alarm_time < '2018-11-01'))

INHERITS (control_alarm_info);

 

create table control_alarm_info_201811

(CHECK (alarm_time >= '2018-11-01' AND alarm_time < '2018-12-01'))

INHERITS (control_alarm_info);

 

接下来在这4张分区表的每个分区键上建立索引:

在上面的语句中我们添加了一个约束表示只允许插入本月的数据,接下来在这4张分区表的每个分区键上建立索引:

create index control_alarm_info_201809_alarm_time ON control_alarm_info_201809 (alarm_time);

create index control_alarm_info_201810_alarm_time ON control_alarm_info_201810 (alarm_time);

create index control_alarm_info_201811_alarm_time ON control_alarm_info_201811(alarm_time);

 

查询时查询条件中包含这些索引时才会提高查询效率,如果能定位到一张子表内,效率更高。

  1. 定义一个规则(Rule)或触发器(Trigger),把对主表的数据插入重定向到合适的分区表

如何才能让不同日期的数据自动的插入与其对应的分区子表中呢?有两种解决方案,分别是:规则(Rule)和触发器(Trigger),相比触发器,Rule的开销更大,所以我在这里就不做过多介绍了,下面直接介绍Trigger的方式。

Trigger通常会结合自定义函数(Function)来实现分区插入,Function负责根据条件选择插入,而Trigger则负责Function的自动调用。首先定义Function,功能很简单,即根据日期区间insert数据即可:

Drop trigger control_alarm_info_insert_trigger;

CREATE

OR REPLACE FUNCTION control_alarm_info_insert_trigger () RETURNS TRIGGER AS $$

BEGIN

 

IF (

  NEW .alarm_time >= '2018-09-01'

  AND NEW .alarm_time < '2018-10-01'

) THEN

  INSERT INTO control_alarm_info_201809

VALUES

  (NEW .*) ;

ELSEIF (

  NEW .alarm_time >= '2018-10-01'

  AND NEW .alarm_time < '2018-11-01'

) THEN

  INSERT INTO control_alarm_info_201810

VALUES

  (NEW .*) ;

ELSEIF (

  NEW .alarm_time >= '2018-11-01'

  AND NEW .alarm_time < '2018-12-01'

) THEN

  INSERT INTO control_alarm_info_201811

VALUES

  (NEW .*) ;

ELSE

  RAISE EXCEPTION 'Date out of range!' ;

END

IF ; RETURN NULL ;

END ; $$ LANGUAGE plpgsql;

 

最后再创建触发器用于执行刚才的Function

CREATE TRIGGER control_alarm_info_insert_trigger  BEFORE INSERT ON control_alarm_info

FOR EACH ROW

EXECUTE PROCEDURE control_alarm_info_insert_trigger();

 

 

未分表时200万数据中查询最后一条告警记录,2.505s

Postgres分表

 

 

分表后200万数据中查询最后一条告警记录,0.221s,提升了10倍

Postgres分表

 

插入性能

未分表时200万数据中插入10万条数据,需要104s