PostgreSQL:不区分大小写的字符串比较
PostgreSQL有一个简单的忽略大小写比较吗?PostgreSQL:不区分大小写的字符串比较
我想替换:
SELECT id, user_name
FROM users
WHERE lower(email) IN (lower('[email protected]'), lower('[email protected]'));
的东西,如:
SELECT id, user_name
FROM users
WHERE email IGNORE_CASE_IN ('[email protected]', '[email protected]');
编辑:单值like
和ilike
运营工作(如like '[email protected]'
),而不是套。
任何想法?
亚当
首先,不要做什么,不要使用ILIKE ...
create table y
(
id serial not null,
email text not null unique
);
insert into y(email)
values('[email protected]') ,('[email protected]');
insert into y(email)
select n from generate_series(1,1000) as i(n);
create index ix_y on y(email);
explain select * from y
where email ilike
ANY(ARRAY['[email protected]','[email protected]']);
执行计划:
memdb=# explain select * from y where email ilike ANY(ARRAY['[email protected]','[email protected]']);
QUERY PLAN
----------------------------------------------------------------------------------------
Seq Scan on y (cost=0.00..17.52 rows=1 width=7)
Filter: (email ~~* ANY ('{[email protected],[email protected]com}'::text[]))
(2 rows)
这是不是您创建一个索引低表达...
create function lower(t text[]) returns text[]
as
$$
select lower($1::text)::text[]
$$ language sql;
create unique index ix_y_2 on y(lower(email));
explain select * from y
where lower(email) =
ANY(lower(ARRAY['[email protected]','[email protected]']));
......正确使用指数:
memdb=# explain select * from y where lower(email) = ANY(lower(ARRAY['[email protected]','[email protected]']));
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on y (cost=22.60..27.98 rows=10 width=7)
Recheck Cond: (lower(email) = ANY ((lower(('{[email protected],[email protected]}'::text[])::text))::text[]))
-> Bitmap Index Scan on ix_y_2 (cost=0.00..22.60 rows=10 width=0)
Index Cond: (lower(email) = ANY ((lower(('{[email protected],[email protected]}'::text[])::text))::text[]))
(4 rows)
或者你用citext数据类型...
create table x
(
id serial not null,
email citext not null unique
);
insert into x(email)
values('[email protected]'),('[email protected]');
insert into x(email)
select n from generate_series(1,1000) as i(n);
create index ix_x on x(email);
explain select * from x
where email =
ANY(ARRAY['[email protected]','[email protected]']::citext[]);
...它正确使用索引,即使你没有创建一个表达指数(例如创建YYY ZZZ指数(较低(场))):
memdb=# explain select * from x where email = ANY(ARRAY['[email protected]','[email protected]']::citext[]);
QUERY PLAN
--------------------------------------------------------------------------------------------------
Bitmap Heap Scan on x (cost=8.52..12.75 rows=2 width=7)
Recheck Cond: (email = ANY ('{[email protected],[email protected]}'::citext[]))
-> Bitmap Index Scan on ix_x (cost=0.00..8.52 rows=2 width=0)
Index Cond: (email = ANY ('{[email protected],[email protected]}'::citext[]))
(4 rows)
请注意,当使用trigram索引时,您可以使'ILIKE'使用索引:https://www.postgresql.org/docs/current/static /pgtrgm.html(尽管B-Tree索引的更新速度会更快,但也会更快) – 2017-01-17 07:17:56
如果您将其声明为“唯一”,那么您也不需要在“email”上创建索引,该索引已经创建它是一个索引。 – 2017-01-17 07:36:19
使用不区分大小写的文本数据类型。使用citext:
create table emails
(
user_id int references users(user_id)
email citext
);
insert into emails(user_id, email) values(1, '[email protected]');
insert into emails(user_id, email) values(2, '[email protected]');
select * from emails where email in ('[email protected]','[email protected]');
如果你不能找到你的contrib目录citext.sql,复制并粘贴此在你的pgAdmin:
/* $PostgreSQL: pgsql/contrib/citext/citext.sql.in,v 1.3 2008/09/05 18:25:16 tgl Exp $ */
-- Adjust this setting to control where the objects get created.
SET search_path = public;
--
-- PostgreSQL code for CITEXT.
--
-- Most I/O functions, and a few others, piggyback on the "text" type
-- functions via the implicit cast to text.
--
--
-- Shell type to keep things a bit quieter.
--
CREATE TYPE citext;
--
-- Input and output functions.
--
CREATE OR REPLACE FUNCTION citextin(cstring)
RETURNS citext
AS 'textin'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citextout(citext)
RETURNS cstring
AS 'textout'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citextrecv(internal)
RETURNS citext
AS 'textrecv'
LANGUAGE internal STABLE STRICT;
CREATE OR REPLACE FUNCTION citextsend(citext)
RETURNS bytea
AS 'textsend'
LANGUAGE internal STABLE STRICT;
--
-- The type itself.
--
CREATE TYPE citext (
INPUT = citextin,
OUTPUT = citextout,
RECEIVE = citextrecv,
SEND = citextsend,
INTERNALLENGTH = VARIABLE,
STORAGE = extended,
-- make it a non-preferred member of string type category
CATEGORY = 'S',
PREFERRED = false
);
--
-- Type casting functions for those situations where the I/O casts don't
-- automatically kick in.
--
CREATE OR REPLACE FUNCTION citext(bpchar)
RETURNS citext
AS 'rtrim1'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext(boolean)
RETURNS citext
AS 'booltext'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext(inet)
RETURNS citext
AS 'network_show'
LANGUAGE internal IMMUTABLE STRICT;
--
-- Implicit and assignment type casts.
--
CREATE CAST (citext AS text) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (citext AS varchar) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (citext AS bpchar) WITHOUT FUNCTION AS ASSIGNMENT;
CREATE CAST (text AS citext) WITHOUT FUNCTION AS ASSIGNMENT;
CREATE CAST (varchar AS citext) WITHOUT FUNCTION AS ASSIGNMENT;
CREATE CAST (bpchar AS citext) WITH FUNCTION citext(bpchar) AS ASSIGNMENT;
CREATE CAST (boolean AS citext) WITH FUNCTION citext(boolean) AS ASSIGNMENT;
CREATE CAST (inet AS citext) WITH FUNCTION citext(inet) AS ASSIGNMENT;
--
-- Operator Functions.
--
CREATE OR REPLACE FUNCTION citext_eq(citext, citext)
RETURNS bool
AS '$libdir/citext'
LANGUAGE C IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext_ne(citext, citext)
RETURNS bool
AS '$libdir/citext'
LANGUAGE C IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext_lt(citext, citext)
RETURNS bool
AS '$libdir/citext'
LANGUAGE C IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext_le(citext, citext)
RETURNS bool
AS '$libdir/citext'
LANGUAGE C IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext_gt(citext, citext)
RETURNS bool
AS '$libdir/citext'
LANGUAGE C IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext_ge(citext, citext)
RETURNS bool
AS '$libdir/citext'
LANGUAGE C IMMUTABLE STRICT;
--
-- Operators.
--
CREATE OPERATOR = (
LEFTARG = CITEXT,
RIGHTARG = CITEXT,
COMMUTATOR = =,
NEGATOR = <>,
PROCEDURE = citext_eq,
RESTRICT = eqsel,
JOIN = eqjoinsel,
HASHES,
MERGES
);
CREATE OPERATOR <> (
LEFTARG = CITEXT,
RIGHTARG = CITEXT,
NEGATOR = =,
COMMUTATOR = <>,
PROCEDURE = citext_ne,
RESTRICT = neqsel,
JOIN = neqjoinsel
);
CREATE OPERATOR < (
LEFTARG = CITEXT,
RIGHTARG = CITEXT,
NEGATOR = >=,
COMMUTATOR = >,
PROCEDURE = citext_lt,
RESTRICT = scalarltsel,
JOIN = scalarltjoinsel
);
CREATE OPERATOR <= (
LEFTARG = CITEXT,
RIGHTARG = CITEXT,
NEGATOR = >,
COMMUTATOR = >=,
PROCEDURE = citext_le,
RESTRICT = scalarltsel,
JOIN = scalarltjoinsel
);
CREATE OPERATOR >= (
LEFTARG = CITEXT,
RIGHTARG = CITEXT,
NEGATOR = <,
COMMUTATOR = <=,
PROCEDURE = citext_ge,
RESTRICT = scalargtsel,
JOIN = scalargtjoinsel
);
CREATE OPERATOR > (
LEFTARG = CITEXT,
RIGHTARG = CITEXT,
NEGATOR = <=,
COMMUTATOR = <,
PROCEDURE = citext_gt,
RESTRICT = scalargtsel,
JOIN = scalargtjoinsel
);
--
-- Support functions for indexing.
--
CREATE OR REPLACE FUNCTION citext_cmp(citext, citext)
RETURNS int4
AS '$libdir/citext'
LANGUAGE C STRICT IMMUTABLE;
CREATE OR REPLACE FUNCTION citext_hash(citext)
RETURNS int4
AS '$libdir/citext'
LANGUAGE C STRICT IMMUTABLE;
--
-- The btree indexing operator class.
--
CREATE OPERATOR CLASS citext_ops
DEFAULT FOR TYPE CITEXT USING btree AS
OPERATOR 1 < (citext, citext),
OPERATOR 2 <= (citext, citext),
OPERATOR 3 = (citext, citext),
OPERATOR 4 >= (citext, citext),
OPERATOR 5 > (citext, citext),
FUNCTION 1 citext_cmp(citext, citext);
--
-- The hash indexing operator class.
--
CREATE OPERATOR CLASS citext_ops
DEFAULT FOR TYPE citext USING hash AS
OPERATOR 1 = (citext, citext),
FUNCTION 1 citext_hash(citext);
--
-- Aggregates.
--
CREATE OR REPLACE FUNCTION citext_smaller(citext, citext)
RETURNS citext
AS '$libdir/citext'
LANGUAGE 'C' IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION citext_larger(citext, citext)
RETURNS citext
AS '$libdir/citext'
LANGUAGE 'C' IMMUTABLE STRICT;
CREATE AGGREGATE min(citext) (
SFUNC = citext_smaller,
STYPE = citext,
SORTOP = <
);
CREATE AGGREGATE max(citext) (
SFUNC = citext_larger,
STYPE = citext,
SORTOP = >
);
--
-- CITEXT pattern matching.
--
CREATE OR REPLACE FUNCTION texticlike(citext, citext)
RETURNS bool AS 'texticlike'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION texticnlike(citext, citext)
RETURNS bool AS 'texticnlike'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION texticregexeq(citext, citext)
RETURNS bool AS 'texticregexeq'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION texticregexne(citext, citext)
RETURNS bool AS 'texticregexne'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OPERATOR ~ (
PROCEDURE = texticregexeq,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = !~,
RESTRICT = icregexeqsel,
JOIN = icregexeqjoinsel
);
CREATE OPERATOR ~* (
PROCEDURE = texticregexeq,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = !~*,
RESTRICT = icregexeqsel,
JOIN = icregexeqjoinsel
);
CREATE OPERATOR !~ (
PROCEDURE = texticregexne,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = ~,
RESTRICT = icregexnesel,
JOIN = icregexnejoinsel
);
CREATE OPERATOR !~* (
PROCEDURE = texticregexne,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = ~*,
RESTRICT = icregexnesel,
JOIN = icregexnejoinsel
);
CREATE OPERATOR ~~ (
PROCEDURE = texticlike,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = !~~,
RESTRICT = iclikesel,
JOIN = iclikejoinsel
);
CREATE OPERATOR ~~* (
PROCEDURE = texticlike,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = !~~*,
RESTRICT = iclikesel,
JOIN = iclikejoinsel
);
CREATE OPERATOR !~~ (
PROCEDURE = texticnlike,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = ~~,
RESTRICT = icnlikesel,
JOIN = icnlikejoinsel
);
CREATE OPERATOR !~~* (
PROCEDURE = texticnlike,
LEFTARG = citext,
RIGHTARG = citext,
NEGATOR = ~~*,
RESTRICT = icnlikesel,
JOIN = icnlikejoinsel
);
--
-- Matching citext to text.
--
CREATE OR REPLACE FUNCTION texticlike(citext, text)
RETURNS bool AS 'texticlike'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION texticnlike(citext, text)
RETURNS bool AS 'texticnlike'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION texticregexeq(citext, text)
RETURNS bool AS 'texticregexeq'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION texticregexne(citext, text)
RETURNS bool AS 'texticregexne'
LANGUAGE internal IMMUTABLE STRICT;
CREATE OPERATOR ~ (
PROCEDURE = texticregexeq,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = !~,
RESTRICT = icregexeqsel,
JOIN = icregexeqjoinsel
);
CREATE OPERATOR ~* (
PROCEDURE = texticregexeq,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = !~*,
RESTRICT = icregexeqsel,
JOIN = icregexeqjoinsel
);
CREATE OPERATOR !~ (
PROCEDURE = texticregexne,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = ~,
RESTRICT = icregexnesel,
JOIN = icregexnejoinsel
);
CREATE OPERATOR !~* (
PROCEDURE = texticregexne,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = ~*,
RESTRICT = icregexnesel,
JOIN = icregexnejoinsel
);
CREATE OPERATOR ~~ (
PROCEDURE = texticlike,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = !~~,
RESTRICT = iclikesel,
JOIN = iclikejoinsel
);
CREATE OPERATOR ~~* (
PROCEDURE = texticlike,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = !~~*,
RESTRICT = iclikesel,
JOIN = iclikejoinsel
);
CREATE OPERATOR !~~ (
PROCEDURE = texticnlike,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = ~~,
RESTRICT = icnlikesel,
JOIN = icnlikejoinsel
);
CREATE OPERATOR !~~* (
PROCEDURE = texticnlike,
LEFTARG = citext,
RIGHTARG = text,
NEGATOR = ~~*,
RESTRICT = icnlikesel,
JOIN = icnlikejoinsel
);
--
-- Matching citext in string comparison functions.
-- XXX TODO Ideally these would be implemented in C.
--
CREATE OR REPLACE FUNCTION regexp_matches(citext, citext) RETURNS TEXT[] AS $$
SELECT pg_catalog.regexp_matches($1::pg_catalog.text, $2::pg_catalog.text, 'i');
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION regexp_matches(citext, citext, text) RETURNS TEXT[] AS $$
SELECT pg_catalog.regexp_matches($1::pg_catalog.text, $2::pg_catalog.text, CASE WHEN pg_catalog.strpos($3, 'c') = 0 THEN $3 || 'i' ELSE $3 END);
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION regexp_replace(citext, citext, text) returns TEXT AS $$
SELECT pg_catalog.regexp_replace($1::pg_catalog.text, $2::pg_catalog.text, $3, 'i');
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION regexp_replace(citext, citext, text, text) returns TEXT AS $$
SELECT pg_catalog.regexp_replace($1::pg_catalog.text, $2::pg_catalog.text, $3, CASE WHEN pg_catalog.strpos($4, 'c') = 0 THEN $4 || 'i' ELSE $4 END);
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION regexp_split_to_array(citext, citext) RETURNS TEXT[] AS $$
SELECT pg_catalog.regexp_split_to_array($1::pg_catalog.text, $2::pg_catalog.text, 'i');
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION regexp_split_to_array(citext, citext, text) RETURNS TEXT[] AS $$
SELECT pg_catalog.regexp_split_to_array($1::pg_catalog.text, $2::pg_catalog.text, CASE WHEN pg_catalog.strpos($3, 'c') = 0 THEN $3 || 'i' ELSE $3 END);
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION regexp_split_to_table(citext, citext) RETURNS SETOF TEXT AS $$
SELECT pg_catalog.regexp_split_to_table($1::pg_catalog.text, $2::pg_catalog.text, 'i');
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION regexp_split_to_table(citext, citext, text) RETURNS SETOF TEXT AS $$
SELECT pg_catalog.regexp_split_to_table($1::pg_catalog.text, $2::pg_catalog.text, CASE WHEN pg_catalog.strpos($3, 'c') = 0 THEN $3 || 'i' ELSE $3 END);
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION strpos(citext, citext) RETURNS INT AS $$
SELECT pg_catalog.strpos(pg_catalog.lower($1::pg_catalog.text), pg_catalog.lower($2::pg_catalog.text));
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION replace(citext, citext, citext) RETURNS TEXT AS $$
SELECT pg_catalog.regexp_replace($1::pg_catalog.text, pg_catalog.regexp_replace($2::pg_catalog.text, '([^a-zA-Z_0-9])', E'\\\\\\1', 'g'), $3::pg_catalog.text, 'gi');
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION split_part(citext, citext, int) RETURNS TEXT AS $$
SELECT (pg_catalog.regexp_split_to_array($1::pg_catalog.text, pg_catalog.regexp_replace($2::pg_catalog.text, '([^a-zA-Z_0-9])', E'\\\\\\1', 'g'), 'i'))[$3];
$$ LANGUAGE SQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION translate(citext, citext, text) RETURNS TEXT AS $$
SELECT pg_catalog.translate(pg_catalog.translate($1::pg_catalog.text, pg_catalog.lower($2::pg_catalog.text), $3), pg_catalog.upper($2::pg_catalog.text), $3);
$$ LANGUAGE SQL IMMUTABLE STRICT;
'创建扩展名“citext”;'将安装模块 – 2014-10-30 16:59:50
select *
where email ilike '[email protected]'
ilike
类似于like
但不区分大小写。对于转义字符使用replace()
where email ilike replace(replace(replace($1, '~', '~~'), '%', '~%'), '_', '~_') escape '~'
,或者你可以创建一个函数来逃避文本;用于数组文本使用
where email ilike any(array['[email protected]', '[email protected]'])
+1'any'运营商正是我所期待的。谢谢! – 2010-12-19 09:41:33
'LIKE'和'ILIKE'与字符串相等是非常不同的,并且必要的'替换'魔术来摆脱元字符比原来的'lower'调用差得多。虽然'ILIKE'没有打扰说明元字符常常会作为一个快速和肮脏的一次性,我不会主张它作为一般不区分大小写的字符串比较。 – Ben 2012-08-15 02:38:09
@Bonshington我喜欢'ILike'的想法 - 毕竟这些年来,我从来不知道它。但是,你知道这是否适用于任何语言,还是仅适用于英语和拉丁语集?谢谢! +1上面的答案。 – itsols 2013-01-02 08:02:15
您也可以在lower(email)上创建索引。
会有点击败这个问题的目的,但提问者不想被打扰使用较低的我猜:-)使用citext的一些基本原理:http://www.depesz.com/index.php/2008/08/10/waiting-for-84-case-insensitive-text-citext/ – 2010-12-19 13:21:04
Use ‘Collate SQL_Latin1_General_CP1_CS_AS’ for it.
declare @a nvarchar(5)='a'
declare @b nvarchar(5)='A'
if(@[email protected] Collate SQL_Latin1_General_CP1_CS_AS)
begin
print 'Match'
end
else
begin
print 'Not Matched'
end
OP询问PostgreSQL,而不是SQL Server。 – NathanAldenSr 2016-08-17 03:13:20
事在过去4年中已经改变,因为这个问题得到的回答和建议,“不使用ILIKE”是不正确的任何更多(至少以这样的一般方式)。
实际上,根据数据分布情况,带有trigram index的ILIKE甚至可能比citext
更快。
唯一索引确实是有很大的不同,可使用迈克尔的测试设置时可以看到:
create table y
(
id serial not null,
email text not null unique
);
insert into y(email)
select 'some.name'||n||'@foobar.com'
from generate_series(1,100000) as i(n);
-- create a trigram index to support ILIKE
create index ix_y on y using gin (email gin_trgm_ops);
create table x
(
id serial not null,
email citext not null unique
);
-- no need to create an index
-- the UNIQUE constraint will create a regular B-Tree index
insert into x(email)
select email
from y;
使用ILIKE
执行计划:
explain (analyze)
select *
from y
where email ilike ANY (ARRAY['[email protected]','[email protected]']);
Bitmap Heap Scan on y (cost=126.07..154.50 rows=20 width=29) (actual time=60.696..60.818 rows=2 loops=1)
Recheck Cond: (email ~~* ANY ('{[email protected],[email protected]}'::text[]))
Rows Removed by Index Recheck: 13
Heap Blocks: exact=11
-> Bitmap Index Scan on ix_y (cost=0.00..126.07 rows=20 width=0) (actual time=60.661..60.661 rows=15 loops=1)
Index Cond: (email ~~* ANY ('{[email protected],[email protected]}'::text[]))
Planning time: 0.952 ms
Execution time: 61.004 ms
而且使用citext
:
explain (analyze)
select *
from x
where email = ANY (ARRAY['[email protected]','[email protected]']);
Index Scan using x_email_key on x (cost=0.42..5.85 rows=2 width=29) (actual time=0.111..0.203 rows=2 loops=1)
Index Cond: (email = ANY ('{[email protected],[email protected]}'::citext[]))
Planning time: 0.115 ms
Execution time: 0.254 ms
请注意,ILIKE
查询实际上是不同于=
citext查询,因为ILIKE会兑现通配符。
但是,对于非唯一索引,事情看起来不同。下面的设置是基于recent question问同样的:
create table data
(
group_id serial primary key,
name text
);
create table data_ci
(
group_id serial primary key,
name citext
);
insert into data(name)
select 'data'||i.n
from generate_series(1,1000) as i(n), generate_series(1,1000) as i2(n);
insert into data_ci(group_id, name)
select group_id, name
from data;
create index ix_data_gin on data using gin (name public.gin_trgm_ops);
create index ix_data_ci on data_ci (name);
因此,我们必须在每个表一百万行,为name
列1000点不同的值,并为每个不同的价值,我们有1000个重复。查询3个不同值的查询将返回3000行。
在这种情况下的三字母组索引实质上更快则B树索引:
explain (analyze)
select *
from data
where name ilike any (array['Data1', 'data2', 'DATA3']);
Bitmap Heap Scan on data (cost=88.25..1777.61 rows=1535 width=11) (actual time=2.906..11.064 rows=3000 loops=1)
Recheck Cond: (name ~~* ANY ('{Data1,data2,DATA3}'::text[]))
Heap Blocks: exact=17
-> Bitmap Index Scan on ix_data_gin (cost=0.00..87.87 rows=1535 width=0) (actual time=2.869..2.869 rows=3000 loops=1)
Index Cond: (name ~~* ANY ('{Data1,data2,DATA3}'::text[]))
Planning time: 2.174 ms
Execution time: 11.282 ms
而关于citext列中的B树索引现在使用SEQ扫描
explain analyze
select *
from data_ci
where name = any (array['Data1', 'data2', 'DATA3']);
Seq Scan on data_ci (cost=0.00..10156.00 rows=2904 width=11) (actual time=0.449..304.301 rows=1000 loops=1)
Filter: ((name)::text = ANY ('{Data1,data2,DATA3}'::text[]))
Rows Removed by Filter: 999000
Planning time: 0.152 ms
Execution time: 304.360 ms
GIN索引的大小实际上小于citext
列中的大小:
select pg_size_pretty(pg_total_relation_size('ix_data_gin')) as gin_index_size,
pg_size_pretty(pg_total_relation_size('ix_data_ci')) as citex_index_size
gin_index_size | citex_index_size
---------------+-----------------
11 MB | 21 MB
以上使用的Postgres 9.6.1完成在Windows笔记本电脑random_page_cost
设置为1.5
不要使用** ** ILIKE,这将导致对**顺序扫描**:http://www.ienablemuch.com/2010/12/postgresql-case-insensitive-design-and.html – 2010-12-19 12:35:01
@MichaelBuen你确定吗?你有参考吗? – 2013-05-06 12:28:09
我想网上有很多例子。这里的文档:https://wiki.postgresql.org/wiki/FAQ'不区分大小写的搜索,如ILIKE和〜*不使用索引' – 2013-05-06 13:16:51