ruk·si

🐘 PostgreSQL
Text Operations

Updated at 2015-09-26 17:55

PostgreSQL text manipulation and lookup related notes.

Searching for text with LIKE.

SELECT
      t.name,
      t.some_cost
 FROM my_schema.my_table t
WHERE t.name LIKE 'name starts with this text%';

SELECT
      t.name,
      t.some_cost
 FROM my_schema.my_table t
WHERE t.name LIKE '%name ends with this text';

SELECT
      t.name,
      t.some_cost
 FROM my_schema.my_table t
WHERE t.name LIKE '%name contains this text%';

Case insensitive search with ILIKE.

SELECT t.name, t.some_cost
FROM my_schema.my_table t
WHERE t.name ILIKE '%name contains this text in any case%';

concat() function combines multiple strings together. || operator gives the same result but is a bit more messy.

SELECT concat(m.first_name, ' ', m.surname) AS full_name
  FROM my_schema.member m
ORDER BY full_name;

SELECT m.first_name || ' ' || m.surname AS full_name
  FROM my_schema.member m
ORDER BY full_name;

Add padding to a string.

SELECT lpad('10', 5, '0');
-- => 00010 (text)

Number of bytes in a string.

SELECT octet_length('a');     -- 1 in UTF-8
SELECT octet_length('ä');     -- 2 in UTF-8
SELECT octet_length('は');    -- 3 in UTF-8

POSIX regex (7.1+). The ~* operator does case insensitive POSIX regex pattern match. You can make it case sensitive with just ~.

SELECT 'abc' ~ 'abc';    -- true
SELECT 'abc' ~ '^a';     -- true
SELECT 'abc' ~ '(b|d)';  -- true
SELECT 'abc' ~ '^(b|c)'; -- false
SELECT p.title FROM product p WHERE p.title ~* 'master';
SELECT p.title FROM product p WHERE p.title !~* 'master';
-- You can also do text search queries using `to_tsvector` and `to_tsquery`.

Full text search is better than regex. E.g. searching word "satisfy" and return all text that also contain "satisfies".

-- FTS Configurations select a parser and a set of dictionaries to use.
-- FTS Dictionaries convert tokens to normalized form and reject stop words.
-- FTS Parsers break documents into tokens.
-- FTS Templates provide the functions underlying dictionaries.

-- Functions ts_rank_cd and tsquery in use.
SELECT
    title,
    ts_rank_cd(textsearch, query) AS rank
FROM
    apod,
    to_tsquery('neutrino|(dark & matter)') query
WHERE query @@ textsearch
ORDER BY rank DESC
LIMIT 10;

--                     title                     |   rank
-------------------------------------------------+-------
-- Neutrinos in the Sun                          |      3.1
-- The Sudbury Neutrino Detector                 |      2.4
-- A MACHO View of Galactic Dark Matter          |  2.01317
-- Hot Gas and Dark Matter                       |  1.91171
-- The Virgo Cluster: Hot Plasma and Dark Matter |  1.90953
-- Rafting for Solar Neutrinos                   |      1.9
-- NGC 4650A: Strange Galaxy and Dark Matter     |  1.85774
-- Hot Gas and Dark Matter                       |   1.6123
-- Ice Fishing for Cosmic Neutrinos              |      1.6
-- Weak Lensing Distorts the Universe            | 0.818218

Pattern matching (7.1+).

_ matches any single character.
% matches zero or more characters.
~~ operator is equivalent to LIKE.
~~* operator is equivalent to ILIKE.
ILIKE and operators are not part of SQL standard.
'abc' LIKE 'abc'    true
'abc' LIKE 'a%'     true
'abc' LIKE '_b_'    true
'abc' LIKE 'c'      false

Sources