Encoding – The alphabet used to write the words (what characters are allowed, how they’re stored)
Collation – The rules for sorting the words (dictionary order, accents, case handling)
1. PostgreSQL
Encoding (DB-wide)
Always:
ENCODING ‘UTF8’
This comfortably supports English + Cyrillic + pretty much everything else.
Collation + case-insensitive behaviour
PostgreSQL is the awkward one here:
Locale-based collations like en_GB.UTF-8 are NOT case-insensitive.
Case-insensitive behaviour is usually done with:
ICU collations (if ICU is enabled), or
The citext extension, or
Functional indexes on LOWER(column).
You have two realistic “good compromise” options:
Option A – Use ICU collation (if your Postgres build supports ICU)
Check if ICU is available:
SELECT * FROM pg_collation WHERE provider = ‘i’;
If you see rows, you can create a Unicode, case-insensitive collation like:
CREATE COLLATION multilingual_ci (
provider = icu,
locale = ‘und-u-kc’, — ‘und’ = undetermined language, ‘kc’ = case-folding
deterministic = false — allows true case-insensitive comparisons
);
Then:
CREATE TABLE example (
name text COLLATE multilingual_ci
);
Result:
UTF-8 storage
ICU Unicode rules (good for mixed English + Cyrillic)
Case-insensitive comparisons + ORDER BY
Option B – Use citext (works everywhere, even without ICU)
CREATE EXTENSION IF NOT EXISTS citext;
CREATE TABLE example (
name citext
);
citext behaves like text but comparisons are case-insensitive under the current collation.
Works fine for English + Cyrillic as long as the DB is ENCODING ‘UTF8’.
Recommended combo for you (simple + reliable):
Encoding: UTF8
Collation: OS default UTF-8 locale (e.g. en_GB.UTF-8)
Use citext for case-insensitive columns
2. MariaDB
MariaDB is much easier here: _ci collations are already case-insensitive.
Character set
Use full Unicode:
CHARACTER SET utf8mb4
Collation
Recommended:
COLLATE utf8mb4_unicode_ci
This gives you:
Unicode (English + Cyrillic fully supported)
Case-insensitive comparisons and sorting
Better multilingual behaviour than utf8mb4_general_ci
Typical DB creation:
CREATE DATABASE myapp
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
Any VARCHAR/TEXT columns inherit that and are case-insensitive for both English and Cyrillic.
3. SQL Server
SQL Server: look for collations with:
_CI → case-insensitive
_AS → accent-sensitive (you probably want this)
_SC → supports supplementary characters (nice to have)
You’ll be using Unicode types (NVARCHAR, NCHAR) anyway.
Recommended collation
A very solid, widely used choice:
Latin1_General_100_CI_AS_SC
Why?
Latin1_General Windows collation supports Unicode, so it covers Cyrillic too
100 = newer collation version (better Unicode support)
CI = case-insensitive
AS = accent-sensitive (usually what you want for names, etc.)
SC = supplementary characters (emoji etc.)
Example:
CREATE DATABASE MyApp
COLLATE Latin1_General_100_CI_AS_SC;
GO
CREATE TABLE dbo.Example (
Name NVARCHAR(200) COLLATE Latin1_General_100_CI_AS_SC
);
That will sort and compare strings in a case-insensitive way, for English and Cyrillic alike.
Quick cheat-sheet
Engine Encoding / Charset Collation / Type Case-insensitive? Good for English + Cyrillic?
PostgreSQL ENCODING ‘UTF8’ ICU multilingual_ci (und-u-kc) or citext type ✅ (with ICU or citext) ✅ Yes
MariaDB utf8mb4 utf8mb4_unicode_ci ✅ _ci = CI ✅ Yes
SQL Server (Unicode types) Latin1_General_100_CI_AS_SC ✅ _CI = CI ✅ Yes