PHP scenarijus su SQLite3 kodu

Enhancing SQlite3 with ICU extension

I have been using SQLite3 databases in a few of my projects lately, and liked a lot the simplicity of development using SQLite3. It is so easy when database is a single file, and there is no extra database server to connect to. I liked SQLite3 so much that I have even moved this blog to use it as backend DB.

One thing that troubled me, however, was lack of proper text ordering in Lithuanian language, as well as lack of case-insensitive search for UTF8 characters.

Here is how SQLite3 would order Lithuanian letters:

$ sqlite3
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE test (name TEXT COLLATE NOCASE);
sqlite> INSERT INTO test (name) VALUES ('ž'), ('Ž'), ('e'), ('q'), ('į'), ('i'), ('y'), ('ė'), ('ę'), ('c'), ('z');
sqlite> SELECT name FROM test ORDER BY name;
c
e
i
q
y
z
ė
ę
į
Ž
ž
sqlite>

All specific Lithuanian letters would be moved to the end by the available collations.

At first I tried to solve the ordering issue by a workaround (retrieve database results into a PHP array and then reorder it; or order the results on the frontend using, for example, Datatables). That was not very scalable (it meant loading full tables into PHP’s memory for sorting!) and it meant extra programming. So, not good enough.

Later I found out that PHP’s SQlite3 extension can pass the ordering or lowercasing of text to a PHP function, which removed the need for extra programming to order (or search) properly. I described how to do that in this forum post (for both Vanilla PHP and Codeigniter 4). Searches, however, became so much slower. Not a big problem on a small app, mostly unnoticeable, but once I tried it on a database with 0.5 mln records searches began to take up to 5 seconds.

Finally today I managed to compile the ICU extension for SQLite3, which reduced the database operation time by half in comparison to the operation with PHP functions, since SQLite3 no longer needs to employ external PHP functions for ordering, lowercasing and searching. That is perhaps the best that can be done in trying to optimize SQLite3 for web projects on my part.

Let me share it.

So, my development computer has a Kubuntu 22.04 OS.

First I had to get compile dependencies as well as the development packages for SQlite3 and the ICU libraries:

$ sudo apt build-dep sqlite3
$ sudo apt install libicu-dev libsqlite3-dev

Then I got the source code of SQlite3:

$ mkdir /tmp/SQLITE
$ cd /tmp/SQLITE
$ apt source sqlite3

Then got into the source, the ICU extension folder:

$ cd sqlite3-3.37.2/ext/icu

Within that directory there were three files:

$ ls
icu.c README.txt  sqliteicu.h

So I read the `README.txt` file first, where I found the compilation instructions:

$ gcc -fPIC -shared icu.c `pkg-config --libs \  
     --cflags icu-uc icu-io` -o libSqliteIcu.so

After running the command the `libSqliteIcu.so` extension was compiled. I tried to use it in SQLite3 cli client:

$ sqlite3
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .load ./libSqliteIcu.so
Error: ./libSqliteIcu.so: undefined symbol: sqlite3_sqliteicu_init

So, something did not quite work, some undefined symbol `sqlite3_sqliteicu_init`, whatever it is…

It sent me googling, and I found explanation that suggested I needed to open the file `icu.c` and replace every instance of string `sqlite3_icu_init` with `sqlite3_sqliteicu_init`.

So I replaced `sqlite3_icu_init` with `sqlite3_sqliteicu_init` (found one string only) and rerun the compile command. And voila, it worked! Here is the test:

$ sqlite3
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .load ./libSqliteIcu.so
sqlite> SELECT icu_load_collation('lt', 'lithuanian');

sqlite> CREATE TABLE test (name TEXT COLLATE lithuanian)
   ...> ;
sqlite> INSERT INTO test (name) VALUES ('ž'), ('Ž'), ('e'), ('q'), ('į'), ('i'), ('y'), ('ė'), ('ę'), ('c'), ('z');
sqlite> SELECT name FROM test ORDER BY name;
c
e
ę
ė
i
į
y
q
z
ž
Ž

So, now the ordering was good.

To use the extension with Apache’s php I needed to put the extension into some common folder, so I created it:

$ mkdir -p /usr/lib/sqlite3/ext/
$ cp libSqliteIcu.so /usr/lib/sqlite3/ext/

Then I pointed php.ini file located at `/etc/php/php8.1/apache/php.ini`, found relevant SQLite3 section and pointed the variable `sqlite3.extension_dir` to the directory:

[sqlite3]
# Directory pointing to SQLite3 extensions
sqlite3.extension_dir = /usr/lib/sqlite3/ext/

The extension has to be loaded in PHP scripts after the initiation of database. Below is a full example:

<?php

$db = new SQLite3(':memory:');
$db->loadExtension('libSqliteIcu.so');
$db->exec("SELECT icu_load_collation('lt', 'lithuanian')");

$db->exec('CREATE TABLE test (name TEXT COLLATE lithuanian)');

$db->exec("INSERT INTO test (name) VALUES ('ž'), ('Ž'), ('e'), ('q'), ('į'), ('i'), ('y'), ('ė'), ('ę'), ('c'), ('z')");

$result = $db->query('SELECT name FROM test ORDER BY name');

while ($row = $result->fetchArray(SQLITE3_ASSOC)) {
    echo $row['name'] . "\n";
}

Paskelbta

sukūrė

Žymos:

Komentarai

Parašykite komentarą

El. pašto adresas nebus skelbiamas. Būtini laukeliai pažymėti *

Brukalų kiekiui sumažinti šis tinklalapis naudoja Akismet. Sužinokite, kaip apdorojami Jūsų komentarų duomenys.