SQLite FTS3 Virtual Table

You need to enable full-text searching using SQLite's FTS3 indexing mechanism, creating a search function like Google. Use SQLite to index all the words in a TEXT column in your SQLite3 database. This article shows you how to get started with the FTS3 feature of SQLite.

=== Optimization results for FTS3 database ===
    Database contained 700+ text documents.   
    Reduced by not indexing stopwords.        

No FTS3 enabled:   7.0 MB  [smallest]
Unreduced FTS3:   19.0 MB  [largest]
Reduced FTS3:      9.0 MB 

Creating FTS3 tables

First we make a virtual table. Every FTS3 table is actually a virtual table with extensions enabled in the SQLite engine. After we look at some examples of using FTS3, we will see some ideas of how its performance can be improved. You can create a virtual FTS3 table using SQL in SQLite. You can store other kinds of columns in SQLite virtual tables, but only the TEXT ones are searched with FTS3 and the MATCH syntax.

-- Create a fts3 table that contains two columns.
-- The first is a text column and the second is a datetime column.
-- The column names, table name, etc. must be customized.
CREATE VIRTUAL TABLE TableName
    USING FTS3(ColOne TEXT, ColTwo DATETIME)

How to UPDATE your FTS3 table

Here we look at updating FTS3 tables. Remember that before you are able to update data into the virtual table, you need to insert rows into it. With SQLite there is no way around checking whether you need to insert or update a row. Here's how to update.

-- Update our virtual table.
UPDATE TableName
    SET ColOne=@a
    WHERE ColTwo=@b

Description of the UPDATE statement. You need to select first. Select first to see if the row exists, and you can insert as usual if it isn't there.

FTS3 is like a regular table. The virtual table acts like a regular table, but you need to manually maintain it. You must maintain the indexes the virtual table to maintain referential integrity. You need to UPDATE the FTS3 table. The preceding code example shows how you can update the virtual table.

How to use MATCH

Here we see how to use MATCH to search through full-text tables in SQLite. Make sure you use MATCH instead of equals or LIKE to scan a TEXT column in a virtual table.

-- Look at the line starting with MATCH.
-- That's where the fts3 stuff kicks in.
-- FullText is a fts3 virtual table.
SELECT jt.EntryTitle, jt.DateTimeNew, jt.WordCount
    FROM JournalTable jt, FullText ft
    WHERE ft.FullContent
    MATCH @query
    AND jt.DateTimeNew = ft.FullDate
    ORDER BY ft.FullDate DESC
    LIMIT 100

It selects a row. We find a row that contains the query, and then we take the date from that row in the FTS3 table. Then we do a join and find the row in the other table that has more information. This 'links' the virtual table with the main table through a surrogate key. This syntax is very similar to LIKE or WHERE syntaxes, so it won't be a problem to understand.

How to use rowid for matching

In SQLite and other databases, each row has a unique identifier called a rowid. You don't need to tell the database to create this. The rowid gives us some more power and is very cheap.

-- Get some matches from the fts3 table, then select the rowid.
SELECT rowid
    FROM VirtualTable
    WHERE ContentCol
    MATCH @query
    LIMIT @limit
    OFFSET @offset;

How to SELECT with rowid

You have a main content table, which contains content and data that your user interacts with, and the SQLite FTS3 virtual table. Search text using MATCH on the virtual table, and then get the corresponding row content.

-- Fetch the relevant row from the other table.
SELECT ColA, ColB
    FROM DataTable
    WHERE rowid=@id;

How to JOIN on rowid

To optimize performance and improve code clarity, minimize the trips to the SQLite database. Use an implicit join on the two tables using rowid. Because rowid is an int, this will be a fast seek.

-- Join virtual table with regular table on rowid.
SELECT dt.ColA, dt.ColB
    FROM DataTable dt, VirtualTable vt
    WHERE vt.ContentCol
    MATCH @query AND dt.rowid = vt.rowid
    LIMIT @limit
    OFFSET @offset;

Description. It selects from both tables. The above SQL selects from both tables at once, and names them dt and vt. Next it finds all ContentCol where the query is present and then does an implicit join on the two tables. It returns the full content that you want to give the user instead of the virtual table contents.

Optimize database size with FTS3

The author's database grew to three times its previous size when I enabled FTS3. Stopwords are short, common words that Google and Yahoo don't index. These make searching slower and bloat the indexes. To solve this problem, he added code to remove stopwords.

See Stopword Dictionary.

Summary

In this tutorial, we saw how you can use FTS3 in SQLite 3 for fast searching similar to Google. SQLite does this well. The results of FTS3 implementations in my programs were impressive. The searches using this code were faster than those in Windows Vista or Apple Spotlight.

See Data Content.

© 2007-2010 Sam Allen. All rights reserved.

Dot Net Perls  Sam Allen