Querying other databases from SQLite

For such a tiny database engine, SQLite can really do a lot. It also offers many extensibility points, which QueryStorm makes good use of. QueryStorm extends its SQLite engine so that it can see Excel tables as if they were regular database tables. But did you know that the SQLite database can also access data from external databases? You can, for example, write a query in SQLite that joins data from a Postres database with data from a SQL Server database (or even data returned from a web service). In this article, I’ll show you how.

This functionality uses table-valued functions defined in C#. I describe this functionality in more detail in this article. Internally, it uses C# as glue but this article does not require that you know C#.

Step 1: Prepare a query for an external database

For a start we need to define the queries that fetch the data we’re interested in.

For this demo, I’ll load some data from my AdventureWorks2014 database. I’ll fetch data for all people in the database, and for each person, I’ll fetch their name, email, city and country.

Loading data from AdventureWorks2014 db

I’ll embed this query into the workbook, and call it GetCustomers.

I’m also interested in the population of the city that each person comes from because, for example, I want to send a nice email to all customers that live in a town with less than 100k poeple.

I don’t have information about the population of cities in the AdventureWorks database. However, I do have a Postgres database with information on the population of cities, and I can load the data from there.

Loading city populations from Postgres db

I’ll embed this query into the workbook as well and call it GetCityPopulations.

Step 2: Generate table-valued functions

Now that I have both queries embedded in the workbook, I can generate table-valued functions them that I can use from SQLite. I can do this from the context menus of the queries:

Generating table-valued functions

This generates a C# wrapper that calls the query internally. It enables calling the query as a table-valued function from SQLite.

You don’t have to know C# or care about the contents of these C# scripts, but in case you’re curious, here’s what one looks like:

Generated C# API code

Step 3: Run queries from SQLite

Now that the table-valued functions are ready, we can switch to SQLite and join data from both data sources:

SELECT
    gc.*, gcp.population
FROM
    GetCustomers() gc
    INNER JOIN GetCityPopulations() gcp 
        ON gc.City = gcp.city AND gc.country = gcp.country
WHERE
    population < 100000

Generated C# API code

Query parameterization

One thing to note here is that that data from both databases was loaded independently. This could have been a problem if both databases stored huge amounts of data. If the queries could be parameterized, it would be possible to load subsets of data as needed. The query would look like this:

SELECT
    gc.*, gcp.population
FROM
    GetCustomers() gc
    INNER JOIN GetCityPopulations(gc.country, gc.City) gcp 
WHERE
    population < 100000

This would mean that GetCityPopulations would be called only for cities where there are customers. A caching mechanism takes care of reusing the results of table-valued functions called with the same parameters.

However, at present, this query will not work. Currently there is no way to pass parameters to queries to remote databases, but support for this is on the way!

Conclusion

Working with data from various data sources is extremely useful for data exploration, analysis and reporting tasks. With QueryStorm’s ability to query multiple databases from SQLite, these scenarios are much easier deal with.

Leave a Comment

Your email address will not be published. Required fields are marked *