Origin of Cohelion: The action of process of multiple independent parts to become aligned around a central theme. Used in Astronomy and in Information Theory

The role of SQL within the Cohelion Data Platfom

In this blogpost Peter de Heer explains the basics of SQL and why this industry standard is heavily used within the Cohelion Data Platfrom

What is SQL?

SQL stands for Structured Query Language. It’s an ANSI and ISO standard for accessing relational database systems. Several dialects exist as each database system vendor made tweaks and enhancements over time to fit their product and the state of technology. The standard itself evolves as well, adding new powerful features over time.

At Cohelion we use Microsoft SQL Server, which works with the Transact-SQL (T-SQL) dialect.

Standardized since 1986, SQL has survived the test of time as today it’s the primary solution for data-querying and manipulation.

What is SQL used for?

Invented specifically for relational database systems, SQL covers everything that these systems do, such as:

  • Configuring the database management system and account management;
  • Defining and manipulating databases and their files;
  • Defining and manipulating data-tables and their relationships;
  • Manipulating and querying of data stored in tables.

With this many responsibilities, T-SQL is divided into more specialized sub-languages. For this article the focus is on manipulating and querying of data stored in tables and will refer to that as SQL.

What is SQL mostly used for?

The dominant use of SQL is data querying/filtering, usually to display something in the user-interface. This use is closely followed by data-manipulation tasks like editing/importing data.

Recognizable activities that rely on SQL:

  • Opening a page with stored information (rights check + fetching of data);
  • Populating a drop-down list (regions, countries, …);
  • Searching in existing data;
  • Saving, updating or removing existing data;
  • Importing data from any source;
  • Any kind of reporting activity or preparation;
  • Notifications of all kinds.

Anything to do with persisted data will require SQL because the data itself or the meta-data is stored in the database (accessible only by SQL)!

Let's try it!

This is a simple example that helps to get a sense on how SQL looks and is structured.

The function is: Get the first five results from a sorted list of countries that have a non-empty country-code.

The result is show in the bottom table.

A few things are happening here:

  1. Select specifies this is a data-retrieval operation.
  2. Top 5 specifies that only the first 5 rows from the result are needed.
  3. IdCountry = C.Id specifies a column in the result. The desired name of the column is specified first and assigned is it’s value. Here C.Id is the source for the value, where C. is a shorthand alias to refer to the dbo.Country table.
  4. From specifies a list of source tables follows.
  5. dbo.Country as C is the first and only source table in the example. It’s given a short alias as not to have to specify dbo.Country in full every time it’s referenced. This practice is common as in complex SQL queries a table is often specified more than one time as a source. In which case the alias is the only thing that makes each source distinctly addressable.
  6. Where specifies some filter conditions follows.
  7. len( C.Code ) > 0 is the condition that all results have to satisfy. The length of the code column in the dbo.Country table is measured and compared to zero. In most non-trivial queries many conditions apply and get specified here.
  8. Order by specifies an ordering of the rows that passed the filtering process must happen.
  9. C.Label asc specifies the first and only column to sort on, the sorting happens in ascending order. In most non-trivial queries multiple columns are part of the sorting process.

What makes SQL so beneficial?

As the earlier SQL example demonstrated, writing SQL is much like writing a specification of the needed functional result. It’s left to the database server to provide a result that satisfies this specification. It remains however a bit of an art to write good specifications (nothing is entirely free).

But how the database sever does optimally provide a result and deals with concurrent requests is not something the SQL code writer has to deal with (typically). This arrangement works quite well, provided the database is structured properly and “clean” (numbers are stored as numbers, etc.).

The advantages of this division in responsibility include:

  • SQL provides rapid iteration opportunities on solutions and ideas as it avoids the costly manual task of writing “how-to” code for each iteration. Many complex data processing tasks are solved faster and will perform better than possible in other languages (by several orders of magnitude)!

  • The SQL syntax allows for easy insertion of new filter conditions. For example: when a user types in a numeric value in a search control, application code can apply this as a numeric filter on numeric columns, instead of searching for it as text in text columns. The Cohelion Data Platform makes heavy use of context dependent and automatically extended SQL.

  • The SQL syntax allows for easy construction of entirely dynamic queries, provided sufficient meta-data is available to work with. The Cohelion Data Platform benefits a lot from this because of high quality meta-data defined specifically for each client. The platform can deliver a lot of functionality without repeated manual coding out each and every use-case.

SQL is also flexible about the rule-set for matching/ranking/searching texts. To be user-friendly our applications default to case insensitive and accent insensitive. But in rare situations strict/literal matching is needed and in SQL it’s quite easy to specify where another rule-set must be used.

Other SQL functionality that saves development time and improves performance:

  • High performance aggregation / grouping of data;

  • In-query (un-)pivoting of data;

  • Partitioning and sorting of data (example: a ranking per partition, based on sort order);

  • Inline functions to abstract complex fixed logic, without performance penalty.

Important to understand is that SQL excels at data handling and data-manipulation in all common forms. Today data is everything and everywhere and as a result so is SQL. There is no going around such a powerful tool or dare to under-utilize the benefits it offers.