SQL : Java Glossary

SQL
SQL (Standard Query Language) is a platform independent relational database query language. It is accessed via JDBC (Java Data Base Connectivity) in Java. SQL and relational databases were pioneered by Codd and C.J. Date.

JDBC. Sun’s official position is that expansion is wrong, although that is the generally accepted assumption. It describes a list of methods a Java programmer can use to access an SQL relational database. JDBC is similar to Microsoft’s ODBC (Open Data Base Connectivity) interface to SQL databases.

One of the major benefits to SQL is that you always have a map of your file structure, fields and data types that is completely up to date. Without a database, it is up to programmers to maintain the file descriptions manually. If they let them slide, you won’t be able to export your data to a replacement application program.

The Vendor Lists Deleting Atomic Updates
Alternatives to SQL Simple Statement Security
CLasses of Statement PreparedStatement Text Searching
Discovery ResultSets Under The Hood
Comments Quoting O-R Tools
Creating Transaction Processing A Modest Proposal
Querying SQL Cleverness Books
Updating Gotchas Learning More
Inserting Escapes Links

The Vendor Lists

I have compiled four vendor lists, now split off into separate documents:

  1. JDBC driver vendors.
  2. SQL engine vendors including prices.
  3. Java-friendly ISPs.
  4. Canadian Internet Access Providers.

Alternatives to SQL

You may vaguely sense you need a data base and so reflexively reach for SQL. SQL is a fairly big hammer. Perhaps something simpler and lighter weight will suffice.
Alternatives To SQL
Technique Advantages Disadvantages Description
Serialisied Collections Very fast. Easy to program. No protection from crashing. Not scaleable to large datasets. Totally RAM (Random Access Memory) resident HashMaps, TreeSets etc, serialised when the prgram is not running.
Serialisied Objects Very fast. Easy to program. No protection from crashing. Very wasteful of disk space. No caching. Inefficient use of operating system cache. You serialise your objects to a ByteArrayStream. Let us say m bytes is the worst case longest serialised object. You divide your RandomAccessFile into n slots each m bytes. You write your serialised object in slot s, computing the offset to seek to as s * m; The File I/O Amanuensis will show you how how to serialise/write and read/reconstitute objects to the RandomAccessFile. This of course wastes space for Objects shorter than m, and does not allow objects to grow larger than m bytes. The Hermit Crab file described next avoids those problems.
Hermit Crab Files Very fast. Easy to install. Code is compact. No caching other than what the operating system does. No need to install/start/stop/manage a separate SQL engine. No conflicts with other apps using the SQL engine. Limited protection from crashing. Provides only lookup of variable length records by integer key. Indexes must be done with ordinary HashMaps etc. Can provide extremly fast access to very large databases. You have to hire someone to write the code, e.g. me. There are simpler variants: e.g. HashMap of offsets to random access file, or to NIO (New Input/Output) Memory mapped file offsets. You don*rsquo;t recycle file slots, just tack on the end.
POD (Persistent Object Database) Flexible, may offer crash protection, coding similar to working with Serialised collections Expensive. Tend to be slow. Though some claim to be much faster than SQL databases. Ideally the entire database floats into RAM and stays there.
Caché Does multidimensional searches, not just tables. Single source from Intersystems Caché Innovations. They won’t tell you the pricing. All the documentation I have read was written by a salesman who had no clue what the product really did, or he was not willing to divulge much, other than it is fast and wonderful. It is both a POD and a SQL engine.
SQL The SQL engine intelligently manages indexes to do searches. It is not the programmer’s responsibility. An SQL app will not fail, just degrade, when RAM gets tighter. With an in RAM key lookup, you must have sufficient RAM for all the keys. You can improve the performance of an SQL engine just by throwing more RAM at it. The engine can use the RAM for both keys and objects. If your indexing needs become more complex, you can just add. You don’t have to start your coding approach over from scratch. Usually SQL engines run in a separate address space. This lets your app use its entire 2 GB address space for its own purposes. Clumsy to code and install because it is platform specific and runs as a separate process. Can be very expensive for high end engines. relational database. Programs only see parts of the database they are authorised to see.

Classes of Statement

Discovery

To get started, find the names of the databases. From there you can discover the tables in one of those databases, and

Comments

Comments are done three ways:
  1. -- to end of line, Ada style.
  2. /* or /** to matching */ C/C++/Java style
  3. non-standard # to end of line, CSV (Comma-Separated Value) style.

Creating

If someone has not already set up the databases for you, you will have to do it with code something like this. It is quite different depending on vendor. The vendors all use different names for the various field types. Happily your

To modify a table:

DROP TABLE animals;
DROP DATABASE mydata;
ALTER TABLE animals DROP COLUMN toothCount;
You will need high privilege to do that.

Querying

In SQL, you request sets of records with statements like this to show just the name, city and state of people in Massachusetts. DESC requests descending order. ORDER BY sorts in SQL are always case-insensitive, in other words apple is treated the same as Apple.
SELECT last_name, first_name, city, state
FROM contacts
WHERE state = 'MA'
ORDER BY last_name DESC;
You can limit the number of results returned, though the syntax is non-standard. This is MySQL syntax:
SELECT confirm, ordertimestamp
FROM orders
WHERE confirm < 2000 AND vendorid = 1234
ORDER BY confirm DESC
LIMIT 1
Will find the previous record to confirm number 2000.

There is some slick syntax like BETWEEN and IN writing terser WHERE clauses. LIKE 'Mc'; gives you wildcard matching. You can also summarise data with queries like this to get the count of people in each state (not bothering

SELECT state, count(*)
FROM contacts
WHERE age > 18
GROUP BY state
HAVING COUNT(*) > 1
ORDER BY state;

Updating

To change individual fields is a bit tedious. You must compose ASCII (American Standard Code for Information Interchange) sentences. You can’t just hand over the modified record in binary. You must tell it precisely which fields changed and how to find the record that needs
UPDATE contacts
SET last_name='Brown', state='WA'
WHERE acct=2103 AND state='MA';
By adding AND state=MA you ensure no recent changes have been made by someone else.

Inserting

The syntax for adding new records is quite different from that for updating. If you left off the WHERE clause, every record in the table would be updated! To insert a new record you need something like
INSERT INTO contacts(last_name, first_name, city, state)
VALUES('Brown','James','Seattle', 'WA');
With INSERT, you have to supply all the must enter fields. For bulk insertions, there is the LOAD TABLE command that accepts a file of comma and apostrophe delimited data.
LOAD INTO TABLE contacts FROM 'C:\temp\contacts.txt';

Deleting

Delete is straightforward. Be careful. If you forget the WHERE clause, every record in the Don’t confuse deleting with dropping. Deleting refers to discarding data. Dropping refers to deleting table structures.

Simple Statement

Here is how you do a one-off SQL statement in Java.
// using a simple Statement
final Statement updater = conn.createStatement();

updater.executeUpdate( "UPDATE mboards "
                        + "SET formFactor="
                        + FormFactor.ATX.ordinal()
                        + ",widthInCm=30.5,heightInCm=22.4 "
                        + "WHERE manufacturer="
                        + manufacturer.ordinal()
                        + " AND model='GA-M750SLI-DS4'");

PreparedStatement

Most of the time you reuse PreparedStatement, filling in different data values for each use. It has the added advantage is the data fields you insert don’t have to be manually quoted. PreparedStatement deals with awkward embedded characters in your selector fields for you.

PrepareStatement protects you from composing ordinary Statements from Strings containing incidental 's, e.g. 'O'Toole'. It also protects you from using SQL keywords incidentally occuring in data from being interpreted as SQL commands, e.g. 'OAK PRIMARY SCHOOL TABLE'. For more about the security problem, read up on injections.

In Oracle, it does not pay to use preparedStatement unless you are going to use the same query at least 50 times. In fine tuning, you want to discover where your own database’s break point is. You still might use it even for one-shots just to achieve the security advantages.

The server maintains an object to represent the prepared statement query. The server may plan its strategy when you submit the prepared statement, or may postpone that decision, or part of it, until it has the actual data values filled in, which give it further hints on whether a given index would be any use, particularly with LIKE clauses.

ResultSets

How do you get results back from a query into your variables? This is not so easy. You might think SQL would hand you an Iterator of Objects populated by fields named after the columns. No such luck (unless you used Hibernate or some sort of POD interface). It is quite a production, with JDBC method calls for each field. You will have to pore over the JDBC documentation. You need code roughly like
String employeeName = result.getString( "EmpName" );
int employeeNumber = result.getInt( "EmpNum" );
To get a row back you make a series of method calls, roughly one per field. The exact format of what the SQL engine sends back is thus invisible to the programmer. The JDBC method calls are ghastly code only their mother could The JDBC interface is not really designed for direct human use. We desperately need, but do not yet have, a layer to shield you from all the administrative details. Rows should just be a group of smart self-validating objects that automatically refresh the screen and inform the database of changes when their values are changed either by keying or computation. This is the way Abundance works.

Quoting

SQL uses quite different string literal conventions from Java. Strings are surrounded in ( ') not ( "). Embedded ( ') are written ( '') [two single quotes in a row] not ( ") not ( \') and embedded ( ") are left plain as ( "). These conventions also apply to data imported into SQL as comma-delimited Strings. It gets really hairy creating string literals in Java to be fed to SQL since you have two layers of quoting. First you compose the string to get it right for SQL, then you apply the Java quoting conventions. You also have to be aware of the SQL quoting conventions when you dynamically compose SQL statements in Java or when you feed data to SQL from Java. None of this would be necessary if SQL had a method interface instead of an ASCII sentence interface.

One more complication: SQL sometimes uses double quotes ", to surround identifiers, e.g. table names that have spaces or other awkward characters in them. Not all SQLs (Standard Query Languages) support this. In Sybase, you must use set quoted_identifier on to enable the feature.

Transaction Processing

Transaction processing is too complicated to explain in a paragraph or two. Happily there are some tutorials, (see below) that goes into a fair bit of detail of how it works both in SQL and JDBC.

The basic idea is you can do a group of SQL operations. If any of them fail, all changes back out to the way they were when you started the group. The group of SQL operations is called a transaction. You mark the end of a transaction with Connection. commit. To force partially complete changes to be undone, use Connection. rollback. Control the how different transaction threads interact with Connection. setTransactionIsolation.

SQL Cleverness

SQL looks quite simple, but is suprisingly powerful. It will let you look up by fields which are not indexed. It will let you change the primary key in a record. It will let you change individual fields in a record without disturbing the others. SQL has its own procedural language to write triggers, code that is automatically run before or after various database events.

SQL tries hard to avoid transporting data to and from the server. Instead of fetching records for you to look at the client, you send a command to the server to do what you want and return just the summarised information.

Don’t be timid about creating huge result sets then only using part of them. Most database engines are quite clever, and only transmit a hundred records at a time of the result set. This buffering is completely transparent to your application.

Let’s say, for example, you asked for a list of people living in France with a WHERE country = 'France' clause. While you were processing your giant ResultSet, one of these people moved to Belgium and somebody else updated their record. SQL will ensure either you get that person’s old record showing him living in France, or it will exclude him from the ResultSet before you process it. Your ResultSet is guaranteed to contain only people listed as living in France. If you update them, you might put a WHERE country = 'France' clause to ensure that fact has not recently changed.

You should only see the new state (sometimes, and probably very rarely) if the transaction isolation level is READ_UNCOMMITTED. If it’s any higher than that, then the update in another transaction should not cause the SELECT in this transaction to produce spurious results. Most database products default to a transaction isolation level higher than READ_UNCOMMITTED.

Gotchas

SQL uses = both for assignment and comparison unlike Java with uses = for assignment and == for comparison.

If you load your triggers individually they work. If you try to load them in batches, SQL gets confused about terminating semicolons. You can view your triggers with:

SELECT * FROM SYS.SYSTRIGGERS;

SQL uses CASE/WHEN/ELSE instead of SWITCH/CASE/DEFAULT. Its these little differences that often trip you up and leave you scratching your head.

SQL is missing features you would expect such as the ability to traverse forward and back in ResultSets. It is gradually improving; JDBC now has ResultSet.TYPE_SCROLL_INSENSITIVE which lets you examine ResultSets by random access. However, there is still no guarantee the version of your JDBC driver and the version of your SQL engine will support the crucial new feature you need. Support for almost every feature is optional.

LIMIT row_count lets you limit the size of a result set. Unfortunately, this is not standard in all SQLs. Your vendor may do it a different way. SQL-2003 is the most standardised of all the variants. Users are beginning to dig in their heels and refuse to put up with lock-in proprietary extensions. There is now wide choice.

Escapes

One problem with SQL is its age. There are many divergent dialects, and the standards allow for a lot of slop. It can take more time to move an app from one SQL engine to another than to write it in the first place. The problem is you start to expect things to work a give way the second time around. The first time, you test everything incrementally. JDBC tries to restore order to at least date/time literals by inventing its own, that it guarantees to convert into whatever form your particular SQL engine likes, e. g. the documentation is vague on which time zone is implied. I strongly suggest storing all database information in GMT (Greenwich Mean Time).

For more information on escapes, see JDBC literals.

Atomic Updates

Imagine what would happen if you updated a bank balance in two transactions, one to discover the balance, and one to set the new balance adding on the deposit. If someone else were doing the same thing, interleaved with you, you would get the wrong balance. Similarly if you had two threads trying to track the highest value so far interleaved could get the lesser of the two values finally stored in the database. The easiest way around this is to get SQL to do such updates in one atomic operation done all of a piece such as this:

Security

Typically you have users and programs coming at your database from all over the web, talking on sockets directly to your database engine. Users don’t login first to your OS (Operating System). This means that the SQL engine has to manage its own completely separate security system.

By entering GRANT commands into your database, you control who can access which tables from where with which passwords. You can separately control read and write access.

Text Searching

SQL was originally designed to store and process only relatively small fields. Now people are trying to use it to store and search entire documents as fields, with text searching tools similar to what you would have with a search engine.

MySQL offers server-side full-text indexing and searching. Simply declare an index of type FULLTEXT on the TEXT/CHAR/VARCHAR table columns which contains the text you want to search, then query the table using the "MATCH()… AGAINST" syntax. There’s a manual section which describes this facility.

OpenFTS is a full text search third party add-on for Postgre. TSearch2 is another full text search for Postgre.

Oracle Text Search is Oracle’s facility in there Oracle 10 database. Read the Oracle Text Search How To.

Full text search is now becoming a common feature in databases.

Under the Hood

Conceptually you compose an ASCII English-like sentence query and send it off to the SQL engine. The SQL engine goes through all its records the relevant tables one by one finding the ones that fit your criteria. It puts these in a separate file called a result set. You then process the rows/records in it one by one.

What actually happens is much more efficient and clever. When I first used the Sybase SQL engine I could not believe how fast it was compared with the Btrieve DOS (Disk Operating System) files and ISAM (Indexed Sequential Access Method) I was used to. What makes it so fast?

First, it does not actually wade through all the records in each relevant table looking for matches. It has indexes. It uses those indexes to narrow down the search to likely candidates. Clever SQL engines even create new indexes on the fly without being asked to help them process queries faster.

Next, SQL engines cache as much of the database as they can in RAM. Sometimes databases are totally RAM-resident. This is quite feasible now-a-days with RAM as cheap as it is.

Next the SQL engine does not actually fetch the entire result set. It just grabs a decent sized chunk of it, say 15 rows worth and hands that to you in a chunk. When you have processed that, it gets you the next lump, or it may get the next chunk ready while you are processing the first chunk. This is why you can get away with creating giant result sets then using just the first few rows of them. Unfortunately MySQL does not have this cleverness. It always hands your program the complete set of results.

You might think the way you update a row is to submit a C-like binary struct representing it and that, to fetch a row, you would get such a beast back. Oddly, it does not work that way. Baud knows why. Instead you compose ASCII sentences to update fields. I kid you not. You have to painstakingly compose things like this:

UPDATE contacts
SET last_name='Brown', state='WA'
WHERE acct=2103 AND state='MA';

Happily the format of the resultset is completely up to the implementor. It will likely be a mixture of binary data and text. In the JDBC entry I talk about the conversations the client and server have. The client end is generally not very bright. It just relays requests to the server using a binary protocol.

O-R Tools

O/R Object relational tools help you map between objects and SQL relational databases. In most cases code generators take the database schema (con.getMetaData()) and create corresponding java classes, for example one data class and one manager class per table. Some OR tools (TopLink, Cocobase, etc) create both java classes and database schema from single xml master file.

A Modest Proposal

Writing ordinary SQL code is extremely tedious. I suggest there needs to be a simpler Java interface to SQL. I would work like this:

Answers to queries come back as Iterators of row objects. These objects are custom classes with primitive fields with the same names precisely as the SQL columns. The corresponding Java for these classes needs to be generated ahead of time, much the way RMI (Remote Method Invocation) stubs are.

Granted this does not allow fancy dynamic queries, but it does the bull work in a way that is much easier to write and read.

From there it is just one more step to smart GUI (Graphic User Interface) objects that automatically update when a query result comes in, and that can allow GUI objects to modify the row with a minimum of code.

Books

book cover recommend book⇒Guide to the SQL Standard: A User’s Guide to the Standard Database Language SQLto book home
by Chris J. Date, yes the C.J. Date. 978-0-201-96426-4 paperback
birth 1941 age: 72
publisher Addison-Wesley
published 1996-11-18
Considered the best on understanding the SQL standard.
Australian flag abe books anz abe books.co.uk UK flag
Chinese flag amazon.cn amazon.co.uk UK flag
German flag abe books.de abe books.ca Canadian flag
German flag amazon.de amazon.ca Canadian flag
Spanish flag amazon.es Chapters Indigo Canadian flag
Spanish flag iberlibro.com abe books.com American flag
French flag abe books.fr amazon.com American flag
French flag amazon.fr Barnes & Noble American flag
Italian flag abe books.it Google play American flag
Italian flag amazon.it O’Reilly Safari American flag
India flag junglee.com Powells American flag
UN flag Kobo other stores UN flag
Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder.
book cover recommend book⇒Murach’s Oracle SQL and PL/SQL (Training & Reference)to book home
by Joel Murach 978-1-890774-50-9 paperback
publisher Mike Murach
published 2008-08-01
This talks about Oracle’s extended SQL, not just the generic SQL JDBC interface in Java. book website
Australian flag abe books anz abe books.co.uk UK flag
Chinese flag amazon.cn amazon.co.uk UK flag
German flag abe books.de abe books.ca Canadian flag
German flag amazon.de amazon.ca Canadian flag
Spanish flag amazon.es Chapters Indigo Canadian flag
Spanish flag iberlibro.com abe books.com American flag
French flag abe books.fr amazon.com American flag
French flag amazon.fr Barnes & Noble American flag
Italian flag abe books.it Google play American flag
Italian flag amazon.it O’Reilly Safari American flag
India flag junglee.com Powells American flag
UN flag Kobo other stores UN flag
Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder.

see the list of JDBC books.

Learning More

Oracle’s Javadoc on java.sql package : available:
Oracle’s Javadoc on javax.sql package : available:
Oracle’s Javadoc on Connection class : available:
Oracle’s Javadoc on DriverManager class : available:
Oracle’s Javadoc on Datasource class : available:
Oracle’s Javadoc on PreparedStatement class : available:
Oracle’s Javadoc on Statement class : available:
Oracle’s Javadoc on ResultSet class : available:

available on the web at:

http://mindprod.com/jgloss/sql.html
ClustrMaps is down

optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\jgloss\sql.html
logo
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy.
Blog
IP:[65.110.21.43]
Your face IP:[54.204.163.26]
You are visitor number 68,576.