Have you ever wanted to be able to SELECT a random record or records from a database table, but was unsure how to do it with SQL, instead of through code?
I ran across this problem a few years ago when I was developing Vibez.ca. Vibez, if you would like to know, is an electronic music community that I developed when I first got into doing web development almost 6 years ago, and was written in ASP with Access (I know, I know …).
What I needed, was a way to SELECT one random link from the Links Directory, every time someone viewed the main page. I wasn’t able to properly figure out how to accomplish this, and ended up slapping together some code to give me the end result that I wanted. If you were to ever see the code, you probably would feel sick to your stomach.
Now that I’m in the planning stages of rewriting the site in ASP.Net with SQL Server 2005 replacing the Access back-end, I decided to revisit this problem and see if there was an easier way to do this.
Fortunately there is a way to do it in SQL Server (and MySQL, PostgreSQL, Oracle, DB2, SQLite, and yes, even your favourite, MS Access), and I’ll show you how.
For each of the Databases and their random SQL statement listed below, I’ve installed the database, ran the SQL through some simple C# code three times, and graphed the results to show that the statements work (or didn’t), and to see if there were any anomalies with each Database.
For each test, I simply ran the random SQL statement 5000 times. The x-axis represents the record’s Id from 1 to 10, and the y-axis represents the number of times each Id was selected.
Let’s begin ….
SQL Server 2005
The secret with retrieving random records from SQL Server is really simple. All that you have to do is specify how many random records that you want returned with the TOP N (where ‘N’ is any number of records), and end your query with ORDER BY NewID().
SELECT TOP 1 Field1, ..., FieldN FROM Table1 ORDER BY NEWID()
I don’t know about you, but I consider that pretty painless.
For my test, I ran the following statement against the Northwind database.
SELECT TOP 1 ProductID FROM Products WHERE ProductID <= 10 ORDER BY NEWID()
I’m no statistical major, but these results look pretty random with each test.
To select a random record or records from MS Access, the method isn’t much different.
Instead of doing an ORDER BY NewID() like in SQL Server, you would use ORDER BY Rnd(Field1), where Field1 is the name of your primary auto-numbered key field.
SELECT TOP 1 Field1 , ..., FieldN FROM Table1 ORDER BY Rnd(Field1)
Although the SQL statements are very similar, the results however, are not.
I ran the following SQL statement through my C# sharp code three times, each time looping to select 5000 random records. Each time however, the results were exactly the same.
I closed down the web app, and restarted it, and what do you know, the results were identical again.
I then decided to change the loop from 5000 to 10000, and got a new set of results, with the only similarity between the two results being that the record Id of 5 had the least number of occurrences in both sets.
So one last time, I re-ran the first test with the 5000 loops, and guess what my results were? If you answered that it was identical to the first set, you are correct.
Although the results are identical in each test, I think this is random enough for now. I’ll come back to this one in a future update post.
Just like the previous two examples, MySQL’s way of selecting random records is also pretty painless.
SELECT Field1, ..., FieldN FROM Table1 ORDER BY RAND() LIMIT 1
For the following results, I ran my C# code against the help_category table and selected a random record between 1 and 10.
If, for some reason, you would like to connect to a MySQL database with .Net, you’ll need to download MySQL Connector/ODBC (MyODBC).
SELECT help_category_id FROM help_category WHERE help_category_id <= 10 ORDER BY RAND() LIMIT 1
Not surprisingly, since MySQL comes with a RAND() method, the results are pretty evenly distributed, and are random with each test.
Oracle 10g Express
Wow, Oracle so far, has to be the biggest PITA to get working in Visual Studio. For some reason, I can create and populate a table with TOAD, and I am able to see the data I entered, but Visual Studio, through the Server Explorer and through code, could only see the table, and not the data. I had to end up doing a DELETE and then populate the table before I could run each test. Anyways ….
Not only was Oracle the biggest PITA, it also so far has the ugliest SQL statement to retrieve a random record or records.
SELECT Field1, ..., FieldN FROM (
SELECT Field1, ..., FieldN FROM Table1 ORDER BY dbms_random.value
) WHERE rownum <= 1
If you’re somewhat confused by this statement, think of the inner SELECT statement as a “temp” table that is filled with data from Table1 and ordered by dbms_random.value (guess it’s like MySQL’s Rand(), or SQL Server’s NewId()).
The outer SELECT statement then retrieves rows from the temp table where the rownum is equal to or less than the first row.
And VOILA! Nothing shocking here, that I can see.
SQLite’s random method is almost identical to MySQL’s, except with SQLite you have to use the full Random word.
SELECT Field1, ..., Field2 FROM Table1 ORDER BY Random() LIMIT 1
If you ever plan on using SQLite in one of your .Net projects, I suggest that you check out System.Data.SQLite, which is an enhanced version of the original SQLite database engine. For my tests, I used Finisar.SQLite which is no longer being developed.
Here are my results:
This graph looks random.
DB2 (9.5) Express-C
Out of all of the Databases that I’ve installed and tested, DB2 has to be the easiest, and quickest of the bunch. How much easier can connecting to a database be when your connection string is “DATABASE=SAMPLE“?
Anyway’s … this is how to SELECT a random record from a table in DB2.
SELECT Field1, ..., FieldN, RAND() as RID FROM Table1 ORDER BY RID FETCH FIRST 1 ROWS ONLY
PostgreSQL’ method of selecting a random record is identical to SQLite’s, which I’m a bit surprised. I thought for some reason, that it would be identical to MySQL’s.
SELECT "Field1", "...", "FieldN" FROM "Table1" ORDER BY RANDOM() LIMIT 1
And yes, each column is wrapped in quotes, and so is the table name. At least that’s what I had to do to get it to work.
So there you have it. I think I’ve covered all of the popular DBs out there. If there’s one that you think should be added, or you’re having any problems with the statements above, leave a comment below.