Selecting Random Records With SQL

By | December 16, 2007

Have you ever wanted to be able to SELECT a random record or records from a database table, but was unsure how to do it with SQL, instead of through code?

I ran across this problem a few years ago when I was developing Vibez.ca. Vibez, if you would like to know, is an electronic music community that I developed when I first got into doing web development almost 6 years ago, and was written in ASP with Access (I know, I know …).

What I needed, was a way to SELECT one random link from the Links Directory, every time someone viewed the main page. I wasn’t able to properly figure out how to accomplish this, and ended up slapping together some code to give me the end result that I wanted. If you were to ever see the code, you probably would feel sick to your stomach.

Now that I’m in the planning stages of rewriting the site in ASP.Net with SQL Server 2005 replacing the Access back-end, I decided to revisit this problem and see if there was an easier way to do this.

Fortunately there is a way to do it in SQL Server (and MySQL, PostgreSQL, Oracle, DB2, SQLite, and yes, even your favourite, MS Access), and I’ll show you how.

For each of the Databases and their random SQL statement listed below, I’ve installed the database, ran the SQL through some simple C# code three times, and graphed the results to show that the statements work (or didn’t), and to see if there were any anomalies with each Database.

For each test, I simply ran the random SQL statement 5000 times. The x-axis represents the record’s Id from 1 to 10, and the y-axis represents the number of times each Id was selected.

Let’s begin ….

SQL Server 2005

The secret with retrieving random records from SQL Server is really simple. All that you have to do is specify how many random records that you want returned with the TOP N (where ‘N’ is any number of records), and end your query with ORDER BY NewID().

SELECT TOP 1 Field1, ..., FieldN
FROM Table1
ORDER BY NEWID()

I don’t know about you, but I consider that pretty painless.

For my test, I ran the following statement against the Northwind database.

SELECT TOP 1 ProductID
FROM Products 
WHERE ProductID <= 10 
ORDER BY NEWID()

 

I’m no statistical major, but these results look pretty random with each test.

Access

To select a random record or records from MS Access, the method isn’t much different.

Instead of doing an ORDER BY NewID() like in SQL Server, you would use ORDER BY Rnd(Field1), where Field1 is the name of your primary auto-numbered key field.

SELECT TOP 1 Field1 , ..., FieldN 
FROM Table1
ORDER BY Rnd(Field1)

Although the SQL statements are very similar, the results however, are not.

I ran the following SQL statement through my C# sharp code three times, each time looping to select 5000 random records. Each time however, the results were exactly the same.

I closed down the web app, and restarted it, and what do you know, the results were identical again.

I then decided to change the loop from 5000 to 10000, and got a new set of results, with the only similarity between the two results being that the record Id of 5 had the least number of occurrences in both sets.

So one last time, I re-ran the first test with the 5000 loops, and guess what my results were? If you answered that it was identical to the first set, you are correct.

 

Although the results are identical in each test, I think this is random enough for now. I’ll come back to this one in a future update post.

MySQL 5.0.45

Just like the previous two examples, MySQL’s way of selecting random records is also pretty painless.

SELECT Field1, ..., FieldN
FROM Table1
ORDER BY RAND()
LIMIT 1

For the following results, I ran my C# code against the help_category table and selected a random record between 1 and 10.

If, for some reason, you would like to connect to a MySQL database with .Net, you’ll need to download MySQL Connector/ODBC (MyODBC).

SELECT help_category_id
FROM help_category
WHERE help_category_id <= 10 
ORDER BY RAND() 
LIMIT 1

Not surprisingly, since MySQL comes with a RAND() method, the results are pretty evenly distributed, and are random with each test.

Oracle 10g Express

Wow, Oracle so far, has to be the biggest PITA to get working in Visual Studio. For some reason, I can create and populate a table with TOAD, and I am able to see the data I entered, but Visual Studio, through the Server Explorer and through code, could only see the table, and not the data. I had to end up doing a DELETE and then populate the table before I could run each test. Anyways ….

Not only was Oracle the biggest PITA, it also so far has the ugliest SQL statement to retrieve a random record or records.

SELECT Field1, ..., FieldN
FROM (
SELECT Field1, ..., FieldN FROM Table1 ORDER BY dbms_random.value
) WHERE rownum <= 1

If you’re somewhat confused by this statement, think of the inner SELECT statement as a “temp” table that is filled with data from Table1 and ordered by dbms_random.value (guess it’s like MySQL’s Rand(), or SQL Server’s NewId()).

The outer SELECT statement then retrieves rows from the temp table where the rownum is equal to or less than the first row.

 

And VOILA! Nothing shocking here, that I can see.

SQLite 3

SQLite’s random method is almost identical to MySQL’s, except with SQLite you have to use the full Random word.

SELECT Field1, ..., Field2 
FROM Table1 
ORDER BY Random()
LIMIT 1

If you ever plan on using SQLite in one of your .Net projects, I suggest that you check out System.Data.SQLite, which is an enhanced version of the original SQLite database engine. For my tests, I used Finisar.SQLite which is no longer being developed.

Here are my results:

 

This graph looks random.

DB2 (9.5) Express-C

Out of all of the Databases that I’ve installed and tested, DB2 has to be the easiest, and quickest of the bunch. How much easier can connecting to a database be when your connection string is “DATABASE=SAMPLE“?

Anyway’s … this is how to SELECT a random record from a table in DB2.

SELECT Field1, ..., FieldN, RAND() as RID
FROM Table1
ORDER BY RID
FETCH FIRST 1 ROWS ONLY

PostgreSQL 8.3

PostgreSQL’ method of selecting a random record is identical to SQLite’s, which I’m a bit surprised. I thought for some reason, that it would be identical to MySQL’s.

SELECT "Field1", "...", "FieldN" 
FROM "Table1"
ORDER BY RANDOM()
LIMIT 1

And yes, each column is wrapped in quotes, and so is the table name. At least that’s what I had to do to get it to work.

For my tests, I used the PostgreSQL Windows Installer, with the Npgsql Data Provider.

 

So there you have it. I think I’ve covered all of the popular DBs out there. If there’s one that you think should be added, or you’re having any problems with the statements above, leave a comment below. 

51 thoughts on “Selecting Random Records With SQL

  1. Pingback: DotNetKicks.com

  2. carlj Post author

    Thanks John.

    I’m working on an update for that, with a comparison between NewID() and another method.

  3. Pingback: Web 2.0 Announcer

  4. Danilo Celic

    A while ago I found a solution for Access that seems to work well on this page: http://www.petefreitag.com/item/466.cfm. Haven’t done any tests on it, but for my usage it is random enough.

    If uses SQL similar to the following:
    SELECT * FROM tableName ORDER BY RND(INT(NOW*tableIDColumnName)-NOW*tableIDColumnName)

    Don’t know how well it performs

  5. Bijay Rungta

    I had once needed this ..
    I did know about the RAND() function but my requirement is a little different..

    How to finc a solution where you create a listing of random items in pages and if the user clicks on any of the page numbers the user gets to see the corresponding page with listings which is generated with the same RANDOM Key as the one generated previously…

    its ok if the user visits the page without clicking on the page links and the whole random listing is generated afresh..

    Did anyone follow what I am looking for..

    Will elaborate or we could start a discusion anywhere you like on this…

    could be interesting..

  6. carlj Post author

    I think I follow, but I’m a bit confused.

    Why would you want the second page to be generated with random data, using the same key as the link on the previous page?

    If you were to do the following (i’m assuming this is for MySQL), you can retrieve the random value that was generated for each record.

    SELECT Field1, …, FieldN, Rand([Insert Random Generated Key Here]) AS RID
    FROM Table1
    ORDER BY RID DESC

    Replace [Insert Random Generated Key Here] with the Generated Key. If there was no generated key, then leave blank.

    Now back to my confusion. Why would you want the second page to be generated with the same random key that was generated for the link that the user clicked on in the first page?

    The results are random, so there’ll be no relationship between the two pages. Or am I missing something?

  7. carlj Post author

    Thanks Danilo. I was fooling around with giving the Rand() in Access a different value, but was getting some weird results.

    Depending on the number you give it, it would return the same record 10 times (there were only 10 records to select from).

    I’ll give your idea a shot as soon as I can, and post the results. I expect that it’s the seeding that’s causing the problems.

  8. Bijay Rungta

    Here is the reason.
    I remember there was a requirement as the following..

    The Client wanted that the list that is generated is random (not in any particular order like the primary id, number of visits or anything). In this scenario the paging doesn’t make sense.. but the client was also insisting on a paging so that all the entries are accessible.

    Note: The Random Key that I talk about might be confusing.
    What I mean is that suppose there are 10 listings in all the pages and suppose that there are 30 + items and the first page is generated using the RAND function. Then the next page should again create RANDOM items but should not contain any item that was in the page 1. again if the user goes to page 3 the items should not be among anyone that appeared in page 1 or 2
    Also if now the user goes to Page 1 He should again see the same items that he saw the first time…
    and so on…

    The solution that I could think of was that you select all entries in the table with the RANDOM or use shuffle the results through the Program (its all the same and save the item’s primary key in an indexed array) and use that to generate pagination in the subsequent call from the same user satisfying the conditions…

  9. Bijay Rungta

    This might be useful:
    ———————–

    RAND(), RAND(N)

    Returns a random floating-point value v in the range 0 <= v SELECT RAND();
    -> 0.9233482386203
    mysql> SELECT RAND(20);
    -> 0.15888261251047
    mysql> SELECT RAND(20);
    -> 0.15888261251047
    mysql> SELECT RAND();
    -> 0.63553050033332
    mysql> SELECT RAND();
    -> 0.70100469486881
    mysql> SELECT RAND(20);
    -> 0.15888261251047

    The effect of using a non-constant argument is undefined. As of MySQL 5.0.13, non-constant arguments are disallowed.

    To obtain a random integer R in the range i <= R < j, use the expression FLOOR(i + RAND() * (j – i)). For example, to obtain a random integer in the range the range 7 <= R SELECT * FROM tbl_name ORDER BY RAND();

    ORDER BY RAND() combined with LIMIT is useful for selecting a random sample from a set of rows:

    mysql> SELECT * FROM table1, table2 WHERE a=b AND c ORDER BY RAND() LIMIT 1000;

    Note that RAND() in a WHERE clause is re-evaluated every time the WHERE is executed.

    RAND() is not meant to be a perfect random generator, but instead is a fast way to generate ad hoc random numbers which is portable between platforms for the same MySQL version.

    From http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_rand

  10. Pingback: links for 2007-12-21 « Bijay Rungta’s Weblog

  11. Mehzabeen

    Hi,

    This info was a bit useful for me, however, my requirement was a bit different from this.

    That is I need to generate the random records in MSSQL where in the next random record which is generated should be according to the user’s choice (ie., if a record 4 is generated, then its very next record ie., 5 should come next, and suppose if 8 is generated, then its very next record that is 9 should come next).

    If there is any solution for that please do let me know.

    Thanks.

  12. carlj Post author

    Hi Mehzabee,

    Off of the top of my head, If you are using MS SQL 2005, I would suggest picking a random number in code (do a SELECT Count() to get the max number of rows), then select all records that are equal to or greater than that random number using MS SQL 2005′s rownum.

    If that doesn’t work for you, then do two select statements.

    SELECT TOP 1 [PrimaryID]
    FROM Table1
    ORDER BY NEWID()

    Store the [PrimaryID] and then …

    SELECT * FROM
    Table1
    WHERE [PrimaryID] >= x

    where ‘x’ is the PrimaryID from the first statement.

    I have another idea, and will post it a bit later.

  13. carlj Post author

    Here you go …

    SELECT *
    FROM Table1
    WHERE PrimaryID > =
    (
    SELECT TOP 1 PrimaryID
    FROM Table1
    ORDER BY NEWID()
    )

  14. Pingback: Carl J » February 2008 Stats

  15. Rapture

    Dude!!! You are a genious. You just saved me 40 lines of code just by one line ‘ORDER BY NEWID()’

  16. carlj Post author

    I wouldn’t consider myself a ‘genius’, I only did what you probably did, and Googled for it.

    Glad I could help though.

  17. Pingback: Selecting a random row from a table in mysql | Prashanth Ellina

  18. Tim

    Hi,

    Do you have any idea how to implement selecting random records together with sql paging?

  19. Carl J

    Hey Tim,

    That would be a bit difficult to do, because everytime you ask SQL for a new page, you’re asking it to select a page of random records. So whats going to stop page 2 from having some of the same records as page 1?

    Now, if you want to sort the records randomly, then page through that list, then thats a different story.

  20. Will

    I’m considering using this for random pagination (on Postgres):

    SELECT *
    FROM records
    ORDER BY (recordID % 5)

    In the above query 5 would either be the hour or day, so the records would be in a different order every hour or day. The drawbacks are obviously it’s not frequent randomisation and if a user is paging just before the changeover of hour or day, they will probably see duplicate records.

  21. Pingback: Carl J » This Day in History: September 2, 2008

  22. carlj Post author

    Hey JWhite,

    Thanks for the link. I’ll try it out (not that I don’t believe the article) as soon as I can. I’ve been wanting to get around to doing an update on Access, with something similar.

  23. Pingback: Eduard Gamonal » Blog Archive » What «random» means in Microsoft

  24. Gary

    Im using access with a classic ASP front end…on MS Access the query works great:

    SELECT TOP 1 Campaign.Camp_ID, Companies.AD_ID, Campaign.CampURL
    FROM Companies INNER JOIN (Campaign INNER JOIN CampStats ON Campaign.Camp_ID = CampStats.CampaignID) ON Companies.AD_ID = Campaign.Co_Parent
    WHERE (((Campaign.CampActive)=’Yes’))
    ORDER BY Rnd(Campaign.Camp_ID);

    When this is ran through my ASP code, I get the same freaking ad every single time…it’s the ad with the highest Camp_ID though..

    thoughts?

  25. Carl J Post author

    Hi Gary, the problem that you’re experiencing, is the same problem that I was having above. What may work (and I can’t test from here right) is to change your Rnd() in the ORDER BY, and use a random number instead of the Camp_Id, which the Id never changes. Instead you could generate a random number (or use Ticks) in ASP and use that.
    “ORDER BY Rnd(Campaign.Camp_ID)” becomes “ORDER BY Rnd(” & RandomASPGeneratedNumber &”)”
    Basically, to create a random number, you need a random number

  26. samantha

    Dear all,

    I have this problem in mysql database trigger.

    I have two tables namely messagein and messageout.

    I want to update messageout table when a message received to messagein table.
    So I wrote a database trigger in mysql.
    I am using wamp-php admin,
    Code as follows,

    create trigger autoreply after insert on ozekimessagein
    FOR each row
    begin
    DECLARE tel VARCHAR(30);
    SELECT @tel=sender FROM ozekimessagein ORDER BY sender DESC limit 1;
    INSERT INTO ozekimessageout(receiver,msg,statuss) VALUES (@tel,’Thank you for the
    message’,'send’);
    end

    But when it runs give an mysql server version error.

    Can you help me to resolve this matter.

    Regards to all
    Samantha

  27. shashank

    good info. i NEWID() is new to me. thanks for the post. i am using the SQL is one of my projects now.

  28. Kirk

    I find it incredible that I can not find a way good way to select random records from a DB2 table by a record pointer of some type. This would be necessary for large tables. Those in the millions of records. Selecting by a record pointer would eliminate the need of dumping the entire table to just get a sampling.

  29. Pingback: SQL – Select Random Records From Table « Eureka!

  30. Kadama

    I had being seaching for these information for long time with vain. it clear, simple and it working.

    Thanks

  31. Pingback: Select random sample from SQLite table | Ecostudies

  32. partha

    how i can fetch record randomly from a table without rand() in Mysql??

  33. Pingback: Il blog di Gabriele Favrin » Archivio del blog » Estrarre un record a caso da una tabella di un database SQL

  34. P. F.

    Thank you for your thorough explanation and use of the RAND() function. It helped save me time and also expanded my SQL experience.

  35. Jill

    Wow, SQL “standards” are great.

    That’s why there are so many of them!

  36. Joan

    What if I needed to get “10 random records” using Sqlite… but without any duplicates?

  37. Pingback: Randomizing based on criteria

Leave a Reply

Your email address will not be published. Required fields are marked *