Derived Attributes with UNION

Posted by Francis Avila on August 10, 2008

A Story

Recently, a client of ours wanted to institute a “point” system for an existing body of users. The idea was that certain actions of the user would generate points for that user, which the client could then track as part of an incentive program.

But What are “Points”?

At the time, we had a simple “users” table in our database which stored all our user-related data. Now we were asked, essentially, to add a new “points” attribute to the “user” entity. However, we could not simply add a “points” column to the “user” table, because the client needed to track individual point-granting actions separately, with descriptions and such.

But this was also not a one-to-many relationship with an abstract “point-event” entity either, since some points were inferred from information which was properly normalized into other parts of the database. For example, referring another user (information we know at user registration time) was worth a certain number of points, but to copy a “referred user” event to a “point-event” entity would mean denormalizing the database. If a user-referral were added or changed later, we would have to make sure to do the same thing to a corresponding point-event.

Thus a user’s “points” are an attribute of the user, but the value of this attribute is derived from potentially many different entities or attributes. Guess what? It’s a derived attribute (scroll to the bottom).

So, how are we going to deal with this?

Implementation

Derived Columns

Some “real” databases have native support for derived attributes (e.g., SQL Server) but as far as I know they all require that the value of the derived attribute be defined as an expression, not the result of an arbitrary query. We could get around this using a stored function which calculates the points for us, but this particular database was MySQL (which does not support derived attributes), version 4.1 (which does not support stored functions).

In any case, this is a bad solution for us because any changes to the point calculation algorithm would require modification of the database, yet we had been accustomed to putting this kind of logic into the application. Additionally, a lazy SELECT * (many of which were unfortunately sprinkled throughout our application) would suddenly become much more expensive, requiring an additional function call per row.

Application Code

The other solution, of course, is that we simply put all the point-calculation code into the application. The problem with this is that it would take multiple queries to the database for every user that interested us, and we could potentially get the wrong point value if a change were made to the database in between our queries (since MySQL MyISAM does not have transactions). Plus, if we want to sort by points (or something more complicated), we would have to do the sorting ourselves, in the application.

UNION

Clearly, we wanted to handle point calculation by a single query. The solution we finally hit upon was to use a temporary table (not a view, since MySQL 4.1 doesn’t support them) filled by a UNION. This is quite possibly the only good use for a UNION. Each subquery of the UNION would calculate points based on a particular attribute or entity, and all the subqueries would SELECT to common column names.

DROP TEMPORARY TABLE IF EXISTS tmp_all_points;
CREATE TEMPORARY TABLE tmp_all_points
-- Get referrer-derived points
(SELECT user.id AS user_id, COUNT(*)*5 AS points
FROM user ... INNER JOIN ... GROUP BY ...)
UNION
-- Get pointevents-derived points
(SELECT user_id AS user_id, SUM(points) AS points
FROM pointevents GROUP BY user_id HAVING points != 0);

This will give us a temporary table with 0, 1, or 2 rows per user. If we want to limit this to particular users, we can add the relevant WHERE conditions to the individual subqueries before we send them to the database.

Now if we want to do any queries which involve points, we can just treat tmp_all_points as a “points” entity with a many-to-one relationship with the “users” entity.

Want the top five point-holders?

SELECT users.name, SUM(tmp_all_points.points) AS points
FROM users
INNER JOIN tmp_all_points ON users.id = tmp_all_points.users_id
GROUP BY users.id
ORDER BY points DESC
LIMIT 5

Happy Ending?

By using a UNION, we were able to neatly model the derived attribute as a table, using a single query that maps easily to the logic of the derived attribute and is easy to extend to account for any additional criteria that the client may dream up. And we didn’t have to denormalize our database or introduce complex application code.

There is a caveat, however. Tables defined by a query have no index, and probably we are going to want to join on this table, which means we’ll be doing a join without an index. For this reason, it is pretty important to keep the result set of your UNION query as small as possible using additional WHERE conditions.

If your result set will always be large, split off the temporary table creation into a definition with keys and use a INSERT INTO tmp_table SELECT ... UNION SELECT .... Don’t use CREATE INDEX after filling your table, since creating an index on a full table is much slower than building it incrementally (except for FULLTEXT indexes, where the opposite is true).

Don’t Try This With Views

If you are using MySQL 5.0 or above, you won’t be able to mitigate this problem by using a VIEW. MySQL is not very good at optimizing views. If there is not a one-to-one relationship between the rows of your view and the rows of the underlying tables, MySQL will use ALGORITHM = TEMPTABLE for your view. So any view with a UNION in it will be created as a temporary table anyway.

Thus I would not wrap a UNION in a view for this technique, since you can’t control the result set size for a view and you will be generating a new temporary table every time you use the view, instead of once per connection.

Please Take a Number

Posted by Brian Kieffer on July 9, 2008

In the Internet world of seemingly endless computer resources, it’s not often that a website requires visitors to wait in line to visit, but a recent Dancing Mammoth project called for just that.

The Requirements:

  • Visitors will be added to a Virtual Waiting Room prior to advancing to website content.
  • Visitors will be advanced according to First In First Out.
  • Page must indicate current position in line via a client provided Flash object.
  • An administrator must be able to control flow of traffic.

The client expected light traffic to their video chat feature, so we decided that capturing queue data in a single table was the most efficient way to go. We could then poll at a some set interval to determine a visitor’s place in line, and take action based on the result.

The client-provided Flash object required use of a bit of javascript, so I decided to go ahead and implement the polling with the jQuery library’s ajax functionality — I ♥ jQuery. Here’s what the javascript function looks like:

function updatePosition()
{
	$.ajax({
		type: "GET",
		url: "queue/index.php",
		data: "sess=<?php echo $session_id ?>&random=" + new Date().getTime(),
		dataType: "xml",
		success: function(xml){
			var ky = "";
			var val = "";
			var action = "";
			var pos = 0;
			$("response", xml).each(function(){
				$("action", this).each(function(){
					action = $("key", this).text();
					val = $("value", this).text();
					if(action == "forward")
					{
						// Forward
						forwardToUrl(val);
					}else{
						//Update
						pos = val;
						thisMovie('queueCountdown').update(pos);
					}
				});
			});
		}
	});
}

If you’d like to review all the sample files, you can download them here, but no warranty is expressed or implied.

jQuery sends the visitor’s session id (as well as a random string — always send a random string when making ajax GET calls or IE will give you cached content) to a script that checks the queue and returns some XML with the action and associate value. Then the visitor is either forward to the content, or their position in line is displayed.

On the back end, the VirtualWaitingRoom class provides functionality to retrieve queue position, administratively advance visitors through the queue, and remove records for abandoned sessions.

The project was a success and while there’s nothing especially complex about it, this Virtual Waiting Room is a good short example of how various web technologies can come together to provide a unique solution.

Use Your iMac as a Display

Posted by Francis Avila on May 28, 2008

I have an Intel iMac (the white kind). It’s my personal machine. I like it. It’s nice. What I especially like about it is that it has a big screen (1680×1050).

I also have an Intel MacBook. It’s my work machine. I like it. It’s nice. But what I don’t like about it is that the screen is a bit smaller than my iMac (1200×800). Using the smaller keyboard and mouse isn’t so nice either.

What to do?

Well, there’s VNC. OS X even has a VNC server built in. So I could turn that on and then use a VNC client on my iMac. But that only gives me the keyboard and mouse and a 1280×800 window mirroring the MacBook screen. Not cool.

The same guy who makes this excellent VNC client also makes ScreenRecycler. ScreenRecycler turns your VNC client into an attached display. The monitor of the computer your VNC client runs on looks to OS X like just another monitor, plugged in through the mini-DVI port. So now I can work on my MacBook and have a 1680×1050 screen in addition. Joy!

But ScreenRecycler ignores input from the VNC client, so I can’t use my iMac’s keyboard and mouse to control my MacBook. No joy.

But some other guy on the internet makes Transport. Transport lets you control other Macs using your keyboard and mouse. Joy has returned!

So, the plan:

  1. Install and run ScreenRecycler and Transport on the MacBook.
  2. Install and run JollysFastVNC and Transport on the iMac.
  3. The VNC client finds ScreenRecycler via Bonjour. No sweat.
  4. On the MacBook, tell Teleport to “Share this Mac.”

All done! Now I can use my iMac as a second display to my MacBook and control my MacBook with my iMac. (I can even make the iMac the MacBook’s main display!) Using the power of Spaces, I can even have multiple workspaces, and keep (for example) Mail and iChat permanently displayed in the MacBook screen, no matter what workspace I’m in.

A caveat: Transport doesn’t seem to recognize the ScreenRecycler display, at least when one machine is Panther (iMac) and the other Leopard (MacBook). You have to arrange your virtual screens in Transport in such a way that they don’t share the same borders. Otherwise your pointer will get stuck on the MacBook.

  • tags: , ,
  • No Comments

THE BALLERINAS

  • PJ Doland

    PJ was born in a cross-fire hurricane And he howled at his ma in the driving rain.

    Gary DuVall

    Strength of a thousand coding languages with the patience of a ninja.

    Matt Niemi

    Adept at mobilizing the forces of good.

    Brian Kieffer

    Solving website problems before you even knew they existed.

    Matt Fetissoff

    Making sure all our websites have at least 15 pieces of flair.

    Chris Coté

    Slinging the code with deadly accuracy.

    Erin Doland

    100 percent all-natural high-quality content machine.

    Francis Avila

    Newest coding squad deputy, quickly earning his badge.