Just Minecraft: Architecture

Saturday, October 9, 2010

Beyond the Glass Ceiling

When running a virtual environment system, one must always take into account the cost per user when scaling the architecture. In the beginning, the costs may seem negligible, but due to accelerating returns we will quickly find that there will come a time when the cost per user will begin to skyrocket.

Often times, companies will not think that far ahead when the initial planning is being hashed out, and as we have with Second Life today, (and most other systems in existence) the cost per user eventually reaches that turning point and the bandwidth and server expenses needed to scale to a higher number of co-current users becomes prohibitive.

There is, and always has been, a solution for this problem.

On a conference call Friday, I had the pleasure of listening to what the United States Army had to say about their military simulation project, and the constraints they seem to be running into. It is true that they are proud to be able to have found a way to handle 250 co-current participants in the simulation, which is orders of magnitude higher than what Second Life can reliably deliver, but they want a system that can handle upwards of thousands of co-current users in the same area reliably.

Therein is the problem.

It’s not a software limitation, by any means. Rather, it is a problem of architecture and creative solutions. Sure we can host four regions on a quad-core processor and each will have a reliability of about 50 co-current users before we really see the system degrade horribly. It is true that you could even devote an entire server to a single region and possibly see a boost of maybe 200-300 co-currency in the region. But we’re missing the bigger picture here, as we often do.

What we need is to step back and look at the whole if we want to truly go beyond this glass ceiling of co-currency.

Something we rarely think about is all of the latent processing power in the datacenters running the many simulators in the grid. How many areas have we gone to where there is nobody in the area, yet we can then go to a region that is jam packed with users and slower than molasses?

This is where big picture thinking comes in handy. If we take into account all of the regions that have little to no consistent traffic, we are left with the collective processing power of a super computer going to waste. But what if those regions were smart enough to know that their unused processing power can and should be diverted to the heavy use regions when they are not in use themselves?

At this point, we are no longer thinking in terms of brute-force processing on a single server, but thinking in terms of a decentralized processing fabric. Where the resources are applied from the entirety of the datacenter exactly where it is needed.

Think about this for a moment. Imagine the countless processors in the Linden Lab datacenter laying idle. Now imagine what happens when they all come together to handle heavy use regions. Instead of four regions per server, we’re throwing the collective processing muscle of the entire grid itself into the regions, split evenly where it is needed, without wasted CPU or GPU cycles.

You would think that such server architecture would be prohibitively expensive, and until recently it actually was. This sort of decentralized fabric processing has been something on my mind for many years, and only recently has it materialized as an actual server. Maybe I’m ahead of my time, and doomed to forever wait for the rest of the world to catch up?

Enter the SeaMicro.com SM10000 High Density, Low Power Server.

The SM10000 is the first server purpose built for scale-out workloads. Designed to replace 40 1 RU dual socket quad core servers, the SM10000 integrates 512 Intel Atom low power processors, top of rack Ethernet switching, server management, and application load balancing in a single 10 RU "plug and play" standards-based server. The SM10000 uses 1/4 the power and takes 1/4 the space of today's best in class volume servers without requiring any modifications to existing software.

Yes, you’ve just read that correctly. A server that is designed to take 1/4 of the space, 1/4 of the power, and replace 40 1 RU dual socket quad core servers. 512 CPUs per rack mount.

The SM10000 uses the idea of virtualization to create a processing fabric scenario where all of the CPUs are capable of doing any of the workload, and instantly switching to the task at hand.

One full rack (4 Seamicro servers) would be 2,048 CPUs working in tandem. Network them together across the entire Second Life Grid and you may as well have just upgraded every region to it’s own personal supercomputer, at a quarter of the space, power, and TCO.

The big picture here is Long-Tail thinking.

If we look at these systems not in their infancy, where they would look very underpowered individually, but taken as a collective whole as more servers are migrated to them in tandem, we begin to understand that the real genius of this long-tail approach is that the whole is greater than the sum of its parts.

Of course, there is still an issue of bandwidth but this is a different problem to address. The first steps to breaking the glass ceiling of co-currency is to address the underlying hardware architecture which is the foundation. As the old saying goes, “You shouldn’t build a house on sand, but instead build it upon the rock.”

If the Second Life grid had been running on this type of server architecture, we wouldn’t today see the scalability issues and hasty decisions of revoking the Educational and Non-Profit discounts in a mad rush to cut costs.

Instead of worrying about whether a new region server would cause more load and stress on the entire grid, Linden Lab would be overjoyed with each new region; They would welcome each with open arms because they would know that each new region would not weaken the entire grid capacity but drastically strengthen the whole.

Breaking Bandwidth Limitations

There comes a time when we also must look at the bandwidth constraints inherent with massive multiuser virtual environments, and this scenario is no different. While a collective computing fabric would make the co-currency rate easier to manage in higher numbers, it doesn’t essentially solve the second bottleneck.

In Lessons Learned From Lucasfilm’s Habitat written by Chip Morningstar and F. Randall Farmer for the First Annual International Conference on Cyberspace in 1990, there is a blatant solution given to today’s continuing co-currency limits and scalability.

The first area to investigate involves the elimination of the centralized backend. The backend is a communications and processing bottleneck that will not withstand growth above too large a size. While we can support tens of thousands of users with this model, it is not really feasible to support millions. Making the system fully distributed, however, requires solving a number of difficult problems. The most significant of these is the prevention of cheating. Obviously, the owner of the network node that implements some part of the world has an incentive to tilt things in his favor there. We think that this problem can be addressed by secure operating system technologies based on public-key cryptographic techniques.

Twenty years later, and one of the most important pieces of advice concerning virtual worlds technology seems not only forgotten, but blatantly ignored. In 1990, there wasn’t the invention of Peer2Peer networking or Bittorrent technologies. The idea of a BrightNet system wasn’t even a figment of people’s imagination back then either. Yet here we are in the year 2010 reading words from 1990 and wondering how they knew this was coming.

Not only did they see this coming, they spelled out (to the best of their knowledge) how to potentially solve it.

I will be the first to admit that the “Breaking Bandwidth Limitations” section is not a full solution spelled out for Linden Lab, but instead only a hint. While this article has given a 75% solution to those in need at Linden Research, I consciously choose to withhold the remaining 25% that would tie it all together properly.

The last piece of this puzzle does not reside in the What (Decentralization) but the How (Protocols) and Where (Where do we apply it in the system?)

It’s a very tricky solution, and just like the proverbial Light Saber construction, if you screw up even a tiny bit, the whole thing will explode.

I think for that extra 25%, I’d trade for the name Aeonix Linden.

Friday, August 13, 2010

Asynchronous Updating

Logging into the virtual environment, users began to notice that the frequency of system updates began to increase, eventually culminating into passive updates streaming behind the scenes. Dynamically updating the protocols and software did away with cumbersome batch updates and awkward installations from the past. Rightly so, it was noted, because the near constant updates would have made it completely impossible to utilize the technologies in any other manner, and each time the users logged into their accounts, the virtual environment greeted them with more to offer and ever higher fidelity.

At some point during the past week while I was making final corrections and edits to my book chapter in the upcoming book Virtual Worlds and E-commerce: Technologies and Applications for Building Customer Relationships I had a revelation upon logging into the Second Life virtual environment. I currently use the Viewer 2 alpha for testing purposes and to participate in the JIRA feedback and bug reporting, however I noticed something interesting over the past few days that I don't believe really hit home previously.

The plot to this blog entry is found in the initial quote (from my chapter) which outlines how the progression of updates will be handled in a virtual environment program as accelerating returns begin to take hold. If you aren't aware, accelerating returns is the speeding up of paradigm shifts over time, and in terms of software and how people handle their information, I make the assumption that it will be commonplace to begin doing passive updates to software rather than direct bulk updates.

When you log into a software system such as Second Life, quite often you are greeted with a message saying that a newer version is available and you must download and install it before you can continue. I find this method to be ill-conceived at best in the 21st century, and would go so far as to say that asynchronous updating should be the preferred method for software updates going forward.

There is no need to wait until a person runs the software to inform them that a new version is available, force them to quit the program, download the new version, install it, and then continue. When faced with the alternatives, it seems quite silly that software still does this, and I can only stare in wonder and ask "Why?".

So what's all this about "Asynchronous Updating", you may ask? Well, it's nothing spectacular or new; it's simply the order of how updates are handled that is changing. Take, for instance, if the software does an update check and finds that there is indeed a newer version of the software available. Instead of telling you a newer version exists and prompts you to quit the software application to download the new version and install it, the following should happen:

You execute the program
It checks for new versions or updates
If it finds updates or a new version, it downloads it in the background to it's own temp folder.
User continues to use the old version of the program unhindered
When user closes the program, and later restarts it, the updates are automatically applied as part of the start-up process.

This is more of a passive updating system, or transparent update system whereby the experience is not abruptly interrupted for a mandatory update. In the meantime, they continue to use the old version unhindered and without a noticeable interruption. When they run the program again later, the updated version which was downloaded in the background is installed before running the main software.

I believe as release times shorten, and updates for software become more numerous, this sort of asynchronous update system should be used. What do you think?