Introducing Zetamex.Solutions

Zetamex Network has offered a host of standard packages for OpenSim in the past. That is all well and good, but sometimes standard run-of-the-mill solutions are not what you may be looking for. For that purpose we have created Zetamex.Solutions specifically aimed at those who want to make more out of OpenSim. Our existing customers already enjoy many of the perks this branch of our company offers. Now those are available for everyone.

Along with this we are changing part of our billing system to better serve to the different budgets and projects people approach us with. This means removal of the standardized packages to be replaced with flexible, budget-based billing. Further are we streamlining our support system removing the need for the Dedicated Support package. Instead support is given equally to everyone and special requests are met with more transparent billing system. These changes were made after the feedback we received over the last couple of months has shown that they were needed. We hope you are as excited about them as we are. We will continue to gauge the feedback we receive to improve the quality of our products and the support we give for them.

PHP-SAS Performance Analysis

Since our release of phpSAS into production we wanted to explain a bit more how it works. Specifically in terms of the performance differences.

To start off, phpSAS is much like SRAS in that it is stores files to disk directly, then organizes the data using SQL. It was built to support both PostgreSQL and MySQL/MariaDB so that an existing SRAS database can be dropped in without any issues. Furthermore, we have noticed that PHP 7’s ability to query is far faster than that of Ruby 1.9.3, which was the last working version that could run SRAS. So in simple terms, a request comes into phpSAS which then queries the SQL database which in turn returns the information and location of the asset, then serves the asset back to the grid/simulator/user.

Since running it in production on several different types of grids, low to high load environments, we wanted to report exactly what we are seeing on our end in terms of the performance. First off let’s look at the raw statistics.

Grid Active Users Number of Assets Memory Usage (phpSAS) Memory Usage (SRAS)
37 902,113 (84.2 gb) 32mb 215mb
135 534,052 (48.7 gb) 48mb 1.2gb
284 3,308,648 (148.9 gb) 54mb 3gb

As you can see from the data above, the statistic speaks for itself. The new asset server is able to scale significantly higher, and with the ability to put it behind a load balancer allows it to handle even higher spikes of traffic. The reason we have not included CPU load is because there is really no significant CPU load to speak of. Both SRAS and phpSAS never really spiked above 2 percent load even during heavy traffic of loading IAR’s or OAR’s from multiple simulators.

In terms of end-point performance for the client the following chart should give you some idea of the performance compared to the other asset servers. Please note that mileage may very on these results depending on hardware and the grid’s connection. We tested this with fairly simple mesh, textures, and objects. The Viewer was run on a machine with 8gb of ram and a Nvidia grid card on ultra-settings, while on a 1Gbps connection. Provided by LiquidSky.

Stock FSAssets SRAS phpSAS
Mesh 4.12 1.55 1.89 1.31
Textures 2.07 2.01 2.15 2.01
Objects 2.81 3.01 2.67 1.41

If you are interested in seeing the difference for yourself then head over to ZetaWorlds and create a local account. While you could use the hypergrid, you will most likely not see a great difference in performance, as load times greatly depend on the performance of the grid you are teleporting in from.

Replacing SRAS

Zetamex Network has been a big user of SRAS for a long time. We didn’t make the switch to FSAssets due to the performance to system resource usage was still much higher than that of SRAS. Since the most recent releases of many mainstream Linux distributions it is becoming more and more noticeable that SRAS is showing its age. We had already been using software to allow for SRAS to run on older versions of its dependencies in order to get it to work at all.

We realized this was just not cutting it anymore, and we really needed something new to replace the outdated SRAS and make sure that we could support it for the future to come. Reaching out to freelancers to assist us in the project in converting or rewriting something backwards compatible to SRAS, as FSAssets is not directly backwards compatible but could be migrated with some effort. After shopping around for freelancers we came upon one already deep inside the OpenSimulator community who has done custom work for other grids, with whom we have worked with in the past on a few smaller projects for some of our clients.

We decided on to use PHP for the replacement, as we have already built almost the entire Zetamex Network back-end on it. Furthermore, we made sure it was built without relying on any dependencies other than PHP itself. This made it slim, lightweight and easy to future proof by having it all written from scratch for easy upgrades in the future. The development took a while, but this past week it has been rolled out and is running on all hosted and managed Zetamex Network grids. The name of it is phpSAS (PHP Simple Asset Service), close to the same naming of SRAS (Simple Ruby Asset Service).

What makes it different is that we run it on the latest version of PHP, PHP 7, which is the fastest version of PHP to date. The proxy layer for serving assets and handling the load is NGINX
making it even faster at storing assets and allowing it to serve thousands of requests without causing any major load spikes. Best of all, we are now able to load balance the asset server like never before as assets are served just like a website would be. This means we can put it behind large CDN providers with minimal effort, which is our next project for this.

Resting means Rusting

Constant innovation and planning for the future is nothing new for us, but while we are forging ahead our equipment has always struggled to keep up. In the last year we have drastically changed that and now have plenty of computational resources to test and develop new features with. However, since we are not a 24/7 company, some of that new equipment stays idle for a couple hours each day while we recharge our batteries. As shutting down and then rebooting all these servers and systems would be the best option for the environment, it also means even more time spent not developing.

Folding@Home

So the logical conclusion is using the downtime for something useful and so we have setup a Folding@Home machine within our testsystem. Using the spare computational power of the cluster during the night it helps the search for a cure to cancer. You can see the statistics of that in the blogs sidebar, which also has a direct link to the first of the machines we setup this way along with our team number, in case you want to join us. In addition to helping cancer-research the software also allows us to simulate synthetic loads during testruns, which provides us with valuable data towards the stability and reliability of what we test. Finding a cure for cancer one feature at a time if you will. We hope to have sparked some interest in Folding@Home and maybe we will see you on our Folding team in the future.

Patching Vulnerabilities

Making sure everything is up to date and secured properly is our daily bread and butter. Great care is given towards making sure updates are deployed regularly and their behavior is monitored to make sure they do not break anything along the way. It might be a boring and consumes a good deal of time, but the benefits are well worth the effort. Most security holes are patched by these updates, which leaves only a few for us to have an actual look at. What is left is to make sure we actually take advantage of new security measures and implement them into our system.

Unfortunately with security comes inconvenience. Carefully designing security measures as to not impede the ease-of-use of the systems we create is a skill not easily learned. Over time a lot of different approaches to common security issues have sprung up, so there is an array of ones to choose from. Some are highly secure, but very cumbersome to use. Some can be very elegant, but have critical flaws in their design. Our approach is to gradually increase the level of security as to allow our customers and users to adapt and make themselves familiar with the workflow of these measures. This has worked very well in the past, but lately the pace seems to have picked up quite a bit.

Going off the usual blog-writing here to actually address an issue directly. Recently one of our bigger customers has been “attacked” by what can only be described as a “movie-style hacking for leet haxor pros”, ahem. To be more precise, an “attack” was launched towards the user management system, which resulted in a mess of data being entered into one of the databases. The origin of this “attack” was a piece of software distributed by Acunetix , a company specializing in vulnerability testing of websites. According to them, their software has been stolen and is now used without license or consent to stir up all sorts of trouble. While this software offers a great way for website-administrators to test their work and make sure their site is save, it apparently lacks safety of its own. Most companies offering these services make sure that these tests are ran with the consent of the websites owner. Unfortunately their software lacks the capability to make sure this is the case. To us this shows a distinctive lack of care toward security on the web and proper practice of such. What is most disturbing about this, is a lack of transparency and information from their side. Specifically after being promised more information regarding their software and how to deal with the “attacks” it generates.

At this point the damage has been done and we were forced to proceed with implementing additional measures to deal with this sort of thing. We would have liked to evaluate and build a system less intrusive and easier to use. Specifically, we have implemented a Captcha system to prevent automated registrations and injections from automated software. As it is just one click on the Captcha button is not too complex to use yet very effective. We also implemented checks towards making sure registrations come from a legitimate source and use proper credentials. We will continue to increase the security of our systems and, as this case shows, pick up the pace doing so.

For Science! OpenSim Research

Open-Source Software tends to invite a sense of adventure and discovery, which is why we have been actively testing all sorts of new solutions in the past. Apart from DDoS mitigation and CDN based asset delivery we also explore the bare metal side of things. One of the biggest arguments for OpenSim has always been its more cost-effectiveness and we are taking that idea further.

One idea is to isolate customers on their own virtual machine, this makes resource allocation simpler and can improve security. However, most of these systems trade overall low overhead for hard resource restrictions. As a result customers may still be able to overdraw on their assigned resources. The only way to counteract this is to isolate them on physical hardware. Sounds easy enough, but in practice cost-effective hardware for just a single simulator is difficult to find. This is why we are exploring technologies like ARM and specifically the Odroid line of small ARM-boards by Hardkernel. While ARM is a great technology and certainly capable of running OpenSim, ARM can be a bit tricky to deal with due to lack of support for some of the software we use alongside OpenSim. Another solution is the APU-board line of products from PCENGINES. These book-sized boards offer the more widely used x86 architecture and thus have all the usual software available to them. Size is key here, after all space inside datacenters is not cheap. There are many factors that will ultimately have to be evaluated before we can move ahead with these solutions, but that does not make them anymore less exciting to explore.

Apart from tinkering with hardware there is more general testing going on as well. From reproducing issues our customers encounter, to building new systems, requires testing and that brings along cost associated with running long term tests. To reduce this cost and to allow for a versatile testing environment capable of replicating very complex systems, we opted to run dedicated hardware locally at our headquarters. This allows for hands-on configuration and makes adding or removing parts very easy and above all affordable. Since some of you might be interested in the specifics of this system the next paragraph will go into some detail on what we have set-up. Most of this will not be overwhelming, but it does not need to be to provide adequate performance for testing.

testsystem

Our main testing is done on an ESXi v6 cluster with various virtual machines. The cluster sits on an HP Proliant DL580 G5 with 4 Intel Xeon X7350 processors, 128 GB of Memory and 800 GB of drive space. Most of the ARM testing is done on the two Odroids with one of them serving as an external proxy to make navigating the testsystem a bit easier, because remembering ever-changing IPs is not easy. The Odroid U3+ ARM Cortex A9 1.7GHz, 2GB Memory all within a creditcard formfactor, the X2 has similar specifications and is pretty much the bigger, uglier brother 🙂
The PCENGINES Apu is more designed as a pfsense router, but the formfactor makes it a contender for stuffing a bunch of them in a server enclosure to reduce hardware cost. It features an AMD G series GX-312TC 1GHz quad core with 4 GB of Memory and three gigabit lan ports, all with a mere 12W TDP. Excluded from the above graphic are the backups, since they are just some older HDDs that happen to be spares, one of them has failed already. If any of you want to know more feel free to ask.

Now that all the geeks and sysadmins have had their fill back to something more easily comprehensible. We obviously also test on the actual systems that will be used to deploy these solutions on. This forces these solutions to be solid enough to be deployed on any hardware or system. It also helps expose problems caused by conditions we cannot simulate on our internal systems. Only after that should solutions be allowed into a production environment. Now, we are not completely free from pushing solutions without doing thorough testing on them, this is something that cannot be helped sometimes. We would be lying if we did not admit that this has happened, but frankly this is the nature of IT. We aim to keep service interruptions to a minimum and that usually causes such solutions to be pushed out quicker than we would normally like. Rest assured we always go back and re-evaluate these and make sure it was the right thing to do and make changes where needed. Fortunately for us with nearly a decade of experience between everyone at Zetamex Network we have become fairly proficient at this and our customers profit greatly from that.

Obviously we will be expanding on our testing equipment and will continue to conduct research on all things OpenSim. At the same time we are looking onward to what the future holds and with the rise of VR gaming and increasing interest of the education sector we hope you are as excited as we are for what this will evolve into. We would not be here if it was not for our pursuit of improving what we have and create new and better things, but there come times that this road has been rocky and innovation brings with it a sense of uncertainty. We hope that our approach of being transparent and sharing some of the technology and ideas gives you a greater sense of understanding our goals.