Data is beautiful – pt. 2

Continuing on from the dry statistical analysis to another fairly coarse subject matter. As statistics provide numbers, those numbers do actually mean something. This is especially true when considering that all of the metrics signifying growth carry the consequence of increased physical data on hardware and the potential to be a much bigger headache than anticipated. To avoid headaches regular maintenance is key, but beyond the obvious there is a lot more going on under the hood than might be expected. In technical terms, there are some numbers and data to look at, which provide an insight into what is going on within an installation and also what potential issues are being created.

There is a lot of information to look at, but the most prone to issue metrics are actually not that complicated to understand. It is still a fairly technical, and thus dry, topic to work through, but all the more important. Leaving everything to software to handle itself is not a good idea when you cannot be sure the appropriate routines to manage all data points exist within it. As to why they may not exist, given the complexity of the data and the an equal lack of information as to why it might come into existence, there can never be a routine that accurately guesses an adequate action. Operating on assumptions often enough leads to trouble, but it can be even more detrimental in regards to software. Knowing what to look for and understanding the capabilities of software or lack thereof is rather important when it comes to long-term stability in the face of continued growth.

 

Technical data

 

Outside of the statistical information regarding the usage and scale of an installation there also exists data worth looking into from time to time. Primarily to further judge both the scale of an installation and the types of data it consists of. The biggest such provider of data is the asset server itself, as it comprises the single largest source of information for most installations.

Assets are categorized by type, which offers an insight into what sort of content is mostly present and can also give an indication of what growth to expect and thusly what requirements the installation will have to the hardware it sits on. In some cases the distribution of asset types can also provide a warning about problematic sources of content. This is especially interesting in the case of Notecards being created automatically by scripts in manners that create unnecessary data.

A simple count of the type can already provide this indication, as usually the distribution of asset types should lean towards textures, objects and scripts. Though this would also count Notecards generated by users rather than scripts and we specifically want to know, beyond just the number, where the source of them could be, which requires more information. The simplest way, then, is to look at the amount that share the same name as the source is likely to be one or multiple script(s) creating them.

SELECT `name`, COUNT(*) FROM assets WHERE description='Script generated notecard' GROUP BY `name` ORDER BY COUNT(*) DESC;

Generating a ranking of sorts to show the biggest offenders first. If not from the name itself, then the contents of any Notecard by that name can usually provide insights into their origin. It goes without saying that catching such things early is usually best, but that does not mean damage done should or needs to be ignored. While it is very difficult to determine the in-use state of a given Notecard, in cases of hundreds or even hundreds of thousands of them, action is still required. Resolving the source of the problem first and thus reducing the potential for further spam should be the first step. Then, once a mitigation has been made the data no longer required should be removed. This can mean simply deleting the entries from the database, but might require also removing the corresponding data from the filesystem. With most asset servers the filesystem contains a structure of folders and files that can be recursively searched via the hash stored in the database. Entries removed from the database can thus be used to look up the corresponding files. Similarly the opposite can be achieved through crawling the filesystem and attempting to find a corresponding entry in the database. With none present the file serves no purpose and can be safely removed.

SELECT sha256 FROM assets WHERE sha256='$filename' LIMIT 0,1;

As an example from PHPSAS. Crawling larger filesystems can be quite slow as the nesting and number of folders can reach into the tens of thousands. It also stresses the IO capacity of the hardware so it is advisable to add delays to the routines or only perform this task during times with low utilization.

In terms of other data with technical impact, mostly to performance, but in extreme cases to stability, are frienships. This may sound strange, but the internal routines to manage the information regarding friends requires OpenSimulator to renew information about each of them whenever a user changes their location within a grid. As this information doesn’t carry much data each query is usually dealt with relatively quickly. A problem, however, arises when this information cannot be sent to a designated endpoint. Eventually a target not responding to the request will result in a timeout, but during the waiting period for a response the request holds up a spot in the queue. Enough of those in a row and the queue is full, with no more data, friend or otherwise, getting through. In extreme cases this can cause a region to become unresponsive.

In order to reduce this problem it is usually a good idea to try and keep the number of friendships in a manageable area, but that is easier said than done for some social butterfly. On the other end we can go through the frienship information and determine a ranking of who happens to have gotten around the most and thus pose a potential for having either too many or potentially problematic frienships associated with them.

SELECT UserAccounts.FirstName,UserAccounts.LastName,Friends.PrincipalID, COUNT(*) AS Friendnum FROM Friends JOIN UserAccounts ON (Friends.PrincipalID=UserAccounts.PrincipalID) GROUP BY PrincipalID ORDER BY Friendnum DESC;

A total count of friends is one with, but removing those entries that point to a dead endpoint is a different matter. Users are unable to remove friendships that cannot reliably inform the other end of their dismissal. This means, that once an endpoint is dead, the friendship cannot be removed by the user. It requires removal directly through the database as well as removing the Calling Card item associated with the friendship. Determining whether and endpoint is temporarily unavailable or completely dead remains to be evaluated on a case-by-case basis.

Above a mention is made about the known grids information contained in the GridUser table. There simply a count is made of how many unique entries are present. We can also extract their urls from the same table. The remote users are entered via their UUI, which is a combination of their UUID, name and origin grid address. By removing the parts we do not need from the UUI we can create a list of just the origin urls.

SELECT DISTINCT(SUBSTRING(UserID,(LOCATE(';',UserID,1))+1, (LOCATE(';',UserID,(LOCATE(';',UserID,1)+1)) - (LOCATE(';',UserID,2)-1)-2 ))) FROM (
SELECT UserID from GridUser WHERE UserID LIKE "%;%;%") as users;

Removing the parts of the UUI by the ; delimiter and returning the distinct result we get a list of just the urls. Between MySQL and MariaDB the syntax to accomplish this can be a bit different as the methods for locating substrings and returning parts of strings differ.

For administrative purposes or in the case of a popularity contest it may be beneficial to know the locations of users. While it is not possible to get their exact coordinates through the database, we can determine the specific region they are currently connected to. However, as avatars can also be visiting from remote grids we cannot simply correlate the Presence information with the UserAccounts data alone. Instead the GridUser table needs to also be checked for matching entries and subsequent returned matches be altered to return the same usable data.

SELECT CONCAT_WS(' ', REPLACE(SUBSTRING(GridUser.UserID,(LOCATE(';',GridUser.UserID,1)+1)),';',' @ '), UserAccounts.FirstName, UserAccounts.LastName) AS Avatar, regionName
FROM Presence
LEFT JOIN regions ON Presence.RegionID = regions.uuid
LEFT JOIN GridUser ON GridUser.UserID LIKE CONCAT(Presence.UserID, ';%;%')
LEFT JOIN UserAccounts ON UserAccounts.PrincipalID = Presence.UserID;

Similarly to the query above there may be differences in syntax between database software, but the method remains the same. We look at both the UserAccounts table for the avatar names as well as parsing the UUI’s contained in GridUser table to create usable names. Then we can simply match the Presence information to the matching entries in both tables and add the region names to them. Further to note is the type of join performed to only match the actually present information when matches are found in the relevant tables as GridUser does also contain information about local users matched by their UUID.

Much like friendships, large groups can cause performance issues as well. Primarily through the functions that send information to all users within a group. This can also cause regions to become unresponsive for the time it takes to complete all the resulting scheduled sending of information. As with friendships, we mostly just want a ranking of the largest groups in order to take action on them first.

SELECT os_groups_groups.`Name`, (count(*)) as Members from os_groups_membership INNER JOIN os_groups_groups ON os_groups_groups.GroupID=os_groups_membership.GroupID GROUP BY os_groups_membership.GroupID ORDER BY Members DESC;

As is probably easy to guess by now the behavior of external users within those groups is similarly problematic when it comes to removing them. Without the endpoint replying to the request the removal is not possibly to be done by the groups founder and this action must also be performed via the database. Fortunately there is no Calling Card to remove, so no changes to inventory are required.

Not all of these queries are equally important and must be constantly checked. Trend data can be collected and usually acted upon just as much as hard limits, but the idea of offering an unrestricted space for users is also very appealing. Outside of extreme cases it usually takes quite some time before problems arise, but this should not mean there are not outliers either. As usage increases it is a good idea to reduce the intervals between checking the various statistics and technical information. What returned data is reasonable or within soft limits depends on more factors and is different for each installation. When numbers differ from all others by magnitudes it is usually a good indicator for potential problems. Eventually symptoms arise, but with the knowledge of their potential causes and a quick check of the information the cause can be narrowed down. This goes both ways as well, when the statistical or technical information shows a reasonable or even small utilization then the cause of a problem isn’t found in the simple overburdening of the installation.

To clear up any potential concerns, the information and relations made through these queries are all part of process-relevant information and their usage in regards to statistics and maintenance does not constitute any data-processing that would be subject to requirements for processing consent. Of course, the data itself is still specific to all the individuals that have created it, but none of the queries return data that can be attributed directly to an individual. In case of your own protections remember that anyone with access to the database can view and even manipulate all of your data, which is not just the case for OpenSimulator.

Leave a Reply

Your email address will not be published. Required fields are marked *