Saturday, April 16, 2011

New FireGSS version for Firefox 4.0

FireGSS is the desktop client of gss and the Pithos service, in the form of a Firefox extension. Yesterday, I tested it on Firefox 4.0 for the first time (a little bit late, I know). All looked well, except two things: a) The version number in the about box was displayed as ... undefined! and b) all menus were something like transparent (definitely not usable).

What 's wrong?!? Firebug and chromebug to the rescue! Firebug console displayed an error message in line


var extManager = Cc["@mozilla.org/extensions/manager;1"].getService(Ci.nsIExtensionManager);


It seems that Firefox 4.0 has changed the way to programmaticaly access the extensions manager. The new way is

Components.utils.import("resource://gre/modules/AddonManager.jsm"); //Load AddonManager
AddonManager.getAddonByID("addon id", function (addon) {
//Do something with your addon, we get the version number here
});


As you see, the new way is asynchronous and uses a callback function to return the results. That required some refactoring on our part to make sure that the about box is not displayed before we have the version number.

The second problem was that we defined the menus as popup elements inside a popupset element. Another thing that is no longer supported in 4.0 is the popup element as it has been replaced by the menupopup.

So after those changes the plugin was playing well with Firefox 4.0 but no longer with 3.*. Since it does not have any other functionality enhancements, this is not a problem. Users of Firefox 3.* can continue using FireGSS v. 0.18 and users of Firefox 4.0 can upgrade to 0.19.

Until we resolve some issue with the update site, you can manually update to version 0.19 by downloading from here.

Sunday, April 10, 2011

Using Solr as a fast database cache

In gss (the open source project that the "Pithos" service and mynetworkfolders is based upon), we use apache solr for full-text indexing and searching of the stored documents. However when a user searches for some terms, we have to show him only results from documents that she has read permission on, i.e. her own documents, document shared to her by others and documents made public.

During some benchmarks we did recently we observed extremely high response times even for searches that had very few results. After some code reviews and more fine-grained benchmarks, we realized that over 60% of the time that it took for a search to complete was due to permission checks in the database that stores the document metadata. An example search for the term 'java', returned from solr something in the area of 2000 results. After that, each result should have its permissions checked to see if the user that did the search has read permission on the result and filter out results that cannot be read by the specific user. The response from solr was blazingly fast, the transformation of the SolrDocument objects to gss resources and the marshaling to json was around 40% of the total time and the remaining 60% was the permissions checking.

So, we thought that if the solr search is so fast, why don't we store the document permissions in the index and transform the search query to include the user? That way the search will return only the relevant results (those that the user has read permission) and no permission checking and filtering will be necessary. More specifically, whenever a file is created or updated we store in the index the user and group ids that have read permissions on the file. Now, when a user does a search, we retrieve the groups that the user belongs to and append to the search query a search term that checks if the user id and group ids belong to those stored with the file. That way the search returns only the relevant results, thus improving search times more that 60%.

Note: Care should be taken to update the index, not only when a file is updated but whenever its permissions are updated too. However, this is not something that happens often and index updating is done asynchronously through a message queue, so the load imposed to the server is insignificant.

Friday, March 25, 2011

My Google Web Toolkit Talk

If you are new to GWT technology or have never heard about it, take a look at my recent talk about Google Web Toolkit during the 3rd Greece GTUG (Google Technology User Group) Meetup. It is quite introductory, highlighting the main features of the toolkit. I am planning (if time permits) to add the transcript as well, because some slides are not self-explanatory. Your comments are always welcome.