Well there are other things that can be done in JavaScript beyond substructure search. For example, Tanimoto binary fingerprint similarity calculation needs just two short functions:
function popcount(b) {
var c, bi3b = 0xE994;
c = 3 & (bi3b >> ((b << 1) & 14));
c += 3 & (bi3b >> ((b >> 2) & 14));
c += 3 & (bi3b >> ((b >> 5) & 6));
return c;
}
function tanimoto(fp1, fp2) {
var a=0;
var b=0;
var c=0;
for (var i=fp1.length-1; i>=0; i--) {
var block_fp1=fp1[i];
var block_fp2=fp2[i];
a += popcount(block_fp1);
b += popcount(block_fp2);
c += popcount(block_fp1 & block_fp2);
}
return c/(a+b-c);
}
The fingerprints have to be converted into JavaScript arrays of equal length containing signed numbers:
onclick="alert(tanimoto(new Array('1','-1073741825'),new Array('3','2147483647')));"
0.9
Monday, February 15, 2010
Friday, February 12, 2010
Chemoinformatics in the browser: Firefox catches up
>That's a big difference. Which version of Firefox? If 3.5, have you tried 3.6?
Yes, today. Chrome 4 is not faster than Chrome 3 but Firefox 3.6 now allows jobs of about 50 structures.
Those batch sizes allow for script execution times of about 1 second. The idea behind this is, that this does not interfere with other scripts on a page if the job is running embedded, e.g. in an invisible iframe.
If the page is dedicated, much larger jobs might be possible up to the limit of the browser that triggers the 'A script is not responding' error message.
Update: IE 8.x is twice as fast as IE 6.x, but still slow compared to the competitiors.
Yes, today. Chrome 4 is not faster than Chrome 3 but Firefox 3.6 now allows jobs of about 50 structures.
Browser | max. job size |
Chrome 3.x | 100 |
Chrome 4.x | 100 |
Firefox 3.5.x | 25 |
Firefox 3.6.x | 50 |
IE 6.x | 5 |
IE 8.x | 10 |
Those batch sizes allow for script execution times of about 1 second. The idea behind this is, that this does not interfere with other scripts on a page if the job is running embedded, e.g. in an invisible iframe.
If the page is dedicated, much larger jobs might be possible up to the limit of the browser that triggers the 'A script is not responding' error message.
Update: IE 8.x is twice as fast as IE 6.x, but still slow compared to the competitiors.
Thursday, February 11, 2010
Chemoinformatics in the browser: Chrome finishes first
While developing my little demo in the previous article, I found that different browsers could handle different job sizes depending how fast their JavaScript engines are.
Chrome 3.x finishes first before Firefox 3.x and ye olde IE 6.x is almost unusably slow for substructure searching with JavaScript.
The possible job sizes are:
Thus, the server sizes jobs according to the user-agent header sent:
While I knew that Chrome's JavaScript engine is fast, I didn't expect it to be that dominant.
Chrome 3.x finishes first before Firefox 3.x and ye olde IE 6.x is almost unusably slow for substructure searching with JavaScript.
The possible job sizes are:
Browser | max. job size |
Chrome 3.x | 100 |
Firefox 3.x | 25 |
IE 6.x | 5 |
Thus, the server sizes jobs according to the user-agent header sent:
if (uatype.find('Firefox/3') != -1): timeout = 500 maxsize = 25 elif (uatype.find('Chrome/3') != -1): timeout = 200 maxsize = 100 elif (uatype.find('MSIE') != -1): maxsize = 5 timeout = 1000 else: return
While I knew that Chrome's JavaScript engine is fast, I didn't expect it to be that dominant.
Monday, February 8, 2010
Browsers of the world: Map! Reduce! Map! Reduce!
This article about the idea of collaborative map/reduce in the browser and this one on Depth-First gave me the idea to try something other than distributed word counting: distributed substructure matching.
The server was quickly written in Python, the backend in this case is Postgresql with a table holding the structures as V2000 molfiles in plain text format. No magic so far.
Here's the code of the server.
More interesting might be, how the substructure matching itself is done with 100% JavaScript. Thanks to JSDraw a pure JavaScript structure editor, which on closer inspection has some more interesting tricks up it's sleeve, notably a substructure matching capability, this is doable now.
The server schedules a job of maxsize random molecules from the database and constructs a page containing those molecules as molfiles. After the page has completely loaded in the browser, the matching is done and the page is posted back to the server which parses the result. Once manually started by opening http://:8080/get, the pages keep reloading automatically by means of a meta http-equiv="refresh" in the result page.
Of course, the server is very basic. It notably lacks keeping track of the results and housekeeping to restart broken jobs and uses a hardcoded substructure as search argument.
But it can be done.
The server was quickly written in Python, the backend in this case is Postgresql with a table holding the structures as V2000 molfiles in plain text format. No magic so far.
Here's the code of the server.
More interesting might be, how the substructure matching itself is done with 100% JavaScript. Thanks to JSDraw a pure JavaScript structure editor, which on closer inspection has some more interesting tricks up it's sleeve, notably a substructure matching capability, this is doable now.
The server schedules a job of maxsize random molecules from the database and constructs a page containing those molecules as molfiles. After the page has completely loaded in the browser, the matching is done and the page is posted back to the server which parses the result. Once manually started by opening http://:8080/get, the pages keep reloading automatically by means of a meta http-equiv="refresh" in the result page.
Of course, the server is very basic. It notably lacks keeping track of the results and housekeeping to restart broken jobs and uses a hardcoded substructure as search argument.
But it can be done.
Subscribe to:
Posts (Atom)