How fast is an HTML5 Supercomputer?

One question we get asked very often is how fast a supercomputer made entirely of web browsers is. Given that GFLOPS are relatively meaningless to most people, we often compare to how fast something would run locally, or on AWS.

In one example, we ran a Monte Carlo Pi estimation on:

  • Intel Core i7 2600 Laptop
  • AWS’s cc2.8xlarge compute optimized instances (88 ECUs)
  • CrowdProcess (at 8900 workers)

The laptop used one node.js process, the cc2.x8large instance used Hadoop, and CrowdProcess obviously used Web Workers in a diversity of browsers.

Results, in millions of tasks per second (each task was composed to 200M trials):

image

With CrowdProcess at full power, the platform performed the equivalent number of calculations to 4750 cores of i7 processors, or 3350 cc2.8xlarge AWS instances (294.800 ECUs) .

image

Not surprisingly, CrowdProcess at full power easily beats the compute nodes, what is surprising however is how well each browser does. The laptop is running on a single thread, and each browser node is getting around 50% of that speedup.

A major caveat is in the random number generators, as the browsers lack uniformity in the way the produce random numbers, and certainly produce them differently from the Java implementation we ran on AWS. This can make the results biased towards the JavaScript run.

A fairer comparison should have all nodes running the same random number generator, and run more than one algorithm (these were pure Monte Carlo Runs). More comparisons to follow.

Naturally, there is always the case for GPUs, and we are aware that any Telsa could beat all of these numbers. It should be noted however that the vast majority of developers only know how to write code for CPUs. Besides the vast majority of languages being easier and simpler to write than OpenCL, Javascript is even simpler than tons of other languages (go ahead and try our Playground) which makes it an extremely popular language. And the best part, it’s quite fast.

In a while, we intend to start looking into the potential to use WebCL on CrowdProcess, that will make it more competitive with GPGPU computing. For now, we are very happy to see how fast it is compared to all accessible alternatives. 

As always, we invite you to run your own tests and benchmarks by registering on the CrowdProcess platform!

The World’s first REPL for distributed computing

Ok, it is braggy. But we are just that proud. As of today, there is a REPL for launching distributed computing tasks over thousands of nodes, that even a 9 year old can understand. 



We chose to call it the “CrowdProcess Playground”, for that is what it is for: trying out things, messing around with an HTML5 supercomputer, without having to really learn much or optimize before trying. 

Welcome to the Playground!

The CrowdProcess Challenge: 1000$ for every app

One of most fascinating things about working at CrowdProcess has been to get to know the users.

Amazing work, by physicists, statisticians, medical researchers and a plethora of developers and engineers running fascinating problems of distributed computing on web browsers.

As our users’ jobs fundamentally belong to them, we are often saddened not to be able to publish everything that goes through.

Which is why CrowdProcess is paying 1000$ for everyone who comes up with an idea for an application, develops it and releases it as open source code.

Be sure to read the complete announcement and rules before entering, but it’s more or less the following:

  • You come up with a cool idea for an application that solves a real world problem and is suitable for CrowdProcess (we can help);
  • You develop and release this application as a web app;
  • You get paid;

Of course you can submit multiple applications and make more money!

Need some inspiration ?

Here’s a nice hello world application that calculates the value of Pi.

Gentlemen, start your engines!

Looks simple right ? All you have to do once you have an idea for an application is contact us to enter the contest (which is not a contest, you’ll win just for delivering the project!).

Simplest distributed computing APIs

We love simple and elegant modules and frameworks, the one liners that make you understand something right away without needing to go through any documentation at all, REPLs, immediate experimentation and feedback, and more importantly, the instant reward from of small accomplishments.

That’s why we (recently) released one api client for Python and (not so recently) an api client for Javascript.

We couldn’t be prouder of how simple and elegant they are. Take a look at a basic hello world for the Python client:

and for the Javascript client:

As a comparison here is a word count program for hadoop, the most popular distributed computing platform. It has 122 lines of code, not that immediate.

Javascript in Biotech

One of the fun things about working at CrowdProcess is meeting the amazing developers who work on bleeding edge computational challenges. One of them, Bruno Vieira, has been working in bringing bioinformatics into the world of Javascript.


Besides his work in helping us with genomics in the browser, he has also been working on bionode.io, a Node.js library for client and server side bioinformatics.


His reasoning is that having bioinformatics methods that can run on the browser could immediately benefit biological visualisation projects like BioJS or web genome browsers. It would also allow using those methods on browser based distributed computing grids like CrowdProcess. Thanks to Node.js, the same code could be run on the server to perform some tasks, which are currently performed with other bio libraries, such as Biopython or BioRuby. In this sense, one can think of bionode as an underscore.js for bioinformatics.


We fully encourage developers interested in collaborating with Bruno in bionode (lovers of JavaScript + Biology) to get in touch with him directly (project submitions finish this Friday), and remind everyone that CrowdProcess is free for awesome scientists and researchers like Bruno.

Free for scientists and researchers

Over a year ago we had a vision. A world where scientists, researchers, and those working for the benefit of humanity could have access to supercomputing resources, for free. 

Since then, a lot has happened:

  • We connected millions of different devices, ran huge computations, and helped developers work on incredible projects, from catastrophe simulation to genomics.

  • We launched an enterprise version, and started working with companies on leveraging their own resources.

  • We were joined by some of the most awesome investors, who have become core members of our young team (more on this very soon).

But at a personal level, few things can compare to being able to say the following:
We welcome scientists and researchers into the CrowdProcess platform. For free.

Genetic Algorithms on CrowdProcess

Computer programs that evolve in ways that resemble natural selection can solve complex problems even their creators do not fully understand." - John Henry Holland

A few months ago, one of our users took on the fascinating task of implementing a genetic algorithm for optimization of job scheduling on CrowdProcess.

Flowshop Scheduling Program

In a rather crude nutshell, genetic algorithms act the same way as nature: they take a group of candidate solutions and allow them to mutate, reproduce, and crossover. They then keep the best results (as defined by a fitness function) from one generation to the next. For more on genetic algorithms, we recommend this book by David E. Goldberg

Genetic algorithms are, by their nature, prone to parallelism. Going back to the parallel (no pun intended) with nature, the more candidates and generations you have, the more likely one of them with reach a better optimum. In nature, that optimum is simply the ability to survive.

So how does this overlap with a distributed computing platform made of thousands of web browsers? Extremely well, it turns out. The solution is to make each web browser an independent population, run each simulation in isolation from all others, and return the best result from each one. These can then be compared locally, and the “best of the best” chosen as the optimum.

In the case of the current problem, the objective was to find the order of jobs that gave the fastest completion time of the whole production cycle. The problem is essential for production management, as proper planning of jobs can lead to significant savings in production costs.

Applications such as telecommunications routing, financial forecasting or fleet logistics use Genetic algorithms, and we hope that the CrowdProcess platform can bring them into more common use.

The source code is on this link, and we encourage developers to try it out, and to run Genetic algorithms on thousands of web browsers in parallel.

Our heartfelt Thank You to Jerzy Duda from AGH University of Science and Technology in Krakow, Poland for providing us an initial code of GA and the test problems.

We are very interested in supporting developers who would like to run GA algorithms on CrowdProcess for their own use cases, and potentially add functionality such as editable fitness functions, and control over parameters such as population size, crossover type, mutation type, etc.

If you are working with GA algorithms feel free to get in touch.

A few vaguely interesting numbers

CrowdProcess is a very particular distributed computing platform. It has browsers connecting and disconnecting all the time, and their number varies considerably during the day.

So how are job times affected by these variations? We decided to run a very basic experiment, which give a (non scientific) feel for this. If you want to know how we did it, read on. If you are more interested in the cool graphs, scroll down. 

Here is what we did: 
We took a simple Run function, which uses a Monte Carlo simulation to calculate pi with 1000000000 points, and returns only the time it took to calculate. In node.js, each run takes about 12 and a half seconds.
   

function Run() {
   var inQuarterCircle = 0,
   n = 1000000000,
   i = n;
   timer= Date.now();
      while(i—) {
         if (Math.pow(Math.random(),2 ) + Math.pow(Math.random(),2 ) < 1) {
            inQuarterCircle++;
      }
   }
   var pi=4*inQuarterCircle / n;
   return (Date.now()-timer)/1000;
}

Next, we made 4 json files, and called them (to be very original), small, medium, large and huge. Each one got a different number of empty objects, corresponding to the number of tasks to be run (our function does not take an input in this case)

small.json:          2,080 tasks
medium.json:    10,000 tasks
large.json:          20,000 tasks
huge.json:          60,000 tasks

Excellent. Now it remained only to run them on the platform. During the different runs on the platform (across multiple days), the number of browsers varied  considerably, as follows:
image

So it was important to control for number of browsers, and each experiment was run multiple times. 

Each job we sent was returned with the times that each task took to execute. The interesting thing was to determine how different browsers took different times to run the same task. 

The average time on the local run, in node, on a Toshiba L50 (Intel® Core™ i5-3230M Dual Core) was 12.771 seconds. 

On the browsers, the distribution looked more like this:

image

This is an example from a small.json. Interestingly, a full 27,85% of browsers outperformed the local run (72.12% took longer). If you are wondering about the long tail, 3.3% crossed the 1 minute mark, and none past the two minute mark (because of a platform timeout at 2 minutes).

Interesting… now on to the experiments themselves! If a single task would take 12.771 seconds, and a small.json has 2080 objects, then the expected sequentially run would take a bit over 7 hours. To the platform! 

image

Not bad, the worse result was a 172x speedup, and the best was 288x. Time going down pleasantly linearly with the number of browsers. 

Beyond, to the medium jobs! 10k tasks, expected time 35.7 hours. 

image

Again, beautifully linear. as expected, and speedups ranging between 240x and 517x. Not much of a challenge for a distributed computing platform though…

So up again, to 20k tasks. (expected time: 70.9 hours).

image

All linear, except for a massive outlier. The most likely answer is that the platform was being shared by multiple developers running different tasks at the same time. Speedups at a respectable minimum 238x and maximum 643x.

Finally, the run corresponding to the ‘huge’ file, with 60k tasks (“huge is a major overstatement, the platform has run jobs orders of magnitude higher, but it sounded like the natural thing after “large”).

Expected time for the “huge” file was 212 hours (a bit over a week sequentially).
image

Max speedup at 755x, and a minimum at 268x. One question springs to mind, what happens if we plot speedup vs number of browsers? 

Well, this happens: 

image

Interestingly, the job sizes have a clear impact on speedup, and not only on time of computation, even though the tasks often outweigh the number of nodes by more than an order of magnitude.

Which begs the question… What happens to speedup per browser, with the number of browsers?

image

Speedup per browser went up to almost 0,5 with a large number of tasks on a small number of nodes, but clearly decreases as the number of nodes increases. It seems comparatively always higher on larger jobs.

So what’s next? Well, today we ran a job with 1 million tasks in 13.11 minutes, with a speedup of over 3000x. We will be publishing more on that next week so follow the blog, or try out the platform yourself (which is probably an even better idea). 

Scheduler improvements

CrowdProcess began February with a very considerable improvement in the scheduler, and consequent improvements in platform performance. With this we have been able to reduce the processing time per job over 29%.

image
Now that we’ve told you the good news, let us explain how we came to this result, and how we compare results on a browser powered distributed computer:

As you might know the number of browsers connected to CrowdProcess’ platform is volatile by design. We therefore ran experiments when the number of browsers was fairly stable, ranging between 1435 and 1588.

We used a sample of  >300 jobs for each experiment: one using the old scheduler, and the other using the new one. Next, we calculated the accumulated average time per job, as the number of jobs increased.

We used a dummy Job, in which each task would wait 2 seconds before returning 1.  Each job had 1500 tasks. (more about jobs and tasks on CrowdProcess) We cleared the data of outliers, defined as results over 2 standard deviations from the mean.  We then plotted the accumulated average job time, which is what you can see on the graph.  

In the process, we have been collecting data that will enable us to further improve the CrowdProcess platform. We will be sharing new findings here, so follow our blog or register at our platform so that we can keep you updated.

Helping you debug yourself

We are currently working with two experts in Bioinformatics: Prof. Jonas Almeida, a researcher at the University of Alabama at Birmingham (UAB) and Prof. Alexandre Francisco from INESC in Portugal. We developed with a browser powered version of a Microbiome Sequence Alignment tool.

"Microbiome Sequence Alignment?! What’s that? Please tell me, CrowdProcess, I want to know!”, you say.

So, imagine you put the content of your stomach into a sequencing machine - a machine that translates genes into genetic code (ACGTs) - and you get a “book” of your stomach contents. A book written in a sentence using only four letters (ACGT). Gibberish.

You then get your dictionary of ACGT-ish - a database that matches known organisms with the respective sequences - and you read it to find out what microorganisms, bacteria and bugs are populating the contents of your stomach.

Now that you know what Microbiome Sequence Alignment is, you might be wondering what a distributed computing platform powered by web browsers has to do with it.

Here’s your answer: in the development of the work we are doing with Prof. Jonas Almeida and Prof. Alexandre Francisco, we’ve made a simple demo just to show what’s possible with CrowdProcess.

CrowdProcess Genomes

We made this because we believe that massively parallel computing obtained through web browsers can pave the way to a future where distributed computing is accessible to everyone and anyone. That and the ease of use of JavaScript can truly bring (computing) power to the people. Aditionally, by using CrowdProcess behind the firewall, we want to help hospitals and clinics bring genome sequencing to everybody.

Check the demo and let us know what you think in the comments. Alternatively feel free to get in touch with us directly if you want to use it.