Web Workers - Multithreading in the Browser

Web Workers - Multithreading in the Browser

Introduction

As software developers, we often deal with blocking long running tasks such as generating a PDF out of a collection of data or batch processing thousands of records to extract some statistical values to feed them to a chart or whatever. This creates a somewhat a unique problem that we all know how to solve when building non web applications like desktop or mobile apps and the solution is leveraging the beloved threads. But, when it comes to web development, there is nothing that we can use to make a blocking long running task run in the background and display a progress bar or a busy animation to let the user know that some results are about to show up or may be let him continue his work while the results are beeing processed. Well, there are in fact some workarounds around that and i can think of two of them:

  • Running the work on a timer (this is not realy done in parallele but for some scenarios it might do the job)
  • Send the work to the server and wait for the results to get back.

While the latter approache might seem appealing, it can heavily tax your server if your task is CPU hungry and it can also tax your bandwidth if it is data hungry.

Up until some years ago, there was no standard way of doing multithreading programming on the browser, but now, thanks to Web Workers, we can enjoy the power of multithreading in the realm of the browser !

Why and when to use Web Workers or not ?

As discussed previously, some tasks can be CPU or bandwidth hungry or both. I had to deal with this exact scenario as i had to work on a client project that required the web app to display 3D models. It was also required to allow the user to interact with the 3D models in real time.

The challenge was in the fact that the raw 3D models the client worked on were very high resolution so they needed to be compressed and optimised in order for them to be bandwidth friendly as well as to be rendered in real time. The obvious and only solution here was Web Workers.

An important thing to keep in mind is that Web Workers might not be the appropriate solution for every long running task problem. Consider a task for which the results are generated on the fly and must be cached on the server. In that case, client side Web Workers might not be the ideal solution for obvious reasons.

Also, Web Workers add a significant overhead both in terms of performance and code complexity, which is why you need to weigh the pros and cons very well before diving into the code.

Additionnaly, as we will discuss further in this article, Web Workers have some major technical limitations that might be a deal breaker for many scenarios.

How to use Web Workers ?

In their simplest form, Web Workers are not that hard to use. All you need to do is instanciate a Worker object and pass your task to the constructor as a path to a javascript file.

To send data to the worker, you simply call postMessage on the instance and pass it the required data as a parameter. You can pass almost anything as data except functions and non clonnable elements.

When you instanciate the Worker object, you can register to the onmessage event which will allow you to receive data back from the Worker.

In your worker code, you can use the postMessage function to send data back to the context that created the Worker (Sincie it is possible to start Workers from within other Workers).

For your Worker to do any work, it must contain an onmessage function which will get called each time your master task (which could be your main application thread or another Worker) calls postMessage. This is what will trigger your worker.

You can import any library or any other javascript file from your worker. To do so, simply use the importScripts function. It takes a listof all the files to import separated by commas. (ex: importScripts('foo.js', 'bar.js');)

When you are done with your Worker, you would like to free up the resources that were allocated to it. To do so, simply call the terminate method on the worker instance.

Note that you can have any number of Workers at any given time. But, as with threads, you should not have more workers than the number of physical processors running at the same time since this will lead to performance issues due to extensive and expensive CPU contexts switches and whatnot.

Limitations of Web Workers

As discussed previously, Web Workers have some technical limitations that might be a deal breaker for many scenarios.

The first, and it was discussed earlier, is that you cannot pass non clonnable entities to the Worker. You cannot pass a javascript object that contains methods and prototypes, only flat objects. This is a severe limitation when you need to pass the behavior of an object to the Worker. In that case, your only option is to transfer the behavior code to the Worker and have it live there.

The second limitation prevents you from accessing local storage and session storage from the Worker's code. You will have to pass data back up to the main application thread in order for you to be able to store it in the local or session storage.

The third limitation and the most "limiting" one is that you cannot access the DOM from a Web Worker. Your only option if you need to do so is to send tagged messages to the main application thread and perfrom your DOM manipulations from there.

Security concerns

Web Workers are executed within a dedicated context independent of the caller's one. So, if the caller script is served with a content policy that prevents it from executing the eval function, all the Workers created by that script will be able to execute the eval function.

Conclusion

Web Workers are an efficient and elegant way to run long and heavy tasks in parallel to the main application thread within the web browser improving the user experience by providing a more responsive front end.

Web Workers will also allow us to potentially save a significant processing power and bandwidth by shifting the workload to the client machine which is very interesting since it is exactly what cloud providers tend to bill for.