A Raspberry Pi a day keeps COVID-19 away (how to fight COVID-19 using Raspberry Pi SoC machines)

At a time when the Corona virus (COVID-19) is ravaging the world and most of us are stuck in-doors, it’s easy to feel a bit powerless and vulnerable.

Well, fellow techies, it’s time for the geeks to step up and show the world how we handle our business!

To this end I’m going to run through a very simple process to build a Raspberry Pi 4 ‘cluster’ and have it run protein folding simulations to help researchers battle COVID-19 and other diseases such as cancer, alzheimers etc.

There will be a particular focus on automation and scalability on the management side of things.

The main project for simulating protein folding is Folding@Home and they say:

Our specialty is in using computer simulations to understand proteins’ moving parts. Watching how the atoms in a protein move relative to one another is important because it captures valuable information that is inaccessible by any other means.

Hardware

In terms of Raspberry Pi (RPI) model, I would recommend nothing less than the 4GB RPI4.
The generation isn’t a big deal (RPI vs RPI3) since that mostly affects the processing speed, although I consider the small price difference to be worth the bump to a RPI4.

The RAM however will be a bigger issue as some work units will have minimum memory requirements.
While each work unit for the ARM platform has relatively meager requirements (typically a few-to-several hundred Megabytes), you need to remember that each work unit will run on a different CPU core.

Since the RPI4 is a quad-core CPU we’re going to want to schedule work units to run on all cores concurrently to get the most out of the units but that means having enough RAM for 4 concurrent work units. Otherwise some work units will get put on hold waiting for RAM to be freed up (which may never happen) and the CPU core will sit idle.

There already are plenty of great guides on how to specify and build the hardware for a Raspberry Pi cluster and so I won’t reproduce that here. (Just click here and you’ll get a guide to the parts I’ve used)

Operating System

Raspberry Pi’s have an ARM processor (as opposed to the usual x86 of most PC’s/laptops). However, the default Operating System for the Raspberry Pi is Raspbian 32bit and Folding@Home don’t currently have a 32-bit ARM folding agent core (in fact, they only very released an 64-bit ARM core).

The Raspberry Pi 4 is a 64-bit processor and some people will want to use a 64-bit operating system because the Raspberry Pi 4 has a 4GB model (which I am using here) and in theory it might be faster for some memory-intensive apps.

In my case, the motivation is a bit simpler: Raspbian is a nice distribution (downstream from Debian) but the packages that you can install from the official Raspbian repositories tend to be a bit on the old side. This has been especially painful as I’ve been playing a lot with Docker Swarm and Kubernetes.

As such, I’m using the Ubuntu 64-bit ARM image that can be installed via the recently released Raspberry Pi imager:

https://www.raspberrypi.org/downloads/

Networking

The RPI4 comes with gigabit Ethernet, 2.4 & 5GHz wifi and Bluetooth and so you’re spoiled for choice on the networking front.
However, for reliability and security I go with Ethernet only. Fortunately, I have a 24-port switch and plenty of free ports

Folding Client

Since we’re going to be running on more than one machine we ought to have a little think about how to manage all the machines:

The clients can be configured very quickly (for that just jump straight here) but since I want to easily scale things I’m going to run through a few steps that will save a huge amount of time later on so that we don’t have to keep re-configuring each client or new clients.

BOINC:

If we go with the Folding@Home client then we’ll need to run/manage things via the terminal. However, BOINC is an over-arching project that allows you to sign up for countless scientific projects that require computer-number crunching power; and it has some nice (if old looking) management tools.

BOINC is an acronym for Berkeley Open Infrastructure for Network Computing.

From the BOINC website:

BOINC is a platform for distributed high throughput computing, i.e. large numbers of independent compute-intensive jobs, where there performance goal is high rate of job completion rather than low turnaround time of individual jobs. It also offers low-level mechanisms for distributed data storage. BOINC has a client/server architecture: the server distributes jobs, while the client runs on worker nodes, which execute jobs.

BOINC State & BAM!

So the first thing we’re going to do is sign up for BOINC stats.

This let’s us centrally track our stats and provides the base for the next piece of centralised control.

https://www.boincstats.com/


Then make sure you are signed up for BOINC BAM!
This allows us to centrally manage the projects we want to contribute to and to automate the deployment of configurations to the client machines

https://www.boincstats.com/bam/account/

From there you can take a look at the projects list and pick one to sign up to.
In fact, you can sign up for multiple projects and have your clients rotate CPU/GPU time between those projects.
For this guide, however, we’re just signing up for Rosetta@home.

N.B. The sign-up option will attempt to create an account in the project (e.g. rosetta@home) using your BOINC/BAM email address and password; likewise, the “find account” option uses those same credentials.

Client Setup

Installing the client on Ubuntu 19.10 ARM64 is as simple as running this in the terminal:

sudo apt install boinc-client

If you are running the desktop environment on the Pi then there are other packages and GUIs that you can add to manage the local client, but this is all you need to run headless.
We’ll be using a different tool to manage all the clients remotely.

The above command will have installed the binaries and also a systemd service to run the client in the background and on boot-up.
So we need to check it’s working fine:

sudo systemctl status boinc-client

At this point let’s edit the remote_hosts config file to allow our management machine (in my case a Windows desktop) to connect remotely.

sudo nano /var/lib/boinc/remote_hosts.cfg

Make sure to add the IP of DNS name of the machine you want to manage the client from (one host per line).
I didn’t bother to set a password since I’ve white-listed the host above and my firewalls are also configured to limit the connections.
However, if you do want to add a password (or retrieve a pre-existing one) then edit the following file:

sudo nano /var/lib/boinc/gui_rpc_auth.cfg

Now just restart the service for the settings to take effect:

sudo systemctl restart boinc-client

BOINC Manager

BOINC has its own manager (GUI) but can only connect to one client per GUI-instance. You can have multiple instances of the manager but it’s just messy.
So we’re going to use a tool called BOINCtasks

Once installed, open the app and just cancel any screens asking you to choose a project.

From the main screen, choose the “Computers” button at the top:

and then open the ‘Computer‘ menu at the very top:

You can either add hosts individually or scan the local network.

Once added it should look something like this:

Connecting clients to BAM!

Make sure that in the ‘Computer‘ view you select the root “All computers” node in the left-side pane, as this will ensure that the settings we are about to change are applied to all clients.

Now select Projects > Account Manager

On the Next screen just choose “BOINCstats BAM!” and login with your account and select “Add manager”

After this point your connected clients should all now be associated with the BOINC BAM! web service.

Assigning clients to projects

We can now head back over to the BOINC stats website and start assigning work.

First off, let’s ensure that all of our hosts are showing up at:

https://www.boincstats.com/bam/hosts/

Host list – (I have some nodes showing up multiple times because I rebuilt the node)

If you want to assign work to groups of machines then just create a “Host group” and add your clients/hosts in there by selecting “Add/Remove Hosts

The final piece is to use the “Edit Projects” button to assign the Rosetta@Home Project to your Host group so that it winds up looking something like this:

You’ll want to click the project in this view to get additional options and toggle “Attach” to “Yes“:

Fetch the client policy

Now that we’ve setup our management groups and settings, we just need the clients to pick up those changes by polling BOINC stats.

By default this happens every 24 hours but we can force a poll by switching back to the BOINCtasks GUI.

Remember to ensure that the root node is selected so we apply this action to all nodes/hosts:

At the top of the screen, choose the “Projects” > “Synchronize with

Then select all the nodes you want to update and trigger the synchronization:

The messages pane can now be used to confirm the Account Manager connections:

Lastly, don’t be too concerned if you get a message back in the logs that says that there are no work units available.
This project and others like Folding@Home have seen a massive spike in demand for work units and sometimes that means you might need to wait a few hours (more if it’s a weekend) for new work units to be issued; or sometimes you might get messages (especially in Folding@Home) that there are no work units for your specific platform. Just wait it out for 24 hours and you should get some work units issued to you.

Get Folding!!

Now just sit back and relax, grab a beer and let the computers save the planet this time!

Leave a Reply

Your email address will not be published. Required fields are marked *