Existing cloud mirror automation

Bastian Blank

2018-10-15 11:44:44 UTC

Here my initial description of this setup:

Google have a globally distributed anycast load balancer, which I'd like
to use. So we have one IP (+IPv6) that supports HTTP and HTTPS. It is
reachable from, hopefully, most parts of the world.

There is one large difference to what we currently have a Fastly and
CloudFront. The load balancer can only utilize backends within the own
cloud infrastructure, so we need backends there. This also makes sure
the infrastructure does not have any common failure modes with Fastly
and CloudFront.

Also every system only carries one archive. So even main and security
archives are completely separated.

Setup
-----

My currently planned initial setup includes all stuff for the main and
security archive. The others, debug and ports, will be redirects for the
time being.

This means the following amount of instances:
- one syncproxy per archive (= 2),
- two backends per archive in three regions (= 12),
- one jump host and management router (= 1), and
- one monitoring host for Prometheus and Icinga2 (= 1).

The numbers for the backends can be changed easily later. Setting up a
new one for the main archive using an existing snapshot takes less then
30 minutes.

Prometheus is used to extract both host and web access statistics,
including the load balancer. Apache is not able to provide this kind
information, so this setup runs on nginx.

A more traditional monitoring via Icinga2 is also partially configured,
but without any output.

The whole setup is maintained in three different git repositories:
- the cloud stuff itself is done via Terraform.[^terraform-git]
- the main setup is done via Ansible.[^ansible-git]
- the mirror sync setup is done via the mirror team Ansible stuff.

Terraform does the whole network, load balancer and instance setup. Only
Terraform knows the numbers of systems available. Everything else just
asks the Google platform to see what's available.

Capacity planing
----------------

The described setup should (from official quota documentation and own
observation) provide us per backend with:
- Network: 4Gbps
- Disk for main archive: 240MB/s, 15kIOPS
- Disk for security archive: 72MB/s, 7kIOPS

So we have from the commulated backends:
- for the main archive:
- 24Gbps network
- 1440MB/s disk
- for the security archive:
- 24Gbps network
- 432MB/s disk

As everything is cached in the CDN as well, we can push way higher
figures to the users.

Still unfinished stuff
----------------------

- Instance mail. Google blocks access to port 25, 465, and 587, so mail
needs to go another way.
- If we want to do HTTPS, DSA needs to provide certificates and
automatically update them in the Google cloud target proxy.

--
Death, when unnecessary, is a tragic thing.
-- Flint, "Requiem for Methuselah", stardate 5843.7