By default, the /etc/hosts file is managed by the system. Hence, user modifications as deleted at each reboot. In order to prevent that from happening, a line has to be modified in the `/etc/cloud/cloud.cfg` file:
`manage_etc_hosts: false`
## `dnsmasq`
Installed in the front-web machine, with the following configuration (`/etc/dnsmasq.conf`):
```
domain-needed
bogus-priv
server=213.186.33.99
listen-address=192.168.0.59
no-dhcp-interface=ens4
bind-interfaces
```
The following lines are appended to the `/etc/hosts` file in the front-web machine:
```
51.83.13.51 front-web.wan
192.168.0.59 front-web.lan
51.83.15.2 back-office.wan
192.168.0.146 back-office.lan
51.68.115.202 es-1.wan
192.168.0.74 es-1.lan
51.77.229.85 es-2.wan
192.168.0.65 es-2.lan
51.83.13.94 es-3.wan
192.168.0.236 es-3.lan
```
The other machines will use front-web as DNS.
# Routing
In order for the front-web machine to be usable as a router, we need to apply the following modifications within **front-web**:
1. In `/etc/sysctl.conf` -> `net.ipv4.ip_forward=1`.
Once that it is done, the other machines can be setup as follows:
`/etc/network/interfaces`
```
[...]
iface ens4 inet static
address 192.168.0.XXX
netmask 255.255.255.0
gateway 192.168.0.59
dns-nameservers 192.168.0.59
[...]
```
In case the default gateway is not taken into account, the following command has to be issued:
`
$ route add default gw 192.168.0.59 ens4
`
The line `auto ens3` can be commented out in the file `/etc/network/interfaces.d/50-cloud-init.cfg`, in order to prevent the ens3 from being "upped" at reboot.
In order for the modification to be persisten, we need to disable cloud-init's network configuration capabilities, by editing the file /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following content:
The software is hosted on 5 machines, having the following hostnames and specs:
***front-web**: 7 GB RAM; 2 vCores; 50 GB SSD
***back-office**: 15 GB RAM; 4 vCores; 100 GB SSD
***es-1**: 30 GB RAM; 8 vCores; 200 GB SSD
***es-2**: 30 GB RAM; 8 vCores; 200 GB SSD
***es-3**: 30 GB RAM; 8 vCores; 200 GB SSD
The above machines exchanges information through a private LAN: `192.168.0.0/24`; `front-web` is the only instance which is directly connected to the Internet, through its WAN interface (`ens3`) and public IP address (`51.83.13.51`).
The following diagram provides a sketch of the various applications hosted by infrastructure: 
Deployments are performed using Gitlab CI. Details on each machine's role and configuration are provided here-below.
## front-web
The **front-web** machine has the following roles:
* router, firewall
* DNS server
* SMTP server
* Reverse Proxy
Such roles are accomplished thanks to the configuration detailed here-below.
### router, firewall
The relevant configuration is stored within the file `/etc/iptables/rules.v4`:
```
*nat
:PREROUTING ACCEPT [541:33128]
:INPUT ACCEPT [333:20150]
:OUTPUT ACCEPT [683:49410]
:POSTROUTING ACCEPT [683:49410]
-A POSTROUTING -s 192.168.0.0/24 -o ens3 -j MASQUERADE
Moreover, the following line must appear in the `/etc/sysctl.conf` file:
`net.ipv4.ip_forward=1`
### DNS server
We rely on the `dnsmasq` software, which was installed via `apt`. The relevant configuration is stored in `/etc/dnsmasq.conf` file, which reads as follows:
```
domain-needed
bogus-priv
server=213.186.33.99
listen-address=192.168.0.59
no-dhcp-interface=ens4
bind-interfaces
```
The following lines were appended to the `/etc/hosts` file, allowing the DNS to resolve the entire infrastructure:
```
51.83.13.51 front-web.wan
192.168.0.59 front-web.lan
51.83.15.2 back-office.wan
192.168.0.146 back-office.lan
51.68.115.202 es-1.wan
192.168.0.74 es-1.lan
51.77.229.85 es-2.wan
192.168.0.65 es-2.lan
51.83.13.94 es-3.wan
192.168.0.236 es-3.lan
```
The WAN interfaces were declared in spite of the fact that they are not actually used (except for the `front-web` instance).
It is important to note that, by default, the `/etc/hosts` file is managed by the hosting service. In order to prevent user modifications from being reset at every reboot, a line has to be modified in the `/etc/cloud/cloud.cfg` file:
`manage_etc_hosts: false`
### SMTP server
`postfix` and `opendkim` were installed through `apt`. The latter was setup following the instructions found at [https://wiki.debian.org/opendkim](https://wiki.debian.org/opendkim). In particular, the following commands were issued as `root`:
```
mkdir /etc/postfix/dkim/
opendkim-genkey -D /etc/postfix/dkim/ -d data.beta.grandlyon.com -s mail
chgrp opendkim /etc/postfix/dkim/*
chmod g+r /etc/postfix/dkim/*
chmod o= /etc/postfix/dkim/*
```
Moreover,
* the line "Mode sv" was uncommented in `/etc/opendkim.conf` (for unknown reasons :-()
* the following lines were appended to the same file:
```
# Specify the list of keys
KeyTable file:/etc/postfix/dkim/keytable
# Match keys and domains. To use regular expressions in the file, use refile: instead of file:
data.beta.grandlyon.com. 86400 IN TXT "v=spf1 +ip4:51.83.13.51 ~all"
```
```
mail._domainkey.data.beta.grandlyon.com. 86400 IN TXT "v=DKIM1; h=sha256; k=rsa; " "p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzzoL8dvkfhm3xCpGxW8COUIgmw4r0PV/5GSUekCA8sLGPiqNh8//Jj4tFpLK6eUMacKYPbL4goUdRyTF5gqh/MdEWwafodZczELETRcp3a7mGdmM2nDhD6lk2Xtdf+nS+HWobYN18a3abNFchcF62LJWGTd4fwKV8gOIIuvTiakVxFuC7eIBUO+7m0JU0EnnivLUabphFSL3yV" "hEdpCD3csRGedSnG6+ocpZw25ll8/5f6WZnobU2d5KKqk7MVgOFXfuJMhdjmd6UvSGPaxR+/E+PsxQCU0f9vLG4R8fLPLh0ngNGGiyNYGHB5Sn8VxIrxqpH2pQKaJsfHLK/IgRJwIDAQAB"
```
in order to implement the Sender Policy Framework (SPF). The public key can be found in the file `/etc/postfix/dkim/mail.txt`.
### Reverse Proxy
`nginx` was installed through `apt`. The various "virtual host" configuration files can be found in the `/etc/nginx/sites-available` and `/etc/nginx/sites-enabled` folders. TLS certificates are stored in `/etc/nginx/ssl`.
## back-office
This instance hosts both custom and off-the-shelf applications, as illustrated by the diagram displayed at the beginning of this. These applications serve several purposes:
* administration, configuration
* monitoring
* business
The public network interface (`ens3`) was deactivated, by commenting out the line
`auto ens3` in the `/etc/network/interfaces.d/50-cloud-init.cfg`. In order for the modification to be persistent, we need to disable cloud-init's network configuration capabilities, by editing the file `/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg` with the following content:
```
network: {config: disabled}
```
The private network interface (`ens4`) was statically configured. Here's the relevant lines of the `/etc/network/interfaces` file:
```
[...]
auto ens4
iface ens4 inet static
address 192.168.0.146
netmask 255.255.255.0
gateway 192.168.0.59
dns-nameservers 192.168.0.59
[...]
```
The `back-office` instance runs Docker and docker-compose, which were installed following the official documentation:
The default configuration was tweaked in order to prevent Docker from messing up with virtual networks. Here's is the content of the `/etc/docker/daemon.json` file:
```
{
"default-address-pools": [
{
"scope": "local",
"base": "172.17.0.0/16",
"size": 24
},
{
"scope": "global",
"base": "172.90.0.0/16",
"size": 24
}
]
}
```
Moreover, the content of the file `/etc/systemd/system/docker.service.d/startup_options.conf` was edited as follows,
in order to make the Docker Daemon listen to a TCP socket, instead of the default Unix socket. This allows Portainer to connect to the Docker Daemons running on the various Docker-enabled instances of the infrastructure (cf. https://success.docker.com/article/how-do-i-enable-the-remote-api-for-dockerd).
## es-1, es-2, es-3
These three instances host some distributed applications:
* Elasticsearch
* Kong (backed by the Cassandra database)
* MinIO
Moreover,
* they collect and parse HTTP logs via Filebeat and Logstash, respectively, which are then sent to a "small" Elasticsearch instance which is running on the `back-office` machine for monitoring purposes;
* they store (cold) backups of the configuration of the entire infrastructure, as well as some of the relevant application data. Backups are performed by `rsnapshot`, which was installed via `apt`. Its setup requires the following steps:
1.`rsync` needs be installed on all the instances of the infrastructure
2. a public SSH key owned by the `root` user of each `es-X` instance must be appended to the `/root/.ssh/authorized_keys` of all the other instances
3. a first SSH session from each `es-X` instance to all the others must be established, in order to answer "yes" to the question concerning the authenticity of the host we wish to connect to
4. the `/etc/rsnapshot.conf` file must be customized according to our needs. Here's the copy of the relevant lines that can be found on `es-1`:
**The `es-1`, `es-2`, `es-3` instances share the same network and Docker (+ docker-compose) configuration as the `back-office` instance.**
## Additional notes
The following software packages are installed on all the machines (via `apt`):
*`resolvconf`
*`prometheus-node-exporter`
On the `back-office` and `es-{1,2,3}` instances, `gitlab-runner` was installed following the [official documentation](https://docs.gitlab.com/runner/install/linux-repository.html). Gitlab Runners were then registered as "group runners" associated with the following group: https://gitlab.alpha.grandlyon.com/groups/refonte-data/deployment-beta. The following tags were used
* data-beta-grandlyon-com-back-office
* data-beta-grandlyon-com-es-1
* data-beta-grandlyon-com-es-2
* data-beta-grandlyon-com-es-3
in order to be able to trigger CI jobs only on selected machines.
## Critical points and potential improvements
1.**The `front-web` instance is the SPOF of the infrastructure. How to cope with it? Shall we use an HA instance ? If not, how to set up an infrastructure with two routers??**
2. Despite the periodic backups that we let `rsnapshot` perform, in case of failure data/service restoration would take a non-negligible amount of time. Some applications are already deployed in High Availability mode:
* Kong, thanks to the Cassandra cluster
* Elasticsearch, which stores both the (meta)data related to datasets and the editorial content (edited from within the Ghost CMS application)
Some others, hosted by the `back-office` instance are not yet distributed/replicated, but could be in the near future:
* by deploying the stateless services (mail, AUTHN, CSV catalog download, single page app, ...) on `es-{1,2,3}`;
* by deploying PostgreSQL (needed by the "organizations" and "resources wizard" services) in master-slave mode, the slaves being hosted by `es-{1,2,3}` and the master by `back-office` (N.B.: writes to the database come from the "Admin GUI" service);
* by deploying Redis (needed by the "legacy AUTH middleware" service) in HA mode, cf. https://redis.io/topics/sentinel.
N.B.: It's not such a big deal to leave the administration tools (Konga, Portainer, pgAdmin, Prometheus + Elasticsearch + Grafana) unreplicated.