diff --git a/docs/deployment/beta-deployment.md b/docs/deployment/beta-deployment.md deleted file mode 100644 index d0fc9c5ebe0d14458523452e9e2ede3b79df974d..0000000000000000000000000000000000000000 --- a/docs/deployment/beta-deployment.md +++ /dev/null @@ -1,333 +0,0 @@ -# Deployment of the beta version - -The software is hosted on 5 machines, having the following hostnames and specs: - -* **front-web**: 7 GB RAM; 2 vCores; 50 GB SSD -* **back-office**: 15 GB RAM; 4 vCores; 100 GB SSD -* **es-1**: 30 GB RAM; 8 vCores; 200 GB SSD -* **es-2**: 30 GB RAM; 8 vCores; 200 GB SSD -* **es-3**: 30 GB RAM; 8 vCores; 200 GB SSD - -The above machines exchanges information through a private LAN: `192.168.0.0/24`; `front-web` is the only instance which is directly connected to the Internet, through its WAN interface `ens3` and public IP addresses : `51.83.13.51` (standard), `91.121.35.236` (failover). - -The following diagram provides a sketch of the various applications hosted by infrastructure:  - -Deployments are performed using Gitlab CI. Details on each machine's role and configuration are provided here-below. - -## front-web - -The **front-web** machine has the following roles: - -* router, firewall -* DNS server -* SMTP server -* Reverse Proxy - -Such roles are accomplished thanks to the configuration detailed here-below. - -### router, firewall - -The relevant configuration is stored within the file `/etc/iptables/rules.v4`: - -``` -*nat -:PREROUTING ACCEPT [541:33128] -:INPUT ACCEPT [333:20150] -:OUTPUT ACCEPT [683:49410] -:POSTROUTING ACCEPT [683:49410] --A POSTROUTING -s 192.168.0.0/24 -o ens3 -j MASQUERADE --A POSTROUTING -o ens3 -j SNAT --to-source 91.121.35.236 -COMMIT - -*filter -:INPUT DROP [173:7020] -:FORWARD ACCEPT [2218:856119] -:OUTPUT ACCEPT [5705:2627050] --A INPUT -s 192.168.0.0/24 -m comment --comment "FULL ACCESS LAN" -j ACCEPT --A INPUT -i lo -m comment --comment "FULL ACCESS LOOPBACK" -j ACCEPT --A INPUT -s 217.182.252.78/32 -p tcp -m tcp --dport 22 -m comment --comment "SSH neogeo-ansible" -j ACCEPT --A INPUT -s 80.12.88.99/32 -p tcp -m tcp --dport 22 -m comment --comment "SSH neogeo-bureau" -j ACCEPT --A INPUT -s 213.245.116.190/32 -p tcp -m tcp --dport 22 -m comment --comment "SSH erasmes" -j ACCEPT --A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment "in order to receive responses to outgoing requests" -j ACCEPT --A INPUT -d 51.83.13.51/32 -i ens3 -p tcp -m tcp --dport 443 -j ACCEPT --A INPUT -d 51.83.13.51/32 -i ens3 -p tcp -m tcp --dport 80 -j ACCEPT --A INPUT -d 91.121.35.236/32 -i ens3 -p tcp -m tcp --dport 443 -j ACCEPT --A INPUT -d 91.121.35.236/32 -i ens3 -p tcp -m tcp --dport 80 -j ACCEPT -COMMIT -``` - -Moreover, the following line must appear in the `/etc/sysctl.conf` file: - -`net.ipv4.ip_forward=1` - -### DNS server - -We rely on the `dnsmasq` software, which was installed via `apt`. The relevant configuration is stored in `/etc/dnsmasq.conf` file, which reads as follows: -``` -domain-needed -bogus-priv -server=213.186.33.99 -listen-address=192.168.0.59 -no-dhcp-interface=ens4 -bind-interfaces -``` - -The following lines were appended to the `/etc/hosts` file, allowing the DNS to resolve the entire infrastructure: -``` -51.83.13.51 front-web.wan -192.168.0.59 front-web.lan - -51.83.15.2 back-office.wan -192.168.0.146 back-office.lan - -51.68.115.202 es-1.wan -192.168.0.74 es-1.lan - -51.77.229.85 es-2.wan -192.168.0.65 es-2.lan - -51.83.13.94 es-3.wan -192.168.0.236 es-3.lan - -``` - -The WAN interfaces were declared in spite of the fact that they are not actually used (except for the `front-web` instance). - -It is important to note that, by default, the `/etc/hosts` file is managed by the hosting service. In order to prevent user modifications from being reset at every reboot, a line has to be modified in the `/etc/cloud/cloud.cfg` file: - -`manage_etc_hosts: false` - - -### SMTP server - -`postfix` and `opendkim` were installed through `apt`. The latter was setup following the instructions found at [https://wiki.debian.org/opendkim](https://wiki.debian.org/opendkim). In particular, the following commands were issued as `root`: - -``` -mkdir /etc/postfix/dkim/ -opendkim-genkey -D /etc/postfix/dkim/ -d data.beta.grandlyon.com -s mail -chgrp opendkim /etc/postfix/dkim/* -chmod g+r /etc/postfix/dkim/* -chmod o= /etc/postfix/dkim/* -``` - -Moreover, - -* the line "Mode sv" was uncommented in `/etc/opendkim.conf` (for unknown reasons :-() -* the following lines were appended to the same file: - - ``` - # Specify the list of keys - KeyTable file:/etc/postfix/dkim/keytable - - # Match keys and domains. To use regular expressions in the file, use refile: instead of file: - SigningTable refile:/etc/postfix/dkim/signingtable - - # Match a list of hosts whose messages will be signed. By default, only localhost is considered as internal host. - InternalHosts refile:/etc/postfix/dkim/trustedhosts - ``` -* the line starting with `Socket` was modified as follows: - - ``` - Socket inet:8892@localhost - ``` - -Some other files were edited: - -* `/etc/postfix/dkim/keytable`: - - ```mail._domainkey.data.beta.grandlyon.com data.beta.grandlyon.com:mail:/etc/postfix/dkim/mail.private``` - -* `/etc/postfix/dkim/signingtable`: - - ```*@data.beta.grandlyon.com mail._domainkey.data.beta.grandlyon.com``` - -* `/etc/postfix/dkim/trustedhosts`: - - ``` - 127.0.0.1 - 192.168.0.0/24 - ``` - -The relevant lines in the `postfix` configuration file (`/etc/postfix/main.cf`) read as follows: - -``` -[...] -myhostname = data.beta.grandlyon.com -alias_maps = hash:/etc/aliases -alias_database = hash:/etc/aliases -myorigin = /etc/mailname -mydestination = $myhostname, data.beta.grandlyon.com, front-web.localdomain, localhost.localdomain, localhost -relayhost = -mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 192.168.0.0/24 -mailbox_size_limit = 0 -recipient_delimiter = + -inet_interfaces = all -inet_protocols = ipv4 -[...] -milter_default_action = accept -milter_protocol = 6 -smtpd_milters = inet:127.0.0.1:8892 -non_smtpd_milters = $smtpd_milters -[...] -``` - -The DNS records was updated as follows: -``` -data.beta.grandlyon.com. 86400 IN TXT "v=spf1 +ip4:51.83.13.51 ~all" -``` - -``` -mail._domainkey.data.beta.grandlyon.com. 86400 IN TXT "v=DKIM1; h=sha256; k=rsa; " "p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzzoL8dvkfhm3xCpGxW8COUIgmw4r0PV/5GSUekCA8sLGPiqNh8//Jj4tFpLK6eUMacKYPbL4goUdRyTF5gqh/MdEWwafodZczELETRcp3a7mGdmM2nDhD6lk2Xtdf+nS+HWobYN18a3abNFchcF62LJWGTd4fwKV8gOIIuvTiakVxFuC7eIBUO+7m0JU0EnnivLUabphFSL3yV" "hEdpCD3csRGedSnG6+ocpZw25ll8/5f6WZnobU2d5KKqk7MVgOFXfuJMhdjmd6UvSGPaxR+/E+PsxQCU0f9vLG4R8fLPLh0ngNGGiyNYGHB5Sn8VxIrxqpH2pQKaJsfHLK/IgRJwIDAQAB" -``` - -in order to implement the Sender Policy Framework (SPF). The public key can be found in the file `/etc/postfix/dkim/mail.txt`. - -### Reverse Proxy - -`nginx` was installed through `apt`. The various "virtual host" configuration files can be found in the `/etc/nginx/sites-available` and `/etc/nginx/sites-enabled` folders. TLS certificates are stored in `/etc/nginx/ssl`. - -## back-office - -This instance hosts both custom and off-the-shelf applications, as illustrated by the diagram displayed at the beginning of this. These applications serve several purposes: - -* administration, configuration -* monitoring -* business - -The public network interface (`ens3`) was deactivated, by commenting out the line -`auto ens3` in the `/etc/network/interfaces.d/50-cloud-init.cfg`. In order for the modification to be persistent, we need to disable cloud-init's network configuration capabilities, by editing the file `/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg` with the following content: -``` -network: {config: disabled} -``` - -The private network interface (`ens4`) was statically configured. Here's the relevant lines of the `/etc/network/interfaces` file: - -``` -[...] -auto ens4 -iface ens4 inet static - address 192.168.0.146 - netmask 255.255.255.0 - gateway 192.168.0.59 - dns-nameservers 192.168.0.59 -[...] -``` - -The `back-office` instance runs Docker and docker-compose, which were installed following the official documentation: - -* https://docs.docker.com/install/linux/docker-ce/debian/ -* https://docs.docker.com/compose/install/ - -The default configuration was tweaked in order to prevent Docker from messing up with virtual networks. Here's is the content of the `/etc/docker/daemon.json` file: -``` -{ - "default-address-pools": [ - { - "scope": "local", - "base": "172.17.0.0/16", - "size": 24 - }, - { - "scope": "global", - "base": "172.90.0.0/16", - "size": 24 - } - ] -} -``` - -Moreover, the content of the file `/etc/systemd/system/docker.service.d/startup_options.conf` was edited as follows, -``` -[Service] -ExecStart= -ExecStart=/usr/bin/dockerd -H fd:// -H tcp://192.168.0.146:2375 -``` -in order to make the Docker Daemon listen to a TCP socket, instead of the default Unix socket. This allows Portainer to connect to the Docker Daemons running on the various Docker-enabled instances of the infrastructure (cf. https://success.docker.com/article/how-do-i-enable-the-remote-api-for-dockerd). - -## es-1, es-2, es-3 - -These three instances host some distributed applications: - -* Elasticsearch -* Kong (backed by the Cassandra database) -* MinIO - -Moreover, - -* they collect and parse HTTP logs via Filebeat and Logstash, respectively, which are then sent to a "small" Elasticsearch instance which is running on the `back-office` machine for monitoring purposes; -* they store (cold) backups of the configuration of the entire infrastructure, as well as some of the relevant application data. Backups are performed by `rsnapshot`, which was installed via `apt`. Its setup requires the following steps: - -1. `rsync` needs be installed on all the instances of the infrastructure -2. a public SSH key owned by the `root` user of each `es-X` instance must be appended to the `/root/.ssh/authorized_keys` of all the other instances -3. a first SSH session from each `es-X` instance to all the others must be established, in order to answer "yes" to the question concerning the authenticity of the host we wish to connect to -4. the `/etc/rsnapshot.conf` file must be customized according to our needs. Here's the copy of the relevant lines that can be found on `es-1`: - - ``` - [...] - - cmd_ssh /usr/bin/ssh - - [...] - - retain hourly 6 - retain daily 7 - retain weekly 4 - retain monthly 3 - - [...] - - backup /home/ es-1/ - backup /etc/ es-1/ - backup /usr/local/ es-1/ - - backup root@es-2.lan:/etc/ es-2/ - backup root@es-2.lan:/home/ es-2/ - backup root@es-2.lan:/usr/local/ es-2/ - - backup root@es-3.lan:/etc/ es-3/ - backup root@es-3.lan:/home/ es-3/ - backup root@es-3.lan:/usr/local/ es-3/ - - backup root@back-office.lan:/etc/ back-office/ - backup root@back-office.lan:/home/ back-office/ - backup root@back-office.lan:/usr/local/ back-office/ - backup root@back-office.lan:/var/local/docker-apps/ back-office/ - - backup root@front-web.lan:/etc/ front-web/ - backup root@front-web.lan:/home/ front-web/ - backup root@front-web.lan:/usr/local/ front-web/ - ``` - - N.B.: `rsnapshot` loves (hates) tabs (blank spaces) - -**The `es-1`, `es-2`, `es-3` instances share the same network and Docker (+ docker-compose) configuration as the `back-office` instance.** - -## Additional notes - -The following software packages are installed on all the machines (via `apt`): - -* `resolvconf` -* `prometheus-node-exporter` - -On the `back-office` and `es-{1,2,3}` instances, `gitlab-runner` was installed following the [official documentation]( https://docs.gitlab.com/runner/install/linux-repository.html). Gitlab Runners were then registered as "group runners" associated with the following group: https://gitlab.alpha.grandlyon.com/groups/refonte-data/deployment-beta. The following tags were used -* data-beta-grandlyon-com-back-office -* data-beta-grandlyon-com-es-1 -* data-beta-grandlyon-com-es-2 -* data-beta-grandlyon-com-es-3 -in order to be able to trigger CI jobs only on selected machines. - - -## Critical points and potential improvements - -1. **The `front-web` instance is the SPOF of the infrastructure. How to cope with it? Shall we use an HA instance ? If not, how to set up an infrastructure with two routers??** -2. Despite the periodic backups that we let `rsnapshot` perform, in case of failure data/service restoration would take a non-negligible amount of time. Some applications are already deployed in High Availability mode: - * Kong, thanks to the Cassandra cluster - * Elasticsearch, which stores both the (meta)data related to datasets and the editorial content (edited from within the Ghost CMS application) - - Some others, hosted by the `back-office` instance are not yet distributed/replicated, but could be in the near future: - - * by deploying the stateless services (mail, AUTHN, CSV catalog download, single page app, ...) on `es-{1,2,3}`; - * by deploying PostgreSQL (needed by the "organizations" and "resources wizard" services) in master-slave mode, the slaves being hosted by `es-{1,2,3}` and the master by `back-office` (N.B.: writes to the database come from the "Admin GUI" service); - * by deploying Redis (needed by the "legacy AUTH middleware" service) in HA mode, cf. https://redis.io/topics/sentinel. - - N.B.: It's not such a big deal to leave the administration tools (Konga, Portainer, pgAdmin, Prometheus + Elasticsearch + Grafana) unreplicated.