Newer
Older
# Deployment of the beta version
The software is hosted on 5 machines, having the following hostnames and specs:
* **front-web**: 7 GB RAM; 2 vCores; 50 GB SSD
* **back-office**: 15 GB RAM; 4 vCores; 100 GB SSD
* **es-1**: 30 GB RAM; 8 vCores; 200 GB SSD
* **es-2**: 30 GB RAM; 8 vCores; 200 GB SSD
* **es-3**: 30 GB RAM; 8 vCores; 200 GB SSD
Alessandro Cerioni
committed
The above machines exchanges information through a private LAN: `192.168.0.0/24`; `front-web` is the only instance which is directly connected to the Internet, through its WAN interface `ens3` and public IP addresses : `51.83.13.51` (standard), `91.121.35.236` (failover).
The following diagram provides a sketch of the various applications hosted by infrastructure: 
Deployments are performed using Gitlab CI. Details on each machine's role and configuration are provided here-below.
## front-web
The **front-web** machine has the following roles:
* router, firewall
* DNS server
* SMTP server
* Reverse Proxy
Such roles are accomplished thanks to the configuration detailed here-below.
### router, firewall
The relevant configuration is stored within the file `/etc/iptables/rules.v4`:
```
*nat
:PREROUTING ACCEPT [541:33128]
:INPUT ACCEPT [333:20150]
:OUTPUT ACCEPT [683:49410]
:POSTROUTING ACCEPT [683:49410]
-A POSTROUTING -s 192.168.0.0/24 -o ens3 -j MASQUERADE
Alessandro Cerioni
committed
-A POSTROUTING -o ens3 -j SNAT --to-source 91.121.35.236
COMMIT
*filter
:INPUT DROP [173:7020]
:FORWARD ACCEPT [2218:856119]
:OUTPUT ACCEPT [5705:2627050]
-A INPUT -s 192.168.0.0/24 -m comment --comment "FULL ACCESS LAN" -j ACCEPT
-A INPUT -i lo -m comment --comment "FULL ACCESS LOOPBACK" -j ACCEPT
-A INPUT -s 217.182.252.78/32 -p tcp -m tcp --dport 22 -m comment --comment "SSH neogeo-ansible" -j ACCEPT
-A INPUT -s 80.12.88.99/32 -p tcp -m tcp --dport 22 -m comment --comment "SSH neogeo-bureau" -j ACCEPT
-A INPUT -s 213.245.116.190/32 -p tcp -m tcp --dport 22 -m comment --comment "SSH erasmes" -j ACCEPT
-A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment "in order to receive responses to outgoing requests" -j ACCEPT
-A INPUT -d 51.83.13.51/32 -i ens3 -p tcp -m tcp --dport 443 -j ACCEPT
-A INPUT -d 51.83.13.51/32 -i ens3 -p tcp -m tcp --dport 80 -j ACCEPT
Alessandro Cerioni
committed
-A INPUT -d 91.121.35.236/32 -i ens3 -p tcp -m tcp --dport 443 -j ACCEPT
-A INPUT -d 91.121.35.236/32 -i ens3 -p tcp -m tcp --dport 80 -j ACCEPT
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
COMMIT
```
Moreover, the following line must appear in the `/etc/sysctl.conf` file:
`net.ipv4.ip_forward=1`
### DNS server
We rely on the `dnsmasq` software, which was installed via `apt`. The relevant configuration is stored in `/etc/dnsmasq.conf` file, which reads as follows:
```
domain-needed
bogus-priv
server=213.186.33.99
listen-address=192.168.0.59
no-dhcp-interface=ens4
bind-interfaces
```
The following lines were appended to the `/etc/hosts` file, allowing the DNS to resolve the entire infrastructure:
```
51.83.13.51 front-web.wan
192.168.0.59 front-web.lan
51.83.15.2 back-office.wan
192.168.0.146 back-office.lan
51.68.115.202 es-1.wan
192.168.0.74 es-1.lan
51.77.229.85 es-2.wan
192.168.0.65 es-2.lan
51.83.13.94 es-3.wan
192.168.0.236 es-3.lan
```
The WAN interfaces were declared in spite of the fact that they are not actually used (except for the `front-web` instance).
It is important to note that, by default, the `/etc/hosts` file is managed by the hosting service. In order to prevent user modifications from being reset at every reboot, a line has to be modified in the `/etc/cloud/cloud.cfg` file:
`manage_etc_hosts: false`
### SMTP server
`postfix` and `opendkim` were installed through `apt`. The latter was setup following the instructions found at [https://wiki.debian.org/opendkim](https://wiki.debian.org/opendkim). In particular, the following commands were issued as `root`:
```
mkdir /etc/postfix/dkim/
opendkim-genkey -D /etc/postfix/dkim/ -d data.beta.grandlyon.com -s mail
chgrp opendkim /etc/postfix/dkim/*
chmod g+r /etc/postfix/dkim/*
chmod o= /etc/postfix/dkim/*
```
Moreover,
* the line "Mode sv" was uncommented in `/etc/opendkim.conf` (for unknown reasons :-()
* the following lines were appended to the same file:
```
# Specify the list of keys
KeyTable file:/etc/postfix/dkim/keytable
# Match keys and domains. To use regular expressions in the file, use refile: instead of file:
SigningTable refile:/etc/postfix/dkim/signingtable
# Match a list of hosts whose messages will be signed. By default, only localhost is considered as internal host.
InternalHosts refile:/etc/postfix/dkim/trustedhosts
```
* the line starting with `Socket` was modified as follows:
```
Socket inet:8892@localhost
```
Some other files were edited:
* `/etc/postfix/dkim/keytable`:
```mail._domainkey.data.beta.grandlyon.com data.beta.grandlyon.com:mail:/etc/postfix/dkim/mail.private```
* `/etc/postfix/dkim/signingtable`:
```*@data.beta.grandlyon.com mail._domainkey.data.beta.grandlyon.com```
* `/etc/postfix/dkim/trustedhosts`:
```
127.0.0.1
192.168.0.0/24
```
The relevant lines in the `postfix` configuration file (`/etc/postfix/main.cf`) read as follows:
```
[...]
myhostname = data.beta.grandlyon.com
alias_maps = hash:/etc/aliases
alias_database = hash:/etc/aliases
myorigin = /etc/mailname
mydestination = $myhostname, data.beta.grandlyon.com, front-web.localdomain, localhost.localdomain, localhost
relayhost =
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 192.168.0.0/24
mailbox_size_limit = 0
recipient_delimiter = +
inet_interfaces = all
inet_protocols = ipv4
[...]
milter_default_action = accept
milter_protocol = 6
smtpd_milters = inet:127.0.0.1:8892
non_smtpd_milters = $smtpd_milters
[...]
```
The DNS records was updated as follows:
```
data.beta.grandlyon.com. 86400 IN TXT "v=spf1 +ip4:51.83.13.51 ~all"
```
```
mail._domainkey.data.beta.grandlyon.com. 86400 IN TXT "v=DKIM1; h=sha256; k=rsa; " "p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzzoL8dvkfhm3xCpGxW8COUIgmw4r0PV/5GSUekCA8sLGPiqNh8//Jj4tFpLK6eUMacKYPbL4goUdRyTF5gqh/MdEWwafodZczELETRcp3a7mGdmM2nDhD6lk2Xtdf+nS+HWobYN18a3abNFchcF62LJWGTd4fwKV8gOIIuvTiakVxFuC7eIBUO+7m0JU0EnnivLUabphFSL3yV" "hEdpCD3csRGedSnG6+ocpZw25ll8/5f6WZnobU2d5KKqk7MVgOFXfuJMhdjmd6UvSGPaxR+/E+PsxQCU0f9vLG4R8fLPLh0ngNGGiyNYGHB5Sn8VxIrxqpH2pQKaJsfHLK/IgRJwIDAQAB"
```
in order to implement the Sender Policy Framework (SPF). The public key can be found in the file `/etc/postfix/dkim/mail.txt`.
### Reverse Proxy
`nginx` was installed through `apt`. The various "virtual host" configuration files can be found in the `/etc/nginx/sites-available` and `/etc/nginx/sites-enabled` folders. TLS certificates are stored in `/etc/nginx/ssl`.
## back-office
This instance hosts both custom and off-the-shelf applications, as illustrated by the diagram displayed at the beginning of this. These applications serve several purposes:
* administration, configuration
* monitoring
* business
The public network interface (`ens3`) was deactivated, by commenting out the line
`auto ens3` in the `/etc/network/interfaces.d/50-cloud-init.cfg`. In order for the modification to be persistent, we need to disable cloud-init's network configuration capabilities, by editing the file `/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg` with the following content:
```
network: {config: disabled}
```
The private network interface (`ens4`) was statically configured. Here's the relevant lines of the `/etc/network/interfaces` file:
```
[...]
auto ens4
iface ens4 inet static
address 192.168.0.146
netmask 255.255.255.0
gateway 192.168.0.59
dns-nameservers 192.168.0.59
[...]
```
The `back-office` instance runs Docker and docker-compose, which were installed following the official documentation:
* https://docs.docker.com/install/linux/docker-ce/debian/
* https://docs.docker.com/compose/install/
The default configuration was tweaked in order to prevent Docker from messing up with virtual networks. Here's is the content of the `/etc/docker/daemon.json` file:
```
{
"default-address-pools": [
{
"scope": "local",
"base": "172.17.0.0/16",
"size": 24
},
{
"scope": "global",
"base": "172.90.0.0/16",
"size": 24
}
]
}
```
Moreover, the content of the file `/etc/systemd/system/docker.service.d/startup_options.conf` was edited as follows,
```
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// -H tcp://192.168.0.146:2375
```
in order to make the Docker Daemon listen to a TCP socket, instead of the default Unix socket. This allows Portainer to connect to the Docker Daemons running on the various Docker-enabled instances of the infrastructure (cf. https://success.docker.com/article/how-do-i-enable-the-remote-api-for-dockerd).
## es-1, es-2, es-3
These three instances host some distributed applications:
* Elasticsearch
* Kong (backed by the Cassandra database)
* MinIO
Moreover,
* they collect and parse HTTP logs via Filebeat and Logstash, respectively, which are then sent to a "small" Elasticsearch instance which is running on the `back-office` machine for monitoring purposes;
* they store (cold) backups of the configuration of the entire infrastructure, as well as some of the relevant application data. Backups are performed by `rsnapshot`, which was installed via `apt`. Its setup requires the following steps:
1. `rsync` needs be installed on all the instances of the infrastructure
2. a public SSH key owned by the `root` user of each `es-X` instance must be appended to the `/root/.ssh/authorized_keys` of all the other instances
3. a first SSH session from each `es-X` instance to all the others must be established, in order to answer "yes" to the question concerning the authenticity of the host we wish to connect to
4. the `/etc/rsnapshot.conf` file must be customized according to our needs. Here's the copy of the relevant lines that can be found on `es-1`:
```
[...]
cmd_ssh /usr/bin/ssh
[...]
retain hourly 6
retain daily 7
retain weekly 4
retain monthly 3
[...]
backup /home/ es-1/
backup /etc/ es-1/
backup /usr/local/ es-1/
backup root@es-2.lan:/etc/ es-2/
backup root@es-2.lan:/home/ es-2/
backup root@es-2.lan:/usr/local/ es-2/
backup root@es-3.lan:/etc/ es-3/
backup root@es-3.lan:/home/ es-3/
backup root@es-3.lan:/usr/local/ es-3/
backup root@back-office.lan:/etc/ back-office/
backup root@back-office.lan:/home/ back-office/
backup root@back-office.lan:/usr/local/ back-office/
backup root@back-office.lan:/var/local/docker-apps/ back-office/
backup root@front-web.lan:/etc/ front-web/
backup root@front-web.lan:/home/ front-web/
backup root@front-web.lan:/usr/local/ front-web/
```
N.B.: `rsnapshot` loves (hates) tabs (blank spaces)
**The `es-1`, `es-2`, `es-3` instances share the same network and Docker (+ docker-compose) configuration as the `back-office` instance.**
## Additional notes
The following software packages are installed on all the machines (via `apt`):
* `resolvconf`
* `prometheus-node-exporter`
On the `back-office` and `es-{1,2,3}` instances, `gitlab-runner` was installed following the [official documentation]( https://docs.gitlab.com/runner/install/linux-repository.html). Gitlab Runners were then registered as "group runners" associated with the following group: https://gitlab.alpha.grandlyon.com/groups/refonte-data/deployment-beta. The following tags were used
* data-beta-grandlyon-com-back-office
* data-beta-grandlyon-com-es-1
* data-beta-grandlyon-com-es-2
* data-beta-grandlyon-com-es-3
in order to be able to trigger CI jobs only on selected machines.
## Critical points and potential improvements
1. **The `front-web` instance is the SPOF of the infrastructure. How to cope with it? Shall we use an HA instance ? If not, how to set up an infrastructure with two routers??**
2. Despite the periodic backups that we let `rsnapshot` perform, in case of failure data/service restoration would take a non-negligible amount of time. Some applications are already deployed in High Availability mode:
* Kong, thanks to the Cassandra cluster
* Elasticsearch, which stores both the (meta)data related to datasets and the editorial content (edited from within the Ghost CMS application)
Some others, hosted by the `back-office` instance are not yet distributed/replicated, but could be in the near future:
* by deploying the stateless services (mail, AUTHN, CSV catalog download, single page app, ...) on `es-{1,2,3}`;
* by deploying PostgreSQL (needed by the "organizations" and "resources wizard" services) in master-slave mode, the slaves being hosted by `es-{1,2,3}` and the master by `back-office` (N.B.: writes to the database come from the "Admin GUI" service);
* by deploying Redis (needed by the "legacy AUTH middleware" service) in HA mode, cf. https://redis.io/topics/sentinel.
N.B.: It's not such a big deal to leave the administration tools (Konga, Portainer, pgAdmin, Prometheus + Elasticsearch + Grafana) unreplicated.