home-server/bootstrap.adoc

544 lines
21 KiB
Plaintext
Raw Permalink Blame History

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

/////
# The home-server project produces a multi-purpose setup using Ansible.
# Copyright © 2018 Y. Gablin, under the GPL-3.0-or-later license.
# Full licensing information in the LICENSE file, or gnu.org/licences/gpl-3.0.txt if the file is missing.
/////
:keymap: fr-bepo
:front-name: dmz
:front-ip: 192.168.1.254
:back-name: home
:back-ip: 192.168.1.253
:pc-ip: 192.168.1.252
:net-bits: 24
:net-gateway: 192.168.1.1
:your-uid: me
:sys-disk: /dev/mmcblk0
:sys-esp: /dev/mmcblk0p1
:sys-pv: /dev/mmcblk0p2
:sys-vg: Sys
:data-vg: Data
:appdata-lv: AppData
:userdata-lv: UserData
:bt-storage-name: p2p
:bt-storage-todo: iso.torrent
:bt-storage-doing: .iso.wip
:bt-storage-done: iso
:prosody-db: prosody
:prosody-db-user: prosody
:nextcloud-db: nextcloud
:nextcloud-db-user: nextcloud
:nextcloud-root: /usr/share/webapps/nextcloud
:nextcloud-user: cloud
= Bootstrap of the home-server
:toc:
TIP: Modifiy this documents header variables and it will then reflect your own preferences.
https://addons.mozilla.org/fr/firefox/addon/asciidoctorjs-live-preview/[View the result in Firefox].
== Purpose
The server is entirely configured by https://docs.ansible.com/[Ansible].
Thus, what this document is about should be entirely done with Ansible.
However, Ansible can only reach and control the server if the server has some basic software installed (namely, SSH and Python), and has its network interface correctly configured.
This is a chicken-and-egg problem, which is solved by manually bootstraping the server.
== Archlinux standard installation
Once the Archlinux installation media (USB in my case) is inserted and booted (in EFI mode), the https://wiki.archlinux.org/index.php/Installation_guide[official documentation] basically comes down to this (to be adapted for your actual preferences):
Basic configuration and partioning::
* `{sys-disk}` is the small integrated storage area, where the system gets installed.
* The “{data-vg}” LVM-VG is a (set of) storage device(s) (SATA, eSATA, or USB3) with lots of extra space (for example on `/dev/sdb`).
* Each application that manages state data gets its own mount points inside a BTRFS “{appdata-lv}” volume.
* User data is stored in a BTRFS “{userdata-lv}” volume.
+
[subs="+attributes"]
```bash
root@archiso ~ # export LVM=/dev/mapper
root@archiso ~ # export DMZ=/mnt/var/lib/machines/{front-name}
root@archiso ~ # export APPDATA=/mnt/mnt/AppData
root@archiso ~ # export USERDATA=/mnt/mnt/UserData
root@archiso ~ # loadkeys {keymap}
root@archiso ~ # ping -c 1 archlinux.org
1 packets transmitted, 1 received, 0% packet loss, time 0ms
root@archiso ~ # timedatectl set-ntp true
root@archiso ~ # fdisk {sys-disk}
Command (m for help): g
Created a new GPT disklabel…
Command (m for help): n
Partition number (1-128, default 1):
First sector (…):
Last sector, +sectors or +size{K,M,G,T,P} (…): +128M
Created a new partition 1…
Command (m for help): t
Selected partition 1
Hex code (type L to list all codes): 1
Changed type of partition 'Linux filesystem' to 'EFI System'.
Command (m for help): n
Partition number (2-128, default 2):
First sector (…):
Last sector, +sectors or +size{K,M,G,T,P} (…):
Created a new partition 2…
Command (m for help): t
Partition number (1,2, default 2):
Hex code (type L to list all codes): 31
Changed type of partition 'Linux filesystem' to 'Linux LVM'.
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
root@archiso ~ # mkfs.vfat -n ESP {sys-esp}
root@archiso ~ # pvcreate {sys-pv}
root@archiso ~ # vgcreate {sys-vg} {sys-pv}
root@archiso ~ # lvcreate -L 5G -n Root {sys-vg}
root@archiso ~ # lvcreate -L 2G -n Cont {sys-vg}
root@archiso ~ # mkfs.ext4 $LVM/{sys-vg}-Root
root@archiso ~ # mkfs.btrfs --mixed --label Cont $LVM/{sys-vg}-Cont
root@archiso ~ # lvcreate -L 10G -n RootVar {data-vg}
root@archiso ~ # mkfs.ext4 $LVM/{data-vg}-RootVar
root@archiso ~ # lvcreate -L 1G -n ContVar {data-vg}
root@archiso ~ # mkfs.ext4 $LVM/{data-vg}-ContVar
root@archiso ~ # lvcreate -L 100G -n {appdata-lv} {data-vg}
root@archiso ~ # mkfs.btrfs --mixed --label {appdata-lv} $LVM/{data-vg}-{appdata-lv}
root@archiso ~ # lvcreate -L 700G -n {userdata-lv} {data-vg}
root@archiso ~ # mkfs.btrfs --mixed --label {userdata-lv} $LVM/{data-vg}-{userdata-lv}
root@archiso ~ # lvcreate -L 1G -n Home {data-vg}
root@archiso ~ # mkfs.ext4 $LVM/{data-vg}-Home
```
Host and guest mounting::
* The hardware host holds the sensitive data, and is not reachable from the Internet.
* the guest container is the DMZ and holds directly accessible Internet services.
+
[subs="+attributes"]
```bash
root@archiso ~ # mount $LVM/{sys-vg}-Root /mnt
root@archiso ~ # mkdir -p /mnt/{boot,home,var} $APPDATA $USERDATA
root@archiso ~ # mount LABEL=ESP /mnt/boot
root@archiso ~ # mount $LVM/{data-vg}-Home /mnt/home
root@archiso ~ # mount $LVM/{data-vg}-RootVar /mnt/var
root@archiso ~ # mount $LVM/{data-vg}-{appdata-lv} $APPDATA
root@archiso ~ # mkdir -p /mnt/var/cache/{minidlna,pacman/pkg}
root@archiso ~ # mkdir -p \
> /mnt/var/lib/{acme,dovecot,gitea,kodi,machines,nextcloud,openldap,postgres}
root@archiso ~ # mkdir -p /mnt/var/spool/mail
root@archiso ~ # btrfs subvolume create $APPDATA/acme.lib
root@archiso ~ # btrfs subvolume create $APPDATA/acme.srv
root@archiso ~ # btrfs subvolume create $APPDATA/ddclient.cache
root@archiso ~ # btrfs subvolume create $APPDATA/dovecot.lib
root@archiso ~ # btrfs subvolume create $APPDATA/gitea.lib
root@archiso ~ # btrfs subvolume create $APPDATA/kodi.lib
root@archiso ~ # btrfs subvolume create $APPDATA/mail.spool
root@archiso ~ # btrfs subvolume create $APPDATA/minidlna.cache
root@archiso ~ # btrfs subvolume create $APPDATA/movim.cache
root@archiso ~ # btrfs subvolume create $APPDATA/movim.lib
root@archiso ~ # btrfs subvolume create $APPDATA/nextcloud.lib
root@archiso ~ # btrfs subvolume create $APPDATA/nginx.log
root@archiso ~ # btrfs subvolume create $APPDATA/openldap.lib
root@archiso ~ # btrfs subvolume create $APPDATA/pacman_pkg.cache
root@archiso ~ # btrfs subvolume create $APPDATA/postgres.lib
root@archiso ~ # btrfs subvolume create $APPDATA/prosody.lib
root@archiso ~ # btrfs subvolume create $APPDATA/transmission.lib
root@archiso ~ # btrfs subvolume create $APPDATA/webapps.srv
root@archiso ~ # mount \
> -o subvol=acme.lib,compress=lzo \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/lib/acme
root@archiso ~ # mount \
> -o subvol=dovecot.lib,compress=lzo \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/lib/dovecot
root@archiso ~ # mount \
> -o subvol=gitea.lib,nodatacow \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/lib/gitea
root@archiso ~ # mount \
> -o subvol=kodi.lib,compress=lzo \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/lib/kodi
root@archiso ~ # mount \
> -o subvol=mail.spool,compress=lzo,nodatacow \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/spool/mail
root@archiso ~ # mount \
> -o subvol=minidlna.cache,nodatacow \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/cache/minidlna
root@archiso ~ # mount \
> -o subvol=nextcloud.lib,compress=lzo \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/lib/nextcloud
root@archiso ~ # mount \
> -o subvol=openldap.lib,nodatacow \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/lib/openldap
root@archiso ~ # mount \
> -o subvol=pacman_pkg.cache,nodatacow \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/cache/pacman/pkg
root@archiso ~ # mount \
> -o subvol=postgres.lib,nodatacow \
> $LVM/{data-vg}-{appdata-lv} /mnt/var/lib/postgres
root@archiso ~ # mount $LVM/{sys-vg}-Cont /mnt/var/lib/machines
root@archiso ~ # btrfs subvolume create $DMZ
root@archiso ~ # mkdir -p $DMZ/var
root@archiso ~ # mount $LVM/{data-vg}-ContVar $DMZ/var
root@archiso ~ # mkdir -p $DMZ/srv/{acme,webapps}
root@archiso ~ # mkdir -p $DMZ/var/cache/{ddclient,movim}
root@archiso ~ # mkdir -p $DMZ/var/lib/{movim,prosody,transmission}
root@archiso ~ # mkdir -p $DMZ/var/log/nginx
root@archiso ~ # mount \
> -o subvol=acme.srv,nodatacow \
> $LVM/{data-vg}-{appdata-lv} $DMZ/srv/acme
root@archiso ~ # mount \
> -o subvol=ddclient.cache,compress=lzo \
> $LVM/{data-vg}-{appdata-lv} $DMZ/var/cache/ddclient
root@archiso ~ # mount \
> -o subvol=movim.cache \
> $LVM/{data-vg}-{appdata-lv} $DMZ/var/cache/movim
root@archiso ~ # mount \
> -o subvol=movim.lib,compress=lzo \
> $LVM/{data-vg}-{appdata-lv} $DMZ/var/lib/movim
root@archiso ~ # mount \
> -o subvol=nginx.log,compress=lzo,nodatacow \
> $LVM/{data-vg}-{appdata-lv} $DMZ/var/log/nginx
root@archiso ~ # mount \
> -o subvol=prosody.lib,nodatacow \
> $LVM/{data-vg}-{appdata-lv} $DMZ/var/lib/prosody
root@archiso ~ # mount \
> -o subvol=transmission.lib,nodatacow \
> $LVM/{data-vg}-{appdata-lv} $DMZ/var/lib/transmission
root@archiso ~ # mount \
> -o subvol=webapps.srv,compress=lzo \
> $LVM/{data-vg}-{appdata-lv} $DMZ/srv/webapps
root@archiso ~ # mkdir $DMZ/var/lib/transmission/{Todo,Doing,Done}
root@archiso ~ # mount -o nodatacow $LVM/{data-vg}-{userdata-lv} $USERDATA
root@archiso ~ # mkdir -p $USERDATA/{bt-storage-name}
root@archiso ~ # for d in {bt-storage-todo} {bt-storage-doing} {bt-storage-done}; do
> btrfs subvolume create $USERDATA/{bt-storage-name}/$d
> done
root@archiso ~ # mount \
> -o subvol={bt-storage-name}/{bt-storage-todo},nodatacow \
> $LVM/{data-vg}-{userdata-lv} $DMZ/var/lib/transmission/Todo
root@archiso ~ # mount \
> -o subvol={bt-storage-name}/{bt-storage-doing},nodatacow \
> $LVM/{data-vg}-{userdata-lv} $DMZ/var/lib/transmission/Doing
root@archiso ~ # mount \
> -o subvol={bt-storage-name}/{bt-storage-done},nodatacow \
> $LVM/{data-vg}-{userdata-lv} $DMZ/var/lib/transmission/Done
```
Archlinux installation::
* When this is done, be sure to check that `/mnt/etc/fstab` perfectly matches the wanted result (the above mount points).
+
```bash
root@archiso ~ # pacstrap /mnt base arch-install-scripts intel-ucode linux \
> openssh python etckeeper git lvm2 btrfs-progs rsync
root@archiso ~ # genfstab -L /mnt >>/mnt/etc/fstab
```
Archlinux initial configuration::
* The basic files for the host must roughly match the final configuration, enough to let Ansible control the right host on the right IP without error.
* The values used here *must* match those in link:group_vars/all[].
+
[subs="+attributes"]
```bash
root@archiso ~ # arch-chroot /mnt
[root@archiso /]# echo {back-name} >/etc/hostname
[root@archiso /]# cat >/etc/systemd/network/bridge.netdev <<-"THEEND"
> [NetDev]
> Name=wire
> Kind=bridge
> THEEND
[root@archiso /]# cat >/etc/systemd/network/bridge.network <<-"THEEND"
> [Match]
> Name=wire
>
> [Network]
> IPForward=yes
> Address={back-ip}/{net-bits}
> Gateway={net-gateway}
> THEEND
[root@archiso /]# cat >/etc/systemd/network/wired.network <<-"THEEND"
> [Match]
> Name=en*
>
> [Network]
> Bridge=wire
> THEEND
[root@archiso /]# systemctl enable systemd-networkd.service
[root@archiso /]# sed -i '/prohibit-password/s/.*/PermitRootLogin yes/' \
> /etc/ssh/sshd_config
[root@archiso /]# mkdir ~root/.ssh
[root@archiso /]# chmod 700 ~root/.ssh
[root@archiso /]# scp {your-uid}@{pc-ip}:.ssh/id_ansible.pub \
> ~root/.ssh/authorized_keys
[root@archiso /]# chmod 600 ~root/.ssh/authorized_keys
[root@archiso /]# systemctl enable sshd.service
[root@archiso /]# sed -i '/^HOOKS=/s/block filesystems/block lvm2 filesystems/' \
> /etc/mkinitcpio.conf
[root@archiso /]# mkinitcpio -p linux
[root@archiso /]# passwd
passwd: password updated successfully
[root@archiso /]# bootctl --path=/boot install
[root@archiso /]# cat >/boot/loader/entries/arch.conf <<-THEEND
> title Arch Linux
> linux /vmlinuz-linux
> initrd /intel-ucode.img
> initrd /initramfs-linux.img
> options root=$LVM/{sys-vg}-Root rw
> THEEND
[root@archiso /]# cat >/boot/loader/loader.conf <<-"THEEND"
> default arch
> editor 0
> THEEND
[root@archiso /]# printf '%s, %s\n' \
> 'ACTION=="add", SUBSYSTEM=="usb"' \
> 'TEST=="power/control", ATTR{power/control}="off"' \
> >/etc/udev/rules.d/50-usb_power_save.rules
[root@archiso /]# exit
root@archiso ~ # systemctl reboot
```
This last command about USB and power control disables power saving for USB.
This line is only interesting if the main data drive is connected with USB.
[IMPORTANT]
===========
In theory, at this stage, the machine is ready to be controlled by Ansible.
However, Ansible fails at first, because for some reason, `pacstrap` in the “front” Ansible role fails to initialize the DMZ if the location already contains mount points, so:
. I had to temporarily unmount everything under `/var/lib/machines/{front-name}`, and delete `/var/lib/machines/{front-name}/*`.
. I also temporarily commented out the whole front-half of `site.xml`, as well as the “front-run” role of the back part.
. Then I ran Ansible again.
. When the DMZ was correctly initialized, I renamed `/var/lib/machines/{front-name}/var` to `/var/lib/machines/{front-name}/var.new`.
. Then I created a new `/var/lib/machines/{front-name}/var`, inside of which I mounted all the above DMZ-specific mount points again.
. In the `/var/lib/machines/{front-name}/` directory, I ran `rsync -av var.new/ var/`.
. After that, I could remove the `/var.new` directory (see below), restore `site.yml` to its original state, and start Ansible once again.
When I wanted to delete the DMZs `var.new` directory as root, I was denied the permission!
This is because `pacstrap` created the DMZs own `var/lib/machines` as a btrfs subvolume, which can only be deleted with the `btrfs subvolume delete var.new/lib/machines` command (`var.new` because of the renaming above).
Then removing `var.new` worked.
===========
== Post-installation tasks
You may want to restore some data from a former installation.
This section contains some examples of data restoration.
NOTE: Most values and paths here are examples, and shall be adapted.
=== Dotclear
[subs="+attributes"]
```bash
[root@{back-name} ~]# systemctl -M {front-name} stop haproxy.service
[root@{back-name} ~]# systemctl -M {front-name} stop nginx.service
[root@{back-name} ~]# systemctl -M {front-name} stop php-fpm.service
[root@{back-name} ~]# sudo -u postgres pg_restore -c -C -F c -d postgres \
> </backup/dotclear.cdump
[root@{back-name} ~]# systemctl -M {front-name} start php-fpm.service
[root@{back-name} ~]# systemctl -M {front-name} start nginx.service
[root@{back-name} ~]# systemctl -M {front-name} start haproxy.service
```
=== Prosody
[subs="+attributes"]
```bash
[root@{back-name} ~]# systemctl -M {front-name} stop haproxy.service
[root@{back-name} ~]# systemctl -M {front-name} stop nginx.service
[root@{back-name} ~]# systemctl -M {front-name} stop prosody.service
[root@{back-name} ~]# sudo -u postgres pg_restore -c -C -F c -d postgres \
> </backup/prosody.cdump
[root@{back-name} ~]# su - postgres
[postgres@{back-name} ~]$ psql
postgres=# ALTER DATABASE {prosody-db} OWNER TO {prosody-db-user};
ALTER DATABASE
postgres=# \c {prosody-db}
{prosody-db}=# ALTER TABLE prosody OWNER TO {prosody-db-user};
ALTER TABLE
{prosody-db}=# \q
[postgres@{back-name} ~]$ exit
[root@{back-name} ~]# systemctl -M {front-name} start prosody.service
[root@{back-name} ~]# systemctl -M {front-name} start nginx.service
[root@{back-name} ~]# systemctl -M {front-name} start haproxy.service
```
=== Nextcloud
There is a twist here…
My former installation actually was ownCloud, _not_ Nextcloud.
But knowing that I would use Nextcloud from then on, before doing the backup I upgraded my ownCloud installation to the corresponding compatible Nextcloud version (version `10.0.2.1`). +
The upgrade process broke my ownCloud… Not a big deal, since I only needed the backup of the data, to be restored in a clean Nextcloud installation on the new server.
But I dont remember if, on the new server, I restored the backup of the migrated database, or the backup of the ownCloud database…
Besides, my old ownCloud did _not_ use LDAP, instead relying on its internal database of users.
Unfortunately, there is no way to convert internal users (with their contacts, calendars, and so on) into LDAP users.
So I did it the programmers way, by studying the data model, and running SQL requests.
These are described below.
At the time of the data restoration, the current Nextcloud release (installed on the server) was version `12.…`.
Stop Nextcloud and restore the data::
+
[subs="+attributes"]
```bash
[root@{back-name} ~]# systemctl -M {front-name} stop haproxy.service
[root@{back-name} ~]# systemctl -M {front-name} stop nginx.service
[root@{back-name} ~]# systemctl stop nextcloud-maintenance.timer
[root@{back-name} ~]# systemctl stop uwsgi@nextcloud.socket
[root@{back-name} ~]# systemctl stop uwsgi@nextcloud.service
[root@{back-name} ~]# sudo -u postgres pg_restore -c -C -F c -d postgres \
> </backup/owncloud10.cdump
[root@{back-name} ~]# sed -i "s/'version' => '12.*'/'version' => '10.0.2.1'/" \
> /etc/webapps/nextcloud/config/config.php
[root@{back-name} ~]# cd {nextcloud-root}
[root@{back-name} nextcloud]# sudo -u {nextcloud-user} \
> /usr/bin/env NEXTCLOUD_CONFIG_DIR=/etc/webapps/nextcloud/config \
> /usr/bin/php occ upgrade
[root@{back-name} nextcloud]# cd /etc
[root@{back-name} etc]# git reset --hard
[root@{back-name} etc]# etckeeper init
```
Migrate users to LDAP (they keep the same name)::
* connect to the database:
+
[subs="+attributes"]
```bash
[root@{back-name} etc]# su - postgres
[postgres@{back-name} ~]$ psql
postgres=# ALTER DATABASE {nextcloud-db} OWNER TO {nextcloud-db-user};
ALTER DATABASE
postgres=# \c {nextcloud-db}
{nextcloud-db}=#
```
* browse a table (eg. `addressbooks`) to note the number associated to each user (eg. “`{your-uid}`” associated to number “`6266`”);
* migrate user `{your-uid}` (repeat for each user): the idea is to delete most data, considering that it is synced somewhere and it can be restored by resynchronizing:
+
[subs="+attributes"]
```sql
{nextcloud-db}=# delete from oc_accounts where uid='{your-uid}';
DELETE 1
{nextcloud-db}=# delete from oc_addressbooks where principaluri='principals/users/{your-uid}_6266';
DELETE 1
{nextcloud-db}=# delete from oc_calendars where principaluri='principals/users/{your-uid}_6266';
DELETE 1
{nextcloud-db}=# delete from oc_credentials;
DELETE 0
{nextcloud-db}=# delete from oc_filecache where name='{your-uid}_6266';
DELETE 1
{nextcloud-db}=# delete from oc_jobs where argument='{"uid":"{your-uid}_6266"}';
DELETE 1
{nextcloud-db}=# delete from oc_mounts where user_id like '%{your-uid}_6266%';
DELETE 1
{nextcloud-db}=# delete from oc_preferences where userid='{your-uid}_6266';
DELETE 10
{nextcloud-db}=# delete from oc_storages where id='home::{your-uid}_6266';
DELETE 1
{nextcloud-db}=# delete from oc_users where uid='{your-uid}';
DELETE 1
{nextcloud-db}=# update oc_ldap_user_mapping set owncloud_name='{your-uid}' where owncloud_name='{your-uid}_6266';
UPDATE 1
{nextcloud-db}=# commit;
{nextcloud-db}=# \q
```
Restart Nextcloud::
+
[subs="+attributes"]
```bash
[root@{back-name} ~]# systemctl start uwsgi@nextcloud.socket
[root@{back-name} ~]# systemctl start nextcloud-maintenance.timer
[root@{back-name} ~]# systemctl -M {front-name} start nginx.service
[root@{back-name} ~]# systemctl -M {front-name} start haproxy.service
```
=== Restore emails
I was formerly using BincIMAP, and then Courier-IMAP, and I also ran Dovecot once, on a backup server, when my main servers power supply burnt.
As a consequence, the Maildirs were polluted with dot-files from various origins.
I decided to do a clean import, especially since I configured Dovecot in a way that makes it more performant, with the constraint that it must have exclusive access to the mail storage.
[subs="+attributes"]
```bash
[root@{back-name} ~]# find /backup/user-Maildirs -depth \
> \( -iname '*binc*' -o -iname '*courier*' -o -iname '*dovecot*' \) \
> -exec rm -rf {} \;
[root@{back-name} ~]# for u in $(ls /backup/user-Maildirs); do
> chown -R $u /backup/user-Maildirs/$u
> doveadm import -s -u $u maildir:/backup/user-Maildirs/$u/Maildir/ '' ALL
> done
```
[literal.small]
.....
# The home-server project produces a multi-purpose setup using Ansible.
# Copyright © 2018 Y. Gablin, under the GPL-3.0-or-later license.
# Full licensing information in the LICENSE file, or gnu.org/licences/gpl-3.0.txt if the file is missing.
.....