现在再次走向魔法咖系统的道路~采用Ansible前进吧~
既に構築済みの設定ファイルを使用して一括で設定を行います。先ほどの再インストールにより、SLURMなども再度インストールされています。
使用Ansible进行配置管理。
我们会继续进行下去。
步骤一,理解构成
在接下来的过程中,我们将了解这个配置。麦尔基奥(192.168.1.101)、巴尔塔萨尔(192.168.1.102)和卡斯波(192.168.1.103)是一个简单的并行计算机,正如它们的名字所示,由三台计算机组成。
另外,还有一个管理节点CTL(192.168.1.100)和一个VPN节点VPN(192.168.1.98)。
各种设备已经配置了路由器,可以根据MAC地址自动分配IP。
这次我们会从VPN机器上进行配置管理。首先我们会修改/etc/hosts文件。
[evakichi@localhost ~]$ cat < /etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.1.98 vpn vpn.sc-magi.com
192.168.1.100 ctl ctl.sc-magi.com
192.168.1.110 ctl-wl ctl-wl.sc-magi.com
192.168.1.101 melchior melchior.sc-magi.com system01 system01.sc-magi.com
192.168.1.111 melchior-wl melchior-wl.sc-magi.com system01-wl system01-wl.sc-magi.com
192.168.1.102 balthasar balthasar.sc-magi.com system02 system02.sc-magi.com
192.168.1.112 balthasar-wl balthasar-wl.sc-magi.com system02-wl system02-wl.sc-magi.com
192.168.1.103 casper casper.sc-magi.com system03 system03.sc-magi.com
192.168.1.113 casper-wl casper-wl.sc-magi.com system03-wl system03-wl.sc-magi.com
将 hogehoge-wl 分配给无线局域网使用,但这次我想在最后设置它。
第一步,创建SSH密钥对。
好的,作为第一项工作,我们要创建一个SSH密钥对。
[evakichi@localhost ~]$ ssh-keygen -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/home/evakichi/.ssh/id_rsa): /home/evakichi/.ssh/id_rsa_fedora_magi
Created directory '/home/evakichi/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/evakichi/.ssh/id_rsa_fedora_magi.
Your public key has been saved in /home/evakichi/.ssh/id_rsa_fedora_magi.pub.
The key fingerprint is:
SHA256:LqBYmpK1mZ+uP6QBSyQxA33Ogms1U511R2u6AKOuW04 evakichi@localhost.localdomain
The key's randomart image is:
+---[RSA 4096]----+
|*o . o. ..o |
|.+. .. o . . . |
|o. +. o o |
|o..+o . o o |
|.++.+. S. . |
|oO.=o. . . . |
|B =+ E. . . . |
|. ..*. . |
| .**o |
+----[SHA256]-----+
完成了密钥对。
接下来,首先进行主机名更改和hosts转移……在此之前,先将密钥传送到各个机器。
[evakichi@localhost ~]$ ssh-copy-id -i .ssh/id_rsa_fedora_magi casper
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: ".ssh/id_rsa_fedora_magi.pub"
The authenticity of host 'casper (192.168.1.103)' can't be established.
ECDSA key fingerprint is SHA256:13yeYjc0c69tDbRlbCGNpmqvu/7h+jIE8IAhntUKhEg.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
evakichi@casper's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'casper'"
and check to make sure that only the key(s) you wanted were added.
我打算先简化工序,直接做4个。
安装Ansible
那么,我们开始在ctl上安装Ansible吧。很简单对吧?
$ sudo dnf install ansible
下面我们为每个客户端(1个主服务器和3个从服务器:设置有些不同)创建ansible用户。我们也为密码选择一个简单的密码。
[evakichi@localhost ~]$ sudo useradd ansible
[sudo] password for evakichi:
[evakichi@localhost ~]$ sudo passwd ansible
Changing password for user ansible.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
[evakichi@localhost ~]$ sudo usermod -aG wheel ansible
我会为每个节点执行这个操作。
哦,我忘了,还要为自己添加ansible用户。
[root@vpn ~]# passwd ansible
Changing password for user ansible.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
[root@vpn ~]# usermod -aG wheel ansible
确定用户已正确添加到”wheel”组可能是一个好主意。
[root@vpn ~]# id ansible
uid=1001(ansible) gid=1001(ansible) groups=1001(ansible),10(wheel)
我忘了另一件事情,需要将SSH公钥复制给ansible用户…
[evakichi@vpn ~]$ ssh-copy-id -i ./.ssh/id_rsa_fedora_magi ansible@casper
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "./.ssh/id_rsa_fedora_magi.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
ansible@casper's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'ansible@casper'"
and check to make sure that only the key(s) you wanted were added.
用Ansible复制主机
我只会写一些简单的设置。因为我只是想试试今天是否能用,所以我打算复制/etc/hosts文件,这样挺合适的。
那么,创建库存。
[magi-system]
casper ansible_host=192.168.1.103
balthasar ansible_host=192.168.1.102
melchior ansible_host=192.168.1.101
ctl ansible_host=192.168.1.100
[linux_servers:children]
magi-system
接下来要写playbook。
---
- name: main.yml
hosts: linux-servers
remote_user: ansible
roles:
- copy-hosts
---
- name: deploy hosts
become: yes
copy:
src: hosts
dest: /etc/hosts
owner: root
group: root
mode: 0644
[ansible@vpn ansible-playbooks]$ ansible-playbook -i inventory.ini init.yml --ask-become-pass
BECOME password:
[DEPRECATION WARNING]: The TRANSFORM_INVALID_GROUP_CHARS settings is set to allow bad characters in group names by default, this will change, but still be user configurable on deprecation. This feature will be removed in version 2.10.
Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details
PLAY [main.yml] *****************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [melchior]
ok: [ctl]
ok: [casper]
ok: [balthasar]
TASK [copy-hosts : deploy hosts] ************************************************************************************************************************************************************************************************************
changed: [ctl]
changed: [melchior]
changed: [balthasar]
changed: [casper]
PLAY RECAP **********************************************************************************************************************************************************************************************************************************
balthasar : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
casper : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
ctl : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
melchior : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
好像进展很顺利。
但是,我稍微有点担心上面的警告。
因此,按照指示办理
[ansible@vpn ansible-playbooks]$ ansible-playbook -i inventory.ini init.yml --ask-become-pass
BECOME password:
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details
PLAY [main.yml] *****************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [ctl]
ok: [balthasar]
ok: [melchior]
ok: [casper]
TASK [copy-hosts : deploy hosts] ************************************************************************************************************************************************************************************************************
ok: [melchior]
ok: [balthasar]
ok: [casper]
ok: [ctl]
PLAY RECAP **********************************************************************************************************************************************************************************************************************************
balthasar : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
casper : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
ctl : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
melchior : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
一个警告已经解除了。另一个稍后再处理。
使用Ansible构建并行环境。
无论如何,一切都运转地很顺利。接下来,我们会继续引入并行化环境。
首先是创建MUNGE密钥的工作。
dd if=/dev/random bs=1 count=1024 >./install-MPI_OPENMP/files/munge.key
因为我也制作了钥匙,所以我们来实际进行安装大赛。
首先,我们要创建一个播放剧本。
[ansible@vpn ansible-playbooks]$ cat install-MPI_OPENMP/tasks/main.yml
---
- name: install mpich
become: yes
dnf:
name: mpich
state: latest
update_cache: yes
- name: install mpich-devel
become: yes
dnf:
name: mpich-devel
state: latest
update_cache: yes
- name: install mpich-doc
become: yes
dnf:
name: mpich-doc
state: latest
update_cache: yes
- name: install openmpi
become: yes
dnf:
name: openmpi
state: latest
update_cache: yes
- name: install openmpi
become: yes
dnf:
name: openmpi-devel
state: latest
update_cache: yes
- name: install libomp
become: yes
dnf:
name: libomp
state: latest
update_cache: yes
- name: install libomp-test
become: yes
dnf:
name: libomp-test
state: latest
update_cache: yes
- name: install libomp-devel
become: yes
dnf:
name: libomp-devel
state: latest
update_cache: yes
- name: install slurm
become: yes
dnf:
name: slurm*
state: latest
update_cache: yes
- name: deploy munge.key
become: yes
copy:
src: munge.key
dest: /etc/munge/munge.key
owner: munge
group: munge
mode: 0600
[ansible@vpn ansible-playbooks]$ ansible-playbook -i inventory.ini init.yml --ask-become-pass
BECOME password:
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details
PLAY [main.yml] *****************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [ctl]
ok: [balthasar]
ok: [casper]
ok: [melchior]
TASK [copy-hosts : deploy hosts] ************************************************************************************************************************************************************************************************************
ok: [ctl]
ok: [melchior]
ok: [balthasar]
ok: [casper]
TASK [install-MPI_OPENMP : install mpich] ***************************************************************************************************************************************************************************************************
changed: [ctl]
changed: [balthasar]
changed: [casper]
changed: [melchior]
TASK [install-MPI_OPENMP : install mpich-devel] *********************************************************************************************************************************************************************************************
changed: [ctl]
changed: [casper]
changed: [melchior]
changed: [balthasar]
TASK [install-MPI_OPENMP : install mpich-doc] ***********************************************************************************************************************************************************************************************
changed: [ctl]
changed: [melchior]
changed: [casper]
changed: [balthasar]
TASK [install-MPI_OPENMP : install openmpi] *************************************************************************************************************************************************************************************************
changed: [ctl]
changed: [melchior]
changed: [casper]
changed: [balthasar]
TASK [install-MPI_OPENMP : install openmpi] *************************************************************************************************************************************************************************************************
changed: [ctl]
changed: [melchior]
changed: [casper]
changed: [balthasar]
TASK [install-MPI_OPENMP : install libomp] **************************************************************************************************************************************************************************************************
changed: [ctl]
changed: [balthasar]
changed: [casper]
changed: [melchior]
TASK [install-MPI_OPENMP : install libomp-test] *********************************************************************************************************************************************************************************************
changed: [ctl]
changed: [balthasar]
changed: [melchior]
changed: [casper]
TASK [install-MPI_OPENMP : install libomp-devel] ********************************************************************************************************************************************************************************************
ok: [ctl]
ok: [melchior]
ok: [casper]
ok: [balthasar]
TASK [install-MPI_OPENMP : install slurm] ***************************************************************************************************************************************************************************************************
changed: [ctl]
changed: [melchior]
changed: [balthasar]
changed: [casper]
TASK [install-MPI_OPENMP : deploy munge.key] ************************************************************************************************************************************************************************************************
changed: [ctl]
changed: [balthasar]
changed: [melchior]
changed: [casper]
PLAY RECAP **********************************************************************************************************************************************************************************************************************************
balthasar : ok=12 changed=9 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
casper : ok=12 changed=9 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
ctl : ok=12 changed=9 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
melchior : ok=12 changed=9 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
所以,然后只需要创建并执行定期更新的播放书来完成。
---
- name: update packages
become: yes
dnf:
name: "*"
state: latest
update_cache: yes
---
- name: main.yml
hosts: linux-servers
remote_user: ansible
roles:
- update
[ansible@vpn ansible-playbooks]$ ansible-playbook -i inventory.ini update.yml --ask-become-pass
BECOME password:
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details
PLAY [main.yml] *****************************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [balthasar]
ok: [melchior]
ok: [ctl]
ok: [casper]
PLAY RECAP **********************************************************************************************************************************************************************************************************************************
balthasar : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
casper : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
ctl : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
melchior : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
结束了。