2017-08-03

Python3.xでLocustを使う

python

現状、LocustはPython3.xには対応しておらず、2.xで利用しなければならない。
ただ、Python3.x対応版も準備されていて、それを利用することが可能。

普通にpipでLocustをインストールすると、locustio-0.7.5が入る。

$ pip install locustio
...(中略)...
Successfully installed Werkzeug-0.12.2 certifi-2017.7.27.1 chardet-3.0.4 click-6.7 flask-0.12.2 gevent-1.1.1 greenlet-0.4.12 idna-2.5 itsdangerous-0.24 locustio-0.7.5 msgpack-python-0.4.8 requests-2.18.3 urllib3-1.22

この状態で実行するとエラーが出る。

$ locust -f locustfile.py
...(中略)...
ModuleNotFoundError: No module named 'core'

Python3.xで利用するにはlocustio-0.8a2を明示的にインストールする。

$ pip install locustio==0.8a2

$ python -f locustfile.py

参考：

github.com

2017-07-27

Zabbix 3.0 LTSの構築(Proxy利用)

zabbix

Zabbix 3.0 LTSの構築手順。
バージョン番号的には3.4が最新だが、実利用上はLTSである3.0のほうが都合が良い。

https://www.zabbix.com/download

公式ドキュメントはここ。

Zabbix Manual [Zabbix Documentation 3.0]

構成

今回は、Zabbix Server、Zabbix Proxy、Zabbix Web、MySQL、そして監視対象サーバの5台をそれぞれの環境にセットアップする。

Proxyはパッシブモードで動作させ、Proxy用のDBは同一サーバ内に持つことにする。
※ ProxyのDBとServerのDBは一緒にしてはいけない。それではProxyの意味がなくなる。

仮想サーバの用意

Vagrantfileを用意してCentOS7を5台立ち上げる。

Vagrant.configure("2") do |config|
  config.vm.box = "centos/7"

  config.vm.define "server" do |server|
    server.vm.network "private_network", ip: "192.168.33.31"
  end

  config.vm.define "proxy" do |server|
    server.vm.network "private_network", ip: "192.168.33.32"
  end

  config.vm.define "web" do |server|
    server.vm.network "private_network", ip: "192.168.33.33"
  end

  config.vm.define "db" do |server|
    server.vm.network "private_network", ip: "192.168.33.34"
  end

  config.vm.define "client" do |server|
    server.vm.network "private_network", ip: "192.168.33.35"
  end
end

$ cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)

Zabbix Serverの構築

まずyum repositoryの追加。/etc/yum.repos.d/zabbix.fepoにファイルが作られる。

$ sudo rpm -ivh http://repo.zabbix.com/zabbix/3.0/rhel/7/x86_64/zabbix-release-3.0-1.el7.noarch.rpm

Zabbix Serverのパッケージをインストールする。

$ sudo yum install zabbix-server-mysql -y

設定ファイルを書き換える（DB接続情報を追加）。
ProxyConfigFrequencyはパッシブモードのProxyに設定を連携する間隔設定。3600secだと長すぎるので60secにする（実運用では300secぐらいが妥当か？）。

$ sudo cp -p /etc/zabbix/zabbix_server.conf{,.org}
$ sudo vi /etc/zabbix/zabbix_server.conf
$ sudo diff /etc/zabbix/zabbix_server.conf{.org,}
82a83,84
> DBHost=192.168.33.34
>
116a119,120
> DBPassword=zabbix
>
525a530,531
>
> ProxyConfigFrequency=60

Zabbix Serverを起動。失敗。。。

$ sudo systemctl start zabbix-server
Job for zabbix-server.service failed because a configured resource limit was exceeded. See "systemctl status zabbix-server.service" and "journalctl -xe" for details.

ログを確認すると「cannot set resource limit: [13] Permission denied」と出ている。

$ tail -n 3 /var/log/zabbix/zabbix_server.log
  4928:20170724:161152.933 using configuration file: /etc/zabbix/zabbix_server.conf
  4928:20170724:161152.934 cannot set resource limit: [13] Permission denied
  4928:20170724:161152.934 cannot disable core dump, exiting...

調べると同じ現象が結構出てくる。

さっしーの試してみるか3: Zabbix 3.0.7を建ててみたがはまったのでメモ

SELinuxが有効になっていることが原因とわかったので、上記記事を参考にさせて頂き、zabbix-serverに関するポリシーを変更。

$ getenforce
Enforcing

$ sudo yum install policycoreutils-python -y
$ sudo grep zabbix_server /var/log/audit/audit.log | audit2allow
$ sudo grep zabbix_server /var/log/audit/audit.log | audit2allow -M zabbix-limit
$ sudo semodule -i zabbix-limit.pp

ここで改めてzabbix-serverの起動を試みるとうまくいく。
ログにはDB接続エラーが出るが、これもSELinux関連。setseboolでZabbix Serverが外部と通信できるようにする。

$ sudo systemctl start zabbix-server
$ sudo setsebool -P zabbix_can_network on

DBサーバの構築

今回はMariaDBを利用する。まずパッケージのインストールを行って起動。
MariaDBのエンコーディング設定はutf8にしておく。

$ sudo yum install mariadb-server -y
$ sudo systemctl start mariadb

初期設定。まずrootのパスワードを設定。

$ /usr/bin/mysql_secure_installation
（いろいろ聞かれるので答えていく）

そしてzabbixデータベースとzabbixユーザを作る。

$ mysql -uroot -p

MariaDB [(none)]> show variables like 'character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

MariaDB [(none)]> create database zabbix;
Query OK, 1 row affected (0.00 sec)

MariaDB [(none)]> grant all privileges on zabbix.* to 'zabbix'@'%' identified by 'zabbix';
Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)

Zabbix ServerからDDLを持ってきて流す。DDLファイルは/usr/share/doc/zabbix-server-mysql-3.0.10/create.sql.gzにある。
ちなみに、今回はMariaDBだからそのままで良いが、Percona Server 5.7を利用する場合にはDDLを一部書き換える必要がある（プライマリキー必須のため）。

$ zcat create.sql.gz | mysql -uzabbix -p zabbix

Zabbix Proxyの構築

Zabbix Serverと同様、まずyum repositoryの追加。

$ sudo rpm -ivh http://repo.zabbix.com/zabbix/3.0/rhel/7/x86_64/zabbix-release-3.0-1.el7.noarch.rpm

Zabbix ProxyとMariaDBのパッケージをインストールする。

$ sudo yum install zabbix-proxy-mysql mariadb-server -y

MariaDBをセットアップしていく。
エンコーディング設定は事前にutf8にしておく。

$ sudo systemctl start mariadb
$ /usr/bin/mysql_secure_installation

$ mysql -uroot -p

MariaDB [(none)]> create database zabbix_proxy;
Query OK, 1 row affected (0.00 sec)

MariaDB [(none)]> grant all privileges on zabbix_proxy.* to 'zabbix'@'localhost' identified by 'zabbix';
Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)

Zabbix ProxyのパッケージについてきているDDLを流す。

$ zcat /usr/share/doc/zabbix-proxy-mysql-3.0.10/schema.sql.gz | mysql -uzabbix -p zabbix_proxy

次にZabbix Proxyの設定を行う。今回はパッシブモードで動作させる。

$ sudo cp -p /etc/zabbix/zabbix_proxy.conf{,.org}
$ sudo vi /etc/zabbix/zabbix_proxy.conf
$ sudo diff /etc/zabbix/zabbix_proxy.conf{.org,}
14a15,16
> ProxyMode=1
>
162a165,166
>
> DBPassword=zabbix

Zabbix Proxyを起動する。Zabbix Serverと同様の落ち方をするので、設定を行って起動させる。

$ sudo systemctl start zabbix-proxy
Job for zabbix-proxy.service failed because a configured resource limit was exceeded. See "systemctl status zabbix-proxy.service" and "journalctl -xe" for details.

$ tail -n 3 /var/log/zabbix/zabbix_proxy.log
 24459:20170724:172729.981 using configuration file: /etc/zabbix/zabbix_proxy.conf
 24459:20170724:172729.981 cannot set resource limit: [13] Permission denied
 24459:20170724:172729.981 cannot disable core dump, exiting...

$ sudo yum install policycoreutils-python -y
$ sudo grep zabbix_proxy /var/log/audit/audit.log | audit2allow
$ sudo grep zabbix_proxy /var/log/audit/audit.log | audit2allow -M zabbix-limit
$ sudo semodule -i zabbix-limit.pp

$ sudo systemctl status zabbix-proxy

Zabbix Webの構築

これまでと同様、まずyum repositoryの追加。

$ sudo rpm -ivh http://repo.zabbix.com/zabbix/3.0/rhel/7/x86_64/zabbix-release-3.0-1.el7.noarch.rpm

Zabbix Webのパッケージをインストールする。
zabbix-web-japaneseというパッケージもあり、これを入れると日本語対応できるが、今回はいったん無し。

$ sudo yum install zabbix-web-mysql -y

httpdでサーブできるようにzabbixのPHPファイルを配置（シンボリックリンク）。

$ sudo ln -s /usr/share/zabbix /var/www/html/zabbix

PHPの設定を修正する。タイムゾーンを指定しないとWeb UI開いたときに大量のエラーを見るはめになる。

$ sudo cp -p /etc/httpd/conf.d/zabbix.conf{,.org}
$ sudo vi /etc/httpd/conf.d/zabbix.conf
$ sudo diff /etc/httpd/conf.d/zabbix.conf{.org,}
19c19
<         # php_value date.timezone Europe/Riga
---
>         php_value date.timezone Asia/Tokyo

Zabbix Webの設定ファイルを用意する。

$ sudo cp /usr/share/zabbix/conf/zabbix.conf.php.example /etc/zabbix/web/zabbix.conf.php
$ sudo vi /etc/zabbix/web/zabbix.conf.php
$ sudo diff /usr/share/zabbix/conf/zabbix.conf.php.example /etc/zabbix/web/zabbix.conf.php
6,7c6,7
< $DB['SERVER']                 = 'localhost';
< $DB['PORT']                           = '0';
---
> $DB['SERVER']                 = '192.168.33.34';
> $DB['PORT']                           = '3306';
10c10
< $DB['PASSWORD']                       = '';
---
> $DB['PASSWORD']                       = 'zabbix';
14c14
< $ZBX_SERVER                           = 'localhost';
---
> $ZBX_SERVER                           = '192.168.33.31';
16c16
< $ZBX_SERVER_NAME              = '';
---
> $ZBX_SERVER_NAME              = 'zabbix-server';

httpdが外部NWと通信可能にしておく（MariDBとの接続）。

$ sudo setsebool -P httpd_can_network_connect on

そしてhttpdを起動する。

$ sudo systemctl start httpd

ブラウザでアクセスするとZabbixのWeb UIが表示される。
初期状態では、ユーザ名「Admin」、パスワード「zabbix」でログインできる。

http://192.168.33.33/zabbix/

監視対象サーバの構築

これまでと同様、まずyum repositoryの追加。

$ sudo rpm -ivh http://repo.zabbix.com/zabbix/3.0/rhel/7/x86_64/zabbix-release-3.0-1.el7.noarch.rpm

そしてZabbix Agentのインストール。

$ sudo yum install zabbix-agent -y

Zabbix Proxyと通信するように設定する。

$ sudo cp -p /etc/zabbix/zabbix_agentd.conf{,.org}
$ sudo vi /etc/zabbix/zabbix_agentd.conf
$ sudo diff /etc/zabbix/zabbix_agentd.conf{.org,}
95c95
< Server=127.0.0.1
---
> Server=192.168.33.32
136c136
< ServerActive=127.0.0.1
---
> ServerActive=192.168.33.32
147c147
< Hostname=Zabbix server
---
> Hostname=Zabbix agent

Zabbix Agentを起動するが、これもSELinux関連で落ちる。これまでと同様に設定する。

$ sudo systemctl start zabbix-agent
Job for zabbix-agent.service failed because a configured resource limit was exceeded. See "systemctl status zabbix-agent.service" and "journalctl -xe" for details.

$ tail -n 3 /var/log/zabbix/zabbix_agentd.log
 31979:20170724:183644.392 using configuration file: /etc/zabbix/zabbix_agentd.conf
 31979:20170724:183644.392 cannot set resource limit: [13] Permission denied
 31979:20170724:183644.392 cannot disable core dump, exiting...

$ sudo yum install policycoreutils-python -y
$ sudo grep zabbix_agent /var/log/audit/audit.log | audit2allow
$ sudo grep zabbix_agent /var/log/audit/audit.log | audit2allow -M zabbix-limit
$ sudo semodule -i zabbix-limit.pp

Zabbix Agentが起動できたら、Proxy経由で監視するようWeb UIから設定する。

f:id:aka_mythosil:20170727000608p:plain:w400

f:id:aka_mythosil:20170727000623p:plain:w400

今後：HA構成

監視サーバだから多少落ちてても問題ないよね、という考えはあるが、とはいえ自動でフェイルオーバーしてくれたほうが運用上楽なのは間違いない。
というか、運用のためのツールの運用に手間かかるような状態ではダメだ。
Pacemaker等を利用してHA構成を構築することができるようなので、これは別途試してみることにする。

2017-07-21

CentOS7でPacemaker×Corosyncを動かす

公式資料(ClusterLabs)を参考に構築していく。

RedHat社の資料も役に立つ。

仮想マシンの用意

Vagrantで2台起動する。

Vagrant.configure("2") do |config|
  config.vm.box = "centos/7"

  config.vm.define "cluster01" do |server|
    server.vm.network "private_network", ip: "192.168.33.21"
  end

  config.vm.define "cluster02" do |server|
    server.vm.network "private_network", ip: "192.168.33.22"
  end
end

CentOSのバージョン確認。2台とも当然同じ。

$ cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)

パッケージインストール

pcsをインストールすれば、その他諸々（pacemaker,resource-agents,corosync,psmics,policycoreutils-python）も同時に入る。

$ sudo yum install pcs

パッケージインストール直後では、pcsdが動いていないので起動する。

$ sudo systemctl status pcsd
● pcsd.service - PCS GUI and remote configuration interface
   Loaded: loaded (/usr/lib/systemd/system/pcsd.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

$ sudo systemctl start pcsd
$ sudo systemctl enable pcsd

ユーザ管理

パッケージインストール時に、haclusterユーザとhaclientグループが自動で追加されている。

$ id hacluster
uid=189(hacluster) gid=189(haclient) groups=189(haclient)

$ tail -n 2 /etc/passwd
vagrant:x:1000:1000:vagrant:/home/vagrant:/bin/bash
hacluster:x:189:189:cluster user:/home/hacluster:/sbin/nologin

$ tail -n 2 /etc/group
vagrant:x:1000:vagrant
haclient:x:189:

haclusterユーザにパスワードを設定する。

$ sudo passwd hacluster

ちなみに、haclusterユーザのパスワード設定は必須ではない。

The installed packages will create a hacluster user with a disabled password. While this is fine for running pcs commands locally, the account needs a login password in order to perform such tasks as syncing the corosync configuration, or starting and stopping the cluster on other nodes.

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html#_enable_pcs_daemon

Corosync設定

hosts設定を入れておく。

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.33.21 cluster01
192.168.33.22 cluster02

いずれか１台で下記コマンドを実行し、haclusterユーザを認証する（pcs実行可能にする）。

$ sudo pcs cluster auth cluster01 cluster02
Username: hacluster
Password:
cluster02: Authorized
cluster01: Authorized

そして、いずれか１台で下記コマンドを実行し、２ノードのクラスタを作成する。

$ sudo pcs cluster setup --name mycluster cluster01 cluster02
Destroying cluster on nodes: cluster01, cluster02...
cluster02: Stopping Cluster (pacemaker)...
cluster01: Stopping Cluster (pacemaker)...
cluster02: Successfully destroyed cluster
cluster01: Successfully destroyed cluster

Sending cluster config files to the nodes...
cluster01: Succeeded
cluster02: Succeeded

Synchronizing pcsd certificates on nodes cluster01, cluster02...
cluster02: Success
cluster01: Success

Restarting pcsd on the nodes in order to reload the certificates...
cluster02: Success
cluster01: Success

すると、全ノードの/etc/corosync/corosync.confにクラスタ情報が設定として書き込まれる。

$ cat /etc/corosync/corosync.conf
totem {
    version: 2
    secauth: off
    cluster_name: mycluster
    transport: udpu
}

nodelist {
    node {
        ring0_addr: cluster01
        nodeid: 1
    }

    node {
        ring0_addr: cluster02
        nodeid: 2
    }
}

quorum {
    provider: corosync_votequorum
    two_node: 1
}

logging {
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    to_syslog: yes
}

クラスタ稼働開始

pcs cluster setupだけではクラスタの設定情報が配布されるだけ。
クラスタの開始は別途コマンドを打つ（setupのオプションで開始させることも可能）。

$ sudo pcs cluster status
Error: cluster is not currently running on this node

$ sudo pcs cluster start --all
cluster01: Starting Cluster...
cluster02: Starting Cluster...

$ sudo pcs cluster status
Cluster Status:
 Stack: unknown
 Current DC: NONE
 Last updated: Wed Jul 19 16:56:06 2017         Last change: Wed Jul 19 16:56:02 2017 by hacluster via crmd on cluster01
 2 nodes and 0 resources configured

PCSD Status:
  cluster02: Online
  cluster01: Online

ノードのブート時にクラスタが自動で開始するよう設定しておく。

$ sudo pcs cluster enable --all

実はこれは下記コマンドを全ノードで実行したのと同じ。

$ sudo systemctl enable corosync
$ sudo systemctl enable pacemaker

クラスタ設定

今回はSTONITHは使わないので無効化しておく。

$ sudo pcs property set stonith-enabled=false

STONITHについての説明はここにある。

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html#_what_is_stonith

STONITHに関しては、gihyoの記事もわかりやすい。

第3回 Pacemakerでいろいろ設定してみよう！［構築応用編］：Pacemakerでかんたんクラスタリング体験してみよう！｜gihyo.jp … 技術評論社

また、２ノード構成においてはQUORUMが意味をなさないので無効化。
１ノード障害でサービスが移動するするように設定しておく。

$ sudo pcs property set no-quorum-policy=ignore
$ sudo pcs resource defaults migration-threshold=1

QUORUMについての説明はここにある。

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html#_perform_a_failover

サービスの作成

検証用にDummyサービスを作成する。

$ sudo pcs resource create my_first_svc Dummy op monitor interval=10s

手動フェイルオーバー

初期状態ではcluster01でサービスが動いている。

$ sudo pcs status resources
 my_first_svc   (ocf::heartbeat:Dummy): Started cluster01

このサービスをcluster02に移動させる（フェイルオーバーさせる）。

$ sudo pcs resource move my_first_svc cluster02
$ sudo pcs status resources
 my_first_svc   (ocf::heartbeat:Dummy): Started cluster02

特定ノードをスタンバイにする

サービスがcluster01で動いてる状態でcluster01をスタンバイに移行すると、サービスが自動でcluster02に移動する。

$ sudo pcs status resources
 my_first_svc   (ocf::heartbeat:Dummy): Started cluster01

$ sudo pcs cluster standby cluster01
$ sudo pcs status resources
 my_first_svc   (ocf::heartbeat:Dummy): Started cluster02

cluster01をスタンバイから復旧させると、サービスが自動でcluster01に戻る。

$ sudo pcs cluster unstandby cluster01
$ sudo pcs status resources
 my_first_svc   (ocf::heartbeat:Dummy): Started cluster01

ちなみに、スタンバイではなくクラスタを停止させた場合も同様の動きをする（障害のシミュレート）。

$ sudo pcs cluster stop cluster01
$ sudo pcs status resources
 my_first_svc   (ocf::heartbeat:Dummy): Started cluster02

$ sudo pcs cluster start cluster01
$ sudo pcs status resources
 my_first_svc   (ocf::heartbeat:Dummy): Started cluster01

apacheのactive/standbyを構成する

cluster01でapacheを動かしておき、落ちたらcluster02でapacheが起動する、というものを構成してみる。
ここがハマりどころで、公式資料の通りにやっても動かなかった部分。

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html#idm139647252480704

まずリソースを作成。cluster01でapacheが動き始める。この時点ではcluster02ではapacheは動いていない。

$ sudo pcs resource create webserver systemd:httpd op monitor interval=10s

$ sudo pcs resource show
 webserver      (systemd:httpd):        Started cluster01

そしてcluster01をスタンバイにしてみる。すると、cluster01のapacheが落ち、cluster02でapacheが起動する。

$ sudo pcs cluster standby cluster01
$ sudo pcs status resources
 webserver      (systemd:httpd):        Started cluster02

cluster01をスタンバイ状態から復旧させても、勝手にフェイルオーバーは動かない。

cluster01を強制終了させた場合にも一連の動きは同様になる。手動フェイルオーバー（pcs resource move）ももちろん可能。

その他

複数のサービスがバラバラのノードで起動してしまうことがあるので、pcs constraintを使って固定化することができる。

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html#_prefer_one_node_over_another

apacheについては、LBによるヘルスチェックに利用するファイルの生成/破棄だけをさせたいシチュエーションがあるのだが、これはリソースエージェントを自作でもしないとできないんだろうか。
引き続きこの点について調べてみる。

あと、pcsコマンドによる設定の一斉配布は便利だが、ごりごりにサーバ間の通信制限がかけられているセキュア環境では個別に設定を入れていかざるをえない時も出てくる。このやり方についても引き続き調べる。

Rhythm & Biology

Engineering, Science, et al.