idocumentation - reed-alert - Lightweight agentless alerting system for server Err bitreich.org 70 hgit clone git://bitreich.org/reed-alert/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/reed-alert/ URL:git://bitreich.org/reed-alert/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/reed-alert/ bitreich.org 70 1Log /scm/reed-alert/log.gph bitreich.org 70 1Files /scm/reed-alert/files.gph bitreich.org 70 1Refs /scm/reed-alert/refs.gph bitreich.org 70 1Tags /scm/reed-alert/tag bitreich.org 70 1README /scm/reed-alert/file/README.gph bitreich.org 70 1LICENSE /scm/reed-alert/file/LICENSE.gph bitreich.org 70 i--- Err bitreich.org 70 1commit 5e86ea0009b75bc4acfc9da5836dac309e4473ba /scm/reed-alert/commit/5e86ea0009b75bc4acfc9da5836dac309e4473ba.gph bitreich.org 70 1parent 9410e05e37aefd2e1880178e655a8bd64173c645 /scm/reed-alert/commit/9410e05e37aefd2e1880178e655a8bd64173c645.gph bitreich.org 70 hAuthor: solene rapenne URL:mailto:solene@dataswamp.org bitreich.org 70 iDate: Fri, 7 Oct 2016 15:27:01 +0200 Err bitreich.org 70 i Err bitreich.org 70 idocumentation Err bitreich.org 70 i Err bitreich.org 70 iDiffstat: Err bitreich.org 70 i A README.rd | 153 +++++++++++++++++++++++++++++++ Err bitreich.org 70 i Err bitreich.org 70 i1 file changed, 153 insertions(+), 0 deletions(-) Err bitreich.org 70 i--- Err bitreich.org 70 1diff --git a/README.rd b/README.rd /scm/reed-alert/file/README.rd.gph bitreich.org 70 i@@ -0,0 +1,153 @@ Err bitreich.org 70 i+Presentation Err bitreich.org 70 i+============ Err bitreich.org 70 i+ Err bitreich.org 70 i+reed-alert is a tool to check the status of various things on a server Err bitreich.org 70 i+and trigger user defined notifications to be alerted. In the code, Err bitreich.org 70 i+each check is called a "probe" and have parameters. Err bitreich.org 70 i+ Err bitreich.org 70 i+The code is very rough for now. I will try to make the config file Err bitreich.org 70 i+easier than it is actually, but I think it's already easy enough for Err bitreich.org 70 i+people who need to kind of tool. Err bitreich.org 70 i+ Err bitreich.org 70 i+reed-alert is regularly tested on FreeBSD/OpenBSD/Linux Err bitreich.org 70 i+ Err bitreich.org 70 i+ Err bitreich.org 70 i+Defining notification system Err bitreich.org 70 i+============================ Err bitreich.org 70 i+ Err bitreich.org 70 i++ function : the name of the probe Err bitreich.org 70 i++ date : the current date with format YYYY/MM/DD hh:mm:ss Err bitreich.org 70 i++ params : the parameters of the probe Err bitreich.org 70 i++ hostname : the hostname of the server Err bitreich.org 70 i++ result : the error returned (the value exceeding the limit, file not found) Err bitreich.org 70 i++ description : an arbitrary description naming a check Err bitreich.org 70 i++ level : the type of notification used Err bitreich.org 70 i++ os : the type of operating system (FreeBSD/Linux/OpenBSD) Err bitreich.org 70 i++ _ : a space character Err bitreich.org 70 i++ space : a space character Err bitreich.org 70 i++ newline : a newline character Err bitreich.org 70 i+ Err bitreich.org 70 i+If you want to send a mail with a message like "At 2016/10/06 11:11:12 Err bitreich.org 70 i+server.foo.com has encountered a problem during LOAD-AVERAGE-15 Err bitreich.org 70 i+(:LIMIT 10) with a value of 30" you can write the following and use Err bitreich.org 70 i+**pretty-mail** in your checks. Err bitreich.org 70 i+ Err bitreich.org 70 i+ (defvar *alerts* Err bitreich.org 70 i+ (list Err bitreich.org 70 i+ '(pretty-mail ("echo '" date _ hostname " has encountered a problem during" function Err bitreich.org 70 i+ params " with a value of " result "' | mail yourmail@foo.bar")))) Err bitreich.org 70 i+ Err bitreich.org 70 i+If you don't want anything to be triggered, you can use the following Err bitreich.org 70 i+in *alerts* Err bitreich.org 70 i+ Err bitreich.org 70 i+ '(nothing-to-send nil) Err bitreich.org 70 i+ Err bitreich.org 70 i+If you find it easier to read, you can add + in the concatenation, Err bitreich.org 70 i+this is simply discarded when the program parse the list. Err bitreich.org 70 i+ Err bitreich.org 70 i+ '(pretty-mail (date + " " + hostname + " has encountered a problem " + function)) Err bitreich.org 70 i+ Err bitreich.org 70 i+The differents probes Err bitreich.org 70 i+===================== Err bitreich.org 70 i+ Err bitreich.org 70 i+Probes are written in LISP and sometimes relies on system call, like Err bitreich.org 70 i+for ping or the average load of the system. It cares about running on Err bitreich.org 70 i+different operating system. Err bitreich.org 70 i+ Err bitreich.org 70 i+number-of-processes Err bitreich.org 70 i+------------------- Err bitreich.org 70 i+Check if the actual number of processes of the system exceed the limit Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the limit that will trigger an alert when exceeded Err bitreich.org 70 i+ :limit INTEGER Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example number-of-processes (:limit 200))` Err bitreich.org 70 i+ Err bitreich.org 70 i+pid-running Err bitreich.org 70 i+----------- Err bitreich.org 70 i+Check if the PID number found in a .pid file is alive Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the path of the pid file. If user don't have permission to open it, return "file not found" Err bitreich.org 70 i+ :path "STRING" Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example pid-running (:path "/var/run/nginx.pid"))` Err bitreich.org 70 i+ Err bitreich.org 70 i+ Err bitreich.org 70 i+disk-usage Err bitreich.org 70 i+---------- Err bitreich.org 70 i+Check if the used percent of the choosed partition exceed the limit Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the mountpoint to check Err bitreich.org 70 i+ :path "STRING" Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the limit that will trigger an alert when exceeded Err bitreich.org 70 i+ :limit INTEGER Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example disk-usage (:path "/tmp" :limit 50))` Err bitreich.org 70 i+ Err bitreich.org 70 i+ Err bitreich.org 70 i+file-exists Err bitreich.org 70 i+----------- Err bitreich.org 70 i+Check if a file exists Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the path of the file to check Err bitreich.org 70 i+ :path "STRING" Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example file-exists (:path "/var/postgresql/standby"))` Err bitreich.org 70 i+ Err bitreich.org 70 i+file-updated Err bitreich.org 70 i+------------ Err bitreich.org 70 i+Check if a file exists and has been updated since a defined time Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the path of the file to check Err bitreich.org 70 i+ :path "STRING" Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the limit in minutes since the last modification time before triggering an alert Err bitreich.org 70 i+ :limit INTEGER Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example file-updated (:path "/var/log/nginx/access.log" :limit 60))` Err bitreich.org 70 i+ Err bitreich.org 70 i+load-average-1 Err bitreich.org 70 i+-------------- Err bitreich.org 70 i+Check if the load average on the last minute exceed the limit Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the limit not to exceed Err bitreich.org 70 i+ :limit INTEGER Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example load-average-1 (:limit 2))` Err bitreich.org 70 i+ Err bitreich.org 70 i+load-average-5 Err bitreich.org 70 i+-------------- Err bitreich.org 70 i+Check if the load average on the last fives minutes exceed the limit Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the limit not to exceed Err bitreich.org 70 i+ :limit INTEGER Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example load-average-5 (:limit 2))` Err bitreich.org 70 i+ Err bitreich.org 70 i+load-average-15 Err bitreich.org 70 i+--------------- Err bitreich.org 70 i+Check if the load average on the last fifteen minutes exceed the limit Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the limit not to exceed Err bitreich.org 70 i+ :limit INTEGER Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example load-average-15 (:limit 2))` Err bitreich.org 70 i+ Err bitreich.org 70 i+ping Err bitreich.org 70 i+---- Err bitreich.org 70 i+Check if a remote host answer the 2 ICMP ping Err bitreich.org 70 i+ Err bitreich.org 70 i+> Set the host to ping. Return an error if ping command returns non-zero Err bitreich.org 70 i+ :host "STRING" (can be IP or hostname) Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example ping (:host "8.8.8.8"))` Err bitreich.org 70 i+ Err bitreich.org 70 i+command Err bitreich.org 70 i+------- Err bitreich.org 70 i+Execute an arbitrary command which trigger an alert if the command return a non-zero value Err bitreich.org 70 i+ Err bitreich.org 70 i+> Command to execute, accept commands with pipes Err bitreich.org 70 i+ :command "STRING" Err bitreich.org 70 i+ Err bitreich.org 70 i+Example : `(=> example command (:command "tail -n 10 /var/log/messages | grep -v CRITICAL"))` Err bitreich.org 70 .