iSync with new alert declaration and add explanations with code usage - reed-alert - Lightweight agentless alerting system for server Err bitreich.org 70
hgit clone git://bitreich.org/reed-alert/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/reed-alert/ URL:git://bitreich.org/reed-alert/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/reed-alert/ bitreich.org 70
1Log /scm/reed-alert/log.gph bitreich.org 70
1Files /scm/reed-alert/files.gph bitreich.org 70
1Refs /scm/reed-alert/refs.gph bitreich.org 70
1Tags /scm/reed-alert/tag bitreich.org 70
1README /scm/reed-alert/file/README.gph bitreich.org 70
1LICENSE /scm/reed-alert/file/LICENSE.gph bitreich.org 70
i--- Err bitreich.org 70
1commit 439acf53f4c8be2665c3459055a57b3d03656fd8 /scm/reed-alert/commit/439acf53f4c8be2665c3459055a57b3d03656fd8.gph bitreich.org 70
1parent 8e2203d405f186f1c5e6968d37e45482a7175399 /scm/reed-alert/commit/8e2203d405f186f1c5e6968d37e45482a7175399.gph bitreich.org 70
hAuthor: Solene Rapenne <solene@perso.pw> URL:mailto:solene@perso.pw bitreich.org 70
iDate: Wed, 10 Jan 2018 20:17:32 +0100 Err bitreich.org 70
i Err bitreich.org 70
iSync with new alert declaration and add explanations with code usage Err bitreich.org 70
i Err bitreich.org 70
iDiffstat: Err bitreich.org 70
i M README | 147 ++++++++++++++++++++++++------- Err bitreich.org 70
i Err bitreich.org 70
i1 file changed, 115 insertions(+), 32 deletions(-) Err bitreich.org 70
i--- Err bitreich.org 70
1diff --git a/README b/README /scm/reed-alert/file/README.gph bitreich.org 70
i@@ -20,11 +20,16 @@ tested with both **sbcl** and **ecl** - which should be available for Err bitreich.org 70
i most distributions. Err bitreich.org 70
i Err bitreich.org 70
i (On OpenBSD you may prefer to use ecl because sbcl needs 'wxallowed' Err bitreich.org 70
i-where the binary is.) Err bitreich.org 70
i+on the partition where the binary is.) Err bitreich.org 70
i Err bitreich.org 70
i To make reed-alert's deployment easier I avoid using external Err bitreich.org 70
i libraries. reed-alert only requires a Common LISP interpreter and a Err bitreich.org 70
i-few files. Err bitreich.org 70
i+its own files. Err bitreich.org 70
i+ Err bitreich.org 70
i+A development to use quicklisp libraries to write more sophisticated Err bitreich.org 70
i+checks like "does this url contains a pattern ?" had begun and had Err bitreich.org 70
i+been abandoned, it has been decided to write shell command in the Err bitreich.org 70
i+probe **command** if the user need more elaborated checks. Err bitreich.org 70
i Err bitreich.org 70
i Err bitreich.org 70
i Code-Readability Err bitreich.org 70
i@@ -34,7 +39,7 @@ Although the code is very rough for now, I think it's already fairly Err bitreich.org 70
i understandable by people who do need this kind of tool. Err bitreich.org 70
i Err bitreich.org 70
i I will try to improve on the readability of the config file in future Err bitreich.org 70
i-commits. Err bitreich.org 70
i+commits. NOTE : declaration of notifiers is easier now. Err bitreich.org 70
i Err bitreich.org 70
i Err bitreich.org 70
i Usage Err bitreich.org 70
i@@ -58,52 +63,53 @@ The configuration is explained below. Err bitreich.org 70
i The Notification System Err bitreich.org 70
i ======================= Err bitreich.org 70
i Err bitreich.org 70
i-+ function : the name of the probe Err bitreich.org 70
i-+ date : the current date with format YYYY/MM/DD hh:mm:ss Err bitreich.org 70
i-+ params : the parameters of the probe Err bitreich.org 70
i-+ hostname : the hostname of the server Err bitreich.org 70
i-+ result : the error returned (the value exceeding the limit, file not found) Err bitreich.org 70
i-+ description : an arbitrary description naming a check Err bitreich.org 70
i-+ level : the type of notification used Err bitreich.org 70
i-+ os : the type of operating system (FreeBSD/Linux/OpenBSD) Err bitreich.org 70
i-+ _ : a space character Err bitreich.org 70
i-+ space : a space character Err bitreich.org 70
i-+ newline : a newline character Err bitreich.org 70
i+When a check return an error, a previously defined notifier will be Err bitreich.org 70
i+called. The notifier is a shell command with a name. The shell command Err bitreich.org 70
i+can contains variables from reed-alert. Err bitreich.org 70
i+ Err bitreich.org 70
i++ %function% : the name of the probe Err bitreich.org 70
i++ %date% : the current date with format YYYY/MM/DD hh:mm:ss Err bitreich.org 70
i++ %params% : the parameters of the probe Err bitreich.org 70
i++ %hostname% : the hostname of the server Err bitreich.org 70
i++ %result% : the error returned (the value exceeding the limit, file not found) Err bitreich.org 70
i++ %description% : an arbitrary description naming a check Err bitreich.org 70
i++ %level% : the type of notification used Err bitreich.org 70
i++ %os% : the type of operating system (FreeBSD/Linux/OpenBSD) Err bitreich.org 70
i++ %newline% : a newline character Err bitreich.org 70
i Err bitreich.org 70
i Err bitreich.org 70
i-Example Probe: 'Check For Load Average' Err bitreich.org 70
i+Example Probe 1: 'Check For Load Average' Err bitreich.org 70
i --------------------------------------- Err bitreich.org 70
i If you want to send a mail with a message like: Err bitreich.org 70
i Err bitreich.org 70
i- "At 2016/10/06 11:11:12 server.foo.com has encountered a problem Err bitreich.org 70
i+ "On 2016/10/06 11:11:12 server.foo.com has encountered a problem Err bitreich.org 70
i during LOAD-AVERAGE-15 (:LIMIT 10) with a value of 30" Err bitreich.org 70
i Err bitreich.org 70
i Err bitreich.org 70
i-write the following and use **pretty-mail** in your checks: Err bitreich.org 70
i+write the following at the top of the file and use **pretty-mail** in your checks: Err bitreich.org 70
i Err bitreich.org 70
i- (defvar *alerts* Err bitreich.org 70
i- (list Err bitreich.org 70
i- '(pretty-mail ("echo '" date _ hostname " has encountered a problem during" function Err bitreich.org 70
i- params " with a value of " result "' | mail yourmail@foo.bar")))) Err bitreich.org 70
i+ (alert pretty-mail "echo 'On %date% %hostname% has encountered a problem during %function% Err bitreich.org 70
i+ %params% with a value of %result%' | mail yourmail@foo.bar") Err bitreich.org 70
i Err bitreich.org 70
i-Variant 1 Err bitreich.org 70
i-~~~~~~~~~ Err bitreich.org 70
i-If you find it easier to read, you can add + in the concatenation. Err bitreich.org 70
i-The + is discarded by reed-alert as soon as it parses the list. Err bitreich.org 70
i+Example Probe 2: 'Don't do anything' Err bitreich.org 70
i+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Err bitreich.org 70
i+If you don't want anything to be done when an error occur, use the following : Err bitreich.org 70
i Err bitreich.org 70
i- '(pretty-mail (date + " " + hostname + " has encountered a problem " + function)) Err bitreich.org 70
i+ (alert nothing-to-send "") Err bitreich.org 70
i Err bitreich.org 70
i-Variant 2 Err bitreich.org 70
i-~~~~~~~~~ Err bitreich.org 70
i-If you don't want anything to be triggered use the following in *alerts*: Err bitreich.org 70
i+Example Probe 3: 'Send SMS' Err bitreich.org 70
i+~~~~~~~~~~~~~~~~~~~~~~~~~~~ Err bitreich.org 70
i+You may want to use an external service to send a SMS, this is totally Err bitreich.org 70
i+possible as we rely on a shell command : Err bitreich.org 70
i Err bitreich.org 70
i- '(nothing-to-send nil) Err bitreich.org 70
i+ (alert sms "echo 'error on %hostname : %function% %result%' Err bitreich.org 70
i+ | curl -u login:pass http://api.sendsms.com/";) Err bitreich.org 70
i Err bitreich.org 70
i Err bitreich.org 70
i The Probes Err bitreich.org 70
i ========== Err bitreich.org 70
i Err bitreich.org 70
i-Probes are written in Common LISP. Err bitreich.org 70
i+Probes are written in Common LISP. They are predefined checks. Err bitreich.org 70
i Err bitreich.org 70
i The :desc Parameter Err bitreich.org 70
i ------------------- Err bitreich.org 70
i@@ -230,6 +236,7 @@ Example : `(=> alert ping (:host "8.8.8.8"))` Err bitreich.org 70
i command Err bitreich.org 70
i ------- Err bitreich.org 70
i Execute an arbitrary command which triggers an alert if it returns a non-zero value. Err bitreich.org 70
i+This may be the most useful probe because it let the user do any check needed. Err bitreich.org 70
i Err bitreich.org 70
i > Command to execute, accept commands with pipes. Err bitreich.org 70
i :command "STRING" Err bitreich.org 70
i@@ -255,4 +262,80 @@ Check if a file has a size less than a specified limit. Err bitreich.org 70
i > Set the limit in bytes before triggering an alert. Err bitreich.org 70
i :limit INTEGER Err bitreich.org 70
i Err bitreich.org 70
i-Example : `(=> alert file-less-than (:path "/var/log/nginx/access.log" :limit 60))` Err bitreich.org 70
i+Example : `(=> alert file-less-than (:path "/var/log/nginx.log" :limit 60))` Err bitreich.org 70
i+ Err bitreich.org 70
i+ Err bitreich.org 70
i+The configuration file Err bitreich.org 70
i+====================== Err bitreich.org 70
i+ Err bitreich.org 70
i+The configuration file is Common LISP code, so it's evaluated. It's Err bitreich.org 70
i+possible to write some logic within it. Err bitreich.org 70
i+ Err bitreich.org 70
i+ Err bitreich.org 70
i+Loops Err bitreich.org 70
i+----- Err bitreich.org 70
i+It's possible to write loops if you don't want to repeat code Err bitreich.org 70
i+ Err bitreich.org 70
i+ (loop for host in '("bitreich.org" "dataswamp.org" "floodgap.com") Err bitreich.org 70
i+ do Err bitreich.org 70
i+ (=> mail ping (:host host))) Err bitreich.org 70
i+ Err bitreich.org 70
i+or another example Err bitreich.org 70
i+ Err bitreich.org 70
i+ (loop for service in '("smtpd" "nginx" "mysqld" "postgresql") Err bitreich.org 70
i+ do Err bitreich.org 70
i+ (=> mail service (:name service))) Err bitreich.org 70
i+ Err bitreich.org 70
i+and another example using rows from a file to check remote hosts Err bitreich.org 70
i+ Err bitreich.org 70
i+ (with-open-file (stream "hosts.txt") Err bitreich.org 70
i+ (loop for line = (read-line stream nil) Err bitreich.org 70
i+ while line Err bitreich.org 70
i+ do Err bitreich.org 70
i+ (=> mail ping (:host line)))) Err bitreich.org 70
i+ Err bitreich.org 70
i+ Err bitreich.org 70
i+Conditional Err bitreich.org 70
i+----------- Err bitreich.org 70
i+It is also possible to achieve conditionals. There are two very useful Err bitreich.org 70
i+conditionals groups. Err bitreich.org 70
i+ Err bitreich.org 70
i+ Err bitreich.org 70
i+Dependency Err bitreich.org 70
i+~~~~~~~~~~ Err bitreich.org 70
i+Sometimes it may be a good idea to stop some probes if a probe Err bitreich.org 70
i+fail. In a case where you need to check a path through a network, from Err bitreich.org 70
i+the nearest machine to the remote target. If we can't reach our local Err bitreich.org 70
i+router, probes requiring the router to work will trigger errors so we Err bitreich.org 70
i+should skip them. Err bitreich.org 70
i+ Err bitreich.org 70
i+(stop-if-error Err bitreich.org 70
i+ (=> mail ping (:host "192.168.1.1" :desc "My local router")) Err bitreich.org 70
i+ (=> mail ping (:host "89.89.89.89" :desc "My ISP DNS server")) Err bitreich.org 70
i+ (=> mail ping (:host "kernel.org" :desc "Remote website"))) Err bitreich.org 70
i+ Err bitreich.org 70
i+Note : stop-if-error is an alias for the **and** function. Err bitreich.org 70
i+ Err bitreich.org 70
i+ Err bitreich.org 70
i+Escalation Err bitreich.org 70
i+~~~~~~~~~~ Err bitreich.org 70
i+It could be a good idea to use different alerts Err bitreich.org 70
i+depending on how critical a check is, but sometimes, the critical Err bitreich.org 70
i+level may depend of the value of the error and/or the delay between Err bitreich.org 70
i+the detection and fixing it. You could want to receive a mail when Err bitreich.org 70
i+things need to be fixed on spare time, but mail another people if Err bitreich.org 70
i+things aren't fixed after some level. Err bitreich.org 70
i+ Err bitreich.org 70
i+(escalation Err bitreich.org 70
i+ (=> mail-me disk-usage (:path "/" :limit 70)) Err bitreich.org 70
i+ (=> sms-me disk-usage (:path "/" :limit 90)) Err bitreich.org 70
i+ (=> buzzer disk-usage (:path "/" :limit 98))) Err bitreich.org 70
i+ Err bitreich.org 70
i+In this example, we check the disk usage, I will get a mail through Err bitreich.org 70
i+"mail-me" alert if the disk usage go get more than 70%. Once it goes Err bitreich.org 70
i+that far, it will check if the disk usage gets more than 90%, if so, Err bitreich.org 70
i+I'll receive a sms through "sms-me" alert. And then, if it goes more Err bitreich.org 70
i+than 98%, the "buzzer" alert will make some bad noises in the room to Err bitreich.org 70
i+warn me about this. Err bitreich.org 70
i+ Err bitreich.org 70
i+Note : escalation is an alias for the **or** function. Err bitreich.org 70
.
Response:
text/plain