SMOLNET PORTAL home about changes
iCount failures and send begin/end notifications - reed-alert - Lightweight agentless alerting system for server	Err	bitreich.org	70
hgit clone git://bitreich.org/reed-alert/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/reed-alert/	URL:git://bitreich.org/reed-alert/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/reed-alert/	bitreich.org	70
1Log	/scm/reed-alert/log.gph	bitreich.org	70
1Files	/scm/reed-alert/files.gph	bitreich.org	70
1Refs	/scm/reed-alert/refs.gph	bitreich.org	70
1Tags	/scm/reed-alert/tag	bitreich.org	70
1README	/scm/reed-alert/file/README.gph	bitreich.org	70
1LICENSE	/scm/reed-alert/file/LICENSE.gph	bitreich.org	70
i---	Err	bitreich.org	70
1commit f352b8458e9b406ce8795bf00c704c260c511cd6	/scm/reed-alert/commit/f352b8458e9b406ce8795bf00c704c260c511cd6.gph	bitreich.org	70
1parent 1b2f15bf2974f893f7dd55ff6b4742dd0c0430d2	/scm/reed-alert/commit/1b2f15bf2974f893f7dd55ff6b4742dd0c0430d2.gph	bitreich.org	70
hAuthor: Solene Rapenne <solene@perso.pw>	URL:mailto:solene@perso.pw	bitreich.org	70
iDate:   Wed, 17 Jan 2018 20:38:54 +0100	Err	bitreich.org	70
i	Err	bitreich.org	70
iCount failures and send begin/end notifications	Err	bitreich.org	70
i	Err	bitreich.org	70
iDiffstat:	Err	bitreich.org	70
i  M README                              |      37 ++++++++++++++++++++++++++++---	Err	bitreich.org	70
i  M config.lisp.sample                  |       6 +++---	Err	bitreich.org	70
i  M example.lisp                        |       8 ++++----	Err	bitreich.org	70
i  M functions.lisp                      |      53 +++++++++++++++++++++++++++----	Err	bitreich.org	70
i	Err	bitreich.org	70
i4 files changed, 88 insertions(+), 16 deletions(-)	Err	bitreich.org	70
i---	Err	bitreich.org	70
1diff --git a/README b/README	/scm/reed-alert/file/README.gph	bitreich.org	70
i@@ -63,9 +63,29 @@ The configuration is explained below.	Err	bitreich.org	70
i The Notification System	Err	bitreich.org	70
i =======================	Err	bitreich.org	70
i 	Err	bitreich.org	70
i-When a check return an error, a previously defined notifier will be	Err	bitreich.org	70
i-called. The notifier is a shell command with a name. The shell command	Err	bitreich.org	70
i-can contains variables from reed-alert.	Err	bitreich.org	70
i+When a check return a failure, a previously defined notifier will be	Err	bitreich.org	70
i+called. This will be triggered only after reed-alert find **3**	Err	bitreich.org	70
i+failures (not more or less) in a row for this check, this is a default	Err	bitreich.org	70
i+value that can be changed per probe with the :try parameter as	Err	bitreich.org	70
i+explained later in this document. This is to prevent reed-alert to	Err	bitreich.org	70
i+spam notifications for a long time (number of failures very high, like	Err	bitreich.org	70
i+a disk space usage that can't be fixed before a long time) OR	Err	bitreich.org	70
i+preventing reed-alert to send notifications about a check on the edge	Err	bitreich.org	70
i+of the limit like a ping almost working but failing from time to time	Err	bitreich.org	70
i+or the load average around the limit.	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+reed-alert will use the notifier system when it reach its try number	Err	bitreich.org	70
i+and when the problem is fixed, so you know when it begins and when it	Err	bitreich.org	70
i+ends.	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+reed-alert keep tracks of the count of failures with one file per	Err	bitreich.org	70
i+probe failing in the "states" folder. To ensure unique filenames, the	Err	bitreich.org	70
i+following format is used (+ means it's concatenated) :	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+    alert-name + probe-name + hash of probe parameters	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+The notifier is a shell command with a name. The shell command can	Err	bitreich.org	70
i+contains variables from reed-alert.	Err	bitreich.org	70
i 	Err	bitreich.org	70
i + %function%    : the name of the probe	Err	bitreich.org	70
i + %date%        : the current date with format YYYY/MM/DD hh:mm:ss	Err	bitreich.org	70
i@@ -76,6 +96,7 @@ can contains variables from reed-alert.	Err	bitreich.org	70
i + %level%       : the type of notification used	Err	bitreich.org	70
i + %os%          : the type of operating system (FreeBSD/Linux/OpenBSD)	Err	bitreich.org	70
i + %newline%     : a newline character	Err	bitreich.org	70
i++ %state%       : "start" / "end" when problem happen / is solved	Err	bitreich.org	70
i 	Err	bitreich.org	70
i 	Err	bitreich.org	70
i Example Probe 1: 'Check For Load Average'	Err	bitreich.org	70
i@@ -119,6 +140,16 @@ does. It can be put in every probe.	Err	bitreich.org	70
i     :desc "STRING"	Err	bitreich.org	70
i 	Err	bitreich.org	70
i 	Err	bitreich.org	70
i+The :try Parameter	Err	bitreich.org	70
i+------------------	Err	bitreich.org	70
i+The :try parameter allows you to change how many failure to wait	Err	bitreich.org	70
i+before the alert is triggered. By default, it's triggered after 3	Err	bitreich.org	70
i+failures. Sometimes, when using ping for example, you want to be	Err	bitreich.org	70
i+notified when it fails a few cycles and not at first failure.	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+    :try INTEGER	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+	Err	bitreich.org	70
i Overview	Err	bitreich.org	70
i --------	Err	bitreich.org	70
i As of this commit, reed-alert ships with the following probes:	Err	bitreich.org	70
1diff --git a/config.lisp.sample b/config.lisp.sample	/scm/reed-alert/file/config.lisp.sample.gph	bitreich.org	70
i@@ -1,8 +1,8 @@	Err	bitreich.org	70
i (load "functions.lisp")	Err	bitreich.org	70
i 	Err	bitreich.org	70
i-(alert mail "echo -n 'Problem with %function% %date% %params%' | mail -s alarm mail@isp.net")	Err	bitreich.org	70
i-(alert sms  "/home/user/sms.sh '%date% %function% %params% %hostname%")	Err	bitreich.org	70
i-(alert available-variables "REMINDER : %function% %params% %date% %hostname% %desc% %level% %os% %newline% %result%")	Err	bitreich.org	70
i+(alert mail "echo -n '[%state%] Problem with %function% %date% %params%' | mail -s '[%state%] alarm' mail@isp.net")	Err	bitreich.org	70
i+(alert sms  "/home/user/sms.sh '%date% %state% %function% %params% %hostname%")	Err	bitreich.org	70
i+(alert available-variables "REMINDER : %function% %params% %date% %hostname% %desc% %level% %os% %newline% %result% %state%")	Err	bitreich.org	70
i (alert empty "")	Err	bitreich.org	70
i 	Err	bitreich.org	70
i 	Err	bitreich.org	70
1diff --git a/example.lisp b/example.lisp	/scm/reed-alert/file/example.lisp.gph	bitreich.org	70
i@@ -1,9 +1,9 @@	Err	bitreich.org	70
i (load "functions.lisp")	Err	bitreich.org	70
i 	Err	bitreich.org	70
i-(alert dont-use-it "REMINDER %function% %params% %date% %hostname% %desc% %level% %os% %newline% _ %space% %result%")	Err	bitreich.org	70
i+(alert dont-use-it "REMINDER %state% %function% %params% %date% %hostname% %desc% %level% %os% %newline% _ %space% %result%")	Err	bitreich.org	70
i (alert empty "")	Err	bitreich.org	70
i (alert mail "")	Err	bitreich.org	70
i-(alert peroket "echo 'problem at %date% with %function% %params%'")	Err	bitreich.org	70
i+(alert peroket "echo '%state% problem at %date% with %function% %params% : %result%'")	Err	bitreich.org	70
i (alert sms "echo -n '%date% %function% CRITICAL on %hostname%' | curl http://somewebservice";)	Err	bitreich.org	70
i ;(alert mail "echo -n '%date% %hostname% had problem on %function% %newline% %params% values %result% %newline%	Err	bitreich.org	70
i ;                      %desc%' | mail -s '[Error] %function% - %hostname%' foo@bar.com")	Err	bitreich.org	70
i@@ -15,8 +15,8 @@	Err	bitreich.org	70
i (=> peroket disk-usage   :path "/tmp" :limit 0) ;; failure	Err	bitreich.org	70
i 	Err	bitreich.org	70
i ;; check if :path file exists	Err	bitreich.org	70
i-(=> mail file-exists  :path "/bsd.rd" :desc "OpenBSD kernel /bsd.rd")	Err	bitreich.org	70
i-(=> empty file-exists  :path "/non-existant-file") ;; failure file not found	Err	bitreich.org	70
i+(=> mail  file-exists  :path "/bsd.rd" :desc "OpenBSD kernel /bsd.rd")	Err	bitreich.org	70
i+(=> empty file-exists  :path "/non-existant-file" :try 1) ;; failure file not found	Err	bitreich.org	70
i 	Err	bitreich.org	70
i ;; check if :path file exists and has been updated since :limit minutes	Err	bitreich.org	70
i (=> empty file-updated :path "/var/log/messages" :limit 400)	Err	bitreich.org	70
1diff --git a/functions.lisp b/functions.lisp	/scm/reed-alert/file/functions.lisp.gph	bitreich.org	70
i@@ -1,6 +1,8 @@	Err	bitreich.org	70
i (require 'asdf)	Err	bitreich.org	70
i 	Err	bitreich.org	70
i+(defparameter *tries* 3)	Err	bitreich.org	70
i (defparameter *alerts* '())	Err	bitreich.org	70
i+(ensure-directories-exist "states/")	Err	bitreich.org	70
i 	Err	bitreich.org	70
i (defun color(num1 num2)	Err	bitreich.org	70
i   (format nil "~a[~a;~am" #\Escape num1 num2))	Err	bitreich.org	70
i@@ -57,9 +59,10 @@	Err	bitreich.org	70
i      (push (list ',name ,string)	Err	bitreich.org	70
i                 *alerts*)))	Err	bitreich.org	70
i 	Err	bitreich.org	70
i-(defun trigger-alert(level function params result)	Err	bitreich.org	70
i+(defun trigger-alert(level function params result state)	Err	bitreich.org	70
i   (let* ((notifier-command (assoc level *alerts*))	Err	bitreich.org	70
i          (command-string (cadr notifier-command)))	Err	bitreich.org	70
i+    (setf command-string (replace-all command-string "%state%"    (if (eql 'error state) "Start" "End")))	Err	bitreich.org	70
i     (setf command-string (replace-all command-string "%result%"   (format nil "~a" result)))	Err	bitreich.org	70
i     (setf command-string (replace-all command-string "%hostname%" (machine-instance)))	Err	bitreich.org	70
i     (setf command-string (replace-all command-string "%os%"       (software-type)))	Err	bitreich.org	70
i@@ -85,15 +88,53 @@	Err	bitreich.org	70
i 	Err	bitreich.org	70
i (defun =>(level fonction &rest params)	Err	bitreich.org	70
i   (format t "[~a~a ~20A~a] ~45A" *yellow* level fonction *white* (getf params :desc params))	Err	bitreich.org	70
i-  (let ((hash (fnv-hash (format nil "~{~a~}" (nconc (list level fonction) (remove-if #'symbolp params)))))	Err	bitreich.org	70
i-        (result (funcall fonction params)))	Err	bitreich.org	70
i+  (let* ((hash (fnv-hash (format nil "~{~a~}" (remove-if #'symbolp params))))	Err	bitreich.org	70
i+         (result (funcall fonction params))	Err	bitreich.org	70
i+         (filename (format nil "~a-~a-~a" level fonction hash))	Err	bitreich.org	70
i+         (filepath (format nil "states/~a" filename)))	Err	bitreich.org	70
i     (if (not (listp result))	Err	bitreich.org	70
i         (progn	Err	bitreich.org	70
i-          (format t " => ~asuccess~a~%" *green* *white*)	Err	bitreich.org	70
i+          (if (probe-file filepath)	Err	bitreich.org	70
i+              ;; last time was a failure	Err	bitreich.org	70
i+              (progn	Err	bitreich.org	70
i+                (uiop:run-program (trigger-alert level fonction params t 'success) :output t)	Err	bitreich.org	70
i+                (delete-file filepath)	Err	bitreich.org	70
i+                (format t " => ~afailure => success~a~%" *green* *white*))	Err	bitreich.org	70
i+              ;; last time was a success	Err	bitreich.org	70
i+              (format t " => ~asuccess~a~%" *green* *white*))	Err	bitreich.org	70
i+          ;; we return t because it's ok	Err	bitreich.org	70
i           t)	Err	bitreich.org	70
i+	Err	bitreich.org	70
i         (progn	Err	bitreich.org	70
i-          (format t " => ~aerror~a~%" *red* *white*)	Err	bitreich.org	70
i-          (uiop:run-program (trigger-alert level fonction params (cadr result)) :output t)	Err	bitreich.org	70
i+          (if (probe-file filepath)	Err	bitreich.org	70
i+              ;; error before	Err	bitreich.org	70
i+              ;; but how many ?	Err	bitreich.org	70
i+              (with-open-file (stream filepath :direction :input)	Err	bitreich.org	70
i+                (let ((tries (parse-integer (read-line stream 0 nil))))	Err	bitreich.org	70
i+                  (format t " => ~aerror (~a failures before)~a~%" *red* tries *white*)	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+                  ;; more error than limit, send alert once	Err	bitreich.org	70
i+                  (when (= tries (getf params :try *tries*))	Err	bitreich.org	70
i+                    (uiop:run-program (trigger-alert level fonction params (cadr result) 'error) :output t))	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+                  ;; increment the file	Err	bitreich.org	70
i+                  (progn	Err	bitreich.org	70
i+                    (with-open-file (stream-out filepath :direction :output	Err	bitreich.org	70
i+                                                :if-exists :supersede)	Err	bitreich.org	70
i+                      (format stream-out "~a~%~a~%" (+ 1 tries) params)))))	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+              ;; file doesn't exist	Err	bitreich.org	70
i+              (with-open-file (stream-out filepath :direction :output	Err	bitreich.org	70
i+                                          :if-exists :supersede)	Err	bitreich.org	70
i+                (format t " => ~aerror (first failure)~a~%" *red* *white*)	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+                ;; maybe we would be warned at first error ?	Err	bitreich.org	70
i+                ;; code is duplicated from above because it	Err	bitreich.org	70
i+                ;; requires reading the non existent file	Err	bitreich.org	70
i+                (when (= 1 (getf params :try *tries*))	Err	bitreich.org	70
i+                  (uiop:run-program (trigger-alert level fonction params (cadr result) 'error) :output t))	Err	bitreich.org	70
i+	Err	bitreich.org	70
i+                (format stream-out "1~%~a~%" params)))	Err	bitreich.org	70
i           nil))))	Err	bitreich.org	70
i 	Err	bitreich.org	70
i (load "probes.lisp")	Err	bitreich.org	70
.
Response: text/plain
Original URLgopher://bitreich.org/0/scm/reed-alert/commit/f352b8458e9...
Content-Typetext/plain; charset=utf-8