Révision 6c13e1d9
Added extinfo to list affected logs and improved documentation
| plugins/logs/service_events | ||
|---|---|---|
| 16 | 16 |
The idea is that any given service may produce events in various areas of |
| 17 | 17 |
operation. For example, while a typical web app might log runtime errors |
| 18 | 18 |
to it's app.log file, a filesystem change may prevent the whole app from |
| 19 |
event being bootstrapped, which may be logged in an apache log or in syslog. |
|
| 19 |
even being bootstrapped, and this crucial error may be logged in an apache |
|
| 20 |
log or in syslog. |
|
| 21 |
|
|
| 22 |
This plugin attempts to give visibility into all such "important events" |
|
| 23 |
that may affect the proper functioning of a given service. It attempts to |
|
| 24 |
answer the question, "Is my service running normally?". |
|
| 20 | 25 |
|
| 21 |
This plugin attempts to answer the question, "how is my service doing?". |
|
| 22 | 26 |
Unfortunately, it won't help you trace down exactly where the events are |
| 23 | 27 |
coming from if you happen to be watching a number of different logs, but |
| 24 | 28 |
it will at least let you know that something is wrong and that action |
| 25 |
should be taken. |
|
| 29 |
should be taken. To try to help with this, the plugin uses the extinfo |
|
| 30 |
field to list which logs currently have important events in them. |
|
| 26 | 31 |
|
| 27 | 32 |
The plugin can be included multiple times to create graphs for various |
| 28 | 33 |
differing kinds of services. For example, you may have both webservices |
| ... | ... | |
| 30 | 35 |
different ways. |
| 31 | 36 |
|
| 32 | 37 |
You can accomplish this by linking the plugin twice with different names |
| 33 |
and providing different configuration for each instance. |
|
| 38 |
and providing different configuration for each instance. In general, you |
|
| 39 |
should think of a single instance of this plugin as representing a single |
|
| 40 |
class of services. |
|
| 41 |
|
|
| 34 | 42 |
|
| 35 | 43 |
=head1 CONFIGURATION |
| 36 | 44 |
|
| ... | ... | |
| 80 | 88 |
both /var/log/my-site/errors.log and /srv/www/my-site/logs/app.log to the |
| 81 | 89 |
defined my-site service. |
| 82 | 90 |
|
| 91 |
|
|
| 83 | 92 |
=head2 SERVICE AUTOCONF |
| 84 | 93 |
|
| 85 | 94 |
Because services are often dynamic and you don't want to have to manually update |
| ... | ... | |
| 93 | 102 |
If you choose not to use the autoconf feature, you MUST specify services as a |
| 94 | 103 |
space-separated list of service names in the \`services\` variable. |
| 95 | 104 |
|
| 96 |
=head2 EXAMPLE CONFIG |
|
| 105 |
|
|
| 106 |
=head2 EXAMPLE CONFIGS |
|
| 107 |
|
|
| 108 |
This example uses services autoconf: |
|
| 97 | 109 |
|
| 98 | 110 |
[service_events] |
| 99 | 111 |
user root |
| ... | ... | |
| 106 | 118 |
env.apache_regex error|alert|crit|emerg |
| 107 | 119 |
env.warning 1 |
| 108 | 120 |
env.critical 5 |
| 109 |
env.my_special_service_warning 100
|
|
| 121 |
env.my_special_service_warning 100 |
|
| 110 | 122 |
env.my_special_service_critical 300 |
| 111 | 123 |
|
| 124 |
This example DOESN'T use services autoconf: |
|
| 125 |
|
|
| 126 |
[service_events] |
|
| 127 |
user root |
|
| 128 |
env.services auth.example.com admin.example.com www.example.com |
|
| 129 |
env.auth_example_com_logbinding my-custom-binding[0-9]+ |
|
| 130 |
env.cfxsvc_logfiles /srv/*/*/logs/app.log |
|
| 131 |
env.cfxsvc_regex error|alert|crit|emerg |
|
| 132 |
env.phpfpm_logfiles /srv/*/*/logs/php-fpm*.log |
|
| 133 |
env.phpfpm_regex Fatal error |
|
| 134 |
env.apache_logfiles /srv/*/*/logs/errors.log |
|
| 135 |
env.apache_regex error|alert|crit|emerg |
|
| 136 |
env.warning 1 |
|
| 137 |
env.critical 5 |
|
| 138 |
env.auth_example_com_warning 100 |
|
| 139 |
env.auth_example_com_critical 300 |
|
| 140 |
env.www_example_com_warning 50 |
|
| 141 |
env.www_example_com_critical 100 |
|
| 142 |
|
|
| 143 |
This graph will ONLY ever show values for the three listed services, even |
|
| 144 |
if other services are installed whose logfiles match the logfiles search. |
|
| 145 |
|
|
| 146 |
Also notice that in this example, we've only listed a log binding for the |
|
| 147 |
auth service. The plugin will use the service name by default for any |
|
| 148 |
services that don't specify a log binding, so in this case, auth has a |
|
| 149 |
custom log binding, while all other services have log bindings equal to |
|
| 150 |
their names. |
|
| 151 |
|
|
| 112 | 152 |
|
| 113 | 153 |
=head1 AUTHOR |
| 114 | 154 |
|
| 115 | 155 |
Kael Shipman <kael.shipman@gmail.com> |
| 116 | 156 |
|
| 157 |
|
|
| 117 | 158 |
=head1 LICENSE |
| 118 | 159 |
|
| 119 | 160 |
MIT LICENSE |
| ... | ... | |
| 138 | 179 |
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR |
| 139 | 180 |
OTHER DEALINGS IN THE SOFTWARE. |
| 140 | 181 |
|
| 182 |
|
|
| 141 | 183 |
=head1 MAGIC MARKERS |
| 142 | 184 |
|
| 143 | 185 |
#%# family=manual |
| ... | ... | |
| 245 | 287 |
echo "graph_args --base 1000 -l 0" |
| 246 | 288 |
echo "graph_vlabel ${vlabel}"
|
| 247 | 289 |
echo "graph_category other" |
| 248 |
echo "graph_info Lists number of matching lines found in various logfiles associated with each service" |
|
| 290 |
echo "graph_info Lists number of matching lines found in various logfiles associated with each service. Extinfo displays currently affected logs."
|
|
| 249 | 291 |
|
| 250 | 292 |
local var_prefix |
| 251 | 293 |
while read -u 3 -r svc; do |
| ... | ... | |
| 266 | 308 |
local curstate="$(cat "$MUNIN_STATEFILE")" |
| 267 | 309 |
local nextstate=() |
| 268 | 310 |
|
| 269 |
local n svcnm varnm service svc svc_counter logbinding logfile lognm logmatch prvlines curlines matches |
|
| 311 |
local n svcnm varnm service svc svc_counter logbinding logfile lognm logmatch prvlines curlines matches extinfo_var
|
|
| 270 | 312 |
|
| 271 | 313 |
# Set service counters to 0 and set any logbindings that aren't yet set |
| 272 | 314 |
while read -u 3 -r svc; do |
| ... | ... | |
| 306 | 348 |
svcnm="$(echo "$service" | sed -r 's/^[^a-zA-Z]+//g' | sed -r 's/[^a-zA-Z0-9]+/_/g')" |
| 307 | 349 |
lognm="$(echo "$logfile" | sed -r 's/^[^a-zA-Z]+//g' | sed -r 's/[^a-zA-Z0-9]+/_/g')" |
| 308 | 350 |
|
| 309 |
# Get previous line count to determine whether or not the file may have been rotated |
|
| 351 |
# Get previous line count to determine whether or not the file may have been rotated (defaulting to 0)
|
|
| 310 | 352 |
prvlines="$(echo "$curstate" | grep "^${lognm}_lines=" | cut -f 2 -d "=")"
|
| 311 |
if [ -z "$prvlines" ]; then |
|
| 312 |
prvlines=0 |
|
| 313 |
fi |
|
| 353 |
prvlines="${prvlines:-0}"
|
|
| 314 | 354 |
|
| 315 |
# Get the current number of lines in the file |
|
| 355 |
# Get the current number of lines in the file (defaulting to 0 on error)
|
|
| 316 | 356 |
curlines="$(wc -l < "$logfile")" |
| 317 |
if ! [ "$curlines" -eq "$curlines" ] &>/dev/null; then |
|
| 318 |
curlines=0 |
|
| 319 |
fi |
|
| 357 |
curlines="${curlines:-0}"
|
|
| 320 | 358 |
|
| 321 | 359 |
# If the current line count is less than the previous line count, we've probably rotated. |
| 322 | 360 |
# Reset to 0. |
| ... | ... | |
| 330 | 368 |
logmatch="${LOGFILEMAP[$n]}_regex"
|
| 331 | 369 |
matches="$(tail -n +"$prvlines" "$logfile" | grep -Ec "${!logmatch}" || true)"
|
| 332 | 370 |
|
| 333 |
# Aggregate and add to the correct service counter |
|
| 334 |
svc_counter="${svcnm}_total"
|
|
| 335 |
!((matches+=${!svc_counter}))
|
|
| 336 |
typeset "$svc_counter=$matches" |
|
| 371 |
# If there were matches, aggregate them and add this log to the extinfo for the service |
|
| 372 |
if [ "$matches" -gt 0 ]; then |
|
| 373 |
# Aggregate and add to the correct service counter |
|
| 374 |
svc_counter="${svcnm}_total"
|
|
| 375 |
!((matches+=${!svc_counter}))
|
|
| 376 |
typeset "$svc_counter=$matches" |
|
| 377 |
|
|
| 378 |
# Add this log to extinfo for service |
|
| 379 |
extinfo_var="${svcnm}_extinfo"
|
|
| 380 |
typeset "$extinfo_var=${!extinfo_var}$logfile, "
|
|
| 381 |
fi |
|
| 337 | 382 |
|
| 338 | 383 |
# Push onto next state |
| 339 | 384 |
nextstate+=("${lognm}_lines=$curlines")
|
| ... | ... | |
| 348 | 393 |
while read -u 3 -r svc; do |
| 349 | 394 |
svcnm="$(echo "$svc" | sed -r 's/^[^a-zA-Z]+//g' | sed -r 's/[^a-zA-Z0-9]+/_/g')" |
| 350 | 395 |
svc_counter="${svcnm}_total"
|
| 396 |
extinfo_var="${svcnm}_extinfo"
|
|
| 351 | 397 |
echo "${svcnm}.value ${!svc_counter}"
|
| 398 |
echo "${svcnm}.extinfo ${!extinfo_var}"
|
|
| 352 | 399 |
done 3< <(IFS=$'\n'; echo "${services[*]}")
|
| 353 | 400 |
|
| 354 | 401 |
return 0 |
Formats disponibles : Unified diff