Projet

Général

Profil

Paste
Télécharger au format
Statistiques
| Branche: | Révision:

root / plugins / emc / emc_vnx_block_lun_perfdata @ 210ebae0

Historique | Voir | Annoter | Télécharger (12,1 ko)

1
#!/bin/bash
2

    
3
: <<=cut
4

    
5
=head1 NAME 
6

    
7
 emc_vnx_block_lun_perfdata - Plugin to monitor Block statistics of EMC VNX 5300 Unified Storage Processors
8

    
9
=head1 AUTHOR
10

    
11
 Evgeny Beysembaev <megabotva@gmail.com>
12

    
13
=head1 LICENSE
14

    
15
 GPLv2
16

    
17
=head1 MAGIC MARKERS
18

    
19
  #%# family=auto
20
  #%# capabilities=autoconf
21

    
22
=head1 DESCRIPTION
23

    
24
 The plugin monitors LUN of EMC Unified Storage FLARE SP's. Probably it can also be compatible with 
25
 other Clariion systems. It uses SSH to connect to Control Stations, then remotely executes 
26
 /nas/sbin/navicli and fetches and parses data from it. Obviously, it's easy to reconfigure plugin not to use 
27
 Control Stations' navicli in favor of using locally installed /opt/Navisphere's cli. There is no difference which
28
 Storage Processor to use to gather data, so this plugin tries both of them and uses the first active one.
29
 This plugin also automatically chooses Primary Control Station from the list by calling /nasmcd/sbin/getreason and 
30
 /nasmcd/sbin/t2slot.
31
 
32
 I left some parts of this plugin as rudimental to make easy to reconfigure it to draw more (or less) data.
33

    
34
=head1 COMPATIBILITY
35

    
36
 The plugin has been written for being compatible with EMC VNX5300 Storage system, as this is the only EMC storage which 
37
 i have. By the way, i am pretty sure it can also work with other VNX1 storages, like VNX5100 and VNX5500, and old-style 
38
 Clariion systems.
39
 About VNX2 series, i don't know whether the plugin will be able to work with them. Maybe it would need some corrections
40
 in command-line backend. The same situation is with other EMC systems, so i encourage you to try and fix the plugin. 
41
 
42
=head1 CONFIGURATION
43

    
44
=head2 Prerequisites
45

    
46
 First of all, be sure that statistics collection is turned on. You can do this by typing:
47
 navicli -h spa setstats -on
48
 on your Control Station or locally through /opt/Navisphere 
49

    
50
 Also, the plugin actively uses buggy "cdef" feature of Munin, and here we can be hit by the following bugs:
51
 http://munin-monitoring.org/ticket/1017 - Here I have some workarounds in plugin, be sure that they are working.
52
 http://munin-monitoring.org/ticket/1352 - 
53
 Metrics in my plugin can be much longer than 15 characters, so you have to edit the following file:
54
 /usr/share/perl5/Munin/Master/GraphOld.pm
55
 Find get_field_name() function and change "15" to "255".
56

    
57
=head2 Installation
58

    
59
 The plugin uses SSH to connect to Control Stations. It's possible to use 'nasadmin' user, but it would be better
60
 if you create read-only global user by Unisphere Client. The user should have only Operator role.
61
 I created "operator" user but due to the fact that Control Stations already had one internal "operator" user,
62
 the new one was called "operator1". So be careful.
63
 
64
 On munin-node side choose a user which will be used to connect through SSH. Generally user "munin" is ok. Then,
65
 execute "sudo su munin -s /bin/bash", "ssh-keygen" and "ssh-copy-id" to both Control Stations with newly created 
66
 user.
67
 
68
 Make a link from /usr/share/munin/plugins/emc_vnx_dm_basic_stats to /etc/munin/plugins/emc_vnx_dm_basic_stats_<NAME>,
69
 where <NAME> is any arbitrary name of your storage system. The plugin will return <NAME> in its answer 
70
 as "host_name" field.
71
 Assume your storage system is called "VNX5300".
72
 
73
 Make a configuration file at /etc/munin/plugin-conf.d/emc_vnx_block_lun_perfdata_VNX5300
74
 
75
 [emc_vnx_block_lun_perfdata_VNX5300]
76
 user munin							# SSH Client local user
77
 env.username operator1						# Remote user with Operator role
78
 env.cs_addr 192.168.1.1 192.168.1.2				# Control Stations addresses
79

    
80
=head1 ERRATA
81

    
82
 It counts Queue Length in not fully correct way. We take parameters totally from both SP's, but after we divide them
83
 independently by load of SPA and SPB. Anyway, in most AAA / ALUA cases the formula is correct.
84

    
85
=head1 HISTORY
86

    
87
 09.11.2016 - First Release
88
 26.12.2016 - Compatibility with Munin coding style
89

    
90
=cut
91

    
92
export LANG=C
93
TARGET=$(echo "${0##*/}" | cut -d _ -f 6)
94
SPALL="SPA SPB"
95
NAVICLI="/nas/sbin/navicli"
96

    
97
ssh_check() {
98
        ssh -q $username@$1 "/nasmcd/sbin/getreason | grep -w slot_\`/nasmcd/sbin/t2slot\` | cut -d- -f1"
99
}
100

    
101

    
102
check_conf () {
103
	if [ -z "$username" ]; then
104
		echo "No username ('username' environment variable)!"
105
		return 1
106
	fi
107

    
108
	if [ -z "$cs_addr" ]; then
109
                echo "No control station addresses ('cs_addr' environment variable)!"
110
		return 1
111
	fi
112

    
113
	#Choosing Cotrol Station. Code have to be "10"
114
	for CS in $cs_addr; do
115
		if [[ "10" -eq "$(ssh_check $CS)" ]]; then
116
			PRIMARY_CS=$CS
117
			break
118
		fi
119
	done
120

    
121
	if [ -z "$PRIMARY_CS" ]; then
122
		echo "No alive primary Control Station from list \"$cs_addr\"";
123
		return 1
124
	fi
125
	return 0
126
}
127

    
128
if [ "$1" = "autoconf" ]; then
129
	check_conf_ans=$(check_conf)
130
        if [ $? -eq 0 ]; then
131
                echo "yes"
132
        else
133
                echo "no ($check_conf_ans)"
134
        fi
135
        exit 0
136
fi
137

    
138
check_conf
139
if [[ $? -eq 1 ]]; then
140
        exit 1;
141
fi
142

    
143
SSH="ssh -q $username@$PRIMARY_CS "
144
for PROBESP in $SPALL; do
145
	$SSH $NAVICLI -h $PROBESP  > /dev/null 2>&1
146
	if [ 0 == "$?" ]; then SP="$PROBESP"; break; fi
147
done
148

    
149
if [ -z "$SP" ]; then
150
	echo "No active Storage Processor found!";
151
	exit 1;
152
fi
153
NAVICLI="/nas/sbin/navicli -h $SP"
154

    
155
# Get Lun List
156
LUNLIST="$($SSH $NAVICLI lun -list -drivetype | sed -ne 's/^Name:\ *//p')"
157

    
158
echo -e "host_name ${TARGET}\n"
159

    
160
if [ "$1" = "config" ] ; then
161
	cat <<-EOF 
162
	multigraph emc_vnx_block_blocks
163
	graph_category disk
164
	graph_title EMC VNX 5300 LUN Blocks
165
	graph_vlabel Blocks Read (-) / Written (+)
166
	graph_args --base 1000
167
	
168
	EOF
169

    
170
	while read -r LUN ; do
171
		cat <<-EOF 
172
		${LUN}_read.label none
173
		${LUN}_read.graph no
174
		${LUN}_read.min 0
175
		${LUN}_read.draw AREA
176
		${LUN}_read.type COUNTER
177
		${LUN}_write.label $LUN Blocks
178
		${LUN}_write.negative ${LUN}_read
179
		${LUN}_write.type COUNTER
180
		${LUN}_write.min 0
181
		${LUN}_write.draw STACK
182
		EOF
183
	done <<< "$LUNLIST"
184

    
185
	cat <<-EOF
186

    
187
	multigraph emc_vnx_block_req
188
	graph_category disk
189
	graph_title EMC VNX 5300 LUN Requests
190
	graph_vlabel Requests: Read (-) / Write (+)
191
	graph_args --base 1000
192
	
193
	EOF
194
	while read -r LUN ; do
195
		cat <<-EOF
196
		${LUN}_readreq.label none
197
		${LUN}_readreq.graph no
198
		${LUN}_readreq.min 0
199
		${LUN}_readreq.type COUNTER
200
		${LUN}_writereq.label $LUN Requests
201
		${LUN}_writereq.negative ${LUN}_readreq
202
		${LUN}_writereq.type COUNTER
203
		${LUN}_writereq.min 0
204
		EOF
205
	done <<< "$LUNLIST"
206

    
207
	cat <<-EOF
208

    
209
	multigraph emc_vnx_block_ticks
210
	graph_category disk
211
	graph_title EMC VNX 5300 Counted Load per LUN
212
	graph_vlabel Load, % * Number of LUNs 
213
	graph_args --base 1000 -l 0 -r 
214
	EOF
215
	echo -n "graph_order "
216
	while read -r LUN ; do
217
                echo -n "${LUN}_busyticks ${LUN}_idleticks ${LUN}_bta=${LUN}_busyticks_spa ${LUN}_idleticks_spa ${LUN}_btb=${LUN}_busyticks_spb ${LUN}_idleticks_spb "
218
	done <<< "$LUNLIST"
219
	echo ""
220
	while read -r LUN ; do
221
		cat <<-EOF
222
		${LUN}_busyticks_spa.label $LUN Busy Ticks SPA
223
		${LUN}_busyticks_spa.type COUNTER
224
		${LUN}_busyticks_spa.graph no
225
		${LUN}_bta.label $LUN Busy Ticks SPA
226
		${LUN}_bta.graph no
227
		${LUN}_idleticks_spa.label $LUN Idle Ticks SPA
228
		${LUN}_idleticks_spa.type COUNTER
229
		${LUN}_idleticks_spa.graph no
230
		${LUN}_busyticks_spb.label $LUN Busy Ticks SPB
231
		${LUN}_busyticks_spb.type COUNTER
232
		${LUN}_busyticks_spb.graph no
233
		${LUN}_btb.label $LUN Busy Ticks SPB
234
		${LUN}_btb.graph no
235
		${LUN}_idleticks_spb.label $LUN Idle Ticks SPB
236
		${LUN}_idleticks_spb.type COUNTER
237
		${LUN}_idleticks_spb.graph no
238
		${LUN}_load_spa.label $LUN load SPA 
239
		${LUN}_load_spa.draw AREASTACK
240
		${LUN}_load_spb.label $LUN load SPB
241
		${LUN}_load_spb.draw AREASTACK
242
		${LUN}_load_spa.cdef 100,${LUN}_bta,${LUN}_busyticks_spa,${LUN}_idleticks_spa,+,/,*
243
		${LUN}_load_spb.cdef 100,${LUN}_btb,${LUN}_busyticks_spa,${LUN}_idleticks_spa,+,/,*
244
		EOF
245
	done <<< "$LUNLIST"
246

    
247
	cat <<-EOF
248
	
249
	multigraph emc_vnx_block_outstanding
250
	graph_category disk
251
	graph_title EMC VNX 5300 Sum of Outstanding Requests
252
	graph_vlabel Requests
253
	graph_args --base 1000
254
	EOF
255
	while read -r LUN ; do
256
		cat <<-EOF 
257
		${LUN}_outstandsum.label $LUN
258
		${LUN}_outstandsum.type COUNTER
259
	EOF
260
	done <<< "$LUNLIST"
261

    
262
	cat <<-EOF
263
	
264
	multigraph emc_vnx_block_nonzeroreq
265
	graph_category disk
266
	graph_title EMC VNX 5300 Non-Zero Request Count Arrivals
267
	graph_vlabel Count Arrivals
268
	graph_args --base 1000
269
	EOF
270
	while read -r LUN ; do
271
		cat <<-EOF
272
		${LUN}_nonzeroreq.label $LUN
273
		${LUN}_nonzeroreq.type COUNTER
274
		EOF
275
	done <<< "$LUNLIST"
276

    
277
	cat <<-EOF
278

    
279
	multigraph emc_vnx_block_trespasses
280
	graph_category disk
281
	graph_title EMC VNX 5300 Trespasses
282
	graph_vlabel Trespasses
283
	EOF
284
	while read -r LUN ; do
285
		cat <<-EOF
286
		${LUN}_implic_tr.label ${LUN} Implicit Trespasses
287
		${LUN}_explic_tr.label ${LUN} Explicit Trespasses
288
		EOF
289
	done <<< "$LUNLIST"
290

    
291
	cat <<-EOF
292

    
293
	multigraph emc_vnx_block_queue
294
	graph_category disk
295
	graph_title EMC VNX 5300 Counted Block Queue Length
296
	graph_vlabel Length
297
	EOF
298
	while read -r LUN ; do
299
		cat <<-EOF
300
		${LUN}_busyticks_spa.label ${LUN}
301
		${LUN}_busyticks_spa.graph no
302
		${LUN}_busyticks_spa.type COUNTER
303
		${LUN}_idleticks_spa.label ${LUN}
304
		${LUN}_idleticks_spa.graph no
305
		${LUN}_idleticks_spa.type COUNTER
306
		${LUN}_busyticks_spb.label ${LUN}
307
		${LUN}_busyticks_spb.graph no
308
		${LUN}_busyticks_spb.type COUNTER
309
		${LUN}_idleticks_spb.label ${LUN}
310
		${LUN}_idleticks_spb.graph no
311
		${LUN}_idleticks_spb.type COUNTER
312
		${LUN}_outstandsum.label ${LUN}
313
		${LUN}_outstandsum.graph no
314
		${LUN}_outstandsum.type COUNTER
315
		${LUN}_nonzeroreq.label ${LUN}
316
		${LUN}_nonzeroreq.graph no
317
		${LUN}_nonzeroreq.type COUNTER
318
		${LUN}_readreq.label ${LUN}
319
		${LUN}_readreq.graph no
320
		${LUN}_readreq.type COUNTER
321
		${LUN}_writereq.label ${LUN}
322
		${LUN}_writereq.graph no
323
		${LUN}_writereq.type COUNTER
324
		EOF
325
# Queue Length SPA = ((Sum of Outstanding Requests SPA - NonZero Request Count Arrivals SPA / 2)/(Host Read Requests SPA + Host Write Requests SPA))*
326
# (Busy Ticks SPA/(Busy Ticks SPA + Idle Ticks SPA)
327
# We count together SPA and SPB, although it is not fully corrext
328
		cat <<-EOF
329
		${LUN}_ql_l_a.label ${LUN} Queue Length SPA
330
		${LUN}_ql_l_a.cdef ${LUN}_outstandsum,${LUN}_nonzeroreq,2,/,-,${LUN}_readreq,${LUN}_writereq,+,/,${LUN}_busyticks_spa,*,${LUN}_busyticks_spa,${LUN}_idleticks_spa,+,/
331
		${LUN}_ql_l_b.label ${LUN} Queue Length SPB
332
		${LUN}_ql_l_b.cdef ${LUN}_outstandsum,${LUN}_nonzeroreq,2,/,-,${LUN}_readreq,${LUN}_writereq,+,/,${LUN}_busyticks_spb,*,${LUN}_busyticks_spb,${LUN}_idleticks_spb,+,/
333
		EOF
334
	done <<< "$LUNLIST"
335
exit 0
336
fi
337

    
338
#Preparing big complex command to SP's to have most work done remotely.
339
BIGSSHCMD="$SSH"
340
while read -r LUN ; do
341
	BIGSSHCMD+="$NAVICLI lun -list -name $LUN -perfData | 
342
		sed -ne 's/^Blocks Read\:\ */${LUN}_read.value /p; 
343
		s/^Blocks Written\:\ */${LUN}_write.value /p;
344
		s/Read Requests\:\ */${LUN}_readreq.value /p;
345
		s/Write Requests\:\ */${LUN}_writereq.value /p;
346
		s/Busy Ticks SP A\:\ */${LUN}_busyticks_spa.value /p;
347
		s/Idle Ticks SP A\:\ */${LUN}_idleticks_spa.value /p;
348
		s/Busy Ticks SP B\:\ */${LUN}_busyticks_spb.value /p;
349
		s/Idle Ticks SP B\:\ */${LUN}_idleticks_spb.value /p;
350
		s/Sum of Outstanding Requests\:\ */${LUN}_outstandsum.value /p;
351
		s/Non-Zero Request Count Arrivals\:\ */${LUN}_nonzeroreq.value /p;
352
		s/Implicit Trespasses\:\ */${LUN}_implic_tr.value /p;
353
		s/Explicit Trespasses\:\ */${LUN}_explic_tr.value /p;
354
		' ;"
355
done <<< "$LUNLIST"
356
ANSWER="$($BIGSSHCMD)"
357
echo "multigraph emc_vnx_block_blocks"
358
echo "$ANSWER" | grep "read\.\|write\."
359
echo -e "\nmultigraph emc_vnx_block_req"
360
echo "$ANSWER" | grep "readreq\.\|writereq\."
361

    
362
echo -e "\nmultigraph emc_vnx_block_ticks"
363
while read -r LUN ; do
364
	echo "${LUN}_load_spa.value 0"
365
	echo "${LUN}_load_spb.value 0"
366
done <<< "$LUNLIST"
367
echo "$ANSWER" | grep "busyticks_spa\.\|idleticks_spa\."
368
echo "$ANSWER" | grep "busyticks_spb\.\|idleticks_spb\."
369

    
370
echo -e "\nmultigraph emc_vnx_block_outstanding"
371
echo "$ANSWER" | grep "outstandsum\."
372

    
373
echo -e "\nmultigraph emc_vnx_block_nonzeroreq"
374
echo "$ANSWER" | grep "nonzeroreq\."
375

    
376
echo -e "\nmultigraph emc_vnx_block_trespasses"
377
echo "$ANSWER" | grep "implic_tr\.\|explic_tr\."
378

    
379
echo -e "\nmultigraph emc_vnx_block_queue"
380
# Queue Length
381
	echo "$ANSWER" | grep "busyticks"
382
	echo "$ANSWER" | grep "idleticks."
383
	echo "$ANSWER" | grep "outstandsum\."
384
	echo "$ANSWER" | grep "nonzeroreq\."
385
	echo "$ANSWER" | grep "readreq\."
386
	echo "$ANSWER" | grep "writereq\."
387
while read -r LUN ; do
388
	echo "${LUN}_ql_l_a.value 0 "
389
	echo "${LUN}_ql_l_b.value 0 "
390
done <<< "$LUNLIST"
391
exit 0