Projet

Général

Profil

Paste
Télécharger au format
Statistiques
| Branche: | Révision:

root / plugins / emc / emc_vnx_block_lun_perfdata @ a8e1084b

Historique | Voir | Annoter | Télécharger (11,5 ko)

1
#!/bin/bash
2

    
3
: <<=cut
4

    
5
=head1 NAME 
6

    
7
 emc_vnx_block_lun_perfdata - Plugin to monitor Block statistics of EMC VNX 5300 Unified Storage Processors
8

    
9
=head1 AUTHOR
10

    
11
 Evgeny Beysembaev <megabotva@gmail.com>
12

    
13
=head1 LICENSE
14

    
15
 GPLv2
16

    
17
=head1 MAGIC MARKERS
18

    
19
  #%# family=auto
20
  #%# capabilities=autoconf
21

    
22
=head1 DESCRIPTION
23

    
24
 The plugin monitors LUN of EMC Unified Storage FLARE SP's. Probably it can also be compatible with 
25
 other Clariion systems. It uses SSH to connect to Control Stations, then remotely executes 
26
 /nas/sbin/navicli and fetches and parses data from it. Obviously, it's easy to reconfigure plugin not to use 
27
 Control Stations' navicli in favor of using locally installed /opt/Navisphere's cli. There is no difference which
28
 Storage Processor to use to gather data, so this plugin tries both of them and uses the first active one.
29
 This plugin also automatically chooses Primary Control Station from the list by calling /nasmcd/sbin/getreason and 
30
 /nasmcd/sbin/t2slot.
31
 
32
 I left some parts of this plugin as rudimental to make easy to reconfigure it to draw more (or less) data.
33

    
34
=head1 COMPATIBILITY
35

    
36
 The plugin has been written for being compatible with EMC VNX5300 Storage system, as this is the only EMC storage which 
37
 i have. By the way, i am pretty sure it can also work with other VNX1 storages, like VNX5100 and VNX5500, and old-style 
38
 Clariion systems.
39
 About VNX2 series, i don't know whether the plugin will be able to work with them. Maybe it would need some corrections
40
 in command-line backend. The same situation is with other EMC systems, so i encourage you to try and fix the plugin. 
41
 
42
=head1 CONFIGURATION
43

    
44
=head2 Prerequisites
45

    
46
 First of all, be sure that statistics collection is turned on. You can do this by typing:
47
 navicli -h spa setstats -on
48
 on your Control Station or locally through /opt/Navisphere 
49

    
50
 Also, the plugin actively uses buggy "cdef" feature of Munin, and here we can be hit by the following bugs:
51
 http://munin-monitoring.org/ticket/1017 - Here I have some workarounds in plugin, be sure that they are working.
52
 http://munin-monitoring.org/ticket/1352 - 
53
 Metrics in my plugin can be much longer than 15 characters, so you have to edit the following file:
54
 /usr/share/perl5/Munin/Master/GraphOld.pm
55
 Find get_field_name() function and change "15" to "255".
56

    
57
=head2 Installation
58

    
59
 The plugin uses SSH to connect to Control Stations. It's possible to use 'nasadmin' user, but it would be better
60
 if you create read-only global user by Unisphere Client. The user should have only Operator role.
61
 I created "operator" user but due to the fact that Control Stations already had one internal "operator" user,
62
 the new one was called "operator1". So be careful.
63
 
64
 On munin-node side choose a user which will be used to connect through SSH. Generally user "munin" is ok. Then,
65
 execute "sudo su munin -s /bin/bash", "ssh-keygen" and "ssh-copy-id" to both Control Stations with newly created 
66
 user.
67
 
68
 Make a link from /usr/share/munin/plugins/emc_vnx_dm_basic_stats to /etc/munin/plugins/emc_vnx_dm_basic_stats_<NAME>,
69
 where <NAME> is any arbitrary name of your storage system. The plugin will return <NAME> in its answer 
70
 as "host_name" field.
71
 Assume your storage system is called "VNX5300".
72
 
73
 Make a configuration file at /etc/munin/plugin-conf.d/emc_vnx_block_lun_perfdata_VNX5300
74
 
75
 [emc_vnx_block_lun_perfdata_VNX5300]
76
 user munin							# SSH Client local user
77
 env.username operator1						# Remote user with Operator role
78
 env.cs_addr 192.168.1.1 192.168.1.2				# Control Stations addresses
79

    
80
=head1 ERRATA
81

    
82
 It counts Queue Length in not fully correct way. We take parameters totally from both SP's, but after we divide them
83
 independently by load of SPA and SPB. Anyway, in most AAA / ALUA cases the formula is correct.
84

    
85
=head1 HISTORY
86

    
87
 09.11.2016 - First Release
88
 26.12.2016 - Compatibility with Munin coding style
89

    
90
=cut
91

    
92
export LANG=C
93
TARGET=$(echo "${0##*/}" | cut -d _ -f 6)
94
SPALL="SPA SPB"
95
NAVICLI="/nas/sbin/navicli"
96
SSH_CHECK='ssh -q $username@$CS "/nasmcd/sbin/getreason | grep -w slot_\`/nasmcd/sbin/t2slot\` | cut -d- -f1"'
97

    
98
if [ "$1" = "autoconf" ]; then
99
	echo "yes"
100
	exit 0
101
fi
102

    
103
if [ -z "$username" ]; then
104
	echo "No username!"
105
	exit 1
106
fi
107

    
108
if [ -z "$cs_addr" ]; then
109
	echo "No control station addresses!"
110
	exit 1
111
fi
112

    
113
#Choosing Cotrol Station. Code have to be "10"
114
for CS in $cs_addr; do
115
	if [[ "10" -eq "$(eval $SSH_CHECK)" ]]; then
116
#		echo "$CS is Primary"
117
		PRIMARY_CS=$CS
118
		break
119
	fi
120
done
121

    
122
if [ -z "$PRIMARY_CS" ]; then
123
	echo "No alive primary Control Station from list \"$cs_addr\"";
124
	exit 1;
125
fi
126

    
127
SSH="ssh -q $username@$PRIMARY_CS "
128
for PROBESP in $SPALL; do
129
	$SSH $NAVICLI -h $PROBESP  > /dev/null 2>&1
130
	if [ 0 == "$?" ]; then SP="$PROBESP"; break; fi
131
done
132

    
133
if [ -z "$SP" ]; then
134
	echo "No active Storage Processor found!";
135
	exit 1;
136
fi
137
NAVICLI="/nas/sbin/navicli -h $SP"
138

    
139
# Get Lun List
140
#LUNLIST="$($SSH $NAVICLI lun -list -drivetype | grep Name | sed -ne 's/^Name:\ *//p')"
141
LUNLIST="$($SSH $NAVICLI lun -list -drivetype | sed -ne 's/^Name:\ *//p')"
142

    
143
echo -e "host_name ${TARGET}\n"
144

    
145
if [ "$1" = "config" ] ; then
146
	echo "multigraph emc_vnx_block_blocks
147
graph_category disk
148
graph_title EMC VNX 5300 LUN Blocks
149
graph_vlabel Blocks Read (-) / Written (+)
150
graph_args --base 1000"
151
	while read -r LUN ; do
152
		echo "${LUN}_read.label none
153
${LUN}_read.graph no
154
${LUN}_read.min 0
155
${LUN}_read.draw AREA
156
${LUN}_read.type COUNTER
157
${LUN}_write.label $LUN Blocks
158
${LUN}_write.negative ${LUN}_read
159
${LUN}_write.type COUNTER
160
${LUN}_write.min 0
161
${LUN}_write.draw STACK"
162
	done <<< $LUNLIST
163

    
164
	echo -e "\nmultigraph emc_vnx_block_req
165
graph_category disk
166
graph_title EMC VNX 5300 LUN Requests
167
graph_vlabel Requests: Read (-) / Write (+)
168
graph_args --base 1000"
169
	while read -r LUN ; do
170
		echo "${LUN}_readreq.label none
171
${LUN}_readreq.graph no
172
${LUN}_readreq.min 0
173
${LUN}_readreq.type COUNTER
174
${LUN}_writereq.label $LUN Requests
175
${LUN}_writereq.negative ${LUN}_readreq
176
${LUN}_writereq.type COUNTER
177
${LUN}_writereq.min 0"
178
	done <<< $LUNLIST
179

    
180
	echo -e "\nmultigraph emc_vnx_block_ticks
181
graph_category disk
182
graph_title EMC VNX 5300 Counted Load per LUN
183
graph_vlabel Load, % * Number of LUNs 
184
graph_args --base 1000 -l 0 -r "
185
echo -n "graph_order "
186
	while read -r LUN ; do
187
                echo -n "${LUN}_busyticks ${LUN}_idleticks ${LUN}_bta=${LUN}_busyticks_spa ${LUN}_idleticks_spa ${LUN}_btb=${LUN}_busyticks_spb ${LUN}_idleticks_spb "
188
	done <<< $LUNLIST
189
	echo ""
190
	while read -r LUN ; do
191
		echo "${LUN}_busyticks_spa.label $LUN Busy Ticks SPA
192
${LUN}_busyticks_spa.type COUNTER
193
${LUN}_busyticks_spa.graph no
194
${LUN}_bta.label $LUN Busy Ticks SPA
195
${LUN}_bta.graph no
196
${LUN}_idleticks_spa.label $LUN Idle Ticks SPA
197
${LUN}_idleticks_spa.type COUNTER
198
${LUN}_idleticks_spa.graph no
199
${LUN}_busyticks_spb.label $LUN Busy Ticks SPB
200
${LUN}_busyticks_spb.type COUNTER
201
${LUN}_busyticks_spb.graph no
202
${LUN}_btb.label $LUN Busy Ticks SPB
203
${LUN}_btb.graph no
204
${LUN}_idleticks_spb.label $LUN Idle Ticks SPB
205
${LUN}_idleticks_spb.type COUNTER
206
${LUN}_idleticks_spb.graph no"
207

    
208
echo "${LUN}_load_spa.label $LUN load SPA 
209
${LUN}_load_spa.draw AREASTACK
210
${LUN}_load_spb.label $LUN load SPB
211
${LUN}_load_spb.draw AREASTACK
212
${LUN}_load_spa.cdef 100,${LUN}_bta,${LUN}_busyticks_spa,${LUN}_idleticks_spa,+,/,*
213
${LUN}_load_spb.cdef 100,${LUN}_btb,${LUN}_busyticks_spa,${LUN}_idleticks_spa,+,/,*
214
"
215
	done <<< $LUNLIST
216

    
217
	echo -e "\nmultigraph emc_vnx_block_outstanding
218
graph_category disk
219
graph_title EMC VNX 5300 Sum of Outstanding Requests
220
graph_vlabel Requests
221
graph_args --base 1000"
222
	while read -r LUN ; do
223
		echo "${LUN}_outstandsum.label $LUN
224
${LUN}_outstandsum.type COUNTER"
225
	done <<< $LUNLIST
226

    
227
	echo -e "\nmultigraph emc_vnx_block_nonzeroreq
228
graph_category disk
229
graph_title EMC VNX 5300 Non-Zero Request Count Arrivals
230
graph_vlabel Count Arrivals
231
graph_args --base 1000"
232
	while read -r LUN ; do
233
		echo "${LUN}_nonzeroreq.label $LUN
234
${LUN}_nonzeroreq.type COUNTER"
235
	done <<< $LUNLIST
236

    
237
	echo -e "\nmultigraph emc_vnx_block_trespasses
238
graph_category disk
239
graph_title EMC VNX 5300 Trespasses
240
graph_vlabel Trespasses"
241
	while read -r LUN ; do
242
		echo "${LUN}_implic_tr.label ${LUN} Implicit Trespasses
243
${LUN}_explic_tr.label ${LUN} Explicit Trespasses"
244
	done <<< $LUNLIST
245

    
246
	echo -e "\nmultigraph emc_vnx_block_queue
247
graph_category disk
248
graph_title EMC VNX 5300 Counted Block Queue Length
249
graph_vlabel Length"
250
	while read -r LUN ; do
251
		echo "${LUN}_busyticks_spa.label ${LUN}
252
${LUN}_busyticks_spa.graph no
253
${LUN}_busyticks_spa.type COUNTER
254
${LUN}_idleticks_spa.label ${LUN}
255
${LUN}_idleticks_spa.graph no
256
${LUN}_idleticks_spa.type COUNTER
257
${LUN}_busyticks_spb.label ${LUN}
258
${LUN}_busyticks_spb.graph no
259
${LUN}_busyticks_spb.type COUNTER
260
${LUN}_idleticks_spb.label ${LUN}
261
${LUN}_idleticks_spb.graph no
262
${LUN}_idleticks_spb.type COUNTER
263
${LUN}_outstandsum.label ${LUN}
264
${LUN}_outstandsum.graph no
265
${LUN}_outstandsum.type COUNTER
266
${LUN}_nonzeroreq.label ${LUN}
267
${LUN}_nonzeroreq.graph no
268
${LUN}_nonzeroreq.type COUNTER
269
${LUN}_readreq.label ${LUN}
270
${LUN}_readreq.graph no
271
${LUN}_readreq.type COUNTER
272
${LUN}_writereq.label ${LUN}
273
${LUN}_writereq.graph no
274
${LUN}_writereq.type COUNTER"
275
# Queue Length SPA = ((Sum of Outstanding Requests SPA - NonZero Request Count Arrivals SPA / 2)/(Host Read Requests SPA + Host Write Requests SPA))*
276
# (Busy Ticks SPA/(Busy Ticks SPA + Idle Ticks SPA)
277
# We count together SPA and SPB, although it is not fully corrext
278
		echo "${LUN}_ql_l_a.label ${LUN} Queue Length SPA
279
${LUN}_ql_l_a.cdef ${LUN}_outstandsum,${LUN}_nonzeroreq,2,/,-,${LUN}_readreq,${LUN}_writereq,+,/,${LUN}_busyticks_spa,*,${LUN}_busyticks_spa,${LUN}_idleticks_spa,+,/
280
${LUN}_ql_l_b.label ${LUN} Queue Length SPB
281
${LUN}_ql_l_b.cdef ${LUN}_outstandsum,${LUN}_nonzeroreq,2,/,-,${LUN}_readreq,${LUN}_writereq,+,/,${LUN}_busyticks_spb,*,${LUN}_busyticks_spb,${LUN}_idleticks_spb,+,/"
282
	done <<< $LUNLIST
283
exit 0
284
fi
285

    
286
#Preparing big complex command to SP's to have most work done remotely.
287
BIGSSHCMD="$SSH"
288
while read -r LUN ; do
289
	BIGSSHCMD+="$NAVICLI lun -list -name $LUN -perfData | 
290
		sed -ne 's/^Blocks Read\:\ */${LUN}_read.value /p; 
291
		s/^Blocks Written\:\ */${LUN}_write.value /p;
292
		s/Read Requests\:\ */${LUN}_readreq.value /p;
293
		s/Write Requests\:\ */${LUN}_writereq.value /p;
294
		s/Busy Ticks SP A\:\ */${LUN}_busyticks_spa.value /p;
295
		s/Idle Ticks SP A\:\ */${LUN}_idleticks_spa.value /p;
296
		s/Busy Ticks SP B\:\ */${LUN}_busyticks_spb.value /p;
297
		s/Idle Ticks SP B\:\ */${LUN}_idleticks_spb.value /p;
298
		s/Sum of Outstanding Requests\:\ */${LUN}_outstandsum.value /p;
299
		s/Non-Zero Request Count Arrivals\:\ */${LUN}_nonzeroreq.value /p;
300
		s/Implicit Trespasses\:\ */${LUN}_implic_tr.value /p;
301
		s/Explicit Trespasses\:\ */${LUN}_explic_tr.value /p;
302
		' ;"
303
done <<< $LUNLIST
304
ANSWER="$($BIGSSHCMD)"
305
echo "multigraph emc_vnx_block_blocks"
306
echo "$ANSWER" | grep "read\.\|write\."
307
echo -e "\nmultigraph emc_vnx_block_req"
308
echo "$ANSWER" | grep "readreq\.\|writereq\."
309

    
310
echo -e "\nmultigraph emc_vnx_block_ticks"
311
while read -r LUN ; do
312
	echo "${LUN}_load_spa.value 0"
313
	echo "${LUN}_load_spb.value 0"
314
done <<< $LUNLIST
315
echo "$ANSWER" | grep "busyticks_spa\.\|idleticks_spa\."
316
echo "$ANSWER" | grep "busyticks_spb\.\|idleticks_spb\."
317

    
318
echo -e "\nmultigraph emc_vnx_block_outstanding"
319
echo "$ANSWER" | grep "outstandsum\."
320

    
321
echo -e "\nmultigraph emc_vnx_block_nonzeroreq"
322
echo "$ANSWER" | grep "nonzeroreq\."
323

    
324
echo -e "\nmultigraph emc_vnx_block_trespasses"
325
echo "$ANSWER" | grep "implic_tr\.\|explic_tr\."
326

    
327
echo -e "\nmultigraph emc_vnx_block_queue"
328
# Queue Length
329
	echo "$ANSWER" | grep "busyticks"
330
	echo "$ANSWER" | grep "idleticks."
331
	echo "$ANSWER" | grep "outstandsum\."
332
	echo "$ANSWER" | grep "nonzeroreq\."
333
	echo "$ANSWER" | grep "readreq\."
334
	echo "$ANSWER" | grep "writereq\."
335
while read -r LUN ; do
336
	echo "${LUN}_ql_l_a.value 0 "
337
	echo "${LUN}_ql_l_b.value 0 "
338
done <<< $LUNLIST
339
exit 0