NeoID

SNMP not working on DSM 6.0.2 Loader

Recommended Posts

Hi guys.

 

Anyone else using SNMP for monitoring? While on 5.2 I was using LibreNMS and Grafana to keep track of my NAS and virtual machines, but that's no longer working. After upgrading to Jun's DSM 6.0.2 Loader SNMP doesn't work anymore. The thing is that you can enable it, but it will stop working after just a few minutes. Once it stops, it will also crash the resource monitor (failed to connect errors when opened) and it won't work until you disable SNMP and reboot the NAS.

 

I'm really eager to figure out what's going on. I'm currently on ESXi 6.5 (the VM is configured as a 6.5 VM) with two 1000e NICs. The NAS has 2 cores and 12GB of memory assigned. The resource manager and everything else works perfect until SNMP is enabled and has been polled/been active for a short while. I'm very interested in other people who also use ESXi. Especially users that have a PCI device in pass-through. Would appreciate any feedback that could help me resolve the issue. :smile:

Share this post


Link to post
Share on other sites

Found the following in the logs:

 

2017-05-03T02:10:02+02:00 moto kernel: [11455.177314] Can't get Core 2 temperature data
2017-05-03T02:15:02+02:00 moto kernel: [11755.309262] Can't get Core 2 temperature data
2017-05-03T02:20:02+02:00 moto kernel: [12055.513710] general protection fault: 0000 [#1] SMP
2017-05-03T02:20:02+02:00 moto kernel: [12055.516855] CPU: 1 PID: 13921 Comm: snmpd Tainted: P         C O 3.10.77 #8451
2017-05-03T02:20:02+02:00 moto kernel: [12055.516956] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1605280101 05/28/2016
2017-05-03T02:20:02+02:00 moto kernel: [12055.517114] task: ffff88031f0a80c0 ti: ffff88031eed4000 task.ti: ffff88031eed4000
2017-05-03T02:20:02+02:00 moto kernel: [12055.517218] RIP: 0010:[]  [] mutex_lock+0x4/0x20
2017-05-03T02:20:02+02:00 moto kernel: [12055.517332] RSP: 0018:ffff88031eed7d08  EFLAGS: 00010246
2017-05-03T02:20:02+02:00 moto kernel: [12055.517408] RAX: ffff880329fb2400 RBX: a7c0c3c748bd76f8 RCX: 0000000000000000
2017-05-03T02:20:02+02:00 moto kernel: [12055.517507] RDX: 000000000000258b RSI: 0000000000000002 RDI: a7c0c3c748bd76f8
2017-05-03T02:20:02+02:00 moto kernel: [12055.517607] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
2017-05-03T02:20:02+02:00 moto kernel: [12055.517735] R10: 000000000000000b R11: 0000000000000246 R12: ffffffff8184e0f0
2017-05-03T02:20:02+02:00 moto kernel: [12055.517836] R13: ffff88031eed7d80 R14: 000000000000fdc0 R15: a7c0c3c748bd76f8
2017-05-03T02:20:02+02:00 moto kernel: [12055.517936] FS:  00007f51fcfbc780(0000) GS:ffff88033fd00000(0000) knlGS:0000000000000000
2017-05-03T02:20:02+02:00 moto kernel: [12055.518050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2017-05-03T02:20:02+02:00 moto kernel: [12055.518131] CR2: 00007f51fcfba000 CR3: 000000031ec8b000 CR4: 00000000001407e0
2017-05-03T02:20:02+02:00 moto kernel: [12055.518262] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2017-05-03T02:20:02+02:00 moto kernel: [12055.518376] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2017-05-03T02:20:02+02:00 moto kernel: [12055.518476] Stack:
2017-05-03T02:20:02+02:00 moto kernel: [12055.518506]  a7c0c3c748bd75a0 ffffffff8137a91b 0000000000000000 ffff88031eed7d80
2017-05-03T02:20:02+02:00 moto kernel: [12055.518622]  00007fff560d5c30 ffff88031eed7d80 00007fff560d5c30 0000000000000000
2017-05-03T02:20:02+02:00 moto kernel: [12055.518739]  00000000008407f0 ffffffffa0aecece ffffffffffffffff ffffffffa0ae9cf7
2017-05-03T02:20:02+02:00 moto kernel: [12055.518859] Call Trace:
2017-05-03T02:20:02+02:00 moto kernel: [12055.518908]  [] ? syno_cpu_temperature+0xfb/0x1d0
2017-05-03T02:20:02+02:00 moto kernel: [12055.519032]  [] ? GetCpuTemperatureDenlowI3Transfer+0xe/0x80 [bromolow_synobios]
2017-05-03T02:20:02+02:00 moto kernel: [12055.519178]  [] ? synobios_ioctl+0x507/0x1100 [bromolow_synobios]
2017-05-03T02:20:02+02:00 moto kernel: [12055.519290]  [] ? dput+0x1e/0x2a0
2017-05-03T02:20:02+02:00 moto kernel: [12055.519363]  [] ? mntput_no_expire+0x13/0x130
2017-05-03T02:20:02+02:00 moto kernel: [12055.519451]  [] ? path_openat.isra.43+0x146/0x4f0
2017-05-03T02:20:02+02:00 moto kernel: [12055.519542]  [] ? free_pages_and_swap_cache+0x9d/0xc0
2017-05-03T02:20:02+02:00 moto kernel: [12055.519636]  [] ? do_filp_open+0x2f/0x70
2017-05-03T02:20:02+02:00 moto kernel: [12055.519720]  [] ? dput+0x1e/0x2a0
2017-05-03T02:20:02+02:00 moto kernel: [12055.519798]  [] ? do_vfs_ioctl+0x20e/0x880
2017-05-03T02:20:02+02:00 moto kernel: [12055.519883]  [] ? SyS_ioctl+0x80/0xa0
2017-05-03T02:20:02+02:00 moto kernel: [12055.519959]  [] ? system_call_fastpath+0x16/0x1b
2017-05-03T02:20:02+02:00 moto kernel: [12055.520403] Code: 31 c0 48 8d 74 24 20 4c 89 e7 89 44 24 0c e8 54 d3 ba ff 8b 44 24 0c 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 53 48 89 fb  ff 0f 79 05 e8 b2 0c 00 00 65 48 8b 04 25 c0 a7 00 00 48 89
2017-05-03T02:20:02+02:00 moto kernel: [12055.520836] RIP  [] mutex_lock+0x4/0x20
2017-05-03T02:20:02+02:00 moto kernel: [12055.520917]  RSP 
2017-05-03T02:20:02+02:00 moto kernel: [12055.521000] ---[ end trace f35f665396d7d824 ]---
2017-05-03T02:20:02+02:00 moto [12055.522175] init: snmpd main process (13921) killed by SEGV signal
2017-05-03T02:20:02+02:00 moto [12055.522334] init: snmpd main process ended, respawning
2017-05-03T02:20:02+02:00 moto [12055.712944] init: snmpd main process (32005) terminated with status 1
2017-05-03T02:20:02+02:00 moto [12055.713056] init: snmpd main process ended, respawning
2017-05-03T02:20:06+02:00 moto synoupgrade_SYNO.Core.Upgrade.Server_1_check[32009]: rssfile.cpp:222 Fail to open [/var/run/autoupdate_tmp_file_RSS] err=No such file or directory
2017-05-03T02:20:06+02:00 moto synoupgrade_SYNO.Core.Upgrade.Server_1_check[32009]: dsmupdate.cpp:192 Fail to parse RSS file
2017-05-03T02:20:06+02:00 moto [12060.364074] init: synosnmpcd main process (32007) killed by KILL signal

 

 

It looks like it's complaining about the Core 2 temperature before the SNMP process gets killed. First I though its the CPU it struggles with, but another test VM on the same host works perfect. Going through the SNMP logs I see this:

 

Guessing that there's a floating point co-processor hrDeviceCoprocessor

 

That log entry comes from my SATA controller I've added in pass-through so I assume that it sees part of the device as a processor which doesn't properly report temperature.

 

Any ideas on how to deal with this issue? May changing the VM hardware version change anything? It's currently set to 13 (Esxi 6.5).

 

 

Edit: I see that I'm getting this too: "[May 3 20:55] general protection fault: 0000 [#1] SMP"

Share this post


Link to post
Share on other sites

Same issue here, running on ESXi 6.0, VM version: 10.

[  488.923716] BUG: unable to handle kernel NULL pointer dereference at 000000000000009b
[  488.925408] IP: [] __mutex_lock_slowpath+0x2e/0x1e0
[  488.926761] PGD 21328067 PUD 77fe9067 PMD 0 
[  488.927763] Oops: 0000 [#1] SMP 
[  488.928508] Modules linked in: cifs udf isofs loop nfsd exportfs rpcsec_gss_krb5 iscsi_target_mod(O) target_core_ep(O) target_core_file(O) target_core_iblock(O) target_core_mod(O) syno_extent_pool(PO) rodsp_ep(O) hid_generic usbhid hid usblp bromolow_synobios(PO) button ax88179_178a usbnet tg3 r8169 cnic bnx2 vmxnet3 pcnet32 e1000 sfc netxen_nic qlge qlcnic qla3xxx pch_gbe ptp_pch sky2 skge jme ipg uio alx atl1c atl1e atl1 libphy mii btrfs synoacl_vfs(PO) zlib_deflate hfsplus md4 hmac bnx2x(O) libcrc32c mdio mlx5_core(O) mlx4_en(O) mlx4_core(O) mlx_compat(O) compat(O) tn40xx(O) i40e(O) ixgbe(O) be2net(O) igb(O) i2c_algo_bit e1000e(O) dca fuse vfat fat crc32c_intel aesni_intel glue_helper lrw gf128mul ablk_helper arc4 cryptd ecryptfs sha512_generic sha256_generic sha1_generic ecb aes_x86_64 authenc
[  488.945855]  des_generic ansi_cprng cts md5 cbc cpufreq_conservative cpufreq_powersave cpufreq_performance cpufreq_ondemand mperf processor thermal_sys cpufreq_stats freq_table dm_snapshot crc_itu_t crc_ccitt quota_v2 quota_tree psnap p8022 llc sit tunnel4 ip_tunnel ipv6 zram(C) sg mpt3sas mpt2sas(O) megaraid_sas ata_piix mptctl mptsas mptspi mptscsih mptbase scsi_transport_spi megaraid megaraid_mbox megaraid_mm vmw_pvscsi BusLogic usb_storage etxhci_hcd xhci_hcd uhci_hcd ohci_hcd ehci_pci ehci_hcd usbcore usb_common el000(O) [last unloaded: bromolow_synobios]
[  488.957521] CPU: 0 PID: 17776 Comm: snmpd Tainted: P         C O 3.10.102 #15047
[  488.959035] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1506250318 06/25/2015
[  488.961320] task: ffff880065b7cee0 ti: ffff8800213b8000 task.ti: ffff8800213b8000
[  488.962834] RIP: 0010:[]  [] __mutex_lock_slowpath+0x2e/0x1e0
[  488.964662] RSP: 0018:ffff8800213bbcb0  EFLAGS: 00010202
[  488.965744] RAX: 0000000000000073 RBX: ffff88007f5f35b8 RCX: 0000000000000000
[  488.967188] RDX: 0000000000006e65 RSI: 0000000000000002 RDI: ffff88007f5f35b8
[  488.968643] RBP: ffff8800213bbcf8 R08: 0000000000000000 R09: 0000000000000000
[  488.970084] R10: 0000000000000105 R11: 0000000000000246 R12: ffffffff8184df70
[  488.971528] R13: ffff8800213bbd80 R14: ffff880065b7cee0 R15: ffff88007f5f35b8
[  488.972975] FS:  00007fe6c9d377c0(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
[  488.974609] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  488.975784] CR2: 000000000000009b CR3: 000000005f48a000 CR4: 00000000001007f0
[  488.977287] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  488.978785] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  488.980231] Stack:
[  488.980665]  ffff8800213bbd1c ffff88007f5f7de0 ffff8800213bbd80 000000000000fa80
[  488.982316]  ffff88007f5f35b8 ffffffff8184df70 ffff8800213bbd80 000000000000fa80
[  488.983966]  ffff88007f5f35b8 0000000000000001 ffffffff814a75ee ffff88007f5f3460
[  488.985618] Call Trace:
[  488.986149]  [] ? mutex_lock+0xe/0x20
[  488.987229]  [] ? syno_cpu_temperature+0xfb/0x1d0
[  488.988560]  [] ? GetCpuTemperatureDenlowI3Transfer+0xe/0x80 [bromolow_synobios]
[  488.990376]  [] ? synobios_ioctl+0x507/0x1110 [bromolow_synobios]
[  488.991930]  [] ? dput+0x1e/0x2a0
[  488.992936]  [] ? mntput_no_expire+0x13/0x130
[  488.994148]  [] ? path_openat.isra.45+0x146/0x4f0
[  488.995427]  [] ? free_pages_and_swap_cache+0x9d/0xc0
[  488.996781]  [] ? handle_mm_fault+0x13e/0x2a0
[  488.997998]  [] ? do_filp_open+0x2f/0x70
[  488.999123]  [] ? dput+0x1e/0x2a0
[  489.000129]  [] ? do_vfs_ioctl+0x20e/0x880
[  489.001282]  [] ? SyS_ioctl+0x80/0xa0
[  489.002358]  [] ? system_call_fastpath+0x16/0x1b
[  489.003619] Code: e5 41 57 41 56 41 55 41 54 53 48 89 fb 65 4c 8b 34 25 c0 a7 00 00 48 83 e4 f0 48 83 ec 20 48 8b 47 18 48 85 c0 0f 84 99 01 00 00 <8b> 40 28 65 4c 8b 3c 25 b0 a7 00 00 85 c0 4c 8d 6b 20 75 18 eb 
[  489.009857] RIP  [] __mutex_lock_slowpath+0x2e/0x1e0
[  489.011226]  RSP 
[  489.011954] CR2: 000000000000009b
[  489.012706] ---[ end trace 0dac5c8a951aa974 ]---
[  489.017045] init: snmpd main process (17776) killed by KILL signal
[  489.018518] init: snmpd main process ended, respawning

And because there is no Zabbix Agent package for DSM 6, I'm a bit stuck to monitor my Xpenology VM :sad:

Share this post


Link to post
Share on other sites

I'm currently using Paessler PRTG for SNMP monitoring on my systems, including XPE/DSM boxes. I'm 'testing' a bare metal Asrock J3455 having migrated from 5.2 to 6.0.2 with SNMP (v1/2) enabled and it seems to work ok. PRTG has a lot of built in mib info for Synology kit and so I'm getting full details of lot of sensors with no failures or freezing of services. I realise that this is a different setup to an ESXi VM but hope it helps.

Share this post


Link to post
Share on other sites

It doesn't unfortunately. The problem is the particular card or pci passthrough in general. The snmp service crashes regardless of snmp service polling the data.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now