naasking Posted April 20, 2017 #1 Posted April 20, 2017 I recently upgraded from a rock solid DSM 5 to DSM 6 for some much needed features. The upgrade went pretty smoothly, but the network and file services on DSM 6 update 10 seem to periodically drop. A 1GB file transfer over SMB or NFS will be chugging along at 100MB/s and then suddenly drop to zero. It sometimes resumes maybe 30 seconds later. Drops aren't predictable that I can see. Sometimes I won't see one for quite some time, and sometimes it happens repeatedly in the span of 10 mins. I don't see anything in the SMB logs, even if I log in via ssh and inspect them manually. SMB seems to get hit the worst. Right now I'm on the NAS via ssh, but can't load it via Windows explorer over SMB. Usually changing the supported SMB version fixes the issue relatively quickly, because that resets the network configuration thus restarting some services. But not always. About the only error I see in /var/log/messages is the following: 2017-04-20T17:33:14-04:00 MagiSAN synow3: net_get_mac.c:165 ioctl mac failed 2017-04-20T17:33:14-04:00 MagiSAN synow3: net_get_mac.c:63 Failed to get local original mac At this point I'm thinking it might be a network driver issue, but I don't see any issues being logged. Unless I'm missing some log somewhere? Any suggestions would be much appreciated. Quote
chege Posted April 20, 2017 #2 Posted April 20, 2017 One question, do you use mac from loader or its real mac from NIC? Quote
naasking Posted April 20, 2017 Author #3 Posted April 20, 2017 The DSM install instructions suggested changing the MAC in the grub file, which I did slightly change from the default (changed 2-3 chars), but I didn't think to use the actual NIC MAC. I just tried changing the MAC to a randomly generated one and the problems persist (might actually be worse as transfer speeds are dramatically slower, but that's possibly due to cold caches). The usual methods of checking the MAC address just report the random number I input... Ah, just found ethtool -P eth0 which returned something different, so I will try that. Quote
chege Posted April 20, 2017 #4 Posted April 20, 2017 Try to start xpenology with option set mac1= this will use actuall mac from NIC. PS Just checked transfer on my setup. Baremetal DSM 5.2 tranfer file to ESXI 6.0, DSM 6.1. File size 4GB, tranfer speed about 75MB/s over 1Gb network and mac setup like above. Quote
naasking Posted April 20, 2017 Author #5 Posted April 20, 2017 Same problem with the MAC from "ethtool -P eth0" too. NAS is an AMD machine, so dmesg at the beginning reports: [ 0.000000] CPU: vendor_id 'AuthenticAMD' unknown, using generic init. CPU: Your system may be unstable. Pretty sure I specified the correct grub line to boot though, or is that message an indication that I got it wrong? I just saw your other message about the empty line, so I'll try that next. Quote
chege Posted April 20, 2017 #6 Posted April 20, 2017 Transfer from CIFS mounted remote folder is about 75 MB/s and is not max, can be higher but my DSM 6.1 shows that volume is 100% occupied. With remote folder mointed as NFS transfer can be as high as 125MB/s. Quote
naasking Posted April 20, 2017 Author #7 Posted April 20, 2017 Leaving mac1 setting blank seems worse. Full network disconnects seem to happen more frequently. The eth0 MAC actually changes on each boot so I think it's random if it's left empty, doesn't actually use the device MAC. Quote
chege Posted April 20, 2017 #8 Posted April 20, 2017 Thats weird, in my case leaving blank mac1 etc takes mac adresses from ESXI settings. Quote
naasking Posted April 21, 2017 Author #9 Posted April 21, 2017 Still getting network drops. I can reproduce it pretty consistently now, I just have to boot then start a large SMB transfer. SMB quickly fails and I can't ping the server for a minute, then IP connectivity is restored. I can reach SMB by IP, but not by name. Lookup by name still hasn't come back after 5 mins. I usually reboot the samba process and lookup by name works again. No messages in /var/log/messages, no errors in dmesg that I haven't already mentioned. I'm now using the MAC I found and inputted the proper MB serial number, but no dice. I'm out of ideas. Let me know if you have any suggestions for where to look in the logs to diagnose the problem, and thanks for your help! Quote
chege Posted April 21, 2017 #10 Posted April 21, 2017 Its last i can think off, maybe the issue is not related to ethernet. Did you resource monitor on web to check what happened with system during file copy? Quote
naasking Posted April 21, 2017 Author #11 Posted April 21, 2017 It has to be network related. I can't even ping the NAS when the SMB copy halts. Resource monitor and ssh connections also temporarily fail. ssh reconnects when IP comes back up, but the web interface logs me out, so I can't view the resource monitor throughout. ifconfig eth0 reports 0 errors. SMB transfer speed is brutally slow when it reconnects (max 21MB/s vs. typical 100+MB/s), even after I restart smbd, but seems to be a little more reliable. If I switch from SMB 2 with jumbo frames to just SMB 2, transfer speed goes back up to 100+MB/s. This whole thing is just weird. Quote
naasking Posted April 21, 2017 Author #12 Posted April 21, 2017 Hmm, I just noticed that synology is only recognizing 2.5GB of the 4GB of RAM installed in the NAS. dmesg reports the following error: [Thu Apr 20 18:51:27 2017] SMBIOS 2.6 present. [Thu Apr 20 18:51:27 2017] DMI: System manufacturer System Product Name/E35M1-I DELUXE, BIOS 1501 04/25/2013 [Thu Apr 20 18:51:27 2017] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved [Thu Apr 20 18:51:27 2017] e820: remove [mem 0x000a0000-0x000fffff] usable [Thu Apr 20 18:51:27 2017] e820: last_pfn = 0x13f000 max_arch_pfn = 0x400000000 [Thu Apr 20 18:51:27 2017] MTRR default type: uncachable [Thu Apr 20 18:51:27 2017] MTRR fixed ranges enabled: [Thu Apr 20 18:51:27 2017] 00000-9FFFF write-back [Thu Apr 20 18:51:27 2017] A0000-BFFFF write-through [Thu Apr 20 18:51:27 2017] C0000-D1FFF write-protect [Thu Apr 20 18:51:27 2017] D2000-E7FFF uncachable [Thu Apr 20 18:51:27 2017] E8000-FFFFF write-protect [Thu Apr 20 18:51:27 2017] MTRR variable ranges enabled: [Thu Apr 20 18:51:27 2017] 0 base 000000000 mask F00000000 write-back [Thu Apr 20 18:51:27 2017] 1 base 0A7F00000 mask FFFF00000 uncachable [Thu Apr 20 18:51:27 2017] 2 base 0A8000000 mask FF8000000 uncachable [Thu Apr 20 18:51:27 2017] 3 base 0B0000000 mask FF0000000 uncachable [Thu Apr 20 18:51:27 2017] 4 base 0C0000000 mask FC0000000 uncachable [Thu Apr 20 18:51:27 2017] 5 disabled [Thu Apr 20 18:51:27 2017] 6 disabled [Thu Apr 20 18:51:27 2017] 7 disabled [Thu Apr 20 18:51:27 2017] e820: update [mem 0xa7f00000-0x13effffff] usable ==> reserved [Thu Apr 20 18:51:27 2017] WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 1007MB of RAM. I don't know if this was the case under DSM 5, but it still seems strange that such an old mobo would still have this problem. Quote
naasking Posted April 21, 2017 Author #13 Posted April 21, 2017 I confirmed that the default boot selection was not AMD for some reason, which is why I received that AuthenticAMD error in dmesg. When I boot with the correct option I see 3.6 GB of memory instead of the previous 2.5 GB. 500MB would be reserved for onboard devices, like built-in graphics and network, which seems reasonable, so that's solved. I also reset the my BIOS memory timings to auto, and network seems pretty stable so far. So fingers crossed... Quote
Polanskiman Posted April 21, 2017 #14 Posted April 21, 2017 I confirmed that the default boot selection was not AMD for some reason, which is why I received that AuthenticAMD error in dmesg. When I boot with the correct option I see 3.6 GB of memory instead of the previous 2.5 GB. 500MB would be reserved for onboard devices, like built-in graphics and network, which seems reasonable, so that's solved. I also reset the my BIOS memory timings to auto, and network seems pretty stable so far. So fingers crossed... Strange though that you were able to boot without selecting the AMD boot line in Grub menu. Anyhow, use your NIC real MAC address. Not that this was the problem but this is what I recommend to everyone. No need to generate anything. If your problem is solved please add [sOLVED] to the title. Quote
naasking Posted April 21, 2017 Author #15 Posted April 21, 2017 Unfortunately it's not solved. Network drops are still fairly common over SMB at least. I noticed this sequence of errors in /var/log/synoservice.log right around the time of the network drop: 2017-04-21T09:20:33-04:00 MagiSAN synoservice: service_type_action.c:130 synoservice: Type [LINK_SENS] restart finished 2017-04-21T09:21:07-04:00 MagiSAN synoservice: service_resume_by_reason.c:12 synoservice: resume [avahi] by reason [ipv4_change] ... 2017-04-21T09:21:07-04:00 MagiSAN synoservice: service_restart.c:21 synoservice: restart [synotunnel] ... 2017-04-21T09:21:07-04:00 MagiSAN synoservice: service_restart.c:34 synoservice: [synotunnel] is not enabled, skip restart action ... 2017-04-21T09:21:07-04:00 MagiSAN synoservice: service_restart.c:52 synoservice: finish restart [synotunnel]. 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:69 synoservice: Type [iP_SENS] restarting 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [nmbd] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [ftpd-ssl] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [snmp] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [pppoerelay] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [avahi] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [iscsitrg] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [ssdp] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:69 synoservice: Type [LINK_SENS] restarting 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [nmbd] restart 2017-04-21T09:21:12-04:00 MagiSAN synoservice: service_type_action.c:82 synoservice: service [ssdp] restart 2017-04-21T09:21:14-04:00 MagiSAN synoservice: service_reload.c:20 synoservice: reload [nginx]. 2017-04-21T09:21:14-04:00 MagiSAN synoservice: service_restart.c:21 synoservice: restart [nmbd] ... 2017-04-21T09:21:15-04:00 MagiSAN synoservice: service_restart.c:52 synoservice: finish restart [nmbd]. 2017-04-21T09:21:15-04:00 MagiSAN synoservice: service_restart.c:21 synoservice: restart [avahi] ... 2017-04-21T09:21:15-04:00 MagiSAN synoservice: service_restart.c:52 synoservice: finish restart [avahi]. 2017-04-21T09:21:15-04:00 MagiSAN synoservice: service_reload.c:46 synoservice: finish reload [nginx]. 2017-04-21T09:21:15-04:00 MagiSAN synoservice: service_type_action.c:130 synoservice: Type [LINK_SENS] restart finished 2017-04-21T09:21:20-04:00 MagiSAN synoservice: service_reload.c:20 synoservice: reload [nginx]. 2017-04-21T09:21:20-04:00 MagiSAN synoservice: service_restart.c:21 synoservice: restart [nmbd] ... 2017-04-21T09:21:21-04:00 MagiSAN synoservice: service_restart.c:52 synoservice: finish restart [nmbd]. 2017-04-21T09:21:21-04:00 MagiSAN synoservice: service_restart.c:21 synoservice: restart [avahi] ... 2017-04-21T09:21:22-04:00 MagiSAN synoservice: service_restart.c:52 synoservice: finish restart [avahi]. 2017-04-21T09:21:22-04:00 MagiSAN synoservice: service_reload.c:46 synoservice: finish reload [nginx]. 2017-04-21T09:21:23-04:00 MagiSAN synoservice: service_type_action.c:130 synoservice: Type [iP_SENS] restart finished This repeats many times, and the timing seems roughly correlated with network drops. You can see all the network services restarting (including OpenVPN which shows up on synosys.log), but I don't see any reasons logged. Any thoughts? Some other observations: * Some threads on synology forums[1] report the same log entries as aboe, but I'm not using any of PPTP, IPSEC/L2TP or AFP. * Now that I've been up and running for a good half hour, network drops seem less frequent (could be a coincidence). * The above restart messages occur every time the OpenVPN connection on the NAS connects/disconnects, presumably because the IP changes so all IP services are restarted. Perhaps the problem is ultimately with the OpenVPN connection. [1] https://forum.synology.com/enu/viewtopic.php?t=118970 Quote
NoFate Posted September 16, 2017 #16 Posted September 16, 2017 hi same issues here on my asrock and hpgen 8 systems look here from page 3 , watch the videos, same issue as you have? with asrock system , i managed it with lowering memory from 4 to 2 with hpgen 8, i totally switch over to esx, with another sata controller , read that topic , everything is there Quote
Decapix Posted October 6, 2017 #17 Posted October 6, 2017 @naasking have you managed to solve this? I have almost exactly the same board. I am using Asus E35M1-M Pro with Realtek 8111E Gigabit. SMB will stall and comes back every 30 seconds. You can try the solution below? swapping the drivers to loading sequence. But i solved mine by getting a refurb Intel 1000pro PT. On 2/5/2017 at 9:15 PM, Bear said: I got tired of waiting for the other loader, and gave this a try. I have a Asrock C2550D4I, and there was only one snag when installing this. The network drivers didn't work (nothing after the "Booting the kernel."), but a change in the file like this post https://xpenology.com/forum/topic/6253-dsm-6xx-loader/?do=findComment&comment=57513 suggested worked! Basically in the grub.cfg file, this section should look like this function loadinitrd { if [ -s $img/$info ]; then if [ -n "$has_serial" ]; then terminal_output --remove serial fi cat $img/$info if [ -n "$has_serial" ]; then terminal_output --append serial fi fi if [ -s $img/$extra_initrd ]; then initrd $img/$extra_initrd $img/ramdisk.lzma else initrd $img/ramdisk.lzma fi } And its this like you change initrd $img/$extra_initrd $img/ramdisk.lzma Updated to .9 with no problem. Quote
naasking Posted October 7, 2017 Author #18 Posted October 7, 2017 I have not managed to solve this. Flipping the image load order as you suggested makes my NAS inaccessible over the network. It appears to still boot except for the network drivers. Another person suggested restricting the accessible RAM to 2GB which solved it for their AMD box, so I may try that next since I'm currently running with 4GB. Quote
Decapix Posted October 9, 2017 #19 Posted October 9, 2017 just want to confirm that the latest boot loader changed the name to: if [ -s $img/$extra_initrd ]; then initrd $img/$extra_initrd $img/rd.gz Quote
naasking Posted October 9, 2017 Author #20 Posted October 9, 2017 (edited) Yes, I just swapped the order of the parameters that were already there. NAS no longer booted with network connectivity. Edited October 9, 2017 by naasking Quote
Decapix Posted October 10, 2017 #21 Posted October 10, 2017 On 9/12/2017 at 3:29 AM, IG-88 said: hi, contiuing from here i have created a new v3 with the source of dsm 6.1.3 (15152) there are drivers that would be usefull but do not load after compiling (SATA/PATA), i marked them and commented the reason, maybe i will find time to look into this and try to find the point in the kernel source where the function is and why it might happen (i'm coder so i dont expext much to find out), if some else is able to find out comment it here all modules are tested with 6.1.3 (sucsesfull loaded with insmod) as before it contains all modules and firmware jun has used, so in theory what worked ootb with 1.02b should also with this extra.lzma v.3: http://s000.tinyupload.com/?file_id=71323561438971251178 Modules log net/ethernet Atheros L2 Fast Ethernet support atl2.ko ---temp remove - Broadcom 440x/47xx ethernet support b44.ko -> b44: Unknown symbol ssb_device_is_enabled (err 0) b44: Unknown symbol ssb_pcicore_dev_irqvecs_enable (err 0) b44: Unknown symbol ssb_bus_may_powerdown (err 0) b44: Unknown symbol ssb_pcihost_register (err 0) b44: Unknown symbol ssb_device_disable (err 0) b44: Unknown symbol ssb_device_enable (err 0) b44: Unknown symbol ssb_driver_unregister (err 0) b44: Unknown symbol __ssb_driver_register (err 0) b44: Unknown symbol ssb_bus_powerup (err 0) b44: Unknown symbol ssb_clockspeed (err 0) b44: Unknown symbol ssb_dma_translation (err 0) Intel(R) PRO/100+ support e100.ko Intel(R) 82576 Virtual Function Ethernet support igbvf.ko Intel(R) PRO/10GbE support ixgb.ko Intel(R) 82599 Virtual Function Ethernet support ixgbevf.ko nForce Ethernet support forcedeth.ko Marvell MDIO interface support mvmdio.ko net/usb USB RTL8150 based ethernet device support rtl8150.ko Realtek RTL8152 Based USB 2.0 Ethernet Adapters r8152.ko Conexant CX82310 USB ethernet port cx82310_eth.ko ASIX AX88xxx Based USB 2.0 Ethernet Adapters asix.ko Prolific PL-2301/2302/25A1 based cables plusb.ko block Promise SATA SX8 support sx8.ko scsi Adaptec AACRAID Support aacraid.ko Adaptec AIC94xx SAS/SATA Support aic94xx.ko 3ware 9xxx SATA-RAID support 3w-9xxx.ko 3w-sas.ko HP Smart Array SAS hpsa.ko Marvell 88SE64XX/88SE94XX SAS/SATA support mvsas.ko ARECA (ARC11xx/12xx/13xx/16xx) SATA/SAS RAID Host Adapter arcmsr.ko HighPoint RocketRAID 3xxx/4xxx Controller support hptiop.ko Intel(R) C600 Series Chipset SAS Controller isci.ko Marvell UMI driver mvumi.ko ata ---temp remove - NVIDIA SATA support sata_nv.ko ---temp remove - Silicon Image SATA support sata_sil.ko ---temp remove - VIA SATA support sata_via.ko ---temp remove - Promise SATA TX2/TX4 support sata_promise.ko ---temp remove - Promise SATA SX4 support sata_sx4.ko -> Unknown symbol syno_libata_index_get (err 0) ---temp remove - JMicron PATA support pata_jmicron.ko ---temp remove - Marvell PATA support via legacy mode pata_marevell.ko ---temp remove - VIA PATA support pata_via.ko ---temp remove - CMD / Silicon Image 680 PATA support pata_sil680 ---temp remove - Intel PATA old PIIX support pata_oldpiix.ko ---temp remove - Intel SCH PATA support pata_sch.ko ---temp remove - Intel PATA MPIIX support pata_mpiix.ko ---temp remove - SERVERWORKS OSB4/CSB5/CSB6/HT1000 PATA support pata_serverworks.ko -> Unknown symbol syno_libata_index_get (err 0) firmware bnx2/bnx2-mips-06-6.2.3.fw bnx2/bnx2-mips-09-6.2.1b.fw e100/d101m_ucode.bin e100/d101s_ucode.bin e100/d102e_ucode.bin tigon/tg357766.bin Hide The newest version for testing is down below in the tread as its still work in progess and for testing (no "stable" version)" how about loading driver from IG-88? Quote
waspsoton Posted October 12, 2017 #22 Posted October 12, 2017 I have having the same problem with my amd system. I was thinking it was my network cable but after reading this I am not so sure. I will update my loader to the latest version and see what happens Quote
naasking Posted October 19, 2017 Author #23 Posted October 19, 2017 The problem seems well-known in every Linux distro. My board has the Realtek 8111E chipset, and along with the 8168, these load the driver for the Realtek 8169. Unfortunately, this driver is known to produce unreliable ethernet connections on these chipsets. Not sure what I can do to remedy this, as all the recommended solutions that I can find suggest building the drivers Realtek provides from source, and I'm not setup to do that with the Synology images. Anyone have any ideas? Quote
naasking Posted October 19, 2017 Author #24 Posted October 19, 2017 Hmm, a module r8168 is installed and running. The version says 8.044.02, which seems to only be one minor version behind the up to date one on the Realtek site which is 8.045.08. If someone could point me in the right direction to update this module, I could play around with this and hopefully get it working. Quote
waspsoton Posted October 23, 2017 #25 Posted October 23, 2017 On 19/10/2017 at 6:19 AM, naasking said: Hmm, a module r8168 is installed and running. The version says 8.044.02, which seems to only be one minor version behind the up to date one on the Realtek site which is 8.045.08. If someone could point me in the right direction to update this module, I could play around with this and hopefully get it working. i am having the same problem. did u ever get it fixed?? I am wondering if I should get an intel NIC and see if that sorts the problem Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.