PDA

View Full Version : Any Linux troubleshooting experts (probably hardware) around?



Varsuuk
April 12th, 2017, 04:51
Long story shorter:
When I ssh to the host, sometimes while typing the teminal becomes "stuck/frozen/unresponsive" - yet if I wait, whatever I typed comes into terminal in a few seconds or maybe sometimes longer but always under 30 secs.

Far as I recall, my system was fine on whatever I was running back then. Believe was Ubuntu. Upgraded PLUS added a new WD Red 2TB drive (this was like 2 years ago.) And next time I finally got to use after a reos... the odd behavior was noticed.
Problem is I cannot guarantee it did not occur before since I so rarely logged on to the host anymore. Mainly it became my TS3 server and Mincraft or other server games so could do with my son localnet.


I tried running top on the screen connected to the host and nothing seems odd at the time. Back before I upgraded hardware and changed OSs (see below) I'd see a xterm or video related (2 years ago, forget) things using decent CPU)
I noticed that I can immediately open a new ssh window on my windows box and connect and do "stuff" like ls, etc and get it to work before the original ssh terminal (putty btw) "catches up" - this depends on my typing fast enough.
I recall determining last time a year or more back that going on to the host itself, I could type on a terminal there but that it too was capable of freezing. Do NOT recall if tried with non-graphic terminal via the F-keys. I have to retest my memory of first part again - prob tomorrow.

I retried install of then current 14-xx version of Ubuntu. Waited for next 6 month one and same prob (in case faulty - hail mary). Tried Fedora and saw same issue.
I SUSPECT that I was using the older 2D gnome? gui in my Ubuntu before they deprecated it. I had a cheeeeeeesy nVidia card.
-- tried switching to 3rd party driver or to Nouveou, forgot which was first.
-- Was a "silent" card, took an old dual fan BFG 7900GTX to see if "newer" (lol) card was better. Seems to perform better but same issue.
-- Bought a new EVGA GeForce 710 with 2GB DDR3 "passive" no fan one figure cheap at $40 and quieter than my BFG. Works great, pretty and smooth... still has putty freezing.
-- I bought a new WD 1TB Gold drive to just use as the "root drive" since before that was using an very old WD Black 750GB drive.
-- Tried installing then using ONLY the new drive, same issue. (Occam's razor, back then I just changed the data HD so thought maybe that was it.)


I have it on an old Core 2 Duo 8500. 4GB Of Corsair Dominator whatever simms (2 sticks - tested with memtest 2 nights back.) Motherboard is an old ASUS P5K Premium.
I don't see anything written to syslog at the time.
Note the swap dirs are currently on 2 (until I re-OS the old DATA drives, 2 have SWAPs) of the older drives. In future only the 750GB will have a swap.

Since then tried:
-Passing noapi to kernel (was looking at some troubleshooting)
-Updating hdparms.conf to "apm=" for all but new Gold drive since hdparm on each of them said not supported and had seen an error in syslog relating to setting one of them to =254 and timing out.
-Went back to BIOS, disabled all unused things like firewire, serial ports, second NIC, WiFI on board, etc. Turned on APIC 2.0. Turned on PnP OS. (I reverted the noapic option after one try on kernel options btw)

Now re-OSing after setting these options in case. In case, the more esoteric things (to me) like APIC 2.0/PnP affect options when installing?


I know this is not likely to make anyone have a EUREKA and fix it for me, but figured listing it all might get some feedback on what to try.
Not much I can take out - technically never disconnected the SATA CDROM, there's that and the floppy 3.5 I can try without.
Also did not disable the built in audio HD.

Hadn't worked on this in a while because old monitor was bad and had a placeholder 19" (eeeek!) monitor with scratches on surface in meanwhile. Finally bought a nice 24" Ultrasharp which I can use with my macbook when using as a desktop and no longer need to pick mac or windows on my main monitor.


Happy to chat on TS if that helps.

Sorry is kind of OffTopic as it isn't "game related", although, I plan to try to install FG on it ;P for son or visiting friend to play next to me ;)


Thanks for your time, feel free to grill me for more data.

Varsuuk
April 12th, 2017, 06:01
Forgot - PS... I tried Linux Mint 18.1

Same issue, but that's the one I will leave installed and work on figuring out for now.

Andraax
April 12th, 2017, 13:49
What's your load average when this is going on? Could just be dropped network packets, or could be your LA is high and your process is waiting on CPU slices.

Varsuuk
April 12th, 2017, 14:23
Won't be able to check until back home but severely doubt it.

i was watching top for many of the testing rubs and never saw a clue. Closest ai saw as a pattern is that sometimes "iraq/31-nvi+" (not sure on number off top of head) would flicker near top on top just around time the terminal "comes back" but then I noticed it came up at times when nothing going on and couple where didn't appear when terminal "came back"

The most recent testing is post clean install with no one cora things running. In past the machine should have ever had a load, but hey...

Also both are plugged into same switch. Tried on a different switch (have 2 8s next to each other) also like said pretty close to certain I saw this happen connected in terminal in my Ubuntu incarnation right on the host. But need to reverify that.

I really assumed would be a HD (maybe going to sleep) or some problem with newer Linux after dropping Gnome2D running on really old crappy nvidia card. Tried it with only a brand newHD and with new (low end) current nvidia card yesterday so....

Varsuuk
April 12th, 2017, 14:24
I'm totally stumped.

celestian
April 12th, 2017, 16:15
Long story shorter:
When I ssh to the host, sometimes while typing the teminal becomes "stuck/frozen/unresponsive" - yet if I wait, whatever I typed comes into terminal in a few seconds or maybe sometimes longer but always under 30 secs.

Sounds like a network and/or latency issue. Check the interface for errors? Any packet loss?

Varsuuk
April 12th, 2017, 21:41
Best way to do this stuff - I'm technical-capable, just not my area (I'm a C++ engineer)

Btw - it's freaky Celestian, I was JUST reading your questions:
"Celestian on March 30
Quick question, will we be able to get a digital version of the map for use with a virtual tabletop? I know the PDF will have the map but im more interested in a copy that has proper alignment/scaling to use grids/etc.
Thanks!"


I was thinking of all the lost opportunities like missing out on buying Upper Works (always wished could play it) and that led me to google the Harridan to see if she had licensed anything or still holding her whatever grudge and then I thought of the Gygaxers and Ben's dungeon project (which I too heard about too late to get in on all the good stretch goals - heck, even if it HAS been stretching so long, I wouldn't have missed the cash and just been happy when came out :( ) and I saw your name.

So I come here to look at where to post and was still on this page and BLAM...there you are! lol so funny.


I went to ssh to it, but the damn Mint screensaver was frozen and realized host is frozen now.

This is second time I rebooted for this. So either my BIOS setting changes (off top of head, going to SATA AHCPI and enabling ACPI 2.0 and disabling (sure that isn't it) extra NIC, serial port, firewire, etc) has caused this.
Will reboot when can after work (remote working atm, that would take too much attn vs wandering net on lunch) and see if there was any logged messages to give hints.

But this is definitely a different issue, because under ubuntu LTS I had it running over a year without rebooting other than on purpose to test some things. It should be unrelated to the SSH thing other than maybe caused by my actions troubleshooting freeze thing. Unless Mint 18.1 has this issue which I am SURE it doesn't - just my first time using other than Gentoo, Redhat or Ubuntu (plain)

celestian
April 12th, 2017, 21:47
Best way to do this stuff - I'm technical-capable, just not my area (I'm a C++ engineer)

Btw - it's freaky Celestian, I was JUST reading your questions:
"Celestian on March 30
Quick question, will we be able to get a digital version of the map for use with a virtual tabletop? I know the PDF will have the map but im more interested in a copy that has proper alignment/scaling to use grids/etc.
Thanks!"


I was thinking of all the lost opportunities like missing out on buying Upper Works (always wished could play it) and that led me to google the Harridan to see if she had licensed anything or still holding her whatever grudge and then I thought of the Gygaxers and Ben's dungeon project (which I too heard about too late to get in on all the good stretch goals - heck, even if it HAS been stretching so long, I wouldn't have missed the cash and just been happy when came out :( ) and I saw your name.

So I come here to look at where to post and was still on this page and BLAM...there you are! lol so funny.


Yeap, I'm looking forward to the Tomb project! You'll find me anywhere there is old school AD&D on the net probably. Like dragonsfoot.org, I've been using it since 2002 (well before that, he just updated forums then and I had to re-signup I think).



I went to ssh to it, but the damn Mint screensaver was frozen and realized host is frozen now.


Yeah, based on that alone it's definitely not a network/latency issue. That's going to be a more difficult issue to track if you're not getting any errors in your logs and/or console output.

Ken L
April 13th, 2017, 03:51
Long story shorter:
When I ssh to the host, sometimes while typing the teminal becomes "stuck/frozen/unresponsive" - yet if I wait, whatever I typed comes into terminal in a few seconds or maybe sometimes longer but always under 30 secs.

...


I tried running top on the screen connected to the host and nothing seems odd at the time. Back before I upgraded hardware and changed OSs (see below) I'd see a xterm or video related (2 years ago, forget) things using decent CPU)
I noticed that I can immediately open a new ssh window on my windows box and connect and do "stuff" like ls, etc and get it to work before the original ssh terminal (putty btw)


From what I'm gleaming here, 2 years ago, you performed a hardware upgrade, but his problem occurred only recently as a result of a reformat? Sorry, you have a bit of information bloat here in terms of what has caused the problem.

Second, the issue involves a remote terminal delay to your PUTTY client on your connecting windows machine correct? Given that you've performed hardware upgrades, I'd assume that the server is physically accessible? Also is this the only issue? Do normal operations perform in a timely manner?

Varsuuk
April 13th, 2017, 06:32
2 years ago I replaced a HD with a new WD 2TB Red drive to use as a pure storage drive. I had not been using the machine for some time before that. So the problem COULD have existed before. But it did not exist when I was on an older Ubuntu with the old Gnome 2D? setup. I upgraded but barely used it. Thing is, my at home PC time is so short starting about then that cannot be sure.

I didn't want to troubleshoot. Decided to update in 2 months when next Ubuntu came out 16.04LTS I think. I did. Same prob. Left it again, I didn't bother finishing the install (ie: my dev tools and setting up the web server etc - just left it as is.)
I later installed 2 Minecraft Server instances both running for my son to play and us to Multiplay on. I never noticed delays or problems with this but if they were not like 2-15 second hiccups like I get on SSH - I wouldn't have noticed it.

Decided to try again. Added a new 1TB Gold drive to install just the base Linux stuff on. Disconnected all other drives. Installed Mint 18.1 to try it. Same issue. The pause on the ssh. Did not duplicate when on Terminal directly on the server (yes, its in den, older C2D 8500) - did not try on on-X screen since did not duplicate on xterm (or whatever Mint uses) Also bought a cheap EVGA 7300 vid card to replace noisy old BFG7900OTC. I don't usually do anything directly on box's screen. I use ssh to edit or remote netbeans etc to code.

Tried mucking with BIOS. Turned off unused things. Switched to AHCPI for SATA. Turned on ACPI 2.0. Not sure if other stuff (took pics, will look tomorrow)
Saw same issue BUT now noticed that it freezes (the whole PC) and drops the SSH connection after a long time of inactivity. At this point I assume was some sort of power/sleep thing that it does not come back from. (I really don't want it to hibernate ever or any such thing. Spin down drives, whatever sure, but not actually sleep/hibernate. Didn't play with settings yet though.)

Reinstalled with Mint 18.1. Had same issues, reinstalled with Ubuntu 16.04 since I had more direct experience and to switch it up.

So I do not think this is connected. Suspect it's something I did with bios. After first time this happened I noticed I was 5 BIOS back and went from 702 to 1001 (old ASUS 5K Premium MB)
Same issue happened with freeze in SSH and total lock up after long delay. Hitting power button once brought up Ubuntu screen. But clicking with mouse didn't open menus and keyboard (both PS/2s) didn't type.


So... finally went to BIOS and switched back to NO ACPI-2.0. And put the XXXX mode back to AUTO from trying S1 (vs S3 POST). I did not reinstall so not sure if the ACPI non 2.0 is a prob without reinstall but was heading to sleep so figured can do tomorrow.
Left an SSH session connected and was going to see if PC was frozen or SSH connect broken (which is part of the frozen PC issue vs the ORIGINAL ssh "slowdown/buffering" issue)

I suspect the totally frozen PC thing will go away if I put all BIOS back to what was BEFORE and I will try that.

The SSH delay then all of sudden all buffered input (it still COLLECTS input) is processed issue is something else. Looking like the WD Red drive replace was just a coincidence as it happens with another drive as the solo drive. Doubt it is video related which suspect was since had switched to gnome 3d windowing and had a reaaallllly old cheap passive DVI card in there before.



I will next run tcpdump to see if see anything funny at time it "freezes/pauses" in SSH connection. I am not wireshark pro but hoping will see something out of ordinary via patterns when it occurs.

Varsuuk
April 13th, 2017, 13:17
Yup, either APIC 2.0 or the other thing I changed (need to look at BIOS Advanced tab to compare against last time) took care of the (what I am calling) Sleep/freeze.

Back to just ssh issue. Will investigate more later tonight when home from work.

Varsuuk
April 13th, 2017, 13:26
Ken: Do normal operations occur in timely manner?

From all I can see, there is nothing "wrong" other than this occasional "pause" in comms between ssh client and the host. I am still not 100% that it cannot occur while typing on the term app DIRECTLY on the host but I could not get it to do so. I hope it is remote only because it then points to the net/ssh vs more general troublshooting. It's just that (with my baaaaad memory) I kind of remember seeing it occur locally as well.

I'll make a point of working directly on the host keyboard when I do things tonight like install git/minecraft servers/teamspeak server etc, to be closer to certain it does not happen that way.

Maybe looking at pcap will identify general delays since I doubt I'd notice anything at the couple millis level much less micros.


Now, gotta run to work and deal with measuring micros most of the morning ... round trip processing is ~67mics (~30mics is the Trading System that processes the orders) so I hit my perf goals for 2016 of reducing 30-50mics out of the 120mics we were at last March :) but it's been a LOT of pcap staring and code reviewing...

Ken L
April 14th, 2017, 04:58
Sorry, I skimmed much of your second post since you seem to be jumping through a ton of hoops without actually identifying a root cause. Plug in a keyboard and monitor and verify that the host server is fine, SSH should not have a delay that large, especially on LAN which should be near instant. I wouldn't but it past PUtty being wonky at times, and relying on that single window for diagnosing the problem is a case of tunnel vision. The next test after checking the host, would ideally be to SSH via another linux machine on your network and verify again. You can simply flash a 2gb USB drive with Knoppix or any light weight GNU/Linux distribution and boot from that usb drive on your windows machine to get a 100% second temporary gnu/linux machine as your test case.

skj310
April 14th, 2017, 12:59
What you are experiencing sounds familiar and as Ken L mentioned, I too haven't read through everything you mentioned, but his idea is valid. Connect a local keyboard, mouse, monitor and see if you are experiencing the same.

My specific thought is that you've a bad port(s) on your switch, but as you mention, there's no good way to know unless you are monitoring packet loss. To do that you need to use a tool like jperf spanning your network (i.e. running on the problematic linux box, and on your windows PC too). If you know what your network speed should be (it'll always be the speed of the lowest order on your network (e.g. if you have gigabit NICs, and hub but your cabling is only 10MB then you'll only ever see 10MB)). I mention this because knowing your network speed will give you an indication of how speedy to expect jperf. As well you can then use wireshark to monitor your linux NIC and see if there's packet loss.

The issues you're experiencing sound very much like network ... UNLESS plugging in the keyboard/mouse/monitor turns up the same problem. Then you know for sure that it's related to your local hardware. That you're messing with BIOS versions is good, and that you mention BIOS tells me that you're not dealing with UEFI ... which makes things easier. But that being said, modern kernels, linuxmint 18.1 (my personal flavour of choice), should recognise your h/w very well independent of your bios version. So if you ARE experience trouble directly on this PC independent of the network, then we might be thinking about a bad mobo.

I recently had to rebuild a media server that i have running ubuntu. Humidity finally got to the fans, and rusted the hell out of the chassis. Thankfully my drives were all ok, so getting up and running on new h/w was FAST (probably took me about 2hr).

Ken L
April 14th, 2017, 13:26
My specific thought is that you've a bad port(s) on your switch, but as you mention, there's no good way to know unless you are monitoring packet loss.

I would not even go there until some preliminaries are made. He claims that it works fine given he runs minecraft servers and other such services on it except for this PUTTY issue. If it was a network problem there would be degradation across all network based processes.

Varsuuk
April 14th, 2017, 21:03
Hey guys, in the middle of some things with my son - will be trying to "work directly on the linux box thing" later today. Normally, I did everything with putty since I like this keyboard (ergo crap) for my hands and before because the Linux box had crap screen. Now it has a good one so I can jsut scoot across the desk to it. I figure between copying to new install and editing various configs, setting up my code repo etc, I should see this issue in THAT terminal if it isn't related to net or putty specifically. Definitely a good call to eliminate it. In my spew above, I mentioned long time back trying it directly on the Linux host and running into same pause. BUT... I have horrible memory and for all I know I recall it wrong, it was long ago. Definitely will do this first.

As for Minecraft play, well, not sure that I would notice it IF unless the amount of data commed back and forth client to server and back needed to continuous. If there were a 10sec stop but the server sent more than enough data or the client could work on "stale" data meanwhile - I might miss it. Certainly though, I did not experience any old-time release-day EverQuest rubberbanding/stop-motion bleepery ;)

Will repor later, thinks for thinking on this.


Skj310 --> Any reason I should really switch to Mint 18.1 vs continuing with my Ubuntu 16.04 reinstall? I had decided to try 18.1 when I started this a few days back and when saw same problem and the fact that I was even less familiar with the layout of Mint (and meh on coloring) I said let me debug with Ubuntu vanilla.
But, if you have specific info on why I am better off adjusting to change - I am willing to listen :) I currently do very little unlike past on the Linux boxes. I used to do a lot of coding at home and ran Gentoo for bleeding edge compilers etc and for increasing Linux knowledge fixing all the emerge issues! ;) . Now, I cannot take that much time to keep in sync and years back went to standard ones. I would prefer to be on 17.04 to get "new stuff" but they require more upgrading to next versions more often and sadly in my age, it seems I am preferring to leave as is. Hell, not like my dev team is using even C++11 much less 14/17 anyways. Nasdaq is more a Java place.

Varsuuk
April 15th, 2017, 04:53
OK, didn't get it to occur on the local "Terminal" app in Ubuntu 16.04.
It is very "random" on the Windows box running putty. On one putty screen it took like 77 "ls -lrt" with bunch of enters until it scrolls off screen.
Another putty screen it happened after like 5 enters while typing the "ls -lrt"


Which leads me to better phrasing the issue and I regoogled (not sure if the last thing discussed could be related, but yeah about back then I probably went to full gigabit switches etc... : https://unix.stackexchange.com/questions/76121/ssh-will-sporadically-hang-temporarily-on-fast-connection)
I will leave a ping running between them on next try. This should help me when looking at the pcap because the one I saw had lots of other stuff going on. Will do on a fresh restart of windows since I had soooo many things running and using net (no "especially active" connections, web pages, origin client, outlook etc) - but while I saw a delay in constant coms client->server in pcap at around the time I wallclock noticed it "pause", thing is I am confused because shows my comms micros apart so it HAS to be grouping things or some such (nagling? not sure) I certainly do not expert the enter processing to be micros apart but hey...


Next, I will be doing the SSH from my macbook in a few mins (then have to put boy in bed - vacation stays up late) but wanted to mention/stress that is NOT the "response" to an putty terminal typed command - but it pauses AS I TYPE in the putty window... which may free your mind to consider other things.

Varsuuk
April 15th, 2017, 06:16
OK - verified it also happens doing the same thing on my MacBook. I type "ls -lrt" and hold enter, scrolling until it gets near top then up-arrow enter to redo ls -lrt and continue alternating how long between etc.

At one point during the "enter" phase the screen stopped scrolling. I quickly switched to entering random keys "kdfdkddk" etc + enter key and when it "came back" it processed those keys and gave unknown command etc.


This was done using a physical ethernet connection to switch. Will disconnect and try via wifi.


UPDATE: Yup, even on wifi, it happened on the MacBook. In fact the second time, on wifi, it occurred after only a couple screen scrolls. I've seen it occur that was on ethernet cable connection as well - just mentioning it.


It's too late in the evening for more checking from me - but if no updates on what to try - figure will look at where it is wired and then try moving both test box and linux host to connect on same switch (in case that isn't the case now - I have 1 GB switch, one netgear wifi+router (used as a switch) and the verizon router + wifi.

If behaves same, will try using a different ethernet port on the linux host mobo. The Mobo is an old ASUS P5K Premium (Black Pearl) which has 2 ethers + 1 wifi. (Wifi is disabled in BIOS... and yeah, not UFI or what have ya - only my newer Windows 10 box uses that sort of "bios")

skj310
April 15th, 2017, 06:59
OK, didn't get it to occur on the local "Terminal" app in Ubuntu 16.04.
So that is good news! Starting to look more like a network issue. I was gonna suggest trying cygwin with the openssh package instead of PuTTy (i stopped using PuTTy some time ago, preferring the greater tools available in having cygwin installed on a windows machine); but since you attempted using MAC and its terminal ... then it's looking more and more like something with your network. Your ideas are all good ones ... so i hope you see positive results with your testing.


Skj310 --> Any reason I should really switch to Mint 18.1 vs continuing with my Ubuntu 16.04
Not really. They are both Debian linux. I just hate Unity desktop and the whole MIR display server annoyed the hell outta me. So I chose to go with linuxmint instead, and chose the cinnamon desktop over mate. It's really just a desktop preference. Nothing more. Apparently there's supposed to be some big changes happening with ubuntu, so who knows what that might mean for linuxmint. I have been thinking about BSD, but only because of the ZFS filesystem and how easy and robust that system is to recover from user stupidity. That being said I 'm really not sold on BSD as a desktop, and might just experiment with the ZFS filesystem when i find out more regarding Ubuntu's future.

Anyway cheers and good luck!
I have to admit ... i'm really happy with how FG is working on my linux OS. Seems to be rock solid since 3.2.x was released!

Varsuuk
April 15th, 2017, 07:07
Interesting... ping alone (with --apple-time option to show times) seems to fail to communicate for varying periods of time and when that occurs, my MacBook (testing with ping and tcpdump on MacBook) terminal if
I quickly click there and type, shows the same behavior... pauses until ping goes back to normal!
(cap originally only was capturing hosts 192.168.1.5 --> the linux server. Now need to run with that + MacBook box because DID see ARP and MDNS calls around the point where it recovers but not in all cases. But
I forgot the ssh was live on the windows box - so need to test without that extra factor. Tomorrow.

01:29:39.256294 64 bytes from 192.168.1.5: icmp_seq=156 ttl=64 time=0.310 ms
01:29:40.258467 64 bytes from 192.168.1.5: icmp_seq=157 ttl=64 time=0.279 ms
01:29:41.262325 64 bytes from 192.168.1.5: icmp_seq=158 ttl=64 time=0.280 ms
01:29:42.265340 64 bytes from 192.168.1.5: icmp_seq=159 ttl=64 time=0.282 ms
01:29:43.266090 64 bytes from 192.168.1.5: icmp_seq=160 ttl=64 time=0.274 ms
01:29:44.268827 64 bytes from 192.168.1.5: icmp_seq=161 ttl=64 time=0.269 ms
01:29:45.273500 64 bytes from 192.168.1.5: icmp_seq=162 ttl=64 time=0.306 ms
Request timeout for icmp_seq 163
Request timeout for icmp_seq 164
Request timeout for icmp_seq 165
Request timeout for icmp_seq 166
Request timeout for icmp_seq 167
Request timeout for icmp_seq 168
Request timeout for icmp_seq 169
Request timeout for icmp_seq 170
Request timeout for icmp_seq 171
Request timeout for icmp_seq 172
Request timeout for icmp_seq 173
Request timeout for icmp_seq 174
Request timeout for icmp_seq 175
01:29:59.316079 64 bytes from 192.168.1.5: icmp_seq=176 ttl=64 time=0.371 ms
01:30:00.321226 64 bytes from 192.168.1.5: icmp_seq=177 ttl=64 time=0.355 ms
01:30:01.326395 64 bytes from 192.168.1.5: icmp_seq=178 ttl=64 time=0.337 ms
01:30:02.327495 64 bytes from 192.168.1.5: icmp_seq=179 ttl=64 time=0.349 ms



01:50:34.244719 64 bytes from 192.168.1.5: icmp_seq=117 ttl=64 time=0.183 ms
01:50:35.249992 64 bytes from 192.168.1.5: icmp_seq=118 ttl=64 time=0.353 ms
01:50:36.251545 64 bytes from 192.168.1.5: icmp_seq=119 ttl=64 time=0.291 ms
01:50:37.252335 64 bytes from 192.168.1.5: icmp_seq=120 ttl=64 time=0.199 ms
Request timeout for icmp_seq 121
Request timeout for icmp_seq 122
Request timeout for icmp_seq 123
Request timeout for icmp_seq 124
Request timeout for icmp_seq 125
Request timeout for icmp_seq 126
Request timeout for icmp_seq 127
Request timeout for icmp_seq 128
01:50:46.291446 64 bytes from 192.168.1.5: icmp_seq=129 ttl=64 time=0.339 ms
01:50:47.296628 64 bytes from 192.168.1.5: icmp_seq=130 ttl=64 time=0.371 ms
01:50:48.301755 64 bytes from 192.168.1.5: icmp_seq=131 ttl=64 time=0.345 ms
01:50:49.306891 64 bytes from 192.168.1.5: icmp_seq=132 ttl=64 time=0.355 ms
01:50:50.309828 64 bytes from 192.168.1.5: icmp_seq=133 ttl=64 time=0.315 ms
01:50:51.314567 64 bytes from 192.168.1.5: icmp_seq=134 ttl=64 time=0.309 ms




01:52:47.631361 64 bytes from 192.168.1.5: icmp_seq=250 ttl=64 time=0.357 ms
01:52:48.634635 64 bytes from 192.168.1.5: icmp_seq=251 ttl=64 time=0.386 ms
Request timeout for icmp_seq 252
01:52:50.643847 64 bytes from 192.168.1.5: icmp_seq=253 ttl=64 time=0.350 ms
01:52:51.648810 64 bytes from 192.168.1.5: icmp_seq=254 ttl=64 time=0.319 ms
01:52:52.651691 64 bytes from 192.168.1.5: icmp_seq=255 ttl=64 time=0.361 ms
01:52:53.656673 64 bytes from 192.168.1.5: icmp_seq=256 ttl=64 time=0.352 ms

skj310
April 15th, 2017, 07:16
Almost feeling vindicated in my network guess! :) heh heh
Good work!

Varsuuk
April 15th, 2017, 07:22
Ran 2 different pings on my MacBook - one to linux host, one to windows box. So far, saw where Windows was fine but Linux logged timeout in two separate minutes apart incidents.

Varsuuk
April 15th, 2017, 07:26
Still technically guess could be some sort of busyness or freeze on host but since haven't seen occur while typing on it directly, it would be a freeze related to the net stack not the whole box.

Didn't try the "rewiring" yet - now that have an "automated" test where don't have to rely on a lot of human triggered typing, can feel better about seeing it triggered.

Nighters, for real this time - got my rotator cuff torture by my near sadist PT ;) in 7 hours, need sleep.


Edit:
And yes, also going to look at this "jperf" thing to see if that gives more help than ping. Net should be gigabit. Not going to outside world, if was, its like 100-150 or so Verizon setup.

Saw 5 periods of varying length where ping times out on the ping to linux host and 0 over same time on ping to windows host.

Varsuuk
April 15th, 2017, 18:43
Just got home from PT, going to be trying stuff after I eat.

What got me confused is how even just typing on the putty (or mac Terminal) is paused - not just after hitting enter there is a delay. So apparently, the terminals have some comms going on at all times where when this happens it stops displaying what you type. Threw me off a bit. But yeah, it happens and stops happening at the exact moment ping goes timeout and returns to normal respectively. When wife goes to a Wedding Shower, will kill wife [LOL, it autocorrected from "WIFI" ... or DID it?](so the bazillion mobile and appliance devices are not connected and not getting IPs etc) and disconnect all physical ethernet cables except the linux host the cable router to outside, my test macbook and finally, my son's PC or he will attack me with a slew of nerf guns, light sabers and wooden swords and crossbow. I grew up when touch tone was still a "thing", so different worlds ;)

Andraax
April 15th, 2017, 20:07
Terminal sessions are full duplex. When you press a key, the terminal software sends that key to the host. The host then decides how to handle that keystroke (echo the character for regular typing, don't echo if echo is turned off for passwords or whatever, apply an editing command like up arrow or whatever) and then the host responds to the terminal. After that, the host waits again for the next keystroke and the process repeats. A terminal session is not "line mode" (by default) - it doesn't wait for you to type an entire line and press enter before the host decides what to do; if it did, you couldn't use normal editing keys.

Varsuuk
April 16th, 2017, 22:37
Yeah, Andraax - I was simply thinking (barely) about typing commands TO the host... of course when I am editing or doing anything interactive like hitting key to continue etc, it needs to be constant back and forth.


Had to change some things, so decided to take the time to reinstall as Mint.
Left it for a long time, overnight actually. Came back and it was "frozen" - now before leaving I connected a putty term to it (in case it had frozen screen/keyboard but not "background") and when looked it had the Window "network link broken" or whatever message.


When I installed Mint, all I did is install Mint and then do update (which updates some update software) then update again. I took ALL so it means the new kernel was installed. This might or might not be relevant. During initial install I clicked "take 3rd party stuff" but I did not choose either the new nvidia driver OR the microcode. I intended on selecting both after verified I had no "left alone crash" issues. Welp...


When I did my BIOS, I saw a suspend/resume option where it said "S1 or S3 or auto" - it was always auto in the past. I left it as such. In the Power section of the settings, I left never turn off monitor and set never to hibernate. I really don't want ANY power savings other than possibly drive spin things since I fear it may be related to suspend/hibernate etc on such an old MB (ASUS P5K Premium - "Black Pearl") setup. E6850 (3.0GHz) CPU, 4 GB )


The syslog showed nothing but constant "ntpd: Soliciting pool server ip-address" sometimes many times a minute (like every 0-2 seconds or 40secs...it's odd looking to me, I never recalled so many log messages for ntp) Those were the last things logged at the time of the "freeze." No idea if related, probably not since it isn't coincidental since it is CONSTANTLY logging that even now. Doesn't seem it should be normal imo. There are other "red" entries in syslog, but nothing jumps at me (in my ignorance.)


***OK...JUST finished copying a large dir to another drive so can reformat clean the big "data" partition and saw the screen "dimming" ... moved the mouse and it stopped the dim. I moved the mouse to the "Welcome Screen" to hit the X and close it to uncover the file windows and BLAM - mouse is frozen now. Tried putty, no connection to host.***
I don't know if this is any circumstantial evidence that it IS related some some sort of powersavings... I used to think Linux was very forgiving of old hardware but these days I am getting feeling it might be less so due to becoming more "first world" than in my day. Of course, maybe my hardware is broken in some subtle way.

Varsuuk
April 17th, 2017, 00:10
Yup... ignore my "machine freeze" report above (different from the now detected as network-related ping/ssh thing)

It froze copying one of the directories, this is sooo hosed. I left off on working on this a while back, left it just for Minecraft/TS3 and GIT - was working fine for that. I never got farther than that minimal thing because the SSH connection thing was too annoying to really use that host. Once I got new hardware for it, I decided to actually use it and re-osed.
Going to put Ubuntu 16.04 back on it and let it sit and see if it also gets hosed with freezes. If it does then I tell the boy we lost the MC server for now and eventually when have time I'll buy the parts and build a new server. Doing nothing but wasting folks time on this stuff and I don't have the understanding to fix it myself.

No need to respond unless I come back with it "working" under plain Ubuntu and I try to further troubleshoot the net issue.

Thanks SO much for your time guys.

Ken L
April 18th, 2017, 04:56
Are you using PUTTY on Mac? You can natively SSH via Mac's native command line. You need to look at this problem directly rather than tunnel visioning through that PUTTY lens.

Varsuuk
April 18th, 2017, 13:21
Yes, I chose mac vs using a usb stick because mac is a Linuxy OS.

I eliminated putty back on the 15th and was able to see how a continuous ping times out whenever this issue rears its head. And was not able to get the thing to happen yet typing directly on it.

Then the "overnight lockups" started and I am close to tossing the hardware. After repartioning and copying tons of files and leaving it on for nearly 36 hours this time - it froze around 4:20am it seems by the screensaver time. Previously it would freeze much quicker when left unattended. Only shutdown works.

(I referred to it as "network related ping/ssh thing" in my last post because that identified the total and unrecoverable machine freeze as different from the OP issue.)

Varsuuk
April 20th, 2017, 06:21
So yeah... I checked ASUS P5K Premium manual, found that the PCIe x16 slot is supposed to be x16 ONLY i.e., won't work with x1 or x4 according to manual - but its old so probably x8 wasn't yet invented. There is a light per slot to indicate if incorrect card is inserted, the light doesn't light up. But... there is more, its a PCIe slot, no need to mention 1.0 because I guess that's all there was then. And the new EVGA 710 was PCIe 2.0 x8. Oops... mayhap that 1.0 vs 2.0 was causing the new full machine freeze (crash really since cannot ssh to it once it is frozen) :(


Not gonna try to find a PCI 1.0 passive vid card. I guess I will just buy a new Intel CPU and MB for it.


Is the new Intel i3 good enough for a Linux box? Remember, I was using an old Core 2 Duo E6850 3.0gHz before. Looking to buy inexpensive since didn't intend on building a new PC.
https://www.amazon.com/dp/B01NCESRJX/ref=psdc_229189_t1_B015VPX2EO

Figure, will take the old MB. memory and PSU to Florida in August when visit folks and upgrade their old AMD. Luckily, the vid card I bought that I will "eat" is only $39 - unless it is better than the i3's built in but it's only PCIe 2.0, so guessing prob not as good as built in.




As for MB, I will never be using it with multiple video cards - the motherboards I see at lower cost seem to come with built-in hdmi/dp/dvi - so if the Processor PCI-E Configuration line is JUST for Graphics cards, maybe doesn't matter? Price diff on some is negligible, I only wanted to know if BOTH were acceptable.

I presume the built in Intel HD Graphics 630 will be fine for Linux OS? Won't be gaming on it, if ever did - it may be to have son play Minecraft with me in Den so we don't need to use TS when he plays from PC in his room. If sucks, can always buy a discrete in future.

OR... if should NOT go with new i3 and better off using some older chip for a specific reason/improvement - let me know.

Specifications Z270 H270
Processor Support Kaby Lake/Skylake LGA 1151
CPU Overclocking Yes No
Processor PCI-E Configuration 1x16 or 2x8 or 1x8+2x4 1x16
Chipset PCI-E Lanes (Gen)* 24 (3.0) 20 (3.0)
Maximum HSIO Lanes** 30 30
Max PCI-E Storage (x4 M.2 or x2 SATA Express) 3 2
Independent Display Ports/Pipes 3/3 3/3
Mem/DIMMs Per Channel 2/2 2/2
USB Total (USB 3.0) 14 (10) 14 (8)
Total SATA 6Gb/s 6 6



Please feel free to suggest an MB from any seller, below was fast scan of Amazon may have missed good ones. Again remember, not a "main" pc - just a Linux box for occasional programming work and mostly TS3, GIT, Minecraft server.
https://www.amazon.com/GIGABYTE-GA-H270M-D3H-LGA1151-Crossfire-Motherboard/dp/B01NBX22H6/ref=sr_1_19?s=electronics&ie=UTF8&qid=1492660294&sr=1-19&keywords=kaby+lake+motherboard
https://www.amazon.com/MSI-Intel-CrossFire-Motherboard-PRO/dp/B01MR31OZ8/ref=sr_1_6?s=electronics&ie=UTF8&qid=1492658668&sr=1-6&keywords=kaby+lake+motherboard
https://www.amazon.com/GIGABYTE-GA-Z270M-D3H-LGA1151-Crossfire-Motherboard/dp/B01N9IGECC/ref=sr_1_3?s=electronics&ie=UTF8&qid=1492658668&sr=1-3&keywords=kaby+lake+motherboard
etc...

Ken L
April 20th, 2017, 12:13
I feel like you're making a mountain out of a mole hill here.

It's good that you've isolated but, but the next step is for your windows and mac machines to ping each other (open a port on one end and ping them) see if the problem is the network switch, if it isn't then I'd check the network card on your server.

I still find it strange that your hosted services are doing fine yet you're willing to upend your box for terminal delay. I suspect it's something more that's been omitted. When I hit a network issue, it usually affects all running services; so color me skeptical.

Varsuuk
April 20th, 2017, 21:54
The host is hosed. Mint will not boot after the last system freeze. It didn't shutdown clean so I would need to probably fix something on file system which I wouldn't do as it was just a fresh install and could just redo it in <30 mins.

System feeeze is not the ping / network issue. It is something that only started occurring after I put in the new video. As I mentioned in last post, I found out the video card I bought is not in spec for this old MB. Now that I know this, I don't want to keep using it. Unfortunately the old video card is DVI only and my new monitor cannot accept DVI (2 hdmi, 1 DP, 1 mDP only)

Since all I need to replace is CPU,mb,mem, I can keep it under $400 (maybe closer to $300ish) sofigure go that way.

I can reinsert the vid card with the old me and men and replace parents older pc (they just browse with it, in their 80s) so I even have a home for the old hardware once I fly to Florida to visit this year.


As I mentioned before, System totally locking up is NOT ;) the issue I used to see. It's new and different Makes sense it could be related to a PCIe 2.0 x8 Vid card being inserted into a PCIe x16 ONLY card socket.