PDA

View Full Version : Large Module load time - Mac vs Win



mattekure
April 28th, 2020, 17:42
I have previously reported on the long delay of opening the sounds window when my sound module is loaded (it contains nearly 25,000 nodes). On my windows machine (very fast CPU, plenty of RAM, and M.2 drive), loading the initial window takes around 9 minutes.

I am getting reports from Mac users that the loading time is significantly shorter on a Mac. I have seen reports of as little as 15 seconds and 20 seconds to load the window and have access to the sound links.

zuilin
April 28th, 2020, 17:49
I have previously reported on the long delay of opening the sounds window when my sound module is loaded (it contains nearly 25,000 nodes). On my windows machine (very fast CPU, plenty of RAM, and M.2 drive), loading the initial window takes around 9 minutes.

I am getting reports from Mac users that the loading time is significantly shorter on a Mac. I have seen reports of as little as 15 seconds and 20 seconds to load the window and have access to the sound links.

15.5 seconds tops--including my clicking-the-button-and-hitting-the-stopwatch differential. When I do the same process on an equivalently spec'd Windows machine, it's minutes, and minutes, and minutes, and minutes (about 9 of them for me, too).

celestian
April 28th, 2020, 18:38
This is a very interesting development and I'm curious to see what ends up being the issue. The performance on the Mac seems to imply this isn't how things are done in Unity but something else... it would be great if some adjustments to the compiler sequence would resolve the bulk of the performance issues on Windows.

zuilin
April 28th, 2020, 18:47
This is a very interesting development and I'm curious to see what ends up being the issue. The performance on the Mac seems to imply this isn't how things are done in Unity but something else... it would be great if some adjustments to the compiler sequence would resolve the bulk of the performance issues on Windows.

The first thing that came to mind for me, and I don't actually develop on Unity at all, so take it with a grain (or 10) of salt, is that Unity was Mac first--maybe there's something deep below...

zuilin
April 28th, 2020, 19:20
I just ran the exact same process (open FGU, click Syrinscape Sounds button in side bar) in both macOS Catalina 10.15.4 and Windows 10 (current update as of April 28, 2020--not sure what it's called) under BootCamp on the exact same computer; a 2019 16" MacBook Pro. All things considered, BootCamp should be faster since the OS won't throttle the CPU like macOS does.

Same computer:

FGU with my dataset on macOS 10.15.4: 15.5 seconds

FGU with the exact same dataset on Windows 10: 4 minutes, 7 seconds

pollux
April 28th, 2020, 19:22
This is a very interesting development and I'm curious to see what ends up being the issue. The performance on the Mac seems to imply this isn't how things are done in Unity but something else... it would be great if some adjustments to the compiler sequence would resolve the bulk of the performance issues on Windows.

In another thread (sorry, I don't have the link handy) someone mentioned a significant speedup when running FG off of a ramdrive. Especially oddly, they noted that running the PROGRAM DIRECTORY off the ramdrive had a bigger impact than running the data dir... which is not what I would have predicted.

I'm speculating wildly, so take with a grain of salt... but I wonder if certain FG Lua API's (that get hit when loading lists of entities) imply filesystem system calls that are expensive on Windows. I know, for example, accessing the windows filesystem from WSL is also very very slow. Perhaps batching or caching or reading ahead in some fashion could reduce the per-call overhead of such a thing. I watched FGU with procmon and I didn't see huge numbers of FS calls to the program directory to support this theory, though, so... I dunno. Speculation. Certain FS calls being expensive on windows is a relatively well-known thing, though, and might help explain differences in cross-platform performance.

celestian
April 28th, 2020, 21:20
In another thread (sorry, I don't have the link handy) someone mentioned a significant speedup when running FG off of a ramdrive. Especially oddly, they noted that running the PROGRAM DIRECTORY off the ramdrive had a bigger impact than running the data dir... which is not what I would have predicted.

I'm speculating wildly, so take with a grain of salt... but I wonder if certain FG Lua API's (that get hit when loading lists of entities) imply filesystem system calls that are expensive on Windows. I know, for example, accessing the windows filesystem from WSL is also very very slow. Perhaps batching or caching or reading ahead in some fashion could reduce the per-call overhead of such a thing. I watched FGU with procmon and I didn't see huge numbers of FS calls to the program directory to support this theory, though, so... I dunno. Speculation. Certain FS calls being expensive on windows is a relatively well-known thing, though, and might help explain differences in cross-platform performance.

My system for FGU runs off SSDs. A "ram disk" would be a little faster but not much, access wise. If the user testing was on spinning disks and then tried on a ram disk I could see the improvement being dramatic.

zuilin
April 28th, 2020, 21:43
My system for FGU runs off SSDs. A "ram disk" would be a little faster but not much, access wise. If the user testing was on spinning disks and then tried on a ram disk I could see the improvement being dramatic.

Mine's the same computer, same SSD. Different OS's. Way faster in macOS than Windows 10.

celestian
April 28th, 2020, 21:48
Mine's the same computer, same SSD. Different OS's. Way faster in macOS than Windows 10.

Yeah, my response was specifically directed at the thought that it was ram disks. If it was just that my system (windows) would be fine IMO.

zuilin
April 28th, 2020, 21:50
Yeah, my response was specifically directed at the thought that it was ram disks. If it was just that my system (windows) would be fine IMO.

Ah yes, right. Makes sense.

pollux
April 28th, 2020, 22:07
My system for FGU runs off SSDs. A "ram disk" would be a little faster but not much, access wise. If the user testing was on spinning disks and then tried on a ram disk I could see the improvement being dramatic.

I asked the poster whether they were running on an SSD previously or not but I don't think they responded. Depending on the SSD, a ramdisk still can be pretty significantly faster, though: https://www.geckoandfly.com/21507/ramdisk-virtual-disk-memory/

But my comment was not meant to suggest that we should all run FGU off ramdisks (which is probably pretty dangerous for the average non-technical user from a data-safety standpoint), but rather as weak evidence to support the theory that there's some (possibly unecessary) filesystem access at the root of the performance differential.

mattekure
April 29th, 2020, 04:47
One other thing I’ve noticed in my testing is that the load times do not appear to be linear with the number of nodes. Loading a module with 9000 nodes took between 30-45 seconds while loading one with 25,000 took 10 mins. 3x the nodes but 10x the loading time