Hey everyone,
I have been getting huge file I/o times with low utilization on CPU/RAM and network, when sending jobs to my remote farm on AWS. The remote farm is connected via a site to site VPN with my on prem network drive for the relevant files.
I have read elsewhere that cache issues can cause high file I/o times through dumping and reading the same memory repeatedly; fixed by expanding the cache size.
Here is my openImageIO log:
725 5.7 6m 41.1s 4096x4096x1.u16 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/v_01/brokenJeepA_02_Roughness.1002.tx MIP-COUNT[237,272,132,57,16,4,1,1,1,1,1,1,1] 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 43 1 110 0.9 1m 21.7s 4096x4096x1.u16 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/v_01/brokenJeepA_02_Roughness.1003.tx MIP-COUNT[0,28,26,29,16,4,1,1,1,1,1,1,1] 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 44 1 1162 9.1 9m 49.5s 4096x4096x1.u16 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/v_01/brokenJeepA_02_Roughness.1004.tx MIP-COUNT[433,424,215,63,16,4,1,1,1,1,1,1,1] 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 45 1 199 1.6 2m 12.4s 4096x4096x1.u16 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/v_01/brokenJeepA_02_Roughness.1005.tx MIP-COUNT[1,44,88,39,16,4,1,1,1,1,1,1,1] 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 46 1 1224 9.6 11m 30.5s 4096x4096x1.u16 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/v_01/brokenJeepA_02_Roughness.1006.tx MIP-COUNT[253,644,237,63,16,4,1,1,1,1,1,1,1] 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 47 1 1 32.0 2m 41.0s 2048x2048x4.f16 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/plants_3d_sjlkG/Redstem_Colocasia_2K_Albedo.exr UNTILED 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | Tot: 42 63088 965.5 ( 56 0.5) 21h 29m 55.1s 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 5 were exact duplicates of other images 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 1 not tiled, 1 not MIP-mapped 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 8 were constant-valued in all pixels 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | Broken or invalid files: 5 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 1 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/masks/tires_mask_v01_1001.tx 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 2 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/masks/tires_mask_v01_1002.tx 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 3 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/masks/tires_mask_v01_1003.tx 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 4 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/masks/tires_mask_v01_1005.tx 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 5 /mnt/prod/directors/peter_sluszka/mcds54_multipack/mcds54_jurassic_world/production/maya/textures/jeepBroken/masks/tires_mask_v01_1006.tx 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | ----------------------------------------------------------------------------------- 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | number of warnings, warning type: 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 91: [disp] %s: padding is at least %.3gx smaller than it should be! given disp_padding: %.9g, recommended: %.9g 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 3: [subdiv] %s: polymesh does not have UV coordinates, ignoring subdiv_smooth_derivs 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | ----------------------------------------------------------------------------------- 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | performance warnings: 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB WARNING | You are tracing 2289 rays per pixel. You may want to double check and reduce your sampling settings. 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB WARNING | Rendering CPU utilization was only 18%. Your render may be bound by a single threaded process or I/O. 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | ----------------------------------------------------------------------------------- 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | 2021-01-17 22:32:20: 0: STDOUT: 00:55:51 2950MB | releasing resources 2021-01-17 22:32:20: 0: STDOUT: 00:55:52 2457MB | Arnold shutdown
This section in particular is why I suspect cache issues
Images : 47 unique 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | ImageInputs : 42 created, 36 current, 36 peak 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | Total pixel data size of all images referenced : 3.2 GB 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | Total actual file size of all images referenced : 787.3 MB 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | Pixel data read : 965.5 MB 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | File I/O time : 21h 29m 55.1s (40m 18.6s average per thread, for 32 threads) 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | File open time only : 7m 46.9s 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | ImageInput mutex locking time : 16m 59.3s 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | Tiles: 63473 created, 63032 current, 63032 peak 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | total tile requests : 7256450755 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | micro-cache misses : 805361013 (11.0986%) 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | main cache misses : 63473 (0.000874711%) 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | redundant reads: 56 tiles, 480 KB 2021-01-17 22:32:19: 0: STDOUT: 00:55:51 2950MB | Peak cache memory : 966.0 MB
the reason I find this concerning is that the peak cache memory is 966mb, and the total volume of pixel data is 3.2GB, am I interpreting this correctly ? And if so, how do I change my texture cache size when using mtoa for linux via the command line/ config files, as I don't use a desktop environment for the render nodes.
Thanks and have a great day !
Only 965MB of that 3.2 GB is referenced by the scene.
And peak cache memory is 966 MB, with only 480K of redundant texture reads
So, no problems with the texture cache size. There's no cache thrashing.
OIIO will do lots of frequent tiny reads of data off of disk. If your data is on a high latency drive, such as your remote drive, this is going to result in poor performance. You'll probably want to copy the texture files to AWS.