It's a Katana Arnold render problem.
The render process was finished, the error happened when the rendering process is 100% done, arnold is trying to save the first image to local disk (enough space, no permission restriction). If I retry the job on the same machine, it will render and finish properly. So it's quite random.
The chance for the problem on the whole farm is about 3%, happens on every job, every slave.
Here I paste a chunk of log for example:
-------------------------------------------------------------
2021-10-17 11:15:13: 0: STDOUT: [INFO python.RenderLog]: 00:25:16 110369MB | 5% done - 53821 rays/pixel
2021-10-17 11:25:32: 0: STDOUT: [INFO python.RenderLog]: 00:35:35 111589MB | 10% done - 58983 rays/pixel
2021-10-17 11:35:46: 0: STDOUT: [INFO python.RenderLog]: 00:45:49 112424MB | 15% done - 55302 rays/pixel
2021-10-17 11:44:18: 0: STDOUT: [INFO python.RenderLog]: 00:54:21 112703MB | 20% done - 62043 rays/pixel
2021-10-17 11:57:11: 0: STDOUT: [INFO python.RenderLog]: 01:07:15 112870MB | 25% done - 54049 rays/pixel
2021-10-17 12:07:45: 0: STDOUT: [INFO python.RenderLog]: 01:17:48 112901MB | 30% done - 53440 rays/pixel
2021-10-17 12:17:54: 0: STDOUT: [INFO python.RenderLog]: 01:27:58 112957MB | 35% done - 62832 rays/pixel
2021-10-17 12:18:22: 0: STDOUT: [INFO python.RenderLog]: 01:28:26 112960MB WARNING | ignoring infinity/NaN image sample in pixel (514,585), rgba=(nan,nan,nan,0.884671986)
2021-10-17 12:18:22: 0: STDOUT: [INFO python.RenderLog]: 01:28:26 112960MB WARNING | ignoring infinity/NaN image sample in pixel (514,585), rgba=(nan,nan,nan,0.884671986)
2021-10-17 12:18:25: 0: STDOUT: [INFO python.RenderLog]: 01:28:29 112960MB WARNING | ignoring infinity/NaN image sample in pixel (519,585), rgba=(nan,nan,nan,0.887660205)
2021-10-17 12:18:25: 0: STDOUT: [INFO python.RenderLog]: 01:28:29 112960MB WARNING | ignoring infinity/NaN image sample in pixel (519,585), rgba=(nan,nan,nan,0.989554942)
2021-10-17 12:22:35: 0: STDOUT: [INFO python.RenderLog]: 01:32:39 112997MB | 40% done - 39237 rays/pixel
2021-10-17 12:29:18: 0: STDOUT: [INFO python.RenderLog]: 01:39:21 113038MB | 45% done - 35046 rays/pixel
2021-10-17 12:37:46: 0: STDOUT: [INFO python.RenderLog]: 01:47:49 113061MB | 50% done - 38456 rays/pixel
2021-10-17 12:41:32: 0: STDOUT: [INFO python.RenderLog]: 01:51:35 113076MB | 55% done - 25652 rays/pixel
2021-10-17 12:46:48: 0: STDOUT: [INFO python.RenderLog]: 01:56:51 113104MB | 60% done - 32125 rays/pixel
2021-10-17 12:52:50: 0: STDOUT: [INFO python.RenderLog]: 02:02:54 113127MB | 65% done - 30167 rays/pixel
2021-10-17 12:55:01: 0: STDOUT: [INFO python.RenderLog]: 02:05:05 113143MB | 70% done - 15677 rays/pixel
2021-10-17 12:57:43: 0: STDOUT: [INFO python.RenderLog]: 02:07:46 113158MB | 75% done - 13627 rays/pixel
2021-10-17 13:00:50: 0: STDOUT: [INFO python.RenderLog]: 02:10:53 113175MB | 80% done - 18549 rays/pixel
2021-10-17 13:05:23: 0: STDOUT: [INFO python.RenderLog]: 02:15:26 113195MB | 85% done - 25736 rays/pixel
2021-10-17 13:09:57: 0: STDOUT: [INFO python.RenderLog]: 02:20:01 113229MB | 90% done - 28410 rays/pixel
2021-10-17 13:17:20: 0: STDOUT: [INFO python.RenderLog]: 02:27:23 113264MB | 95% done - 63460 rays/pixel
2021-10-17 13:22:41: 0: STDOUT: [INFO python.RenderLog]: 02:32:44 113305MB | 100% done - 67248 rays/pixel
2021-10-17 13:23:17: 0: STDOUT: [INFO python.RenderLog]: 02:33:20 10486MB | render done in 2:32:16.257
2021-10-17 13:23:17: 0: STDOUT: [INFO python.RenderLog]: 02:33:20 10486MB | [driver_exr] writing file `C:\Users\render\AppData\Local\Temp\katana_tmpdir_8284\render000004.exr'
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: 02:33:23 10633MB ERROR | signal caught: error C0000005 -- access violation
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: ****
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * Arnold 5.1.1.1 [3849b993] windows icc-17.0.2 oiio-1.7.17 osl-1.9.0 vdb-4.0.0 clm-1.0.3.513 rlm-12.2.2 2018/06/26 21:12:06
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * CRASHED in AiMsgWarning at 00:00:09, pixel (0, 0)
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * signal caught: error C0000005 -- access violation
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: *
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * backtrace:
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 0 0x00007ffc4ca384de [ai ]
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 1 0x00007ffc4ca3777f [ai ]
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 2 0x00007ffc8362feea [KERNELBASE] UnhandledExceptionFilter
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 3 0x00007ffc85924ab2 [ntdll ] memset
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 4 0x00007ffc8590c656 [ntdll ] _C_specific_handler
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 5 0x00007ffc859211cf [ntdll ] _chkstk
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 6 0x00007ffc858ea209 [ntdll ] RtlRaiseException
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 7 0x00007ffc8591fe3e [ntdll ] KiUserExceptionDispatcher
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: >> 8 0x00007ffc4ca736bf [ai ] AiMsgWarning
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 9 0x00007ffc4ca41c46 [ai ]
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 10 0x00007ffc4ca73843 [ai ] AiMsgWarning
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 11 0x00007ffc4ca464e5 [ai ]
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 12 0x00007ffc4ca73851 [ai ] AiMsgWarning
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 13 0x00007ffc4c906a37 [ai ] AiMsgTab
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 14 0x00007ffc4ca57710 [ai ]
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 15 0x00007ffc4c836f01 [ai ] AiAOVSampleIteratorGetNextDepth
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 16 0x00007ffc4c9f7bfc [ai ] AiNodeSetStr
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 17 0x00007ffc4ca3eb7c [ai ]
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 18 0x00007ffc4c8e2d30 [ai ] AiVolumeSampleFltFunc
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 19 0x00007ffc4c8e21c0 [ai ] AiVolumeSampleFltFunc
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 20 0x00007ffc83450e82 [ucrtbase ] beginthreadex
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 21 0x00007ffc84157bd4 [KERNEL32 ] BaseThreadInitThunk
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 22 0x00007ffc858ece51 [ntdll ] RtlUserThreadStart
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: *
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * loaded modules:
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 0x00007ffc4c790000 ai
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 0x00007ffc83530000 KERNELBASE
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 0x00007ffc85880000 ntdll
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 0x00007ffc83430000 ucrtbase
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: * 0x00007ffc84140000 KERNEL32
2021-10-17 13:23:20: 0: STDOUT: [INFO python.RenderLog]: ****
2021-10-17 13:23:20: 0: Done executing plugin command of type 'Render Task'
=======================================================
Details
=======================================================
Date: 10/17/2021 13:23:23
Frames: 131
Elapsed Time: 00:02:34:40
Job Submit Date: 10/16/2021 17:07:38
Job User: jiangxiaoping
Average RAM Usage: 121950117888 (89%)
Peak RAM Usage: 125436743680 (92%)
Average CPU Usage: 99%
Peak CPU Usage: 100%
Used CPU Clocks (x10^6 cycles): 1229811200
Total CPU Clocks (x10^6 cycles): 1242233473
=======================================================
Slave Information
=======================================================
Slave Name: render-46
Version: v10.0.7.0 Release (a0f30a477)
Operating System: Windows 10
Running As Service: Yes
Machine User: render
IP Address: 192.168.25.146
MAC Address: EC:F4:BB:BF:E0:0C
CPU Architecture: x64
CPUs: 56
CPU Usage: 3%
Memory Usage: 22.8 GB / 127.9 GB (17%)
Free Disk Space: 307.662 GB (215.490 GB on C:\, 92.172 GB on D:\)
Video Card: Microsoft Basic Display Adapter