YouTube user @viti95 suggested avoiding making calls to other procedures within the assembly code, as just the CALL process has overhead. I added an additional test of using C++ to ASM with the WATCOMC calling convention, but without sub calls (functionality is all in the single ASM procedure). It had a noticeable impact.
As I work on this graphics code, I will post updated code to GitHub - rehsd/FreeDOS_AppCode: Misc. work for apps to run in FreeDOS on my builds.