Troubleshooting FreeDOS Boot on my 286 System - Hangs on execrh() Call in Kernel
Updated: 13 hours ago
As I continue to work towards getting FreeDOS running on my 286 homebrew build, I am encountering a new issue. During boot, the FreeDOS kernel makes a call to execrh(), and it appears the code within execrh() does not like my system, as it hangs at that point. I appear to have an issue with my system -- mostly likely something that should be there, that FreeDOS is expecting to be there, and yet I don't have it there.
I will update this post as I receive suggestions, test changes, and learn more.
During startup of the kernel, the final steps prior to the system hanging are as follows (the best that I can tell, at least):
The truename() function [newstuff.c] is called.
The truename function then calls media_check() [fatfs.c].
The media_check() function calls rqblockio() in the same file.
The rqblockio() function then calls execrh() [execrh.asm].
I am speculating that the execrh() routine is jumping to FL_DISKCHANGED in floppy.asm, based on the parameters. If correct, FL_DISKCHANGED should then raise interrupt 13H, function 16H (read change status type). My 286 system logging is not seeing this interrupt making it through.
Scratch #5. It appears the execrh() call should be going to routines in dsk.c. I'm digging deeper into this.
The execrh() function makes a call. The kernel execution stops at this point.
I built a debug version of the kernel and also added some additional logging. Here are screen captures of my system booting (click for larger view). See later video with updated boot output.
I have not been able to figure out to where the execrh() function is calling or why the system is hanging up at this point.
If I place the drive (actually, a CF Card) in a standard Dell PC with a PCIe to ISA to CF Card adapter, it seems to boot fine, so there must be something with my BIOS or hardware on my 286 build.
80286 with 640 KB of RAM, 128 KB of I/O which includes VGA, 256 KB of ROM BIOS. No high memory (HMA) is configured in the system.
BIOS built with NASM 2.16.01 on Windows 11 development PC. BIOS installed on high/low byte flash ICs on 286 system board.
FreeDOS virtual machine running on Windows 11 development PC. Used for building FreeDOS for 286 system.
FreeDOS kernel source 2.43
FreeDOS bootloader and kernel are on a CF Card, using 8086 with FAT32 options. The CF Card is connected to an ISA IDE adapter in the 286 system.
My BIOS currently supports the following INT13H functions (disk.asm):
0x00: reset disk system
0x02: read disk sectors
0x08: get current drive params
0x15: read DASD type
0x16: disk change status
0x41: check extensions present
0x42: extended read sectors
0x48: extended read drive params
I have not yet added write sector functions to the BIOS. I am not seeing any calls to write functions in the kernel boot process.
I log all interrupt calls through my 286 serial debugger. I can also catch any interrupt calls where support is not present in my BIOS. The execrh() function does not appear to be calling any BIOS interrupts (or at least no interrupts calls are making it to the BIOS).
My BIOS supports CHS and LBA. Due to the size of the drive, FreeDOS appears to stay in CHS mode. In the logged output, I see LBA not enabled for drive C:. In the soruce code for initdisk.c, I see this comment: /* Turn of LBA if not forced and the partition is within 1023 cyls and of the right type */.
For this stage of development, I have dropped my system bus clock speed down to 8MHz, resulting in a 4MHz internal processor clock.
I do not have DMA support in my system, and I am using programmed I/O only.
Additional information about the 286 system: 286 Build - Six Months In.
BIOS Source Code
I am far from a skilled x86 assembly or C developer, so please be gentle. :) Here is a link to my current BIOS source: /WorkingCode/20230528_alignmentissue_questionmark at main · rehsd/x86 · GitHub.
Possible Causes ???
Some disk, partition, or file system information is not being properly populated somewhere (e.g., standard IBM PC memory location, a variable, etc.).
Earlier in the call chain, possibly cds, dpb, dpbp, or mediareqhdr aren't being properly populated.
Corrupted memory due to improper writes.
If anyone has suggestions, please let me know (thank you!). I plan to research IDE initialization on IBM PC systems to see if I am missing something critical on boot, such as populating some memory structures with system information (such as disk information). I also need to learn more about how FreeDOS handles IDE, FAT, and device drivers (as execrh.asm is documented as a "request handler for calling device drivers"). I will update this page as I learn more.
Things to To / Test
As I receive suggestions of things to try, I will queue them up here and post updates as I work through them.
Force LBA. Complete, this did not change the behavior.
Inject logging/debugging code in kernel/floppy.asm. In process...
Test disk I/O outside of the FreeDOS kernel (e.g., a simple bootloader application). So far, all testing seems to indicate CHS and LBA reading is working fine. I can load the MBR, then load the boot sector, then load kernel.sys without issue.
Review all code in my BIOS to make sure I'm not update incorrect memory locations or trashing registers, the stack, etc. In process...
Significantly improve error handling in INT13 disk services BIOS code.
Work backwards in the FreeDOS code from execrh() and see where pointers/variables might not be getting populated correctly (e.g., dpbp, dpb, cds).
Reduce FreeDOS's kernel down to very simple file access to see if I can get that to work.