SYS.COM corrupted a NetDrive image. But why?
Posted: 2025-02-22
Tags: DOS, NetDrive, ForgotToCheckReturnCode
2025-02-24: Somebody over at Hacker News pointed me at the source code, and it confirms the bug and also how things like ambiguous media descriptor bytes are handled. Hat tip to mmastrac and I'll post an update after I digest everything.
In ye olden days to make a diskette bootable you had to format it using the /s option of the FORMAT command. That works fine for blank disks, but software vendors had a small problem - they would sell you a disk with their software on it but they couldn't include the DOS files needed to make it bootable because they were not selling you DOS. To get around this they would leave space available on the disk and have you use the SYS command, which copied the magic boot loader code onto from your DOS disk onto their disk. There were some restrictions on where the free space was located and how much was required, but it generally worked to allow you to make a diskette bootable without having to format it.
Last year somebody reported a problem with the DOS 3.3 SYS.COM command when used with NetDrive. They started with a valid FAT12 image, ran SYS.COM to make it bootable, and then they were not able to mount the image using NetDrive again. Running SYS.COM against the image had broken something.
Besides copying the operating system's hidden files to the target drive letter, SYS.COM also copies some boot code into the first sector of the disk. In general it does not make sense to run it against a NetDrive image because you already had to boot DOS to mount the image, but it should not hurt anything. So I decided to have a look at what was going on.
The first step was to recreate the problem. I created a 10MB FAT12 disk image using NetDrive. The first few bytes of the first sector (the volume boot record) are shown below:
The FAT12 NetDrive image when first created:
00000000 EB 3C 90 4E 45 54 44 52 49 56 45 00 02 08 01 00 .<.NETDRIVE..... 00000010 02 00 02 00 50 F8 60 00 00 00 00 00 00 00 00 00 ....P.`......... 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ <... snip ...> 000001E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000001F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000200 F8 FF FF 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Dissecting that we find:
| Offset | Bytes | Description |
|---|---|---|
| 0x00 | EB 3C 90 | Jump to executable code |
| 0x03 | 4E 45 54 44 52 49 56 45 | OEM ID ("NETDRIVE") |
| 0x0B | 00 02 | Bytes per sector (512) |
| 0x0D | 08 | Sectors per cluster (8) |
| 0x0E | 01 00 | Reserved sectors (1) |
| 0x10 | 02 | Number of File Allocation Tables (2) |
| 0x11 | 00 02 | Root directory entries (512) |
| 0x13 | 00 50 | Sectors (20480) |
| 0x15 | F8 | Media Descriptor (hard drive) |
| 0x16 | 60 00 | Sectors per fat (6) |
That is a minimal BIOS Parameter Block (BPB) as defined by DOS 2.0 but also recognizable to later versions of DOS. Later versions of DOS have extended it a few times.
At offset 0x200 you see the start of the first File Allocation Table (FAT). The first byte 0xF8 is the media descriptor byte, which should be the same as the one in the BPB. This is FAT12 so entries are 12 bits in size; the first entry is actually 0xF8F and the second entry is 0xFFF. Ignoring the media descriptor entry and the second entry which is reserved, this FAT is completely empty.
I mounted the new image under IBM PC DOS 3.3 and ran SYS.COM against it. That looked normal. However, when I disconnected the image and tried to mount it again NetDrive complained that it was a bad image:
Well, the BPB started off correctly but now it seems bad. Let's look at the first sector now that SYS.COM has altered it:
00000000 EB 34 90 49 42 4D 20 20 33 2E 33 00 02 3B C1 75 .4.IBM 3.3..;.u 00000010 1A 8B 16 DC 09 8B 0E DE 09 2B CA 74 D4 8B 1E 00 .........+.t.... 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 12 ................ 00000030 00 00 00 00 01 00 FA 33 C0 8E D0 BC 00 7C 16 07 .......3.....|.. 00000040 BB 78 00 36 C5 37 1E 56 16 53 BF 2B 7C B9 0B 00 .x.6.7.V.S.+|... <... snip ...> 000001C0 0D 0A 44 69 73 6B 20 42 6F 6F 74 20 66 61 69 6C ..Disk Boot fail 000001D0 75 72 65 0D 0A 00 49 42 4D 42 49 4F 20 20 43 4F ure...IBMBIO CO 000001E0 4D 49 42 4D 44 4F 53 20 20 43 4F 4D 00 00 00 00 MIBMDOS COM.... 000001F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 AA ..............U. 00000200 F8 FF FF 03 40 00 05 60 00 07 F0 FF 09 A0 00 0B ....@..`........ 00000210 C0 00 0D E0 00 0F F0 FF 00 00 00 00 00 00 00 00 ................
| Offset | Bytes | Description |
|---|---|---|
| 0x00 | EB 34 90 | Jump to executable code |
| 0x03 | 49 42 4d 20 20 33 2E 33 | OEM ID ("IBM 3.3") |
| 0x0B | 00 02 | Bytes per sector (512) |
| 0x0D | 3B | Sectors per cluster (59) |
| 0x0E | C1 75 | Reserved sectors (49525) |
| 0x10 | 1A | Number of File Allocation Tables (26) |
| 0x11 | 8B 16 | Root directory entries (35606) |
| 0x13 | DC 09 | Sectors (56239) |
| 0x15 | 8B | Media Descriptor (unknown) |
| 0x16 | 0E DE | Sectors per fat (3806) |
It makes sense for the jump instruction and OEM ID to change. And the bytes per sector field is correct. But the rest of the BPB is garbage. Something corrupted it.
Looking at the rest of the sector the boot code starts at offset 0x3E and that looks reasonable. There is also the bootable partition signature (0xAA55) at offset 0x1FE, and the FAT shows some additional entries for the two hidden files that were copied over.
I tried it again, this time with a diskette image mounted using NetDrive, and it did everything perfectly. Which implies that the problem is not in NetDrive, but in the difference between hard drive images and floppy disk images.
So the DOS 3.3 SYS command added the boot code and updated the FAT correctly, but it clobbered the BPB. But only on the NetDrive hard drive image. Why?
DOS 3.2 added a function called "Generic IOCTL" which allows DOS to query a device to get its geometry, write a track, read a track, format a track, etc It also added code to handle these additional calls for the devices supported by the BIOS. For example, here is a call to "Get Device Parameters" (Generic IOCTL, sub function 0x60) for drive C:
Note that at the first breakpoint (after the IOCTL call) the Carry Flag (NC) is not set. This means the call was successful and the data returned is reliable.
The documentation says that the DEVICEPARAMS data structure has a BPB starting at offset 0x06, which here shows:
| Offset | Bytes | Description |
|---|---|---|
| 0x06 | 00 02 | Bytes per sector (512) |
| 0x08 | 04 | Sectors per cluster (8) |
| 0x09 | 01 00 | Reserved sectors (1) |
| 0x0B | 02 | Number of File Allocation Tables (2) |
| 0x0C | 00 02 | Root directory entries (512) |
| 0x0E | B1 FF | Sectors (65457) |
| 0x10 | F8 | Media Descriptor (hard drive) |
| 0x11 | 40 00 | Sectors per fat (4) |
That makes sense for a 32MB C: drive.
Let's run that code again against the NetDrive drive image, with the boot sector returned to what it was before it was corrupted:
I changed one instruction to change to the NetDrive drive number and ran the code again, but this time at the first breakpoint the Carry Flag (CY) is set. This means there was an error, and AX holds the error code. Value 0x0001 means "ERROR_INVALID_FUNCTION" which makes sense because the NetDrive device driver doesn't support this function. (It is not required to be supported.)
If we dig around inside of SYS.COM we can see a call to Generic IOCTL:
Here we see it getting the drive number from storage, setting AX to 0x440D, setting CH to 0x08 (a block device) and CL to 0x60 (get device parameters). DS:DX will be the pointer to the parameter block to fill in.
Let's run the code!
At the breakpoint after the Generic IOCTL we see that the Carry Flag (CY) is set and the error code is set to ERROR_INVALID_FUNCTION, just as it was above. And here is the bug ... nothing is checking the Carry Flag to see if there was an error after the Generic IOCTL call.
The Generic IOCTL writes DEVICEPARAMS structure at DS:DX, assuming the call did not fail. If the call fails, as it does here, we'll just see whatever was already in that storage. As before, the BPB structure will be at offset 0x06. Here is what we got back in that data structure:
(Note that the segment registers changed causing the instruction pointer to shift ... we are still in the same code though, just using aliased memory locations. Thanks segmented x86!)
IOCTL BPB: 72 E4 3B C1 75 1A 8B 16 DC 09 8B 0E DE 09 2B CA 74 D4 8B 1E 19 09 B4 40 CD Corrupted BPB: 00 02 3B C1 75 1A 8B 16 DC 09 8B 0E DE 09 2B CA 74 D4 8B 1E 00 00 00 00 00
Except for the first two bytes (the sector size) and the last four bytes (dpHugeSectors) that lines up perfectly. So the failure to check the Carry Flag wound up causing bad BPB data to be written to the volume boot record.
So what are those bytes? It looks like code to me, and we can confirm that by just disassembling it:
Great, where did it come from? Using the search feature of DEBUG.COM we can find those bytes, and they appear right before the suspect code:
I am a little bit freaked out by that because the pointer to the buffer is set before the IOCTL call; the code knowingly sets a pointer to a buffer into what looks like its code area. Let's hope they knew they were done with that part of the code, or it's just another interesting bug to dissect.
So SYS.COM clearly doesn't work on hard drive images mounted with NetDrive, but it did work on a floppy image. What is the difference and why did it work?
The answer requires us to look inside of the BPB again. The BPB has a field called the "media descriptor byte" which is used to describe the layout of the image. This single byte has a limited range of valid values:
| Value | Description |
|---|---|
| F0 | 3.5 inch, 2 sides, 18 sectors per track, 80 tracks, 1440KB or 3.5 inch, 2 sides, 36 sectors per track, 80 tracks, 2880KB or 5.25 inch, 2 sides, 15 sectors per track, 80 tracks, 1.2MB |
| F8 | Hard disk, any geometry |
| F9 | 3.5 inch, 2 sides, 9 sectors per track, 80 tracks, 720KB or 5.25 inch, 2 sides, 15 sectors per track, 80 tracks, 1220KB |
| FA | 5.25 inch, 1 side, 8 sectors per track, 40 tracks, 160KB |
| FB | 3.5 inch, 2 sides, 8 sectors per track, 80 tracks, 640KB |
| FC | 5.25 inch, 1 side, 9 sectors per track, 40 tracks, 180KB |
| FD | 5.25 inch, 2 sides, 9 sectors per track, 80 tracks, 360KB or 8 inch, 2 sides, single density, 500KB |
| FE | 5.25 inch, 1 side, 8 sectors per track, 40 tracks, 160KB or 8 inch, 1 side, single density, 250KB or 8 inch, 2 sides, double density, 1220KB or |
| FF | 5.25 inch, 2 sides, 8 sectors per track, 40 tracks, 320KB. |
You can tell this wasn't well thought out. The media descriptor byte is often not enough to tell you what you are working with; you need to combine it with knowledge of the physical drive type too.
When I run SYS.COM against a floppy image mounted using netdrive the breakpoint after the Generic IOCTL call does not even get hit:
I am pretty certain that it used the media descriptor byte from the BPB and did not bother making the Generic IOCTL call. I used a 360KB disk image with a media descriptor byte of FD, which is very common. And no IBM PC ever shipped from the factory with an 8 inch drive so it is not ambiguous. So as an experiment I used a 2880KB disk image which has a media descriptor byte of F0, which is also shared with 1440KB diskettes. Sure enough SYS.COM tried to make the Generic IOCTL call on that image, failed, and corrupted that BPB.
So in short:
The bug was probably introduced in DOS 3.2. I'm pretty sure that it is still present in DOS 4.0, as the code is still not checking the Carry Flag after the Generic IOCTL call.
Created February 22nd, 2025
(C)opyright Michael Brutman, mbbrutman at gmail dot com