Page 33 of 62
Re: XT-IDE on PCjr
Posted: Fri Sep 09, 2011 6:08 am
by Brutman
Well, it's a compiler - it sets up it's environment for the general case. It is perfectly appropriate to drop to inline ASM if you can do the job better. The #pragma aux feature of Watcom is pretty good about letting you inline ASM while preserving optimization by not trashing the register optimization that the compiler already setup. (Borland Turbo products were terrible in that regard.)
One key thing I think you can do is setup CX with an operation count (256) and use a REP prefix to do 256 word moves without having to test for the condition each time through the loop. You only need to do the test at a sector boundary. You still need inline ASM to do this, but it will be infinitely faster than the loop because the instruction prefetch on the 8088 is inadequate.
This is the original code you are competing with:
Code: Select all
readSectorPIO proc near public
push ax
push cx
push dx
mov cx, SECTOR_SIZE / 2 ; sector size in words
; cli
mov dx, cs:[settings.ioBasePort] ; data register
readLoop:
in al, dx ; get lower half
mov ah, al
or dx, 8 ; switch to 2nd data reg
in al, dx
and dx, NOT 8 ; switch back
xchg ah, al
stosw ; save it
loop readLoop
; sti
pop dx
pop cx
pop ax
ret
readSectorPIO endp
Notice the horror we have to go through because of the two different data registers.
It should be more like this:
That works find on a V20 or 286 ... You should be able to do rep mov* variant on an 8088 with your 16 bit data registers.
Mike
Re: XT-IDE on PCjr
Posted: Fri Sep 09, 2011 7:54 am
by alanh
I'm going to take a stab at optimizing the sector read at least tonight. I think by tomorrow night I can have a test bios in-place that does minimal effort with respect to int13h. I would like to see DOS booting off the HD before I start making board changes. Sorry for the long delays. Small windows of time just now opened up and taking advantage of it when I can.
Re: XT-IDE on PCjr
Posted: Fri Sep 09, 2011 6:08 pm
by alanh
It's going to take some testing. I'm wondering if I/O mapping would be more efficient since while you can do a rep insw, you can't do a rep movsw since you actually don't want si to auto increment on each loop. I'll just have to test things out. The good news is the current layout has both sets of strobes routed, so memory vs io mapped can be changed in PLD code. And the I/O address range used can be dynamically set through a memory mapped register at the top of the BIOS window.
Maybe what could be eventually done is setup a 512 byte virtual sector window in the PLD. When you access any IDE register other than data, it's a normal 8-bit access. When you access anywhere inside the window, it acts the same as the current code does for a 16 bit read/write -> an assertion of IDE CS0 with A2..0 equal to 0. Virtualize the sector in a sense. That way rep movsw works with memory mapping. It also means all other registers go back to natural 8-bit accesses. I'm just not sure if the wait states are difference between 8-bit I/O vs memory. I think all 8-bit accesses may be fixed at 4+.
Re: XT-IDE on PCjr
Posted: Sat Sep 10, 2011 6:58 am
by Brutman
Crap - I forgot about the auto-incrementing of pointers.
I'm pretty sure that the Future Domain SCSI sidecar for the PCjr uses the REP MOVSW trick, and exposes the sector buffer as a 512 byte region of memory. Using my IOtest program I measured 300KB/sec reads, and there is no way on earth that could have been done with a loop. I'll have to dig that machine out and disassemble the BIOS to be sure.
So here is the hierarchy in performance order:
- XT-IDE original design with two non-contiguous data registers (in a loop)
- XT-IDE modified to have a contiguous 16 bit data register (in a loop)
- Memory mapped data register (still with a loop)
- REP INSW (V20/286 only)
- REP MOVSW (any CPU, only if the entire sector buffer is memory mapped)
I think you need a pretty decent SCSI controller (like the Future Domain) to be able to use the memory mapped data sector. For IDE it is probably required that something drives the data transfer, 16 bytes at a time. To make the memory mapped data sector work there you would have to pre-fetch the first word from the controller, and then put the controller and the PCjr in synchronized loops where the controller can fetch a word from the device one step ahead of the PCjr trying to read it. Probably not worth the hassle. Just memory mapping a single data register and reading/writing it in a loop is far fast that what we have today.
And for extra credit, REP INSW is the way to go. My primary machine has a V20.
Mike
Re: XT-IDE on PCjr
Posted: Sat Sep 10, 2011 8:22 am
by alanh
I hit a road bump last night when I was changing the PLD code to do it. All of a sudden the RTC stopped working. I've tried just about everything to fix it, but haven't been able to. I think maybe a cell has gone out on the smaller PLD, but not 100% sure. Everything else still works including RAM, full range of flash, post, etc, so I'm going to continue down the IDE remap route today and also trying to get DOS booting later. College football may slow me down though!
It was actually quite simple to reorganize the mappings for IDE and internal latch selects to virtualize the sector buffer. It's not *really* virtualized entirely, just that every access into it acts the same way as the 8<->16 bit latching worked for all registers before but it keeps the address lines to IDE zeroed and only asserts CE0. It's only giving room for SI to walk 512 bytes along with DI but every access into the window works the same way targeting the IDE data reg. It actually simplifies access in other ways as the rest of the IDE registers are 8-bit and now mapped back to a 16-byte window instead of a 32. So you no longer have to do the hokey thing of always doing word writes and reordering the address lines.
I was just concerned that it might not be worth it for memory mapped regs since I wasn't sure the wait states added by the JR were any different for memory vs io port access. But from what you've just mentioned, it sounds like even if access times are the same, there might be a big savings in just the 'rep movsw' vs a 'inw ax,dx / stosw / loop' since there are no ins/outs on a vanilla 8088. I'll keep investigating this today as it seems 300KB/s might be possible on the peanut with any ole drive!
Re: XT-IDE on PCjr
Posted: Sat Sep 10, 2011 9:51 am
by Brutman
I went back to the tech ref to make sure I had this correct ..
- Clock cycle time: 210ns
- Normal bus cycle: four clock cycles (840ns)
- RAM read and write to the lower 128: 6 cycles due to video memory sharing (1260ns)
- I/O reads and writes: 6 cycles (1260ns)
The I/O reads and writes are worse than I thought - I remembered them being five cycles.
Now, for instruction timings ... straight from the Intel book:
MOVSW takes 18 cycles for instruction decode and 6 cycles for the word transfer. I'm not sure if that includes the two cycle penalty that the Jr seems to levy on I/O. So at best it is 24 cycles per word. This is without the REP prefix so you need to add loop overhead, and possible instruction prefetch buffer starvation.
With the REP prefix things get better .. 9 cycles for the prefix and then 17 cycles for the instruction and 6 cycles for the word transfer for each word. 23 vs. 24 isn't dramatic, but now there is no loop overhead and no possibility of having the prefetch buffer go empty.
So at the high end, a REP MOVSW should be able to do 400KB/sec assuming 4,700,000 cycles per second and 23 cycles per word moved.
In contrast, IN with a word argument takes 15 cycles for each word. But it only moves to AX requiring another instruction to put AX to the final destination. And on the 8088 the REP prefix is not supported.
It sounds like you are making great progress! Yell if you need anything ...
Re: XT-IDE on PCjr
Posted: Sat Sep 10, 2011 7:24 pm
by alanh
So preliminary test = ~230 KB/s raw reads. That's un-optimized but also w/o interrupt overhead.
Re: XT-IDE on PCjr
Posted: Sat Sep 10, 2011 7:42 pm
by alanh
Things haven't been very productive today. My board is failing as I didn't realize how bad PLCCs were at insert/remove reliability. I've been popping the chip in and out a lot since it's easier for me to program it w/ my eeprom burner. I'll have to build up one of the JTAG cables Monday and continue only doing in-circuit programming nursing this board along till the next spin arrives. I think my confidence is to a point I can start on the P2 board changes this week with the target of an order by Friday (+1 week turn + 1 week shipping + 1 week validation and transit to Jeff, Mike and jmetal88 (first name?)). It will help to have more people with functional boards. The problems with this board pretty much made it unusable w/o a whole lot of hacking. It's a good start though. I'm also going to order ISA versions of the design (w/ just flash and IDE) at the same time. I like what Ian is trying to do at Dangerous Prototypes, but he's really going about it the wrong way.
It seems electrically things are sound though P1 has some issues with high ground impedance atm. Memory mapping with single instruction sector transfers seems sound, so I'm probably going to drop the IO lines to the main PLD in order to help some gate-racing concerns I've been countering by adjusting output slew rates. I'm going to leave the possibility of RAM fill in the cartridge area open for now. If we can work it in, someone on the VC forums found a UMB manager for DOS 5. So in theory, the JR could have nearly 800K of usable DOS RAM. And at 200+ KB/s disk read performance, could fill it in <4 seconds!
I still need to make a 'best guess' on the drill sizes for the side car pins. I might only order 3 instead of 6 in case I get it wrong again; which might leave someone out on the first build.

Re: XT-IDE on PCjr
Posted: Sat Sep 10, 2011 7:48 pm
by alanh
It just occurred to me, my transfer buffer is probably in lower 128 KB memory as it is a normal DOS .EXE (~12KB). I'll re-run the test with the dest pointer above 128K.
EDIT: NM, didn't seem to make a difference.
Re: XT-IDE on PCjr
Posted: Sat Sep 10, 2011 7:49 pm
by jmetal88
Haha, my first name is also Mike. Or Michael. I don't really have a preference. A lot of my friends call me the former, my parents call me the latter.