Refactoring
This will be a brief update as most of the time spent this week was doing refactoring of code and adding more documentation in comments. I will include images and demonstrations in a later blog post before the next major one on September 1st. I optimized the makefile to be more general to target all *.asm and *c source files, regardless of the possibility of their existence in certain directories. I also create custom nested file extensions for the makefile to make the process much clearer during compilation and linkage. Speaking of linkage, I created a linker script which I will hopefully expand upon instead of having to supply linker configurations through the command line arguments. This will also help with things down the road like adding a better stack, filesystem, memory management, and other predefined memory locations. In terms of documentation, I'm using doxygen to generate documentation for all my functions and special variables in my codebase. To do this, I need to go through every file and add special comments to my code so doxygen can understand it properly. The IDT Skeleton The IDT, or Interrupt Descriptor Table, is a data structure used in Protected (32 bit) and Long Mode (64 bit) kernels to handle interrupts, both with interfacing with hardware and for traps such as memory faults. I set up the basic IDT code and I will update it in a later blog post. My immediate next step will be implementing the keyboard driver which will only take a day or two. At that point, I can get ready on the final step in my summer goals. A separate update for the IDT will come soon... Exciting Updates To Come and Finishing Off Few features were added but I'm still on track for finishing up for the summer with a rudimentary shell. I'm also working with Mr. Baraty and another mentor to figure out the electronics of my project. I think currently after I create my shell language and possibly include a file system and memory manager I will have completed 70% of the baseline needed software for the entire fellows project.
1 Comment
Finally the font problems have all been fixed and sanity has been restored! After debugging for literal weeks and several false revelations, I have finally the fontdriver code to work properly, and I (hopefully) will never have to transition back to BIOS graphics mode 0x3 to debug. Now, as referenced in the title, I can printf and draw symbols to the screen using my own font code. The code will need to be optimized and R E F A C T O R E D many times later on... but this is still a huge step forward. I had to solve a few key bugs. to give a refresher, this is my current (fixed) font specification. Every character is a represented as a bitmap image in an array of bytes. The first line with the fontchar datatype is defined as a byte pointer, which just allows for more liberty in accessing memory. A font contains a width, height, and array of fontchars. Remember that fontchars are simply pointers that point to the location in memory where you have the graphical data. When declaring a new font, the graphical data is already injected in memory, and a simple init function stores pointers (fontchars) in the array upon startup which is then reinterpret-casted to this struct. You might wonder, why not just bypass having pointers and have a gigantic block of data that maps to every fontchar? Well many ASCII characters actually aren't meant to be printed and are rather control codes, like the newline. To make all the data continuous in this alternate scheme would require lots of dummy blocks of zeros or magic numbers. Since we have no filesystem yet, everything is loaded into memory and I don't want this kernel to be b l o a t. Now that we're done refreshing, time for the first bug: First bug: Using lea instead of mov. In the earlier blogpost, I had labels to every block of bytes so I could retreive their memory address and store the pointers in the array. To do this, my init function "mov"s the memory address of the label into the array. However, I used the LEA instruction instead of the MOV instruction. I got confused by the naming and it wasn't until probing the code with gdb and ghidra that I found this utterly unfortunate mistake. Wait... I'm looking back at the ghidra image... it seems that NASM automatically accounted for the LEA instruction since I wasn't using effective addressing, so it inferred it and put it in. Hmmmmm. Well there's still bugs to come! Second bug: When you forget that in assembly +1 means +1 byte, not +1 on the index. This caused so much trouble and it was such a devious and sneaky problem. After extensively using a combination of GDB, Ghidra, and qemu run flags I was able to pin down that memory was being incorrectly MOV'd. After storing a pointer, the init function doesn't hop over 4 bytes to the next free block of zeros, but rather moves 1 byte and overwrites part of the previous pointer. You can see in address 0x2080 the address is malformed. At first I thought it was endianess but no. Pointers here are expected to be around 0x2000 to 0x2200. If this program were to run in user space it would segfault immediately. Third bug: movzx confounding me. Reducing eax to all zeros. An incorrect datatype. In the refresher, I mentioned how the fontchar datatype is simply a pointer to bytes in memory. I think because I switched schemes so often I made the array in the font struct fontchar* instead of fontchar. This meant it was an array of pointers to pointers to bytes. An extra recast instruction slipped in. When the bytes were already accessed and stored in a variable, the code mistakenly treated the variable as a pointer to the actual bytes. When testing with 'A', the first 4 bytes are 0x90906000 (little endian in the EAX register). When trying to access the memory address at x90906000, the kernel freaked out and just returned 0. This led to the supposed data being all blank. Next step in the shell: Interrupts
Interrupts are already underway. I'm setting up the IDT and ISRs in the same way I'm doing the GDT. Some refactoring will be needed both before and after that to finalize some kernel and toolchain architecture considerations. Be prepared. Once I can use interrupts, user input is possible. Summer goals for the shell: Simple Shell No Turing Completeness When the IDT is completed I can focus on implementing a rudimentary user input system. It won't be much but it's honest work. I'll discuss with my mentors on how to proceed forward and my timeline for the summer goals as I'm starting to reach the end stages. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
November 2022
Categories |