Contents
Update News
Project Overview
Downloads
In The News
May 31, 2006
The code is in a state where it can now parse almost all relevant sections of the ELF files. I am also able to process libraries without problems, which may prove useful. I will most likely be looking at what I can do for the ".text" section until Clark is back. Perhaps I could whip up a primitive x86 disassembler.
May 29, 2006
I have been able to fix the minor alignment issue I was having with the program header table. My ELF binaries should now be fully standard-compliant. On another note, I have also successfully parsed the dynamic symbol table section of the ELF files. This information will most likely not be useful for my purposes, however, since the symbols seem to refer to external libraries. Fortunately, the code I have developed to manipulate byte streams in memory should prove useful for parsing the relocation entries and the ".text" section of the ELF files.
May 25, 2006
No updates for a little while... But the good news is: I got it working. My new ELF file parser
produces executable which are not identical to the ones loaded (data ordering differences, etc),
but still work perfectly. I finally managed to decipher all the quirks that were keeping things
from working properly. Most of them had to do with memory alignment, and what segments should
contain.
The next step for me will be to look into parsing (and eventually modifying) the data in the
ELF file sections. For this I will be specializing my generic ELF section class in order to
create sections that can handle specific contents. I should look into the relocation entries,
for a start, since this is what I will mostly care about, and also the .text (executable)
section of the binaries, although I will most likely not parse those down to the level of
individual instructions.
May 18, 2006
I began to rework my ELF file loading and saving code. I decided to make it more modular and extendable this time. Hopefully, using the new knowledge I have, I should be able to make it all work right. This time, the files produced should be identical to the input files, assuming that no changes took place. At least this code is not a total rewrite. I simply needed to modify the design because I felt that otherwise it would have turned into something very messy.
May 17, 2006
I met with Clark this morning to discuss my progress and the problematic issues I was having.
He made a number of useful suggestions to find what was wrong. From stripping the program of
unnecessary info, to running a debugger and even verifying the file type. To my surprise, the
"file" program did not detect the exact file type, and the debugger reported some sensible
information about the problem.
After more investigation on my part, it seems that the problem is not what data is being
written (no bugs there, that I know of), but rather how. It seems that the segments, contrarily
to my expectation (and what seems to be implied in the ELF specification), can actually contain
anything in the file. As a matter of fact, one segment even contains the ELF header.
Since I have no choice but to parse the file if I want to be able to modify it, I will have to
come up with a better system. My current idea is to modify the way things are loaded so that I
can easily order each significant block of data in the file according to its position, and also
have a containment relationship. This should ensure that, with an unmodified ELF file, the file
written is exactly the same as what was originally parsed, with everything in the same place.
Hopefully that will do the trick.
May 16, 2006
I have been working on writing back parsed ELF data into files. I used a simple "Hello
World!" program to test this, in order to keep things simple. I had to fix a few bugs here
and there, but it seems I managed to write the file almost exactly the way it was before I
loaded it. Excepted for the debug data gcc appends at the end of the executable, which my
program rewrites in a different order (this should not affect the execution).
What was rather disturbing, however, is that I ran into "permission denied" errors when I
tried to run the newly generated executables. This seems strange to me since I made sure the
permissions were properly set beforehand, and the executables did run before. I will have to
show this to Clark. Perhaps something strange is happening during the execution.
May 15, 2006
I have been working my new website and I am finally putting it online today. I will be adding progress updates to this page throughout the summer, and eventually making a demo of the optimizer tool available for download.
May 12, 2006
During the last week I have been working on establishing the basis for my ELF file parser. It seems to be working rather well. I have successfully been reading and writing back simple programs. This week I will work on more elaborate parsing, up to the level of executable code.
Project Overview
Brief Description
The Catalyst Code Optimizer project aims to explore ways of enhancing the performance of
compiled binary code by modifying its memory arrangement in order to improve instruction
cache performance on the target platform.
This will be done by implementing a tool which will be able to parse and modify ELF
executable binaries on the Linux platform. This tool will perform modifications to the
compiled code contained in the said executables in order to modify their cache alignment,
in the hope of obtaining improved performance.
Theoretical Foundation
The theoretical foundation of this project is very simple. It is based on the fact that
the instruction cache of modern processors is divided into fixed-size cache lines, each
of which can only accept a set of specific addresses.
When a program is executed on the processor, new instructions get stored into the cache,
forcing other instruction data out of it. What we are trying to achieve is an improvement
in the alignment of the instruction data inside the memory image of a process, so that
the instruction cache performance becomes improved.
Downloads
There are no downloads at this time.