Last update: 01 jan 2005 12:17:10
A little knowledge about x86 processors are required in order to understand the information It is based in the knowledge that we have gathered in these years learning about how to optimize in the machine level Maybe some of the ideas listed here are flawed or impossible or maybe already being used, but if it is being used it is not at least already documented in the processor news media If you find the ideas interesting and have also your own ideas and want to share with us also just send an email It will be interesting to tell to amd and intel about what we want to see in future processors
1 - About inserting executable code in internal memory section on processors for fast execution (01/Jan/2005 11:43) : The interesting is not to accelerate the execution speed of processors, but make better use of the speed already available Good part of the speed of the processors are lost due to the data bus delay to read or write to the memory And since the x86 has only 8 general purpose registers no matter how simple an algorithm is , it always requires reading and writing to the memory in order to execute But there is a way that the processor can be enhanced without breaking compatibility , the idea of inserting most executable code inside the processor Consider an algorithm that requires function a and b , and function b is largely executed , it requires all the 8 registers plus a few more memory access to do the work The processor could read the whole b function in an internal memory, and each call to the b function will be redirected to the internal copy of the b function but with a few enhancements , since the b function now is internal , all the references to the memory regarding temporary variables will be replaced with an internal register reference Obviously the function need to be generated in such a way to don't access global variables or the code execution will fail, just internal variables that can be replaced with additional registers without problems , it can be provided as a hint to the compiler that will generate the additional assembly code to explain to the processor that the function can be used internally without problems Where to save this internal executable code during task switch ? This processor need to have additional memory to keep the internal copy of the function and don't lose it during task switch keeping also the state of the additional registers , since it is internal to the processor , the operating system don't even need to know about the existence of this additional memory and the additional registers, the hard task here is to correctly detect task switch without depending on the operating system , and detecting also what task the processor will switch to , so to restore the additional memory to the correct task being executed It appears a little complex but it can work , indeed , today the processors are making translation of the original x86 code to a internal RISC representation for fast execution Now imagine how fast a code will execute if all the temporary variables required by a program are retained in an internal additional general purpose register
2 - Why the number of general purpose registers cannot be enlarged ? (01/Jan/2005 11:43) : You have only 8 registers to do the work , but why not to add a new large set of registers to the processor is something very difficult to understand Possible problems : The operating system was not created to support this new set of registers : ok , the operating system don't know about the registers so it cannot save and restore it during task switch , but the processor can detect that a task switch is occurring and it can keep track of the task switch , the operating system don't even need to know about the existence of this new set of registers , the unique thing that is required is a way to the processor to save internally the state of these registers ,if they have added 2 megabytes of cache in the processor , they can add a 1 megabyte memory to hold internal register states during task switch 3 - A new set of register and additional memory can be added to the x86 without problems (01/Jan/2005 11:43) : The major problem with the addition of new general purpose registers is that the processor cannot save and restore the states of these registers during task switch But if only one thread running will use these registers , then there is no reason the save the state of these additional registers , so , if only one thread will be using the additional registers no possible problem will occur and the processor will be able to execute the code without possible flaw And since this thread can use the additional set of registers without a task switch problem , then it can use also an additional set of internal processor memory to keep data for fast code execution Then you can create a specific code that will use not only a large set of registers but also an internal processor memory to allocate , deallocate, store temporary values or any other code requirement I wonder how fast a code will execute if everything running is inside the processor and just the result is going outside And it will be easy to avoid collision of two threads trying to use the same internal set of additional registers and memory , just keep a flag that a thread will query to see whether the additional set of register is available , if available , then use it , if it is being used by another thread , just execute normal code , and as soon the thread finish the execution of the code , it will just update the state of the flag to make the additional register available to other threads As you can see it is not that difficult to extend the x86 processor without breaking compatibility
More to come
|
Home
Contact About Development C
Programming
Processor Research Products License Mirrors