Miro Kropacek wrote:
I've decided to make some tests.
Nice tests.
So unless m68000 is some exception, I'd say libm is needed only for gmp / mpfr tests on native builds
I remember you already said that, then I removed it, and finally I added it again because it was required by something... I don't remember. Probably libstdc++.a, because it requires math.h, and certainly some liker tests.
So either I was really drunk or this new gcc has drastically changed since then
Yes, it has. GCC 4.x is very different than older ones (tree-based optimization ?). It is also a lot bigger, and a lot slower than older versions. However it usually produces better code. Sometimes there are regressions in code quality, too.
> or mintlib is
not very good testcase (since it uses a lot of external stuff)
I think it is not a good testcase.The MiNTLib is composed of hundreds of C files. Each one is compiled quite fast. But it is highly dependent of the speed of the disk. Also, the GCC executables are are huge. It takes time to the kernel to load them in memory and relocate them. For compiling one file you have to run make, sh, gcc, cc1, as... They all use fork() (?) and it is slow... Finally the speed of the compiler itself is a very small parameter regarding to the global speed.
I believe most of the time is lost in sh, disk access, and fork(). However, we can say what we want, we will never be right. This has to be profiled scientifically to see where the slowdowns come from.
A good test would be to compile a small and extremely complicated C file, which takes 10 seconds or more to be compiled. Also you should invoke cc1 directly, to minimize the overhead of the intermediate programs. However, the C language is quite simple, a single C program usually compile quite fast. C++ files are better candidates for benchmarking, because some of them with a lot of templates take some time to compile, even on extremely fast computers.
> or 68000
optimization in general is so damn good there's no space for massive speedups thanks to raw CPU power.
Maybe... The 68000 code is usually very good. Except for the floating point of course. Soft-float can't compete in speed with a hardware FPU.
As we discussed long ago, another big source of speed is... alignment of longs on 32-bit boundaries. We have seen that such aligned longs are faster than unaligned in the FastRAM. In order to respect the existing TOS APIs, our GCC never tries to align longs on 32-bit. So they are randomly slow or fast. Enabling alignment would provide more speed. But the OS public structures should be protected to keep unaligned and compatible. This will be a full project. I made a quick hack in EmuTOS for ColdFire to align the RAM returned by Malloc() on a 32-bit boundary, the performance was significantly better.
-- Vincent Rivière