Lionel Bouton wrote: > These are CPU-level profiling tools, as: > - the CPU is the same in and out of UML, > - I certainly don't have access to the performance counters from > user-mode-linux, > they won't be of much use for me (profiling the behavior of > user-mode-linux is not what I'm after). I disagree here. 1. User-mode Linux is a guest in some host. (Gentoo, wasn't it?) In a sense, UML is the application, even though it's executing code on behalf of the benchmark, which is the application you care about. So profiling the host with *oprofile* will tell you what the whole host is doing, including the UML guest and the benchmark within it. 2. The actual physical processor(s) will be the same in and out of UML, yes. However, what the "OS" does with those processors, especially with respect to caches, will be different. There are other things that could affect this, like branch prediction, or the system call possibility you noted before. oprofile and the CodeAnalyst wrapper will tell you how efficiently the processor is being used in both cases. > Eventually when I have time to narrow down the problem myself, I'll > launch strace on the benchmark, study the differences and submit the > list of system calls to the UML coders asking why some can be faster on > UML than on the host kernel. > > Lionel > > As long as you're experimenting, you might want to try this with a Xen domu host and dom0 guest. Or a VMware Server. In either case, oprofile on the host should give you some interesting information.