p2p news feature / p2pnet: Which is faster for stats, Linux or Apple’s OS X?
Linux, says Jasjeet Sekhon, associate professor, Travers Department of Political Science Survey Research Center at UC Berkeley.
What does the Apple crowd have to say about that?
“I joke that my webserver survived the digg onslaught and my mailserver the Apple fraternity onslaught,” Sekhon told p2pnet. “Joking aside, the response from the fraternity has been mostly but not entirely civil. A running issue has been that they don’t think it is fair to compare OS X with Linux because the latter is a specialty operating system. I find this claim to be bizarre.
“One is obviously not going to be running Photoshop on Linux (other than via vmware or something), but the both are general operating systems. In any case, I added the Windows XP results (which I knew about before hand) to the webpage in response.
“Even Windows XP performs better than OS X, which is embarrassing for Apple and a testament to Microsoft engineering given the legacy software requirements of Windows.
“The one very helpful response has been from people who run the Virginia Tech G5 high performance computer system. They revealed to me that they get their excellent performance by using a modified memory manager – ie, they dispensed with the OS X memory manager.
“This key fact is of course missing from Apple’s websites and PR on the Virginia Tech system.”
Linux versus Mac OS X and Windows XP on Intel Dual Core
By Jasjeet Sekhon – UC Berkeley
Linux is found to be much faster
than Apple’s OS X for
statistical computing. And although Linux is 5 to 10 percent faster
than Windows XP, both are markedly faster than OS X. For example, in
one benchmark both Linux and Windows XP are more than twice as
fast as OS X. The results on this page were conducted on a href="http://www.apple.com/macbookpro">MacBookpro with a 2.16Ghz
Intel Core Duo chip and 2GB of RAM.
I had previous conducted Linux vs. Mac OS X and
Windows XP and Opteron vs. G5 and Pentium benchmarks. Those results
for OS X and not particularly good for the G5 (970) chip. href="g5.html">For example, my 2.7 pound Pentium-M Linux laptop is
faster than my 44 pound G5 running OS X. The floating point
performance of the 970 chip leaves much to be desired, but OS X makes
the performance problem significantly worse.
The Intel chip… For months, it’s been trapped inside a Mac, inside a
pretty little box, dutifully performing pretty little tasks when it
could have been doing so much more. Starting today, the Intel chip
will be set free, and get to live life in a Mac… running
Linux. Imagine the possibilities.
People often ask me about my opinion of
href="http://www.apple.com/macosx/">Apple’s OS X both as an
alternative to Linux and as an
operating system useful for statistical computing. Because I support
software on various platforms, I have to think about the
idiosyncrasies of various operating systems and chips. In order to
save time repeating the same information to many people, I have
decided to post it on the web. The short answer: use Linux if you
want performance and stability. If you want to use Mac OS X or
Windows XP, go ahead. All of these operating systems are now above
the line (not long ago the operating systems out of Redmond and
Cupertino were a joke). However, if you decide to use Mac OS X for
whatever reason, don’t assume that it is just like Linux or some other
efficient unix but with a friendly GUI. Life is full of tradeoffs and
reasonable people can decide to make different choices. Don’t pretend
that tradeoffs don’t exist, and don’t fall victim to Apple’s marketing
which is an extension of the href="http://www.folklore.org/StoryView.py?project=Macintosh&story=Reality_Distortion_Field.txt">Steve
Jobs Reality Distortion Field.
I present here a set of benchmarks which are relevant to my work and
to people working in statistical computing, particular people using
the R Project for Statistical
Computing. These benchmarks are floating point bound where the
main IO is to memory and not to disk. Cache and Translation
Look-aside Buffer (TLB) misses really matter as well as memory speed.
This setup may be of more general interest. But they may not be
relevant for what you do. If you need a computer to do Y, and
these benchmarks are in no way related to Y, don’t write me to
complain about it. These benchmarks are useful for the work I
and some other computational statistics people do.
OS X is incredibly slow by design in part because of the
href="http://en.wikipedia.org/wiki/XNU">hybrid XNU kernel it uses.
It is based on the href="http://en.wikipedia.org/wiki/Mach_kernel">Mach Microkernel
vs. Tanenbaum) and the excellent href="http://en.wikipedia.org/wiki/Bsd">Berkeley Standard Distribution
(BSD) kernel. The hybrid kernel is very inefficient and less
stable than alternatives such as the Linux kernel and the BSD kernel
found in FreeBSD. The reasons
for this are many. For example, in Linux, the variables for a system
call are passed directly using the register file. In OS X, they are
packed up in a memory buffer, passed to a variety of places, and the
results are then passed back using another memory buffer before the
results are written back to the register file. You can just imagine
what that does for TLB and cache hits. This just adds to the context
switching difficulties on some chips such as the
Memory management in OS X is awful. To quote href="http://www.tacc.utexas.edu/~kgoto/">Kazushige Goto talking
about his BLAS:
“Performance is suppressed on purpose due to [the] awful memory
management of OS X”. Goto’s work is described and praised on href="http://www.apple.com/education/science/profiles/vatech/optimization.html">Apple’s
own website because he added a custom BLAS for the Apple super
computer at Virginia
Tech. On the Apple site it states that Goto was “pulling out
incredible efficiencies”. Well, given the Goto’s own benchmarks and
comments, this is just another example of the href="http://www.folklore.org/StoryView.py?project=Macintosh&story=Reality_Distortion_Field.txt">Steve
Jobs Reality Distortion Field.
The benchmarks presented here are based on two of my statistical
software packages for R: href="../matching">Matching (Multivariate and Propensity Score
Matching Software) and rgenoud (R Version of
GENetic Optimization Using Derivatives). The code uses C++ code
extensively. The two benchmark scripts are available href="GenMatch.R">here (Genetic Matching) and href="matching.R">here (Matching). All benchmarks were done using
R-2.3 and gcc 4. The best timing result of the three calls to href="../Matching/GenMatch.html">GenMatch in the GenMatch href="GenMatch.R">script are presented and the best result of
three consecutive runs of the matching script
are presented (examining the worst or the average times yields the
same substantive results).
The machines are:
|Label||OS and Chip|
|OS X Core Duo||
href="http://www.apple.com/macbookpro">MacBookpro, Intel 2.16GHz|
Dual Core 2GB RAM
|Linux Core Duo||
MacBookpro, Intel 2.16GHz Dual Core 2GB RAM. Note: Xorg
server running with GNOME
|XP Core Duo|| Windows XP SP2 on MacBookpro, Intel 2.16GHz|
Dual Core 2GB RAM.
Linux ( href="http://www.ubuntu.com/testing/dapperbeta?highlight=%28beta%29">Drapper
Drake Beta 2) on 3GHz Pentium 4, 2GB RAM. Note: href="http://en.opensuse.org/Xgl">Xgl+compiz running with href="http://www.kde.org">KDE
href="http://www.ubuntu.com/">Ubuntu Linux (64bit) on
250, 4GB RAM Note: Xorg server running with href="http://www.kde.org">KDE
Both Linux and Windows XP are vastly faster than OS X: more than
twice as fast. And Linux is somewhat faster than Windows XP.
This benchmark does not take up much RAM, less than 30meg, nor does it
work the filesystem much. But the application does flip between
various shared libraries and pass various data objects back and forth
in RAM. The following benchmark takes about the same amount of RAM,
but unlike the previous one it does not flip between various shared
libraries. It does call a shared library, but it only does it once
and only passes results back once.
This second benchmarks looks better for OS X, but it is still
about 1.2 times slower than Linux. And the gap between Linux and
Windows has grown from about 5 to about 10 percent.
These benchmarks do not use a graphical user interface. They are
batch jobs run from the command line and produce no graphical
output. No X11 or Aqua calls are made. And on all platforms the
benchmark process obtains 99%+ of a cpu or core. Moreover, in neither
benchmark are we testing IO or running multiple processes on the same
chip. If we do either of these, the href="http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_8825,00.html">Opteron’s
relative performance improves.
Many people commented that my previous
benchmarks, which compared OS X on the G5 with Linux on Opteron
chips, were limited because gcc is optimized for the x86 family. In
these benchmarks, this excuse can obvious not be used. There are some
serious issues with OS X and the gang in Cupertino should get to work.
Even Windows XP performs better than OS X, which is embarrassing for
Apple and a testament to Microsoft engineering given the legacy
software requirements of Windows.
As noted before, the hybrid
XNU kernel is probably to blame for OS X’s problems. href="http://www.applematters.com/index.php/section/comments/how-long-will-apple-keep-the-mach-microkernel/">People
on the web have recently been speculating whether Apple with drop
the Mach micro-kernel portion of XNU. These rumors have picked with
the departure of href="http://en.wikipedia.org/wiki/Avie_Tevanian">Avie Tevanian,
an important figure in the development of the Mach Kernel first at
Carnegie Mellon and then at Apple. Interestingly, Chris Emura, the
Filesystem Development Manager within Apple’s CoreOS organization, href="http://www.osnews.com/story.php?news_id=14473">recently
stated that Apple is interested in porting href="http://www.opensolaris.org/os/community/zfs/">Sun’s ZFS
filesystem to OS X. If true, it may be that Apple is interested
in fixing core issues with their operating system now that the eye
candy is stable.
I have conducted many more benchmarks on these and other machines.
For example, I have tested the HFS+ filesystem. It is slower than href="http://www.namesys.com/">reiser especially for small and
medium sized files and slower than href="http://oss.sgi.com/projects/xfs/">XFS especially for large
files. If you want these additional benchmarks, let me know.
There are claims on the web that when Apple developers compile OS X on the 970, they use -Os. That is, they optimize for size and not for performance. “So even though
Apple talked a lot of smack about having a first-class 64-bit RISC workstation
chip under the hood of their towers, in the end they were more concerned about
OS X’s bulging memory requirements than they were about The Snappy(TM).”
AnandTech has an article which offers another explanation for why OS X is so
inefficient. See href="http://www.anandtech.com/mac/showdoc.aspx?i=2436">No more mysteries:
Apple’s G5 versus x86, Mac OS X versus Linux.
A writeup of my previous benchmarks, which includes a review of my general impressions of OS X, is available here.
If you have any suggestions on how to fix the terrible performance of
(this software on) OS X or if you think something here is erroneous,
please contact me.
See similar benchmarks available on AnandTech’s website:
mysteries: Apple’s G5 versus x86, Mac OS X versus Linux” and href="http://www.anandtech.com/mac/showdoc.aspx?i=2520&p=8">“No more
mysteries, part two”.
For another review see When a Linux user buys
Apple’s Mac mini.