Mindcraft Performance Reports

Setting the Record Straight:
Where Linux Today Got It Right and Wrong

By Bruce Weiner

May 4, 1999


In an April 27, 1999 article entitled "Will Mindcraft II Be Better?" Linux Today presented a one-sided report clearly designed to destroy Mindcraft's credibility and to falsely make our reports look wrong. I want to set the record straight with this rebuttal, so I'll point out what's right and wrong with the Linux Today article. Unfortunately, it takes more words to right a wrong than it does to make someone look wrong, so please bear with me.

What's Right

Dave Whitinger and Dwight Johnson  had several points right in their article:

  • Mindcraft did the tests stated in the article under contract with Microsoft in a Microsoft lab.

    Many have tried to imply that something is wrong with Mindcraft's tests because they were done in a Microsoft lab. You should know that Mindcraft verified the clients were set up as we documented in our report and that Mindcraft, not Microsoft, loaded the server software and tuned it as documented in our report. In essence, we took over the lab we were using and verified it was set up fairly.

  • Mindcraft did conduct a second test with support from Linus Torvalds, Alan Cox, Jeremy Allison, Dean Gaudet, and David Miller. Andrew Tridgell provided only one piece of input before he left on vacation. Mindcraft received excellent support from these leading members of the Linux community. I thank them for their help and very much appreciate it.

  • Jeremy Allison was correct that the I made the initial contact at the suggestion of a journalist, Lee Gomes from the Wall Street Journal.

  • Jeremy was right that we were under an NDA and, as stated above, the tests were run at a Microsoft lab.

What was not mentioned in the article was the excellent support Red Hat provided for our second test. Doug Ledford, from Red Hat, answered my questions on the phone, always called back when I left messages, and participated in the email correspondence with the above named Linux experts.


What's Wrong

Unfortunately, Mr. Whitinger and Mr. Johnson by not even attempting to contact Mindcraft to get information from us. It seems as though they wanted to write a one-sided story from the beginning. The following points will give you the other side of their story.

  • Linus is attributed as saying ".... that nobody in the Linux community is really working on the Mindcraft test per se, because Mindcraft hasn't allowed them access to the test site." It's clear from the emails we exchanged that the Linux experts did make suggestions on tunes for Linux, Apache, and Samba. They also provided a kernel patch that was not readily available. We applied all tunes they suggested and the kernel patch. Here are some of the things that happened:

    • Red Hat provided version 1.0 of the MegaRAID driver during our tests and we used it, even though it meant retesting.

    • We sent out our Apache and Samba configuration files for review and received approval of them before we tested. (We actually got better performance in Apache when we made some changes to the approved configuration file on our own).

    • Whenever we got poor performance we sent a description of out how the system was set up and the performance numbers we were measuring. The Linux experts and Red Hat told us what to check out, offered tuning changes, and provided patches to try. We had several rounds of messages between us in which Mindcraft answered the questions they posed.

  • According to the article, Linus complained about the opaqueness of our test. This is a strange complaint since he and all of the Linux experts knew the exact configuration of the system we were testing and knew the benchmarks we were running. The NetBench and WebBench benchmarks are readily available on the Web for free and are probably some of the best documented benchmarks available. We withheld no technical details from him or the other Linux experts.

    Jeremy Allison directly contradicts Linus later in the article when he says "...I can confirm that we have reproduced Mindcraft's NT server numbers here in our lab." Clearly, Jeremy was tracking what we were doing and had the lab to verify our results.

  • The article says that all emails to the Linux experts came from a Microsoft address. That's wrong. On April 16, 17, 18, and 19 I sent emails to them from Mindcraft's office on a Mindcraft IP address. Emails sent during the second test were sent from a Microsoft IP number.

  • Mr. Whitinger and Mr. Johnson are wrong about the email alias of "will" belonging to me. It belongs to a person who is not a Mindcraft employee. He is someone who did a posting to a newsgroup about Linux on the system we were going to use for testing. He wanted to remain as anonymous as possible because he didn't want to get a ton of flamming email (based on the email Mindcraft has received, his expectation was underestimated). I see no need to reveal who he is now because his worst nightmare will come true and because he had nothing to do with our test.

  • Jeremy did give me excellent support both on the phone and via email. I applied all of his suggestions. If he gave me all of the tuning parameters he used for the February 1, 1999  PC Week article showing Samba performance on a VA Research system, they should have been applicable to the system I was using. That certainly is true for systems as similar as those two when running Windows NT Server.


The Crux of The Matter

The whole controversy over Mindcraft's benchmark report is about three things: we showed that Windows NT Server was faster than Linux on an enterprise-class server, Apache did not outperform IIS, and we didn't get the same performance measurements for Samba that Jeremy got in the PC Week article or his lab. Let's look at these issues.

Comparing the performance of a resource-constrained desktop PC with an enterprise-class server is like saying a go-kart beat a grand prix race car on a go-kart race course.
  • Smart Reseller reported a head-to-head test of Linux and Windows NT Server in a January 25, 1999 article; they tested performance on a resource-constrained 266 MHz desktop  PC. One cannot reasonably extrapolate the performance of a resource-constrained desktop PC to an unconstrained, enterprise-class server with four 400 MHz Xeon processors.

  • In a February 1, 1999 article, PC Week tested the file server performance of Linux and Samba on an enterprise-class system. They did not compare it to Windows NT Server on the same system. Jeremy Allison helped with these tests comparing the Linux 2.2 kernel with the Linux 2.0 kernel. I'll show you below what he thinks about Windows NT Server on an enterprise-class server.

  • If you doubt our published Apache performance, Dean Gaudet, who wrote the Apache Performance Notes and who provided tuning help, gives some insights in a recent newsgroup posting. In response to a request for tuning Apache for Web benchmarks, Dean wrote:

    "Unless by tuning you mean 'replace apache with something that's actually fast' ;)

    "Really, with the current multiprocess apache I've never really been able to see more than a handful of percentage improvement from all the tweaks. It really is a case of needing a different server architecture to reach the loads folks want to see in benchmarks."

    In other words, Apache cannot achieve the performance that companies want to see in benchmarks. That's probably why none of the Unix benchmarks results reported at SPEC use Apache.

  • Jeremy Allison believes, according to the Linux Today article, that if we do another benchmark with his help, "...this doesn't mean Linux will neccessarily [sic] win, (it doesn't when serving Win95 clients here in my lab, although it does when serving NT clients)..." In other words, in a fair test we should find Windows NT Server outperforming Linux and Samba on the same system. That's what we found.

  • Jeremy's statement in the Linux Today article that "It is a shame that they [Mindcraft] cannot reproduce the PC Week Linux numbers ..." shows a lack of understanding of the NetBench benchmark. If he looked at the NetBench documentation , he would find a very significant reason why Mindcraft's measured Samba performance was lower:

    We used 133 MHz Pentium clients while Jeremy and PC Week used faster clients, although we don't know how much faster because neither documented that. We believe that PC Week uses clients running with at least a 266 MHz Pentium II CPU. Because they use clients that are at least twice as fast and because so much of the NetBench measurements are affected by the clients, this can account for most of the difference in the reported measurements.

"You can only compare results if you used the same testbed each time you ran that test suite." Understanding and Using NetBench 5.01

    In addition, the following testbed and server differences add to the measured performance variances:

    1. Mindcraft used a server with 400 MHz Xeon processors while PC Week used one with 450 MHz Xeon processors. Jeremy did not disclose what speed processor he was using.

    2. Mindcraft used a server with a MegaRAID controller with a beta driver (which was the latest version available at the time of the test) for our first test while the PC Week server used an eXtremeRAID controller with a fully released driver. The MegaRAID driver was single threaded while the eXtremeRAID driver was multi-threaded.

    3. Mindcraft used Windows 9x clients while Jeremy and PC Week used Windows NT clients. According to Jeremy, he gets faster performance with Windows NT clients than with Windows 9x clients.

    Given these differences in the testbeds and servers, is it any wonder we got lower performance than Jeremy and PC Week did?

    If you scale up our numbers to account for their speed advantage, we get essentially the same results.

  • The only reason to use Windows NT clients is to give Linux and Samba an advantage, if you believe Jeremy's claim. In the real world, there are many more Windows 9x clients connected to file servers than Windows NT clients. So benchmarks that use Windows NT clients are unrealistic and should be viewed as benchmark-special configurations.

  • The fact that Jeremy did not publish the details of the testbed he used and the tunes he applied to Linux and Samba is a violation of the NetBench license. If he had published the tunes he used, we would have tried them. What's the big secret?

  • Jeremy states in the article "The essense of scientific testing is *repeatability* of the experiment..." I concur with his assertion. But a scientific test would use the same test apparatus set up and would use the same initial conditions. Jeremy's unscientific test did not use the same testbed or even one with client computers of the same speed we used. We reported enough information in our report so that someone could do a scientific test to determine the accuracy of our findings. Jeremy did not.

    Given the warning in the NetBench documentation against comparing results from different testbeds, it is Jeremy and Linus that are being unscientific in their thrashing of Mindcraft's results. Mindcraft never compared its NetBench results to those produced on a different testbed.

Some Background on Mindcraft

Mindcraft has been in business for over 14 years doing various kinds of testing. For example, from May 1, 1991 through September 30, 1998 Mindcraft was accredited as a POSIX Testing Laboratory by the National Voluntary Laboratory Program (NVLAP), part of the National Institute of Standards and Technology (NIST ). During that time, Mindcraft did more POSIX FIPS certifications than all other POSIX labs combined. All of those tests were paid for by the client seeking certification. NIST saw no conflict of interest in our being paid by the company seeking certification and NIST reviewed and validated each test result we submitted. We apply the same honesty to our performance testing that we do for our conformance testing. To do otherwise would be foolish and would put us out of business quickly.

Some may ask why we decided not to renew our NVLAP accreditation. The reason is simple, NIST stopped its POSIX FIPS certification program on December 31, 1997. That program was picked up by the IEEE and on November 7, 1997 the IEEE announced that they recognized Mindcraft as an Accredited POSIX Testing Laboratory. We still are IEEE accredited and are still certifying systems for POSIX FIPS conformance.

We've received many emails and there have been many postings in newsgroups accusing us of lying in our report about Linux and Windows NT Server because Microsoft paid for the tests. Nothing could be further from the truth. No Mindcraft client, including Microsoft, has ever asked us to deliver a report that lied or misrepresented the results of a test. On the contrary, all of our clients ask us to get the best performance for their product and for their competitor's products. They want to know where they really stand. If a client ever asked us to rig a test, to lie about test results, or to misrepresent test results, we would decline to do the work.

A few of the emails we've received asked us why the company that sponsored a comparative benchmark always came out on top. The answer is simple. When that was not the case our client exercised a clause in the contract that allowed them to refuse us the right to publish the results. We've had several such cases.

Mindcraft works much like a CPA hired by a company to audit its books. We give an independent, impartial assessment based on our testing. Like a CPA we're paid by our client. NVLAP approved test labs that measure everything from asbestos to the accuracy of scales are paid by their clients. It is a common practice for test labs to be paid by their clients.

What's Fair

Considering the defamatory misrepresentations and bias in the Linux Today article written by Mr. Whitinger and Mr. Johnson, we believe that Linux Today should take the following actions in fairness to Mindcraft and its readers:

  1. Remove the article from its Web site and put an apology in its place. If you do not do that, at least provide a link to this rebuttal at the top of the article so that your readers can get both sides of the story.

  2. Disclose who Mr. Whitinger and Mr. Johnson work for. Were they paid by someone with a vested interest in seeing Linux outperform Windows NT Server?

  3. Disclose who owns Linux Today and if it gets advertising revenue from companies who do not a vested interest in seeing Linux outperform Windows NT Server. 

  4. Provide fair coverage from an unbiased reporter of Mindcraft's Open Benchmark of Windows NT Server and Linux. For this benchmark, we have invited Linus Torvalds, Jeremy Allison, Red Hat, and all of the other Linux experts we were in contact with to tune Linux, Apache, and Samba and to witness all tests. We have also invited Microsoft to tune Windows NT and to witness the tests. Mindcraft will participate in this benchmark at its own expense.

References to NetBench Documentation

The NetBench document entitled Understanding and Using NetBench 5.01 states on page 24, " You can only compare results if you used the same testbed each time you ran that test suite [emphasis added]."

Understanding and Using NetBench 5.01 clearly gives another reason why the performance measurements Mindcraft reported are so different than the ones Jeremy and PC Week found. Look what's stated on page 236, "Client-side caching occurs when the client is able to place some or all of the test workspace into its local RAM, which it then uses as a file cache. When the client caches these test files, the client can satisfy locally requests that normally require a network access. Because a client's RAM can handle a request many times faster than it takes that same request to traverse the LAN, the client's throughput scores show a definite rise over scores when no client-side caching occurs. In fact, the client's throughput numbers with client-side caching can increase to levels that are two to three times faster than is possible given the physical speed of the particular network [emphasis added]."



