|

DirectoryMark 1.2 Run Rules
The purpose of these run rules is to:
-
Define an industry-standard, rigorous
benchmark for LDAP directory servers;
-
Show the performance of a directory server
under realistic, application-specific conditions;
-
Require detailed documentation of what was
tested and how it was tested so that the results can be
reproduced by others; and
-
Define metrics that can be used to compare
LDAP directory servers.
The major
difference between these run rules and those for DirectoryMark 1.1 is
the elimination of the Update Test Scenario. The other significant
change is how test results are reported.
1.0 Run Requirements
1.1 Test Environment
DirectoryMark 1.2 tests directory servers that
conform to either the LDAP Version 2 or LDAP version 3 protocols as
defined in RFC 1777 (http://ds.internic.net/rfc/rfc1777.txt)
and RFC 2251 (http://ds.internic.net/rfc/rfc2251.txt),
respectively. In addition, both the System Under Test and the client
test systems must satisfy the appropriate following RFCs:
1.2 System Under Test
The System Under Test (SUT) includes the following
components:
-
The computer system or systems on which the
LDAP directory server runs. The run rules place no restriction on
the computer architectures used and can include a symmetric
multiprocessor system, a cluster of computers, a load-balanced
collection of independent computers, and other configurations.
-
The operating system(s) running on the SUT.
-
The LDAP directory server running on the SUT.
-
The networking subsystem connecting the
various components of the SUT with each other and with the load
generator systems (that portion that resides in the load
generator systems is excluded from the SUT). This includes all
network interface cards (NICs), networking software (such as a
third party TCP/IP stack in the computer operating system),
network hubs, network routers, network switches, software in any
programmable network devices (e.g., operating system software in
a switch, VLAN software, device drivers that run on programmmable
NICs, etc.).
1.2.1 Run Requirements
In order for a test run to be considered valid under
these run rules the following requirements must be met:
-
The SUT must conform to the applicable
networking standards.
-
The value of TIME_WAIT must be at least 60
seconds (see RFC 1122 and 793). On those systems that do not
dynamically allocate TIME_WAIT table entries, the appropriate
system parameter should be configured to at least 1.1 * TIME_WAIT
* Maximum Operation Rate to ensure they can maintain all the
connections in TIME_WAIT state.
-
The LDAP directory server must satisfy the
LDAP Version 2 or Version 3 standard as used by DirectoryMark 1.2.
-
The LDAP directory server must behave
correctly for each request made. This means that it must return
all information requested for the correct entry that is the
target of a search, that it must correctly modify the value of a
specified attribute for a given entry, etc.
-
The SUT must use non-volatile storage for all
entries in the LDAP directory. While entries may be cached in the
main memory of a computer system while the LDAP directory server
is running, this requirement specifies that a computer system
with an uninterruptable power supply is not acceptable whereas a
RAM disk with battery backup that lasts over 48 hours would be
acceptable.
-
The SUT must present the behavior and
appearance of a single logical server to LDAP clients. This means
that any LDAP requests that change data in the directory must be
visible to any and all other clients on any subsequent request.
-
If the results are to be
published, all of the components of the SUT must be generally
available to commercial users within 90 days of the publication
of the test results.
-
Any deviations from the standard, default
configuration for the SUT must be documented so an independent
party would be able to reproduce the result without further
assistance.
1.3 Load Generator Systems
DirectoryMark 1.2 uses computer systems to generate
a load on the SUT; these are called load generators. The load
generator systems must meet the following requirements for a test run
to be considered valid and to be published:
-
The load generators must conform to the
applicable networking standards.
-
The value of TIME_WAIT must be at least 60
seconds (see RFC 1122 and 793). On those systems that do not
dynamically allocate TIME_WAIT table entries, the appropriate
system parameter should be configured to at least 1.1 * TIME_WAIT
* Maximum Number of Open Connections to ensure they can maintain
all the connections in TIME_WAIT state.
-
All of the components of
the load generators must be generally available to commercial
users within 90 days of the publication of the test results.
-
Any deviations from the standard, default
configuration for the load generators must be documented so an
independent party would be able to reproduce the result without
further assistance.
1.4 Directory Size Classes
The number of entries in a directory can affect performance. Also,
some directory servers are more effective at dealing with directories
of a certain size. In choosing a directory size requirement for
DirectoryMark testing, we want to make a decision that will provide the
highest value to the most people. There are many alternative approaches
we could take to specify what size directory to use for testing,
including:
- A "one size fits all" approach. If we
picked a large size, this would be misleading for directory
servers targeted at smaller sizes. If we picked a small size,
directory servers targeted at large sizes might have too much
overhead to compete with those targeted for small sized
directories, again misleading you.
- A single directory size that increases as a function of
server performance. The real world does not work this
way and the results could be misleading. You pick a directory
server that will work well with the directory size you have or
expect to have. So, for example, if the test directory size were
computed to be 1,000,000 entries given the performance of the SUT
and if you had a directory with 10,000,000 entries, you would not
be able to draw valid conclusions from the performance metrics
for the directory you wanted to use.
- Let the tester pick the directory size. This
approach has the benefit of letting the tester target the
directory size their product supports best. However, because any
directory size could be used, it would not be possible to compare
products fairly if they use different directory sizes. This would
lead to confusion in the market.
- Let the tester pick the directory size from a fixed set
of size classes and require testing for each size in a class.
This approach lets vendors target their products to various sizes
of directories while providing a uniform set of sizes that
facilitates product comparisons. Since directories have a
tendency to grow over time, you can see how a directory server
will behave as your directory grows and can focus your
evaluations on those directory servers targeted at the directory
sizes you expect to have.
We have chosen to follow alternative 4 because it offers the highest
value for users and vendors.
Therefore, we have defined the classes of directory sizes shown in
Table 1. For each class there are three sizes of directories that must
be tested, each size corresponds to a data point that will be used in
each test scenario.
Table 1: Directory Size Classes
1 |
10,000 |
50,000 |
100,000 |
2 |
100,000 |
250,000 |
500,000 |
3 |
500,000 |
750,000 |
1,000,000 |
4 |
1,000,000 |
2,500,000 |
5,000,000 |
5 |
5,000,000 |
7,500,000 |
10,000,000 |
6 |
10,000,000 |
25,000,000 |
50,000,000 |
All directory size classes reported for the same SUT and load
generator configuration will be included on a single report for the SUT.
1.5 Test Scenarios
The following scenarios are required to be tested in
the order shown for each data point in each directory size class
selected (for example, test each scenario for Data Point #1 before
going to Data Point #2):
- Loading
: This scenario will time how long it takes to load
the directory using LDIF files. This will be tested when the
directory is loaded. The values required to be reported for each
class selected are:
- For Data Point #1, the time it takes to load the number of
entries specified in Data Point #1 and the load rate in
entries/minute;
- For Data Point #2, the time it takes to load the additional
number of entries specified to reach Data Point #2 from Data
Point #1, the load rate in entries/minute for the incremental
load, and the overall load rate in entries/minute to reach
Data Point #2 from an empty directory;
- For Data Point #3, the time it takes to load the additional
number of entries specified to reach Data Point #3 from Data
Point #2, the load rate in entries/minute for the incremental
load, and the overall load rate in entries/minute to reach
Data Point #3 from an empty directory;
- Messaging: The purpose of this scenario is to simulate an
e-mail/messaging server using a directory server. There will be
only one bind at the beginning of the test. All queries will be
an exact match on a UID. The values required to be reported for
each class selected are:
- The number of searches/second for each Data Point.
- Address Look-Up: The purpose of this scenario is to
simulate people looking up names in an address book as well as
expanding a group for e-mail. The parameters for this scenario
are:
Bind |
After
every 5 operations |
UID1 Searches |
28% |
CN2 Wildcard Searches |
24%;
* at the end of the value searched for |
Exact Match on givenName3 |
16% |
Exact Match on SN4 |
8% |
Exact Match on CN |
16% |
Not Found |
8% |
Notes:
- UID means a unique user identifier. By default, we use
"description" in the LDAP organizationalPerson
schema.
- CN is the common name attribute in the LDAP
organizationalPerson schema.
- givenName means a given name identifier. By default, we
use "seeAlso" in the LDAP organizationalPerson
schema.
- SN is the surname attribute in the LDAP
organizationalPerson schema.
For the above mix of operations, the values required to be
reported for each class selected are the number of searches/second
for each Data Point.
You must use the test script scriptgen generated for each
data point for your measurements to be valid. For example, you must use
the Data Point #3 messaging script to measure the performance of the
messaging scenario for Data Point #3.
1.5.1 Warm-Up and Measurement Run Times
The purpose of a warm-up run is to fill the LDAP directory server
cache on the SUT to simulate a system that has been running for some
time. A valid measurement will consist of one warm-up run followed
immediately by one measurement run for each data point in each test
scenario, except for the Loading scenario which shall have no warm-up
run.
Table 2 below shows the warm-up and measurement time in minutes
required for each data point.
Table 2: Warm-up and Measurement Times
loading |
N/A |
Actual |
N/A |
Actual |
N/A |
Actual |
Actual |
messaging |
8 |
10 |
8 |
10 |
8 |
10 |
54 |
address lookup |
8 |
10 |
8 |
10 |
8 |
10 |
54 |
Total minutes |
16 |
20 |
16 |
20 |
16 |
20 |
108 + loading time |
For measurements to be considered valid, the LDAP directory server
may not be restarted nor may the SUT be rebooted between data points in
a test scenario and between test scenarios for a given directory size
class. You are allowed to restart or re-initialize the LDAP directory
server and/or reboot the SUT before testing a different directory size
class.
1.5.2 LDAP Directory Schema
All testing shall be done using the LDAP organizationalPerson
schema. At least the following attributes must be indexed for fast
searching:
- UID1
- CN2
- givenName3
- SN4
Notes:
- UID means a unique user identifier. By default, we use
"description" in the LDAP organizationalPerson
schema.
- CN is the common name attribute in the LDAP
organizationalPerson schema.
- givenName means a given name identifier. By default, we
use "seeAlso" in the LDAP organizationalPerson
schema.
- SN is the surname attribute in the LDAP
organizationalPerson schema.
2.0 Documentation Requirements
2.1 Report
To report DirectoryMark 1.2 results, you must show
the operation rate performance reported by the DirectoryMark client(s).
If you use more than one client system to test an SUT, you must
aggregate results from all client systems and report the total
operation rates.
You are required to specify enough information about
the SUT and the test environment to allow a knowledgeable user to
reproduce your test results.
2.2 Archiving Requirements
You must archive the following items and make them
available for Mindcraft's review, if you want your results published at
our Web site:
-
All test DirectoryMark result tables generated
for each data point.
-
All test scripts used for each data point.
-
All parameters used to run each component of
DirectoryMark for each data point.
-
Should directory server logging be
turned on? If so, the logs should be archived.
3.0 DirectoryMark Metrics
3.1 Performance Metric
The DirectoryMark performance metric is
DirectoryMark Operations Per Second or DOPS™. Performance
is always to be reported for a specified directory size class.
DOPS are computed as a weighted average of the
operation rates of all of the test scenarios. The weights used are:
Loading |
4% |
Messaging |
48% |
Address Look-Up |
48% |
The following examples show the three acceptable
alternative ways to express the DirectoryMark performance metric in
press releases and other publications (numbers in the examples below
obviously will change to reflect the measurements and classes tested):
In addition to reporting DOPS on the standard DirectoryMark Results
Web page, you must also report the metrics specified in Section 1.5 for
each data point of the test scenarios.
3.2 Price/Performance Metric
You are required to report the price/performance of the SUT if you
publish any DirectoryMark performance results. It is computed using the
formula:
price/performance = (SUT price in dollars)/DOPS
It is expressed in terms of $/DOPS.
The components that will make up the SUT price include:
- All server hardware and software used for the test.
- All networking hardware and software used for the test up to,
but not including, the wires or fibers coming out of the client
systems.
The prices used may be street price (substantiated by a reseller,
customer invoices, or publicly available street price information from
the manufacturer) or list price. The type of pricing used must be
included in the report.
You need to fill in all relevant data in the standard pricing Web
page (pricingdm11.html). You may use the spreadsheet (pricingdm11.xls)
to help you compute the price/performance metric.
4.0 Publishing Results
4.1 Unreviewed Results
You can publish your price/performance and DOPS
results in any medium you find appropriate, as long as you followed
these run rules. When publishing results, you need to say which version
of DirectoryMark was used for the measurements. You may publish the
standard DirectoryMark Results Web page any where you want as long as
you also publish the associated standard DirectoryMark Pricing Web
page.
Mindcraft will publish unreviewed results at its Web
site. If you want your results included there, please contact us. Note, you will need to
provide us a copy of the licenses for all software tested and a release
from the product vendor, if their license precludes publishing
benchmark results without their prior approval.
4.2 Reviewed Results
If you want your results reviewed by Mindcraft and
published at Mindcraft's Web site at a location for reviewed results,
please contact us. Note, you will
need to provide us a copy of the licenses for all software tested and a
release from the product vendor, if their license precludes publishing
benchmark results without their prior approval.
4.3 Certified Results
Mindcraft will perform DirectoryMark testing for you
and will certify the results. We will publish the results at our Web
site at a location reserved for certified results. Contact us for more information
about this service.
4.4 Contacting Mindcraft
You can contact Mindcraft at directorymark@mindcraft.com
or (408) 395-2404. |