|Fourth TMF Cache-Off: Organizational Meeting (Minutes)|
|The Measurement Factory|
These are the notes from cache-off organizational meeting. Meeting CFP is available elsewhere.
2. Big Picture
4. The future of PolyMix
5. Product Licensing
6. Workloads for Cache-off #4
6.1 PolyMix-4 features
6.2 WebAxe-4 specifics
8. To Do
This page talks about various issues discussed at the meeting. Further cache-off preparations will be based, in part, on the decisions reached at the meeting. However, some issues remain unresolved and old preferences might change. Further public discussions will be held on the mailing list. We welcome any constructive input.
Meeting slides (Postscript) and an already outdated network setup figure are available.
We have invited all companies with caching products we knew about. Besides The Measurement Factory, the following companies sent their representatives.
Dell, HP, Lucent, Microbits, Stratacache, and Swell Technology are probably interested in competing but did not send representatives citing scheduling and resource problems as well as satisfaction with the current status-quo.
We talked about the value of cache-off tests in general. Should we continue with cache-off testing, or should the PolyMix series be moved to a SPEC-like model where vendors can submit new results?
TMF argues that the only major difference between a cache-off and a SPEC-model is scheduling and location.
Some like the way SPEC presents results normalized against standardized ``baseline'' systems. This makes it easier to compare results over time and from different workloads. TMF argues that different workloads should not be comparable, and that SPEC's approach of concentrating on a single number is wrong.
TMF suggested that it might be possible to include non-participating products ``anonymously'' in a cache-off report. This allows participants to say ``I am better than all other products'' but not ``I am better than product X.'' Somebody would have to financially support anonymous benchmarks.
One of the problems with anonymous results is that it will be hard to get current- and/or next-generation products from non-participants. Another problem is that nobody is at the event ``fighting'' for the anonymous product.
Somebody suggests that, in the past, academics have tried to anonymously publish database benchmark results. They failed because each product is easily identifiable due to unique result characteristics. There may have been lawsuits by the database companies.
One participant volunteers to ask their legal department for advice regarding anonymous benchmark results.
Overall, the attendees seem skeptical about the feasibility of anonymous results.
Company X suggests that a cache-off report could include empty rows for non-participants to highlight the fact that they aren't benchmarked.
The proposed cache-off date (Oct 8-12) conflicts with Jewish holidays. Participants ask to move the date later, rather than earlier.
Scheduling is somewhat complicated by the need to figure out what would be test at the cache-off. Can we add enough new features to the PolyMix workload to keep participation high? Do we want to test a surrogate (accelerator) workload as well?
There is some concern that a release of cache-off results in the late November/December timeframe is bad because everyone can only think about holidays (thanksgiving, Christmas) and won't care about proxy performance.
The new proposed date is October 29 - November 2.
It is clear that vendors are less and less interested in producing new PolyMix performance numbers. TMF published no PolyMix results after the third cache-off report. A number of vendors are ``satisfied'' with their previous results and do not feel compelled to spend time on PolyMix-4.
Company X suggests (with a nice diagram) that the interest level for plain HTTP performance is now waning. Ideally we should be testing streaming media, SSL, and other features.
Most attendees agree that a WebAxe workload, especially with SSL termination, is very interesting, and probably enough to get them to attend a cache-off. At least two companies are ready to sponsor SSL support in Polygraph.
Some suggest testing other characteristics of firewall features such as TCP proxying or denial of service protection.
One of the products tested at the third cache-off is available only with a 10-user limited license. This product demonstrated performance at the cache-off that was significantly higher than 10 typical users would normally generate. This OEM negotiated a reduced royalty payment with the software provider because of the limited user license. TMF argues that this gives the vendor an unfair advantage in performance/price calculations. This topic spawned a lively and partially unresolved debate regarding restricted licenses and performance/price calculations. The consensus is that licensing differences should be highlighted in the report.
Some argue that prices reported in the cache-off reports are essentially meaningless. It is difficult for TMF and others to verify pricing. Companies that produce only software and rely on the customer to purchase their own hardware feel unreasonably burdened with the requirement to include hardware costs. They argue that customers may already have such hardware available for use with their software.
Some feel that if we require list price, then we need to take it a step further and require total cost of ownership (TCO) data as TPC does. Others say that proxies today are not what databases were ten years ago.
Some suggest that software and appliance products should be reported separately. The consensus is that software/hardware type of a product should be clearly identified on the report.
Some take issue with the ``$1000 can buy X req/sec'' reporting of previous cache-offs. A product's performance is not directly proportional to its cost. Some costs contribute to support, documentation, etc. and do not affect raw performance.
There was almost a consensus that it is okay to report price and performance separately, but TMF should not include a performance/price column in the result table. Customers would be welcome to perform the division on their own if they desire. At least one participant is very reluctant to agree that eliminating performance/price column is a good idea.
Most attendees feel that WebAxe is a ``must'' for the fourth cache-off. It is better to have a single event that tests both workloads, rather than two events.
We talked about adding SSL termination to the WebAxe workload for cache-off #4. Rough consensus was that adding SSL requires more time that we have with the current schedule. Thus, it is better to have a plain WebAxe test earlier rather than a WebAxe+SSL test later.
We seemed to agree that PolyMix-4 and WebAxe-4 workloads should be available at the cache-off. A vendor will pick one, the other, or both.
TMF feels that there will be different participation fees for each test. For example, PolyMix costs X, WebAxe costs Y, and both cost X+Y.
Several new PolyMix features were discussed, including:
DNS and better server popularity seems like the most important features to be added, but DNS support may add a lot of complexity to the bench setup.
TMF proposed server-side think time delays in the range of 100-250 msec. Attendees felt these values are too high, but did not offer specific values.
Someone asked about IMS and reload requests. Real installations probably see a wide variance in the percentage of such requests.
Someone suggested that large objects are getting larger in real traffic. Polygraph's current size distributions should perhaps be adjusted to generate occasional huge replies.
TMF proposed a 95% recurrence ratio for the WebAxe workload. Given 20% uncachable responses and presence of IMS requests, this results in about 70% DHR.
One attendee suggests that we need more scientifically derived numbers.
TMF needs to do a better job of testing rental equipment. In the past, a faulty PC cost one participant valuable testing time.
TMF should reduce the number of content-dependent assertions in Polygraph source code. It is probably possible to continue processing when certain conditions occur, rather than asserting and aborting the test.
TMF proposed that vendors can use L2-only aggregation devices in their zone with no penalty in the price used for performance/price reporting. This was rejected.
At least one vendor wants the cache-off report to equate throughput with number of simultaneous users.
Most attendees consider participation fee discounts for large fair. Fees structure is best determined by TMF.