In part 3 out of 4 of this blog, much like parts 1 and 2 I will hit on anti-patterns that allow your performance testing of clustered and or distributed software to lie to you. I'll be following up part 3 of this blog, the last 4 anti-patterns, with a blog about a simple distributed testing framework I have begun. Hopefully enough of you will be interested, try it and maybe even contribute to it.
Anti-pattern 7: In-memory vs. Distributed Performance Comparison
Writing a test that compares the speed of adding objects to a local, in-memory data structure vs. adding objects to a clustered data structure.
To avoid suspense, I'll tell you the results of that test without running it. Adding things to a local, in-memory data structure stakes virtually no time at all. In-memory object changes happen so fast, they are hard to even measure. However, when you are making changes to a distributed data structure, no matter what, those state changes have to be shipped off to another location. This takes instructions to be executed to make this happen on top of the ones used for the original task. This isn't just slower, it is way slower. The comparison between in-memory object changes and distributed object changes is useless.
Figure out how much data you are going to be clustering and what the usage patterns of that data becoming clustered will be. Then simulate and time that. Once again, focus on total throughput with acceptable latency.
Anti-pattern 8: Ignore Real-world Cross-node Patterns
Reading and writing the same data in every node.
Generally speaking, whether reading or writing, it is more expensive to access the same data concurrently across all nodes. Depending on the underlying clustering infrastructure, this can be more or less of a problem. If you are using an “everything everywhere” strategy, the performance hit of random access across all the data on all the nodes is less, but the “everything everywhere” sharing strategy generally does not scale well. Most other strategies perform better when data access is consistently read and or written from the same node
Write your performance tests in a way that allows you to set a percentage for locality of reference. Is an object accessed on the same node 80%, 90%, or 99% of the time? You should usually have some cross-node chatter, but usually not too much—although you should be as realistic to the problem you are trying to solve as possible.
Anti-pattern 9: Ignore Usage Patterns
The performance test either just creates objects or just reads objects
In the real world, an application does a certain amount of reading, writing, and updating of shared objects. And those reads, writes, and updates are of certain sizes.
If your app likely changes only a few fields in a large object graph, then that is what your performance test should do. If your app is 90% read from multiple threads and 10% write from multiple threads than that is what your test should do. Make your test be true to what you need when it comes to data and usage.
Anti-pattern 10: Log Yourself to Death
Last, but far from least, doing extra stuff like writing data out to a log chews up CPU. Logging too much in any performance test can render the test results meaningless.
This anti-pattern generally covers any extra CPU usage on a load-generating client that affects the performance test. In general, if one or more of your nodes is CPU bound in a cluster performance test, you likely have not maxed-out the performance of your cluster. Let me say that again, if you are resource constrained on any node, including your load generating nodes (but not including your server if one exists) then you are probably not maxing out what your cluster as a whole can handle. Investigate further.
If the individual load-generating nodes—or even the clustered nodes—are resource constrained, it is likely to create a false bottleneck in your test. You are trying to figure out the throughput of the cluster and your cluster nodes are likely busy doing other things like logging.
First, always have machine monitoring on all nodes in a performance test. Any time one of the nodes or load generators becomes resource constrained make sure you test with an additional node and see if it adds to the scale. If a node is unexpectedly resource constrained, then take a series of thread dumps (java only) and figure out where all the time is going.
Alright, that is the end of my anti-pattern list for now. I could probably come up with a few more but I'll save them for another day. The moral of this section of the blog is to be curious and skeptical with your testing results. Don't just ask what the numbers are. Find out why and you will end up a much happier person.