Thursday, September 3, 2009

First version bench

Transactions per second, measured at the receiving end (50 people sending to 50 peers, 2500 expected). A new batch of messages is only sent once all 2500 have been received, so a single message being lost in the system will result in test freeze and is a failure.

0-0=0
124-0=124
400-124=276
788-400=388
2125-788=1337
2431-2125=306
2890-2431=459
3348-2890=458
3909-3348=561
4164-3909=255
4521-4164=357
4776-4521=255
4980-4776=204
5235-4980=255
5591-5235=356
5846-5591=255
6254-5846=408
6713-6254=459
7019-6713=306
8039-7019=1020
8548-8039=509
9211-8548=663
9362-9211=151
9362-9362=0
10153-9362=791
11305-10153=1152
11823-11305=518
12742-11823=919
14412-12742=1670
16138-14412=1726
18115-16138=1977
19696-18115=1581
20162-19696=466
20526-20162=364
20886-20526=360
21167-20886=281
21167-21167=0
21167-21167=0
21167-21167=0

As you can see, we approached desired speed for a bit but then fell back down and froze. No further work was accomplished for the remainder of the test. The auditing profile looks like this:

Snapshot[period=10000ms]:
Threads=4 Tasks=10632/11837
AverageQueueSize=32.67 tasks

Snapshot[period=10000ms]:
Threads=4 Tasks=3281/4837
AverageQueueSize=137.71 tasks

Snapshot[period=10000ms]:
Threads=4 Tasks=1074/2369
AverageQueueSize=226.95 tasks

Snapshot[period=10000ms]:
Threads=4 Tasks=36/116
AverageQueueSize=0.87 tasks

Snapshot[period=10000ms]:
Threads=4 Tasks=35/118
AverageQueueSize=0.87 tasks

Snapshot[period=10000ms]:
Threads=4 Tasks=34/115
AverageQueueSize=0.88 tasks

What that profile suggests to me is that the baseline 35/35 maintenance tasks, which run every ten seconds on an idling darkstar instance, are completing, and all the others are finding it impossible. They might be always taking more than 100ms, or they might be all grabbing for the same resources. While this does merit more examination, I'm excited to try the server revisions suggested by the PDS forum, so am implementing and benchmarking that now. Stand by...

No comments:

Post a Comment