Before you setup a Tableau Server, you have a lot of questions
- Do I have enough servers for my users?
- How will my user mix (viewer/interactor) impact my server capacity planning?
- How many VizQLs do I need for my user community?
- Can I configure 8 VizQL instances on a single server?
- …… to name a few.
The idea behind this post is to guide you on how to do some of your own tests to get some of the answers
So, lets get started and understand a few things about scalability tests
The idea is to understand how many users of a particular type of load can a server support
Now, since I currently work for Tableau and have had exposure to some of their tools, I used one of their tools (TabJolt) to conduct some of the tests
The rest of the article follows an approach. You can most certainly follow the same approach and use a completely different tool
Getting Started with TabJolt
I don’t plan to re-invent the wheel so if you haven’t even started with TabJolt, please refer to this great blog on TabJolt
How do I start?
The most important thing that you have to remember when conducting tests is to understand the baseline. If you don’t have a baseline or a benchmark, then what is it that you are comparing to?
Even though this sounds like common sense, this is one of the areas that a lot of folks mess up. They keep changing scenarios and then keep comparing their results (effectively, comparing apples and oranges)
Benchmarking / Baselining
I have found a great way to benchmarking or baselining the performance is by utilizing a out-of-the-box provided report/dashboard/visualization
An advantage of this report is that because this is Tableau provided, you can (sort of) compare your results against a future version. Or, even compare the performance you are getting with other environments/customers etc.
But, in the end, we have to remember, that “your mileage will vary”
So, for this test, I will use a report that Tableau provides out of the box, “Superstore”
So, how do we benchmark results. Here is an example.
go –t=testplans\InteractVizLoadTest.jmx –d=120 –c=1 –r=”Aggressive Interact | 1 Users | 120 secs ” –e=”8 Core 32 GB RAM – Vanilla TS”
Yes, this command is a bit more than what you have seen the in official documentation, which looks something like this
go –t=testplans\InteractVizLoadTest.jmx –d=60 –c=1
And yes, both commands work just fine.
But, try to use the other switches that are provided with the command so you can use those dimensions when looking at the data in Desktop.
I like to follow a simple convention that does the description as
“type of interaction | # of Users | Duration”
Once you have the first test conducted, maybe do it a couple of times to ensure that you are baseline results are consistent
Now, the fun part
Let’s just assume that we set up a reasonable limit of what we will expect this server to provide. For my current server (8core, 32GB RAM VM), I will assume that I will probably hit the limit of the server anywhere between 50 and 100 users constantly pounding the server.
So, how do we test that?
Use this approach
- Setup a simple excel with your test cases
- Create a formula that creates the “go” command
- Copy the data from excel
- Paste into a command window
- Let it rip
The details are here
As you can see above, I run a few tests for 1 user, 5 users, and then ramp by 10 users
Now, let’s look at the results
As you can see somewhere in the middle of the tests with 60 users, things start going bad.
The Error rate starts creeping up from 0% to 100% a few minutes later.
In addition, the avg. response time jumps up really high.
This is good (well not so good, depends on how you look at it)
Good in the sense that we saw our tipping point. Not good for the server.
So, let’s dissect this even more
What if we were not so aggressive on the server?
So, if you read a bit about the TabJolt documentation and the discussions on the community, you will find that when you are running the “Interactor” tests, you are basically pounding the server. The TabJolt harness is just relentlessly sending samples (what is a sample?) to the Tableau server.
Well, that’s what happens in real life, does it?
So, let’s do some realistic tests
Ramping Up with 60/40 Mix and a bit of think time
So, for our next set of tests, let’s change a few things
First, let’s decide to do the exact same number of tests with same “# of users”
So, we will do 1 user, 5 users, 10 users and then on, increments of 10 users.
And, this is very important. When doing any sort of performance related tests, its important to be very patient. And that means, changing only 1 variable at a time
For our second tests, we will change the test plan
If you see above, we are only changing the test plan.
Again, copy the command for all 12 rows and let it rip and take a 24 minute break (good time to catch up on one of those short hulu episodes, maybe?)
So, what does the second set of test results look like
Interesting that we see error rate over 0% at about the same number of users (when we moved from 50 t0 60 users)
But, what’s really interesting to note here is what happened to the “AVG success response time” metric.
In our first test, that metric climbed from 905 msec to 8647 msec by the time we got to 50 users
However, for this second test, it stays between 403 msec to 762 msec. In other words, pretty consistent behaviour.
Adding more users isn’t really adding more load on the server. And this makes sense.
It is because we have users (virtual) taking a few seconds break before they fire another request. We aren’t robots, right?
OKAY, THIS is encouraging. What’s next?
Ramping Up with 50/50 Mix and a bit of think time
So, for our next set of tests, we will keep other variables the same but just change the mix from 60/40 to 50/50
Here are the tests
And here are the results
This is great. We can get to between 70-80 users with a good balance of interactions and viewing dashboards and keeping the keeping the server healthy.
What did we learn?
The idea behind this article is to provide you a methodology around how to conduct tests and how to interpret results from TabJolt.
Hey, wait a minute, I saw some funky things
Yes, if you are like me and looking at the various numbers, you will have noticed some interesting things. e.g.,
a) why is there a high 95% Percentile success msec for the second tests of tests with a single user?
b) why did the tests with 80 and 100 users have low error rate and low response time for second set of tests?
c) why do my visualizations look different than the ones TabJolt code provides?
d) was the server really healthy when these tests were executed?
I agree, there are still some things to explain.
and quite frankly, there are still so many things to explain in the other sheets in the TabJolt workbook that I just can’t cover in one blog.
Hopefully, this will get your started and I will publish some follow up articles to provide you more info
Feel free to leave a comment or if you need any of the files being used here for conducting the tests.
EDIT (16 Feb 2016)
Here is a link to some TabJolt terms – de-mystified