People often ask what is the recommended percentile when forecasting with Pacemkr. Is it 85%? Or 95%? While the entire conversation/debate about the right percentile is outside the content of this article, let’s take the 85th percentile to see how much data is needed to reach this percentile.
Let’s use a simple example where a ball is picked from a bag to help us understand the amount of data required to reach the 85th percentile. In the image below, the ball is a blue circle with a number one on it. We put it on a line as follows.
data:image/s3,"s3://crabby-images/53a26/53a26c0fcfae971788b887f87289f19ea980d248" alt=""
We then pick a second ball from the bag. We have a 50% probability that ball #2 will be above and a 50% probability it will be below ball #1. By above or below, I mean it could be bigger or heavier. In other words, it’s an attribute of comparison with ball #1.
data:image/s3,"s3://crabby-images/8bd42/8bd42d41eedc1b52c156ed33af3b562a9517cd5e" alt=""
We draw a third ball from the bag. This ball has a 33% probability of being between balls #1 and #2.
data:image/s3,"s3://crabby-images/e7694/e7694880eb2084db37a2bef29ea9a5bda86f875c" alt=""
We then draw a fourth ball. It has a 50% probability of being between #1 and #3.
data:image/s3,"s3://crabby-images/f549f/f549fb2db94fe72e13810cc8b8bca77d8bb3305f" alt=""
We draw a fifth ball from the bag. Ball #5 has a 60% probability of being between #1 and #4.
data:image/s3,"s3://crabby-images/c47dc/c47dcfe1c6beea2872dd50525f3eb1604fa80fde" alt=""
We draw a sixth ball from the bag. Ball #6 has a 66.67% probability of being between #1 and #5.
We can summarize the previous images in the following table:
Number of balls on the line | Probability of landing between the edge cases |
---|---|
2 | 33% |
3 | 50% |
4 | 60% |
5 | 66.7% |
We can extract the following mathematical formula from this example:
data:image/s3,"s3://crabby-images/9ed4a/9ed4aa66cd26181c554c374a1276e4f16419a816" alt=""
Finally, we can run this formula until we get a probability of hitting our 85% probability.
Number of balls on the line | Probability of landing between the edge cases |
---|---|
2 | 33% |
3 | 50% |
4 | 60% |
5 | 66.7% |
6 | 71% |
7 | 75% |
8 | 78% |
9 | 80.7% |
10 | 82% |
11 | 83% |
12 | 85% |
Once 12 work items are completed, the next work item has an 85% probability of landing between our edge cases, which is the equivalent of our 85th percentile line.
If you wish to get to a 95% confidence level, you need 39 completed work items.
data:image/s3,"s3://crabby-images/56452/56452131717ab8e757d2ff040f4a4f9f1ad496ec" alt=""
In conclusion, one of the criteria to help you choose the right percentile for your forecast can be the number of data points required to get there. If you are in a team where work items are completed frequently and after a few days, there are already 39 completed work items, the 95th percentile could be a better candidate for your forecasts.
On the other hand, if you are in a slower context where only a few work items are completed every week, maybe the 85th percentile would be a better option knowing it can take months before you reach 39 completed work items.