Validation Case Study – Resource Allocation
This table summarizes the differences in forecasting accuracy between PBE’s models and the naïve models across the 118 cells. The table shows the weighted-average absolute percent errors for both models. Scripts were used to weight the errors for each cell.
PBE was able to reduce the naïve model error by 75% . We have undertaken a total of 14 such validation exercises and have in each case reduced the naïve model error by 75% to 85%. In other words, this Case Study is far from our best example. As of this writing, we are aware of no one else being able to reduce the naïve model error at all, much less challenge PBE’s standards.
Forecast vs. Actual Filled Rx’s Weighted-Average of 118 Cells
A Blinded Forecasting Test To Objectively Determine if it is
Possible to Quantify the Relationship Between Amounts of
Details/Samples and Rx’s at the Individual Doctor Level
Call Value Targeting
Pharmaceutical companies are constantly focused on improving the productivity of their detailing and sampling. This is not surprising, since an organization’s sales force is normally its second largest expense item after R & D. Additionally, details and samples are such potent generators of Rx’s that the opportunity cost of failing to fully exploit their potential is enormous.
Companies are increasingly relying on mathematical models that their creators claim quantify the relationship between details and samples on the one hand and filled Rx’s on the other, for each individual doctor. In theory, such models could show how 1,000 representatives might detail and sample with the effectiveness of 1,200 or more — without making more calls.
However, there is a question every user of models should want answered: “Do these models really do what they claim to do?” How does a company know if its models actually quantify the cause and effect of personal selling? Specifically, do the models quantify cause and effect well enough so that, if representatives follow the resulting plans, they will be more productive than they would have been on their own?
In order to determine if anyone could actually quantify cause and effect, an interested party sponsored a shoot out among organizations that claim to be able to forecast at the doctor level. The test was designed to determine if anyone could backup their claim. Five organizations agreed, including PBE, and one later dropped out. Other invitees declined to participate from the outset.
PBE won the shoot-out “hands down”. According to the sponsor, none of the other participants was able to account for the impact of details and samples at all! The Validation Methodology Section of this Case Study explains how the sponsor reached these conclusions. In short, they created a placebo.
The sponsor of the shoot-out has released very little detailed information. However, the Case Study that follows presents highly detailed information from a blinded validation test conducted by a leading pharmaceutical company. This exercise involved only PBE, but the protocol was the same as in the shoot-out. This protocol (which was reviewed by a leading academic) enables any company to quickly determine if and how well a model building methodology finds the relationship between numbers of details and samples on one hand and total filled Rx’s on the other at the individual doctor level for an established brand.
A pharmaceutical company provided PBE with monthly, doctor-level detailing and sampling data for 30 consecutive months for one of its brands. The company also provided the corresponding doctor-level prescribing data for only the first 24 of those months. PBE’s mission was to tell the company how many Rx’s each doctor had had filled during the remaining six months that were withheld from PBE. This way, the company was able to immediately validate PBE’s forecasts for each doctor by comparing them to the actual audited Rx’s for these same doctors. By taking advantage of existing data, the company was able to avoid the time and expense of a test market – which would also have forced representatives to adhere strictly to specific call plans.
For purposes of analysis, the doctors were grouped into 120 cells. This was done by segmenting the doctors by 12 levels of market share and 10 levels of prescribing volume. Forecasts were made for each doctor within each cell and these forecasts were then summed to produce a single forecast for each cell. The absolute percent difference between forecasted Rx’s and actual Rx’s was calculated for each cell.
The doctors were grouped into 120 cells for two reasons. First, although 120 numbers are a lot to look at, this is far better than looking at forecasts for tens of thousands of doctors. Second, there is a tremendous amount of random variation in the number of scripts individual doctors write for a brand from one period to the next. This variation is primarily due to the luck of the draw in terms of the number of patients showing up in the office that the doctor considers to be appropriate for a specific drug. Grouping the doctors eliminates most of the effect of this random variation.
Prior to the exercise, it was agreed that the accuracy of PBE’s forecasts would be compared to the accuracy of forecasts produced by a naïve model, i.e., the placebo. The naïve model simply stated that each doctor’s filled Rx’s would increase at the national rate. In this case, Rx’s were flat, so the naïve model assumed that each and every doctor’s prescribing would remain constant.
The naïve model assumes, in effect, that detailing and sampling have no impact on filled Rx’s. Both the client and PBE agreed that any methodology that cannot at least beat the naïve model is worthless at best — and counter-productive at worst.
The sponsor of the shoot-out used this naïve model to determine that PBE had indeed quantified the cause and effect of personal selling which none of the other participants was able to do.
TABLES 1- 4
These tables show the period-to-period changes in detailing and sampling for the brand used in the exercise. PBE has had the opportunity to examine the detailing and sampling of scores of brands. We have consistently observed that there is a large amount of period-to-period variation in detailing and sampling at the individual doctor level even when the national numbers are stable. Such was the case with this brand.
This variation in promotion at the doctor level makes the validation protocol discussed earlier a true test of someone’s ability to model cause and effect well enough to create plans that find hidden profits.
Tables 1, 2, 3 compare detailing activities (Primary, Secondary and Reminder) during the forecasted six months (Period 2) versus the previous six months (Period 1) for every doctor who wrote a script or received promotion during either period. For example, Table 1 shows that 3,939 doctors received exactly three Primary Details during Period 1. (This number is shown in the far right column.) Looking across the corresponding row, one sees that of the 3,939 doctors who received 3 primary details in Period 1 — 1,807 were not detailed at all during Period 2. In fact, fewer than 10% (298) of the doctors who received three primary details during Period 1 received that exact same number in Period 2.
This table shows the results of PBE’s forecasts versus the naïve model forecasts. For purposes of the exercise, the doctors were grouped 12 ways according to their initial shares and 10 ways according to their prescribing volumes. We then added all the forecasts for doctors within each cell to produce 118 forecasts (2 cells were empty) rather than 134,000 forecasts. The data were looked at the cell level for two reasons. First, grouping the doctors into cells greatly reduces the impact of random variation created by patient visits. Second, aggregating the data makes it much easier to see what was going on.
Here is how to read the table, starting by reading across the first line on Table 5 (CELL 1). This cell has the doctors in the lowest share and volume groups – Group I in both cases. There were 2,404 doctors in this cell and they wrote a total of 2,404 scripts during Period 1. During Period 2, the forecasted period, their scripts jumped to 12,549. PBE had predicted 12,626. PBE’s forecast was off by 0.06%. The naïve model (Period 1 vs. Period 2) which assumed no change, was off by 80%.
|Results of PBE’s Forecasts vs. Naïve Model Forecasts|
|Actual Rx’s||Actual Rx’s||Predicted Rx’s||PBE Error||Naïve Error|
|Freq||Period 1||Period 2||Period 2||Predicted vs.Actual||Period 1 vs.Period 2|