A Checklist For Designing Data Collection Regimes

April 29, 2014 by Stacey Barr | Last modified: July 26, 2017

Data collection is a process, not an event. Thinking about it as a process makes it easier to appreciate all the steps that are involved, who is involved in each step and what resources will be needed to make these steps work well. Use the following checklist as a starting point for thinking through the design of your data collection processes.

When you’re designing a new data collection process, or revamping an existing one, here are the six steps and 36 checkpoints that will ensure you get the right data, in the right way, and at the right level of reliability:

Step 1: Make the purpose clear.

Identify the performance measures, business questions or decisions that you require the data for.
List the data items you need to collect (eg these may come from your Performance Measure Definitions, if you have them, or analysis of the information requirements for your business questions or decisions).
Make sure your data items are useful, NOT just interesting. If they are just interesting, then consider the unintended consequences of collecting them (such as cost, annoying respondents or data collectors, compromising integrity, etc.).
Develop a purpose statement for the data collection process, so that everyone understands why it exists.

Step 2: Define the scope of your data collection.

List the characteristics that define who or what you will be collecting data about (eg age groups, roles, activities involved in, education level).
List the characteristics that define where this data will be collected (eg specific departments or divisions, geographical locations, specific offices or places of work).
List the characteristics that define when this data will be collected (eg during November, all the time, for the next 3 years, until an improvement is achieved).
Use these lists to define the scope of your data collection: your ‘target population’.
Check and refine your scope definition by testing it with examples of people, things, places or times that are out of scope.

Step 3: Design your sample.

Define how reliable you want the data to be (eg how small a change in your measures do you want to be able to reliably detect?). This may already be recorded in your Performance Measure Definitions.
Nominate any demographic or classification (or drilling) variables that you want to use in analysis of your data (eg do you want to have averages or percentages by geographic location or age group or department or gender?). This may already be recorded in your Performance Measure Definitions.
Discuss what kind of results you are expecting, in terms of the range of data values you think you are likely to get (eg are customers likely to rate their satisfaction mostly at 3 or 4 on your 5 point satisfaction scale, or are they likely to be more spread out on the scale?).
Explore logistical constraints of collecting data from your target population e.g. accessibility, cost and data integrity.
Use the above four decisions (and a survey statistician or other assistance) to decide whether or not a sample will be more cost effective than a census.

And if you have chosen to go with a sample, get professional help so you don’t inadvertently make it completely useless:

Identify a survey statistician or other assistance in survey sampling. It’s a science, not an art.
Decide whether or not it will be stratified (ie your total sample is really a collection of smaller samples based on your demographic or classification variable, which may be geographic location, age group, department or gender). Stratifying a sample can sometimes be a way to reduce the overall sample size or improve the overall reliability of the results.
Select a sample size (or sample sizes, if stratifying) that will deliver the reliability you require.
Select your sample using a random method – not a convenient method like quotas or volunteers – or else you run the risk of bias, where the data you get is not representative of your target population.

Step 4: Develop your data collection instrument.

Decide the basic method of data collection you want (or can afford), such as self-completion, telephone interview, face to face interview, focus group, or automated (if possible).
Formulate questions or constructs around the set of data items you listed at Step 1. Give consideration to the type of construct that will give you the data you need, such as open-ended questions, yes/no questions, multiple choice, rating scales, option lists, etc.
Sequence the questions or constructs in a logical order.
Check the language and wording of your questions or constructs to remove ambiguity and “fluff”. Give consideration to providing concise instructions for how to respond to each construct.
Design a layout for arranging your questions in a readable and usable way. Give consideration to the medium you will use (such as web page, computer data entry screen, paper, etc.), how you align things on the “page”, how you use white space to stop it looking like a huge blob of text, how you use contrast to make questions stand apart from instructions and the response area (eg the option list, the rating scale, etc.).
Test your questionnaire or form on a handful of people ideally those who will collect the data or provide the data. The obvious problems won’t be obvious to you. Absorb their feedback for ideas on making the questionnaire more relevant, understandable and usable.

Step 5: Flowchart the procedure of collecting the data.

Identify the trigger that will let people know that data has to be collected. It might be a customer phone call, a specific event occurring or finishing, an activity starting.
Identify how the data will be captured, such as which database will it be entered into.
List the steps that you think will be involved in the data collection procedure, from the trigger to the capture of the data. Note down who will take a role in which step.
Draw a flowchart (or cross-functional process map) that shows the flow of the steps through time, against who performs them. Give consideration to the expected time frames within which each step should be performed.
For each step, identify the resources required to perform it successfully.

Step 6: Pilot test the whole thing.

Choose a part of your data scope, based on location and time, in which you will conduct your pilot test.
List the outcomes that define success for this data collection process. Explore what success might look like from each stakeholder’s point of view (eg people collecting the data, providing the data, capturing the data, using the data, etc.). These might include impact on people’s time, data integrity, data usability, costs and timeliness.
Develop a Pilot Test plan for testing the data collection process, and a way to observe “evidence of success”.
Implement the Pilot Test plan.
Reflect on the “evidence of success” and summarise what you learned. List changes or improvements that you need to make to the data collection process and/or resources.
Make the improvements to your data collection design.
Deploy the data collection process. Continue to monitor it over time to ensure the “success outcomes” are tracking well.

There you have it. Now it’s time to get busy! But don’t think for a second that it’s too much effort to design robust data collection regimes. You just have to consider how much time and effort and performance improvement opportunity is wasted through having irrelevant data, at the wrong time, and with low reliability.

TAKE ACTION:

How does your approach to collecting data for your performance measures compare with this checklist? Do you have some potential to improve it, and thereby collect more of the data you need, in a more timely and reliable way?