There are different reasons why you would go for sampling.
If you are a Lean Six Sigma Green Belt or Lean Six Sigma Black Belt, or just an aspirant or a Quality Engineer, you will encounter several scenarios where you have to sample the data during data collection in the Measure Phase of DMAIC project.
Before selecting the right method for sampling, it’s important to understand why sampling needs to be done, what is the objective of the data collection and what are the expected outcomes from the data collected through sampling.
Data sampling saves a lot of time and effort in data collection. But it comes with certain risks of drawing wrong conclusions, from the data so collected through sampling. If these risks have to be minimized, then one of the things we need to ensure the right sampling method is adopted.
There are broadly 4 sampling techniques that are commonly used. There are few more intricate sampling techniques, but these 4 should suffice in most of the scenarios. We will now get to the details of selecting the right sampling method.
First understand if you are dealing with a Population or a Process in which sampling has to be performed. A population refers to a lot or collection of items which can be parts, documents, transactions, etc. but the population size is pre-defined. For example, if you have 100 applications, that becomes the population. If you wish to assess some characteristic of the 100 applications, then you can employ sampling, but the results will only be conclusive for those 100 applications. This is because we don’t have any idea of how this lot or population was built. (from one process or many)
On the other hand, when there is a process (manufacturing, service or tech) which is continuously producing output and you wish to sample in between the process steps or at the end of the process, so that you can check the health of the process and accordingly alter the process parameters, then you will deploy process sampling methods.
Thus as an example, if you have a lot of parts or applications to be processed, you can employ population sampling to evaluate incoming quality and process sampling to evaluate the quality of the output the process is generating. This concept will be very useful if you are running a Lean Six Sigma Green Belt or Black Belt project. In the Measure Phase of DMAIC, you will have to decide what data is to be collected and how this data collection will be executed. Depending on the nature of the problem or improvement you wish to bring about using Lean Six Sigma, the sampling method will have to be decided.
Once this is clear, the next step will be to decide exactly which method of sampling will you deploy. As mentioned earlier, broadly there are 4 methods of sampling, 2 in each Population Sampling and Process Sampling.
This is the most common method of population sampling. For example, in your Lean Six Sigma Green Belt or Black Belt Project, if you wish to sample a few applications from a lot of 100 applications to find how many are incomplete, you can deploy this method.
If your lot is not homogeneous but has different strata (or groups) within it, you will employ stratified random sampling. For example, in your Lean Six Sigma Green Belt or Lean Six Sigma Black project, if the applications are received through 3 different channels such as Physical Applications, Web Application Form and Applications from Mobile App, then its likely the sources of error are different and hence the error rates are also likely to be different. So we need to sample each strata separately and draw estimates depending on their contribution to the overall population. Hence in this case, it would be better to stratify the applications by source and then randomly sample within each strata.
In case your Lean Six Sigma Green Belt or Black Belt project aims to improve any process, then you can employ process sampling there. For example, when applications are processed by a team of associates, you can conduct a dip-stick audit at regular intervals. This could involve collecting a few samples every hour. In case, you find that the error rates are increasing hour after hour, this information can help you to correct any deficiencies in the process right away. Hence this sampling method is useful not only for improvement projects using Lean Six Sigma DMAIC or PDCA but also for on-going process control and management.
There is another process sampling method which you can employ in any Lean Six Sigma project, called as Rational Sub-grouping. If your process has no known patterns and consistently works in the same manner throughout the working cycle then, systematic sampling is sufficient. But if there are known patterns in the process, such as weekend, month end or even intra-day patterns such as morning, afternoon and evening patterns, then it may be wise for us to collect a few samples in every window of pattern. If there is unusually high variation within the samples collected in a given window, then that signals of some special cause of variation. For example, you expect errors to be lower in the morning, but when you sample 3 consecutive applications in the morning and find that 1 has low error but other two have very high errors, that indicates some underlying issue in the process. Thus, within sub-groups (for ex:within 3 samples of morning) we expect minimum variation while between sub-groups (between average of morning and evening samples), we expect significant difference, as such a pattern is naturally expected in this process. But if errors come out to be random, the variation between and within samples are no different, that either indicates the rationale for sampling is incorrect or something is wrong with the process.