A common misconception about packaging validation is that we decide on the number of samples required for each test. That is information we get from our customers FMEA’s.
Sample size is determined by several factors, some of those being whether the testing is considered attribute testing or non-attribute testing and what the risk level is for that device. In this blog, we will go over the two kinds of tests, definition of risk, how risk is calculated, and how all that plays into sample size!
First, what’s Attribute Testing and Non-Attribute testing?
Attribute testing results do not provide a numerical output, it is simply a ‘pass’ or ‘fail’ test. An example of this is bubble leak testing. When performing a bubble leak test, we submerge a pouch underwater and inflate it with air. If we observe a constant stream of bubbles, that is an indication that the package has failed. If there is not a stream of bubbles, then the package passes.
Non-attribute testing provides an output that has a value. Let us use peel testing as an example here. Every time we pull apart the two pieces of material that make a seal, that outputs a value.
The cool thing about non-attribute testing is that we can use the data to run a statistical analysis of the client’s samples. If we crunch the numbers on 30 seal strength tests, we can look at the average, the standard deviation, and we can understand how in control the process is.
Because non-attribute testing provides us with more data, we do not require as many samples for those kinds of tests. With that being said, the kind of test is not the only factor that plays into how we choose sample size. The major deciding factor of the right sample size is determined based on the risk of our customer’s device.
What is risk?
Most of our clients are medical device manufacturers (MDM’s). MDM’s operate under ISO 13485 which is a standard that provides Quality System guidance on how to manufacture medical devices. The definition of risk under ISO 13485 is “the combination of the probability of occurrence of harm and the severity of that harm.” In layman’s terms, risk is defined as the likelihood that something bad will happen and what the impact on the patient is if that bad thing happens.
How do we determine the calculated risk?
MDM’s commonly use a Failure Mode Effects Analysis (FMEA) to help determine risk for the device. A FMEA is a step by step process that identifies how a device might fail, and to assess the impact of different failures to identify parts of the process that are in most need of change.
There are 3 things that are considered in an FMEA:
- probability of something happening (how bad is it?)
- severity if that happens (how bad is it if that happens?)
- probability of detection (what is the likelihood we will catch it anyway?)
Typically, each of these inputs is assigned a value ranked between 1 and 5, 1 being the safest (lowest risk) and 5 being the least safe. Then, those 3 values are multiplied, and we get a value that we call a Risk Priority Number (RPN). The RPN is associated with the potential harm that is associated with the device if it were to be defective. This ranges from no danger all the way to the most extreme risk, death.
Consider this example, your mom is on an operation table getting open-heart surgery. If the operating room nurse opens a package and the device that is going inside your mom is defective, that would be a very bad situation. It is for this very reason why a risk/benefit analysis is complete and why it is important.
We then use the table below to determine risk acceptability and requirements for both design and process-related hazards.
Once MDM’s go through the risk analysis process, we come up with our risk categories and determine the confidence and reliability number. Reliability refers to a failure rate. Confidence refers to the minimum certainty that the failure rate is accurate.
If the RPN is LOW, we will typically validate to 90% confidence/90% reliability (this is what dictates sample size). Moderate risk is commonly validated to 95% confidence/95% reliability (the most common risk level observed by the lab).
If your device falls in the NAC category, it is likely you will be turned down by a contract manufacturer or be forced to redesign portions of a device or the manufacturing processes due to the risk of that device being too high. If we feel we will not be able to get the process down to keep the patient safe or don’t want to deal with the potential lawsuits that might incur from this device, then we would likely turn down that device.
Once you determine that reliability and confidence, you refer to a statistical reference. At PCL, we use Juran’s Handbook of Statistics. Inside there are various tables that help to determine what the right sample size is for your device.
Once we have the sample size, we must take into consideration is if it is an attribute test or non-attribute test.
How do we use the calculated risk to determine sample size?
The risk level we typically see is Moderate. That lands at 95% confidence and 95% reliability (we call this a 95% confidence interval). That equates to 30* samples for our non-attribute test (seal strength, burst test) and 60* samples for attribute testing (bubble leak test, visual inspection).
What do we tell clients when they ask, ‘How Many Samples Should We Send You?
Here at the lab, we do not determine the sample size for our clients. We tell our customers their decision for sample size should be based on the risk of their device and that we recommend reviewing the risk analysis that they performed as part of your development process. What we do tell our customers is that we usually test to that 95% confidence interval.
Potential Limiting Factors
If a device costs $5,000, it can be difficult for device manufacturers to provide 30-60 devices for samples. If this is the case, the MDM would need to present us with a justification as to why that is acceptable, they do not have the suggested number of samples.
Limited number of devices
If the MDM we are working with is a startup, there is the potential that they are unable to procure the full number of devices required for packaging validation. In these cases, we can use simulated devices. There are caveats that allow us to use products that are representative of the actual device if the MDM is unable to provide us with the actual device in the sample (IE: pieces of tubing in place of catheters). If we go down this road, we choose the most worst-case feature of your product. Then we make sure we are capturing that worst case in the simulated product. Examples of a worse case feature: heavy, sharp, geometry, etc.
Want to Talk about Your Device’s Packaging Sample Size? Our experts are happy to discuss design verification testing and sample sizes for packaging of all types.
*the sample quantities are 30 and 59, but most of our customers round up to make it an even number.