

STATISTICS 

Year : 2010  Volume
: 1
 Issue : 2  Page : 6769 

Sample size estimation in clinical trial
Tushar Vijay Sakpal
Principal Statistician, Biostatistics, PharmaNet Clinical Services Pvt. Ltd
Date of Web Publication  20Oct2010 
Correspondence Address: Tushar Vijay Sakpal Principal Statistician, Biostatistics, PharmaNet Clinical Services Pvt. Ltd
Source of Support: None, Conflict of Interest: None  Check 
PMID: 21829786
Abstract   
Every clinical trial should be planned. This plan should include the objective of trial, primary and secondary endpoint, method of collecting data, sample to be included, sample size with scientific justification, method of handling data, statistical methods and assumptions. This plan is termed as clinical trial protocol. One of the key aspects of this protocol is sample size estimation. The aim of this article is to discuss how important sample size estimation is for a clinical trial, and also to understand the effects of sample size over estimation or underestimation on outcome of a trial. Also an attempt is made to understand importance of minimum sample to detect a clinically important difference. This article is also an attempt to provide inputs on different parameters that impact sample size and basic rules for these parameters with the help of some simple examples. Keywords: Sample size, statistically significant, clinically significant, Type I error, Type II error, power
How to cite this article: Sakpal TV. Sample size estimation in clinical trial. Perspect Clin Res 2010;1:679 
Introduction   
The article will cover
 Why sample size estimation is important in a clinical trial (CT)
 Inputs and their importance for sample size estimation in CT
 Examples on sample size estimation
 Discuss some basic rules and guidelines
Importance of Sample Size in CT   
Let us see an example. Dr. ABC had developed a drug NEW which was effective in reducing pain. The drug NEW is clinically better in terms of efficacy and safety as compared to test drug. Hence, the drug NEW truly deserves to reach market so that patients suffering from pain can be at ease.
Dr. ABC decided to conduct a CT in order to get approval from regulators. How will the sample size affect Dr. ABC's trial on patients suffering from pain?
Scenario 1  Sample size underestimation (Sample size selected was less than what was required)
At the end of trial when data was analyzed, statistical significance was not achieved (statistical jargon, pvalue was ≤ 0.05).
Underestimation of sample size may result in drug turning out to be statistically nonsignificant even though clinical significance exists.
Scenario 2  Sample size overestimation (Sample size selected was much more than what was required)
Dr ABC conducted CT with a large number of subjects. When the data was analyzed, strong statistical significance was achieved as pvalue turned out to be 0.000001. However Dr ABC has to consider the following issues:
 The above pvalue indicates that a smaller sample size was needed to prove statistical significance. This raises ethical issues as more subjects were exposed to test drug, which could have deserved getting NEW drug.
 As sample selected was very large even a small difference between NEW and test drug will turn out statistically significant even if that difference is not clinically meaningful.
Hence sample size is an important factor in approval /rejection of CT results irrespective of how clinically effective/ ineffective, the test drug may be.
What does it take to estimate a statistically appropriate sample size?
Basic Requirement for Estimation of Sample Size in CT   
The estimation of sample size along with other study related parameters depends on Type I error, Type II error and Power.
Type I Error:   
Consider we are testing two brands of paracetamol to evaluate if Brand 1 is better in curing subject's suffering from fever as compared to Brand 2. As both brands contain paracetamol, it is expected that the effect of both brands is similar. Let us try to built a statistical hypothesis around this
Null Hypothesis (H _{0} ): Brand 1 is equal to Brand 2
Alternate Hypothesis (H _{1} ): Brand 1 is better than Brand 2
Let us try to evaluate error that can occur.
Error 1: Based on analysis it is concluded that Brand 1 is better than Brand 2, basically we reject H _{0} . Knowing that Brand 1 is equal to Brand 2 (H _{0} ), we are making an error here by rejecting H0. This is called as Type I error. Statistically it is defined as
Type I error = (Reject H _{0} / H _{0} is true).
Probability of Type I error is called as level of significance and is denoted as α.
Type II Error:   
Consider we are testing paracetamol against placebo to evaluate if paracetamol is better in curing subject's suffering from fever as compared to placebo. (It is expected that the effect of paracetamol is better than placebo). Let us try to built a statistical hypothesis around this
Null Hypothesis (H _{0} ): paracetamol is equal to placebo
Alternate Hypothesis (H _{1} ): paracetamol is better than placebo
Let us try to evaluate the error that can happen.
Error 2: If analysis concludes that paracetamol is equal to placebo, we accept H _{0} . Knowing that paracetamol is better than placebo (H _{1} ) we are making an error here by accepting H0. This is called as Type II error. Statistically it is defined as
Type II error = (Accept H _{0} / H _{1} is true).
Probability of type II error is denoted as β.
In above case if analysis concludes that paracetamol is better than placebo, we reject H0, which would be correct decision. Probability of such a decision taking place is called as "Power".
Power = Probability (Reject H _{0} / H _{1} is true) which is actually 1β.
We can tabulate type I and type II error as
[Additional file 1]
Following are the basic considerations for estimating sample size for a CT
Following are the basic considerations for estimating sample size for a CT
1. Study Design
In CT, different statistical study designs are available to achieve objectives. Typical designs that may be employed are parallel group design, crossover design etc.
For sample size estimation study design should be explicitly defined in the objective of the trial. A cross over design will have different approach and formula for sample size estimation as compared to parallel group design.
2) Hypothesis Testing
This is another critical parameter needed for sample size estimation, which describes aim of a CT. The aim can be equality, noninferiority, superiority or equivalence.
Equality and equivalence trials are twosided trials where as noninferiority and superiority trials are onesided trials.
Superiority or noninferiority trials can be conducted only if there is prior information available about the test drug on a specific end point.
3) Primary Study End Point
The sample size calculation depends on primary end point of a CT. The description of primary study end point should cover whether it is discrete or continuous or timetoevent. Sample size is estimated differently for each of these end points. Sample size is adjusted if primary end point involves multiple comparision.
4) Expected Response Test vs. Control
The information about expected response is usually obtained from previous trials done on the test drug. If this information is not available, it could be obtained from previous published literature. The important information required is:
 response expected with test drug (mean expected score / proportion of subjects achieving success.)
 response expected with control drug (mean expected score / proportion of subjects achieving success.)
 variability (standard deviation.)
5) Clinical Important/Meaningful Difference/Margin
This is one of most critical and one of most challenging parameters. The challenge here is to define a difference between test and reference which can be considered clinically meaningful. To put it in plain English, this is the difference which will make your family doctor to give you new medication over and above the existing gold standard which he is prescribing for last 10  20 years. This threshold figure is at times not easily available and should be decided based on clinical judgment.
6) Level of Significance
This is typically assumed as 5% or lesser. ^{[1],[2]} .Type I error is inversely proportional to sample size.
7) Power
As per guideline. ^{[1],[2]} power should not be less that 80%. Type II error is directly proportional to sample size.
8) Dropout rate
The sample size estimation formula will provide number of evaluable subjects required for achieving desired statistical significance for a given hypothesis. However in practice we may need to enroll more subjects to account for potential dropouts.
If n is the sample size required as per formula and if d is the dropout rate then adjusted sample size N1 is obtained as
N1 = n/ (1d).
9) Unequal Treatment Allocation
For some of clinical trials it is ethically desirable to have more subjects in one arm compared to the other arm. For example, for placebocontrolled trials with very ill subjects it is unethical to assign equal subjects to each arm. In such cases sample size is adjusted as below.
If n is sample size required as per formula and n1 is the sample size for test and n2 is sample size for placebo and n1 /n2 = k, then
n1 = (0.5) x n x (1 + k)
n2 = (0.5) x n x (1 + (1/k))
Suppose n = 100 and ratio between test (n1) and placebo (n2) is 2:1 (k=2) then
n1 = (0.5) x (100) x (1+2) = 150
n2 = (0.5) x (100) x (1+[1/2]) = 75
As a statistician one should have clear and clinically meaningful inputs on the above points prior to working on sample size estimation.
Calculation of Sample Size   
Example 1: Comparing two proportions
A placebocontrolled randomized trial proposes to assess the effectiveness of Drug A in curing infants suffering from sepsis. A previous study showed that proportion of subjects cured by Drug A is 50% and a clinically important difference of 16% as compared to placebo is acceptable.
Level of siginificance = 5%, Power = 80%, Type of test = twosided
Formula of calculating sample size is
n = [(Z _{α/2} + Z _{β} ) 2 x {(p1 (1p1) + (p2 (1p2))}]/(p1  p2) ^{2}
where
n = sample size required in each group,
p1 = proportion of subject cured by Drug A = 0.50,
p2 = proportion of subject cured by Placebo = 0.34,
p1p2 = clinically significant difference = 0.16
Z _{α/2}: This depends on level of significance, for 5% this is 1.96
Z _{β}: This depends on power, for 80% this is 0.84
Based on above formula the sample size required per group is 146. Hence total sample size required is 292.
The description sample size in the protocol will be:
A sample size of 292 infants, 142 in each arm, is sufficient to detect a clinically important difference of 16% between groups in curing sepsis using a twotailed ztest of proportions between two groups with 80% power and a 5% level of significance. This 16% difference represents a 50% cure rate using drug A and 34% cure rate using placebo.
Example 2: Comparing two means
An activecontrolled randomized trial proposes to assess the effectiveness of Drug A in reducing pain. A previous study showed that Drug A can reduce pain score by 5 points from baseline to week 24 with a standard deviation (o) of 1.195. A clinically important difference of 0.5 as compared to active drug is considered to be acceptable. Consider a dropout rate of 10%.
Level of significance = 5%, Power = 80%, Type of test = twosided
Formula of calculating sample size is
n = [(Z _{α/2} + Z _{β} ) ^{2} x {2(0) ^{2}}]/ (μ1  μ2) ^{2}
where
n = sample size required in each group,
μ1 = mean change in pain score from baseline to week 24 in Drug A = 5,
μ2 = mean change in pain score from baseline to week 24 in Active drug = 4.5,
μ1μ2 = clinically significant difference = 0.5
o = standard deviation = 1.195
Z _{α/2:} This depends on level of significance, for 5% this is 1.96
Z _{β}: This depends on power, for 80% this is 0.84
Based on the above formula, the sample size required per group is 90. Hence total sample size required is 180. Considering a dropout rate of 10% total sample size required is 200 (100 in each arm).
The description sample size in the protocol will be: A sample size of 180 subjects, 90 in each arm, is sufficient to detect a clinically important difference of 0.5 between groups in reducing pain assuming a standard deviation of 1.195 using a twotailed ttest of difference between means with 80% power and a 5% level of significance. Considering a dropout rate of 10% the sample size required is 200 (100 per group).
Basic Rules for Estimating Sample Size in CT   
Some basic rules for on sample size estimations are:
 Level of significance  It is typically taken as 5%. The sample size increases as level of significance decreases.
 Power  It should be >= 80%. Sample size increases as power increases. Higher the power, lower the chance of missing a real effect.
 Clinically meaningful difference  To detect a smaller difference, one needs a sample of large size and vice a versa.
 Sample size required to demonstrate equivalence is highest and to demonstrate equality is lowest.
The sample size estimation is challenging for complex designs such as noninferiority or, time to event end points. Also, the sample size estimation needs adjustment in accommodating a) unplanned interim analysis b) planned interim analysis and c) adjustment for covariates.
In conclusion, the sample size is one of the critical steps in planning a CT and any negligence in its estimation may lead to rejection of an efficacious drug and an ineffective drug may get approval. As calculation of sample size depends on statistical concepts, it is desirable to consult an experienced statistician in estimation of this vital study parameter.
Acknowledgement:   
I would like to thank Dr. Suresh Bowalekar, Managing Director PharmaNet Clinical Services Pvt. Ltd., for his guidance and inputs on writing this article.
References   
1.  SheinShung C., Jun S. and Hansheng W. Sample Size Calculation in Clinical Trial. New York, Marcel Dekker Inc., 2003, Chapter 1, page 11, section 1.2.3. 
2.  ICH Topic E 9, Statistical Principal for Clinical Trials Step 4, Consensus guideline, 05Feb1998 Note for Guidance On Statistical Principles For Clinical Trial. Available via http://www.ich.org . 
