|Year : 2021 | Volume
| Issue : 2 | Page : 71-78
A magical journey into knowledge creation in emergency difficult airway access - Sample size calculation and choosing statistical tests with the ‘Research Genie’
Department of Otolaryngology Head and Neck Surgery, St. John's Medical College and Hospital, Bengaluru, Karnataka, India
|Date of Submission||17-Jul-2021|
|Date of Acceptance||20-Jul-2021|
|Date of Web Publication||10-Aug-2021|
Dr. Arumugam Ramesh
Department of Otolaryngology Head and Neck Surgery, St. John's Medical College Hospital, Koramangala, Bengaluru - 560 034, Karnataka
Source of Support: None, Conflict of Interest: None
This article is the third of a four-article series intended to ignite the minds of readers and empower them to create new knowledge in the context of 'emergency difficult airway access'. This article describes sample size calculation, descriptive statistics and inferential statistics in simple and lucid language without using any formulae. The reader should have followed the steps of knowledge creation as described in the first two articles and framed objectives for a given challenging healthcare situation. The study design and variables to operationalise the objective should have been defined. With this information in the background, the article empowers the reader to calculate sample size for a given objective. The pathway to access this information on the 'Research Genie (RG)' app is described for every objective in all the nine relevant domains of healthcare, i.e. description, laboratory range estimation, incidence/prevalence estimation, evaluating therapies, measuring costs in healthcare, critically evaluating new tests, measuring risk, correlating variables and describing experiences, perceptions and beliefs. Mathematical and statistical jargon are deliberately kept at bay. This is followed by describing summary measures and tests of significance for each objective. The pathway to access this on RG is described. On reading and assimilating this article, healthcare personnel can communicate meaningfully with the biostatistician while explaining the data required to calculate the sample size for a given objective. The researcher learns to list the possible summary measures and tests of significance for a particular objective. With an intention to demystify all these complicated concepts, I may have erred on the side of oversimplification. I pray for forgiveness from the biostatisticians and sincerely recommend all these are discussed with the biostatistician and approval sought before putting them in print.
Keywords: Descriptive statistics, inferential statistics, sample size calculation
|How to cite this article:|
Ramesh A. A magical journey into knowledge creation in emergency difficult airway access - Sample size calculation and choosing statistical tests with the ‘Research Genie’. Airway 2021;4:71-8
|How to cite this URL:|
Ramesh A. A magical journey into knowledge creation in emergency difficult airway access - Sample size calculation and choosing statistical tests with the ‘Research Genie’. Airway [serial online] 2021 [cited 2022 Jan 27];4:71-8. Available from: https://www.arwy.org/text.asp?2021/4/2/71/323581
| Introduction|| |
This article is the third of a series of four articles. The series intends to empower medical personnel in creating new knowledge to manage difficult airway in emergency situations. In the first article (September–December 2020), 'Defining the destination', readers were educated to frame precise objectives in nine domains of healthcare, namely descriptions, laboratory range estimations, estimating incidence and prevalence, evaluating therapies, measuring costs in healthcare, critically evaluating new tests, measuring risk, correlating variables and describing beliefs, perceptions and experiences. The process of operationalising the objective was explained. In the second article (January–April 2021), 'Planning your journey with Research Genie' (RG), the steps of creating new knowledge was pictorially described using an innovative and easy-to-understand diagram. The readers were introduced to a free educational app 'RG' available in Play Store (Android phone) and App Store (iPhone). RG gave a glimpse of the shining and shimmering world of research methodology using lively and interesting diagrams. The thought process of the researcher was channelled by RG to concisely define the target population and frame-accurate objectives. This is the most crucial step in knowledge creation and sets the stage for the rest of the process. After this, the researcher has to perform comprehensive review of literature to examine if any other well-designed study has created new knowledge to answer this objective. The most appropriate study designs for the nine objectives and standards used to ensure the quality of the research were described. The concept of probability and non-probability sampling strategy was introduced. It concluded by stating that RG is limited in its ability to assist in selecting the study design and sampling strategy, for RG cannot think. The thinking has to be performed by the researcher. This article demystifies sample size calculation and statistical tests. The RG can definitely assist in this task.
The magical journey continues …….
| Principles of Sample Size Calculation|| |
Medical doctors are often distressed when faced with the task of calculating sample size for their research. They often seek the assistance of the all-powerful biostatistician. Due to their busy schedules, when the doctor is free, the biostatistician is not (usually after 6 pm IST) and vice versa. By divine providence, on the day that the doctor's and biostatistician's convenience align (a rare cosmological phenomenon), they do not seem to understand each other's language. They part sweetly till God unites them again! By the end of this article, I promise to create affinity towards the biostatistician by educating you in the language of sample size calculation, descriptive statistics and inferential statistics. Thereafter, you will definitely have meaningful and productive conversations with your biostatistician to calculate sample size and choose the most appropriate test of significance.
The principle of sample size calculation is similar to making inferences about the quality of rice in a large sack based on a sample from the topmost layer. Hence, how much to sample is determined by the uniformity of rice quality across the various layers in the sack. The measure of uniformity is termed as 'variability' in statistical language. Variability is commonly measured as the standard deviation of means or ranges of categories within the group. This information is usually gathered from prior similar studies (or a pilot study of your work) if there is no such prior research in your area of interest.
Before making a purchase where you have to choose between two items, the approximate cost and quality difference you are willing to accept are enquired by the shopkeeper. Similarly, when you want to calculate sample size to compare two or more groups, the biostatistician (there is no intention to equate a biostatistician with a shopkeeper!) will ask for the expected difference based on prior studies or your estimates based on your pilot studies. Difference between groups is also measured using effect size which is derived from prior studies with similar research objectives as yours. The concept of effect size will be explained in the context of group comparisons for better understanding in later sections.
One cannot sample the entire universe. Hence, inferences about the universe based on a sample have inherent error. The extent of error we are willing to accept determines the sample size. The maximum false-positive error that is acceptable is 5%. This type of error is termed type 1 or alpha error. Similarly, 20% is the maximum permissible false-negative error, also termed type 2 or beta error. As common sense suggests, the lower the acceptable error, the larger would the sample size be. As a corollary, the degree of acceptable error is based on the importance of the objective. If you are dealing with blood pressure, then an error in the range of 5 mm Hg may be acceptable; however while dealing with pH, the acceptable range will be narrower, possibly 0.01 (pH does not have any units). Based on what is at stake, you have to inform the statistician of the degree of error you are willing to accept.
The next important concept for sample size calculation is the type of hypothesis stated in your study design. In one-sided hypothesis, the direction is defined. The hypothesis states that training in difficult airways improves the skill and the study is about measuring the degree of improvement. In a two-sided hypothesis, the direction is not defined. The hypothesis states that simulator-based training for managing a difficult airway has an effect on improving skills. The study will reveal a positive or a negative effect. The researcher has to commit to the type of hypothesis.
In summary, the main determinants of sample size calculation are the extent of variability in study parameters, expected difference if groups are compared, acceptable error and type of hypothesis. For every objective in the nine healthcare domains, additional sets of information may be required to calculate the sample size. This will be explained under each objective.
| Demystifying Summary Measures and Statistical Tests|| |
Looking at a list of hundreds of numbers, it is difficult to understand what the numbers are trying to convey. A bunch of data is of no practical value until it is summarised in the form of point estimates and ranges. Each data pattern has a particular method of summarising. As a general principle, numerical variables are summarised by means and standard deviation (if data are normally distributed) and medians/interquartile range (if data are skewed). Categorical variables such as gender or religion are summarised as ratios or proportions. Median is the preferred summary measure for ordinal variables such as grades and scales. The most appropriate summary measure for every objective will be described in the following sections.
If sampling and sample size estimation ensure that the study sample is representative of the universe, then tests of significance inform us if the results can be applied to the universe. This concept is called generalisability of the results. In simple words, if the test of significance shows that the results of the study have a P < 0.05, then there is a probability of the results being generalisable 95% in the universe provided the study sample is representative of the universe.,, Under each objective in the following sections, 'Research Genie' will assist you to speak the language of the statistician to facilitate sample size calculation. RG will also assist you to select the most appropriate summary measure and choose the test of significance for a given objective.
| Description Objective|| |
'To estimate the proportion of emergency situations with difficult airway access (Outcome/Event) in people presenting to emergency rooms of district level hospitals of India (Population)'
On selecting the 'Domain' box on RG, nine boxes, each indicating a domain relevant to healthcare, are displayed with the page header as 'To which category does your research fit?' Choose the 'Description' box. The description template is displayed. You must faithfully follow the objective template. Every word is important. Any deviation will lead to confusion and difficulties in the future. The data pattern is a guide to ensure that you are on the right path. The present/absent indicates either the presence or absence of the outcome namely 'emergency situations with difficult airway access'. On swiping the screen, data required to calculate sample size are displayed. Population proportion is the proportion of difficult airways in emergencies as observed in the literature. The sample proportion is the proportion of difficult airways you expect in your scenario based on your prior clinical experience. Power is 100-type 2 error (false negative). As the maximum acceptable type 2 error is 20%, power is always 80 or more. Higher power is assumed if high accuracy is expected from the study. Alpha (type 1 error) should be 5% or lower. The type of hypothesis is based on the direction assumed by the researcher which has been explained in the previous section. With this information, you may use an online calculator or consult a statistician. On swiping the page, the descriptive statistics (summary measure) is displayed. You may depict the results either as a proportion of difficult airways in emergency situations or the ratio of difficult to easily accessible airways in these situations. Here, you will observe 95% confidence interval of ratio or proportion. This measure is depicted as (x, y) where x is the upper bound of the 95% confidence interval and y is the lower bound. These measures are calculated using computer programmes such as SPSS (Statistical Package for the Social Sciences). If the 95% confidence interval of difficult airway is (0.40, 0.45), then it means that in the universe you can expect the proportion of difficult airway to fall between 40/100 and 45/100 in 95% of the scenarios. Finally, swipe the page and the inferential statistics, also called test of significance, will be displayed. For this objective, it is the Z test of proportions. Here, we will introduce the concept of assumptions before applying the test of significance. When we communicate to a person in English and expect a response, we assume that the person understands the English language. Similarly, data must fulfil a certain set of assumptions before applying the Z test of proportions. Random sampling and independence of measurement (measurement obtained in an individual should not be dependent on any other prior factor that can systematically bias the measurement). The assumptions ensure that data analysed by the computer programme is representative of the population/universe. The computer cannot ensure random sampling and independence of measurement; it has to be done by the researcher. This completes all the steps required to create knowledge in this domain. You will need to consult the statistician from the formulation of objectives till final analysis. RG will assist mutual understanding and make research (knowledge creation) a magical experience for both you and the statistician.
| Lab Range Objective|| |
'To estimate the time taken to secure a stable airway (Outcome/Event) using a supraglottic airway device in unstable patients with open mandibular fractures (Population) presenting to emergency room'
For this objective, you need to select 'Lab Range' box in the domain. The diagram beside the objective clearly illustrates that in each individual a numerical parameter is measured. In our objective, it is 'time taken to secure a stable airway'. On swiping the page, data required to calculate sample size are displayed. Population mean is the mean time required to secure airway in difficult emergency situations and associated standard deviation derived from previous studies. The sample mean is the mean time to secure an airway in these situations in your experience. Alpha error, power and type of hypothesis are similar to the previous objective. The concept of effect size is introduced here. It is the measure of difference in the meantime to secure airway that you intend to measure by your study. The numbers 0.2, 0.5 and 0.8 are indicative of small, medium and large effect size, respectively. This measure is determined by the researcher. On swiping the page, it displays that mean and standard deviation along with 95% confidence interval of the mean is the most appropriate summary measure if data are normally distributed. Normally distributed data appear like a bell-shaped curve when represented on a graph. This representation is performed by SPSS or Microsoft Excel. If the data are skewed, median, mode, tertiles, percentile or interquartile range is employed. The last swipe displaysZ test of means or Student's 't' test as the most appropriate test of significance for this objective. With this, you have a complete framework to create knowledge in this domain.
| Incidence or Prevalence Objective|| |
'To estimate the prevalence of death/irreversible brain hypoxia (Outcome/Event) in patients presenting to emergency rooms of district hospitals (Population) manned by inadequately trained airway managers'
This objective is similar to the description objective, but for sampling strategy. In the description situation, the sampling is done in a hospital setup, whereas here, sampling is performed in the community setting. The data pattern is similar to the description situation where proportions are estimated. On swiping the page, data required for sample size calculation are displayed. Population incidence/prevalence is derived from previous literature. Sample incidence/prevalence is based on your presumption about the prevalence of death/irreversible brain hypoxia. Power, alpha error and hypothesis assumptions are similar to previous objectives. Incidence/prevalence is point estimate. On swiping the page, descriptive statistics for the objective is described. The 95% confidence interval is the limit within which incidence/prevalence may lie in 95% of similar scenarios. The most appropriate test of significance is displayed in the next page, namely Z test of proportions in this case. This completes the framework to create knowledge in this domain.
| Therapy Objective|| |
'To compare the efficacy of securing the airway measured as time to secure airway (outcome) in patients presenting to emergency rooms of district hospitals with restricted mouth opening (Population) using supraglottic airway device (intervention 1) versus fibreoptic-guided intubation (Intervention 2)'
This objective gets displayed when you choose therapy in the domain box. On swiping the page and displaying data required for sample size calculation, data required to calculate sample size for two groups appear. This is for therapy situations, where two types of interventions are compared. The next page displays data required when three types of interventions are compared. In the first situation, the mean difference is the expected mean difference between time to secure airway with supraglottic airway device and fibreoptic-guided intubation. Standard deviations 1 and 2 are those of the meantime to secure airway by each technique, namely supraglottic airway device and fibreoptic-guided intubation. This data are derived from previous studies or a small pilot study done at your centre. The effect size is based on the above measures and sample size calculators derive the effect size from the above-mentioned measures. Power, alpha error and hypothesis assumptions are similar to previous objectives. In the situation where three interventions will be compared, between- and within-group variances are derived from previous literature. Effect size is decided by the researcher based on criticality of the outcomes. Number of measurements indicates whether outcomes will be measured once or over repeated periods. Power, alpha error and hypothesis assumptions are similar to previous objectives. The next page displays summary measures (descriptive statistics) for two groups. Here, it will be the difference in the mean time to secure the airway by two techniques if data are normally distributed. If data are not normally distributed, the difference in median time to secure the airway is the most appropriate method to summarise the results. The next page shows a table to summarise results when comparing three groups. It is a corollary to the previous situation. Here, the difference in mean time to secure airway between each of the three groups will be described. The 95% confidence intervals of difference between mean time to secure airway give an estimate of the upper and lower bounds of difference in time to secure airway in 95% of similar scenarios. Moving ahead by swiping the pages, we reach tests of significance to be employed for two intervention comparison and three intervention comparisons. To test the statistical significance of two interventions, independent sample t-test is employed if data are normally distributed and Mann Whitney U or Wilcoxon signed-rank test if data are skewed and assumptions are violated. The tests employed when assumptions are violated are less powerful in assisting us to generalise the results to the universe. If three interventions are compared, analysis of variance is used if data are normally distributed and Kruskal–Wallis test if assumptions are violated. With this, we complete the framework to create new knowledge in this domain.
| Cost Objective|| |
'To compare the cost (Outcome/Event) of using supraglottic airway devices (intervention 1) and fibreoptic-guided intubation (Intervention 2) in patients presenting to emergency rooms of district hospitals with restricted mouth opening (Population)'
This objective is displayed on selecting cost from the list of domains in RG. The data pattern for this objective is similar to the therapy objective. Hence, the sample size calculation, descriptive measures and test of significance remain the same. The cost for each intervention is substituted in place of time to secure the airway. Thus, the sample size is calculated based on the standard deviation of cost for supraglottic airway and fibreoptic-guided intubation. The means and difference in means are for the costs. If a third intervention is included, then within- and between-group variances are derived from previous literature or pilot work done at your centre. Power, alpha error and hypothesis assumptions are similar to previous objectives. As discussed in the second article of this series, the outcome can be cost, cost-effectiveness, cost utility or cost benefit. All steps required for knowledge creation in this domain are complete with this section. Comparing costs is a critical outcome used by administrators and government agencies for policy and advocacy.
| New Test Objective|| |
'To compare the accuracy of decision making using smartphone-based application
(New test) to predict probability of securing the airway using a supraglottic airway device (Outcome/Event) among patients presenting to emergency rooms of district hospitals with restricted mouth opening (Population)'
This objective is unique for its peculiar data patterns. On selecting new test in the domain box, it is displayed. On swiping the page, data required for sample size calculation are displayed. The new test is a smartphone-based application and reference is by other conventional methods of predicting difficult airway. Power, alpha error and hypothesis assumptions are similar to previous objectives. Summary measures (descriptive statistics) are sensitivity (measure of true positives), specificity (measure of true negatives), positive predictive value (true positive when test is positive), negative predictive value (true negative when test is negative) and likelihood ratio (measure of probability of truth in various test outcomes)., All these are described in detail in the second article in this series. On close observation of data patterns in the opening page, you will notice that test outcomes can be either numbers like time to secure airway or categorical like airway secured or not. Likewise, the test of significance is Chi-square/Fisher's exact test for ordinal or dichotomous outcomes (categorical variable). Fisher's exact test is used when certain assumptions are not fulfilled. You may consult your statistician to understand this concept. I am avoiding a detailed explanation to reduce mental strain for you! For tests with numerical outcomes, a paired t-test is the best to evaluate statistical significance. Our journey of knowledge creation in this domain comes to completion at this stage.
| Risk Measurement Objective|| |
'To estimate the risk (Outcome/Event) of having death/irreversible brain injury (Outcome) in emergency airway management by residents trained using conventional observation training (At-risk group) in comparison to those trained using advanced simulators with embedded training algorithms for optimal positioning and transport (Not at-risk group)'
On selecting the risk measurement box from the domain box, you will observe that the data pattern is different. Here, we are comparing two categorical data sets. The next page displays the data required to calculate the sample size. The proportion in the control group is death/irreversible brain injury (Outcome) in emergency airway management by residents trained using advanced simulators with embedded training algorithms for optimal positioning and transport (Not at-risk group). The proportion in the case group is death/irreversible brain injury (Outcome) in emergency airway management by residents trained using conventional observation training (At-risk group). Power, alpha error and hypothesis assumptions are similar to previous objectives. Odds ratio and relative risk reduction are commonly employed summary measures with 95% confidence intervals which are displayed in the next page. On swiping to reach inferential statistics, it displays the most appropriate tests of significance, namely Chi-square test and Fisher's exact test. As discussed in the previous section, Chi-square is employed if assumptions are fulfilled, and Fisher's test if assumptions are violated. The entire framework for knowledge creation in this domain is complete with this step.
| Correlation Objective|| |
'To estimate the strength of correlation (Outcome/Event) between time spent on accessing the difficult airway by training using advanced simulation systems measured as hours of training (Quantitative parameter 1) and time to secure airway in an emergency situation with restricted mouth opening, measured in seconds (Quantitative parameter 2)'
On selecting the correlation box, this objective is displayed. The term 'strength of correlation' is a new term. It is a measure of degree by which time spent on training using advanced simulation models influences time to secure airway in emergency situations. Computer programmes such as SPSS calculate correlation coefficients based on the data of both variables. Coefficients more than 0.5 indicate the presence of correlation and values above 0.7 indicate good correlation. On swiping the page, data required for sample size calculation get displayed. Population correlation coefficient between time to train and time to secure the airway is derived from previous literature. Sample correlation coefficient is your assumption based on experience or pilot study. Power, alpha error and hypothesis assumptions are similar to previous objectives. The next page displays the correlation coefficient with 95% confidence interval as the most appropriate method to summarise the results. The following page states Pearson's correlation and Spearman's correlation as the most appropriate statistical tests for correlation objective. While Pearson's correlation is used for normally distributed data, Spearman's correlation is used for skewed data. This completes the framework for knowledge creation in this domain.
| Beliefs/Perception/Experience Objective|| |
'To describe the perceptions of hospital managers and finance authorities about investing in advanced airway access training simulators'
The objective that gets displayed on choosing beliefs/perceptions/experiences from the domain box is different from the remaining eight domains. These questions are answered by qualitative research methods. Data are not in the form of numbers. Here, data are either visuals like images/videos of ethnographic observations or transcripts of conversations from focus group discussions or in-depth interviews. There is no fixed sample size and data saturation determines termination of the study as no new information gets gathered. This is displayed in the next page. The next page that appears on swiping describes the pictorial representation of qualitative data results. You will observe that the neonatal intensive care environment is depicted as a dynamic apparatus, with on-going adaptation to every new challenge. Similarly, the perceptions of hospital managers and finance authorities should be explored and described. A comprehensive discussion on qualitative research methodology is beyond the scope of this article. The reader can consult easy-to-understand resources and graduate to more advanced literature to understand this specialised and important domain of healthcare research.,,
| Does the Magical Journey End Here?|| |
You would have begun to feel that the magical journey is coming to an end. Every section seems to suggest that. We started the journey with clearly defined objectives in nine relevant healthcare domains in a specific clinical context. Research questions and objectives were generated to address all the challenges in managing a difficult airway in emergency situations. Following this, we created a conceptual framework and got familiar with operationalising the objectives with an understanding of variables. The concept of confounding variables was introduced. The most appropriate study design for each objective and the acceptable standards for ensuring quality were described. A pictorial and easy-to-understand framework to conduct research in any area of healthcare was presented. 'RG', an educational smartphone-based app available for free download from Play Store for Android and App Store for iPhones was introduced. The method to use RG to construct a framework for knowledge creation (Research methodology) in difficult airway was outlined. The magical journey does not stop here. Truly fulfilling and magical moments are experienced when the new knowledge created is utilised to make a difference in the universe. The cycle of innovation is all about taking your research results to the universe.
The next and final article in this series will educate you on making an impact on the world with your knowledge innovation. Till then, identify a challenging healthcare situation and create the framework to invent and discover new knowledge. If you have any questions or clarifications, the Genie will be at your service at '[email protected]'.
We acknowledge Dr George D'Souza, Dean, St John's Medical College, for administrative support.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Ramesh A. A magical journey into knowledge creation in emergency difficult airway access – Defining the destination, reserving your seats on the magic carpet. Airway 2020;3:119-26. [Full text]
Ramesh A. A magical journey into knowledge creation in emergency difficult airway access – Planning your journey with “Research Genie”. Airway 2021;4:21-7.
Singer C. From Magic to Science: Essays on the Scientific Twilight. New York: Dover; 1958.
Lwanga SK, Lemeshow S. Sample Size Determination in Health Studies. Geneva: World Health Organization; 1991.
Altman DG. Practical Statistics for Medical Research. London: Chapman and Hall; 1991.
Armitage P. Encyclopedia of Biostatistics. 2nd
ed. Chichester GB: John Wiley and Sons Ltd; 2005.
Robert H. How to Tell Liars from Statisticians. New York: Marcel Dekker; 1983.
Moye LA. Statistical Reasoning in Medicine: The Intuitive P
Value Primer. New York: Springer-Verlag; 2000.
Bowers D. Statistics from Scratch for Healthcare Professionals. New York: John Wiley and Sons; 1996.
Field A. Discovering Statistics using SPSS (and Sex and Drugs and Rock N Roll). 3rd
ed. London: Sage Publications; 2010.
Park K. Park's Textbook of Preventive and Social Medicine. 21st
ed. Jabalpur: M/S Banarsidas Bhanot Publishers; 2012.
Schulz KF, Grimes DA. The Lancet Handbook of Essential Concepts in Clinical Research. Michigan:Elsevier; 2006.
Flick U, von Kardoff E, Steinke I, editors. A Companion to Qualitative Research. Thousand Oaks: Sage Publications; 2004.
Bhandarkar PL, Wilkinson TS. Methodology and Techniques of Social Research. Mumbai: Himalaya Publishing House; 2010.
Flick U. An Introduction to Qualitative Research. 4th
ed. London: Sage Publications; 2009.
Greene R. Mastery. London: Profile Books Ltd; 2012.
Narayanamurti V, Odumosu T. Cycles of Invention and Discovery – Rethinking the Endless Frontier. Cambridge, Massachusetts: Harvard University Press; 2016.