Protocol for the field-test and psychometric validation of the pectus excavatum evaluation questionnaire in the Dutch pectus excavatum population
Introduction
Beyond its visible manifestation, pectus excavatum often leads to impaired cardiopulmonary function and psychosocial well-being, diminishing the health-related quality of life of affected individuals (1). A disease-specific and robust patient-reported outcome measure is essential to assess the impact of the deformity on health-related quality of life. The pectus excavatum evaluation questionnaire (PEEQ) is the most used instrument for the pectus excavatum population (2,3). Comprising 22 items, this questionnaire offers a comprehensive assessment divided into two sections: one completed by the patient and the other by the patient’s parent or legal guardian.
The PEEQ was initially developed and field-tested in the United States, limiting its direct applicability to other countries and non-English speaking populations due to cultural and linguistic differences between populations. Our research group previously performed a Dutch translation, cultural adaptation and linguistic validation of the PEEQ (4). Here, we outline the protocol for the field test and psychometric validation of the Dutch version of the PEEQ to ensure the suitability of this instrument for assessing the health-related quality of life in the Dutch pectus excavatum population and to facilitate tailored patient care.
Study objective
The primary objective of this study is to evaluate the following psychometric properties:
- Structural validity: assessment of the underlying factor structure.
- Internal consistency: evaluation of the reliability of each subscale and overall questionnaire.
- Test-retest reliability: determination of the stability of the PEEQ over time.
- Floor and ceiling effects: identification of potential score distributions limiting the questionnaire’s sensitivity.
- Responsiveness: assessment of the ability of the PEEQ to detect clinically meaningful changes.
- Smallest detectable change (SDC): calculation of the minimal change required to reflect a real change beyond the measurement error.
The secondary objective of the study is to refine the Dutch version of the PEEQ where appropriate.
Methods
Study design
This is a prospective, longitudinal single-center study conducted at Zuyderland Medical Center, Heerlen, the Netherlands. The study design adheres to the consensus-based standards for the selection of health measurement instruments (COSMIN) study design checklist for patient-reported outcome measures (5).
Patient and public involvement
No patients were involved in designing the study protocol. However, as part of our previous study, feedback from patients and their parents regarding the clarity, relevance and comprehensibility of the Dutch translation of the PEEQ was obtained and the questionnaire was refined based on their recommendations (4).
Participant selection
Eligible participants will be pectus excavatum patients aged 12 to 18 years old who are scheduled for a Nuss procedure. Parents or legal guardians will also be invited to complete the parent section of the PEEQ. Exclusion criteria will be the inability of the patient or their parent/guardian to complete the corresponding section of the questionnaire due to language barriers.
Participant timeline
Participants will be identified and enrolled during their preoperative visit at the outpatient clinic by a member of the research team. Data will be collected at three moments (Figure 1):
- Preoperatively: the PEEQ will be completed on paper during the outpatient clinic visit.
- Postoperatively: the PEEQ will be completed electronically 2 months post-surgery.
- Test-retest reliability: the PEEQ will be administered electronically at least two weeks after a previous assessment.

Automatic reminders will be sent once the electronic questionnaire is not completed within one week, with up to two reminders spaced one week apart.
Outcomes and statistical analysis
Besides the primary outcomes, the patient characteristics age, sex and preoperative Haller index will be collected. Analysis will be performed using JASP software (JASP Team, JASP statistics for MacOS, V.0.19.0). The normality of the distribution of the PEEQ total scores will be assessed using a histogram and QQ-plots.
Structural validity
A Kaiser-Meyer-Olkin (KMO) test and Bartlett’s test of sphericity are performed to assess the adequacy of patient sampling before performing further structural tests. Threshold values of ≥0.70 in KMO test and P<0.05 in Bartlett’s test of sphericity indicate the suitability of the collected data for factor analysis (6,7). Exploratory factor analysis (EFA), using principal axis factoring and a promax rotation method, will be conducted to identify the underlying factor structure for both the child and parent sections (8,9). These factor structures will later be subjected to confirmatory factor analysis (CFA), using polychoric correlations and robust maximum likelihood estimation, to evaluate the validity of the structures derived from EFA (8,10). Within EFA, the Kaiser criterion (eigenvalue >1), explained variance (≥50%), and interpretability principle will be applied to determine the number of factors to be retained. Squared multiple correlations are used to compute the communality estimates and variables with estimates <0.4 will be removed (9). Items with a minimum factor loading of 0.40 contribute to the specific factor (11).
As part of the CFA, the item-scale convergent validity is tested by calculating corrected item-total correlations to confirm whether individual items contribute effectively to measuring the intended construct of the subscale. Items with an item-total correlation <0.5 are further inspected by assessing their residual variance with a value <0.3 considered acceptable (8,12). Furthermore, each factor’s average variance extracted (AVE) will be determined with values >0.5, indicating adequate construct-level convergent validity. The correlation coefficient between the factors will be compared to the square root of this AVE to test discriminant validity. Model fit will be evaluated using a chi-square-to-degrees-of-freedom ratio (χ²/df), wherein a value ≤2 indicates a good fit, while a value between 2 and 3 indicates an acceptable fit (12). Additionally, the following goodness-of-fit indices and criteria will be handled: Root Mean Square Error of Approximation (RMSEA) <0.10, Standardized Root Mean Square Residual (SRMR) <0.08, Goodness of Fit Index (GFI) >0.9, Comparative Fit Index (CFI) >0.9, Tucker-Lewis Index (TLI) >0.9 and Non-normed Fit Index (NNFI) >0.9 (13,14). In case of an inadequate model fit, adjustments (e.g., allowing error terms to correlate or adding paths between variables) will be made in line with the underlying model theory, guided by modification indexes. The models will be compared using the Akaike information criterion (AIC) and Bayesian information criterion (BIC), prioritizing simplicity to avoid unnecessary complexity when selecting the final model. Individual items will be reviewed during various stages of the psychometric validation process to identify any problematic items that may require revision or removal. For example, items contributing to low internal consistency or test-retest reliability of the entire subscale will be addressed.
Internal consistency
Cronbach’s alpha coefficient, a measure of internal consistency, will be calculated for each subscale. A value >0.70 indicates sufficient reliability of the questionnaire for application at the group level, while a value >0.90 implies suitability for individual assessment (14).
Test-retest reliability
Test-retest reliability for each subscale will be evaluated using a two-way mixed-effects model for the intraclass correlation coefficient (ICC), type (3.1), with 95% confidence intervals (CI) reported (15). The strength of agreement will be interpreted following the guideline provided by Cicchetti [1994] (16): <0.40= poor, 0.40–0.59= fair, 0.60–0.74= good, 0.75–1.00= excellent. For individual items, quadratic weighted Cohen’s kappa will be calculated along with its 95% to identify problematic items, with interpretation based on the standards proposed by Landis and Koch [1977] (17): 0= poor, 0.01–0.20= slight, 0.21–0.40= fair, 0.41–0.60= moderate, 0.61–0.80= substantial, and 0.81–1.00 = almost perfect.
Floor and ceiling effects
The floor and ceiling effects will be assessed by calculating the proportion of participants scoring the lowest or highest possible scores. Effects are considered substantial if >15% of the participants score at the extremes for a specific item or subscale (18).
Responsiveness
The responsiveness of the questionnaire will be determined using the construct approach in which preoperative and postoperative scores are compared using a paired t-test or Wilcoxon signed-rank test for skewed data. The effect size will be expressed in Cohen’s d and is calculated as the mean difference divided by the standard deviation of the difference. We hypothesize that the total score on the PEEQ and mean scores per subscale improve after surgical correction of the pectus excavatum deformity as demonstrated by the original questionnaire (2,3). Responsiveness of individual items will be evaluated using a paired t-test or Wilcoxon signed-rank test, as appropriate, to guide further item refinement.
SDC
The standard error of measurement (SEM) quantifies the precision of the scores across different time points and is calculated as (18):
Where mean square error (MSE) is the error term obtained from repeated measures analysis of variance (ANOVA).
The SDC represents the minimum change in points a patient must score on the questionnaire over time to ensure the observed change reflects a real change and not a measurement error. The SDC is expressed in points on the PEEQ and will be calculated for the mean scores per subscale, and the total scores of the child’s section, parent’s section and entire questionnaire. The SDC at a 95% confidence level will be calculated as (18):
Where:
- 1.96 is the z-score corresponding to a 95% confidence level;
- adjusts for the measurement error contributed by both time points.
Data collection
All collected data will be stored in a secured database (Research Manager Software, Deventer, the Netherlands) and coded by participant ID. The patient characteristics will be extracted from the medical records. Data from the PEEQ assessment on paper will be transferred to the database by our research team. To complete the PEEQ electronically, participants will access the questionnaire through a secure link sent to their email and complete it directly within the Research Manager System. Data from the Research Manager System will be securely stored and retained for 15 years. Files containing the participant’s identifiers will be stored separately from the database. Access to these files and the database is restricted to the research team.
Sample size calculation
The sample size is based on the COSMIN study design checklist recommendations (5). Preoperative and postoperative data, and test-retest data from 66 patients are sufficient for psychometric validation of the PEEQ.
Ethics and dissemination
Ethical considerations
This study will be in accordance with the Declaration of Helsinki and its subsequent amendments, and ethical approval was obtained from the local Medical Ethics Review Committee, METC Zuyderland Medical Center and Zuyd University of Applied Sciences (registration No. METCZ20210182; date of approval: December 6th, 2021). Prior to inclusion, written informed consent will be obtained from the patient and if applicable, from the patient’s parent(s) or legal guardian(s). The written informed consent form will also be signed by the researcher who provided the study information. The signed informed consent forms will be stored at the study site for fifteen years. Participation is voluntary and participants are allowed to withdraw at any time. They will receive follow-up as part of standard care. Patients will have no benefit from participation in this study and there are no risks related to study participation.
Dissemination
The results of the current study will be submitted to an appropriate international medical journal. Data cannot be linked to individual participants. All authors of the protocol are eligible to participate in the dissemination process.
Discussion
Our center is a large tertiary referral center for chest wall deformities, attracting patients from across the Netherlands. Although this study will be conducted in a single center, the findings are likely generalizable to other parts of the country. Nevertheless, while responsiveness is assessed postoperatively, the long-term responsiveness of the PEEQ to changes in health-related quality of life may not fully be captured by this study. Future studies are needed to confirm the robustness of the PEEQ’s responsiveness and predictive validity over time. A potential source of bias in this study is the selection of a specific group of patients—those aged 12 to 17 years old who are scheduled for a Nuss procedure—as they may not fully represent all patients with pectus excavatum. Patients with a mild deformity, younger children or adults, and patients treated conservatively or using other surgical techniques such as the Ravitch technique, will not be included in the study. In the absence of a definition of pectus excavatum, recruitment is limited to those undergoing the Nuss procedure, to ensure inclusion of patients with a clinically significant deformity. Patients undergoing the Ravitch procedure were excluded to maintain a homogenous patient group, given that this technique is reserved in this age group for revision cases or specific comorbidities not adequately captured by the PEEQ. Similarly, the decision to only include patients under 18 years of age was driven by the necessity to validate the parent’s section of the PEEQ. Patients younger than 12 years old were excluded, as our center and many others do not typically perform surgery in this age group (19). While these eligibility criteria enhance the internal validity of the study by focusing on a relevant and well-defined patient population, they may limit the generalizability to other subpopulations of pectus excavatum. Despite these limitations, the chosen patient sample represents the majority of individuals presenting with pectus excavatum at referral centers, thereby supporting the relevance and applicability of the study findings for research purposes and routine clinical practice. Future research could explore validation in other subpopulations, such as adults, to enhance the applicability of the PEEQ.
Acknowledgments
None.
Footnote
Peer Review File: Available at https://tp.amegroups.com/article/view/10.21037/tp-2024-616/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tp.amegroups.com/article/view/10.21037/tp-2024-616/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study will be in accordance with the Declaration of Helsinki and its subsequent amendments, and ethical approval was obtained from the local Medical Ethics Review Committee, METC Zuyderland Medical Center and Zuyd University of Applied Sciences (registration No. METCZ20210182; date of approval: December 6th, 2021). Prior to inclusion, written informed consent will be obtained from the patient and if applicable, from the patient’s parent(s) or legal guardian(s).
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Mohamed JS, Tan JW, Tam JKC. Quality of life with minimally invasive repair of pectus excavatum: a systematic review and meta-analysis. Ann Transl Med 2023;11:407. [Crossref] [PubMed]
- Lawson ML, Cash TF, Akers R, et al. A pilot study of the impact of surgical repair on disease-specific quality of life among patients with pectus excavatum. J Pediatr Surg 2003;38:916-8. [Crossref] [PubMed]
- Kelly RE Jr, Cash TF, Shamberger RC, et al. Surgical repair of pectus excavatum markedly improves body image and perceived ability for physical activity: multicenter study. Pediatrics 2008;122:1218-22. [Crossref] [PubMed]
- Janssen N, Daemen JHT, van Polen EJ, et al. Translation, cultural adaptation and linguistic validation of the pectus excavatum evaluation questionnaire. J Thorac Dis 2022;14:2556-64. [Crossref] [PubMed]
- Mokkink L, Prinsen C, Patrick D, et al. COSMIN Study Design checklist for Patient-reported outcome measurement instruments. 2019. Available online: https://www.cosmin.nl/wp-content/uploads/COSMIN-study-designing-checklist_final.pdf
- Watkins MW. Exploratory Factor Analysis: A Guide to Best Practice. J Black Psychol 2018;44:219-46. [Crossref]
- Hoelzle JB, Meyer GJ. Exploratory Factor Analysis: Basics and Beyond. In: Schinka JA, Velicer WF, Weiner IB, et al. Handbook of psychology: Research methods in psychology. 2nd ed. New York: Wiley; 2012:164-88.
- Brown T. Confirmatory factor analysis for applied research. 2nd ed. New York: Guilford Press; 2015.
- Costello AB, Osborne JW. Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. Practical Assessment, Research and Evaluation 2005;10:7.
- Holgado-Tello FP, Chacón-Moscoso S, Barbero-García I, et al. Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Qual Quant 2010;44:153-66. [Crossref]
- Pituch KA, Stevens JP. Applied multivariate statistics for the social sciences: Analyses with SAS and IBM‘s SPSS. 6th ed. New York: Routledge; 2016.
- Kline P. An Easy Guide to Factor Analysis. 1st ed. New York: Routledge; 2014.
- Schreiber JB, Stage FK, King J, et al. Reporting Structural Equation Modeling and Confirmatory Factor Analysis Results: A Review. J Educ Res 2006;99:323-37. [Crossref]
- Fayers PM. Quality of life : The assessment, analysis and reporting of patient-reported outcomes. 3rd ed. Chichester: Wiley Blackwell; 2016.
- Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979;86:420-8. [Crossref] [PubMed]
- Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess. 1994;6:284-90. [Crossref]
- Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159-74. [Crossref] [PubMed]
- Terwee CB, Bot SD, de Boer MR, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007;60:34-42. [Crossref] [PubMed]
- Janssen N, Daemen JHT, van Polen EJ, et al. Pectus Excavatum: Consensus and Controversies in Clinical Practice. Ann Thorac Surg 2023;116:191-9. [Crossref] [PubMed]