Is telephone follow-up reliable in maternal and neonatal outcomes surveys in in vitro fertilization?
Reproductive Biology and Endocrinology volume 20, Article number: 128 (2022)
Many studies that collect maternal and neonatal outcomes rely on patient self-report phone calls. It is unclear how reliable or accurate these phone call reports are.
To evaluate the reliability of telephone calls in information collection in IVF.
The women were interviewed seven days after delivery by a nurse via telephone. The maternal and neonatal outcomes were recorded based on a self-report from one of the spouses. Meanwhile, the standardized electronic hospitalized discharge records were extracted from the hospital medical database. For each case, maternal and neonatal information obtained from telephone interviews and extracted from medical files were compared.
Agreement was classified as “almost perfect, K = 0.81–1.00” for preterm birth, cesarean delivery, low birth weight baby, and macrosomia. The strength of agreement was classified as “moderate, K = 0.41–0.60” for some antepartum complications: gestational diabetes (K = 0.569); pregnancy-induced hypertension (K = 0.588); intrahepatic cholestasis of pregnancy (K = 0.597) and oligohydramnios (K = 0.432). The strength of agreement between telephone interviews and hospitalized discharge records can be classified as “slight (K = 0–0.20)” for some complications: thyroid diseases (K = 0.137), anemia (K = 0.047), postpartum hemorrhage (K = 0.016), and Fetal distress (K = 0.106).
Some variables (preterm birth, cesarean delivery, birth weight) information collected by telephone follow-up were reliable. However, other complications (thyroid diseases, anemia, postpartum hemorrhage, and fetal distress) collected via self-report was non-reliable. Compared with complications during labor, antepartum complications have higher agreement between different follow-up methods. IVF records and hospitalized discharge records should be matched and collected simultaneously when discussing maternal and neonatal outcomes of IVF.
Assisted reproductive technology (ART) is a group of medical procedures for treating infertility in which both male and female gametes are handled outside the body to achieve conception . Since its introduction in 1978, ART has contributed to the birth of millions of infants worldwide .
With the expanding use of ART, there is a rising concern regarding the safety of these treatments for both mother and child . Hundreds of studies focused on the obstetric and perinatal outcomes of in vitro fertilization (IVF) treatment [4,5,6,7]. However, the incidences differed significantly between studies for the same maternal complication (e.g., gestational diabetes). For example, some studies investigated the incidence of gestational diabetes in IVF singleton delivery, the incidence should be comparable theoretically. However, it varied considerably in different studies. A large sample study that enrolled 183,059 IVF single delivery babies reported a 21.1% incidence of gestational diabetes , whereas, in other studies, the corresponding rates were 1.4% , 5.6%  and 12.2% , respectively.
The variation in incidence may be due to the heterogeneity of population sampled, but it is more likely due to the inconsistent strategies in data collection. Several methods were described in published literature for obstetric and perinatal information collection, including standardized electronic hospitalized discharge records  and national medical birth registries . However, collecting data from medical records derived from large cohorts is time-consuming . For this reason, many studies that collect this information rely on the postal questionnaire or phone calls via patient self-report [14, 15].
The Canadian ART Register collects data on IVF treatment cycles and their outcomes from all ART clinics in Canada. Data related to the pregnancy outcome, birth weight, and congenital malformations are obtained by each clinic through direct follow-up with parents via telephone or mail . Similar methods are applied in the United States; the details of maternal and perinatal complications were collected by nurses who telephoned each patient after delivery and sent to the Society of Assisted Reproductive Technology Outcome Reporting System (SART CORS) .
It is unclear how reliable or accurate these questionnaires and phone calls reports are. The data in the SART CORS have been validated annually with some clinics’ medical records. The 2019 ART cycle data validation indicated that most discrepancy rates were low (less than 5%) . The items which were cross-checked in the SART CORS dataset included: patient date of birth, cycle intention, cycle start date, date of egg retrieval, number of eggs or embryos transferred, outcome of ART treatment (i.e., pregnant, or not pregnant), pregnancy outcome (for example, miscarriage, live-birth delivery, or stillbirth), date of pregnancy outcome, number of infants born and patient diagnosis—reason for ART. However, the data on maternal complications is not validated in this chart.
Therefore, it is necessary to conduct a study to evaluate the reliability of the questionnaire and phone calls in maternal and perinatal information collection in IVF treatment. It would be a meaningful addition to the literature, and it would be helpful for subsequent studies in perinatal data collection.
Materials and methods
Study design and study participants
This is a cross-sectional study which was conducted in a tertiary maternity hospital between January 2010 and December 2019. In this study, women who were undergoing ART (including in vitro fertilization (IVF), intracytoplasmic sperm injection (ICSI), frozen-thawing embryo transfer (FET)) and with live birth at the same hospital were enrolled in the study. The study was approved by the Independent Ethics Committee of Guangzhou Women and Children’s Hospital (No. 2022-090A01).
According to the routing protocol, the women were interviewed seven days after delivery by a nurse via telephone. The maternal and neonatal outcomes were recorded based on a self-report from one of the spouses. Meanwhile, the standardized electronic hospitalized discharge records were extracted from the hospital medical database. For each case, maternal and neonatal information obtained from telephone interviews and extracted from medical files were compared.
Data collection via telephone
The couples were informed and signed a follow-up consent form before IVF treatment and were interviewed by telephone seven days after delivery. Data collected via telephone included: date of delivery, mode of delivery, number of children born, gender and birth weight of each baby, congenital malformations of each baby, and maternal/neonatal complications. Data collected via telephone were recorded in the ART database.
To ensure the consistency of the follow-up process, all of the nurses were trained, and a uniform follow-up questionnaire was applied (Additional file 1). If the couples did not answer the first call, additional calls were made three or four days later to maximize the follow-up rate.
When extracting variables from the ART database, personal identification number of the women, IVF/ICSI, fresh/cryopreserved, date of embryo transfer, and number of embryos transferred were extracted for further analysis.
Data collection from standardized electronic hospitalized discharge records
In standardized electronic hospitalized discharge records, all diagnoses of disorders and diseases were coded using the International Statistical Classification of Diseases and Related Health Problems, Eleventh Revision (ICD-11) ; all operating procedures were coded by using the ICD-11 and the International Classification of Diseases, 9th Revision. Clinical Modification (ICD-9-CM) .
Before the linkage process, a limited data file was generated from standardized electronic hospitalized discharge records, containing only the following factors: women’s personal identification number, woman’s first, middle name or initial, and last names, date of hospitalized discharge, and whole items of discharge diagnosis.
We linked the ART database and hospitalized discharge records. In the first step, the women’s personal identification number was cross-linked between the two databases to ensure proper identity recognition. Then the date of hospitalized discharge was linked to the date of embryo transfer to exclude the delivery followed by a spontaneous conception of the same woman. Thirdly, duplicated records were excluded if the women were hospitalized several times during the same conception. Fourthly, the study population was limited to delivery births only; hospitalized discharge without delivery record is also excluded.
Definition of maternal complications neonatal outcomes
Maternal chronic diseases were defined as chronic diseases the pregnant woman had before pregnancy, including thyroid diseases, anemia, and other diseases. Maternal complications were defined as disorders that developed during pregnancy, including pregnancy-induced hypertension (persistent blood pressure ≥ 140/90 mmHg was recorded after 20 weeks of gestation in a previously normotensive woman, preeclampsia and eclampsia), gestational diabetes mellitus, placenta previa, placental abruption, oligohydramnios, polyhydramnios, preterm birth (gestational age at birth, 28–36 weeks), cesarean delivery, abnormal placental cord insertion, postpartum hemorrhage (bleeding volume ≥ 500 mL after vaginal delivery or ≥ 1000 mL after cesarean delivery), and intrahepatic cholestasis of pregnancy. Neonatal outcomes were defined as neonatal complications that developed before or after birth until discharge, including fetal distress, low birth weight (birth weight < 2500 g), macrosomia (birth weight > 4000 g).
Among these variables, low birth weight and macrosomia were identified according to birth weights reported in the records. Preterm birth was accounted for according to birth date and embryo transfer date. Other variables were identified according to the related ICD-9-CM or ICD-10 codes.
Categorical variables were expressed as frequency and percentage. Cohen’s kappa (κ) statistics were used to investigate the agreement between records from telephone follow-up and of hospitalized discharge records. Kappa coefficients were interpreted as follows: almost perfect (0.81–1.00), substantial (0.61–0.80), moderate (0.41–0.60), fair (0.21–0.40), and slight (0–0.20) [21, 22]. All data analyses were performed using SPSS for windows 23.0. (IBM, Armonk, NY). P values < 0.05 were considered significant.
Totally 3,473 women who were pregnant after IVF/ICSI/FET treatment in Guangzhou women and children’s hospital were enrolled for cross-link, and 1,268 hospitalized discharge records were matched. Then 135 deliveries records in hospitalized discharge records were excluded for there were no corresponding embryo transfer records, and the deliveries were considered to follow with spontaneous pregnancies. Twenty-five women were hospitalized more than one time during pregnancy for some reasons, then duplicated medical records were excluded. Nine medical records were excluded, for there was a record of hospitalization during pregnancy but no final delivery records. Finally, 1,099 records were included in the study. The detailed linkage results are shown in Fig. 1.
Of the 1,099 delivery records, 771 (70.2%) were singleton deliveries, and 328 (29.8%) were twin deliveries. Finally, a total of 1099 women and 1,427 newborns were enrolled in the study.
Maternal and neonatal outcomes
More than ten types of maternal and neonatal complications were recorded in a phone interview, whereas a much greater variety of complications were recorded in hospitalized discharge records files.
Table 1 illustrates agreement between self-report by telephone and medical records for binary variables. The agreement was classified as “almost perfect, κ = 1.00” for the following variables: preterm birth, cesarean delivery (97.6% in twin pregnancy, 49.5% in single delivery), low birth weight baby, and macrosomia.
Compared with the phone interview, hospitalized discharge records file reported a higher proportion of other complications. The strength of agreement was classified as “moderate, κ = 0.41–0.60” for the following variables: gestational diabetes (κ = 0.569); pregnancy-induced hypertension (κ = 0.588); intrahepatic cholestasis of pregnancy (κ = 0.597) and oligohydramnios (κ = 0.432). It is of note that gestational diabetes mellitus and pregnancy-induced hypertension have higher self-report rates. However, compared with hospitalized discharge records, gestational diabetes mellitus was not frequently recorded in the telephone interview, with the information missing in 155 of 295 sets of notes. The total incidence of gestational diabetes mellitus was 12.7% in a telephone interview; it increased to 26.8% in hospitalized discharge records. Similar results were recorded in pregnancy-induced hypertension (4.7% vs. 10.6%). Detailed data are shown in Table 1.
Agreement for some complications were classified as “fair, κ = 0.21–0.40”, including placenta previa (κ = 0.383), placental abruption (κ = 0.233), polyhydramnios (κ = 0.233) and Abnormal placental cord insertion (κ = 0.318). The strength of agreement between telephone interviews and hospitalized discharge records can be classified as “slight (κ = 0–0.20)” for the remainder of complications: thyroid diseases (κ = 0.137), anemia (κ = 0.047), postpartum hemorrhage (κ = 0.016), and Fetal distress (κ = 0.106). Detailed data are shown in Table 1.
This is the first study to assess the consistency of telephone follow-up and hospitalized discharge records in maternal and neonatal complication collection. We found that the information on preterm birth, cesarean delivery, low birth weight baby, and macrosomia were in complete agreement between the two methods. Other maternal and neonatal complications were rarely reported in telephone interviews, much lower than hospitalized discharge records.
Four variables, including cesarean delivery, preterm birth, low birth weight baby, and macrosomia, were in complete agreement. This finding is in line with previous study, which have reported correlations of around 0.989 in birth weight . However, this study assessed consistency for one variable merely, the birth weight. The reason why these four variables are perfectly consistent may be due to several factors: firstly, cesarean section was an operation that the patient will remember very well; secondly, we asked the couples about the birth date and birth weight of the newborn in detail in telephone follow-up, then we calculated preterm birth, low birth weight baby and macrosomia according to the time of embryo transfer and the criterion of low birth weight. Parents’ recall of infant birth weight and birthday was highly accurate compared to hospitalized discharge records, making the correlation between the two sources of information close to unity.
Except for the four obstetric outcomes mentioned above, the incidence of maternal and neonatal outcomes was significantly higher in obstetric discharge records than in telephone follow-up records. This means a high percentage of complications were missing reported by telephone follow-up.
Agreement between self-report and obstetric discharge records was classified as “moderate” or “fair” for antepartum complications included in this study. In comparison, it was classified as “slight” for complications during labor.
The relatively high self-reported rate for antepartum complications may be due to these complications onset at the second or the third trimester of pregnancy; therefore, the women were informed many times during prenatal examination; and due to the harmfulness of gestational diabetes mellitus and pregnancy-induced hypertension, the women pay more attention to monitoring them. However, obstetric complications during labor, such as postpartum hemorrhage and placental abnormalities, are most likely missed in self-reported. This may be due to patients may pay more attention to the neonatal health of IVF babies while ignoring those less severe obstetric complications.
An important strength of the study is that all cases have been follow-up by the same standard procedures and equally trained professionals. On the contrary, as results were based on single-center, results should be considered with caution, due to the efficiency of telephone follow-up may vary from center.
In conclusion, some variables (preterm birth, cesarean delivery, birth weight) information collected by telephone follow-up were reliable. However, other complications (thyroid diseases, anemia, postpartum hemorrhage, and fetal distress) collected by self-reported via telephone were non-reliable. Compared with complications during labor, antepartum complications have higher agreement between different follow-up methods. IVF records and hospitalized discharge records should be matched and collected simultaneously when discussing maternal and neonatal outcomes of IVF.
Availability of data and materials
The data is not publicly shared and please contact author for data requests.
Dunietz GL, et al. Assisted reproductive technology and the risk of preterm birth among primiparas. Fertil Steril. 2015;103(4):974-979.e1.
Bauquis C. The world’s number of IVF and ICSI babies has now reached a calculated total of 5 million. 2012. Available from: http://www.eshre.eu/ESHRE/English/PressRoom/Press-Releases/Press-releases-2012/5-million-babies/page.aspx/1606.
Ensing S, et al. Risk of poor neonatal outcome at term after medically assisted reproduction: a propensity score-match ed study. Fertil Steril. 2015;104(2):384-90.e1.
Zheng W, et al. Obstetric and neonatal outcomes of pregnancies resulting from preimplantation genetic testing: a syst ematic review and meta-analysis. Hum Reprod Update. 2021;27(6):989–1012.
Sarmon KG, et al. Assisted reproductive technologies and the risk of stillbirth in singleton pregnancies: a systematic review and meta-analysis. Fertil Steril. 2021;116(3):784–92.
Conforti A, et al. Perinatal and obstetric outcomes in singleton pregnancies following fresh versus cryopreserved blasto cyst transfer: a meta-analysis. Reprod Biomed Online. 2021;42(2):401–12.
Pandey S, et al. Obstetric and perinatal outcomes in singleton pregnancies resulting from IVF/ICSI: a systematic revie w and meta-analysis. Hum Reprod Update. 2012;18(5):485–503.
Wang Y, et al. Absolute Risk of Adverse Obstetric Outcomes Among Twin Pregnancies After In Vitro Fertilization by Ma ternal Age. JAMA Netw Open. 2021;4(9):e2123634.
Sazonova A, et al. Obstetric outcome after in vitro fertilization with single or double embryo transfer. Hum Reprod. 2011;26(2):442–50.
Liu L, et al. Obstetric and perinatal outcomes of intracytoplasmic sperm injection versus conventional in vitro fer tilization in couples with nonsevere male infertility. Fertil Steril. 2020;114(4):792–800.
Szymusik I, et al. Perinatal outcome of in vitro fertilization singletons - 10 years’ experience of one center. Arch Med Sci. 2019;15(3):666–72.
Spangmose AL, et al. Obstetric and perinatal risks in 4601 singletons and 884 twins conceived after fresh blastocyst trans fers: a Nordic study from the CoNARTaS group. Hum Reprod. 2020;35(4):805–15.
Rice F, et al. Agreement between maternal report and antenatal records for a range of pre and peri-natal factors: th e influence of maternal and child characteristics. Early Hum Dev. 2007;83(8):497–504.
Wan HL, et al. Obstetric outcomes in women with polycystic ovary syndrome and isolated polycystic ovaries undergoing in vitro fertilization: a retrospective cohort analysis. J Matern Fetal Neonatal Med. 2015;28(4):475–8.
Romanski PA, et al. Reproductive and obstetric outcomes in mildly and significantly underweight women undergoing IVF. Reprod Biomed Online. 2021;42(2):366–74.
Dar S, et al. Increased risk of preterm birth in singleton pregnancies after blastocyst versus Day 3 embryo transfer: Canadian ART Register (CARTR) analysis. Hum Reprod. 2013;28(4):924–8.
Makhijani R, et al. Maternal and perinatal outcomes in programmed versus natural vitrified-warmed blastocyst transfer cycles. Reprod Biomed Online. 2020;41(2):300–8.
Centers for Disease Control and Prevention. ART data. 2019. Available from: https://www.cdc.gov/art/reports/2019/appendixes.html.
World Health Organization. International Statistical Classification of Diseases and Related Health Problems, 11th Revision. January 1, 2022. Accessed April 25, 2022. Available from: https://www.who.int/classifications/classification-of-diseases.
National Center for Health Statistics, Centers for Disease Control and Prevention . International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM). Reviewed November 3, 2021. Accessed April 25, 2022. Available from: https://www.cdc.gov/nchs/icd/icd9cm.htm.
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74.
Sim J, Wright CC. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys Ther. 2005;85(3):257–68.
The authors thank the clinical and nursing staff of Guangzhou women and children hospital.
No external funding was used for this study.
The authors declare no competing interests.
Ethics approval and consent to participate
This retrospective study was approved by the Independent Ethics Committee of Guangzhou Women and Children’s Hospital (Number: 2022-090A01).
Consent for publication
Conflict of interest
The authors have no conflicts of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Sun, L., Xu, J., Liang, PL. et al. Is telephone follow-up reliable in maternal and neonatal outcomes surveys in in vitro fertilization?. Reprod Biol Endocrinol 20, 128 (2022). https://doi.org/10.1186/s12958-022-01001-5