- Open Access
Human sperm acrosome function assays are predictive of fertilization rate in vitro: a retrospective cohort study and meta-analysis
Reproductive Biology and Endocrinologyvolume 16, Article number: 81 (2018)
To determine whether acrosome function scoring—including acrosomal enzyme (AE) levels and acrosome reaction (AR) results—can predict fertilization rate in vitro.
We examined the predictive value of acrosomal enzymes (AE) determined by spectrophotometry/N-α-benzoyl-dl-arginine-p-nitroanilide for fertilization rate (FR) in vitro in a retrospective cohort study of 737 infertile couples undergoing IVF therapy. Additionally, a meta-analysis was done for prospective cohort or case-control studies; the following summary measures were reported to expand upon the findings: pooled spearman correlation coefficient (Rs), standardized mean difference (SMD), sensitivity (SEN), specificity (SPE), positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic score (DS), diagnostic odds ratio (DOR), and area under the summary receiver operating characteristic curve (AUC).
Lower AE levels determined by spectrophotometry with a cut-off value of <25μIU/106 spermatozoa were predictive of total fertilization failure (TFF) with moderate SEN (88.23%) and low SPE (16.50%). On meta-analysis, a total of 44 unique articles were selected, but given the multiple techniques described there was a total of 67 total datasets extracted from these 44 articles, comprising 5356 infertile couples undergoing IVF therapy. The AE levels or induced AR% was positively correlated with FR (Rs = 0.38, SMD = 0.79; Rs = 0.40, SMD = 0.86, respectively). Lower AE levels or induced AR% was predictive of lower fertilization rate with moderate accuracy (AUC = 0.78, AUC = 0.84, respectively); this was accompanied by low SEN/moderate SPE (0.57/0.85), moderate SEN/moderate SPE (0.79/0.87), respectively. For AE assay, the diagnostic performance in Asia (Rs = 0.24, SMD = 0.50) was inferior to that in North America (Rs = 0.54, SMD = 0.81) and Europe (Rs = 0.46, SMD = 0.92). Cryopreserved spermatozoa (SMD = 0.20, P = 0.204) were inferior to fresh spermatozoa (SMD = 0.89, P < 0.001). Sperm preparation yielded inferior results as compared to no preparation; spermatozoa after swim up were weak relevant (Rs = 0.27, P = 0.044); and there was no correlation for spermatozoa after a discontinuous gradient (SMD = 1.07, P > 0.05). Lower AE levels determined by fluorometry or substrate assay were used for predicting lower FR with low sensitivity and high specificity; the spectrophotometry assay had an uncertain predictive value. For induced AR assay, the diagnostic performance in the other areas was inferior to that in Africa (Rs = 0.65, SMD = 1.86). No preparation or double preparation yielded inferior results as compared to one preparation (Rs = 0.41); discontinuous gradient (Rs = 0.17, SMD = 0.47) was inferior to swim up (Rs =0.65, SMD = 1.51). Nonphysiological triggers (SMD = 0.81) did not differ from physiological triggers (SMD = 0.95) in general; ZP (Rs = 0.63) or mannose (Rs = 0.59) was superior to other physiological or nonphysiological triggers; and there was no correlation for human follicle fluid, progesterone, cyclic adenosine 3′-5′-phosphate analogue and phorbol ester–BSA-GlcNAc Neoglycoproteins with N-acetylglucosamine residues. Lower induced AR% determined by indirect immunofluorescence, direct immunofluorescence with lection, or triple stain was used for predicting lower FR, with moderate sensitivity/high specificity, moderate sensitivity/high specificity, or high sensitivity/low specificity.
Although the correlation between acrosome function scoring and FR was significant, the assays were neither highly sensitive nor specific. Additionally, the diagnostic performance showed regional effects as well as an effect of the sperm preparation or assay method. More studies of multicenter, large-scale, careful design and synthesizing multiple sperm functional assays and oocyte quality assays are still needed in clinical settings to better predict fertilization outcome in IVF.
The sperm acrosome is a Golgi complex-derived flat granule overlaying the anterior two-thirds of the sperm head and contains numerous acrosomal enzymes (AEs) such as protease, glycosidase, acrosin, hyaluronidase, and high-electron density semisolid matrix proteins. Among AEs, the serine proteinase acrosin and hyaluronidase are of particular interest owing to their roles in fertilization, which include limited proteolysis of zona proteins to facilitate spermatozoa penetration into the various layers of the ovum. Acrosin—which is exclusive to the acrosome of mammalian spermatozoa—is mainly synthesized and stored in an enzymatically inactive zymogen form (i.e., proacrosin), and is released during acrosomal exocytosis following maturation . Hyaluronidase is secreted and depolymerizes the matrix between cells of the cumulus oophorus .
Intact acrosome function—containing adequate active AEs (proacrosin, acrosin, and hyaluronidase) and ability to undergo acrosome reaction (AR) after the induction—is necessary for sperm fertility. The detection of acrosome function can provide insight into the fertilizing capacity of spermatozoa, and is therefore considered a useful diagnostic tool for male infertility. Several methods have been described to assay AE, including fluorometry, western blotting, spectrophotometry, substrate assays, and radioimmunoassay (RIA). For the indirect fluorometry, polyclonal anti-acrosin (pAb-acrosin)  or anti-hyaluronidase (pAb-hyaluronidase) antibodies  or a monoclonal anti-proacrosin antibody (mAb 4D4-proacrosin)  is used. In addition, anti-acrosin antibody with low binding specificity has been used for western blotting . There are several types of spectrophotometry assay, including an acrosin/proacrosin target with N-α-benzoyl-dl-arginine-p-nitroanilide (BAPNA) substrate (spectrophotometry/BAPNA) [6,7,8]; acrosin/proacrosin target with BAPNA substrate in a commercially available acrosin activity assay kit (Accu-Sperm) (Accu-Sperm spectrophotometry/BAPNA) ; acrosin/proacrosin target with N-benzoyl-l-arginine ethyl ester (BAEE) substrate (spectrophotometry/BAEE) [10, 11]; acrosin/proacrosin/acrosin inhibitor target with BAEE substrate ; and hyaluronidase target with BAEE substrate . Substrate assays include a hyaluronidase target with cytochemical substrate ; acrosin target with gelatine substrate [15,16,17,18]; hyaluronidase target with agar/hyaluronic acid mixture substrate ; and hyaluronidase target with hyaluronic acid substrate . Finally, an RIA has been used to quantify acrosin in sperm acid extracts irrespective of the presence of acrosin inhibitors .
For assessing human sperm AR, three kinds of methods are used, including transmission electron microscopy (TEM), dyes for bright-field microscopy (DBM), fluorescent labels . For the TEM, it is usually the god standard against which a new assay is measured and it cannot be routinely used owing to labor consuming and lack of sperm viability assay . For the DBM, two stain (an acrosomal stain, a nuclear stain)  and triple stain (Bismark brown, rose Bengal, trypan blue) [23, 24] are the most widely used. There are three classes of fluorescent labels: those that label permeabilized spermatozoa with internally directed probes, including fluorescein isothiocyanate-conjugated Pisurn sativum agglutimm (FITC-PSA) [25,26,27,28,29,30,31,32,33,34,35,36], peanut agglutinin (FITC-PNA) [37,38,39], Concanavalin A lectin (FITC-Con A) , GB24 antibody (FITC-GB24) [41, 42], rhodamine-conjugated PSA (RITC-PSA) , and tetramethylrhodamine-conjugated PSA (TRITC-PSA) ; those that label permeabilized spermatozoa with by indirect immunofluorescence with antibodies—including HS21 , HS63 , GB24 [37, 47], MH61 , anti-CD46 —directed against acrosome-associated antigens; and those—such as chlortetracycline (CTC) —that can be used on living, nonpermeabilized cells.
Conflicting results have been reported concerning the utility of acrosome function scoring determined by different methods for predicting fertilization rate (FR) in vitro. Some studies showed that there was no correlation between acrosome function scoring and FR [9, 10, 41, 50, 51]. In contrast, others have reported a positive correlation between the two parameters by fluorometry [3, 4], spectrophotometry [3, 6,7,8, 52,53,54,55,56,57,58], and substrate assay [2, 15,16,17,18,19]. To clarify this contradiction, we retrospectively investigated the correlation between AE levels determined by spectrophotometry/BAPNA with FR. Additionally, a systematic review and meta-analysis of published literature on similar topic, without regard to acrosome function assay methods, was performed to further expand upon the findings.
Retrospective cohort study
From July 2015 to March 2016, 737 infertile couples undergoing in vitro fertilization (IVF) therapy for whom ≥4 MII oocytes used for fertilization in vitro on the day of therapy, while excluding those presenting for IVF with intracytoplasmic sperm injection (ICSI) therapy, were included in retrospective analysis. The aetiologies of infertility were as follows: male factor in 133 (single problem = 93; oligozoospermia: 6, asthenozoospermia: 38, teratozoospermia: 49; ≥ 2 male problems mentioned above = 40); female factor in 353 (single problem = 195; tubal occlusion: 190, ovulatory disorder: 0, endometriosis: 1, polycystic ovarian syndrome: 0, intrauterine adhesion: 1, uterine myomas: 1, uterine malformation: 0, genital tract malformation: 0, pelvic inflammatory disease: 2, immune infertility: 0, adiposis: 0, hyperlipemia: 0, hyperprolactinemia: 1; ≥ 2 female problems mentioned above = 158); couple factors in 251(≥ 1 male problem and ≥ 1 female problem mentioned above).
Prior to further inclusion of couples in therapy protocol, the semen samples were collected and AE levels were determined by the procedure of Kennedy , with proper modifications. Briefly, the experimental and control tubes, each containing 7.5 × 106 spermatozoa, were layered over 500 μL of 11% Ficoll (Sigma-Aldrich, St. Louis, MO, USA) and centrifuged at 2000×g for 20 min. Then 100 μL of benzamidine (500 mM, Sigma-Aldrich, St. Louis, MO, USA) was added to equal volume of sperm pellet in the control tube. Afterwards, 1 mL of substrate-detergent mixture (BAPNA-Triton X-100 mixture, PH = 8.0, Sigma-Aldrich, St. Louis, MO, USA) was added to both tubes. After 1 h of incubation at 24 °C, benzamidine (100 μL) was added to experimental tube to stop the reaction. All samples were centrifuged at 2000×g for 15 min and the absorbance of supernatants was spectrophotometrically determined at 410 nm. AE activity (μIU/106) was calculated out of the difference in optical density between experimental and control tube of each sample.
Data sources and study selection
Two investigators independently carried out a search in PubMed, Web of Science, Cochrane Library, Embase, EBSCO, Ovid, ClinicalTrials.gov and Google Scholar databases for relevant literature up to February 2017. The [Title/Abstract] search was restricted to English language publications and was performed for the following MeSH terms: fertilization in vitro, acrosin, acrosome reaction, exocytosis, predictive value of tests, sensitivity and specificity (Additional file 1: search strategy). Inclusion criteria were as follows: (1) prospective cohort or case-control design; (2) infertile couples undergoing IVF therapy; (3) a study population of at least 30 couples; (4) AE or AR assay as an index test; (5) oocytes examined to establish fertilization as a reference standard test.
Data extraction and quality assessment
Information on study characteristics was independently abstracted by two investigators according to a standardized table (Table 2–4), with decisions made by consensus in cases of disagreement. In four articles where there were ≥ 1 outcome indicators, data with a maximal correlation coefficient and corresponding 95% confidence interval (CI) were used [16, 18, 25, 44]. In four articles where there were ≥ 1 AE/AR cut-off values, data with the best sensitivity (SEN) or specificity (SPE) were used [3, 6, 39, 59]. The methodological quality of eligible articles was assessed with the QUADAS-2 tool . Based on user guidelines, items were tailored by omitting or modifying some signaling questions ; for example, when reviewing Patient Selection, the item “Was a case–control design avoided?” was omitted; and for a review of Objective Index Test, the item “If a threshold was used, was it pre-specified?” was substituted with “Was the method of determining AEs or AR described?” This substitution was made because candidate articles were included regardless of the method of acrosome function detection.
In retrospective cohort study, the statistical analysis was performed by SPSS version 16.0 for Windows (SPSS Inc., Chicago, IL, USA). Data were presented as number and percentages for categorical variables, while non-normal variables were reported as median and interquartile ranges. Spearman rank analysis was performed to determine which variables were related to FR. The Pearson χ2-test was performed for comparison for the frequencies of categorical variables. Two-tailed p < 0.05 was considered statistically significant. In meta-analysis, data analysis was performed using STATA 12.0 software (Stata Corp., College Station, TX, USA). Statistical heterogeneity was evaluated using the Q test or inconsistency index (I2), with significance set at p < 0.05 or I2 > 50%, respectively. If heterogeneity existed, the random effects model was adopted; otherwise, a fixed-effects model was selected. SEN and subgroup analyses were carried out to identify suspected sources of heterogeneity. Subgroups were compared with the Q test for heterogeneity . The bivariate mixed effects regression model of midas module in STATA 12.0 was used for calculating SEN, SPE, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic score (DS), diagnostic odds ratio (DOR), and for performing the summary receiver operating characteristic (SROC) curve analysis and drawing Fagan nomogram.
Retrospective cohort study
The baseline characteristics, AE result, and fertilization rate for the couples included in the analysis are described in Table 1. The sample size retrieved (n = 737) for this retrospective study was greater than the calculated values (334–687) for cohort study by Epi Info version 7.2 for Windows (https://www.cdc.gov/epiinfo/pc.html), with two-sided confidence level set at 95%, power set at 90%, ratio (unexposed: exposed) set at 0.1945 (120/617), and the % outcomes in unexposed group set at 5–10% (i.e., the occurrence of total fertilization failure [TFF, FR = 0%] described previously . The median and interquartile range obtained for AE levels was 13.78 μIU/106 spermatozoa (12.12 μIU/106 spermatozoa). The FR was shown to be positively correlated with forward progression motility (spearman r = 0.119, p = 0.001) and AE levels (spearman r = 0.075, p = 0.042; Additional file 2: Table S1). According to a previously published report , patients were separated into two groups (< 25 μIU/106 spermatozoa, ≥ 25 μIU/106 spermatozoa), based on the AE levels results. Significantly higher FR were obtained in the group with AE activity ≥25 μIU/106 spermatozoa, compared with those with AE activity < 25 μIU/106 spermatozoa (78.98% [1101/1394], n = 120 vs. 73.31% [4843/6606], n = 617, p < 0.001). The lower AE result with a cut-off value of <25μIU/106 spermatozoa was not a risk factor for patients suffering from TFF (risk ratio [RR] = 1.46, 95% CI: 0.52–4.07), and was used for predicting TFF, showing moderate SEN (88.23% [30/34]) and low SPE (16.50% [116/703], Additional file 3: Table S2).
Literature search results
We initially identified 16,024 candidate articles through database searches (n = 15,772) and additional records (n = 252). After removing 7606 duplicates, we browsed the titles and abstracts of 8418 articles and selected 579 for full-text reading. The reasons for excluding the others were as follows: irrelevant (n = 3043); non-human (n = 4405); case report/review (n = 224); protocol/patent: (n = 24; protocol: 21, patent: 3); meeting abstract (n = 45); and non-English (n = 97; Chinese: 75, Iranian: 1, French: 3, Japanese: 16, German: 2), and Letter (n = 1). Of the 44 selected articles, 16 articles [3, 4, 6,7,8,9, 16, 18, 19, 50, 51, 53,54,55,56,57] addressed the relationship between the AE levels and FR (Table 2); one described three AE assay methods ; another reported three sperm preparation methods ; and three also mentioned different preparation methods [4, 7, 9] for a total of 23 total datasets extracted from these 16 articles, comprising 2734 infertile couples undergoing IVF therapy. A total of 13 articles [22, 29, 33, 36, 37, 39,40,41,42, 44, 47, 63, 64] addressed the relationship between the spontaneous AR% and FR (Table 3); one described two AR assay methods  for a total of 14 total datasets extracted from these articles, comprising 791 infertile couples. A total of 23 articles [23,24,25,26,27,28, 30,31,32,33,34,35, 37,38,39, 41,42,43, 47, 48, 59, 63, 64] addressed the relationship between the induced AR% and FR (Table 4); one described two AR assay methods ; another reported five AR triggers ; and two also mentioned different triggers [39, 42] for a total of 30 total datasets extracted from these articles, comprising 1831 infertile couples (Fig. 1a).
All included 44 articles comprised at least four items of low bias in QUADAS 2, indicating high overall quality (Fig. 1b). Forty-one had a prospective cohort design and three had a prospective case-control design. Geographic areas included Asia (n = 10), North America (n = 10), Europe (n = 17), Africa (n = 3), Oceania (n = 2), and South America (n = 2). Sperm storage methods included fresh samples (n = 41, for AE assay: 13, for AR assay: 28) and cryopreservation (n = 3; for AE assay: 3, for AR assay: 0). Sperm preparation methods included no preparation (n = 12), one preparation (n = 34; α-chymotrypsin: 1; swim up: 18; discontinuous gradient: 14; swim up/discontinuous gradient: 1;), double preparation (n = 2; swim up after discontinuous gradient: 1; double swim up: 1), and not reported (n = 1). AE assay methods included fluorometry (n = 3; pAb-acrosin: 1, pAb-hyaluronidase: 1, mAb 4D4-proacrosin: 1), spectrophotometry (n = 13; spectrophotometry/BAPNA: 9, Accu-Sperm spectrophotometry/BAPNA: 3, spectrophotometry/BAEE: 1), and substrate assay (n = 3; acrosin target with gelatine substrate assay: 2, hyaluronidase target with agar/hyaluronic acid mixture substrate assay: 1). All spectrophotometry in the 16 articles had acrosin/proacrosin as targets. AR triggers included physiological triggers (n = 10; human follicle fluid [HFF]: 4, progesterone [P]: 3, zona pellucida [ZP]: 3,) and nonphysiological triggers (n = 18; calcium ionophore A23187: 12, low temperature: 1, cyclic adenosine 3′-5′-phosphate analogue [CAMP]: 1, phorbol ester [TPA]: 2, Neoglycoproteins with N-acetylglucosamine residues [BSA-GlcNAc]: 1, mannose: 1). AR assay methods included DBM (n = 4, two stain Blutstan kit: 1, triple stain: 3) and fluorescent labels (n = 24; direct immunofluorescence with lectin: FITC-PSA: 12, FITC-PNA: 3, FITC-ConA: 1, RITC-PSA: 1, TRITC-PSA:1; direct immunofluorescence with antibody: FITC-GB24: 3; indirect immunofluorescence: GB24 antibody: 1, anti-CD46 antibody: 1, MH61 antibody: 1).
Data synthesis and analysis
Engauge Digitizer software (http://markummitchell.github.io/engauge-digitizer/) was used to convert the scatter plots in seven articles [6, 7, 18, 27, 38, 39, 63] into coordinates to indirectly obtain acrosome function scoring and FRs. Pearson correlation coefficient from thirteen studies [4, 8, 9, 18, 22, 24, 31, 39, 41, 43, 44, 54, 55] was converted into spearman correlation coefficient (Rs) values followed by Fisher’s r-to-z and z-to-r transformation.
Rs was extracted from 10 articles that included a total of 758 infertile couples. A total of 13 datasets were analyzed, including one article each that used three  and two  sperm preparation methods. AE levels and FRs that were higher and lower than the respective cut-off values were extracted from 12 articles, which included a total of 1037 infertile couples. Of the 16 datasets analyzed, one used two AE assay methods  and three used two sperm preparation methods [4, 7, 9]. Binary accuracy data from 939 infertile couples were extracted from 10 articles as 2 × 2 tables. We analyzed the 12 datasets, including one paper that used three assay methods  (Table 2).
According to a random-effects model, AE levels were positively correlated with FR (Rs = 0.39, 95% CI: 0.18–0.60, p < 0.001), albeit with notable heterogeneity (I2 = 95.7%, p < 0.001; Table 3; Fig. 2a, Table 5). Higher AE levels were obtained for higher as compared to lower FRs (standardized mean difference [SMD] = 0.79, 95% CI: 0.53–1.05, p < 0.001; Fig. 2b, Table 6). The bivariate mixed effects regression model predicted lower FR for lower AE levels with pooled low SEN/moderate SPE (SEN = 0.57, 95% CI: 0.41–0.71; SPE = 0.85, 95% CI: 0.73–0.93), moderate discriminant effect (PLR = 3.91, 95% CI: 2.31–6.61; NLR = 0.50, 95% CI: 0.37–0.68; DS = 2.05, 95% CI: 1.43–2.67; DOR = 7.78, 95% CI: 4.19–14.46) and moderate accuracy (area under the SROC curve [AUC] = 0.78, 95% CI: 0.74–0.81; Fig. 2c–e, Table 7). The Fagan nomogram showed that lower AE levels could be used to predict lower FR when the pre-test probability was 27% (i.e., occurrence rate of patients for whom < 70% fertilization was achieved by IVF in our hospital), with a post-test probability of 59%.
After SEN analysis, two studies [8, 54] were identified as a source of heterogeneity when pooling Rs; however, after they were excluded, the correlation was unchanged (Rs = 0.38, 95% CI: 0.27–0.48, p < 0.001) and the heterogeneity while decreased was still significant (I2 = 67.1%, p = 0.001; Table 3). When SMD was pooled, four studies [3, 6, 50, 53] were found to contribute to this heterogeneity; when these were excluded, the correlation was unchanged (SMD = 0.75, 95% CI: 0.57–0.93, p < 0.001) but there was no obvious heterogeneity (I2 = 26.5%, p = 0.184; Table 4). When pooling diagnostic accuracy data, excluding one outlier  did not significantly change the overall results (SEN = 0.51, 95% CI: 0.39–0.62; SPE = 0.88, 95% CI: 0.79–0.94; PLR = 4.22, 95% CI: 2.42–7.36, NLR = 0.56, 95% CI: 0.45–0.69; DS = 2.02, 95% CI: 1.37–2.67; DOR = 7.53, 95% CI: 3.92–14.47; AUC = 0.73, 95% CI: 0.69–0.77; Table 7). Graphs of SROC curves generated before and after removing the outlier (Fig. 2e, f) indicated that the threshold effect applied to inter-study heterogeneity, since the spearman correlation coefficient between SEN and 1 − SPE was 0.685 (p = 0.014).
In the subgroup analysis (Table 5–7), datasets were stratified according to geographic area, sperm storage method, sperm preparation method, and FR cut-off value combined with AE assay method. The diagnostic performance in Asia (Rs = 0.24, 95% CI: 0.05–0.42, p = 0.013; SMD = 0.50, 95% CI: 0.15–0.85, p = 0.006) was inferior to that in North America (Rs = 0.54, 95% CI: 0.43–0.65, p < 0.001; SMD = 0.81, 95% CI: 0.39–1.22, p < 0.001) and Europe (Rs = 0.46, 95% CI: 0.37–0.54, p < 0.001; SMD = 0.92, 95% CI: 0.48–1.36, p < 0.001; comparison between subgroups [p] < 0.05). Cryopreserved spermatozoa (SMD = 0.20, 95% CI: − 0.11–0.52, p = 0.204; SEN = 0.51; SPE = 0.51; DOR = 1.12) were inferior to fresh spermatozoa (SMD = 0.89, 95% CI: 0.63–1.15, p < 0.001; SEN = 0.60, 95% CI: 0.41–0.76; SPE = 0.87, 95% CI: 0.77–0.93; DOR = 9.99, 95% CI: 6.05–16.49; comparison between subgroups [p] < 0.001). Sperm preparation yielded inferior results as compared to no preparation (Rs = 0.42, 95% CI: 0.30–0.55, p < 0.001; SMD = 0.82, 95% CI: 0.50–1.13, p < 0.001; SEN = 0.72, 95% CI: 0.50–0.87; SPE = 0.80, 95% CI: 0.59–0.92; DOR = 10.56, 95% CI: 5.51–20.26; comparison between subgroups [p] < 0.05); spermatozoa after swim up were scarcely irrelevant (Rs = 0.27, 95% CI: 0.01–0.53, p = 0.044); and there was no correlation for spermatozoa after a discontinuous gradient (SMD = 1.07, 95% CI: − 0.10–2.25, p = 0.074).
AE levels determined by fluorometry—including pAb-acrosin (SMD = 1.68, 95% CI: 1.29–2.07, p < 0.001), pAb-hyaluronidase (SMD = 0.48, 95% CI: 0.13–0.82, p = 0.007), and mAb 4D4-proacrosin (Rs = 0.49, 95% CI: 0.27–0.70, p < 0.001; SMD = 0.72, 95% CI: 0.32–1.13, p < 0.001)—were positively correlated with FR. For predicting TFF, the pAb-acrosin assay with a cut-off value of < 60% for normal fluorescence scores (SEN = 0.63, SPE = 0.92, DOR = 23.87); pAb-hyaluronidase assay with a cut-off value of < 80% for normal fluorescence scores (SEN = 0.23, SPE = 0.90, DOR = 2.68); and mAb 4D4-proacrosin assay with a cut-off value of ≤50% for the normal acrosomal principal region (SEN = 0.40, SPE = 0.96, DOR = 16.00) and low SEN and high SPE were adopted.
AE levels determined by spectrophotometry—including spectrophotometry/BAPNA (Rs = 0.44, 95% CI: 0.35–0.54, p < 0.001), Accu-Sperm spectrophotometry/BAPNA (SMD = 0.70, 95% CI: 0.41–0.99, p < 0.001)—were positively correlated with FR, but this did not apply to spectrophotometry/BAEE (SMD = − 0.04, 95% CI: − 0.72–0.64, p = 0.908). The spectrophotometry/BAPNA assay predicted an FR < 50%, with pooled low SEN (0.63, 95% CI: 0.48–0.76) and moderate SPE (0.87, 95% CI: 0.60–0.97) and DOR = 11.68 (95% CI: 3.47–39.36). Specifically, low SEN and high SPE and moderate SEN and low SPE were associated with cut-off values of 25 μIU/106 spermatozoa (SEN = 0.51, SPE = 0.97, DOR = 33.78)  and 18 μIU/106 spermatozoa (SEN = 0.76, SPE = 0.63, DOR = 5.42) . For predicting TFF, the spectrophotometry/BAPNA assay was adopted with pooled moderate SEN and low SPE (SEN = 0.78, 95% CI: 0.38–0.95; SPE = 0.63, 95% CI: 0.40–0.81; DOR = 5.94, 95% CI: 1.34–26.34). Specifically, moderate SEN and SPE, high SEN and low SPE, low SEN and moderate SPE, and low SEN and SPE were obtained for cut-off values of 25 μIU/106 spermatozoa (SEN = 0.78, SPE = 0.73, DOR = 9.50) , 54μIU/106 spermatozoa (SEN = 1.00, SPE = 0.36, DOR = 23.31) , and 30 μIU/106 spermatozoa (SEN = 0.53, SPE = 0.86, DOR = 7.27) , and 15 μIU/μg DNA (SEN = 0.51, SPE = 0.51, DOR = 1.12) . The Accu-Sperm spectrophotometry/BAPNA assay with a cut-off value of < 4.5 for the acrosin activity index was adopted with low SEN and moderate SPE (SEN = 0.61, SPE = 0.84, DOR = 8.40).
AE levels determined by substrate assays—including acrosin target with gelatine substrate assay (Rs = 0.43, 95% CI: 0.30–0.56, p < 0.001), and hyaluronidase target with agar/hyaluronic acid mixture substrate assay (Rs = 0.54, 95% CI: 0.43–0.65, p < 0.001) —were positively correlated with FR. For predicting an FR of ≤50% or < 50%, the acrosin target with gelatine substrate assay with a cut-off value of < 6 for acrosin activity index or < 60% for halo formation rate showed low SEN and high SPE (SEN = 0.26, SPE = 0.97, DOR = 12.63; SEN = 0.50, SPE =0.93, DOR = 13.00, respectively).
The included studies were distributed symmetrically without obvious publication bias (Deeks’ funnel plot [p] = 0.53, Fig. 4b).
Spontaneous AR assay
Rs was extracted from 3 articles that included a total of 181 infertile couples. The spontaneous AR% and FRs that were higher and lower than the respective cut-off values were extracted from 9 articles, which included a total of 602 infertile couples. Of the 10 datasets analyzed, one used two AR assay methods . Binary accuracy data were extracted from only 3 articles as 2 × 2 tables; the diagnostic summary measures were not pooled, based on the computation of bivariate mixed effects regression model for the lowest threshold of 4 studies (Table 3).
According to a random-effects model, spontaneous AR% was weakly correlated with FR (Rs = 0.32, 95% CI: 0.01–0.63, p = 0.045; Fig. 3a), with notable heterogeneity (I2 = 85.1%, p = 0.001). However, the higher spontaneous AR% was not obtained for higher as compared to lower FRs when pooling SMD (SMD = − 0.30, 95% CI: –0.80–0.20, p = 0.245; Fig. 3b), with notable heterogeneity (I2 = 87.0%, p < 0.001). After SEN analysis, three studies [29, 33, 63] were identified as a source of heterogeneity; after they were excluded, the irrelevance was unchanged (SMD = − 0.06, 95% CI: –0.33–0.22, p < 0.001) but the heterogeneity significantly decreased (I2 = 46.2%, p = 0.084). The included studies were distributed symmetrically without obvious publication bias (Egger’s test [p] = 0.713, Fig. 4d).
Induced AR assay
Rs was extracted from 12 articles that included a total of 917 infertile couples. A total of 17 datasets were analyzed, including one article each that used five  and two  AR triggers. Induced AR% and FRs that were higher and lower than the respective cut-off values were extracted from 15 articles, which included a total of 1033 infertile couples. Of the 22 datasets analyzed, one used two AR assay methods , another reported five AR triggers , and two also mentioned different triggers [39, 42]. Binary accuracy data from 953 infertile couples were extracted from 12 articles as 2 × 2 tables. We analyzed the 13 datasets, including one paper that used two triggers  (Table 4).
According to a random-effects model, induced AR% were positively correlated with FR (Rs = 0.40, 95% CI: 0.24–0.57, p < 0.001; Fig. 3c, Table 8), albeit with notable heterogeneity (I2 = 96.5%, p < 0.001). Higher induced AR% was obtained for higher as compared to lower FRs (SMD = 0.86, 95% CI: 0.60–1.11, p < 0.001; Fig. 3d, Table 9). The bivariate mixed effects regression model predicted lower FR for lower induced AR% with pooled moderate SEN/SPE (SEN = 0.79, 95% CI: 0.71–0.85; SPE = 0.87, 95% CI: 0.74–0.94; Fig. 3e, Table 10), discriminant effect (PLR = 6.08, 95% CI: 2.77–13.36; NLR = 0.24, 95% CI: 0.17–0.35; DS = 3.22, 95% CI: 2.19–4.24; DOR = 24.91, 95% CI: 8.91–69.66; Fig. 3f, Table 10), and accuracy (AUC = 0.84, 95% CI: 0.81–0.87, Fig. 4a, Table 10). The Fagan nomogram showed that lower AE levels could be used to predict lower FR when the pre-test probability was 27%, with a post-test probability of 69%.
After SEN analysis, seven studies (when pooling Rs: 3; when pooling SMD: 4) were identified as a source of heterogeneity; however, after they were excluded, the correlation was unchanged (Rs = 0.36, 95% CI: 0.24–0.47, p < 0.001; SMD = 0.71, 95% CI: 0.52–0.90, respectively) and the heterogeneity while decreased was still significant (I2 = 83.6%, p < 0.001; I2 = 54.8%, p = 0.003, respectively). There was no outlier was identified when pooling diagnostic accuracy data (Fig. 4e). Graphs of SROC curves generated indicated that the threshold effect did not apply to inter-study heterogeneity (r = − 0.146, p = 0.634; Fig. 4a).
In the subgroup analysis, datasets were stratified according to geographic area, sperm preparation method, AR trigger, and AR assay method (Tables 8, 9 and 10). The diagnostic performance in the other areas (Europe [Rs = 0.33, 95% CI: 0.11–0.55, p = 0.003; pooled moderate SEN = 0.80, 95% CI: 0.66–0.89; moderate SPE = 0.86, 95% CI: 0.56–0.97], Oceania [Rs = 0.40, 95% CI: 0.03–0.76, p = 0.035; pooled moderate SEN = 0.75, 95% CI: 0.59–0.86; moderate SPE = 0.77, 95% CI: 0.61–0.87], South America [Rs = 0.46, 95% CI: 0.17–0.75, p = 0.002; moderate SEN = 0.78, high SPE = 0.91], Asia [moderate SEN = 0.82, moderate SPE = 0.70], and North America [Rs = 0.49, 95% CI: 0.30–0.69, p < 0.001; pooled moderate SEN = 0.77, 95% CI: 0.69–0.84, moderate SPE = 0.87, 95% CI: 0.64–0.96) was inferior to that in Africa (Rs = 0.65, 95% CI: 0.05–1.25, p = 0.034; pooled high SEN = 0.94, 95% CI: 0.44–1.00, high SPE = 0.98, 95% CI: 0.85–1.00; comparison between subgroups [p] < 0.01).
No preparation (Rs = 0.39, 95% CI: 0.20–0.58, p < 0.001; moderate SEN = 0.71, low SPE = 0.55) or double preparation (Rs = 0.33, 95% CI: 0.13–0.53, p = 0.001; low SEN = 0.63, high SPE = 1.00) yielded inferior results as compared to one preparation (Rs = 0.41, 95% CI: 0.24–0.58, p < 0.001; pooled moderate SEN = 0.82, 95% CI: 0.73–0.88, moderate SPE = 0.87, 95% CI: 0.73–0.94; comparison between subgroups [p] < 0.001); discontinuous gradient (Rs = 0.17, 95% CI: 0.10–0.25, p < 0.001; SMD = 0.47, 95% CI: 0.28–0.66, p = 0.02) was inferior to swim up (Rs =0.65, 95% CI: 0.49–0.81, p < 0.001; SMD = 1.51, 95% CI: 1.13–1.89, p < 0.001; comparison between subgroups [p] < 0.001).
Nonphysiological triggers (SMD = 0.81, 95% CI: 0.56–1.06, p < 0.001; moderate SEN = 0.79, 95% CI: 0.70–0.85; pooled moderate SPE = 0.86, 95% CI: 0.65–0.95) did not differ from physiological triggers (SMD = 0.95, 95% CI: 0.29–1.61, p = 0.005; pooled moderate SEN = 0.82, 95% CI: 0.73–0.88; moderate SPE = 0.88, 95% CI: 0.76–0.94; comparison between subgroups [p] = 0.92) in general; ZP (Rs = 0.63, 95% CI: 0.25–1.01, p = 0.001; SMD = 1.86, 95% CI: 0.91–2.80, p < 0.001;) or mannose (Rs = 0.59, 95% CI: 0.42–0.76, p < 0.001; SMD = 1.91, 95% CI: 1.18–2.63, p < 0.001) was superior to other physiological (comparison between subgroups [p] < 0.05) or nonphysiological triggers (A23187 [Rs = 0.36, 95% CI: 0.13–0.58, p = 0.002; SMD = 0.87, 95% CI: 0.66–1.08, p < 0.001], BSA-GlcNAc [Rs = 0.46, 95% CI: 0.17–0.75, p = 0.002; SMD = 0.97, 95% CI: 0.15–1.78, p = 0.02]; comparison between subgroups [p] < 0.001); and there was no correlation for HFF (Rs = 0.46, 95% CI: − 0.03–0.95, p = 0.065; SMD = 0.97, 95% CI: − 0.42 − 2.37, p = 0.172), P (Rs = 0.31, 95% CI: − 0.01 − 0.63, p = 0.059), CAMP (Rs = 0.12, 95% CI: − 0.06 − 0.29, p = 0.206; SMD = − 0.12, 95% CI: − 0.56 − 0.32, p = 0.588) and TPA (Rs =0.03, 95% CI: -0.15–0.21, p = 0.773).
The diagnostic performance of fluorescent labels (Rs = 0.41, 95% CI: 0.25–0.58, p < 0.001; SMD = 0.82, 95% CI: 0.57–1.08, p < 0.001) did not differ from that of triple stain (Rs = 0.24, 95% CI: 0.03–0.45; SMD = 1.52, 95% CI: 0.87–2.18); Lower induced AR% determined by fluorescent labels or triple stain was used for predicting lower FR with pooled moderate SEN/high SPE (SEN = 0.78, 95% CI: 0.71–0.84; SPE = 0.90, 95% CI: 0.78–0.96) or pooled high SEN/low SPE (SEN = 0.93, 95% CI: 0.76–0.98; SPE = 0.58, 95% CI: 0.52–0.64). The diagnostic performance of direct immunofluorescence (Rs = 0.40, 95% CI: 0.21–0.58; SMD = 0.80, 95% CI: 0.52–1.07) did not differ from that of indirect immunofluorescence (Anti-CD46 antibdy [Rs = 0.68, 95% CI: 0.59–0.77, p < 0.001], GB24 antibody [SMD = 1.11, 95% CI: 0.26–1.95, p = 0.01]; comparison between subgroups [p] > 0.05); direct immunofluorescence with antibody (FITC-GB24: Rs = 0.15, 95% CI: 0.05–0.25, p = 0.003; SMD = 0.28, 95% CI: 0.06–0.50, p = 0.014) was inferior to direct immunofluorescence with lectin (Rs = 0.53, 95% CI: 0.36–0.70, p < 0.001; SMD = 1.16, 95% CI: 0.84–1.47, p < 0.001; comparison between subgroups < 0.001); there is no significant difference between lectins (FITC-PSA [SMD = 1.19, 95% CI: 0.68–1.71, p < 0.001], FITC-PNA [SMD = 0.96, 5% CI: 0.66–1.25, p < 0.001], and RITC-PSA [SMD = 1.91, 95% CI: 1.18–2.63, p < 0.001]; comparison between subgroups [p] = 0.06). Specifically, moderate SEN/moderate SPE (SEN = 0.81, 95% CI: 0.69–0.88; SPE = 0.83, 95% CI: 0.66–0.93), pooled low SEN/moderate SPE (SEN = 0.68, 95% CI: 0.58–0.76; SPE = 0.85, 95% CI: 0.81–0.88, and moderate SEN/high SPE (SEN = 0.83, SPE = 0.98) were obtained for FITC-PSA, FITC-PNA, and RITC-PSA.
The included studies were distributed symmetrically without obvious publication bias (Deeks’ funnel plot [p] = 0.36, Fig. 4f).
There are many functional assays that attempt to assess the fertilization capacity of spermatozoa based on hypoosmotic swelling, peroxidative damage, acrosome status, AEs, sperm chromatin, sperm-oocyte interaction, zona pellucida binding, and zona-free oocyte penetration . However, their clinical utility for diagnosing male infertility is unclear. One reliable criterion for evaluating the diagnostic performance of assays is whether or not they can predict fertilization outcomes in IVF [3, 66, 67].
Our first results showed that AE (i.e., proacrosin and acrosin) levels determined by spectrophotometry/BAPNA were positively correlated with FR. However, lower AE levels were predictive of TFF with moderate SEN but with low SPE. In addition, a meta-analysis of published literature on similar topic was performed to further expand upon the findings. To the best of our knowledge, this meta-analysis is the first study to evaluate the association between acrosome function scoring—including AE levels and AR%—and FR and the diagnostic performance of acrosome function scoring. No attempt has been made here to correlate the scoring with conception rates because several other factors, such as the endometrial secretions, receptivity and systemic and local endocrine status, become significant after embryo transfer . After validating the correlation with pooling Rs and SMD, lower AE levels or induced AR% was predictive of lower FR with moderate accuracy (AUC between 0.70–0.90); this was accompanied by low SEN/moderate SPE, moderate SEN/moderate SPE, respectively. A moderate SPE indicates that a male diagnosed as scoring -negative (i.e., higher than the AE cut-off value) has about 85% or greater probability of having a high FR (i.e., higher than the FR cut-off value). Fifteen percent of the patients with high AE levels and poor fertilization probably have defects other than impaired AEs . For induced AR assay, the findings were in agreement with the results of Oehninger et al. , who reported that AR results were predictive of IVF rates, showing moderate accuracy, SEN and SPE. However, for AE levels, a low SEN indicates that a male diagnosed as AE-positive (i.e., low than the AE cut-off value) still has a 43% probability of having a high FR. The described first results are expected as proacrosin/acrosin is an important enzyme for fertilization. However, the SPE is low, probably because its action is dependent on structural and biochemical events which take place during capacitation and the acrosome reaction and it cannot be detected in its proper location (i.e., the acrosome) like fluorometry . The other kinds of AEs, such as hyaluronidase, were not taken into consideration. Furthermore, the satisfying diagnostic performance was not obtained for assays in the meta-analysis, in spite of synthesizing multiple assay methods. This result in relatively low SEN might be attributed to other parameters of sperm function, such as good membrane integrity, normal chromatin decondensation, excellent ability of undergoing capacitation and hyperactivation, high inducibility of the acrosome reaction (AR), increased sperm-oolemma interaction, or mild peroxidative damage, low DNA fragmentation. However, it should be mentioned that the fertilization process is a multifactorial process where female factors, such as young woman, maturity of oocyte/spindle/zona pellucida, intactness of cumulus-oocyte complex, or good ability to modulate/restore sperm functions, may contribute to high fertilization [15, 16]. For spontaneous AR assay, a weak correlation was obtained when pooling Rs; however, after enlarging the sample size, there was no significant correlation between them when pooling SMD. The spontaneous AR assay was considered for the evaluation of the initial acrosome stability before ZP binding; a low percentage of spontaneous AR did not seem to influence sperm fertility may due to high heterogeneity of spermatozoa.
In addition, there was notable heterogeneity when pooling summary measures in the present meta-analysis. After SEN analysis, two studies were identified as a source of heterogeneity when pooling Rs for AE assay. One reported a linear correlation between AE and the percentage of cases with ≥70% fertilization achieved by IVF . On the other hand, semen prepared by α-chymoytrypsin treatment was suitable for highly viscous semen . When SMD was pooled, four studies [3, 6, 50, 53] were found to contribute to this heterogeneity. Two used cryopreserved spermatozoa to assay AE [50, 53]; one used spermatozoa without preparation  or spermatozoa subjected to a special discontinuous gradient (i.e., 1-ml fractions of 90%, 80%, and 50% Percoll in isotonic Ham’s-F10)  in IVF therapy. When pooling diagnostic accuracy data, one outlier may have affected inter-study heterogeneity, for which the highest AE cut-off value was obtained by the spectrophotometry/BAPNA assay (54 μIU/106 spermatozoa) . The sperm origin (fresh or cryopreserved), sperm preparation methods, FR cut-off values, and AE assay methods and cut-off values might contribute to inter-study heterogeneity. For spontaneous AR assay, three studies found to contribute to this heterogeneity when pooling SMD. Two used FITC-PSA to determine AR after incubation for 60 min in synthetic human tubal fluid (HTF) media [33, 63]; one used two-color fluorescence staining of FITC-PSA and anti-CD46 antibody (MH61) to assay acrosomal status after 4 h of incubation in mBWW/3.5% HSA media . The sperm capacitation time, media, and assay methods might contribute to inter-study heterogeneity. For induced AR assay, seven studies [31, 33, 41, 43, 63] were identified as a source of heterogeneity. The inconsistencies among studies regarding capacitation time (range between 1 h and 24 h), sperm preparation methods (swim up or discontinuous gradient), AR triggers (physiological [HFF, P, ZP] or nonphysiological [TPA, CAMP, mannose]), as well as AR assay methods (FITC-PSA, RITC-PSA, FITC-GB24) methods might contribute to inter-study heterogeneity.
Furthermore, the subgroup analysis revealed that the correlation between AE levels and FR depended on geographic area, with Asia being inferior in this regard to North America and Europe, which may be explained by methodological quality. For example, two of three studies in Asia [9, 18] did not describe the inclusion criteria for patients undergoing IVF therapy, whereas only a minority of North American (i.e., three of seven) [6, 8, 19] and European (i.e., one in six articles)  studies did not report these criteria. In addition, two Asian studies [9, 57] did not clearly define the reference standard test (i.e., fertilization), which was only true for two North American [6, 50] and one European  study. Additionally, there may be racial differences that could possibly contribute, but this is unknown. The populations of certain areas of the world, such as in parts of North America, can be very heterogeneous as well and racial status cannot be assumed. In the sperm head, the organelle most affected by cryopreservation damage was the acrosome , suggesting that cryopreserved spermatozoa were inferior to fresh spermatozoa. Spermatozoa without preparation more closely reflected the population composition and fertility of the original ejaculate  and were superior to spermatozoa after swim up and a discontinuous gradient in terms of diagnostic performance. It was difficult to predict FR based on AE levels with high accuracy as well as SEN and SPE using any one assay method. The lower AE levels determined by fluorometry—including pAb-acrosin, pAb-hyaluronidase, and mAb 4D4-proacrosin—could predict TFF with low SEN and high SPE. Lower AE levels determined by the gelatine substrate assay could predict lower FR (i.e., FR ≤ 50% or < 50%) with low SEN and high SPE. As for the hyaluronidase target with agar/hyaluronic acid mixture substrate assay, the diagnostic performance was not evaluated because the described high SEN (0.91) and SPE (1.00) for predicting TFF in the text has contradiction with the calculated low SEN (0.54) and high SPE (1.00) from scatterplot of correlation between hyaluronidase activity and FR in the study by Abdul-Aziz et al. . More studies are needed to determine its predictability. The spectrophotometry assay had an uncertain predictive value. Specifically, the lower AE levels determined by the most commonly used spectrophotometry/BAPNA assay could predict a FR < 70%, FR < 50%, or FR = 0%; this was accompanied by moderate SEN/moderate SPE, pooled low SEN/moderate SPE and moderate SEN/low SPE, respectively. This result also validated the finding from retrospective study. AE levels determined by Accu-Sperm spectrophotometry/BAPNA could predict TFF with low SEN and moderate SPE. However, the lower AE levels obtained by spectrophotometry/BAEE in one study were not correlated with TFF. Another study  that was not included in our analysis showed similar results by the same method (AE extraction with acid [i.e., pH = 2.8]) but did not reflect the actual levels of proacrosin converted to acrosin.
For induced AR assay, the diagnostic performance also showed regional effects; the Africa in this regard was superior to other areas, which may be explained by methodology or high inter-study heterogeneity in other certain areas. For example, all three studies [31, 35, 63] in Africa used the same sperm preparation method (swim up), trigger (ZP), and assay method (FITC-PSA) and clearly defined the reference standard test. Two of them executed the laboratory quality control for assay method by establishing intra- and interassay/technician coefficients of variations, but only one study in other area did . The spermatozoa after one preparation—especially swim up—show better survival after incubation in capacitation media compared with no-prepared or double-prepared spermatozoa, which may explain its optimal diagnostic performance . The nonphysiological triggers did not differ from physiological triggers in terms of diagnostic performance; the mannose maybe act as a substitute when lack of physiological triggers. Nevertheless, the use of human ZP, biologically active recombinant ZP3 or active, synthetic ZP3 peptides (or analogues) combined with a better understanding of the biochemistry of the carbohydrate–protein interactions that take place during gamete recognition, binding and induction of acrosomal exocytosis will undoubtedly help in their elaboration . Finally, it was difficult to predict FR based on induced AR% with high accuracy as well as SEN and SPE using any one assay method. Multiple methods (i.e., indirect immunofluorescence, direct immunofluorescence with lection, and triple stain) may be combined to obtain high SEN and SPE.
In conventional IVF therapy, one of the major disappointments that infertile couples may encounter is the unexpected failure to achieve fertilization. Some researches using early rescue ICSI procedure performed 4–6 h post-insemination have described successful salvage of some total or near-total fertilization failure cycles [71, 72]. Therefore, it may provide more important clinic direction when the acrosome function assays were used for predicting TFF. For AE assay, lower AE levels determined by spectrophotometry/BAPNA, Accu-Sperm spectrophotometry/BAPNA, or fluorometry—including pAb-acrosin assay, pAb-hyaluronidase, and mAb 4D4-proacrosin—were used for predicting TFF, with moderate SEN/low SPE, low SEN/moderate SPE, or low SEN/high SPE. For induced AR assay, lower induced AR% determined by triple stain or direct immunofluorescence with lection—including FITC-PSA and FITC-PNA—was used for predicting TFF, with high SEN/low SPE and moderate SEN/moderate SPE. Based on optimal diagnostic performance, a two-method assay using AE levels determined by pAb-acrosin assay and induced AR% determined by triple stain can be recommended for assessing acrosome function and predicting TFF. Two-method assay will reveal four types of detection results: AE levels-postive (< 60% for normal fluorescence scores)/induced AR%-positive (< 31.3% for difference between induced AR minus the spontaneous AR results), AE levels-negative (≥ 60% for normal fluorescence scores)/induced AR%-negative (≥ 31.3% for difference between induced AR minus the spontaneous AR results), AE levels-positive/induced AR%-negative, and AE levels-negative/induced AR%-positive. The early rescue ICSI procedure should be recommended for the patients diagnosed as AE levels-postive/induced AR%-positive, for which has a higher chance of TFF, or patients with high-risk factors—such as unexplained infertility or primary infertility with longer infertility duration—and with conflicting diagnosis (i.e., AE levels-postive/induced AR%-negative or AE levels-negative/induced AR%-positive). The conventional IVF therapy should be recommended for the patients diagnosed as AE levels-negative/induced AR%-negative, for which has a higher chance of fertilization success, or patients with conflicting diagnosis but without high-risk factors.
Our cohort study has several limitations: First, our dataset was collected retrospectively from a single center in a single geographic area and AE was determined by a single spectrophotometric method. Second, the sample size was not large and only FR was the primary fertilization outcome. The meta-analysis results should be considered in the context of their strengths and limitations. The advantages were as follows: the pooling of multiple summary measures; SEN and subgroup analyses to identify sources of heterogeneity; and low publication bias, which confirmed the reliability of the results. Nonetheless, there were some limitations such as no available RCT; the inclusion of old articles (published between 1988 and 2014) and studies with high heterogeneity; and the omission of some AE assay methods, including acrosin/proacrosin/acrosin inhibitor  or hyaluronidase  target with BAEE substrate assay, hyaluronidase target with cytochemical  or hyaluronic acid substrate  assay; and acrosin target with western blotting  or RIA , for which articles were lacking.
The results of our study demonstrate that the acrosome function assays used to predict FR with high SEN and SPE are deficient. A limited prediction was obtained for AE assays, even though multiple methods (i.e., fluorometry, spectrophotometry, substrate assays) may be combined. But for induced AR assay, multiple methods (i.e., indirect immunofluorescence, direct immunofluorescence with lection, and triple stain) may be combined to obtain high SEN and SPE. The diagnostic performance showed regional effects as well as an effect of the sperm preparation or assay method. New assays of acrosome function—such as ones utilizing a panel of monoclonal or polyclonal antibodies against acrosome-related proteins—should be developed as a supplement for a more accurate diagnosis of structural and functional defects in the sperm acrosome. In addition, although most fertility centers rather prefer ICSI than IVF as method of treatment for male-factor infertility couples, yet the pace of this decision-making process should slow down, considering the controversy in the potential safety about ICSI. More studies of multicenter, large-scale, careful design and synthesizing multiple sperm functional assays and oocyte quality assays are still needed in clinical settings to better predict fertilization outcome in IVF. The early rescue ICSI procedure should be recommended for the patients with a higher chance of fertilization failure, and the conventional IVF therapy should be recommended for the patients with a higher chance of fertilization success.
Area under the summary receiver operating characteristic curve
N-benzoyl-l-arginine ethyl ester
Neoglycoproteins with N-acetylglucosamine residues
Cyclic adenosine 3′-5′-phosphate analogue
Dyes for bright-field microscopy
Diagnostic odds ratio
- FITC-Con A :
Concanavalin A lectin
Fluorescein isothiocyanate-conjugated peanut agglutinin
Fluorescein isothiocyanate-conjugated Pisurn sativum agglutimm
Human follicle fluid
Intracytoplasmic sperm injection
in vitro fertilization
- mAb 4D4-proacrosin:
Monoclonal anti-proacrosin antibody
Negative likelihood ratio
Polyclonal anti-acrosin antibody
Polyclonal anti-hyaluronidase antibody
Positive likelihood ratio
Rhodamine-conjugated Pisurn sativum agglutimm
Spearman correlation coefficient
Standardized mean difference
Summary receiver operating characteristic
Transmission electron microscopy
Total fertilization failure
Vazquez-Levin MH, Furlong LI, Veaute CM, Ghiringhelli PD. An overview of the proacrosin/acrosin system in human spermatozoa. Treballs la Soc Catalana Biol. 2005;56:59–74.
Hirayama T, Hasegawa T, Hiroi M. The measurement of hyaluronidase activity in human spermatozoa by substrate slide assay and its clinical application. Fertil Steril. 1989;51:330–4.
Senn A, Germond M, De Grandi P. Immunofluorescence study of actin, acrosin, dynein, tubulin and hyaluronidase and their impact on in-vitro fertilization. Hum Reprod. 1992;7:841–9.
Albert M, Gallo JM, Escalier D, Parseghian N, Jouannet P, Schrevel J, et al. Unexplained in-vitro fertilization failure: implication of acrosomes with a small reacting region, as revealed by a monoclonal antibody. Hum Reprod. 1992;7:1249–56.
Howe SE, Grider SL, Lynch DM, Fink LM. Antisperm antibody binding to human acrosin: a study of patients with unexplained infertility. Fertil Steril. 1991;55:1176–82.
Kennedy WP, Kaminski JM, Ven HH, Der V, Jeyendran RS, Reid DS, et al. A simple, clinical assay to evaluate the acrosin activity of human spermatozoa. J Androl. 1989;10:221–31.
Tummon IS, Yuzpe AA, Daniel SAJ, Deutsch A. Total acrosin activity correlates with fertility potential after fertilization in vitro. Fertil Steril. 1991;56:933–8.
De Jonge CJ, Tarchala SM, Rawlins RG, Binor Z, Radwanska E. Acrosin activity in human spermatozoa in relation to semen quality and in-vitro fertilization. Hum Reprod. 1993;8:253–7.
Yang YS, Chen SU, Ho HN, Chen HF, Lien YR, Lin HR, et al. Acrosin activity of human sperm did not correlate with IVF. Arch Androl. 1994;32:13–9.
Liu DY, Baker HWG. Relationships between human sperm acrosin, acrosomes, morphology and fertilization in vitro. Hum Reprod. 1990;5:298–303.
Schill WB. Quantitative determination of acrosin activity in human spermatozoa. Fertil Steril. 1974;25:703–12.
Goodpasture JC, Polakoski KL, Zaneveld LJD. Acrosin, proacrosin, and acrosin inhibitor of human spermatozoa: extraction, quantitation, and stability. J Androl. 1980;1:16–27.
Zaneveld LJD, Polakoski KL, Schumacher GFB. Properties of acrosomal hyaluronidase from bull spermatozoa evidence for its similarity to testicular hyaluronidase. J Biol Chem. 1973;248:564–70.
Joyce C, Jeyendran RS, Zaneveld LJD. Release, extraction, and stability of hyaluronidase associated with human spermatozoa. Comparisons with the rabbit. J Androl. 1985;6:152–61.
Henkel R, Maaß G, Bödeker R, Scheibelhut C, Stalf T, Mehnert C, et al. Sperm function and assisted reproduction technology. Reprod Med Biol. 2005;4:7–30.
Henkel R, Müller C, Miska W, Schill W, Kleinstein J, Gips H. Acrosin activity of human spermatozoa by means of a simple gelatinolytic technique: a method useful for IVF. J Androl. 1995;16:272–7.
Liu DY, Baker HWG. Andrology: disordered acrosome reaction of spermatozoa bound to the zona pellucida: a newly discovered sperm defect causing infertility with reduced sperm-zona pellucida penetration and reduced fertilization in vitro. Hum Reprod. 1994;9:1694–700.
Tavalaee M, Razavi S, Nasr-Esfahani MH. Effects of sperm acrosomal integrity and protamine deficiency on in vitro fertilization and pregnancy rate. Int J Fertil Steril. 2007;1:27–34.
Abdul-Aziz M, MacLusky NJ, Bhavnani BR, Casper RF. Hyaluronidase activity in human semen: correlation with fertilization in vitro. Fertil Steril. 1995;64:1147–53.
Mohsenian M, Syner FN, Moghissi KS. A study of sperm acrosin in patients with unexplained infertility. Fertil Steril. 1982;37:223–9.
Cross NL, Meizel S. Methods for evaluating the acrosomal status of mammalian sperm. Biol Reprod. 1989;41:635–41.
Fujino Y, Ozaki K, Nakamura Y, Sun TT, Ueda K, Ozaki A, et al. Clinical application of a new stain to detect acrosome-reacted sperm for predicting polyspermic fertilization in IVF-ET. Arch Androl. 1997;39:25–31.
Pampiglione JS, Tan S-L, Campbell S. The use of the stimulated acrosome reaction test as a test of fertilizing ability in human spermatozoa. Fertil Steril. 1993;59:1280–4.
Jędrzejczak P, Pawelczyk L, Taszarek-Hauke G, Kotwicka M, Warchoł W, Kurpisz M. Predictive value of selected sperm parameters for classical in vitro fertilization procedure of oocyte fertilization. Andrologia. 2015;37:72–82.
Calvo L, Dennison-Lagos L, Banks SM, Dorfmann A, Thorsell LP, Bustillo M, et al. Arosome reaction inducibility predicts fertilization success at in-vitro fertilization. Hum Reprod. 1994;9:1880–6.
Yovich JM, Edirisinghe WR, Yovich JL. Use of the acrosome reaction to ionophore challenge test in managing patients in an assisted reproduction program: a prospective, double-blind, randomized controlled study. Fertil Steril. 1994;61:902–10.
Brandelli A, Miranda PV, Añón-Vazquez MG, Marin-Briggiler CI, Sanjurjo C, Gonzalez-Echeverria F, et al. A new predictive test for in-vitro fertilization based on the induction of sperm acrosome reaction by N-acetylglucosamine-neoglycoprotein. Hum Reprod. 1995;10:1751–6.
Liu DY, Baker HWG. Calcium ionophore-induced acrosome reaction correlates with fertilization rates in vitro in patients with teratozoospermic semen. Hum Reprod. 1998;13:905–10.
Kawamoto A, Ohashi K, Kishikawa H, Zhu L-Q, Azuma C, Murata Y. Two-color fluorescence staining of lectin and anti-CD46 antibody to assess acrosomal status. Fertil Steril. 1999;71:497–501.
Fukui K, Hori R, Yoshimoto I, Ochi H, Ito M. Correlation between progesterone binding sites on human spermatozoa and in vitro fertilization outcome. Gynecol Obstet Investig. 2000;49:1–5.
Esterhuizen AD, Franken DR, Lourens JGH, Rooyen LHV. Clinical importance of zona pellucida-induced acrosome reaction and its predictive value for IVF. Hum Reprod. 2001;16:138–44.
Liu DY, Baker HWG. Disordered zona pellucida–induced acrosome reaction and failure of in vitro fertilization in patients with unexplained infertility. Fertil Steril. 2003;79:74–80.
El-Ghobashy AA, WEST CR. The human sperm head: a key for successful fertilization. J Androl. 2003;24:232–8.
Katsuki T, Hara T, Ueda K, Tanaka J, Ohama K. Prediction of outcomes of assisted reproduction treatment using the calcium ionophore-induced acrosome reaction. Hum Reprod. 2005;20:469–75.
Abu DAH, Franken DR, Hoffman B, Henke R. Sequential analysis of sperm functional aspects involved in fertilisation: a pilot study. Andrologia. 2012;44:175–81.
Wiser A, Sachar S, Ghetler Y, Shulman A, Breitbart H. Assessment of sperm hyperactivated motility and acrosome reaction can discriminate the use of spermatozoa for conventional in vitro fertilisation or intracytoplasmic sperm injection: preliminary results. Andrologia. 2014;46:313–5.
Parinaud J, Labal B, Vieitez G, Richoilley G, Grandjean H. Comparison between fluorescent peanut agglutinin lectin and GB24 antibody techniques for the assessment of acrosomal status. Hum Reprod. 1993;8:1685–8.
Sukcharoen N, Keith J, Irvine DS, Aitken RJ. Predicting the fertilizing potential of human sperm suspensions in vitro: importance of sperm morphology and leukocyte contamination. Fertil Steril. 1995;63:1293–300.
Krausz C, Bonaccorsi L, Maggio P, Luconi M, Criscuoli L, Fuzzi B, et al. Two functional assays of sperm responsiveness to progesterone and their predictive values in in-vitro fertilization. Hum Reprod. 1996;11:1661–7.
Takahashi K, Sakoda R, Yamasaki H, Uchida A, Kazuo Y, Kitao M. Evaluation of sperm fertilizing capacity using the determination of acrosome reaction. Asia-Oceania J Obstet Gynaecol. 1993;19:235–40.
Parinaud J, Vieitez G, Moutaffian H, Richoilley G, Labal B. Variations in spontaneous and induced acrosome reaction: correlations with semen parameters and in-vitro fertilization results. Hum Reprod. 1995;10:2085–9.
Parinaud J, Vieitez G, Moutaffian H, Richoilley G, Labal B. Relevance of acrosome function in the evaluation of semen in vitro fertilizing ability. Fertil Steril. 1995;63:598–603.
Benoff S, Hurley IR, Mandel FS, Paine T, Jacob A, Cooper GW, et al. Use of mannose ligands in IVF screens to mimic zona pellucidainduced acrosome reactions and predict fertilization success. Fertil Steril. 1997;3:839–46.
Hershlag A, Paine T, Scholl GM, Rosenfeld D, Mandel FS, Zhu JZ, et al. Acrobeads test as a predictor of fertilization in vitro. Am J Reprod Immunol. 1997;37:291–9.
Robertson L, Wolf DP, Tash JS. Temporal changes in motility parameters related to acrosomal status: identification and characterization of populations of hyperactivated human sperm. Biol Reprod. 1988;39:797–805.
Lee CY, Wong E, Hsu E, Huang ES. Molecular identity of a sperm acrosome antigen recognized by HS-63 monoclonal antibody. J Reprod Immunol. 1993;24:235–47.
Fénichel P, Donzeau M, Farahifar D, Basteris B, Ayraud N, Hsi B-L. Dynamics of human sperm acrosome reaction: relation with in vitro fertilization. Fertil Steril. 1991;55:994–9.
Carver-ward JA, Jaroudi KA, Hollanders JMG, Einspenner M. High fertilization prediction by flow cytometric analysis of the CD46 antigen on the inner acrosomal membrane of spermatozoa. Hum Reprod. 1996;11:1923–8.
Lee MA, Trucco GS, Bechtol KB, Wummer N, Kopf GS, Blasco L, et al. Capacitation and acrosome reactions in human spermatozoa monitored by a chlortetracycline fluorescence assay. Fertil Steril. 1987;48:649–58.
Kruger TF, Haque D, Acosta AA, Pleban P, Swanson RJ, Simmons KF, et al. Correlation between sperm morphology, acrosin, and fertilization in an IVF program. Arch Androl. 1988;20:237–41.
Sofikitis NV, Miyagawa I, Zavos PM, Toda T, Iino A, Terakawa N. Confocal scanning laser microscopy of morphometric human sperm parameters: correlation with acrosin profiles and fertilizing capacity. Fertil Steril. 1994;62:376–86.
Romano R, Santucci R, Marrone V, Gabriele AR, Necozione S, Valenti M, et al. A prospective analysis of the accuracy of the TEST-yolk buffer enhanced hamster egg penetration test and acrosin activity in discriminating fertile from infertile males. Hum Reprod. 1998;13:2115–21.
Yie S, Baillie J, Younglai EV. Acrosin activity in pelleted frozen sperm does not correlate with in vitro fertilization of oocytes. Andrologia. 1996;28:349–52.
Sharma R, Hogg J, Bromham DR. Is spermatozoan acrosin a predictor of fertilization and embryo quality in the human? Fertil Steril. 1993;60:881–7.
Menkveld R, Rhemrev JP, Franken DR, Vermeiden JP, Kruger TF. Acrosomal morphology as a novel criterion for male fertility diagnosis: relation with acrosin activity, morphology (strict criteria), and fertilization in vitro. Fertil Steril. 1996;65:637–44.
Langlois MR, Oorlynck L, Vandekerckhove F, Criel A, Bernard D, Blaton V. Discrepancy between sperm acrosin activity and sperm morphology: significance for fertilization in vitro. Clin Chim Acta. 2005;351:121–9.
Bartoov B, Reichart M, Eltes F, Lederman H, Kedem P. Relation of human sperm acrosin activity and fertilization in vitro. Andrologia. 1994;26:9–15.
Aghajanpour S, Ghaedi K, Salamian A, Deemeh MR, Tavalaee M, Moshtaghian J, et al. Quantitative expression of phospholipase C zeta, as an index to assess fertilization potential of a semen sample. Hum Reprod. 2011;26:2950–6.
Henkel R, Müller C, Miska W, Gips H, Schill WB. Determination of the acrosome reaction in human spermatozoa is predictive of fertilization in vitro. Hum Reprod. 1993;8:2128–32.
Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155:529–36.
Green S, Higgins JPT, Alderson P, Alderson P, Clarke M, Mulrow CD, et al. Cochrane handbook for systematic reviews of interventions. 5.0.2 ed. Hoboken: John Wiley & Sons, Ltd under “TheCochrane Book Series” Imprint; 2009.
He Y, Liu H, Zheng H, Li L, Fu X, Liu J. Effect of early cumulus cells removal and early rescue ICSI on pregnancy outcomes in high-risk patients of fertilization failure. Gynecol Endocrinol. 2018;15:1–15.
Bastiaan HS, Windt ML, Menkveld R, Kruger TF, Oehninger S, Franken DR. Relationship between zona pellucida-induced acrosome reaction, sperm morphology, sperm-zona pellucida binding and in vitro fertilization. Fertil Steril. 2003;79:49–55.
Rufas O, Gilman A, Fisch B, Shalgi R. Spontaneous and follicular fluid-induced acrosome reaction in sperm samples from in vitro fertilizing and nonfertilizing normozoospermic patients. J Assist Reprod Genet. 1998;15:84–9.
World Health Organization, Department of Reproductive Health and Research. WHO Laboratory Manual for the Examination and Processing of Human Semen. 5th ed. Geneva: WHO Press; 2010.
Liu DY, Du PY, Nayudu PL, Johnston WI, Baker HW. The use of in vitro fertilization to evaluate putative tests of human sperm function. Fertil Steril. 1988;49:272–7.
Acosta AA, Oehninger S, Morshedi M, Swanson RJ, Scott R, Irianni F. Assisted reproduction in the diagnosis and treatment of the male factor. Obstet Gynecol Surv. 1989;44:1–18.
Oehninger S, Franken DR, Sayed E, Barroso G, Kolm P. Sperm function assays and their predictive value for fertilization outcome in IVF therapy: a meta-analysis. Hum Reprod Update. 2000;6:160–8.
Gómez-Torres MJ, Medrano L, Romero A, Fernández-Colom PJ, Aizpurúa J. Effectiveness of human spermatozoa biomarkers as indicators of structural damage during cryopreservation. Cryobiology. 2017;78:90–4.
Thijssen A, Klerkx E, Huyser C, Bosmans E, Campo R, Ombelet W. Influence of temperature and sperm preparation on the quality of spermatozoa. Reprod BioMed Online. 2014;28:436–42.
Huang B, Li Z, Zhu L, Hu D, Liu Q, Zhu G, et al. Progesterone elevation on the day of HCG administration may affect rescue ICSI. Reprod Biomed. 2014;29:88–93.
Li M, Wang H, Li W, Shi J. Effect of normal sperm morphology rate (NSMR) on clinical outcomes for rescue-ICSI (R-ICSI) patients. Gynecol Endocrinol. 2017;33:458–61.
We thank WebShop for its linguistic assistance with an earlier version of the manuscript.
This work was supported by National Natural Science Foundation of China (No. 31472054), The National Research Foundation for the Doctor Program of Higher Education of China (20120162110058).
Availability of data and materials
The data and materials are available from the corresponding author on reasonable requests.
Sperm acrosome function scoring is positively correlated with fertilization rate, which is predictive of fertilization outcome with moderate accuracy and specificity.
Ethics approval and consent to participate
The study was sanctioned by the ethics committee of the Reproductive and Genetic Hospital of CITIC-Xiangya. Due to the retrospective nature of the study, informed consent was waived.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.