The Pacific whiteleg shrimp, Litopenaeus vannamei, is the most farmed aquaculture species worldwide with global production exceeding 3 million tonnes annually. Litopenaeus vannamei has been the focus of many selective breeding programs aiming to improve growth and disease resistance. However, these have been based primarily on phenotypic measurements and omit potential gains by integrating genetic selection into existing breeding programs. Such integration of genetic information has been hindered by the limited available genomic resources, background genetic parameters and knowledge on the genetic architecture of commercial traits for L. vannamei. This study describes the development of a comprehensive set of genomic gene-based resources including the identification and validation of 234,452 putative single nucleotide polymorphisms in-silico, of which 8,967 high value SNPs were incorporated into a commercially available Illumina Infinium ShrimpLD-24 v1.0 genotyping array. A framework genetic linkage map was constructed and combined with locus ordering by disequilibrium methodology to generate an integrated genetic map containing 4,817 SNPs, which spanned a total of 4552.5 cM and covered an estimated 98.12% of the genome. These gene-based genomic resources will not only be valuable for identifying regions underlying important L. vannamei traits, but also as a foundational resource in comparative and genome assembly activities.
Breeding programs for animal production species have traditionally been developed around phenotypic selection in conjunction with quantitative genetic theory. As with other realms of biology, animal production science is currently in the midst of a genomics revolution and there has been an increasing global focus on the development of genomic resources and subsequent identification of markers linked to genes of economic importance. Although still in its infancy as a production industry, aquaculture is perfectly situated to uptake recent advances in quantitative genetics and to integrate new genomic technologies into future breeding program designs1.
The globally important whiteleg shrimp or Pacific white shrimp, Litopenaeus vannamei, is an aquaculture species that would benefit substantially from the integration of genomic information into traditional breeding programs, particularly for disease resistance and other difficult to measure or low heritability traits. Unfortunately, even though several genetic linkage maps have been produced2, 3, comprehensive genomic information available for L. vannamei is still very limited and there is currently a poor understanding of fine scale genome structure and the genetic basis underlying complex commercially important traits. For example, current breeding programs for L. vannamei use traditional phenotypic selection to produce shrimp with improved growth and resistance to various viral pathogens like Taura syndrome virus (TSV)4, 5. While this traditional approach has been moderately successful in producing more productive shrimp strains, genetic progress using multi-trait phenotypic selection in L. vannamei has been significantly impeded by unfavourable genetic correlations between growth and disease resistance4, 6, as well as a poor correlated response in susceptibility to multiple diseases7–9. In light of these unfavourable genetic correlations between traits of interest in L. vannamei (i.e. growth and disease), breeding strategies would benefit from the integration of genetic markers tightly associated with trait variation (i.e. quantitative trait loci - QTL). The development of single nucleotide polymorphism (SNP) marker panels with the power to simultaneously identify genome-wide QTLs for complex and/or correlated traits would assist shrimp breeding strategies, as it would allow for the improved identification of selection candidates possessing advantageous genes. This would negate the current requirement for multiple selection lines and allow selection decisions for traits to be made directly on candidates, thereby increasing the accuracy of selection and resultant genetic gains. Despite recent increased research effort into L. vannamei genomics which have yielded SNP markers and moderate density linkage maps2, 3, 10–14, limited gene-based (Type-I) genomic resources are publically available, and production trait architecture and localisations are based on low density maps containing either AFLPs or microsatellites12, 14, 15. Therefore, there is still a need to develop comprehensive genome-wide SNP marker panels and dense genomic maps that allow the simultaneous detection of genome-wide QTLs for commercially important traits.
SNPs derived within expressed sequence tags (ESTs) which originate from gene coding and 3′-UTR regions are considered a valuable resource useful in linking genotype to phenotype. Anchoring EST-derived SNPs and their associated transcript sequences to a high density genomic map not only allows insights into genome structure and marker spacing across the genome, but also helps identify the biochemical pathways underlying traits of interest. This study aimed to extend on the current genomic resources available for L. vannamei by developing a gene-based commercial SNP array and genomic linkage map, demonstrate the placement of additional markers using novel locus ordering by linkage disequilibrium (LODE) methods16, 17, and describe the genome synteny between two commercially valuable penaeid species. The gene-based Type-I SNP marker panel and comparative genomic maps will be valuable resources for investigating genome-wide genetic trait associations, creating optimal marker sets for selective breeding and genomic prediction, understanding functional biology and genome evolution, and assisting in genome assembly.
Sequencing, assembly and annotation
To enable the identification and development of genome-wide Type I SNPs, total RNA was extracted from the tail muscle tissue of 30 L. vannamei individuals representing prominent domesticated industry lines (Global Gen, Indonesia), using TRIZOL® Reagent (Life Technologies). RNA from each individual were pooled together in equimolar amounts before being converted to double stranded cDNA using the Mint cDNA synthesis kit (Evrogen), and normalised using the Trimmer cDNA normalisation kit (Evrogen). Normalised cDNA was then sequenced using an Illumina GA-IIX genome analyser, which produced approximately 25 gigabases of 76 bp paired-end EST sequence data (~10× genome coverage). Illumina sequence adaptors and primers were screened and removed using the software Seqclean (https://sourceforge.net/projects/seqclean/). MOTHUR was used to remove sequences with an average quality score (Phred score) less than 15 (window size = 10 bp) and/or shorter than 50 bp in length18. The cleaned sequence data was assembled using Velvet V1.019 and OASES20. Assembly parameters consisted of no extra gap penalty with all other options at default or recommended settings. Transcript assemblies were conducted at kmer lengths of k39, k41, k43, k45, k47, k49, k51 and k53 before being clustered together at a 90% sequence identify threshold using the software CD-HIT21. Where multiple transcript sequences were identified, only the longest sequence was retained. Transcript redundancy removal was undertaken, since it is a requirement for SNP discovery.
Sequence annotation of Gene Ontology terms
Annotation of the assembled sequence database was achieved using a Blastx search algorithm22 and the NCBI non-redundant protein database conducted through the software package Blast2GO23. Where multiple annotations were returned, the one with the best bit score was retained. For each successfully annotated contig, gene ontology (GO) terms InterPro scan results were retrieved using Blast2GO.
SNP discovery and filtering
To ensure high-quality SNPs were produced, strict data integrity measures were implemented. Genome-wide SNPs were identified using stringent SNP discovery filtering within the software package SAMTOOLs24 and custom scripts. NOVACRAFT (Novocraft Technologies, Selangor, Malaysia) was used to align the cleaned sequence reads to the full sequence assembly. The SAMTOOLs pileup command was used to produce mapping qualities. The varFilter option in SAMTOOLs was employed to filter SNPs, keeping only the most informative [i.e. minimum minor allele frequency (MAF) of 0.25, a minimum read depth of 10 reads, a minimum of two minor allele reads, a minimum SNP mapping quality of 25, a minimum flanking sequence quality of 25]. Any SNP identified within 50 bp of a candidate SNP was excluded to ensure a conservative flanking region for probe design. In addition, multi-allelic SNPs and SNPs requiring type I Illumina Infinium Probes (A/T or C/G) were removed and sequence repeat elements were masked. The resultant SNPs with the highest MAF and read depth were prioritised and submitted for assay development analysis using Illumina’s Assay Design Tool (ADT). Any SNP that returned an ADT score of less than 0.7 was excluded from the array. To ensure no unintentional duplicate SNPs were included on the array, probes for each SNP were mapped to the initial assembly using NOVOCRAFT (Novocraft Technologies, Selangor, Malaysia) and only the probes that mapped uniquely were included in the array. Following this procedure, 8,967 SNPs (8,616 novel SNPs with the highest ADT score and 351 from the public domain including those mapped in Du et al.3 and Ciobanu et al.11) were incorporated into the Illumina Infinium ShrimpLD-24 v1.0 SNP genotyping array (Table 1 and Supplementary Table S1).
The number of SNPs retained throughout subsequent filtering and data integrity during design of the custom L. vannamei 10 k iSelect beadchip.
Infinium array genotyping
To validate the performance of the L. vannamei Illumina Infinium ShrimpLD-24 v1.0 beadchip, 2,004 samples were genotyped, including 1,134 female and 193 male broodstock that produced families, along with 677 nauplii DNA pools (pools of >300 nauplii larvae from an individual family). For some nauplii pools, one of the two parents was either unknown or not sampled. Consequently for these families, the full unknown parental genotype was reconstructed using methods described in Supplementary Methods and Peiris, et al.25. All families were raised indoors in a Nucleus Breeding Centre under biosecure conditions from founding individuals representing most of the prominent industry domesticated/selected lines. To ensure all genotypes calls were genuine and to identify aberrant SNPs and DNA samples, strict genotypic data integrity was undertaken in GenomeStudio V2011.1 following methods outlined in Jones, et al.26. Family groups were reconstructed using SNP genotypic data (as described below) to enable the assessment of Mendelian inheritance (MI) of alleles. Genotype reproducibility between batches and across arrays was tested using 52 replicate samples and 26 replicate SNPs.
Genomic DNA was extracted either from the 2,004 L. vannamei samples or pools using a modified CTAB protocol27. DNA was standardised to 50 ng/µl using PicoGreen dsDNA quantification (Invitrogen), while DNA quality was inspected by agarose gel electrophoresis. All array genotyping was undertaken at PathWest Medical Laboratories, Perth, Western Australia, following manufacturer instructions28. Genotypes were calculated within the genotyping module of Genome Studio V2011.1 (Illumina Inc.) using the GenTrain genotype clustering algorithm. A minimum GenCall (GC) score cut-off (quality metric for each genotype) of 0.15 was used in SNP genotype clustering. The proportion of loci that produced a genotype for a sample is the sample genotyping rate. The SNP conversion rate is defined as the number of SNPs that produced a genotype divided by the number of total SNPs included. SNP validation rates were calculated as the number of SNPs with a heterozygous call divided by the number of SNPs that produced a genotype. SNPs with a minor allele frequency of greater than 0.01 were considered polymorphic. Mendelian errors for each SNP were reported as in Mendelian agreement whereby;
All GenCall scores are reported as the 10th percentile of the GC scores (GC10 scores). All SNPs were investigated for conformation to expected Hardy-Weinberg Equilibrium (HWE) and Mendelian Inheritance (MI) patterns. All recorded pedigree information was validated in a number of subsequent iterations using the 1,800 highly reliable “first class” SNPs produced from the array and the parentage programs Cervus 3.029, 30 and COLONY31. Briefly, all individually genotyped females and male family relationships were confirmed using this integrated approach, whereby all maternal assignments were verified in COLONY (1,121), before being used to verify paternal assignments (750). Then using all validated parental relationships, COLONY was used to cluster pedigrees as an extra level of validation and to estimate unknown parents by inferring genotypes (N = 30). Any disagreements or pedigree alterations were resolved.
Linkage mapping families and map construction
After parental relationship validation and genotype reconstruction, a total of 631 progeny from 30 grandmaternal and 19 grandpaternal traced families were selected for linkage map construction (the number of progeny within a family ranged from 8 to 33; Supplementary Fig. S2 and Supplementary Table S3). The genotypic data of these individuals over all 6,379 high quality SNPs (as described below) was manually phased into hexadecimal encoding using custom scripts and linkage analysis was conducted in Carthagene V1.332, 33. Markers were segregated into linkage groups by the group function at a logarithm of odds (LOD) threshold of 10 and a distance threshold of 30 cM. Linkage groups were defined as groups of at least three markers ordered on a map at a LOD 3 threshold, and having agreement with independent linkage disequilibrium (LD) and LODE mapping assignment (as described below). The remaining 1,447 markers which did not have three markers ordered at a LOD 3 threshold and/or were not confirmed by LD and LODE analysis were designated as orphan markers. The defined linkage groups were subsequently constructed using a hierarchical approach whereby ordering was determined using consecutive thresholds of LOD3, LOD2 and the most likely marker position. For each consecutive threshold, maps were created using the buildfw function, followed by annealing, flips 6 and polish, until the best sex average map was produced. After all linkage groups were ordered, orphan markers were tested again using two-point to determine whether they could be inserted into any ordered linkage groups. In addition, the five distal markers from both ends of each linkage group were compared by two-point to identify if any linkage groups could be merged together. Sex specific maps were also produced by locking in the sex average marker order and re-calculating interval distances based on separate male and female informative recombination events. The Kosambi34 mapping function was used for all centi-Morgan (cM) calculations.
Sex- and family- specific recombination heterogeneity
To investigate sex-specific heterogeneity throughout independent linkage groups, the following goodness of fit heterogeneity test was utilised with one degree of freedom as described in Ott35 and Jones, et al.36;
where, is the joint sex-specific recombination rate and represents the recombination rate when equal male and female recombination fractions are assumed. For each test, a false discovery rate (FDR) correction was applied to correct for multiple comparisons and minimise false positives37.
To detect any differences in sex-specific recombination rates, ratios of female-to-male map distances were calculated (R = Xf/Xm) for each interval and linkage group as well as over the entire map. To ensure any observed sex-specific recombination was truly due to differences between the sexes, and not affected by variation in individual F1 parents, family specific heterogeneity was investigated for each F1 parent independently. LINKMFEX version 2.438 was used to calculate the recombination fraction, number of co-informative meiotic events (N) and the number of recombinations (r) for all mapped locus intervals for the maternal and paternal lines of each family separately. The Zmax score (LOD) was calculated for the mother and father in each family, and combined across all mothers and fathers respectively using methods outlined in Ott35. The following M-test was employed to investigate individual F1 recombination heterogeneity within each mapping family Ott35.
Here, represents the LOD scores maximum likelihood estimation (MLE) for the ith F1 reference family for a pair of markers, with being the total LOD score MLE of all ith reference families.
Segregation distortion was investigated to determine if there was any evidence of deviations from expected Mendelian Inheritance (MI) patterns. This was investigated using log-likelihood ratio tests for goodness of fit to Mendelian expectations on manually phased genotypic data across all markers from all dams and sires as described in Sokal and Rohlf 39 and Jones, et al.36.
The extent of linkage disequilibrium and integration of LODE-placed markers
Locus ordering by disequilibrium (LODE) is a novel methodology that allows the utilisation of additional linkage disequilibrium data to place unpositioned or orphan SNPs within genetic maps or scaffolds16, 17, 40, 41. The LODE procedure used in this study is an adaptation of the two step procedure described in Khatkar, et al.16. Firstly, SNPs are assigned to a chromosome or linkage group, then subsequently its position within this linkage group is estimated. Both of these steps rely on pair-wise estimates of linkage disequilibrium (LD). LD estimates (r2 and D′) were computed among 6,379 SNPs and 1,963 individuals (631 individuals from mapping families, and 1,332 individuals representing prominent industry lines) using GOLD software42. The extent of LD among SNPs, within and across the linkage groups, was estimated using position of SNPs on the current linkage map. Placements of orphan SNPs using the LODE method were defined based on at least three pairs on a chromosome with r-squared 0.1 or more, but also looking at the maximum LD score.
Genome coverage of the integrated linkage and LODE sex-average map was calculated using observed and expected genome lengths. The observed genome length (Goa) was calculated by adding the observed linkage group lengths. The expected genome length (Ge) was produced by multiplying the length (cM) of each linkage group by (m + 1)/(m − 1), where m is the number of loci in each linkage group43. The total expected genome length was then the sum of Ge from all linkage groups. Genome coverage (Coa), was calculated by dividing Goa by Ge44.
Comparative genome analysis
Syntenic relationships were explored for the integrated linkage and LODE map against three previously published maps, a L. vannamei SNP linkage map with 6,359 markers2, a L. vannamei SNP linkage map with 418 SNP markers3, and a Penaeus monodon linkage map with 3,959 SNPs45. Assignment of orthologous sequences were undertaken by reciprocal BLAST searches of contigs sequences from which SNPs were discovered in the present study against respective sequence databases available for the maps of Yu, et al.2, Du, et al.3 and Baranski, et al.45 (at an e-value threshold of >1e-5). Comparisons to Yu, et al.2 were undertaken using their contiguous sequences generated from their genome survey sequencing, bacterial artificial chromosomes (BACs) and marker sequences, whereas comparisons to Baranski, et al.45 were undertaken with the contig sequences associated with their mapped SNPs. The primary hit was retained in each case. In addition to sequence similarity search of the marker sequences published in Du, et al.3, 159 SNPs from a previously published low density SNP map3 were included on our genotyping array to allow the direct comparison of their linkage map to ours. Comparison of genome synteny in this case was undertaken by matching marker IDs of all SNPs from this current study that were directly genotyped and mapped with our integrated map. BLAST annotations to Daphnia pulex and Drosophila melanogaster for SNPs with common IDs were also carried across from Du, et al.3. Oxford Grids46 of the integrated map presented here versus Yu, et al.2, Du, et al.3 and Baranski, et al.45 were plotted using custom R scripts to confirm mapping position and illustrate genome synteny. An example linkage group (LG4) was drawn using ArkMap47 to illustrate genome conservation.
The assembled contig sequences and mapped raw reads generated within the current study have been submitted to GenBank as a SRA database (Accession number: SRP094129). All SNPs included on the Illumina Infinium ShrimpLD-24 v1.0 array have been submitted to dbSNP on NCBI [Accession numbers: ss2137297825–ss2137306471 from the current study; rs159816077–rs159831399 mapped in Du et al.3; and rs142459135–rs142459627 developed in Ciobanu et al.11]. The Illumina Infinium ShrimpLD-24 v1.0 array is available from https://www.illumina.com/products/by-type/microarray-kits/infinium-shrimp-ld.html. All remaining data used and/or analysed during the current study are available from the corresponding author on reasonable request.
Sequencing and assembly of transcripts
In total, over 25 Gb of sequence data (329 million raw EST sequences, 76 bp paired-end, ~10× genome coverage) was produced from an Illumina GA-IIx run. After clean-up and trimming, 19.7 Gb of sequence data was retained (average Phred score of 25.9). Assembly of the cleaned-up sequence data (including transcript redundancy removal) produced 76,963 contigs. The N50 of the assembly was 2,375 bp, the average contig length was 1,429 bp and median contig length was 955. Over 72% of the 76,963 contigs had a read depth coverage of greater than 50 reads (average read depth over all contigs was 2527.5 reads). The assembled contig sequences and mapped raw reads have been submitted to GenBank as a SRA database (Accession number: SRP094129). This is a significant genomic resource enabling the sequence data mining of 27,477 specific genes (see below) and in-silico detection of over 234,452 SNPs and 133,960 indels (Table 1).
Sequence annotation and gene ontology terms
Blastx searches against NCBI’s non-redundant protein database produced 30,317 hits from the 76,963 contigs. Of these sequences, 27,477 (24.7%) also had GO categories assigned, from which these genes were categorised into biological processes (21,333), molecular function (22,142) and cellular components (19,155) (Fig. 1 and Supplementary Table S4). Within the listed biological processes, most genes were involved in cellular and metabolic processes (32.7%). The most common molecular function designations were binding (43.6%) and catalytic activity (38.9%). Finally, cell (20.1%), cell part (20.0%) and organelle (15.2%) formed the most common GO terms within cellular component designations. A total of 12,957 unique gene hits were identified including Myosin and Myostatin/Growth Differentiation Factor-11, which are involved in muscle cell growth48, 49, as well as genes involved in immune response pathways such as apoptosis, MAPK signalling, toll-like receptor and antigen processing and presentation50.
Proportions of Gene Ontology (GO) annotations of the assembled 454 mantle tissue transcripts from Litopenaeus vannamei.
SNP discovery and development of commercial array
In-silico SNP discovery and filtering
From the assembled sequence dataset, 234,452 putative SNPs and 133,960 indels were identified in-silico before strict filtering parameters were applied. By filtering out all SNPs with a read depth less than 10 reads and a minor allele frequency (MAF) of less than 0.25, a total of 26,662 high-quality SNPs were identified. A further 2,445 multi-allelic SNPs, 4,565 SNPs requiring Type I Illumina Infinium probes and 1,054 highly repetitive SNP probes were removed before ADT analysis. Illumina’s ADT analysis calculates the effectiveness of the SNP probes on the array. A total of 1,142 SNPs did not return ADT values > 0.7 and 1,006 SNPs did not map to unique contigs and were removed. A further 7,003 SNPs were excluded due to being located within the flanking region of another SNP resulting in a final list of 9,447 SNPs. Of these, 8,967 SNPs (8,616 novel SNPs with the highest ADT score and 351 from the public domain including those developed in Ciobanu, et al.11 and mapped in Du, et al.3) were incorporated into the Illumina Infinium ShrimpLD-24 v1.0 SNP genotyping array enabling high throughput, cost effective and accurate genotyping (Table 1 and Supplementary Table S1). The average MAF and ADT score of these high-value SNPs was 37% and 0.95 respectively. All SNPs included on the Illumina Infinium ShrimpLD-24 v1.0 array have been submitted to dbSNP on NCBI (Accession numbers: ss2137297825 - ss2137306471 from the current study; rs159816077 - rs159831399 mapped in Du, et al.3; and rs142459135 - rs142459627 developed in Ciobanu, et al.11). The Illumina Infinium ShrimpLD-24 v1.0 array is available from https://www.illumina.com/products/by-type/microarray-kits/infinium-shrimp-ld.html.
Infinium array genotyping and validation
In total, 2,004 shrimp samples were genotyped, including 1,134 female and 193 male parents of families, along with 677 nauplii pools. From these samples, 70 individuals produced call rates of less than 90% and were subsequently removed from further analysis leaving 1,257 unique individuals to investigate SNP array performance. Analysis of the resulting genotypic data revealed that 6.01% of the SNPs did not amplify successfully (probe did not bind to the DNA) and 13.04% of the SNPs returned ambiguous clusters. From the resulting 7,259 SNPs, the SNP conversion rate was calculated to be 80.95%. Within the converted SNPs, 318 SNPs did not return heterozygous genotype calls and therefore were considered monomorphic. After the removal of the monomorphic SNPs, 6,941 remained resulting in a SNP validation rate of 95.62%. To estimate the proportion of informative or polymorphic SNPs, within this experimental population, a further 562 SNPs with deviations from HWE and MI errors were excluded, resulting in 6,379 SNPs (87.88%) with minimal errors (Table 2). Further stringent data integrity (i.e. excluding SNPs with a MAF < 0.01, SNP duplication, or low call rates) resulted in the exclusion of an additional 323 SNPs (Table 2). From the final dataset of 6,056 high quality SNPs, the SNP call rate was extremely high (98.92%) and the Mendelian inheritance concordance exceeded 99.9%. The average minor allele frequency of these high-value SNPs was 0.37. Summary statistics for all SNPs included on the array are included in Supplementary Table S1. A total of 52 replicate samples were included to evaluate final array genotyping performance. No major deviations between replicate samples were observed, resulting in sample concordance exceeding 99.9%. This provides strong support for highly reliable genotypic data across all validated SNPs.
SNP array performance indicating the number of SNPs retained over subsequent filtering procedures.
Linkage map construction and LODE integration
A total of 708,209 phase known informative meiotic events were utilised to place and order SNPs across linkage groups. The average number of informative events per mapped locus was 147.02 (ranging from 4 to 444) compared to an average of 28.30 informative meiotic events for unmapped markers. A total of 4,370 SNPs were successfully mapped to their most likely position within the 44 linkage groups, which spans a total of 4,552.5 cM of the estimated sex-average 4,619.3 cM genome length, covering 98.12% of the L. vannamei genome. By utilising this linkage map in LODE analysis, an additional 447 markers were placed with high confidence. This integrated map (Build 1.2) contained a total of 4,817 SNPs which reduced the average marker interval across the genome to 0.97 cM, or 2.67 when all intervals of 0 cM were excluded (Fig. 2 and Table 3 and Supplementary Table S5). Linkage groups were ordered based on their total cM length.
Distribution of SNPs placed using linkage and LODE mapping methods across the 44 linkage groups.
Descriptive statistics for the Litopenaeus vannamei integrated linkage and LODE map (build 1.2).
Sex-specific and family-specific recombination heterogeneity
Sex-specific maps were also produced using the sex-average marker order to recalculate marker intervals based on 447,640 phase-known informative meiotic events for the female map and 260,569 phased known informative meiotic events for the male map. The total lengths of the female and male maps were 4,530.60 cM and 4,522.30 cM respectively (Table 3). Minimal differences in sex-specific recombination were observed throughout the linkage groups (Supplementary Fig. S6). However, LG23 and LG44 displayed slightly larger male maps and LG6, LG9, LG12, LG13 and LG24 displayed slightly larger female maps. Overall, the average female-to-male ratio was 1.02:1. The sex-specific log10 likelihood for each linkage group, averaged between the sexes ranged from −744.740 to −190.780 (average −516.333) and the total sex-specific log10 likelihood was −22,718.645. Cumulative cM distances of the sex average, female and male maps indicate that there is no pronounced pattern of sex-specific recombination throughout the 44 linkage groups (Supplementary Fig. S6).
Investigations into family specific heterogeneity also indicate minimal significant deviations throughout all maps after false discovery rate (FDR) corrections. Only one interval on the female map (66909_123–44494_691) and six intervals on the sex average map (75525_46–34150_668, 61146_423–57343_1404, 804_148–23736_451, 23736_451–24606_1440, 66909_123–44494_691 and 7007_119–48446_2196) returned significant recombination heterogeneity after FDR (χ2 = 11.809–31.301, P = 0.00011–0.00059, df = 1–9).
A total of 540 significant segregation distortions were detected across all markers and families following FDR correction (Supplementary Table S7). The majority of these distortions (82.3%) were within families F1014, M1014 and F1002. As no significant family specific heterogeneity was detected for these distortions, they show no evidence of influencing calculations on mapping distances.
Extent of linkage disequilibrium
Linkage disequilibrium estimates declined gradually with increasing map intervals both throughout the genome and for each linkage group (Fig. 3 and Supplementary Table S8). This is accentuated by the relatively high mean r2 values for SNP pairs less than 5 cM. Between the 4,817 adjacent SNP pairs, the mean r2 estimates were 0.184 (with a median of 0.096).
Mean linkage disequilibrium (LD) estimates at different linkage map distances throughout the P. maxima genome for r2 and D′.
Comparative genome analysis
Sequence similarity Blastn searches of assembled contig sequences to mapping sequence data published in Yu, et al.2 and Baranski, et al.45 returned 5,591 and 2,499 hits respectively. Of these hits, a total of 67 SNPs for Yu, et al.2 and 275 SNPs for Baranski, et al.45 were in common between our integrated and previously published maps (Supplementary Table S9). The small number of blast hit results to Yu, et al.2, is most likely due to the large number of contigs from which the SNPs originated in both studies (i.e. 7.4 million contigs from Yu, et al.2 and 76,963 from our transcriptome assembly), as well as the relatively low number of mapped SNPs (6,147 for Yu, et al.2 and 4,817 from our integrated map). Between these two studies, SNPs were also developed using different methodologies. Yu, et al.2 utilised genomic DNA sequencing and specific-locus amplified fragment sequencing (SLAF-seq), whereby the SNP discovery method utilised in the current study involved transcriptome sequencing, SNP identification and design which was optimal for developing Illumina custom array probes.
Of the 159 SNPs developed in Du, et al.3 and included on our genotyping array, 83 were assigned a map position in the integrated map (Supplementary Table S9). In addition, sequence similarity searches of our mapped contig sequences (from which the mapped SNPs were designed) to the marker sequences from Du, et al.3 returned 38 hits (evalue < 0.01, similarity >95%). Of these 38 marker sequence and contig matching pairs, 30 of the respective contigs were mapped adjacent to its pair confirming its placement and blast hit (average distance = 1.08 ± 2.1 cM).
The identified homologous sequences were used to identify homologous linkage groups across the independent maps (Figs 4, 5 and 6 and Supplementary Table S9). Comparing our integrated map to the previously published L. vannamei SNP map from Du, et al.3, 29 linkage groups were able to be matched. Marker order was highly conserved (Figs 4 and 7), although, six SNPs were placed on alternative linkage groups. Within this comparison, our integrated map indicates that LG11 and LG34 from Du, et al.3 may be able to be merged due to common SNPs on LG14 in the present map. In addition, the map produced in this study indicates that LG1 of Du, et al.3 may be a concatenation of two separate linkage groups. A total of 35 linkage groups from Yu, et al.2 were able to be matched to the linkage groups within the present integrated map (Fig. 5). Within the 67 common SNPs between these two maps, only two SNPs were placed on alternative linkage groups. However, homologous sequence matching indicates that LG7 and LG17, as well as LG26 and LG29 may be merged due to being mapped to LG35 and LG39 respectively in the present map. Comparisons to Baranski, et al.45 revealed 44 linkage groups matched by homologous sequences (Fig. 6). Marker synteny was highly conserved with only three markers being assigned to alternatively matched linkage groups (Figs 6 and 7). Both LG39 and LG43 from Baranski, et al.45 were matched to SNPs placed within LG2 reported here, which were placed with high significance (two-point LODs ranging from 10.8–39.6), indicating a potential merger.
Homologous linkage map relationships between the integrated linkage and LODE L. vannamei map and the L. vannamei linkage map produced in Du, et al.3. Each dot represents a homologous locus proportional to cM lengths (Kosambi).
Homologous linkage map relationships between the integrated linkage and LODE L. vannamei map and the L. vannamei linkage map produced in Yu, et al.2. Each dot represents a homologous locus proportional to cM lengths (Kosambi). Data from Yu, et al.2 was...
Homologous linkage map relationships between the integrated linkage and LODE L. vannamei map and the P. monodon linkage map produced in Baranski, et al.45. Each dot represents a homologous locus proportional to cM lengths (Kosambi). Data from Baranski,...
Demonstration of synteny analysis between LG4 of the integrated map, LG1 from Du, et al.3 and LG20 from Baranski, et al.45. Only matched markers are listed. Data from Baranski, et al.45 was utilised under a Creative Commons Attribution 4.0 International...
The rapid advancement of genetic and genomic technologies has made the generation of genome-wide genomic resources available for many non-model species. Genome-wide markers with known gene function and position are highly useful in comparative mapping studies, genome-wide association studies, and linking gene function to traits of interest. By using the high-throughput sequencing approach, this study provides over 234,452 putative in-silico SNPs and 26,662 filtered high-quality SNPs (MAF ≥ 0.25, read depth ≥ 10). A total of 8,967 high-utility SNPs were incorporated into a commercial array allowing cost-effective routine genotyping of validated content. In addition, 4,817 of these SNPs were placed within a moderate density integrated linkage and LODE map allowing insights into the genome structure of L. vannamei and comparisons to previously published genome maps in penaeids. These resources vastly improve the publically available genomic resources available for this important commercial species and have high-utility in studies aiming to identify genomic regions linked to traits of interest. Furthermore, the current integrated genetic map will help with forthcoming L. vannamei genome sequence assemblies, by providing robust gene-associated reference maps to anchor and orientate sequence data.
Highly reliable genotypes are integral to ensure the generation of integrated genetic maps and genome association studies are accurate. The success of Type I SNP assay development can be attributed to the quality of the EST sequence data used, sequence depth, in-silico MAF cutoff and SNP flanking region composition51, 52. By utilising a SNP mining approach within 25 Gb of assembled transcriptome-wide sequence data (~10× genome coverage), and applying strict SNP discovery filtering parameters (i.e. MAF of 0.25, read depth > 10, a minimum of two minor allele reads, a minimum flanking sequence quality of 25, and no observed variation in the flanking probe design region), we ensured that the SNPs reported within this study are of high utility and are dispersed throughout the genome. In addition, since all SNPs reported here were designed within large expressed contigs (N50 of 2,375 and average contig length of 1,429 bp) which are generally well conserved, they are not only known to be associated with functional genes, but will also be highly useful in ongoing comparative genomic analyses and genome sequence assemblies.
Standard measures of quantifying the success of genotyping arrays are the conversion (the proportion of SNPs producing genotypes) and validation (the proportion of SNPs that are polymorphic in a population) rates. Conversion and validation rates observed within this study (i.e. 80.95% and 95.62% respectively) were relatively high compared to Illumina genotyping arrays designed on other aquaculture species. Previous conversion and validation rate of Illumina panels from aquaculture species such as the silver-lipped pearl oyster (Pinctada maxima), the pacific oyster (Crassostrea gigas), European flat oyster (Ostrea edulis), rainbow trout (Oncorhynchus mykiss), Atlantic salmon (Salmo salar) and catfish (Ictalurus punctatus) range from 70.3–92.0% and 48.0–59.9% respectively26, 53–57. In comparison, the current L. vannamei array is very similar to the well-established Illumina livestock arrays such as the chicken, goat, bovine, porcine and the domestic horse which have conversation and validation rate that range from 88–97.5% and 78–99.1% respectively58–62. The current array was validated on a large number of samples distributed throughout the most prominent industry domesticated lines. This ensures that a high proportion of SNPs included on the commercial array will be polymorphic within the majority of farmed and wild populations of L. vannamei worldwide.
With the advent of genotype by sequencing (GBS) approaches for generating genotypic data, there has been a move away from solid state SNP genotyping arrays63. However, there are significant benefits of using solid state SNP arrays over GBS. For example, the laboratory techniques and procedures required to undertake genotyping as well as integral downstream bioinformatics analysis are much simpler and require less technical knowledge. As a result, genotyping arrays usually have a much quicker turnaround and are much more robust and less prone to errors64, 65. In addition, SNP genotyping arrays can be automated leading to higher reproducibility. Per sample, genotyping arrays are generally more expensive than GBS approaches, however, as long as the SNP arrays were designed with loci that are polymorphic within the populations for its intended use, there are many benefits that can be yielded over any GBS approach.
Assigning gene annotation and ontology terms to SNPs provide valuable insights into the functional biology and trait architecture particularly when coupled with location information within the genome. A total of 27,477 of the 76,963 contigs utilised in SNP discovery were annotated with one or more gene ontology terms. The proportion of annotated contigs reflects the still limited amount of annotated sequence data for decapods in the public domain. The major GO terms returned including cellular, metabolic and single-organism process in Biological Process; binding and catalytic activity in Molecular Function; and cell, organelle, membrane and macromolecular complexes in Cellular Components were all reflective of the proportions of GO terms returned within previous studies within penaeid transcriptome studies10, 66. In addition, the GO distribution patterns of protein coding genes and therefore gene compositions between the two shrimps was reported to be similar66.
Out of the 6,379 SNPs deemed suitable for linkage mapping, 4,370 were successfully placed via linkage analysis. An additional 447 SNPs were placed when integrating LODE methodologies resulting in a total of 4,817 SNPs mapped. The power of placing SNPs on a linkage map comes from the number of informative meiosis events within the reference mapping families. The average number of informative meiotic event across all mapped SNPs was 147.0, compared to 28.3, for all unmapped SNPs. This significant drop in the number of informative events is a major contributor to the ability to firstly assign a SNP to a linkage group, and finally order the markers unambiguously. In addition to the placement of markers, the number of informative meiotic events adds power to teasing apart the LD blocks or binning of markers placed at the same location on the linkage map. There were a number of clusters of co-localised markers within the map presented in this study (i.e. there were 4,817 markers places within 1,752 unique locations, indicating that on average, there were 2.75 markers co-localised throughout the genome). An increase in either the number of individuals per family, or the number of families should result in higher power to refine the position of these co-localised markers. Nevertheless, considering there were only minimal evidence of sex-specific recombination, family specific recombination and segregation distortion, the assignment of the SNP markers throughout the linkage mapping procedure is considered to be highly accurate.
To assign additional orphaned markers to the linkage map that were not assigned a position within linkage analysis, a locus ordering by disequilibrium (LODE) mapping procedure was implemented as described in Khatkar, et al.16. LODE methodologies rely on the linkage disequilibrium (LD) information from unrelated samples within a population and do not require specific resource populations or mapping families. A reliable estimate of r2 (statistical measure of LD) can be successfully obtained from a minimum samples size of 75 unrelated individuals40. Within this study, a total of 1,963 individuals (from prominent domesticated industry lines) were included in LODE analysis which allowed the generation of accurate estimates of LD. In doing so, the LODE method was able to harness additional samples and linkage disequilibrium information (outside of linkage mapping families) that was useful to place 447 additional orphan SNPs.
The estimation of LD does have some assumptions and limitations. The precision of determining locations of orphan SNPs by this method depends on the extent of linkage disequilibrium and density of the framework map. Furthermore, the extent of linkage disequilibrium, among other factors, depends on population structure. Therefore the number of LODE placed SNPs within this study may be increased by utilising larger random mating wild populations (to stabilise genetic drift).
Using the observed length of the integrated linkage and LODE map, the expected genome length of L. vannamei was calculated to be 4,619.3 cM (sex average) and the estimated genome coverage is 98.12%. This is comparable to the sex-average linkage map from Yu, et al.2, which incorporated 4,817 loci and had a total genome map length of 4,341.39 cM and genome coverage of 98.39%. Previous linkage maps to this contained fewer loci and as such are less accurate and smaller in size (i.e. 3677.65–4025.5 cM; refs 12–14).
To date, no complete shrimp genomes have been published due to their large genome size (ranging from 2.17 Gb to 2.64 Gb) and high levels of duplication2, 66. In addition, previous comparative mapping in penaeid shrimps has been very limited, although, some comparative mapping and divergence times have been recently conducted in decapod shrimps within the Pancrustacea clade using transcriptome data66. By comparing homologous loci between L. vannamei and P. monodon, this study reports one of the first genomic comparisons for gene order and synteny within penaeids. Homology was investigated in two L. vannamei maps2, 3 and one P. monodon linkage map45.
Within the homologous loci between the L. vannamei linkage map published in Du, et al.3 and the integrated map within this study, the gene order is well conserved. In total, only six out of the 83 homologous loci were assigned to alternative linkage groups). Positional two-point LOD thresholds of five of the six SNPs calculated from the current map ranged from 4.5 to 27.1 (informative meiotic events ranging from 98 to 230) indicating high statistical support for their current assignments. Only one marker, LV1007 had a low two-point LOD placement threshold of 0 resultant from 70 informative meiotic events. With a higher density of markers and number of informative meiotic events in the current integrated map, the placement of the large majority of SNPs is expected to be more accurate. Similarly, only two out of the 67 common SNPs were assigned to alternative linkage groups when comparing the map presented in this study to the L. vannamei map published in Yu, et al.2. These two SNPs had high positional two-point LOD thresholds at 14.4 and 19.3, indicating strong support for their placement in the current map.
The total length of the female and male maps for P. monodon map were 4,060 cM and 2,917 cM respectively. Within P. monodon, the female map was 28% larger than the male map indicating greater recombination frequency in female over males. Even though large differences in recombination rate between sexes have been reported in other penaeids12, 13, 45, 67–69, it was not observed within the L. vannamei map reported in this study (female and male map length of 4,530.6 cM and 4,522.3 cM respectively). Sex-specific recombination is highly variable throughout existing reported penaeid maps, which may come down to the number of markers mapped to the respective sex maps, whereby various numbers of markers mapped for respective sexes could be influencing the recombination rates observed. The incorporation of many more markers, families and offspring (and therefore more total informative meiosis events) within the integrated map is expected to produce a more accurate estimation of sex-specific recombination rate than the existing maps.
P. monodon and L. vannamei share an identical chromosome number of 2n = 4413, 67. A comparison of the 44 LGs from our integrated map to the P. monodon map produced in Baranski, et al.45 returned substantial macrosynteny throughout the 275 homologous loci identified despite their estimated divergence of 95 million years ago66. Only three SNPs were assigned to alternative linkage groups, whereby the two-point LOD SNP placements for the three SNPs placed on different linkage groups ranged from 6.6 to 22.3 (with informative meiotic events ranging from 79 to 226). There is minor evidence of interchromosomal rearrangements and marker shuffling between the species [i.e. within linkage group pairs LG6 (L. vannamei) and LG14 (P. monodon); LG40 (L. vannamei) and LG22 (P. monodon); LG34 (L. vannamei) and LG41 (P. monodon); and LG15 (L. vannamei) and LG44 (P. monodon)], however, comparisons between a higher density of markers is required before inferences of chromosomal rearrangements can be made reliably. Overall, the robustness of the marker orders in all maps is demonstrated by the high correlations between marker orders across most linkage groups calculated from independent analysis. The sporadic marker disagreements may be due to either genotyping errors, or differences in the mapping algorithms used during map construction1.
Penaeid genomics has come a long way in the last 15 years. Many species now have significant genomic resources which will enable more advanced methods of breeding such as marker assisted and genomic selection70. These novel techniques may help increase disease resistance to specific emerging diseases which is a major priority for current shrimp breeding programs. It is predicted from simulated genetic advancement using genomic information in selection programs for survival and disease resistance was up to 2.6 times as effective than that of phenotypic sib-selection alone70. Furthermore, considering vaccination is not an option and management interventions to curve disease are usually unfeasible, several shrimp breeding programs have already been implemented in a number of countries to improve disease resistance as reviewed in Neira71, Rye72, and Castillo-Juárez, et al.70. With the continuing development of genomic resources in penaeids, incorporation of genomic information into breeding programs is a viable option promising to increase the accuracy of selection and therefore response compared to conventional selection70.
Developing a large set of type I genome-wide molecular markers and genomic maps for L. vannamei
Regarding the practical farming of Litopenaeus vannamei, the deterioration of water quality from intensive culture systems and environmental pollution is a common but troublesome problem in the cultivation of this species. The toxicities that result from deteriorating water quality, such as that from ammonia stress, have lethal effects on juvenile shrimp and can increase their susceptibility to pathogens. The toxicity of ammonia plays an important role in the frequently high mortality during the early stage on shrimp farms. However, little information is available regarding the genetic parameters of the ammonia tolerance of juveniles in the early stage, but this information is necessary to understand the potential for the genetic improvement of this trait. Considering the euryhalinity of L. vannamei and the fact that low salinity can increase the toxicity of ammonia stress, we estimated the heritability of ammonia tolerance in juveniles in 30‰ (normal) and 5‰ (low) salinity in this study using the survival time (ST) at individual level and the survival status at the half-lethal time (SS50) at the family level. In the normal and low salinity conditions and for the merged data, the heritability estimates of the ST (0.784±0.070, 0.575±0.068, and 0.517±0.058, respectively) and SS50 (0.402±0.061, 0.216±0.050, and 0.264±0.050, respectively) were all significantly greater than zero, which indicates that the ammonia-tolerance of shrimp can be greatly improved. So it might provide an alternative method to reduce mortality, help to enhance resistance to pathogens and reduce the occurrence of infectious diseases. The significant positive genetic correlation between ST and body length suggested that ammonia is more toxic to shrimp in the early stage. The medium-strength genetic correlations of the ST and SS50 between the two environments (0.394±0.097 and 0.377±0.098, respectively) indicate a strong genotype-by-environment (G×E) interaction for ammonia tolerance between the different salinity levels. Therefore, salinity-specific breeding programs for ammonia tolerance in shrimp should be purposefully implemented.
Citation: Lu X, Luan S, Cao B, Meng X, Sui J, Dai P, et al. (2017) Estimation of genetic parameters and genotype-by-environment interactions related to acute ammonia stress in Pacific white shrimp (Litopenaeus vannamei) juveniles at two different salinity levels. PLoS ONE 12(3): e0173835. https://doi.org/10.1371/journal.pone.0173835
Editor: Peng Xu, Xiamen University, CHINA
Received: November 28, 2016; Accepted: February 27, 2017; Published: March 22, 2017
Copyright: © 2017 Lu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper.
Funding: This work was supported by the Central Public-interest Scientific Institution Basal Research Fund, CAFS (No. 2016HY-ZD04), the National Natural Science Foundation of China (No. 31302179), the Taishan scholar program for seed industry, the Introduction of International Advanced Agricultural Science and Technology Plan (No. 2016-X39), and the China Postdoctoral Science Foundation (No. 2015M572095).
Competing interests: The authors have declared that no competing interests exist.
The Pacific white shrimp Litopenaeus vannamei is an important worldwide aquaculture species [1, 2]. Due to the advantages of euryhalinity, rapid growth, and a high survival rate at high density, L. vannamei is becoming the most popular cultured species in the world and generally uses semi-intensive and intensive culture modes . However, these culture practices often lead to the degradation of the water quality due to uneaten food and the waste products of the animals. Additionally, the vagaries of climates and the serious pollution of aquaculture environments jointly cause the quality of the culture water to decrease. The toxic factors associated with deteriorating water quality exhibit very strong correlations with the high mortality of L. vannamei in farms [4–6].
High concentrations of ammonia are the most common and greatest toxic factor in the deteriorating water quality, which does serious harm to crustaceans, mollusks and fish [4, 7–9]. Much research has been performed to detect the detrimental effects of ammonia to shrimp, and this research has revealed that the accumulation of ammonia in pond water reduces growth, damages the hepatopancreas and gills, increases oxygen consumption, decreases the osmoregulatory capacity, reduces the ability of the hemolymph to transport oxygen, affects the molting frequency, and even causes high mortality in shrimp [10–18]. More important than all of the above, increased ammonia in the water can suppress the immune defense system of shrimp and increase their sensitivity to pathogens [19–21]. We previously performed a comparative transcriptome analysis to understand the molecular mechanisms of the detrimental effects of ammonia stress in L. vannamei, and we found that the majority of the key genes involved in the response to ammonia stress are potentially involved in immune defense function .
Because the aquaculture environment is becoming worse, and ammonia stress causes grievous harm to shrimp aquaculture, it is necessary to seek possibilities for culturing shrimp that efficiently tolerate ammonia stress, which might represent alternative methods for reducing mortality and infectious diseases. Selective breeding programs are an effective method to gain genetic improvements and have been utilized to improve performance in terms of growth, survival, reproduction, resistances to viruses and adverse environmental stress in shrimp [23–27]. Knowledge about the heritability of ammonia tolerance in shrimp is a prerequisite to understanding the potential for the genetic improvement of this trait. Previous studies have revealed that ammonia stress is more toxic to the earlier ontogenetic development of aquaculture organisms [28–30], and it has been widely found to exert lethal effects on penaeid shrimp juveniles [31–37]. The shrimp in the early juvenile stage are critical for successful breeding because high rates of mortality often occur during this period. We have performed an investigation to estimate the heritability of ammonia resistance in shrimp at the stage with an average body weight of 3.3 g in the normal salinity (30‰) , but there are no reports regarding the genetic parameters of ammonia tolerance in juvenile shrimp in the early stage. Additionally, the phenotype of a quantitative trait is determined by genetic and environmental resources and their interactions ; thus, genotype-by-environment (G×E) interactions will play an important role in genetic improvement . L. vannamei is cultured at different salinity levels due to its euryhalinity. However, many studies have revealed that low salinity can increase the toxicity of ammonia to the shrimp [34, 35, 41–43], and we have also verified that ammonia stress can influence the pathway that is involved in osmoregulation . If the G×E interaction for ammonia tolerance is significant at different salinity levels, the selection response for this trait will vary across different salinity levels . Based on the above reasoning, it is necessary to understand the heritability of ammonia tolerance in the early stage and the G×E interactions at different salinity levels in L. vannamei.
Considering that the majority of the production of L. vannamei occurs in seawater with normal salinity and freshwater with low salinity, in the present study, we estimated the heritability of ammonia tolerance in L. vannamei juveniles in the early stage (average body weight, 0.5 g) at two different salinity levels (30‰ and 5‰) and estimated the genetic correlations between body length and ammonia tolerance within the two environments to verify the relationship between ammonia tolerance and ontogenetic development. Additionally, in the present study, we also detected the G×E interaction between the two salinity levels. Our goal was to understand the potential for the genetic improvement of ammonia tolerance in L. vannamei in the early stage in normal and low salinity because such improvement might represent an alternative method for reducing the mortality of L. vannamei and will also be helpful for enhancing resistance to pathogen and reducing infectious diseases.
Materials and methods
Origin of the base population and selection procedure
The program was conducted at a Hebei Xinhai Aquatic Biological Technology Co., Ltd. Facility located in the Huanghua (longitude 111.3366, latitude 38.37738), Hebei province, China. The founders were collected from seven improved commercial strains that were introduced from different companies in the United States and Singapore. The virus-free individuals from the seven strains were tagged using numbered rings that were placed on one ocular peduncle and used to produce the base population (G0) via an incomplete diallel cross experiment. The G0 was constructed with 130 full-sib families using artificial insemination with 114 males and 108 females in 2011 and included 69 half-sib families. Individual estimated breeding values (EBV) for body weight and the family EBV for survival rate from the test were weighted into the selection index and used to rank the breeding candidates as described by Luan et al. . The families with low selection indices (<100) were eliminated from the breeding program, and the males and females with high selection indices (>100) from the remaining families were selected to produce the next generation. During the implementation process of the present program, the selection population was closed, and the generations were discrete. The pedigree of the selection population is clear and forms a relationship matrix. The shrimp production scheme was provided in our previous study .
The present experimental shrimp were from 91 full-sib families (52 half-sib families) of the fifth generation (G5). The half-sib families were produced by two females mating with one male or two males mating with one female, and all of the families were produced by artificial insemination. The processes of family construction, hatching, and larvae rearing were described in detail in our previous study . At the post-larvae stage of approximately 45 days, approximately 400 post-larvae were randomly selected from each family and equally transferred into tow net cages (0.5 m3) that were separately fixed in two large ponds (60 m3) for separate rearing at the same density for each family. The seawater in one of the ponds was diluted with filtered fresh water from 30‰ to 5‰, which lasted for approximately three weeks. However, the seawater in the other pond was maintained at 30‰. Aside from the salinity, the other management conditions were maintained identically between the two ponds.
Challenge with high concentration ammonia
To acquire the individual survival times for the estimations of the genetic parameters, we performed a pre-experiment to obtain the proper concentrations of ammonia in the 30‰ (normal) and 5‰ (low) salinity conditions. Based on the results of our pre-experiment, the concentrations of ammonia required to kill all of the shrimp after 72-h challenges were 32 mg/L and 18 mg/L in the normal and low salinity conditions, respectively. The ammonia solution was prepared using NH4Cl (Aldrich, Milwaukee, WI, USA).
The juveniles were too small (the average body weight of 0.50 g and average length of 3.73 cm) that they could not be tagged with visible implant elastomer (VIE) when the experiment began. Therefore, forty individuals were randomly selected from each net cage (each family) and transferred to separate 100 L tanks filled with 35 L 30‰ or 5‰ salinity water for ammonia stress tests under the same conditions of density and environmental management. After a period of three days for a temporary rearing, the ammonia acute stress experiment was initiated. During the experiment, the shrimp were given feedstuff twice per day, and the management environments were maintained identically between the tanks. The dissolved oxygen level was no less than 6 mg L-1, the pH ranged from 8.00 to 8.06, and the temperature was approximately 27±0.5℃. The tanks were cleaned daily by suction to remove feces, and the water that was removed by suction was replaced with clean water with the same concentration of ammonia. The dead shrimp were collected and recorded every hour during the experiment. When the shrimp sank, stopped movement and lost responsiveness to external stimulation, they were defined as dead and removed from the tanks with a scoop net. The family ID and death time of each dead shrimp were recorded to quantify the individual survival times (STs) and survival statuses (1 = alive, 0 = dead) at the half lethal time (SS50). Because the animals were too small that body weight measures would introduce large measurement errors due to the different degrees of excess water on the shrimp, we selected body length (BL) as an index of growth. Therefore, the individual BLs were also recorded when the shrimp died, which enabled us to investigate the relationship between ammonia tolerance and growth stage.
Variance components and heritability estimates.
An analysis of the descriptive statistics was conducted using the MEANS procedure . The complete pedigree of this breeding program from G0 to G5 was used in the following analysis to account for the genetic relationships among the individuals. The common environmental effect was omitted in this analysis due to the separate rearing of the families during the experiment. The ammonia tolerance was defined by the ST (individual level) and the SS50 (family level) in the present study. The variance components and heritabilities were calculated for ST and SS50 using the data from the normal and low salinity conditions and the merged data from the two conditions. Therefore, these variables are denoted as the STH/SS50H, STL/SS50L, and STM/SS50M for the normal salinity, low salinity, and merged data, respectively.
The individual animal model was used to estimate the heritability of the ST in the shrimp that were challenged with the high concentration of ammonia. Body length exhibited a linear relationship with ST (P < 0.01) and was fit as a covariate in the model. The fitted model was as follows: (Model 1) where yijk is the observed ST of the kth individual, μ is the overall mean, Salti is the fixed effect of the ith salt level (two levels) and it was not contained when estimated for the separate environments, BLj is the covariate of the jth body length, b is the regression coefficient, ak is the random additive genetic effect of the kth individual, and eijk is the random residual effect associated with observation ijk. The phenotypic variance () was taken as the sum of all of the variance components as follows: . The heritability (h2) was calculated as the ratio between the genetic variance and the total phenotypic variance ().
A standard threshold (probit) and a sire–dam model were used in this study to estimate the heritability of the SS50. The model was written in ASReml as follows : (Model 2) where yijkl is the survival status (1 = alive, 0 = dead) of the jth shrimp, λijkl is the underlying liability of yijkl, which was assumed to be a cumulative standard normal distribution, μ is the overall mean, BLj is the covariate of the jth body length, b is the regression coefficient, and Sirek and Daml are the additive genetic effects of the kth sire and the lth dam, Sire or Dam ~(0, ) (), where A is the additive genetic relationship matrix among all shrimp. The residual variance of λ was assumed to be 1. The phenotypic variance was the sum of and : . Heritability was computed as the ratio between and (). The heritability would be overestimated in this model, so the estimates for SS50 then were adjusted according to Robertson and Lerner .
The estimation of the heritability of BL was performed using the ASReml package . The individual animal model was used to estimate the genetic parameters for the BLs of the shrimps that were challenged with high concentrations ammonia. The age at the end of the experiment exhibited a linear relationship with body length and was fit as a covariate in the model. All of the fixed effects and the covariates were statistically significant (P < 0.01). The fitted model was as follows: (Model 3) where yijk is the observed BL of the kth individual, μ is the overall mean, Salti is the fixed effect of the ith salt level (two levels), and it was not contained when estimated for the separate environments, b is the regression coefficient, Agej is the covariate of the jth individual’s age when the experiment ended, ak is the random additive genetic effect of the kth individual, and eijk is the random residual effect associated with observation ijk. The phenotypic variance () was the sum of all variance components: . The heritability was calculated as the ratio between the animals’ genetic variance and the total phenotypic variance as follows: ().
The z-scores were used to test whether the heritability estimates were significantly different from each other and whether the heritability estimates were significantly different from zero . where xi and xj are the heritability estimates from different models and different salinity levels, and and are their respective standard errors. Both xj and were set to be zero when testing whether an estimate was significantly different from zero. The resulting z-score was then tested against a large-sample normal distribution.
Genotype by environment interaction estimate.
The homologous ST, SS50, and BL in the different environments were considered to be different traits; thus, the genotype by environment (G×E) interactions were estimated based on the genetic correlations of these traits between environments. The phenotypic (rp) and genetic (rg) correlations of the ST and BL values between the two salinity environments were estimated using the bivariate animal model described above. Because one animal can only be tested in one environment, there were no environmental (residual) correlations between the homologous traits, and the environmental covariance was set to be zero in the bivariate analysis . Because of the convergence problem, the rg between the two salinity environments for the SS50 was calculated with the estimated breeding values. A G×E interaction was measured as the difference between the genetic correlation and 1. Thus, genetic correlations closer to 1 indicated smaller the G×E interactions. The genetic correlations between the environments were used to measure the degree of re-ranking due to G×E interactions. Additionally, the rp and rg values between the STs and BLs within environments were estimated using the bivariate animal model, which is the extension of Model 1 and Model 3, with the ASReml package . The statistical significances of the genetic correlations within environments and between environments were assessed using the confidence intervals (CIs). The CIs of the genetic correlations were calculated using rg, the standard errors (SEs) and Z0.025, which is 1.96 (i.e., [rg−1.96×SE, rg+1.96×SE]). If 1 or 0 was contained within the CI, the difference between the genetic correlation and 1 or 0 was not significant.
A total of 7221 individual records, including 3624 records from the normal salinity condition and 3597 records from low salinity condition, were obtained and analyzed in the present study. The number of final testing individuals was less than the total number of starting individuals (7280), which might have been due to cannibalism during the experimental process. The numbers of samples, simple means, minima, maxima, standard deviations and the coefficients of variation for the survival rate of each family at the half-lethal time (SS50), ST, and BL are given in Table 1. The 25th percentiles, median percentiles, 75th percentiles, minima, maxima, and outliers of the SS50 and average ST values of each family are displayed in Fig 1. The results revealed the SS50 values under acute ammonia stress in the normal and low salinity conditions varied substantially between the families, but the variance was higher following exposure to acute ammonia stress in the normal salinity condition (Table 1; Fig 1A). Additionally, the STs under acute ammonia stress in the normal and low salinity conditions also varied substantially between the families and the overall individuals, but the variance was higher when analyzed at the individual level than the family level as indicated by the higher standard deviation (SD) and coefficient of variation (CV). The CVs of the STs were 27.38% and 29.27% among the 91 families in the two environments, but these values were 46.87% and 54.48% among the individuals in the two environments (Table 1). Although the concentration of ammonia (32 mg/L) was higher in the normal salinity condition than in the low salinity (18 mg/L) condition, the average ST under acute ammonia stress in the normal salinity condition (36.64 h) was substantially greater than that in the low salinity condition (24.80 h), and the variance was higher in the normal salinity condition (Table 1; Fig 1B). Regarding the BL, it also exhibited higher variance when analyzed at the individual level than at the family level, but there was no significant difference between the two environments (Table 1).
(a) Boxplot of the survival rates of the families at the half-lethal time. (b) Boxplot of the average survival times of the families. The 25th (upper line), median (inside line) and 75th (bottom line) percentiles of the families are plotted as boxes. The minima, maxima, and the observed values are shown as -, -, and ○, respectively.
The cumulative mortality and the average survival time of each family under acute ammonia stress in the normal and low salinity conditions are displayed in Fig 2 and Fig 3, respectively. The cumulative mortalities of the families under acute ammonia stress in the low salinity condition were substantially greater than those in the normal salinity condition at each sampling point (Fig 2). All of the individuals in the low salinity condition were dead after 69 h of exposure to acute ammonia stress, but at this time, 115 individuals (3.5%) in the normal salinity condition were still alive. It showed that the average survival time exhibited large variation between the families within the same environment and between the two environments, but the average survival time of the majority of the families under acute ammonia stress in the low salinity condition was substantially lower than that of the families in the normal salinity condition (Fig 3).
Variance components and heritability estimations
Estimates of the variance components and heritability of ST, SS50, and BL created with different models are provided in Table 2. All of the heritability estimates were significantly greater than zero (P < 0.01). According to the classification reported by Cardellino and Rovira  (i.e., low, 0.05–0.19; medium, 0.20–0.44; high, 0.45–0.64; and very high, >0.65), the heritability estimate of STH (0.784±0.070) was very high, the heritabilities of STL (0.575±0.068) and STM (0.517±0.058) were high, and the heritabilities of SS50H (0.402±0.061), SS50L (0.216±0.050) and SS50M (0.264±0.050) were medium. The heritability estimate of STH was significantly greater than those of STL and STM (P<0.05), but there was no significant difference between STL and STM (P>0.05). The heritability estimates of ST were all significantly greater than that of SS50 (P < 0.01). The heritability of SS50H was significantly greater than that of SS50L (P < 0.05), but there was no significant difference between SS50H and SS50M (P>0.05), and there was also no significant difference between SS50L and SS50M (P>0.05). The heritability estimates of BLH (0.346±0.052), BLL (0.386±0.054), and BLM (0.291±0.042) were all medium based on the above classification, and the values for BLH and BLL were higher than that of BLM but were not significantly different from each other (P > 0.05).
Genetic corrections within and between environments
To understand the effects of the G×E interactions on the three traits, particularly the trait of ammonia tolerance, genetic correlations were estimated between the two environments, and the results are displayed in Table 2. The phenotypic and genetic correlations among the three traits between the two environments were all positive, and their genetic correlations were higher than their respective phenotypic correlations. The estimated genetic correlation for BL between the two environments was higher (0.535±0.096) than those of ST (0.394±0.097) and SS50 (0.377±0.098) (Table 2). The genetic correlations for ST, SS50 and BL between the two environments were significantly different from 0 and 1 based on the CIs (ST, 0.20–0.58; SS50, 0.19–0.57; and BL, 0.35–0.72). These results indicate a strong re-ranking effect for ammonia tolerance and growth between the two environments.
The phenotypic and genetic correlations within the two environments for ST and BL are given in Table 3. The phenotypic and genetic correlations between the two traits within the two environments were also all positive, and their genetic correlations were greater than their respective phenotypic correlations. The phenotypic and genetic correlations between ST and BL were higher in the normal salinity condition (rp = 0.416±0.017; rg = 0.779±0.037) than in the low salinity condition (rp = 0.298±0.021; rg = 0.568±0.048).
In this study, we first reported the heritability of ammonia tolerance (adversity-stress adaptability) for juveniles in the early stage in normal salinity (30‰) and low salinity (5‰) conditions and the G×E interaction between the two environments to explore the possibility of reducing the mortality of shrimp by selection. Many studies have reported that ammonia stress is more toxic to shrimp in early stages [28–30, 53–55], and this finding was corroborated in the present study. In this experiment, approximately 98.6% of the individuals with an average body weight 0.5 g (average body length of 3.7 cm were dead following exposure to 32 mg/L ammonia for 72 h (pH = 8.05, temperature = 26.5±0.5℃, salinity = 30‰). However, all of the L. vannamei individuals with the an average body weight of 3.3 g were dead following exposure to 63 mg/L ammonia for 218 h (pH = 8.0, temperature = 25℃, salinity = 30‰) . Sun et al.  reported that the half-lethal concentration of ammonia is 32.44 mg/L at 72 h (pH = 8.15, temperature = 27℃, salinity = 20‰) for L. vannamei with an average body length of 5.0 cm. Although high temperatures and low salinities can increase the toxicity of ammonia stress, the shorter survival time at lower concentrations of ammonia in our study also suggests that younger shrimp are more sensitive to ammonia toxicity. Additionally, we also observed that the lower salinity increased the sensitivity of shrimp to the toxic effects of ammonia. The lethal concentration of ammonia at 72 h in the normal salinity condition (30‰) was 32 mg/L, which is significantly higher than the 18 mg/L value observed in the low salinity condition (5‰). The average survival time of all the individuals under acute ammonia stress in the normal salinity condition was 36.61 h, which is substantially greater than the 24.80 h observed in the low salinity condition (Table 1), and the average survival time of majority of the families under acute ammonia stress in the normal salinity condition was significantly greater than that in the low salinity condition (Fig 3). The cumulative mortalities of the families in the normal salinity condition were significantly lower than those in the low salinity condition at each sampling point (Fig 2). The above results indicated that low salinity increased the toxicity of ammonia stress to shrimp, which is consistent with the results of previous studies [34, 35, 41–43]. Our results suggest that the selection of ammonia-tolerant shrimp should be performed in the early stage and in low salinity water.
A high heritability of a trait suggests that rapid and highly accurate genetic improvement can be obtained by selection. Although many studies have been performed to detect ammonia toxicity for aquatic organisms [10–18], very few reports are available regarding the genetic parameters related to ammonia tolerance in aquatic species; thus, there is little information against which to compare our results. Currently, the only previous report of the heritability of ammonia resistance in shrimp with an average body weight of 3.3 g that are subjected to ammonia stress in normal salinity conditions came from our group . In that report, the heritabilities of ST and SS50 were 0.154±0.045 and 0.148±0.040, respectively, and these values were not significantly different from each other. In the present study, we estimated the heritability of ammonia tolerance of shrimp with an average body weight of 0.5 g under ammonia stress in normal and low salinity conditions. The heritability estimates of ST (STH = 0.784±0.070; STL = 0.575±0.068; STL = 0.517±0.058) and SS50 (SS50H = 0.402±0.061; SS50L = 0.216±0.050; SS50M = 0.264±0.050) under ammonia stress at the two salinity levels were all significantly greater than that reported in the previous study, which indicates that improving the ammonia tolerance of shrimp via selection performed at the early stage could be quite advantageous. Although we did our best to maintain the same environment between the tanks in the present study, the absent of common environmental effects because of the juveniles without communal rearing during the experiment may have contributed to the overestimation of the heritability estimates of ammonia tolerance, because the exclusion of common environmental effects can inflate heritability estimates [57–59]. In our previous study, an ammonia challenge experiment was performed in four tanks, and significant tank effects were detected, so the fixed effect of the tank in the models might have reduced the additive genetic variance to a certain extent, but the absence of common environmental effects also might have resulted in overestimations of the heritability estimates [27, 60]. The higher heritability of ammonia tolerance observed in the present study might have been due to cryptic genetic variation (CGV) that emerged due to ammonia stress, which could have increased the additive genetic variance (Va, i.e., the heritable phenotypic variation). Because shrimp in the early stage are more sensitive to the ammonia stress, the CGV is more easily released. Additional information about CGV can be found in the review by Paaby and Rockman that was published in Nature Reviews Genetics . Genomic matrices based on molecular information provide a more precise method for the estimation of heritability and can thus enhance the accuracy of selective breeding, and this approach has been successfully applied in animal husbandry and agriculture [62, 63]. With the development of molecular biological technology, the cost of obtaining genotypic information from individual shrimp (e.g., through sequencing-based or SNP panel-based genotyping technology) will be substantially reduced, and a few molecular genetic/genomic resources will become publically available. At this time, we will use genomic matrices for heritability estimation, which will enable families to be communal reared without tags.
In the present study, the heritability estimates of ST were all significantly higher than those for SS50, which is inconsistent with our previous results that indicated that the heritability estimates of ST and SS50 are basically the same . The significantly lower heritability of SS50 compared with ST might have resulted from the smaller sample size at the family level (91) for SS50 compared to the individual level (above 3500) for ST. Additionally, the heritability estimate of ammonia tolerance at normal salinity was significantly greater than that at low salinity in the present study, which might have resulted from the incomplete release of the additive genetic variance among the individuals and families because the time to death was shorter and more intensive in the low salinity condition; after 20 h of ammonia stress, the cumulative mortality in the low salinity condition was 50%, whereas this value was only 24% in the normal salinity condition (Fig 1). The heritability estimates of ST and SS50 from the merged data from the environments were lower than those in the normal salinity condition, which indicates that salinity plays an important role in the estimation of the heritability of ammonia tolerance. After this experiment, a WSSV infection experiment was performed with the ammonia-sensitive and ammonia-tolerant populations to detect the relationship between ammonia tolerance and WSSV resistance. The results revealed that the ammonia-tolerant population also exhibited a significantly greater resistance to WSSV than the ammonia-sensitive population (this result will be published soon in an article entitled “The investigation for the susceptibility difference of ammonia-tolerant and ammonia-sensitive populations to WSSV infection in Litopenaeus vannamei”). Fortunately, the ammonia-tolerant shrimp can be improved largely due to high heritability, which might provide an alternative method to reduce mortality, aid the enhancement of resistance to pathogens, and reduce infectious diseases.
Growth is a very important trait in selective breeding programs because it is highly correlated with economic return. The estimates of the heritability of growth in L. vannamei have primarily focused on harvest body weight [27, 64], and very little information is available regarding juveniles. Although the absence of the common environmental effects could have resulted in the overestimation of the heritability of BL in the present study, the medium heritabilities (0.346±0.052, 0.386±0.054, and 0.291±0.042) in the two environments still indicated that there is potential for improving the growth of juveniles. The growth performances of the families between the two environments were not fully released due to the short communal rearing period (approximately three weeks), and thus, the heritabilities of growth observed in the two salinity levels in the present study are only preliminary results. To understand the difference in growth performance between normal and low salinity conditions, estimates of the heritability of growth in the two salinity levels is carried out in a separate experiment with a longer testing time (approximately three months) in our program, and the results of such an experiment will be published later. Previous studies have reported negative correlations between body weight and resistance traits (such as TSV resistance and WSSV resistance) in L. vanname [23, 24]. However, a strong positive correlation between body length and ammonia tolerance was detected in the present study, which is consistent with the previous studies [28, 38]. The strong positive correlation between body size and ammonia tolerance suggests that it is necessary to select ammonia-tolerant shrimp during the early stage and that selection high growth will not adversely affect ammonia tolerance performance in shrimp.
G×E interactions refer to genotypes that respond differently in different environments and thus result in different performances [65–67]. Genetic correlations across environments that are significantly different from 1 indicate the existence of re-ranking effects, i.e., genotypes that should be ranked differently in different environments . The results from the simulation study of Sae-Lim et al.  suggested that more reliable estimates of the G×E interactions can be obtained with population sizes greater than 2000 (i.e., 50–200 families and 10–40 individuals of each family) and heritabilities greater than 0.30. The above conditions are suitable to the situation in our study, and thus, the present estimates of G×E interactions are credible. Generally, selection for a trait in aquaculture is always performed in the main environment; for example, ammonia tolerance would typically be selected for in normal salinity conditions. However, the offspring of the excellent shrimp procured from selection are transferred to different commercial production environments in most cases. Therefore, strong re-ranking effects may occur and result in different responses to ammonia stress at different salinity levels if there is a strong G×E interaction for this trait, and these effects would influence selection accuracy and the total economic gains. In the present study, we detected a strong re-ranking effect (i.e., a low genetic correlation) for ammonia tolerance between the different salinity levels (Table 3), which suggests that the genetic advantage of ammonia tolerance can be significantly influenced by salinity. Additionally, the genetic correlations across environments would decrease along with increases in differences between environments [70, 71]. When a strong G×E interaction exists for a trait, it is better to perform selection in multiple environments, such as the nucleus environment and other production environments . Therefore, salinity-specific breeding programs for ammonia tolerance in shrimp should be purposefully implemented.
Overall in this study, we first reported the heritability of ammonia tolerance in juveniles in the early stage and the G×E interaction between the normal and low salinity to explore the potential for improving ammonia tolerance in shrimp by selection. The results revealed that the heritability of ammonia tolerance was medium to high in L. vannamei juveniles, which suggests that rapid genetic gains in terms of ammonia tolerance could be obtained by increasing the selection intensity in our selective breeding program. However, the heritability of ammonia tolerance in the normal salinity condition was significantly higher than that in the low salinity condition. Additionally, a strong G×E interaction for ammonia tolerance was detected between the normal salinity and low salinity conditions, which suggests that salinity-specific breeding programs for ammonia tolerance in shrimp should be purposefully implemented and that the normal salinity environment is a better choice based on the faster rate of genetic improvement due to higher heritability. Additionally, the significantly and strong positive correction between ammonia tolerance and body size suggests that ammonia-tolerant shrimp should be selected in the early stage.
We thank Yuling Zhang and Zhangwei Kong from Ocean University of China, Guangfeng Qiang from Shanghai Ocean University, and Fanyong Zeng from Nanjing Agricultural University for assisting to complete the ammonia test experiment.
- Conceptualization: XL SL JK.
- Data curation: XL SL.
- Formal analysis: XL JS.
- Funding acquisition: XL JK.
- Investigation: XL BXC PD DCH KL GMH XLS.
- Methodology: XL SL JS.
- Project administration: XL JK.
- Resources: BXC SL DCH.
- Software: SL JS XL.
- Supervision: JK SL XHM.
- Validation: XL.
- Visualization: XL.
- Writing – original draft: XL.
- Writing – review & editing: XL.
- 1. FAO Yearbook. Fishery and Aquaculture Statistics Summary tables. 2015.
- 2. Lu X, Luan S, Luo K, Meng XH, Li WJ, Sui J, et al. Genetic analysis of the Pacific white shrimp (Litopenaeus vannamei): heterosis and heritability for harvest body weight. Aquaculture Research 2015; 47: 3365–3375.
- 3. Cuzon G, Lawrence A, Gaxiola G, Rosas C, Guillaume J. Nutrition of Litopenaeus vannamei reared in tanks or in ponds. Aquaculture 2004; 235: 513–551.
- 4. Colt JE, Armstrong DA. Nitrogen toxicity to crustaceans, fish, and molluscs. In: Allen LJ, Kinney EC, editors. Proceedings of the Bio-Engineering Symposium for Fish Culture. Fish Culture Section, American Fisheries Society. Northeast Society of Conservation Engineers, Bethesda, MD. 1981. p. 34–47.
- 5. Foss A, Vollen T, Øiestad V. Growth and oxygen consumption in normal and O2 supersaturated water, and interactive effects of O2 saturation and ammonia on growth in spotted wolfish (Anarhichas minor Olafsen). Aquaculture 2003; 224: 105–116.
- 6. Alagappan KM, Deivasigamani D, Somasundaram ST, Kumaran S. Occurrence of Vibrio parahaemolyticus and its specific phages from shrimp ponds in east coast of India. Current Microbiology 2010; 61(4): 235–240. pmid:20140436
- 7. Lewis JrWM Morris DP. Toxicity of nitrite to fish: a review. Transactions of the American Fisheries Society 1986; 115: 183–195.
- 8. Hurvitz A, Bercovier H, Rijn JV. Effect of ammonia on the survival and the immune response of rainbow trout (Oncorhynchus mykiss, Walbaum) vaccinated against Streptococcus iniae. Fish & Shellfish Immunology 1997; 7: 45–53.