|Applied Biological Materials, Inc. is an Illumina Certified Service Provider, dedicated to ensuring the delivery of the highest-quality data available for genetic analysis applications. Click here for more details about the Illumina CSpro program.|
|Key features for abm's WGS service|
|Starting Material||50 ng (eukaryote) or 1 ng (prokaryote)|
|Sequencing Type||75 bp paired end.
Longer read lengths available upon customer request.
|Library Type||500bp fragment size (average)
Larger fragment size available upon customer request.
|Bioinformatics Analyses||Read alignment
SNP and Indel calling
de novo assembly*
Unified genomic variation detection*
|Turn-around Time||2-4 weeks for sequencing plus 2-3 weeks for analysis|
|Data Storage||3 months|
abm’s WGS service overview:
Exploratory Bioinformatics Analyses:
abm’s proprietary Unified Variation Caller detects genomic variation using an alignment-based and de novo assembly-based approach that is more sensitive and accurate in comparison to many popular choices such as VarScan, Pindel, and GATK. Our key advantage is the ability to accurately detect compound variations (deletions and insertions occurring at the same loci).
High quality assembly (shown by the dot plot on the leftt) enables accurate calling of SNPs, Simple Indels, and large Compound Variations. Circos plot (right) shows from outside to inside: Reference genome (brown), SNP density (green), Simple Deletion size (blue), Simple Insertion size (red), Compound Variation size (grey).
- Sequencing results in FASTQ format
- SNP and Indel calling
- Contig and scaffold sequences from de novo assemblies in FASTA format (if applicable)
- QC and analysis report
All of abm's sequencing services are performed by a group of specially trained and experienced scientists partaking in a streamlined workflow.
For more information, please contact our technical support team with details of your project requirements at email@example.com.
The library QC will also be performed using the Agilent Bioanalyzer to determine library size and purity. Also, prior to loading the libraries on the sequencer, we perform qPCR quantification. The cost for this is included in the sequencing service.
For the QC of the final data from our sequencing service, the integrated software will generally do the job and provide adequate QC information, however, we can also provide additional QC data such as FASTQC upon request.
The data we output will pass our Q30 filter, which means that the error rate in base calls is less than 1 in 1000, or 0.1%. We will get a percentage at the end of the run which states the percentage of bases that have a Q score > 30, and this percentage is usually ≥85%.
mirVana™ PARIS™ Kit or PureLink® miRNA Isolation kit to enable adequate enrichment of your sample. If preferred, it should be okay to use Qiagen's kit, as long as the RNA Integrity Number (RIN) of the sample is above 8.0 to make sure there is no significant degradation. (Samples with a RIN lower than 8.0 may contain smaller degraded RNA fragments that can be sequenced in addition to the miRNAs). We prefer receiving Total RNA sample however so we can perform the bioanalyzer QC to check the RNA quality before beginning library construction. There will be no extra-charges if you submit Total RNA instead of isolated miRNA.
For miRNA sequencing we require 200ng-2ug of total RNA in 10ul of nuclease free water, as quantified with a fluorometric method. Lower amounts might result in inefficient ligation and low yield. If you are supplying purified miRNA, please submit a minimum of 50-100 ng of purified small RNA in 10ul is required. Purified small RNAs must be in nuclease free water or 10 mM Tris-HCI, pH 8.5.
For other services, there are generally no preferred DNA/RNA isolation kits as long as minimum requirements for QC are met.
The RNASeq_sample_cufflinks_output.tar.gz will also need to be unzipped to see the text files by vim, nano, gedit, or Notepad plus plus. The unzipped data an also be imported as text into the data tab of Excel 2007. Other versions may differ.
To analyze further, Excel can sort, filter, find, match, etc and has many functions for conditional coloring, graphing, etc.
For a quick solution, you can take the # gb in the service and divide by genome size to find out the amount of coverage that service will give:
eg. 30Gb = 30,000,000,000 bases.
If the genome size is 1Gb = 1,000,000,000 bases, 30Gb will give 30X coverage.
For human WGS, it's more around 3Gb for genome size, so 30Gb will give 10X coverage. For a list of coverages, please visit the "Coverage Guidelines" section on this page.
BAM refers to the compressed/mapped format of the above raw FASTQ data.
VCF is the processed file that expresses only the differences between the raw/compressed-mapped sample and a known reference; thus is much smaller. The differences are the SNP/mutation/indels that you would be looking for. This is the only file you can use to locate these genetic markers.
To visualize the vcf file, you need to upload it to a visualizer like UCSC (https://genome.ucsc.edu/) or have your own visualizing program like genome in a box, galaxy, etc.
The level of annotation is often higher in UCSC [sic] but uses a 0-based coordinate system and is sometimes listed as hg19/GRCH37; But that is an older version of what NCBI uses right now. The current version is hg19/GRCh38 and uses a 1-based co-ordinate system.
Often this is important if the user already has an established data pool/collaborators/etc and wishes to compare annotations between old and new data. In most labs starting out, it will not matter which you use so long as they import/identify the correct reference into your visualization tool and be consistent with it so you can report comparable data.