Omics Requirements
Requirements for Biobank Samples in Multi-Omics Technologies
In this section, we systemically concluded the bio-samples requirements of major omics types, including genomics, transcriptomics, proteomics, metabolomics and epigenomics.
Genomics
Genomics takes the whole genome sequence and DNA variations (single nucleotide variations, insertion deletions, structural variations, etc.) as the object of study, analyses the genome structure, function and evolution, and its carrier genome consists of exons encoding proteins (accounting for 1%) and introns regulating gene expression. .Whole-Genome Sequencing (WGS) involves DNA extraction, library preparation (fragmentation, end-repair, adapter ligation), paired-end 150bp sequencing (e.g., via Illumina platforms), and bioinformatics analysis (FastQC, BWA, GATK) . Its paired-end strategy improves detection of repeats and SVs.Whole-Exome Sequencing (WES), which uses probes to capture exons (1% of the genome but holds 85% of disease-causing mutations), is cost-effective and less data-intensive. Its process resembles WGS, but with exon capture and enrichment in library preparation.For microbial studies, amplicon sequencing amplifies marker genes (16S/18S/ITS) with universal primers for community composition analysis at the genus level and above. Metagenomic sequencing directly assesses all microbial DNA in environmental samples, revealing genetic and metabolic details down to the strain level. Both involve end-repair and adapter ligation in DNA processing.
Table 1: Genomics sample size requirements
| Specimen category | WGS/WES Sample volume | Amplicon sequencing Sample volume | Macrogenome sequencing Sample volume |
|---|---|---|---|
| Surgical tissue | 400μg-10mg | / | 400μg-25mg |
| Biopsy tissue | 400μg-10mg | / | 400μg-25mg |
| FFPE tissue | 10-20 sheets, area 1cm2, thickness 5-10µm | / | / |
| Plasma | / | ≥3ml | ≥5ml |
| Whole blood | ≥2ml | / | / |
| Saliva | 1-2ml | / | / |
| Faeces | / | ≥2g | ≥500mg |
Transcriptomics
Transcriptomics, the study of all transcribed RNAs (mRNAs, non-coding RNAs, and small RNAs) in a biological organism, reveals gene expression and regulation mechanisms, reflecting a cell's specific state .mRNA-seq, a common technique, involves total RNA extraction, mRNA enrichment, double-stranded cDNA synthesis, end-repair, polyA addition, adapter ligation, PCR enrichment, and library quality assessment, enabling high-throughput sequencing to uncover RNA expression dynamics and structural variations.miRNA-seq, focusing on 20nt to 25nt small RNAs, includes RNA extraction, adapter addition, reverse transcription, PCR amplification, and library quality assessment, using SE50 sequencing to analyze miRNA expression levels and evolutionary history. LncRNA-seq resolves long non-coding RNA functions by removing rRNA and retaining strand information .circRNA-seq constructs libraries by removing ribosomal and linear RNA, employing PE50 sequencing to analyze circular non-coding RNA differential expression and miRNA binding sites, elucidating their functional mechanisms . Macrotranscriptome sequencing takes all RNA as the research object, and after constructing standard libraries, it is able to analyse the transcriptional situation and regulatory laws of the genome of a population in a specific environment. The amount of sequencing data usually reaches 5-10Gb, providing an important tool for the study of gene expression networks in ecosystems.
Table 2: Transcriptomics sample size requirements
| Specimen category | mRNA-seq Sample volume | miRNA-seq Sample volume | LncRNA-seq Sample volume | Circ RNA-seq Sample volume | Macrotranscriptome Sample volume |
|---|---|---|---|---|---|
| Surgical tissue | 400μg-25mg | 400μg-50mg | 400μg-25mg | 400μg-50mg | 40mg-500mg |
| Biopsy tissue | 400μg-25mg | 400μg-50mg | 400μg-25mg | 400μg-50mg | 40mg-500mg |
| FFPE tissue | 5-10 sheets, thickness 5-10μm, 25mm2 | 5-10 sheets, thickness 5-10μm, 25mm2 | / | / | / |
| Whole Blood | ≥500μl | ≥500μl | ≥500μl | / | / |
| Faeces | / | / | / | / | 1-3g |
Proteomics
Proteomics, which uses high-throughput methods to analyze protein types, quantities, modifications, and interactions in cells or tissues, works with genomics and transcriptomics to uncover dynamic gene expression processes, crucial for understanding physiological and disease mechanisms.Unlabeled techniques like DIA generate secondary spectra by spectral library construction and integrate first-level signals. Combined with LC-MS/MS and tools like Spectronaut, it improves detection reliability. 4D-DIA adds ion mobility separation, enhancing scanning speed, sensitivity, and quantification accuracy.Label free uses LC- MS/MS for large-scale protein identification by analyzing proteolytic peptides. TMT enables simultaneous relative quantification of 8 or 18 samples using isotope tags, involving protein extraction, digestion, labeling, chromatographic fractionation, and LC-MS/MS detection .PRM, after pre-experimental optimization, targets specific peptide segments for accurate quantification. Metaproteomics combines label-free quantification, chromatographic fractionation, and mass spectrometry, using Unipept and Krona software to analyze microbial community protein composition and species abundance .
Post-translational modification research covers glycosylation, ubiquitination, methylation, acylation, etc. Glycosylation is divided into N-and O-linked types. Technical routes include lectin-enriched N-glycopeptides (label-free combined with heavy water deglycosylation), motif antibody-enriched O-GlcNAc (4D label-free analysis), and ZIC-HILIC-enriched intact N-glycopeptides for glycan composition analysis . The first involves a label-free approach, where N-glycopeptides are enriched using lectin affinity chromatography following enzymatic digestion. Subsequent deglycosylation with heavy water induces a mass shift, enabling detection via mass spectrometry. O-GlcNAc glycosylation analysis involves enzymatic digestion, enrichment with motif antibodies, and 4D-label-free mass spectrometry, followed by qualitative and quantitative analysis of modification sites. The second strategy involves intact N-glycopeptide analysis, where peptides are enriched using ZIC-HILIC chromatography after enzymatic digestion. This is followed by database searching and data analysis, allowing for the identification of modification sites, glycan compositions, and large-scale qualitative and quantitative analysis. Ubiquitination uses motif antibodies (K-ε-GG) to enrich peptides, with 4D label-free and mass spectrometry detecting a 114.1Da mass shift for site-specific analysis. Methylation relies on motif antibody enrichment and 4D label-free analysis of molecular weight changes to precisely locate arginine/lysine methylation sites .Acylation uses antibody enrichment followed by TMT labeling (for acetylation) or 4D label-free detection of molecular weight shifts to analyze acylated peptides and their functional mechanisms .
Table 3: Proteomics sample size requirements
| Specimen category | DIA/4D DIA (Non-Depletion of High-Abundance Proteins) |
DIA/4D DIA (Depletion of High-Abundance Proteins) |
Label free/TMT (Non-Depletion of High-Abundance Proteins) |
Label free/TMT (Depletion of High-Abundance Proteins) |
PRM Sample volume | Macroproteomics Sample volume |
|---|---|---|---|---|---|---|
| Surgical tissue | ≥5mg | / | ≥4mg | / | ≥2mg | / |
| Biopsy tissue | ≥5mg | / | ≥4mg | / | ≥2mg | / |
| Puncture Tissue | ≥2 stitches, size of a grain of rice visible to the naked eye | / | / | / | / | / |
| FFPE tissue | ≥20 pieces (5-10μm thick, 50mm2 size) | / | / | / | / | / |
| Serum | ≥20μl | ≥100μl | ≥20μl | ≥100μl | ≥50μl | / |
| Plasma | ≥20μl | ≥100μl | ≥20μl | ≥100μl | ≥50μl | / |
| Urine | ≥500μl | / | ≥500μl | / | ≥500μl | / |
| Saliva | ≥200μl | / | ≥200μl | / | ≥200μl | / |
| Faeces | / | / | / | / | / | ≥2g |
Table 4: Modified proteomics sample size requirements
| Specimen category | Phosphorylation Sample volume | Glycosylation Sample volume | Ubiquitination、Acylation、Methylation Sample volume |
|---|---|---|---|
| Surgical tissue | ≥30mg | ≥50mg | ≥75mg |
| Biopsy tissue | ≥50mg | ≥250mg | ≥75mg |
| FFPE tissue | ≥20 tablets (5-10μm, 50mm2 size) | —— | ≥20 tablets (5-10μm, 50mm2 size) |
| Serum | ≥20μl | ≥20μl | ≥50μl |
| Plasma | ≥20μl | ≥20μl | ≥50μl |
| Urine | ≥25ml | ≥30ml | ≥25ml |
| Saliva | ≥200μl | ≥500μl | ≥200μl |
| Faeces | / | / | / |
Metabolomics
Metabolomics, a burgeoning "omics" field following genomics, transcriptomics, and proteomics, serves as an extension of the latter two, offering a more direct and precise reflection of an organism's physiological state. It constitutes a crucial component of systems biology. By investigating alterations in metabolites, metabolomics elucidates the roles and impacts of proteins within metabolic pathways, thereby reflecting the outcomes of gene expression and regulation.
The primary technical workflow in metabolomics research begins with the separation of target compounds from the sample. Subsequently, appropriate methods are employed for metabolite extraction; for instance, global metabolomics often utilizes a combined extraction approach for hydrophilic metabolites and lipids. Prior to instrumental analysis, quality control (QC) measures are implemented to ensure the accuracy and reliability of the instrument and the detection process. Following this, gas chromatography (GC) or liquid chromatography (LC) separation techniques are used to finely separate the complex sample mixture, providing distinct sample components for subsequent mass spectrometry (MS) analysis. Mass spectrometry, with its high sensitivity and specificity, then precisely identifies and quantifies each component. The acquired data undergoes filtering to eliminate invalid or interfering data, ensuring data quality and reliability, followed by qualitative and quantitative analysis.
Table 5: Metabolomics sample size requirements
| Specimen category | Non-targeted metabolomics Sample volume | Targeted metabolomics Sample volume | Non-targeted lipid metabolomics Sample volume | Lipid Metabolomics Sample volume | Full-Spectrum Metabolome Sample volume |
|---|---|---|---|---|---|
| Surgical tissue | ≥20mg | ≥20mg | ≥20mg | ≥20mg | ≥20mg |
| Biopsy tissue | ≥20mg | ≥20mg | ≥20mg | ≥20mg | ≥20mg |
| serum | ≥100μl | ≥300μl | ≥100μl | ≥300μl | ≥100μl |
| Plasma | ≥100μl | ≥300μl | ≥100μl | ≥300μl | ≥100μl |
| Urine | ≥100μl | ≥1ml | ≥100μl | ≥500μl | ≥100μl |
| Saliva | ≥100μl | ≥200μl | ≥100μl | ≥1ml | ≥100μl |
| Faeces | ≥50mg | ≥100mg | / | ≥200mg | ≥50mg |
Epigenomics
Epigenomics explores heritable genome modifications like DNA methylation, chromatin accessibility, and RNA methylation that regulate gene expression without altering the DNA sequence, revealing the dual mechanisms of genetic information storage in both sequences and modifications .For DNA methylation, Whole-Genome Bisulfite Sequencing (WGBS) uses bisulfite to convert unmethylated cytosines to uracil, enabling single-base methylation mapping via high-throughput sequencing. The process includes DNA extraction, fragmentation, bisulfite conversion, and library preparation.Reduced Representation Bisulfite Sequencing (RRBS) enriches CCGG site-containing fragments via MspI digestion, covering 12% of the genome's methylation sites (over 80% of promoter regions) after bisulfite treatment, ideal for clinical CpG island methylation studies.Bisulfite Amplicon Sequencing (BSAS) allows targeted PCR amplification of specific genes or CpG islands, enabling absolute methylation quantification with as little as 1ng of sample .For chromatin accessibility and RNA methylation, ATAC-seq uses Tn5 transposase to cleave open chromatin regions, creating sequencing libraries to map nuclear chromatin accessibility, which can be combined with RNA-seq to study gene regulatory networks.MeRIP-seq enriches m6A-methylated RNA fragments (around 100nt) using m6A antibodies to profile m6A modification distribution across the transcriptome.ChIP-seq enriches DNA regions with histone modifications or transcription factor binding via chromatin immunoprecipitation, detecting genome-wide interaction sites.
Table 6: Epigenomics sample size requirements
| Specimen category | WGBS Sample volume | PRBS Sample volume | BSAS Sample volume | ATAC-seq Sample volume | MeRIP-seq Sample volume | ChIP-seq Sample volume |
|---|---|---|---|---|---|---|
| Surgical tissue | ≥50mg | ≥50mg | ≥50mg | ≥50mg | ≥50mg | ≥50mg |
| Biopsy tissue | ≥50mg | ≥50mg | ≥50mg | ≥50mg | ≥50mg | ≥50mg |
| FFPE tissue | 5-10 sheets, thickness 4-5 μm, 0.6mm³ | ≥10 pieces, thickness 5μm-10μm, area 200mm2 | ≥10 pieces, thickness 5μm-10μm, area 200mm2 | / | / | / |
| Serum | / | ≥4ml | ≥4ml | / | / | / |
| Plasma | / | ≥4ml | ≥4ml | / | / | / |
| Whole Blood | ≥1ml | ≥2ml | ≥2ml | / | / | / |
| Faeces | ≥1ml | / | / | / | / | / |