Skip to content

Built-in presets

MiXCR provides a comprehensive list of built-in presets for many of available commercial kits, data types and library preparation protocols.

Preset can be used to run the whole upstream analysis pipeline with analyze command. For example:

mixcr analyze <preset-name> \
      sample_R1.fastq.gz \
      sample_R2.fastq.gz \
      sample_result

Command exportPreset can help to understand structure of preset.

Bellow you one can find a variety of presets for different types of input data and commercially available kits. Most of these presets do not require any additional arguments.

Kits

MiLaboratories

Human Ig RNA Multiplex

milab-human-rna-ig-umi-multiplex · Link · Code

Allows to obtain full length IG heavy and light chain repertoires with UMI-based accuracy. Discriminates all IGH isotypes including IgM, IgD, IgG3, IgG1, IgA1, IgG2, IgG4, IgE, and IgA2. By default the clones are assembled by {CDR1Begin:FR4End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

By default, separates clonotypes by isotype which may be changed using --dont-split-clones-by C mix-in option.

Example:

mixcr analyze milab-human-rna-ig-umi-multiplex \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

Human TCR RNA Multiplex

milab-human-rna-tcr-umi-multiplex · Link · Code · Tutorial

Allows to obtain human TCR alpha and beta CDR3 repertoires for different types of available RNA material, with high sensitivity and UMI-based accuracy.

Example:

mixcr analyze milab-human-rna-tcr-umi-multiplex \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

Human TCR RNA

milab-human-rna-tcr-umi-race · Link · Code

Allows to obtain unbiased TCR alpha and beta repertoires with UMI-based accuracy. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze milab-human-rna-tcr-umi-race \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

Human TCR DNA Multiplex

milab-human-dna-tcr-multiplex · Link · Code

Allows to obtain TCR alpha and beta repertoires for different types of available DNA material, with the highest possible sensitivity. Clones are assembled by CDR3 sequence.

Example:

mixcr analyze milab-human-dna-tcr-multiplex \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

Human 7GENES DNA Multiplex

milab-human-dna-xcr-7genes-multiplex · Link · Code

Allows to obtain a comprehensive set of complete and incomplete IG/TCR rearrangements in one tube. From 0.1 ng to 150 ng of template DNA. Applicable for lymphoid malignancy clonality detection and MRD monitoring. Note, that, for now MiXCR will report only complete rearrangements, support for incomplete rearrangements will be added soon.

Example:

mixcr analyze milab-human-dna-xcr-7genes-multiplex \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Mouse TCR RNA

milab-mouse-rna-tcr-umi-race · Link · Code

The kit allows to obtain unbiased TCR alpha and beta repertoires with UMI-based accuracy. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze milab-mouse-rna-tcr-umi-race \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

10xGenomics

10x Genomics single cell VDJ

10x-sc-xcr-vdj · Link · Code

Chromium Single Cell Immune Profiling provides a solution to your immunology questions. Analyze full-length V(D)J sequences for paired B-cell or T-cell receptors, all from a single cell. Notice that on the scheme bellow reads' length is shown according to the protocol recommendations, but the presets will work regardless of sequencing reads length.

The --species option is required.

Example:

mixcr analyze 10x-sc-xcr-vdj \
     --species hsa \
     sample_R1.fastq.gz \
     sample_R2.fastq.gz \
     sample_result

10x Genomics single cell 5' gene expression

10x-sc-5gex · Link · Code

These presets are specifically optimized to extract TCR and BCR repertoires from non-enriched single cell 5' RNA-seq cDNA libraries. By default the longest possible contig is assembled for every clone.

The --species option is required.

Example:

mixcr analyze 10x-sc-5gex \
     --species hsa \
     sample_R1.fastq.gz \
     sample_R2.fastq.gz \
     sample_result

New England BioLabs

NEBNext® Immune Sequencing Kit (Human) BCR & TCR

neb-human-rna-xcr-umi-nebnext · Link · Code · Tutorial

With the NEBNext® Immune Sequencing Kit (Human), sequence the full-length immune gene repertoires of B cells and T cells. Profile somatic mutations across all relevant contexts (e.g., V, D, and J segments and isotypes IgM, IgD, IgG, IgA, and IgE) with improved sequence accuracy. Characterize BCR light, BCR heavy, TCRα and TCRβ chains. This kit includes UMIs for source-molecule identification. Depending on the repertoire being sequenced (TCR, BCR or both TCR and BCR) you can pick the dedicated preset. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3. Mix-in option --dont-split-clones-by C may be used for BCR data to not separate clones by isotypes.

Example:

mixcr analyze neb-human-rna-xcr-umi-nebnext \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

NEBNext® Immune Sequencing Kit (Mouse) BCR & TCR

neb-mouse-rna-xcr-umi-nebnext · Link · Code

With the NEBNext® Immune Sequencing Kit (Mouse), sequence the full-length immune gene repertoires of B cells and T cells. Profile somatic mutations across all relevant contexts (e.g., V, D, and J segments and isotypes IgM, IgD, IgG, IgA, and IgE) with improved sequence accuracy. Characterize BCR light, BCR heavy, TCRα, TCRβ, TCRγ and TCRδ chains. This kit includes UMIs for source-molecule identification. Depending on the repertoire being sequenced (TCR, BCR or both TCR and BCR) you can pick the dedicated preset. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3. Mix-in option --dont-split-clones-by C may be used for BCR data to not separate clones by isotypes.

Example:

mixcr analyze neb-mouse-rna-xcr-umi-nebnext \
      --dont-split-clones-by C \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

Cellecta

DriverMap Adaptive Immune Receptor (AIR) TCR-BCR Human RNA Profiling

cellecta-human-rna-xcr-umi-drivermap-air · Link · Code

Cellecta’s DriverMap™ AIR TCR-BCR Human RNA assay is designed to specifically amplify only functional CDR3 RNA molecules' TCR and BCR cells, avoiding non-functional pseudogenes with similar structures. The assay simultaneously amplifies, in a single, multiplex RT-PCR reaction, all TCR and BCR CDR3 regions using a set of 300 experimentally validated PCR primers to yield Illumina-compatible, next-generation sequencing (NGS) libraries.

Example:

mixcr analyze cellecta-human-rna-xcr-umi-drivermap-air \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

DriverMap Adaptive Immune Receptor (AIR) TCR-BCR Human DNA Profiling

cellecta-human-dna-xcr-umi-drivermap-air · Link · Code

Easy-to-run, single-day assay that uses multiplex PCR-NGS technology to profile T-cell receptor (TCR) or B-cell receptor (BCR) repertoire present in human DNA. The DriverMap AIR DNA assay measures the frequency of CDR3 clonotypes to quantify the clonal frequency of T and B cells. Assay primers contain unique molecular identifiers (UMIs) to enable accurate quantitation of each clonotype.

Example:

mixcr analyze cellecta-human-dna-xcr-umi-drivermap-air \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

DriverMap Adaptive Immune Receptor (AIR) TCR-BCR Full Length Human RNA Profiling

cellecta-human-rna-xcr-full-length-umi-drivermap-air · Link · Code

Cellecta’s DriverMap™ AIR TCR-BCR Full Length assay is designed to specifically amplify Full Length TCR and BCR RNA molecules' TCR and BCR cells. The assay simultaneously amplifies, in a single, multiplex RT-PCR reaction, all TCR and BCR using a set of experimentally validated PCR primers to yield Illumina-compatible, next-generation sequencing (NGS) libraries.

Example:

mixcr analyze cellecta-human-rna-xcr-full-length-umi-drivermap-air \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

DriverMap Adaptive Immune Receptor (AIR) TCR-BCR Mouse RNA Profiling

cellecta-mouse-rna-xcr-umi-drivermap-air · Link · Code

Cellecta’s DriverMap™ AIR TCR-BCR Mouse RNA assay is designed to specifically amplify only functional CDR3 RNA molecules' TCR and BCR cells, avoiding non-functional pseudogenes with similar structures. The assay simultaneously amplifies, in a single, multiplex RT-PCR reaction, all TCR and BCR CDR3 regions to yield Illumina-compatible, next-generation sequencing (NGS) libraries.

Example:

mixcr analyze cellecta-mouse-rna-xcr-umi-drivermap-air \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Illumina

AmpliSeq for Illumina Immune Repertoire Plus, TCR beta Panel

illumina-human-rna-trb-ampliseq-plus · Link · Code

AmpliSeq for Illumina Immune Repertoire Plus, TCR beta Panel is a highly multiplexed targeted resequencing panel to measure T cell diversity and clonal expansion by sequencing T cell receptor (TCR) beta chain rearrangements. RNA evaluation of TCRβ chain rearrangements, including CDR1, CDR2, and CDR3 (with up to 400 bp read-length amplicons). By default the clones are assembled by {CDR1Begin:FR4End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze illumina-human-rna-trb-ampliseq-plus \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

AmpliSeq™ for Illumina® TCR beta-SR Panel

illumina-human-rna-trb-ampliseq-sr · Link · Code

Sequences TCR beta chain rearrangements, with up to 80 bp read-length amplicons for characterizing CDR3.

Example:

mixcr analyze illumina-human-rna-trb-ampliseq-sr \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

Qiagen

QIAseq™ Human TCR Panel Immune Repertoire RNA Library Kit

qiagen-human-rna-tcr-umi-qiaseq · Link · Code · Tutorial

The QIAseq Human TCR Panel Immune Repertoire RNA Library Kit uses unique Molecular Indices (UMI) with gene-specific primers to target specific RNAs for NGS sequencing. The Human T-cell Receptors Panel is used for sequencing the V(D)J region of the alpha, beta, delta and gamma genes, including the CDR3 regions. By default the clones are assembled by {CDR2Begin:FR4End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3. On contrary if the library was sequenced with longer reads and the full-length receptor sequence is expected to be covered, one can adjust assembling feature accordingly: --assemble-clonotypes-by VDJRegion

Example:

mixcr analyze qiagen-human-rna-tcr-umi-qiaseq \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

QIAseq™ Mouse TCR Panel Immune Repertoire RNA Library Kit

qiagen-mouse-rna-tcr-umi-qiaseq · Link · Code

The QIAseq Mouse TCR Panel Immune Repertoire RNA Library Kit uses unique Molecular Indices (UMI) with gene-specific primers to target specific RNAs for NGS sequencing. The Mouse T-cell Receptors Panel is used for sequencing the V(D)J region of the alpha, beta, delta and gamma genes, including the CDR3 regions.

By default the clones are assembled by {CDR2Begin:FR4End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3. On contrary if the library was sequenced with longer reads and the full-length receptor sequence is expected to be covered, one can adjust assembling feature accordingly: --assemble-clonotypes-by VDJRegion

Example:

mixcr analyze qiagen-mouse-rna-tcr-umi-qiaseq \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

AbHelix

abhelix-human-rna-xcr · Link · Code · Tutorial

This kit allows identification of IgG1,IgG1,IgG1,IgG1,IgGM,IgA isotypes. Isotypes are separated prior to the final PCR reaction, in a way that resulting sequences don't cover C region enough. Thus, this preset does not separate clones by C-gene, implying that different isotypes have been already separated into different samples. The kit also allows to obtain full-length sequences of TCR-alpha and TCR-beta V(D)J variable regions.

Example:

mixcr analyze abhelix-human-rna-xcr \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

By default the clones are assembled by {FR1Begin:FR4End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

iRepertoire

iRepertoire has multiple different primer systems that vary by the regions targeted, the desired read length, and the species.

Repertoire’s RepSeq service (formerly AMP2Seq) utilizes arm-PCR technology, which uses hundreds of VDJ-specific primers in one reaction to semi-quantitatively and inclusively amplify all the expressed V(D)Js in B or T cells from a single sample.

Human RepSeq RNA Reagent System

irepertoire-human-rna-xcr-repseq-sr · irepertoire-human-rna-xcr-repseq-lr · Link · Code

Short-read (SR) RNA-compatible human primers (presets that end with sr) cover from within Framework-3 (FR3) into the Constant Region (C). These SR primers are compatible with 100/150 paired end read (PER) sequencing. Long-read (LR) primer systems (presets that end with ls) cover from within FR1 and continue through to the C region. iRepertoire’s LR primers are compatible with 250 PER sequencing. Note that by default the clones are assembled by the regions not covered by the primers (CDR3 for sr and {CDR1Begin:FR4End} for lr).

Example:

mixcr analyze irepertoire-human-rna-xcr-repseq-sr \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

Mouse RepSeq RNA Reagent System

irepertoire-mouse-rna-xcr-repseq-sr · irepertoire-mouse-rna-xcr-repseq-lr · Link · Code

Short-read (SR) RNA-compatible human primers (presets that end with sr) cover from within Framework-3 (FR3) into the Constant Region (C). These SR primers are compatible with 100/150 paired end read (PER) sequencing. Long-read (LR) primer systems (presets that end with ls) cover from within FR2 and continue through to the C region. iRepertoire’s LR primers are compatible with 250 PER sequencing. Note that by default the clones are assembled by the regions not covered by the primers (CDR3 for sr and {CDR2Begin:FR4End} for lr).

Example:

mixcr analyze irepertoire-mouse-rna-xcr-repseq-lr \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

RepSeq+ Reagent System for Human and Mouse

irepertoire-human-rna-xcr-repseq-lr · irepertoire-mouse-rna-xcr-repseq-lr · Link · Code

iRepertoire’s RepSeq+ service utilizes dam-PCR technology, which allows for any combination of TCR and BCR chains (TCR-alpha, TCR-beta, TCR-delta, TCR-gamma, BCR-IgHeavy, and BCR-kappa/lambda) to be amplified within a single reaction. The RepSeq+ service is available for human (BCR and TCR) and mouse (TCR only).

Example:

mixcr analyze irepertoire-mouse-rna-xcr-repseq-lr \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

RepSeq+ Human Reagent System with UMIs

irepertoire-human-rna-xcr-repseq-plus-umi-pe · irepertoire-human-rna-xcr-repseq-plus-umi-se · Link · Code

iRepertoire's RepSeq+ service integrates dam-PCR technology. This allows for the amplification of any combination of TCR and BCR chains (including TCR-alpha, TCR-beta, TCR-delta, TCR-gamma, BCR-IgHeavy, and BCR-kappa/lambda) within a single reaction, complemented by UMI-based error correction. For IGH the kit allows identification of the following isotypes: IGHM, IGHD, IGHA, IGHE, IGHG1-2, IGHG3-4.

The recommended protocol outlines two sequencing strategies:

Paired End sequencing, which spans all regions from CDR1 to FR4.

Single End sequencing, covering the CDR2 to FR4 regions.

TRG Coverage

For Single End sequencing, the TRG clones might only encompass the CDR3 and FR4 regions due to the extended length of the V-region. It may be beneficial to include the --assemble-clonotypes-by CDR3 for this chain analysis.

Example:

mixcr analyze irepertoire-human-rna-xcr-repseq-plus-umi-pe \
    input_R1.fastq.gz \
    input_R2.fastq.gz \
    result

Original raw RepSeq+ sequences contain sample barcodes. If your data is non-demultiplexed and you want to use the barcode information to split the samples you can do so using the command bellow:

mixcr analyze irepertoire-human-rna-xcr-repseq-plus-umi-pe \
    --tag-pattern "^(UMI:N{10})(SMPL:N{8})(R1:*)\^N{3}(R2:*)" \
    --sample-table sample_table.tsv \
    input_R1.fastq.gz \
    input_R2.fastq.gz \
    result

For single end sequencing: "^(UMI:N{10})(SMPL:N{8})(R1:*)".
Where the sample_table.tsv looks like the example bellow:

Sample TagPattern SMPL
Sample1 CAGCCCTA
Sample2 GGCAATGT
... ...

Human gDNA based RepSeq Reagent System

irepertoire-human-dna-xcr-repseq-sr · irepertoire-human-dna-xcr-repseq-lr · Link · Code

SR gDNA compatible primers (SR-VJ) cover from within FR3 to the end of the J-gene. These have been LR gDNA compatible primers cover from within FR1 to the end of the J-gene. LR versions are available for both human TCR beta and human IgH. These can be sequenced as single end read on 300-cycle kits or for full amplicon coverage as 250 PER sequencing. By default the clones are assembled by {CDR1Begin:CDR3End}, if the library has been sequenced as single end read one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze irepertoire-human-dna-xcr-repseq-sr \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

RepSeq+ Mouse Reagent System with UMIs

irepertoire-mouse-rna-xcr-repseq-plus-umi-pe · Link · Code

iRepertoire's RepSeq+ service integrates dam-PCR technology. This allows for the amplification of any combination of TCR and BCR chains (including TCR-alpha, TCR-beta, TCR-delta, TCR-gamma, BCR-IgHeavy, and BCR-kappa/lambda) within a single reaction, complemented by UMI-based error correction. For IGH the kit allows identification of the following isotypes: IGHM, IGHD, IGHA, IGHE, IGHG1, IGHG2, IGHG3. The sequences should cover all regions from CDR2 to FR4.

Example:

mixcr analyze irepertoire-mouse-rna-xcr-repseq-plus-umi-pe \
    input_R1.fastq.gz \
    input_R2.fastq.gz \
    result

Original raw RepSeq+ sequences contain sample barcodes. If your data is non-demultiplexed and you want to use the barcode information to split the samples you can do so using the command bellow:

mixcr analyze irepertoire-mouse-rna-xcr-repseq-plus-umi-pe \
    --tag-pattern "^(UMI:N{10})(SMPL:N{8})(R1:*)\^N{3}(R2:*)" \
    --sample-table sample_table.tsv \
    input_R1.fastq.gz \
    input_R2.fastq.gz \
    result

Where the sample_table.tsv looks like the example bellow:

Sample TagPattern SMPL
Sample1 CAGCCCTA
Sample2 GGCAATGT
... ...

Thermo Fisher

Oncomine™ TCR Beta‑LR Assay

thermofisher-human-rna-trb-oncomine-lr · Link · Code

The Oncomine™ TCR Beta‑LR Assay is a highly sensitive, RNA-based NGS assay that enables the characterization of the T-cell receptor β (TCRβ) sequences, including all complementarity-determining regions (CDR1, 2, and 3) of the variable gene. The assay accurately measures T‑cell repertoire diversity, clonal expansion, and allows for identification of allele-specific polymorphisms, in a wide array of sample types. By default the clones are assembled by {CDR1Begin:FR4End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze thermofisher-human-rna-trb-oncomine-lr \
      input.fastq.gz \
      result 

Oncomine™ TCR Beta‑SR Assay (RNA or DNA based)

thermofisher-human-rna-trb-oncomine-sr · thermofisher-human-dna-trb-oncomine-sr · Link · Code

The Oncomine™ TCR Beta‑SR Assay is a highly sensitive RNA- or DNA-based NGS assay that enables the characterization of the T-cell receptor β (TCRβ) complementarity-determining region 3 (CDR3) sequences of the TCRβ chain. The assay accurately measures T‑cell repertoire diversity and clonal expansion in a wide array of sample types, including those derived from FFPE-preserved or degraded material.

Example:

mixcr analyze thermofisher-human-rna-trb-oncomine-sr \
      input.fastq.gz \
      result 

Oncomine™ BCR IGH‑LR Assay Kit

thermofisher-human-rna-igh-oncomine-lr · Link · Code

The Oncomine™ BCR IGH‑LR Assay is a highly sensitive, RNA-based NGS assay that enables the characterization of immunoglobulin heavy-chain sequences, including all complementarity-determining regions (CDR1, 2, and 3) and the CH1 domain of the constant gene. The assay accurately measures repertoire diversity, clonal expansion, allows for determination of B cell isotype (and subtype), and reports the level of somatic hypermutation, in a wide array of sample types. By default the clones are assembled by {CDR1Begin:FR4End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze thermofisher-human-rna-igh-oncomine-lr \
      input.fastq.gz \
      result 

Oncomine™ BCR IGH‑SR Assay (RNA or DNA based)

thermofisher-human-rna-igh-oncomine-sr · thermofisher-human-dna-igh-oncomine-sr · Link · Code

The Oncomine™ BCR IGH‑SR Assay is a highly sensitive RNA- or DNA-based NGS assay that enables the characterization of the immunoglobulin heavy chain complementarity-determining region 3 (CDR3) sequences. The assay accurately measures B cell repertoire diversity and clonal expansion in a wide array of sample types, including samples derived from FFPE-preserved or degraded material.

Example:

mixcr analyze thermofisher-human-dna-igh-oncomine-sr \
      input.fastq.gz \
      result 

Oncomine™ BCR Pan-Clonality Assay

thermofisher-human-dna-bcr-oncomine-pan-clonality · Link · Code

The Oncomine™ BCR Pan-Clonality Assay is a highly sensitive, single reaction, DNA-based NGS assay that enables the characterization of B cell heavy chain (IGH) and light chain (IgK and IgL) receptor sequences, including each complementarity-determining region 3 (CDR3). The assay accurately measures B cell repertoire metrics such as repertoire diversity and clonality in multiple receptors in a single reaction.

Example:

mixcr analyze thermofisher-human-dna-bcr-oncomine-pan-clonality \
      input.fastq.gz \
      result 

Oncomine™ IGH FR3(d)-J Assay

thermofisher-human-dna-igh-oncomine-fr3-j · Link · Code

The Oncomine™ IGH FR3(d)-J Assay is a highly sensitive, DNA-based NGS assays that enable the characterization of B cell heavy chain (IGH) sequences, including complementarity-determining region 3 (CDR3). The assays accurately measure B cell repertoire metrics such as repertoire diversity and clonality.

Example:

mixcr analyze thermofisher-human-dna-igh-oncomine-fr3-j \
      input.fastq.gz \
      result 

Oncomine™ IGH FR2-J Assay

thermofisher-human-dna-igh-oncomine-fr2-j · Link · Code

Oncomine™ IGH FR2-J Assay is a highly sensitive, DNA-based NGS assays that enable the characterization of B cell heavy chain (IGH) sequences, including complementarity-determining region 3 (CDR3). The assays accurately measure B cell repertoire metrics such as repertoire diversity and clonality. The Oncomine™ IGH FR2-J Assay uses multiplex Ion AmpliSeq™ primers to target the FR2 region of the variable gene and the joining gene segment of IGH rearrangements in genomic DNA producing 200–250 bp amplicons. Note, that clonotypes in this preset are assembled by {CDR2Begin:CDR3End} feature, because primers are located in FR2 and FR4.

Example:

mixcr analyze thermofisher-human-dna-igh-oncomine-fr2-j \
      input.fastq.gz \
      result 

Oncomine™ IGH FR1-J Assay

thermofisher-human-dna-igh-oncomine-fr1-j · Link · Code

The Oncomine IGH FR1-J Assay is a targeted next-generation sequencing (NGS) assay for detection of somatic hypermutation (SHM) in the variable region of the immunoglobulin heavy chain of B-cell receptors. SHM increases the affinity of the B-cell receptor for antigen, post-VDJ recombination, and is frequently a region of interest in CLL research. Primers have been specifically designed to target framework 1 and joining gene regions of the immunoglobulin heavy chain, resulting in an amplicon ∼300 bp in length. Note, that clonotypes in this preset are assembled by {CDR1Begin:CDR3End} feature, because primers are located in FR1 and FR4.

Example:

mixcr analyze thermofisher-human-dna-igh-oncomine-fr1-j \
      input.fastq.gz \
      result 

Oncomine™ IGHV Leader-J Assay

thermofisher-human-dna-igh-oncomine-v-leader-j · Link · Code

The Oncomine BCR IGHV Leader-J Assay is a targeted next-generation sequencing (NGS) assay for detection of somatic hypermutation (SHM) in the variable region of the immunoglobulin heavy chain of B-cell receptors. SHM increases the affinity of the B-cell receptor for antigen, post-VDJ recombination, and is used as a biomarker in chronic lymphocytic leukemia (CLL). Primers have been specifically designed to target the leader and joining regions of the immunoglobulin heavy chain, resulting in an amplicon ∼480 bp in length. Note, that clonotypes in this preset are assembled by {FR1Begin:CDR3End} feature, because primers are located in the leader sequence and FR4.

Example:

mixcr analyze thermofisher-human-dna-igh-oncomine-v-leader-j \
      input.fastq.gz \
      result 

Oncomine™ TCR Pan-Clonality Assay

thermofisher-human-dna-tcr-oncomine-pan-clonality · Link · Code

The Oncomine TCR Pan-Clonality Assay is a targeted next-generation sequencing (NGS) assay for detection of T-cell diversity to measure clonal expansion/contraction with accuracy and for detection of rare clones with high sensitivity. Primers have been specifically designed to sequence the FR3-J regions of the TCR beta chain and TCR gamma chain. Through the use of Ion AmpliSeq technology, sequencing of the CDR3 region of the beta and gamma T-cell receptor chains occurs in a single reaction. When combined with Ion Reporter Software, data can be easily analyzed and reports created within a short period of time.

Example:

mixcr analyze thermofisher-human-dna-tcr-oncomine-pan-clonality \
      input.fastq.gz \
      result 

Ion AmpliSeq Mouse TCR Beta SR Assay

thermofisher-mouse-rna-tcb-ampliseq-sr · thermofisher-mouse-dna-tcb-ampliseq-sr · Link · Link · Code

The Ion AmpliSeq Mouse TCR Beta SR Assay is a robust, targeted next-generation sequencing (NGS) assay designed to accurately identify and measure the clonal expansion of T lymphocytes by targeting the complementarity-determining region 3 (CDR3) of the T-cell receptor (TCR) gene locus from gDNA input. The assay can be used for basic and translational research to identify T-cell clones since the nucleotide sequence of the CDR3 region is unique to each T cell and codes for the part of the TCR beta chain that is involved in antigen recognition. The kit comes in two options: for RNA and DNA starting material.

Example:

mixcr analyze thermofisher-mouse-rna-tcb-ampliseq-sr \
      input.fastq.gz \
      result

Ion AmpliSeq Mouse BCR IGH SR Assay, RNA

thermofisher-mouse-rna-igh-ampliseq-sr · thermofisher-mouse-dna-igh-ampliseq-sr · Link · Link · Code

The Ion AmpliSeq Mouse BCR IGH SR Assay, RNA/DNA, is a robust, targeted next-generation sequencing (NGS) assay for use in basic and translational immunology, immuno-oncology, hemato–oncology, and vaccine research. It is designed to accurately identify and measure the clonal expansion of B lymphocytes in blood, peripheral blood leukocytes (PBLs), peripheral blood mononuclear cells (PBMCs), and fresh-frozen (FF) and formalin-fixed paraffin-embedded (FFPE) samples. The assay identifies unique B-cell clones through targeting of the highly diverse complementarity-determining region 3 (CDR3) of the B-cell receptor (BCR) immunoglobulin heavy (IGH) chain from genomic DNA template . The nucleotide sequence of the IGH CDR3 region serves as a natural barcode to enable clone tracking and measurements of B-cell clonal expansion and diversity. Analysis of IGH CDR3-region amino acid motifs may reveal signatures of B-cell responses to defined antigens.

Example:

mixcr analyze thermofisher-mouse-dna-igh-ampliseq-sr \
      input.fastq.gz \
      result

Invivoscribe

LymphoTrack Dx TRG Assay Panel

invivoscribe-human-dna-trg-lymphotrack · Link · Code

The LymphoTrack Dx TRG Assay is an in vitro diagnostic product intended for next-generation sequencing (NGS) based determination of the frequency distribution of TRG gene rearrangements in patients suspected with having lymphoproliferative disease. This assay aids in the identification of lymphoproliferative disorders. Our single multiplex master mixes target all conserved regions within the variable (V) and the joining (J) region genes described in lymphoid malignancies. The preset works for both Illumina and Ion S5/PGM versions.

Example:

mixcr analyze invivoscribe-human-dna-trg-lymphotrack \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

LymphoTrack Dx TRB Assay Panel

invivoscribe-human-dna-trb-lymphotrack · Link · Code

The LymphoTrack Dx TRB Assay is an in vitro diagnostic product intended for next-generation sequencing (NGS) based determination of the frequency distribution of TRB gene rearrangements in patients suspected with having lymphoproliferative disease. This assay aids in the identification of lymphoproliferative disorders. Each single multiplex master mix for TRB targets the conserved regions within the Vβ and the Jβ regions described in lymphoid malignancies. The preset works for both Illumina and Ion S5/PGM versions.

Example:

mixcr analyze invivoscribe-human-dna-trb-lymphotrack \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

LymphoTrack Dx IGK Assay Panel

invivoscribe-human-dna-igk-lymphotrack · Link · Code

The LymphoTrack Dx IGK Assay is an in vitro diagnostic product intended for next-generation sequencing (NGS) based determination of the frequency distribution of IGK gene rearrangements in patients suspected with having lymphoproliferative disease. This assay aids in the identification of lymphoproliferative disorders. Our single multiplex master mix for IGK targets the Vк-Jк, the Vк-Kde, and the INTR-Kde gene rearrangements described in lymphoid malignancies.

The preset works for both Illumina and Ion S5/PGM versions.

Example:

mixcr analyze invivoscribe-human-dna-igk-lymphotrack \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

LymphoTrack Dx IGHV Leader Somatic Hypermutation Assay Panel

invivoscribe-human-dna-ighv-leader-lymphotrack · Link · Code

The LymphoTrack Dx IGHV Leader Somatic Hypermutation Assay for the Illumina MiSeq is an in vitro diagnostic product intended for next-generation sequencing (NGS) based determination of the frequency distribution of IGH gene rearrangements as well as the degree of somatic hypermutation of rearranged genes in patients suspected with having lymphoproliferative disease. This assay aids in the identification of lymphoproliferative disorders as well as providing an aid in determining disease prognosis. Our single multiplex master mix targets the Leader (VHL) and the joining (J) gene regions of IGH.

Example:

mixcr analyze invivoscribe-human-dna-ighv-leader-lymphotrack \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

LymphoTrack Dx IGH FR1 Assay Kit Panel

invivoscribe-human-dna-igh-fr1-lymphotrack · Link · Code

The LymphoTrack Dx IGH FR1 Assay is an in vitro diagnostic product intended for next-generation sequencing (NGS) based determination of the frequency distribution of IGH gene rearrangements as well as the degree of somatic hypermutation of rearranged genes in patients suspected with having lymphoproliferative disease. This assay aids in the identification of lymphoproliferative disorders as well as providing an aid in determining disease prognosis. Our single multiplex master mix for IGH targets the conserved framework region 1 (FR1) within the VH and the JH regions described in lymphoid malignancies.

The preset works for both Illumina and Ion S5/PGM versions.

Example:

mixcr analyze invivoscribe-human-dna-igh-fr1-lymphotrack \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

LymphoTrack Dx IGH FR2 Assay Kit Panel

invivoscribe-human-dna-igh-fr1-lymphotrack · Link · Code

The LymphoTrack Dx IGH FR2 Assay is an in vitro diagnostic product intended for next-generation sequencing (NGS). The assay will determine the frequency distribution of IGH VH-JH gene rearrangements in patients suspected with having lymphoproliferative disease. This assay aids in the identification of lymphoproliferative disorders. Our single multiplex master mix for IGH targets the conserved framework region 2 (FR2) within the VH and the JH regions described in lymphoid malignancies. The preset works for both Illumina and Ion S5/PGM versions.

Example:

mixcr analyze invivoscribe-human-dna-igh-fr2-lymphotrack \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

LymphoTrack Dx IGH FR3 Assay Kit Panel

invivoscribe-human-dna-igh-fr3-lymphotrack · Link · Code

The LymphoTrack Dx IGH FR3 Assay is an in vitro diagnostic product intended for next-generation sequencing (NGS) for the Illumina MiSeq instrument. The assay will determine the frequency distribution of IGH VH-JH gene rearrangements in patients suspected with having lymphoproliferative disease. This assay aids in the identification of lymphoproliferative disorders. Our single multiplex master mix for IGH targets the conserved framework region 3 (FR3) within the VH and the JH regions described in lymphoid malignancies.

The preset works for both Illumina and Ion S5/PGM versions.

Example:

mixcr analyze invivoscribe-human-dna-igh-fr3-lymphotrack \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Takara Bio

SMART-Seq Human BCR (with UMIs)

takara-human-rna-bcr-umi-smartseq · Link · Code

SMART-Seq Human BCR Kit (with UMIs) provides a sensitive and reproducible solution for generating high-quality NGS libraries for profiling the human BCR repertoire. The kit leverages SMART (Switching Mechanism at 5' end of RNA Template) full-length cDNA synthesis technology and pairs NGS with a 5’-RACE approach to capture the complete V(D)J variable regions of all human B-cell receptor (BCR) heavy (IgG/M/D/A/E) and light (IgK/L) chains. The By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3. Mix-in option --dont-split-clones-by C may be used to not separate clones by isotypes.

Example:

mixcr analyze takara-human-rna-bcr-umi-smartseq \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

SMART-Seq Human TCR (with UMIs)

takara-human-rna-tcr-umi-smartseq · Link · Code

SMART-Seq Human TCR (with UMIs) is powered by robust chemistry that provides unparalleled sensitivity and reproducibility. The kit leverages SMART (Switching Mechanism at 5' end of RNA Template) full-length cDNA synthesis technology and pairs NGS with a 5'-RACE approach to capture the complete V(D)J variable regions of TRA and TRB genes. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze takara-human-rna-tcr-umi-smartseq \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

SMARTer Human BCR IgG IgM H/K/L Profiling Kit

takara-human-rna-bcr-umi-smarter · Link · Code · Tutorial

SMARTer Human BCR IgG IgM H/K/L Profiling Kit pairs 5' RACE with NGS technology to provide a sensitive, accurate, and optimized approach to BCR profiling from RNA input samples. The 5' RACE method reduces variability and allows for priming from the constant region of BCR heavy or light chains. This kit combines these benefits with gene-specific amplification to capture complete V(D)J variable regions of BCR transcripts and provide a highly sensitive and reproducible method for profiling B-cell repertoires. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3. Mix-in option--dont-split-clones-by C mix-in may be used to not separate clones by isotypes.

Example:

mixcr analyze takara-human-rna-bcr-umi-smarter \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

SMARTer Human TCR a/b Profiling Kit v2

takara-human-rna-tcr-umi-smarter-v2 · Link · Code

he SMARTer Human TCR a/b Profiling Kit v2 (TCRv2) is powered by robust chemistry that provides unparalleled sensitivity and reproducibility. The kit leverages SMART (Switching Mechanism at 5' end of RNA Template) full-length cDNA synthesis technology and pairs NGS with a 5'-RACE approach to capture the complete V(D)J variable regions of TRA and TRB genes. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze takara-human-rna-tcr-umi-smarter-v2 \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

SMARTer Human TCR a/b Profiling Kit

takara-human-rna-tcr-smarter · Link · Code

SMARTer Human TCR a/b Profiling Kit allows to obtain full-length sequences of TCR-alpha and TCR-beta V(D)J variable regions. The -cdr3 preset may be used to reduce clonotype assembling feature from full V-D-J region to CDR3 only.

Example:

mixcr analyze takara-human-rna-tcr-smarter \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

SMART-Seq Mouse TCR (with UMIs)

takara-mouse-rna-tcr-umi-smarseq · Link · Code

SMART-Seq Mouse TCR (with UMIs), powered by SMART (Switching Mechanism at 5’ end of RNA Template) technology, enables detection of mouse TCR sequences with sensitivity and no bias. The kit combines NGS with a 5’-RACE approach to capture the full-length V(D)J variable regions of mouse TRA and TRB genes.

By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length e.g. 150+150) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze takara-mouse-rna-tcr-umi-smarseq \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

SMARTer Mouse BCR IgG H/K/L Profiling Kit

takara-mouse-rna-bcr-smarter · Link · Code

The SMARTer Mouse BCR IgG H/K/L Profiling Kit pairs 5' RACE with NGS technology to provide a sensitive, accurate, and optimized approach to BCR profiling. The 5'-RACE method reduces variability and allows for priming from the constant region of BCR heavy or light chains. This kit combines these benefits with gene-specific amplification to capture complete V(D)J variable regions of BCR transcripts and provide a highly sensitive and reproducible method for profiling B-cell repertoires. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze takara-mouse-rna-bcr-smarter \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

SMARTer Mouse TCR a/b Profiling Kit

takara-mouse-rna-tcr-smarter · Link · Code · Tutorial

The SMARTer Mouse TCR a/b Profiling Kit provides a powerful new solution for those seeking to perform T-cell receptor (TCR) repertoire analysis using NGS. The kit employs a 5'-RACE-based approach to capture complete V(D)J variable regions of TCR transcripts, starting from as little as 10 ng to 500 ng of total RNA obtained from mouse spleen, thymus, or PBMCs, or from 1,000 to 10,000 purified T cells. As the name suggests, the kit can be used to generate data for both TCR-alpha and TCR-beta chain diversity, either in the same experiment or separately. By default the clones are assembled by VDJRegion, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze takara-mouse-rna-tcr-smarter \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result 

BD Rhapsody

BD Rhapsody™ VDJ CDR3 Protocol

bd-human-sc-xcr-rhapsody-cdr3 · bd-mouse-sc-xcr-rhapsody-cdr3 · Link · Code

The BD Rhapsody™ VDJ CDR3 Protocol utilizes the existing BD Rhapsody™ Targeted Kits and the human/mouse immune response primer panel and is designed to work alongside the BD® AbSeq Assay and BD® Single-Cell Multiplexing Kit (SMK). Details about the individual reagents needed to run the assay are included in the respective protocols. A dedicated bioinformatics pipeline is also available for you to analyze sequencing data generated using the CDR3 protocol.

Example:

mixcr analyze bd-human-sc-xcr-rhapsody-cdr3 \
     sample_R1.fastq.gz \
     sample_R2.fastq.gz \
     sample_result

BD Rhapsody™ VDJ Full Length Protocol

bd-sc-xcr-rhapsody-full-length · Link · Code

The BD Rhapsody™ TCR/BCR Multiomic Assay helps you assemble the full length VDJ sequence for the T cell and B cell receptor. The --species option is required.

Example:

mixcr analyze bd-sc-xcr-rhapsody-full-length \
    --species human \
    sample_R1.fastq.gz \
    sample_R2.fastq.gz \
    sample_result

Oxford Nanopore Technologies

Full-length RNA-seq

generic-ont · generic-ont-with-umi · Link · Code

These presets designed to handle long reads RNA-seq data obtained with Oxford Nanopore Technologies sequencer.

The --species option is required. generic-ont-with-umi preset requires --tag-pattern with UMI.

Example:

mixcr analyze generic-ont \
     --species hsa \
     sample.fastq.gz \
     sample_result

mixcr analyze generic-ont-with-umi \
     --species hsa \
     --tag-pattern "^(UMI:N{12})(R1:*)" \
     sample.fastq.gz \
     sample_result

Parse Biosciences

Parse Evercode™ single-cell

parsebio-sc-3gex-evercode-wt-mini · parsebio-sc-3gex-evercode-wt · parsebio-sc-3gex-evercode-wt-mega · Link · Code

Conventional droplet-based single-cell technologies struggle as cell or experiment sizes change. Parse makes it easy to scale your experiments regardless of cell size or sample type. The Evercode™ Whole Transcriptome technology, originally based on the split-pool combinatorial barcoding method published in Science and known widely as SPLiT-Seq, is accessible to any standard biology lab. Please note that as a 3'end RNA-seq based protocol it was not originally optimized for immune repertoire analysis, thus TCR/BCR yield might be low. Select the preset according to the kit used: Evercode WT, Evercode WT Mini or Evercode WT Mega.

The --species option is required.

Example:

mixcr analyze parsebio-sc-3gex-evercode-wt-mega \
     --species hsa \
     sample_R1.fastq.gz \
     sample_R2.fastq.gz \
     sample_result

Singleron

GEXSCOPE Single Cell V(D)J Kit

singleron-human-sc-xcr-gexscope-vdj · Link · Code

The GEXSCOPE® Single Cell V(D)J Kit enables simultaneous detection of T-cell or B-cell receptor variable region (CDR3) together with the whole transcriptome expression at single-cell level.

Example:

mixcr analyze singleron-human-sc-xcr-gexscope-vdj \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Public protocols

Biomed2

IGH primer set

biomed2-human-rna-igh · Publication · Code · Tutorial

Biomed2 FR1-FR4 human multiplex BCR primer set. By default the clones are assembled by {CDR1Begin:CDR3End}, if needed (e.g. if the library has been sequenced with shorter read length) one can assemble clones by CDR3 by adding --assemble-clonotypes-by CDR3.

Example:

mixcr analyze biomed2-human-rna-igh \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

IGK/IGL primer set

biomed2-human-rna-igkl · Publication · Code

Biomed2 IGK/IGL human multiplex primer set. Clones are assembled by CDR3.

Example:

mixcr analyze biomed2-human-igl-igk \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

TRB/TRD/TRG primer set

biomed2-human-rna-trbdg · Publication · Code

Biomed2 TRB/TRD/TRG human multiplex primer set. Clones are assembled by CDR3.

Example:

mixcr analyze biomed2-human-rna-trbdg \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Smart-Seq2

smart-seq2-vdj · Publication · Code

For Smart-Seq2, single cells are lysed in a buffer that contains free dNTPs and oligo(dT)-tailed oligonucleotides with a universal 5'-anchor sequence. RT is performed, which adds 2–5 untemplated nucleotides to the cDNA 3′ end. A template-switching oligo (TSO) is added, carrying 2 riboguanosines and a modified guanosine to produce a LNA as the last base at the 3′ end. After the first-strand reaction, the cDNA is amplified using a limited number of cycles. Next, tagmentation is used to construct sequencing libraries quickly and efficiently from the amplified cDNA.

Note that Smart-Seq2 protocol implies that sequences are separated by cells in different FASTQ file pairs according to PCR plate row and column. In the example bellow we show how to provide cell id (row and column) information using filename patterns that have illumina indices in it. However, inplace of indices could be any text that is unique for column-row pair.

Example:

> ls

161014_SN163_0729_AH3VW7BCXY_L1_TAAGGCGA-CTCTCTAT_R1.fastq.gz
161014_SN163_0729_AH3VW7BCXY_L1_TAAGGCGA-CTCTCTAT_R2.fastq.gz
161014_SN163_0729_AH3VW7BCXY_L1_TAAGCTTA-GTGTTAAG_R1.fastq.gz
161014_SN163_0729_AH3VW7BCXY_L1_TAAGCTTA-GTGTTAAG_R2.fastq.gz

mixcr analyze smart-seq2-vdj \
    --species hsa \
    161014_SN163_0729_AH3VW7BCXY_L1_{{CELL0ROW:a}}-{{CELL0COL:a}}_R1.fastq.gz \
    161014_SN163_0729_AH3VW7BCXY_L1_{{CELL0ROW:a}}-{{CELL0COL:a}}_R2.fastq.gz \
    result

SPLiT-seq

split-seq-3gex · Publication · Code

The SPLiT-seq uses the combinatorial indexing to identify single cells without single cell isolation. Multi-level indexing can be performed by ligation. Please note that as a 3'end RNA-seq based protocol it was not originally optimized for immune repertoire analysis, thus tcr/bcr yield might be low.

Example:

mixcr analyze split-seq-3gex \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

FLAIRR-seq

flairr-seq-bcr · Publication · Code

The FLAIRR-seq method employs 5′ RACE for targeted amplification, and when combined with single-molecule, real-time sequencing, it achieves a remarkable accuracy (99.99%) in generating human Ab H chain transcripts. This technique offers a full-length approach to immunoglobulin repertoire profiling. The preset processes raw data in line with the methodology outlined in the original publication and includes the identification of Human IgM and IgG isotypes. The initial protocol incorporates 16bp-long sample barcodes intended for multiplexing. Below, we provide an example of a sample_table.tsv file suitable for demultiplexing.

Other chains or species

If you employ primers different from those specified in the original publication to capture other chains (e.g., TRA, TRB, IGK, or IGL) or if your data originates from a species other than humans, you'll need to adjust the --tag-pattern in the preset accordingly.

Example:

mixcr analyze flairr-seq-bcr \
    --sample-table sample_table.tsv \
      input.fastq.gz \
      result

Where the sample_table.tsv looks like the example bellow:

Sample TagPattern SMPL
Sample1 CAGCCCTACAGCCCTA
Sample2 GGCAATGTGGCAATGT
... ...

Seq-Well VDJ

seq-well-vdj · Publication · Code

The seq-well-vdj preset is tailored to for Seq-Well data adapted for targeted TCR immune profiling, as outlined in the original publication. In this protocol, both the CELL and UMI barcodes are situated in the Illumina index read, while R1 serves as the payload, specifically covering the CDR3 region.

Example:

mixcr analyze seq-well-vdj \
      input_R1.fastq.gz \
      input_I1.fastq.gz \
      result

Mikelov et al, 2021

mikelov-et-al-2021 · Publication · Code

Vergani et al, 2017

vergani-et-al-2017-cdr3 · vergani-et-al-2017-full-length · Publication · Code

Example:

mixcr analyze vergani-et-al-2017-full-length \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Han et al, 2014

han-et-al-2014-tcr · han-et-al-2014-bcr · Publication · Code

These presets are optimized for a single-cell use of the protocol when each plate well contains a single cell.

Example:

mixcr analyze han-et-al-2014-bcr \
      --species hsa \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Generic data

RNA-Seq data

rna-seq · Publication · Code

Non-enriched fragmented (shotgun) RNA-Seq data. By default MiXCR runs consensus contig assembly to reconstruct all available parts of V-D-J-C receptor rearrangement sequence.

The --species option is required.

Example:

mixcr analyze rna-seq \
    --species hsa \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Exome data

exome-seq · Code

Non-enriched fragmented (shotgun) Exome-Seq data. By default MiXCR runs consensus contig assembly to reconstruct all available parts of V-D-J receptor rearrangement sequence.

The --species option is required.

Example:

mixcr analyze exome-seq \
    --species hsa \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Generic amplicon

generic-amplicon · Code · Tutorial

Generic TCR/BCR amplicon library. Required configs that must be specified with corresponding mix-in options:

Species;

Material type;

Left alignment boundary (5'-end);

Right alignment boundary (3'-end).

The following example runs upstream analysis for some bulk mouse 5'RACE RNA library with 3'-end primers located on C-gene:

mixcr analyze generic-amplicon \
    --species mmu \
    --rna \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result
The following mix-in options are used:

--species mmu
specify Mus Musculus species
--rna
set RNA as starting material (exon regions only will be used for alignments)
--rigid-left-alignment-boundary
use global left alignment boundary (5'RACE)
--floating-right-alignment-boundary C
use local right alignment boundary on C-segment as C-primers are used

By default the clones are assembled by CDR3, if needed one can change this behavior by adding --assemble-clonotypes-by VDJRegion, if the longer receptor part is covered by the reads.

Generic amplicon with UMI barcodes

generic-amplicon-with-umi · Code · Tutorial

Generic TCR/BCR amplicon library with UMI barcodes. Required configs that must be specified with corresponding mix-in options:

Species;

Material type;

Tag pattern;

Left alignment boundary (5'-end);

Right alignment boundary (3'-end).

The following example runs upstream analysis for some bulk mouse 5'RACE RNA library with 3'-end primers located on C-gene:

mixcr analyze generic-amplicon-with-umi \
    --species hsa \
    --rna \
    --tag-pattern "^(R1:*) \ ^(UMI:N{12})GTAC(R2:*)" \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result
The following mix-in options are used:

--species mmu
specify Mus Musculus species
--rna
set RNA as starting material (exon regions only will be used for alignments)
--tag-pattern "^(R1:*) \ ^(UMI:N{12})GTAC(R2:*)"
UMI barcode pattern
--rigid-left-alignment-boundary
use global left alignment boundary (5'RACE)
--floating-right-alignment-boundary C
use local right alignment boundary on C-segment as C-primers are used

By default the clones are assembled by CDR3, if needed one can change this behavior by adding --assemble-clonotypes-by VDJRegion, if the longer receptor part is covered by the reads.

Generic low throughput amplicon single-cell protocols

generic-lt-single-cell-amplicon · generic-lt-single-cell-amplicon-with-umi · Code ·

These presets are appropriate for targeted low throughput amplicon single-cell protocols (for instance, plate-based amplicon single-cell workflows) with or without Unique Molecular Identifiers (UMIs). The required configuration parameters are:

Species;

Material type;

Left alignment boundary (5'-end);

Right alignment boundary (3'-end).

For these presets, the CELL barcode must be defined either through --tag-pattern (if the CELL barcode is embedded within the sequence) or using the sample name if each cell (well) corresponds to a separate pair of FASTQ files.

mixcr analyze generic-lt-single-cell-amplicon \
    --species mmu \
    --rna \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
    --tag-pattern "^(R1:*) \ ^(CELL:N{8})GTAC(R2:*)"\
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

mixcr analyze generic-lt-single-cell-amplicon-with-umi \
    --species mmu \
    --rna \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
    --tag-pattern "^NN(CELL3PLATE:N{5})ga(CELL1ROW:N{5})(R1:*) \ ^NN(CELL2COLUMN:N{5})(UMI:N{14})(R2:*)"\
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

# If each pair of fastq files represents a different cell (e.g. A1,A2,A3 ... H12).

> ls
    input_sample1_A1_R1.fastq.gz
    input_sample1_A1_R2.fastq.gz
    input_sample1_A2_R1.fastq.gz
    input_sample1_A2_R2.fastq.gz
    input_sample1_A3_R1.fastq.gz
    input_sample1_A3_R2.fastq.gz
    input_sample1_A4_R1.fastq.gz
    input_sample1_A4_R2.fastq.gz
    ...

mixcr analyze generic-lt-single-cell-amplicon \
    --species mmu \
    --rna \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
      input_sample1_{CELL:a}_R1.fastq.gz \
      input_sample1_{CELL:a}_R2.fastq.gz \
      result

Note: Cell barcodes should begin with CELL (e.g., CELL1, CELL2, CELL, CELL3PLATE, CELL2COLUMN, etc.). For the generic-lt-single-cell-amplicon-with-umi preset, a tag-pattern containing UMI (CELL barcodes can still be passed through filenames) is required and can be provided either through the --tag-pattern parameter or a sample table.

The following mix-in options are used:

--species mmu
specify Mus Musculus species
--rna
set RNA as starting material (exon regions only will be used for alignments)
--floating-left-alignment-boundary
use local left alignment boundary on V-segment as V-primers are used
--floating-right-alignment-boundary C
use local right alignment boundary on C-segment as C-primers are used

By default, clones are assembled by CDR3. However, if necessary, this behavior can be changed by adding --assemble-clonotypes-by VDJRegion if the longer receptor part is covered by the reads.

Filters

By default, each single-cell preset includes a series of filters that are consecutively applied to the data.

generic-lt-single-cell-amplicon
  1. In the assemble phase:
    • For every cell, and for each chain (TRAD/TRB/TRG/IGH/IGL), only those clones are retained whose cumulative frequency is 95% or more, as measured by the number of reads (containing 'CDR3').
generic-lt-single-cell-amplicon-with-umi
  1. In the refineTagsAndSort phase:
    • Filters out UMIs covered by fewer than the threshold number of reads. This threshold is automatically determined using Otsu's method. If an automatically determined threshold eliminates more than 85% of reads, an adjusted threshold is applied to preserve at least 85% of the reads.
  2. In the assemble phase:
    • For each cell and for each chain (TRAD/TRB/TRG/IGH/IGL), only those clones are retained whose cumulative frequency is 95% or more, as measured by the number of UMIs.

Generic low throughput shotgun single-cell protocols

generic-lt-single-cell-fragmented · generic-lt-single-cell-fragmented-with-umi · Code ·

The preset is suitable for low throughput shotgun (fragmented) single-cell protocols (e.g. plate-based single-cell rna-seq workflows) with or without molecular identifiers (UMI).

The --species option is required.

For this preset CELL barcode has to be set either through --tag-pattern if CELL barcode is present in the sequence or with the sample name if for each cell (well) one has a separate pair of FASTQ files. See examples below:

mixcr analyze generic-lt-single-cell-fragmented \
    --species mmu \
    --tag-pattern "^(R1:*) \ ^(CELL:N{8})GTAC(R2:*)"\
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

mixcr analyze generic-lt-single-cell-fragmented-with-umi \
    --species mmu \
    --tag-pattern "^NN(CELL3PLATE:N{5})ga(CELL1ROW:N{5})(R1:*) \ ^NN(CELL2COLUMN:N{5})(UMI:N{14})(R2:*)" \
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

# If each pair of fastq files represents a different cell (e.g. A1,A2,A3 ... H12).

> ls
    input_sample1_A1_R1.fastq.gz
    input_sample1_A1_R2.fastq.gz
    input_sample1_A2_R1.fastq.gz
    input_sample1_A2_R2.fastq.gz
    input_sample1_A3_R1.fastq.gz
    input_sample1_A3_R2.fastq.gz
    input_sample1_A4_R1.fastq.gz
    input_sample1_A4_R2.fastq.gz
    ...

mixcr analyze generic-lt-single-cell-fragmented \
    --species mmu \
      input_sample1_{CELL:a}_R1.fastq.gz \
      input_sample1_{CELL:a}_R2.fastq.gz \
      result

Note that cell barcodes have to start with CELL (e.g. CELL1, CELL2, CELL, CELL3PLATE, CELL2COLUMN etc.). For generic-lt-single-cell-fragmented-with-umi a tag-pattern containing UMI (CELL barcodes can be still passed through filenames) is required and can be provided either through --tag-pattern parameter or a sample table.

The following mix-in options are used:

--species mmu
specify Mus Musculus species

By default the clones are first assembled by CDR3 and then extended to the longest possible contig with mixcr assembleContigs.

Filters

By default each single-cell preset includes a set of filters that are applied to the data.

generic-lt-single-cell-fragmented
  1. In the assemble phase:
    • For every cell, for every chain (TRAD/TRB/TRG/IGH/IGL) only those clones are left which cumulative frequency is >= 95% by the number of reads (containing 'CDR3').
generic-lt-single-cell-fragmented-with-umi
  1. In the refineTagsAndSort phase:

    • Filters out UMIs covered by fewer than the threshold number of reads. This threshold is automatically determined using Otsu's method. If an automatically determined threshold eliminates more than 85% of reads, an adjusted threshold is applied to preserve at least 85% of the reads.
  2. In the assemble phase:

    • For each cell and for each chain (TRAD/TRB/TRG/IGH/IGL), only those clones are retained whose cumulative frequency is 95% or more, as measured by the number of UMIs.

Generic high throughput amplicon single-cell protocols

generic-ht-single-cell-amplicon · generic-ht-single-cell-amplicon-with-umi · Code ·

The preset is suitable for targeted high throughput amplicon single-cell protocols (thousands of cells and more, not plate based) with or without molecular identifiers (UMI). Required configs that must be specified with corresponding mix-in options:

Species;

Material type;

Left alignment boundary (5'-end);

Right alignment boundary (3'-end).

For this preset CELL barcode has to be set through --tag-pattern.

See examples below:

mixcr analyze generic-ht-single-cell-amplicon \
    --species mmu \
    --rna \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
    --tag-pattern "^(R1:*) \ ^(CELL:N{8})GTAC(R2:*)"\
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

mixcr analyze generic-ht-single-cell-amplicon-with-umi \
    --species mmu \
    --rna \
    --rigid-left-alignment-boundary \
    --floating-right-alignment-boundary C \
    --tag-pattern "^(R1:*) \ ^(CELL:N{8})(UMI:N{14})(R2:*)"\
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Note that cell barcodes have to start with CELL (e.g. CELL1, CELL2, CELL, etc.). For generic-ht-single-cell-amplicon-with-umi a tag-pattern containing both CELL and UMI is required and can be provided either through --tag-pattern parameter or a sample table.

The following mix-in options are used:

--species mmu
specify Mus Musculus species
--rna
set RNA as starting material (exon regions only will be used for alignments)
--floating-left-alignment-boundary
use local left alignment boundary on V-segment as V-primers are used
--floating-right-alignment-boundary C
use local right alignment boundary on C-segment as C-primers are used

By default the clones are assembled by CDR3, if needed one can change this behavior by adding --assemble-clonotypes-by VDJRegion, if the longer receptor part is covered by the reads.

Filters

By default each single-cell preset includes a set of filters that are applied to the data.

generic-ht-single-cell-amplicon
  1. In the refineTagsAndSort phase:

    • Filter out the cells, where CELL barcode is covered by less then threshold number of reads. The threshold is determined automatically using Otsu's algorithm.
  2. In the assemble phase:

    • For every cell, for every chain (TRAD/TRB/TRG/IGH/IGL) only those clones are left which cumulative frequency is 95% by the number of reads (containing 'CDR3').
generic-ht-single-cell-amplicon-with-umi
  1. In the refineTagsAndSort phase:

    • Filters out UMIs covered by fewer than the threshold number of reads. This threshold is automatically determined using Otsu's method. If an automatically determined threshold eliminates more than 85% of reads, an adjusted threshold is applied to preserve at least 85% of the reads.

    • Filter out the cells, where CELL barcode is covered by less then threshold number of UMIs. The threshold is determined automatically using Otsu's algorithm.

  2. In the assemble phase:

    • For each cell and for each chain (TRAD/TRB/TRG/IGH/IGL), only those clones are retained whose cumulative frequency is 95% or more, as measured by the number of UMIs.

Generic high throughput shotgun single-cell protocols

generic-ht-single-cell-fragmented · generic-ht-single-cell-fragmented-with-umi · Code ·

The preset is suitable for high throughput (thousands of cells and more, not plate based) shotgun (fragmented) single-cell protocols (e.g. 10x like protocols which include fragmentation, random primer etc.) with or without molecular identifiers (UMI).

The --species option is required.

For this preset CELL barcode has to be with --tag-pattern

See examples below:

mixcr analyze generic-ht-single-cell-fragmented \
    --species mmu \
    --tag-pattern "^(R1:*) \ ^(CELL:N{8})(R2:*)"\
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

mixcr analyze generic-ht-single-cell-fragmented-with-umi \
    --species mmu \
    --tag-pattern "^(CELL:N{16})(UMI:N{10})\^(R2:*)"\
      input_R1.fastq.gz \
      input_R2.fastq.gz \
      result

Note that cell barcodes have to start with CELL (e.g. CELL1, CELL2, CELL, CELL3PLATE, CELL2COLUMN etc.). For generic-ht-single-cell-fragmented-with-umi a tag-pattern containing UMI is required and can be provided either through --tag-pattern parameter or a sample table.

The following mix-in options are used:

--species mmu
specify Mus Musculus species

By default the clones are first assembled by CDR3 and then extended to the longest possible contig with mixcr assembleContigs.

Filters

By default each single-cell preset includes a set of filters that are applied to the data.

generic-ht-single-cell-fragmented
  1. In the refineTagsAndSort phase:

    • Filter out the cells, where CELL barcode is covered by less then threshold number of reads. The threshold is determined automatically using Otsu's algorithm.
  2. In the assemble phase:

    • For every cell, for every chain (TRAD/TRB/TRG/IGH/IGL) only those clones are left which cumulative frequency is 95% by the number of reads (containing 'CDR3').
generic-ht-single-cell-fragmented-with-umi
  1. In the refineTagsAndSort phase:

    • Filters out UMIs covered by fewer than the threshold number of reads. This threshold is automatically determined using Otsu's method. If an automatically determined threshold eliminates more than 85% of reads, an adjusted threshold is applied to preserve at least 85% of the reads.

    • Filter out the cells, where CELL barcode is covered by less then threshold number of UMIs. The threshold is determined automatically using Otsu's algorithm.

  2. In the assemble phase:

    • For each cell and for each chain (TRAD/TRB/TRG/IGH/IGL), only those clones are retained whose cumulative frequency is 95% or more, as measured by the number of UMIs.