Skip to content
Commits on Source (3)
STAR 2.7.1a 2019/05/15
======================
**This version requires re-generation of the genome indexes**
* Implemented --soloFeatures GeneFull which counts reads overlapping full genes, i.e. includes reads that overlap introns. This can be combined with other features, e.g. --soloFeatures Gene SJ GeneFull .
* Implemented --soloCBwhitelist None option for solo* demultiplexing without CB whitelist. In this case error correction for CBs is not performed.
* Implemented Cell Barcodes longer than 16 bases (but shorter than 31 bases). Many thanks to Gert Hulselmans for implementing this feature (#588).
* Implemented collapsing of duplicate cell barcodes in the whitelist.
* Implemented --sjdbGTFtagExonParentGeneName and --sjdbGTFtagExonParentGeneType options to load gene name and biotype attributes from the GTF file.
* Fixed problems created by missing gene/transcript ID, name and biotype attributes in GTF files (issues #613, #628).
* Added warning for incorrectly scaled --genomeSAindexNbases parameter (issue #614).
* Added numbers of unmapped reads to the Log.final.out file (pull #622).
* Fixed a problem which may cause seg-faults for reads with many blocks (issue #342).
STAR 2.7.0f 2019/03/28
======================
* Fixed a problem in STARsolo with empty Unmapped.out.mate2 file. Issue #593.
* Fixed a problem with CR CY UR UQ SAM tags in solo output. Issue #593.
* Fixed problems with STARsolo and 2-pass.
STAR 2.7.0e 2019/02/25
======================
* Fixed problems with --quantMode GeneCounts and --parametersFiles options
STAR 2.7.0d 2019/02/19
======================
* Implemented --soloBarcodeReadLength option for barcode read length not equal to the UMI+CB length
* Enforced genome version rules for 2.7.0
STAR 2.7.0c 2019/02/08
======================
* This release is compiled with gcc-4.8.5, and requires at least gcc-4.8.5
* Fixed another problem in STARsolo genes.tsv output.
* Replaced tabs with spaces in STARsolo matrix.mtx output
* #559, #562 Fixed compilation problems.
* #550 (again, previous merge failed): Added correct header for the STARsolo matrix.mtx file, needed for python scipy mmread compatibility.
STAR 2.7.0b 2019/02/05
======================
* #550: Added correct header for the STARsolo matrix.mtx file, needed for python scipy mmread compatibility.
* #556: Fixed a problem with STARsolo genes.tsv file, which may also cause troubles with GTF files processing.
* Important: 2.7.0x releases require re-generation of the genome index.
STAR 2.7.0a 2019/01/23
======================
* This release introduces STARsolo for: mapping, demultiplexing and gene quantification for single cell RNA-seq.
......
......@@ -35,9 +35,9 @@ Download the latest [release from](https://github.com/alexdobin/STAR/releases) a
```bash
# Get latest STAR source from releases
wget https://github.com/alexdobin/STAR/archive/2.7.0a.tar.gz
tar -xzf 2.7.0a.tar.gz
cd STAR-2.7.0a
wget https://github.com/alexdobin/STAR/archive/2.7.1a.tar.gz
tar -xzf 2.7.1a.tar.gz
cd STAR-2.7.1a
# Alternatively, get STAR source using git
git clone https://github.com/alexdobin/STAR.git
......@@ -74,7 +74,6 @@ make STARforMacStatic CXX=/path/to/gcc
```
If employing STAR only on a single machine or a homogeneously setup cluster, you may aim at helping the compiler to optimize in way that is tailored to your platform. The flags LDFLAGSextra and CXXFLAGSextra are appended to the default optimizations specified in source/Makefile.
```
# platform-specific optimization for gcc/g++
make CXXFLAGSextra=-march=native
......@@ -82,6 +81,14 @@ make CXXFLAGSextra=-march=native
make LDFLAGSextra=-flto CXXFLAGSextra="-flto -march=native"
```
FreeBSD ports
=============
STAR can be installed on FreeBSD via the FreeBSD ports system.
To install via the binary package, simply run:
```
pkg install star
```
LIMITATIONS
===========
......
STAR 2.7.0a 2019/01/23
STAR 2.7.0c 2019/02/05
======================
STARsolo: mapping, demultiplexing and gene quantification for single cell RNA-seq
---------------------------------------------------------------------------------
STARsolo is a turnkey solution for analyzing droplet single cell RNA sequencing data (e.g. 10X Genomics Chromium System) built directly into STAR code.
STARsolo inputs the raw FASTQ reads files, and performs the following operations
(i) error correction and demultiplexing of cell barcodes using user-input whitelist
(ii) mapping the reads to the reference genome using the standard STAR spliced read alignment algorithm
(ii) error correction and collapsing (deduplication) of Unique Molecular Identifiers (UMIa)
(iv) quantification of per-cell gene expression by counting the number of reads per gene
* error correction and demultiplexing of cell barcodes using user-input whitelist
* mapping the reads to the reference genome using the standard STAR spliced read alignment algorithm
* error correction and collapsing (deduplication) of Unique Molecular Identifiers (UMIa)
* quantification of per-cell gene expression by counting the number of reads per gene
STARsolo output is designed to be a drop-in replacement for 10X CellRanger gene quantification output.
It follows CellRanger logic for cell barcode whitelisting and UMI deduplication, and produces nearly identical gene counts in the same format.
At the same time STARsolo is ~10 times faster than the CellRanger.
The STAR solo algorithm is turned on with:
```
......@@ -32,6 +33,8 @@ Importantly, in the --readFilesIn option, the 1st file has to be cDNA read, and
--readFilesIn cDNAfragmentSequence.fastq.gz CellBarcodeUMIsequence.fastq.gz
```
Important: the genome index has to be re-generated with the latest 2.7.0x release.
Other parameters that control STARsolo output are listed below. Note that default parameters are compatible with 10X Chromium V2 protocol.
```
......
rna-star (2.7.1a+dfsg-1) unstable; urgency=medium
* New upstream release.
* Refresh patches.
-- Sascha Steinbiss <satta@debian.org> Mon, 08 Jul 2019 17:28:41 +0200
rna-star (2.7.0a+dfsg-1) unstable; urgency=medium
* New upstream release.
......
......@@ -2,11 +2,9 @@ Author: Steffen Moeller <moeller@debian.org>,
Last-Changed: Thu, 29 Jan 2015 14:18:44 +0100
Description: Use Debian packaged htslib
Index: rna-star/source/Makefile
===================================================================
--- rna-star.orig/source/Makefile
+++ rna-star/source/Makefile
@@ -12,15 +12,15 @@ CXXFLAGSextra ?=
--- a/source/Makefile
+++ b/source/Makefile
@@ -12,15 +12,15 @@
CXX ?= g++
# pre-defined flags
......@@ -25,7 +23,7 @@ Index: rna-star/source/Makefile
CXXFLAGS_main := -O3 $(CXXFLAGS_common)
CXXFLAGS_gdb := -O0 -g $(CXXFLAGS_common)
@@ -63,10 +63,10 @@ SOURCES := $(wildcard *.cpp) $(wildcard
@@ -64,10 +64,10 @@
%.o : %.cpp
......@@ -38,7 +36,7 @@ Index: rna-star/source/Makefile
all: STAR
@@ -77,12 +77,10 @@ clean:
@@ -78,19 +78,17 @@
.PHONY: CLEAN
CLEAN:
rm -f *.o STAR Depend.list
......@@ -49,9 +47,8 @@ Index: rna-star/source/Makefile
rm -f *.o Depend.list
- $(MAKE) -C htslib clean
.PHONY: install
install:
@@ -93,7 +91,7 @@ ifneq ($(MAKECMDGOALS),cleanRelease)
ifneq ($(MAKECMDGOALS),clean)
ifneq ($(MAKECMDGOALS),cleanRelease)
ifneq ($(MAKECMDGOALS),CLEAN)
ifneq ($(MAKECMDGOALS),STARforMac)
ifneq ($(MAKECMDGOALS),STARforMacGDB)
......@@ -60,7 +57,7 @@ Index: rna-star/source/Makefile
echo $(SOURCES)
'rm' -f ./Depend.list
$(CXX) $(CXXFLAGS_common) -MM $^ >> Depend.list
@@ -104,11 +102,6 @@ endif
@@ -101,11 +99,6 @@
endif
endif
......@@ -72,10 +69,8 @@ Index: rna-star/source/Makefile
parametersDefault.xxd: parametersDefault
xxd -i parametersDefault > parametersDefault.xxd
Index: rna-star/source/bamRemoveDuplicates.cpp
===================================================================
--- rna-star.orig/source/bamRemoveDuplicates.cpp
+++ rna-star/source/bamRemoveDuplicates.cpp
--- a/source/bamRemoveDuplicates.cpp
+++ b/source/bamRemoveDuplicates.cpp
@@ -1,7 +1,7 @@
#include <unordered_map>
#include "bamRemoveDuplicates.h"
......@@ -85,11 +80,9 @@ Index: rna-star/source/bamRemoveDuplicates.cpp
#include "IncludeDefine.h"
#include SAMTOOLS_BGZF_H
#include "ErrorWarning.h"
Index: rna-star/source/bam_cat.c
===================================================================
--- rna-star.orig/source/bam_cat.c
+++ rna-star/source/bam_cat.c
@@ -52,8 +52,8 @@ THE SOFTWARE.
--- a/source/bam_cat.c
+++ b/source/bam_cat.c
@@ -52,8 +52,8 @@
#include <stdlib.h>
#include <unistd.h>
......@@ -100,10 +93,8 @@ Index: rna-star/source/bam_cat.c
#include <cstring>
#define BUF_SIZE 0x10000
Index: rna-star/source/signalFromBAM.h
===================================================================
--- rna-star.orig/source/signalFromBAM.h
+++ rna-star/source/signalFromBAM.h
--- a/source/signalFromBAM.h
+++ b/source/signalFromBAM.h
@@ -1,6 +1,6 @@
#ifndef CODE_signalFromBAM
#define CODE_signalFromBAM
......@@ -112,10 +103,8 @@ Index: rna-star/source/signalFromBAM.h
#include <fstream>
#include <string>
#include "Stats.h"
Index: rna-star/source/IncludeDefine.h
===================================================================
--- rna-star.orig/source/IncludeDefine.h
+++ rna-star/source/IncludeDefine.h
--- a/source/IncludeDefine.h
+++ b/source/IncludeDefine.h
@@ -28,8 +28,8 @@
#define ERROR_OUT string ( __FILE__ ) +":"+ to_string ( (uint) __LINE__ ) +":"+ string ( __FUNCTION__ )
......@@ -127,10 +116,8 @@ Index: rna-star/source/IncludeDefine.h
using namespace std;
Index: rna-star/source/BAMfunctions.cpp
===================================================================
--- rna-star.orig/source/BAMfunctions.cpp
+++ rna-star/source/BAMfunctions.cpp
--- a/source/BAMfunctions.cpp
+++ b/source/BAMfunctions.cpp
@@ -1,5 +1,5 @@
#include "BAMfunctions.h"
-#include "htslib/htslib/kstring.h"
......@@ -138,10 +125,8 @@ Index: rna-star/source/BAMfunctions.cpp
string bam_cigarString (bam1_t *b) {//output CIGAR string
// kstring_t strK;
Index: rna-star/source/STAR.cpp
===================================================================
--- rna-star.orig/source/STAR.cpp
+++ rna-star/source/STAR.cpp
--- a/source/STAR.cpp
+++ b/source/STAR.cpp
@@ -30,7 +30,7 @@
#include "bam_cat.h"
......@@ -151,10 +136,8 @@ Index: rna-star/source/STAR.cpp
#include "parametersDefault.xxd"
void usage(int usageType) {
Index: rna-star/source/bam_cat.h
===================================================================
--- rna-star.orig/source/bam_cat.h
+++ rna-star/source/bam_cat.h
--- a/source/bam_cat.h
+++ b/source/bam_cat.h
@@ -1,7 +1,7 @@
#ifndef CODE_bam_cat
#define CODE_bam_cat
......
No preview for this file type
......@@ -34,7 +34,7 @@
\newcommand{\sechyperref}[1]{\hyperref[#1]{Section \ref{#1}. \nameref{#1}}}
\title{STAR manual 2.7.0a}
\title{STAR manual 2.7.1a}
\author{Alexander Dobin\\
dobin@cshl.edu}
\maketitle
......@@ -253,7 +253,7 @@ STAR produces multiple output files. All files have standard name, however, you
\subsection{SAM.}
\ofilen{Aligned.out.sam} - alignments in standard SAM format.
\subsubsection{Multimappers.}
The number of loci \code{Nmap} a read maps to is given by \code{NH:i:Nmap} field. Value of 1 corresponds to unique mappers, while values \textgreater1 corresponds to multi-mappers. \code{HI} attrbiutes enumerates multiple alignments of a read starting with 1 (this can be changed with the \opt{outSAMattrIHstart} - setting it to 0 may be required for compatibility with downstream software such as Cufflinks or StringTie).
The number of loci \code{Nmap} a read maps to is given by \code{NH:i:Nmap} field. Value of 1 corresponds to unique mappers, while values \textgreater1 corresponds to multi-mappers. \code{HI} attrbiutes enumerates multiple alignments of a read starting with 1 (this can be changed with the \opt{outSAMattrIHstart} - setting it to 0 may be required for compatibility with downstream software such as Cufflinks).
The mapping quality MAPQ (column 5) is 255 for uniquely mapping reads, and int(-10*log10(1-1/Nmap)) for multi-mapping reads. This scheme is same as the one used by TopHat and is compatible with Cufflinks. The default MAPQ=255 for the unique mappers maybe changed with \opt{outSAMmapqUnique} parameter (integer 0 to 255) to ensure compatibility with downstream tools such as GATK.
......@@ -269,17 +269,35 @@ The \opt{outSAMmultNmax} parameter limits the number of output alignments (SAM l
The SAM attributes can be specified by the user using \opt{outSAMattributes} \optvr{A1 A2 A3 ...} option which accept a list of 2-character SAM attributes. The implemented attributes are: \optv{NH HI NM MD AS nM jM jI XS}. By default, STAR outputs \optv{NH HI AS nM} attributes.
\begin{itemize}
\item[]
\optv{NH HI NM MD} have standard meaning as defined in the SAM format specifications.
\optv{NH HI NM MD} : have standard meaning as defined in the SAM format specifications.
\item[]
\optv{AS} id the local alignment score (paired for paired-end reads).
\optv{AS} : id the local alignment score (paired for paired-end reads).
\item[]
\optv{nM} is the number of mismatches per (paired) alignment, not to be confused with \optv{NM}, which is the number of mismatches in each mate.
\optv{nM} : is the number of mismatches per (paired) alignment, not to be confused with \optv{NM}, which is the number of mismatches in each mate.
\item[]
\optv{jM:B:c,M1,M2,...} intron motifs for all junctions (i.e. N in CIGAR): 0: non-canonical; 1: GT/AG, 2: CT/AC, 3: GC/AG, 4: CT/GC, 5: AT/AC, 6: GT/AT. If splice junctions database is used, and a junction is annotated, 20 is added to its motif value.
\optv{jM:B:c,M1,M2,...} : intron motifs for all junctions (i.e. N in CIGAR): 0: non-canonical; 1: GT/AG, 2: CT/AC, 3: GC/AG, 4: CT/GC, 5: AT/AC, 6: GT/AT. If splice junctions database is used, and a junction is annotated, 20 is added to its motif value.
\item[]
\optv{jI:B:I,Start1,End1,Start2,End2,...} Start and End of introns for all junctions (1-based).
\optv{jI:B:I,Start1,End1,Start2,End2,...} : Start and End of introns for all junctions (1-based).
\item[]
\optv{jM jI} attributes require samtools 0.1.18 or later, and were reported to be incompatible with some downstream tools such as Cufflinks.
\optv{jM jI} : attributes require samtools 0.1.18 or later, and were reported to be incompatible with some downstream tools such as Cufflinks.
\item[]
\optv{vA} : variant allele
\item[]
\optv{vG} : genomic coordiante of the variant overlapped by the read
\item[]
\optv{vW} : 0/1 - alignment does not pass / passes WASP filtering. Requires --waspOutputMode SAMtag
\item[]
\optv{CR CY UR UY} : sequences and quality scores of cell barcodes and UMIs for the solo* demultiplexing, not error corrected
\item[]
\optv{uT} : for unmapped reads, reason for not mapping:
\begin{itemize}[noitemsep,topsep=-3pt]
\item[] 0 : no acceptable seed/windows, "Unmapped other" in the Log.final.out
\item[] 1 : best alignment shorter than min allowed mapped length, "Unmapped: too short" in the Log.final.out
\item[] 2 : best alignment has more mismatches than max allowed number of mismatches, "Unmapped: too many mismatches" in the Log.final.out
\item[] 3 : read maps to more loci than the max number of multimappng loci, "Multimapping: mapped to too many loci" in the Log.final.out
\item[] 4 : unmapped mate of a mapped paired-end read
\end{itemize}
\end{itemize}
\subsubsection{Compatibility with Cufflinks/Cuffdiff.}
......@@ -522,6 +540,44 @@ Importantly, in the --readFilesIn option, the 1st FASTQ file has to be cDNA read
Other solo* options can be found in the Section \ref{STARsolo_(single_cell_RNA-seq)_parameters}.
\subsection{Feature statistics summaries.}
Feature statistics summaries are recorded in the \optvr{Solo.out/} directory in files \optvr{<Feature>.stats} where features are those used in the \opt{soloFeatures} option, e.g. \optvr{Gene.stats}. The following metrics are recorded:
\begin{itemize}[leftmargin=1.5in]
\itemsep -0.3em
\item[\optv{nNinBarcode:}] number of reads with more than 2 Ns in cell barcode (CB)
\item[\optv{nUMIhomopolymer:}] number of reads with homopolymer in CB
\item[\optv{nTooMany:}] not used at the moment
\item[\optv{nNoMatch:}] number of reads with CBs that do not match whitelist even with one mismatch
\end{itemize}
All of the above reads are discarded from Solo output. Remaining reads are checked for overlap with features (e.g. genes):
\begin{itemize}[leftmargin=2in]
\itemsep -0.3em
\item[\optv{nUnmapped:}] number of reads unmapped to the genome
\item[\optv{nNoFeature:}] number of reads that map to the genome but do not belong to a feature
\item[\optv{nAmbigFeature:}] number of reads that belong to more than one feature
\item[\optv{nAmbigFeatureMultimap:}] number of reads that belong to more than one feature and are also multimapping to the genome (this is a subset of the nAmbigFeature)
\item[\optv{nTooMany:}] number of reads with ambiguous CB (i.e. CB matches whitelist with one mismatch but with posterior probability <0.95)
\item[\optv{nNoExactMatch:}] number of reads with CB that matches a whitelist barcode with 1 mismatch, but this whitelist barcode does not get any other reads with exact matches of CB
\end{itemize}
All of the reads above are output in feature (e.g. gene) / cell count matrices.
\begin{itemize}[leftmargin=1.5in]
\itemsep -0.3em
\item[\optv{nExactMatch:}] number of reads with CB that match the whitelist exactly
\item[\optv{nMatch:}] total number of reads that match CB with 0 or 1 mismatches (this is superset of nExactMatch)
\item[\optv{nCellBarcodes:}] number of distinct CBs detected
\item[\optv{nUMIs:}] number of distinct UMIs detected
\end{itemize}
These metrics can be grouped into more broad categories:
\begin{itemize}
\itemsep -0.3em
\item[]\optv{nNinBarcode+nUMIhomopolymer+nNoMatch+nTooMany+nNoExactMatch} = number of reads with CBs that do not match whitelist.
\item[]\optv{nUnmapped+nAmbigFeature} = number of reads without defined feature (gene)
\item[]\optv{nMatch} = number of reads that are output as solo counts
\end{itemize}
The three categoties above summed together should be equal to the total number of reads.
\section{Description of all options.}\label{Description_of_all_options}
For each STAR version, the most up-to-date information about all STAR parameters can be found in the \code{parametersDefault} file in the STAR source directory. The parameters in the \code{parametersDefault}, as well as in the descriptions below, are grouped by function:
\begin{itemize}
......
......@@ -36,7 +36,7 @@ if ($1=="###") {# new group/subsection of parameters
nOpt=0;
while ($1!="") {
$0=substr($0,match($0,/[^[:space:]]/));
no=split($0,oo,/[:space:]*\.\.\.[:space:]*/);
no=split($0,oo,/[[:space:]]*\.\.\.[[:space:]]*/);
if (no!=2) {# not option line
if (nOpt>0) print optOptTableEnd;
print " \\optLine{" $0 "}" " ";
......
......@@ -92,10 +92,16 @@
\optLine{string: feature type in GTF file to be used as exons for building transcripts}
\optName{sjdbGTFtagExonParentTranscript}
\optValue{transcript{\textunderscore}id}
\optLine{string: tag name to be used as exons' transcript-parents (default "transcript{\textunderscore}id" works for GTF files)}
\optLine{string: GTF attribute name for parent transcript ID (default "transcript{\textunderscore}id" works for GTF files)}
\optName{sjdbGTFtagExonParentGene}
\optValue{gene{\textunderscore}id}
\optLine{string: tag name to be used as exons' gene-parents (default "gene{\textunderscore}id" works for GTF files)}
\optLine{string: GTF attribute name for parent gene ID (default "gene{\textunderscore}id" works for GTF files)}
\optName{sjdbGTFtagExonParentGeneName}
\optValue{gene{\textunderscore}name}
\optLine{string(s): GTF attrbute name for parent gene name}
\optName{sjdbGTFtagExonParentGeneType}
\optValue{gene{\textunderscore}type gene{\textunderscore}biotype}
\optLine{string(s): GTF attrbute name for parent gene type}
\optName{sjdbOverhang}
\optValue{100}
\optLine{int{\textgreater}0: length of the donor/acceptor sequence on each side of the junctions, ideally = (mate{\textunderscore}length - 1)}
......@@ -153,14 +159,6 @@
\optName{readNameSeparator}
\optValue{/}
\optLine{string(s): character(s) separating the part of the read names that will be trimmed in output (read name after space is always trimmed)}
\optName{readStrand}
\optValue{Unstranded}
\optLine{string: library strandedness type}
\begin{optOptTable}
\optOpt{Unstranded} \optOptLine{unstranded library}
\optOpt{Forward} \optOptLine{1st read strand same as RNA (i.e. 2nd cDNA synthesis strand)}
\optOpt{Reverse} \optOptLine{1st read opposite to RNA (i.e. 1st cDNA synthesis strand)}
\end{optOptTable}
\optName{clip3pNbases}
\optValue{0}
\optLine{int(s): number(s) of bases to clip from 3p of each mate. If one value is given, it will be assumed the same for both mates.}
......@@ -290,7 +288,7 @@
\optOpt{vA} \optOptLine{variant allele}
\optOpt{vG} \optOptLine{genomic coordiante of the variant overlapped by the read}
\optOpt{vW} \optOptLine{0/1 - alignment does not pass / passes WASP filtering. Requires --waspOutputMode SAMtag}
\optOpt{CR,CY,UR,UY} \optOptLine{sequences and quality scores of cell barcodes and UMIs for the solo* demultiplexing}
\optOpt{CR CY UR UY} \optOptLine{sequences and quality scores of cell barcodes and UMIs for the solo* demultiplexing}
\end{optOptTable}
\optLine{Unsupported/undocumented:}
\begin{optOptTable}
......@@ -643,7 +641,7 @@
\optOpt{Right} \optOptLine{insertions are flushed to the right}
\end{optOptTable}
\end{optTable}
\optSection{Paired-End reads: presently unsupported/undocumented}\label{Paired-End_reads:_presently_unsupported/undocumented}
\optSection{Paired-End reads}\label{Paired-End_reads}
\begin{optTable}
\optName{peOverlapNbasesMin}
\optValue{0}
......@@ -808,6 +806,13 @@
\optName{soloUMIlen}
\optValue{10}
\optLine{int{\textgreater}0: UMI length}
\optName{soloBarcodeReadLength}
\optValue{1}
\optLine{int: length of the barcode read}
\begin{optOptTable}
\optOpt{1} \optOptLine{equal to sum of soloCBlen+soloUMIlen}
\optOpt{0} \optOptLine{not defined, do not check}
\end{optOptTable}
\optName{soloStrand}
\optValue{Forward}
\optLine{string: strandedness of the solo libraries:}
......@@ -822,6 +827,7 @@
\begin{optOptTable}
\optOpt{Gene} \optOptLine{genes: reads match the gene transcript}
\optOpt{SJ} \optOptLine{splice junctions: reported in SJ.out.tab}
\optOpt{GeneFull} \optOptLine{full genes: count all reads overlapping genes' exons and introns}
\end{optOptTable}
\optName{soloUMIdedup}
\optValue{1MM{\textunderscore}All}
......@@ -832,13 +838,14 @@
\optOpt{1MM{\textunderscore}NotCollapsed} \optOptLine{UMIs with 1 mismatch distance to others are not collapsed (i.e. all counted)}
\end{optOptTable}
\optName{soloOutFileNames}
\optValue{Solo.out/ genes.tsv barcodes.tsv matrix.mtx matrixSJ.mtx}
\optValue{Solo.out/ genes.tsv barcodes.tsv matrix.mtx matrixSJ.mtx matrixGeneFull.mtx}
\optLine{string(s) file names for STARsolo output}
\begin{optOptTable}
\optOpt{1st word} \optOptLine{file name prefix}
\optOpt{2nd word} \optOptLine{barcode sequences}
\optOpt{3rd word} \optOptLine{gene IDs and names}
\optOpt{4th word} \optOptLine{cell/gene counts matrix}
\optOpt{5th word} \optOptLine{cell/splice junction counts matrix}
\optOpt{2nd word} \optOptLine{gene IDs and names}
\optOpt{3rd word} \optOptLine{barcode sequences}
\optOpt{4th word} \optOptLine{cell/Gene counts matrix}
\optOpt{5th word} \optOptLine{cell/SJ counts matrix}
\optOpt{6th word} \optOptLine{cell/GeneFull counts matrix}
\end{optOptTable}
\end{optTable}
......@@ -2,9 +2,9 @@ FROM debian:stretch-slim
MAINTAINER dobin@cshl.edu
ENV STAR_VERSION 2.6.1d
ARG STAR_VERSION=2.7.1a
ENV PACKAGES gcc g++ make wget zlib1g-dev
ENV PACKAGES gcc g++ make wget zlib1g-dev unzip
RUN set -ex
......@@ -12,8 +12,8 @@ RUN apt-get update && \
apt-get install -y --no-install-recommends ${PACKAGES} && \
apt-get clean && \
cd /home && \
wget --no-check-certificate https://github.com/alexdobin/STAR/archive/${STAR_VERSION}.tar.gz && \
tar xzf ${STAR_VERSION}.tar.gz && \
wget --no-check-certificate https://github.com/alexdobin/STAR/archive/${STAR_VERSION}.zip && \
unzip ${STAR_VERSION}.zip && \
cd STAR-${STAR_VERSION}/source && \
make STARstatic && \
mkdir /home/bin && \
......
......@@ -10,6 +10,7 @@ void ChimericAlign::chimericJunctionOutput(fstream &outStream, uint chimN)
<<"\t"<< al2->exons[0][EX_G] - mapGen.chrStart[al2->Chr]+1 <<"\t"<< al2->generateCigarP() <<"\t"<< chimN;
if (P.outSAMattrPresent.RG)
outStream <<"\t"<< P.outSAMattrRG.at(RA->readFilesIndex);
if (P.pSolo.type>0)
outStream <<"\t"<< RA->soloRead->readBar->cbSeq <<"\t"<< RA->soloRead->readBar->umiSeq;
outStream <<"\n"; //<<"\t"<< trChim[0].exons[0][EX_iFrag]+1 --- no need for that, since trChim[0] is always on the first mate
};
\ No newline at end of file
......@@ -174,4 +174,3 @@ void ChimericAlign::chimericStitching(char *genSeq, char *readSeq) {
return;
};
};
\ No newline at end of file