Counts genomic cuts (5' end) from DNase-seq or ATAC-seq BAM alignment files using bedtools For ATAC-seq, when shift_ATAC = TRUE, shifts reads so as to address offsets and align the signal across strands.

count_genome_cuts(
  bam_file,
  chrom_size_file,
  data_type = c("DNase", "ATAC"),
  shift_ATAC = TRUE,
  shift_ATAC_bases = c(4L, -4L),
  outdir = dirname(bam_file),
  outname,
  bedtools_path = "bedtools",
  bedGraphToBigWig_path = "bedGraphToBigWig"
)

Arguments

bam_file

Sorted BAM file.

chrom_size_file

Chromosome size file.

data_type

Data type. Options: ‘DNase’ or ‘ATAC’.

shift_ATAC

Logical. When shift_ATAC=TRUE (and data_type='ATAC'), shifts reads according to shift_ATAC_bases.

shift_ATAC_bases

Number of bases to shift on + and - strands. Default: shifts reads on + strand by 4 bp and reads on - strand by -4 bp.

outdir

Output directory (default: use the directory of bam_file).

outname

Output prefix (default: use the prefix of bam_file).

bedtools_path

Path to bedtools executable.

bedGraphToBigWig_path

Path to UCSC bedGraphToBigWig executable.

Examples

if (FALSE) {
# ATAC-seq data
count_genome_cuts(bam_file='K562.ATAC.bam',
                  chrom_size_file='hg38.chrom.sizes',
                  data_type='ATAC',
                  shift_ATAC=TRUE,
                  outdir='processed_data',
                  outname='K562.ATAC')

# DNase-seq data
count_genome_cuts(bam_file='K562.DNase.bam',
                  chrom_size_file='hg38.chrom.sizes',
                  data_type='DNase',
                  outdir='processed_data',
                  outname='K562.DNase')
}