R/process_dnase_atac_data.R
get_sites_counts.RdExtracts counts around candidate binding sites on both strands
from the genome counts data
(BigWig files generated using count_genome_cuts()).
It utilizes the extract bed function from the bwtool software
to extract the read counts,
then combines the counts into one matrix, with the first half of the columns
representing the read counts on the forward strand,
and the second half of the columns representing the read counts
on the reverse strand.
get_sites_counts(
sites,
genomecount_dir,
genomecount_name,
tmpdir = genomecount_dir,
bedGraphToBigWig_path = "bedGraphToBigWig",
bwtool_path = "bwtool"
)A data frame containing the candidate sites.
Directory for genome counts,
the same as outdir in count_genome_cuts().
File prefix for genome counts,
the same as outname in count_genome_cuts().
Temporary directory to save intermediate files.
Path to UCSC bedGraphToBigWig executable.
Path to bwtool executable.
A count matrix. The first half of the columns are the read counts on the forward strand, and the second half of the columns are the read counts on the reverse strand.
if (FALSE) {
# Extracts ATAC-seq count matrices around candidate sites
sites_counts.mat <- get_sites_counts(sites,
genomecount_dir='processed_data',
genomecount_name='K562.ATAC')
}