Extracts counts around candidate binding sites on both strands from the genome counts data (BigWig files generated using count_genome_cuts()). It utilizes the extract bed function from the bwtool software to extract the read counts, then combines the counts into one matrix, with the first half of the columns representing the read counts on the forward strand, and the second half of the columns representing the read counts on the reverse strand.

get_sites_counts(
  sites,
  genomecount_dir,
  genomecount_name,
  tmpdir = genomecount_dir,
  bedGraphToBigWig_path = "bedGraphToBigWig",
  bwtool_path = "bwtool"
)

Arguments

sites

A data frame containing the candidate sites.

genomecount_dir

Directory for genome counts, the same as outdir in count_genome_cuts().

genomecount_name

File prefix for genome counts, the same as outname in count_genome_cuts().

tmpdir

Temporary directory to save intermediate files.

bedGraphToBigWig_path

Path to UCSC bedGraphToBigWig executable.

bwtool_path

Path to bwtool executable.

Value

A count matrix. The first half of the columns are the read counts on the forward strand, and the second half of the columns are the read counts on the reverse strand.

Examples

if (FALSE) {
# Extracts ATAC-seq count matrices around candidate sites
sites_counts.mat <- get_sites_counts(sites,
                                     genomecount_dir='processed_data',
                                     genomecount_name='K562.ATAC')
}