`labtools.adtools.counter`

Module Contents

Functions

`seq_counter`(fastq[, design_to_use, barcoded, only_bcs])	Counts occurences of ADs or AD-barcode pairs in a fastq file.
`create_map`(ad_bcs[, filter])	Converts output of seq_counter with AD,bc pairs to a dict map.
`convert_bcs_from_map`(bcs, bc_dict)	Takes bc only data and uses a barcode dictionary to return AD counts.
`sort_normalizer`(pair_counts, bin_counts[, thresh])	Normalize by reads per sample, reads per tile and reads per bin.
`calculate_activity`(df_in, bin_values[, min_max])	Calculate the activity of a normalized sort df.
`main`()

labtools.adtools.counter.seq_counter(fastq, design_to_use=None, barcoded=False, only_bcs=False, **kwargs)[source]

Counts occurences of ADs or AD-barcode pairs in a fastq file.

Parameters:

fastq (str) – Path to fastq or fastq.gz file.
design_to_use (str, default None) – Path to csv file containing ArrayDNA column.
barcoded (bool, default False) – Whether to count ADs with different barcodes separately.
only_bcs (default False) – True, False or the barcode map to use. If True, no map is used.
**kwargs (dict) – Add additional arguments to pass to pull_AD or pull_barcode.

Returns:

counts – Pandas series where indices are AD or AD/barcode sequences and values are counts.

Return type:

pandas.core.series.Series

Examples

>>> seq_counter("../exampledata/mini.fastq")
GGTTCTTCTAAATTGAGATGTGATAATAATGCTGCTGCTCATGTTAAATTGGATTCATTTCCAGCTGGTGTTAGATTTGATACATCTGATGAAGAATTGTTGGAACATTTGGCTGCTAAA    1
GAAGAATTGTTTTTACATTTGTCTGCTAAGATTGGTAGATCTTCTAGGAAACCACATCCATTCTTGGATGAATTTATTCATACTTTGGTTGAAGAAGATGGTATTTGTAGAACTCATCCA    3
dtype: int64

labtools.adtools.counter.create_map(ad_bcs, filter=False)[source]

Converts output of seq_counter with AD,bc pairs to a dict map.

If the barcode is found with two different ADs, it is not included in the dictionary.

Parameters:

ad_bcs (pd.Series) – output counts from seq_counter with barcoded = True.
filter (int, default False) – Number of reads below which to ignore the barcode.

Returns:

bc_dict – Dictionary with barcodes as keys and 1 AD as value.

Return type:

dict

labtools.adtools.counter.convert_bcs_from_map(bcs, bc_dict)[source]

Takes bc only data and uses a barcode dictionary to return AD counts.

If the barcode is found with two different ADs, it is not included in the dictionary.

Parameters:

bcs (pd.Series) – output counts from seq_counter with only_bcs = True.
bc_dict (dict) – Dictionary with barcodes as keys and 1 AD as value from create_map().

Returns:

converted – Pandas series where indices are AD sequences and values are counts.

Return type:

pd.Series

labtools.adtools.counter.sort_normalizer(pair_counts, bin_counts, thresh=10)[source]

Normalize by reads per sample, reads per tile and reads per bin.

Parameters:

pair_counts (list of pandas.core.series.Series) – List of pandas series where indices are AD or AD/barcode sequences and values are counts.
bin_counts (list) – List of number of cells per bin in the same order as the pair counts.
thresh (int, default 10) – Number of reads above which to count the unique sequence.

Returns:

df (pandas.DataFrame) – Pandas dataframe containing the normalized read counts.
numreads (pandas.DataFrame) – Total read counts for each unique sequence.
reads (pandas.DataFrame) – Read counts per bin for each unique sequence.

Examples

>>> sort_normalizer([count1, count2], [1000,1000])

labtools.adtools.counter.calculate_activity(df_in, bin_values, min_max=False)[source]

Calculate the activity of a normalized sort df.

Parameters:

df_in (pandas.DataFrame) – Dataframe output of sort_normalizer()
bin_values (list) – List of mean or median fluorescence values per bin in the same order as the pair counts.
min_max (bool, default False) – Whether to normalize the activity using min 0 max 1.

Returns:

df – Pandas dataframe containing the activity values per sequence or sequence-barcode pair.

Return type:

pandas.DataFrame

labtools.adtools.counter.main()[source]

labtools.adtools.counter

Module Contents

Functions

`labtools.adtools.counter`