:py:mod:`labtools.adtools.counter`
==================================

.. py:module:: labtools.adtools.counter


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   labtools.adtools.counter.seq_counter
   labtools.adtools.counter.create_map
   labtools.adtools.counter.convert_bcs_from_map
   labtools.adtools.counter.sort_normalizer
   labtools.adtools.counter.calculate_activity
   labtools.adtools.counter.main


.. py:function:: seq_counter(fastq, design_to_use=None, barcoded=False, only_bcs=False, **kwargs)

   Counts occurences of ADs or AD-barcode pairs in a fastq file.

   :param fastq: Path to fastq or fastq.gz file.
   :type fastq: str
   :param design_to_use: Path to csv file containing ArrayDNA column.
   :type design_to_use: str, default None
   :param barcoded: Whether to count ADs with different barcodes separately.
   :type barcoded: bool, default False
   :param only_bcs: True, False or the barcode map to use. If True, no map is used.
   :type only_bcs: default False
   :param \*\*kwargs: Add additional arguments to pass to pull_AD or pull_barcode.
   :type \*\*kwargs: dict

   :returns: **counts** -- Pandas series where indices are AD or AD/barcode sequences and values are counts.
   :rtype: pandas.core.series.Series

   .. rubric:: Examples

   >>> seq_counter("../exampledata/mini.fastq")
   GGTTCTTCTAAATTGAGATGTGATAATAATGCTGCTGCTCATGTTAAATTGGATTCATTTCCAGCTGGTGTTAGATTTGATACATCTGATGAAGAATTGTTGGAACATTTGGCTGCTAAA    1
   GAAGAATTGTTTTTACATTTGTCTGCTAAGATTGGTAGATCTTCTAGGAAACCACATCCATTCTTGGATGAATTTATTCATACTTTGGTTGAAGAAGATGGTATTTGTAGAACTCATCCA    3
   dtype: int64


.. py:function:: create_map(ad_bcs, filter=False)

   Converts output of seq_counter with AD,bc pairs to a dict map.

   If the barcode is found with two different ADs, it is not included in
   the dictionary.

   :param ad_bcs: output counts from seq_counter with barcoded = True.
   :type ad_bcs: pd.Series
   :param filter: Number of reads below which to ignore the barcode.
   :type filter: int, default False

   :returns: **bc_dict** -- Dictionary with barcodes as keys and 1 AD as value.
   :rtype: dict


.. py:function:: convert_bcs_from_map(bcs, bc_dict)

   Takes bc only data and uses a barcode dictionary to return AD counts.

   If the barcode is found with two different ADs, it is not included in
   the dictionary.

   :param bcs: output counts from seq_counter with only_bcs = True.
   :type bcs: pd.Series
   :param bc_dict: Dictionary with barcodes as keys and 1 AD as value from create_map().
   :type bc_dict: dict

   :returns: **converted** -- Pandas series where indices are AD sequences and values are counts.
   :rtype: pd.Series


.. py:function:: sort_normalizer(pair_counts, bin_counts, thresh=10)

   Normalize by reads per sample, reads per tile and reads per bin.

   :param pair_counts: List of pandas series where indices are AD or AD/barcode sequences and values are counts.
   :type pair_counts: list of pandas.core.series.Series
   :param bin_counts: List of number of cells per bin in the same order as the pair counts.
   :type bin_counts: list
   :param thresh: Number of reads above which to count the unique sequence.
   :type thresh: int, default 10

   :returns: * **df** (*pandas.DataFrame*) -- Pandas dataframe containing the normalized read counts.
             * **numreads** (*pandas.DataFrame*) -- Total read counts for each unique sequence.
             * **reads** (*pandas.DataFrame*) -- Read counts per bin for each unique sequence.

   .. rubric:: Examples

   >>> sort_normalizer([count1, count2], [1000,1000])


.. py:function:: calculate_activity(df_in, bin_values, min_max=False)

   Calculate the activity of a normalized sort df.

   :param df_in: Dataframe output of sort_normalizer()
   :type df_in: pandas.DataFrame
   :param bin_values: List of mean or median fluorescence values per bin in the same order as the pair counts.
   :type bin_values: list
   :param min_max: Whether to normalize the activity using min 0 max 1.
   :type min_max: bool, default False

   :returns: **df** -- Pandas dataframe containing the activity values per sequence or sequence-barcode pair.
   :rtype: pandas.DataFrame


.. py:function:: main()