:py:mod:`labtools.adtools.counter` ================================== .. py:module:: labtools.adtools.counter Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: labtools.adtools.counter.seq_counter labtools.adtools.counter.create_map labtools.adtools.counter.convert_bcs_from_map labtools.adtools.counter.sort_normalizer labtools.adtools.counter.calculate_activity labtools.adtools.counter.main .. py:function:: seq_counter(fastq, design_to_use=None, barcoded=False, only_bcs=False, **kwargs) Counts occurences of ADs or AD-barcode pairs in a fastq file. :param fastq: Path to fastq or fastq.gz file. :type fastq: str :param design_to_use: Path to csv file containing ArrayDNA column. :type design_to_use: str, default None :param barcoded: Whether to count ADs with different barcodes separately. :type barcoded: bool, default False :param only_bcs: True, False or the barcode map to use. If True, no map is used. :type only_bcs: default False :param \*\*kwargs: Add additional arguments to pass to pull_AD or pull_barcode. :type \*\*kwargs: dict :returns: **counts** -- Pandas series where indices are AD or AD/barcode sequences and values are counts. :rtype: pandas.core.series.Series .. rubric:: Examples >>> seq_counter("../exampledata/mini.fastq") GGTTCTTCTAAATTGAGATGTGATAATAATGCTGCTGCTCATGTTAAATTGGATTCATTTCCAGCTGGTGTTAGATTTGATACATCTGATGAAGAATTGTTGGAACATTTGGCTGCTAAA 1 GAAGAATTGTTTTTACATTTGTCTGCTAAGATTGGTAGATCTTCTAGGAAACCACATCCATTCTTGGATGAATTTATTCATACTTTGGTTGAAGAAGATGGTATTTGTAGAACTCATCCA 3 dtype: int64 .. py:function:: create_map(ad_bcs, filter=False) Converts output of seq_counter with AD,bc pairs to a dict map. If the barcode is found with two different ADs, it is not included in the dictionary. :param ad_bcs: output counts from seq_counter with barcoded = True. :type ad_bcs: pd.Series :param filter: Number of reads below which to ignore the barcode. :type filter: int, default False :returns: **bc_dict** -- Dictionary with barcodes as keys and 1 AD as value. :rtype: dict .. py:function:: convert_bcs_from_map(bcs, bc_dict) Takes bc only data and uses a barcode dictionary to return AD counts. If the barcode is found with two different ADs, it is not included in the dictionary. :param bcs: output counts from seq_counter with only_bcs = True. :type bcs: pd.Series :param bc_dict: Dictionary with barcodes as keys and 1 AD as value from create_map(). :type bc_dict: dict :returns: **converted** -- Pandas series where indices are AD sequences and values are counts. :rtype: pd.Series .. py:function:: sort_normalizer(pair_counts, bin_counts, thresh=10) Normalize by reads per sample, reads per tile and reads per bin. :param pair_counts: List of pandas series where indices are AD or AD/barcode sequences and values are counts. :type pair_counts: list of pandas.core.series.Series :param bin_counts: List of number of cells per bin in the same order as the pair counts. :type bin_counts: list :param thresh: Number of reads above which to count the unique sequence. :type thresh: int, default 10 :returns: * **df** (*pandas.DataFrame*) -- Pandas dataframe containing the normalized read counts. * **numreads** (*pandas.DataFrame*) -- Total read counts for each unique sequence. * **reads** (*pandas.DataFrame*) -- Read counts per bin for each unique sequence. .. rubric:: Examples >>> sort_normalizer([count1, count2], [1000,1000]) .. py:function:: calculate_activity(df_in, bin_values, min_max=False) Calculate the activity of a normalized sort df. :param df_in: Dataframe output of sort_normalizer() :type df_in: pandas.DataFrame :param bin_values: List of mean or median fluorescence values per bin in the same order as the pair counts. :type bin_values: list :param min_max: Whether to normalize the activity using min 0 max 1. :type min_max: bool, default False :returns: **df** -- Pandas dataframe containing the activity values per sequence or sequence-barcode pair. :rtype: pandas.DataFrame .. py:function:: main()