:py:mod:`labtools.adtools.seqlib`
=================================

.. py:module:: labtools.adtools.seqlib


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   labtools.adtools.seqlib.read_fasta
   labtools.adtools.seqlib.read_fastq
   labtools.adtools.seqlib.read_fastq_big
   labtools.adtools.seqlib.get_numreads
   labtools.adtools.seqlib.get_numreads_old
   labtools.adtools.seqlib.write_bc_dict
   labtools.adtools.seqlib.read_bc_dict


.. py:function:: read_fasta(filename)

   Generator for reading entries in a fasta file.

   Yields 2 lines of a fasta file at a time (name, seq).

   :param filename: Path to fasta or fasta.gz file.
   :type filename: str

   :Yields: **(name, seq)** (*(str, str)*) -- Name of sequence, biological sequence.

   .. rubric:: Examples

   >>> for line in read_fasta("example.fasta"):
   ...     name = line[0]
   ...     seq = line[1]
   ...     print(name, seq)
   Geraldine
   ACGTGCTGAGGCTGCGCTAGCAT
   Gustavo
   CTGATGCTAGATGCTGATA


.. py:function:: read_fastq(filename, subset=None)

   Generator for reading entries in a fastq file.

   Yields 4 lines of a fastq file at a time (name, seq, +, error).

   :param filename: Path to fastq or fastq.gz file.
   :type filename: str
   :param subset: Number of reads to randomly sample from the fastq file.
   :type subset: int, optional

   :Yields: **(name, seq, qual)** (*(str, str, str)*) -- tuple of str containing name, seq and quality for entry.

   .. rubric:: Examples

   >>> for line in read_fastq("example.fasta"):
   ...     name = line[0]
   ...     seq = line[1]
   ...     qual = line[2]
   ...     print(name, seq)
   Geraldine
   ACGTGCTGAGGCTGCGCTAGCAT
   Gustavo
   CTGATGCTAGATGCTGATA


.. py:function:: read_fastq_big(filename, subset=None, progress=True, **kwargs)

   Generator for fastq file without opening into memory.

   Yields 4 lines of a fastq file at a time (name, seq, +, error).
   Useful in situations where the fastq file is large and opening into RAM
   would crash computer. Supports subsetting with sklearn.sample_without_replacement().

   :param filename: Path to fastq or fastq.gz file.
   :type filename: str
   :param subset: Number of reads to randomly subsample from file.
   :type subset: int

   :Yields: **(name, seq, qual)** (*(str, str, str)*) -- tuple of str containing name, seq and quality for entry.

   .. rubric:: Examples

   >>> for line in read_fastq_big("example.fasta"):
   ...     name = line[0]
   ...     seq = line[1]
   ...     qual = line[2]
   ...     print(name, seq)
   Geraldine
   ACGTGCTGAGGCTGCGCTAGCAT
   Gustavo
   CTGATGCTAGATGCTGATA


.. py:function:: get_numreads(filename)

   Returns number of reads in a fastq or fastq.gz file.

   :param filename: Path to fastq or fastq.gz file.
   :type filename: str

   :returns: **numreads** -- Number of reads in the fastq file.
   :rtype: int

   .. rubric:: Examples

   >>> get_numreads("example.fastq")
   124


.. py:function:: get_numreads_old(filename)

   Returns number of reads in a fastq file.

   :param filename: Path to fastq file.
   :type filename: str

   :returns: **numreads** -- Number of reads in the fastq file.
   :rtype: int

   .. rubric:: Examples

   >>> get_numreads("example.fastq")
   124


.. py:function:: write_bc_dict(bc_dict, name)

   Writes bc_dict to a csv.

   :param bc_dict: Dictionary output from counter.create_map().
   :type bc_dict: dict
   :param name: Filename for output csv. Ex "Library1_dictionary"
   :type name: str


.. py:function:: read_bc_dict(filename)

   Reads bc_dict from a csv.

   :param filename: Path to csv containing a single dictionary.
   :type filename: str

   :returns: **bc_dict** -- Dictionary.
   :rtype: dict