monoseq
¶
monoseq
is a Python library for pretty-printing DNA and protein sequences
using a monospace font. It also provides a simple command line interface.
Sequences are pretty-printed in the traditional way using blocks of letters where each line is prefixed with the sequence position. User-specified regions are highlighted and the output format can be HTML or plaintext with optional styling using ANSI escape codes for use in a terminal.
A simple example:
>>> from monoseq import pprint_sequence
>>> sequence = 'MIMANQPLWLDSEVEMNHYQQSHIKSKSPYFPEDKHICWIKIFKAFGT' * 4
>>> print pprint_sequence(sequence)
1 MIMANQPLWL DSEVEMNHYQ QSHIKSKSPY FPEDKHICWI KIFKAFGTMI MANQPLWLDS
61 EVEMNHYQQS HIKSKSPYFP EDKHICWIKI FKAFGTMIMA NQPLWLDSEV EMNHYQQSHI
121 KSKSPYFPED KHICWIKIFK AFGTMIMANQ PLWLDSEVEM NHYQQSHIKS KSPYFPEDKH
181 ICWIKIFKAF GT
An example, admittedly contrived, with annotations:
>>> from monoseq import AnsiFormat
>>> twelves = [(p, p + 1) for p in range(11, len(sequence), 12)]
>>> conserved = [[(11, 37), (222, 247)]
>>> middle = [(len(sequence) / 3, len(sequence) / 3 * 2)]
>>> print pprint_sequence(sequence, format=AnsiFormat,
... annotations=[conserved, twelves, middle])
1 cgcactcaaa acaaaggaag accgtcctcg actgcagagg aagcaggaag ctgtcggccc
61 agctctgagc ccagctgctg gagccccgag cagcggcatg gagtccgtgg ccctgtacag
121 ctttcaggct acagagagcg acgagctggc cttcaacaag ggagacacac tcaagatcct
181 gaacatggag gatgaccaga actggtacaa ggccgagctc cggggtgtcg agggatttat
241 tcccaagaac tacatccgcg tcaag
This IPython Notebook shows how to pretty-print sequences in an IPython Notebook.
User documentation¶
New users should probably start here.
API reference¶
Documentation on a specific function, class or method can be found in the API reference.