Discover Top Posts Tagged with #biopython

Unraveling the Mysteries of COVID-19 Through Data Science

In the past few years, our world has been upended by the COVID-19 pandemic, affecting millions of lives and reshaping the future of public health, economies, and how we connect as a global community. Amidst these challenges, data science has emerged as a beacon of hope, offering innovative solutions and deep insights into the virus that has changed our way of life. The Power of Genomic…

View On WordPress

#Biopython #CollaborativeScience #Covid-19 #DataScience #DataVisualization #Genomics #OpenScience #PandemicResponse #PublicHealth #VaccineDevelopment #ViralGenomics

EMBL-EBI exercises and course material on different topics

I found some very nice exercises and teaching material from the EMBL workshop on Plant and Pathogen Genomics!

I got the link from here:

https://twitter.com/widdowquinn/status/734495123397038080

Make sure also to have a look on the "train on-line part" of the website:

https://www.ebi.ac.uk/training/online/course-list

It has very interesting courses. Some of them have also video from the actual course.

#bioinformatics #python #biopython

I'm terrified to test my Python code right now, because if it doesn't work I'm going to dissolve in a huge puddle of tears.

#saro and college #python #biopython #please let this code work

Programming is making me sad, I can't figure out how to check a string for a specific sequence. I am the worst programmer, it's me.

#biopython #bioinformatics #saro and college #sighs

And the summer ends

The coordinate mapper, with updated documentation, is now located on [this branch](https://github.com/lennax/biopython/tree/f_loc4), awaiting the merging of Peter's f_loc4 branch. I've written an [entry](http://biopython.org/wiki/Coordinate_mapping) on coordinate mapping for the Cookbook. Additionally, at Peter's suggestion, I've written a clarification of strand as it relates to transcription and translation. It's available [here](https://docs.google.com/document/d/11R7EOJXn90lN5_SmaPOyN5rFfPQybbCbUBo6EY0R0pA/edit). It's been a great experience working with this project this summer. Thank you to everyone involved.

#BioPython #gsoc #gsoc12 #GSOC2012 #python

Strand

The summer is winding to a close. I've spent this week busy with orientation events and meetings for my upcoming PhD program. I hope to have time to continue to contribute to Biopython in my spare time, and ideally I would like to use and expand Biopython as a portion of my research. I have been considering how to handle gene strandedness. As long as I'm correctly interpreting the following position, my coordinate mapper should produce the correct coordinates with negative strand or mixed strand features. GenBank: join(complement(25..30), 36..40) Biopython: FeatureLocation(24, 30, -1) + FeatureLocation(35, 40) 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 <---------------- -------------> 5 4 3 2 1 0 6 7 8 9 10 I have to admit that it wasn't until I read a BioStar [post](http://biostars.org/post/show/3423/forward-and-reverse-strand-conventions/) earlier this week that I fully understood the relationship between plus/minus forward/reverse sense/antisense coding/template strands. So please let me know as soon as possible if I've made a mistake in the above code. `c2g` yields the correct genome position, but not the strand. I still need to integrate strand information into my `GenomePosition` object and/or partially merge it with `ExactLocation`. This weekend I intend to expand documentation and write a brief cookbook entry.

#BioPython #gsoc #gsoc12 #GSOC2012 #python

Coordinate Mapping update

Following extensive [discussion](http://biopython.org/pipermail/biopython-dev/2012-August/009849.html) on the dev list of the pros and cons of configuration classes/modules, I have refactored my [coordinate mapper](https://gist.github.com/3172753) to keep configuration as isolated as possible. All mapping functions use base 0 internally. Transformation to and from 1-based coords is allowed by custom MapPosition objects. (they are currently separate from the Seq* positions but could probably subclass ExactPosition). The MapPosition objects have to_dialect and from_dialect methods that automatically handle conversion between bases and other formatting details. There are two different ways a user can convert a coordinate from HGVS: # ... assuming cm is an instance of CoordinateMapper # Manually construct position from HGVS CDS_coord = CDSPosition.from_hgvs("6+1") genomic_coord = cm.c2g(CDS_coord) print genomic_coord.to_hgvs() # Pass dialect argument to mapping function genomic_coord = cm.c2g("6+1", dialect="HGVS") print genomic_coord.to_hgvs() Furthermore, the inheritance hierarchy is designed to allow a user to set a default string representation: # Set MapPositions to print as HGVS by default def use_hgvs(self): return str(self.to_hgvs()) MapPosition.__str__ = use_hgvs The [revision](https://gist.github.com/3172753/577b7c383e057b78cdcee64be33f18117a46faaf) as of this writing is passing tests using base 0. I have not yet implemented tests for `from_hgvs` or `to_hgvs`, but that's next on my list. I'm hoping to have time for strand and mixed strand, too. Update: The latest [revision](https://gist.github.com/3172753/7c5f285634b124b2ba2f65fd96114c441382a12d) now tests with default settings and HGVS settings.

#python #BioPython #gsoc #gsoc12 #GSOC2012

Coordinate Mapping

I have been expanding the [coordinate mapper](http://lists.open-bio.org/pipermail/biopython/2010-June/006598.html) Reece posted to the dev list a couple of years ago. It's currently living as a [gist](https://gist.github.com/3172753), although it has grown rather precipitously (over 300 lines each of code and testing). I may have gotten slightly carried away with the concept of "test-driven development," but in this case, extensive testing is extremely critical. Note that as of this writing, the code is in disarray; it was much less messy [before](https://gist.github.com/3172753/f12878bb9d34c524f7427fe7d0bde5747e7eb6d1) I started testing it with 1-based output. (More on that below.) I have modified it to work with the new `CompoundLocation` Peter is working on, while retaining the ability to provide exons as a sequence of pairs. ### Representing intron locations ### One of the more complicated operations in coordinate mapping is converting a genomic intron position to CDS coords. I am using the HGVS [conventions](http://www.hgvs.org/mutnomen/refseq_figure.html), in which positions are shown in the format 123+x and 124-y, where 123 is the last base of one exon in CDS coords and 124 is the first base of the next exon. I'm using a custom object called `NonExonPosition`. Are there standards that use an alternative display for intron positions relative to CDS coords? ### Configuration ### I'm currently using a configuration class to store customization for display of positions, most importantly conversion between the 0-based, Pythonic internal representation and 1-based representation used by many formats (GenBank and HGVS, to name just a few). However, I have read a variety of StackOverflow threads that imply that if I'm trying to use globals or a singleton class, I'm doing something wrong. Is it unwise to use a class simply as a storage object? Does anyone have a "more Pythonic" way to handle module-wide configuration? On a related note, I have been fiddling with adding and subtracting the pesky 1s all day, and I am thinking that I need to adjust my design and refactor the code slightly to maintain internal 0-based coordinates while adjusting the string representation to match 0- or 1-based coords. I'll likely use a class similar to `ExactPosition` (i.e. subclass `int`). Actually, would it be useful to add to the *Position objects a method for representation in GenBank and other formats? ### Other considerations ### * I have a few questions about negative integers. Can genomic or protein coordinates ever have negative coordinates? For CDS, as it be confusing for the integer -1 and the string '-1' to be treated differently, my code interprets both as a downstream position. Using the pattern of the Python list index doesn't make biological sense. * I haven't yet tackled circular genomes. Are they handled by `SeqFeature`/`FeatureLocation` yet? In GFF, the end is represented as an index greater than the "length" of the entire sequence. In order to handle this, it should only be a matter of `except`ing an `IndexError`. * I also haven't considered how strand will influence coordinate mapping. Any pointers in this direction would be appreciated. * Are there Biopython objects other than `FeatureLocation` that should be transformable? For example, is it worth attempting to map non-exact locations across coordinates? ------- All that said, I hope to have a production-ready coordinate mapper by the end of the week. Mailing list: * http://lists.open-bio.org/pipermail/biopython-dev/2012-August/009847.html * http://lists.open-bio.org/pipermail/biopython-dev/2012-August/009849.html

#python #BioPython #GSOC2012 #gsoc #gsoc12

Coordinate Mapping

#python #BioPython #GSOC2012 #gsoc #gsoc12

#biopython

Trending Tags

Recently Viewed Tags

#biopython