How to split up a string into a list, 5 characters per chunk

This is the place for queries that don't fit in any of the other categories.

How to split up a string into a list, 5 characters per chunk

Postby johnick013 » Mon Feb 25, 2013 3:42 pm

Hi, I'm doing an exercise for bioinformatics. In the exercise I have to split a gene sequence, which is in the form of a string, into base groups of 5. So for example:
Code: Select all
s='GTAGTACGAATTTGAGCAAA'

and then I want my output to be in a form of a list:
Code: Select all
l=['GTAGT','ACGAA','TTTGA','GCAAA']

But I have absolutely no idea how to do this. Please help! :D
Last edited by Yoriz on Thu Feb 28, 2013 7:03 pm, edited 2 times in total.
Reason: Added code tags, Changed title
johnick013
 
Posts: 1
Joined: Mon Feb 25, 2013 3:36 pm

Re: How to split up a string

Postby zeycus » Mon Feb 25, 2013 4:58 pm

Admins will probably tell you to read this:
http://www.python-forum.org/viewtopic.php?f=6&t=145
You should use code tags, and most important, show your attempts to solve the problem.
Image

Live long and prosper.
Spock
User avatar
zeycus
 
Posts: 23
Joined: Sun Feb 17, 2013 10:30 am
Location: Madrid

Re: How to split up a string

Postby Yoriz » Mon Feb 25, 2013 6:02 pm

Here is a recursive solution, it will take any length of string, when there is less then 5 left for a group it will use whatever is left for the last list item, which my or may not be want you want to happen.
Code: Select all
string = 'GTAGTACGAATTTGAGCAAA'


def chunk_five(string):
    return [string[:5]] + chunk_five(string[5:]) if string else []

print chunk_five(string)

['GTAGT', 'ACGAA', 'TTTGA', 'GCAAA']
Due to the reasons discussed here we will be moving to python-forum.io/ on October 1 2016
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
Yoriz
 
Posts: 1670
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: How to split up a string

Postby micseydel » Mon Feb 25, 2013 8:32 pm

While recursion is neat, it's not efficient, and I'm not sure that list concatenation is either. Below I have an iterator solution which will work for a string of greater length than 5000, and which is significantly less likely to get you a MemoryError too.
Code: Select all
>>> from itertools import izip
>>> def chunk_five(iterable):
   my_it = iter(iterable)
   return izip(*[my_it]*5)

>>> chunk_five('GTAGTACGAATTTGAGCAAA')
<itertools.izip object at 0x7f4390034248>
>>> list(chunk_five('GTAGTACGAATTTGAGCAAA'))
[('G', 'T', 'A', 'G', 'T'), ('A', 'C', 'G', 'A', 'A'), ('T', 'T', 'T', 'G', 'A'), ('G', 'C', 'A', 'A', 'A')]
>>>
>>> ]
>>> def chunk_five(iterable):
   my_it = iter(iterable)
        # if getting back strings instead of tuples is important
   return (''.join(five) for five in izip(*[my_it]*5))

>>> list(chunk_five('GTAGTACGAATTTGAGCAAA'))
['GTAGT', 'ACGAA', 'TTTGA', 'GCAAA']
Due to the reasons discussed here we will be moving to python-forum.io on October 1, 2016.

This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
micseydel
 
Posts: 2991
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: How to split up a string

Postby ichabod801 » Mon Feb 25, 2013 8:51 pm

While recursion and iterators are nice, aren't they a bit high level? Why not just use slicing?

Code: Select all
genes = 'GTAGTACGAATTTGAGCAAA'
fives = [genes[start:(start + 5)] for start in range(0, len(genes), 5)]


Even list comprehensions might be above beginner level, so I might even put it in a loop:

Code: Select all
genes = 'GTAGTACGAATTTGAGCAAA'
fives = []
for start in range(0, len(games), 5):
   fives.append(genes[start:(start + 5)])
Due to the reasons discussed here we will be moving to python-forum.io on October 1st, 2016.
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
ichabod801
 
Posts: 687
Joined: Sat Feb 09, 2013 12:54 pm
Location: Outside Washington DC

Re: How to split up a string

Postby Yoriz » Mon Feb 25, 2013 8:56 pm

Here's is another go.
Code: Select all
string = 'GTAGTACGAATTTGAGCAAA'


def yield_chunk_five(string):
    while string:
        yield string[:5]
        string = string[5:]

print list(yield_chunk_five(string))

['GTAGT', 'ACGAA', 'TTTGA', 'GCAAA']
Due to the reasons discussed here we will be moving to python-forum.io/ on October 1 2016
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
Yoriz
 
Posts: 1670
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: How to split up a string

Postby micseydel » Mon Feb 25, 2013 9:04 pm

What's wrong with high level? The iterator works well for very large samples, which is common with DNA. Also, this person likely isn't someone who need to learn general Python, they're just someone trying to do bioinformatics and so they need to know how to do this one thing.

Yoriz: that solution makes new, potentially big strings every iteration of the loop.
Due to the reasons discussed here we will be moving to python-forum.io on October 1, 2016.

This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
micseydel
 
Posts: 2991
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: How to split up a string

Postby ichabod801 » Mon Feb 25, 2013 9:05 pm

Is this just going to turn into how many ways can we split the string into lenths of five?

Code: Select all
[''.join(word) for word in zip(*[genes[start::5] for start in range(5)])]
Due to the reasons discussed here we will be moving to python-forum.io on October 1st, 2016.
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
ichabod801
 
Posts: 687
Joined: Sat Feb 09, 2013 12:54 pm
Location: Outside Washington DC

Re: How to split up a string

Postby Yoriz » Mon Feb 25, 2013 9:09 pm

O bugger, i thought it was just chopping 5 off the string each time but i think i see now that its creating a new string that's 5 less then the last, back to the drawing board. :(
Due to the reasons discussed here we will be moving to python-forum.io/ on October 1 2016
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
Yoriz
 
Posts: 1670
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: How to split up a string

Postby ichabod801 » Mon Feb 25, 2013 9:15 pm

micseydel wrote:What's wrong with high level? The iterator works well for very large samples, which is common with DNA. Also, this person likely isn't someone who need to learn general Python, they're just someone trying to do bioinformatics and so they need to know how to do this one thing.


When teaching I stick to simple. I don't know who this guy is or what the context of his exercise in Bioinformatics is, so I would aim for simple that he is more likely to understand.
Due to the reasons discussed here we will be moving to python-forum.io on October 1st, 2016.
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
ichabod801
 
Posts: 687
Joined: Sat Feb 09, 2013 12:54 pm
Location: Outside Washington DC

Re: How to split up a string

Postby Yoriz » Mon Feb 25, 2013 9:48 pm

And I'm just a hobbyist python coder that makes up crappy solutions that might help for the time being till some one that knows what there doing comes along.
Due to the reasons discussed here we will be moving to python-forum.io/ on October 1 2016
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
Yoriz
 
Posts: 1670
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: How to split up a string

Postby snippsat » Mon Feb 25, 2013 11:24 pm

Is this just going to turn into how many ways can we split the string into lenths of five?

Why not ;)
Code: Select all
>>> import re
>>> s = 'GTAGTACGAATTTGAGCAAA'
>>> re.findall(r'.'*5, s)
['GTAGT', 'ACGAA', 'TTTGA', 'GCAAA']

Code: Select all
>>> map(None, *([iter(s)] * 5))
[('G', 'T', 'A', 'G', 'T'),
 ('A', 'C', 'G', 'A', 'A'),
 ('T', 'T', 'T', 'G', 'A'),
 ('G', 'C', 'A', 'A', 'A')]
We will be moving to python-forum.io on October 1 2016
User avatar
snippsat
 
Posts: 1251
Joined: Thu Feb 21, 2013 12:04 am

Re: How to split up a string

Postby Yoriz » Tue Feb 26, 2013 12:50 pm

ichabod801 wrote:Is this just going to turn into how many ways can we split the string into lenths of five?

Its already been done to death.
http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python
http://stackoverflow.com/questions/434287/what-is-the-most-pythonic-way-to-iterate-over-a-list-in-chunks
Due to the reasons discussed here we will be moving to python-forum.io/ on October 1 2016
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
Yoriz
 
Posts: 1670
Joined: Fri Feb 08, 2013 1:35 am
Location: UK

Re: How to split up a string

Postby Jaro » Tue Feb 26, 2013 6:51 pm

ichabod801 wrote:Is this just going to turn into how many ways can we split the string into lenths of five?

If so, let me drop a few lines:

Code: Select all
>>> import textwrap
>>> split_seq=textwrap.TextWrapper(width=5).wrap
>>> split_seq('GTAGTACGAATTTGAGCAAA')
['GTAGT', 'ACGAA', 'TTTGA', 'GCAAA']
Code: Select all
<function signature at 0xb73f910c>
User avatar
Jaro
 
Posts: 8
Joined: Sat Feb 23, 2013 6:16 pm


Return to General Coding Help

Who is online

Users browsing this forum: Bing [Bot], Yahoo [Bot] and 4 guests