I would like any tips to increase performance on windows python 2.6.2
I use a product which can internally use python as a scripting language ( 2.6.2), I'm not a regular python user so my apologies for any gaffs made.
We had a requirement to split a large(ish) file into smaller chunks to pass onto downstream processing, and so I wrote a quick script to loop through and split the file. I was amazed to find that on my macbook ( python 2.7) it was 10x faster than running under windows. I tried a number of python versions (2.6 - 3.3) but it is always faster on mac osx. I've also tried removing the opens/writes which had little effect on mac osx, but a 5x increase when on windows.
I cant change the deployment platform from windows (python2.6.2) and am a little frustrated that my laptop performs better than a 32core 64GB windows server!
- Code: Select all
from time import time
t = time()
# Split a pre sorted text file into multiple outputs based on the leftmost element
# delimited by spaces.
# The second element can be used for an additional sort and will stripped from the
# output when 'isLeadingSort=1'
# path: char path for the input file
# outPath: char path for the output files
# isLeadingSort int use the 2nd of 3rd element as output data
# isdbg int enable debug prints
# Just use the cmd at the moment for test
outPath = sys.argv
isLeadingSort = int(sys.argv)
isdbg = int(sys.argv)
#outPath = os.getcwd()
#isLeadingSort = 0
#isdbg = 0
# define all the functions up front
""" print when the debug option is set """
"""raise an exception if we cant find the path or file"""
if not os.path.exists(path):
raise Exception ('File not found: ' + path )
# This is where we start
# check that the paths exist or raise an exception
printStr ('paths ok')
arline = 
fnameOut = chr(1) # init the output filename
# open the input file for reading and process though in a loop
with open(path,'r') as f:
for line in f:
printStr( 'for line in f: ' )
arLine = re.split('[ \n]+',line,wrds)
newFname = arLine
outLine = arLine[len(arLine)-1]
if newFname == fnameOut:
printStr ('writing to open file: ' + fnameOut)
fnameOut = newFname
printStr ('opennextfile: ' + fnameOut + '- closing: ' + str(fOut) )
if fnameOut in ('' , '\n'):
raise Exception ('Filename is not the first element of the data: ' )
fOut = open(os.path.join(outPath,fnameOut),'w') # open new
print ( 'timediff : ' + str(time() - t))