[scrapped] tokenize issue (need data offset)

This is the place for queries that don't fit in any of the other categories.

[scrapped] tokenize issue (need data offset)

Postby Tcll » Thu Apr 24, 2014 3:39 pm

so what I'm trying to do is build a syntax highlighter...
I'm using a QPainter() cause I can't seem to get anything else to display the way I want it to... (Visual Studio 2010 with Python Tools 1.5 Intellisense, with Aptana's convenience features, but with an updated version IDLE's syntax)

I'll later add word indexing and intellisense and all those convenience features. :P

for the rendering code, I'm passing only the lines between the scroll position and the widget height to be drawn,
which works more than perfectly and is extremely fast.

to add colors, I needed a list containing a QColor() for every byte so I could index it via the current character being drawn...

so here's the code that updates the colors for the characters:
Code: Select all

    def colorize(this): # syntax coloring
        lines = this.data.splitlines(True)
        this.colors = [QtGui.QColor(  0,   0,   0, 255)]*len(this.data)

        kind = tok_str = ''
        tok_type = tokenize.COMMENT
        for (tok_type, tok_str, (srow, scol), (erow, ecol), logical_lineno
            ) in tokenize.generate_tokens(
                functools.partial( next, iter( lines ), '' )

            prev_tok_type, prev_tok_str = tok_type, tok_str
            kind = ''
            if tok_type == tokenize.COMMENT: kind = 'comment'
            elif tok_type == tokenize.OP and tok_str[:1] not in '{}[](),.:;@': kind = 'operator'
            elif tok_type == tokenize.STRING:
                kind = 'string'
                if prev_tok_type == tokenize.INDENT or scol==0: kind = 'docstring'
            elif tok_type == tokenize.NAME:
                if tok_str in ('def', 'class', 'import', 'from'): kind = 'definition'
                elif prev_tok_str in ('def', 'class'): kind = 'defname'
                elif keyword.iskeyword(tok_str): kind = 'keyword'
                elif hasattr(builtins, tok_str) and prev_tok_str != '.': kind = 'builtin'
            if kind:
                tok_str, (srow, scol), (erow, ecol)

                offset = 0
                if srow:
                    for line in lines[:srow]: offset+=len(line)

                offset += scol
                if erow==srow:
                    for i in range(ecol-scol):
                        this.colors[offset] = this.syntax[kind]
                    section = lines[srow:erow]
                    ls = len(section)
                    for ln,line in enumerate(section):
                        if ln==0: line = line[scol:]
                        if ln==ls: line = line[:ecol]
                        for i,c in enumerate(line): this.colors[offset] = this.syntax[kind]; offset+=1

the result is an index-error from "this.colors[offset]"... I'm not sure what I'm doing wrong :/
Last edited by Tcll on Thu Apr 24, 2014 4:39 pm, edited 1 time in total.
User avatar
Posts: 107
Joined: Wed Jan 01, 2014 6:36 pm

Re: tokenize issue (need data offset)

Postby Tcll » Thu Apr 24, 2014 4:39 pm

you know what... forget it... I just remembered PyQt has a C++ syntax highlighting example which uses regex that I can use for this...

should be waaay better than tokenize... heh
User avatar
Posts: 107
Joined: Wed Jan 01, 2014 6:36 pm

Return to General Coding Help

Who is online

Users browsing this forum: No registered users and 7 guests