Code Highlighting for the Blog
posted by Jake on
Here are some high-level highlights on what I did to get code highlighting working for the blog. The blog runs on python/django, so this is a very pythonic solution -- which, I have to say, is not a bad solution.
I've always wanted to get some good code highlighting for the code that I wanted to post on the world wide web as I had various adventures in the software development world.
I used a wonderful custom django template filter that someone was kind enough to offer in a snippet (http://www.djangosnippets.org/snippets/119/)that brought all the important pieces together.
I altered it ever so slightly and reproduce it here:
from django import template register = template.Library() # Pygments: http://pygments.org -- a generic syntax highlighter. from pygments import highlight from pygments.formatters import HtmlFormatter from pygments.lexers import get_lexer_by_name, guess_lexer # Python Markdown (dropped in my project directory) from markdown import markdown # BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/ from aprilandjake.tech.BeautifulSoup import BeautifulSoup @register.filter("code") def rendercode(content, safe="unsafe"): """Render this content for display.""" # First, pull out all the code blocks, to keep them away # from Markdown (and preserve whitespace). soup = BeautifulSoup(str(content)) code_blocks = soup.findAll('code') for block in code_blocks: block.replaceWith('<code class="removed"></code>') # Run the post through markdown. if safe == "unsafe": safe_mode = False else: safe_mode = True markeddown = markdown(str(soup), safe_mode=safe_mode) # Replace the pulled code blocks with syntax-highlighted versions. soup = BeautifulSoup(markeddown) empty_code_blocks, index = soup.findAll('code', 'removed'), 0 formatter = HtmlFormatter(cssclass='source') for block in code_blocks: if block.has_key('class'): language = block['class'] else: language = 'text' try: lexer = get_lexer_by_name(language, stripnl=True, encoding='UTF-8') except ValueError, e: try: # Guess a lexer by the contents of the block. lexer = guess_lexer(block.renderContents()) except ValueError, e: # Just make it plain text. lexer = get_lexer_by_name('text', stripnl=True, encoding='UTF-8') empty_code_blocks[index].replaceWith( highlight(block.renderContents(), lexer, formatter)) index = index + 1 return str(soup)
This customer filter uses 3 third party modules:
- Pygments - a syntax highlighter for Python 2.3 and above.
Install is easy with a Python .egg:
easy_install Pygments
For windows at tarball is also available at Sourceforge.
- Markdown for Python - a plain text to HTML converter
Installation is likewise easy:
easy_install markdown
If you're on Windows, a win32 installer is also available here.
- Finally, Beautiful Soup - a fun-sounding HTML parser.
Download the .py file and stick it somewhere in your project.
With all the pieces in place, now it's as easy as applying the filter just as with all the other nifty django filters.
{{ entry.body|code|safe }}
Now you just need the stylesheet to make all the newly created span's around your code look as magical as they really are. I horked the .css file off the Pygments demo page and did a Find|Replace from ".syntax" to ".source", so now my .css looks something like this:
.source .c { color: #60a0b0; font-style: italic } /* Comment */ .source .err { border: 1px solid #FF0000 } /* Error */ .source .k { color: #007020; font-weight: bold } /* Keyword */ /** ... */
Then, in the content that you're feeding into your 'code' filter variable, just insert 'code' tags such as this:
<code class='python'>print "Hello, World"</code>
And replace the 'python' class with any number of different languages that Pygments supports.
That's it!
Note, in the last usage code and the custom filter code, on this line:
block.replaceWith('<code class="removed"></code>')