I showed this to a few people during rubyconf but couldn't find a good
solution.
Liquid is a non evaling (html-)template engine which I use to allow my
customers to edit their shop's appearance in Shopify.

I would like to tokenize the entire document in one pass.
This is a stripped down test case demonstrating the blocker. Basically
i can't find a good way to get all the text between {% tags %} and {{
variables }} ( which i have omitted from the test case for
simplicity).

Once this is addressed i'll release liquid to rubyforge.

Testcase:

require 'test/unit'
TokenizationRegexp          = /\{%.*?%\}|[^\{]+/ # this doesn't work

class ParsingTest < Test::Unit::TestCase

   def test_tokenization

     # Please make me work

     text = "Hello im liquid this: {% is a tag %} curly brackets like
this { may appear in the text } please parse me"
     assert_equal ["Hello im liquid this: ", "{% is a tag %}", " curly
brackets like this { may appear in the text } please parse me"],
text.scan(TokenizationRegexp)
   end

   def test_tokenization_without_curly

     text = "Hello im liquid this: {% is a tag %}"
     assert_equal [ "Hello im liquid this: ", "{% is a tag %}"],
text.scan(TokenizationRegexp)
   end


end

# Loaded suite test
# Started
# F.
# Finished in 0.021796 seconds.
#
#   1) Failure:
# test_tokenization(ParsingTest) [test.rb:11]:
# <["Hello im liquid this: ",
#  "{% is a tag %}",
#  " curly brackets like this { may appear in the text } please parse
me"]> expected but was
# <["Hello im liquid this: ",
#  "{% is a tag %}",
#  " curly brackets like this ",
#  " may appear in the text } please parse me"]>.
#
# 2 tests, 2 assertions, 1 failures, 0 errors

--
Tobi
http://jadedpixel.com    - modern e-commerce software
http://typo.leetsoft.com - Open source weblog engine
http://blog.leetsoft.com - Technical weblog