"Hugh Sasse Staff Elec Eng" <hgs / dmu.ac.uk> wrote in message
news:Pine.GSO.4.53.0302231919130.23615 / neelix...

Thanks for the response.

> If TeX and LaTeX are too complex,

It wasn't entirely correclty formulated: It is not really the complexity
that bothered me, because I could avoid advanced issues. The problem was
that I quickly ran into problems with non-ASCII symbols like the danish ๆ๘and even with escaping ASCII symbols frequently used in LaTex syntax. LaTex
also seems to have some difficulties with whitespaces: /LaTex/ to avoid
consuming space after the command. I wanted a clear start and endpoint of
the command : '{' and '}'. If the dd letters in middle are to be bolded, I
simply write
mi{b dd}le.
Putting the command name outside of '{' would require a space before the
command name. That's why it's not b{something bold} but {b something bold}.
I forgot that in the original post.

> have you looked at Lout?
> http://snark.ptc.spbu.ru/~uwe/lout/lout.html
> I've not had time to get into it, but it seems to have relatively
> few rules.

I didn't know Lout. Thanks for the pointer.

Lud seems to address many of the same issues that I do with STEP. However,
it mixes the text entry syntax with document layout. I do not want to handle
document formatting - I wan't to write simple syntax the translates into
existing formatting languages like DocBook, XSL-FO or LaTex (or use it for
something non-formatting like assigning bug reports). Of course commands
will need to be defined on top of STEP - you could easily steal DocBooks
commands for example. End the end each backend processor provides a native
set of commands and additional macro definitions provides easier use of the
same commands - not unlike LaTex - except the higher level macros are
delivered to the backend so macro resolution is optional and can be replaced
by a different backend using these commands directly - e.g. you can define
{section} in terms of simple html commands, or you could programmatically
process it - this is where Ruby would be great.
You could say I'm aiming at the same thing as Yaml, except the focus is on
entering text, not generic datastructures. Thus you can plug all kinds of
back-ends to the STEP processor.

> People always want to do more thigs, so more cases crop
> up, which I can see from the elided examples you have run into
> already.

True, but I try to address the syntax and structure part and leave the other
problems to those that have already addressed them; like XSL-FO. My problems
essentially relates to 3 issues: 1) automatically closing tags, 2)
whitespace and 3) escaping text. These are very fundamental problems that
must be solved. Anything else can be solved within the STEP syntax defining
appropriate commands. Perhaps not always entirely elegant, but often more
elegant than the alternatives.

Problems, like where paragraphs begin and end really are outside the scope
of STEP. I have made it possible to clearly identify the location of breaks,
but in the end a higher level processor can strip them and require explicit
paragraph breaks, like DocBook, or use sensible rules to interpret the
breaks produced by STEP. I also made it possible to have exact control over
whitespace. The choice to collapse spaces to a single word-break is for XML
compatibility and also for userfriendlyness - all space information could be
passed along with a break token.

It's important to remember that STEP is a structured text entry format and
the logic for processing that information. It is not in itself a document
formatter. It's purpose is to make it as simple as possible to enter text
information with as much metadata as possible.

For example, I could just require parapgraphs to be explicit commands, like
in Xml-Doc. It would be less convenient though. Also, the complexity of
spaces should be compared to Ruby's syntax: Ruby's syntax is pretty
involved - but in return makes it intuitive to the user. It's about
balancing day to day usability against the problem of explaining the syntax
and it's special cases.

> One thing that irked me about troff, which I otherwise
> like, is that I can never remember which commands just affect
> following text, and which need arguments on the same line.

Yes, this is really my main concern as well. This is why I consider using
two escape syntaxes: "[section header] body" and "{b a bolded text}", but
then it is often implicit by the command and it may be more difficult to
remember and understand why there are two syntaxes. I would appreciate some
more feedback on how to deal with this. I have considered several options,
but the current solutions seems to be the most practical. You don't want to
track end tags 200 pages down a document to close the {part I} tag. You'd
just write a new {part II} tag or a {postscript} tag.

One ting I didn't menation was that you can explicity close the body of a
command:

{chapter Ch. 1} text in ch.1 {section My section} body text {/section}. More
text in ch.1 but outside the section. {chapter Ch. 2} ...


> having tables and equations outside of troff itself meant that
> dealing with them was a pain when you tried after 6 months of not
> using them.  I don't hear much about troff nowadays.

I have never worked with troff.
I expect equations in STEP would initially require LaTex target and use
something like {LaTex ... a raw latex equation syntax here}. Then you could
use tools like Tex4ht to get MathML. This is really outside the scope of
STEP itself, but an issue for a higher level processor (such as XSL-FO
output or a Wiki Engine). Of course you could define an equation syntax in
STEP which would be converted to LaTex or MathML.

> One suggestion: get rid of '\' notation.  Too many programs use it.
> Yes, it is nice to have a consistent standard, but when you pipe a
> text string though a couple of these then you end up with \\\\\
> which are really bad for the eyes! :-)  {brace} {closebrace} could
> allow insertion of your special characters.

I totally agree - but this is also exactly what I did. The only escape
character is '{' '}'. However, I have two choices to escape the escape.
Either doubling them like {{ and }} which would give me the same problem as
with \\\\ and even worse problems related to nesting, or use a different
notation.
I chose '\{' for that purpose. '\' has no special meaning otherwise: If you
write "\\\\" you mean to write four backslashes and if you write "\\{\\" you
mean to write "\{\\".
If I want to escape source source code with balanced braces the syntax is
{|if(x < 2) { printf("hello") }; |} (in this particular case space is
significant after the bar). If the curly braces are not balanced, you write
an endmarker: {eot| if(x < 2) printf("left curly brace: {"); |eot}, where
eot is arbitrarily chosen.

The important thing is that there are not traps. Unless you use '{' you can
type anything. You won't accidentally type a command you are not aware of.

Mikkel