# Outfix Notation

 Posts 186
Regular
arbiteroftruth
Regular
Joined: Feb 17 2013, 05:51 AM
I've been giving thought lately to mathematical notation in general, and I've come to the conclusion that html essentially got it right.

The problem with infix notation is that parentheses are often needed, and it doesn't generalize well to functions of more than 2 inputs.

The problem with prefix and postfix notation is that you need to know by heart how many inputs a given function takes, and you get no explicit reminder of where the scope of the function ends. This is no big deal for computing purposes, where you just have to avoid ambiguity, but it's annoying for human readability to always have to work out for yourself where the scope of each function ends.

In infix, with something like (a+b.)*(c+d), I can tell at a glance that the * is being applied to two small sub-expressions. In postfix, that would be ab+cd+*, which saves on parentheses and is therefore more concise, but makes it harder to immediately parse the inputs of *, since they aren't explicitly delimited.

So I figured the ideal mixture of the two would be to have the function name itself serve as the delimiter of its inputs. For a function f, you would have "f (inputs) \f". So, for example,

a+b+c+d

becomes

+a b c d\+

And the previous example of
(a+b.)*(c+d) in infix, which is
a b + c d + * in postfix, would be
*+a b\+ +c d\+\* in "outfix".

That looks odd at first, but once you get used to seeing + and \+ as delimiters like parentheses, it becomes easy to scan over the expression and immediately see that the terms of the multiplication are two small sub-expressions.

Outfix may seem less concise on the face of it, but it can be easily improved by omitting the closing delimiter at the end of the expression. This turns the above example into

*+a b\+ +c d\+

The final \* is unnecessary, because the end of the expression implicitly closes the main function, in this case multiplication.

But furthermore, because every function has distinct delimiters, closing one function can implicitly close others. For example, the final \+ above could also be omitted, giving

*+a b\+ +c d

Since +c d\+ is inside the multiplication, the end of the multiplication would implicitly indicate the end of the addition as well. And the end of the multiplication in this case is itself implicitly established by the end of the expression.

We can take one final step by using the standard convention of assuming multiplication when no function is specified, which simplifies the expression down to

+a b\++c d

Now lets try comparing outfix, infix, and postfix on something more complicated. My go-to example for comparing notations is the quadratic formula. It's worth noting that unary functions need no closing delimiter, since they're assumed to simply apply to the immediately following term. Also, I'll use # for "plus or minus".

Infix:
(-b+#sqrt(b^2-4ac))/(2a)

Postfix (with unary negation, not binary subtraction):
b-b2^4a*c*-+sqrt#+2 a*/

Outfix (with unary negation and reciprocation):
+-b #sqrt+^b 2\^ -4ac\+\+/2a

Outfix doesn't exactly prove superior to infix here, but I'd say it's a heck of a lot easier to parse than postfix, and it doesn't lose much concision compared to infix either. Where outfix really shines is with multiple nested functions. For example:

f(x,g(y,h(z,w)))

becomes

fx gy hz w

In an example like above, it might seem ambiguous in general whether f, g, and h are functions or variables, but this can be easily resolved just by having a convention for distinguishing the two. For example, using regular print for variables, and for functions using cursive when writing or italic when typing. Giving

fx gy hz w

In this example the notation reduces to prefix. But the way I see it, that's the beauty of it. It reduces to something simple like prefix whenever the expression is easy to read in prefix, but naturally introduces delimiters when necessary. For example, the similar expression

f(g(h(x,y),z),w)

becomes

fghx y\h z\g w

In prefix, you would need to parse the entire expression to find out that everything before w is a single term, and thus that w is the second input to f. But in outfix, as soon as you see the g, you can scan forward to the \g to see where that term ends, without having to parse everything in between.

I suppose you could go even further by allowing the notation to reduce to postfix when convenient by using a closing \f with an implied opening f at the start of the expression. So the previous example would be

x y\h z\g w\f

Basically, this unifies prefix, postfix, and the use of delimiters into one notation that allows you to do whatever's most convenient for any given expression.

 Posts 817
Dozens Disciple
m1n1f1g
Dozens Disciple
Joined: Feb 20 2011, 10:15 AM
arbiteroftruth @ Aug 2 2014, 04:53 AM wrote:Where outfix really shines is with multiple nested functions. For example:

f(x,g(y,h(z,w)))

becomes

fx gy hz w

In an example like above, it might seem ambiguous in general whether f, g, and h are functions or variables, but this can be easily resolved just by having a convention for distinguishing the two. For example, using regular print for variables, and for functions using cursive when writing or italic when typing. Giving

fx gy hz w

In this example the notation reduces to prefix. But the way I see it, that's the beauty of it. It reduces to something simple like prefix whenever the expression is easy to read in prefix, but naturally introduces delimiters when necessary. For example, the similar expression

f(g(h(x,y),z),w)

becomes

fghx y\h z\g w

In prefix, you would need to parse the entire expression to find out that everything before w is a single term, and thus that w is the second input to f. But in outfix, as soon as you see the g, you can scan forward to the \g to see where that term ends, without having to parse everything in between.
That breaks down if functions are being passed as arguments, though. There is no way to split symbols into functions and arguments, just functions and not functions (and even that has some caveats).

We can translate $$f(x,g(y,h(z,w)))$$ into more sensible notation: $$f\ x\ (g\ y\ (h\ z\ w))$$. We can even replicate that Lojbanic right bracket elision: $$f\ x\ (g\ y\ (h\ z\ w$$, but I'm not sure whether that helps. Round brackets are very visually helpful. They wrap around a subexpression and temporarily allow one to ignore the contents. Your notation does this to an extent, but is not as noticeable to me. That last example, by the way, can be replicated in Haskell (and, in principle, any notation with similar syntax) using the function application operator (&#036;): $$f\ x\mathbin&#036;g\ y\mathbin&#036;h\ z\ w$$ where $$f\mathbin&#036;x=f\ x$$ and $$&#036;$$ has lowest precedence.

Another problem is that any use of prefix operators requires more forethought when writing than use of infix or postfix operators. It is not as easy to just tag on a forgotten term. Infix sometimes requires some brackets to the left, but that's relatively rare in practice.
A few little conventions:
- Dozenal integers suffixed with prime (&#8242;). This is the uncial point.
- Decimal integers suffixed with middle dot (·). This is the decimal point.

You may see me use * prefix for messages before 11&#400;7-03-1X, and a whole range of similar radix points. I will often use X and &#400; for and .

Sometimes, I will imply that an integer is in dozenal, so I won't add any marks to it. You should be able to tell that "10 = 22 * 3" is in dozenal.

 Posts 186
Regular
arbiteroftruth
Regular
Joined: Feb 17 2013, 05:51 AM
If a function is being passed as an argument, then it won't be receiving an input. And when opening a function, the first input should be written immediately after the function name. But if the function name is just one element in a list, then the next element will come after a space or a comma. For example,

fg x

is a higher order function f taking the function g and the object x as inputs. In standard notation, it would be f(g,x) Whereas

fgx

is the composition of f and g applied to x. In standard notation, it would be f(g(x)).

Infix's feature of allowing you to glance over a sub-expression to the closing parenthesis is what I was trying to capture from infix, while also capturing prefix and postfix's more unified notation for functions in general (since infix only really works for binary functions). I think it's just a matter of getting used to immediately knowing to look for the \f after you see the f.

I don't think prefix especially requires more forethought than infix. It definitely doesn't require more than postfix. Infix is just more familiar, and prefix is just a matter of thinking of things in a slightly different order. Rather than starting with some number, and then doing some modification to it using another number, you instead start with knowing what you want to do, and then figuring out what you want to do it with.

 Posts 817
Dozens Disciple
m1n1f1g
Dozens Disciple
Joined: Feb 20 2011, 10:15 AM
Translation challenge: $$\text{map}\ (1+{}) \ [0,1,2,3] = [1,2,3,4]$$. Maybe that's slightly unfair, since I'm not sure that currying is generally understood and used in mathematics (and lists are rare). But it's a statement that gives you currying, a higher-order function and collections
A few little conventions:
- Dozenal integers suffixed with prime (&#8242;). This is the uncial point.
- Decimal integers suffixed with middle dot (·). This is the decimal point.

You may see me use * prefix for messages before 11&#400;7-03-1X, and a whole range of similar radix points. I will often use X and &#400; for and .

Sometimes, I will imply that an integer is in dozenal, so I won't add any marks to it. You should be able to tell that "10 = 22 * 3" is in dozenal.

 Posts 186
Regular
arbiteroftruth
Regular
Joined: Feb 17 2013, 05:51 AM
Before I can translate that, first you have to explain to me what that means. As far as I can tell, you're just distributing a "+1" operation over the set [0,1,2,3] to get [1,2,3,4]. And that's a matter of conventions about doing arithmetic on sets, rather than conventions for how to specify functions in general.

If that is what your statement means, then I would just write it as

+1 [0 1 2 3]=[1 2 3 4]

wendy.krieger
wendy.krieger
You could do RTF notation. The operator appears as a command inside the function, eg {b bold text}. I use this as a markup language for my web page.

eg {pi} = 3,14159 {pi 4} = 12.566370, {pi 4, 0} = 4.

{sum {step 1, 1, 15}} = 120.

 Posts 817
Dozens Disciple
m1n1f1g
Dozens Disciple
Joined: Feb 20 2011, 10:15 AM
wendy.krieger @ Aug 4 2014, 05:43 AM wrote:You could do RTF notation.&nbsp; The operator appears as a command inside the function, eg {b bold text}.&nbsp; I use this as a markup language for my web page.

eg {pi} = 3,14159&nbsp; &nbsp; {pi 4} = 12.566370,&nbsp; {pi 4, 0} = 4.

{sum {step 1, 1, 15}} = 120.
...and we rederive Lisp's s-exps!
arbiteroftruth @ Aug 4 2014, 12:48 AM wrote:Before I can translate that, first you have to explain to me what that means. As far as I can tell, you're just distributing a "+1" operation over the set [0,1,2,3] to get [1,2,3,4]. And that's a matter of conventions about doing arithmetic on sets, rather than conventions for how to specify functions in general.

If that is what your statement means, then I would just write it as

+1 [0 1 2 3]=[1 2 3 4]
Yes, that's the meaning I was going for (though it's a list, which preserves its order, but that's irrelevant). My solution of using implicit mapping with otherwise normal-looking notation gives a similar result. One thing, though: why do you revert to infix for â€˜=â€™?

Based on your answer, I'm interested about what you'd do with $$x\in\mathbb{R}\implies 0\leq x^2$$.
A few little conventions:
- Dozenal integers suffixed with prime (&#8242;). This is the uncial point.
- Decimal integers suffixed with middle dot (·). This is the decimal point.

You may see me use * prefix for messages before 11&#400;7-03-1X, and a whole range of similar radix points. I will often use X and &#400; for and .

Sometimes, I will imply that an integer is in dozenal, so I won't add any marks to it. You should be able to tell that "10 = 22 * 3" is in dozenal.

 Posts 186
Regular
arbiteroftruth
Regular
Joined: Feb 17 2013, 05:51 AM
m1n1f1g @ Aug 4 2014, 06:36 PM wrote:Yes, that's the meaning I was going for (though it's a list, which preserves its order, but that's irrelevant). My solution of using implicit mapping with otherwise normal-looking notation gives a similar result. One thing, though: why do you revert to infix for â€˜=â€™?

Based on your answer, I'm interested about what you'd do with $$x\in\mathbb{R}\implies 0\leq x^2$$.
The real reason I reverted to infix is probably just that it fits better with natural language. In arithmetic, a prefix function can just be read in the imperitive sense. "+x y" would be the imperitive sentence "add x and y". But boolean relations like "implies" or "is less than or equal to" seem more like verbs of which one of the inputs is the subject and the other the object.

But in the case of = or < and similar relations, it can also be justified by the fact that infix doesn't need delimiters when the function has different input and output data types. Although = can be used between booleans, usually the double arrow is used instead in that case, and =, if thought of as a function, is a function from numbers or sets or other non-boolean objects to a boolean output. So, for example,

$$x\in\mathbb{R}\implies 0\leq x^2$$

needs no disambiguation for its order of operations. R can't be the input to the implication, because implication takes propositions as inputs, and R is not a proposition. So the set element relation must take precedence. Similarly, the inequality on the right hand side must occur first if the right side input of the implication is to be a proposition. And the squaring operation must occur even sooner, since the inequality returns a boolean value, and squaring a boolean value doesn't make sense, so the squaring must come before the comparison.

But for boolean connectives that can be seen as functions from propositions to propositions, I suppose consistency is preferred. So, your example would become

-->x in R,0<=^x 2

The complete description of conventions would be as follows.

1) Unary functions are prefix and apply to the immediately following term.
2) Binary functions with different input and output types are infix, since the order of operations is implied by the data types.
3) Other functions are outfix and apply until the closing delimiter or the end of the expression or subexpression.
4) To refer to a function as an object in its own right, explicitly separate the next term in the expression with a space or comma. e.g., gfx is g(f(x)), while gf,x is g(f,x).

 Posts 186
Regular
arbiteroftruth
Regular
Joined: Feb 17 2013, 05:51 AM
Actually, it occurs to me that for any given set of object we might want to write expressions with, there's no problem with having 1 infix operator. As long as that operator is associative, chains of that operator can be put into a left-to-right order that needs no delimiters, and any subexpressions using different functions can be explicitly delimited in outfix notation.

Similarly, any operator that changes the data type completely can be made in infix, because the required data types can implicitly force the correct order of operations.

I consider the 4 basic data types to be numbers, sets, functions, and booleans. The natural infix operations are addition, union, direct sum(f "+" g is a function from ordered pairs of inputs to f and g respectively, and generates an ordered pair of the associated outputs), and disjunction. Also, comparitive relations on non-booleans are infix, since the output type is different from the input type, so ambiguity is avoided. So equality, inequalities, set membership, etc., can all be expressed in infix without the need for parentheses.

As a demonstration, notice that the following expression is unambiguous despite the lack of expression delimiters:

{a b c}U{x y z}={0 1 2 3}Vx=a+b

The V only takes boolean inputs, while the terms immediately around it are not boolean, so the ='s must be considered first. But on the other side of each = is an infix operator that doesn't accept booleans, while the = outputs booleans, so those infix operators must occur first.

Similarly, any expression involving only these operators has its order of operations established by the data types.

All other operations and relations would be outfix, but even then, a lot of delimiters can be avoided by allowing the next most common operation to be the default assumption when no function is specified. Writing it with no separation between terms indicates high precedence. Whenever that's not the case, it can still be explicitly written in outfix.

This allows us to recover expressions like x+yz with the usual interpretation. And (x+y)z becomes x+y z.

In general, for a given data type, the most natural notion of a sum should be the infix operator, and the most natural notion of a product should be the default outfix operator. Binary predicates on those objects can be infix as well. Everything else is explicit outfix. For the four basic types I mentioned, I figure the natural notions of "product" are multiplication, cartesian product, composition, and conjunction.

One last slight modification: often, the closing \f can be cumbersome, especially if the function name is multiple characters long and you're enclosing a very simple input. cosx\cos is a big waste of space, and actually makes it harder to pick out the x as the input. But you can write cosx\ instead, where the \ without a function name is assumed to close the most recently opened function. In cases like the above example, the input expression is so small that there's no need to explicitly remind the reader which function is being closed. True outfix notation then is only generally necessary for particularly complex expressions.