[developers] cfrom/cto

Ben Waldron benjamin.waldron at cl.cam.ac.uk
Wed Oct 12 15:23:41 CEST 2005


The currently agreed semantics of CFROM/CTO is that they refer to 
character positions. Eg. given the text 'abcd' the range CFROM=0 to 
CTO=2 refers to the "abc" substring.

abcd
0123 = character positions

I would like to suggest we use character _points_ (the points between 
characters) instead of the above -- more expressive and allows the 
specification of empty ranges. Eg. given the text 'abcd' the range 
CFROM=0 to CTO=2 would refer to the "ab" substring, whilst the range 
CFROM=0 to CTO=3 would refer to the "abc" substring

.a.b.c.d.
0 1 2 3 4 = character points

What would people feel about such a semantics? The conversion from 
character positions to character points is simple: CFROM values are the 
same, to convert a CTO character position to character point you must add 1.

- Ben




More information about the developers mailing list