Ⓐ

Variable Byte Coding

Assign Onward uses a variable length byte encoding scheme called Ricey Codes
*vaguely related to Rice Coding
to represent words in the dictionary, the number of elements in arrays and other unsigned integer quantities.
*2018 Assign Onward implementation (unfinished) used variable byte codes slightly differently. Consult the repository history of this document and the source code if interested.
A RiceyInt is an unsigned integer up to 263-1. Programatically, RiceyInts are typically stored in unsigned int_64 variables, though regular signed int_64 can also work.

A RiceyCode stores a RiceyInt as a string of one to nine bytes, with 7 bits of value per byte, most significant bits first, and the most significant bit of each byte indicating more bytes to follow in this code when it is a 1, or this is the final byte when it is a 0. While leading bytes of 0x80 could be interpreted as leading zeroes on the binary representation, they are considered invalid
*Leading 0x80 bytes not only waste space, but they create an ambiguity of representation with multiple codes representing the same value. As such, only the most compact representation of a given number is the valid one.
and should not be used. 0x80 appearing in the middle of RiceyCodes of three bytes or more are valid and simply represent seven binary zeroes in the middle of the value.

When used to represent the number of children in an object, or the number of elements in an array, or other numerical values, any positive value from 0 up to 263-1 is "valid Ricey."

When used to represent dictionary words, only those codes specifically defined in the dictionary (or the subset of the dictionary used in the applicable protocol context) are permissible. In part, this is because the json
*and graphviz .dot
representation of RiceyCodes as keys or RiceyCode values is not the binary or hex string, but the human readable
*Generally, English word(s) and/or abbreviations are used as dictionary "words" for json and dot representations; however, localization of dictionaries is possible without altering the bao representation and since protocols interact in bao, protocols with different localization should interoperate seamlessly - just their json and dot representations will change.
UTF-8 string that corresponds to the RiceyCode - while the bao representation is the binary form of the RiceyCode.