Skip to content

Digram Serialization

F. Conrads edited this page Sep 13, 2019 · 27 revisions

digram-ser.png

To serialize several Digrams we simply add the serialized Digrams to each other.

Complement of Edge Label Index 1

To set the start of a digram, we simply use the complement of the first edge label index. Hence to seperate two digrams, we can search for the next Integer with the leading bit. This is then the next ELI1 and the start of the next digram, thus setting the boundaries of the internals.

IAS

IAS is the Internal Array Size Flag which represents if the Digram has two or one internals.

  • 0 for 1 Internal Node
  • 1 for 2 Internal Nodes

Size Flag

The size flag represents the Number Class of the internals, if the internals are all smaller than the Maximum amount possible using a short, it would be a waste of space to save them as Integers. So the internals will be saved in the lowest number class all of them can be contained in.

  • 00 for byte
  • 01 for short
  • 10 for integer
  • 11 for long

External Index

An external index has 2 bits, as we exploit the fact that we only have 1 or 2 external indexes The indexes are one of [1,2,3,4] which can be represented as:

  • 1: 00
  • 2: 01
  • 3: 10
  • 4: 11

In the case that we only have one external Index, external Index 2 will be set to the same as external 1.

compression rate

In the following graph the compression ratio of this method in comparison to saving the Digram Occurences naively as a 24 byte object (N1 E1 N2 N_3 E2 N4) is shown digram-ratio.png

Clone this wiki locally