STxT: The Book

A language for the web

Chapter 5: Compaction (*)

STxT Information analysis

This chapter deals with the compaction of STxT documents. We are going to modify a document to make it more compact, so that it is faster to parse and contains the same information.

In order to do this we must analyze the content of one document.

A document is just a set of nodes with a specific indentation, so the information is actually of order and level.

Let's see an example:

Node_0 (www.example.com/...):
    Node_1:
        Node_2: Text of node 2
        Node_3: Text of node 3
    Node_4:
        Node_5: Text of Node 5
        Node_6: Text of Node 6
            A line of text
            Other line of text
        Node_7:
            Node_8:
                Node_9: Last line

We see that there are 10 nodes in this document. The first node is Node_0, and is the one containing the rest. We could say that it is the first one to appear (order) and that it has level 0 (is not indented and contains the rest). The second node is Node_1. It is the second one to appear and has level 1. This way, we could qualify all the lines, and extract the information. It isn’t hard to make a table with this information:

Order Node Level
1 Node_0 0
2 Node_1 1
3 Node_2 2
4 Node_3 2
5 Node_4 1
6 Node_5 2
7 Node_6 2
8 Text 3
9 Text 3
10 Node_7 2
11 Node_8 3
12 Node_9 4

Now we just need to include this information. This is what the next section deals with.

A way of compacting

From the table above, we are going to propose a way to compact the information. It will be as easy as inserting the level number instead of using tabs, and separate it with the usual ':' character.

Thus, the above example would be left like this:

0:Node_0 (www.example.com/...):
1:Node_1:
2:Node_2: Text of node 2
3:Node_3: Text of node 3
1:Node_4:
2:Node_5: Text of node 5
2:Node_6: Text of node 6
3:A line of text
3:Other line of text
2:Node_7:
3:Node_8:
4:Node_9: Last line

Actually, there is still redundant information, as the first node always has level 0 (in fact the parsers can assume that it is -1), and the level 1 could be assumed to be level 0. Thus, there will be programs that will leave it the following way:

Node_0 (www.example.com/...):
Node_1:
1:Node_2: Text of node 2
2:Node_3: Text of node 3
Node_4:
1:Node_5: Text of node 5
1:Node_6: Text of node 6
2:A line of text
2:Other line of text
1:Node_7:
2:Node_8:
3:Node_9: Last line

As a personal remark I will say that my I prefer that all nodes have a level (except the first one) as I find it more aesthetic and easy to read.

My preference would be thus:

Node_0 (www.example.com/...):
1:Node_1:
2:Node_2: Text of node 2
3:Node_3: Text of node 3
1:Node_4:
2:Node_5: Text of node 5
2:Node_6: Text of node 6
3:A line of text
3:Other line of text
2:Node_7:
3:Node_8:
4:Node_9: Last line

There is one basic rule for compaction:

If aliases have been used
they will be replaced by the Canonical Name

This is an important rule, since names are usually shorter than aliases, and at the same time it is faster to parse a document this way. In addition, it also plays an important role in document transformation, as we will see in Chapter 7: Equivalences and Internationalization).

In case we do not use canonical names we would talk of semicompacted documents, which although correct, would not it be as optimal as the compacted ones.

Finally keep in mind that text lines which belong to a node, will always be in a level above the node to which they belong. In the previous example we can see that after Node6 (with level 2) there are two text lines that belong to it, both with level 3.

Actually, the compacting rule is this simple:

Replace the start tabs with:
number of tabs + ':'

Other examples

If we compact the fritters recipe, we would ave the following text:

Recipe(www.cooking.demo/recipe.stxt):
1:Title: Sweet fritters from Girona
1:Description: This is a recipe for making sweet fritters at Lent 
2:time. They are known also as Sweet Fritters from l'Empordà.
2:They are usually served as a dessert, and there is enough quantity
2:for up to 12 people. It is an easy to make recipe.
2:Ingredients: 1 Kg flour
2:8 eggs
2:zest of 1-2 lemons
2:150 grams sugar
2:100 grams butter
2:1/4 liter milk
2:1 glass of grassolis (blend of muscatel, anise, cointreau)
2:75 grams yeast (from a bakery)
2:aniseed
2:a pinch of salt
2:a little bit of ground cinnamon
1:Preparation: Place flour and the other ingredients in a bowl.
2:Melt the butter with some milk, as well as the
2:yeast. The milk for dissolving the yeast has to be only 
2:slightly lukewarm, because otherwise it would probably
2:lose its effect. Mix everything well, and let stand
2:covered for 3-5 hours (until the dough raises).
2:With a spoon or teaspoon, depending on the size you want, 
2:fry in plentiful oil until you turn golden. As you take
2:them out, sprinkle them with a little bit of the grassolis mixture,
2:and cover them with sugar.

Let’s see what the compaction of the example we saw in the grammar chapter looks like.

We had the customer's information, which compacted looks like:

Customer (www.gym.demo/client.stxt):
1:Personal information:
2:Name: Shila
2:First last name: Kween
2:Age: 28
2:Vip customer: 1
2:Photograph:
3:R0lGODlhlgCNALMAAFtepau07X2I1tvg+pOe42dwwMPK9vz+/AAAAAAAAAAAAAAAAAAAA
3:AAAAAAAAAAAACH5BAAAAAAALAAAAACWAI0AAwT/8MhJq7046827/2AojmRpnmiqrmzrvn
3:Asz3Rt3/Vg7MbgD7igsATUBQgCAYEQ2AGH0KGPR9XxAkdBIdl8Rr8qK3bJxJp5vosBmxQ
3:EvOA4yIhdvw/PonVt2OjIfXKCF1ZpPzsuA1hwg18DBDoVa0sde04bPQc9aY1Ra14+AUmQ
3:B00Se3RMO1h4lxisEoGdQgZMOrdKTwZKWW2jZUcBW25XmRaPEnezQZ8UvLFIvErAaKsFw
3:b3RZZsSpAHLOM2nbhRLyh0CkXqTWkjmTQS1jOAwfBMD5BOitx/f86fBFJGJ9o3eDCT38n
3:UzUxCDkR61qnGiwKfIGAGnDL7YdwohQCak/wh1mdJEgJ1gA0dOiveIjChZGlO07GgyVi1
3:bFYws4WOsFJ5jq+os4QLSpBZXMU3g1OEGyJtteOiY0kAgw8ojmaykS7frEMykICa1wvkG
3:W0V+HM61qgPnUcMD8dxMbfUVLAclfZqY6nEEiayeaUnWORY1pKYm1zBSQGt3AxlNB8hZG
3:eqlbgamWeuGmjRSUToAHCVZbjxhV7we8b6VTV26AxpFrQ79mKgGibB4hEaTxlsK5FpVd2
3:jnRLpwZIidopQ4BEwaD6RArJvUevtzOO1bokgoeiRwy79YzTNSclqr1CTmkkDdKsL3lOz
3:ZPxbfA4LkWlQ13+ndOu18EZ+5arhX3f9aO0BShB+RSIJHO5qMphs4f1TVRzxVCaXMP9zM
3:5wpsHMQ3IHil5JIgUI3R8cNOQV1IHVAZUrAiYRpwUsBpws1n13RdMIHRIrglg94pP+oT4
3:4MClnINELoR2Yg5V3D1hzFr1GSdHy8WyUEgBhSwXX5BziIXPPLUMdcYOalF1WUf9AHENc
3:9poKQcbgFDY0R5aAFKeR5wGKAHfUyoyoNdDoKVGyDtMZcWBcgyAABVfdAoIR9MoYmWb+Q
3:HmUa2+AWJam8EguijcM0YaZUfIvhDolVZ+uYXPA20yWCTFgAATADMSGpO6aTQ5wFsXuqQ
3:pXH4Nw1s8qhZ6xblEAqqm0xY5Qf/lpBxMSIGq0rRhShH+HREGrcxSsGMpr2ZHLVmVgBLG
3:mtc0+Key4w0LA9c4TEjg8kQY1SHtv3DXwbK3beDG+lYOm0jfQ5gq3n4RDuKqPIKZacf8y
3:47nTgXaNHaDjNKeN8Ft8aRLikVdirMwoohxM6MAmeca0dhkroFtEYugZ2z9GDBsGqbFKD
3:zLigbXFAZoOmGz7z2xVISZhhsUVAkpsm8MiHAfsFEoq1EpgnAkGjxjV80ARAvx0nMmGga
3:XD3XrwWMNtQn1mdTW7NHmVRlTgG9qdJ2N7PWGJkSShz5Bt1UvFQquIFMAXA7qtZc9DeIJ
3:GrfS39aMLJPktiGdRkGnqfc/1yLluUePgbXuqzkNUupWksEaOmDSSzNE/qu5Wg6dVyI4L
3:MjMY8awKgim6C2qM6jK7iM7ezBhc+4faQj4T9h++rZnzt4LQy6uCFGeSmg5ZgdE4y+HGP
3:UUpiEpA86HgWZXr6aaxvlR7zMOM9JUE/TzMJEtA9IOif3oN4ec7NJG8iKW67+oYhraEkx
3:ovBOn5Kjrs9x4jnSwYumHiGrrTBsOcuonU/csQWNhWg98qnPFrwWilEQC36wmJS3xrKLY
3:CQjMjoA3vPywz8wmMIpkYmG6irVFUJJQ2e10gIAalWgJOTFPMCrCT4Ylbqm9GUapHhSAE
3:RXxCSBTwgsSYbBtBC/nv8gRlYvi0iWwDhERvHthcnTUjJIiAc2eE0uGXuVjt4IidQFD0S
3:zOFcvuHA9CsoKNAUUBmKm0TdZNWpXyBLGrFoDEDC6r43RKKM7zOG2ZSylFjo7in8SCEYR
3:irBWQ4xkEwl1E7HVygzSkkQZs+UcIB7LiPEjUQZ/Yg2AXWoXQuwb3wbiyCB6TV3TmVoZC
3:1lGUPqilxISQDHB9SXXXZEWUUkO3xjXG51Z85ep0eEydQY6gdRCmdyES1+G8ccOouiToG
3:TUhFhpgWoJwSmIItoVhgLETLqjDWcoptf0yU9ftMNyZawnokTHN6V5pkodY1XqisnFSxU
3:QjErURhLukbo/pvP/WNZMoncYyIsKKnOILzNi6hrSwOGAA5NbyORHkXW1UBqobrsgIkWS
3:AFJEhU1rypzGGLFZL41y8zRUk8AzcvLMcNTnnjMqJtB+qYz6dFQxH+Ei69JVT87sNEuyG
3:OUdOjU0tcXSRcNjx7byRxSbKjNzRiJnUEtREkrtK56NautRcOMZrTXoOaFjo8HWisdG1O
3:GAPVAXJeRlyCH6o30FDZp0ssS6vWhiCUn8Qz1nhZppoOSQaxhmMi5YiqLeYJxmQIJAHoU
3:kYWJkOhRUZnB4JooeqYkvBK2oMa3RQVupjR1DzIvFGNmJhGXpsnBRjofwQQCQ3qS1ycnC
3:y9gIIsmGsoJA/5yGd/J3vQaNc4hWUKMmPGuD83gtLqqZKDai8dHnZswd7fvbUdAFnqGBV
3:KmI+pf4nmYegNXzb/5wJw6e0sRdlGdcyVCJImtaOZuZkSIg2mlAbUXI54wOvBVFVBa2e1
3:KMnZJSIeoRkNgyYNHphaaaRfCAEnvKUdaEneVoYROdilX9GjV0bfBJljRcpoEScShCHCJ
3:UGcmPOJU3iE9o1jHSYUA7phSXGmlDEHckr4DFKJ5GWUJAIVXAe3yCd5zgDrVod+SdJTAm
3:WvKoT4gXNaceq6YYajJ1FnGPbWVAFfPiImhyKCVwjEF3O4uEncBH0056xyFGupvRLkJD1
3:gHslxjhjv9HwJG1zAJzb4hAhxDB9b2+GQZSbrLgFFPqFCPsuBOI8G9xsxdcAF2G0JH+FR
3:dpLIKXKG2fneYdqwV1CztUEDVcC8FiLxND85VgbkoIJYUjIeTeFqy+OnaLoF0za/mErVx
3:8QumoE60m7FxaEKkqnHKZ2L4XkO27LtaibGvVRmpGK6FBINZibsNYbXh7b5w9ju3KuIpa
3:g2ewgkpVVPo00pGNjLuAjsW8AW5dEA9rD5/jaxxe2pKCtLuDbSC4+pAkDJaOoCrBJugTP
3:CfwZUOBFfKFie1CykcUtDZWRRNBESP8NDS0kbiDMJAb/ryYZ+NY4vq4RsUtLoKpzXxnss
3:CGeXz/IwdTTDNLF5jaG1iHgnaHLdyPta81E7SdNTRILjgPQ2oAhkkMbaNNJthrf0vAhbB
3:10BjBkRBn0D2DRUBxKNfWxx9gMzAQqLRU/dhlELmJZX5H5RF4igLvyr4Tha/bDJ2FOgvx
3:XgnLqhV6tNwE+WAFBR0Ng0LT04AguWL1EnSB8X4widjEJg2cuOe1tlB80xM4CrjQbgOj4
3:FTY2XscNigtuVk4Ryjosq8hQNagDGmWZZySyQWdgPZ5ss0wRssLdv8lv6iFwiO4KBRFIB
3:49U6jo1nDuBOnk4dR9G4M5QKLeBPuDKZ8XAutDu13fWF3khssfijvEu5Vc8/IiYaDVvW+
3:H/8AqpnBRZAd1VwPRYAfLgw/vh3iGMAVjFCXgw0k55koCxQhVd1RclQlUwDPftx2xwRrM
3:QBTyoA4UokWRcFhFUFHRACwFJFWIJX5clAeX8E1cAxF6sAgJ5HA8cT6bonoXB0XigyVP0
3:Q1I0iBw0XHWtGgHshjpQhQiKAoyxTsAIUEdiCUwZ0cr8xToIx0plAPSQA63lTCd9XdH8x
3:PtwyZxlxN2Q35lGBKV0Q5bVUfP9wmiNRV50QWboCP7ZXmb4iKrczpD+E3QQk+M1ReUR1H
3:jdoSaUXFygUs74SKq4HpCWDV7kQpZBwJkED9rdlDWh0OTAi0toTW1AnjpNxzlQy8ukv9R
3:idFBZrJYDXcY6/YqgHgDhuYKntKHBaILV0N3QKAsZocoW5ITpUEcC4JRBiQ9UNMpttEoy
3:5J6F8F2TecqxfNY4iQh+uYpbbI2xjNQUvUqIqYHbQZSZCRTbvMzlFQbeTEQNiBNoXYgok
3:VJoCNwcPEa5QZ4ObZSkQMQVbcKEahP9NVOCcI3o4GAkIRvM4BPBfNNbAIRaiId4PEUxCI
3:Q9fdjxqQXVIAiKKVj+pRq+DEVCURDC/kYNFBxPdIdP/VyTjEYecEXtbN0dABO4DhM5mRm
3:yxRQEwUoVjM/mgckXdF2nNZIbdB5kHGOrPAQxsM4yJhdFqVP5wVi7wWTB/RSVhH/eDEEA
3:m4WA4yYXyuWL/egg2RBV3t4B5TASTNpURnFT0e2FaNojqd4HDyoeW1ACrjkQmmGStCALh
3:coVKrhSkqZlAFlR2w0FKYmOS03UXCCItsSJfLgJv0hZPGyHeaGgL9TllLWlEw0CisUXGU
3:QI9nhHsIgCH/xHw6JIBlGhqzkH9WBgEC0KKsWEN9kdnMGJLywPwpBfLOgkjPhGmtRGilU
3:FhOhA8dSgFcjDRhTKIthO1t4DHSzlZtJMNXjjNVGKvuXPuCWDg0xFAK4HFzwIOyUBos2C
3:LAYFviYAbuWByTULFrWikPyJVShHuI0PNA2JLFxattVOF4DQ6UBkghiWWdi5SOBGQdapn
3:LbdSumUCnYozxEYDsKIUsCopFw4oyEoSTIYB6hsj7aIYOfVpz40VthB5Ru8lpjRAxEECW
3:5EAI1FB4bwwGHpIgl+iy/dYZuUokmWjXYoy44t3SmEaNgkCVBcwLfIBf9iKM4UFxywaOu
3:50JAOgT/xINaUUcOeqS6AjwuaomD+J5OCgNveUckajccWKU0EBdDlQKMIz5c2nbWF6Ujo
3:DYwOqZbdhItEChqqitG4IotYG5v6gLpx6Aygad1aqd72qd++qeAGqiCOqiEWqiGeqiImq
3:iKuqiM2qiO+qiQGqlyEAEAOw==
1:Bank details:
2:Account holder: Joan Costa Mombiela
2:Account number: 0000-0000-00-000000000

Usefulness

The usefulness of compaction is to create a slightly more efficient format, which is just as easy to read and write by humans. With level numbers it is also very easy to read, and for machines, it has some advantages:

It can also be used in file transformations, but we will see this further ahead (Chapter 7).

This website uses cookies to ensure you get the best experience on our website. [cookies.html|Learn more] Got it! More info