Compression as indicator of document quality
April 6, 2005
Posted by on
Jean Véronis describes an informal experiment in compressing different texts, including the European constitution. (Original in French, Auto-translated to English). He notes that normally, French texts compress at about 60-65% of the original, but the EU constitution compresses at about 75%. He puts this down to “jargon, puffery, redundancy…” in the constitution. I wonder what the US Constitution is (before and after any admendments added after the Bill of Rights). To be determined…
By the way, Google’s translation provider translates “”jargon, baratin, redondance…” as “jargon, sweet talk, redundancy…” Which makes me wonder about the compression ratios achievable on flattery.