unisteg.py -- Hiding text in text using unicode
December 29, 2008
I’m proudly presenting my latest little script: unisteg.py.
This is a steganography tool that can hide text within text that is unicode encoded, and has lots of diacritics. I’m exploiting a feature of unicode that allows characters with diacritics to be written either as a monolithic “composed” character that is a single symbol, or in a “decomposed” form in which the component symbols combine. These two different ways to represent the underlying characters are visually indistinguishable. This is where I’m hiding the secret plaintext.
Usage: unisteg.py [options] > Prints output to stdout by default. Options: -h, --help show this help message and exit -s, --steg Hide plaintext in covertext to produce cyphertext. --url-plain=URL_PLAIN URL to retrieve plaintext from --url-cover=URL_COVER URL to retrieve covertext from --file-plain=FILE_PLAIN File to retrieve plaintext from --file-cover=FILE_COVER File to retrieve covertext from -b, --binary Use if the plaintext is a string of 1s and 0s -e ENCODING, --encoding=ENCODING Encoding of the covertext, if not unicode. See Python codecs module for possible values. -u, --unsteg Derive plaintext from cyphertext. --url-steg=URL_STEG URL to retrieve cyphertext from --file-steg=FILE_STEG File to retrieve cyphertext from -o OUT, --out=OUT Filename of output
To test:
$ unisteg.py -s --url-cover "http://www.theholyquran.org/sura_print.php?kid=1&sid=2" -e latin5 -o steg.txt "this is a test"
$ unisteg.py -u --file-steg steg.txt
This software is distributed under a BSD license with the endorsement restriction clause removed.