unisteg.py -- Hiding text in text using unicode

December 29, 2008

I’m proudly presenting my latest little script: unisteg.py.

This is a steganography tool that can hide text within text that is unicode encoded, and has lots of diacritics. I’m exploiting a feature of unicode that allows characters with diacritics to be written either as a monolithic “composed” character that is a single symbol, or in a “decomposed” form in which the component symbols combine. These two different ways to represent the underlying characters are visually indistinguishable. This is where I’m hiding the secret plaintext.

Usage: unisteg.py [options]  >

Prints output to stdout by default.

Options:
  -h, --help            show this help message and exit
  -s, --steg            Hide plaintext in covertext to produce cyphertext.
  --url-plain=URL_PLAIN
                        URL to retrieve plaintext from
  --url-cover=URL_COVER
                        URL to retrieve covertext from
  --file-plain=FILE_PLAIN
                        File to retrieve plaintext from
  --file-cover=FILE_COVER
                        File to retrieve covertext from
  -b, --binary          Use if the plaintext is a string of 1s and 0s
  -e ENCODING, --encoding=ENCODING
                        Encoding of the covertext, if not unicode. See Python
                        codecs module for possible values.
  -u, --unsteg          Derive plaintext from cyphertext.
  --url-steg=URL_STEG   URL to retrieve cyphertext from
  --file-steg=FILE_STEG
                        File to retrieve cyphertext from
  -o OUT, --out=OUT     Filename of output

To test:

$ unisteg.py -s --url-cover "http://www.theholyquran.org/sura_print.php?kid=1&sid=2" -e latin5 -o steg.txt "this is a test" $ unisteg.py -u --file-steg steg.txt

This software is distributed under a BSD license with the endorsement restriction clause removed.

unisteg.py -- Hiding text in text using unicode - December 29, 2008 - Michael Katsevman