ScanLife’s 2D code system – Flaws, privacy, and whatnot
A while ago, I covered a bit on Metro News' and other publications' implementation of ScanLife's 2D barcode system for users of smart phones.
I have gotten a bit interested in it seeing that it has advertised itself as a better alternative to QR and data matrix codes. This seemed unlikely and I began to take a look.
Ed: forgive any weird grammar or spelling errors here.
ScanLife's software
Primarily, all major smartphone devices on the market today can run ScanLife's software. My Nokia E71 had no issues installing the software at all and it seems that installing it on the iPhone or Android platforms is trivial too. However, I will have to say that the software is rather buggy. Countless times while testing the EZcode barcodes, the software would claim that the camera is in use and refuse to take over. This seems to happen only when I cancel a code after I it has retrieved information regarding the code or interrupt it after it started to attempt a connection to the website I just scanned.
Having to restart the phone each time the software bugged out was just annoying and I certainly hope that ScanLife fixes this annoyance.
Here's a look at how the software should normally operate:

For some reason, the icon reminds me of a man running.

ScanLife finds some prey!

Connecting through my little packet sniffer.

Do you want to read this crappy article by a bankrupt newspaper?
Not much to it. It seems as simple as your standard QR and DM code reader that a lot of phones include. In fact, it almost seems unnecessary save for the fact that their reader reads the EZcode and so far no other reader out there exists.
Communication with ScanLife
ScanLife appears to have one main system--it's not really "just one", but I'll explain later. When your phone communicates with ScanLife, it connects to app.scanlife.com (98.129.7.179), which also happens to be the same host for scanlife.com. It connects via an unencrypted HTTP link and has no qualms about connecting either via a data service (EDGE, HSDPA, etc) or via another method (Wi-Fi or Bluetooth). The process works basically like this:

A really, really crummy diagram.
After the snapshot is taken and it is requested that a connection is made, the ScanLife software sends the following bits of information (this is off of a barcode I had made pointing at google.ca):
GET /resolver/barcoderesolver? sdkid=17087727& licenseid=17089923& userid=[23-digit string of numbers]& barcode=A119A0000000000E01& btype=EZ& appid=17007085& language=en_us& guid=& mktqfeed=false& smsc=+[10-digit phone number]
The USERID component doesn't appear to be related to the IMEI or anything from the phone itself--at least it is not apparent. What's also interesting is that the SMSC string contains the SMS gateway for my mobile phone carrier. I am guessing that they use this to identify which carrier you're connecting from--this is likely done just in case the user connects via Wi-Fi.
The ScanLife software is also able to scan QR and data matrix codes with ease. There isn't much different in its request string minus the barcode being reported as the URL itself:
GET /resolver/barcoderesolver? sdkid=17087727& licenseid=17089923& userid=[same 23-digit string]& barcode=http://google.ca& btype=QR& appid=17007085& language=en_us& guid=& mktqfeed=false& smsc=+[same SMS gateway]
With data matrix, it just places "DM" under BTYPE.
One interesting thing to note is that it doesn't seem to matter to ScanLife what format the barcode is in when it comes to BARCODE string. I was able to place the same string (A119A0000000000E01) into a data matrix code and it returned back the same information as the EZcode--this suggests to me that the data is just plaintext in whatever format EZcode adopted. I will note a bit later in this post what it does with BARCODE strings that are in non-ScanLife formats.
How ScanLife keeps track of your phone OS is also quite simple. The SDKID indicates which particular ScanLife program you're using. In the case of Symbian, it will give an SDKID of 17087727. For Android, it's 17087728. Oddly, the Android version doesn't feed the SMSC when making a request.
After the data has been sent to ScanLife's servers, it responds back with XML data resembling the following (it's quite a bit so I'll break it apart):
<codeset numactions="1" title="Test to Google"> <action title="Test to Google" type="web" typeid="7"> <property name="url" type="tx">http://google.ca</property>
This data is pretty straightforward. When I created the barcode on ScanLife's server, it let me set an internal name for me to refer to and a description to let the public see. Nothing really nefarious here.
The stuff that comes after these three lines gets a bit interesting.
The barcode itself
The barcode in itself is a very basic 2D system that is essentially 11 pixels by 11 pixels. Thanks to ScanLife's whitepaper (mirror available here), it works out basically like this:

Possibly my best artwork! Also, the software doesn't scan it when the whole thing is filled out--I've tried.
While the capacity is potentially able to hit 83 bits, in reality, it will only be able to store a maximum of 76-bits--the rest is used for error correction. Based on hamming vectors, it's capable of 5 pixels (7%) damage before the code is completely unreadable. However, in real world tests, having removed a pixel from the code itself either made it unreadable or gives off an incorrect result. This is in stark contrast to QR where I have removed 1/6th of a 21x21 code and it was still readable--this is the beauty of error correction, folks.
Here are some comparrisons between QR and EZcode.
| QR | EZcode -------------------------------------------------- Max capacity: | 6.9 KB | 76-bits | (4.2 alphanum.)| | (2.9 binary) | | | Max error cor.: | 7-30% | 7% | | Min size: | 2x2 cm | 1.25x1.25 cm | 0.78x0.78 in. | 0.5x0.5 in. | | Mix size (px.): | 21x21 | 11x11
The main reason why I wanted to compare the two was because they're both being used by publishers in print format. It's not uncommon to come across a QR code in a European or Asian newspaper or magazine. One other thing to note about my table is that while I used the term "px" (or "pixels"), it really should mean "modules". However, if you look at a code, you'll think "pixels" before "modules" anyway.
So when a code is created, the barcode will contain something akin to the following:
A119A0000000000E01 A119A0000000000E02
I took a look at a few other codes (taken from a copy of The Metro) to see how they match up against my own:
A119A0000000000A51 A119A0000000000A57 A119A0000000000A69 A119A0000000000A6F
Holy crap! The numbers just keep increasing! To make matters interesting, you can just simply up the value (they're in base-16) see what other codes were produced. In the case of The Metro, I was able to find EZcodes for Edmonton's and Toronto's editions.
This is probably the biggest mistake that ScanLife could have made. Why? Because now I can troll their entire database for potential private data! Why couldn't have they figured out a hash function that could have fit within the 76-bits of data is beyond me. The codes would still in theory be troll-able, but it wouldn't be as ridiculously bad as this.
Here is an example script (written in Python, presumes you have wget installed) that'll let you download a range of barcodes (from A00 to AFF, feeding it decimal numbers between 000 and 255 to make it work). It's very limited because at this time, I don't wish to make a full-blown script to download their whole database. This is solely an example and nothing more.
To make this even more funny, it was pointed out to me that you can grab future codes this way as well. By increasing the value, you will run across errors in the XML data that tell you that the code won't be valid until a certain date. This could be a potential method to get into news articles that haven't been released yet.
Working with the server and the software
One of the things I came across quite quickly after scanning the codes for the first time is that it doesn't seem to give a damn about the User-Agent string. The default User-Agent given by the client is simply "SimpleClient 1.0".
A search for this particular string gives off some interesting results. It seems that a lot of example code taken from Nokia could potentially be embedded in the software. I could also be wrong as it could be an internal code name.
Connecting to the service with a browser like Firefox with no attributes following gives off the following error:
[START][DM][INFO]The system cannot resolve the barcode. The barcode is missing from the request.[END]
Feeding it just the string given off from one of the previous codes gives off the following:
[START][DM][DATA]1|041google.ca|Test to Google|scanbuy[END]
However, feeding it the whole string of data that it normally would get returns back the XML data (with history and everything):

Being told your life story every time you request some information would be painful and time-consuming.
It is a bit disturbing to see that it returns your history data every time you request a link. On top of that, it wastes valuable bandwidth--mobile phone data rates tend to be rather expensive depending on what country you're in. The XML data totals to 2 KB whereas the non-XML data came out to 59-bytes. Is there a limit on how large (or old) the XML file will get before it starts to remove old data?
However, there is a way to get the data in XML format without having to receive a load of useless data. Simply just supply the following attributes:
barcode=A119A0000000000E01 appid=17007085
It will in turn return just this much code (a bit chopped off on the right, but you get the idea):

Going on an XML diet.
I'd rather have this result, especially if I am using my HSDPA connection.
One of the other things about the XML code is interesting is the TYPEID under <ACTION>. By changing the variable around, I found that it did the following:
- Requests to call phone number
- Create new contact (within ScanLife?) *
- SMS message creation *
- E-mail creation *
- N/A
- Unsupported barcode
- Requests to open website
The ones that are masked with the asterisk (*) allows the software to perform the function without intervention from the user--in the case of SMS and e-mail, it launched the messaging application without my permission. This is quite disturbing and is ripe for abuse.
One final aspect I'd like to touch upon with the software is how the dialogue boxes are produced. I don't have an example at this time to give, but basically the gist of it is that there is some laziness involved here and basically the software is told what to display in the dialogue box instead of the software being told there is an error and then producing a dialogue box from there.
It's kind of asinine and really easy to exploit. I did fiddle around with it after redirecting traffic from app.scanlife.com to one of my internal servers and managed to get it display other URLs with the intended URL itself (in the hopes of masquerading) but didn't get very far. Perhaps I will look into this later.
So what can be done with this information?
Well, I won't go too much into this, but this could be a potential for open source development. Instead of relying on ScanLife to provide the URLs, perhaps a mirror service could be created and thus have all of the codes stored elsewhere. Creating EZcodes outside of ScanLife's realm doesn't seem very practical, but at the very least the codes can be mirrored.
The whitepaper that I provided should allow one to create a reader. Perhaps someone could integrate this into the Android reader and allow one to switch between ScanLife's service or their own? A nice, non-bandwidth-hogging version would be great for those who have data plans that are quite limited.
ScanLife's Servers
I don't want to delve too much into this, but I did decide to see what sort of servers ScanLife had scattered around. A WHOIS of the IP given earlier reveals that they have a block of available IPs:
Rackspace.com, Ltd. RSCP-NET-4 (NET-98-129-0-0-1)
98.129.0.0 - 98.129.255.255
SCANBUY INC RSPC-1213288687252329 (NET-98-129-7-176-1)
98.129.7.176 - 98.129.7.191
None of the IPs that are hosted at Rackspace have any sort of rDNS enabled. Using some readily available tools, I was able to determine a what the majority of their hosts are under the scanlife.com domain. They're as follows:
98.129.7.179 scanlife.com 98.129.7.180 serv1.scanlife.com 98.129.7.181 serv2.scanlife.com 98.129.7.179 app.scanlife.com 94.236.61.145 blog.scanlife.com 216.104.161.119 dk.scanlife.com 216.104.161.219 dk.scanlife.com 216.104.161.119 es.scanlife.com 216.104.161.219 es.scanlife.com 98.129.7.179 ftp.scanlife.com 94.236.59.90 helpdesk.scanlife.com 98.129.7.179 mail.scanlife.com 216.104.161.219 mx.scanlife.com 216.104.161.119 mx.scanlife.com 216.175.243.8 qa.scanlife.com 216.104.161.219 us.scanlife.com 216.104.161.119 us.scanlife.com 98.129.7.179 www.scanlife.com 207.97.244.81 x.scanlife.com
A scan of scanbuy.com's subdomains netted nothing interesting. However, it seems that both serv1.scanbuy.com and serv2.scanbuy.com are the two domains to be interested in. Connecting to both directly redirects you to scanlife.com, which suggests to me that there is some load-balancing action going on here.
This suggests that ScanLife isn't gambling here and has at least two available servers for users to go through in case of problems. However, this has its benefits and drawbacks--the main drawback I wonder about is how it copes with data centre troubles.
Closing
I'd like to thank both Handler and RogueClown (see her link on the left) for their assistance in this. I believe a few others gave me a hand on this as well, but unfortunately I have the memory of a goldfish. I did give this as a talk at BazCampYVR yesterday and it went over well. Feel free to point out any corrections or problems as I may just post a follow-up to this.
If ScanLife happens to read this: I ask that at the very least you look into fixing the code generation problem. It surprises me that you guys have a nice little platform to build upon but somehow are hindered by a really basic problem. Also, your software should be improved to get around the error correction issue and also to stop hogging the camera when it isn't closed in the way it wants to.
I don't condone your storage of user information even if it's not asking me about who I am and how much I make. I certainly hope that the information given on your Danish page is incorrect.
Pages
Categories
- E-mails
- Events
- Gaming
- Hacks
- Hardware
- Idiocy
- Internet
- Law and Politics
- Movies and Television
- Photographs
- Projects
- Rants
- Security
- Software
- Technology
- Travels
- Uncategorized
Blogroll
- Army of Evil Robots (Derek Anderson)
- Art of Victoria Sticha
- EisMcSquared
- Joe Bowser
- Mindstab.net (Dan Ballard)
- Peter Kieser
- Planet VHS
- Randy Sommerfeld
- RogueClown
- Vancouver Hackspace (VHS)
Archive
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- May 2009
- November 2008
- October 2008
- September 2008
- July 2008
- June 2008
- February 2008
- January 2008
- December 2007
- November 2007
- August 2007
- July 2007
- March 2007
- February 2007