Browsing the blog archives for November, 2009.


I Found a Twitter Bug!

Computers, Software

I found a Twitter bug! Hah!

Specifically, certain characters which much be escaped in the GSM 03.38 character encoding are getting treated as the wrong encoding when posted to Twitter from Verizon Wireless SMS, and showing up as ? in text messages sent by Twitter to Verizon Wireless customers via SMS.

I should add that I didn’t find this bug alone – @elliotreed asked why I used question marks to note something in a tweet when I had actually used square brackets around some text. Some quick investigation with him revealed the more specific nature of the problem, but it wasn’t until I actually found out that there was such a thing as GSM encoding that I came up with a hypothesis to explain the character weirdness.

As far as I can tell, Verizon’s HTTP/SMS gateway is now doing the GSM/UTF-8 mapping internally, but Twitter is assuming it still has to send GSM bytes to Verizon, so the encoding is happening twice, or at least attempting to happen twice. Verizon chokes on the GSM two-byte characters, since they’re not valid UTF-8, while Twitter receives certain ASCII-range one-byte UTF-8 characters but converts them as if they were GSM one-byte characters, resulting in a totally different UTF-8 character!

The GSM-to-UTF-8 encoding bug, shown here for square brackets, curly braces, tilde, backslash, and carat.

The GSM-to-UTF-8 encoding bug, shown here for square brackets, curly braces, tilde, backslash, and carat.

The GSM encoding doesn’t allow certain characters as single-byte characters; this appears to be a way to shove a number of European characters into a 7-bit mutant ASCII, with control characters and certain punctuation replaced by characters from the Latin-1 codepage. To some extent this makes sense, given that with the 160-byte length limit on SMS messages you want to avoid multibyte encodings while still supporting commonly used characters (UTF-16 is used for non-roman languages). Unfortunately, this leaves [, ], ~, {, }, \, |, and ^ out in the cold. As a programmer, I use these punctuation characters often as separators in various notations, so it is perhaps not surprising that one of my tweets revealed the problem. These characters can be sent as a two-byte sequence in the GSM encoding, but those start with an escape byte 0×1B, which since it starts with more than one initial bit high will always be invalid as the first byte of a UTF-8 character.

I would have thought that the Age of Unicode would have ended many of these non-standard application-specific encodings (and plus, given the way mobile carriers love to gouge on SMS, if they make your characters take more bytes, they get more money!). It looks like that’s exactly what Verizon is trying to do, in moving to exposing UTF-8 on the edge of their network… they just didn’t tell anyone that they had changed encodings, or if they have, Twitter hasn’t acted on the change yet.

Since Twitter disabled their help ticket creation (probably because too many stupid people were posting the same questions without reading the FAQs), I reported the bug using the Twitter API ticketing system on Google Code.

Short story: if you use any of the punctuation characters above in your tweets, expect texting Twitter users with Verizon to see ?, and expect to receive tweets from them with weird European characters, until this is fixed by one or both parties.

2 Comments

RIP Bike

Life

If you follow me on Twitter or are a friend on Facebook, you probably already heard that my bike got stolen. A brief memorial to my thorughly well-used 2005 Trek 7500 FX ::cue sappy music::…

My 2005 Trek 7500 FX, fresh of the moving truck, clean, and unused.

My 2005 Trek 7500 FX, fresh off the moving truck, clean, and unused.

It was a solid bike, and it served me well, in spite of occasional abuses such as forgetting to oil the chain often enough or wiping out on wet leaves and bending a pedal out of whack. I certainly put money into this on top of the base purchase price (adding cargo racks, new handlebars, replacing shifter cables, new brakes, etc.), but it is still well below the cost of dealing with a car… and I get some form of exercise, as well.

As for the theft itself, I have learned the hard way regarding cable locks. I had switched to one a while ago for the weight and convenience of being able to lock to more things, but they are of course eminently more cuttable. This particular one, a Kryptonite KryptoFlex 1218 6′, was sliced mostly silently right below the window of my girlfriend’s apartment, locked to a lamppost. I took a taxi home, and first thing in the morning filed a police report and an insurance claim.

Thankfully, my renter’s insurance from Liberty Mutual (obtained through work) covers loss, theft, or destruction of personal property even if it’s outside of my apartment; there’s just a $250 deductible (and potentially depreciation calculated) that comes out of the value of the item(s), which means it’s really only useful for replacing something on the order of a laptop or bicycle.

In the end, I’m getting a check for almost $500, which should mostly cover a new bike purchased during the Eastern Mountain Sports winter sale. My natural disposition then is to see the silver lining, and take this frustrating theft as an excuse to get a new bicycle for cheap (even after you amortize what I pay biweekly for the insurance).

Incidentally, while googling for the insurance quote, I discovered that when he still lived in Chicago, Obama rode a 7500 FX :oD.

Hopefully, the new bike (I’m currently leaning towards a Trek Valencia) will serve me as well as the last one.

No Comments

BackSnapper – My First Chrome Extension

Code Projects, Computers

BackSnapper

On a whim tonight, I whipped up my first Google Chrome extension in about 2 hours. A non-trivial amount of time was spent writing it up and making the icons. It’s obviously very simple, but it replicates one of my favorite features of Safari 3: SnapBack (the feature got eviscerated in Safari 4).

Basically all this extension does is add a button to the Chrome toolbar that you can click to jump back to the first page in a tab’s history. I realize the button and icons are ugly; I am not a design-type person.

The BackSnapper button once installed in Chrome 4

The BackSnapper button once installed in Chrome 4

You can read a bit more about my BackSnapper extension, download it if you’re using the developer edition of Google Chrome (currently version 4), or view the code on github.

As Chrome rolls out the Extensions Gallery, I’ll deploy the extension out there. It could probably use some better options, and some smarter heuristics for determining where the beginning is, but for my purposes it gives me the magic button I want.

Installation

You can install the BackSnapper extension from the .zip file more or less by following Step 4 in these instructions. Note that at present this only works for the dev channel (version 4) of Google Chrome.

  1. Download and unpack the .zip file
  2. Select Extensions from the Tools menu.
  3. Click “Developer Mode” on the right in the Extensions display.
  4. Click “Load unpacked extension…” and select the unpacked BackSnapper folder

Development Tips

There were a few things I learned getting this working that weren’t immediately obvious from the documentation:

  • The debug console is per tab
  • You may need to select your injected content Javascript in the debug console to view logged messages
  • For simple calls into content scripts, chrome.tabs.sendRequest() is sufficient, you don’t need to use the more complicated connect() message passing calls.

There were also a few things I couldn’t figure out:

  • Why won’t the current developers-only Extensions Gallery accept my unsigned zip file?
  • Why can’t I determine the current URL in the history after having called history.go()? location.href remains unchanged, and history.current is undefined.
2 Comments

Top 200 Video Games of All Time According to Game Informer

Video Games

Introduction

Oh yeah, I have a blog! Lots has been going on in the intervening months (see my Twitter feed for short attention span details), but I figured a video game post during NaBloPoMo would be a good way to get back on the wagon, even if I’m not actually posting every day during November.

While visiting my Little Brother this weekend, I noticed a rather unusual magazine cover… a (very pixelated) monster from the original Doom. This turned out to be the latest issue of Game Informer, specifically Volume XIX, Number 12, Issue 200. In honor of this decimalist anniversary, they published their Top 200 Video Games of All Time list, which unsurprisingly is linkbait for any video game fan who likes to rant about what should and should not be included in such a list. I ran through my opinions quickly with my Little, mostly fixating on why so many recent games were already on the list, but decided a deeper analysis was in order.

Instead of complaining about the contents of the list, I thought I’d use it to track my personal video game history (much as my father has in the past used the Rolling Stone 500 Greatest Albums of All Time and Rolling Stone 500 Greatest Songs of All Time to guide his music purchases). I’ve also done some histogram breakdowns of what’s on the list. I would say that my guideline for inclusion on any such list would involve adjectives like “innovative” and “influential”, and explicitly avoid conditions like “critically acclaimed”, “popular”, or “best-selling”. This in turn means that inclusion must be viewed through a somewhat temporally distant lens, for sufficient perspective on a particular cultural artifact’s import.

How many of these have you played? Do you strongly agree/disagree with any of the rankings?

The columns are Game Informer rank, game title, platform(s), and year of publication from the original article; I believe using this data for commentary is covered by Fair Use. I added platforms in a few places to account for the particular port of a game that I played. I have also added columns for myself, for Played, Owned, and Completed. The full table and further analysis is below the cut.

Continue Reading »

6 Comments