Pages

Thursday, September 23, 2010

Nightmares

I thought it'd be impossible to experience something as scary as Flower's terrifying rain/lightning level; naturally that means my dreams last night managed to top it. Last night's dreams were about Von Neumann machines taking over civilization and I was the only one who realized what was happening. These particular machines started out small and masqueraded as friendly pets who loved to eat anything, growing bigger and bigger. They looked like small red gift boxes, with teeth. I'm still not sure if they were actually malign - it all seemed to happen offscreen, or perhaps because I was so paranoid about them.

Anyways... I haven't mentioned Flower before, but I should - only picked it up recently, it was one of the original games for the PS3 arcade. The gameplay basically can be described as you playing through the dreams of a flower, where you control a petal and how the wind blows it around. It's absolutely gorgeous and very serene. The only exception is one level near the end, which is absolutely necessary to the story they tell but is also very very scary - you might find it hard to believe how scared one can get for a tiny flower petal [or group of them], but man. Much scarier than hordes of zombies, for sure.

Wednesday, September 22, 2010

Law Enforcement

Howdy all. Just wanted to post a quick link, but also some thoughts. I found this account of a reporter's experience in the LAPD deadly force video simulator very interesting. The simulator sounds very interesting, though he didn't answer some details I would have loved to know - it's apparently a thing with real actors, weapons with laser blanks rather than bullets, etc. More interesting was the perspective the reporter was able to give on what it's like to be law enforcement in that sort of situation. I don't really want to speak specifically of any of the recent cases in the media, as I am not researched enough to comment on them.

What I do want to say is this: Being in law enforcement [and civil service, etc] is a hard job. Thank you to those who can do it, and to those who do it well. It can be hard for many people, myself included, to have empathy for those in that position as the viewpoints and situations involved are so very different from our experience; I hope that I never hold a gun, much less fire it. To those who are entrusted to do that every day, I salute you for having the strength to do your jobs.

Ok, now that I said my deep thoughts on the subject, I had one other peripheral thought about this: If you ever see a traffic cop managing stuff at a light that's gone dead, or anyone in that sort of position, take a moment to thank them. Far too often it's easy to be impatient with the person directing traffic when it's not their fault the light died, a wreck happened, etc. I had a long chat with a traffic cop once after they spent a while directing traffic at Chester / Cordova [after a wreck]; she was very good at directing traffic, keeping things at least flowing, and I took a moment when the traffic had died down to thank her - I had needed to cross and she was very clear in both letting me know she was aware and when to actually cross, at the same time dealing with a driver who had decided that clearly the detour around that particular block didn't apply to them. We spoke for a little bit, and she told me that the situation with that driver was pretty normal, and that they got a lot of anger and impatience while doing that job! People arguing about the street being blocked off, angry they had to detour, and so on. I was glad that I got the chance to talk a little, as it was a perspective I hadn't seen before; more importantly, it reminded me that even a traffic cop deserves some appreciation for the job they're doing and not just impatience at the situation one usually encounters them in.

Monday, September 20, 2010

On the hazards of integer arithmetic

A short essay on why integer arithmetic is 'bad'. I'm not out to start a holy war, just ran into some trouble having to do with some of the facets of using integers...

My simulator, as a base component, needs to compute energies according to some model. We'd like it to actually agree on what it computes with the parameter sets and models used in various literature sources. It does just fine (and has for a while) on the current NUPACK parameters, but I recently needed to use a different parameter set to test some parts. It's been a while since I used it and due to the nature of the model it's actually a different underlying implementation.

So, that's the setup. The test set is ~55k different test cases, in 6 different classes of test (the largest class, 32k, the smallest, 2). My program tested fine for nearly every test case, except exactly 2048 from the class with 33344 [why yes, I did mean 'k'] test cases, and 8 from the class with 4956 test cases.

After several tries at bucketing my cases vs the 'good' ones to figure out which part of the distribution had this failure, I ended up with these two sets:
In [122]: buckets7
Out[122]:
[{'(10, 2)': 256,
'(13, 2)': 256,
'(14, 2)': 256,
'(2, 10)': 256,
'(2, 13)': 256,
'(2, 14)': 256,
'(2, 22)': 256,
'(22, 2)': 256}]
In [123]: buckets8
Out[123]:
[{...
'(10, 1)': 2304,
'(11, 2)': 256,
'(12, 2)': 256,
'(15, 2)': 256,
'(16, 2)': 256,
'(17, 2)': 256,
'(18, 2)': 256,
'(19, 2)': 256,
'(20, 2)': 256,
'(21, 2)': 256,
'(23, 2)': 256,
'(24, 2)': 256,
'(25, 2)': 256,
'(26, 2)': 256,
'(27, 2)': 256,
'(28, 2)': 256,
'(29, 2)': 256,
'(30, 2)': 256,
... }]
Note that the ...'s are where I chopped symmetric entries, or all the ones that had a 1,x or x,1 [A large part of the test set.]. There are some very suspicious absences in the 'good' list, notably the two sets do not intersect at all!

So, what's going on here? Now we get to the hazards of integer arithmetic. As it happens, the energy for these cases has a particular dependence on the sizes [or rather the sum of the sizes, e.g. (12,2) we care about n = 12+2 = 14]. For large n, we know that energy(n) ~= energy(0) + k * log(n), for some constant k and initial value energy(0). As it happens, this approximation is not a good one for our actual physical system at small n, so what we do is get a lot of data about those small n and get really good numbers to fit those. For all the parameter sets we know that for n > 30 we can just use the logarithm approximation. Some of these sets have every value of n up to 30 specified, but depending on the quality of the set and other factors others may only have n <= 9 specified in the parameter set. Now in order to compute the energy we would prefer to just look up a value rather than take an expensive logarithm to do so, so the normal practice is to store a lookup of values energy(n) for n <= 30, doing the log if we get a n > 30 (unlikely for our systems). When the parameter set doesn't define all those, we need to compute them. This is easy though, as we can just use the logarithm approximation to figure out the rest; all we need is that same k we needed for n > 30: if the last measured parameter is at length j (so we know energy(j) but not energy(j+1) for our parameter set), we can compute lengths j+1,...30 using the log.

With me so far?

Oh, what about integers? Well, you'd think that given the use of logs and so on, these parameters might be real numbers and not integers, and in theory that's actually the case. In practice, though, all the energies are measured to 2 significant digits (e.g. '2.45 kcal/mol') and so some programmers decided that it was better to do everything in integer arithmetic. So everything gets multiplied by 100 and stored as an integer; this makes a lot of sense historically due to how cheap it is to perform computation on integers vs floating point numbers. Nowadays that's not really an issue [I think?], though.

So, let's spot the error. For the moment, assume k = 107.5 (or so) is our constant, and we know position l has the last measured value.

From simple log arithmetic, we know:
log(a/b) = log(a) - log(b)

So if we have a sequence of numbers x[0], x[1], ... x[l], and know that they are logarithmic in distribution, we can then calculate subsequent numbers:
 x[j+1] = x[j] + k * log((j+1) / j)
So for any position j < m <= 30, we can calculate:
 x[j] = x[j] + k * log((j+1)/j) + k * log((j+2)/(j+1) + ... + k * log((m) / (m-1))
      = (x[j] + k * log((j+1)/j) + k * log((j+2)/(j+1) + ...) + k * log((m) / (m-1))
      = x[m-1] + k * log(m / (m-1))

Or, grouping them the other way:
 x[m] = x[j] + k * log((j+1)/j) + k * log((j+2)/(j+1) + ... + k * log((m) / (m-1))
      = x[j] + (k * log((j+1)/j) + k * log((j+2)/(j+1) + ... + k * log((m) / (m-1)))
      = x[j] + k * log(m / j)

So, either method gives you an equivalent algorithm for generating all the values up to 30, but there's two important differences: The first method you 'remember' one piece of information as you go 'x[m-1]', and the second remembers 'j' (which is then used because you also need x[j]). When implementing this, I used the first method: our other parameter sets all are floating point numbers anyways and thus I can avoid storing 'j' in addition to the normal loop variable.

Now, even though the algorithms are equivalent on the reals, on integers (with our k) they differ by 1 in several places due to compounded rounding issues:

index 11: 434 vs 434
index 12: 443 vs 444
index 13: 452 vs 452
index 14: 460 vs 460
index 15: 467 vs 468
index 16: 474 vs 475
index 17: 481 vs 481
...
index 23: 514 vs 514
index 24: 519 vs 518
index 25: 523 vs 523
...

And there you have it. Integer rounding issues leading to a very slight disagreement in the computed energy, and causing me to track down the problem. At the precisions I deal with, the right solution probably is to convert things into floating point at the input stage - the later operations on energy all need floating point, but on the internals side it's a pain in the neck to transfer everything.

On the hazards of binding Ctrl-P to "Publish Post"

If anyone is reading via one of those newfangled inventions and saw a bunch of random half-formed posts go up, sorry! Most of the usual emacs keybinds work great in blogger's editor, but for some reason ctrl-p is publish post, rather than previous line. (ctrl-n works though!) So you may have gotten a random block that wasn't really a post yet - I'll likely be actually finishing that and posting it when I get time; hopefully tonight?

Wednesday, September 15, 2010

Lighthearted

In case my last post got you down, here's some random links that are either cool or funny:

A NBA team tries the Crazy People method of advertising.

Gordon Freeman: The Legend. The Crowbar. A song about Gordon Freeman and his Crowbar. Pretty well done song.

Speaking of Crazy People, the headline for this one is a good one: "We're sorry we're not Apple."

Finally, a pair of cuteness (via L):

Penguins + Butterfly = Chaos

Sleepy Baby Bears

Please Think (PSA)

Just a quick post, as I read a very moving story that reminded me of other things from my life.

If you're driving, please think about the other drivers, other people on the street, the world outside your car's interior. I know that should be obvious, but frequently people get caught up inside their own worlds and when you're at the wheel of a car that can be lethal.

A recent wreck at the intersection outside our apartment falls in this category - the elderly man who was hit while riding in the bike lane is still in critical condition.

A long time ago, someone I'd known in high school died due to a car crash - just a bit more thought on the part of those involved and it wouldn't have happened.

Very recently, a game designer for Relic died in a car crash that was in no way his fault. He was thinking, though, and that's very likely why his pregnant wife survived.

Please, think.

Links:
Dad-to-be's final act saves family.
Hit-and-run driver critically injures elderly Pasadena man.

Monday, September 13, 2010

Football / You're Old When / Longevity is Awesome

Ok, I know I'm getting old. I just read a news story about the Carolina Panthers - they came to Charlotte about when I was leaving and so I've never actually been to one of their games. The news story was about one of their players, who also happens to be one of my favorite football players - he is still playing, and has been with the Panthers every year of their existence [16 years now]. Can you name that player?

Speaking of playing sports when old, one of my favorite baseball players was Julio Franco. Never spectacular [okay, I lie. That home run into the pool was amazing]; but man was he in shape and always hustling! He's got nearly all of the age-related records - oldest player to hit a home run (which he then topped a few times), etc, but nearly as impressive is that he's one of only 3 players to top 4200 hits in their career (major, minor, international play) - the others are Pete Rose and Ty Cobb [I suspect Ichiro may get there, maybe?]. Other amusing trivia: he went 20 years between starts at 3rd base. He was the 6th batter Roger Clemens faced in his career, and when they faced off in 2007 they were the oldest batter/pitcher pair [combined ages] since 1933. [So, when did he first face Clemens? That would likely be 1984, when Clemens debuted in the major leagues for the Red Sox. Franco first played in the majors in 1982.] While it's likely just a myth, there's reports that his registered birth year of 1958 may be incorrect - it could be as early as 1954!

Thursday, September 2, 2010

A disgrace

Nyjer Morgan is a disgrace to the game of baseball and should be tossed. In the past week I've seen at least three plays where he appeared to intentionally try to hurt someone, and he succeeded at least once (separated shoulder).