Archive for the ‘research’ Category

census estimates

July 24, 2006

on july 21, 2006 dc got 31,528 new people.  well, at least if you’re the US census.  the methodology used?  the washington post article notes

the city submitted building permits from 1999 to 2005, Phillips said, in addition to school enrollment figures showing that public charter schools had absorbed much of a decline in the number of students attending public schools. The city also submitted data showing that the number of people filing taxes in the District had remained steady between 2004 and 2005 and that Pepco was serving an increasing number of residential units.

but you’d think that if this was done in DC, it’d be standard practice for all cities.  i’m actually kinda surprised the census didn’t incorporate these figures already.  but the .pdf i could find didn’t mention anything.

the more i look at these figures, the less i trust them.  lies, damn lies,…


more statistics!

July 24, 2006

statNotes: topics in multivariate analysis

statisitical resources at the UMichigan

statistics jokes!

county and city data tables
oh my, i love coupling – “i’m not that easy, but show me a muscular blond who can control the weather and this girl is on all fours”

poverty & the middle class

July 24, 2006

the la times has a piece on the growing gap between rich and poor, and the shrinking neighborhoods of middle-class families.

“The retailers in the two neighborhoods are very different,”… “It’s the difference between a Whole Foods and a corner grocer, or Citibank and the local check casher. They’re not competing, and in the end, you have higher prices for all basic goods and services.”

urban centers are becoming the domain of the richest and the poorest, as the middle class is increasingly moved to the suburbs.  this limits the upward mobility of the poor, cutting off moving into a better neighborhood, going to better schools, or maintaining social contacts.

you can bet i’ll be reading the brookings report later on.

[update:  a rich discussion over at feministe]

textMap, so so cool – but how does it work?

July 22, 2006

i am the absolute worst when it comes to methodologies and titles (titles, you probably guessed from these posts).  but it is becoming increasingly apparent that these are at the core of great statistics / information display / research.  take textmap, an engine to analyize the geographic and temporal distribution of news.  it is really quite cool, and something i’ve wanted to do for a while, but it always seemed like there were too many problems to be overcome, before the idea became workable. so i was psyched fo find the site.

-but a problem-

playing with the ‘function of location’ charts has me worried.  montana has a relatively few news sources, and therefore never shows a strong reading.  the east coast, however – particularly the metropolitan corridor – is a sea of red (more news sources in the area).  so, there is variation in both areas, but it isn’t entirely clear what the map is measuring, because comparing across regions is no longer intuitive.  i couldn’t find the methodology on the site (boo!) – and so it isn’t clear what intensity of red indicates.

this isn’t to say the site isn’t worthwhile – the mexico map shows an intuitive trend

Mexico TextMap

but i have to wonder if this is an artifact of paper coverage – why the band between n. florida and s. georgia? – and wonder about coverage in relation to associated thoughts.  (what is the unit of analysis, btw – census tract?)

there is also the old baseline problem: what is the ‘noise’ associated with a given concept (background usage not associated with events)? – and what is the median frequency of ‘related’ terms – its cool if mexico usage went up, but if that was a function of world cup news or a function of immigration news makes a big difference.

hm… actually, with this data set, you could probably look at news conglomeration::variety of media sources, if the answers to above were clear… ooh, shiny.

data makes me do the happy dance

July 21, 2006
datamining has an interactive map of the blogosphere. the map layout is a “variant of the force layout approach to graph layout. There certainly is meaning to the location of nodes in the image: proximity indicates a tendancy for mutual citation.” meaning: the map is more than just a pretty face. the place of nodes has actual social meaning.

but this is even more sexy, as a suggestion:

Time stability is an interesting problem. One way to do this is to fix nodes in location (or certain nodes). Alternatively, you could allow nodes to become more lethargic in movement according to how long they have been there. This seems like a good idea. Are you going for some form of animated representation?

dangit, where is my programming computer when i need it!

[update 1]: ok, i heart datamining. this visualization method is pretty darn inspiring, and pretty straightforward to understand (compared to other methods i’ve read)

we start by giving some amount of money to some user (initiator) in LJ network telling him to evenly distribute it among his friends, then his friends are performing the same action among their friends and so on. Obviously, if these guys are the members of some clique it will not take too long until all of them have an equal amount of money (thanks to small-world property), meanwhile only some small part of the initial amount will leave this community. So the amount of money of a particular user defines his thermodynamic distance from the initiator. If we have two initiators – we can plot the figure like the one shown here.

expect more updates as i read through the whole archives this weekend.

[update 2]: don’t run too far through the links. i accidentally made it to ‘linked’, a book that makes me angry. hulk angry


June 29, 2006

via crooked timber and jim gibbon, i stumbled into gapminder, a neat data-visualization package available online (alternate link). so so cool, and not just for us geeks.

while i’m at it…

neat link on the monte carlo method
we all use it, i just need to store links: mathworld
possibly going on my sidebar: social science stat blog. too cool