Thursday, April 16, 2009

Some changes are coming

I plan to make some changes to this blog in the near future. I've been reading a lot about how to make your blog worthwhile so that other people will want to read it.

I'm interested both in informing other people and honing my writing skills. I figure the two will work in a virtuous cycle. I'll start writing to hone my writing, and people will eventually read it. Then I'll want to keep writing so that more people can read it, which will spur me on to write even more.

I figure I'll write about what I spend a lot of time thinking about: media. I play a lot of games, consume some television and books, and spend a lot of time on different operating systems (I run Linux at home). I certainly have a lot of opinions and would like to get them out to the world so that other people can be informed about products before they use them.

Hopefully I'll find a community of like-minded people and be able to help them by reviewing the media so that they don't have to, if they don't have the time.

I plan to start slowly, just a post a week, and see how difficult that is while I'm still working on my doctoral dissertation. Post may increase or decrease in frequency accordingly.

Until next time!

Friday, April 3, 2009

SAS Global Forum 2009 Recap: Tuesday

I know this post is late, seeing how the conference ended over two weeks ago, but I figured better late than never. My impressions of the presentations are not as fresh, but I have more time now to write a bigger post about the sessions I attended.


The first session I attended was on time series predictions using neural networks, from Goel. This session was interesting and informative if for no other reason than its novel approach. The author used neural networks and ARIMA models side by side to predict time series data. He used this approach across several different types of data sets. In the end, the neural networks were able to achieve a reduction of error of around 80% to 90%! If these results can be replicated, I would be interested in using neural networks in my work to see if I can get that kind of accuracy.


The second session I attended was on regression assumptions, from Cerrito. She gave a lot of helpful pointers about some common-sense tests you should run on your data before modeling, and some common-sense tips about interpreting your output. Some good takeaways? If n gets too large, the variance will go to 0 regardless of whether or not the model is correct; if you have a large n, take a random sample of your data and fit a model based on that. When you're predicting rare events, your model can have high accuracy but horrible predictive power. For instance, if a disease occurs in 1% of the population, a model that says no one ever gets the disease will be right 99% of the time. Great accuracy, but horrible predictive power!


The next session I went to was on customer retention, from Pruitt. This presentation was not too compelling. The presenter essentially gave one big example about how he had developed customer retention scores using Enterprise Miner. We don't have Enterprise Miner, and he didn't go into any real depth on the theory side, so I left early to go work on some other projects. Several of my colleagues were there, so if I missed anything truly important I can find it out from them.


The next session, on Bayesian modeling using MCMC, was one I had already been to! I saw Fang present this paper before, I'm not sure where, but I had definitely seen the paper before. I left early to go to lunch.


After lunch I went to a couple of short sessions. The first was from Knafl about a macro for adaptive regression modeling. The presentation was about a macro the author had written that used k-fold likelihood cross-validation to determine the best model from a class of models. This macro could be interesting; I will probably check it out from the author's website. The second was from Chou and Steenhard about count data regressions, which are useful to me since most of my data are count data. They just presented a macro they had written that deals with 17 different distributions for count data, instead of SAS's built-in Poisson and negative binomial distributions.


At this point I realized there were no more presentations that looked interesting to me, so I staked out a chair in the convention center and did some work for the next few hours.


Tuesday night a few colleagues and I went out to a wonderful restaurant in Alexandria's Old Town called Las Tapas. The food was good and there was live Flamenco dancing, with one dancer and one guitar player. It had a fun, intimate feeling. If you're ever in the area, I highly recommend it.

Tuesday, March 24, 2009

SAS Global Forum 2009 Recap: Monday

This post will recap what I did at SAS Global Forum on Monday. You can follow my as-it-happens commentary by following the #SGF09 channel on Twitter.

Getting to the conference center and picking up my registration packet (I got in late Sunday night) was a snap, due to the fact that my hotel offers a free shuttle to and from the conference center. I thought I'd have to take a cab every day.

The first part of keynote address was pretty informative, even if it did feel like a late-night infomercial. Some of the more interesting products showcased were all about integration: integrating SAS with Sharepoint, text mining tools, R, and Flash. The future of SAS will be visually much different than it is now, I think.

Dave Barry, the keynote speaker, was hilarious, as usual. When I was younger I used to read his column every week, and it was neat to hear him speaking in the style in which he writes. It was like one of his columns had come to life. As I said, hilarious. I'm not sure his talk had much to do with SAS or analytics, but it was a nice bit of entertainment for all us conference attendees.

The first session I attended was about design of experiments (DOE) from Milliken. He had a very good message, that a modeler needs to consider all sources of possible variation that can be accounted for in experiments, whether it's before or after the actual experiment takes place. Even for someone in retail, the message is meaningful, because I should get all the details of experiments (which have usually already happened) so that I know how to properly apply blocking to the analysis. He gave plenty of good examples but I wish he had focused a bit more on theory.

The second session I attended, after lunch, was on time series, given by none other than David Dickey, cocreator of the Dickey-Fuller test. Not only was it great to hear about time series from someone who's contributed so much to the field, but his talk was suprisingly funny and accessible. His talk was mainly filled with examples and not much on theory, but it was still very informative. We learned about the advantages and disadvantages of different types of time-series models. My favorite session so far.

I returned to the topic of DOE for my next session, which was the Basics of DOE for Multivariate Analysis, from Figard. This talk was essentially just a rehash of what Milliken said in the morning session about how to design experiments and analyze them afterwards to identify all systematic sources of variation. The value added from this talk was that Figard presented a checklist to go through when designing and analyzing an experiment to find truly global optima instead of local optima.

My last session of the day was on optimizing marketing mix, from Bhattacharya. This presentation was only so-so. The presenter spoke very quickly and was done in 30 minutes when 50 were allotted. She could have benefitted from slowing down a little. Going through so much content so quickly meant that very little of it stuck. It felt like she had written a large macro to analyze optimal marketing mix, then wrote up a run of the macro. It felt a little light on theory and on the example. Her basic point was that the effect of media is usually non-linear, and you need tools to handle non-linear functions.

That was it for the first day. I caught the shuttle back to the hotel, ordered some surprisingly good local Chinese food, and spent the evening catching up on some work and reading the news. It's surprising how much news happens when you're spending all day in conference rooms. I know, I should have been "cooler" and gone to one of the SAS evening events, but I was pretty tired and had some work I had to get done. Don't worry, Tuesday night I'm going out with some fellow attendees to a local tapas place; they have a flamenco show that should be fun.

Next: in a surprising twist, I'll be posting SGF 2009 Recap: Tuesday!

Sunday, March 22, 2009

Finally here in DC for SGF '09!

I'm planning on writing a blog post every night to recap my experiences at SAS Global Forum 2009. Since I'm tweeting all day, the nightly blog entry is going to be more in-depth than the short 140-character bursts you get on Twitter.

Unfortunately, it's now really late and I'm too tired to write anything substantive. So I'm going to sleep.

Besides, all I did today was travel, and while I do have some interesting stories from my day of traveling (seriously, who can spend 8 hours in airports and NOT hear some interesting stuff?), I'm sure no one wants to hear about what I ate at Quizno's as the high point of my day.

Be back tomorrow.

Friday, March 20, 2009

Man, I'm bad at this

Well, it's been a long time and I haven't blogged at all, so I guess I totally fell down on my promise to blog at least once a month. Well, I guess this counts. I'm going to try updating things both here and at my twitter site, which you can find at http://twitter.com/jephwood. I'd like to use this forum for longer thoughts and my twitter site for short updates. I guess now I'm trying to be more like Wil Wheaton. We'll see if that works out.

Monday, August 11, 2008

I am not Paul Graham

I really like Paul Graham's style, where he writes about one long essay a month and posts it on his website. I'd like to do something like that with this blog. Just write about whatever comes to my mind. I think it would be a good exercise. We'll see if it takes. :-)