Where Have You Gone, Bell Labs?

Really interesting article in Business Week that I just caught wind of by Adrian Slywotzky talking about how the importance of public/private research labs for America’s economic recovery.

He estimates that due to the Recession and outsourcing, we need to create 6.7 million jobs, and then to spark demand to truly “recover”, we need to create another 10 million. He says this isn’t impossible, because in the 1990′s the U.S. economy generated 22 million jobs (2.2 million a year), but between 2000-2007 (before the Recession), the economy only generated 900,000 a year.

In addition, as he says,

Of the roughly 130 million jobs in the U.S., only 20% (26 million) pay more than $60,000 a year. The other 80% pay an average of $33,000. That ratio is not a good foundation for a strong middle class and a prosperous society. Rather than a demand engine, it’s a decay curve.

His argument is that basic scientific research by both government and private labs has fueled the various “blockbuster” economies over the past hundred something years:

Cars and petroleum in the 1920s, movies and radio in the 1930s, defense in the 1940s, appliances and television in the 1950s, pharmaceuticals in the 1960s, aerospace in the 1970s, PCs in the 1980s, the Internet and cellular telephony in the 1990s.

He observes that all of these industries grew out of basic research conducted either at private research labs like Bell Labs or government labs like DARPA.

What he sees as a problem is that unlike in previous recessions, the funding for basic research has dwindled over the past decade. He cites Bell Labs as an example where as recently as 2001, there were 30,000 scientists employed and now there are only 1,000.

Underlying much of this, of course, is the oft-observed truth that I can certainly confirm personally, that most of the smart technical people (especially the ones I graduated with at Harvard) have been going into finance over the 10 years. As he says,

Science has lost its allure as the domain for our best and brightest. Much of the best technical talent has been drawn to the promise of riches from Wall Street and financial engineering. We need to reestablish a culture that rewards and celebrates the scientist who is willing to work on tough problems even if the commercial return is less certain.

He fundamentally calls for greater investment in labs and R&D in the United States.  His three recommendations for how to get back on track are:

• Clear national goals in two or three key areas, such as carbon-free energy and preventive medicine.
• Government commitment of $10 billion a year above and beyond spending for national agencies to jump-start new industrial research labs
• Government tax credits for corporations that commit to spending 5% to 10% (or more) of R&D on basic research

Incidentally, I saw a while ago, that the third point sounds like something President Obama is proposing as part of his general tax reforms – namely a $74.5 billion tax cut over 10 years for R&D.

Having worked at a government research lab over the past 3 years, I can’t comment much on what it used to be like pre-2000. But I know that the people working there are brilliant.  And I also know that the finance sector is not going to create 17 million jobs over the next 10 years. What’s wrong with giving scientists some love, for the good of the country?

Google and IBM say we need to train more supercrunchers

There was an article in the New York Times today about the effort that companies like Google and IBM are making to allow university students access to very powerful computing environments to allow engineers and scientists to plow through massive data sets. Their argument is that students are being trained right now to think on a gigabyte scale (if they’re lucky enough to be trained how to analyze real data at all), when all the breakthroughs are happening with datasets in the tera and peta-byte scales.

I couldn’t agree more with this analysis. If people are serious about analyzing those “very rare events”, “long tails” or whatever that can make the difference between a profit and loss, success or failure, or even life or death, then we can’t continue running around assuming things because the model fits 80% of the time and anyways, it’s too hard to do that level of analysis. We all saw what happened with that idea.

When I was working at Lincoln, we created a highly accurate model of U.S. near mid-air collisions. We did this by analyzing about 5 terabytes worth of radar data from across the country (about 8 months worth). Nobody had ever done this before on anything close to that scale.

As a result, we had orders of magnitude more data on near mid-air collisions (a very rare event) than the last model in the early 90′s. Without this data, and the high-powered systems available at Lincoln that we used to analyze it, our model would have suffered from the same assumptions and modeling error as previous attempts, and that is just not good enough for developing something as important as the next generation of collision avoidance systems for manned and unmanned aircraft, which people are now doing at Lincoln, largely as a result of that effort.

The ability to analyze massive data sets has been proven again and again as a competitive advantage in bio-tech, finance (those who do it correctly), internet, and even marketing, making those companies who developed those competencies hundreds of billions of dollars.

Is it then a stretch to say that the next lucrative opportunity in operations management will be to develop the capabilities to harness the massive amounts of data companies already generate every day? I’m talking about everything from inventories to machine control outputs and even to intra-company emails.  There are signals in that data, just as there are signals in everything from our DNA to the stock markets, if you look hard enough.

To be honest, I don’t know (I’m new to this stuff!) but that’s why I and several of my classmates are trying to start a new track for LGOs in the EECS department this year called Information and Decision Systems. The focus in this track is to develop the theoretical, practical and communication skills for students who want to take on this operations challenge in the real world, for real companies. That means not just studying and learning the algorithms, but also getting a design background in the networking, database and parallel computing systems that are critical enablers of this type of work. It also means developing specialized communication skills to explain the opportunities and the results, because like the NYT article said, most people have not been trained to think on this scale before.

I could talk for pages more about this topic, but lets just leave it at that for now. I just had to write something because I’m obsessed with this idea, and this article got me all excited. I’m definitely going to look into Hadoop…

Blogger^2

Quick note to say that I’m now blogging for the EECS department too.  I even got my first “first” comment!