Big data gets a lot of ink these days including at Booz Allen. What interests me is answering real-world questions with big data.

I do not currently have the title Data Scientist but might actually be one. As a consultant I know how to analyze data and make recommendations while also consulting domain experts when necessary. We evaluate our analysis tools all the time but a majority of our civil government clients still want an Excel file as the deliverable. Another useful skill for consultants is knowing when to say I don’t know. All of this is covered via the URL and written by Stephanie Rivera:

Being a data scientist is more than having a technical background; it’s also about going beyond your tools and understanding what it really means to tackle complex data analysis problems. No matter if you are a seasoned big data expert or are just considering moving into the field, here are five things you ought to know.

  1. Assuming the title, Data Scientist, does not make you one. There are lots of individuals that claim the title without any robust experience with big data or data analysis. However, there are likely as many people who have the experience needed to be a ‘data scientist’ but do not claim the title. Many disciplines require DS skills to understand data; the only major difference between a physicist and a data scientist is that the latter applies the math/cs skills to many types of data with the assistance of a domain expert. The former is also the domain expert in a specific area of physics.

  2. Speed is sexy, live in the fast lane. Technology and resources for data science are in constant flux. Being able to stay on top of the change is vital. Recently, Spark has tromped Map Reduce, drastically changing the “hottest tool” for big data science. If you fight the change, you will be left in the dust.

  3. Tools are just that, tools. Having access to machine learning libraries does not make a data scientist. Know the tools, use the tools, but most of all think about the problem. Depending on the situation, you may not have tools or you may need to approach feature extraction from an ‘outside of the box’ way. To be successful, you may need to turn your data science problem on its head.

  4. Value your domain expert. Domain expertise is vital to your success and is not something you can pick up overnight. Many times the domain expert loaned to you for your work has been studying the data for years or even decades. Treat them with the respect they deserve.

  5. Know yourself. Take note of your strengths and weaknesses. When you have a data science team where you cannot fill a gap due to a weakness, own that. Acknowledging that you need to reach out to colleagues or read academic papers is the first step toward defeating any knowledge gap. Chances are you can learn what’s missing – just give yourself the time to do that.

One of my favorite sites that regularly tackles big data and presents meaningful analysis with solid interactive data visualizations is FiveThirtyEight, the latest offering by Nate Silver of “Moneyball” fame. Recently FiveThirtyEight published an article, “A Better Way to Find the Best Flights And Avoid the Worst Airports” with an accompanying interactive, Which Flight Will Get You There Fastest?. Utilizing a year’s worth of data from the Bureau of Transportation Statistics the analysis serves as a quick way for one to find the fastest airline on any particular route. It also identifies the best- and worst- performing airlines and airports.

Check out how Washington Dulles International (IAD) rates below – on average, the airport adds 8 minutes to your travel time and is the 224th fastest.

Full Article

A Better Way to Find the Best Flights And Avoid the Worst Airports


Which Flight Will Get You There Fastest?