Archive for the ‘java’ category

The future of IT is Big

October 13, 2009

The New York Times is running an interesting piece about the ever growing glut of data. The article details IBM and Google’s concern over the data glut and if new and upcoming students trained to handle the explosion of data. It is quite a fascinating piece.

At the heart of this criticism is data. Researchers and workers in fields as diverse as bio-technology, astronomy and computer science will soon find themselves overwhelmed with information. Better telescopes and genome sequencers are as much to blame for this data glut as are faster computers and bigger hard drives.

Please click through and read the whole article. It is very good and very true. This topic should be at the forefront of any person who works in the Computer/Technology field. First there is the problem of how to store this much data. Currently I work for a small publisher (O’Reilly media). It is easy to think that a small publisher probably doesn’t have huge storage needs. But so far since I’ve started working here (1 full year going on my second) we just ordered our second storage shelf, this time for almost 14TB. The new shelf has yet to be installed, but the other day my IT coworker was talking to management in a meeting. Our last shelf was around 1TB, but lasted less than a year. He said at almost 14TB this should last us a long time, but then added, “But we say this every time.” It is so true, especially with storage so cheap and drives so big. It reminds me of my first computer in the mid 90’s with 10GB of storage. I told my parents I’d never needed a bigger hard drive. Then I went away for my freshmen year of college and filled it right up with stupid pictures and movie files.

When I worked for the University of Illinois Engineering department the problems were worse. One research group that I worked for had 1 professor and maybe 5 students (including undergrad). They were relatively new so there was no infrastructure or file server and there really wasn’t much money for it anyway. One day I went to the Professor’s office. He must have had at least 30 hard drives each at least 500GB if not 1TB. Those were just the hard drives he had his students carried around a handful themselves. Another research group, with decades of history, started a scanning project. They would scan hundreds of slides at once each producing around 1MB of data. We installed a file array starting off at 4TB, but was expandable to 14. Unfortunately I left and am not sure what they have or need now. My point is that data storage is a huge problem. And is growing extremely fast. The article mentioned facebook’s 1Petabyte of photos, I’m guilty of quite a few of those, but that is just mentioning one company, many more could have been mentioned. Finally there is even personal space. Since I got my new camera I myself am looking at more storage for home. I am looking for personal NAS boxes. So I see the basic point. The future of IT is data and what to do with it.

Computer scientists and, for that matter, any scientists need to pay special attention. Not only do we need a way to store a lot of this data, but probably more importantly we need to do something with it. A lot of this will rest on programmers, but it isn’t limited to them. When I worked at the U of I the students worked on a cluster I built for them. They would code in C tweaking their algorithm to save every last processor cycle. These students weren’t in Computer Science. This summer I took a course at Boston University. One of my classmates was clearly not a computer person. I asked her why she took the course. She was a statistician and was heading to Grad School for statistics. The school asked her to take programming courses so she could analyze data sets. And of course then there are the Computer Scientists, and our future depends upon analyzing such data.

The future is big data; lots of it. And it is no longer just Google and IBM analyzing and storing it. Now even the smallest of research groups or a little publisher can generating mounds of information. Time to start paying very close attention.

CS intro w/ Java and a bit of book review

July 1, 2009

I’m currently taking a basic Java course that’s supposed to go to the proper level to take the SCJA (Sun Certified Java Associate) certification, the first step in the Sun Java cert track.  Instead of using the standard book that my university recommends, I’m using Big Java by Cay S. Horstmann (ISBN 978-0-470-10554-2).  I’ll post a few thoughts here on this book.

So far so good.  I’m about four or five chapters in, and I think I have a good feel for the flow of the book.  I’ve done several other starter programming books (Zelle’s Python Programming: An Introduction to Computer Science, and an intro JavaScript course as well), but Big Java surprised me starting out.  Unlike some other courses that mainly start out with syntax and primitive data types, this one started out with class design and OO concepts.  It even teaches the student to use a few Swing components (JFrame and JOptionPane) early on in order to make the usual “monkey trick” exercises a bit more interesting.  I like this approach, as it makes the introductory chapters easier.

I like Horstmann’s writing style.  It’s concise and clear, and the code examples are good.  I have yet to find an error in any of the examples.  I’m reading it on Skillsoft Books 24×7, an online book service and it’s been good so far.  I do kind of wish I had the paper copy, but that’s just how I am.  Anyhow…

A big help to me was that I started with JavaScript and Python.  Java’s syntax is very similar to JavaScript’s, so it gave me a head start to coding in Java.  The combination of JavaScript’s syntax with Python’s OO perspective gave me a good foundation from which to move through Big Java.

One more thought – I’ve worked through some programming books that have virtually no exercises.  This, IMO, is a terrible way to help people learn.  If you’re writing a beginner book, you MUST provide practice opportunities for those who can’t come up with their own.  Big Java does a great job of providing practice opportunities at multiple complexity levels.  The exercises build on each other (to some degree), and I feel they are quite effective.

So I do recommend the book for a self-taught Java beginner.  I’ll post more about it as I go along.