Archive for the ‘programming’ category

The future of IT is Big

October 13, 2009

The New York Times is running an interesting piece about the ever growing glut of data. The article details IBM and Google’s concern over the data glut and if new and upcoming students trained to handle the explosion of data. It is quite a fascinating piece.

At the heart of this criticism is data. Researchers and workers in fields as diverse as bio-technology, astronomy and computer science will soon find themselves overwhelmed with information. Better telescopes and genome sequencers are as much to blame for this data glut as are faster computers and bigger hard drives.

Please click through and read the whole article. It is very good and very true. This topic should be at the forefront of any person who works in the Computer/Technology field. First there is the problem of how to store this much data. Currently I work for a small publisher (O’Reilly media). It is easy to think that a small publisher probably doesn’t have huge storage needs. But so far since I’ve started working here (1 full year going on my second) we just ordered our second storage shelf, this time for almost 14TB. The new shelf has yet to be installed, but the other day my IT coworker was talking to management in a meeting. Our last shelf was around 1TB, but lasted less than a year. He said at almost 14TB this should last us a long time, but then added, “But we say this every time.” It is so true, especially with storage so cheap and drives so big. It reminds me of my first computer in the mid 90’s with 10GB of storage. I told my parents I’d never needed a bigger hard drive. Then I went away for my freshmen year of college and filled it right up with stupid pictures and movie files.

When I worked for the University of Illinois Engineering department the problems were worse. One research group that I worked for had 1 professor and maybe 5 students (including undergrad). They were relatively new so there was no infrastructure or file server and there really wasn’t much money for it anyway. One day I went to the Professor’s office. He must have had at least 30 hard drives each at least 500GB if not 1TB. Those were just the hard drives he had his students carried around a handful themselves. Another research group, with decades of history, started a scanning project. They would scan hundreds of slides at once each producing around 1MB of data. We installed a file array starting off at 4TB, but was expandable to 14. Unfortunately I left and am not sure what they have or need now. My point is that data storage is a huge problem. And is growing extremely fast. The article mentioned facebook’s 1Petabyte of photos, I’m guilty of quite a few of those, but that is just mentioning one company, many more could have been mentioned. Finally there is even personal space. Since I got my new camera I myself am looking at more storage for home. I am looking for personal NAS boxes. So I see the basic point. The future of IT is data and what to do with it.

Computer scientists and, for that matter, any scientists need to pay special attention. Not only do we need a way to store a lot of this data, but probably more importantly we need to do something with it. A lot of this will rest on programmers, but it isn’t limited to them. When I worked at the U of I the students worked on a cluster I built for them. They would code in C tweaking their algorithm to save every last processor cycle. These students weren’t in Computer Science. This summer I took a course at Boston University. One of my classmates was clearly not a computer person. I asked her why she took the course. She was a statistician and was heading to Grad School for statistics. The school asked her to take programming courses so she could analyze data sets. And of course then there are the Computer Scientists, and our future depends upon analyzing such data.

The future is big data; lots of it. And it is no longer just Google and IBM analyzing and storing it. Now even the smallest of research groups or a little publisher can generating mounds of information. Time to start paying very close attention.

Mono and C# so whats the big deal?

July 23, 2009

There has been a controversy brewing for quite some time in the Open Source world. I’ve sort of stayed out of it because I didn’t fully understand exactly what was happening, and I wasn’t sure that I cared. Still there is good reasons if you are a programmer looking to do anything in the non-MS world with C# to at least be aware of this. I currently don’t have any C# training, but I am scheduled to take a class on it in the Spring. Finally I found an explanation that didn’t resort to needless flame wars. I think it is fair and balanced. Please click through, but I’ll post a few highlights. So you know mono is an open source project that ports .net framework to Linux and Mac. So this is where the controversy lies. It is porting over a Microsoft technology, and MS does not have a good track record with Linux or open source. Also, by way of background, there are a few very notable programs that are written with mono and are being considered as default programs in Linux distributions (tomboy, banshee, fspot, and gome-do).

Mono, the free software implementation of .NET (C#), has been the subject of bitter debate for eight years. Yesterday, that debate ended — or at least shifted to another level — with Microsoft’s announcement that it was extending its Community Promise to include the patents that left Mono possibly encumbered.

The greatest fear has been that Mono-based programs like GNOME’s Tomboy or F-Spot could be the source of a patent violation case by Microsoft against some or all of the community.

In 2001, Microsoft released a letter to ECMA in which it promised that use of the patents involved would be available on request on a “royalty free and otherwise RAND ‘Reasonable and Non-Discriminatory’ basis.”

However, as Miguel de Izaca, the founder of Mono and a Novell vice president, points out, “The problem with ‘RAND’ is that it doesn’t say what ‘reasonable’ means. It has to be reasonable, but it doesn’t have to be free. Microsoft stated publicly and on the ECMA committee that nobody had to pay, but they never actually went and published the license.”

And there is the problem. While C# looks like a great language with awesome capabilities the fact that MS holds patents and is a commercial entity leaves the door wide open to forcing Linux users to pay up.

As described on his blog, de Izaca plans to divided Mono source code into two repositories. One will include the ECMA-covered libraries, and the other Mono’s implementation of ASP.NET, ADO.NET, and Winforms. By making this division, de Icaza presumably hopes to make clear to developers at a glance what code they are working with.

I’m excited to learn C# and really have every intention to use it, but this does give me pause especially if I were starting a large project that needed mono (i.e for any platform other than Windows).

CS intro w/ Java and a bit of book review

July 1, 2009

I’m currently taking a basic Java course that’s supposed to go to the proper level to take the SCJA (Sun Certified Java Associate) certification, the first step in the Sun Java cert track.  Instead of using the standard book that my university recommends, I’m using Big Java by Cay S. Horstmann (ISBN 978-0-470-10554-2).  I’ll post a few thoughts here on this book.

So far so good.  I’m about four or five chapters in, and I think I have a good feel for the flow of the book.  I’ve done several other starter programming books (Zelle’s Python Programming: An Introduction to Computer Science, and an intro JavaScript course as well), but Big Java surprised me starting out.  Unlike some other courses that mainly start out with syntax and primitive data types, this one started out with class design and OO concepts.  It even teaches the student to use a few Swing components (JFrame and JOptionPane) early on in order to make the usual “monkey trick” exercises a bit more interesting.  I like this approach, as it makes the introductory chapters easier.

I like Horstmann’s writing style.  It’s concise and clear, and the code examples are good.  I have yet to find an error in any of the examples.  I’m reading it on Skillsoft Books 24×7, an online book service and it’s been good so far.  I do kind of wish I had the paper copy, but that’s just how I am.  Anyhow…

A big help to me was that I started with JavaScript and Python.  Java’s syntax is very similar to JavaScript’s, so it gave me a head start to coding in Java.  The combination of JavaScript’s syntax with Python’s OO perspective gave me a good foundation from which to move through Big Java.

One more thought – I’ve worked through some programming books that have virtually no exercises.  This, IMO, is a terrible way to help people learn.  If you’re writing a beginner book, you MUST provide practice opportunities for those who can’t come up with their own.  Big Java does a great job of providing practice opportunities at multiple complexity levels.  The exercises build on each other (to some degree), and I feel they are quite effective.

So I do recommend the book for a self-taught Java beginner.  I’ll post more about it as I go along.

I must be getting better at programming…

April 21, 2009

… because I got this one without even having to read the alt text. 

comic

The alt text read “If androids someday DO dream of electric sheep, don’t forget to declare sheepCount as a long int.”

If you think that’s terribly unfunny, try this site.  Let me know if it’s more to your tastes.

Comic provided courtesy of http://www.xkcd.com

Using Android on Netbooks

April 2, 2009

We have talked about netbooks and Google’s Android before. I am pretty excited to see where Google’s Android is headed, and I am very hopefully that netbooks will continue to adopt Linux as a viable OS. Not that I fault anyone for buying a netbook with XP, but as Linux becomes more common on netbooks so too will the OS become more common in the mainstream.

So comes the news that HP is considering using Android in their netbooks. Ars does a good job breaking down the good and the bad. Android isn’t designed for the desktop, at least not yet. The OS is designed more for touch than for external input, but those barriers can be easily overcome. Perhaps the bigger negative is the ecosystem. Android has its own development platform and highly customized kernel. HP already ships with Ubuntu the advantage here is installing all the vast wealth of Linux and Open Source applications. Android would limit that integration with, as of now, a limited iPhone-like app store. However, this disadvantage to the Linux faithful may be an advantage to the Linux noob. If Google is reviewing each application it insures stability and compatability…something Ubuntu can’t do as of yet. It will also be pretty easy to install these applications. With that in mind having Google’s name backing up the otherwise nebulous OS may also be of great importance…who has heard of Ubuntu, but Google is something I can trust.

If HP takes Android in its current state than this is a bad move on HP’s part, but, given the open nature of Android, if they can make it into something truly revolutionary this may be one of the best moves for the Open Source community. Hopefully HP will surprise us all.

Python to get a speed boost by Google

March 27, 2009

If this is true this is awesome news by Google. Python is a scripting language, and as such usually has poorer performance on intensive programming projects. For me, though, I like Python because of its ease of use and flexibility. Google obviously uses Python extensively and effectively. If this new interpreter can really bring about a 5x speed boost than I say it is a win win. It also looks like this new interpreter will help utilize multi-processor and thread hardware.

From the article:

The goal of the Unladen Swallow project is to use LLVM, the Low Level Virtual Machine compiler infrastructure, to build a just-in-time (JIT) compilation engine that can replace Python’s own specialized virtual machine. This approach offers a number of significant advantages. As the developers describe in the project plan, the project will make it possible to transition Python to a register-based virtual machine and will pave the way for future optimizations.

Good luck Google. May you bring it to pass.

O’Reilly’s state of the book market programming languages

February 25, 2009

O’Reilly is releasing their numbers on the book market via programming languages. It is pretty interesting to see where growth was and what languages are more popular than others. For instance they saw the most growth in Python, and saw a pretty significant drop in Ruby and C++, while C# is the most purchased programming language book.

Of course O’Reilly isn’t the end all be all of computer book sales (as in there are other players who may have totally different results) and this is totally just by number units sold, but it is interesting to see what the numbers casually tell you.