Monday, February 13, 2006

Simplicity in design: an example

On my personal website like many people I have a photo gallery. My web hosting provides 2GB of space which was a large sum when I first signed up. I was already pushing the limit when last spring I got a Digital Rebel XT camera. At eight megapixel the file size it produces is conciderably larger and by the end of 2005 not only had I vastly surpassed the 2GB limit for the minimum that I wanted to put up, but my monthly backup for my full photo collection took up several DVD-R's. When I first started hitting the 2GB limit I would deleted older photos (or ones I didn't like) from the site, but I realized that this was not a long term solution and either I would have to pay for more space or do something about the photos.

Part of the problem stemmed from the fact that my photos were stored at the highest quality the camera could give me. I regret not doing this with my earlier digital photos that will forever be the low quality and low resolution that they currently are. The other problem is from the image size. The Cannon produces images with a resolution of 3456x2304 and my older Casio is respectable at 2048x1536. It was cool to offer them at that size online, but really 1600 or 1280 will probably satisfy almost everyone, and the few that want the higher resolution can always e-mail me for the file. Armed with those two bits of information I knew what needed to be done. I set out to write a script that would 1) go through and make a copy of every photo from directory A into directory B (keeping all sub-folders) and 2) resize the images to 1600 if they were larger and lower the quality to 75 for those that were jpg's.

With those specs I quickly stubbed out a recursive function that would take a directory (A) and a destination (B) for the duplication. It would read everything in A and if it was a directory make it in B and then call itself to parse B. If it was a file it would copy the file and if it was a jpg and larger than 1280 then resize it.

After writing it I got the same feeling that something wasn't right. This bash script took more then a hour to write, 95% of that time was wasted fixing issues caused by file names and directories with spaces and was an bulky 55 lines. This was not a "hard task", a dialog in digikam that implemented this feature would probably have less lines of code and taken a fraction of the time to code. The code seemed way to brittle with a lot of edge cases. I set aside the project and went to bed. The next day (like it always happens) I realized my mistake. My problem had been with how I had worded my original design. The original design had directed my implementation down the wrong path. I quickly realized what my design should have been:

Make a copy of all my photos and lower the resolution and quality of the jpgs.

With a new awareness of the problem I quickly wrote a script that did the job in just two lines. Once you added in error checking for the argument, you get a total of nine lines.


#/bin/bash
to=$1_web
if [ -d $to ] ; then
echo "$to already exists";
exit
fi

cp -ar "${1}" "${to}" || exit
find "${to}" -regex '.*jpg$\|.*JPG' -exec mogrify -quality 70 -resize '1280x>' '{}' \;


Not only is it simpler, but it does away with several issues that I had come up with in the fifteen hours sense I had written the first version. Also due to its simplicity there is no chance of any issues due to the file names having spaces. Best of all it took ten minutes to write and test which is what I had expected the original task to take.

Many of you have probably heard the following story:
A manager gives a task to his three employees. The first employee, fresh out of school goes right to his desks and starts coding a solution. Eight o'clock that evening he finally finishes, having to only change 2000 lines. The second employee who had worked there for a few years sketches up some stuff on his whiteboard, creates his solution at 1000 lines and leaves at five. The third employee goes for a walk around the campus with a notepad. When she returns she deletes 200 lines and changes 100 before heading home at three. Who was the most productive?

Simplicity in design is often spoken about in books and so I thought I might write about a real world example that is small enough for a blog. After coding my first solution I find a bit of elegance and pride in my final solution even though really it was my folly in my first design that was the problem. When designing code listen to your gut. If something seems to ugly, too convoluted take a minute to step back. Take a walk with a notepad, (don't check your e-mail!), even explaining your design to someone else might be the perfect way to discover the real solution simply by trying to articulate the full problem verbilly. Your friend will probably be able to poke holes in your design or suggest new avenues to explore.

Coding can be a lot of fun, I especially enjoy moments this the above while walking to work I realized that my difficult problem wasn't difficult at all and just like I had originally hypothesized, was decidedly simple.

Saturday, February 11, 2006

Massif for memory profiling

So you might have noticed that your application is using a bit of memory, but it wasn't *that* much so your didn't worry about it too much. Then one day maybe you noticed your app was using more memory then Konq and you don't have a leak. What do you do? Use a heap profiler.

You have probably used KCachegrind to improve performance, but did you know that Valgrind includes a heap profiler too? Check out the massif documentation which should get you started. I have found a Gnome massif tutorial that provides a walkthrough of a gnome application and most everything applies to KDE applications too.

KDE 3.5.1 is better then ever, but there are still applications that use way to much memory. On my system I regularly find Konversation and Akregator using more memory then KMail and Konq combined. I profiled them both and found some interesting results while exploring what massif can do.

Applications that use a little amount of memory matter too. I profiled a small Qt4 application I have today and was able to save a few MB after just a few small changes. Small savings really do add up and are important. Recently in Jim Getty's blog he has talked a number of times about how the $100 laptop will have 128MB of ram total for the system, desktop *and* running application. Recently he discussed how the gnome-clock-applet was using more then 60 megabytes. With the conversion to Qt4 KDE applications are in the perfect opportunity to show that KDE can be really light. Running the most basic KDE3 application uses more then twice the memory (total including libs) of some Qt4 example application's I tried. Just migrating to use Qt4, KDE applications should get a nice drop in memory size. But that is just the beginning, just like you take time to speed up your application with KCachegrind so should you profile your memory usage from time to time. And if you refractor your application while porting it to Qt4 take a minute to see if you can decrease the memory requirement. Many systems today still ship with only 256MB. KDE applications are in fantastic shape and we will be getting better in the next year and with tools like massif we will have no problem providing a desktop and applications with a low memory footprint.

Popular Posts