Eight Tips for Studying Computer Science

My bachelors program is coming to and end this summer. Soon I’ll be a bachlor of science. But besides my college degree I will have gained a lot more. My time at the University of Bamberg has changed who I am and taught me so much. In this blogpost I want to give some tips how to succeed in studying Computer Science, these are mostly lessons I learned “the hard way”. Keep these tips in mind and I promise you you will do good in college and — even better — have something to show after your time as an undergrad.

1. Forget everthing you “know” about Computer Science

While I was in high school I started to learn C++ and wrote my first programs. I came to college with the opinion that I knew a lot and that the studies would be easy. In my opinion, this was a huge disadvantage of me compared to other students who had no idea of computer science.

My advice: even if you already have some knowledge, take your studies seriously. Software development is much more than programming. And computer science is much more than software development.

2. Take part in extracurricular activities

I joined the students association in my first year at college. Most of the friends I made in college, I met here. Additionally, I came in contact with people in a higher semester that gave me advice on nearly every question I had during my enrollment.

But this is just one example of what you can do. Every university has a wide range of clubs and associations (or whatever it is called in your country) you can join. Do it!

3. Learn a new programming language every year

Some schools — like mine — only focus on one or two languages. I learned C/C++ in high school and python and QML/Javascript during my college studies. Currently, I am learning a Python and a little of Haskell.

You should do this too, because knowing a programming language is not just something you can put on your resume. Modern software systems are very rarely written in one language. Usually there are many subsystems that may originate from different (sub)projects. Additionally, you often have scripting languages to build and/or test your software. While you most likely will not master any language in college, having advanced knowledge in several programming languages will become very useful in your future work-life.

4. Take as many graduate-level courses as possible

You may think it is smart to take easy courses and may have the opinion that is crazy to do any extra work. However, marks are just a letter (or number) on your transcript. You should never take courses because they are “easy credit” or because you already know most of the things that get taught there. Take the courses that interest you. These are mostly graduate level courses, but the additional workload will be worth it.

5. Get involved in research early

During your first years you may think you are not qualified enough for research, but if you are willing to work and have an interest in a topic, you are in fact qualified to do assist in a research project.

Professors are always looking for motivated students. This is your chance to separate yourself from the crowd.

6. Use Linux on a daily basis

While choice of operating system — and software in general — is mostly about personal preference, I can highly recommend using Linux on your main machine during your college studies. After a few months you will know how to administrate a Linux/Unix machine; a skill that can come in handy during your first internship or employment.

7. Learn to fail

Coming from high school, you may have the opinion that you are very good at math or computer science. You will not expect to encounter any serious problems. Fun fact: you will!

One of the biggest lessons you will learn during your time as an undergraduate is how to handle failure and criticism. So don’t give up if something goes wrong, because everybody will have to handle such situation at some point in their lives.

8. Start your own project

This is why you went to college in the first place, right? There is something you wanted to do. Maybe write that software you imagined or start your own firm? College is the best time to try out these things. Again, even if you are not successful, it will teach you a valuable lesson.

Some Thoughts on Java and C++

Leaving performance aside, Java is broadly considered superior to C++. However, there are things about Java that make the hair on my back stand up. Let me elaborate with three examples.

Threads API

Unlike C/C++. Java was born in a time where mutlithreading was already very common. In the UNiX and C world it was long preferred to use fork() to spread the load on multiple processes. Some people even said Sun only focused on multi-threading, because their implementation of fork() was so slow. However, let’s not discuss what method is better. After all, threading is here to stay and will be a common way to leverage multi-core computing power in the present and future.

When I started using Java, I thought the fact that it was designed from scratch should make its threading API very straightforward, but I soon realized the opposite was the case. The official documentation from Oracle marks several method of Thread deprecated, like stop(), resume(), suspend(). So, it is often suggested to use more high level facilities like Callables or Executors.
But that is not even the worst part. There are so many ways to manage shared memory in Java. Besides the synchronized keyword and semaphores, there are also high level constructs like ConcurrentHashMap that you can use. As a beginner, I had no idea what was the best way to go here. In my opinion, it is a general tendency in the Java standard to throw in as many features as possible. This blows up the JRE, that is still not modularized, and also means that some parts of it are likely to be marked as deprecated at some point in the future. This is, in my opinion, no proper way to design a standard for a language.

So now, you may wonder how C++ does multi-threading. Basically there are two facilities: the thread class and the async function. The latter is as powerful as executors in Java. C++ and must be used in conjunction with futures. Locking is done with mutexes, and that is about it.

Generics and Templates

Let’s first about templates in C++ and what they are good for. In the old days, people used the preprocessor to include/exclude architecture specific code, define constants and create inline functions. This had some major drawbacks. The biggest issue with using the preprocessor is that there is no type safety. Because of this, C++ has a template system. This is basically a turing-complete language inside the language that can dynamically generate code. The template system replaces the preprocessor nearly completely. There is still some use for it though, for example for header guards.

Java has something like this too. It is called Generics. I say “something like this”, because Generics are not nearly as powerful as templates in C++. Then can only be used with complex types that can be casted to Object. The reason for this is, as far as I know, that the developers wanted to maintain backwards compatibility.  Generics also lack a lot of advanced features that I fell in love with when using the C++ template system, like variadic templates.

Const and Final

In C++, const can not only be be applied to datatypes but also to methods. A const method can not modify the class it belongs to and a const reference/pointer can only be used to call const methods. This makes it very easy to ensure read-only data access. This feature is nearly completely missing in Java. Now you may think “hey there is final and that is basically the same” and you are kind of right. However, final in Java is very weak. For example, you can not make a reference to an object final. Yes, you can make the reference itself final, but that only means that the memory it points to will stay the same. The reference can still modify the object it points to. The reason for this is that there is no concept of final or const methods in Java.

 

I could give even more examples here, but I think I’ve made my point. Maybe I just look at Java the wrong way, but in my opinion the language has major design flaws and changes to the language are often rushed and seem to lack a proper concept.
Be free to comment and correct me if you opinion differs.

The Ubuntu SDK: How It Could Fail or Win

Ubuntu-smartphone

Your next mobile OS?
(Source)

Since I started using Linux as my main operating system, my whole world revolved around Gnome and GTK+. I sometimes toyed with cross-platform toolkits like wxWidgets, but they did not convince me. In the last year — particularly during the my employment at Siemens — I became some kind of a Qt enthusiast.

This has several reasons. First, Qt uses C++. You might say that the programming language is a matter of taste. If so, I suggest you try work with GObject for a day or learn the horrible mess of non-debuggable code that is Vala. Secondly, there is QML. Object-oriented programming language are not fit to layout User Interfaces. The Qt Markup Language is a nice way to quickly build User Interfaces. Finally, there is the community. GTK3 has not even been ported to Windows yet some weeks ago, while Qt has official and community port for almost every platform there is. New additions — like Qt3D — are released frequently.

When Canonical announced the Ubuntu Touch platform and I heard that it is based on Qt, I was pretty eager to try out their SDK. In this blogpost I want to present some insights I gained in the last days. Note, I will only talk about the QML part of the Ubuntu SDK. I have not touched any C++ code yet.

Besides the awesomeness that is vanilla QML, Ubuntu Touch adds some Components that want to create a unified look and feel. While the idea is really nice and there are many good concepts, it still lacks a lot of advanced features — like Tables or Rich Text Editors.  Currently, the SDK may be sufficient to build a simple app for the phone, but there I could not imagine building a full-blown desktop app with it. Qt 5.1 introduced a wide range of native-looking widgets. If the engineers at Canonical are smart, they will embrace this toolkit for the desktop apps.

Now lets talk about the big aim that the Ubuntu folks target: convergence. During the last days I have worked on adding a desktop layout to two different apps. Unfortunately, I ran into two major problems. The fist is that the utilities for adaptive layouts cannot be nested. The documentation says they can but I at least on my machine it didn’t work. Secondly these adaptive Layouts cannot be used in combination with Tabs. A common approach for adaptive apps is to show multiple columns instead of tabs on a desktop, but this currently isn’t possible without a lot of nasty workarounds in you app. I suggests that the Ubuntu SDK add this functionality to the build-in Tabs element. This would ease the development of convergent Apps a lot.
This is just one example. Generally, the development kit should come with adaptive widgets. Otherwise we end up in a world where each app has a different look and feel. A key feature of free Desktop Environments has always been the unified look of Apps — we should not abandon this in my opinion.

Friends-App Tabbed

The Same App Using a Multicolumn Layout

The Same App Using a Multicolumn Layout

 

 

 

 

 

 

 

 

Some of you might have seen the documentation on qt-project.org already. While it is not perfect, the one of the Ubuntu SDK is far from useful. The components have a lot of undocumented or wrongly documented behaviour, there are missing many examples and it strutured very badly. Luckly, a feature has been added that allows the community to edit the documentation. So I am sure this will improve in the future.
Also, I want to point out the awesome community of Ubuntu here. In my opinion this is one of the best features of the free operating system. I hope that Canonicals tries to make the development process — from design to deployment — as open as possible and keeps embracing the community as part of the development process. This could be the advantage of Ubuntu compared to systems like Android or iOS that are developed behind closed door.

Finally, I am really not a fan of the life-cycle model of Android. In my opinion Maemo was one of the best mobile operating systems so far, because it allowed real multitasking while still allowing a relativly long battery life. Sadly, Ubuntu seems to go the path of Android here and suspends Apps by default. Even worse, currently there is no way for a non-core app to run in the background. I hope that Ubuntu will make it easy for developers in the future to allow apps not to suspend, but I am afraid they will add a complex mechanism that forces the developers to write a separate daemon using some restrictive API. If that will be the case I might be moving to SailfishOS, that stayed very close to the concept of Maemo.

Let’s be fair. The Ubuntu SDK is new and still very young. However, if someone would currently ask me if he/she should use it for a new app, I would most likely say no.
Still I think that Ubuntu Touch has the potential for a bright future. The currently release is just a preview of what will come and the direction is definitely the right one. I hope that my suggestions from the view of an app developer are taken into account.

Research Focus: Fountain Codes

With this blogpost I want to give some insights on my research. Note, this is not an academic article but intended for a broad audience.

Classical error correction codes – e.g. LDPC – generate a fixed size of parity symbols. These parity symbols can be used to recover from errors or check the received data for correctness. However, the amount of these parity symbols has to be adapted manually. There are many factors that have to be taken into account for this. The most important are latency and the amount of loss.

Fountain codes, on the other hand, are rateless. That means that they can generate a virtually infinite amount of symbols. The receiver just has to receive a certain number of these symbols and can then decode the data independently on what symbols he/she received. The name derives from an analogy to water fountain, where it also doesn’t matter what specific water drops you “receive” as long as you can fill your bucket. Fountain codes are also often called rateless error correction codes or rateless error codes.

In this article I want to give a short intro on the three most prominent examples for Fountain codes: LubyTransform, Raptor and Spinal codes.

Simplified example of an LT encoder.

Simplified example of an LT encoder.

LubyTransform (LT) codes were the first real implementation of a Fountain code. In a nutshell, a function generates encoded symbols from the source symbols. The source symbols are fixed-size chunks that together make up the source data or source block. The encoding function has to be known by the sender as well as the receiver. Decoding is then done by generating a linear equation system from the encoded symbols and solving it by using Gaussian elimination or another technique.

Raptor codes are can be seen as an optimization of LT codes. Here, a precode is added that can recover source symbols that cannot successfully be decoded. This small modification increases the performance of the code significantly. The decoding can be combined in one generator matrix. If you are interested, there are many good publications on raptor codes (e.g. this one).

Visualisation of Spinal Encoder. Source: http://nms.csail.mit.edu/spinal/

Visualization of Spinal Encoder. Source

Finally, Spinal codes are a pretty new attempt to build a rateless error correction code. These codes do not use linear equation systems to map the source symbols onto the encoded symbols, but a combination of a Hash function and a Random number generator (which sounds very cool in my opinion). Also there the data is broken in to chunks. These chunks are then passed to the hash function that generates the spine values. These values are then used to seed a random number generator. The generated number are the encoded symbols. Why “spine”? The visualisation of the encoder – with a little imagination – looks like a spine.

In my opinion, Spinal codes have a lot of potential. They are very easy to implement compared to Raptor codes and accoring to the measurements form the CSAIL at MIT they are better performing. Head to the website of Jonathan Perry or view my slides on Spinal codes for more information.

In my next research focus blogpost I want to give some more insights on what I actually do, which can be summed up as the following: at the computer networks group we implement those codes and look for ways to optimize them.

A Journey through the World of Build Systems

I guess every serious programmer has had experience with a build system. At some point you want to build your application on different operating systems or run automated tests after each build. That is where build systems come in. They offer a way to script you build and install process and often also allow to generate custom makefiles or projects for you IDE depending on your current operating system.

During my employment at Elektrobit I’ve learned a lot about CMake, which is the de facto standard build system for C/C++. It’s syntax is pretty close to classical makefiles which makes it easy to understand. To build you project you first have to run cmake which generates makefiles that then can be used to generate the binaries. I found this a not very straightforward process.
Additionally, I learned to use Maven, which is a  build system for Java and a successor to Ant. My experience with Maven has not been very good, I am not sure if it was related to the build system itself or the fact that I am not really a Java person. However, at school we were taught Gradle, which is in my opinion much nicer and easier to use build system for Java. Still, I found it very slow which is annoying when you have to build very often.

To sum up I have the following requirements for my build system:

  • Performance
  • Easily scriptable post-build actions like testing
  • Support for deploying binaries and cross-compiling for different targets

During my employment at Siemens I started to work with Qt. If you have ever used Qt you may also know qmake – Qt’s standard buildsystem. And it is in my opinion very painful.
But there is hope! The Qt Build Suite (Qbs) is a proposed replacement for qmake. It uses the Qt Markup Language (qml) – a JavaScript derivative – to define projects. Instead of calling make recursively as qmake does, it directly calls the compiler/linker and thus allows incremental building. This can speed up the build process significantly, especially when dealing with big projects.

The following is a qbs file from my fountain-tools project.

import qbs.base 1.0

StaticLibrary {
	name: "hermes"
	targetName: name

	cpp.defines : ["LIBRARY"]
	cpp.includePaths : ["./include", "../libencode/include"]
	cpp.cxxFlags: ["-std=c++11"]
	cpp.dynamicLibraries: ["crypto", "ssl", "pthread"]

	Depends {
		name: "cpp"
	}

	Depends {
		name: "encode"
	}

	files: ["src/*.cpp", "src/*.h", "include/*.h"]
    	fileTags: ["cpp"]
}

As you can see the code is very short and concise even though several advanced features like dependencies or adding files by using a regular expression are used.

Note, I am not using Qt in my project. Qbs can also be used in non Qt projects without problems. Qbs projects can be directly build by the command line qbs tool. If you are fan of IDE I should point out that  Qbs files can be directly opened by the Qt creator, without the need of some intermediate step.

I hope I created some interested in Qbs with this blog entry. However, this was not a tutorial. Please head over to the official documentation for that.

Pitfalls when using smart pointers

So I’ve been using the new C++ smart pointers for about half a year now and I am quite happy with them. My current project comes completely without the (manual) use of delete. It has very simple class relationships and thus I mostly use the unique_ptr.

An advantage of using smart pointers, besides preventing memory leaks, is that the ownership of the objects is really clear. unqiue_ptrs are for objects that belong to a single object. shared_ptrs are for objects that are used by more than one other object. And weak_ptrs are used to check if an object still exists. The latter only appears in combination with a shared pointer.

If you are not familiar with smart pointers I highly recommend you to take some time and learn about them.

When analyzing the code for memory leaks I was shocked that the tool I used still found several. The reason for them, however, was pretty obvious. Some of my interfaces were missing virtual destructors. If you are not familiar with the concept of the virtual keyword, it is used to tell the runtime to go “upwards” in the type hierarchy to look for overwrites. My classes often have members that are smart pointers. If the instances of theses classes are never cleaned up, of course, the objects the smart pointers point to are also not cleaned up.

You may wonder, normally GCC creates a warning if delete is called on an interface with a non virtual destructor, so why not in this case? The reason for this is, the memory is not freed in my code but somewhere inside the implementation of the smart pointers. If you are using LLVM or some other compiler, I would be happy if you could tell me if they detect such problems.

In my opinion compilers should make all functions, or at least the destructors, virtual by default. It would decrease the performance of the program a little but prevent a lot of errors.

If you are curious what type of tool I used to analyze the code, it was Valgrind. Most of you may not surprise this as it is the most popular tool for that purpose. Again, I highly recommend to familiarize yourself with this tool.