No Effing Clue: C

Showing posts with label C. Show all posts

Sunday, 8 January 2012

A Path to Python

I've always liked the saying, "The only truly stupid person is the one who has nothing left to learn." I have no idea who said it, or if I even have the quote right, but it's always stuck with me. So, I am always trying to learn new programming techniques and new languages.

Call it a fault of mine but I always learn the best when I'm enjoying what I'm learning. I know I don't stand alone in that. So, I've discovered that when I'm going to learn a new language I need to know more about it first; and, to really know a language you have to see it in action. Feature's really don't mean a lot unless the language "speaks" to you. So, I always browse the source code of some applications written in the language. I always take a look at a minimal "hello, world" application, too.

Python has been recommended to me in the past on several occasions but I've never really paid it much interest. My first hang up was the need for the Python interpreter. I didn't like the idea that a user of my product would have to install the Python VM just to use my application. I prefer an all-in-one package and bundling Python with the installer would increase an installer's size dramatically. Of course, there are other solutions but at the time that's how I saw it. I was also worried that Python was very slow. Maybe, at the beginning, it was slow but in relation to what? Will the speed difference even matter or be noticeable? And is that still the case? For application level programming it would probably be a complete non-issue but, again, I couldn't see that at the time.

Go also challenged my programming dogmas. For example, take dynamic linking. I had grown to believe that dynamically linking libraries (not to be confused with dynamic loading) was the only way to go. I just couldn't wrap my head around why anyone would ever use static linking (make sure to follow the links at the bottom of the page) and chalked it up as the old, antiquated way of doing things. How wrong was I? Both methods of linking have advantages and disadvantages and I can certainly appreciate why Go does much of it's linking statically.

Go does struggle with a couple issues and one of the big ones is shared with C. Because most third-party libraries utilized by Go are linked to C libraries portability becomes an issue. I've been looking for a "write-once, run-anywhere" solution ever since writing my first non-trivial application. There is only one platform that immediately jumps out at me to fit that bill: Java.

Initially, I thought Java might be a great language. It's C-like, the JVM is installed on most platforms, and has a huge collection of libraries built right in! Unfortunately, despite it's many strengths, I didn't enjoy programming in the language. My feelings about Oracle aside, I felt that it is very, very verbose. Take even a simple "Hello World!" program for example:

class HelloWorld {
  static public void main( String args[] ) {
    System.out.println( "Hello World!" );
  }
}

Now, compare that to Go:

package main
import "fmt"
func main() {
  fmt.Printf("Hello World\n")
}

And, finally, Python:

print "Hello World"

As I began to work with the language I started to feel like it was more of a patched together, band-aid solution, mess. Don't get me wrong, Java is a decent language but it just doesn't work for me. C is painful enough to program in, Java seems to be worse. Thankfully, other avenues to the JVM platform exist.

I considered Clojure, Jython, Groovy and Scala. I want to learn to learn a LISP-ish language at some point so I thought Clojure may work. After looking at some source code and documentation for the other languages, the only one which spoke to me was Jython. Python + the JVM? How can you go wrong?

Documentation for Jython is somewhat lacking if you don't already know Python. Good thing there's an easy solution to that. The Jython documentation is good for learning how it differs from Python and how to utilize the JVM, and Java, from within Jython but not much on learning the language itself. So? Learn Python.

My first observation has been that it is remarkably similar to Go. Go has clearly been heavily influenced by Python and I feel right at home with it. It does differ from Go in several distinct ways but it's turning out to be a completely fantastic language and I can see why so many programmers are using it. It's expressive, concise and an altogether eloquent language.

Time is important to me. I have precious little of it. So getting bogged down with nonsense is something I do my best to avoid. Python, like Go, does away with a lot of the nonsense and let's you get down to producing functional projects. I can see myself working with it a lot in future projects and I look forward to that time. I think, too, it will help me bridge the mental gap with being able to work in a LISP-like language.

After learning the basics of Python I then started to work with Jython. The amount of time it took for the Jython interpreter to load took me aback. I had gotten used to how quickly the Python interpreter loaded. I then discovered that Jython is based on Python 2.5. Not only is it two CPython releases old (as of this writing CPython 2.7 is the current stable 2.x branch) but Python 3 has also been released. While this hasn't proved to be particularly handy-capping thus far I am worried about the future of Jython as it falls further and further behind it's parent project. I am also concerned about it's speed. Is the slowness of it's interpreter indicative of the language itself? I suppose time will tell, literally.

Monday, 28 November 2011

Creating a Treeview in Glade

Preface: This was one of the first articles I wrote for my blog but never published it. I can't say why but I didn't. It's quite old and it's a little embarrassing. I figure I may as well post it though since it might just help another Glade/GTK+ newbie out there!

Disclaimer: This article assumes you have an understanding of how to implement the Model/View/Controller methodology of a GTK+ treeview. It is also intended for GTK+ 2.xx but may still be applicable to GTK+ 3.xx.

It seems like every day I program I am reminded about how much more I have to learn. Some days, like the name of my blog suggests, I feel like I have no idea what I'm doing. Or more appropriately, what I'm doing wrong. Take for example, a ~~few nights~~ several months ago.

I was continuing the process of converting the user interface for Vocab Builder from a C source code implementation to a gtk-builder xml user interface via Glade. Glade makes designing and maintaining a user interface a dream when it's behaving properly. I am using the latest version of Glade (3.8.0 at the time of writing) on Ubuntu 10.10 Maverick Meerkat and I have experienced a few problems. One of them is a real show-stopper. Literally. Somehow the widget tree gets corrupted and isn't displayed properly. Some widgets are just blank lines where the widget should be or are a garbled mess. If that happens, and you click on the place where the widget SHOULD be, Glade crashes. No warnings, just straight back to the desktop. All unsaved work is lost. Good times! The only work-around is to not click on the spot and instead find the widget (if you can) in your UI and click on it, hoping all the while that it's a visible widget because if its a vbox or adjustment widget you might be out of luck. A work around is to restart Glade. Closing the UI file and re-opening it may fix the issue too, I don't know.

Creating more than just the treeview widget seems to be a relatively new feature to Glade which might explain some of its oddities. First of all, how you add columns isn't exactly clear. Unlike other widgets, there's no treeview subcategory in the tools section where you can click on a category and easily add it to the treeview as a child widget. No sirree, you have to right-click on the treeview widget and select Edit. Okay, fine. Not a big deal. An odd design but whatever. How about a cell renderer? There is absolutely no visual clue that tells you how to add one. What you have to do is go into the Hierarchy tab, where you add new columns, and right-click on each column and select Add. Again, it sort of makes sense but with how easy it is to implement other widgets you would think there would be some kind of a clue like a button or tooltip which would indicate how you add a renderer. At least the Hierarchy tab had an Add button that let you know you could add something to it. View and controller done. What about the model?

Well, there's a property for the treeview where you can select the model to attach to it. It pops up a handy window that allows you to either select a tree model you've already created or, if you haven't created one yet, you can click New and Glade will create one for you. By default, Glade gives you a ListStore. This is where my problems began. I didn't clue in that it was a ListStore that was created by default. To be fair, Glade does call the model ListStore1 or something like that. I just never picked up on it. So I edited the model and put in the values I needed to store in the model. Uh-oh, what's this? Another stumbling block? Where's the string type? There MUST be a string type...Nope, can't find one! As it turns out, while a string is TYPE_STRING in GDK in Glade its called a gchararray. Yes, a string is a character array. So...why change what its called? Why not gstring? Oh...maybe that's why? G-string? Naw...must be another reason. Maybe just to piss me off? Either way, off I went and re-ran Vocab Builder to see if my changes were working. Yup, the treeview headers are clickable and the sort indicators look like they work. It does almost everything I want sans a few options I haven't implemented, yet. However, when selecting New in the Editor the ability to sort by columns disappears. What the heck? What's going on? After hours of trying to solve the issue, nothing. Bah! "I'll fix it tomorrow."

Here it is ~~tomorrow~~ the following day and I'm still not sure what's going on with the treeview so I start converting the Builder interface to get my mind off the Editor. Turns out I use a treeview in the preferences window, too. I can't escape these darned treeviews! Well, I decide to take another stab at it. I go through all the steps, forgetting how to do some of them and getting frustrated again, but I muddle through it. In my original code, I simply removed the old model and created a new one whenever I needed to update the treeview because it would always be new data. However, I realized that I could do away with all that if I just cleared the old treeview. Hrm...why isn't it working? It says it's not a TreeStore...I stared blankly at the screen in consternation. Ya, right, it's not. Its a ListStore. Ohhhh....what an idiot I am! I quickly convert my functions to use that of a ListStore and viola! Everything works as intended! Fantastic! Now, what if I want a TreeStore?

Luckily, this fix was relatively easy to figure out. Rather than have Glade create a default model for me I can just create the TreeStore model first and then attach it to the treeview. Easy. Fixed. Wish I'd known that ~~yesterday~~ the day before after spending hours trying to figure it out...

Addendum: If there turns out to be some interest in an actual step-by-step example of the process of creating a treeview in Glade/GTK+ I'd be more than happy to do so.

Wednesday, 24 August 2011

Compiling C Programs in Linux

A lot of beginner programmers, especially those new to the Linux world, have a lot of issues compiling programs. This isn't exclusive to C, of course, so hopefully you can take something away from this post and apply it other languages, too.

There are two primary stages to "compiling" a program. The first stage is the compilation stage. There are several intermediary steps but the end result is that your code, in this case C, is translated into a machine readable language. The second stage is called "linking". This is when each part of your program and any external libraries are linked together to create an actual executable binary file you run or a library file.

Disclaimer - This post may be hazardous to your health (not really). It is a non-exhaustive and non-authoritative introduction to compiling and linking programs on the command line in Linux. I can't emphasize enough that you should reference the documentation provided by the compiler you're using and that said documentation will always trump anything contained herein. Side effects may include: nausea, dry mouth, hair loss, severe depression and momentary blindness. If any of these side effects occur, discontinue use immediately and contact your physician.

Stage 1 - Compilation:

The Compiler:

First off, you have your choice of compiler. We are going to concern ourselves with the GNU C Compiler (gcc); currently, the most widely used C compiler for Linux. LLVM has been gaining ground but we won't be discussing it here though it stands to reason much of what you learn would still be applicable. The GNU C Compiler, usually simply referred to as GCC, is actually a collection of tools for compiling programming languages. Each tool has a unique name which reflects the language it is designed for. For example, compilers exist for the following languages in addition to C: C++ (g++), Java (gcj) and Fortran (gfortran). Only some, though possibly all, may be installed on your system by default.

Important: Do NOT use g++ to compile C language sources. There's a reason why they are separate compilers.

In the case of C, make sure you have a GCC installed on your system. Open a terminal and type:

gcc --version

If you get an error of some kind, chances are you don't have GCC installed and will need to do so before you can go any further.

Hint: For those of you running a Linux system like Arch, Debian, Fedora or Ubuntu you can install any of the tools discussed within this document quite easily. For example, Debian based systems usually have a 'build-essential' target via apt-get for installing commonly used tools for building programs from source. You can simply issue the command 'apt-get install build-essential' without the quotes. Check your specific distribution's instructions for installing these tools from their respective repositories.

Everything in this how-to should be run in a terminal. Make sure you change to the directory where your source files are contained and execute the supplied commands within than directory. There are many, many compiler flags to know but the following should be considered a minimum:

gcc -Wall -pedantic -std=c99 my_file.c -o my_program

...replacing "my_file.c" and "my_program" with the proper names of your file(s) and desired program name.

Compiling a source file with the above flags will produce a final binary, skipping the separate linking stage. However, as your programs get more complex it becomes advantageous to build intermediary objects first then link them together later.

Compiler Flags:

These flags are passed to the compiler to tell it what you want. There are some very important ones to know:

-std=c99 - This flag sets the specific standard for gcc to comply with. At the time of writing gcc defaults to a standard called gnu89, which is the c89 standard with GNU extensions. If you plan on writing a program which may be compiled on a system which does not use gcc as it's C compiler (LLVM, MS Visual Studio, Borland C, etc) then it is probably in your best interest to force the use the most current C standard and disable the GNU extensions. gcc is not 100% c99 compliant but the non-compliant features are rarely used. If you require any of the features of c99 which have not yet been implemented in gcc then you'll have to use another compiler anyway.

-o - Specify an output name for the object or binary. In the most basic case, you use it to specify the name of the program you are producing. If you don't use this flag, gcc will default to using the name of the source file and create a .o file for an intermediary object (myfile.c becomes myfile.o) or, in the case of creating a binary, it will default to a.out for the binary file.

Compiler Warnings:

Warnings should always be turned on. Here are some very important/common ones to know:

-Wall - all warnings; this is misleading because it doesn't actually turn on ALL warnings but just the most commonly desired ones.

-Wextra - turns on extra, more strict warnings.

-ansi - specifies that you want to adhere strictly to the C standard. It turns off all GNU extensions to the C language making it fully ANSI compliant. This is important for portability between compilers. This is automatically turned on when you specify the standards c89 or c99 with the -std flag. However, as stated earlier, gcc defaults to gnu89 (c89 with GNU extensions) and the -ansi flag will explicitly disable these extensions. -std=gnu89 -ansi is equivilent to -std=c89.

-pedantic - makes ANSI warnings fatal, meaning that they'll be reported as errors instead or warnings and halt compilation. It is usually a good idea to always enable this flag when you give -ansi or -std=c89/99.

Extra Compiler Flags:

-c - compile object code but do not link. This produces a file with the .o extension. Object files, those ending with .o, are later linked together to create a final binary or library.

-I <directory> - This flag allows you to specify an additional location for the compiler to find your header (.h) files by replacing <directory> with the location of the headers being searched for. This is useful if you store your header files in a directory other than the one your regular source (.c) files are located. You can chain as many of these flags together to add as many directories as you need. If you don't know why you would need to use this then it's safe to just leave it out.

Stage 2 - Linking:

The Linker:

As noted earlier, building a program is a two step process. The second step, after compiling, is linking. In most, if not all, large projects sources are usually compiled first into object files. On their own, they do nothing and are essentially useless. In order to work, they need to be linked together to create a single binary file, an actual program, then made executable. This is where the linker, named 'ld', comes in. The linker has its own set of flags which may either be passed to gcc or to ld itself. It is probably easier, and in your best interest, to issue the flags to gcc for simplicity's sake.

Linker Flags:

-l<library> - where library is the name of the external library you need to link into your program, sans (minus) the 'lib' prefix. In other words, if you with to link the math library, libmath into your program, you drop the 'lib' part and use: -lmath. Actually, the linker can find libraries with a basic regex so you can link -lm for the math library.

Extra Linker Flags:

-static - Used to force libraries linked to your program statically. This means that the library itself is incorporated (bolted on) into your program. This increases the size of your program but does away with some of the issues associated with dynamic linking. A discussion on dynamic vs. static linking is WAY beyond the scope of this article.

-dynamic - Force libraries to be dynamically linked to your program. This means that the library is linked to your program but not actually loaded until the program is run.

-L <directory> - Specify a path to find extra/custom libraries by replacing <directory> with the location of the libraries being searched for. This can be used to link external libraries not in installed in the normal paths searched by the linker. It is also used to specify directories within your project if you are linking to internal libraries.

The Next Steps:

For compiling very small programs it is usually easiest just to run gcc from the command line. As programs get larger, though, it usually becomes necessary to remove some of the complexity. The next step would be to create a custom Makefile to build your program. You may then simply issue a single command, make, and the rest of the work is done for you. I mentioned earlier that it becomes simpler to compile sources into objects first then link them later. That is because as a project gets larger the longer it takes to compile and link. By utilizing a Makefile only files which have been modified are recompiled and then re-linking thereby speeding up the build process when changes are being made. This is especially helpful when debugging.

The next step would probably to use a full build system like the GNU autotools. Autotools is a general term for a suite of programs: aclocal, autoconf, autoheader, automake, autopoint and libtool. GNU autotools provides a method of making your code more portable and easier to distribute. They can even roll a tarball for you and compress it. By utilizing tools like GNU gettext and GNOME's intltool you can integrate translations into your project, too. The autotools are the backbone of other distribution methods, as well, like creating an .rpm or .deb in Linux and knowing how to use them tends to be an essential skill. There is plenty of help on the Internet on using GNU's autotools. Just do a web search and you'll find plenty of help.

If you're having trouble, read: How to Get Help

Saturday, 6 August 2011

An Introduction of Sorts, Part 3

The next step of our journey takes us through the second to last leg of my introduction.

So, in my escapades with C, I have learned the following:

1) C is not the best language for every task. Naively, I thought that if I started with C not only would I be able to write any kind of program (which you can) a person can conceive of but that it would act as the basis for learning any future languages (which it doesn't) should I so desire.

2) There are many cool languages to learn, too many in fact. In working through Beginning Linux Programming I came across Tcl/Tk, Perl and shell scripting (Bash specifically). Through other sources I came across Java, Python, LISP and Go. And from there things explode. There are literally hundreds of languages out there!

3) Learning a language doesn't teach you to program, at least, not well. As it turns out, knowing a language doesn't help you much. Algorithms and structure go a lot further in teaching programming. Understanding how to implement a linked list, the best method to store persistent data or the most efficient method of manipulating data is far more essential than knowing where to place a semi-colon or bracket.

4) A compiled language is not the end-all be-all. The JVM, for example, is very cool and Java has all kinds of implementations. You don't need to write Java to use the JVM. Clojure or Jython are examples of languages which are compiled to Java bytecode for execution on the JVM and can take advantage of everything it can provide.

5) Interpreted languages like Perl, Tcl or Python are not slow. First thing to remember, is that over the years things improve and a lot of information on the Internet is old. Really old. A post from 2002 ranting on how slow Python is, is no longer relevant. A heck of a lot has changed in 9 years. With the way technology changes these days, a matter of months can make all the difference. It also has to do with writing idiomatic code and implementing good/quick algorithms which complement the language. While yes, in certain instances, a C implementation of a specific task is faster than the equivalent in Python. However, compare parsing lines of text with C or C++ to Perl. Write an application with a GUI in C and compare that with Tcl/Tk, Java or Python. It's all about matching the tool with the job. You don't bring a hammer to tighten a screw.

6) There is no one way to do something. Quicksort or Bubble Sort are not the only ways to sort data. Not only that but often language implementations may be faster than anything you come up with on your own. You need to know the strengths and weaknesses of the language you're working in to get the most out of it.

7) Dynamic Linking is not necessarily superior to Static Linking. It is also not to be confused with Dynamic Loading. Google's Go language taught me that.

8) Skills aren't necessarily transferable. Knowing how to manipulate pointers, manage memory and use pre-processor directives doesn't usually apply outside the C world. Many languages don't have pointers (or are at least not the headache they are in C), use garbage collection and have no need of pre-processor directives. That doesn't mean that C, or any other language (read: all), with non-transferable skills don't have something to teach. It just means that everything you learn won't be transferable to another language.

9) There is a trade-off between writing a fast program and writing a program quickly. No one wants a slow and sluggish application. On the other hand, a couple seconds overall is likely not even noticeable in most situations. Now, compare that with how quickly a program can be written in Python or Go to C or C++. There's no comparison.

10) Design is where you should spend 90% of your time. I can't stress this enough. An excellent programmer following poor design will create a mediocre application. A poor programmer implementing an excellent design will create a good program. If you can combine the two, a good design with a good programmer, and you'll make an awesome application.

11) Benchmarks mean little to nothing. Look no further than this example. Any language can be optimized in such a way but I can't think of a better example.

It's funny that, reading back, I haven't even scratched the surface of what there is to learn but this certainly is a foundation for describing how much I know.

In the end, not much.

Wednesday, 3 August 2011

An Introduction of Sorts, Part 2

Learning C without the help of formal education isn't the easiest task. Some times I feel like MacGyver trying to use an elastic band, a paperclip and a funnel to make a rocket ship. No really! That's what trying to write a game or large program is like in C. The language gives you some rudimentary tools with which to craft intricate art with. However, that being said some very powerful programs can be written in C. You just need to learn how to use the tools and fine tune their use.

I would describe myself as an intermediate to advanced C programmer. By no means am I an expert and I'm a far cry from being a master/guru. To understand why I consider myself thus, you need to understand the full breadth of a computer language. Any one computer language consists of two parts: 1) Syntax, 2) Libraries and Tools. Learning the syntax and idioms of a language can be difficult but as you grow and learn it turns out that discovering the idiosyncrasies of a language is the easy part. Learning the standard and third party libraries and tools are the hard part. When I began learning C, I thought that once I had learned the language I was done. I was a master. Just call me Yoda. Ya, right. Let's see...what do we need to know:

1) Language Grammar: punctuation (semi-colons, brackets), code blocks, scope, reserved words, loops, conditionals, macros, preprossessor directives, etc. all make up language syntax;
2) Compiling: compiler flags, linker flags, and all other manners of subtle nuances of each compiler you need to know in order to even build your code;
3) Debuggers: backtraces, core dumps, debugging programs (gdb), profiling and symbol tables are some of the tools you need to learn to be an effective programmer in C;
4) Standard Libraries: printing, strings, locales, file input/output, etc.;
5) Non-Standard Standard Libraries: "You're kidding me, right?" you might be asking. Oh, no! I'm not joking. Unix libraries vs. Windows, POSIX libraries, and all manner of things in between. Start learning sockets, you'll see what I mean;
6) Standards Complient: No, there's no one version of C. There are several C standards and more are on the way. Learning C once isn't the end. You'll need to keep reviewing the new standards, especially when a compiler changes to a different default standard (C89 for gcc but I expect it to change C99 eventually);
7) Third-Party Libraries: This is the balance of all the other libraries out there. For C? There's a LOT.
8) Portability: This one is a mixed bag and it depends on how people are using the word. For some, portable code means it can be ported to other Unix variants like BSD, Linux, etc.. For others, like myself, I take it to mean code can be ported between operating systems. In most cases, this is no easy task and can quickly turn into a nightmare. Remember I mentioned sockets? Just the tip of the iceburg. More on that in another post.

I feel like I'm missing something. I probably am.

Well, there is something. A couple somethings. There are the tools you need to work with your code. The first of these is a development platform. What operating system do you use to program on? I, personally, program on my Ubuntu Linux box. You'll need to pick up a compiler. The most popular in the Linux world are GCC and LLVM. You'll need to install a debugger. How about an automatic build system? Oh, don't forget a programming environment. Now, I prefer to use a terminal and a text editor. Others, especially those from the Windows world, prefer an IDE.

I eventually chose a book called Beginning Linux Programming. Now, I already had enough background to jump right in and this turned out to be a fantastic book for me. Now, it is more focused on Linux programming, as the name implies, but it also teaches theory and techniques usable by anyone. The main thing I learned was that I had a lot more to learn. A lot more.

Your mind fried yet? I know mine is.