C vs. Java: Nethack vs Netwhack

Disclaimer, I was never any good at C. I was corrupted by C++ at an early age. So it would be too easy to accuse me of not knowing what I am talking about (i.e. i’ll just concede that right now).

The reason I am picking on Nethack here is to compliment it. Nethack has been in development for decades, by a team of several people. It should be the most advanced Roguelike, and as such an example of well-written C. I did manage to take a look at the Nethack sources recently, specifically the monster data, to avoid accidentally copying any of Nethack’s ideas or source code in my Java game. I would like to announce I was very happy to discover that not only would it be almost impossible to copy Nethack’s ideas or source code into my game (even in a converted-to-java way), but it would be a bad idea to do so since Java uses different paradigms than C to deal with that kind of data. I’d like to write down my thoughts on the matter here because I find it helps me with the creative process of writing my own roguelike game.

The Hack Approach

The ‘hack approach’ is a common approach in C. There is a ‘permonst’ struct which contains data like name, symbol, armor class, damage and so forth. Then in monst.c this is used to create a static array of permonst which seems to be indexed by symbol — thus restricting the list of monsters to one per symbol. The code then goes on (in mhitu.c, etc.) to rely on the type of monster matching it’s symbol to check what kind of damage, etc. the monster does to the player. An interesting, but limited approach in that you can really only have one monster per symbol as all monsters are referenced by their symbol. I may be slightly wrong about this but I haven’t really studied the hack code in-depth.

The Nethack Approach

In Nethack, the exact same approach is used but with a twist. There is still a permonst structure (per-monster?) but it’s wrapped in a monst structure which contains an mnum (“/* permanent monster index number */”). This is what I would say is a better approach because it allows monsters to share symbols. As for what kind of damage monsters do, it is not based on the monster symbol anymore but on a list of attacks with associated damage types (either #defined or in an enum). This allows the game to index monsters by some sort of enumerated value, and then deal with their attacks separately from their symbol. SO now some snakes ‘s’ may be poisonous, and some might not be.

However — and oddly, because of how C works I suppose — almost everything about the monsters including the mnum itself is hardcoded in #defines. This means it would be very difficult to change the order of monsters in the game (such as to move a salamander from a level 5 monster to a level 2 monster, for example) without making multiple changes in multiple files to large lists of hardcoded constants. Imagine having to add a new type of lizard and then to increase the number of all the other monsters after it by 1. Now imagine making a mistake and having to check and reorder several different sets of #defines. I’d go crazy. To be honest I don’t understand this approach in it’s entirety, I may be wrong about how Nethack does things as I didn’t study the Nethack code in depth. But from what little I did see it just seems really complex with the number of structures and defines and inline functions and so forth, all split across many different files. Don’t get me wrong — obviously a very skilled and competent team of people came up with this data structure. I guess that is just how you do things in C.

In other parts of the code there is a statically defined array of monster data which ties all of this together (and in turn relies on all the enums to match all the #defines).

The Java Approach

I call this the ‘Java approach’ and not the ‘Netwhack approach’ because really this underscores some of the fundamental differences in writing a roguelike in Java versus in C. Yes, I could have copied the Nethack approach. I could have made a class for Attacks, and a class for Attack, for example, then used overloaded constructors to basically copy the monst.c file with very minor changes. However this kind of approach is a C approach, based on C thinking and C limitations. It may work very well in C but it is a strange beast in Java. If I used that method it would be an obvious rip-off of Nethack sources, because the fingerprint of C would be on the code. It would also make it too easy to accidentally start using actual Nethack data by accident, and copying Nethack is the least of our desires. SO what’s the Java approach? Simlpy, enums.

Enums are incredibly powerful in Java. They are so powerful they let you do things that you just can’t do in C (without writing your own metaprogramming language). Basically you define a class enum Monsters, and use the special features of Java Enums to store everything directly in the enum — including special logic on a per-monster basis. Let me give you an example.

Let’s say you have an Orc, defined as follows in monst.c:

Now what you are looking at is a direct defenition into an array of struct monst, which itself holds various structs and so on. All of this data such as AT_WEAP, AD_PHYS, S_KOBOLD etc. are hardcoded defines which other parts of the game rely on in switch statements (etc). Copying this approach by defining classes like ATTK or LVL is one way to write a roguelike game. But why torture yourself by writing such ancient code? This type of ‘problem’ in computer science data structures and algorithms has been solved for decades. In Java we just use enum classes. Off the top of my head like this:

You may wonder where all the other information is. Simple. KOBOLD is an enum type. So you can now add attacks with a method like this:

Now the Kobold has a 5% chance to sometimes kick. This is cleaner than the way it’s done in Nethack, since it leaves the game to infer what it can from things like size and hit dice. I.E. you don’t have to define everything in the data. You may also wonder where the monster’s name is. No need to worry; System.out.println(“Monster name: ” + kobold); will print “Monster name: kobold”. Now guess what; this class overrides class Mobile, which overrides NWObj, so you now have access to equivalents of Nethack’s aname() (and/or thename() if it exists), pluralizer functions, countable and uncountable processing, and so forth. And everything, including the name of the monster and it’s symbol, is generated by the code. It’s not stored in a static data structure.

You can even override individual enums and provide custom functions. Here’s an example in our hypothetical Monster.java file:

Of course, even this could be considered too verbose; since any time you deal with this monster you automatically know it’s a “kobold” since, remember, “kobold” is an enumerated type. And then you know one of it’s attacks could be poison. And if that attack is rolled the special code for poisoning could be placed there.

SO in the monster factory class/method/whatever, whenever a monster of type kobold is created, because the game understands ‘kobold’ as a compile-time constant and because we access the data along with the methods right at enum access time, there is no need to put special case switches anywhere in the game engine. All the special cases are defined with the monster and there are no placeholders like ‘noattack, noattack, etc’.

The code looks very very clean. It’s very easy to understand. No more wading through special case switches every single time you want to do something in the engine. It all just works.

This approach has other benefits for example protecting data files because they’re code (i.e. you don’t need to patent a proprietary file format or data structure because your data is created by the code, which is protected by the license). To give an example, not that I am evil, but it is completely legal apparently to copy everything in Nethack into my game wholesale so long as the user has a copy of the Nethack sources which he obtained legally. This is a loophole in the NGPL, so to speak, because their data is kept in plaintext and if you obtained the code legally there is no restriction on your use of it. However, if the data is kept in code in the manner I described, it is protected under the software licence and you can license the legality of reverse engineering that data out of the program. That is how SmartGo prevents you from legally taking it’s Go games out of it’s database. A lot of programs use this technique, actually.

In Practice

In practice, difficult constructs into static arrays such as

  • PotionData.item[i].identified = false;
  • MobData.mobkind[MobInfo.ORC].maxhp = “1d8”

are now

  • i.identified = false;
  • orc.maxhp = “1d8”;

Further, switch statements operate as “case orc:” and “case kobold:”. Although for the most part, these aren’t needed and special logic can go directly into the enum.

I also no longer need MobInfo, MobData and Mobile — one class covers everything. This has reduced the SLOC of my project significantly.

It makes many aspects of coding easier and enables many new kinds of features. For example in custom level files (think Nethack .des files) I can add monsters and modify their stats by name because the code can read the name of the monster and access the enum immediately using (for example) Monster m = Monster.valueOf(String name).

Exploring the magic of Java enums has brought Netwhack back into the rapid development stage where my fingers just can’t type fast enough to get down the massive stream of features and ideas that just keep coming and coming. I’m giddy excited about doing major content addition again for the first time since the accident in 2011. Maybe I’ll add a hundred cool new monsters tomorrow, and twenty new potions. Rise up!

Final Thoughts

So from this standpoint there is really no need to copy Nethack’s code because it is, frankly, outdated. That doesn’t mean it’s bad though. Just not so modern anymore. That’s C for you. Then again, touche, I am sure many lisp, haskell or python programmers feel my use of Java is outdated. Point well taken.

Another point well taken is that the Nethack devteam could adopt this ‘Java approach’ in C, by adding function pointers to it’s structs and dynamically assigning code to each monster. But that would amount to hardcoding the data anyways, so it wouldn’t make much real difference. The only reason why the Java approach feels better is because the inheritance and dependencies are handled for you by the language.

The point is, Java is not C, and it would be as ludicrous to assume someone is borrowing C code for use in a Java program as it would be to actually try to do so.

Comments are closed.