Littlefoot Land Mines: 2012

Saturday, June 9, 2012

Broadening Programming

I've decided that I want to broaden my abilities in terms of languages I know. After looking at some stats on the webs I've picked what languages I want to learn: C/C++, Python, Java, Small Talk, Ada, and Ruby. This is a wide range of languages and language types. Small talk has as its influnced list [Lisp, Simula, Logo, Sketchpad]. Where as Java and Python are heavily influenced by C/C++. Ada has a Python influence I can see. And Ruby has all of them as an influence and I know it gets a fair bit of use (Ruby on Rails).

So those are the languages I want to learn. The biggest difficulty with learning a programming language is just finding something to do and knowing when you've done it right. I've just started a repository called "LittleLibs" which are a collection of libraries that I'm using to help me learn how to create these systems, and probably provide me with useful libraries I can use later on. If I want to be fluent in these languages I might find these libraries useful later on.

Fluent. I'm going to define fluent here as using a programming language idiomatically. Programing Ada like an Ada programmer not like a Python programmer.

This is also touching in on how I plan on doing unit testing. I personally don't believe that internal consistency unit tests make sense. An internal consistency unit test dictates how a library should work. That isn't want matters! What matters is that the library should be correct, how it works underneath shouldn't matter. If done right then, even the programming language underneath shouldn't matter.

What I'm imagining is I'll have a library, then a simple test interface to that library. The test interface is just to overcome the language complications of trying to get something to use the library. The test interface loads all of its data from a file to check for library correct ness. In this way, each language will have its own test interface to its own library. However, the information tested against is common amoungst all the libraries. Thus, either they are all right, or they are either all wrong.

This way I can develop libraries that are useful for these languages and I can develop useful libraries that I can use with these languages later.

So, what do I need to focus what I need to learn about these languages. These are the main things I have come up with that I need to focus on when working with these languages.

I've already started working with python on my first project, a collision detection library, which also happens to include various lower level math libraries that will be of great value to me later. a linearAlgebra library, a Geometry library, and a boundingVolume library. These have lots of potential use for collision/physics engines, as well as graphics systems such as OpenGL but also such as Ray Tracers.

I'm considering adding mixal to the list, if for no other reason than to pay homage to Donald Knuth.

Friday, May 4, 2012

Quake Disection part 1

I've decided to not use Kodachi for further development on Delta. The compiling and running of Delta is just a little to much for the software graphics drivers. So you, my zero readers, will still hear me talk about it, but only when I'm actually at my desktop. What I'm going to do when I'm away from my desktop for now is start playing with other source codes. Particularly I'm starting with Quake. So today begins my first game disection! I've got no idea what I'm doing so here goes nothing! Not knowing anything to begin with I look at the root of the "Quake" folder: 3 folders: QW qw-qc WinQuake 2 files: gnu.txt readme.txt first thing I notice is that the target audience is windows users. One because of "WinQuake" but also, in most unix projects like this you would see: gnu README with no extension. Just a little quirk. Second thing I notice, there are no data files. this is strictly the source code, I can't build this and run quake. But that's ok, I don't really want to play Quake right now, I want to see how this works. So carrying on. the gnu and readme files are just lisences and such. so on to the folders. QW first. I see a collection of files and folders. most of the folders contain various c files. one folder sticks out. "docs" maybe there I could find some help for getting started in understanding this. no avail. The 4 files inside were mostly on how to get different window wrappers to work. So I go back up to QW. I see "clean.bat" maybe be seeing what needs to be deleted, I can see what's important. This didn't help really at all. I found an interesting file called "cmds.txt". it contains things like: cd +moveup -moveup +movedown -movedown +left -left version", CL_Version_f changing", CL_Changing_f disconnect", CL_Disconnect_f record", CL_Record_f stop", CL_Stop_f playdemo", CL_PlayDemo_f timedemo", CL_TimeDemo_f obviously, this are terminal commands. Not what I'm looking for. Next I inspect "Makefile.Linux" to see if I can decode what depends on what. I'm still such a rookie with make that I can't decode anything. Inspection of the remaining files in here didn't yield anything. I glance over a couple folders and don't find anything particularly leading. So I jump into the server folder. this and client I know have make files. so I start with the server for some reason. in the server file I find the line: $(EXE) : $(OFILES) cc $(CFLAGS) -o $(EXE) $(OFILES) $(LDFLAGS) I glance up and find: EXE = qwsv I'm finding something! qwsv seems to be an important file. qwsv is 5 files. qwsv.dsp, qwsv.dsw, qwsv.mak, qwsv.mdp, and qwsv.plg. I look at qwsv.mak. another huge wall of make I cna't make heads or tails of. *sigh* I get an idea. What if I enter to command line: "Grep "somefile.h" -l *"? this will output any file that include that file. so I'll start with something that looks low, and work up until there aren't any files that include it. I tried "progsdef.h" sounds like a low file. gave me: progs.h qwsv.dsp qwsv.mak qwsv.mdp So i tried "progs.h" and it gave me nothing. gah! ok. so what files do I have to work with here? asm_i386.h newnet.txt pr_exec.c qwsvdef.h qwsv.plg sv_main.c sv_user.c world.c makefile notes.txt profile.txt qwsv.dsp server.h sv_move.c sys.h world.h math.s pr_cmds.c progdefs.h qwsv.dsw sv_ccmds.c sv_nchan.c sys_unix.c model.c pr_comp.h progs.h qwsv.mak sv_ents.c sv_phys.c sys_win.c move.txt pr_edict.c quakeasm.h qwsv.mdp sv_init.c sv_send.c worlda.s so. what has hopes here. anything without a .c or a .h isn't going to be of interest. that reduces it to: asm_i386.h pr_comp.h progdefs.h qwsvdef.h sv_ents.c sv_move.c sv_send.c sys_unix.c world.h model.c pr_edict.c progs.h server.h sv_init.c sv_nchan.c sv_user.c sys_win.c pr_cmds.c pr_exec.c quakeasm.h sv_ccmds.c sv_main.c sv_phys.c sys.h world.c i know progdefs and progs won't help. the asms aren't going to be a good starting point either. any sys ones are likely to be just system wrappers and not what I need. model, move, phys, user, send, world, ccmds, and nchan I can rule out pretty quickly as well. pr_comp.h qwsvdef.h sv_ents.c pr_edict.c server.h sv_init.c pr_exec.c i'm down to 7 files. sever, init, and exec look the most promising. I'm going to start with exec. first thing i notice is: #include "qwsvdef.h" what's that file? it turns out to be a nice and easy to digest file, with structs and other common includes. so I can take that one off my list up above as well. Then the most obvious thing hit me. grep for "int main" duh! and I got sys_win.c. I did "void main" and I got sys_unix.c. hmmm. I did just "main" and got a bunch. So I'm going to jump over to sys_unix.c. That file contained, as expected, a bunch of system wrapping functions, making directories, reading, print, errors, time, etc. but it also had the line: void main(int argc, char *argv[]) woohoo! I'm so glad I discovered grep. but that's all I've got time for with this installment.

Monday, February 27, 2012

Tutorial Architecture

A friend of mine critiqued my post on parallelism. It gave me some good insight. into what needs to be done and what was done well. It has made me think about what makes a good tutorial/guide/instructional writing of any kind. One thing I thought of was how much detail is necessary. If I put forth to much detail the reading becomes overwhelming, if I don't put in enough detail then it may not get the important information across. Also, Gadget commented on the value of examples and summaries. So this is the approach I came up with.

I start with a topic, in this case parallelism. A topic is a piece that is meant to be read together as one chunk. Kind of like section in a chapter or a post on a blog. Each topic is made up of concepts and key words. At the beginning of a topic introduce what the concepts to be covered and define the key words for that topic. Give a summary over view of the topic and then go into the concepts.

Each concept has several levels of explanation. The first paragraph of the concept is the summary of the concept for some one who already knows the concept. This way, if some one reads this summary and feels confident that they understand the concept, they can continue on to the next one. In this way some one who knows a lot but maybe missing a few things can skim really easily.

The next section of the concept is optional. It tries to explain the concept carefully. Some one who this is a new field would should be able to understand this explanation. And there may be more than one attempt. this part of the concept is attempt after attempt to explain the concept. coming from different angles. pictures, analogies, detailed explanations, what ever it takes to get help anyone understand the concept. the reader need only read how ever much it takes to understand the summarized earlier.

After the explanations, it continues on to the next concept. There isn't a summary because the reader is expected to read the summary that came first. I may change that and just restate the summary in different words.

After each concept has been covered there is a summary of the topic, how all the concepts fit together and what not as well as Links or references to further reading. What I hope to accomplish with this approach is that those who know can skip, the important information is presented up front and hashed over as many times as it takes to make it clear, the information is represented in a clear pattern that is carefully followed so that the user can keep focused and keep track of where they are. and if they still don't understand or want to learn more, there are references to further study. It isn't a format that standard textbooks would be able to support well, and will take a lot of work from me. but if I do it on my blog, I can have sections of text expand and collapse. and the time I spend writing the tutorial I'm sure I'll get the time i spend back in the confidence in the field I have found in myself.

Thursday, February 23, 2012

What is Parallelism?

I've decided that this is where I need to start with my new OpenCL tutorial. For me to be able to easily and accurately explain how OpenCL works, I need you to understand what the different kind of parallelisms are.

If you are reading this, you are interested in learning OpenCL, but have no experience programing in such environments. If you are reading this you probably don't have a background in electrical engineering or computer engineering. If you are reading this, you may be familiar with the concept of multi-threading, but may not be aware that there are more kinds of parallelism. Hopefully, you'll be more in the know here soon.

In computer science we can do things in 1 of two ways. In serial and in parallel. These typically speak for themselves, but just so we are on the same page I'll elaborate. Doing something in serial looks like:
A->B->C->D->etc.
A happens, then when its done, B happens, then when it is done, C happens, and D, and so on and so forth. In some cases there are whats called dependencies from A to B and down the line. or in other words, D can't execute before C can, it depends on some of the output values of C. C can't execute before B does, nor could B before A because there because C and B depend on the outputs of the step before them. In the case of dependencies, you must do things in serial.

However, if there are no dependencies A, B, C, and D could all execute at the same time. this is called executing in Parallel.
A->
B->
C->
D->
In order to do this, there are certain physical parameters. The easiest to understand (and we'll stick with the CPU for this example) is the number of processors. If you have only 2 processors you could only do:
A->B
C->D
If you only had one processor you would have to do something like
A->D->C->B (represented out of order here for the point that execution order doesn't matter)
The first parallel example would require 4 processors (or a 4-core processor). so the number of things you can do at once depends on the number of processors you have. This is the concept of multi-threading or multi-processing (the distinction between these two I will write about later).

Multi-threading/multi-processing is an example of task-level-parallelism. There are 4 levels of parallelism: (in order from the bottom up)
Bit Level
Instruction level
Data Level
Task Level

Bit level parallelism is not really a concern anymore. It was a concern back in the 1970s when 8-bit processors were the common processor. Inorder to add two 16-bit numbers in an 8-bit processor it would take 2 steps: add lower 8 bits, add upper 8 bits with a carry bit from the first step. This meant that doing this one things, adding 2 16-bit numbers required 2 instructions. now, it is about as likely that you are using a 64-bit processor as a 32-bit one. in either of these cases, the size of the integers are so large that this isn't a concern anymore. So I won't talk about this any further.

Instruction Level:
This level of parallelism gets divided into 2 categories. the one we aren't really concerned about because it happens under the hood is called instruction pipelines. in the average processor, there are typically several instructions being executed at once. But the way this works doesn't create multi-core processor like we know it today. This is just the common place way that instructions execute in a processor. Again, we aren't concerned with writing operating systems right now, so I'm going to leave this here. This isn't something we can really exploit or even interface with at all.

The other kind of instruction level parallelism is called "Superscalar" processors. In this case, a single processor executes one instruction more than onces at the same time. (Instruction: In machine language, binary essentially, what your processor actually sees, each function or step is called an instruction.) This is also sometimes called "SIMD" which stands for "Single Instruction Multiple Data". SIMD is what I want you to remember and wrap your mind around with this level.

Say I have 8 values, i want to add every 2 together, so first + second, third + fourth, fifth + sixth, and the seventh + the eighth. In a normal processor instruction it would happen just like that. group 1, group 2, group 3, group 4. with SIMD, which is actually available to you in most modern processors, the values in 1st, 2nd, 3rd, and 4th all get placed in special registers(a register is the lowest level of storage. RAM and the cache are made up of groups of registers. When ever your processor executes an instruction, it reads the data from a register to a register. if it wants to use something in RAM or cache, it reads it from them to a register and then it executes the instruction. see my post on memory access for further details.) in the processor and then the processor does executes the add instruction once, and it does both group 1 and group 2 at the same time. in this manner the number of instructions is halved.

A graphics card is an example of the first word I used. it is a "Superscalar" processor. Instead of loading group 1 and 2 into special registers, the graphics card loads all the groups in clumps of 2 into individual stream processors. so 4 stream processors get used. The thing about a stream processor is it has no control systems. A stream processor does what it is told, it never figures out what to do. So the GPU says add your first and your second to the stream processors. It is important to note here that the GPU need not and further cannot even if you wanted to, address each stream processor individually. When the GPU says "ADD!" all the stream processors add. when the GPU says "Load from your location + 1" stream processor 1 loads from memory location 2, stream processor 2 loads from memory location 3, etc. Its called performing instructions in "lock step" (meaning, each stream processor is locked into the same step as all the others) This may seem like it restricts what the graphics card can do, if you think that, you are right.

Here's the thing. Because we've restricted what the graphics card can do, there are fewer things to concern our selves about. We can then make those few instructions go super fast. So great amounts of speed are gained by applying these restrictions. This is how your graphics card can paint 1980x1080 pixels 60 times a second. (if the GPU were to do 1 instruction per pixel, that would be 128 million instructions a second. To put it in context. A GPU with 1024 stream processors [middle high end] can execute 640,000,000,000 instructions per second. A quad core CPU can execute 9,600,000,000 instructions instructions per second. see how restrictions can be useful?)

So, summarize, instruction level parallelism as we are concerned with, takes multiple data and performs the exact same function (the exact same function being the important point here) on them all at once.

Data level parallelism:
Data level parallelism is a theoretical concept. It actually rides on top of task level parallelism, but with some key differences. The easiest way to imagine data level parallelism is to think of a for loop. Assuming that one iterations of the for loop has no dependencies on any of the other iterations, a data level parallelism will execute all (or as many as physically possible) of the iterations of the for loop at once. Each processor is running the exact same code, but on different data. how does this differ from instruction level parallelism? The big thing is that while the same code is being executed, each execution is not in lock step. There is an efficiency gain because there is a sharing of code, but they don't have to all execute at exactly the same time.

Task level parallelism:
This is were a lot of attention is being put currently in computer engineering and software engineering. Here's what makes task level parallelism so important: there need not be anything in common between the separate tasks. Here's what makes it so hard: what if there is something common between the separate tasks?

Ok, now decoding my last two statements a little. Instruction level parallelism had each separate instance executing in lock step. data level it wasn't in lock step, but all instances were within a for loop. With task level there can be two completely un-related tasks running at the same time. consider your computer's network driver and a text editor. nothing at all in common, but they can run at the same time. This creates a huge amount of flexibility, but extra flexibility brings extra overhead.

But this also begs the question, what happens if/when the tasks share? There are a lot of things such as locks and semaphores that ensure that different tasks don't step on each others toes. But those control structures are not free. Some times there isn't concern over stepping on each others toes, but it can some times be hard to even share the information.

So my point is that while task level parallelism sounds attractive because you can execute what ever you want when ever you want, be careful, that freedom comes at a price. Instruction level parallelism may feel restricting, but because of those restrictions it avoids the overhead involved with task level parallelism.

So this, as with everything, is a trade off. So remember that appropriateness is key. Complete independent tasks? Task level. Large data sets with identical instructions? Instruction level.

Saturday, February 18, 2012

Delta

I wrote down a while back a file I called "Game Engine Guide" which were my original thoughts on making a game engine for my own academic purposes. I realized after I posted my last post that I referenced "Delta" with out any context for my 0 readers. (I consider this blog more of a journal than anything) In my game engine guide I planned for 3 engines. Delta, Epsilon, and Zeta. Delta was to learn the technology. OpenGL, OpenCL, OpenAL, OpenMP, etc. Epsilon was designed to be my fantastic failure. I'm over the learning curve of the technology, now lets over-plan and be over ambitious and see what lessons I can learn, what my limits really are. Then Finally Zeta is the acceptance of those limits and the design and creation of a usable engine that I am capable of creating.

In the interest of what I dreamed up last time, I want to accomplish Delta, and I want to do so soon. So I'm taking a moment to decide what Delta needs to be able to be considered complete, as well as musing a bit about its architecture.

Delta is written entirely in C and C++. It uses simple polygons for its graphics and physics, both of which are computed on the GPU. The CPU is left in charge of managing the GPU and the Audio through OpenAL. Collision detection and resolution happen on the GPU, something that will take some effort for me to master, but when done will be very powerful. I had originally said that it was to have mouse looking and keyboarding. I'm removing mouse looking. There will still be the ability to trackball items, but mouse looking is not possible through glut.

So, what do I need to complete Delta?

Graphics:
Finish reimplementing the graphics interface class to allow for a vector of models and the interchangable cameras and what not.
Importing .obj files
Sharing location data with Physics.

Physics:
Collision detection
Collision resolution
Gravity

Sound:
Learn OpenAL
Triggers from GPU for sounds
Triggers from CPU for sounds
Internal hash library for sound effects

Engine:
Loading main game
Loading levels
Exiting

A fair bit to do. But I think I'll be able to accomplish a large chunk of this by the end of spring break. I've done a lot of work already over Christmas break. One important choice I've decided to make deals with the actual running of the engine. I've decided that the engine is to be 100% data driven. Meaning that the difference between one game and another game for the engine is solely in the assets. One game will have one set of models, sounds, scripts, etc. another will have a completely different set. The engine will be a single standalone executable. This is highly portable, highly flexible, and very simple. To play a different game you just send a different command line argument. Theoretically this would be just a different short cut to the end user that the theoretical developer would have made for the user. yes, there are flaws with this approach, but it will be sufficient for now. I just want to get Delta done so I can have something to show for my work.

Thursday, February 16, 2012

Kick starter

I was going to try to accomplish something on Facere, but a recent change in Ubuntu One broke it on my netbook, and things are currently re-assembling themselves, mean I can't access my files for Facere right now. Not wanting to waste time, I decided to move I decided to have a mental walk around an idea I had for next summer.

Between now and this time next year I hope to have completed Delta and Epsilon, and be working on Zeta. If such comes to pass, I might be able to make a step towards a kick starter. For a long time I've been interested in making a truly freeing space what-ever-you-want-to-do game which has a heavy focus on physics and custom ships. the custom ships being by far the singly most important feature. If I have zeta in alpha, then I might be able get a little interest in funding me towards that objective.

I estimated that I earned about 3k$ this summer working for computing services. If I were to try for a kick starter, I'd probably live with my mom over the summer which would cut costs a bit. But this project would essentially be my summer job. If I could get a kick starter fund of 2k$ I could probably justify it as a reasonable choice. If I could get 3-4k$ it not only would be a reasonable choice, but quite probably the most prudent one.

And my imagination flies off the hook here: If, and it is a big if I know, but If I accomplish something that is would be of the status of minecraft's first release, or dwarf fort, or overgrowth, and get something where people are purchasing pre-orders, I may very well be able to turn this in to my lively hood while I'm working towards my masters. That would be uber cool. I could settle where ever I need to and just work on what I'm truly passionate about.

Ok, the last paragraph maybe be a little far fetched, but as we have seen is entirely possible. I'm going to keep this idea in the back of my head, and press forward with Delta and Epsilon. If I can complete them and start working on Zeta by christmas break next year. I may just try to do this kick starter business.

Saturday, February 11, 2012

Simple Introduction

I recently had a little explosion on stack overflow. While I'm not afraid to admit I was probably in the wrong and the one being unreasonable, I think my point still stands. I wanted a simple introduction geometry shaders that would help me wrap my mind around how they work. Most of what I ran across was either very poor in quality (poor explanations, poor examples, specific to systems or environments) or far to technical (lots of words or phrases that aren't familiar to unskilled users, a feeling that you already understand how the rest of the system works and this is just adding on a new feature to what you already know, etc). So I wanted to see if anyone out there new of a good tutorial. My question was closed as not actually a question.

I'm taking a potentially permanent hiatus from any interactions with SO till I calm down. But also, it made me think. Of what quality is my OpenCL tutorial? I think Winkle was trying to get me to understand this too. It wasn't helpful for some one who doesn't already know how the system works. So. I'm taking a step back with my openCL intro. I'm going to see if I might be able to incorporate pictures and explain better how parallelization works. This will be a good exercise for me I think, and if I can explain it on such a level, that also means that I'll probably have to clarify things in my head. I may end up reading my own tutorial as I make it. xD

Tuesday, January 31, 2012

Facere

I made a surprise amount of progress truth be told. I've got nodes, arcs, and connecting dots. I don't have the ability to connect things yet, nor is there any functionality in the back end. But when I actually get around to making the functional back-end, that will go rather quickly I feel. I know exactly how I want to do that.

This was an experiment of will. Can I just sit down and decide "I'm going to make this" and try to just simply make it? I certainly can. It is kind of nice to finally get a little bit of a handle on creating GUIs with python. I've learned quite a bit this evening and I've got a lot to ponder as I push forward with Facere. But I'm pleased with what I've got thus far.

Code Rush

I'm going to try something silly and ridiculous and probably won't work. But here it goes. I've come up with an idea for a universal editor. I'm going to teach myself Tkinter in python and create this program in 5 hours. it isn't a complex program so it has potential. but Really thinking I could teach myself Tkinter and make a program in the next 5 hours is foolishness. but here we go! Facere coming down the pike.

Thursday, January 19, 2012

SOPA & PIPA

I suppose it is about time I give my two bits on SOPA and PIPA. Not suprisingly I am greatly opposed to it. I also Really do believe that piracy is bad. But Twenty sided had a point. If you are a company that has one channel of making money and then protecting/sustaining that channel comes to cost more than the channel is worth, you should drop it.

For example, Telegraph is dead. Truth be told it was much cheaper to have a telegraph line than a phone line, but the public has spoken, or at least they litterally wanted to and telegraph died. The postal system is another one. There was a big out cry when some one proposed that the postoffice only deliever 3 out of the 6 days of the week, to cut costs by reducing the number of mailmen dramatically. I don't blame mail men, nor am I suprised by the massive opposition they put forth. But think about it. Bank statements, bills, and almost all leters come digitally anymore. its faster, cheaper, and more reliable. Tell me. If something was faster, cheaper, and more reliable, wouldn't you toss the old one?

My point with the mail system is that times are changing. We don't use the mail system as a vital life line of information anymore. It has been replaced. Times are changing with computer games too. The days where you actually buy physical media is dying out. Now publishers don't like this for a couple reasons.

1. it is what they are used to. For a few decades everything was done by buying floppies, CD's, DVD's, Cartigages, etc etc. This was a simple system and easy to control.
2. They loose their advantage on barrier to entry. In order to make money selling a video game you have to sell a large quantity of copies. With DLC, the cost of each additional copy is negligable. But when you are selling physical media, each additional copy can cost you 2$ or so. Publishers use this to their advantage, previously you couldn't make any money with out a huge investment. Now the app store, steam, android market, etc, just take a small royalty, meaning there is no barrier to entry. and this means indie games are rising. And big time publishers don't like that extra competition.

Why SOPA and PIPA are bad is that they are backwards and close minded. They are bills that are trying to protect an out dated system. Do I think piracy is good? no. But obviously people aren't using the postal system, so why defend it? I understand defending out dated systems because that is where your income is from. I don't have any problem with people thinking they should, it is your lively hood. But you know what the really smart and successful people do? Rather than try to help an out-dated industry limp along, they find ways to take advantage of the new industries that come around.

A fantastic example of this is the dot com boom. Thousands of start ups came around. But after all has been said and done, we have Amazon and eBay, that's it. There were dozens of companies that did exactly what Amazon and eBay did, but they did it right, they did it better, the competition died out and left just the two. The dot com boom happened when everyone decided that putting everything on the internet was a fantastic idea. well suprise suprise, the internet has its limits, just as any other medium. Walmart clobers lots of litte stores, but some how every little town in Iowa still has its own grocery store. why? because walmart isn't sufficent, walmart can't do everything. particularly, walmart can be in every small town.

I'm just kind of rambling at this point so I'll bring this to a close. SOPA and PIPA, they are backwards and narrow sited. the DRM problem with video games will cease to exist when all video games are just stored on the cloud. Do I think that that is what's going to happen? Probably, I'm not 100% certain, but it certainly seems like it is probable. And guess what happens when everything is stored in the cloud? You can't pirate it.

Thursday, January 12, 2012

Generic Asset Editor

I had an idea a while back that it would be nice if you could have one program that you use for all of your asset creation. one program you can use for pictures, sound, models, etc.. Now, obviously, if doing such was an easy endevour, it would have been done long ago. But nothing has really come close to being an editor like that. its hard, think about all the things blender has in it and it can manipulate only a small set of what is needed to make a game.

So I stopped, and I thought "what does an editor do?" and put as simple as it possibly could be, an editor reads a file, changes values, and writes a file. That is as simple as it comes. Looking at blender's node system for compositing, I thought "what if an editor could work this way?" meaning, instead of making a tool that can edit all sorts of things and then having to add functionality for each new type of file, why not make a program that can't edit anything, but can be made to edit everything?

Like blender's nodes, there would be inputs and outputs from a node. what the node actually does is entirely liquid. inside a node there would just simply python code. so image a file reader node. it takes a file name as its input and a string as its output. you can make a decoder node that reads from a file and translates the values into variables that you can work with in other nodes.

Now, this is all good but there is one peice that remains for this to make sense. Editor nodes. These are nodes that the user would have to create using a UI painter inside of this editor which could be made to edit anything. This node I'm still not entirely sure of how its made, but the idea is compelling. One of the things I see having a huge potential here is pipelines. where you want to work on one asset, or group of assets, but you open them up and the pipe line runs until it hits an editor node.

Its a rough idea that I'll probably post more about later. but for now sleep.

Wednesday, January 4, 2012

The Process Pt.2

So I've spend the last 2 hours writing down everything I can think of in terms of how the library should work. Its proven to be incredibly insightful. I now actually know what I'm building. I wanted to comment to a revision in my little methodology as I realized I have left something out:

1. Think of every of all the common uses of your program. what is the intended uses? Write down what those uses need. Now, with the time you have left, abuse it. What are some of the edge cases you can think of for your program? what might be some of the un-intended yet still very viable uses of your program? How can you accomidate for these to keep your program robust? or how can you prevent these uses if they are security or performance pitfalls?

2. Pretend that your program is down. write down some examples of using it. Describe its behavior. What are the characteristics of the implementation. If you think of any further needs while you are doing this, jump back to step one and add them in.

3. Figure out what are the atomic peices. What is it impossible to cut any smaller? start building your architecture there. don't be specific in how these peices are made, just describe them. start putting the peices together to make ever larger peices until you've encompassed everything.

4. Test Driven Development goes here:
a. write a test that checks for one of the needs from step 1.
b. run the test. if it passes, go back to step 3.a.
c. write code to satisfy that test.
d. run all tests. if any fail, step back to step 3.c.
e. clean up your code and repeat untill all requirements listed from step one are being tested for.

5. take a break.

On to step 3!

The Process

Figures that this comes right after I posted about Test Driven Development. I'm starting a new project today, its rather a classic type of program, I'm making a concurency library. So I'll be making a lot of base classes and very testable units. I'm taking this project as a chance to walk through the whole development process. I'm going to give a little over view of it here:

Requirements - Determining the needs or conditions to be meet.
Specification - The documentation that describes the requested behavior of the system.
Architecture - the set of structures needed to reason about a system. The software elements, relations among them, and properties of both.
Design - the process of problem solving and planning for a software solution.
Implementation - the actual process of writing code and debugging. what we typically think about when we think about coding.
Testing - Running automatic tests, manual test, and anything else that can be used to varify that it meets requirements, works as expected, and can be implemented with the same characteristics.
Deployment - All that is done to make the software available for use. executables, zip files, web sites, etc..
Maintenance - modification of a product after delivery. basically Implementation after Deployment.

Mythical Man Month, be it 40 years old, has a word to say on this topic. Particularly the balance of these 8 items. the first 2 we will categorize into pre-production. the second 2 into design. 3rd 2 into Development. and last 2 into maintenance. According to MM-M, the first 2 should take 25% of your time each! that means you will spend half of your time on this project just simply preparing to write the program and won't actually write a single line of code. I know that sounds strange, if I was a construction work and told my boss I need to stpend half of the time just preparing for the project, I'd be fired. But the thing is, Construction projects do spend a large amount of time in preparation, its just that by the time it gets to the construction worker, all of the preparation is complete. it isn't a very good analogy as construction tends to speend far more time in production than preperation compared to software development, but my point remains. You need to spend a lot of time planing and designing, it will actually save you time in the long run.

The 3rd category, Development, will win exactly 1/6th of the time on your project. Yea, again, tiny. The reason being is that by the time you get here, you will know so well exactly what you need to do that you will sit down and just write out code at a frightening pace. You won't have to go back and rework things as much because you'll get it write the first time. you'll spend less time thinking about what needs to be writen as you know exactly what needs to be written already.

The 4th category gets the remaining 1/3 of your time, maintenance. The good thing about this though is that you will have spent so much time carefully writing designs and specifications that maintenence won't be the horrific nightmare we all know. You'll have every detail marked out and batteries of tests to keep everything in tip top shape.

That's the breif summary of the process, now on to my actual attempt at it!

I know I won't finish this library in the time I am home. That's ok, but because of that I am going to cut some of those time frames a little short because I want to fit this all into a day and I know I'm not going to write this all in a single day. So I'm alotting 2 hours pre-production, 2 hours for design, and then 2 hours for deveopment. I don't really have a deployment or maintenance, so I'm not going to concern myself with that.

I've read over a couple of programing methodologies, and I get the feeling that people are far to focused. a lot of the methodologies, iterative, TDD, really are just focusing on the imlementation step. Agile, Scrum, waterfall, and the like however focus on the entire system as a whole. To each their own, I know that a lot of these work and they all have their advantages. But they are also focused on large projects with multiple people. For example, I'm not going to have a stand up meeting with myself. So I'm not using agile. This is what I plan on doing, introducing! the Littlefoot methodology:

1. Think of every of all the common uses of your program. what is the intended uses? Write down what those uses need. Now, with the time you have left, abuse it. What are some of the edge cases you can think of for your program? what might be some of the un-intended yet still very viable uses of your program? How can you accomidate for these to keep your program robust? or how can you prevent these uses if they are security or performance pitfalls?

2. Pretend that your program is down. write down some examples of using it. Describe its behavior. What are the characteristics of the implementation. If you think of any further needs while you are doing this, jump back to step one and add them in.

3. Test Driven Development goes here:
a. write a test that checks for one of the needs from step 1.
b. run the test. if it passes, go back to step 3.a.
c. write code to satisfy that test.
d. run all tests. if any fail, step back to step 3.c.
e. clean up your code and repeat untill all requirements listed from step one are being tested for.

4. take a break.

I believe in occam's razor. My entire development process can be described in 3 steps. When you actualy write your program, remember that. if there are two ways to accomplish something, the most simple one is the best one. keep programs as simple as you can, modular programing really helps with achieve this.

Alright. I'm going to probably post again later today, But that will be after steps 1 and 2. Time to go ponder in a big cushie chair away from my computer with nothing but a pencil and paper to program with. ^_^

Monday, January 2, 2012

Test Driven Development

Test Driven Development, or TDD. Its hailed as one of the best approaches to programing. not many people truely enjoy doing it, but the results can't be argued with.

I've shifted gears from Delta for a little bit. I'm working on a project I've been making for a prof at school. Ceranubis, a distributed processing program. I've tried several times to do test driven development for it, but I've never seemed to get it to work the way I want. Ceranubis deals a lot with networking, and it is hard to make a test class for a class that deals with networking. That's when what should have been obvious final struck me, test driven development should be used in only certain cases, and not in others.

Lets reason through this idea. What does test driven development do?

1. Accuracy first and alwasy: Test driven development allows programers to think only once of the edge cases of a program. You program the whole possible range of input for a class into the test class, you then proceed to work out by hand what the results /should/ be. When you know what they should be, you can then run your test class and see if the class your testing is giving the same results. this ensures program accuracy under all conditions. if you go back and change something and it breaks one of the tests, then you know right away that you broke something and you won't have to spend hours later trying to find the bug. this improves code maintainability and stability during growth.

2. Clearity of design: TDD forces programers to think at least once about all edge cases of a program. This means that when the programer actually goes about writing the program, he has a much clearer idea of what the class is supposed to do. I found that when I was doing the TDD that my data structures teacher required, I would look at the provided tests to get an idea of what I was supposed to be doing.

These are all good things, but what does TDD not do?

1. Test large systems: could you make unit tests for an entire application? not likely. TDD is limited to testing the functionality of base classes. high level classes and the entire application tend to have far to many variables to be accounted for in a test, trying to do so will lead to huge over bearing tests that take far to much time to program and you'd be better off debugging by hand.

2. Test libraries to lower level systems: This, again, is mostly due to the large number of variables, but there's more. This is the one that really got me, how do you make a test for a class that interfaces with networking or threading? Not saying it can't be done, but the complexities are daunting. Aside from the fact that most of the problems that could occur in this class you'll probably be handling at run-time anyway. If the thread fails to start, if the network socket fails to bind, those are probably not logic errors as much as they are things like "the port is in use". One of my difficulties with the TDD on my network class is that exactly, everything that could be wrong with the class is handled at run time. and you say "well, don't you want to know that the exception handling is going to work correctly?" yes, of course i do. What I don't want to do is spend large amounts of my time writing test that recreates the entire enviorment of my system and then ensuring that that test itself is accurate. I know you should spend time writing tests, they are supposed to be about 50% of your development time. but that is just a bit ridiculous. i would be spending way more time than debugging by hand would take, and this is one case where debugging by hand might even get more accurate results.

What are the pros of TDD?:
1. Dependable code
2. Clearity of design
3. Makes is easier for more than one programer to work together.
4. Catches most edge cases

What are the cons of TDD?:
1. Extra time spent writing tests.
2. You have to know your design before you can write tests.
3. Passing lots of tests may give you a false sense of security.
4. Should the test writter miss an edge case, then you loose your dependability.

So where should TDD be used?:
low level systems and base classes. Things that are entirely self contained. Data structures, parsers, and the like.

Where should TDD not be used?:
high level systems nad entire programs. Things that have lots of variables and a lot of interconectivity.