Foundational Knowledge for a Software Developer

In conversations with people who are much smarter about education than am I, one idea that came up was this notion of foundational knowledge. To define “foundational knowledge” by analogy think of learning basic arithmetic. First you learn addition then subtraction, multiplication and finally you move on to division. I can’t claim to understand the underlying pedagogical theories but it seems to me that if a child hasn’t mastered addition there’s little point in him or her moving on to subtraction etc. And you sure aren’t ready for algebra if you can’t deal with basic arithmetic.

In the same way, I’ve often thought there should be some sort of basic curriculum for software developers. Some sort of foundational knowledge which must be mastered before one can progress to harder subjects. I’ve seen lot of places teaching “Principles of Object-Oriented Programming” while never concerning themselves with whether or not a developer can competently write a loop. To me being able to write a loop is the developer equivalent of being able to master addition. (Feel free to insert charges of me “gatekeeping” here.)

Now I know there are some folks reading this who will say that many developers will never need to concern themselves with writing loops. And that’s probably a fair criticism. However, some knowledge is foundational to other knowledge. Could a student of mathematics understand the notion of a summation if he or she didn’t understand the basic notion of addition?

Even beyond the question of certain pieces of knowledge being “foundational” there’s the question of thinking in a computational way. While human beings are capable of leaps of intuition (i. e. arriving at an answer while not having to process all the intervening steps) computers are not. In order to program a computer we need to tell it all the steps needed to reach a solution. In fact it’s the very human trait of glossing over non-essential details that makes writing software hard.

Let’s put this to a more concrete conversation. If I said to you I need a program to read a file of totals and find the largest dollar amount in the file, if you don’t think of using a loop (or some language library that contains looping logic underneath) you’re doing it wrong.

The pseudocode for such a process would look something like this:

Open file
Set dollar amount to zero
:start
Read line from file
Parse dollar amount from line
If dollar amount is greater than stored dollar amount
set dollar amount to new dollar amount
If next line is available go to :start
Close file

Now I would need to translate that into language specific calls but that’s essentially what I would need to do. Do I care about a File class (in that classic OOP sense)? Well depending on the language I may. Do I care how I store dollar amount (i. e. integer or float)? Again, depending on the language and the problem I may. But the most important thing is to think in terms of a loop to solve the problem.

Of course, my pseudocode doesn’t take into account any potential error conditions; hence it’s foundational and other code would need to be added to it.

My point is that basic sort of computational thinking is quite separate and apart from language paradigms (Imperative vs. Functional; OOP vs. Procedural etc) and it’s fundamentally harder to master in some way. If a developer cannot master something this basic though, he or she is bound to fail eventually because this sort of thing is going to come up. And even if the language’s library can hide the details from you, you still need to have a notion of what’s going on if you have to debug this.

If you want to be a better developer, first insure that you’ve mastered these foundational concepts of software development. This is not to diminish the difficulty of learning more advanced concepts; it’s simply to say that, to me anyway, too often developers leap to advanced ideas without mastering basic ones and their work suffers for their lack of mastery of basics.

Playing By Color

When I was younger we would take our vacations in Arkansas visiting my mom’s family down there.  My grandmother who was quite a good singer and who had quite an intuitive gift for music, had an organ in her living room.  I can still remember sitting there and playing with it.  I couldn’t (still can’t) play music but as with most kids that didn’t stop me from sitting there and making noise.  God bless my grandma–she had to have had the patience of a saint or three to put up with my noodling on her organ.

At one point though I recall finding a colored piece of cardboard in the compartment of the piano bench.  It was something that could be used with a special music book to allow someone who couldn’t read music to pick out a tune by following the colors.  The idea was that you set the colored piece of cardboard on the keyboard at a certain position and then by following the colors in a special songbook one could pick out a tune.  This seems to be an idea that comes up time and again (as evidenced by this) in teaching music to children.

I was so proud of being able to pick out “When The Saints Go Marching In” and have it sound somewhat like the tune should have sounded.

I would never in a million years consider that I learned very much musically.  In fact it’s probably much more akin to some sort of Skinnerian behavior conditioning (“Ooh–I follow the colors and I get pretty music!”) than real music.  But I could pick out a tune.

So this is part of my concern with the whole “everyone should learn to code” school of thought.  While I think it’s a fine idea I’m afraid there are a number of people coming out of these code camps with the software development analog of knowing how to play the organ by following the colors.  That is they can pick out a pleasant tune but without the cardboard they’re lost.

Don’t get me wrong–everyone who writes software has to start with baby steps.  My concern is that we’re not doing enough to encourage new software developers to move beyond the cardboard color guide of software development and truly understand the dynamics of software development.  I’ve seen it more than once or twice–new developers who are utterly lost without intellisense. We need to help people move beyond the rote regurgitation of what they’ve been told to a deeper, richer understanding of the mechanics of what’s going on.  If we’re just teaching people to repeat a series of rote steps then while we’re giving them skills for a career change we’re certainly limiting their future prospects in their new career.  We owe them better than that.

RIP grandma Ruby and thank you for the clarity to sometimes see through the superficial.

Intellisense Considered Harmful To Novice Developers

(With apologies to Edsger Dijkstra)

I’ve suspected for a while that many of these coding boot camps and other “learn coding in an accelerated way” are turning out people who know just enough to be dangerous. I say this not to be mean or to denigrate these recent graduates; I’ve met a lot of them and they’re good, hard-working people. They’re genuinely trying to make a big career change and that’s always tough. But they come out of these boot camps with a superficial understanding of software development. Since I was training novice developers for a while I often tried to figure out better ways to train people to be software developers.

So lately I’ve been reading Felienne Hermans’ The Programmers Brain via the Manning Early Access Program. By the way, full disclosure, I am fortunate enough to count myself among Dr. Hermans’ acquaintances. Brilliant person and doing a lot to help us understand how to teach software development. One thing she mentioned in her book is retrieval practice. I have heard this term in other pedagogical literature. It basically means trying to remember some piece of information without reference to a memory aid. The more you try to retrieve something without a memory aid the better it gets lodged into your long term memory. And when things get embedded in your long term memory you start to see more patterns–that is, roughly speaking, the better your long term memory of things the more likely you are to be able to cope with novel situations. I cannot recall the technical discussions I’ve seen on this but as you learn a subject more deeply you develop something akin to an intuition about how things should work.

Dr. Hermans’ mention of retrieval practice finally helped me to articulate something that had been a half-formed thought in my mind for a while. Intellisense and similar technologies–while they are a boon for experienced developers–very likely hinder the learning of novices.

This is why I’m starting to consider intellisense harmful–at least for novices trying to learn coding. Because they’re never forced to try to retrieve syntax from memory it doesn’t stick as easily in their long term memory as it did in the days before intellisense. When you’ve got the immediate feedback that you’ve typed syntax wrong and you can monkey around with your code until the red squigglies go away, then you’re never forced to recall syntax. Or least you can delay committing syntax to long term memory much longer.

I think we need to seriously reconsider starting novices with Eclipse and Visual Studio. I think rather than helping us to turn out more developers I think this is mainly creating novices that will stay at the novice level much longer than they need to.

Empathy and the Other Side of Impostor Syndrome

I cannot be sure–I don’t know if there’s any formal diagnostic that a mental health professional can perform to diagnose impostor syndrome (assuming it’s even a recognized mental health concern). So I won’t say that I suffer from impostor syndrome but it’d be hard for me to imagine that I have never felt the effects of it at any point in my life.

I was doing training for the last couple of years. The VP of the Training Team–a phenomenal trainer and a pretty good judge of people–said to me that one thing I could do better as a trainer is to be learner focused. That is, to do more to put myself in the learner’s place and thus figure out how to make it easier for them to learn what I was trying to teach. And he’s quite correct–I do have a hard time understanding things from the perspective of someone else.

This Can’t Be That Hard!?!

One of the causes of impostor syndrome as I understand it is the feeling “it’s just not that hard”. I mean I can recall taking differential calculus and thanks to some excellent instructors it all made sense and it wasn’t that hard to understand. I can recall taking integral calculus and that, likewise, didn’t seem that difficult to understand till we got to a point where they told us to start memorizing tricks to solve certain integrals. That’s where they lost me. But again, due to great instructors, the preliminaries–the idea of a series of polygons under a curve and the sums of the areas of those polygons–well it just seemed obvious.

I’m not saying that it’s not tough. I’m not saying that I’m that smart. What I am saying is that when the material seems obvious in some sense of that word, it’s really tough to understand why it isn’t obvious to everyone. Well, of course, as the width of the polygons under the curve approaches zero the approximation of the area under the curve will become more accurate. Well, duh!

To me this is the other side of impostor syndrome. That inability to understand how the material could be difficult for anyone to learn or master. That feeling of, well, of course thus and so is true–this can’t be that hard, right?!

I’ve had people tell me that they think I’m a good trainer. I have my doubts. Considering that I would ask people about things I’d demonstrated to them just a few weeks before and they’d look at me like I’d grown a third eye, I think I was not a good trainer. I guess time will be the ultimate judge of that. But not being able to empathize with people you’re training because the material just doesn’t seem that difficult to understand doesn’t help the matter.

I’d like to flatter myself and think that I was good at explaining things to my learners. But the truth of the matter is that no matter how I approached the materials I was trying to teach (software development and some very basic CS) it’s tough material to master. If it seems obvious to me that’s because I’ve had years of doing it to make mistakes and internalize the explanations for things.

Software Development And The False Dilemma

So I started my morning, as I often do, by looking at Twitter. It’s better than a demitasse of espresso in terms of raising my blood pressure.

One thing I see over and over again on Twitter is the rather common false dilemma logical fallacy. And this fallacy seems to be in play a lot in software development.

We Can’t Test All Of It!

This is one of those old saws I’ve heard from multiple developers over multiple years. It goes something like “Well we don’t have time to test thoroughly . . . ” the unstated implication being that if we can’t test “thoroughly” (for whatever definition of thorough that person has in mind) then there’s not much point in testing at all.

This is, of course, a false dilemma. There’s quite a bit of value in testing even 10% of a system because that’s 10% more chance you have of finding bugs before your code gets in to the hands of your customers.

Now testing costs money. It takes time. But that has to be weighed against the loss of time for customers and the loss of trust. I think the loss of trust is an even larger (if much more difficult) cost than the loss of time. Once a customer stops trusting that your software is mostly right, it’s very hard to regain their trust after the fact. The hasty generalization occurs–one crash makes the entire application suspect from there on out.

I Tested And Bugs Still Got By!

This is yet another form of the false dilemma. Developers will, sometimes very grudgingly, write unit tests against their code. Then after several iterations they may find a bug. “Well I guess all those unit tests were useless! They didn’t catch this one bug!” And there’s a great excuse not to do the work of writing unit tests any longer, right?

Of course this is a false dilemma. Unit tests certainly won’t eliminate the possibility of any bugs. For one thing, the code doesn’t stay static after we write our unit tests. For another, it’s human nature to miss details that need to be checked. Neither of these factors mitigate the value of having a good suite of unit tests. If you find bugs after the fact well add another unit test to catch that bug in the future and congratulate yourself on being a tiny bit wiser about potential bugs in your code!

The Comments Lie To You!

Another common false dilemma is the notion that once comments get out of date there’s no point in updating them. “Well,” the reasoning goes, “this one comment is obviously incorrect so I won’t bother to read any of them. And there’s no point in correcting the comment either!”

I guess as I put more miles under my tires, I get more and more to appreciate the developer who rather than leaving garbage does his or her part to spruce up the code. Maybe you can’t fix the entire code base but that doesn’t mean there’s no value to be gained from fixing the part of the code base you’re currently working on.

We Can’t Verify All Of It!

One area I’ve been focusing on lately is the idea of formal verification of software. I’ve been playing with Coq and I have found an excellent tutorial on how to use it to formally proven assertions about code. But this seems to be one of those false dilemmas that developers feed themselves.

“Hey Onorio, you can’t formally verify the correctness of all of your code so why bother with any of it?” And that’s true. For all but the most trivial applications a full formal verification is pretty expensive and time consuming. But that doesn’t mean that there is no case where some formal verification cannot help build stronger software.

Pretend you’re dropped in the middle of landscape you’ve never seen before. You’re offered a map but the map, intentionally, has a lot of details omitted. Would you rather have the incomplete map or nothing? I don’t know about you but I’d consider an incomplete map is still better than absolutely no clue about the landscape.

Of course we can’t formally verify entire large software applications–especially applications that interact with human beings. But that doesn’t mean that we cannot verify parts of the application and have some assurance that some parts of our code are formally, provably correct.

I guess if I wanted you to take anything from this little thought ramble it’s that “I can’t fix everything” is almost never a justification for not trying to fix anything. The single cleanup here and there will eventually add up to a code base that if not perfect is at least a lot more liveable.

Focus And Productivity

So I can recall something that Jeff Atwood said a few years ago about “The Magpie Developer” and it stuck with me. And I’ve heard Robert Martin (aka “Uncle Bob”) discuss “The Churn” which seems to be his feeling of magpie developers constantly seeking shiny new toys. Recently I also spotted this excellent write up on the idea of “Diseases of the Will” on BrainPickings. Seems like the type of person he describes as the “bibliophile” or “polyglot” could describe a lot of software developers.

But I don’t think it’s quite as simple or clear-cut as many of us would like it to be. Saying “Well I know <TechnologyX> and therefore I’m going to stick with it because it’s just a tool and any tool can do any job” is not only foolish, it’s counter productive. It’s somewhat analogous to saying “Well I’ve learned to use a wrench and so, therefore, I’m going to use this wrench for every mechanical job I need to do everywhere!” You can drive a nail with a wrench but a hammer makes it a lot easier. So it seems that we need to at least be aware of new ideas in software development even if we decide our best bet is to stick with the technologies we already know.

So how do we balance the need to learn new things against the constraints of our limited time? That’s what I want to address here. I don’t know that I have any answers but I certainly have some suggestions.

Learning How To Learn

One thing that has been impressed upon me again and again is that many of us, i. e. software developers, don’t know the best ways to learn a new subject. For one thing, for a lot of us a lot of technical topics came somewhat easily up to a point. For myself I thought I was pretty good at math until I hit integral calculus. I really struggled with integral calculus. Partly I wasn’t terribly motivated to learn it at the time and partly I really didn’t have any study habits to speak of because I’d never really needed them. As I am sure people reading this essay will understand it’s probably one of the worst feelings I’ve ever felt to be unable to understand math immediately. If I had developed some sort of study habits I wouldn’t have been so badly thrown. If I had a bit less pride and admitted to myself that I could stand with some tutoring, I wouldn’t have been so badly thrown.

My point is that lifelong learning is an essential part of being a software developer so it’s worth your time to invest a bit in discovering good study habits if you don’t already possess them.

Making Time For Learning

Another essential ingredient of acquiring new skills is setting aside time to do it. I am fortunate enough to work for a company that sets a great deal of importance in training its employees. They expect all of us to set aside 5 hours a month solely for learning. Not every company is that enlightened.

Of course the time for learning has to be balanced with time for other things. More importantly, time for learning can become an excuse for avoiding things we don’t want to do so it does make sense to set a time limit on your learning and try your best to stick to that time budget.

What Should I Learn

This is, to me, one of the essential questions that a technologist must ask herself or himself. Given we only have limited time to acquire new skills and given that we don’t want to waste that most precious commodity, time, what should we focus ourselves on?

One answer I’ve heard trotted out repeatedly is to chase the next big trend. Right now the next big trend seems to be machine learning and artificial intelligence. I’m tempted to put those two terms in quotes because I think they are among some of the worst names for an idea since someone decided to call our genetic material colored bodies (chromosomes). But I digress.

My answer to the question “what should I learn?” is–learn whatever interests you. There will always be some shiny new technology to chase so don’t spend your life trying to learn the latest fad. In that I do believe that I do agree with Uncle Bob.

But even if I say learn whatever interests you, I can improve that recommendation a bit more. Learn something that interests you that is fundamentally stretching your mind. Are you the best C# developer at your company? Learn F#. Functional programming is a pretty dramatic paradigm shift from OO. They are not orthogonal to each other (as everyone seems to think they are) but rather they complement each other in nice ways. Of course if you’re really very interested in learning another OO language, that’s fine too. But I think it’s safe to say that one grows himself or herself more through fundamentally different approaches than through learning new variations on the same theme.

Detroit Tech Watch 1.0

Last year I had several friends inquire if I was going to stage another DetroitDevDay. I had taken over from my dear friend Dave McKinnon and I had run it for a couple of go-rounds. Honestly though, I was getting a bit worried that DetroitDevDay was becoming almost YADLTC (Yet Another Day Long Tech Conference). It’s wonderful to see the explosion of tech focused events here in Detroit and I am truly gratified to see so much going on here but where did DetroitDevDay fit in that picture? That was the question I was struggling to answer.

There are good .Net oriented conferences now. There are good Java oriented events (although truth be told they’re mostly Android/Google events). What could DetroitDevDay do to contribute something different–something worthwhile?

Then I hit on it; I want to talk about what’s coming next and I want to talk about it with other developers who want to look at what’s coming up to see if there’s value in it. And that’s where Detroit Tech Watch was born.

So why not just stick with C# or Java?

I think a lot of developers are in the position of doing C# or Java because it pays the bills. That’s not a knock on either technology stack; it’s simply become somewhat de rigeur to pick one stack or the other and then use it for everything. But we all know that there was a time before OO and C++ and its progeny. And there will be a time afterward as well. COBOL was once cutting edge–now it’s legacy.

Another interesting development (and to be quite frank a bit concerning) is the rise of coding bootcamps and “You too can be a developer!” While this is absolutely terrific in terms of giving people an opportunity to better their lives, in terms of wages it’s going to drive the wages of C# and Java developers down. That’s not a noble or unselfish reason to look at stuff beyond C# and Java but it is a fact.

At some point if we don’t pick up our heads and look for what’s coming next we risk becoming the software developer analog of KMart, Sears, Circuit City, Newsweek Magazine etc. etc.–that is, disrupted right out of business. The trick is not to see what’s coming next. The trick is to spot what’s upcoming that has real value and get on it. And that’s tough.

So we (Mike Onslow who is really a heck of a smart guy and I) are going to try DetroitTechWatch 1.0. Even though it is 1.0 it’s really sort of DetroitDevDay 8.0 with a different emphasis. We really hope that if you’re interested in hearing about what’s coming next you’ll join us. Or you’ll submit a talk. Or both!

Anecdotal Evidence

I’ve been having a conversation with some folks on Twitter about their evidence that mob programming makes Brooks’ Law irrelevant. Someone offered evidence based upon his own experience that mob programming breaks this rule.  I hope I’m not mischaracterizing his argument. That is,  I don’t think he’s saying that mob programming negates Brooks’ Law–I think he’s saying that multiple programmers can work faster via mob programming.

I had pointed out that while his experience is certainly evidence it’s anecdotal evidence.  This is not to belittle the importance of anecdotal evidence. It is to make a distinction between experimental evidence and anecdotal.  I’ll relate an example that I hope will help to clarify my thinking on this point.

I’ve raised domesticated pigeons for a good portion of my life.  I know lots of folks who race their pigeons (I do not).  I was told that one of the old sachems of pigeon breeding, a gentleman named Joe Quinn, was trying to convince breeders to vaccinate their birds.  He was having a hard time convincing the racers to vaccinate their birds and as a result, flocks were getting hit with easily preventable diseases.  So he hit upon an idea.  He convinced one of the better breeders to let him (Joe) vaccinate his flock at no charge.  The breeder didn’t mind since it was free.  At the end of the season the other fanciers asked the master breeder what was his secret of winning races.  The master breeder said that he did pretty much what everyone else did (that is, training, selective breeding etc.) but he did get his flock vaccinated.  The next season all of the fanciers wanted to get their flock vaccinated.

This is what I mean by anecdotal evidence. Did the vaccinations make a difference?  Of course they did–they protected the birds against preventable diseases. But did the vaccinations make a difference in the races?  That’s more debatable.

Experimental evidence would be vaccinating half of the flock and leaving the other half unvaccinated and then comparing how the birds raced at the end of the season.  Of course everything else would need to be held equal–equal diet, equal training, equal housing etc.  The result would be experimental evidence that the vaccination either made the birds race faster or not.

As it stands however, saying vaccinating the flock made the birds race faster is anecdotal evidence.  Heck we might even be able to prove it had no effect by looking at the race times pre-vaccination and post-vaccination–although there are likely to be other complicating factors present.

The thing is it’s really hard to do a double-blind study this way when it comes to software development because there are just too many other factors to try to hold equal.  If one team uses mob programming and they’re more productive–well was it the mob programming that made them more productive or was it that they had an easier task to accomplish or was it the fact that the team had long experience together or so forth and so on.

So we’re mostly stuck with anecdotal evidence when it comes to software development because producing experimental evidence is so difficult.  As I said on Twitter I’m not doubting the anecdotal evidence–I’m just doubting that mob programming is the sole difference.  The fact, all by itself,  that a team is willing to try mob programming probably indicates a team more receptive to trying new ideas and hence possibly more productive for that reason alone.  I’m not doubting that mob programming is beneficial; while I don’t have experimental evidence to prove it, I also believe it’s a good idea.

Grow-Only Set In C#

So I’ve been trying to keep my development skills somewhat sharp by working on coding up certain data structures in C#.  Course I’d rather do this with F# or even some more interesting FP language (Elm anyone?) but I do have a need to be able to show this code to student developers and the students I get know C#.

So there are a few things:

1.) I found it hard to find an abstract discussion of these CRDT data structures.  This article on Wikipedia is fine but there’s some math notation there that I find a bit tough to read. I suppose I need to do a bit more digging.

2.) It’s a bit of an impedance mismatch to try to build a read-only data structure in an imperative language.  OO especially doesn’t really seem to fit the idea of returning new data (as opposed to mutating in place) as well as it might.  Still we soldier on as best we can.

Without any further ado here’s my code.

using System.Collections.Generic;
using System;
using System.Linq;

namespace CRDT
{
    public class GrowOnlySet<T>
    {
        private readonly HashSet<T> payload;

        public GrowOnlySet()
        {
            payload = new HashSet<T>();
        }

        public GrowOnlySet(HashSet<T> newstore)
        {
            payload = newstore ?? throw new ArgumentNullException(nameof(newstore));
        }

        public HashSet<T> GetPayload() => new HashSet<T>(payload);

        public GrowOnlySet<T> Add(T element)
        {
            payload.Add(element);
            return new GrowOnlySet<T>(GetPayload());
        }

        public bool Lookup(T element) => payload.Contains(element);

        public bool Compare(GrowOnlySet<T> other) => (other == null) ? false : other.GetPayload().IsSubsetOf(payload);

        public GrowOnlySet<T> Merge(GrowOnlySet<T> other)
        {
            if(other == null) throw new ArgumentNullException(nameof(other));
            return new GrowOnlySet<T>(new HashSet<T>(payload.Union(other.GetPayload())));
        }
    }
}

I would greatly appreciate any comments that folks with a deeper knowledge of CRDT data structures may care to share.