A common thing seen in the enterprise world is the “implementation” of new technology using the same methods as old technology.

It is often done by doing things “the old way” with the new tech. One example “santa tech” that has been recently abused is the term “REST”. Many people go in and rewrite SOAP and call it “REST” without actually taking the time to understand the new technology’s paradigm. What’s left are long url’s embedded with actions, with GET calls making changes to data.

Let me tell you a story about a codebase that went through a “nightmare before christmas” situation.

The Nightmare

The team was elated. Another successful feature added to their ever more feature-full codebase. As always, Jack, their lead architect, had led them to another wonderful success. Because they wanted their code to be “efficient”, it was all still in assembly. The team had to pull off 80 hour weeks in order to finish the change on time, but they did.

Let’s set this in the early 80′s.

Late one night, Jack ran across ANSI C. It was comparably performant to their assembly code, but its productivity gains were astonishing! No longer would developers have to worry about what registers held what. They could leave the pushing and popping to the compilers (and Madonna). Maybe those 80 hour weeks could fade back down to 40 and they could still compete with other products in their space.

Jack set out on his mission to request a migration to this new language. Given his reputation, management, although initially fearful, was quickly behind him. Jack could do no wrong, and this seemed like the right thing to do. Several developers were assigned to port the existing code over to the new technology. This had to be simple work, right? They just want the exact same functionality in a new language.

When this team of assembly programmers is assigned this migration task, they realize “C can be a lot like assembly”. You can take the data segment and turn it into variable declarations at the top of the module. All the subroutines that were defined in assembly can then be written as void functions that do actions on these variables. It’s almost a direct conversion!

They start by migrating a simple module from this assembly code:

add_extra_account:
    add     eax, 4
    ret

find_account_by_x:
    mov     ebx, dword[eax + accounts]
    ret

_main:
    mov     eax, 0

    call    populate_accounts
    call    add_extra_account
    call    find_account_by_x

    sub     esp, 12
    mov     dword[esp], msg
    mov     dword[esp + 4], ebx
    call    _printf
    add     esp, 12

    mov     eax,0
    ret

section     .data
accounts    resd    30
msg         db      '%d',0xa, 0

Into C:

int ax;
int accounts[30];
int account;
char * msg = "%d\n";

extern void populate_accounts();

void add_extra_account() {
    ax++;
}

void find_account_by_x() {
    account = accounts[ax];
}

int main (int argc, char ** argv) {
    ax = 0;
    populate_accounts();
    add_extra_account();
    find_account_by_x();
    printf(msg,account);
    return 0;
}

That didn’t seem so bad!

The team began to work, carefully creating global variables for all the things that were in the data segment before, and meticulously crafting functions that do exactly what the assembly had done before. And it worked exactly as before!

Management was excited! They finished the migration ahead of schedule! They’re working with a state of the art language! But something seems wrong: they still have the same 80 hour weeks when they are adding new features to the code.

As the teams become more familiar with the C, there is some measurable productivity gains in newer modules, but whenever parts require changes to this original code, timelines seem to expand to what they used to be before the migration. Oh well, they blame it on “legacy code” and hope for a rewrite.

Learn your Paradigm

In the story above, we can see how the essence C is lost. All that’s left is the same spaghetti that the new programming language was supposed to save these developers from in the first place. I’m not saying this migration isn’t a step in the right direction; it saves them from a lot of tedious stack management. What it doesn’t do is actually make the final code all that more readable.

It’s also a lie. This isn’t C code. It’s assembly code pretending to be C. Any C programmer is going to feel a little pang of sadness every time they have to touch this code.

The Dalai Lama once said, “Know the rules well, so you can break them effectively”. I used to think that this was a waste of time, but that was before I had the opportunity to be the victim of those who decide to not follow the rules; not out of malice, but out of ignorance. Trying to understand what is happening in a sizable codebase written in the subset of C shown above is just as hard as trying to understand the original assembly code.

It’s easy to spot these problems from the outside, but we who make these sorts of changes genuinely believe that we are doing things the right way. And we are, based on what we know of the world. It’s easy to ignore what others have done; we are so much smarter than them, anyways.

Save some time: slow down and learn about your new technology before integrating it into your current worldview. The results will pleasantly surprise you as your world expands.

The following are instructions for minimizing SD card writes for Raspberry Pi’s “Raspbian” Distribution.

If you’re like me, you’ve run into a corrupted SD card too many times to not become hell-bent on making it never happen again. I have the following setup, and it seems to be working well for me.

The biggest offender for Filesystem writes on any linux system is logging. If you are like me, you don’t really look at /var/log after a recycle anyways. This area, and /var/run, a location where lock files, pid files and other “stuff” shows up, are the most common areas for mess-ups. Take a look at your blinking FS light on the board. Our goal is to make that light stay off as long as possible.

Set up tmpfs mounts for worst offenders. Do other tweaks.

Linux has with it the concept of an in-memory filesystem. If you write files to an in-memory filesystem, they will only exist in memory, and never be written to disk. There are two common mount types you can use here: ramfs, which will continue to eat memory until your system locks up (bad), and tmpfs, which sets a hard upper limit on mount size, but will swap things out if memory gets low (bad for raspberry pi, you will probably be hard stopping your device if it is low on memory).

We will first solve the usual corruption culprit and then move on to making sure we are covered when our programs decide to blow up.

The following two lines should be added to /etc/fstab:

none        /var/run        tmpfs   size=1M,noatime         00
none        /var/log        tmpfs   size=1M,noatime         00

There’s more, however. By default, linux also records when a file was last accessed. That means that every time you read a file, the SD card is written to. That is no good! Luckily, you can specify the “noatime” option to disable this filesystem feature. I use this flag generously.

Also, for good measure, i set /boot to read-only. There’s really no need to regularly update this, and you can come back here and change it to “defaults” and reboot when you need to do something.

After this, /etc/fstab should look as follows:

proc            /proc               proc    defaults                    0   0
/dev/mmcblk0p1  /boot               vfat    ro,noatime                  0   2
/dev/mmcblk0p2  /                   ext4    defaults,noatime            0   1
none            /var/run        tmpfs   size=1M,noatime             0   0
none            /var/log        tmpfs   size=1M,noatime             0   0

Go ahead and reboot now to see things come up. Check the Filesystem light on your raspberry pi after it’s fully booted. You should see no blinking at all.

Disable swapping

As a note, since i have done the changes above, i have not corrupted an SD card. I’m not saying I’ve tried very hard, but it is much better, even with power plug pulls, which i tried a few of after doing these changes.

One protection against SD card corruption is an optional, but potentially “I’m glad i did that” change to disable swapping.

The raspberry pi uses dphys-swapfile to control swapping. It dynamically creates a swap partition based on the available RAM. This tool needs to be used to turn off swap, and then needs to be removed from startup.

Run the following commands to disable swapping forever on your system:

sudo dphys-swapfile swapoff
sudo dphys-swapfile uninstall
sudo update-rc.d dphys-swapfile remove

After doing this, call free -m in order to see your memory usage:

pi@raspberrypi ~ $ free -m
             total       used       free     shared    buffers     cached
Mem:           438         59        378          0          9         27
-/+ buffers/cache:         22        416
Swap:            0          0          0

If you reboot, and run a free -m again, you should still see swap at 0. Now we don’t have to worry about tmpfs filesystems swapping out to hard disk!

Housekeeping!

As you go on and install other tools and frameworks on your raspberry pi (like ROS), you need to be aware of where caches and logfiles are written to. If, for example, you use ROS, the nodes will, by default, log to ~/.ros/log/. For these sorts of things, and for ROS in particular, point these logs to a folder on /dev/shm. This mount is created by default on linux boxes, and is a tmpfs filesystem that is globally writable by default. Mine has 88 MB.
For example, pointing ROS logs to this file could be done by adding the following to your .bashrc:

export ROS_LOG_DIR=/dev/shm/rosLogs

If you feel like creating more mounts, feel free, but I have run into ownership issues that required having another script on startup that had to be run as root to chown directories to their proper owners.

Getting to 100%

The only way to fully protect against SD card corruption is to mount your root filesystem as readonly. For me, this was too much of a usability issue. I am editing files and installing new packages too often for this to be feasible. The steps listed above should, however, cover you 98% of the time. Try not to pull power while you’re editing files or installing new packages on your device, and you should be fine. Still make backup images of your SD card every once in a while! This is a best practice no matter what!

Happy Hacking!

I have hear the words “Markov” more than a few times, but it is only recently that I can appreciate exactly what this simplification buys me. Markov was a guy who liked to model things by using only the current state. This simplification is often very appropriate and often offers a relatively accurate approximation depending on the problem at hand. It will at least be more accurate than not doing anything, and is considered a common computer science tool when modeling something.

For example:

A Markov Chain for a “Black Friday Sale” Shopper

A Markov Chain for A Black Friday Shopper

In the above example, the links represent the probability of a transition to a new state given the current state. The “Markov Property” is seen by how the probability of any transition is only dependent on the current state. It’s a pretty straightforward idea that I’ll leave to Wikipedia to better explain here.

Example: Dealing with Temporal Uncertainty

We will explore the advantages of the “Markov Property” for simplifying the calculation of time-based processes. Below is an example showing how to tell if a Black Friday shopper is ready to check out. For this example, we only have one “evidence variable”. We can see whether or not the cart is full. The “Markov Assumption” that we will take is that the current state is only dependent on the previous state. The state of “cart is full” is represented by C, and the state of “Ready to leave” is represented by R. We will also assume that whether or not the cart is currently full is not dependent on information from the previous state (suspend disbelief).

When is the Shopper Ready to Leave

The graph above basically says “if their cart is full, they probably want to leave, otherwise, they probably don’t. If they wanted to leave last time we checked, they probably still want to leave, but maybe not. The nerd name for what we have drawn above is a “Temporal Bayesian Network”. We are also making the “Markov Assumption” that the current state is only affected by the state before this. If we did not do this, we would have to take all previous states into account when calculating each new state. Let’s do an example now!

Below is a table of evidence for a Black Friday shopper. We are going to calculate the probability that the shopper is ready to go at any given time.

Time Cart
6:00 Not Full
6:05 Full
6:10 Not Full
6:15 Full
6:20 Full

And that’s all the data we have. Let’s calculate the first time probability by hand.

For the first “ready to go” calculation, you have to make up a previous “steady state” probability about whether or not the shopper is ready to go. For this example, we’ll assume it’s .5. If any of these terms are weird, don’t worry, I’ll explain them afterwards.

Step 1: Calculate P(R_t | R_{t-1})

  \begin{array}{lcl}  P(R_t|R_{t-1}) & = & \alpha \cdot \sum_{r \in R_{t-1}}P(R_t|r)P(r) \\  P(R_t|R_{t-1}) & = & \alpha \cdot < .8 \cdot .5 + .3 \cdot .5 , .2 \cdot .5 + .7 \cdot .5 > \\  P(R_t|R_{t-1}) & = & \alpha \cdot < .55 , .45 > \\  P(R_t|R_{t-1}) & = & < .55 , .45 >  \end{array}

For this calculation, we need to sum up the different probabilities of R_t given the different possibilities for R_{t-1}. Because each of those is affected by the likelihood of R_{t-1}, that probability is multiplied in. This is called the chain rule. The α character you see is there to re-weigh the true / false probabilities back to 1. This wasn’t necessary for that example, but it usually is.

Also, the step we have done above is called “filtering”. It is when we calculate the present state based on past information.

Now that we have accounted for the past in the current probability, it’s time to account for the current evidence available to us.

Step 2: Calculate P(R_t | C_t, R_{t-1})

   \begin{array}{lcl}  P(R_t| \neg C_t, R_{t-1}) & = & \alpha \cdot P(R_t|\neg C) \cdot P(R_t|R_{t-1})\\                   & = & \alpha \cdot < .55 \cdot .4 , .45 \cdot .6 > \\                   & = & \alpha \cdot < .22 , .27 > \\                   & = & < .44898 , .55102 >  \end{array}

This was done assuming that the existence of a cart in time step t is independent from whether or not the person was ready to go in time step t – 1. As both these variables have been deemed independent, the probability of both of them happening is simply a multiplication.

Let’s go back and add another evidence variable that is dependent on the person’s state instead of the other way around. We will say that a person who wants to leave is probably holding a box of cigarettes. According to PBS, 20 percent of Americans smoke, so if we make the assumption that those people make it obvious they need to smoke by holding out cigarettes and that 25% of the time they just end up holding out a box of cigarettes for no apparent reason, that would create the following network (we will use B to represent a box of cigarettes being visible):

A more interesting example

When is the Shopper Ready to Leave

And here’s an updated version of our first chart.

Time Cart Cigarettes Visible
6:00 Not Full No
6:05 Full No
6:10 Not Full No
6:15 Full Yes
6:20 Full No

Thanks to everything being independent from each other, we just need to add another equation onto the answer we already have:

Step 2: Calculate P(R_t | B_t)

   \begin{array}{lcl}  P(R_t|\neg B_t) & = & \alpha \cdot \sum_{r \in R_t} P(\neg B|r)P(r) \\                  & = & \alpha \cdot <.8 \cdot .44898, .95 \cdot .55102> \\                  & = & \alpha \cdot < .35918 , .52347 > \\                  & = & < .40694 , .59306 >  \end{array}

So far, it looks like this person is more likely to not be ready to go after one time step.

Coding it up

For dealing with temporal networks, we can use the Hidden Markov Model to convert all of these calculations into matrices. Below is what I believe is a correct simulation of what is done above implemented in Octave (or Matlab).

% An Octave Example
function Rt = markovNet ( Evidence )
Rt = [.5 .5]; % Initialize to some value
Transition = [ .8 .2 ; .3 .7];
Cart = {[.4 0; 0 .6], [ .9 0; 0 .1]}; % {F, T}
Cigarettes = {[.8 0; 0 .95], [ .2 0 ; 0 .05 ]}; % {F , T};

% go through each time step.
for t = Evidence'
    Rt = Rt * Transition;
    Rt = Rt * Cart{t(1) + 1};
    Rt = Rt * Cigarettes{t(2) + 1};
    Rt = Rt ./ (Rt * ones(size(Rt))');
end
end
markovNet([ 0 0; 1 0; 0 0; 1 1; 1 0])

Assuming that i didn’t do something wrong in the code above, the probability that the person is ready to leave at 6:20 is 96.6%.

Conclusion

If you ever wanted to figure out how to deal with a probability, using the Markov Assumption can help you considerably in making the problem more simple to model. It is not a 100% solution, but most of the time, it can get you close. Very close.

RequireJS is an awesome library that lets you as a developer write very explicitly modular javascript code that every other language had right out of the box. Unfortunately, because there is such a thing as history, there are a lot of code bases that have a large number of globals. In some cases, a third party in-house CDN may deliver such things, including a customized version of JQuery that other modules then depend on.

Requirements

In order to assure that isn’t used when not necessary, make sure that your issue meets the following requirements, and all of them at that:

  1. You don’t get to touch this file.
    • I mean, you don’t even get to download this file. If the file lives in your codebase, break it up and shim it in like any other non-compatible library.
  2. You have other files that depend on it that are also being shimmed in.
    • This is necessary. If you don’t have this problem, once again, you can use a more traditional shim approach to dealing with the problem. Create stub files that depend on this if you need to grab more than one global / you can’t use the :init method to package them all up into one variable. This technique is necessary because a shimmed in file wil lbe loaded BEFORE a CDN file, meaning that whatever was delivered by the CDN will not be in the global space at the time of the other shimmed file.

Shimming in a Lie

Sometimes things just seem to not fit. For this solution, I suggest that you explicity include this library in your html, and then create stub files that grab the particular globals that you care about.

Because a shim has access to all global variables, the ones that you just inherited are also available. In order to trick RequireJS into loading these, you will need to specify several fake stub files. One per global that you want to bring in this way. I want to emphasize this is a last resort, and should only be used when you have no control over the file that is delivering these globals to you.

In order to deal with a CDN delivered dependency to another library that needs to be shimmed, this file just needs to live outside of RequireJS. Explicity load this script in your html. Shed a tear. Move on.

Now that you have the file loaded, create empty stub files (one per global) and use them as shim dependencies. Note: if you try to use the same file for all globals, even if you specify different “virtual paths”, you will run into RequireJS load errors.

Your config should somewhat resemble what is below:

requirejs.config({
    paths: {
        'jquery' : 'path/to/stub1',
        'weirdlibrary' : 'path/to/stub2'
    },
    shim: {
        'jquery': { exports: 'jQuery' },
        'weirdlibrary': {exports: 'LibraryGlobal'},
        'dependentLibrary': {deps : ['jquery'], exports: 'LibraryThing' }
    }
}); 

Conclusion

Congratulations, you can now completely avoid dealing with whatever team is responsible for that code. This method is a hack of a solution, but it does allow you to more quickly adopt this awesome new dependency framework.

After a considerable amount of trial and error, I have discovered the following setup steps for enabling requirejs in a maven project.

Integrating Maven

The requirejs maven plugin world has been marked by two packages: brew, and requirejs-maven-plugin. Brew takes the approach of “we’ll do everything and try to support all of the config file format stuff in our pom.” This has the clear disadvantage of depending on these guys to match 100% what the latest r.js and application configuration can do. For a rapidly evolving library, this is no easy feat. As expected, brew ends up falling short of this.

On the flip side, requirejs-maven-plugin offers a lightweight maven “integration-point”, allowing you to have your own version of r.js along with a separate application configuration. Below is the link to the developer’s github project:

Here is a very close approximation to the plugin i used for this to work:

<plugins>
  <plugin>
    <groupId>com.github.mcheely</groupId>
    <artifactId>requirejs-maven-plugin</artifactId>
    <version>1.0.3</version>
    <executions>
      <execution>
        <goals>
          <goal>optimize</goal>
        </goals>
      </execution>
    </executions>
    <configuration>
      <configFile>${basedir}/src/main/config/app.config.js </configFile>
      <optimizerFile>${basedir}/src/main/scripts/r.js</optimizerFile>
      <filterConfig>true</filterConfig>
    </configuration>
  </plugin>
</plugins>

Mavenizing your app config

Due to our settings up above, your application configuration is actually going to be filtered by maven before it is run.

This lets you use the common variables you are used to having in a configuration file. Below is a potential setup:

({
    appDir: "${basedir}/src/main/js/",
    baseUrl: "./",
    dir: "${project.build.directory}/${project.build.finalName}/js",
    modules: [
        {
            name: "main"
        }
    ]
})

Conclusion

The modularity advantages of requirejs can exist inside of your maven projects after all! This solution does not exactly follow the “maven way” of piling verbose configurations inside of your pom.xml, but I think you can survive. If you have gone so far as to mavenize your javascript dependencies, this plugin does state that its default behavior is to look for r.js on the classpath, so I assume that would work as well.