Wednesday, May 12, 2010

Making use of all the space on a hard drive

If you have a big data drive, there's a good chance that Linux is reserving 5% of it's available space for "emergencies".

This is necessary for your root filesystem as if the disk fills up there may be no way for root to log in and clean things up without this buffer of space for log files etc. to be written into.

However, I have a couple of large drives that store only data and 5% is 100GB I'd rather be able to use.

Fortunately, it's easy to reclaim the space, the following will remove all "reserved blocks":

sudo tune2fs -m 0 /dev/sdXX

This can be reset back to the default 5% (or any percentage) if you need to:

sudo tune2fs -m 5 /dev/sdXX

Friday, January 22, 2010

iPhone woes

My iPhone 3GS died 2 days into our holiday. Something like an IMEI/ICCID not found error, and the only thing it would display was a screen asking to be connected to iTunes to be restored.

Connecting to iTunes wouldn't let me restore as it said my phone was locked with a passcode. Using the forced restore mode, iTunes wouldn't restore without an Internet connection. Once I got a connection, the restore failed with an "Error 23" and the log file contained "radio" errors. Didn't sound good. Tried a DFU mode restore, same "Error 23".

So, it was bricked. My last backup was in October. My bad I guess, but I can only use iTunes via my partner's laptop and usually I don't need to. Thankfully it was still covered by the 1 year warranty.

Calling Apple went smoothly. They acknowledged that I'd done everything I could and booked me an appointment to take the phone into the Apple store.

The Apple store is crazy, there were more staff than customers and the Genius bar was booked out for 2 days. That's where your extra $ go when you buy a Mac, buying blue t-shirts...

The Genius replaced my handset with no troubles, which was great.

Restoring from my backup didn't work very well.

My Apps weren't restored at all, only the links to Web Apps I'd saved to the home screen were restored. This is possibly because I hadn't "authorised" iTunes with my account. Hard to recall if I was warned about that, I didn't know it was necessary in any case.

Restoring Apps manually is a pain, trawling through iTunes receipt emails, figuring out which I hadn't uninstalled and which I wanted to keep using.

Additionally, you have to buy the App again, only once that's done does it confirm that you're not going to have to pay for it again (and gives you a cancel button, God knows why.) If you choose the wrong App then bad luck, the one-click purchase will install it with no cancel button.

One tiny bit of good news: Many App's settings were backed up, and the settings were restored after re-purchasing the App. This wasn't clear initially, but obviously it backs up and restores all the files in the "home" folder, whether the App is present or not.

The music wasn't synced at all as I had iTunes set to manage it manually. Fair enough in that case, but still tedious to get it back.

The lesson to be learnt is that an iPhone backup isn't really a backup at all. It apparently backs up your "home" folder and relies on the Sync to do the rest. This is sort of obvious in hindsight, as it's not like there's going to be room for a 32gb backup, and theoretically that's mostly copies of files (music) that's on the computer anyway.

Just because I don't want everything to be automatically Synced doesn't mean I don't want it backed up.

So, I'll be changing my habits from now on. I won't be switching to using iTunes, that's not an option on Linux, but I'll definitely treat the iPhone as a terminal only and try to keep as much as possible "in the cloud". Using Google Sync for my contacts in addition to my mail and calendars from now on for example.

Sunday, September 27, 2009

Canvas Game on iPhone

I just ported my little canvas game to the iPhone:

Go play!

Thursday, August 13, 2009

GPSLog Labs

GPSLog Labs is a site I've been working on where you can upload logs from your GPS tracking device and map, graph and analyse them.

It will let you track your exercise and training whether your a cyclist or runner, and can also be used to track your mileage and keep a diary of your activity.

It builds on a few of the things I've posted about here before, and there's some more information on the GPSLog Labs blog.

Signing up is trivial as it supports OpenID, I'd appreciate any of your comments and suggestions and hope the site interests some of you.

Tuesday, July 14, 2009

Hopeless Exetel customer service

I got the following message regarding a line fault issue we have with our telephone service:

Please note that the supplier technician who has attended to your service issue has confirmed to us that there is no issue with in the infrastructure/network boundary point and or main distribution frame (MDF). Please re-check your equipment.

That was a surprise to me, as on Saturday the technician had found a fault in the line but had been unable to pull the cable and had said that Optus would have to dig it up and we'd find out more later.

Calling Exetel help was a struggle, after waiting on hold and explaining my issue, the call dropped out when they seemed to put me back on hold. Twice.

Thoroughly pissed off I spent another 20 minutes on hold and listening to complete silence while the help desk person was "just one second" (turns out he was talking to Optus, would have been nice to be told why I was waiting).

So, at this point, 50 wasted minutes later, I find out that Optus is going to fix the line after all, and that the message I received was due to Exetel "closing" the initial fault ticket and was completely wrong.

Idiots.

Saturday, October 11, 2008

Merging GPS logs and mapping them all

Inspired by cabspotting and Open Street Map, I wanted to merge all my GPS logs and create a map showing all the routes I've logged lately.

This is pretty easy using gpsbabel, but I needed to use a little Python to get the list of input log files. (I'm sure there's a way to do it in bash but that's beyond me for now.) My GPS stores files in nmea format, and the directory structure/purpose of my Python script should hopefully be apparent.

>>> import os
>>> from path import path
>>> logs = " ".join([" ".join(["-i nmea -f %s"%log
                               for log in sorted((raw/"raw").files("GPS_*.log"))]) 
                     for raw in path("/home/tom/docs/gpslogs").dirs() 
                     if raw.namebase.isdigit()])
>>> logs
'-i nmea -f /home/tom/docs/gpslogs/200810/raw/GPS_20080930_221152.log -i nmea -f /home/tom/docs/gpslogs/200810/raw/GPS_20081001_071234.log ...'
>>> os.system("gpsbabel %s -o kml,points=0,labels=0,trackdata=0 -F /home/tom/docs/gpslogs/all200810.kml" % logs)

The result of that is a 36.5 MB kml file I could load into Google Earth:

There was one spurious point somewhere in the log file at 0° E, 0° N, and the log has a lot of jitter when I'm walking near home.

Thursday, September 18, 2008

A simple game using JavaScript and the CANVAS element

This should work in Opera and Firefox (though really slowly, why I don't know) and probably Safari. It's not going to work in IE because I can't be bothered getting the iecanvas thingi into the blog code.

Update: There is now an iPhone version.

The only thing to note is less clicks == better score.

Canvas not supported...
Clicks
0
0000

Monday, August 25, 2008

Analysing GPS Logs with Awk

This post describes the first two "chop" functions that fit into the partitioning framework outlined last post.

def chopToSpeedHistogram(dest, p):
    # create histogram of speeds from nmea written to stdout
    os.system("cat "+sh_escape(dest)+".log"
              + " | awk -F , '{if($1==\"$GPVTG\" && int($8)!=0){count[int($8+0.5)]++}}"
              + " END {for(w in count) printf(\"[%d,%d],\\n\", w, count[w]);}'"
              # sort it
              + " | sort -g -k 1.2"
              # output json of histogram
              + " > "+sh_escape(dest)+".hist")

def chopToHeadingHistogram(dest, p):
    # create histogram of headings from nmea written to stdout (ignore heading when stopped)
    os.system("cat "+sh_escape(dest)+".log"
              + " | awk -F , '{if($1==\"$GPVTG\" && int($8)!=0){count[5.0*int($2/5.0+0.5)]++;}}"
              + " END {for(w in count) printf(\"[%d,%d],\\n\", w, count[w]);}'"
              # sort it
              + " | sort -g -k 1.2"
              # output json of histogram
              + " > "+sh_escape(dest)+".head")

Both functions use awk to create a histogram from the speed (in km/h) and heading (or bearing, in degrees) from the NMEA VTG sentences. The speed is rounded to an integer, and the bearing to the nearest 5 degrees. The data logger records on reading per second, so this gives a measure of how much time was spent at each speed/bearing.

The histogram is output in a "json" array format that can be inserted straight into a webpage where the flot library is used to generate some graphs.

Speed Histogram

The average and standard deviation (shaded at ±0.5σ) are indicated on the graph for two bike rides along the same route, and match pretty closely with that recorded by my bike computer:

GPS logBike computer
Ride 1 (brown) 2hrs 59 min, minus 41 min stopped63.4km27.7km/h 2 hrs 16 min64.01km28.00km/h
Ride 2 (dark green) 2 hrs 25 min, minus 13 min stopped63.4km29.0km/h 2 hrs 10 min63.85km29.30km/h

The two rides went in different directions, the first in the "uphill" direction and the second with a bit of a tail wind. I got a flat tire on the first ride too, hence the extra time spent stopped.

Heading Histogram

Up is north and the radius represents the time spent heading in that direction (normalized during the plotting process and "expanded" by taking the square root to show a little more detail.)

Thursday, August 21, 2008

Automatically Partitioning GPS Logs with gpsbabel

My GPS logger is capturing lots of useful information but it's difficult to efficiently capture data for regular activities. Geotagging photos is easy, and manually working with the logs for a special event is possible, but it's not feasible to put in that much work to analyze commutes for example.

The logger creates a separate log file each time it's switched on and off, and while these logs could be sorted into categories for analysis, it's easy to forget to turn it on and off at the start and end of a section of interest and activities are then merged in the logs. In addition, there is often "junk" data at start and end of logs while leaving or arriving at a destination.

I wanted to be able to automatically capture the information about my daily activities by simply switching on the logger and carrying it around with me. I then simply want to plug the logger into the computer and have the logs automatically chopped into segments of interest that can be compared to each other over time.

The rest of this post roughly outlines the Python script I created to perform this task, minus some of the hopefully irrelevant details.

Firstly, I collect the lat/long coordinates of places that I am interested in collecting data while I'm there and traveling between them. These include my home, work, the climbing gym and so on. Each point has a radius within which any readings will be considered to be in that place.

#         id:  name lat         long        radius
places = { 1: ("A", -37.123456, 145.123456, 0.050),
           2: ("B", -37.234567, 145.234567, 0.050),
           3: ("C", -37.345678, 145.345678, 0.050) }
otherid = 4

For each of these places of interest, I then use gpsbabel's radius filter to find all the times where I was within that zone:

# create a list of all raw log files to be processed
from path import path
month = path("/gpslogs/200808")
logs = " ".join(["-i nmea -f %s"%log 
                 for log in sorted((month/"raw").files("GPS_*.log"))])

for (id,(place,lat,lon,radius)) in places.items():
   os.system("gpsbabel "
             # input files
             + logs
             # convert to waypoints
             + " -x transform,wpt=trk,del"
             # remove anything outside place of interest
             + (" -x radius,distance=%.3fK,lat=%.6f,lon=%.6f,nosort"%(radius,lat,lon))
             # convert back to tracks
             + " -x transform,trk=wpt,del"
             # output nmea to stdout
             + " -o nmea -F -"
             # filter to just GPRMC sentences
             + " | grep GPRMC"
             # output to log file
             + (" > %s/processed/place%d.log"%(month,id)))

And all points outside any of the specific places of interest are sent into an "other" file:

os.system("gpsbabel "
          # input files
          + logs
          # convert to waypoints
          + " -x transform,wpt=trk,del"
          # remove anything in a place of interest
          + "".join([" -x radius,distance=%.3fK,lat=%.6f,lon=%.6f,nosort,exclude"%(radius,lat,lon)
                     for (id,(place,lat,lon,radius)) in places.items()])
          # convert back to tracks
          + " -x transform,trk=wpt,del"
          # output nmea to stdout
          + " -o nmea -F -"
          # filter to just GPRMC sentences
          + " | grep GPRMC"
          # output to log file
          + (" > %s/processed/place%d.log" % (month, otherid)))

These files are filtered with grep to contain only minimal data as we only require the timestamps for this part of the process. Specifically only the NMEA GPRMC sentences are kept.

To provide a brief illustration, the following picture shows two log files of data, a blue and a green, between three points of interest:

The above process would create four files, one for each point A, B and C and one for "Other" points that would contain something like the following information, where the horizontal axis represents time:

I then read all those log files back in to create a "time line" that for each timestamp stores my "location" in the sense that it knows whether I was "home", at "work" or somewhere between the two.

# dict of timestamp (seconds since epoch, UTC) to placeid
where = {}
for placeid in places.keys()+[otherid,]:
   for line in (month/"processed"/("place%d.log"%placeid)).lines():
      fields = line.split(",")
      # convert date/time to seconds since epoch (UTC)
      t, d = fields[1], fields[-3]
      ts = calendar.timegm( (2000+int(d[4:6]), int(d[2:4]), int(d[0:2]),
                             int(t[0:2]), int(t[2:4]), int(t[4:6])) )
      where[ts] = placeid

This is then summarised from one value per second to a list of "segments" with a start and end time and a location. Unlogged time segments are also inserted at this point whenever there are no logged readings for 5 minutes or more.

# array of tuples (placeid, start, end, logged)
# placeid = 0 indicates "unknown location", i.e. unlogged
summary = []
current, start, stop, last_ts = 0, 0, 0, None
for ts in sorted(where.keys()):
   # detect and insert "gaps" if space between logged timestamps is greater than 5 minutes
   if last_ts and ts-last_ts > 5*60:
      if current:
         summary.append( [current, start, stop, True] )
      current, start, stop = where[ts], ts, ts
      summary.append( [0, last_ts, ts, False] )
 
   last_ts = ts

   if where[ts] != current:
      if current:
         summary.append( [current, start, stop, True] )
      current, start, stop = where[ts], ts, ts
   else:
      stop = ts
summary.append( [current, start, stop, True] )

(If there's a more "Pythonic" way of writing that kind of code, I'd be interested in knowing it.)

"Spurious" segments are then removed. These show up because when the logger is inside buildings the location jumps around and often out of the 50m radius meaning that, for example, there will be a sequence of Home-Other-Home-Other-Home logs. The "Other" segments that are between two known points of interest and less than 5 minutes long are deleted, as are "Other" segments that sit between a known place of interest and an unlogged segment.

Based on the above graphic, the summary might look something like the following:

startendlocation
10.00am10.05amA
10.05am10.30amOther
10.30am10.35amB
10.35am11.00amOther
...

The "Other" segments are then labelled if possible to indicate they were "commutes" between known locations:

startendlocation
10.00am10.05amA
10.05am10.30amA-B
10.30am10.35amB
10.35am11.00amB-C
...

Some segments cannot be labeled automatically and are left as "Other". This may be a trip out to a "one-off" location and back again, which can be left as "Other". However, sometimes it is because the logger didn't lock onto the satellites within the 50m radius on the way out of a place of interest and these can be manually fixed up later.

Once a list of "activities" has been obtained, with start and end times, it is easy to use gpsbabel again to split logs based on start and end of time segments:

for (place, start, stop, place_from, place_to, logged) in summary:
    dest = month / "processed" / ("%s-%s"%(time.strftime("%Y%m%d%H%M%S", time.localtime(start)),
                                           time.strftime("%Y%m%d%H%M%S", time.localtime(stop))))

   for (ext, chopFn) in [(".log", chopToLog),
                         (".kml", chopToKml), 
                         (".speed", chopToSpeedVsDistance), 
                         (".alt", chopToAltitudeVsDistance), 
                         (".hist", chopToSpeedHistogram),
                         (".head", chopToHeadingHistogram),
                         (".stops", chopToStopsVsDistance)]:
      if not (dest+ext).exists():
         chopFn(dest, locals())
         # make the file in case it was empty and not created
         (dest+ext).touch()

This generates a bunch of files for each segment, named with the start and end timestamps of the segment and an extension depending on the content. The first "chop" function generates an NMEA format log file that is then processed further by the remaining "chop" functions. The other chop functions will probably be explained in a later post, the first two are:

def chopToLog(dest, p):
    # filter input file entries within times of interest to temp file
    os.system("gpsbabel " + p["logs"]
              + (" -x track,merge,start=%s,stop=%s"
                 % (time.strftime("%Y%m%d%H%M%S", time.gmtime(p["start"])),
                    time.strftime("%Y%m%d%H%M%S", time.gmtime(p["stop"]))))
              + " -o nmea -F "+sh_escape(dest)+".log")

def chopToKml(dest, p):
    # create kml file with reduced resolution
    os.system("gpsbabel -i nmea -f "+sh_escape(dest)+".log"
              + " -x simplify,error=0.01k"
              + " -o kml -F "+sh_escape(dest)+".kml")

def sh_escape(p):
    return p.replace("(","\\(").replace(")","\\)").replace(" ","\\ ")

(Again, if there's a better way to handle escaping special characters in shell commands, I would like to know it.)

Using this, I can simply plug in the logger, which launches an autorun script, and the end result are nicely segmented log files that I can map and graph. More about that in another post.

Monday, August 18, 2008

GMail tip: "To be filtered" label

I use GMail's filters a lot, and in particular have a "Newsletters" label that keeps "bacn" from getting into my Inbox (and instead just lights up my ambient email notifier green.)

I've added a "to be filtered" label that I can apply to anything that slips into the Inbox and then later on, when I have time, I can look at all these messages, select them in groups and use the "Filter messages like these" command to make sure they don't bother me again.

This solves two problems: being taken out of the "flow" to create a filter, and perpetually handling the items manually and never getting around to creating a filter.