Programming


Like I said in my last post I have been working on some tools for writer brainstorming. The first one was the “article title writer” at TheWritersSecret.com. Next is a one for brainstorming movie titles. I call it “Movie Brain”.

The tech behind it:

It is written in JavaScript. That way the work of sorting through the data is done by the client machine instead of the server. JavaScript also has some pretty decent string handling abilities.

I collected over 11,000 titles from the WikiPedia list of movies. These are stored by category and are processed into usable data using a combination of awk, grep, and php scripting. For example:

cat horror_words.txt | awk '1==1 {printf("\"%s\",\n",tolower($2));}' | sort | uniq >movies_all.js

The engine uses a very simple template technique. Examples:
“() meets (2).” The parenthesis would be replaced by two different random words. “Boy meets Bacon.”
“The {} and the ()” The brackets would be replaced by a random amount of random words and the parenthesis would be replaced by a single word. “The egg substitute and the curmudgeon.”
“[]: {}” Here the [] means a year. “2012: Pirates Invade”

The last piece of tech was a fun part. When you generate the random list of movies you can see what they would look like with your standard coming soon movie title over a black screen. Just click on one and see what I mean. It uses JavaScript and CSS to generate and animated the title. For a time I had added zooming in on the title but I didn’t like how much processor it hogged.

I also used a technique for the coming soon generator, to keep it all client side, that I had never used before. A couple days ago a friend asked if it were possible to do forms without a server. Apparently it is possible if you are using method GET. You can just have JavaScript parse the URL to read the data.

While writing the JavaScript I found a wonderful resource JavaScript equivalents for PHP functions.

I am a huge fan of TiddlyWiki and use it all the time. This morning I wrote a quick utility in JavaScript to convert CSV to TiddlyWiki Table format.

I am releasing this code into the public domain. The main reason I am doing this is because I had problems finding a JavaScript CSV interpreter that was already public domain. Mine is quick easy and kinda bug free. //smile

Sure there are over 6 million places on the internet to check your IP Address. I’m sure I’ve written this app at least 7-8 times.

But this one is different. No mission creep. Just a plain minimalist way to find out: What is My IP Address?

For some reason I got the idea that importing my Firefox bookmarks.html into a TiddlyWiki would be a good idea. I know its just a .html file and that I could just open it in a text editor and paste it inside a TiddlyWiki tiddler and be done with it.

Deciding against the easy option, I decided to have the bookmarks.html file process its self by adding some JavaScript to it. After making a copy of my bookmarks.html file I added this code to the top of the file:



<SCRIPT LANGUAGE="JavaScript">

var theOutput="";

function run_me()

{

    myDiv=document.getElementById("my_div");

    theOutput="";

    theDL=document.getElementsByTagName('dl')[0];

    theOutput=”";

    walkBookmarks(theDL);

    myDiv.innerHTML=”<textarea cols=\”80\” rows=\”20\”>”+theOutput+”</textarea>”;

}

function walkBookmarks(node)

{

    if (node.nodeName ==”H3″)

    {

        theOutput +=”!”+node.firstChild.nodeValue+”<br>\n”;

    }

    if (node.nodeName ==”A”)

    {

        theURL = node.href;

        theText=node.firstChild.nodeValue;

        theOutput += “[["+theText+"|"+theURL+"]]<br>\n”;

    }

    if (node.childNodes != null)

    {

        for (var i=0; i < node.childNodes.length; i++)

        {

            walkBookmarks(node.childNodes.item(i));

        }

    }

}

</SCRIPT>

<div id=”my_div”>Working….</div>

And this code to the end of the file:



<script language="JavaScript">

run_me();

</script>

Which will give wiki formatted text like this:



!Bookmarks Toolbar Folder<br>

[[UnfocusedBrain.com -|http://www.unfocusedbrain.com/]]<br>

[[[Yahoo!]|http://www.yahoo.com/]]<br>

Works pretty nice.

My second attempt was to try using “rdf:bookmarks” as described in the RDF in Mozilla FAQ on the developer website. I really didn’t make much headway with this method. If I were to develop a plug-in for TiddlyWiki that works w/ Firefox’s built in bookmarks this would definitely be the way to go.

My third attempt was to use the ExternalTiddlersPlugin by Eric Shulman. Which allowed me to just put one line in a tiddler and have the bookmarks show up:

<<tiddler "file:///home/username/.mozilla/firefox/l3qbefb3.default/bookmarks.html">>

It almost worked. Since the bookmarks isn’t a really well formed html file it doesn’t come through the tiddly process very well. So I just added a couple lines of code to add around the file and it worked perfectly.



function my_textprocess(text)

{

    text = "<html>" + text + "</html>";

    return text;

}

With this method the links reflect the non editable nature of the real bookmarks from the TiddlyWiki.

In the end I’m not entirely satisfied with any of these methods. What I would like to make would be a plug-in to TiddlyWiki that will open the “rdf:bookmarks” interface. For each folder in the bookmarks it would create a new tiddler and under each tiddler would be the links. In addition to this I would like to add a Bookmarklet to add links to the wiki instead of the real bookmarks.

Tiddly.

I have been developing a couple projects with some friends for the Propeller chip. Which is an extraordinarily powerful micro-controller. The latest project uses the Ping))) sensor and the Hydra Sound System (HSS) to generate sounds based on your hand position over the ping sensor.

To make sense of the HSS sfx_play interface I wrote a little program that allows me to adjust the parameters in real time using a keyboard attached to the propeller. I use it to find a sound then write down those parameters for future use.
tp03.zip This requres the HSS.

Given a directory of images you can just do:

ls | awk '1 {printf("<img src=\"%s\"><br/>\n",$1);}' >index.html

I don’t think it gets any easier than that.

As part of the purging process of becoming a nomad we are trying to make an archive of important data. This is not an easy task. We have about 30 Gigs of photos. I know there are duplicates in the data but I haven't done anything about it until now.

In integrity part1 I told how to check the md5deep database (just a text file with md5sum and filename) to see if there are any duplicates. Example:  sort -n md5.test_data.txt | uniq -D -w 32 This will check the first 32 bytes of the md5 sums after sorting them. This works great for detecting duplicates.

But what can I do about them? Sometimes I have a duplicate on purpose. For example if I have a directory tree with 500 photos from a single shoot I want to make a directory with the "best of." I could do several things: Make symbolic links to the files, copy them to a new folder, or store best of stuff outside the main backup tree. I opt for just making a copy of the file into a "bob" folder. It is wasteful of space but this is the method that I choose.

So now that I have copied them I have files with duplicate md5sums. After pondering on this for a while I came up with the idea of changing the jpeg comment field to say something like: "I know this is a copy" or "Best Of Photos."

To accomplish this I am using my good friend jhead - the jpeg header manipulator. Example:

find ./best_of_best/ -type f | xargs -n1 jhead -cl \"Best of Photos\" 

This does the trick… now each file even though it is really the same photo with the same image data and same file name has a different md5sum.

Some duplicates in the archive are caused by sloppy photo management. Sometimes I do not delete the files off of a card before taking more and end up having more than one copy of a photo in different directories. With the jpeg header trick I can now either delete them or just change the comment field.

I suppose in the future when I start using the comment field more this method could overwrite valuable comment info. I guess when that becomes a problem I will add checks to make sure the comment field is empty before overwriting.

Have you ever just wanted to see a bunch of people doing random stuff. I know I sure have.

That's why I made the "People Doing Stuff - Random Image Finder." 

It's basically a random GIS query tool. The twist is that it searches for images based on Proper Noun + Verb. Its written using JavaScript. I used to use the diddly.com Random Personal Picture Finder for hours on end. I still do. I just wanted something a little different. Thus my People Doing Stuff finder was born.

I know it uses frames. I couldn't think of any other way to accomplish the goal. Using AJAX I kept getting permission denied when trying to pull data from off site. I guess it had something to do with preventing cross site scripting.

Enjoy. If you have any suggestions let me know.

I have been working hard and furious on the next generation of POP Rage. Originally I had designed the site to be a one-a-day article about pop culture. The idea was that if you didn't have time to scour all the popular sites on the Internet, you could come to POP Rage and see the 1 story that mattered for the day. The one thing that everyone would be talking about tomorrow. It didn't work out that way. The first few articles were about what was hot that day. Of course if it were hot there were already a billion articles about it and it was super well covered. So the question became, 'what makes this site different?'

To fix this perceived problem we started covering things that nobody else was covering or things that hadn't been covered in a while. We did some great articles but finding subject matter became difficult. The problem was if it were something nobody else was covering; how could it be a pop subject? We did some great articles in this format. Staring with an article about Mashups. Then moving on to subjects like, image generators, celebrities with blogs, and one about why celebrities seem to move frequently.

The articles were great but didn't generate much interest from the cold cruel world. So I decided it was time for a change. Of the topics that we covered the two most popular were 'blogging with the stars' and 'your locks are unsafe and useless.' The one about celebrities who blog caught my eye.

I decided that a really neat thing would be to have a site that cataloged celebrity blogs. Not bloggers who became celebrities but rather celebrities who became bloggers. I thought this would be a good resource because while I was researching the 'blogging with the stars' article I discovered how hard it is to track down quality celebrity bloggers. I ran in to un-official blogs that claimed to be official. Found sites that used to be actively maintained but had been abandoned. And kept running into sites where it seemed like a publicists told someone, 'you need to blog to reach these kids today,' the celeb did but only for the duration of that promotion.

I remembered how much I was impressed by Wil Wheaton's blog and thought 'how about not just listing the blog, but telling people how good it is?' This is what gave me the full idea for the celebrity blog rank. 

So I set out not only to catalog the blogs but to rank them. How do you rank a blog for quality? Turns out it is pretty difficult. One of the easiest things to measure is frequency of posting. (That is if they have an RSS feed.) I decided this was the most important factor in determining rank.  Next I thought about the problems with sites that are not blogged by the celeb themselves but rather by an agent. I didn't want to discriminate against these sites but rather wanted sites that were made by the celeb to rank higher. (Or at least that seem to be by them.) So I added some qualitative measurement to the formula. Things like: "How often does the celeb respond to fans?", "Is the blog just — I will be here on this date and here is what I am working on — or is it here is how I feel about what I'm working on" and "does the celeb allow fans to comment or discuss on their site?" Moby as an example not only has a board where people can chat but he even goes so far as to try to allow it to be self moderating.

Now the tech. The site is basically a RSS aggregator. The feeds are collected and analyzed in a 6 hour cycle that updates something every few minutes. A cron script runs lynx. Lynx pulls up a page that that sets off the update. Why do it this round about way rather than call the php script directly? Logs. The PHP script downloads the feed parses it with Magepie and stores the results in a MySQL database. Then another script is called that runs an analysis of all the data not only from that rss but from previous ones. I wanted to judge frequency not only on the few items in the RSS but over longer periods of about a month. Once these scripts run and determine an new rank for the blog the master table is updated to reflect the new rank. No ground breaking tech. Actually the tech side is fairly boring on this project.

So there it is. The why and how of the celebrity blog ranking. Will it do better than the article-a-day POP Rage? We will see. 

I am launching a new site today called POP Rage! The idea is that every day we will cover one and only one topic that is related to Internet Pop Culture and news. One day the site may cover a what's happening in the news, the next it may cover the latest viral video, or the hottest whatever on the net. The philosophy is that a user can come to the site and with 30 seconds know what thier friends are going to be talking about today and hopefully be better informed on the topic.

I am working with a team of really talented writers who take care of the articles which gives me time to focus on the backend. I did the backend software in PHP5 and mySQL.  It's complete custom software that is designed to handle the philosophy of 'one and only one topic' each day. There is a commenting system that allows users to post thier feedback on the article and hopefully contribute intellegent discussion on the topic. I have also implemented a comment ranking system in JavaScript. This allows people to tag comments as good/bad as they are reading without having to leave the thread.

Next Page »