Cool Toys Pic of the day – What to Do Without WhatTheHashtag? Part Deux

Last night I went to sleep thinking, “Oh, darn, I forgot to mention
PHP and DIY Twitter archiving.” This morning I awoke to several
replies and suggestions to yesterday’s post. So, here is an expansion
on the topic of archiving Twitter chats and such. Remember, however,
that I am no longer any kind of codemonkey, so I’m not building
anything myself, and I am speaking from the vantage point of someone
who has not yet found a solution.

First off, the PHP bit I mentioned. It isn’t just PHP, but the idea of
coding your own, be it with PHP, Python, Git, C++, Yahoo Pipes, or
whatever works for you. Harvest and aggregate the RSS feeds for your
streams, splice them together, filter if desired, and push the result
at specified times on a daily basis into a blog that exists primarily
for the purpose of capturing and aggregating your stream(s). I seem to
know a lot of people taking this approach. It really impresses me. If
I had infinite time and wit I might even try it out myself.

However, there is no guarantee this will continue to work, even if you
have built your own. Recently Twitter sent me a survey to ask why I am
one of the holdouts who has not switched to the NEW Twitter interface.
Well, the #1 reason is that none of the pages supply an RSS feed
anymore. Hello? There have been a lot of complaints about this. If you
are using a Twitter RSS feed, take a good look at it. I am accustomed
to RSS feeds that have URLs very similar to the base URL of the page
generating the feed. For example:

SOURCE:
http://twitter.com/pfanderson
RSS might be something like:
feed://twitter.com/pfanderson.xml

Please note, this does NOT work, although Twitter does something very
similar for their public timeline, so obviously they know how.

Twitter Public Timeline RSS Feed:
http://twitter.com/statuses/public_timeline.xml

Now, what does Twitter REALLY provide as an RSS feed?

SOURCE:
http://twitter.com/pfanderson
RSS:
http://twitter.com/statuses/user_timeline/5618162.rss

How would you guess that if there wasn’t a link on the page? Exactly.
Would probably be darn difficult. Even jollier? Even if you have the
RSS link, they don’t always work. As I am writing this, the public
feed and my personal feed both say:

“error: Rate limit exceeded. Clients may not make more than 150
requests per hour. /error”

Why is this happening? I don’t know. In the past hour I’ve sent 4
tweets. The screen does automatically refresh every minute, but that
shouldn’t push it past the limit, if that counts. I have Packrati.us
set to archive my links in tweets to Delicious. Would that count? Do
all of these add up?

Back to the topic. So Twitter is creating arcane RSS links and making
an interface required that hides them, when this information USED to
be readily available. Why would they do this? Also, I must confess
that it looks a trifle suspicious when they incrementally reduce
service and functionality for the main platform. If they are planning
to make access to chat archives or personal archives a
fee-for-service, I’d really appreciate it if they were open about that
strategy or plan, rather than little by little slicing off bits of
what made Twitter useful without apparent rhyme or reason.

Back to backups.

Here is one of the classic articles on how to backup your tweets.
Remember, what I’m trying to get is #hcsm chats, since Backupify seems
to do a nice job at a reasonable price on the personal archive side.

ReadWriteWeb: 10 Ways to Archive Your Tweets (By Sarah Perez / August 11, 2009):
http://www.readwriteweb.com/archives/10_ways_to_archive_your_tweets.php

I haven’t checked all of these, but several I already know no longer
work, and the handful more I checked are also no longer working. The
most likely to still work, I think (and I’m no expert on code) seems
to be option #10:

“10. Geek Tools Let You Archive in XML, PDF, HTML, TXT…or even with Python
RSS guru Dave Winer released a tool earlier this year which archives
Twitter posts using the OPML Editor and optionally synchronizes with a
structure on Amazon S3. Alternately, there’s this simple Python script
for archiving tweets. Sourceforge also hosts an app with lets you
backup up tweets of different users as XML, HTML, PDF, or TXT.
However, it can only perform backups of 3200 tweets at a time. Each
subsequent backup will append the additional tweets to the current
existing archive.”

So what is there for those of us who don’t code? Andrew Spong and
Colleen Young were chatting about Archivist.

Archivist:
http://archivist.visitmix.com/

While I haven’t been successful figuring out how to get an “archive”
out of Archivist, it really does do lovely visualizations. Archivist
suffers from the same constraints faced by so many other similar
tools. They did an especially nice job of explaining these constraints
in their FAQ.

Archivist: FAQ:
http://archivist.visitmix.com/about/faq

Of particular note:
– “[Y]ou will only see at most the 500 last tweets about that topic”
– “The Archivist polls the Twitter Search API to collect data. As
such, there is a chance that the Archivist could miss tweets,
especially in the case of a search with a high volume of traffic.”
– “Can I export an archive and get the tweets in the archive? No. The
Twitter Terms of Service does not allow this.”

I was chagrined to discover I had already created a #HCSM archive in
Archivist last year. Oops!

pfanderson’s Archive on #hcsm Containing 75,281 Tweets:
http://archivist.visitmix.com/pfanderson/1

Another tool that was mentioned was Tweethook. I had not mentioned it
yesterday because it costs money.

Tweethook:
https://tweethook.com/

Alan Levine, better known as CogDog, mentioned that he is using ThinkUp.

ThinkUp:
http://thinkupapp.com/

I really like what he is showing with it, and it is open source (very
nice), but it seems to require that you have your own domain and
server for a self-hosted solution. That is the same problem I
encountered trying to backup my personal tweets previously. The most
likely way seemed to be a WordPress blog. There are several tools for
integrating a push from Twitter to your WordPress blog, but all the
ones I checked out would not work on WordPress.com blogs, only in
self-hosted blogs.

So I’m still stuck, but there you have what I know so far. Keep
sending me ideas and solutions!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s