Smuggler - A Twitter Timeline Downloader

You are looking at revision 3 of this page, which may be out of date. View the latest version.

Fellow Aussie MVP Troy Hunt was asking on Twitter today if a tool exists that will download a user's entire timeline and save it to a file. That seemed like a simple enough job for Budgie, so I had a crack at it over lunch.

The idea is to fetch the latest 100 or so tweets from the given user, then get the Id earliest tweet returned and use that as the "max_id" of the next call. Budgie's current version on NuGet doesn't actually support the "max_id" parameter, but my working copy does. I'll get the NuGet package updated soon.

You call it like this:

>smuggler mabster

That'll download all the tweets I've made, 100 or so at a time. It pauses for 30 seconds between each call because Twitter enforces a 150-calls-per-hour rate limit for anonymous calls.

The tweets are saved into the same folder as the exe in a .txt file whose filename matches the username, so make sure you have the exe sitting in a folder that you have write permission to! :)

Note! In my tests, Twitter stops returning tweets once you go about six months back. Your mileage may vary.

Here's the code in its entirety. I'll probably end up putting it on Bitbucket at some point. In the meantime you can download a zipped executable from my Public SkyDrive folder.

class Program
{
    static void Main(string[] args)
    {
        if (args.Length < 1)
        {
            Console.WriteLine("Usage: " + Path.GetFileName(Environment.GetCommandLineArgs()[0]) + " <username>");
            return;
        }

        GetTweets(args[0]).Wait();
    }

    static async Task GetTweets(string username)
    {
        using (var stream = new FileStream(username + ".txt", FileMode.Create))
        using (var writer = new StreamWriter(stream))
        {
            var client = new TwitterAnonymousClient();

            long? max = null;

            var response = await client.GetUserTimelineAsync(username, count: 100, max: max);

            while (response.StatusCode == System.Net.HttpStatusCode.OK)
            {
                if (!response.Result.Any())
                {
                    Console.WriteLine("No more tweets. Exiting.");
                    break;
                }

                Console.WriteLine("Got " + response.Result.Count() + " tweets, " + response.RateLimit.Remaining + " hits remaining.");
                foreach (var tweet in response.Result)
                {
                    writer.WriteLine(tweet.CreatedAt.ToString() + "\t" + tweet.Text);
                }
                max = response.Result.Min(t => t.Id) - 1;

                Console.Write("Pausing for 30 seconds ...");
                await Task.Delay(30000);
                Console.WriteLine(" done.");

                try
                {
                    response = await client.GetUserTimelineAsync(username, count: 100, max: max);
                }
                catch (Exception ex)
                {
                    Console.WriteLine("Ouch! Got an exception!");
                    Console.WriteLine(ex.ToString());
                    break;
                }
            }
        }
    }
}
budgie twitter .net
Posted by: Matt Hamilton
Last revised: 21 Aug, 2012 02:23 AM History
You are looking at revision 3 of this page, which may be out of date. View the latest version.

Comments

No comments yet. Be the first!

No new comments are allowed on this post.