Archive for the ‘Wikipedia’ Category

A bit of everything

Tuesday, May 29th, 2007

Today was a really good day, at least for summer break. I got to see my girlfriend again (second time in the last week… woot), got a long-standing WikiBench todo done, and found some pretty nasty bugs in Mono’s implementation of the 2.0 framework TryParse methods for integer types.

Somewhere in there I sent a form letter to the three Indiana congressmen urging them not to support the IPPA of 2007. I’d encourage you to do the same.

WikiBench now has proper support for blacklisting anonymous users, as well as blocks of IP addresses. You can ask it to blacklist, say “192.168.0.0/8″ and every edit coming from an anonymous user in that range will be reported. Combined with state persistence, as I just blogged about, this is really useful. I also toyed with the idea of allowing a user to maintain multiple user-defined blacklists, but am not certain when or if this will happen.

Coming soon to WikiBench: watchlist and citation builder pads.

(I should really release a beta soon…)

WikiBench: Persistence

Friday, May 25th, 2007

One of the things I’d been dreading implementing in WikiBench is state persistence. This is, however, very important. When the user closes the application, they should find it in the same state they left it in when they open it later.

This is a tricky one to get right, though. Because one of the goals is portability, I can’t rely on certain paths to exist. Fortunately the CLI has a nifty answer to this one. A simple call to Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData) will return a user-specific path where configuration files can be saved. On *nix this is $HOME/.config and on Windows it’s %APPDATA%, which is typically %USERPROFILE%\Application Data.

The format of the configuration file should also be portable. XML was the natural choice.

Then I had to deal with how to let addins hook into configuration loading and saving process. If I were going for pure efficiency I would have forced addins to deal with XmlReaders and XmlWriters. This would be painful at best. While more of a memory hog, I decided to just maintain an XmlDocument in memory.

I went through several design ideas (one of which I implemented and later had to remove) before deciding on a refactoring of the IAddinEntry interface. The initial goal of this interface was to provide a way for addins to execute code immediately after being loaded, before anything else happened, and then to clean up after the addin was unloaded. The more I thought about it, the more this seemed like the perfect place to deal with configuration data.

So I converted this interface to a class. It retained its two main methods, Enter() and Exit(). I added a property that can be used by subclasses to get the XmlElement that encloses all of the addin’s data. I also added another method, UpdateConfig(). This is called just before the configuration file is written, and is guaranteed to be called only after a call to Enter(), but before a call to Exit(), which is important for addins that do set-up and tear-down of structures that need to be persisted.

So there are just two events an addin needs to deal with to persist state: Enter(), where the state is restored, and UpdateConfig(), where the state is saved. Exit() is intentionally not involved in state persistence.

This system allows a great deal of flexibility; addins can either push updates to the configuration document as they happen, or wait for WikiBench to notify them of the impending save and store their data then. Which is the appropriate technique will depend on how the addin works.

Three bits of WikiBench now use this system. WikipediaChangeStream persists the list of wikis to stream changes from, the Recent Changes pad persists the filter selection, and the Blacklist pad persists the items in the blacklist.

JavaScript Queues

Tuesday, May 15th, 2007

One of the most annoying parts of VandalSniper from a maintenance perspective was how the “JavaScript Queue” was run. To perform some complex action such as rolling back an article and posting a warning to the talk page of the editor required a sequence of JavaScript snippets to be run, one each time the browser finished loading a page. But the way this was implemented was absolutely horrendous. (But it was my first C# project, so I’m okay with that.)

When the queue was started, the entire window was set insensitive to prevent the user from messing with the fragile sequence of events that would take place. Two variables local to the main window class were set: a Queue containing the snippets to run, and a WebBrowser on which to execute these snippets. Once the queue was empty, or the script reported an error, these variables would be set to null and the main window made sensitive again.

Ick. Aside from being horribly annoying to the user (I mean we’ve got multiple browser tabs here… why shouldn’t they be able to use another?) it was just plain hackish.

I just finished implementing this same idea in WikiBench and I think I’ve done a lot better this time. I already have a BrowserTabState class that represents the state of one tab. Instances of this class are passed around as sort of identifiers, as well as being handles into a tab’s state. I added a boolean Locked property to this class; when set to true, all of the browser’s navigation controls are disabled, excluding the stop button. This includes the browser, the address entry, and the close tab button in the tab label. Other tabs remain functional. Oh, and the stop button that remains enabled will, when clicked, raise an event on the BrowserTabState instead of stopping navigation. This can be handled to abort whatever operation is in progress and restore control to the user.

The other part of this puzzle is the new JavaScriptQueue class. It just needs to be instantiated, given a BrowserTabState and an array of strings to work with, and it will manage all of the events.

Now this wouldn’t be complete without a little bit of an ugly hack. Part of the reason the entire main window was disabled in VandalSniper was because of the pads below the browser; if an item in the Recent Changes pad was clicked, for example, the browser would navigate away from the page that the JavaScript snippets expected it to be on. In WikiBench this could happen either through BrowserTabState.LoadUrl() or BrowserTabState.InvokeJavaScript(). The solution? These methods simply don’t work when BrowserTabState.Locked is true. I added two more methods: BrowserTabState.LoadUrlThroughLock() and BrowserTabState.InvokeJavaScriptThroughLock(). Things that expect to be working while a tab is locked (such as JavaScriptQueue) use these methods; things that don’t, like the recent changes pad, use the regular methods. A little bit screwy, but not a terrible solution.

I’ve added the “Report to AIV” user menu item both as a proof of concept for this system, and… well, because it was in VandalSniper, and that’s the feature set I’m copying.

What’s in a name?

Sunday, May 13th, 2007

Quite a bit, if you’re Gtk#.

I’ve implemented the popup user menu in WikiBench.MediaWikiIntegration, along with the extension points for other addins to insert their own menu items into the menu. They can either provide a type extension using a class that derives Gtk.MenuItem, or a custom extension node provided by MediaWikiIntegration to add a separator. It was working perfectly… well, almost.

If you are using the Clearlooks GTK+ engine then you know that when you hover over a menu item, the text color changes from black to white. Lots of other themes do this too. Well, the text on one of the menu items in my popup menu was not, and it happened to be a menu item provided by an addin. So, naturally, I start poking at this phenomenon.

Is it because the class I’m instantiating is in another assembly? Nope.

What about the fact that it’s instantiated using reflection? Nope again.

So I gave up and went to bed. Today I started digging around a bit more, and finally gave up again.

Later I decided to refactor a few classes, and one of the ones on the list was the menu item that was acting funky. The class was named BlacklistUserMenu, and since the other classes I’d defined were named things like TalkMenuItem and BlockLogMenuItem, I renamed it to BlacklistMenuItem for consistency. Then I compiled and ran the program to make sure my changes worked.

They did. In fact they really worked, because now the menu item text was correctly turning white when the mouse was on it. Just to make sure I wasn’t imagining it, I renamed the class back and sure enough the text color didn’t change.

I’m not sure whose fault this is, GTK+ or Gtk#, but it’s certainly one of the more interesting bugs I’ve seen. Who would have thought that the name you give a class could affect its behavior?

Glue-free JSCall#

Tuesday, May 8th, 2007

One of my goals during this rewriting of VandalSniper as a more general-purpose browser has been to reduce or eliminate the dependency on platform-specific glue libraries. JSCall# uses a C/C++ library to interact with the DOM, and this is just one more hurdle to be jumped over on the road to portability. Who wants to set up a build environment against Mozilla on Windows? Not me.

So I started fiddling around with Gecko# some, and discovered that it’s possible — and simple, really — to eliminate the glue library. To place a call to JavaScript from the CLI, it’s as simple as calling WebControl.LoadUrl() passing a string like “javascript:someFunction()”. This does change the return from WebControl.Location, but this can be worked around easily enough.

To make calls back, I copied an idea from JSCall#. The call is placed by constructing a string containing the function name and argument list, and the document title is set to this string, then immediately set back to what it was. This is picked up by the WebControl.TitleChange event handler, where the pieces are pulled apart into an array, the function name mapped to a delegate, and the delegate invoked with the arguments.

One of the cool things about the “javascript:” method of calling functions is that it seems to work just as well for whole function libraries, even those with newlines in them. The only caveat I’ve discovered is that if there is any //-style comment in the JavaScript, everything after it will be ignored. It seems that while embedded newlines are okay, they aren’t really treated as newlines. WikiBench does nothing to correct for this; it’s assumed that addins know about this. While I’d love to have it strip such comments out, that would require some parsing work (”//” can be embedded in a string safely, for example).

It’s one big kludge, but it works remarkably well.

More on WikiBench

Tuesday, May 8th, 2007

What WikiBench looks like right nowI’ve been hacking on WikiBench some more. The primary addition is the recent changes pad, which you can see in the screenshot. It is a separate addin that hooks into the WikipediaChangeStream addin to provide a list of changes to the user. Clicking a row in the list will display that diff in the browser, as expected.

The preferences dialog shows off the preferences pane provided by the WikipediaChangeStream addin for managing which Wikimedia projects to monitor changes from.

A lot is probably going to change between now and the public beta, but this is a pretty darn good start.

WikiBench

Wednesday, April 25th, 2007

I’ve been fiddling around with Mono.Addins and have decided that I will be rewriting VandalSniper from the ground up. I’ve had a lot of ideas for it that have become way too complicated to implement with the current design. VandalSniper was my first C#/Gtk# project anyway, so it’s about due for a rewrite. I’ve learned a lot since I designed it.

Instead of being a pure anti-vandal tool it will be a simple web browser that is extensible with Mono.Addins. Somewhat like Firefox, only “extensions” will be CLI assemblies, and will be able to integrate better with the UI and provide longer-lived services. It will ship with a few addins geared towards Wikipedia editors, not just RC patrollers.

Imagine being notified the moment a page on your watchlist is edited. VandalSniper already does this, but there’s too much anti-vandal stuff to make it attractive to the average editor. So these features will be isolated and regrouped by target audience, then packaged into addins.

The awesome thing about this design is that WikiBench can be whatever anyone wants it to be. If they have a nifty idea for a new feature they can code it themselves and have it integrate with the browser. Basically, we’re talking about the possibility of merging things like AWB, VandalSniper, and other Wikipedia-related tools into one coherent product.

Currently addins can extend the browser in two ways: preferences panes and pads. The former will show up in the WikiBench preferences dialog and will allow addin authors to present settings to the user in a convenient place. The latter will allow arbitrary widgets to be shown below the browser, in a tabbed widget — similar to what VandalSniper does now. However, the user will be able to hide pads they don’t want to see.

I already have one addin that provides change monitoring services to other addins; they can request that a certain wiki be watched (such as “en.wikipedia”), and an event will fire any time any change is made anywhere on the English Wikipedia. The event handler(s) will be passed an object that encapsulates the data of the change. In other words, addins can share the same change feed and don’t need to parse anything from browne.

Eventually the root extension points will include hooks into the browser, allowing addins to inject arbitrary JavaScript into pages, make calls to JavaScript functions from the CLR, and calls back into the CLR from JavaScript. This will make so many nifty things possible it boggles the mind.

I am already planning an addin that will make use of this feature to add links following Wikipedia username links. These will pop up a menu similar to the “[VS]” links in VandalSniper — except the menu will be an extension point, allowing other addins to add items to this menu.

I’m not sure how long before I’ll have a beta release, but I sure hope it’s soon. I’ll be posting some screenshots here occasionally.

This stuff is just too cool.

WatchlistBot

Monday, April 2nd, 2007

Today I got around to implementing an idea I had for a while. I already had code in VandalSniper that would parse the IRC feed from browne.wikimedia.org into objects that represented each edit. So, I simply pulled out those pieces, and threw them in with two other libraries for accessing Jabber and MySQL, and the result is WatchlistBot, a Jabber bot that will send an IM to users when a page they’ve elected to watch is edited.

I think my total development time has been between three and four hours. The bot is live now; just send “help” to watchlistbot@jabber.org for directions.

Wikipedia in The Andersonian

Saturday, March 31st, 2007

My school newspaper published an article about Wikipedia. The author used me as a source, so I thought I’d share it here. I only noticed one minor factual error — VandalSniper is used by many non-administrators too; it’s not limited to admins.

VandalSniper and MonoDevelop

Thursday, July 27th, 2006

Today I decided to try converting my script-based build procedure to a MonoDevelop project. The process was pretty straightforward, although there were some pretty annoying glitches that saw me scrapping the project files several times before it was “just right.”

Now that the conversion is done, my experience with MonoDevelop is so-so. The work that has been done so far is indeed impressive, but I find myself disabling many of the unpolished features simply because they get in my way.

For example, automatic indentation makes assumptions about my coding style that are incorrect, forcing me to re-indent portions of code only to have them butchered again later when I change something. The code parser seems easily confused by things, to the point where it doesn’t recognize half of the members of a class, so I only see a portion of the members in the class browser. Code completion is likewise fragile: it frequently erases code by itself; is unable to find a class in the list after being given a few letters (though I can locate the class in the list it displays); and seems to be constantly using stale data for my own classes, as a Regex field recently changed from a string still displays members of String in the member completion popup.

At this point, MonoDevelop provides very few (if any) benefits over vim and some build scripts. I will continue to watch its progress though — while I don’t find it particularly useful right now, it is a very promising project.