Why Not Wingnut?http://blog.katherineca.se/2022-12-07T17:00:07-05:00Hello Again2022-12-07T16:08:00-05:002022-12-07T17:00:07-05:00Katherine Casetag:blog.katherineca.se,2022-12-07:/personal/hello-again.html<p>I've thought a lot about writing this post.</p>
<p>I've also thought a lot about <em>not</em> writing it, and just leaving the whole thing
up as a semipermanent memorial to he-who-never-really-was.</p>
<p>But then I got to fixing up the domain redirects, and then I noticed the content
errors, and got around …</p><p>I've thought a lot about writing this post.</p>
<p>I've also thought a lot about <em>not</em> writing it, and just leaving the whole thing
up as a semipermanent memorial to he-who-never-really-was.</p>
<p>But then I got to fixing up the domain redirects, and then I noticed the content
errors, and got around to fixing rather a lot of things that had piled up in the
last nine years since I'd last updated this and that meant taking some time to
read some of the posts I had made back in the FOSS@RIT days. And it really
crystalized a few things for me — and reinforced the idea that after all that
has changed, I am still me.</p>
<p>So, if you've been here before, please take note of the new name and location,
and welcome back. If you're new here, well, I really don't know what to tell
you. I can't say that I intend to update this ever again, but at least I will
feel at home if I do.</p>
Fail2ban woes2017-08-24T16:30:00-04:002022-12-07T14:28:30-05:00Katherine Casetag:blog.katherineca.se,2017-08-24:/personal/fail2ban-woes.html<p>So I had a bit of a scare this morning.</p>
<p>I'm in a hotel in North Carolina, hundreds of miles from my server, when I try to log in and get a "Connection refused" error. Phooey. I hope the hotel wifi is just filtering port 22 for some terrible reason …</p><p>So I had a bit of a scare this morning.</p>
<p>I'm in a hotel in North Carolina, hundreds of miles from my server, when I try to log in and get a "Connection refused" error. Phooey. I hope the hotel wifi is just filtering port 22 for some terrible reason, and hop to a friend's server over a non-standard port and I'm in. Great! I type out the connection again and...</p>
<p>Connection refused</p>
<p>Hmmm. This is starting to get concerning. I briefly freak out that my server might have had a terrible accident, when I realize that all of the other services I rely on all the time are still running. Phew. Still, I don't have ssh access, and that is very troubling. I start poking around with other things and eventually unconsciously try to ssh back in to check something and to my surprise, it works! As soon as I realize what I've done, I pull up journalctl to see if sshd is going crazy and I'm met with hundreds of lines of Chinese IPs plugging away at my server.</p>
<p>But wait! I have fail2ban installed. That should be stopping these, right? Well maybe something broke when the server was upgraded to Debian 9... and now I'm off trying to figure out what is running and where the config is.</p>
<p>Some time later, I finally have a solution. The problem (ultimately) was systemd, though not directly. I had moved this system to systemd some time early in Debian 7, which meant that the standard logging locations were still there, but no loger being written to. So fail2ban was looking for /var/log/auth.log, found a file, read it and found no problems, completely ignoring the fact that the file hadn't been written to since 2013. This isn't really fail2ban's fault, it has support for ingesting the systemd journal, but on Debian, the default backend still tries to use those files. I could set up rsyslog to start writing those files again, but I have no particular need or desire to do that outside of this program, especially as fail2ban knows how to read the journal on its own, if it's configured to do so.</p>
<p>So the solution is pretty simple, though it took me a while to get there. First, I had to <tt class="docutils literal"><span class="pre">apt-get</span> install <span class="pre">python3-systemd</span></tt>, to get the proper libraries to actually use the journal. Then, I had to have the following in my jail.local:</p>
<div class="highlight"><pre><span></span><span class="k">[sshd]</span><span class="w"></span>
<span class="na">backend</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">systemd</span><span class="w"></span>
</pre></div>
<p>I also put a few other tweaks in there as my config hadn't actually moved over from the previous version, but since it's now in the jail.local instead of editing the system config, this shouldn't be a problem again.</p>
GrooveBot 20152015-05-11T17:28:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2015-05-11:/personal/groovebot-2015.html<p>So <a class="reference external" href="http://www.grooveshark.com/">GrooveShark</a> shut down a few days ago.</p>
<p>I haven't really followed them much since I took over the original <a class="reference external" href="https://github.com/Qalthos/groovebot">GrooveBot</a>
codebase <a class="reference external" href="/personal/groovebot-updates.html">sometime in 2011</a>.</p>
<p>But now GrooveBot's namesake is gone, which marks a neat sort of milestone for
the project. It has now outlived its inspiration.</p>
<p>In any case …</p><p>So <a class="reference external" href="http://www.grooveshark.com/">GrooveShark</a> shut down a few days ago.</p>
<p>I haven't really followed them much since I took over the original <a class="reference external" href="https://github.com/Qalthos/groovebot">GrooveBot</a>
codebase <a class="reference external" href="/personal/groovebot-updates.html">sometime in 2011</a>.</p>
<p>But now GrooveBot's namesake is gone, which marks a neat sort of milestone for
the project. It has now outlived its inspiration.</p>
<p>In any case, that isn't what I really want to talk about today. Instead of
talking about the loss of GrooveShark, let me tell you about what GrooveBot has
gained.</p>
<p>When I was last fiddling with GrooveBot sometime in 2014, I came to a sudden
realization. If I wanted to subject the people of the FOSSBox to the mad fever
dream of Smash Mouth mashups that is <a class="reference external" href="http://neilcic.com">Neil Cicierega</a>'s <a class="reference external" href="http://neilcic.com/mouthsounds">Mouth Sounds</a>, I
would have to add some form of SoundCloud integration <a class="footnote-reference" href="#need" id="footnote-reference-1">[1]</a>.</p>
<p>At the time, I was busy working on one of the issues that had been annoying me
(and others) since the bot's early beginnings, namely the lack of a permanent
queue that persisted through crashes (and live-coding restarts). Along with that
came some architectural updates that should make it easier to one day support
multiple backends at the same time, which is another longstanding issue that I
would dearly like to fix one day. Point being, I had a lot of other things I
wanted to accomplish, and not a lot of percieved benefit to stopping that and
dropping in a new backend instead.</p>
<p>Fast-forward to PyCon 2015 sprints, and Neil Cicierega has not only made another
ridiculous mashup album with <a class="reference external" href="http://neilcic.com/mouthsilence">Mouth Silence</a>, but has also been posting a
number of not-yet-albumized music to SoundCloud individually. So, I took a look
through SoundCloud's <a class="reference external" href="https://developers.soundcloud.com/docs/api/guide">API</a>, and went through some of GrooveBot's less atrophied
backends <a class="footnote-reference" href="#old" id="footnote-reference-2">[2]</a>, and put together a new version that supports SoundCloud.</p>
<p>And then I proceeded to play <a class="reference external" href="https://soundcloud.com/neilcic/bustin">Bustin'</a> on repeat for a while.</p>
<table class="docutils footnote" frame="void" id="need" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[1]</a></td><td>Well, I mean, I could use the mpd backend, but without multiple
backend support, the pool of music would get stale quickly, and as much as
this is a project to force my musical tastes onto others, it is important to
me to let them do the same to me in return.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="old" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-2">[2]</a></td><td>The 2014 changes I mentioned were actually pretty deep in scope, and
while I am happier with where the bot is now, for a while Spotify was the
only supported backend because it was the only one in use, and the only one
that made the switch. Obviously now SoundCloud works, and I think MPD should
be working, but Pandora has been broken for years, and GrooveShark never
worked in the first place.</td></tr>
</tbody>
</table>
New Domain!2015-04-29T18:23:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2015-04-29:/meta/new-domain.html<p>Hey, look at that, I have a new domain!</p>
<p>It's a thing I've been looking into for a while now, but I finally went through
with it. Hopefully it'll get me to update this a bit more in the future, I have
a few draft posts I have been meaning …</p><p>Hey, look at that, I have a new domain!</p>
<p>It's a thing I've been looking into for a while now, but I finally went through
with it. Hopefully it'll get me to update this a bit more in the future, I have
a few draft posts I have been meaning to put together, especially after PyCon
2015.</p>
Charsheet 1.02013-12-02T17:24:00-05:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-12-02:/fossrit/charsheet-10.html<p>I've been doing some work for <a class="reference external" href="https://charsheet-qalthos.rhcloud.com">Charsheet</a> recently, fixing old bugs and adding
new features. I figured it was about time for a 1.0, so I tagged it and released
it to the wild. Many thanks to <a class="reference external" href="http://oddshocks.com">oddshocks</a> not only for the original 0.1
version of charsheet, but …</p><p>I've been doing some work for <a class="reference external" href="https://charsheet-qalthos.rhcloud.com">Charsheet</a> recently, fixing old bugs and adding
new features. I figured it was about time for a 1.0, so I tagged it and released
it to the wild. Many thanks to <a class="reference external" href="http://oddshocks.com">oddshocks</a> not only for the original 0.1
version of charsheet, but for continued assistance in massaging the site into
something cool.</p>
<div class="section" id="notable-features">
<h2>Notable Features</h2>
<ul class="simple">
<li>Character classes! You are now given a class based on your favorite language
(by klocs <a class="footnote-reference" href="#kloc" id="footnote-reference-1">[1]</a>), and your top two stats on the page.</li>
<li>Fixing Github integration. This was technically fixed in August, but it was
not finalized or pushed upstream until a few weeks ago.
- Github backend switched from pygithub3 back to pygithub</li>
<li>Completely removed tw2 from the site <a class="footnote-reference" href="#tw2" id="footnote-reference-2">[2]</a></li>
<li>Charsheet will now deploy on Openshift!
- But not with a MySQL database. This is a known <a class="reference external" href="https://github.com/civx/knowledge/issues/5">bug</a> with knowledge.</li>
<li>Gnu-cat Will no longer stick around when the back button is pressed
- Yes, it is actually called GNU-cat.</li>
<li>Fixed some incompatibilities with Pyramid 1.5</li>
</ul>
<p>There's more, but that is the big stuff I'm seeing from the git log. In any
case, the site is live, so just hop over to
<a class="reference external" href="https://charsheet-qalthos.rhcloud.com">https://charsheet-qalthos.rhcloud.com</a> and put in your Github, Ohloh, and/or
Coderwall username(s) and see what your coder character sheet looks like!</p>
<table class="docutils footnote" frame="void" id="kloc" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[1]</a></td><td>Kilo-Lines Of Code, or a unit of 1000 lines of code for the
uninitiated.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="tw2" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-2">[2]</a></td><td>This bears some explanation, but it was just tw2.forms, and the forms were
static, meaning they were quite underutilized. Forms are now pure HTML
forms.</td></tr>
</tbody>
</table>
</div>
Deploying MediaGoblin 2: SELinux2013-09-27T16:09:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-09-27:/fossrit/deploying-mediagoblin-2-selinux.html<p>So, <a class="reference external" href="deploying-mediagoblin-1-fastcgi-vs-uwsgi.html">earlier</a>, I wrote about my experience deploying <a class="reference external" href="http://mediagoblin.org">MediaGoblin</a>. None of
this was necessary, I was mostly trying to diagnose some problems I was having
and found uWSGI a more comfortable environment than FastCGI.</p>
<p>And what problem was this, you ask? Well, when I first ran through the
installation instructions …</p><p>So, <a class="reference external" href="deploying-mediagoblin-1-fastcgi-vs-uwsgi.html">earlier</a>, I wrote about my experience deploying <a class="reference external" href="http://mediagoblin.org">MediaGoblin</a>. None of
this was necessary, I was mostly trying to diagnose some problems I was having
and found uWSGI a more comfortable environment than FastCGI.</p>
<p>And what problem was this, you ask? Well, when I first ran through the
installation instructions, everything worked swimmingly (the port issues
mentioned aside). The server it was running on, however, was running Fedora 17,
while Fedora 20 has just reached alpha status recently. So, in the interest of
retaining compatibility and security fixes, there was a fun afternoon of double
distribution updates.</p>
<p>At first, everything seemed to be working fine, but then MediaGoblin began
inexplicably throwing 'permission denied' errors. As usually happens in these
cases, the culprit was SELinux, an additional security layer which normally
transparently protects your system, until something unexpected shows up.</p>
<p>I don't pretend to understand SELinux, but I do understand the security
improvements it brings. Plus I don't want to disable it on a system I don't
own.</p>
<p>Long, boring story short (and it was very long, being noticed first, and not
fixed until well after I figured out how to get uWSGI running), I got something
reasonably close to what I think I'm supposed to do. The key was the command
<tt class="docutils literal">setsebool <span class="pre">-P</span> httpd_can_network_connect on</tt>, which re-enabled the ability for
nginx to talk to programs on a network socket, as in the MediaGoblin
documentation.</p>
<p>I initially changed to a file-based Unix socket, but I could not, for the life
of me figure out how to enable this simply without changing a large number of
SELinux booleans. There may be some simpler way of accomplishing this, but alas,
in this case SELinux has once again bested me.</p>
Deploying MediaGoblin 1: FastCGI vs uWSGI2013-09-23T13:20:00-04:002022-12-07T15:03:23-05:00Katherine Casetag:blog.katherineca.se,2013-09-23:/fossrit/deploying-mediagoblin-1-fastcgi-vs-uwsgi.html<p>Last week I did a thing I really wasn't expecting to. I deployed
<a class="reference external" href="http://mediagoblin.org">MediaGoblin</a> to FOSS@RIT's yacht server <a class="reference external" href="http://yacht.rit.edu/mediagoblin/">here</a>. The initial setup and
<a class="reference external" href="https://mediagoblin.readthedocs.org/en/v0.5.0/siteadmin/deploying.html">instructions</a> are some of the clearest and straightforward I have seen in an
open-source project.</p>
<p>There are several reasons I haven't written this up earlier. One …</p><p>Last week I did a thing I really wasn't expecting to. I deployed
<a class="reference external" href="http://mediagoblin.org">MediaGoblin</a> to FOSS@RIT's yacht server <a class="reference external" href="http://yacht.rit.edu/mediagoblin/">here</a>. The initial setup and
<a class="reference external" href="https://mediagoblin.readthedocs.org/en/v0.5.0/siteadmin/deploying.html">instructions</a> are some of the clearest and straightforward I have seen in an
open-source project.</p>
<p>There are several reasons I haven't written this up earlier. One of the reasons
was the web server configuration file was more complex than I was used to, so
in order to get the server running quickly, I made a new config file for port
8080. Unfortunately, due to various arcane networking policies, while this
allowed anyone inside RIT to access the server, it was still not available to
the outside world.</p>
<p>Also, though the instructions were very clear, they used a few things I had not
used before, and a few things that weren't used in the way I was used to them.
This is the first blog post on the subject, detailing my confusion with
FastCGI and its eventual replacement with uWSGI.</p>
<div class="section" id="what-the-flup">
<h2>What the <tt class="docutils literal">flup</tt>?</h2>
<p>MediaGoblin, as documented, uses FastCGI to route requests from the web server
to MediaGoblin. The CGI in FastCGI refers to the 'Common Gateway Interface',
a standard developed to allow web servers to act as 'gateways' to serve not
just files but the output of executable programs. The MediaGoblin docs describe
how to use a python module called <tt class="docutils literal">flup</tt> to enable this communication.</p>
<p>There's a bit more to it than that, but in Python land, this
turns out to be a more questionable prospect than it might seem. Python already
has its own gateway interface (called the web server gateway interface, or
WSGI) which it is using to talk to FastCGI to have the WSGI turned into CGI so
that it can be interpreted by the server and turned into a web page. This would
be fine except that there are other WSGI-specific modules which can translate
the WSGI into a web page directly.</p>
<p>At this point, I assume that you are either skipping ahead past things you
already know or are horribly lost, so I'll just say that I eventually moved
MediaGoblin from the <tt class="docutils literal"><span class="pre">paste->flup->FastCGI->nginx</span></tt> contraption it was to a more
comprehensible <tt class="docutils literal"><span class="pre">uWSGI->nginx</span></tt>, and this is how I did it.</p>
</div>
<div class="section" id="enter-uwsgi">
<h2>Enter uWSGI</h2>
<p>First, I changed the nginx config to talk to uWSGI instead of FastCGI.
As I was also trying to move MediaGoblin to a subdirectory, I also added the
<tt class="docutils literal">uWSGI_modifier1</tt> line and altered <tt class="docutils literal">SCRIPT_NAME</tt> accordingly:</p>
<div class="highlight"><pre><span></span><span class="c1"># Load MediaGoblin via uWSGI</span>
<span class="k">location</span><span class="w"> </span><span class="s">/mediagoblin/</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="kn">include</span><span class="w"> </span><span class="s">uWSGI_params</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">uWSGI_pass</span><span class="w"> </span><span class="n">127.0.0.1</span><span class="p">:</span><span class="mi">26543</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="c1"># our understanding vs nginx's handling of script_name vs</span>
<span class="w"> </span><span class="c1"># path_info don't match :)</span>
<span class="w"> </span><span class="kn">uWSGI_param</span><span class="w"> </span><span class="s">SCRIPT_NAME</span><span class="w"> </span><span class="s">"/mediagoblin"</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">uWSGI_modifier1</span><span class="w"> </span><span class="mi">30</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</pre></div>
<p>Second, I altered the <tt class="docutils literal">lazystarter.sh</tt> file to accommodate being run with
uWSGI. This is a bit complicated as <tt class="docutils literal">lazyserver.sh</tt>, <tt class="docutils literal">lazystarter.sh</tt>, and
<tt class="docutils literal">lazycelery.sh</tt> are all actually the same file, with certain things changing
depending on the name by which it is invoked. I changed two sections <a class="footnote-reference" href="#footnote-1" id="footnote-reference-1">[1]</a>,
first:</p>
<div class="highlight"><pre><span></span><span class="w"> </span> local_bin="./bin"<span class="w"></span>
<span class="w"> </span> case "$selfname" in<span class="w"></span>
<span class="w"> </span> lazyserver.sh)<span class="w"></span>
<span class="gd">- starter_cmd=paster</span><span class="w"></span>
<span class="gi">+ starter_cmd=uwsgi</span><span class="w"></span>
<span class="w"> </span> ini_prefix=paste<span class="w"></span>
<span class="w"> </span> ;;<span class="w"></span>
</pre></div>
<p>And then near the very end of the file:</p>
<div class="highlight"><pre><span></span><span class="w"> </span> export CELERY_ALWAYS_EAGER=true<span class="w"></span>
<span class="w"> </span> case "$selfname" in<span class="w"></span>
<span class="w"> </span> lazyserver.sh)<span class="w"></span>
<span class="gd">- $starter serve "$ini_file" "$@" --reload</span><span class="w"></span>
<span class="gi">+ $starter --plugin python --virtualenv . --ini-paste "$ini_file" "$@"</span><span class="w"></span>
<span class="w"> </span> ;;<span class="w"></span>
</pre></div>
<p>This method allows you to keep using all the information on how to run
MediaGoblin from paste.ini, while using uWSGI to do all the heavy lifting.
The socket still needs to be defined with the command, though, with
<tt class="docutils literal">./lazyserver.sh <span class="pre">--socket</span> 127.0.0.1:26543</tt> or whatever socket you are using.</p>
<p>As a side note, this also allows us to use your system's uWSGI <a class="reference external" href="http://uWSGI-docs.readthedocs.org/en/latest/Emperor.html">emperor</a> to
manage bringing up the uWSGI process for you. If you are running <a class="reference external" href="http://mediagoblin.readthedocs.org/en/v0.5.0/siteadmin/production-deployments.html#separate-celery">celery as a
separate process</a>, this still needs to be done somehow, but otherwise (or if
you've kept <tt class="docutils literal">CELERY_ALWAYS_EAGER=true</tt>), then MediaGoblin should be managed
automatically. This is the format I eventually settled upon, using the
following uWSGI ini file:</p>
<div class="highlight"><pre><span></span><span class="k">[uwsgi]</span><span class="w"></span>
<span class="na">plugin</span><span class="o">=</span><span class="s">python</span><span class="w"></span>
<span class="na">uid</span><span class="o">=</span><span class="s">mediagoblin</span><span class="w"></span>
<span class="na">gid</span><span class="o">=</span><span class="s">mediagoblin</span><span class="w"></span>
<span class="na">socket</span><span class="o">=</span><span class="s">127.0.0.1:26543</span><span class="w"></span>
<span class="na">virtualenv</span><span class="o">=</span><span class="s">/srv/www/mediagoblin</span><span class="w"></span>
<span class="na">chdir</span><span class="o">=</span><span class="s">/srv/www/mediagoblin</span><span class="w"></span>
<span class="na">ini-paste</span><span class="o">=</span><span class="s">/srv/www/mediagoblin/paste.ini</span><span class="w"></span>
<span class="na">logto</span><span class="o">=</span><span class="s">/srv/www/mediagoblin/mg.log</span><span class="w"></span>
</pre></div>
</div>
<div class="section" id="what-next">
<h2>What Next?</h2>
<p>As far as I can tell, this should have been all we needed to get running.
Well, this wouldn't have been necessary either, except for some of the
repercussions of the other big problem that reared it's head, SELinux.</p>
<p>But that is <a class="reference external" href="deploying-mediagoblin-2-selinux.html">another post</a>.</p>
<table class="docutils footnote" frame="void" id="footnote-1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[1]</a></td><td>This statement is not entirely accurate. I actually made a new link
named <tt class="docutils literal">lazyuwsgi.sh</tt> and added the sections instead of altering the
existing ones. This format was chosen for clarity.</td></tr>
</tbody>
</table>
</div>
Building a Schoolserver2013-08-05T02:04:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-08-05:/fossrit/building-a-schoolserver.html<p>One of my tasks for this summer has been to try and get the FOSSBox's
schoolserver up and running again. We previouly had one a number of years ago,
but the hardware failed some time ago and the system itself was running a
hacked-together Debian build and did not have …</p><p>One of my tasks for this summer has been to try and get the FOSSBox's
schoolserver up and running again. We previouly had one a number of years ago,
but the hardware failed some time ago and the system itself was running a
hacked-together Debian build and did not have access to some of the actual
schoolserver scripts.</p>
<p>I first attempted to run the <a class="reference external" href="http://wiki.laptop.org/go/XS_Installing_Software_0.7">instructions</a> for the latest proper release of
the XO Schoolserver (henceforth XS), but this did not end very well. For one,
the instructions (and many automated scripts) assume you have two network
cards: one for the internal LAN to which the XO laptops connect, and another
connecting to the Internet. This assumption that the XS would be the gateway
device for a network of XO laptops would be fine in most deployments where
there is no existing infrastructure to get in the way, but at RIT where there
is not only significant infrastructure, but infrastructure I cannot easily
modify or control, it is less applicable.</p>
<p>Here are the steps I have taken to turn a fresh CentOS/RHEL server into an XS:</p>
<ol class="arabic simple">
<li>Set up <a class="reference external" href="http://fedoraproject.org/wiki/EPEL">EPEL</a>.</li>
<li>Add the <a class="reference external" href="http://wiki.laptop.org/go/XS_Installing_Software_0.7#Installing_on_top_of_existing_OS_installation">OLPC-XS repository</a> to your yum config.</li>
<li><tt class="docutils literal">yum install ejabberd idmgr <span class="pre">ds-backup-server</span> <span class="pre">xs-activity-server</span></tt></li>
<li><tt class="docutils literal"><span class="pre">xs-domain-config</span> <domain name></tt></li>
<li><tt class="docutils literal"><span class="pre">xs-setup</span></tt></li>
<li>Use <tt class="docutils literal"><span class="pre">system-config-firewall-tui</span></tt> to unblock ports 22, 80, 8080, 5222, 5223,
and 4369</li>
</ol>
<p><tt class="docutils literal"><span class="pre">xs-setup</span></tt> is the most trying of the commands, because it does a lot of
background work to set up the OLPC versions of many config files (while still
leaing the originals in place).</p>
<p>There's more to it, <tt class="docutils literal"><span class="pre">xs-setup</span></tt> tends to have some annoying side effects,
some of the config files need to be manually updated, but this is the general
idea. This post will be further updated as time goes on.</p>
<p>The main problem seen so far is that this is being set up on a RHEL server
backed by XEN, and the OLPC-XS repository keeps wanting to install an
incompatible kernel, hosing the system on a regular basis.</p>
Remysmoke Improving2013-07-26T03:44:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-07-26:/personal/remysmoke-improving.html<p><a class="reference external" href="http://remysmoke.linkybook.com/">Remysmoke</a> has had a few updates over the past week, and I think they've gone
pretty well. Remysmoke is now at version 1.2, here's the changelog for the past
few versions:</p>
<div class="section" id="remysmoke-1-0">
<h2>Remysmoke 1.0</h2>
<ul class="simple">
<li>Moved graphs from tw2-protoviz to <a class="reference external" href="http://pygal.org/">pygal</a>.</li>
<li>Moved smoke input form from tw2-forms to hand-crafted HTML5 …</li></ul></div><p><a class="reference external" href="http://remysmoke.linkybook.com/">Remysmoke</a> has had a few updates over the past week, and I think they've gone
pretty well. Remysmoke is now at version 1.2, here's the changelog for the past
few versions:</p>
<div class="section" id="remysmoke-1-0">
<h2>Remysmoke 1.0</h2>
<ul class="simple">
<li>Moved graphs from tw2-protoviz to <a class="reference external" href="http://pygal.org/">pygal</a>.</li>
<li>Moved smoke input form from tw2-forms to hand-crafted HTML5 forms.</li>
<li>Completely removed Toscawidgets and ToscaWidgets2 from Remysmoke (including
some very old boilerplate TurboGears code).</li>
</ul>
</div>
<div class="section" id="remysmoke-1-1">
<h2>Remysmoke 1.1</h2>
<ul class="simple">
<li>Updated codebase to TurboGears 2.3.</li>
<li>Redeployed Remysmoke on <a class="reference external" href="http://openshift.com/">OpenShift</a> Python 2.7 Cartridge (was Python 2.6).</li>
</ul>
</div>
<div class="section" id="remysmoke-1-2">
<h2>Remysmoke 1.2</h2>
<ul class="simple">
<li>Cleaned up login and smoke forms to be more consistent.</li>
<li>Themed form checkboxes to be consistent with the rest of the UI.</li>
<li>Added new theming for disabled input boxes.</li>
<li>Added new 'unsmoke' option, allowing a user to attest that they have not
smoked that day.</li>
</ul>
<p>Remysmoke 1.2 went live late last night, and a hotfix 1.2.1 will be up shortly
with an OpenShift-specific patch dealing with overriding the database location
at runtime.</p>
</div>
National Day of Civic Hacking2013-06-02T03:18:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-06-02:/fossrit/national-day-of-civic-hacking.html<p>One of the projects I worked on during the <a class="reference external" href="http://hackforchange.org/fossrit-rochester-civic-hackathon">Rochester edition</a> of the
<a class="reference external" href="http://hackforchange.org">National Day of Civic Hacking</a> hackathon was not actually anything intended
for the hackathon, but a short project I wrote last summer.</p>
<p>If you remember from <a class="reference external" href="introducing-fossrit-timeline-year-in-review.html">my last post</a> on the subject, I had made three files …</p><p>One of the projects I worked on during the <a class="reference external" href="http://hackforchange.org/fossrit-rochester-civic-hackathon">Rochester edition</a> of the
<a class="reference external" href="http://hackforchange.org">National Day of Civic Hacking</a> hackathon was not actually anything intended
for the hackathon, but a short project I wrote last summer.</p>
<p>If you remember from <a class="reference external" href="introducing-fossrit-timeline-year-in-review.html">my last post</a> on the subject, I had made three files,
one for each of the years that the FOSSBox had been keeping track of its
activities on the <a class="reference external" href="http://foss.rit.edu/timeline">timeline</a>. The files were very silly- all the data was
loaded on the fly from a JSON file, so the only thing in the files was the structure and the 'decoration' text.</p>
<p>Clearly this was not something that could stand. Today I finally managet to get
all the files together into one page. Now, when the page loads, it scans the
JSON file for all years mentioned, and populates a drop-down list with all the
years it has found. the first (and usually latest) year's data is then loaded
onto the page.</p>
<p>When the user clicks on another year from the list, the content is reloaded with
data from the apropriate year. If you want to see it in action, the new review
page for timeline now lives <a class="reference external" href="http://foss.rit.edu/timeline/summary.html">here</a>.</p>
A note on lxml2013-06-01T20:46:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-06-01:/fossrit/a-note-on-lxml.html<p>This post is being written from the <a class="reference external" href="http://hackforchange.org/fossrit-rochester-civic-hackathon">Rochester edition</a> of the
<a class="reference external" href="http://hackforchange.org">National Day of Civic Hacking</a> hackathon.</p>
<p>I've been doing a lot of things with <a class="reference external" href="http://pygal.org">pygal</a> lately. It's a really neat tool
for making SVG graphs in python. One 'problem' is that it needs <a class="reference external" href="http://lxml.de">lxml</a>, and
that has C extensions …</p><p>This post is being written from the <a class="reference external" href="http://hackforchange.org/fossrit-rochester-civic-hackathon">Rochester edition</a> of the
<a class="reference external" href="http://hackforchange.org">National Day of Civic Hacking</a> hackathon.</p>
<p>I've been doing a lot of things with <a class="reference external" href="http://pygal.org">pygal</a> lately. It's a really neat tool
for making SVG graphs in python. One 'problem' is that it needs <a class="reference external" href="http://lxml.de">lxml</a>, and
that has C extensions that need to be compiled. This isn't too bad, though
sometimes it makes me stop and install a compiler. The real problem is the
external header files it needs to compile.</p>
<p>NOTE: this blog post is entirely the result of my own laziness. Had I simply
perused the <a class="reference external" href="http://lxml.de/installation.html#installation">documentation</a>, I would have found this much sooner. Thus, this
post is solely a marker of my own eagerness to get things running quickly.</p>
<p>After I was asked for the third time how to install lxml, I finally decided I
would figure out how it works so it could be done properly. I did find out
what the needed package was, but I also found that if the shell variable
<tt class="docutils literal">STATIC_DEPS=true</tt> was set prior to installation, lxml would seek out and
download its requirements for you. I don't know how legitimate this is for a
Python install, but it was certainly quite useful for me and the others
trying to use lxml. It even works inside a virtualenv, though I don't know why
it wouldn't.</p>
ditaa Addition2013-05-15T00:00:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-05-15:/meta/ditaa-addition.html<p>Recently, while wandering around the Internet, I found a neat little project
called <a class="reference external" href="http://ditaa.sourceforge.net/">ditaa</a>. The basic idea is it takes a textual representation of a
diagram, and turns it into a real image. My first thought was that this looked
real neat for use in this blog.</p>
<p>I write this …</p><p>Recently, while wandering around the Internet, I found a neat little project
called <a class="reference external" href="http://ditaa.sourceforge.net/">ditaa</a>. The basic idea is it takes a textual representation of a
diagram, and turns it into a real image. My first thought was that this looked
real neat for use in this blog.</p>
<p>I write this now not in HTML directly, but in <a class="reference external" href="http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html">reStructured Text</a>, a somewhat
less verbose markup language native to Python. Thankfully, someone had already
made a <a class="reference external" href="https://gist.github.com/dvarrazzo/3807373">directive</a> for ditaa diagrams to be embedded in reST and get rendered
to an image in the final page. With this in hand, I managed to cobble together
a plugin for pelican to add that directive to my posts.</p>
<p>There's only one live example at the moment, which is my
<a class="reference external" href="/fossrit/american-greetings-hackathon-followup.html">American Greetings hackathon</a> post. The plugin itself has been added to my
fork of <a class="reference external" href="http://github.com/Qalthos/pelican-plugins">pelican-plugins</a>.</p>
<p>It's not all great, though. One thing I would really like is the ability to
render the diagram to SVG instead of PNG. SVG is much better for things like
simple diagrams, as it is web-native, (all it is is a special kind of XML
document), and it's vector-based, making it inherently scalable for different
sized displays. As near as I can tell, ditaa appears to use SVG internally, at
least to some extent. Ideally, I'd like to try to re-implement it in Python, not
necessarily for any benefit, but because the problem sounds interesting.</p>
Blog Rebuild2013-04-21T18:03:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2013-04-21:/meta/blog-rebuild.html<p>As you may notice, my blog looks a lot different now. On the recommendation of
a few FOSSBoxers, I have moved to <a class="reference external" href="http://blog.getpelican.com/">Pelican</a>, a Python-based static site
generator. The practical upshot of which is that I can now write posts in
<a class="reference external" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a>, keep the posts themselves in git, and host …</p><p>As you may notice, my blog looks a lot different now. On the recommendation of
a few FOSSBoxers, I have moved to <a class="reference external" href="http://blog.getpelican.com/">Pelican</a>, a Python-based static site
generator. The practical upshot of which is that I can now write posts in
<a class="reference external" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a>, keep the posts themselves in git, and host the whole thing
on <a class="reference external" href="http://pages.github.com/">GitHub Pages</a>.</p>
<p>While all of my posts have been successfully moved, not every one has survived
intact. in particular, paragraph splits in a number of older blogs have been
lost. I will try to rectify those as they crop up, but there's a lot to go
through.</p>
<p>One of the casualties of the move was the comments. Pelican does not handle
comments on its own, and while I can hook up a specialized service like Disqus,
I've decided that it isn't worth the effort at present.</p>
American Greetings Hackathon Followup2013-01-24T22:33:00-05:002022-12-07T15:26:07-05:00Katherine Casetag:blog.katherineca.se,2013-01-24:/fossrit/american-greetings-hackathon-followup.html<p>Last weekend was the <a class="reference external" href="http://foss.rit.edu/node/425">American Greetings Hackathon</a>, and it was one of
the most successful yet. We got more than 70 people attending and
working on projects, and most of those projects had at least something
working by the end of the 24 hours.</p>
<p>I worked with <a class="reference external" href="http://blog.helixoide.com/">Ross Delinger</a>, with …</p><p>Last weekend was the <a class="reference external" href="http://foss.rit.edu/node/425">American Greetings Hackathon</a>, and it was one of
the most successful yet. We got more than 70 people attending and
working on projects, and most of those projects had at least something
working by the end of the 24 hours.</p>
<p>I worked with <a class="reference external" href="http://blog.helixoide.com/">Ross Delinger</a>, with occasional contributions from <a class="reference external" href="http://www.ryansb.com/">Ryan
S. Brown</a> on a project eventually called <a class="reference external" href="http://github.com/ryansb/hetHUD">netHUD</a>. The idea was to
take the new <a class="reference external" href="http://nethackwiki.com/wiki/NetHack_4">NetHack 4</a> network <a class="reference external" href="http://nethackwiki.com/wiki/NetHack_4_Network_Protocol">protocol</a> and try to make something
more than just another interface to NetHack.</p>
<p>Originally, we set out simply to connect to the NetHack server and have
a second channel of information. Ideally, we would make calls to the
server and have it update us about games in progress. This turned out to
be problematic for a number of reasons, but the most immediate was that
we could not get two simultaneous connections to the server.</p>
<p>This meant we had to redesign our service. Instead of being a second
stream, we would need to piggyback on the initial connection, which
meant writing a server proxy. This may not have been the only way to do
it, or even the best way to do it, but that's just how we roll. The
eventual structure (all written in delicious <a class="reference external" href="http://www.twistedmatrix.com">twisted</a> protocols)
looked something like this:</p>
<img alt="Module diagram" class="ditaa" src="/images/netHUD module diagram.png" />
<p>tee.py acted as the proxy and sent any messages received from the
NetHack server to the controller, which cached the current state of all
the games and sent updates to any listening netHUD instances. This way,
you connect to NetHack as usual and log in, and then in a second window,
you connect to the server again on the netHUD port and get a slew of
information about your current inventory, nearby points of interest (eg.
monsters, items, traps), and other information. There's a lot more we
could add to this over time; one of the ideas thrown around at the
beginning was integration with the NetHack wiki, providing additional
information about items, monsters, even entire levels.</p>
Recent Projects: Democrat & Chronicle2012-08-22T22:16:00-04:002022-12-07T14:32:12-05:00Katherine Casetag:blog.katherineca.se,2012-08-22:/fossrit/recent-projects-democrat-chronicle.html<p>One of the new projects I had this summer was a project proposed by the
<a class="reference external" href="http://www.democratandchronicle.com/">Democrat & Chronicle</a>. The project involved access to a selection of
emails sent to the Greece school district in the wake of the YouTube
video involving several of their students. The idea was that we would …</p><p>One of the new projects I had this summer was a project proposed by the
<a class="reference external" href="http://www.democratandchronicle.com/">Democrat & Chronicle</a>. The project involved access to a selection of
emails sent to the Greece school district in the wake of the YouTube
video involving several of their students. The idea was that we would
get a dump of email bodies and try to glean some information out of
them.</p>
<p>The first hurdle, unfortunately, was getting at the information. Shortly
after we were approached with the request, I received an Access file
containing around 5000 email bodies to go through. Being what we are,
most of the resources in the FOSSBox are oriented around Linux, and no
one about had a copy of Access installed to get the data into a more
friendly format.</p>
<p>In theory, there are ODBC drivers for access databases, just like there
are for any other database system. In practice, however, they seem to
only exist for Windows machines which, while not surprising, was
disappointing. This led me to dig out an old VirtualBox VM with Windows
XP on it, install the Access drivers from Microsoft, and throw
LibreOffice on it, too. There's other ways I could have gotten this done
instead of LibreOffice, but I was still hoping this could be a simple
action at the moment.</p>
<p>LibreOffice Base eventually got into the Access file, but then the
troubles started again. It initially prompted me to save a LibreOffice
database file, which sounded great to me... it could export it
immediately, then I could copy it over and finish the task in Linux.
Unfortunately, all this file did was create a small wrapper around the
Access file, telling LibreOffice where the file was located, and what
was needed to open it. So now I was back to trying to export the data.
LibreOffice, though, was not willing to play along. I admit I am less
than familiar with the Base component of LibreOffice, however some
exploration and more searching online led me to believe I could not do
the simple translation of data from one format to another from within
Base.</p>
<p>Instead, I needed to select the table I was interested in (the only
table in the database), tell LibreOffice to copy the table, then open a
new spreadsheet in LibreOffice Calc and save the data that way. While
this makes some sense to me (Base being simply for basic interaction
with databases, Calc for manipulating raw data), I was dismayed that I
could not find some way to export a single table to a common data
format, like CSV, instead of having to go through yet another step. In
any case, once I dumped the data into Calc, I could easily save it to
CSV, drop that into my real computer, stop the VM, and get to work for
real.</p>
<p>The end result is <a class="reference external" href="https://github.com/Qalthos/mail_scrape">this</a>. I'm not sure it will ever be of any
particular use to anyone other than myself to remind me how to use the
Python NLTK module (whose documentation seems to be geared more towards
researchers than those already familiar with Python), and is hardcoded
to certain facets of the data I was given, but it does manage to do a
few things, and at each step it dumps the state of the data to a file so
I can inspect the process and consider possible improvements.</p>
Introducing: FOSS@RIT Timeline Year in Review2012-08-22T22:15:00-04:002022-12-07T14:31:22-05:00Katherine Casetag:blog.katherineca.se,2012-08-22:/fossrit/introducing-fossrit-timeline-year-in-review.html<p>Yesterday, I sat down with Remy and went over the last of the things we
need to do to close out the summer. There was a long list of items,
split into two sections, each containing the same sort of stuff.
FOSS@RIT has done a lot of stuff in …</p><p>Yesterday, I sat down with Remy and went over the last of the things we
need to do to close out the summer. There was a long list of items,
split into two sections, each containing the same sort of stuff.
FOSS@RIT has done a lot of stuff in the past year, and we need to be
able to tell people about it.</p>
<p>The first section concerned the few things we had done that hadn't yet
hit <a class="reference external" href="http://foss.rit.edu/timeline/">Timeline</a>. This was a fairly sizable chunk of things, and we
needed a good, rapid-entry way of getting more events in there. No
problem, I had done some work on that before, I could get it running and
dump events in there no problem.</p>
<p>The second (and slightly longer) list dealt with things we needed to
compile some information about as a sort of "here's what we've done"
report. The list of things that needed to be in that was... just about
the same as the last one. Remy planned to go through the Timeline site
and add entries from it, categorize them, and push it to foss.rit.edu.</p>
<p>This deeply concerned me on two distinct levels. First was the part of
me that never liked writing. I've mentioned it here once before, and I
feel I've gotten better since then, but the concept of wading through
all that data to write a report was not something that made me happy.
The other part was that all the things in the second list had to be
added to the timeline at some point anyway, or were likewise available
from other sources. When Remy showed me what he had written for 2010, it
was obvious this could be easily replicated in code.</p>
<p>About an hour of JavaScript wrangling later, I had made <a class="reference external" href="http://foss.rit.edu/timeline/2011.html">this</a>. Also
<a class="reference external" href="http://foss.rit.edu/timeline/2010.html">this</a>, and if you're reading this from about a year in the future,
<a class="reference external" href="http://foss.rit.edu/timeline/2012.html">this</a> should even exist. It's still needs some tweaking, the years are
hardcoded in the documents, the pages aren't linked from the main
timeline page, and I'd rather have them all use one common file than
make a new one each year, but it works, it's fairly similar to what was
handwritten last year, and it gets updated every time something gets
added to Timeline.</p>
Recent Projects: Knowledge2012-08-07T23:39:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2012-08-07:/civx/recent-projects-knowledge.html<p>One of the cool features of CIVX that really caught my interest when I
first started working on it was the <a class="reference external" href="https://github.com/FOSSRIT/knowledge">Knowledge DB</a>. Using the magic of
Python, Knowledge is a vertical, polymorphic database for storing
knowledge about things, which means very little to anyone, even me.</p>
<p>What Knowledge does …</p><p>One of the cool features of CIVX that really caught my interest when I
first started working on it was the <a class="reference external" href="https://github.com/FOSSRIT/knowledge">Knowledge DB</a>. Using the magic of
Python, Knowledge is a vertical, polymorphic database for storing
knowledge about things, which means very little to anyone, even me.</p>
<p>What Knowledge does is it allows you to store anything (or anything
Python can pickle, at least), as arbitrary collections of Entities and
Facts.</p>
Recent Projects: Groovebot2012-07-24T18:47:00-04:002022-12-07T14:32:44-05:00Katherine Casetag:blog.katherineca.se,2012-07-24:/personal/recent-projects-groovebot.html<p>This is the first of a number of posts meant to overview the things I've
been doing recently. First up on the list is the long-overdue
<a class="reference external" href="https://github.com/Qalthos/groovebot">Groovebot</a> update.</p>
<p>Before this summer, Groovebot 'worked' on <a class="reference external" href="http://spotify.com/">Spotify</a>, and also had a
fairly useful but under-tested <a class="reference external" href="http://en.wikipedia.org/wiki/Music_Player_Daemon">MPD</a> backend that worked for me when …</p><p>This is the first of a number of posts meant to overview the things I've
been doing recently. First up on the list is the long-overdue
<a class="reference external" href="https://github.com/Qalthos/groovebot">Groovebot</a> update.</p>
<p>Before this summer, Groovebot 'worked' on <a class="reference external" href="http://spotify.com/">Spotify</a>, and also had a
fairly useful but under-tested <a class="reference external" href="http://en.wikipedia.org/wiki/Music_Player_Daemon">MPD</a> backend that worked for me when I
could get MPD running properly. Unfortunately, I no longer had an active
Spotify subscription, which meant that I couldn't use the Spotify
backend, and I had moved off of RIT's campus, meaning I no longer had a
~100Mb connection to my fileserver, so streaming files that way was more
problematic.</p>
<p>While looking around for other things I could use to get Groovebot
running on, I rediscovered <a class="reference external" href="http://kevinmehall.net/p/pithos/">Pithos</a>, a Python frontend to Pandora.
Using Pithos as an authenticator, I could get URLs to music files that I
could play if I could figure out how to play them in Python. As a
temporary test, I just sent them to mplayer to play the files. This
worked well enough, but had the same problem I had encountered with
Spotify, namely that I could not get a callback properly positioned to
fire when the song finished.</p>
<p>I put the project aside for a time, until <a class="reference external" href="http://www.jlewopensource.com/">Justin</a> pointed out
GStreamer is made to do this sort of thing. With this in mind, I took
another look at the task, first trying to run GStreamer in its own event
loop on top of Twisted, then giving up and processing events on a tick.
This means I have to respond to every event GStreamer throws out (which
is a fair number of them), but most of them can be thrown out without
looking at them. Some more tweaking later, and PandBot was born, able to
connect to a user's Pandora station, play music, and thumb songs up or
down based on user response in the IRC channel.</p>
<p>At this point, I had been writing new bots each time I added a new
backend, with a lot of repeated code in each, leading to subtle bugs and
API implementations that were slowly drifting apart based on individual
needs. This made me a little annoyed, but I didn't have any particular
need to fix it, so I left it alone. Until an accidental click around the
Spotify website left me with another Premium subscription (and $10
poorer...) I poked around on SpotBot to see if it still worked. Turns
out it did... just not on the DJ computer in the FOSSBox. Turns out that
spytify, the python bindings for <a class="reference external" href="http://despotify.se/">despotify</a> have gotten a little stale
in the interim, and no longer compile with modern versions of Cython.
This meant I should probably break down and get an API key from Spotify
and use the proper official API instead.</p>
<p>However, in working on MPD and Pandora bots in the meantime had given me
a number of ideas and fixes that would be problematic to port back to
Spotify, which meant I had finally gotten an excuse to try to merge the
three separate codebases into one unified version. Starting with MPD and
Pandora, I merged the files and ran the result through a three-way diff,
trying to identify important parts of each bot to make sure each worked
properly after the merge. Next, I looked into the architecture of
pyspotify, and immediately ran into some trouble. pyspotify works
differently to the backends I had been used to, having its own threading
I would need to juggle alongside my own. Additionally, the code is not
thread safe, which is a problem with the heavily-threaded Groovebot
code.</p>
<p>So now, the status is that we are down to two bots. One is the ancient
Grooveshark code that is in need of updating and integrating with the
second bot. This second bot has all the work I've been doing lately and,
some few remaining problems aside, mostly work. Ideally, I will soon
have a better host for the bot that will let me run Spotify, but for now
Pandora is working well enough.</p>
Final Stretch2012-02-29T16:15:00-05:002022-12-07T14:30:39-05:00Katherine Casetag:blog.katherineca.se,2012-02-29:/fossrit/final-stretch.html<p>So the past two days have been madness trying to get the recently
christened WebBotWar (or just WebBot) to actually work on the web.</p>
<p>We had it working locally some time last week (I think... the days are
really starting to mush together), but OpenShift was a whole other
thing …</p><p>So the past two days have been madness trying to get the recently
christened WebBotWar (or just WebBot) to actually work on the web.</p>
<p>We had it working locally some time last week (I think... the days are
really starting to mush together), but OpenShift was a whole other
thing. This boiled down to two basic problems we had:</p>
<ul class="simple">
<li>pybotwar depended on pyBox2D, which needs to compile, which doesn't
work well even when you have control of the machine</li>
<li>We were relying on memcached to provide cheap communication between
the pybotwar process and the frontend. As near as I can tell,
memcached is not actually supported on OpenShift Express, though that
might not actually be true.</li>
</ul>
<p>The first was surprisingly easy to fix, though it took me some time to
actually think of the solution. In the final setup, there are
essentially three repositories: our modified pybotwar, the TG2 webbot
frontend, and a meta-repository containing both of the previous two in
the proper places. This third repo is not meant to be actually used to
develop, its only purpose is to be pushed to OpenShift and act as a
quick pull for someone looking to run WebBotWar themselves. The
practical upshot of this is that if we commit pyBox2D inside the
pybotwar directory of our meta-repo, pybotwar can find its dependencies,
and no one else needs to have to bother with it.</p>
<p>The second problem was more tricky, and eventually resulted in a rather
simple patch that just happened to take me around 12 hours to get right.
A quick Google of OpenShift Express Python and <a class="reference external" href="http://en.wikipedia.org/wiki/NoSQL">NoSQL</a> led me to
MongoDB, which has some benefits and drawbacks compared to just shoving
bits into memory, but seems to work very well in practice and is
probably the right way to go regardless. To be perfectly fair, memcached
<em>is</em> a type of NoSQL, but MongoDB is actually supported by OpenShift in
an easily-installable manner, and despite its more finicky syntax, it
works, which is something I failed to get with memcached.</p>
<p>Meanwhile, the rest of my team was hard at work making massive progress
on other fronts. Facebook authentication works, as does uploading custom
robot definitions, though I don't think the two are plugged into each
other yet. As well, there are brand new pretty images for the robots and
the turrets.</p>
<p>There's a few outstanding problems left, but (as long as I don't push
anything broken) you can have a look at webbotwar in action <a class="reference external" href="webbotwar-qalthos.rhcloud.com">here</a>.</p>
OpenShift Troubles2012-02-03T06:47:00-05:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2012-02-03:/personal/openshift-troubles.html<p>Recently I've been playing with <a class="reference external" href="http://openshift.redhat.com/">OpenShift</a>, a new(ish) service from
Red Hat as a sort of 'push to cloud' deployment strategy. It's
interesting for people like me who can whip up a site quick, but don't
necessarily have the framework in place to host it.</p>
<p>Due to my work …</p><p>Recently I've been playing with <a class="reference external" href="http://openshift.redhat.com/">OpenShift</a>, a new(ish) service from
Red Hat as a sort of 'push to cloud' deployment strategy. It's
interesting for people like me who can whip up a site quick, but don't
necessarily have the framework in place to host it.</p>
<p>Due to my work on <a class="reference external" href="http://civx.us/">CIVX</a>, I've gotten pretty familiar with
<a class="reference external" href="http://turbogears.org/">TurboGears</a>, and the idea of being able to take a site like that and
run it without having to set up apache or fiddle with paster sounded
real nice. Unfortunately, it was not so simple as it seems.</p>
<p>First up was to get something, anything, running. For a while, OpenShift
was throwing a <a class="reference external" href="http://www.flickr.com/photos/girliemac/6509400855/in/set-72157628409467125">500 error</a> when you tried to get a new application
registered. A brief poke into their IRC told had them aware of the
problem and it was fixed shortly. Next came the other part that should
have been easy, running TurboGears.</p>
<p><a class="reference external" href="http://lewk.org/">Luke</a> our favorite wizard around these parts, wrote a bit of <a class="reference external" href="https://github.com/lmacken/openshift-quickstarter">code</a>
to get most of the available applications up and running without too
much effort, including the currently unsupported TurboGears. Running
this went off without a problem, but the resulting site gave me another
happy 500 error. After a consult with Remy, we determined there were
some missing version requirements that kept the site from running. After
pulling those edits out of his repo and moving them upstream to Luke's,
I had a working default TurboGears site.</p>
<p>Until I tried to log in. Then I got another 500 error.
I was beginning to get used to this, but it was still annoying to make a
small change, then push it to the server and wait for the server to
update the settings before I could test it. Even more fun was the
occasional <a class="reference external" href="http://www.flickr.com/photos/girliemac/6540643319/in/set-72157628409467125/">503 error</a> when OpenShift couldn't keep up with my rapidly
building and tearing down sites.</p>
<p>Feeling that that was going to be a project by itself, I set about
moving all my non-db-interfacing files to this new repository. The
prebuilt version assumes that the site internally is named tg2app, and I
was having trouble convincing it to go by anything else. Eventually I
just decided to move files across one by one; first the templates that
don't care what they're named, then the root controller, than the new
model and widget. A lot of frustration, many <a class="reference external" href="http://www.flickr.com/photos/girliemac/6508023617/in/set-72157628409467125">403</a> and <a class="reference external" href="http://www.flickr.com/photos/girliemac/6508022985/in/set-72157628409467125/">404</a> errors
later, I had something that pretended to work as long as I didn't use
the database. But since the database is kind of the point of the site I
was building, this was not exactly acceptable.</p>
<p>So back to the drawing board then. I had a hunch something was wrong
when I saw SQLAlchemy errors scroll by every time I reloaded the site.
My best guess is that SQLAlchemy is failing to create the tables needed
to run the site and continuing on blindly. Once I realized that, I
dumped a test db from my local copy to the MySQL db, and suddenly
everything was working. Or almost everything, anyway.</p>
<p>I could read form the db fine, but any time I tried to modify it, I got
another dreaded 500 error. I poked into everything I could find to try
to figure out where it was failing, and finally determined it couldn't
be on my end, as my local copy worked just as expected.</p>
<p>Finally I stumbled across the answer, almost accidentally. When I moved
the db from local sqlite to MySQL, I failed to set the auto increment
setting on the id of my new databases, so when I neglected to provide an
id for the new entries I was making, MySQL quite rightly complained at
me. Unfortunately, since I can't find how to re-enable debug mode (nor
should I really try), I wasn't getting any good error messages.
So what is the site that has been giving me all these troubles? It's a
little site I set up to publicly shame Remy into stopping smoking:
<a class="reference external" href="http://remysmoke-qalthos.rhcloud.com/">remysmoke-qalthos.rhcloud.com</a></p>
Openshift Troubles Continued2012-02-03T06:47:00-05:002022-12-07T14:32:28-05:00Katherine Casetag:blog.katherineca.se,2012-02-03:/personal/openshift-troubles-continued.html<p>I figured out the problem I was having with OpenShift.</p>
<p>To put it simply, I didn't pay enough attention.</p>
<p>For reference, when moving an existing TurboGears app to OpenShift, make
sure you add the changes in config/app_cfg.py</p>
<p>As soon as I saw that, I felt really silly for …</p><p>I figured out the problem I was having with OpenShift.</p>
<p>To put it simply, I didn't pay enough attention.</p>
<p>For reference, when moving an existing TurboGears app to OpenShift, make
sure you add the changes in config/app_cfg.py</p>
<p>As soon as I saw that, I felt really silly for missing it. I was so sure
that I had gotten all the relevant changes, but apparently I somehow
missed this file.</p>
<p>More detailed directions coming soon.</p>
TurboGears2 on OpenShift, just like it should be2012-02-03T06:47:00-05:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2012-02-03:/fossrit/turbogears2-on-openshift-just-like-it-should-be.html<p>After much work and many trials, I finally have an app pushed to
OpenShift with no manual tweaking necessary. As often happens with these
things, the solution was much simpler than expected.</p>
<p>Note: I still don't have a foolproof 'follow this' solution ready, as
the one I built works exactly …</p><p>After much work and many trials, I finally have an app pushed to
OpenShift with no manual tweaking necessary. As often happens with these
things, the solution was much simpler than expected.</p>
<p>Note: I still don't have a foolproof 'follow this' solution ready, as
the one I built works exactly as I want it to, but:</p>
<ul class="simple">
<li>It needs a lot of love and cleanup</li>
<li>It requires an external git script that isn't well documented</li>
</ul>
<p>The first isn't much of a problem, and can be worked out over the next
few days. I'm more worried about the second one. For the curious, the
script is <a class="reference external" href="https://github.com/apenwarr/git-subtree">git-subtree</a>, which acts like a submodule except it is more
transparent to the repository which is a plus given OpenShift's odd
structure.</p>
<p>Back on topic, when we last left off this topic, I had finally gotten
OpenShift to acknowledge a project in a directory other than tg2app.
This is useful because, at least for me, most of my projects are not
named tg2app. That turned out to be stupid problem I had made for
myself, but unfortunately, the next problem to tackle was not.</p>
<p>You see, when setting up an app on OpenShift, you have very little
control over the actual environment the app is running in (this isn't
entirely true, but is a useful fiction, especially as the service is
likely to become more 'plug-and-go'). One of the few ways you can retain
control is through a series of post-commit hooks, one of which was
starting off the problematic section of code. When you first push your
code to OpenShift, it needs to set up your database so it is ready to
store information and do other databasey things.</p>
<p>Naturally, this wasn't happening.</p>
<p>First up was a problem with OpenShift. Python's default egg cache (not
too important, it's a place python can use to extract files from
installed packages temporarily) is not writable in OpenShift, so that
needs to be set before anything else will work. Next, the proper MySQL
library is not installed by TurboGears by default (the default is to use
sqlite), so that had to be added to the requires list.</p>
<p>And then I hit yet another wall. Despite everything being set up
properly, I could not connect to the MySQL database on OpenShift. It
wasn't a problem with MySQL, because I could connect fine with the MySQL
client. It wasn't even a problem with SQLAlchemy, because I was able to
connect from a short example script. Finally, in a fit of insanity, I
tried running the build script directly. I'm not even sure why, I was
just at the point I would do anything just to see if it would work.</p>
<p>And, strangely enough, <em>it did</em>.</p>
<p>This had some pretty profound implications. It meant something was
different during the build hook than in normal execution. Armed with
this new knowledge, I headed over to OpenShift's IRC channel to get some
answers (I had actually been in there for some time prior, just not with
enough information for the more ruby-oriented users to help).</p>
<p>They told me that yes, indeed there was a difference. During the build
step, the database is stopped, hence why I could not connect to it.
There were, however, hooks for deploy and post_deploy, during both of
which the database would be running. I moved the calls needing database
access to deploy, and suddenly everything worked! I made a few more
changes, cleaned up my tree, and tested it on a new app I wanted to get
on OpenShift, and it (mostly) worked. There were a few problems left,
but they seemed to be mostly my fault (and problems with the
application, not OpenShift), so it looked like I had finally fixed
deploying a standard TurboGears app. I've no doubt that there's
something I've left out, but I'm pretty amazed at the progress I've made
so far, and learned a lot about both OpenShift and TurboGears.</p>
FLOSS Seminar Update2012-01-19T17:46:00-05:002022-12-07T14:31:11-05:00Katherine Casetag:blog.katherineca.se,2012-01-19:/fossrit/floss-seminar-update.html<p>So it's week 5, what has been going on?</p>
<p>The class has split into three groups, working on different games. My
group's idea is a social update to the venerable likes of <a class="reference external" href="http://robocode.sourceforge.net/">robocode</a>,
though how exactly that is going to pan out is still debatable.</p>
<p>Ideally, this would comprise of …</p><p>So it's week 5, what has been going on?</p>
<p>The class has split into three groups, working on different games. My
group's idea is a social update to the venerable likes of <a class="reference external" href="http://robocode.sourceforge.net/">robocode</a>,
though how exactly that is going to pan out is still debatable.</p>
<p>Ideally, this would comprise of a wrapper around the basic game, not
requiring much tweaking of the internals, though a new interface would be
required for use on the web. All in all the class is shaping up to be
pretty interesting.</p>
First FLOSS Seminar post2011-12-01T20:22:00-05:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2011-12-01:/fossrit/first-floss-seminar-post.html<p>So this quarter I am taking a <a class="reference external" href="http://en.wikipedia.org/wiki/Free_and_open_source_software#FLOSS">FLOSS</a> Seminar being taught by <a class="reference external" href="http://threebean.org">Bean</a>
which should be pretty awesome. As a graduate student (as well as a
fairly advanced FLOSS citizen) there are a few extra things going to be
required of me, and we're working on ironing out the exact …</p><p>So this quarter I am taking a <a class="reference external" href="http://en.wikipedia.org/wiki/Free_and_open_source_software#FLOSS">FLOSS</a> Seminar being taught by <a class="reference external" href="http://threebean.org">Bean</a>
which should be pretty awesome. As a graduate student (as well as a
fairly advanced FLOSS citizen) there are a few extra things going to be
required of me, and we're working on ironing out the exact details now.
It looks like it's going to be a really cool quarter.</p>
GrooveBot updates2011-09-28T01:28:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2011-09-28:/personal/groovebot-updates.html<p>Today I'm going to talk about something a bit more understandable, I
hope.</p>
<p>Back around the summer of 2010, the FOSSBox wanted some ambient music to
play in the background. One of the things we wanted was a way to have
control over the songs played be available to anyone …</p><p>Today I'm going to talk about something a bit more understandable, I
hope.</p>
<p>Back around the summer of 2010, the FOSSBox wanted some ambient music to
play in the background. One of the things we wanted was a way to have
control over the songs played be available to anyone, even if they
weren't here to listen with us. <a class="reference external" href="http://jlewopensource.com/">Jlew</a> was working on some IRC bots at
the time, so he wrote up a bot that could hook into the <a class="reference external" href="http://www.grooveshark.com/">GrooveShark</a>
API and play music that users requested through IRC. It was cool, when
it worked, and it was fun having community control over song selections,
even when that sometimes led to a bit of musical griefing.</p>
<p>Then GrooveShark kept changing it's API without warning, then they
changed their pricing for access to said API, and then the bot stopped
working. Plans to build a more robust bot around <a class="reference external" href="https://gitorious.org/jlew/groovebot">GrooveBot</a>, as it was
now called, were scrapped, and the FOSSBox was quiet once more.</p>
<p>Then <a class="reference external" href="http://turntable.fm/">turntable.fm</a> happened. While we are not their optimal use-case,
it did allow us a measure of community choice in our songs, and
generally got us back to playing some sweet tunes while we work, with
the added bonus that anyone could run it, not just people with premium
access. Everything was pretty good, all things considered, and we could
continue on with our lives.</p>
<p>Except for one thing.</p>
<p>Now with GrooveBot, you could send all kinds of commands to the bot and
it would tell you what songs were playing, queue additional songs, pause
or resume playback, and so on, all from the IRC windows we were already
using to communicate with people outside of the area. I don't expect any
of this to be available for Turntable any time soon, but there is one
function that I have sorely missed. And that is the ability to change
the volume from my computer without having to get up and manually prod
whoever's running the audio.</p>
<p>So now there is a stripped version of GrooveBot on gitorious called
volbot, and all he does is respond to requests to change the volume.
Hopefully, when a new, free music API becomes available that we can hook
into, it should be easier to reimplement those functions in a more
general way so that multiple backends can be used for a more varied
music experience. And then we can get back to the OS grooving.</p>
Multiprocessing the PolyScraper2011-06-29T18:04:00-04:002022-12-07T15:55:10-05:00Katherine Casetag:blog.katherineca.se,2011-06-29:/civx/multiprocessing-the-polyscraper.html<p>Warning: This post was written at 4AM and contains a technical account
of what I have been doing attempting to parallelize CIVX's internals. If
you are looking for a more general overview of what I have been doing
recently, you are better off looking to another one of my posts …</p><p>Warning: This post was written at 4AM and contains a technical account
of what I have been doing attempting to parallelize CIVX's internals. If
you are looking for a more general overview of what I have been doing
recently, you are better off looking to another one of my posts.</p>
<p>Recently, I have been working on CIVX's PolyScraper, a neat little piece
of code designed to be able to read and understand structured text
without knowing how the text is structured beforehand. On Thursday, I
let a test scrape of a rather large dataset start, figuring it would
finish sometime over the weekend and I'd be able to pick it up on
Monday. Then Monday rolled around, and the scrape was still running.</p>
<p>Worse, it didn't seem to be taking full advantage of the available
resources. Running as it was on the boat, it had four cores at its
disposal, however it was steadfastly using only one core.
Now PolyScraping is not an inherently parallelizable task, but on large
datasets it should benefit from some kind of parallelization,
particularly when using large numbers of small files (less so with small
numbers of large files, those types of tasks are usually sequential as
you have a smaller number of resources blocking on read IO). Clearly
there were other things to look into, but if I could do this, this would
mean a huge win for offline scraping, which was one of the things I had
enabled with my addition of local file support to the PolyScraper.</p>
<p>This all led me to <a class="reference external" href="http://docs.python.org/library/multiprocessing.html">multiprocessing</a>, a library I'd been wanting to try
out in python for some time now. Without going into too much detail,
multiprocessing attempts to get around the <a class="reference external" href="http://docs.python.org/glossary.html#term-global-interpreter-lock">Global Interpreter Lock</a> by
spawning subprocesses instead of threads.</p>
<p>The first attempt was written pretty quickly, as I still remember a lot
from my Parallel Computing class from a while ago. Indeed the main
problem turned out to be SQLAlchemy, or, more specifically our use of
sqlite for a backend db. Sqlite is not the most robust of databases and
can't really handle multiple processes attempting to write to the db at
once. Luke suggested (and I would love to try) moving over to Postgres
as we will eventually be doing on the boat, but unfortunately the boat
has been 'stuck ashore' for some time now due to an extended outage in
CSH's network.</p>
<p>In the meantime I have been whittling the process down to what I think
is the essentials. In the process I have made a complete mess of the
code concerning the PolyScraper, but I should be able to make things at
least look like the way they were before too long.</p>
<p>At this point, though, I have been working on this project for close to
20 hours today now. Luke is in town and we're all posted up in Remy's
new place all hacking on our projects together. With any luck a good
night's sleep will clear my head and give me new ideas for tomorrow.</p>
Back up to Speed2011-06-21T22:56:00-04:002022-12-07T15:51:45-05:00Katherine Casetag:blog.katherineca.se,2011-06-21:/civx/back-up-to-speed.html<p>I've almost closed <a class="reference external" href="https://fedorahosted.org/civx/ticket/106">bug #106</a>. I've done all I can do for now, until I
can figure out how to attribute actions to senators. Until then, I
should comment out the 'actions' tab tomorrow with a note letting
whoever tries this next what I've already tried.</p>
<p>There is still some …</p><p>I've almost closed <a class="reference external" href="https://fedorahosted.org/civx/ticket/106">bug #106</a>. I've done all I can do for now, until I
can figure out how to attribute actions to senators. Until then, I
should comment out the 'actions' tab tomorrow with a note letting
whoever tries this next what I've already tried.</p>
<p>There is still some work to do in this code, particularly with fixing
the hacky Assembly scraping I wrote last year which also broke. However,
considering that the Assembly is not currently one of the bits we are
trying to expose (and they don't have a nice public API like the
Senate), it probably won't get done for a little while.</p>
<p>When I left yesterday, I had exposed the senator's social page, but none
of the other tabs were showing up. The problem for this turned out to be
that the image representing the tab was corrupted and the text beneath
it was white, making it overall look like the tab was invisible. Once I
got a new image for the tab, they all suddenly appeared, though only the
bills had any information.</p>
<p>The next thing to fix was the scraping of the committees, whose pages
had also changed subtly in the past few months. Fixing this was much
less annoying than building them the first time, and I was able to
remove a lot of old shims in the code from when I was first being
introduced to <a class="reference external" href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a>. Some of my later changes made this
particularly easy, and since BeautifulSoup is so powerful, I was able to
restore access to the data with relative ease. As a bonus, as soon as I
had committees back up, the tabs for votes and meetings came with it for
free. Suddenly, I was almost there!</p>
<p>The final problem I encountered today is actually not a new one, but one
we struggled with last year, and one I now feel confident I have
actually fixed and understand now. Once I got all the data pulling
again, a few of the pages would crash the server with
UnicodeDecodeErrors.</p>
<p>As a warning, some heavy Python is about to come down
UnicodeDecodeError is an error which happens (generally) when attempting
to decode a string into a Unicode object. This is generally a great
thing to do, as Python strings are generally encoded in the relatively
restrictive ASCII, which does not have characters for any of the more
exciting characters like accents and non-latin symbols. Unicode has no
such restrictions, and indeed has data for many, many more symbols, at
the cost of a few more bits of storage per character.
So why were we getting this error? The relevant line of code was 's =
unicode(s)' and was contained within <a class="reference external" href="http://pythonpaste.org/webob/#introduction">WebOb</a> code, not something I was
going to be able to modify successfully. Still, even this shouldn't be a
problem. The purpose of this function is to turn strings to Unicode
strings.</p>
<p>Except I didn't have a string, I already had a Unicode string.
Even this shouldn't be a problem, except that unicode() tries to
interpret its input as a string and then turn it into a Unicode string.
And while Unicode strings can be easily represented as normal strings,
the default of the unicode() function is to try to interpret those
strings as ASCII strings, and I had accents in the strings. These
strings were representing the names of the senators, so I had to make
sure it came out right.</p>
<p>In order to solve this, I had to reversibly represent these names as a
sequence of ASCII characters.</p>
<p>There are a few ways to replace out-of-bound characters when changing
strings into lesser encodings, and I had two useful ones to chose from.
The obvious one to chose was to change the characters into XML character
entities, however this quickly turned out to be insufficient. While
&#233; correctly showed up as é on the page, this string is used to
represent the name everywhere, including in the internal URL
representing the page. And the ampersand was quickly stripped out as a
broken argument to the URL, leading to a page for a nonexistent Senator.
Looking through the code, there were three distinct uses for the name
string. The first, which had started all this, was as an ASCII key to a
dictionary which needed to be authoritative but not necessarily
accurate. In other words, I needed it to be the same everywhere, but it
didn't necessarily need to be the correct name of the senator. The
second was the use on the generated web page, which needed to be as
accurate as possible to the Senator's actual name, as it is going to be
viewed publicly. The third, and the current stickler was the name in the
URL. Again, this had to be authoritative but not necessarily accurate.
This one, however, had to also only include web-safe characters, of
which &, # and ; do not qualify.</p>
<p>I mulled this over for a while, thinking up more and more elaborate
schemes for intercepting the names before they reached critical areas,
but none of it was terribly good coding practice. After far too much
thinking, I realized the obvious answer: have separate internal and
external names. The system still relies on the senator's name, which is
still a questionable practice given the multiple spellings of names that
occasionally pop up, (but mostly because I remember <a class="reference external" href="http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/">this post</a>, which
is something you should always keep in mind when programming around
names. The display name, on the other hand, has none of the restrictions
on characters (though it still needs to be ASCII to display properly),
but by using XML entities, we can make any character we want without
problems.</p>
<p>This was a long path to take to get back to where we were, but I think
that I really understand Python's Unicode in a way I never grasped
before. This should definitely help in the future as Unicode is a very
important part of coding portable applications and that's something I
want to do.</p>
Shelves and Shoves2011-06-21T22:56:00-04:002022-12-07T15:52:26-05:00Katherine Casetag:blog.katherineca.se,2011-06-21:/civx/shelves-and-shoves.html<p>Today is the day to hit <a class="reference external" href="https://fedorahosted.org/civx/ticket/105">bug #105</a>, another of the leftover bugs from
last year.</p>
<p>The story goes something like this: there's a lot of data on
nysenate.gov that is nice to have, but asking for that info on every
call is a little cumbersome. We want to …</p><p>Today is the day to hit <a class="reference external" href="https://fedorahosted.org/civx/ticket/105">bug #105</a>, another of the leftover bugs from
last year.</p>
<p>The story goes something like this: there's a lot of data on
nysenate.gov that is nice to have, but asking for that info on every
call is a little cumbersome. We want to cache as much of the data as we
can, particularly the stuff that's not going to change in the next week
or longer. Previously, I had implemented a simple pylons cache which was
fast, but had no persistent storage, so every time the server went down
it pulled all the info again. And due to the way the scrape was written,
it pulled all the info for all the senators at once, creating quite a
bit of lag before the first page showed up. This clearly wasn't going to
be something we could continue to develop with.</p>
<p>Now, I know nothing about caching data, so I did some poking around in
CIVX to see how it is done elsewhere. Most of the other caches I found
in CIVX code were related to caching text feeds, which were not as
explanatory as I was hoping. Once I felt I had a handle on how things
were done there, I began to try to implement some of it, only to be
shown <a class="reference external" href="http://threebean.wordpress.com/2011/06/08/cached-function-calls-with-expiration-in-python-with-shelve-and-decorator/">this post</a> on <a class="reference external" href="http://docs.python.org/library/shelve.html">shelve</a>, a Python library for storing arbitrary
data. Combined with the decorator, this seemed to do exactly what I
wanted, namely provide a permanent storage area for a bunch of data with
a configurable expire time. I dumped the code into the dashboard, hooked
the proper inputs up and let it run. The results were... promising, but
not astonishing. The file storage worked, once the data was cached, we
stopped looking to nysenate.gov for data and instead used our own data,
even after server restarts.</p>
<p>The problem was that the file storage seemed to be slower than the
previous memory cache. This is all perfectly reasonable, since disk
access is much slower than memory, and a lot of data has to get pulled
for each senator. The first obvious thing I could do is to re-enable the
memory cache, but this did not seem to help as much as I wanted it to.
At this point, <a class="reference external" href="lewk.org">Luke</a> popped up in chat to sat that Moksha had a
<a class="reference external" href="http://pypi.python.org/pypi/shove">Shove</a> cache it uses for feeds. Sure enough, back in the files I had
been poking through earlier, there were some references to Shove. Back to
the net, I started to explore what Shove was and how it could help me.
It turns out Shove is mostly drop in compatible with shelve, and aims to
be a more extensible replacement for it. Once I got a handle on how
Shove works differently from shelve (answer, not very), I made a few
tiny tweaks and got a version successfully working with Shove and a
sqlite backend. This didn't make the end result any faster (well maybe a
little, but not much), but there is a lot of room for improvement,
particularly if I can hook into Moksha's own stores. Further, Shove has
its own abilities to cache items in memory in addition to storing them,
which I would like to look into. The best route for efficiencies, I
think is to change how the data gets stored in the cache. Currently all
the data gets pulled at once, which was done to pacify the pylons cache.
However, if I can get individual caches for each senator, then I can
pull smaller volumes of data at a time, hopefully speeding up the
process.</p>
<p>We'll see where I get tomorrow, but so far I'm feeling pretty good about
all this.</p>
Summer of CIVX2011-06-21T22:56:00-04:002022-12-07T15:52:57-05:00Katherine Casetag:blog.katherineca.se,2011-06-21:/civx/summer-of-civx.html<p>So I'm back at RIT for the summer working on CIVX again. There's been a
lot of development on Moksha, the software stack CIVX runs on, and not
all of it was a trivial update. Still, many of the problems I
encountered were definitely my fault, not the least of …</p><p>So I'm back at RIT for the summer working on CIVX again. There's been a
lot of development on Moksha, the software stack CIVX runs on, and not
all of it was a trivial update. Still, many of the problems I
encountered were definitely my fault, not the least of which was
forgetting Arch Linux has switched to Python 3.</p>
<p>Once I was up, I looked into fixing the people dashboard we worked on
last summer. In the meantime, the NYS Senate updated their API for
getting open government data, so I had to figure out the new scheme and
try to make it work. Complicating this is that they still don't have
easy access to grabbing all the current senators, and the page we look
at to get this information changed enough to stop the script from
completing.</p>
<p>I'm actually surprised how quickly I was able to work this out, though
of course this is not even remotely new to me. But despite nearly a year
of inactivity on the project I seem to have gotten back into the swing
of it pretty well.</p>
The Last Week2011-06-21T22:56:00-04:002022-12-07T15:53:22-05:00Katherine Casetag:blog.katherineca.se,2011-06-21:/civx/the-last-week.html<p>This week has been a mess for a lot of reasons. Let's see what I managed
to get through so far.</p>
<p>One of the things I got to look at this week is threebean's mokshactl
branch of CIVX and Moksha. This is a project to simplify the
administration of a …</p><p>This week has been a mess for a lot of reasons. Let's see what I managed
to get through so far.</p>
<p>One of the things I got to look at this week is threebean's mokshactl
branch of CIVX and Moksha. This is a project to simplify the
administration of a Moksha installation. The project grew out of an
attempt to easily package Moksha and grew into a much larger system
capable of managing most aspects of CIVX at once, and in a very pretty
package, too. It has a few problems, but for the most part, it performs
quite well, and best of all, it will even integrate itself with Moksha,
controlling the necessary aspects of Moksha as well.</p>
<p>In less exciting news, I poked around in the people dashboard while I
had some time and got Assembly members working again. It didn't take
much as I had expected, but served to keep me on task while Remy was in
one of the never ending series of meetings he's had this week.
On a lighter side of things, I put some <a class="reference external" href="http://lobstertech.com/fabulous.html">fabulous</a> in the CIVX shell
today as I was working in it. The mokshactl branch is already
fabuloused, and once you see that, there's no coming back. fabulous
makes things very pretty with only a little work.</p>
<p>Other than that not a lot has gone down. Some work has gone into the
polyscraper, but that's nothing worth mentioning at this point. Between
that and some internal matters and hours of meetings and my car
developing a leak in it's brake line, that's all that went down this
week. Tomorrow I get to drive home and hopefully fix my car properly.</p>
Grokking the Core2011-06-21T22:55:00-04:002022-12-07T15:53:48-05:00Katherine Casetag:blog.katherineca.se,2011-06-21:/civx/grokking-the-core.html<p>This week begins the real dive into the core of what makes CIVX. Today
(and yesterday, though yesterday hardly counts as a real day) were spent
adding major functionality to the polyscraper, something that's been
overdue for a long time now.</p>
<p>But what is this magical polyscraper? Well, in short …</p><p>This week begins the real dive into the core of what makes CIVX. Today
(and yesterday, though yesterday hardly counts as a real day) were spent
adding major functionality to the polyscraper, something that's been
overdue for a long time now.</p>
<p>But what is this magical polyscraper? Well, in short, it's magic. A lot
of magic, actually, and that's half the problem. You see, in ye olden
days of CIVX, each data source had to have its own scraper, and these
were called whenever CIVX decided its data was old enough to get shoved
out and replaced with new data. This was all well and fine, except that
it took a very long time to get a scraper written for a new data source.
You would have to define all the columns, give it a location to look,
make sure you understood the site's particular dialect and scrubbed out
any irregularities in their data. What the poly scraper does is it
replaces all of those individual scrapers and replaces them with one big
scraper which is smart enough to deal with any URL it finds.</p>
<p>What I've been doing is adding new sources of data to the polyscraper.
In particular, yesterday was spent adding the ability to read files off
of a local disk and properly store them. This, in turn, exposed a few
holes in the underlying framework which needed to be patched. However,
this is a vitally important function, as things like the SunlightNY
scraper I wrote last year works outside of CIVX proper (in Java, no
less) and cannot be thrown into the polyscraper as easily. But I can
download the files locally, and then work on them when it is convenient.
With proper message passing, I can even seamlessly tell the polyscraper
to pick up the files as soon as they are downloaded.</p>
<p>Previously to this I had been working at the periphery of CIVX, adding
functionality to widgets and individual scrapers. This is my first real
push into the core functionality of CIVX, and it is good to see that I
really have picked enough up in all this time to really start to
understand the underlying structure of everything. Every day I learn
more about what goes on inside this machine, and every day marks another
set of tools I've learned to wield. I can't wait to see how far I get by
the end of the summer.</p>
pyDex 1.02011-05-08T03:19:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2011-05-08:/personal/pydex-10.html<p>And in mostly unrelated but terribly belated news, I finally released
<a class="reference external" href="https://gitorious.org/pydex">pyDex 1.0</a> this morning. Black/White pokedex, new config system, baby
pokemon, and game detection, are all new features introduced in this
version. Old configs should get read properly and get seamlessly updated
to the new version, though …</p><p>And in mostly unrelated but terribly belated news, I finally released
<a class="reference external" href="https://gitorious.org/pydex">pyDex 1.0</a> this morning. Black/White pokedex, new config system, baby
pokemon, and game detection, are all new features introduced in this
version. Old configs should get read properly and get seamlessly updated
to the new version, though there may be a few bugs in the process.</p>
Imagine RIT 20112011-05-08T03:16:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2011-05-08:/fossrit/imagine-rit-2011.html<p>So I've been absent for a while lately, but with good reason. I've been
preparing for Imagine RIT, which was held earlier today.
After pulling possibly my longest day's hack yesterday (1pm to 4am,
approximately), I woke up at 9 and headed over to the Innovation Center
to talk to …</p><p>So I've been absent for a while lately, but with good reason. I've been
preparing for Imagine RIT, which was held earlier today.
After pulling possibly my longest day's hack yesterday (1pm to 4am,
approximately), I woke up at 9 and headed over to the Innovation Center
to talk to people about FOSS. All day long, the FOSSBox was filled with
curious people, looking to find out what it is we did. We had kids
poking around on the OLPCs, checking out our software, and occasionally
getting lost opening new programs. Every time I finished showing a group
around all the different projects hosted in the FOSSBox, a brand new
group would appear and start asking questions. It was tiring, especially
on about three hours of sleep, but it was quite fun and exciting showing
off our work to new people.</p>
<p>Of particular interest was Lemonade Stand, which seemed to excite many
of the children who came around. Most seemed to pick up the game with
minimal instruction (though it helped that by now I know very well which
areas to explain... they should get tightened in the future) and most
had lots of fun despite the large quantities of math involved.
And the backdrop to all this was <a class="reference external" href="rise.rit.edu">Rise Above the Crowd</a>, the project I
was spending so much time polishing yesterday. I don't know if I could
explain what exactly Rise is, other than a real-time journalism and news
collection framework, which doesn't really explain anything. Basically,
users submit stories and photos to Rise, which in turn keep track of
popular and recent content and display this information to strategically
placed screens throughout the campus. After a few hiccoughs, the servers
were online and serving data to the public, who then voted on their
favorite stories. At the FOSSBOX, we had the projector set up serving
the HD display mode so people could see what was current while they were
there. Since I was the one most familiar with the system at the time, I
tried to field any questions that popped up at the time to the best of
my knowledge, though sometimes that wasn't much help. As the day
progressed, we uncovered a few new bugs in the new system, and have some
new tasks before the final release.</p>
<p>All in all, a very exciting and fun weekend. We made a few contacts,
introduced a few more people to FOSS@RIT, and gave away at least 40
LiveCDs. I think that qualifies as quite a success.</p>
Special Election Time2011-03-29T03:07:00-04:002022-12-07T15:49:18-05:00Katherine Casetag:blog.katherineca.se,2011-03-29:/fossrit/special-election-time.html<p>Rochester is having a special election tomorrow to choose a new Mayor,
as the previous Mayor, Robert Duffy, is now the Lieutenant Governor of
New York. I've been following the race with some interest despite not
being registered to vote in this area, especially since I know one of
the …</p><p>Rochester is having a special election tomorrow to choose a new Mayor,
as the previous Mayor, Robert Duffy, is now the Lieutenant Governor of
New York. I've been following the race with some interest despite not
being registered to vote in this area, especially since I know one of
the candidates.</p>
<p>But more exciting than the actual business of an politics is the
election itself, which gives a great opportunity to introduce <a class="reference external" href="http://foss.rit.edu/election/">Election
Scraper 2.0</a>.</p>
<p>You may remember the old version. It was largely untested, had a few
design issues, and wasn't very pretty to look at. Through the election
night, I tweaked and honed it, until finally, mere minutes before the
close of the last poll, it gave a very close approximation to an
accurate number.</p>
<p>We probably won't have that many problems this time around. But who
knows? I'm looking forward to an exciting night in any case, and I'll be
ready to fix any problems as they come.</p>
Election Aftermath2011-03-29T02:59:00-04:002022-12-07T15:49:52-05:00Katherine Casetag:blog.katherineca.se,2011-03-29:/fossrit/election-aftermath.html<p>So the election was almost a week ago. Here's how our little experiment
turned out.</p>
<p><img alt="image0" src="http://farm2.static.flickr.com/1386/5143595646_9e4e56f556.jpg" /></p>
<p>The evening started out rather slowly, with information trickling in
about races that were all but confirmed. We had the Innovation Center's
display set up with various maps and information, and my page in the …</p><p>So the election was almost a week ago. Here's how our little experiment
turned out.</p>
<p><img alt="image0" src="http://farm2.static.flickr.com/1386/5143595646_9e4e56f556.jpg" /></p>
<p>The evening started out rather slowly, with information trickling in
about races that were all but confirmed. We had the Innovation Center's
display set up with various maps and information, and my page in the
center in five columns, all showing zero votes.
We waited until the polls closed at 9PM, and then we waited some more
for the information to travel to the Board of Elections and make it onto
the XML file. Even at this point, I still didn't know if the scraper was
going to work, so I was checking the <a class="reference external" href="http://66.192.47.50/flashresults.html">official unofficial results page</a>
to make sure they didn't have anything we didn't.</p>
<p>When the first results popped on the screen, it was amazing. When
results kept pouring in, it was even more impressive. When we finally
figured out that all those numbers were wrong, it was time to start
getting things fixed.</p>
<p>There were several differences from what the Canadian data had led me to
expect, and so I spent most of the night fixing small problems as they
showed up. First, votes in NYS are tallied by party, even if multiple
parties are running the same candidate. This was quite different from
what I had been led to expect, and so the numbers we started out with
were quite.. bizarre. Before we realized what was going on, it looked
like Paladino was beating Cuomo in Monroe County almost 4 to 1. Checking
back with the flash page revealed that I was only counting one party for
each candidate, and whichever came last would show up on the screen. I
had to keep a running count for each candidate rather than throwing out
old data when going through the file.</p>
<p>Having fixed that, and seeing that the numbers now matched what was
coming out of the flash page, I settled back down and watched the third
parties fight for 50,000 votes. That is, until someone pointed out to me
that Andrew Cuomo was currently listed with 1.7 million votes for Monroe
County alone. Seeing as how the total population of the area is only
about 1 million, there was clearly something wrong. It seems that in my
haste to keep count of how many votes a candidate got from each party, I
was not clearing these numbers once the file was done being read. This
was probably the hardest fix I had to make during the night, as it broke
some of the models I had naively put forth on how the program should
run.</p>
<p>I eventually ended up with a workable solution, though I was sad to see
the third party numbers were much lower now. There remained only one
small problem, being that most vote totals seemed to be exactly twice
what the flash site claimed they were. Finally, I realized that there
was a master 'total votes' line for each candidate which made most of
the hacks and shims I had written into my scraper obsolete. Instead, I
could simply show only this total line and everything would be right
again.</p>
<p>This was an amazing experience, especially how the work I was doing was
being picked up by people outside of RIT. Having to think fast on my
feet was a new one too. I, like most programmers, usually like to
examine a problem in detail and have time to test theories before
publishing changes. Here, I was coding just about as fast as I could to
fix each bug that turned up, and we managed to make it work before the
last precincts rolled in.</p>
<p>But even more than that, this was an opportunity to take the skills and
tools I've been accumulating through my time at RIT and make them do
something more than just a class assignment. This wasn't just another
program to be written and forgotten, this was something that represented
the school to some of the public who perhaps hadn't heard about what
we're doing here. And that is cool.</p>
Election Night2011-03-29T02:59:00-04:002022-12-07T15:50:42-05:00Katherine Casetag:blog.katherineca.se,2011-03-29:/fossrit/election-night.html<p>So election night is tonight. I'll have free time soon.</p>
<p>The main reason for my lack of time lately has to do with the election.
On Wednesday, just before I was going to classes, I got a call from Remy
asking what my schedule was and if I'd like to …</p><p>So election night is tonight. I'll have free time soon.</p>
<p>The main reason for my lack of time lately has to do with the election.
On Wednesday, just before I was going to classes, I got a call from Remy
asking what my schedule was and if I'd like to take a trip to WXXI to
talk with Rachel Ward and at their election coverage.</p>
<p>A few hours later, we're in the lobby of WXXI, being mistaken for
musicians (as anyone who meets Remy for the first time is likely to do),
and waiting for Rachel to come down.</p>
<p>We took a quick tour of the facilities, met a few other people, and got
down to business. Monroe County's Board of Elections hosts a server with
the live election information on a static ip and takes it down once the
election is done. Further, this information is released in the form of
a flash app, making it harder to move the information.</p>
<p>After a short discussion, we decided that we weren't going to get
anywhere without making a call to the BoE. Rachel did her magic on the
phone, getting more information out of them than I ever could have done.
To make a long story short, this flash application is driven by an XML
file (actually two) and that the system was provided by the company who
made the machines and was pretty much a black box as far as they were
concerned.</p>
<p>A short bit of Googling later, I had identified another location that
used these machines: London, Ontario. using their XML files as a base, I
made a quick scraper that spat out a basic table of races and
candidates. It wasn't pretty, but it worked.</p>
<p>Over the next few days, I poked and prodded at it, adding features and
eventually moving to an HTML output and checking against a few more
locations, and cleaned up the code some more. We met with Rachel once
more yesterday, set the site up to be running on innovationtrail.org,
and started the countdown.</p>
<p>The current version of the scraper is now running on
<a class="reference external" href="http://foss.rit.edu/election/">http://foss.rit.edu/election/</a> and should update its information every 30
seconds, once information exists.</p>
Winter Hackathon 22011-02-19T02:58:00-05:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2011-02-19:/fossrit/winter-hackathon-2.html<p>So I was gonna do a write-up about the winter hackathon.
That didn't happen too much.
But there's another one going on now.
^__^</p>
Boston2011-01-23T07:14:00-05:002022-12-07T15:45:11-05:00Katherine Casetag:blog.katherineca.se,2011-01-23:/civx/boston.html<p>So here we are in OLPC HQ, right in the middle of MIT. It's pretty sweet
having <a class="reference external" href="http://lewk.org/">Luke</a> around again to hack CIVX with us.</p>
<p>I've had a lot to do in the past few days. Remy's been showing me
scrapers and models, and I've been helping transition <a class="reference external" href="http://foss.rit.edu/user/17">Kate's</a> people …</p><p>So here we are in OLPC HQ, right in the middle of MIT. It's pretty sweet
having <a class="reference external" href="http://lewk.org/">Luke</a> around again to hack CIVX with us.</p>
<p>I've had a lot to do in the past few days. Remy's been showing me
scrapers and models, and I've been helping transition <a class="reference external" href="http://foss.rit.edu/user/17">Kate's</a> people
dashboard into an integrated component or integrating <a class="reference external" href="http://rebeccanatalie.com/">Rebecca's</a> theme
changes on the side. It's been hectic and fun and tough, but I finally
feel like I'm contributing to a project, something with substance and
goals, not just writing code to accomplish a task like some of my
previous co-ops. Being in a team this large helps, especially when we
pull in outside help like Luke, but I think it's mainly Remy's
infectious excitement for the project. When he gets down to work, one
can't help but feel his vision and be excited for the possibilities.</p>
<p>Unfortunately, that means I had precious little time to pay attention to
the other teams. Three separate groups hacking away at their own
projects, tossing ideas about and getting input from a few members
upstream, not to mention the whole OLPC offices around the corner- this
was a right proper hackathon, and something that makes me excited for the
future of these projects.</p>
Killer Bunnies and the Towering Draw Pile2011-01-23T07:12:00-05:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2011-01-23:/personal/killer-bunnies-and-the-towering-draw-pile.html<p>There was a hackathon Friday night, and a write-up is coming soon, but
first a bit of random chatter.</p>
<p>I have, for a few years, been collecting the various pieces that make up
<a class="reference external" href="http://boardgamegeek.com/boardgame/3699/killer-bunnies-and-the-quest-for-the-magic-carrot">Killer Bunnies and the Quest for the Magic Carrot</a>. For Christmas, I
received the Stainless Steel booster …</p><p>There was a hackathon Friday night, and a write-up is coming soon, but
first a bit of random chatter.</p>
<p>I have, for a few years, been collecting the various pieces that make up
<a class="reference external" href="http://boardgamegeek.com/boardgame/3699/killer-bunnies-and-the-quest-for-the-magic-carrot">Killer Bunnies and the Quest for the Magic Carrot</a>. For Christmas, I
received the Stainless Steel booster pack, leaving me one short (Ominous
Onyx) of completing the game's many many expansion packs. In total, my
game had over 650 cards and was disturbingly fun to play.
Yesterday, I was surprised upon reaching my friend Eric's apartment to
be greeted with not just one, but two new expansions, one of which I
didn't even know existed. I am now the owner of not just the Ominous
Onyx booster pack, but also the mysterious new Chocolate pack, adding in
many of the hard-to-find promo cards that Playroom has put out over the
years.</p>
<p>We put together the full game, now consisting of 770 playable cards, 11
dice, 9 pawns, a large cardboard ball, and various plastic tokens and
stands to help try and make sense of all the nonsense. The draw pile
stacks over a foot high thanks to the thick card construction, and
shuffling amounts to little more than a best-effort situation. The game
is frankly insane and a blast to play, with a breadth of randomness not
normally found in such games. While strategy is important, one can
switch freely between winning and losing with a few short turns, and
often quite frequently does. The game is long enough to be satisfying,
but not so long as to drag on, (as long as people are paying attention)
and the variety and humor in the cards keeps the game fresh play after
play.</p>
<p>Still, my game is not quite complete. One more add-on remains, though
not really a necessary one. A spin-off game called Kinder Bunnies also
exists, aimed at a younger audience with streamlined rules and
simplified gameplay. It is also completely compatible with the main
Killer Bunnies game, with a few cards directly referencing Kinder
Bunnies from Killer Bunnies. it also would add another ten "Magic
Carrots" to the game, which are a sort of victory point mechanism, where
once all the Magic Carrots have been picked up, the game ends. Thus, the
addition would extend gameplay a bit more and provide new challenges for
longtime players (not to mention many more dice to play with).</p>
<p>So that's how I spent my weekend apart from coding like a superstar. But
don't worry, I'll get to that soon.</p>
pyDex on Sugar2011-01-19T23:51:00-05:002022-12-07T15:45:54-05:00Katherine Casetag:blog.katherineca.se,2011-01-19:/personal/pydex-on-sugar.html<p>A warning, this gets very technical with lots of acronyms and jargon.
tl;dr: I have a new pyDex branch which *should* run on the OLPC
A few days ago, I decided I'd finally see how well pyDex runs on the
OLPC. Turns out, it runs pretty well, which isn't …</p><p>A warning, this gets very technical with lots of acronyms and jargon.
tl;dr: I have a new pyDex branch which *should* run on the OLPC
A few days ago, I decided I'd finally see how well pyDex runs on the
OLPC. Turns out, it runs pretty well, which isn't too surprising as it
is written in pure pyGTK and Glade, both of which are well supported in
Sugar.</p>
<p>So today, I loaded up my dev files and set to work. First, though, I had
to clean up my dev branch and finish committing the few hacks I had
accumulated over the last few months. I finally fixed the zero index bug
and found I had a problem in my new scraper that was causing all the
evolution problems. In any case, that's all fixed, so my dev branch is
nice and clean and almost ready for the Black/White release in a few
weeks.</p>
<p>Getting back to Sugar, I finally found <a class="reference external" href="http://magazine.redhat.com/2007/04/26/building-the-xo-porting-a-pygtk-game-to-sugar-part-two/">a good tutorial</a> on *porting*
a pyGTK program rather than writing a new one. While admittedly I
haven't looked very hard, I had had a bit of a problem getting past the
example activity before, probably due to my use of Glade as I cannot
replace my top window as easily when it is automatically pulled from
Glade's XML and I really don't feel like defining everything in code.
The most useful thing I found in this tutorial is sending different
parents to the main panel depending on where it is called from. So if we
call the program normally, main_window is still loaded from Glade, but
if Sugar loads it, we use their prebuilt panel.</p>
<p>It still needs to be tested, and I need to add actual activity
information (and an icon eventually), but I think it should work. I'll
probably get to test it sometime this weekend, maybe even turn it into a
real activity by then. This is still just a small project, and I doubt
it will ever go up on Sugar's activities site, but it has given me a
much better understanding about how non-pygame activities work.</p>
Wikiotics Visual Tweak2010-12-08T23:51:00-05:002022-12-07T15:48:23-05:00Katherine Casetag:blog.katherineca.se,2010-12-08:/fossrit/wikiotics-visual-tweak.html<p>So <a class="reference external" href="http://trosehfoss.blogspot.com/">trose</a> and I have been asked to put a bow on the Wikiotics work
while we're between projects. We both hit some bumps at the end of the
quarter, so there's a few unused hours left to put into this project.
One of the things we've been tasked with …</p><p>So <a class="reference external" href="http://trosehfoss.blogspot.com/">trose</a> and I have been asked to put a bow on the Wikiotics work
while we're between projects. We both hit some bumps at the end of the
quarter, so there's a few unused hours left to put into this project.
One of the things we've been tasked with is looking at the CSS and
seeing if there is anything we can do to it to make it look a little
better. Now, neither trose nor I are designers by any stretch of the
imagination, but we've got a few ideas that might go a ways to making
Wikiotics look more attractive.</p>
<p><img alt="image0" src="http://3.bp.blogspot.com/_NuCXZozR8O8/TQAXjymF2KI/AAAAAAAAAi8/02g5WgJiCj0/s320/Screenshot-Wikiotics%2B-%2BNamoroka-1.png" /></p>
<p>This is how Wikiotics looks today. The different actions possible on a
page are hanging out awkwardly in the center, and there's no clear
division between the header and the content. After looking at this for a
while, trose and I decided that we would emulate a few of the more
reasonable designs of the more common wiki systems out there.</p>
<p><img alt="image1" src="http://4.bp.blogspot.com/_NuCXZozR8O8/TQAXkGhFoPI/AAAAAAAAAjE/lnUp7vNtbAI/s320/Screenshot-Wikiotics%2B-%2BNamoroka.png" /></p>
<p>This is how Wikiotics looks on our local git. The 'tabs' have been moved
to the right, and pushed further into the top bar (maybe too much).
Further, the active tab bleeds into the content area, which is lightly
colored to mark it as a separate area from the header and footer areas.
The changed text and color on the top bar are not our doing, this
appears to be a side-effect of running a subinstance of another page as
we are, as is the case with the red link color.
These changes are a small step, but I think they go some way towards
making Wikiotics a bit more approachable to outsiders. There are a few
more things to be done on Friday, but I don't think we're yet done
looking at this.</p>
Wikiotics at the Constellation Commons2010-12-08T23:30:00-05:002022-12-07T15:33:58-05:00Katherine Casetag:blog.katherineca.se,2010-12-08:/fossrit/wikiotics-at-the-constellation-commons.html<p>This happened quite some time ago now, but with the election coverage I
haven't had much time to get this out.</p>
<p>Two weeks ago, Taylor and I were asked to appear at the opening of the
Constellation Commons for Global Learning, to show off Wikiotics as a
student project and …</p><p>This happened quite some time ago now, but with the election coverage I
haven't had much time to get this out.</p>
<p>Two weeks ago, Taylor and I were asked to appear at the opening of the
Constellation Commons for Global Learning, to show off Wikiotics as a
student project and set up Wikiotics on all the computers. We got to
show the site off to President Destler and to a few foreign language
professors who were all quite excited about such a thing existing.</p>
<p>In all we showed off the project to a fair number of people and got a
very positive reaction from most of them. I don't know if we've seen any
more users from this, but it seems to me it should help spread the word
about Wikiotics.</p>
Lemonade Stand 2.1 Released2010-11-12T07:03:00-05:002022-12-07T15:35:16-05:00Katherine Casetag:blog.katherineca.se,2010-11-12:/fossrit/lemonade-stand-21-released.html<p>I am proud to announce the immediate availability of <a class="reference external" href="http://activities.sugarlabs.org/en-US/sugar/addon/4321">Lemonade Stand 2.1</a>,
with a quarter's worth of work and effort behind it.</p>
<p>There are few gameplay changes in this release, but it feels like a
brand new game thanks to the wonderful new art assets courtesy of <a class="reference external" href="http://jtmengel.blogspot.com/">JT</a>.</p>
<p>There's …</p><p>I am proud to announce the immediate availability of <a class="reference external" href="http://activities.sugarlabs.org/en-US/sugar/addon/4321">Lemonade Stand 2.1</a>,
with a quarter's worth of work and effort behind it.</p>
<p>There are few gameplay changes in this release, but it feels like a
brand new game thanks to the wonderful new art assets courtesy of <a class="reference external" href="http://jtmengel.blogspot.com/">JT</a>.</p>
<p>There's also the beginning of a help system, and many generalizations in
the code to make adding new features even easier than before.</p>
<p>We're a little late for a midnight release, but for my first real game
release, I don't think we did half bad.</p>
Lemonade Recap2010-11-10T20:26:00-05:002022-12-07T14:31:54-05:00Katherine Casetag:blog.katherineca.se,2010-11-10:/fossrit/lemonade-recap.html<p>So here we are at the end of the quarter. Have we accomplished our
goals? I think so.
We've gotten a lot done in the last ten weeks. We got some new images,
we got some new ideas into the game, and on the whole, it looks much
better. The …</p><p>So here we are at the end of the quarter. Have we accomplished our
goals? I think so.
We've gotten a lot done in the last ten weeks. We got some new images,
we got some new ideas into the game, and on the whole, it looks much
better. The game is slowly taking shape into something people might
actually want to play.</p>
<p>But let's compare what we've actually done to our goals. The first goal
was to get an updated background image with prettier graphics that
changed with the weather. This is definitely done, and has been for a
week or two. We've since invented a few more graphics that need to be
included in the main screen, but the placeholders we have should do for
now.</p>
<p>Second, we wanted to get a graphical store in place and working. That's
almost done, and should be finished before the end of the week. The
store is fully functional, it just needs a few tweaks and some feedback
on the user's current inventory. For the time being that can be fixed
until we can get a more permanent solution later.</p>
<p>Finally, the overlooked problem we were ignoring this whole time, an in
game help system. This is moving along, not well, but it should be
functional by our release.</p>
Lemonade Week... 8?2010-10-26T23:59:00-04:002022-12-07T15:36:15-05:00Katherine Casetag:blog.katherineca.se,2010-10-26:/fossrit/lemonade-week-8.html<p>So... lots of stuff has happened in the last few weeks.</p>
<p>First, we have a new background, thanks to <a class="reference external" href="http://jtmengel.blogspot.com/">JT</a>. In fact, we have three
now, and as of five minutes ago, a different one will show up depending
on the current weather.</p>
<p>There's also been a lot of behind-the-scenes …</p><p>So... lots of stuff has happened in the last few weeks.</p>
<p>First, we have a new background, thanks to <a class="reference external" href="http://jtmengel.blogspot.com/">JT</a>. In fact, we have three
now, and as of five minutes ago, a different one will show up depending
on the current weather.</p>
<p>There's also been a lot of behind-the-scenes updates to make adding new
events and weather types easier in the future. Generally, all that kind
of stuff goes in constants.py, but that might have to change if we need
to add too much more stuff.</p>
<p>We've got a few exciting ideas we hope to get done in the next few
weeks, and I look forward to seeing what we can get done.</p>
Updating git2010-10-04T19:39:00-04:002022-12-07T16:02:11-05:00Katherine Casetag:blog.katherineca.se,2010-10-04:/fossrit/updating-git.html<p>We here at RIT are working on our own branch of Ductus, and Ductus got a
big addition over the weekend, namespaces. Unfortunately, our repository
didn't know anything about that and needed to be updated.</p>
<p>Now, I'm not a novice to using git, nor are any of the people in …</p><p>We here at RIT are working on our own branch of Ductus, and Ductus got a
big addition over the weekend, namespaces. Unfortunately, our repository
didn't know anything about that and needed to be updated.</p>
<p>Now, I'm not a novice to using git, nor are any of the people in the
FOSSBOX. But we'd never had to worry about upstream repositories, for
all of our projects, we *were* upstream.</p>
<p>Taylor and I finally worked it out <a class="footnote-reference" href="#footnote-1" id="footnote-reference-1">[1]</a>, and now we're trying to fix all
the problems that came up with the new hotness as best as we can. but
even when you feel comfortable with something, you can never quite
escape the possibility that something you hadn't considered might show
up.</p>
<table class="docutils footnote" frame="void" id="footnote-1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[1]</a></td><td><p class="first">The process, for reference, as near as I can figure</p>
<div class="last"><div class="highlight"><pre><span></span>git remote add
git pull
git push
</pre></div>
</div></td></tr>
</tbody>
</table>
Wikiotics Week 22010-09-29T18:36:00-04:002022-12-07T15:31:42-05:00Katherine Casetag:blog.katherineca.se,2010-09-29:/fossrit/wikiotics-week-2.html<p>We're into week 2 of Wikiotics' <a class="reference external" href="http://alpha.wikiotics.org/wiki/four_week_plan">four week plan</a> to push Wikiotics
development, so I thought I'd get out all the stuff I did last week.</p>
<p>First, in order to get myself acquainted with the user interface I
translated (badly) the introduction lesson into <a class="reference external" href="http://alpha.wikiotics.org/wiki/Russian_lessons">Russian</a> (and later
tapped <a class="reference external" href="http://rockettium.net/wordpress/">Ellen Rocket's …</a></p><p>We're into week 2 of Wikiotics' <a class="reference external" href="http://alpha.wikiotics.org/wiki/four_week_plan">four week plan</a> to push Wikiotics
development, so I thought I'd get out all the stuff I did last week.</p>
<p>First, in order to get myself acquainted with the user interface I
translated (badly) the introduction lesson into <a class="reference external" href="http://alpha.wikiotics.org/wiki/Russian_lessons">Russian</a> (and later
tapped <a class="reference external" href="http://rockettium.net/wordpress/">Ellen Rocket's</a> German prowess for a <a class="reference external" href="http://alpha.wikiotics.org/wiki/German_lessons">German</a> one).</p>
<p>Later I was asked to look into creoleparser to see if I could fix <a class="reference external" href="http://code.google.com/p/creoleparser/issues/detail?id=42">bug
42</a> to disable external image loading, which we also want for Ductus.
Getting into yet another python codebase so soon was interesting,
especially when I had to switch back and forth looking at things
<a class="reference external" href="http://trosehfoss.blogspot.com/">Taylor</a> was doing.</p>
<p>That's all I really got into for week one, but now that I'm fairly
comfortable with the various technologies being used, I think things
will really start rolling soon</p>
Lemons Taking Off2010-09-23T14:35:00-04:002022-12-07T15:32:33-05:00Katherine Casetag:blog.katherineca.se,2010-09-23:/fossrit/lemons-taking-off.html<p>It looks like Lemonade Stand is taking off for fall quarter's <a class="reference external" href="http://teachingopensource.org/index.php/RIT/The_Course">HFOSS</a>
class. This is quite exciting and there's a number of things I'm looking
to get in for a 2.2 release</p>
<ul class="simple">
<li>More events</li>
<li>Fluctuating prices</li>
<li>Loyalty and advertising</li>
</ul>
<p>among <a class="reference external" href="http://wiki.sugarlabs.org/go/Lemonade_Stand#Additional_Ideas_.26_Features">others</a>.</p>
<p>The 2.1 release, meanwhile, I would like …</p><p>It looks like Lemonade Stand is taking off for fall quarter's <a class="reference external" href="http://teachingopensource.org/index.php/RIT/The_Course">HFOSS</a>
class. This is quite exciting and there's a number of things I'm looking
to get in for a 2.2 release</p>
<ul class="simple">
<li>More events</li>
<li>Fluctuating prices</li>
<li>Loyalty and advertising</li>
</ul>
<p>among <a class="reference external" href="http://wiki.sugarlabs.org/go/Lemonade_Stand#Additional_Ideas_.26_Features">others</a>.</p>
<p>The 2.1 release, meanwhile, I would like to focus on usability and
graphics, as those have been being worked on for the longest so far.
Whether any of this will work out the way I'm imagining is debatable,
but I've got a plan and two <a class="reference external" href="http://smw-os.blogspot.com/">awesome</a> <a class="reference external" href="http://jtmengel.blogspot.com">people</a> working on it with me</p>
Refreshing Lemonade2010-09-23T14:04:00-04:002022-12-07T15:33:02-05:00Katherine Casetag:blog.katherineca.se,2010-09-23:/fossrit/refreshing-lemonade.html<p>In other news, yesterday marked the <a class="reference external" href="http://blog.jlewopensource.com/2010/07/lemonade-stand-release.html">final release of Lemonade
Stand 2.0</a>. Now running on the <a class="reference external" href="https://fedorahosted.org/fortune_hunter/wiki/FortuneEngine">Fortune Engine</a> and containing 0% of
my code, this marks an important milestone in my open-source
<a class="reference external" href="http://sugarlabs.org/">sugar-based</a> lemonade stand game as it's still getting developed!</p>
<p>While I can claim none of the credit …</p><p>In other news, yesterday marked the <a class="reference external" href="http://blog.jlewopensource.com/2010/07/lemonade-stand-release.html">final release of Lemonade
Stand 2.0</a>. Now running on the <a class="reference external" href="https://fedorahosted.org/fortune_hunter/wiki/FortuneEngine">Fortune Engine</a> and containing 0% of
my code, this marks an important milestone in my open-source
<a class="reference external" href="http://sugarlabs.org/">sugar-based</a> lemonade stand game as it's still getting developed!</p>
<p>While I can claim none of the credit for the code, it is still
satisfying to watch my very first open source project still on its feet,
teetering around like a drunken toddler.</p>
Wikiotics2010-09-20T19:45:00-04:002022-12-07T15:27:20-05:00Katherine Casetag:blog.katherineca.se,2010-09-20:/fossrit/wikiotics.html<p>As of Friday, I have been doing some work (along with <a class="reference external" href="http://trosehfoss.blogspot.com/">Taylor</a>) for a
new set of people, <a class="reference external" href="http://alpha.wikiotics.org">Wikiotics</a>. Their goal is basically to create a
FLOSS alternative to some of the commercial translation software out
there, particularly the ever-so-expensive Rosetta Stone. The whole thing
is built on a custom …</p><p>As of Friday, I have been doing some work (along with <a class="reference external" href="http://trosehfoss.blogspot.com/">Taylor</a>) for a
new set of people, <a class="reference external" href="http://alpha.wikiotics.org">Wikiotics</a>. Their goal is basically to create a
FLOSS alternative to some of the commercial translation software out
there, particularly the ever-so-expensive Rosetta Stone. The whole thing
is built on a custom wiki platform called <a class="reference external" href="http://code.ductus.us/">Ductus</a>.</p>
<p>Today, I cloned their development tree for local development, located at
<a class="reference external" href="http://gitorious.org/ductus-rit/">http://gitorious.org/ductus-rit/</a>. Right now, nothing's been done to
it, but this will be the staging area for our work here at RIT (and
anyone who wants to get any additions into our tree) before going off to
the master tree at ductus.us.</p>
<p>We're still figuring out where everyone stands with development. I think
the guys at ductus are pleasantly surprised at how much we already have
under our belts here at the FOSSBOX. But that's just how we roll here at
RIT.</p>
Almost There...2010-09-09T18:11:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-09-09:/civx/almost-there.html<p>Today I worked my way through more of <a class="reference external" href="http://rebeccanatalie.com">Rebecca's</a> changes to the people
dashboard. It was actually in better condition than I had thought last
night. The first tab was the only one with the majority of the changes,
and most of the changes were easily applied once I understood …</p><p>Today I worked my way through more of <a class="reference external" href="http://rebeccanatalie.com">Rebecca's</a> changes to the people
dashboard. It was actually in better condition than I had thought last
night. The first tab was the only one with the majority of the changes,
and most of the changes were easily applied once I understood them-
which is what I spent hours yesterday trying to do.</p>
<p>I have a problem with writing quantities of HTML because they end up as
giant messes. Even XHTML strict isn't enough for me, though it does come
closer. Invariably, any file of reasonable complexity is going to have
whitespace inconsistencies, mislaid elements, or even entire sections
forgotten. This morning I took a fresh look at the generated HTML and
dashboard.mak and tried simply to understand their structure. After
really getting into the template version, I began to see what was really
necessary and not from Rebecca's blinged up copy. There are still a few
bits missing from the final version, but I think I can get them nailed
down before I leave today.</p>
CIVX Stuff2010-09-09T18:11:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-09-09:/civx/civx-stuff.html<p>So I've been spending some time finally getting some time to familiarize
myself with CIVX and all the parts that make up the system. I'm almost
comfortable with how the whole thing works together, though I'm not
entirely clear on how some things work to finally get to the screen …</p><p>So I've been spending some time finally getting some time to familiarize
myself with CIVX and all the parts that make up the system. I'm almost
comfortable with how the whole thing works together, though I'm not
entirely clear on how some things work to finally get to the screen.</p>
<p>Still, I got things running, both the current CIVX build and <a class="reference external" href="http://foss.rit.edu/user/17">Kate's</a>
widget, though getting the two together is harder than it seems like it
should be. On the other hand, I tweaked the GettingStarted page of the
CIVX wiki to smooth out a few bumps I ran into during the install
process. The Ubuntu instructions are still broken, but I plan going to
look into that in the future.</p>
<p>In terms of actually getting things done, I haven't done much of
substance. I've been immersing myself in CIVX, git, pep8, and all the
different things that go into the FOSS BOX. While I've been in the open
source community for a while now (and my very first project is almost
1.5 years old), I've never really tried to insert myself into a project
that was running in full swing with established work. Somehow this seems
different than the places I've worked previously... though that could
simply be the distributed nature of the work. With <a class="reference external" href="http://lewk.org">Luke</a> no longer
down the hall somewhere, getting information has turned into a more
interesting experience when the necessary parts are in his head.</p>
<p>I'm looking forward to an exciting next few weeks here, and a fruitful
next few years with the information learned from this experience.</p>
Here be Dragons2010-09-09T18:11:00-04:002022-12-07T15:30:46-05:00Katherine Casetag:blog.katherineca.se,2010-09-09:/civx/here-be-dragons.html<p>Today was the day we got <a class="reference external" href="http://www.rebeccanatalie.com">Rebecca</a> on to a git repo towards
implementing her changes to the people dashboard. It's only a halfway
step, as she isn't entirely using the proper version of the widgets but
a hacked one that works locally, and needs to be morphed into something …</p><p>Today was the day we got <a class="reference external" href="http://www.rebeccanatalie.com">Rebecca</a> on to a git repo towards
implementing her changes to the people dashboard. It's only a halfway
step, as she isn't entirely using the proper version of the widgets but
a hacked one that works locally, and needs to be morphed into something
that will work in the context of the CIVX stack. Hopefully soon she will
be up to speed and be able to make a more direct submission in the near
future.</p>
<p>As it stands, this is good stuff, and the end result is definitely
usable, even if it's not in the format we want it just yet. Translating
one to the other has proven tiresome as well; I spent the better part of
three hours just to clean up the first of our tabs. There are a few
items I think might not be translatable, things that mainly live in
Moksha, but we'll get to that when we get to that. Right now, I need a
graphical diff viewer that cares neither for whitespace nor line breaks,
just for what's in the code. Good old meld, which I had been using, has
proven itself inadequate to the massive differences between these files.
Hopefully tomorrow I will have a fresh start from which to sculpt this
mess I've constructed into its completed form.</p>
Just another ten-hour day2010-09-09T18:11:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-09-09:/civx/just-another-ten-hour-day.html<p>Today was another exciting day in CIVX-land.</p>
<p>tl;dr: I spend the day helping other people and being awesome
It started out with the latest in a series of attempts at getting
<a class="reference external" href="http://www.rebeccanatalie.com">Rebecca</a> to a working CIVX repo. As I work through this with her, I am
slowly working out …</p><p>Today was another exciting day in CIVX-land.</p>
<p>tl;dr: I spend the day helping other people and being awesome
It started out with the latest in a series of attempts at getting
<a class="reference external" href="http://www.rebeccanatalie.com">Rebecca</a> to a working CIVX repo. As I work through this with her, I am
slowly working out how all this works together (though admittedly it is
largely me flailing while Rebecca watches). With <a class="reference external" href="http://lewk.org">Luke's</a> help, I
eventually realized that the only thing really needed for CIVX to run is
python-virtualenv (it can run without, but that involves actually
installing python packages to /usr/local, something generally undesired
for a development environment). After some question of whether Rebecca
had sudo permission, we eventually discovered she did, and a short sudo
easy_install virtualenv later, we were ready to start installing the
CIVX stack.</p>
<p>If you haven't read the <a class="reference external" href="https://fedorahosted.org/civx/wiki/Setup">CIVX developer's guide</a> (and I think it's
probably safe to assume you haven't), it's a bit of a mess. Not actually
bad, but short and disorganized. This isn't too bad when you've got a
small, fairly tight development group with the main brain usually a ping
away on IRC, but as people are finding CIVX, I have taken it upon myself
to document every bump in my path. When I was first thrust into CIVX,
the page was much sparser with less detail and fewer sections, I have
added areas whenever there was a question of how to do what that
eventually came down to 'ask Luke'. Any time an arcane set of commands
came up I tried to get them on the page with as much information as I
could figure out, hopefully someone will take pity on my notes and make
them more descriptive.</p>
<p>Around this time <a class="reference external" href="http://foss.rit.edu/user/17">Kate</a> also had a few questions for me, most of which
I could figure out. However, I was still trying to get Rebecca running
and hadn't even touched my own computer more than to turn on IRC and
look a few thins up for Rebecca. Kate was having some trouble as she was
trying to scrape information off the NYS Senate page of senators,
neither of which I had done before. Rebecca had other things to do, and
getting her Mac up to speed was a lot of wait and pray so I switched
over to Kate's task. Now Kate also has a Mac, but she was set up with
CIVX long ago and could never quite explain to me how, thanks again to
arcane commands.</p>
<p>Anyway, her task involved a particular Unicode character in a senator's
name not playing well with her scraped name-to-URL converter magic.
Having only last night read <a class="reference external" href="http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/">Falsehoods Programmers Believe About
Names</a> off StumbleUpon, I immediately recognized this way as a dead
end. Sure, you could force all the current names into this pattern, but
it would never last. Some day, a senator would show up such that the
senate's conversion script and ours didn't match, and then that senator
would disappear from CIVX. After a bit of poking, however, I found that
each senator had a link to their contact page right in the div we were
scraping. A bit of poking around later, and I had a 100% reliable link
to each senator's page, as verified by the senate themselves. Suddenly
every senator's page worked, without any of this needless mucking about
in Unicode transformations.</p>
<p>Which brought us to our second problem. Most (with one important
exception) senators have a page hosted on <a class="reference external" href="http://www.nysenate.gov">http://www.nysenate.gov</a>, and
most have a contact page at /senators/first-m-last/contact.
What that page contains, however, seems largely up to each senator. Kate
had (and was quite proud of) her 4-line, incredibly complex and
unmaintainable regular expression which she used to mangle each page
into a regular form. However, as we poked further and further, we found
more and more inconsistencies and exceptions to the regular expression.
Clearly this was completely the wrong way again, but what was the right
way.</p>
<p>I suddenly saw an interesting anomaly. Most senators had the contact
info were styled exactly the same, despite having quite varying styles
otherwise. Kate had already seen that most of the addresses are together
in some sort of paragraph tag, and was trying to regexp on the contents
of each paragraph on the page. What she hadn't noticed was that every
page had a <div class="field-content"> that contained all the contact
info. Now that, that was something a bit more to go on. Furthermore,
this contained all the contact info- occasionally more, but always the
minimum was their District Office and their Albany Office. Furthermore,
it was already in some form of HTML, which Kate had previously been
stripping and rebuilding manually. If we simply took this HTML as-is and
plugged it into CIVX's contact page, instantly every senator had exactly
what we (and they) wanted!</p>
<p>Well, almost.</p>
<p>It was about at this point that Rebecca went home for the day, CIVX not
yet working. a few important packages were missing from <a class="reference external" href="http://pypi.python.org/pypi">pypi</a>, keeping
us from completing the CIVX setup step so she could get cracking on real
CIVX, without needing me to merge every change she wanted to push. Still
this left me with more time to work on the regular expressions.</p>
<p>Now, most senators worked flawlessly, with two obvious exceptions. the
first, and the one I didn't want to tackle just yet, was the senator I
mentioned briefly above, the page of <a class="reference external" href="http://www.kemphannon.com/">Sen. Kemp Hannon</a>. Notice
anything different about his page? Well, for one, it's not hosted on
nysenate.gov, and for another, he has no explicit contact page. The
first made our scraper entirely useless without coding in an exception
for senators with separate websites, and the second made such an
exception next to impossible to make general, without reverting back to
the 'check each paragraph for addresses' method.</p>
<p>So Kemp was put on the backburner for now. The other one, which failed
somewhat more spectacularly, didn't even break. Rather, the contact page
of <a class="reference external" href="http://www.nysenate.gov/senator/john-j-flanagan/contact">Sen. John J. Flanagan</a>. Putting aside for the moment the excess
content in the div, including the NYS seal, and a few lines about
contact information, this is the worst example against automatically
generated HTML I have had the misfortune of needing to scrape.</p>
<p>Problem 1: <p >&nbsp;<p /><br /> I kid you not, this is on the page a
minimum of 20 times in a row so that his Albany Office is so far below
the fold so as to be nonexistent. Sometimes there's inline styles,
sometimes not. One line has a simple space character instead of the
edgier, hipper &nbsp;. I wanted them all gone.this resulted in five
separate regexps so python wouldn't get too greedy and remove all the
content. One to replace &nbsp; with ' ', another to remove all
whitespace between a closing angle bracket and an opening one, a third
to remove anything matching style="*", a fourth to remove all the (now)
empty paragraphs, and a fifth and final one to turn any group of two or
more consecutive break tags into a single tag. It is probably fortunate
that python would not correctly apply my first attempt which was far
less readable, and more of a one-liner, as I don't know if I could have
understood it now had I not broken it into its component parts.
Problem 2 is a bit more of a WTF moment, both beautiful and frightening,
so I will reproduce it here verbatim:</p>
<pre class="literal-block">
<P style="TEXT-ALIGN: center"><SPAN style="COLOR: #012849; FONT-SIZE:
18pt"><SPAN><SPAN><SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY:
'Calibri', 'sans-serif'; COLOR: #012849; FONT-SIZE: 16pt;
mso-fareast-font-family: Calibri; mso-ascii-theme-font: minor-latin;
mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;
mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font:
minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US;
mso-bidi-language: AR-SA"><SPAN><STRONG><SPAN style="FONT-FAMILY: Times
New Roman">District Office<BR /></span></strong></span><SPAN
style="COLOR: #012849; FONT-SIZE: 16pt"><SPAN style="FONT-FAMILY: Times
New Roman">260 Middle Country Road, Suite 203<BR />Smithtown, New York
11787<BR />631-361-2154<BR />631-361-5367
FAX</span></span></span></span></span>
</pre>
<p>For those of you following along at home, that's
P(text-align) > SPAN(color, font-size) > SPAN > SPAN > SPAN(line-height,
font-family, color, font-size, a bunch of other font styles) >
SPAN > STRONG > SPAN(font-family) >
SPAN(color, font-size) > SPAN(font-family)</p>
<p>Naturally, my first order of business was to remove every single span
from the HTML we take in. Because, frankly, this is preposterous. We
already (by problem 1) strip out all the style information, because
frankly, we don't need it, so this mess just turns into six nested
spans, not a very useful thing. Suddenly, the HTML coming out of the
sanitizer is much more compact, and not just because of all the breaks
and paragraphs I took out.</p>
<p>By the time I finished with this, it was about an hour after most
everyone else had left. I spent the next half hour checking that my
sanitizer didn't break existing pages (it did, but only minorly) and
making sure my code was legible.</p>
<p>At that point, almost ten hours after I had started, I sat back,
committed my final changes, and decompressed. *This*- this is why I
love open source.</p>
Another Blog2010-09-07T16:49:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-09-07:/meta/another-blog.html<p>Just as I get used to this one, another of my classes requires me to
keep a blog, and to keep it separate from any other blogs we may keep.
The new blog is linked at left, but anything interesting will get
cross-posted here and it's likely going to go …</p><p>Just as I get used to this one, another of my classes requires me to
keep a blog, and to keep it separate from any other blogs we may keep.
The new blog is linked at left, but anything interesting will get
cross-posted here and it's likely going to go unmaintained at the
conclusion of the quarter.</p>
<p>Anyway, here it is: <a class="reference external" href="other-person.tumblr.com">other-person.tumblr.com</a></p>
Huzzah!2010-09-06T17:57:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-09-06:/personal/huzzah.html<p>Here we are, back at the beginning of another school year. It's going to
be interesting, I'm still pretty laid back from Fan Expo Canada in
Toronto.</p>
<p>My new monitor finally arrived. Well, it arrived Saturday, but the RIT
post office isn't open on weekends over breaks, so I had …</p><p>Here we are, back at the beginning of another school year. It's going to
be interesting, I'm still pretty laid back from Fan Expo Canada in
Toronto.</p>
<p>My new monitor finally arrived. Well, it arrived Saturday, but the RIT
post office isn't open on weekends over breaks, so I had to wait until
this morning. Luckily, it seems that the stand on my last monitor which
made me love it so much is a standard VESA mount, so I detached it from
my old monitor, removed the sad excuse of a stand from my new monitor,
and now have a beautiful new screen on the beautiful old stand. The new
monitor, meanwhile, being an LED-backlit model, is much lighter now than
it's stand. The speakers seem okay, although they're on the back of the
monitor (no doubt meant to reflect off the wall behind them) so if I
want to blast music, I'll just end up blasting Justin instead.</p>
<p>So I now have two monitors, though I don't know if I'm keeping the
ancient 15" LCD from home that was serving as a spare. We'll see. In the
meantime, Minecraft looks better than ever.</p>
Writing and Me2010-07-21T04:13:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-07-21:/personal/writing-and-me.html<p>I am not a writer. Had things progressed differently, I might have tried
it. I do enjoy the writing process. One important problem is that I hate
revision. I start from a theoretical perfect model, and slowly write it
out, far slower than my brain, which can race ahead to …</p><p>I am not a writer. Had things progressed differently, I might have tried
it. I do enjoy the writing process. One important problem is that I hate
revision. I start from a theoretical perfect model, and slowly write it
out, far slower than my brain, which can race ahead to new, more
exciting topics, while I still need to actually write what has been
planned out before.</p>
<p>Instead, I am a programmer. Really, the two are not very different. In
programming, however, your audience is a computer, and the computer is
far less forgiving of unclear communication. I'm not talking about typos
or misplaced parentheses here, but in the ability of the writer to
convey their ideas to the audience.</p>
<p>Programming generally has a very strict syntax compared to speech. Words
generally have only one meaning, and if they have more than one, it is
immediately obvious which one is meant in any situation. Human
communication, on the other hand, has shades of meaning, imagery, and
all sorts of other trickery to say the same thing in many ways. Years of
programming has left my vocabulary a bit dry, opting for specific,
technical jargon rather than the more general, descriptive prose.</p>
<p>To put it another way, if my writings were computer code, none of them
would compile. But then, none of my real code does either, not on the
first try. The compiler rejects a statement here or there, and when it
finally runs, the output is all wrong. So I tweak the code, simplify and
break down long statements, and try to say exactly what I mean.
Eventually, the code compiles and runs, and does exactly what I want it
to.</p>
<p>How does this relate to being a writer? Well, I follow mostly the same
pattern in my writing. The only difference is that I do it without a
compiler. I have learned long ago that self-correction is often useless
for me, I tend to make the same mistakes while reading as I did writing
it. My best writing comes form when I *do* have a compiler of sorts;
school assignments where each revision compiles a little further with my
teacher, and more meaning comes through.</p>
<p>Unfortunately, I do not write regularly anymore. The days of required
writing are almost gone, and I have yet to find a compiler for writing
which works as well as my teachers. Add to that my dislike of revision
without glaring errors to fix, and my personal writing tends to throw
parsing errors in anyone who isn't me.</p>
<p>Hopefully I'll get better at this. At the very least I hope to catch the
common mistakes here.</p>
pyDex Progress2010-07-10T16:51:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-07-10:/personal/pydex-progress.html<p>I've finally gotten around to staging and committing a few enhancements
to the <a class="reference external" href="http://gitorious.org/pydex">pydex repository</a>. The more advanced stuff like Isshu pokemon
and the new file format are still in limbo, naturally, and won't get
moved to master any time soon since they make older files incompatible,
though I hope …</p><p>I've finally gotten around to staging and committing a few enhancements
to the <a class="reference external" href="http://gitorious.org/pydex">pydex repository</a>. The more advanced stuff like Isshu pokemon
and the new file format are still in limbo, naturally, and won't get
moved to master any time soon since they make older files incompatible,
though I hope to at least be able to read them for the time being.
I'm also beginning to wonder about dynamically limiting the National
Pokedex based on the game setting, once that actually does something.
I've only played Gen IV extensively, but after loading up Yellow on a
whim, I see how this could be useful. But I'm more concerned with
bugfixing 1.0 so that it can see the light of day than tacking on more
features right now.</p>
Diving Deep2010-06-30T16:55:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-06-30:/civx/diving-deep.html<p>Yesterday had been spent working on a proper dev branch to CIVX. Today
we gave it a home.</p>
<p>Today started off trying to get the current dev CIVX running on our
server. Now, I had some experience setting up a CIVX instance from
getting one running on my box, but …</p><p>Yesterday had been spent working on a proper dev branch to CIVX. Today
we gave it a home.</p>
<p>Today started off trying to get the current dev CIVX running on our
server. Now, I had some experience setting up a CIVX instance from
getting one running on my box, but my box is mine and I know what's on
it. This server is not, so we had a few troubles along the way. Once
again the instructions proved insufficient (though not actually bad per
se). The server was running Python 2.4 by default, and one of the
scripts wanted to be run from Python 2.6, as it was pulling a few things
from future. Remy and I spent more than a little while wondering why
things weren't happening properly when we had neglected to run a command
or two. <a class="reference external" href=""http://lewk.org">Luke</a> set us straight at every turn, and I can now say that I
actually understand what most of those commands do with respect to the
rest of the system.</p>
<p>The second half of the day was a bit more interesting. Once CIVX was
running, my attention was turned to the scrapers. Most of the scrapers
are v2 scrapers, and one of our tasks for the summer is to get things
running up to v3. On the first try getting the stimulus watcher scraper
running, we had a few problems remembering where everything went and
where to call what when. On the second shot getting a scraper for
howdtheyvote.ca data, it went much faster. I need to remember to source
tg2env when I'm on the server, but mostly, things were good today.
Tomorrow we put together the pieces of v3, and keep on moving towards
our goals.</p>
More Conference Calls2010-06-24T12:10:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-06-24:/civx/more-conference-calls.html<p>Every time I go through one of these calls I feel like I come out
knowing less about the topic discussed. The topics aren't hard, and I
can now follow along and understand most of what they're talking about,
but most of the technologies are things I haven't really looked …</p><p>Every time I go through one of these calls I feel like I come out
knowing less about the topic discussed. The topics aren't hard, and I
can now follow along and understand most of what they're talking about,
but most of the technologies are things I haven't really looked at
before. It's interesting, and it's exciting, but it's a little wearying
too. The majority of the talks seem to revolve around the use of Amazon
EC2 instances, of which I know nothing.</p>
<p>On my own tasks, the semifinal versions of the getall script and the
associated documentation got pushed to <a class="reference external" href="http://bitbucket.org/slinkp/geowebdns">geowebdns</a>' repository today. I
feel pretty good about the changes I made, even if it doesn't prove
useful to the current task, I have opened up some small part of someone
else's open source program, and that's really cool.</p>
Introducing: GeoPirate!2010-06-23T07:15:00-04:002022-12-07T14:07:06-05:00Katherine Casetag:blog.katherineca.se,2010-06-23:/civx/introducing-geopirate.html<p>So I've received my apparently mandated CIVX nickname, despite it
possibly being inaccurate in a week or two. I, along with <a class="reference external" href="http://rebeccanatalie.com/">Pixel Ninja</a>
and <a class="reference external" href="http://foss.rit.edu/user/17">Python Princess</a>, have been hacking away at our various tasks
together for over a week now, and despite the seemingly constant moving
from place to place …</p><p>So I've received my apparently mandated CIVX nickname, despite it
possibly being inaccurate in a week or two. I, along with <a class="reference external" href="http://rebeccanatalie.com/">Pixel Ninja</a>
and <a class="reference external" href="http://foss.rit.edu/user/17">Python Princess</a>, have been hacking away at our various tasks
together for over a week now, and despite the seemingly constant moving
from place to place, it's a pretty sweet gig.</p>
<p>My work yesterday was some final polishing of the getall script, and my
first real push to the geowebdns repository. I'm also working on a
supplemental, informational guide to maintenance. Unfortunately, my
changes can't quite replace the old script yet, because the import
script (the one that actually starts to bring the files into the
database and do something with them) is hardcoded to the current files
and that script is a bit harder to hack than the download script. The
real work was on the documentation, trying to get people to understand
my reasoning behind the changes and allow for them to continue my work
without too much difficulty.</p>
Hello, world...2010-06-21T14:53:00-04:002022-12-07T16:04:52-05:00Katherine Casetag:blog.katherineca.se,2010-06-21:/civx/hello-world.html<p>This is my first posting despite having been here for a week, because
the zaniness that was my first week did not allow for such silly things
as introductions. Now I finally have some of the infrastructure things
worked out, and can report on my findings.</p>
<p>Work has been progressing …</p><p>This is my first posting despite having been here for a week, because
the zaniness that was my first week did not allow for such silly things
as introductions. Now I finally have some of the infrastructure things
worked out, and can report on my findings.</p>
<p>Work has been progressing on a scraper for <a class="reference external" href="http://en.wikipedia.org/wiki/Shapefile">shapefiles</a> for the Senate.
I've been fairly successful in getting a large amount of data, (up to
7.6 GB of zip files!) so my task today was to clean up the script and
make it readable to commit it back upstream. I had started by adding
more loops to the script for special cases, but that meant repeating the
core download code. Realizing this, I made a generalized downloader
function which gets called by each loop, simplifying individual loops
into a simple</p>
<div class="highlight"><pre><span></span><span class="k">for</span> URL <span class="k">in</span> <span class="nv">$URL_LIST</span><span class="p">;</span> download <span class="nv">$URL</span> <span class="nv">$FILE</span><span class="p">;</span> <span class="k">done</span>
</pre></div>
<p>Well, I think it's simpler...</p>
<p>Each time I tweak away at this file, my bash skill gets better and
better, but whether any of this is going to stick around is another
question. I always forget the simple bashisms, like for loops and
conditionals, and I can never remember quite how variables work. But as
I work more with this file, I get more comfortable with these
conventions. Though I still occasionally pine for the occasional
pythonic statement (and perhaps also for the fjords).</p>