Archive for category howto

Zotero library re-visioned

I have been wanting to use Zotero now for a while for my reference library but could never work out how back up my library using subversion. My life is contained within subversion, I do not know how I could have possibly survived before all my work; code, presentations, papers, images and not to mention my thesis, is all perfectly backed up and re-visioned and floating happily in the cloud available to me from any machine. Zotero installs itself inside the firefox profile which makes it difficult to revision within the C:\\my-subversion” folder. What I decided to do was to create a new firefox profile (instructions here) within my-subversion folder then install zotero creating:


I then only added the zotero folder to my subversion repository. You could always revision your firefox profile but I decided not to. Now every time I add a new item to zotero the my-subversion folder indicates there has been a change and requires a commit. Obviously every time you add a pdf file to the library you will actually have to “SVN add” the file itself. This is not a problem for me as I try to keep my library light and not store to many pdfs.

I am also going to try and use zotero as an interface to my subversion repository, describing and tagging documents and code that I write, but more specifically presentations, so no more trying to work out what is contained in “Presentation1.ppt” or what file name I gave to that talk on data standards which I have to give tomorrow!

I am tagging my hard drive via Zotero, its just one big cloud.



Ontology crowdsourcing

I have the unenviable task of developing an ontology for the CARMEN project which will allow the process of electrophysiology experiments, the generated data, the analysis of the data and the services that perform the analysis, to be described, and in addition be computationally amenable. Collecting the words that are required to described these tasks are relatively trivial. However, getting the scientists to realise they have assigned numerous meanings to the same word or term requires a little bit more patience on my part.

It also requires me to educate the scientists, in that building an ontology for electrophysiology is a little more complicated than putting some “words” in a text file.

The words in an ontology have to be explicitly defined so as to be completely unambiguous both to the scientist, who generate the data, and the informaticians who want to analyse the data, either immediately or several years down the line. The data should be described in such a manner to an agreed level of detail that no longer requires the informatician to pick up the phone and politely ask “how did you generate this piece of data?”.

The first stage I am trying to overcome or relay to the scientists is that although you use the same “words” you often use the words to describe different things in different contexts. This situation is generally less important when described in a journal publication but in presents issues when you use the words to annotate data and infer knowledge.

I have been trying to work out the best way to get this message across and to develop a methodology for collecting agreed definitions for words. I could have always put up a wiki or an issue tracker to do this, but this doesn’t always guarantee contribution. I feel the process needs to be mediated to turn the natural language definitions into more explicit normalised ontological definitions. Taking this into account I have decided to apply crowdsourcing to Ontology development.

Simply this means sending an email out entitled “Metadata term of the week”. This process was suggested to me by my boss Phil Lord. In this email I pick a word and attempt to define it. If I get it right then there is no need to respond. If you disagree with the definition then you have to respond with an alternative and therefore a discussion ensues and ends with an agreed definition. With this process the scientists get to see that other scientists within the project define or describe words slightly differently enough that they no longer are talking about the same thing.

The first Metadata term of the week was “spike sorting” and we received the following definitions

  1. Spike sorting is a process of assigning data spikes to sets, where each set is identified with a single neuron
  2. Spike sorting is a process aiming at separating spikes generated by different cells based on shape discrimination algorithms
  3. Spike sorting is a technique used in single-cell neural recordings which assigns particular spike shapes to individual neurons
  4. Spike sorting is a classification procedure. We can think about a forest (time series) where M animals of K different types live (M spikes of K different neurons). All animals are different but say two rabbits are a little bit more similar than the rabbit and fox. So, we need classify all M animals and to say about each to what particular class among K classes this animal belongs.
  5. Spike sorting is the process of identifying the waveforms associated with action potentials of an individual neuron within time series data.

All trying to say the same thing, although when taken explicitly they start to “mean” different things. Which led us to defining a three more terms in order to answer the original question:
a) An action potential is a sudden depolarization of the membrane potential of a cell . [synonym: spike]

b) Spike detection is a data extraction process that classifies the waveforms associated with action potentials and identifies the time point of when the spike event initiates. The input to this process is a continuous waveform. The output is a single sequence of spike event times.

c) Spike sorting is a data extraction process that assigns detected spike event times to individual neurons. The input of this process can be a continuous waveform or a sequence of spike event times. The output of this process are sets(or categories) of spikes. Each set is assumed to correspond to a single neuron.

This peer-production processes took approximately 4 days to conclude and I think it has succeeded in addressing three issues

  1. Highlighting the ambiguity and the use of terms, even within a small and enclosed group of scientist, within a single project.
  2. The peer-production of ontology terms and definitions.
  3. The engagement of the community within the project.

I would love to know peoples comments on this process or any alternative suggestions. Feel free to comment.


Windows and ubuntu Synergy

I have just installed Synergy on my Gutsy Ubuntu machine. This allows me to switch seamlessly between my ubuntu desktop and my windows laptop using the same keyboard and mouse. It was very easy to install, via apt. I followed the ubuntu synergy how to and it worked perfectly first time.


How to install Java on ubuntu

I am writing this post because I keep forgetting how to do it and end up trawling the web trying to find it. The Unbuntu starter guide should be the first port of call to install whatever flavour of java you want via apt.

Then this comes from [1]

If you want to use Sun’s Java instead of the open source GIJ (GNU Java bytecode interpreter) you need to set it as default. To list installed JVMs:

update-java-alternatives -l

To select, for example, Sun’s JVM as provided in Ubuntu 6.06, run:

sudo update-java-alternatives -s java-1.5.0-sun

You should also edit /etc/jvm and move /usr/lib/jvm/java-1.5.0-sun to the top of JVMs offered.

To set the JAVA_HOME environment variable I followed this [2]

export JAVA_HOME=/usr/lib/jvm/java-1.5.0-sun-

You can find your JAVA_HOME using the locate command for a file belonging to the JDK.

locate /rt.jar

1 Comment