OK, so it is not quite up to the minute as-you-do-it-you-publish-it open science. However, I plan to make my data that I generated during my PhD (just finished) open and available, and in writing this post I am making some what of a public commitment to do so. Once difference though, from some of the open science efforts that I have seen so far, is that I will be publishing my data, conforming to the Proteomics Standards Initiative(PSI) MIAPE guidelines for gel electrophoresis (MIAPE GE.pdf) recommended reporting requirements. The data itself will be represented in XML using the PSI recommended gel electrophoresis Mark up Language (GelML), and using terminology from sepCV and OBI should mean the data set is computationally amenable. I was involved in the development of these specifications so I suppose I should be leading by example and be the first one to publish a complete gel electrophoresis proteomics dataset.
When finnished, I would have liked it to be published some where like Nature Preceedings, however they only accept proprietary Microsoft files and pdfs rather than XML documents. I also though of creating a Google code project for it, but it seems quite elaborate for something nobody else would be contributing to and once completed would be rather static. Any suggestions are very welcome.
#1 by Jean-Claude Bradley on January 28, 2008 - 5:18 pm
In terms of suggestions, experiment with a few different platforms and see what works. For our experiments I have been happy with the flexibility of a hosted wiki (like Wikispaces) with links to data.
Even though Precedings does not take XML, I see no problem with posting the occasional report in pdf.
As long as you don’t give away your copyright you can experiment freely.
#2 by Cameron Neylon on January 29, 2008 - 8:40 am
This is great to hear! Keep us updated with what is going on and how you get on. Particularly issues with formats not working or whatever. I am actually meeting with some of the Nature people next month so if you have some specific suggestions I can take that to them. We should, in my view, make a case for XML being a standard data format that repositories can take. Much more useful in the long term than pdf or doc.
Also great to see a committment to publically available standards. I will be interested to see how this works out in terms of whether you feel in the end that the data itself is enough to be useful. Or will you need more general descriptions of what you did.
#3 by peanutbutter on February 4, 2008 - 2:22 pm
The XML format would be extremely useful, espeacially with the introduction of FuGE and the formats that are being developed based on it. It would also be nice to see open office formats and latex beig included
#4 by Kristine Yu on February 17, 2008 - 6:18 am
It sounds like Google may be getting involved with hosting large datasets, cf.
Perhaps that could eventually be an option?